Lampros Smyrnaios
|
b7c8acc563
|
- Update the code which acquires the "IMPALA_HDFS_NODE", to test the "tmp"-dir, instead of the base-dir and introduce retries, to overcome potential file-system failures. This change was suggested by "Sebastian Tymkow" and "Grzegorz Bakalarski".
- Fix typos.
|
2024-04-03 13:15:37 +03:00 |
Antonis Lempesis
|
df6e3bda04
|
added new orgs in monitor
|
2024-04-01 22:45:29 +03:00 |
Antonis Lempesis
|
573b081f1d
|
added new orgs in monitor
|
2024-04-01 22:24:46 +03:00 |
Antonis Lempesis
|
0bf2a7a359
|
fixed the result_country definition
|
2024-04-01 15:23:22 +03:00 |
Antonis Lempesis
|
9ff44eed96
|
fixed typo in indicator query
added more institutions
|
2024-03-27 14:39:01 +02:00 |
Antonis Lempesis
|
1fee4124e0
|
added missing EOS
|
2024-03-27 12:58:25 +02:00 |
Lampros Smyrnaios
|
036ba03fcd
|
Generate tables with parquet-files, instead of csv, in "dhp-stats-update/.../contexts.sh" script.
|
2024-03-26 13:29:04 +02:00 |
Lampros Smyrnaios
|
bc8c97182d
|
Automatically select the ACTIVE HDFS NODE for Impala cluster, in all "copyDataToImpalaCluster.sh" scripts.
|
2024-03-26 13:01:12 +02:00 |
Lampros Smyrnaios
|
92cc27e7eb
|
Use the ACTIVE HDFS NODE for Impala cluster, in "copyDataToImpalaCluster.sh" script.
|
2024-03-26 12:34:11 +02:00 |
Antonis Lempesis
|
4c40c96e30
|
code cleanup
|
2024-03-22 10:16:49 +02:00 |
Antonis Lempesis
|
459167ac2f
|
Merge branch 'beta' of https://code-repo.d4science.org/antonis.lempesis/dnet-hadoop into beta
|
2024-03-21 12:44:58 +02:00 |
Antonis Lempesis
|
07f634a46d
|
code cleanup
|
2024-03-21 12:44:30 +02:00 |
Antonis Lempesis
|
9521625a07
|
code cleanup
|
2024-03-21 11:45:08 +02:00 |
Antonis Lempesis
|
67a5aa0a38
|
Merge branch 'beta' of https://code-repo.d4science.org/antonis.lempesis/dnet-hadoop into beta
|
2024-03-19 11:24:54 +02:00 |
dimitrispie
|
a3a570e9a0
|
Commit monitor-updates-wf
|
2024-03-19 09:42:21 +02:00 |
Antonis Lempesis
|
f74c7e8689
|
selecting distinct peer_reviewed
|
2024-03-12 02:13:04 +02:00 |
Antonis Lempesis
|
3c79720342
|
fixed the irish result subset
|
2024-03-07 14:08:57 +02:00 |
Antonis Lempesis
|
5ae4b4286c
|
Merge branch 'beta' of https://code-repo.d3science.org/antonis.lempesis/dnet-hadoop into beta
|
2024-03-07 12:15:19 +02:00 |
Antonis Lempesis
|
316d585c8a
|
using distinct apcs per publication to avoid huge sums
|
2024-03-07 02:07:59 +02:00 |
Antonis Lempesis
|
dd4c27f4f3
|
added 2 new institutions in monitor
|
2024-02-08 12:57:57 +02:00 |
Antonis Lempesis
|
a512ead447
|
changed orcid ids to all capital
|
2024-01-30 16:54:47 +02:00 |
Antonis Lempesis
|
bb10a22290
|
merged changes from dnet-hadoop
|
2024-01-29 21:51:47 +02:00 |
Claudio Atzori
|
926903b06b
|
Merge branch 'beta' into stats_with_spark_sql
|
2024-01-29 09:11:45 +01:00 |
Giambattista Bloisi
|
078df0b4d1
|
Use SparkSQL in place of Hive for executing step16-createIndicatorsTables.sql of stats update wf
|
2024-01-26 21:56:55 +01:00 |
Claudio Atzori
|
ce3200263e
|
Merge branch 'beta' into crossref_missing_author_fix
|
2024-01-26 15:57:04 +01:00 |
Sandro La Bruzzo
|
e889808daa
|
Fixed problem on missing author in crossref Mapping
|
2024-01-26 12:19:04 +01:00 |
Antonis Lempesis
|
c548796463
|
Changed step16-createIndicatorsTables to use a spark oozie action instead of hive
|
2024-01-26 02:04:48 +02:00 |
Antonis Lempesis
|
a7115cfa9e
|
max mem of joins (hive.mapjoin.followby.gby.localtask.max.memory.usage) now 80%, up from 55%.
|
2024-01-25 15:13:16 +01:00 |
Antonis Lempesis
|
fd43b0e84a
|
max mem of joins (hive.mapjoin.followby.gby.localtask.max.memory.usage) now 80%, up from 55%.
|
2024-01-25 15:06:34 +01:00 |
Claudio Atzori
|
9b13c22e5d
|
[graph provision] retrieve all the context information by adding all=true to the requests issued to thr API
|
2024-01-23 15:36:08 +01:00 |
Claudio Atzori
|
f87f3a6483
|
[graph provision] updated param specification for the XML converter job
|
2024-01-23 08:54:37 +01:00 |
Claudio Atzori
|
6fd25cf549
|
code formatting
|
2024-01-23 08:47:12 +01:00 |
Claudio Atzori
|
f76852f385
|
Merge branch 'beta' into update_pivots_table
|
2024-01-22 16:37:22 +01:00 |
Claudio Atzori
|
1c6db320f4
|
[graph provision] obtain context info from the context API instead from the ISLookUp service
|
2024-01-22 15:53:17 +01:00 |
Claudio Atzori
|
2655eea5bc
|
[orcid enrichment] drop paths before copying the non-modifyed contents
|
2024-01-19 16:28:05 +01:00 |
Claudio Atzori
|
c6b3401596
|
increased shuffle partitions for publications in the country propagation workflow
|
2024-01-19 10:15:39 +01:00 |
Miriam Baglioni
|
bcc0a13981
|
[enrichment single step] adding <end> element in wf definition
|
2024-01-18 17:39:14 +01:00 |
Miriam Baglioni
|
6af536541d
|
[enrichment single step] moving parameter file in correct location
|
2024-01-18 15:35:40 +01:00 |
Miriam Baglioni
|
a12a3eb143
|
-
|
2024-01-18 15:18:10 +01:00 |
Miriam Baglioni
|
82e9e262ee
|
[enrichment single step] remove parameter from execution
|
2024-01-17 17:38:03 +01:00 |
Miriam Baglioni
|
67ce2d54be
|
[enrichment single step] refactoring to fix issues in disappeared result type
|
2024-01-17 16:50:00 +01:00 |
Miriam Baglioni
|
59eaccbd87
|
[enrichment single step] refactoring to fix issue in disappeared result type
|
2024-01-15 17:49:54 +01:00 |
Giambattista Bloisi
|
21a14fcd80
|
Reusable RunSQLSparkJob for executing SQL in Spark through Oozie Spark Actions
Implements pivots table update oozie workflow
|
2024-01-15 10:18:14 +01:00 |
Miriam Baglioni
|
f612125939
|
fix issue on FoS integration. Removing the null values from FoS
|
2024-01-12 10:20:28 +01:00 |
Claudio Atzori
|
cb9e739484
|
Merge branch 'beta' into resource_types
|
2024-01-11 16:29:41 +01:00 |
Claudio Atzori
|
2753044d13
|
refined mapping for the extraction of the original resource type
|
2024-01-11 16:28:26 +01:00 |
Giambattista Bloisi
|
3c66e3bd7b
|
Create dedup record for "merged" pivots
Do not create dedup records for group that have more than 20 different acceptance date
|
2024-01-10 22:59:52 +01:00 |
Giambattista Bloisi
|
10e135db1e
|
Use dedup_wf_002 in place of dedup_wf_001 to make explicit a different algorithm has been used to generate those kind of ids
|
2024-01-10 22:59:52 +01:00 |
Giambattista Bloisi
|
831cc1fdde
|
Generate "merged" dedup id relations also for records that are filtered out by the cut parameters
|
2024-01-10 22:59:52 +01:00 |
Giambattista Bloisi
|
1287315ffb
|
Do no longer use dedupId information from pivotHistory Database
|
2024-01-10 22:59:52 +01:00 |