1
0
Fork 0
Commit Graph

4259 Commits

Author SHA1 Message Date
Claudio Atzori 09a6d17059 Merge pull request '[Stats wf] #372, #405 to production' (#406) from antonis.lempesis/dnet-hadoop:beta into master
Reviewed-on: D-Net/dnet-hadoop#406
2024-03-26 12:18:26 +01:00
Claudio Atzori d70793847d resolving conflicts on step16-createIndicatorsTables.sql 2024-03-26 12:17:52 +01:00
Lampros Smyrnaios bc8c97182d Automatically select the ACTIVE HDFS NODE for Impala cluster, in all "copyDataToImpalaCluster.sh" scripts. 2024-03-26 13:01:12 +02:00
Lampros Smyrnaios 92cc27e7eb Use the ACTIVE HDFS NODE for Impala cluster, in "copyDataToImpalaCluster.sh" script. 2024-03-26 12:34:11 +02:00
Michele De Bonis f6601ea7d1 default parameters for openorgs updated 2024-03-25 13:07:04 +01:00
Michele De Bonis cd4c3c934d openorgs wf updated 2024-03-22 15:42:37 +01:00
Antonis Lempesis 4c40c96e30 code cleanup 2024-03-22 10:16:49 +02:00
Antonis Lempesis 459167ac2f Merge branch 'beta' of https://code-repo.d4science.org/antonis.lempesis/dnet-hadoop into beta 2024-03-21 12:44:58 +02:00
Antonis Lempesis 07f634a46d code cleanup 2024-03-21 12:44:30 +02:00
Antonis Lempesis 9521625a07 code cleanup 2024-03-21 11:45:08 +02:00
Antonis Lempesis 67a5aa0a38 Merge branch 'beta' of https://code-repo.d4science.org/antonis.lempesis/dnet-hadoop into beta 2024-03-19 11:24:54 +02:00
dimitrispie a3a570e9a0 Commit monitor-updates-wf 2024-03-19 09:42:21 +02:00
Michele Artini a99942f7cf filter by base types 2024-03-13 12:12:42 +01:00
Michele Artini 7f7083f53e updated sql query for filtering BASE records 2024-03-13 11:57:26 +01:00
Michele Artini d9b23a76c5 comments 2024-03-12 14:53:34 +01:00
Michele Artini 3bcfc40293 new plugin to collect from a dump of BASE 2024-03-12 12:17:58 +01:00
Antonis Lempesis f74c7e8689 selecting distinct peer_reviewed 2024-03-12 02:13:04 +02:00
Antonis Lempesis 3c79720342 fixed the irish result subset 2024-03-07 14:08:57 +02:00
Antonis Lempesis 5ae4b4286c Merge branch 'beta' of https://code-repo.d3science.org/antonis.lempesis/dnet-hadoop into beta 2024-03-07 12:15:19 +02:00
Antonis Lempesis 316d585c8a using distinct apcs per publication to avoid huge sums 2024-03-07 02:07:59 +02:00
Giambattista Bloisi 3067ea390d Use SparkSQL in place of Hive for executing step16-createIndicatorsTables.sql of stats update wf 2024-03-04 11:13:34 +01:00
Miriam Baglioni c94d94035c [BulkTagging] added check to verify if field is present in the pathMap 2024-02-28 09:41:42 +01:00
Michele Artini 4374d7449e mapping of project PIDs 2024-02-22 14:44:35 +01:00
Claudio Atzori b3ddbaed58 fixed import of ORPs stored on HDFS in the internal graph format (e.g. Datacite) 2024-02-15 15:02:48 +01:00
Claudio Atzori 1416f16b35 [graph raw] fixed mapping of the original resource type from the Datacite format 2024-02-09 10:19:53 +01:00
Giambattista Bloisi 079085286c Merge branch 'master' into fix_dedupaliases_deletedbyinference 2024-02-08 15:29:13 +01:00
Giambattista Bloisi 8dd666aedd Dedup aliases, created when a dedup in a previous build has been merged in a new dedup, need to be marked as "deletedbyinference", since they are "merged" in the new dedup 2024-02-08 15:27:57 +01:00
Claudio Atzori d86b909db2 [actiosets] fixed join type 2024-02-08 15:10:55 +01:00
Claudio Atzori 08162902ab [actiosets] introduced support for the PromoteAction strategy 2024-02-08 15:10:40 +01:00
Antonis Lempesis dd4c27f4f3 added 2 new institutions in monitor 2024-02-08 12:57:57 +02:00
Claudio Atzori f28c63d5ef [orcid enrichment] fixed directory cleanup before distcp 2024-02-05 09:44:56 +02:00
Antonis Lempesis a512ead447 changed orcid ids to all capital 2024-01-30 16:54:47 +02:00
Claudio Atzori 1a8b609ed2 code formatting 2024-01-30 11:34:16 +01:00
Antonis Lempesis bb10a22290 merged changes from dnet-hadoop 2024-01-29 21:51:47 +02:00
Miriam Baglioni 4c8706efee [orcid-enrichment] change the value of parameters. 2024-01-29 18:21:36 +01:00
Claudio Atzori 926903b06b Merge branch 'beta' into stats_with_spark_sql 2024-01-29 09:11:45 +01:00
Giambattista Bloisi 078df0b4d1 Use SparkSQL in place of Hive for executing step16-createIndicatorsTables.sql of stats update wf 2024-01-26 21:56:55 +01:00
Claudio Atzori 4d0c59669b merged changes from beta 2024-01-26 16:08:54 +01:00
Claudio Atzori ce3200263e Merge branch 'beta' into crossref_missing_author_fix 2024-01-26 15:57:04 +01:00
Sandro La Bruzzo e889808daa Fixed problem on missing author in crossref Mapping 2024-01-26 12:19:04 +01:00
Antonis Lempesis c548796463 Changed step16-createIndicatorsTables to use a spark oozie action instead of hive 2024-01-26 02:04:48 +02:00
Antonis Lempesis a7115cfa9e max mem of joins (hive.mapjoin.followby.gby.localtask.max.memory.usage) now 80%, up from 55%. 2024-01-25 15:13:16 +01:00
Antonis Lempesis fd43b0e84a max mem of joins (hive.mapjoin.followby.gby.localtask.max.memory.usage) now 80%, up from 55%. 2024-01-25 15:06:34 +01:00
Claudio Atzori 9b13c22e5d [graph provision] retrieve all the context information by adding all=true to the requests issued to thr API 2024-01-23 15:36:08 +01:00
Claudio Atzori f87f3a6483 [graph provision] updated param specification for the XML converter job 2024-01-23 08:54:37 +01:00
Claudio Atzori 6fd25cf549 code formatting 2024-01-23 08:47:12 +01:00
Claudio Atzori f76852f385 Merge branch 'beta' into update_pivots_table 2024-01-22 16:37:22 +01:00
Claudio Atzori 1c6db320f4 [graph provision] obtain context info from the context API instead from the ISLookUp service 2024-01-22 15:53:17 +01:00
Claudio Atzori 2655eea5bc [orcid enrichment] drop paths before copying the non-modifyed contents 2024-01-19 16:28:05 +01:00
Claudio Atzori c6b3401596 increased shuffle partitions for publications in the country propagation workflow 2024-01-19 10:15:39 +01:00