1
0
Fork 0
Commit Graph

4134 Commits

Author SHA1 Message Date
Claudio Atzori 9a5b134ddf Merge branch 'beta' into FOSNew 2024-03-25 16:07:37 +01:00
Claudio Atzori 71c1f81b54 Merge branch 'beta' into exception_on_invalid_transofmation_rule 2024-03-25 16:05:11 +01:00
Claudio Atzori 91b61687fa Merge branch 'beta' into bulkTaggingPathMapExtention 2024-03-25 15:50:18 +01:00
Claudio Atzori 54936b7f42 Merge branch 'beta' into transformativeagreement 2024-03-25 15:42:22 +01:00
Michele Artini e1149eb5c4 xslt rules and tests 2024-03-25 15:01:42 +01:00
Michele Artini 3f174ad90f Merge branch 'beta' of code-repo.d4science.org:D-Net/dnet-hadoop into beta 2024-03-25 12:16:02 +01:00
Michele Artini 6ffb1faf09 fixed a problem with multiple nodes 2024-03-25 12:15:51 +01:00
Giambattista Bloisi 3f22c101d9 Merge pull request 'Enrich authors with ORCID info using new matching algorithm' (#398) from new_orcid_enhancement into beta
Reviewed-on: D-Net/dnet-hadoop#398
2024-03-22 17:29:20 +01:00
Giambattista Bloisi 0ff7faad72 Fix conditions that prevented ORCID Enrichment 2024-03-22 16:24:49 +01:00
Michele Artini 7faa115ba0 Merge branch 'beta' of code-repo.d4science.org:D-Net/dnet-hadoop into beta 2024-03-22 11:08:59 +01:00
Michele Artini f9c74c98fa fixed an identifier xpath 2024-03-22 11:08:45 +01:00
Giambattista Bloisi 664a381d31 Unify merge logic of entities in MergeUtils.class 2024-03-18 16:04:49 +01:00
Michele Artini cb29b9773c xslt rules 2024-03-18 15:31:34 +01:00
Michele Artini 85b844d57e updated BASE filter param 2024-03-15 15:03:27 +01:00
Michele Artini 455f2e1e07 apply commits from master 2024-03-15 14:56:39 +01:00
Michele Artini 88fef367b9 new plugin to collect from a dump of BASE 2024-03-15 10:47:52 +01:00
Giambattista Bloisi 9092075760 Enrich authors with ORCID info using new matching algorithm 2024-03-11 13:23:59 +01:00
Miriam Baglioni 5180b6ec8a [FOSNEW] removed test class 2024-03-07 10:47:13 +01:00
Miriam Baglioni 7827a2d66b [OCNEW] added creation of the actionset for the results classified with FoS based ont he OpenAIRE identifier 2024-03-07 10:36:30 +01:00
Giambattista Bloisi 3cd5590f3b When converting json to XML, remove characters that are not allowed in the XML 1.0 specs, as they will cause xpath failures even if escaped 2024-02-28 15:14:18 +01:00
Giambattista Bloisi 56dd05f85c Merge pull request 'Revised procedure when converting json data into xml' (#395) from restiterator_xmlcleanup into beta
Reviewed-on: D-Net/dnet-hadoop#395
2024-02-28 10:38:54 +01:00
Sandro La Bruzzo 7d806a434c formatted code 2024-02-28 09:31:58 +01:00
Sandro La Bruzzo b63994dcc4 Merge remote-tracking branch 'origin/beta' into orcid_update 2024-02-28 09:11:18 +01:00
Sandro La Bruzzo 915a76a796 following the comment on the pull requests:
- Added #NUM_OF_THREADS complete job in the queue at the end of  the main loop to avoid deadlock
2024-02-28 09:10:55 +01:00
Giambattista Bloisi 773e856550 Revised procedure when converting json data into xml:
- json object keys are renamed to be conformant to xml tag elements, special characters are substituted or removed
- json string values are no longer post-processed as they are already escaped by the org.json.XML.toString method
2024-02-24 16:54:30 +01:00
Sandro La Bruzzo a712df1e1d Merge remote-tracking branch 'origin/beta' into orcid_update 2024-02-23 10:12:25 +01:00
Sandro La Bruzzo b32a9d1994 Implemented workflow for updating table , added step to check if the new generated table is valid 2024-02-23 10:04:28 +01:00
Michele Artini 3268570b2c mapping of project PIDs 2024-02-22 14:47:21 +01:00
Miriam Baglioni 72bae7af76 [Transformative Agreement] removed the relations from the ActionSet waiting to have the gree light from Ioanna 2024-02-19 16:20:12 +01:00
Serafeim Chatzopoulos f0dc12634b Add Action Set creation for affiliations inferred from the OpenAPC data 2024-02-18 18:02:09 +02:00
Claudio Atzori a63b091bae Merge branch 'beta' into import_orps_fix 2024-02-15 15:01:56 +01:00
Miriam Baglioni eca021f4d6 [Transformative Agreement] add results with information abount the agreement and the country of the organization paid for it 2024-02-13 12:21:07 +01:00
Miriam Baglioni bdb6bbb365 mergin with branch beta 2024-02-12 15:50:43 +01:00
Claudio Atzori d85d2df6ad [graph raw] fixed mapping of the original resource type from the Datacite format 2024-02-09 10:20:20 +01:00
Giambattista Bloisi b19643f6eb Dedup aliases, created when a dedup in a previous build has been merged in a new dedup, need to be marked as "deletedbyinference", since they are "merged" in the new dedup 2024-02-08 15:34:59 +01:00
Claudio Atzori 38c9001147 fixed import of ORPs stored on HDFS in the internal graph format (e.g. Datacite) 2024-02-07 17:02:05 +01:00
Claudio Atzori fd17c1f17c [actiosets] fixed join type 2024-02-05 16:55:36 +02:00
Claudio Atzori 009dcf6aea [actiosets] introduced support for the PromoteAction strategy 2024-02-05 16:43:40 +02:00
Claudio Atzori 42f5506306 [orcid enrichment] fixed directory cleanup before distcp 2024-02-05 09:45:36 +02:00
Alessia Bardi f2a08d8cc2 test for Italian records from IRS repositories 2024-01-30 19:20:14 +01:00
Miriam Baglioni 07a373a0bd [bulkTagging] removing checks while performing the substring action so that it will fire an Exception if the paramneters are wrongly set 2024-01-30 13:51:11 +01:00
Miriam Baglioni ead08b0dd4 mergin with branch beta 2024-01-30 12:19:10 +01:00
Miriam Baglioni a5995ab557 [orcid-enrichment] change the value of parameters. 2024-01-29 18:19:48 +01:00
Sandro La Bruzzo 9aebca77a0 Added exception throwing in Hadoop transformation when TR is not syntactically valid 2024-01-29 14:41:02 +01:00
Claudio Atzori 926903b06b Merge branch 'beta' into stats_with_spark_sql 2024-01-29 09:11:45 +01:00
Giambattista Bloisi 078df0b4d1 Use SparkSQL in place of Hive for executing step16-createIndicatorsTables.sql of stats update wf 2024-01-26 21:56:55 +01:00
Claudio Atzori ce3200263e Merge branch 'beta' into crossref_missing_author_fix 2024-01-26 15:57:04 +01:00
Sandro La Bruzzo e889808daa Fixed problem on missing author in crossref Mapping 2024-01-26 12:19:04 +01:00
Sandro La Bruzzo 0386f36385 Added workflow to update ORCID and replaced some parsing, because the update works and employments xml differs from the dump one. 2024-01-25 19:40:59 +01:00
Antonis Lempesis a7115cfa9e max mem of joins (hive.mapjoin.followby.gby.localtask.max.memory.usage) now 80%, up from 55%. 2024-01-25 15:13:16 +01:00