Commit Graph

2898 Commits

Author SHA1 Message Date
Miriam Baglioni 73dc082927 added new dumped field (openaccessroute, pid and alternate identifier at the level of the instance) and the bipFinder measure at the level of the result 2021-08-05 15:20:50 +02:00
Miriam Baglioni ee13da9258 merge branch with master 2021-08-05 11:34:20 +02:00
Sandro La Bruzzo 74afe43c3a fixed wrong test file 2021-08-04 10:16:17 +02:00
Claudio Atzori 11e26c020a Update 'README.md' 2021-07-30 11:54:13 +02:00
Claudio Atzori 5219d56be5 Merge pull request 'Patch the identifiers (source/target) in the relations, refinements' (#131) from fct_project_id_replacement into master
Reviewed-on: D-Net/dnet-hadoop#131
2021-07-30 11:07:54 +02:00
Claudio Atzori 4f78565c04 fixed implementation of PatchRelationsApplication, refined the relative unit test 2021-07-30 11:07:09 +02:00
Claudio Atzori a6a38cca9e fixed implementation of PatchRelationsApplication, refined the relative unit test 2021-07-30 11:06:11 +02:00
Miriam Baglioni 9bc4fd3b69 Patch FCT relations - fixed issue with join 2021-07-30 10:34:05 +02:00
Miriam Baglioni 2fc89fc9b5 Merge branch 'fct_project_id_replacement' of https://code-repo.d4science.org/D-Net/dnet-hadoop into fct_project_id_replacement 2021-07-30 10:20:43 +02:00
Claudio Atzori 081fe92a21 Merge branch 'fct_project_id_replacement' of https://code-repo.d4science.org/D-Net/dnet-hadoop into fct_project_id_replacement 2021-07-30 10:13:56 +02:00
Claudio Atzori 576693d782 added unit test for PatchRelationsApplication 2021-07-30 10:13:33 +02:00
Claudio Atzori 6e3554a45e [provision] lowercase relation filter 2021-07-29 13:56:37 +02:00
Claudio Atzori e725c88ebb [raw_all] patching relation identifier phase to be run at the end, i.e. includes also claimed relations 2021-07-29 13:03:43 +02:00
Claudio Atzori f83dd70e1c Merge pull request 'Patch the identifiers (source/target) in the relations' (#125) from fct_project_id_replacement into master
Reviewed-on: D-Net/dnet-hadoop#125
2021-07-29 12:11:27 +02:00
Claudio Atzori 5f7330d407 Merge branch 'master' into fct_project_id_replacement 2021-07-29 11:38:22 +02:00
Claudio Atzori 1923c1ce21 replaced full join + filtering with a left join 2021-07-29 11:36:20 +02:00
Claudio Atzori a9961a1835 [cleaning] title cleaning based on the me.xuender:unidecode library 2021-07-28 16:36:33 +02:00
Alessia Bardi 9594343725 code formatting after mvn compile 2021-07-28 11:41:34 +02:00
Claudio Atzori d267dce520 [raw_all] added extra workflow step for patching the identifiers in the relations, given an id mapping dataset 2021-07-27 17:18:29 +02:00
Claudio Atzori 998b66855a updated assertions in eu.dnetlib.dhp.oa.graph.raw.MappersTest 2021-07-27 15:11:37 +02:00
Miriam Baglioni 35e395eae8 merge with master 2021-07-27 12:34:59 +02:00
Claudio Atzori 5b6844b969 mapping funding relations from Datacite should be done according to the actual result identifier 2021-07-23 18:14:37 +02:00
Claudio Atzori ffdb2a3ea3 [cleaning] fixed filtering function for missing titles 2021-07-23 11:55:55 +02:00
Alessia Bardi 9069958479 tests for enermaps 2021-07-20 19:31:43 +02:00
Claudio Atzori 77e8c6c7f7 filtering 'old' OpenAIRE ids from the entity.originalId[] array in the OAF -> XML searialization procedure 2021-07-20 11:51:33 +02:00
Claudio Atzori 5947cddafc adding record identifier among the originalIds regardless of what IdentifierFactory produces 2021-07-19 17:52:24 +02:00
Miriam Baglioni 13cf444f85 Merge pull request 'force orginalId for claimed records' (#124) from forceOrginalId_claims into master
Reviewed-on: D-Net/dnet-hadoop#124
2021-07-19 17:41:58 +02:00
Claudio Atzori 5e5f65a3c3 contents mapped from the stores with 'claim' interpretation will not change their identifier along their way towards the graph 2021-07-19 15:56:55 +02:00
Claudio Atzori 9913b6073c Merge pull request 'orcid-no-doi' (#123) from enrico.ottonello/dnet-hadoop:orcid-no-doi into master
Reviewed-on: D-Net/dnet-hadoop#123
2021-07-15 17:53:58 +02:00
Enrico Ottonello 2dc50c0999 added default value to process path 2021-07-14 17:02:22 +02:00
Enrico Ottonello 66604bb2b4 added absolute path to process folder 2021-07-14 16:44:51 +02:00
Enrico Ottonello 7840cc6526 merged with master 2021-07-14 15:33:59 +02:00
Enrico Ottonello a65667d217 added publication to dataset even if no contributors 2021-07-14 15:07:07 +02:00
Sandro La Bruzzo 10068c00ea Code refactor:
- removed old workflows in doiboost
 - splitted workflow of doiboost in preprocess and process
2021-07-14 14:45:50 +02:00
Miriam Baglioni 1cdd09cd8e Tentative fix for testing of Jenkins 2021-07-14 11:14:59 +02:00
Sandro La Bruzzo 4cb65bc64a fixed process doiboost workflow:
- splitted OrcidToOAF into two phase preprocess and process
- updated workflow used in production
2021-07-14 09:44:32 +02:00
Miriam Baglioni 774cdb190e changes to mirror the last dump of the graph with the ols data model. 2021-07-13 18:57:24 +02:00
Miriam Baglioni 886617afd0 One result linked to more than on project is saved just once 2021-07-13 18:15:35 +02:00
Miriam Baglioni 320cf02d96 Changed the way to find results linked to projects. We verify to actually have the project on the graph before selecting the result 2021-07-13 18:13:32 +02:00
Miriam Baglioni 52ce35d57b - 2021-07-13 18:08:46 +02:00
Miriam Baglioni 970b387b8d modification to allow dump of a single community 2021-07-13 18:08:10 +02:00
Miriam Baglioni eae10c5894 modification to allow the dump for a single community 2021-07-13 18:07:25 +02:00
Miriam Baglioni c028feef4f workflow for the dump as sub workflows 2021-07-13 18:06:44 +02:00
Miriam Baglioni d70f8c96fd funding contains and not starts with h2020 2021-07-13 17:34:53 +02:00
Miriam Baglioni 5e38c7f42d dumping only communities with status all 2021-07-13 17:32:38 +02:00
Claudio Atzori 734de62474 [doiboost] added workflow for the ActionSet update dedicated to production 2021-07-13 17:26:04 +02:00
Miriam Baglioni d418c309f5 removed the part after part-x- in the file name generated by spark. It was too long and created problems while creating the tar entries 2021-07-13 17:11:49 +02:00
Miriam Baglioni 618d2de2da minor changes and refactoring 2021-07-13 17:10:02 +02:00
Miriam Baglioni 59615da65e Add test to verify the creation of relation between context and projects 2021-07-13 17:09:15 +02:00
Miriam Baglioni 084b4ef999 added the creation of the openaireId from funder and grant number if the element is not present in the context profile 2021-07-13 17:07:46 +02:00