Commit Graph

117 Commits

Author SHA1 Message Date
Enrico Ottonello 9a2fa9dc2f added test for other names parsing from summaries dump 2020-11-13 10:25:34 +01:00
Enrico Ottonello 13f28fa225 moved AuthorData to dhp-schemas; added other names to author data 2020-11-12 17:43:32 +01:00
Enrico Ottonello 1f861f2b0d now wf output is a sequence file with the format seq("eu.dnetlib.dhp.schema.oaf.Publication",eu.dnetlib.dhp.schema.action.AtomicActions) 2020-11-11 17:38:50 +01:00
Enrico Ottonello fea2451658 Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop into orcid-no-doi 2020-11-10 11:49:43 +01:00
Enrico Ottonello 1513174d7e added further test case 2020-11-10 11:44:55 +01:00
Sandro La Bruzzo cd27df91a1 fixed bug on missing relation in ANDS 2020-11-06 17:12:31 +01:00
Enrico Ottonello 6bc7dbeca7 first version of dataset successful generated from orcid dump 2020 2020-11-06 13:47:50 +01:00
Sandro La Bruzzo 39337d8a8a fixed test 2020-11-02 09:26:25 +01:00
Enrico Ottonello 9818e74a70 added dependency version in main pom.xml for orcid no doi 2020-10-22 16:38:00 +02:00
Enrico Ottonello 210a50e4f4 replaced null value 2020-10-22 16:24:42 +02:00
Enrico Ottonello b0290dbcb7 moved all dependencies version to main pom.xml 2020-10-22 16:20:46 +02:00
Enrico Ottonello a38ab57062 let run test methods 2020-10-22 15:43:50 +02:00
Enrico Ottonello 1139d6568d replaced null value with a more safe empty string as return value 2020-10-22 15:32:26 +02:00
Enrico Ottonello c58db1c8ea added filter on null value after map function 2020-10-22 15:11:02 +02:00
Enrico Ottonello 846ba30873 if typologies mapping fails, an exception will be propagated 2020-10-22 14:36:18 +02:00
Enrico Ottonello c3114ba0ae replaced null as return value with a more safe empty string 2020-10-22 14:21:31 +02:00
Enrico Ottonello c295c71ca0 added comment 2020-10-22 14:07:26 +02:00
Enrico Ottonello ab083f9946 propagate exception on parsing work (PR request) 2020-10-22 14:02:32 +02:00
sandro 3a81a940b7 solved bug on merge publication 2020-10-21 22:41:55 +02:00
Sandro La Bruzzo 34bf64c94f fixed export Scholexplorer to OpenAire 2020-10-13 08:47:58 +02:00
Sandro La Bruzzo cd9c377d18 adpted scholexplorer Dump generation to the new Dataset definition 2020-10-08 10:10:13 +02:00
Sandro La Bruzzo c4a3c52e45 fixed Doiboost bug in the identifier 2020-10-01 15:46:44 +02:00
Enrico Ottonello a97ad20c7b exception is now propagated (PR review) 2020-09-22 10:46:34 +02:00
Enrico Ottonello fefbcfb106 dependency version moved to main pom (PR review) 2020-09-22 10:20:25 +02:00
Enrico Ottonello 9e8e7fe6ef add comments 2020-09-15 11:32:49 +02:00
Enrico Ottonello 0377b40fba output to one parquet file 2020-07-30 18:38:07 +02:00
Enrico Ottonello 196f36c6ed fix publication dataset creation 2020-07-30 13:38:33 +02:00
Enrico Ottonello c82b15b5f4 migrate configuration to ocean, fix publication dataset creation 2020-07-28 15:23:52 +02:00
Enrico Ottonello ca37d3427b separate workflow to parse orcid summaries, activities and generate dataset with no doi publications; test 2020-07-03 23:30:31 +02:00
Enrico Ottonello 1729cc5cf3 publication conversion from json to oaf test 2020-07-02 18:46:20 +02:00
Enrico Ottonello 5525f57ec8 converter from orcid work json to oaf 2020-07-01 18:36:14 +02:00
Enrico Ottonello b7b6be12a5 fixed enriched works generation 2020-06-29 18:03:16 +02:00
Enrico Ottonello b2213b6435 merged with dnet version 2020-06-26 17:27:34 +02:00
Enrico Ottonello c5e149c46e Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop into orcid-no-doi 2020-06-26 16:15:38 +02:00
Enrico Ottonello d6498278ed added workflow to generate seq(orcidId,work) and seq(orcidId,enrichedWork) 2020-06-25 18:43:29 +02:00
Sandro La Bruzzo a6c0faac70 added test to verify secondary sorting 2020-06-25 10:48:15 +02:00
Enrico Ottonello fcbb4c1489 parser of orcid publication data from xml original dump 2020-06-24 16:29:32 +02:00
Claudio Atzori 9cd27183b6 [maven-release-plugin] prepare for next development iteration 2020-06-22 11:27:44 +02:00
Claudio Atzori 1e3dab0631 [maven-release-plugin] prepare release dhp-1.2.3 2020-06-22 11:27:39 +02:00
Sandro La Bruzzo 9bf67f5de1 resolved conflicts 2020-06-17 09:15:43 +02:00
Sandro La Bruzzo 1d4275acc4 implemented first version of exportation of Scholexplorer into ActionSet 2020-06-17 09:10:38 +02:00
Claudio Atzori c4d9f1837f [maven-release-plugin] prepare for next development iteration 2020-06-12 12:21:08 +02:00
Claudio Atzori f0746a7605 [maven-release-plugin] prepare release dhp-1.2.2 2020-06-12 12:21:03 +02:00
Claudio Atzori 67c7b31ba6 Merge branch 'master' into graph_cleaning 2020-06-10 15:00:35 +02:00
Claudio Atzori a2fdf85ba1 WIP: graph cleaner implementation 2020-06-09 19:52:53 +02:00
Alessia Bardi 4551c1082f mapping csv for orcid 2020-06-09 18:08:47 +02:00
Alessia Bardi 2d3f7d1eb4 fixed log classes to make the ORCID test run 2020-06-09 18:07:14 +02:00
Alessia Bardi a3a6755d58 mapping csv for Unpaywall 2020-06-09 17:45:44 +02:00
Alessia Bardi f3b033cf09 added csv line for funders from Crossref 2020-06-09 17:08:26 +02:00
Alessia Bardi fc4d220964 updated function name for SNSF 2020-06-09 17:05:31 +02:00