dnet-hadoop/dhp-workflows/dhp-doiboost/src/main/java/eu/dnetlib/doiboost/orcid
Enrico Ottonello c0c2e05eae added wf to extracting authors and works xml data from orcid dump to hdfs; added wf to download the lamda file (containing last orcid update informations) from orcid to hdfs 2020-11-17 18:23:12 +01:00
..
json fixed enriched works generation 2020-06-29 18:03:16 +02:00
model moved AuthorData to dhp-schemas; added other names to author data 2020-11-12 17:43:32 +01:00
xml added wf to extracting authors and works xml data from orcid dump to hdfs; added wf to download the lamda file (containing last orcid update informations) from orcid to hdfs 2020-11-17 18:23:12 +01:00
ActivitiesDecompressor.java added wf to extracting authors and works xml data from orcid dump to hdfs; added wf to download the lamda file (containing last orcid update informations) from orcid to hdfs 2020-11-17 18:23:12 +01:00
ExtractXMLActivitiesData.java added wf to extracting authors and works xml data from orcid dump to hdfs; added wf to download the lamda file (containing last orcid update informations) from orcid to hdfs 2020-11-17 18:23:12 +01:00
ExtractXMLSummariesData.java added wf to extracting authors and works xml data from orcid dump to hdfs; added wf to download the lamda file (containing last orcid update informations) from orcid to hdfs 2020-11-17 18:23:12 +01:00
ORCIDToOAF.scala fixed log classes to make the ORCID test run 2020-06-09 18:07:14 +02:00
OrcidAuthorsDOIsDataGen.java separate workflow to parse orcid summaries, activities and generate dataset with no doi publications; test 2020-07-03 23:30:31 +02:00
OrcidDSManager.java first version of dataset successful generated from orcid dump 2020 2020-11-06 13:47:50 +01:00
OrcidDownloader.java separate workflow to parse orcid summaries, activities and generate dataset with no doi publications; test 2020-07-03 23:30:31 +02:00
SparkConvertORCIDToOAF.scala changed mapping ORCIDToOAF 2020-05-29 09:32:04 +02:00
SparkGenerateDoiAuthorList.java moved AuthorData to dhp-schemas; added other names to author data 2020-11-12 17:43:32 +01:00
SparkOrcidGenerateAuthors.java added accumulator; last modified date of the record is added to saved data; lambda file is partitioned into 20 parts before starting downloading 2020-05-18 19:51:29 +02:00
SparkPartitionLambdaFile.java added accumulator; last modified date of the record is added to saved data; lambda file is partitioned into 20 parts before starting downloading 2020-05-18 19:51:29 +02:00
SummariesDecompressor.java added wf to extracting authors and works xml data from orcid dump to hdfs; added wf to download the lamda file (containing last orcid update informations) from orcid to hdfs 2020-11-17 18:23:12 +01:00