dnet-hadoop/dhp-workflows/dhp-doiboost/src/main/java/eu/dnetlib/doiboost/orcid
Enrico Ottonello 1265dadc90 workflow aligned with stable_ids 2021-05-20 19:01:28 +02:00
..
json original orcid xml data are stored in a field of the class that models orcid data 2020-12-09 09:45:19 +01:00
model action to convert lambda file in seq file; spark action to download updated authors 2020-11-23 09:49:22 +01:00
util fix last update read/write from file on hdfs 2021-02-09 23:24:57 +01:00
xml original orcid xml data are stored in a field of the class that models orcid data 2020-12-09 09:45:19 +01:00
ActivitiesDecompressor.java added wf to extracting authors and works xml data from orcid dump to hdfs; added wf to download the lamda file (containing last orcid update informations) from orcid to hdfs 2020-11-17 18:23:12 +01:00
ExtractXMLActivitiesData.java added wf to extracting authors and works xml data from orcid dump to hdfs; added wf to download the lamda file (containing last orcid update informations) from orcid to hdfs 2020-11-17 18:23:12 +01:00
ExtractXMLSummariesData.java added wf to extracting authors and works xml data from orcid dump to hdfs; added wf to download the lamda file (containing last orcid update informations) from orcid to hdfs 2020-11-17 18:23:12 +01:00
ORCIDToOAF.scala merged manually changes on stable_id for doiboost into master 2021-05-05 10:23:32 +02:00
OrcidAuthorsDOIsDataGen.java separate workflow to parse orcid summaries, activities and generate dataset with no doi publications; test 2020-07-03 23:30:31 +02:00
OrcidDSManager.java first version of dataset successful generated from orcid dump 2020 2020-11-06 13:47:50 +01:00
SparkConvertORCIDToOAF.scala imported changes in stable_id into master 2021-05-07 12:53:50 +02:00
SparkDownloadOrcidAuthors.java workflow aligned with stable_ids 2021-05-20 19:01:28 +02:00
SparkDownloadOrcidWorks.java fix last update read/write from file on hdfs 2021-02-09 23:24:57 +01:00
SparkGenLastModifiedSeq.java fix last update read/write from file on hdfs 2021-02-09 23:24:57 +01:00
SparkGenerateDoiAuthorList.java wf doi_authors generates one json data foreach row 2020-12-07 15:28:10 +01:00
SparkPreprocessORCID.scala imported changes in stable_id into master 2021-05-07 12:53:50 +02:00
SparkUpdateOrcidAuthors.java removed duplicates orcid data on authors set 2021-03-25 11:20:52 +01:00
SparkUpdateOrcidWorks.java fix last update read/write from file on hdfs 2021-02-09 23:24:57 +01:00
SummariesDecompressor.java added wf to extracting authors and works xml data from orcid dump to hdfs; added wf to download the lamda file (containing last orcid update informations) from orcid to hdfs 2020-11-17 18:23:12 +01:00