dnet-hadoop/dhp-workflows/dhp-doiboost/src/main/java/eu/dnetlib/doiboost/orcid
Enrico Ottonello ebd67b8c8f removed duplicates orcid data on authors set 2021-03-25 11:20:52 +01:00
..
json original orcid xml data are stored in a field of the class that models orcid data 2020-12-09 09:45:19 +01:00
model action to convert lambda file in seq file; spark action to download updated authors 2020-11-23 09:49:22 +01:00
util fix last update read/write from file on hdfs 2021-02-09 23:24:57 +01:00
xml original orcid xml data are stored in a field of the class that models orcid data 2020-12-09 09:45:19 +01:00
ActivitiesDecompressor.java added wf to extracting authors and works xml data from orcid dump to hdfs; added wf to download the lamda file (containing last orcid update informations) from orcid to hdfs 2020-11-17 18:23:12 +01:00
ExtractXMLActivitiesData.java added wf to extracting authors and works xml data from orcid dump to hdfs; added wf to download the lamda file (containing last orcid update informations) from orcid to hdfs 2020-11-17 18:23:12 +01:00
ExtractXMLSummariesData.java added wf to extracting authors and works xml data from orcid dump to hdfs; added wf to download the lamda file (containing last orcid update informations) from orcid to hdfs 2020-11-17 18:23:12 +01:00
ORCIDToOAF.scala using com.fasterxml.jackson.databind.ObjectMapper instead of org.codehaus.jackson.map.ObjectMapper 2020-12-23 16:59:52 +01:00
OrcidAuthorsDOIsDataGen.java separate workflow to parse orcid summaries, activities and generate dataset with no doi publications; test 2020-07-03 23:30:31 +02:00
OrcidDSManager.java first version of dataset successful generated from orcid dump 2020 2020-11-06 13:47:50 +01:00
SparkConvertORCIDToOAF.scala fixed doiboost mapping and workflows 2020-12-07 19:59:33 +01:00
SparkDownloadOrcidAuthors.java fix last update read/write from file on hdfs 2021-02-09 23:24:57 +01:00
SparkDownloadOrcidWorks.java fix last update read/write from file on hdfs 2021-02-09 23:24:57 +01:00
SparkGenLastModifiedSeq.java fix last update read/write from file on hdfs 2021-02-09 23:24:57 +01:00
SparkGenerateDoiAuthorList.java wf doi_authors generates one json data foreach row 2020-12-07 15:28:10 +01:00
SparkUpdateOrcidAuthors.java removed duplicates orcid data on authors set 2021-03-25 11:20:52 +01:00
SparkUpdateOrcidWorks.java fix last update read/write from file on hdfs 2021-02-09 23:24:57 +01:00
SummariesDecompressor.java added wf to extracting authors and works xml data from orcid dump to hdfs; added wf to download the lamda file (containing last orcid update informations) from orcid to hdfs 2020-11-17 18:23:12 +01:00