workingPath
the working dir base path
sparkDriverMemory
memory for driver process
sparkExecutorMemory
memory for individual executor
sparkExecutorCores
number of cores used by single executor
timestamp
Timestamp for incremental Harvesting
Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]
${jobTracker}
${nameNode}
eu.dnetlib.doiboost.crossref.CrossrefImporter
-t${workingPath}/input/crossref/index_update
-n${nameNode}
-ts${timestamp}
yarn-cluster
cluster
ExtractCrossrefToOAF
eu.dnetlib.doiboost.crossref.CrossrefDataset
dhp-doiboost-${projectVersion}.jar
--executor-memory=${sparkExecutorMemory}
--executor-cores=${sparkExecutorCores}
--driver-memory=${sparkDriverMemory}
--conf spark.sql.shuffle.partitions=3840
${sparkExtraOPT}
--workingPath/data/doiboost/input/crossref
--masteryarn-cluster
yarn-cluster
cluster
ConvertCrossrefToOAF
eu.dnetlib.doiboost.crossref.SparkMapDumpIntoOAF
dhp-doiboost-${projectVersion}.jar
--executor-memory=${sparkExecutorMemory}
--executor-cores=${sparkExecutorCores}
--driver-memory=${sparkDriverMemory}
--conf spark.sql.shuffle.partitions=3840
${sparkExtraOPT}
--sourcePath${workingPath}/input/crossref/crossref_ds
--targetPath${workingPath}/process/
--masteryarn-cluster