Import DOIboost
Import InfoSpace
30
set the input path for MAG
MAGDumpPath
/data/doiboost/mag-2021-02-15
set the input path for CROSSREF dump
crossrefDumpPath
/data/doiboost/crossref/
set the intermediate path used to process MAG
intermediatePathMAG
/data/doiboost/input/mag
set the input path for Crossref
inputPathCrossref
/data/doiboost/input/crossref
set the timestamp for the Crossref incremental harvesting
crossrefTimestamp
1607614921429
set the input path for UnpayWall
inputPathUnpayWall
/data/doiboost/input/unpayWall
set the input path for ORCID
inputPathOrcid
/data/orcid_activities_2020/last_orcid_dataset
set the working path for ORCID
workingPathOrcid
/data/doiboost/input/orcid
set the hostedBy map path
hostedByMapPath
/data/doiboost/input/hostedBy/hbMap.gz
set the oozie workflow name from which the execution will be resumed
resumeFrom
ConvertCrossrefToOAF
wait configurations
prepare action sets
[
{
'set' : 'doiboost',
'jobProperty' : 'export_action_set_doiboost',
'enablingProperty' : 'active_doiboost',
'enabled' : 'true'
}
]
extract the hdfs output path generated in the previous node
outputPath
prepare a new version of DOIBoost
executeOozieJob
IIS
{
'crossrefTimestamp' : 'crossrefTimestamp',
'hostedByMapPath' : 'hostedByMapPath',
'MAGDumpPath' :'MAGDumpPath',
'inputPathMAG' : 'intermediatePathMAG',
'inputPathCrossref' : 'inputPathCrossref',
'crossrefDumpPath':'crossrefDumpPath',
'inputPathUnpayWall' : 'inputPathUnpayWall',
'inputPathOrcid' : 'inputPathOrcid',
'outputPath' : 'outputPath',
'workingPathOrcid':'workingPathOrcid',
'resumeFrom' : 'resumeFrom'
}
{
'oozie.wf.application.path' : '/lib/dnet/PROD/actionmanager/doiboost_process/oozie_app',
'workingPath' : '/data/doiboost/process_p',
'sparkExecutorCores' : '2',
'sparkExecutorIntersectionMemory' : '12G',
'sparkExecutorMemory' : '8G',
'esServer' : '[es_server]',
'esIndex' : 'crossref'
}
build-report
update action sets
wf_20210714_075237_381
2021-07-14T09:51:46+00:00
SUCCESS