sourceNN
the source name node
isLookupUrl
the isLookup service endpoint
workingDirectory
/tmp/actionsets
working directory
distcp_memory_mb
6144
memory for distcp copying actionsets from remote cluster
distcp_task_timeout
60000000
timeout for distcp copying actions from remote cluster
distcp_num_maps
1
mmaximum number of map tasks used in the distcp process
transform_only
activate tranform-only mode. Only apply transformation step
sparkDriverMemory
memory for driver process
sparkExecutorMemory
memory for individual executor
sparkExecutorCores
number of cores used by single executor
spark2YarnHistoryServerAddress
spark 2.* yarn history server address
spark2EventLogDir
spark 2.* event log dir location
${jobTracker}
${nameNode}
mapreduce.job.queuename
${queueName}
oozie.launcher.mapred.job.queue.name
${oozieLauncherQueueName}
eu.dnetlib.dhp.migration.actions.MigrateActionSet
-Dmapred.task.timeout=${distcp_task_timeout}
-is${isLookupUrl}
-sn${sourceNN}
-tn${nameNode}
-w${workingDirectory}
-nm${distcp_num_maps}
-mm${distcp_memory_mb}
-tt${distcp_task_timeout}
-tr${transform_only}
yarn
cluster
transform_actions
eu.dnetlib.dhp.migration.actions.TransformActions
dhp-aggregation-${projectVersion}.jar
--executor-cores ${sparkExecutorCores}
--executor-memory ${sparkExecutorMemory}
--driver-memory=${sparkDriverMemory}
--conf spark.extraListeners="com.cloudera.spark.lineage.NavigatorAppListener"
--conf spark.sql.queryExecutionListeners="com.cloudera.spark.lineage.NavigatorQueryListener"
--conf spark.yarn.historyServer.address=${spark2YarnHistoryServerAddress}
--conf spark.eventLog.dir=${nameNode}${spark2EventLogDir}
-mtyarn
-is${isLookupUrl}
--inputPaths${wf:actionData('migrate_actionsets')['target_paths']}
migrate_actions failed, error message[${wf:errorMessage(wf:lastErrorNode())}]