outputDir the path where the the generated data will be stored esEventIndexName the elasticsearch index name for events esIndexHost the elasticsearch host esBatchWriteRetryCount 8 an ES configuration property esBatchWriteRetryWait 60s an ES configuration property esBatchSizeEntries 200 an ES configuration property esNodesWanOnly true an ES configuration property maxIndexedEventsForDsAndTopic the max number of events for each couple (ds/topic) brokerApiBaseUrl the url of the broker service api sparkDriverMemory memory for driver process sparkExecutorMemory memory for individual executor sparkExecutorCores number of cores used by single executor oozieActionShareLibForSpark2 oozie action sharelib for spark 2.* spark2ExtraListeners com.cloudera.spark.lineage.NavigatorAppListener spark 2.* extra listeners classname spark2SqlQueryExecutionListeners com.cloudera.spark.lineage.NavigatorQueryListener spark 2.* sql query execution listeners classname spark2YarnHistoryServerAddress spark 2.* yarn history server address spark2EventLogDir spark 2.* event log dir location sparkMaxExecutorsForIndexing 8 Max number of workers for ElasticSearch indexing ${jobTracker} ${nameNode} mapreduce.job.queuename ${queueName} oozie.launcher.mapred.job.queue.name ${oozieLauncherQueueName} oozie.action.sharelib.for.spark ${oozieActionShareLibForSpark2} Action failed, error message[${wf:errorMessage(wf:lastErrorNode())}] yarn cluster IndexEventSubsetOnESJob eu.dnetlib.dhp.broker.oa.IndexEventSubsetJob dhp-broker-events-${projectVersion}.jar --executor-memory=${sparkExecutorMemory} --driver-memory=${sparkDriverMemory} --conf spark.dynamicAllocation.maxExecutors=${sparkMaxExecutorsForIndexing} --conf spark.extraListeners=${spark2ExtraListeners} --conf spark.sql.queryExecutionListeners=${spark2SqlQueryExecutionListeners} --conf spark.yarn.historyServer.address=${spark2YarnHistoryServerAddress} --conf spark.eventLog.dir=${nameNode}${spark2EventLogDir} --conf spark.sql.shuffle.partitions=3840 --outputDir${outputDir} --index${esEventIndexName} --esHost${esIndexHost} --esBatchWriteRetryCount${esBatchWriteRetryCount} --esBatchWriteRetryWait${esBatchWriteRetryWait} --esBatchSizeEntries${esBatchSizeEntries} --esNodesWanOnly${esNodesWanOnly} --maxEventsForTopic${maxIndexedEventsForDsAndTopic} --brokerApiBaseUrl${brokerApiBaseUrl}