forked from D-Net/dnet-hadoop
Lampros Smyrnaios
d46b78b659
- Set Steps 2-7 and 9 to limit the amount of files generated by Spark, from 8000, down to 100, to improve file-transfer and querying performance. - Allow the workflow to run up to Step10. The Step11 seems to have some issues even when using hive-action. |
||
---|---|---|
.. | ||
scripts | ||
config-default.xml | ||
contexts.sh | ||
copyDataToImpalaCluster.sh | ||
createPDFsAggregated.sh | ||
finalizeImpalaCluster.sh | ||
finalizedb.sh | ||
indicators.sh | ||
monitor-post.sh | ||
monitor.sh | ||
observatory-post.sh | ||
observatory-pre.sh | ||
updateCache.sh | ||
workflow.xml |