documented in the readme

This commit is contained in:
Sandro La Bruzzo 2022-10-18 16:24:34 +02:00
parent 310e44e3e4
commit dc39761130
1 changed files with 26 additions and 1 deletions

View File

@ -1,3 +1,28 @@
# DHP-Explorer
For lazy people who hate to execute a Zeppelin notebook that doesn't work for unknown reasons!
For lazy people who hate to execute a Zeppelin notebook that doesn't work for unknown reasons!
# How it works?
Let's say you want to create a series of spark jobs to evaluate some features in your datasets. You have to use a Zeppelin notebook.
Sometimes the notebook doesn't work well. You don't understand why you have such errors instead in spark-shell works.
with this new project your problem are done.
## step 1
Create a Java/Scla main application where run your stuff
## step 2
Run the python script
`python execute_notebook.py {SSH USER NAME} {MAIN CLASS reference path} {arguments_file path}`
the arguments_file is a file which contains all the arguments organized one for line.
This script does the following:
- create the main jar using mvn package
- upload the jar to the iss machine (_iis-cdh5-test-gw.ocean.icm.edu.pl_)
- upload all the dependencies jar by checking in the pom those preceded by comment `<!-- JAR NEED -->`
- submit the spark-job and you can watch the output log directly on your machine
That's AWESOME!