Lampros Smyrnaios
fd62ac567e
- Remove some dependencies. |
||
---|---|---|
gradle/wrapper | ||
src | ||
.gitignore | ||
README.md | ||
build.gradle | ||
gradle.properties | ||
installAndRun.sh | ||
settings.gradle |
README.md
UrlsWorker
The Worker's Application, requests assignments from the Controller and processes them, downloading the available full-texts.
Then, it posts the results to the Controller, which in turn, requests from the Worker, the full-texts which are not already found by other workers, in batches.
The Worker responds by compressing and sending the requested files, in each batch.
We use Facebook's Zstandard compression algorithm, which brings very big benefits in compression rate and speed.
To install and run the application:
- Run
git clone
and thencd UrlsWorker
. - [Optional] Create the file
inputData.txt
, which contains just one line with the workerId, the maxAssignmentsLimitPerBatch, the maxAssignmentsBatchesToHandleBeforeRestart, the controller's base api-url and the shutdownOrCancelCode, all seperated by a comma ",
" .
For example:worker_1,1000,0,http://IP:PORT/api/,stopOrCancelCode
.
The shutdownOrCancelCode is kind of an "auth-code", when receiving "shutdown" and "cancel-shutdown" requests. - Execute the
installAndRun.sh
script.
In case the above file (inputData.txt) does not exist, the script will request the required data from the user, and then it will create the inputData.txt file.
Notes:
- If the "maxAssignmentsBatchesToHandleBeforeRestart" is zero or negative, then an infinite number of assignments-batches will be handled.
- The above script, installs the PublicationsRetriever, as a library and then compiles and runs the whole Application.
- If you want to just run the app, then run the script with the argument "1":
./installAndRun.sh 1
.