You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
Lampros Smyrnaios 4b85b092fe Handle the new "HttpStatus.MULTI_STATUS"-response from the Controller, inside "AssignmentsHandler.postWorkerReport()". 5 days ago
gradle/wrapper - Update the Worker's report to include the datasourceID for each record. It is used by the Controller inside the S3-fileNames. 6 months ago
src Handle the new "HttpStatus.MULTI_STATUS"-response from the Controller, inside "AssignmentsHandler.postWorkerReport()". 5 days ago
.gitignore - Update the "installAndRun.sh" script to be able to just run the app (without re-installing), if you want. 1 year ago
README.md - Move the "shutdownOrCancelCode" input in the "inputDataFile" provided by the user, for convenience and to be able to make this "auth-code" mandatory. Previously, it was optional and the app could not be made to stop in a normal-manner, if this code was not provided. 3 months ago
build.gradle - Add deletion, of the cookies in the newly-supported CookieManager, after each batch. 3 months ago
installAndRun.sh - Move the "shutdownOrCancelCode" input in the "inputDataFile" provided by the user, for convenience and to be able to make this "auth-code" mandatory. Previously, it was optional and the app could not be made to stop in a normal-manner, if this code was not provided. 3 months ago
settings.gradle - Fix the project's name inside "settings.gradle". 1 year ago

README.md

UrlsWorker

The Worker's Application, requests assignments from the Controller and processes them, downloading the available full-texts.
Then, it posts the results to the Controller, which in turn, requests from the Worker, the full-texts which are not already found by other workers, in batches.
The Worker responds by compressing and sending the requested files in each batch.

To install and run the application:

  • Run git clone and then cd UrlsWorker.
  • [Optional] Create the file inputData.txt , which contains just one line with the workerId, the maxAssignmentsLimitPerBatch, the maxAssignmentsBatchesToHandleBeforeRestart, the controller's base api-url and the shutdownOrCancelCode, all seperated by a comma "," .
    For example: worker_1,1000,0,http://IP:PORT/api/,stopOrCancelCode.
    The shutdownOrCancelCode is kind of an "auth-code", when receiving "shutdown" and "cancel-shutdown" requests.
  • Execute the installAndRun.sh script.
    In case the above file (inputData.txt) does not exist, the script will request the required data from the user, and then it will create the inputData.txt file.

Notes:

  • If the "maxAssignmentsBatchesToHandleBeforeRestart" is zero or negative, then an infinite number of assignments-batches will be handled.
  • The above script, installs the PublicationsRetriever, as a library and then compiles and runs the whole Application.
  • If you want to just run the app, then run the script with the argument "1": ./installAndRun.sh 1.