The Worker app of the PDF Aggregation Service.
Go to file
Lampros Smyrnaios 839a797124 - Improve performance of full-texts transferring to the Controller, by preloading some bytes for faster response to the Controller's read requests.
- Optimize directories-creation process by eliminating the additive check for existence, as that check already takes place inside the "mkdirs()" method.
- Remove the obsolete code which in case the specific assignments' subdirectory failed to be created, then a different base-dir was used instead. Since the user-defined baseDir is already been successfully created upon initialization, any problem on creating subdirectories inside that base-directory will most likely persist even when changing the base directory. Additionally, even if the subdirectory with the changed base-directory succeeded, the "FullTextsController.getFullTexts()" method would not use it, resulting in errors.
- Code polishing.
2023-03-08 13:12:17 +02:00
gradle/wrapper - Show a warning, in case the number of archived files is different from the number of requested files. 2023-03-07 16:25:10 +02:00
src - Improve performance of full-texts transferring to the Controller, by preloading some bytes for faster response to the Controller's read requests. 2023-03-08 13:12:17 +02:00
.gitignore
README.md Update/improve documentation. 2023-01-27 14:27:57 +02:00
build.gradle - Show the elapsed time for each assignments-request to be processed by the Worker. 2023-03-02 17:34:44 +02:00
gradle.properties
installAndRun.sh - Show a warning, in case the number of archived files is different from the number of requested files. 2023-03-07 16:25:10 +02:00
settings.gradle

README.md

UrlsWorker

The Worker's Application, requests assignments from the Controller and processes them with the help of the PublicationsRetriever software and downloads the available full-texts.
Then, it posts the results to the Controller, which in turn, requests from the Worker, the full-texts which are not already found by other workers, in batches.
The Worker responds by compressing and sending the requested files, in each batch.

We use Facebook's Zstandard compression algorithm, which brings very big benefits in compression rate and speed.


To install and run the application:

  • Run git clone and then cd UrlsWorker.
  • Set the preferable values inside the application.properties file.
  • Execute the installAndRun.sh script.

Notes:

  • The above script, installs the PublicationsRetriever software, as a library and then compiles and runs the whole Application.
  • If you want to just run the app, then run the script with the argument "1": ./installAndRun.sh 1. In this scenario, apart from the SpringBoot-app not re-building, the PublicationsRetriever software, will not be re-installed as well.
  • If you want to avoid re-installing the PublicationsRetriever software, i.e. when using a development (non-published) version or when nothing has changed and wanting to avoid the time-overhead, run the script with the argument "0", followed by the argument "1": ./installAndRun.sh 0 1.