The Worker app of the PDF Aggregation Service.
Go to file
Lampros Smyrnaios d630f16198 Improve the compression of fulltext files:
- Fix not using the big bufferSize it was supposed to use.
- Make sure the maximum compression-level is used. Before, the invalid value "bufferSize" was passed as the level, and it is unclear to which real-compression level it was changed to, inside the zstd-library (19 or 22 (only allowed though "ultra mode")), probably to the ultra-level though, as this "switch" seems to be required only through the cli.
- Exclude the possibly outdated "commons-compress" transitive dependency from the "publications_retriever" dependency.
2024-06-10 18:21:35 +03:00
gradle/wrapper - Fix the process of shutting down the worker, in case the user sends the relevant request, while the worker is stuck in a data-request error-loop. 2024-04-29 17:08:40 +03:00
src Improve the compression of fulltext files: 2024-06-10 18:21:35 +03:00
.gitignore - Update the "installAndRun.sh" script to be able to just run the app (without re-installing), if you want. 2021-09-09 16:28:58 +03:00
LICENSE - Automatically use the latest version of "publications_retriever" software from the Nexus maven-repository. 2024-02-08 18:33:18 +02:00
README.md - Update README. 2024-04-26 13:36:41 +03:00
build.gradle Improve the compression of fulltext files: 2024-06-10 18:21:35 +03:00
createSwapStorage.sh - Automatically use the latest version of "publications_retriever" software from the Nexus maven-repository. 2024-02-08 18:33:18 +02:00
gradle.properties - Set some optimization settings for gradle. 2022-11-30 16:25:57 +02:00
gradlew Add some gradle files to be used by Jenkins. 2024-02-08 19:06:54 +02:00
gradlew.bat Add some gradle files to be used by Jenkins. 2024-02-08 19:06:54 +02:00
installAndRun.sh Update Gradle in the install script. 2024-04-26 15:02:50 +03:00
settings.gradle - Fix the project's name inside "settings.gradle". 2021-09-22 17:06:30 +03:00

README.md

UrlsWorker

Jenkins build status

The Worker's Application, requests assignments from the Controller and processes them with the help of the PublicationsRetriever software and downloads the available full-texts.
Then, it posts the results to the Controller, which in turn, requests from the Worker, the full-texts which are not already found by other workers, in batches.
The Worker responds by compressing and sending the requested files, in each batch.

Multiple instances of this app are deployed on the cloud.
We use Facebook's Zstandard compression algorithm, which brings very big benefits in compression rate and speed.


To install and run the application:

  • Run git clone and then cd UrlsWorker.
  • Set the preferable values inside the application.properties file.
  • Execute the installAndRun.sh script.

Notes:

  • If you want to just run the app, then run the script with the argument "1": ./installAndRun.sh 1. In this scenario, the SpringBoot-app will not be re-built.