Update/improve documentation.

This commit is contained in:
Lampros Smyrnaios 2023-01-27 14:27:57 +02:00
parent 24b52fba63
commit b98ea92dec
3 changed files with 15 additions and 14 deletions

View File

@ -1,22 +1,23 @@
# UrlsWorker
The Worker's Application, requests assignments from the [Controller](https://code-repo.d4science.org/lsmyrnaios/UrlsController) and processes them, downloading the available full-texts.<br>
The Worker's Application, requests assignments from the [Controller](https://code-repo.d4science.org/lsmyrnaios/UrlsController) and processes them with the help of the [__PublicationsRetriever__](https://github.com/LSmyrnaios/PublicationsRetriever) software and downloads the available full-texts.<br>
Then, it posts the results to the Controller, which in turn, requests from the Worker, the full-texts which are not already found by other workers, in batches.<br>
The Worker responds by compressing and sending the requested files, in each batch.<br>
<br>
We use Facebook's [**Zstandard**](https://facebook.github.io/zstd/) compression algorithm, which brings very big benefits in compression rate and speed.
<br>
To install and run the application:
- Run ```git clone``` and then ```cd UrlsWorker```.
- [Optional] Create the file ```inputData.txt``` , which contains just one line with the ___workerId___, the ___maxAssignmentsLimitPerBatch___, the ___maxAssignmentsBatchesToHandleBeforeRestart___, the ___controller's base api-url___ and the ___shutdownOrCancelCode___, all seperated by a _comma_ "```,```" .<br>
For example: ```worker_1,1000,0,http://IP:PORT/api/,stopOrCancelCode```.<br>
The ___shutdownOrCancelCode___ is kind of an "auth-code", when receiving "__shutdown__" and "__cancel-shutdown__" requests.
- Execute the ```installAndRun.sh``` script.<br>
In case the above file (_inputData.txt_) does not exist, the script will request the required data from the user, and then it will create the _inputData.txt_ file.<br>
<br>
Notes:
- If the "maxAssignmentsBatchesToHandleBeforeRestart" is zero or negative, then an infinite number of assignments-batches will be handled.<br>
- The above script, installs the [PublicationsRetriever](https://github.com/LSmyrnaios/PublicationsRetriever), as a library and then compiles and runs the whole Application.<br>
- If you want to just run the app, then run the script with the argument "1": ```./installAndRun.sh 1```.<br>
**To install and run the application**:
- Run ```git clone``` and then ```cd UrlsWorker```.
- Set the preferable values inside the [__application.properties__](https://code-repo.d4science.org/lsmyrnaios/UrlsWorker/src/branch/master/src/main/resources/application.properties) file.
- Execute the ```installAndRun.sh``` script.<br>
<br>
**Notes**:
- The above script, installs the [PublicationsRetriever](https://github.com/LSmyrnaios/PublicationsRetriever) software, as a library and then compiles and runs the whole Application.<br>
- If you want to just run the app, then run the script with the argument "1": ```./installAndRun.sh 1```. In this scenario, apart from the SpringBoot-app not re-building,
the [PublicationsRetriever](https://github.com/LSmyrnaios/PublicationsRetriever) software, will not be re-installed as well.<br>
- If you want to avoid re-installing the [PublicationsRetriever](https://github.com/LSmyrnaios/PublicationsRetriever) software, i.e. when using a development (non-published) version
or when nothing has changed and wanting to avoid the time-overhead, run the script with the argument "0", followed by the argument "1": ```./installAndRun.sh 0 1```.<br>
<br>

View File

@ -28,6 +28,7 @@ if [[ justRun -eq 0 ]]; then
if [[ avoidReInstallingPublicationsRetriever -eq 1 ]]; then
if [[ ! -f ./libs/publications_retriever-1.0-SNAPSHOT.jar ]]; then
echo -e "\n\nThe required \"PublicationsRetriever\" software has not been installed yet, thus the script will override the user-defined value of \"avoidReInstallingPublicationsRetriever\" to FALSE..\n\n"
avoidReInstallingPublicationsRetriever=0; # In case the jar-file does not exists, then make sure we follow the normal procedure, independently from what the user requested.
fi
fi

View File

@ -20,8 +20,7 @@ server.servlet.context-path=/api
#Input data configurations
info.workerId = XX
info.maxAssignmentsLimitPerBatch = 10000
# The following can be set to <0> in order to never shutdown, or to another positive number,
# in order to run for 2 or 3 times and then shutdown.
# If the "info.maxAssignmentsBatchesToHandleBeforeShutdown" is zero, then an infinite number of assignments-batches will be handled.
info.maxAssignmentsBatchesToHandleBeforeShutdown = 0
info.controllerBaseUrl = XX
info.shutdownOrCancelCode = XX