UrlsController/README.md

25 lines
1.7 KiB
Markdown
Raw Normal View History

2021-03-16 14:25:15 +01:00
# UrlsController
2022-02-07 20:11:03 +01:00
The Controller's Application receives requests coming from the [Workers](https://code-repo.d4science.org/lsmyrnaios/UrlsWorker) , constructs an assignments-list with data received from a database and returns the list to the workers.<br>
Then, it receives the "WorkerReports", it requests the full-texts from the workers, in batches, and uploads them on the S3-Object-Store. Finally, it writes the related reports, along with the updated file-locations into the database.<br>
The database used is the [Impala](https://impala.apache.org/).<br>
<br>
2023-01-27 13:31:07 +01:00
**Statistics API**:
- "**getNumberOfPayloads**" endpoint: **http://IP:PORT/api/stats/getNumberOfPayloads**
- "**getNumberOfRecordsInspected**" endpoint: **http://IP:PORT/api/stats/getNumberOfRecordsInspected**
<br>
<br>
2023-01-27 13:31:07 +01:00
**To install and run the application**:
2022-02-07 20:11:03 +01:00
- Run ```git clone``` and then ```cd UrlsController```.
2023-01-27 13:31:07 +01:00
- Set the preferable values inside the [__application.properties__](https://code-repo.d4science.org/lsmyrnaios/UrlsWorker/src/branch/master/src/main/resources/application.properties) file.
2022-02-07 20:11:03 +01:00
- Execute the ```installAndRun.sh``` script which builds and runs the app.<br>
If you want to just run the app, then run the script with the argument "1": ```./installAndRun.sh 1```.<br>
If you want to build and run the app on a **Docker Container**, then run the script with the argument "0" followed by the argument "1": ```./installAndRun.sh 0 1```.<br>
<br>
Implementation notes:
- For transferring the full-text files, we use Facebook's [**Zstandard**](https://facebook.github.io/zstd/) compression algorithm, which brings very big benefits in compression rate and speed.
- The names of the uploaded full-text files ae of the following form: "***datasourceID/recordId::fileHash.pdf***"