The Controller app of the PDF Aggregation Service.
Go to file
Lampros Smyrnaios 88acaae20f - Replace the "numFullTextUrlsFound"-counter with "numFullTextsFound"-counter to reflect the end result of the actually available full-texts (which were downloaded by the Worker).
- Optimize the gather-fileNames loop.
- Improve a message in "installAndRun.sh"
2022-02-23 17:40:06 +02:00
gradle/wrapper - In case of an error when creating the "current_assignment" table (e.g out of memory in the backend database server), check for partial-creation and drop it. Also, in any case, before we drop this table, now check if it exists firsts (in general it should always exist, unless the creation results in an error and the table was not created at all). 2022-02-14 12:36:00 +02:00
src/main - Replace the "numFullTextUrlsFound"-counter with "numFullTextsFound"-counter to reflect the end result of the actually available full-texts (which were downloaded by the Worker). 2022-02-23 17:40:06 +02:00
.gitignore springified project 2022-01-30 22:15:13 +02:00
Dockerfile - Allow the user to build, push and run the App in Docker, straight though the "installAndRun.sh" script. 2022-02-04 15:49:56 +02:00
README.md Update the README.md 2022-02-07 21:11:03 +02:00
build.gradle - In case of an error when creating the "current_assignment" table (e.g out of memory in the backend database server), check for partial-creation and drop it. Also, in any case, before we drop this table, now check if it exists firsts (in general it should always exist, unless the creation results in an error and the table was not created at all). 2022-02-14 12:36:00 +02:00
installAndRun.sh - Replace the "numFullTextUrlsFound"-counter with "numFullTextsFound"-counter to reflect the end result of the actually available full-texts (which were downloaded by the Worker). 2022-02-23 17:40:06 +02:00
settings.gradle - Add the "isControllerAlive"-endpoint. 2021-09-23 15:08:52 +03:00

README.md

UrlsController

The Controller's Application receives requests coming from the Workers , constructs an assignments-list with data received from a database and returns the list to the workers.
Then, it receives the "WorkerReports", it requests the full-texts from the workers, in batches, and uploads them on the S3-Object-Store. Finally, it writes the related reports, along with the updated file-locations into the database.
The database used is the Impala .

To install and run the application:

  • Run git clone and then cd UrlsController.
  • Provide the S3 Object Store related configurations, inside the src/main/resources/application.properties file.
  • Execute the installAndRun.sh script which builds and runs the app.
    If you want to just run the app, then run the script with the argument "1": ./installAndRun.sh 1.
    If you want to build and run the app on a docker container, then run the script with the argument "0" followed by the argument "1": ./installAndRun.sh 0 1.