Lampros Smyrnaios
c7bfd75973
- Add the "getWorkersInfo" endpoint.
...
- Improve startup speed, by using a faster remote server to get the host's machine public IP. This also reduces the risk of not being able to get the public IP at all.
- Fix the detection of a different IP for a known worker.
- Improve documentation.
2023-05-23 14:57:15 +03:00
Lampros Smyrnaios
d7797eaaf6
Add the "getNumberOfPayloadsForDatasource" endpoint.
2023-04-24 09:54:35 +03:00
Lampros Smyrnaios
a1c16ffc19
- Exclude empty and null urls in the assignments.
...
- Update the "getFullTextsImproved"-call to "getFullTexts", now that the "improved" version is stable.
- Update Gradle.
- Code polishing.
2023-02-16 14:24:47 +02:00
Lampros Smyrnaios
49fefefafd
- Refactor the payloads-statistics-code and provide two endpoints: "getNumberOfPayloadsAggregatedByService", which returns the number of payloads aggregated only by the PDF-Aggregation-Service, and the "getNumberOfAllPayloads", which returns the number of all payloads existing in the database, even the ones aggregated in the past, by other pieces of software.
...
- Update README.md.
- Make sure the docker image is clean-built, by avoiding the use of cache.
2023-02-02 17:58:47 +02:00
Lampros Smyrnaios
f89730f196
Improve documentation.
2023-01-27 14:31:07 +02:00
Lampros Smyrnaios
8876089022
- Use Facebook's [**Zstandard**]( https://facebook.github.io/zstd/ ) compression algorithm, which brings very big benefits on compression rate and speed.
...
- Update the minIO dependency.
- Code polishing.
2023-01-10 13:34:54 +02:00
Lampros Smyrnaios
e51ee9dd27
- Add info about the Stats API usage in "README.md".
...
- Optimize performance in "ParquetFileUtils.createAndLoadParquetDataIntoAttemptTable()" and "ParquetFileUtils.createAndLoadParquetDataIntoPayloadTable()".
- Handle the "EmptyResultDataAccessException" inside "StatsController".
- Optimize gradle's performance.
- Code polishing.
2022-12-15 14:04:22 +02:00
Lampros Smyrnaios
95c38c4a24
- Fix creating the "assignment" table, always in the testDatabase.
...
- Code polishing.
2022-12-07 14:58:38 +02:00
Lampros Smyrnaios
5819bf584b
Update the README.md
2022-02-07 21:11:03 +02:00
Lampros Smyrnaios
48eed20dd8
- Implement the "getAndUploadFullTexts" functionality. In order to access the S3-ObjectStore from one trusted place, the Controller will request the files from the workers and upload them on S3. Afterwards, the workers will delete those files from their local storage. Previously, each worker uploaded its own files.
...
- Move the "mergeParquetFiles" and "getCutBatchExceptionMessage" methods inside the "FileUtils" class.
- Code cleanup.
2021-11-30 18:23:27 +02:00
Lampros Smyrnaios
983b900da7
- Add the "installAndRun.sh" script.
...
- Update the README.
- Update the dependencies.
2021-09-09 15:56:37 +03:00
Lampros Smyrnaios
8a4376da9c
Initial commit of UrlsController.
2021-03-16 15:25:15 +02:00