Commit Graph

13 Commits

Author SHA1 Message Date
Lampros Smyrnaios 03bf4294b8 - Add documentation about the "BulkImport API" in the README.
- Fix a link in README.
- Update dependencies.
2023-05-29 12:13:39 +03:00
Lampros Smyrnaios c7bfd75973 - Add the "getWorkersInfo" endpoint.
- Improve startup speed, by using a faster remote server to get the host's machine public IP. This also reduces the risk of not being able to get the public IP at all.
- Fix the detection of a different IP for a known worker.
- Improve documentation.
2023-05-23 14:57:15 +03:00
Lampros Smyrnaios d7797eaaf6 Add the "getNumberOfPayloadsForDatasource" endpoint. 2023-04-24 09:54:35 +03:00
Lampros Smyrnaios a1c16ffc19 - Exclude empty and null urls in the assignments.
- Update the "getFullTextsImproved"-call to "getFullTexts", now that the "improved" version is stable.
- Update Gradle.
- Code polishing.
2023-02-16 14:24:47 +02:00
Lampros Smyrnaios 49fefefafd - Refactor the payloads-statistics-code and provide two endpoints: "getNumberOfPayloadsAggregatedByService", which returns the number of payloads aggregated only by the PDF-Aggregation-Service, and the "getNumberOfAllPayloads", which returns the number of all payloads existing in the database, even the ones aggregated in the past, by other pieces of software.
- Update README.md.
- Make sure the docker image is clean-built, by avoiding the use of cache.
2023-02-02 17:58:47 +02:00
Lampros Smyrnaios f89730f196 Improve documentation. 2023-01-27 14:31:07 +02:00
Lampros Smyrnaios 8876089022 - Use Facebook's [**Zstandard**](https://facebook.github.io/zstd/) compression algorithm, which brings very big benefits on compression rate and speed.
- Update the minIO dependency.
- Code polishing.
2023-01-10 13:34:54 +02:00
Lampros Smyrnaios e51ee9dd27 - Add info about the Stats API usage in "README.md".
- Optimize performance in "ParquetFileUtils.createAndLoadParquetDataIntoAttemptTable()" and "ParquetFileUtils.createAndLoadParquetDataIntoPayloadTable()".
- Handle the "EmptyResultDataAccessException" inside "StatsController".
- Optimize gradle's performance.
- Code polishing.
2022-12-15 14:04:22 +02:00
Lampros Smyrnaios 95c38c4a24 - Fix creating the "assignment" table, always in the testDatabase.
- Code polishing.
2022-12-07 14:58:38 +02:00
Lampros Smyrnaios 5819bf584b Update the README.md 2022-02-07 21:11:03 +02:00
Lampros Smyrnaios 48eed20dd8 - Implement the "getAndUploadFullTexts" functionality. In order to access the S3-ObjectStore from one trusted place, the Controller will request the files from the workers and upload them on S3. Afterwards, the workers will delete those files from their local storage. Previously, each worker uploaded its own files.
- Move the "mergeParquetFiles" and "getCutBatchExceptionMessage" methods inside the "FileUtils" class.
- Code cleanup.
2021-11-30 18:23:27 +02:00
Lampros Smyrnaios 983b900da7 - Add the "installAndRun.sh" script.
- Update the README.
- Update the dependencies.
2021-09-09 15:56:37 +03:00
Lampros Smyrnaios 8a4376da9c Initial commit of UrlsController. 2021-03-16 15:25:15 +02:00