Lampros Smyrnaios
3f1e96e9f3
Update README.md
2024-09-19 13:02:11 +02:00
Lampros Smyrnaios
ab18ac5ff8
Add new prometheus metrics:
...
- averageFulltextsTransferSizeOfWorkerReports
- averageSuccessPercentageOfWorkerReports
2024-06-14 17:27:52 +03:00
Lampros Smyrnaios
0d63165b6d
- Add checks to verify that there are active workers in the Service, before proceeding to try posting "(cancel)Shutdown" requests to all known workers.
...
- Add documentation in README.
2024-06-14 13:32:44 +03:00
Lampros Smyrnaios
8bc5cc35e2
- Optimize writing to the Bulk-import-report file.
...
- Show the IP of the worker which posts a "workerShutdownReport".
- Code polishing.
2024-03-22 17:50:55 +02:00
Lampros Smyrnaios
43ea64758d
- Improve handling of the case when no fulltexts have been found or none of the found ones were requested from the worker, as they were already retrieved in the past.
...
- Show the number of files with problematic locations (if any of them exist).
- Code polishing.
2024-02-23 12:39:28 +02:00
Lampros Smyrnaios
749172edd8
Add the Jenkins' build-status badge in README.
2024-02-08 19:49:58 +02:00
Lampros Smyrnaios
34d7a143e7
Add/improve documentation.
2024-02-01 14:37:29 +02:00
Lampros Smyrnaios
e8644cb64f
- Optimize the "insertAssignmentsQuery".
...
- Add documentation about the Prometheus Metrics, in README.
- Update Dependencies.
- Code polishing.
2023-07-05 17:10:30 +03:00
Lampros Smyrnaios
0f4b63c4a9
Expose the following statistics as prometheus-metrics and create/update a stats-endpoint for each one:
...
- "numOfPayloadsAggregatedByServiceThroughCrawling"
- "numOfPayloadsAggregatedByServiceThroughBulkImport"
- "numOfPayloadsAggregatedByService"
- "numOfLegacyPayloads"
- "numOfRecordsInspectedByServiceThroughCrawling" (renamed from "numOfInspectedRecords")
2023-06-23 15:22:26 +03:00
Lampros Smyrnaios
b6b1cb08b9
Add instructions on how to run the Prometheus and Grafana docker-containers alongside the UrlsController, by using the same script.
2023-06-23 14:52:07 +03:00
Lampros Smyrnaios
798fa09d68
- Identify and handle a possible Worker-crash, in "UrlsServiceImpl.postReportResultToWorker()".
...
- Add/Improve some log messages.
- Update and cleanup dependencies.
- Code polishing.
2023-06-15 23:19:36 +03:00
Lampros Smyrnaios
e2776c50d0
- Optimize the "WorkerReportResult" and the "ShutdownWorker" requests.
...
- Improve documentation.
2023-06-10 02:31:57 +03:00
Lampros Smyrnaios
5d99a4be5d
- Add the Shutdown Service API documentation.
...
- Improve the BulkImport API documentation.
- Fix markdown in README.
- Update the app's version.
2023-06-06 16:18:38 +03:00
Lampros Smyrnaios
03bf4294b8
- Add documentation about the "BulkImport API" in the README.
...
- Fix a link in README.
- Update dependencies.
2023-05-29 12:13:39 +03:00
Lampros Smyrnaios
c7bfd75973
- Add the "getWorkersInfo" endpoint.
...
- Improve startup speed, by using a faster remote server to get the host's machine public IP. This also reduces the risk of not being able to get the public IP at all.
- Fix the detection of a different IP for a known worker.
- Improve documentation.
2023-05-23 14:57:15 +03:00
Lampros Smyrnaios
d7797eaaf6
Add the "getNumberOfPayloadsForDatasource" endpoint.
2023-04-24 09:54:35 +03:00
Lampros Smyrnaios
a1c16ffc19
- Exclude empty and null urls in the assignments.
...
- Update the "getFullTextsImproved"-call to "getFullTexts", now that the "improved" version is stable.
- Update Gradle.
- Code polishing.
2023-02-16 14:24:47 +02:00
Lampros Smyrnaios
49fefefafd
- Refactor the payloads-statistics-code and provide two endpoints: "getNumberOfPayloadsAggregatedByService", which returns the number of payloads aggregated only by the PDF-Aggregation-Service, and the "getNumberOfAllPayloads", which returns the number of all payloads existing in the database, even the ones aggregated in the past, by other pieces of software.
...
- Update README.md.
- Make sure the docker image is clean-built, by avoiding the use of cache.
2023-02-02 17:58:47 +02:00
Lampros Smyrnaios
f89730f196
Improve documentation.
2023-01-27 14:31:07 +02:00
Lampros Smyrnaios
8876089022
- Use Facebook's [**Zstandard**]( https://facebook.github.io/zstd/ ) compression algorithm, which brings very big benefits on compression rate and speed.
...
- Update the minIO dependency.
- Code polishing.
2023-01-10 13:34:54 +02:00
Lampros Smyrnaios
e51ee9dd27
- Add info about the Stats API usage in "README.md".
...
- Optimize performance in "ParquetFileUtils.createAndLoadParquetDataIntoAttemptTable()" and "ParquetFileUtils.createAndLoadParquetDataIntoPayloadTable()".
- Handle the "EmptyResultDataAccessException" inside "StatsController".
- Optimize gradle's performance.
- Code polishing.
2022-12-15 14:04:22 +02:00
Lampros Smyrnaios
95c38c4a24
- Fix creating the "assignment" table, always in the testDatabase.
...
- Code polishing.
2022-12-07 14:58:38 +02:00
Lampros Smyrnaios
5819bf584b
Update the README.md
2022-02-07 21:11:03 +02:00
Lampros Smyrnaios
48eed20dd8
- Implement the "getAndUploadFullTexts" functionality. In order to access the S3-ObjectStore from one trusted place, the Controller will request the files from the workers and upload them on S3. Afterwards, the workers will delete those files from their local storage. Previously, each worker uploaded its own files.
...
- Move the "mergeParquetFiles" and "getCutBatchExceptionMessage" methods inside the "FileUtils" class.
- Code cleanup.
2021-11-30 18:23:27 +02:00
Lampros Smyrnaios
983b900da7
- Add the "installAndRun.sh" script.
...
- Update the README.
- Update the dependencies.
2021-09-09 15:56:37 +03:00
Lampros Smyrnaios
8a4376da9c
Initial commit of UrlsController.
2021-03-16 15:25:15 +02:00