UrlsController/src/main/java/eu/openaire/urls_controller/services
Lampros Smyrnaios 8f9786de09 Upgrade the algorithm for finding the previously-found fulltexts, based on their md5hash:
- Use a single query with a list of the fileHashes, instead of thousands of singe-md5hash-check queries (run at most 6 in parallel) which require a lot of I/O.
- Avoid checking multiple times the same fileHash, in case it is related with multiple payloads.
- In case of a database-error, avoid completely losing the full-texts of that worker, instead, continue processing the full-texts.
2024-03-13 11:28:37 +02:00
..
BulkImportService.java - Add bulk-import support for non-Authoritative data-sources. 2023-09-26 18:02:48 +03:00
BulkImportServiceImpl.java - If we receive an "UnknownHostException" when uploading to the S3ObjectStore, then skip the current full-texts' batch to leave some time for the network to get unstuck. 2023-11-22 15:29:18 +02:00
ShutdownService.java Add the "shutdownAllWorkersGracefully" and "cancelShutdownAllWorkersGracefully" endpoints, in order to be able to shut them down at once and update them, without shutting down the whole Service. So in this case the bulk-import procedures will continue to work. 2023-11-29 16:45:58 +02:00
ShutdownServiceImpl.java - Allow to easily change the por used by workers. 2023-12-19 23:31:42 +02:00
StatsService.java - Improve the "shutdownController.sh" script. 2023-07-27 18:27:48 +03:00
StatsServiceImpl.java - Improve handling of the case when no fulltexts have been found or none of the found ones were requested from the worker, as they were already retrieved in the past. 2024-02-23 12:39:28 +02:00
UrlsService.java - Process the WorkerReports in background Jobs and post the reportResults to the Workers. 2023-05-24 13:52:28 +03:00
UrlsServiceImpl.java Upgrade the algorithm for finding the previously-found fulltexts, based on their md5hash: 2024-03-13 11:28:37 +02:00