Commit Graph

20 Commits

Author SHA1 Message Date
Lampros Smyrnaios bfa76e9484 - Show the full stacktrace in the weird case of a "RestClientException" without an exception-message. Also, in this case, retry immediately, as there is no long-lasting network problem that requires some time between requests, but most probably a random interruption.
- Code polishing.
2023-10-27 17:36:54 +03:00
Lampros Smyrnaios e85282d35b Update the "addReportResultToWorker"-endpoint to check if the given "assignmentsCounter" was handled by that worker, without considering the related full-texts directory, since that may have been deleted in the meantime. 2023-08-31 17:52:52 +03:00
Lampros Smyrnaios b579296ada - Code optimization and polishing.
- Update dependencies.
2023-08-28 16:11:26 +03:00
Lampros Smyrnaios 088cf73b30 - Update dependencies.
- Code optimization and polishing.
2023-07-27 17:46:17 +03:00
Lampros Smyrnaios 952bf7c035 - Update dependencies.
- Code polishing.
2023-07-06 13:22:09 +03:00
Lampros Smyrnaios 9c897b8bf4 - Make use of the new Normalizer utilized by the PublicationRetriever plugin.
- Code polishing.
2023-06-10 02:40:45 +03:00
Lampros Smyrnaios 2aedae2367 - In case a serious error happened while processing the assignments, instead of shutting down immediately, now the Worker shuts down the executor service, registers that it will shut down soon and waits for the Controller to retrieve the already downloaded full-text files.
- In case the full-texts' subdirectory could not be created, then terminate the "handleAssignment" method immediately. No posting of a faulty workerReport to the Controller should happen.
- Code polishing.
2023-05-31 15:25:36 +03:00
Lampros Smyrnaios 0908dcab8a Use a single "restTemplate" object, with the same timeouts (a bit increased from the old requestRestTemplate, to account for a possible overloaded Controller), since we no longer need to wait for hours until the workerReport is processed by the Controller. 2023-05-29 14:15:55 +03:00
Lampros Smyrnaios 9fdaa9503b - Delete any left-over full-texts after 36 hours.
- Upon shutting down, post a "shutdownReport" to the Controller.
2023-05-23 22:22:57 +03:00
Lampros Smyrnaios 903032f454 - After a WorkerReport has been sent, ask for new assignments immediately. So, the Worker does not have to wait for hours for the Controller to check for duplicate files in the DB, retrieve and upload the full-texts and insert the records to the DB.
- Special care is taken to delete the delivered full-texts as soon as possible.
- Write the workerReport to a json-file, in case something goes wrong, and keep it until the Controller notifies the Worker that the processing was successful.
2023-05-23 22:19:41 +03:00
Lampros Smyrnaios 4d90846261 - In case the specified "controllerIP" is actually a domain-name, find its IP-address, so that a proper IP-to-IP comparison can be performed and the "securityChecks" can pass.
- Increase the "read-timeout" when searching for the host's machine public-IP.
- Update dependencies.
- Code polishing.
2023-05-22 21:25:22 +03:00
Lampros Smyrnaios bd0ead816d Make the value of time-out for "restTemplateForReport", to scale along the "maxAssignmentsLimitPerBatch". 2023-05-16 19:08:59 +03:00
Lampros Smyrnaios d5a997ad3d Use restTemplates with different read timeouts depending on the operation. For the assignments-request we need a shorter read timeout, than the one we need for the worker-report. This guarantees that the connection does not hungs for so long, when the Controller crashes before sending the assignments. 2023-04-29 17:24:16 +03:00
Lampros Smyrnaios 0ba15dd31a Increase the "requestReadTimeoutDuration" to 10 hours, as the number of full-texts to be transferred to the Controller keeps getting larger. 2023-04-26 15:08:46 +03:00
Lampros Smyrnaios 839a797124 - Improve performance of full-texts transferring to the Controller, by preloading some bytes for faster response to the Controller's read requests.
- Optimize directories-creation process by eliminating the additive check for existence, as that check already takes place inside the "mkdirs()" method.
- Remove the obsolete code which in case the specific assignments' subdirectory failed to be created, then a different base-dir was used instead. Since the user-defined baseDir is already been successfully created upon initialization, any problem on creating subdirectories inside that base-directory will most likely persist even when changing the base directory. Additionally, even if the subdirectory with the changed base-directory succeeded, the "FullTextsController.getFullTexts()" method would not use it, resulting in errors.
- Code polishing.
2023-03-08 13:12:17 +02:00
Lampros Smyrnaios 4da54e7a7d - Show a warning, in case the number of archived files is different from the number of requested files.
- Code polishing.
- Update Gradle.
2023-03-07 16:25:10 +02:00
Lampros Smyrnaios ff4fd3d289 - Show the elapsed time for each assignments-request to be processed by the Worker.
- Update dependencies.
2023-03-02 17:34:44 +02:00
Lampros Smyrnaios 81b61b530f Drastically improve performance by applying a pre-processing algorithm for the assignments-list to open some "space" between assignments which have the same domain, which in return, causes the threads to block less during execution.
(The threads block, due to the mandatory "politeness-delay" before reconnecting with the same domain, in order to avoid overloading the remote servers.)
2023-02-24 23:23:37 +02:00
Lampros Smyrnaios 84a37bd4b7 - Handle the case, where an instance of a urlReport record (having the same id and sourceUrl), may have failed to give a docUrl, due to en error, even if another instance gives the docUrl and the docFile. The absence of that handling could lead to a record-instance, being assigned a "fileLocation" which was actually an error-message (comment), and as a result the real "fileLocation" would have never been reached to be assigned, so the payload would be lost.
- Improve exceptions-handling.
2023-02-21 15:22:49 +02:00
Lampros Smyrnaios 24b52fba63 - Refactor the initialization and configuration process and Spring-ify the project.
- Update Spring dependency.
2023-01-25 18:33:49 +02:00