Commit Graph

18 Commits

Author SHA1 Message Date
Lampros Smyrnaios fb2877dbe8 Upgrade the execution system for the backgroundTasks:
- Submit each task immediately for execution, instead of waiting for a scheduling thread to send all gathered tasks (up to that point) to the ExecutorService (and block until they are finished, before it can start again).
- Hold the Future of each submitted task to a synchronized-list to check the result of each task at a scheduled time.
- Reduce the cpu-time to assure the Service can shut down, by checking if there are "actively" and "about-to-be-executed" tasks, at the same time. Instead of having to rely on the additional checking of the "shutdown"-status of each worker to verify that no active task exist.
- Improve the threads' shutdown procedure.
2023-10-09 17:23:59 +03:00
Lampros Smyrnaios 7dc72e242e - Fix missing changes.
- Change the HTTP-method of the renamed "test/uploadParquetFile" endpoint to "POST".
2023-07-24 19:55:37 +03:00
Lampros Smyrnaios a38d6ace79 Code polishing. 2023-05-29 12:21:48 +03:00
Lampros Smyrnaios 74ff31fc64 - Show the workerIPs in the logs.
- Rename the "FullTexts"-files to "BulkImport".
2023-05-29 12:12:08 +03:00
Lampros Smyrnaios cd1fb0af88 - Process the WorkerReports in background Jobs and post the reportResults to the Workers.
- Save the workerReports to json files, until they are processed successfully.
- Show some custom metrics in prometheus.
2023-05-24 13:52:28 +03:00
Lampros Smyrnaios 0ea3e2de24 Add the "shutdownService" and "cancelShutdownService" endpoints. The Controller sends the related requests to the Workers and shutdowns gracefully, after all workers have shutdown. 2023-05-24 13:42:29 +03:00
Lampros Smyrnaios c2a1b96069 - Rename the mounted "mnt/bulkImport/" directory to "/mnt/bulk_import/".
- Increase the "awaitTermination" timeout for the ExecutorService to 2 minutes.
2023-05-23 21:09:34 +03:00
Lampros Smyrnaios b6e8cd1889 New feature: BulkImport full-text files from compatible datasources. 2023-05-11 03:07:55 +03:00
Lampros Smyrnaios c4670073ae - Add missing refactoring-change.
- Code polishing.
- Update Spring.
2023-02-24 23:49:04 +02:00
Lampros Smyrnaios e11afe5ab2 Improve performance of the hash-checking algorithm by using multithreading. 2022-12-15 18:34:28 +02:00
Lampros Smyrnaios 3e8f9c6074 Update the "UriBuilder.java" to be able to acquire the running port of the server, in case the port-number was initially set to "random" (0). Also make sure we get the "localHostAddress" and not the "localHostName", in case the public IP is not retrievable. 2022-09-12 17:04:05 +03:00
Lampros Smyrnaios ad5dbdde9b - Improve performance when inserting records into the "attempt" table, by splitting the records equally, across more threads.
- Bring back the "UriBuilder", which informs us in the logs, about the Controller's url (IP, PORT, API).
- Code cleanup.
2022-02-22 13:54:16 +02:00
Lampros Smyrnaios 6aab1d242b - Improve performance when handling WorkerReports' database insertions, by using parallelism to insert to two different tables in the same time. Also, pre-cache the query-argument-types.
- Update the error-message and counting system, on partial insertion event.
2022-02-04 14:48:22 +02:00
Antonis Lempesis bf26bf955f springified project 2022-01-30 22:14:52 +02:00
Lampros Smyrnaios 33ba3e8d91 - Avoid getting and uploading (to S3), full-texts which are already uploaded by previous assignments-batches.
- Fix not updating the fileLocation with the s3Url for records which share the same full-text.
- Set only one delete-order for each assignments-batch-files, not one (or more, by mistake) per zip-batch.
- Set the HttpStatus to "204 - NO_CONTENT", when no assignments are available to be returned to the Worker.
- Fix not unlocking the "dataBaseLock" in case of a "dataBase-connection"-error, in "addWorkerReport()".
- Improve some log-messages.
- Change the log-level for the "S3-bucket already exists" message.
- Update Gradle.
- Optimize imports.
- Code cleanup.
2021-12-21 15:55:27 +02:00
Lampros Smyrnaios 48eed20dd8 - Implement the "getAndUploadFullTexts" functionality. In order to access the S3-ObjectStore from one trusted place, the Controller will request the files from the workers and upload them on S3. Afterwards, the workers will delete those files from their local storage. Previously, each worker uploaded its own files.
- Move the "mergeParquetFiles" and "getCutBatchExceptionMessage" methods inside the "FileUtils" class.
- Code cleanup.
2021-11-30 18:23:27 +02:00
Lampros Smyrnaios d100af35d0 - Implement the "getUrls" and "addWorkerReport" endpoints with full database-handling.
- Add connectivity with an Impala-database and create a dedicated Controller for future statistics-requests.
- Optimize the "getTestUrls"-endpoint.
- Disable the "reportCurrentTime()" scheduled-task.
- Update dependencies and bump project's version to '1.0.0-SNAPSHOT'.
- Set the logging-appender to "File".
- Code cleanup.
2021-11-09 23:59:27 +02:00
Lampros Smyrnaios 8a4376da9c Initial commit of UrlsController. 2021-03-16 15:25:15 +02:00