Commit Graph

16 Commits

Author SHA1 Message Date
Lampros Smyrnaios 9b95eebb6c - Remove the obsolete "parenthesis" and "increasing duplicate-num" from the full-texts' names, before sending them to the S3-Object-Store. They now end with the "file-hash", so it is guaranteed that they will be unique. The Worker continues to produce the previous kind of names, without any disturbance.
- Improve logging.
- Update MinIO dependency.
2022-04-11 21:15:22 +03:00
Lampros Smyrnaios a81ed3c60f - Add an "isTestEnvironment"-switch, which makes it easier to work with production and test databases.
- In case the Worker cannot be reached during a full-texts' batch request, abort the rest of the batches.
- Fix memory leaks when unzipping the batch-zip-file.
- Add explanatory comments for picking the database related to a full-text file.
2022-04-08 17:39:45 +03:00
Lampros Smyrnaios a23c918a42 - Fix a "@JsonProperty" annotation inside "Payload.java".
- Fix a "@Value" annotation inside "FileUtils.java".
- Add a new database and show its name along with the initial's name in the logs.
- Code cleanup and improvement.
2022-04-05 00:01:44 +03:00
Lampros Smyrnaios 1111c850b9 - Add support for more than one full-text per id. Allow recognizing fileName additions: "id(1).pdf", "id(2).pdf", etc.
- Fix not giving the databaseName in the "ImpalaController.get10PublicationIdsTest()".
- Improve consistency in the "maxAttemptsPerRecord" value, among different threads. Also, reduce the value-increase by one.
- Check if the tableName string is empty, in the "mergeParquetFiles".
- Improve error-logging.
- Set some local variables to "final", optimizing code-execution by the JVM.
2022-02-07 13:57:09 +02:00
Lampros Smyrnaios be4898e43e Bug fixes and improvements:
- Fix an NPE, when the "getTestUrls"-endpoint is called. It was thrown because of an absent per-thread initialization of some thread-local variables.
- Fix JdbcTemplate error when querying the "getFileLocationForHashQuery".
- Fix the "S3ObjectStore.isLocationInStore" check.
- Fix not catching/handling some exceptions.
- Fix/improve log-messages.
- Optimize the "getFileLocationForHashQuery" to return only the first row. In the latest change, without this optimization, the query-result would cause non-handling the same-hash cases, because of an exception.
- Optimize the "ImpalaConnector.databaseLock.lock()" positioning.
- Update the "getTestUrls" api-path.
- Optimize list-allocation.
- Re-add the info-message about the successful emptying of the S3-bucket.
- Code cleanup.
2022-02-02 20:19:46 +02:00
Antonis Lempesis 35966b6f6e finishing toucehs 2022-02-01 16:57:28 +02:00
Antonis Lempesis e9bede5c45 more fixes 2022-02-01 02:08:02 +02:00
Antonis Lempesis 9ac10fc4b3 fixed Value annotations 2022-01-31 14:01:26 +02:00
Antonis Lempesis 1c82088a7c fixed Value annotations 2022-01-31 13:49:14 +02:00
Antonis Lempesis 6dde8c0faa finished merge 2022-01-31 04:17:16 +02:00
Antonis Lempesis bf26bf955f springified project 2022-01-30 22:14:52 +02:00
Lampros Smyrnaios d0ab42e4fa - Change the scheme of the file-location URI.
- Move the old and the current database names in the "application.properties" file.
- Improve logging.
2022-01-28 07:24:42 +02:00
Lampros Smyrnaios ab99bc6168 - Make sure the temp table "current_assignment" from a cancelled previous execution, is dropped and purged on startup.
- Improve logging.
- Code cleanup.
2022-01-19 01:37:47 +02:00
Lampros Smyrnaios 33ba3e8d91 - Avoid getting and uploading (to S3), full-texts which are already uploaded by previous assignments-batches.
- Fix not updating the fileLocation with the s3Url for records which share the same full-text.
- Set only one delete-order for each assignments-batch-files, not one (or more, by mistake) per zip-batch.
- Set the HttpStatus to "204 - NO_CONTENT", when no assignments are available to be returned to the Worker.
- Fix not unlocking the "dataBaseLock" in case of a "dataBase-connection"-error, in "addWorkerReport()".
- Improve some log-messages.
- Change the log-level for the "S3-bucket already exists" message.
- Update Gradle.
- Optimize imports.
- Code cleanup.
2021-12-21 15:55:27 +02:00
Lampros Smyrnaios 780ed15ce2 - Fix a "databaseLock" bug, which could cause both the payload and attempt inserts and the "mergeParquetFiles" to fail, as the inserts could be executed concurrently with tables-compaction.
- Fix the "null" representation of an "unknown" payload-size in the database.
- Remove the obsolete thread-locking for the "CreateDatabase" operation. This code is guaranteed to run BEFORE any other operation in the database.
- Implement the "handlePreparedStatementException" and "closeConnection" methods.
- Improve error-logs.
- Update dependencies.
- Code cleanup.
2021-11-30 13:26:19 +02:00
Lampros Smyrnaios d100af35d0 - Implement the "getUrls" and "addWorkerReport" endpoints with full database-handling.
- Add connectivity with an Impala-database and create a dedicated Controller for future statistics-requests.
- Optimize the "getTestUrls"-endpoint.
- Disable the "reportCurrentTime()" scheduled-task.
- Update dependencies and bump project's version to '1.0.0-SNAPSHOT'.
- Set the logging-appender to "File".
- Code cleanup.
2021-11-09 23:59:27 +02:00