Commit Graph

31 Commits

Author SHA1 Message Date
Lampros Smyrnaios 44459c8681 - Rename "ImpalaConnector.java" to "DatabaseConnector.java".
- Update dependencies.
- Code polishing.
2023-08-23 16:55:23 +03:00
Lampros Smyrnaios a524375656 - Create the HDFS-subDirs before generating "callableTasks" for creating and uploading the parquetFiles.
- Delete gradle .zip file after installation.
2023-08-04 15:30:41 +03:00
Lampros Smyrnaios cde6063d63 - Update dependencies.
- Code polishing.
2023-07-25 12:12:56 +03:00
Lampros Smyrnaios e8644cb64f - Optimize the "insertAssignmentsQuery".
- Add documentation about the Prometheus Metrics, in README.
- Update Dependencies.
- Code polishing.
2023-07-05 17:10:30 +03:00
Lampros Smyrnaios d52601e819 - Add handling for the "EmptyResultDataAccessException" in "UrlsServiceImpl.getAssignments()", which is thrown when no assignments are returned from the query.
- Code polishing.
2023-06-22 12:39:11 +03:00
Lampros Smyrnaios b499209ce3 - Move the Prometheus and grafana configuration in a dedicated directory and docker-compose file.
- Add documentation about setting-up prometheus and grafana.
2023-05-15 18:52:31 +03:00
Lampros Smyrnaios 1b14a7e554 - Add profiles to docker-services to selectively run the additional "Prometheus" and "Grafana" services or not.
- Update Gradle.
2023-04-22 16:50:33 +03:00
Lampros Smyrnaios 68759e3023 Update dependencies. 2023-04-20 18:57:16 +03:00
Lampros Smyrnaios c2b17163cd Automatically show the Controller's logs after the docker-container starts running and the status is shown. 2023-04-11 11:59:10 +03:00
Lampros Smyrnaios 4dc34429f8 - Increase the waiting-time before checking the docker containers' status, in order to catch configuration-crashes.
- Code polishing.
2023-04-10 22:28:53 +03:00
Lampros Smyrnaios 495d5de19b - Automatically get the status of the docker containers after 30 secs of their initialization.
- Add an error-handling in "installAndRun.sh"
- Update dependencies.
2023-03-27 19:43:15 +03:00
Lampros Smyrnaios e975bec911 - Add Prometheus and Grafana which help measuring various metrics for the Controller's health and performance.
- Fix Docker config still using the old (now removed) "application.properties" file.
- Simplify the process of building and running the docker image; Now we use docker compose to run the Controller, along with the Prometheus and Grafana. Also, now it is not requested from the user to login and push the image (this may change in the future).
2023-03-21 16:46:33 +02:00
Lampros Smyrnaios 38643c76a3 - Code polishing.
- Update Gradle.
2023-03-07 16:55:41 +02:00
Lampros Smyrnaios b7f6056032 - Improve an error-message.
- Update Gradle.
2023-02-21 15:42:07 +02:00
Lampros Smyrnaios a1c16ffc19 - Exclude empty and null urls in the assignments.
- Update the "getFullTextsImproved"-call to "getFullTexts", now that the "improved" version is stable.
- Update Gradle.
- Code polishing.
2023-02-16 14:24:47 +02:00
Lampros Smyrnaios 49fefefafd - Refactor the payloads-statistics-code and provide two endpoints: "getNumberOfPayloadsAggregatedByService", which returns the number of payloads aggregated only by the PDF-Aggregation-Service, and the "getNumberOfAllPayloads", which returns the number of all payloads existing in the database, even the ones aggregated in the past, by other pieces of software.
- Update README.md.
- Make sure the docker image is clean-built, by avoiding the use of cache.
2023-02-02 17:58:47 +02:00
Lampros Smyrnaios 0209d24068 - Change the parquet compression from "Snappy" to "Gzip", as there is an unhandleable exception when the app is running inside a Docker Container and uses the "Snappy" compression.
- Code polishing.
2022-12-08 16:28:41 +02:00
Lampros Smyrnaios 577ea983e8 - Improve some log-messages.
- Set some optimization settings for gradle.
- Fix error-handling in "installAndRun.sh".
- Update dependencies.
2022-11-30 16:28:39 +02:00
Lampros Smyrnaios 6a03103b79 Update dependencies. 2022-11-10 16:50:21 +02:00
Lampros Smyrnaios 5e4fad2479 - Change the fileNames' structure in the S3-ObjectStore.
- Update dependencies.
2022-04-01 19:24:04 +03:00
Lampros Smyrnaios 48670f3399 - Show the percentage of the "NumFullTextsFound", in the logs.
- Update dependencies.
2022-03-28 14:29:31 +03:00
Lampros Smyrnaios 88acaae20f - Replace the "numFullTextUrlsFound"-counter with "numFullTextsFound"-counter to reflect the end result of the actually available full-texts (which were downloaded by the Worker).
- Optimize the gather-fileNames loop.
- Improve a message in "installAndRun.sh"
2022-02-23 17:40:06 +02:00
Lampros Smyrnaios 71f6b46130 - In case of an error when creating the "current_assignment" table (e.g out of memory in the backend database server), check for partial-creation and drop it. Also, in any case, before we drop this table, now check if it exists firsts (in general it should always exist, unless the creation results in an error and the table was not created at all).
- Fix an error-message.
- Update dependencies.
- Code cleanup.
2022-02-14 12:36:00 +02:00
Lampros Smyrnaios b206114144 - Allow the user to build, push and run the App in Docker, straight though the "installAndRun.sh" script.
- Re-add the logback-spring configuration.
- Change the docker-app name.
2022-02-04 15:49:56 +02:00
Lampros Smyrnaios ff46839158 Fix not prioritizing the gradle version defined inside the "installAndRun.sh" script. 2022-01-21 15:45:12 +02:00
Lampros Smyrnaios 82bf11b9b3 - Workaround a bug of Impala-JDBC-Driver, when creating insert-prepared-statements.
- Update dependencies.
2021-12-24 00:25:50 +02:00
Lampros Smyrnaios 33ba3e8d91 - Avoid getting and uploading (to S3), full-texts which are already uploaded by previous assignments-batches.
- Fix not updating the fileLocation with the s3Url for records which share the same full-text.
- Set only one delete-order for each assignments-batch-files, not one (or more, by mistake) per zip-batch.
- Set the HttpStatus to "204 - NO_CONTENT", when no assignments are available to be returned to the Worker.
- Fix not unlocking the "dataBaseLock" in case of a "dataBase-connection"-error, in "addWorkerReport()".
- Improve some log-messages.
- Change the log-level for the "S3-bucket already exists" message.
- Update Gradle.
- Optimize imports.
- Code cleanup.
2021-12-21 15:55:27 +02:00
Lampros Smyrnaios dea257b87f - Fix a bug, which caused the get-full-texts request to fail, because of the wrong "requestAssignmentsCounter".
- Fix a bug, which caused multiple workers to get assigned the same batch-counter, while the assignment-tasks where different.
- Set a max-size limit to the amount of space the logs can use. Over that size, the older logs will be deleted.
- Show the error-message returned from the Worker, when a getFullTexts-request fails.
- Improve some log-messages.
- Update dependencies.
- Code cleanup.
2021-12-06 20:18:30 +02:00
Lampros Smyrnaios 780ed15ce2 - Fix a "databaseLock" bug, which could cause both the payload and attempt inserts and the "mergeParquetFiles" to fail, as the inserts could be executed concurrently with tables-compaction.
- Fix the "null" representation of an "unknown" payload-size in the database.
- Remove the obsolete thread-locking for the "CreateDatabase" operation. This code is guaranteed to run BEFORE any other operation in the database.
- Implement the "handlePreparedStatementException" and "closeConnection" methods.
- Improve error-logs.
- Update dependencies.
- Code cleanup.
2021-11-30 13:26:19 +02:00
Lampros Smyrnaios 0540820817 -Update "installAndRun.sh":
-- Check if a gradle installation with the given version already exists, before downloading and installing gradle.
-- Make sure the "unzip" package is installed, before trying unzipping the gradle package.
- Update the logs' fileName.
2021-10-14 02:46:33 +03:00
Lampros Smyrnaios 983b900da7 - Add the "installAndRun.sh" script.
- Update the README.
- Update the dependencies.
2021-09-09 15:56:37 +03:00