- Increase the "numOfThreadsPerBulkImportProcedure" to 6.
- Fix Bulk import not working from a second-level subdirectory; the report-subDirectory was not created.
- Fix not returning the bulk-import-report as "application/json".
- Add useful messages for missing parameters.
- Change the HTTP-method for the "bulkImportFullTexts" endpoint to "POST".
- Show a structured json-response for the "bulkImportFullTexts" endpoint.
- Fix uncommon date-format.
- Remove single quotes from json-report, since they are returned as bytes, not characters.
- Optimize the generation of the json-bulkImport-report.
After that enchantment, each worker could request multiple assignment-batches, before its previous batches were processed by the Controller. This means that for each batch that was processed, the Controller was deleting from the "assignment" table, all the assignments (-batches) delivered to the Worker that brought that batch, even though the "attempt" and "payload" records for the rest of the batches were not inserted in the DB yet. So in a new assignments-batch request, the same publications that were already under processing, were delivered to the same or other Workers.
Now, for each finished batch, only the assignments of that batch are deleted from the "assignment" table.
- Store each worker's info in a hash-table, in order to efficiently know if we need to create new hdfs subdirectories. Also, this will help to issue "shutdown" requests to the workers in the future, as well as to know which worker has shutdown.
- Catch the more general "Exception", inside "FileUtils.mergeParquetFiles()", in order to be certain that the "SQLException" can also be caught.
- Code polishing.
One side effect of using the parquet-files, is that the timestamps are now BIGDECIMAL numbers, instead of "Timestamp" objects, but, converting them to such objects is pretty easy, if we ever need to do it.
- Code polishing.
- Fix a "@Value" annotation inside "FileUtils.java".
- Add a new database and show its name along with the initial's name in the logs.
- Code cleanup and improvement.
- Optimize the "findAssignmentsQuery" by using an inner limit (larger than the outer).
- Save a ton of time from inserting the assignments into the database, by using a temporal table to hold the new assignments, in order for them to be easily accessible both from the Controller (which processes them and send them to the Worker) and the database itself, in order to "import" them into the "assignment"-table.
- Replace the "Date" with "Timestamp", in order to hold more detailed information.
- Code cleanup.
- Change the data-type of the "UrlReport.status" to be "enum StatusType", in order to increase consistency and comparability.
- Change the "Date" datatype in "Payload" to have the SQL's version.
- Fix the project's name inside "settings.gradle".
- Code cleanup.
In order to match the database, now we have a list of Assignments sent through the AssignmentResponse, instead of a single Assignment having a list of tasks.
- Cleanup the members of the "Payload" model (also prepare for database integration).