- Delete the assignments-batch full-texts after the whole procedure (for each assignments-batch) is finished, either successfully or not.
- Do not check for remaining files, when the Worker shuts down, since, in case of problematic handling the files are deleted anyway.
The full-texts are not needed to be kept, in case of an error, since the Controller will reassign the non-downloaded id-url records to some worker (maybe different) and these files will be downloaded again and handled there.
Also, change the "assignmentsNumsHandled" to hold data only for assignments which are handled all the way, including the upload of the full-texts from the Controller and also the insertion of the WorkerReport to the database.
- Apply the checks for the "totalZipBatches" param, before the Worker-related checks, in "FullTextsController.getMultipleFullTexts()"
- Show the Heap-sizes in megabytes.
- Clear the "ConnSupportUtils.domainsWithConnectionData" data-structure, after each batch.
- Move the code for handling the "CookieStore" inside the "PublicationsRetrieverPlugin", as it is more related to that.
- Avoid running the "deleteHandledAssignmentsFullTexts()" scheduled task on application's start.
- Optimize assignment of "requestUrl".
- Add clarity in the scheduled tasks, by using "fixedDelay" instead of "fixedRate", to signify that the time specified is counted right from the time the last task is finished (even though without enabling the "Async" there is no "danger" of running them in parallel).
- Code cleanup.
- Use the "InputStreamResource" also in "get(single)FullText"-endpoint, in order to avoid loading a big full-text file in memory.
- Decrease the system-reserved memory by 128 MB.
- Fix path-variable regexes for "getFullText"-endpoint.
- Optimize imports.
- Code cleanup.
- Set increased lower and upper limits for the Java Heap Size.
- Update the "ServerBaseURL" to the Public IP Address of the machine which is running the app.
- Improve two log-messages.
- Implement the "getFullTexts"-endpoint, which returns the requested full-texts in a zip file.
- Implement the "getFullText"-endpoint, which returns the requested full-text.
- Implement the "getHandledAssignmentsCounts"-endpoint which returns the assignments-numbers, which were handled by that worker.
- Make sure each urlReport has the same "Date" for a given assignments-number. Also, make sure the "size" and "hash" have a "null" value, in case the full-text was not found.
- Check and log thread-pool shutdown errors.
- Add the stack-trace in the error-logs, instead of the Stderr.
- Update SpringBoot dependency.
- Change log levels.
- Code cleanup.
--Ask the user to give the "workerId" and the "controllerBaseUrl".
--Make sure the "libs" directory is created, if not exists.
--Make sure the "unzip" package is installed.
- Change the data-type of the "UrlReport.status" to be "enum StatusType", in order to increase consistency and comparability.
- Update the guidelines in the README.
- Switch the "AssignmentsHandler.askForTest" to "false".
- Get the size and the hash of a docFile which is previously downloaded by another ID in that batch.
- Reset the "AssignmentHandler.urlReports" list after posting the results to the Controller.
- Enhance logging and comments.
- Add more guidelines in the README.
- Disable the scheduled test-live job.
- Code cleanup.