Continuous Validation Workflow #388

Open
lsmyrnaios wants to merge 18 commits from lsmyrnaios/dnet-hadoop:continuous_validation2 into beta

This is a proof-of-concept for the Continuous Validation Workflow.

It is an Oozie workflow, which takes the following parameters:

  • a parquet file or directory which contains the metadata-records
  • the OpenAIRE-Guidelines according to which the metadata will be validated
  • an output-path where the validation-results will be saved, in JSON format

It distributes the metadata records across many cores, using Spark and validates them.
Then is collects the results and saves them in json-files.

This is a proof-of-concept for the Continuous Validation Workflow. It is an Oozie workflow, which takes the following parameters: - a parquet file or directory which contains the metadata-records - the OpenAIRE-Guidelines according to which the metadata will be validated - an output-path where the validation-results will be saved, in JSON format It distributes the metadata records across many cores, using Spark and validates them. Then is collects the results and saves them in json-files.
lsmyrnaios added 18 commits 3 months ago
b71633fd7f - Fix the location of the "input_continuous_validator_parameters.json" file.
- Fix handing the "isSparkSessionManaged" parameter.
- Add the "provided" scope for some dependencies. They do not inherit it from the main pom, since the "version" tag is declared, even though the value is the same as the one from the main pom.
- Code polishing / cleanup.
a2feda6c07 - Fix acquiring the "openaire_guidelines" parameter.
- Use the right Guidelines-profile, depending on the "openaire_guidelines" version.
- Update log-levels.
- Optimize imports.
17282ea8fc - Fix the "is not NULL" checks inside "spark.filter()"
- Make sure the "outputPath" ends with a "/", in any case.
- Fix a parameter-description.
ff47a941f5 - Add the "installProject.sh" script.
- Show the Job-ID or potential deployment-error-logs, right after the deployment of the workflow.
- Code polishing.
8d0ed7d414 Continuous-Validation updates:
- Update the "uoa-validator-engine2" dependency.
- Update the "installProject.sh" script to account for potential conflict with previous builds.
- Add documentation.
cbe7c6734a - Add documentation.
- Code polishing/cleanup.
lsmyrnaios requested review from claudio.atzori 3 months ago

Reviewers

claudio.atzori was requested for review 3 months ago
This pull request has changes conflicting with the target branch.
  • dhp-workflows/dhp-aggregation/src/main/java/eu/dnetlib/dhp/actionmanager/bipmodel/score/deserializers/BipProjectModel.java
  • dhp-workflows/dhp-doiboost/src/test/java/eu/dnetlib/doiboost/orcid/OrcidClientTest.java
You can also view command line instructions.

Step 1:

From your project repository, check out a new branch and test the changes.
git checkout -b lsmyrnaios-continuous_validation2 beta
git pull continuous_validation2

Step 2:

Merge the changes and update on Gitea.
git checkout beta
git merge --no-ff lsmyrnaios-continuous_validation2
git push origin beta
Sign in to join this conversation.
No reviewers
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: D-Net/dnet-hadoop#388
Loading…
There is no content yet.