1
0
Fork 0

Continuous-Validation updates:

- Update the "uoa-validator-engine2" dependency.
- Update the "installProject.sh" script to account for potential conflict with previous builds.
- Add documentation.
This commit is contained in:
Lampros Smyrnaios 2024-01-25 18:23:14 +02:00
parent eee085c104
commit 8d0ed7d414
4 changed files with 21 additions and 3 deletions

View File

@ -1,3 +1,11 @@
# Continuous Validation
This module is responsible for deploying an **Oozie Workflow** (on the desired cluster), which executes a **Spark** action.<br>
This action takes the HDFS-path of a directory of parquet files containing metadata records, and applies the validation process on all of them, in parallel. Then it outputs the results, in json-format, in the given directory.<br>
The validation process is powered by the [**uoa-validator-engine2**](https://code-repo.d4science.org/MaDgIK/uoa-validator-engine2) software.<br>
### Install and run
Run the **./installProject.sh** script and then the **./runOozieWorkflow.sh** script.<br>
[...]

View File

@ -1,9 +1,17 @@
# Install the whole "dnet-hadoop" project.
# Delete this module's previous build-files in order to avoid any conflicts.
rm -rf target/ ||
# Go to the root directory of this project.
cd ../../
# Select the build profile.
DEFAULT_PROFILE='' # It's the empty profile.
NEWER_VERSIONS_PROFILE='-Pscala-2.12'
CHOSEN_MAVEN_PROFILE=${DEFAULT_PROFILE}
# Install the project.
mvn clean install -U ${CHOSEN_MAVEN_PROFILE} -Dmaven.test.skip=true
# We skip tests for all modules, since the take a big amount of time and some of them fail.

View File

@ -1,12 +1,14 @@
# This script deploys and runs the oozie workflow.
# This script deploys and runs the oozie workflow on the cluster, defined in the "~/.dhp/application.properties" file.
# Select the build profile.
DEFAULT_PROFILE='' # It's the empty profile.
NEWER_VERSIONS_PROFILE='-Pscala-2.12'
CHOSEN_MAVEN_PROFILE=${DEFAULT_PROFILE}
# Build and deploy this module.
mvn clean package -U ${CHOSEN_MAVEN_PROFILE} -Poozie-package,deploy,run \
-Dworkflow.source.dir=eu/dnetlib/dhp/continuous_validator
# Show the Oozie-job-ID.
echo -e "\n\nShowing the contents of \"extract-and-run-on-remote-host.log\":\n"
cat ./target/extract-and-run-on-remote-host.log

View File

@ -207,7 +207,7 @@
<dependency>
<groupId>eu.dnetlib</groupId>
<artifactId>uoa-validator-engine2</artifactId>
<version>0.9.3</version>
<version>2.0.0-SNAPSHOT</version>
</dependency>
<dependency>