Graph construction for IIS [PROD NEW] IIS 30 set blacklist of funder nsPrefixes nsPrefixBlacklist conicytf____,dfgf________,gsrt________,innoviris___,miur________,rif_________,rsf_________,sgov________,sfrs________ set the path of the map defining the relations id mappings idMappingPath /data/maps/fct_map.json Set the path containing the PROD AGGREGATOR graph aggregatorGraphPath /tmp/prod_inference/graph/00_graph_aggregator Set the target path to store the RAW graph rawGraphPath /tmp/prod_inference/graph/01_graph_raw Set the target path to store the CLEANED graph cleanedFirstGraphPath /tmp/prod_inference/graph/02_graph_clean_first Set the target path to store the DEDUPED graph dedupGraphPath /tmp/prod_inference/graph/03_graph_dedup Set the target path to store the CONSISTENCY graph consistentGraphPath /tmp/prod_inference/graph/04_graph_consistent Set the target path to store the CLEANED graph cleanedGraphPath /tmp/prod_inference/graph/05_graph_cleaned Set the dedup orchestrator name dedupConfig dedup-similarity-result-decisiontree-v2 declares the ActionSet ids to promote in the RAW graph actionSetIdsRawGraph scholexplorer-dump,doiboost,orcidworks-no-doi,datacite Set the IS lookup service address isLookUpUrl http://services.openaire.eu:8280/is/services/isLookUp?wsdl wait configurations reuse cached ODF claims from the PROD aggregation system reuseODFClaims true reuse cached OAF claims from the PROD aggregation system reuseOAFClaims true reuse cached ODF records on HDFS from the PROD aggregation system reuseODFhdfs true reuse cached OAF records on HDFS from the PROD aggregation system reuseOAFhdfs true reuse cached ODF content from the PROD aggregation system reuseODF true reuse cached OAF content from the PROD aggregation system reuseOAF true reuse cached DB content from the PROD aggregation system reuseDB true reuse cached OpenOrgs content from the PROD aggregation system reuseDBOpenorgs true should apply the relations id patching based on the provided idMapping? shouldPatchRelations false set the PROD aggregator content path contentPath /tmp/prod_aggregator wait configurations create the PROD AGGREGATOR graph executeOozieJob IIS { 'graphOutputPath' : 'aggregatorGraphPath', 'isLookupUrl' : 'isLookUpUrl', 'reuseODFClaims' : 'reuseODFClaims', 'reuseOAFClaims' : 'reuseOAFClaims', 'reuseDB' : 'reuseDB', 'reuseDBOpenorgs' : 'reuseDBOpenorgs', 'reuseODF' : 'reuseODF', 'reuseODF_hdfs' : 'reuseODFhdfs', 'reuseOAF' : 'reuseOAF', 'reuseOAF_hdfs' : 'reuseOAFhdfs', 'contentPath' : 'contentPath', 'nsPrefixBlacklist' : 'nsPrefixBlacklist', 'shouldPatchRelations' : 'shouldPatchRelations', 'idMappingPath' : 'idMappingPath' } { 'oozie.wf.application.path' : '/lib/dnet/PROD/oa/graph/raw_all/oozie_app', 'mongoURL' : '', 'mongoDb' : '', 'mdstoreManagerUrl' : '', 'postgresURL' : '', 'postgresUser' : '', 'postgresPassword' : '', 'postgresOpenOrgsURL' : '', 'postgresOpenOrgsUser' : '', 'postgresOpenOrgsPassword' : '', 'shouldHashId' : 'true', 'importOpenorgs' : 'true', 'workingDir' : '/tmp/prod_inference/working_dir/prod_aggregator' } build-report create the RAW graph executeOozieJob IIS { 'inputActionSetIds' : 'actionSetIdsRawGraph', 'inputGraphRootPath' : 'aggregatorGraphPath', 'outputGraphRootPath' : 'rawGraphPath', 'isLookupUrl' : 'isLookUpUrl' } { 'oozie.wf.application.path' : '/lib/dnet/PROD/actionmanager/wf/main/oozie_app', 'sparkExecutorCores' : '3', 'sparkExecutorMemory' : '10G', 'activePromoteDatasetActionPayload' : 'true', 'activePromoteDatasourceActionPayload' : 'true', 'activePromoteOrganizationActionPayload' : 'true', 'activePromoteOtherResearchProductActionPayload' : 'true', 'activePromoteProjectActionPayload' : 'true', 'activePromotePublicationActionPayload' : 'true', 'activePromoteRelationActionPayload' : 'true', 'activePromoteResultActionPayload' : 'true', 'activePromoteSoftwareActionPayload' : 'true', 'mergeAndGetStrategy' : 'MERGE_FROM_AND_GET', 'workingDir' : '/tmp/prod_inference/working_dir/promoteActionsRaw' } build-report clean the properties in the graph typed as Qualifier according to the vocabulary indicated in schemeid executeOozieJob IIS { 'graphInputPath' : 'rawGraphPath', 'graphOutputPath': 'cleanedFirstGraphPath', 'isLookupUrl': 'isLookUpUrl' } { 'oozie.wf.application.path' : '/lib/dnet/PROD/oa/graph/clean/oozie_app', 'workingDir' : '/tmp/prod_inference/working_dir/clean_first' } build-report search for duplicates in the raw graph executeOozieJob IIS { 'actionSetId' : 'dedupConfig', 'graphBasePath' : 'cleanedFirstGraphPath', 'dedupGraphPath': 'dedupGraphPath', 'isLookUpUrl' : 'isLookUpUrl' } { 'oozie.wf.application.path' : '/lib/dnet/PROD/oa/dedup/scan/oozie_app', 'actionSetIdOpenorgs' : 'dedup-similarity-organization-simple', 'workingPath' : '/tmp/prod_inference/working_dir/dedup', 'sparkExecutorCores' : '3', 'sparkExecutorMemory' : '10G' } build-report mark duplicates as deleted and redistribute the relationships executeOozieJob IIS { 'graphBasePath' : 'dedupGraphPath', 'graphOutputPath': 'consistentGraphPath' } { 'oozie.wf.application.path' : '/lib/dnet/PROD/oa/dedup/consistency/oozie_app', 'workingPath' : '/tmp/prod_inference/working_dir/dedup' } build-report clean the properties in the graph typed as Qualifier according to the vocabulary indicated in schemeid executeOozieJob IIS { 'graphInputPath' : 'consistentGraphPath', 'graphOutputPath': 'cleanedGraphPath', 'isLookupUrl': 'isLookUpUrl' } { 'oozie.wf.application.path' : '/lib/dnet/PROD/oa/graph/clean/oozie_app', 'workingDir' : '/tmp/prod_inference/working_dir/clean' } build-report wf_20210719_165159_86 2021-07-19T20:45:09+00:00 SUCCESS