8172_impact_indicators_workflow #284
No reviewers
Labels
No Label
bug
duplicate
enhancement
help wanted
invalid
question
RDGraph
RSAC
wontfix
No Milestone
No project
No Assignees
4 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: D-Net/dnet-hadoop#284
Loading…
Reference in New Issue
No description provided.
Delete Branch "8172_impact_indicators_workflow"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
This PR adds an Oozie workflow for the calculation of impact indicators using https://github.com/athenarc/Bip-Ranker in graph provision.
WIP: 8172_impact_indicators_workflowto 8172_impact_indicators_workflowThe workflow has been successfully tested and is ready to be merged.
It first calculates the impact indicators for results producing the relevant action sets, and then aggregates those indicators for projects, also producing appropriate action sets.
Since,the performance of this WF was discussed in the past, I attach the execution times using the last prod graph (v6.0.0).
I see you are creating two different action sets: one for projects and one fro results.We already have an action set to store the update related to the computation of the impact indicators, you should use the same for both projects and results.
It seems they are not two distinct action sets, but that you are changing the destination directory for the same AS adding in one case /result and in the other case /project to the path you should use to store the files.
You should avoid doing it. The action sets are associated to profiles on the aggregator. This profiles are updated each time the action set is generated. The profile specifies the association between the last update for the action set and the directory that stores the update.
When the AS in included in the graph, the directory where to search for the files is the one specified in the profile.
8172_impact_indicators_workflowto WIP: 8172_impact_indicators_workflowWIP: 8172_impact_indicators_workflowto 8172_impact_indicators_workflowJust updated the code to store the action sets for projects and results in the same file.
The changes in SparkAtomicActionScoreJob seem to break the other existing workflow using it (dhp-workflows/dhp-aggregation/src/main/resources/eu/dnetlib/dhp/actionmanager/bipfinder/oozie_app/workflow.xml)
I think we can remove that workflow, since this implementation should replace the old asynchronous one
yes, I was thinking the same, but to be on the safe side, I will adapt this workflow in case it is needed.
As agreed with Serafeim, I will merge this PR and the changes to the other related WF (if not removed) will be done in afterwards.