participant project contribution #223

Merged
claudio.atzori merged 5 commits from project_organization_contribution into beta 2 years ago
Owner

This PR extends the mapping applied to the aggregator's DB content to include the individual montary contribution for each project partner, which becomes a property of the project - organization relation.

A side consideration: in case of false positives produced by the Organization deduplication, we might end up merging different relationships bearing this information. IT WOULD BE A MISTAKE TO MERGE THEM, therefore in case of such event, the most reasonable approach seems to be a conservative one: not set any contribution information.

This PR extends the mapping applied to the aggregator's DB content to include the individual montary contribution for each project partner, which becomes a property of the `project - organization` relation. A side consideration: in case of false positives produced by the Organization deduplication, we might end up merging different relationships bearing this information. **`IT WOULD BE A MISTAKE TO MERGE THEM`**, therefore in case of such event, the most reasonable approach seems to be a conservative one: not set any contribution information.
claudio.atzori added the
enhancement
label 2 years ago
claudio.atzori added 1 commit 2 years ago
claudio.atzori requested review from alessia.bardi 2 years ago
claudio.atzori requested review from miriam.baglioni 2 years ago
claudio.atzori added 1 commit 2 years ago
Collaborator

I think the PR can be merged. Only one consideration: you say that when more than one organization bring information about the funded amount it would be safer not to show any contribution. I think we should check if the funded amount is the same. In that case we can map that value. It is different if we have two values that are not the same. In this case why should collect them somewhere to be used to improve the deduplication. One more thing: we need to consider all the relations that will be merged together to be sure that only once the funded amount is occurring, or if more than once, all the times has the same value

I think the PR can be merged. Only one consideration: you say that when more than one organization bring information about the funded amount it would be safer not to show any contribution. I think we should check if the funded amount is the same. In that case we can map that value. It is different if we have two values that are not the same. In this case why should collect them somewhere to be used to improve the deduplication. One more thing: we need to consider all the relations that will be merged together to be sure that only once the funded amount is occurring, or if more than once, all the times has the same value
claudio.atzori added 1 commit 2 years ago
Poster
Owner

I think the PR can be merged. Only one consideration: you say that when more than one organization bring information about the funded amount it would be safer not to show any contribution. I think we should check if the funded amount is the same. In that case we can map that value.

As a matter of fact, question is not limited to the presentation of the funded amount. Taking any action in case of wrong organization deduplication would always imply drawbacks on the aggregated views. The only case that would not be affected is limited to the occurrence of exactly the same amount, but I'm not sure it might happen, and if it does would be by pure chance.

It is different if we have two values that are not the same. In this case why should collect them somewhere to be used to improve the deduplication.

I agree it would be helpful, the information would need to be dumped somewhere in such cases, but we'd need to be alerted as well, otherwise I doubt anyone will ever take a look at such data.

One more thing: we need to consider all the relations that will be merged together to be sure that only once the funded amount is occurring, or if more than once, all the times has the same value

I'm not sure I get what you mean here. Can you elaborate?

Overall, I think we could proceed with integrating the changes as they are, I just want to include some more to-the-point unit test, while to evaluate how to proceed in case of wrong organization dedup, we can verify to what extent this will occur by querying the actual graph data as soon as this change will be shipped to beta and the 1st graph built.

> I think the PR can be merged. Only one consideration: you say that when more than one organization bring information about the funded amount it would be safer not to show any contribution. I think we should check if the funded amount is the same. In that case we can map that value. As a matter of fact, question is not limited to the presentation of the funded amount. Taking any action in case of wrong organization deduplication would always imply drawbacks on the aggregated views. The only case that would not be affected is limited to the occurrence of exactly the same amount, but I'm not sure it might happen, and if it does would be by pure chance. > It is different if we have two values that are not the same. In this case why should collect them somewhere to be used to improve the deduplication. I agree it would be helpful, the information would need to be dumped somewhere in such cases, but we'd need to be alerted as well, otherwise I doubt anyone will ever take a look at such data. > One more thing: we need to consider all the relations that will be merged together to be sure that only once the funded amount is occurring, or if more than once, all the times has the same value I'm not sure I get what you mean here. Can you elaborate? Overall, I think we could proceed with integrating the changes as they are, I just want to include some more to-the-point unit test, while to evaluate how to proceed in case of wrong organization dedup, we can verify to what extent this will occur by querying the actual graph data as soon as this change will be shipped to beta and the 1st graph built.
claudio.atzori added 1 commit 2 years ago
claudio.atzori added 1 commit 2 years ago
claudio.atzori merged commit 37cfda0fc5 into beta 2 years ago
claudio.atzori deleted branch project_organization_contribution 2 years ago

Reviewers

alessia.bardi was requested for review 2 years ago
miriam.baglioni was requested for review 2 years ago
The pull request has been merged as 37cfda0fc5.
You can also view command line instructions.

Step 1:

From your project repository, check out a new branch and test the changes.
git checkout -b project_organization_contribution beta
git pull origin project_organization_contribution

Step 2:

Merge the changes and update on Gitea.
git checkout beta
git merge --no-ff project_organization_contribution
git push origin beta
Sign in to join this conversation.
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: D-Net/dnet-hadoop#223
Loading…
There is no content yet.