forked from D-Net/dnet-hadoop
Merge branch 'mvn_site_documentation' of code-repo.d4science.org:D-Net/dnet-hadoop into mvn_site_documentation
This commit is contained in:
commit
2506d7a679
|
@ -1,9 +1,20 @@
|
||||||
##DHP-Aggregation
|
##DHP-Aggregation
|
||||||
|
|
||||||
This module defines a set of oozie workflows for the **collection** and **transformation** of metadata records.
|
This module defines a set of oozie workflows for
|
||||||
|
|
||||||
Both workflows interact with the Metadata Store Manager (MdSM) to handle the logical transactions required to ensure
|
1. the **collection** and **transformation** of metadata records.
|
||||||
|
2. the **integration** of new external information in the result
|
||||||
|
|
||||||
|
|
||||||
|
### Collection and Transformation
|
||||||
|
|
||||||
|
The workflows interact with the Metadata Store Manager (MdSM) to handle the logical transactions required to ensure
|
||||||
the consistency of the read/write operations on the data as the MdSM in fact keeps track of the logical-physical mapping
|
the consistency of the read/write operations on the data as the MdSM in fact keeps track of the logical-physical mapping
|
||||||
of each MDStore.
|
of each MDStore.
|
||||||
|
|
||||||
It defines [mappings](mappings.md) for transformation of different datasource (See mapping section).
|
It defines [mappings](mappings.md) for transformation of different datasource (See mapping section).
|
||||||
|
|
||||||
|
### Integration of external information in the result
|
||||||
|
|
||||||
|
The workflows create new entity in the OpenAIRE format (OAF) which aim is to enrich the result already contained in the graph.
|
||||||
|
See integration section for more insight
|
||||||
|
|
|
@ -0,0 +1,36 @@
|
||||||
|
DHP Aggregation - Integration method
|
||||||
|
=====================================
|
||||||
|
|
||||||
|
The integration method can be applied every time new information, which is not aggregated from the repositories
|
||||||
|
nor computed directly by OpenAIRE, should be added to the results of the graph.
|
||||||
|
|
||||||
|
The information integrated so far is:
|
||||||
|
|
||||||
|
1. Article impact measures
|
||||||
|
1. [Bip!Finder](https://dl.acm.org/doi/10.1145/3357384.3357850) scores
|
||||||
|
2. Result Subjects
|
||||||
|
1. Integration of Fields od Science and Techonology ([FOS](https://www.qnrf.org/en-us/FOS)) classification in
|
||||||
|
results subjects.
|
||||||
|
|
||||||
|
|
||||||
|
The method always consists in the creation of a new entity in the OpenAIRE format (OAF entity) containing only the id
|
||||||
|
and the element in the OAF model that should be used to map the information we want to integrate.
|
||||||
|
|
||||||
|
The id is set by using a particular encoding of the given PID
|
||||||
|
|
||||||
|
*unresolved:[pid]:[pidtype]*
|
||||||
|
|
||||||
|
where
|
||||||
|
|
||||||
|
1. *unresolved* is a constant value
|
||||||
|
2. *pid* is the persistent id value, e.g. 10.5281/zenodo.4707307
|
||||||
|
3. *pidtype* is the persistent id type, e.g. doi
|
||||||
|
|
||||||
|
Such entities are matched against those available in the graph using the result.instance.pid values.
|
||||||
|
|
||||||
|
This mechanism can be used to integrate enrichments produced as associated by a given PID.
|
||||||
|
If a match will be found with one of the results already in the graph that said result will be enriched with the information
|
||||||
|
present in the new OAF.
|
||||||
|
All the objects for which a match is not found are discarded.
|
||||||
|
|
||||||
|
|
|
@ -19,6 +19,9 @@
|
||||||
<item name="Mappings" href="mappings.html" collapse="true">
|
<item name="Mappings" href="mappings.html" collapse="true">
|
||||||
<item name="Pubmed" href="pubmed.html"/>
|
<item name="Pubmed" href="pubmed.html"/>
|
||||||
<item name="Datacite" href="datacite.html"/>
|
<item name="Datacite" href="datacite.html"/>
|
||||||
|
</item>
|
||||||
|
<item name="Integration" href="integration.html" collapse="true">
|
||||||
|
|
||||||
</item>
|
</item>
|
||||||
<item name="General Information" href="about.html"/>
|
<item name="General Information" href="about.html"/>
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue