forked from D-Net/dnet-hadoop
Merge branch 'mvn_site_documentation' of code-repo.d4science.org:D-Net/dnet-hadoop into mvn_site_documentation
This commit is contained in:
commit
2506d7a679
|
@ -1,9 +1,20 @@
|
|||
##DHP-Aggregation
|
||||
|
||||
This module defines a set of oozie workflows for the **collection** and **transformation** of metadata records.
|
||||
This module defines a set of oozie workflows for
|
||||
|
||||
Both workflows interact with the Metadata Store Manager (MdSM) to handle the logical transactions required to ensure
|
||||
1. the **collection** and **transformation** of metadata records.
|
||||
2. the **integration** of new external information in the result
|
||||
|
||||
|
||||
### Collection and Transformation
|
||||
|
||||
The workflows interact with the Metadata Store Manager (MdSM) to handle the logical transactions required to ensure
|
||||
the consistency of the read/write operations on the data as the MdSM in fact keeps track of the logical-physical mapping
|
||||
of each MDStore.
|
||||
|
||||
It defines [mappings](mappings.md) for transformation of different datasource (See mapping section).
|
||||
|
||||
### Integration of external information in the result
|
||||
|
||||
The workflows create new entity in the OpenAIRE format (OAF) which aim is to enrich the result already contained in the graph.
|
||||
See integration section for more insight
|
||||
|
|
|
@ -0,0 +1,36 @@
|
|||
DHP Aggregation - Integration method
|
||||
=====================================
|
||||
|
||||
The integration method can be applied every time new information, which is not aggregated from the repositories
|
||||
nor computed directly by OpenAIRE, should be added to the results of the graph.
|
||||
|
||||
The information integrated so far is:
|
||||
|
||||
1. Article impact measures
|
||||
1. [Bip!Finder](https://dl.acm.org/doi/10.1145/3357384.3357850) scores
|
||||
2. Result Subjects
|
||||
1. Integration of Fields od Science and Techonology ([FOS](https://www.qnrf.org/en-us/FOS)) classification in
|
||||
results subjects.
|
||||
|
||||
|
||||
The method always consists in the creation of a new entity in the OpenAIRE format (OAF entity) containing only the id
|
||||
and the element in the OAF model that should be used to map the information we want to integrate.
|
||||
|
||||
The id is set by using a particular encoding of the given PID
|
||||
|
||||
*unresolved:[pid]:[pidtype]*
|
||||
|
||||
where
|
||||
|
||||
1. *unresolved* is a constant value
|
||||
2. *pid* is the persistent id value, e.g. 10.5281/zenodo.4707307
|
||||
3. *pidtype* is the persistent id type, e.g. doi
|
||||
|
||||
Such entities are matched against those available in the graph using the result.instance.pid values.
|
||||
|
||||
This mechanism can be used to integrate enrichments produced as associated by a given PID.
|
||||
If a match will be found with one of the results already in the graph that said result will be enriched with the information
|
||||
present in the new OAF.
|
||||
All the objects for which a match is not found are discarded.
|
||||
|
||||
|
|
@ -19,6 +19,9 @@
|
|||
<item name="Mappings" href="mappings.html" collapse="true">
|
||||
<item name="Pubmed" href="pubmed.html"/>
|
||||
<item name="Datacite" href="datacite.html"/>
|
||||
</item>
|
||||
<item name="Integration" href="integration.html" collapse="true">
|
||||
|
||||
</item>
|
||||
<item name="General Information" href="about.html"/>
|
||||
|
||||
|
|
Loading…
Reference in New Issue