Updated documentation on OpenAIRE Research Graph dumps

git-svn-id: https://svn.driver.research-infrastructures.eu/driver/dnet45/modules/dnet-api-http-doc/src@60262 d315682c-612b-4755-9ff5-7f18f6832af3
This commit is contained in:
Alessia Bardi 2021-01-15 18:56:04 +00:00
parent 8f18ae8e38
commit a74d7b9aa7
4 changed files with 490 additions and 108 deletions

View File

@ -192,30 +192,11 @@
<!--div class="uk-card uk-margin-remove uk-padding-remove uk-card-default uk-card-body" style="z-index: 980;" uk-sticky="offset:100; bottom: #bottom">
<a class="uk-totop" href="#" uk-scroll="" uk-totop="">
</a></div-->
<!-- <div class="uk-alert-danger uk-text-center" uk-alert>
<p>
OpenAIRE is about to release its new face with lots of news and services. <br>
During September, you may notice downtime in services, while some functionalities (e.g. user registration, validation, claiming) will be temporarily disabled.
We apologize for the inconvenience, please stay tuned! <br>
For further information please contact <a href="mailto:helpdesk@openaire.eu">helpdesk@openaire.eu</a>.
</p>
</div> -->
<div class="uk-alert-danger" uk-alert>
<h3>Contribute to improve the OpenAIRE Research Graph</h3>
<p>You can explore and test the beta release of the OpenAIRE Research Graph via the <a href="https://beta.explore.openaire.eu">OpenAIRE BETA Explore Portal</a> or via data dumps made available in <a href="https://zenodo.org/communities/openaire-research-graph">Zenodo</a>. </p>
<p>Help us making the graph ready for its 1st production release by providing your feedback.<br/>
Go to the <a href="https://trello.com/b/o1tEJ3rN/openaire-research-graph">OpenAIRE Research Graph Trello Board</a> to report content quality issues, including missing metadata records, wrong values, mistakes in the detection of duplicates and anything else that looks "weird" or wrong.
<p>Find the complete information about the OpenAIRE Research Graph, how to test it and contribute to improve it on <a href="https://www.openaire.eu/blogs/the-openaire-research-graph">our blog</a>.</p>
</div>
<div class="uk-alert-danger" uk-alert>
<p>Openaire XML Schema changed on 2 October 2018. <a href="https://www.openaire.eu/openaire-xml-schema-change-announcement" target="_blank">Click here</a> for details</p>
</div>
<p>
The OpenAIRE HTTP API allows developers to access the metadata information space of OpenAIRE by performing queries over publications, datasets, and projects.
The OpenAIRE HTTP API allows developers to access metadata records of the OpenAIRE Research Graph by performing queries over publications, datasets, and projects.
The API is intended for metadata discovery and exploration only, that is it does not give direct access to publication files and it does not provide access to
the whole information space: the number of total results returned by one query is limited to 10,000. For accessing the whole information space, developers are
encouraged to use the <a href="http://api.openaire.eu/#cha_oai_pmh" target="_blank">OAI-PMH</a>.<br>
the whole information space: the number of total results returned by one query is limited to 10,000. For accessing the whole graph, developers are
encouraged to use the <a href="./graph-dumps.html">OpenAIRE Research Graph dumps</a>.<br>
</p>
<div>

View File

@ -166,25 +166,6 @@
<div class="uk-container">
<div uk-grid="" class="uk-grid uk-grid-stack">
<div class="tm-main uk-width-1-1@s uk-width-1-1@m uk-width-1-1@l uk-row-first uk-first-column">
<!-- <div class="uk-alert-danger uk-text-center" uk-alert>
<p>
OpenAIRE is about to release its new face with lots of news and services. <br>
During September, you may notice downtime in services, while some functionalities (e.g. user registration, validation, claiming) will be temporarily disabled.
We apologize for the inconvenience, please stay tuned! <br>
For further information please contact <a href="mailto:helpdesk@openaire.eu">helpdesk@openaire.eu</a>.
</p>
</div> -->
<div class="uk-alert-danger" uk-alert>
<h3>Contribute to improve the OpenAIRE Research Graph</h3>
<p>You can explore and test the beta release of the OpenAIRE Research Graph via the <a href="https://beta.explore.openaire.eu">OpenAIRE BETA Explore Portal</a> or via data dumps made available in <a href="https://zenodo.org/communities/openaire-research-graph">Zenodo</a>. </p>
<p>Help us making the graph ready for its 1st production release by providing your feedback.<br/>
Go to the <a href="https://trello.com/b/o1tEJ3rN/openaire-research-graph">OpenAIRE Research Graph Trello Board</a> to report content quality issues, including missing metadata records, wrong values, mistakes in the detection of duplicates and anything else that looks "weird" or wrong.
<p>Find the complete information about the OpenAIRE Research Graph, how to test it and contribute to improve it on <a href="https://www.openaire.eu/blogs/the-openaire-research-graph">our blog</a>.</p>
</div>
<div class="uk-alert-danger" uk-alert>
<p>Openaire XML Schema changed on 2 October 2018. <a href="https://www.openaire.eu/openaire-xml-schema-change-announcement" target="_blank">Click here</a> for details</p>
</div>
<h2>Bulk access to projects</h2>
<p>
The APIs offer custom access to metadata about projects funded by a selection of international funders for the <strong>DSpace</strong> and <strong>EPrints</strong> platforms.

437
graph-dumps-old.html Executable file
View File

@ -0,0 +1,437 @@
<!DOCTYPE html>
<html lang="en-gb" dir="ltr" vocab="http://schema.org/">
<head>
<!--link href="http://demo.openaire.eu" rel="canonical" /-->
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1">
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
<meta name="description" content="OpenAIRE API documentation, OAI-PMH, open access, research, scientific publication, European Commission, EC, FP7, ERC, Horizon 2020, H2020, search, projects "/>
<link href="./assets/favicon.ico" rel="shortcut icon" />
<title>OpenAIRE API documentation - Dumps of the OpenAIRE Research Graph</title>
<script src="./assets/jquery.js"></script>
<script src="./assets/uikit.js"></script>
<script src="./assets/uikit-icon-max.js"></script>
<link rel="stylesheet" href="./assets/theme.css">
<link rel="stylesheet" href="./assets/custom.css">
<link rel="stylesheet" href="./assets/develop-custom.css">
</head>
<body class="" style="">
<div class="uk-offcanvas-content uk-height-viewport">
<div class="tm-header-mobile uk-hidden@m">
<div animation="uk-animation-slide-top" class="uk-navbar-container uk-sticky uk-navbar-transparent " cls-active="uk-active uk-navbar-sticky" cls-inactive="uk-navbar-transparent " uk-sticky="">
<nav class="uk-navbar-container uk-navbar" uk-navbar="">
<div class="uk-navbar-left">
<a class="uk-navbar-toggle" href="#tm-mobile" uk-toggle="">
<div uk-navbar-toggle-icon="" class="uk-navbar-toggle-icon uk-icon">
</div>
</a>
</div>
<div class="uk-navbar-center">
<a class="uk-navbar-item uk-logo" href="overview.html">
<img src="assets/OA DEVELOP_A.png" class="uk-responsive-height" alt="OpenAIRE"> </a>
</div>
</nav>
<div id="tm-mobile" uk-offcanvas="" mode="slide" overlay="" class="uk-offcanvas">
<div class="uk-offcanvas-bar">
<button class="uk-offcanvas-close uk-close uk-icon" type="button" uk-close="">
</button>
<div class="uk-child-width-1-1 uk-grid" uk-grid="">
<div>
<div class="uk-panel" id="module-0">
<ul class="uk-nav uk-nav-default">
<li class="uk-nav-header uk-parent" >
Dashboards
<ul class="uk-nav-sub">
<li><a href="https://explore.openaire.eu" target="_blank" class="uk-heading-bullet explore-heading-bullet">EXPLORE</a></li>
<li><a href="https://provide.openaire.eu" target="_blank" class="uk-heading-bullet provide-heading-bullet">PROVIDE</a></li>
<li><a href="https://connect.openaire.eu" target="_blank" class="uk-heading-bullet connect-heading-bullet">CONNECT</a></li>
<li><a href="https://monitor.openaire.eu" target="_blank" class="uk-heading-bullet monitor-heading-bullet">MONITOR</a></li>
</ul>
</li>
<li class="uk-nav-header uk-parent">
<a href="./overview.html"> Overview </a>
</li>
<li class="uk-nav-header uk-parent uk-active">
Bulk access
<ul class="uk-nav-sub">
<li><a routerLinkActive="uk-link" href="./graph-dumps.html" >OpenAIRE Research Graph Dumps</a></li>
<li><a routerLinkActive="uk-link" href="./oai-pmh.html" >OAI-PMH (discontinued)</a></li>
<li><a routerLinkActive="uk-link" href="./bulk-projects.html" >Bulk access to projects</a></li>
</ul>
</li>
<li class="uk-nav-header uk-parent">
<a href="./api.html">Selective access</a>
<ul class="uk-nav-sub">
<li><a href="./api.html#pubs" >Publications</a></li>
<li><a href="./api.html#datasets" >Research Data</a></li>
<li><a href="./api.html#software" >Software</a></li>
<li><a href="./api.html#other" >Other Research Products</a></li>
<li><a href="./api.html#projects" >Projects</a></li>
</ul>
</li>
</ul>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
<!--Mobile ENDS here -->
<div class="tm-toolbar custom-develop-toolbar uk-visible@m">
<div class="uk-container uk-flex uk-flex-middle uk-container-expand ">
<div class="uk-margin-auto-left">
<div class="uk-grid-medium uk-child-width-auto uk-flex-middle uk-grid uk-grid-stack" uk-grid="margin: uk-margin-small-top">
<div class="uk-first-column">
<div class="uk-panel inner" id="module-119">
<ul class="uk-subnav">
<li class="line"><a href="https://www.openaire.eu"><img class="uk-responsive-height" src="assets/Home_24white.svg" alt="home"/></a></li>
<li class="line"><a href="https://explore.openaire.eu" title="Search in OA. Link your research">Explore</a></li>
<li class="line"><a href="https://provide.openaire.eu" title="Content Provider Dashboard">Provide</a></li>
<li class="line"><a href="https://connect.openaire.eu" title="Research Community Dashboard">Connect</a></li>
<li class="line"><a href="https://monitor.openaire.eu" title="Monitoring Dashboard">Monitor</a></li>
<li class="line custom-develop-li "><a href="overview.html" title="APIs">Develop</a></li>
</ul>
</div>
</div>
</div>
</div>
</div>
</div>
<div class="tm-header uk-visible@m tm-header-transparent" uk-header="">
<div animation="uk-animation-slide-top" class="uk-navbar-container uk-sticky uk-navbar-transparent" cls-active="uk-active uk-navbar-sticky" cls-inactive="uk-navbar-transparent" top=".tm-header + [class*=&quot;uk-section&quot;]" uk-sticky="">
<div class="uk-navbar-container uk-navbar-transparent">
<div class="uk-container uk-container-expand">
<nav class="uk-navbar" uk-navbar="{&quot;align&quot;:&quot;left&quot;}">
<div class="uk-navbar-left">
<a href="overview.html" class="uk-navbar-item uk-logo">
<img src="assets/OA DEVELOP_B.png" class="uk-responsive-height" alt="OpenAIRE"></a>
</div>
<div class="uk-navbar-right">
<ul class="uk-navbar-nav">
<li class="uk-parent">
<a href="overview.html" class="" aria-expanded="false">Overview</a>
</li>
<li class="uk-parent uk-active">
<a href="#" class="" aria-expanded="false">Bulk access</a>
<div class="uk-navbar-dropdown uk-navbar-dropdown-bottom-left" style="left: 116px; top: 80px;">
<div class="uk-navbar-dropdown-grid uk-child-width-1-1 uk-grid uk-grid-stack" uk-grid="">
<div class="uk-first-column">
<ul class="uk-nav uk-navbar-dropdown-nav">
<li><a routerLinkActive="uk-link" href="./graph-dumps.html" >OpenAIRE Research Graph Dumps</a></li>
<li><a routerLinkActive="uk-link" href="./oai-pmh.html" >OAI-PMH (discontinued)</a></li>
<li><a routerLinkActive="uk-link" href="./bulk-projects.html" >Bulk access to projects</a></li>
</ul>
</div>
</div>
</div>
</li>
<li class="uk-parent">
<a href="#" class="" aria-expanded="false">Selective access</a>
<div class="uk-navbar-dropdown uk-navbar-dropdown-bottom-left" style="left: 228px; top: 80px;">
<div class="uk-navbar-dropdown-grid uk-child-width-1-1 uk-grid uk-grid-stack" uk-grid="">
<div class="uk-first-column">
<ul class="uk-nav uk-navbar-dropdown-nav">
<li><a href="./api.html#pubs" >Publications</a></li>
<li><a href="./api.html#datasets" >Research Data</a></li>
<li><a href="./api.html#software" >Software</a></li>
<li><a href="./api.html#other" >Other Research Products</a></li>
<li><a href="./api.html#projects" >Projects</a></li>
</ul>
</div>
</div>
</div>
</li>
</ul>
</div>
</nav>
</div>
</div>
</div>
</div>
<!-- MENU ENDS HERE-->
<div class="first_page_section uk-section-default uk-section uk-padding-remove-vertical">
<div class="first_page_banner_headline uk-grid-collapse uk-flex-middle uk-margin-remove-vertical uk-grid" uk-grid="">
</div>
</div>
<div class=" uk-section uk-margin-large-top tm-middle custom-main-content" id="tm-main">
<div class="uk-container">
<div uk-grid="" class="uk-grid uk-grid-stack">
<div class="tm-main uk-width-1-1@s uk-width-1-1@m uk-width-1-1@l uk-row-first uk-first-column">
<!-- Content GOES HERE-->
<div class="uk-alert-danger" uk-alert>
<h3>Contribute to improve the OpenAIRE Research Graph</h3>
<p>You can explore and test the beta release of the OpenAIRE Research Graph via the <a href="https://beta.explore.openaire.eu">OpenAIRE BETA Explore Portal</a> or via data dumps made available in <a href="https://zenodo.org/communities/openaire-research-graph">Zenodo</a>. </p>
<p>Help us making the graph ready for its 1st production release by providing your feedback.<br/>
Go to the <a href="https://trello.com/b/o1tEJ3rN/openaire-research-graph">OpenAIRE Research Graph Trello Board</a> to report content quality issues, including missing metadata records, wrong values, mistakes in the detection of duplicates and anything else that looks "weird" or wrong.
<p>Find the complete information about the OpenAIRE Research Graph, how to test it and contribute to improve it on <a href="https://www.openaire.eu/blogs/the-openaire-research-graph">our blog</a>.</p>
</div>
<h2>OpenAIRE Research Graph Dumps</h2>
<p>The OpenAIRE Research Graph is one of the largest open scholarly record collections worldwide, key in fostering Open Science and establishing its practices in the daily research activities.
Conceived as a public and transparent good, populated out of data sources trusted by scientists, the Graph aims at bringing discovery, monitoring, and assessment of science back in the hands of the scientific community.
</p>
<p>Imagine a vast collection of research products all linked together, contextualised and openly available.
For the past ten years OpenAIRE has been working to gather this valuable record. OpenAIRE is pleased to announce the beta release of its Research Graph, a massive collection of metadata and links between
scientific products such as articles, datasets, software, and other research products, entities like organisations, funders, funding streams, projects, communities, and data sources.
</p>
<p>As of today, the OpenAIRE Research Graph aggregates around 450Mi metadata records with links collecting from 10,000 data sources trusted by scientists, including repositories registered in <a href="https://v2.sherpa.ac.uk/opendoar/">OpenDOAR</a>, Open Access journals registered in <a href="https://doaj.org/">DOAJ</a>, <a href="https://www.crossref.org/">Crossref</a>, <a href="https://unpaywall.org">Unpaywall</a>, <a href="https://orcid.org/">ORCID</a> and <a href="https://aka.ms/msracad">Microsoft Academic Graph</a>.
After cleaning, deduplication, and fine-grained classification processes, they narrow down to ~100Mi publications, ~8Mi datasets, ~200K software research products, 8Mi other products linked together with semantic relations.
More than 10Mi full-texts of Open Access publications are mined by algorithms to enrich metadata records with additional properties and links among research products, funders, projects, communities, and organizations.
Thanks to the mining algorithm, the graph is completed with 480Mi semantic relations.
</p>
<p>The OpenAIRE Research graph is available via our <a href="https://beta.explore.openaire.eu">BETA Explore Portal</a> and you can download it from <a href="https://zenodo.org/communities/openaire-research-graph">Zenodo</a>.
</p>
<h3>Get the dumps</h3>
<div>
<p>The OpenAIRE Research Graph is exported as several dump files available on Zenodo (go to <a href="https://doi.org/10.5281/zenodo.3516917"><img src="https://zenodo.org/badge/DOI/10.5281/zenodo.3516917.svg" alt="DOI"></a>), so you can download the parts you are interested into. </p>
<ul>
<li> <strong>publications</strong>: metadata records about research literature (includes types of publications listed <a href="http://api.openaire.eu/vocabularies/dnet:result_typologies/publication">here</a>)</li>
<li> <strong>datasets:</strong>: metadata records about research data (includes the subtypes listed <a href="http://api.openaire.eu/vocabularies/dnet:result_typologies/dataset">here</a>)</li>
<li> <strong>software:</strong>: metadata records about research software (includes the subtypes listed <a href="http://api.openaire.eu/vocabularies/dnet:result_typologies/software">here</a>)</li>
<li> <strong>orps</strong>: metadata records about research products that cannot be classified as research literature, data or software (includes types of products listed <a href="http://api.openaire.eu/vocabularies/dnet:result_typologies/other">here</a>)</li>
<li> <strong>organizations</strong>: metadata records about organizations involved in the research life-cycle, such as universities, research organizations, funders.</li>
<li> <strong>content_providers</strong>: metadata records about providers whose content is available in the OpenAIRE Research Graph. They includes institutional and thematic repositories, journals, aggregators, funders' databases.</li>
<li> <strong>results_by_funder</strong>: metadata records about research results funded by a given funder. Each result includes information about its type (publications, datasets, software or other) and its specific sub-type (check the list of sub-types for <a href="http://api.openaire.eu/vocabularies/dnet:result_typologies/publication">publications</a>, <a href="http://api.openaire.eu/vocabularies/dnet:result_typologies/dataset">datasets</a>, <a href="http://api.openaire.eu/vocabularies/dnet:result_typologies/software">software</a>, and <a href="http://api.openaire.eu/vocabularies/dnet:result_typologies/other">other research products</a>). </li>
</ul>
<p>The up-to-date list of funders available on OpenAIRE BETA can be find <a href="https://beta.explore.openaire.eu/search/entity-registries?datasourcetypename=%22Funder%20database%22">here on the BETA Explore portal</a>.</p>
<p> In the same <a href="https://zenodo.org/communities/openaire-research-graph">Zenodo community</a> you can also find the dumps of ScholeXplorer and DOIBoost.</p>
</div>
<div>
<p>The dumps contain XML records compliant to the <b>OpenAIRE data model</b> and to the <b>oaf metadata format</b> (the same format of the records exported via <a href="./oai-pmh.html">OAI-PMH</a>):</p>
<ul>
<li><a href="" target="_blank">See the description of the OpenAIRE data model</a></li>
<li><a href="https://www.openaire.eu/schema/latest/oaf.xsd" target="_blank">See the oaf XML schema</a></li>
<li><a href="https://www.openaire.eu/schema/latest/doc/oaf.html" target="_blank">See the oaf XML schema documentation (generated via Oxygen XML Editor)</a></li>
</ul>
<p>Keep reading for instructions on how to consume the dumps.</p>
</div>
<h3>Consume the dumps</h3>
<div>
Each dump is a gzipped json file with many lines. Each line is in the form of:
<code>{"_id":{"$oid":"59b82504895be144859a9804"},"body":{"$binary":"base64(zip(XML_record))","$type":"00"}}</code><br/>
where the <code>body</code> field contains the base64 econding of the compressed XML record. <br/>
In order to get the XMLs you have to:
<ol>
<li>Unzip the file</li>
<li>Get only the value of the <code>$binary</code> field</li>
<li>Read each line and base64 decode it</li>
<li>Unzip the decoded string</li>
</ol>
For example, to print the XMLs on the standard output you can run this command on MacOS/Unix/Linux based systems:
<code>gunzip -c file.json.gz | jq '.body."$binary"' -r | while IFS= read -r line; do echo "$line" | base64 --decode | bsdtar -x -O ; done </code><br/>
where
<ul>
<li><code>file.json.gz</code> is the name you gave to the downloaded file dump;</li>
<li><code>jq</code> is a command to parse json files. It is not installed by default, but you can easy find it on official repositories. <a href="https://stedolan.github.io/jq/download/">Click here for installation instructions</a>.
<li><code>base64</code> and <code>bsdtar</code> are two libraries that are typically pre-installed.</li>
</ul>
Note that you should decide what to do with it (keep parsing XML inline or store them somewhere).
We suggest to start with few records to test and decide what to do, by adding a <code>head</code> command after the <code>gunzip</code>, like:
<code>gunzip -c file.json.gz | head -n 10 | jq '.body."$binary"' -r | while IFS= read -r line; do echo "$line" | base64 --decode | bsdtar -x -O ; done</code>
</div>
<h3>Cite us</h3>
<p>If you use the OpenAIRE Research Graph for research purposes, please cite it as:<br/>
<i>Manghi, Paolo, Atzori, Claudio, Bardi, Alessia, Shirrwagen, Jochen, Dimitropoulos, Harry, La Bruzzo, Sandro, … Summan, Friedrich. (2019). OpenAIRE Research Graph Dump [Data set]. Zenodo. http://doi.org/10.5281/zenodo.3516917</i><br/>
If you want to cite a specific version, please follow the suggestion on Zenodo. For the current version (1.0.0-beta), please use: </br>
<i>Manghi, Paolo, Atzori, Claudio, Bardi, Alessia, Shirrwagen, Jochen, Dimitropoulos, Harry, La Bruzzo, Sandro, … Summan, Friedrich. (2019). OpenAIRE Research Graph Dump (Version 1.0.0-beta) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.3516918</i><br/>
The OpenAIRE Research graph includes data from <a href="https://aka.ms/msracad">Microsoft Academic Graph</a> (MAG): please acknowledge also MAG following <a href="https://docs.microsoft.com/en-us/academic-services/graph/resources-faq#license">this guideline</a>.
</p>
<h3>License</h3>
<p>The OpenAIRE Research Graph is released under CC-BY license.</p>
<p>OpenAIRE is working to produce dumps that only contains metadata records that can be re-distributed with the CC0 license: stay tuned!</p>
</div>
</div>
</div>
</div>
<!-- FOOTER STARTS HERE-->
<div class="uk-section-primary uk-section uk-section-small uk-padding-remove-bottom">
<div class="uk-container uk-container-expand">
<div class="uk-container uk-container-expand uk-margin-small">
<div class="uk-grid-collapse uk-grid" uk-grid="">
<div id="footer#3" class="uk-width-expand@s uk-first-column">
<div class="uk-margin-small uk-margin-remove-top uk-text-left@s uk-text-center">
<img src="assets/Logo_Horizontal_white_small.png" data-width="126" data-height="30" class="el-image" alt="OpenAIRE">
</div>
<!--div id="footer#5" class="uk-margin uk-text-left@s uk-text-center">
<img src="assets/commission.jpg" sizes="(min-width: 50px) 50px" data-width="427" data-height="285" class="el-image" alt="European Commission">
</div-->
<div class="uk-margin"><img style="margin-right: 8px; float: left;" src="assets/commission.jpg" alt="flag black white low" width="50" height="33"><span style="font-size: 8pt; line-height: 0.7!important;">OpenAIRE-Advance receives funding from the European Union's Horizon 2020 Research and Innovation programme under Grant Agreement No. 777541.</span></div>
<div id="footer#6" class="newsletter uk-margin uk-margin-remove-bottom uk-text-left@s uk-text-center uk-panel">
<h5 class="el-title uk-margin uk-h5">
Newsletter
</h5>
<a target="_blank" href="https://www.openaire.eu/newsletter/view" class="el-link">
<span class="el-image uk-icon">
<svg width="20" height="20" viewBox="0 0 20 20" xmlns="http://www.w3.org/2000/svg">
<circle cx="3.12" cy="16.8" r="1.85"></circle>
<path fill="none" stroke="#000" stroke-width="1.1" d="M1.5,8.2 C1.78,8.18 2.06,8.16 2.35,8.16 C7.57,8.16 11.81,12.37 11.81,17.57 C11.81,17.89 11.79,18.19 11.76,18.5"></path>
<path fill="none" stroke="#000" stroke-width="1.1" d="M1.5,2.52 C1.78,2.51 2.06,2.5 2.35,2.5 C10.72,2.5 17.5,9.24 17.5,17.57 C17.5,17.89 17.49,18.19 17.47,18.5"></path>
</svg>
</span>
</a>
</div>
<div id="footer#7" class="newsletter uk-margin-small uk-margin-remove-top uk-text-left@s uk-text-center uk-panel">
<div class="acymailing_module" id="acymailing_module_formAcymailing60611">
<div class="acymailing_mootoolsbutton" id="acymailing_toggle_formAcymailing60611">
<p><a class="acymailing_togglemodule" id="acymailing_togglemodule_formAcymailing60611" target="_blank" href="https://www.openaire.eu/past-newsletters/listing">Subscribe</a></p>
</div>
</div>
</div>
<div class="uk-margin-small uk-margin-remove-top uk-text-left@s uk-text-center">
<div class="uk-child-width-auto uk-grid-small uk-flex-left@s uk-flex-center uk-grid" uk-grid="">
<div class="uk-first-column">
<a href="http://www.facebook.com/groups/openaire/" target="_blank" class="el-link uk-icon-button uk-icon">
<svg width="20" height="20" viewBox="0 0 20 20" xmlns="http://www.w3.org/2000/svg">
<path d="M11,10h2.6l0.4-3H11V5.3c0-0.9,0.2-1.5,1.5-1.5H14V1.1c-0.3,0-1-0.1-2.1-0.1C9.6,1,8,2.4,8,5v2H5.5v3H8v8h3V10z"></path>
</svg>
</a>
</div>
<div>
<a href="http://www.twitter.com/OpenAIRE_eu" target="_blank" class="el-link uk-icon-button uk-icon">
<svg width="20" height="20" viewBox="0 0 20 20" xmlns="http://www.w3.org/2000/svg">
<path d="M19,4.74 C18.339,5.029 17.626,5.229 16.881,5.32 C17.644,4.86 18.227,4.139 18.503,3.28 C17.79,3.7 17.001,4.009 16.159,4.17 C15.485,3.45 14.526,3 13.464,3 C11.423,3 9.771,4.66 9.771,6.7 C9.771,6.99 9.804,7.269 9.868,7.539 C6.795,7.38 4.076,5.919 2.254,3.679 C1.936,4.219 1.754,4.86 1.754,5.539 C1.754,6.82 2.405,7.95 3.397,8.61 C2.79,8.589 2.22,8.429 1.723,8.149 L1.723,8.189 C1.723,9.978 2.997,11.478 4.686,11.82 C4.376,11.899 4.049,11.939 3.713,11.939 C3.475,11.939 3.245,11.919 3.018,11.88 C3.49,13.349 4.852,14.419 6.469,14.449 C5.205,15.429 3.612,16.019 1.882,16.019 C1.583,16.019 1.29,16.009 1,15.969 C2.635,17.019 4.576,17.629 6.662,17.629 C13.454,17.629 17.17,12 17.17,7.129 C17.17,6.969 17.166,6.809 17.157,6.649 C17.879,6.129 18.504,5.478 19,4.74"></path>
</svg>
</a>
</div>
<div>
<a href="http://www.linkedin.com/groups/OpenAIRE-3893548" target="_blank" class="el-link uk-icon-button uk-icon">
<svg width="20" height="20" viewBox="0 0 20 20" xmlns="http://www.w3.org/2000/svg">
<path d="M5.77,17.89 L5.77,7.17 L2.21,7.17 L2.21,17.89 L5.77,17.89 L5.77,17.89 Z M3.99,5.71 C5.23,5.71 6.01,4.89 6.01,3.86 C5.99,2.8 5.24,2 4.02,2 C2.8,2 2,2.8 2,3.85 C2,4.88 2.77,5.7 3.97,5.7 L3.99,5.7 L3.99,5.71 L3.99,5.71 Z"></path>
<path d="M7.75,17.89 L11.31,17.89 L11.31,11.9 C11.31,11.58 11.33,11.26 11.43,11.03 C11.69,10.39 12.27,9.73 13.26,9.73 C14.55,9.73 15.06,10.71 15.06,12.15 L15.06,17.89 L18.62,17.89 L18.62,11.74 C18.62,8.45 16.86,6.92 14.52,6.92 C12.6,6.92 11.75,7.99 11.28,8.73 L11.3,8.73 L11.3,7.17 L7.75,7.17 C7.79,8.17 7.75,17.89 7.75,17.89 L7.75,17.89 L7.75,17.89 Z"></path>
</svg>
</a>
</div>
<div>
<a href="http://www.slideshare.net/OpenAIRE_eu" target="_blank" class="el-link uk-icon-button uk-icon">
<svg width="20" height="20" viewBox="0 0 20 20" xmlns="http://www.w3.org/2000/svg">
<line fill="none" stroke="#000" stroke-width="1.1" x1="13.4" y1="14" x2="6.3" y2="10.7"></line>
<line fill="none" stroke="#000" stroke-width="1.1" x1="13.5" y1="5.5" x2="6.5" y2="8.8"></line>
<circle fill="none" stroke="#000" stroke-width="1.1" cx="15.5" cy="4.6" r="2.3"></circle>
<circle fill="none" stroke="#000" stroke-width="1.1" cx="15.5" cy="14.8" r="2.3"></circle>
<circle fill="none" stroke="#000" stroke-width="1.1" cx="4.5" cy="9.8" r="2.3"></circle>
</svg>
</a>
</div>
<div>
<a href="https://www.youtube.com/channel/UChFYqizc-S6asNjQSoWuwjw" target="_blank" class="el-link uk-icon-button uk-icon">
<svg width="20" height="20" viewBox="0 0 20 20" xmlns="http://www.w3.org/2000/svg">
<path d="M15,4.1c1,0.1,2.3,0,3,0.8c0.8,0.8,0.9,2.1,0.9,3.1C19,9.2,19,10.9,19,12c-0.1,1.1,0,2.4-0.5,3.4c-0.5,1.1-1.4,1.5-2.5,1.6 c-1.2,0.1-8.6,0.1-11,0c-1.1-0.1-2.4-0.1-3.2-1c-0.7-0.8-0.7-2-0.8-3C1,11.8,1,10.1,1,8.9c0-1.1,0-2.4,0.5-3.4C2,4.5,3,4.3,4.1,4.2 C5.3,4.1,12.6,4,15,4.1z M8,7.5v6l5.5-3L8,7.5z"></path>
</svg>
</a>
</div>
</div>
</div>
</div>
<div id="footer#9" class="uk-width-expand@s">
<div id="footer#10" class="uk-width-medium uk-text-left@s uk-text-center uk-panel">
<h3 class="el-title uk-h6">Dashboards</h3>
<ul class="uk-nav uk-nav-default uk-nav-parent-icon uk-nav-accordion" uk-nav="">
<li><a href="https://explore.openaire.eu" target="_blank">Explore</a></li>
<li><a href="https://provide.openaire.eu" target="_blank">Provide</a></li>
<li><a href="https://connect.openaire.eu/" target="_blank">Connect</a></li>
<li><a href="https://monitor.openaire.eu" target="_blank">Monitor</a></li>
<li><a href="https://develop.openaire.eu" target="_blank">Develop</a></li>
</ul>
</div>
</div>
<div id="footer#11" class="uk-width-expand@s">
<div id="footer#12" class="uk-width-medium uk-text-left@s uk-text-center uk-panel">
<h3 class="el-title uk-h6">Support</h3>
<ul class="uk-nav uk-nav-default uk-nav-parent-icon uk-nav-accordion" uk-nav="">
<li><a href="https://www.openaire.eu/contact-noads">NOADs</a></li>
<li><a target="_blank" href="https://www.openaire.eu/guides">Guides</a></li>
<li><a target="_blank" href="https://www.openaire.eu/faqs">FAQs</a></li>
<li><a target="_blank" href="https://www.openaire.eu/frontpage/webinars">Webinars</a></li>
<li><a target="_blank" href="https://www.openaire.eu/support/helpdesk">Ask a question</a></li>
</ul>
</div>
</div>
<div id="footer#13" class="uk-width-expand@s">
<div id="footer#14" class="uk-width-medium uk-text-left@s uk-text-center uk-panel">
<h3 class="el-title uk-h6">Updates</h3>
<ul class="uk-nav uk-nav-default uk-nav-parent-icon uk-nav-accordion" uk-nav="">
<li><a target="_blank" href="https://www.openaire.eu/news/">News</a></li>
<li><a target="_blank" href="https://www.openaire.eu/events">Events</a></li>
<li><a target="_blank" href="https://www.openaire.eu/blogs/magazine">Blogs</a></li>
<li><a href="https://www.openaire.eu/newsletter/listing">Newsletters</a></li>
<li><a target="_blank" href="https://www.openaire.eu/documents">Documents</a></li>
</ul>
</div>
</div>
</div>
</div>
</div>
</div>
<div class="uk-section-primary uk-section uk-section-xsmall">
<div class="uk-container">
<div class="uk-grid-margin uk-grid uk-grid-stack" uk-grid="">
<div class="uk-width-expand@m">
</div>
</div>
</div>
</div>
<div class="uk-section-primary uk-section uk-section-xsmall">
<div class="uk-container uk-container-expand">
<div class="uk-grid-margin uk-grid" uk-grid="">
<div class="uk-width-small@m uk-first-column">
</div>
<div class="uk-width-expand@m">
<div id="footer#22" class=" uk-text-small uk-margin uk-margin-remove-bottom uk-text-center@m uk-text-center uk-text-lead">
<a href="http://creativecommons.org/licenses/by/4.0/" rel="license">
<img src="/images/Icons/cc.svg" uk-svg="" hidden="true">
<svg viewBox="0 0 24 24" xmlns="http://www.w3.org/2000/svg" width="24" height="24" class=" uk-svg">
<title></title>
<g data-name="Creative Commons" id="Creative_Commons">
<circle cx="12" cy="12" r="11.5" style="fill:none;stroke:#ffffff;stroke-linecap:round;stroke-linejoin:round"></circle>
<path d="M10.87,10a3.5,3.5,0,1,0,0,4" style="fill:none;stroke:#ffffff;stroke-linecap:round;stroke-linejoin:round"></path>
<path d="M18.87,10a3.5,3.5,0,1,0,0,4" style="fill:none;stroke:#ffffff;stroke-linecap:round;stroke-linejoin:round"></path>
</g>
</svg>
&nbsp;<img src="/images/Icons/cc-by.svg" uk-svg="" hidden="true">
<svg viewBox="0 0 24 24" xmlns="http://www.w3.org/2000/svg" width="24" height="24" class=" uk-svg">
<title></title>
<g id="Attribution">
<g data-name="<Group>" id="_Group_">
<circle cx="12" cy="5" data-name="<Path>" id="_Path_" r="1.5" style="fill:none;stroke:#ffffff;stroke-linecap:round;stroke-linejoin:round"></circle>
<path d="M12,8a5,5,0,0,0-3.07.71,1,1,0,0,0-.43.83V15H10v5.5h4V15h1.5V9.54a1,1,0,0,0-.43-.83A5,5,0,0,0,12,8Z" data-name="<Path>" id="_Path_2" style="fill:none;stroke:#ffffff;stroke-linecap:round;stroke-linejoin:round"></path>
</g>
<circle cx="12" cy="12" r="11.5" style="fill:none;stroke:#ffffff;stroke-linecap:round;stroke-linejoin:round"></circle>
</g>
</svg>
</a>
&nbsp;Unless otherwise indicated, all materials created by OpenAIRE are licenced under&nbsp;<a href="http://creativecommons.org/licenses/by/4.0/" rel="license">CC ATTRIBUTION 4.0 INTERNATIONAL LICENSE</a>.
</div>
</div>
<div class="uk-width-small@m">
<div class="uk-margin uk-margin-remove-top uk-margin-remove-bottom uk-text-right@m uk-text-center">
<a href="#" uk-totop="" uk-scroll="" class="uk-totop uk-icon">
</a>
</div>
</div>
</div>
</div>
</div>
<!-- FOOTER ENDS HERE-->
</div>
</body>
</html>

View File

@ -165,20 +165,12 @@
<div uk-grid="" class="uk-grid uk-grid-stack">
<div class="tm-main uk-width-1-1@s uk-width-1-1@m uk-width-1-1@l uk-row-first uk-first-column">
<!-- Content GOES HERE-->
<div class="uk-alert-danger" uk-alert>
<h3>Contribute to improve the OpenAIRE Research Graph</h3>
<p>You can explore and test the beta release of the OpenAIRE Research Graph via the <a href="https://beta.explore.openaire.eu">OpenAIRE BETA Explore Portal</a> or via data dumps made available in <a href="https://zenodo.org/communities/openaire-research-graph">Zenodo</a>. </p>
<p>Help us making the graph ready for its 1st production release by providing your feedback.<br/>
Go to the <a href="https://trello.com/b/o1tEJ3rN/openaire-research-graph">OpenAIRE Research Graph Trello Board</a> to report content quality issues, including missing metadata records, wrong values, mistakes in the detection of duplicates and anything else that looks "weird" or wrong.
<p>Find the complete information about the OpenAIRE Research Graph, how to test it and contribute to improve it on <a href="https://www.openaire.eu/blogs/the-openaire-research-graph">our blog</a>.</p>
</div>
<h2>OpenAIRE Research Graph Dumps</h2>
<h2>The OpenAIRE Research Graph</h2>
<p>The OpenAIRE Research Graph is one of the largest open scholarly record collections worldwide, key in fostering Open Science and establishing its practices in the daily research activities.
Conceived as a public and transparent good, populated out of data sources trusted by scientists, the Graph aims at bringing discovery, monitoring, and assessment of science back in the hands of the scientific community.
</p>
<p>Imagine a vast collection of research products all linked together, contextualised and openly available.
For the past ten years OpenAIRE has been working to gather this valuable record. OpenAIRE is pleased to announce the beta release of its Research Graph, a massive collection of metadata and links between
For the past ten years OpenAIRE has been working to gather this valuable record. It is a massive collection of metadata and links between
scientific products such as articles, datasets, software, and other research products, entities like organisations, funders, funding streams, projects, communities, and data sources.
</p>
<p>As of today, the OpenAIRE Research Graph aggregates around 450Mi metadata records with links collecting from 10,000 data sources trusted by scientists, including repositories registered in <a href="https://v2.sherpa.ac.uk/opendoar/">OpenDOAR</a>, Open Access journals registered in <a href="https://doaj.org/">DOAJ</a>, <a href="https://www.crossref.org/">Crossref</a>, <a href="https://unpaywall.org">Unpaywall</a>, <a href="https://orcid.org/">ORCID</a> and <a href="https://aka.ms/msracad">Microsoft Academic Graph</a>.
@ -186,71 +178,62 @@
More than 10Mi full-texts of Open Access publications are mined by algorithms to enrich metadata records with additional properties and links among research products, funders, projects, communities, and organizations.
Thanks to the mining algorithm, the graph is completed with 480Mi semantic relations.
</p>
<p>The OpenAIRE Research graph is available via our <a href="https://beta.explore.openaire.eu">BETA Explore Portal</a> and you can download it from <a href="https://zenodo.org/communities/openaire-research-graph">Zenodo</a>.
</p>
<p>Detailed information can be found on <a href="https://graph.openaire.eu">https://graph.openaire.eu</a></p>
<h3>Get the dumps</h3>
<div>
<p>The OpenAIRE Research Graph is exported as several dump files available on Zenodo (go to <a href="https://doi.org/10.5281/zenodo.3516917"><img src="https://zenodo.org/badge/DOI/10.5281/zenodo.3516917.svg" alt="DOI"></a>), so you can download the parts you are interested into. </p>
<ul>
<li> <strong>publications</strong>: metadata records about research literature (includes types of publications listed <a href="http://api.openaire.eu/vocabularies/dnet:result_typologies/publication">here</a>)</li>
<li> <strong>datasets:</strong>: metadata records about research data (includes the subtypes listed <a href="http://api.openaire.eu/vocabularies/dnet:result_typologies/dataset">here</a>)</li>
<li> <strong>software:</strong>: metadata records about research software (includes the subtypes listed <a href="http://api.openaire.eu/vocabularies/dnet:result_typologies/software">here</a>)</li>
<li> <strong>orps</strong>: metadata records about research products that cannot be classified as research literature, data or software (includes types of products listed <a href="http://api.openaire.eu/vocabularies/dnet:result_typologies/other">here</a>)</li>
<li> <strong>organizations</strong>: metadata records about organizations involved in the research life-cycle, such as universities, research organizations, funders.</li>
<li> <strong>content_providers</strong>: metadata records about providers whose content is available in the OpenAIRE Research Graph. They includes institutional and thematic repositories, journals, aggregators, funders' databases.</li>
<li> <strong>results_by_funder</strong>: metadata records about research results funded by a given funder. Each result includes information about its type (publications, datasets, software or other) and its specific sub-type (check the list of sub-types for <a href="http://api.openaire.eu/vocabularies/dnet:result_typologies/publication">publications</a>, <a href="http://api.openaire.eu/vocabularies/dnet:result_typologies/dataset">datasets</a>, <a href="http://api.openaire.eu/vocabularies/dnet:result_typologies/software">software</a>, and <a href="http://api.openaire.eu/vocabularies/dnet:result_typologies/other">other research products</a>). </li>
</ul>
<p>The up-to-date list of funders available on OpenAIRE BETA can be find <a href="https://beta.explore.openaire.eu/search/entity-registries?datasourcetypename=%22Funder%20database%22">here on the BETA Explore portal</a>.</p>
<p> In the same <a href="https://zenodo.org/communities/openaire-research-graph">Zenodo community</a> you can also find the dumps of ScholeXplorer and DOIBoost.</p>
<p>In order to facilitate users, different dumps are available.
All are available under the <a href="https://zenodo.org/communities/openaire-research-graph">Zenodo community called OpenAIRE Research Graph</a>.
<ul>
<li>The <strong>whole OpenAIRE Research Graph Dump</strong><br/>
Dataset: <a href="https://doi.org/10.5281/zenodo.3516917"><img src="https://zenodo.org/badge/DOI/10.5281/zenodo.3516917.svg" alt="DOI"></a><br/>
Schema: <a href="https://doi.org/10.5281/zenodo.4238938"><img src="https://zenodo.org/badge/DOI/10.5281/zenodo.4238938.svg" alt="DOI"></a><br/>
This dataset is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</a>.<br/>
It is composed of several files so that you can download the parts you are interested into.
Each file is a tar archive containing gz files, each with one json per line.
</li>
<li>The <strong>OpenAIRE COVID-19 dump</strong> <br/>
Dataset: <a href="https://doi.org/10.5281/zenodo.3980490"><img src="https://zenodo.org/badge/DOI/10.5281/zenodo.3980490.svg" alt="DOI"></a><br/>
Schema: <a href="https://doi.org/10.5281/zenodo.3974225"><img src="https://zenodo.org/badge/DOI/10.5281/zenodo.3974225.svg" alt="DOI"></a><br/>
This dataset is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</a>.<br/>
It contains metadata records of publications, research data, software and projects on the topic of Corona Virus and COVID-19.
This dump is part of the <a href="https://www.openaire.eu/openaire-activities-for-covid-19">activities of OpenAIRE to support the fight against COVID-19</a> together with the <a href="https://covid-19.openaire.eu">OpenAIRE COVID-19 Gateway</a>.
The dump consists of a tar archive containing gzip files with one json per line.
</li>
<li>
The <strong>dumps about research communities, initiatives and infrastructures</strong> <br/>
Dataset: <a href="https://doi.org/10.5281/zenodo.3974604"><img src="https://zenodo.org/badge/DOI/10.5281/zenodo.3974604.svg" alt="DOI"></a><br/>
Schema: <a href="https://doi.org/10.5281/zenodo.3974225"><img src="https://zenodo.org/badge/DOI/10.5281/zenodo.3974225.svg" alt="DOI"></a><br/>
This dataset is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</a>.<br/>
The dataset contains one file per community/initiative/infrastructure collaborating with OpenAIRE. Check out also their community gateways on <a href="https://connect.openaire.eu">CONNECT</a>.
Each file is a tar archive containing gzip files with one json per line.
</li>
<li>The dump of <strong>ScholeXplorer</strong> <br/>
Dataset: <a href="https://doi.org/10.5281/zenodo.1200252"><img src="https://zenodo.org/badge/DOI/10.5281/zenodo.1200252.svg" alt="DOI"></a><br/>
Schema (Scholix version 3): <a href="https://doi.org/10.5281/zenodo.1120275"><img src="https://zenodo.org/badge/DOI/10.5281/zenodo.1120275.svg" alt="DOI"></a><br/>
This dataset is licensed under a <a rel="license" href="https://creativecommons.org/publicdomain/zero/1.0/">CC0 1.0 Universal (CC0 1.0) Public Domain Dedication</a>.<br/>
The dataset contains the GZ-compressed dump of the Scholix links exposed by the <a href="https://scholexplorer.openaire.eu">OpenAIRE ScholeXplorer service</a>.
</li>
<li>The dump of <strong>DOIBoost</strong> <br/>
Dataset: <a href="https://doi.org/10.5281/zenodo.1438355"><img src="https://zenodo.org/badge/DOI/10.5281/zenodo.1438355.svg" alt="DOI"></a><br/>
Publication: <a href="https://doi.org/10.5281/zenodo.1441071"><img src="https://zenodo.org/badge/DOI/10.5281/zenodo.1441071.svg" alt="DOI"></a><br/>
Software: <a href="https://doi.org/10.5281/zenodo.1441057"><img src="https://zenodo.org/badge/DOI/10.5281/zenodo.1441057.svg" alt="DOI"></a><br/>
This dataset is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</a>.<br/>
DOIBoost is a metadata collection that enriches CrossRef with inputs from Microsoft Academic Graph, ORCID, and Unpaywall.
</li>
</ul>
</div>
<h3>Cite us</h3>
<p>If you use any of the dumps above for research purposes, please cite it following the reccomendation that you find on the Zenodo page.<br/>
The OpenAIRE Research Graph and DOIBoost include data from <a href="https://aka.ms/msracad">Microsoft Academic Graph</a> (MAG): please acknowledge also MAG following <a href="https://docs.microsoft.com/en-us/academic-services/graph/resources-faq#license">this guideline</a>.<br/>
</p>
<h3>Still using the old XML dumps?</h3>
<div>
<p>The dumps contain XML records compliant to the <b>OpenAIRE data model</b> and to the <b>oaf metadata format</b> (the same format of the records exported via <a href="./oai-pmh.html">OAI-PMH</a>):</p>
<ul>
<li><a href="" target="_blank">See the description of the OpenAIRE data model</a></li>
<li><a href="https://www.openaire.eu/schema/latest/oaf.xsd" target="_blank">See the oaf XML schema</a></li>
<li><a href="https://www.openaire.eu/schema/latest/doc/oaf.html" target="_blank">See the oaf XML schema documentation (generated via Oxygen XML Editor)</a></li>
</ul>
<p>Keep reading for instructions on how to consume the dumps.</p>
Please migrate to the new json dumps. Meanwhile, you can still access the <a href="./graph-dumps-old.html">old documentation here</a>.
</div>
<h3>Consume the dumps</h3>
<div>
Each dump is a gzipped json file with many lines. Each line is in the form of:
<code>{"_id":{"$oid":"59b82504895be144859a9804"},"body":{"$binary":"base64(zip(XML_record))","$type":"00"}}</code><br/>
where the <code>body</code> field contains the base64 econding of the compressed XML record. <br/>
In order to get the XMLs you have to:
<ol>
<li>Unzip the file</li>
<li>Get only the value of the <code>$binary</code> field</li>
<li>Read each line and base64 decode it</li>
<li>Unzip the decoded string</li>
</ol>
For example, to print the XMLs on the standard output you can run this command on MacOS/Unix/Linux based systems:
<code>gunzip -c file.json.gz | jq '.body."$binary"' -r | while IFS= read -r line; do echo "$line" | base64 --decode | bsdtar -x -O ; done </code><br/>
where
<ul>
<li><code>file.json.gz</code> is the name you gave to the downloaded file dump;</li>
<li><code>jq</code> is a command to parse json files. It is not installed by default, but you can easy find it on official repositories. <a href="https://stedolan.github.io/jq/download/">Click here for installation instructions</a>.
<li><code>base64</code> and <code>bsdtar</code> are two libraries that are typically pre-installed.</li>
</ul>
Note that you should decide what to do with it (keep parsing XML inline or store them somewhere).
We suggest to start with few records to test and decide what to do, by adding a <code>head</code> command after the <code>gunzip</code>, like:
<code>gunzip -c file.json.gz | head -n 10 | jq '.body."$binary"' -r | while IFS= read -r line; do echo "$line" | base64 --decode | bsdtar -x -O ; done</code>
</div>
<h3>Cite us</h3>
<p>If you use the OpenAIRE Research Graph for research purposes, please cite it as:<br/>
<i>Manghi, Paolo, Atzori, Claudio, Bardi, Alessia, Shirrwagen, Jochen, Dimitropoulos, Harry, La Bruzzo, Sandro, … Summan, Friedrich. (2019). OpenAIRE Research Graph Dump [Data set]. Zenodo. http://doi.org/10.5281/zenodo.3516917</i><br/>
If you want to cite a specific version, please follow the suggestion on Zenodo. For the current version (1.0.0-beta), please use: </br>
<i>Manghi, Paolo, Atzori, Claudio, Bardi, Alessia, Shirrwagen, Jochen, Dimitropoulos, Harry, La Bruzzo, Sandro, … Summan, Friedrich. (2019). OpenAIRE Research Graph Dump (Version 1.0.0-beta) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.3516918</i><br/>
The OpenAIRE Research graph includes data from <a href="https://aka.ms/msracad">Microsoft Academic Graph</a> (MAG): please acknowledge also MAG following <a href="https://docs.microsoft.com/en-us/academic-services/graph/resources-faq#license">this guideline</a>.
</p>
<h3>License</h3>
<p>The OpenAIRE Research Graph is released under CC-BY license.</p>
<p>OpenAIRE is working to produce dumps that only contains metadata records that can be re-distributed with the CC0 license: stay tuned!</p>
</div>
</div>
</div>