Compare commits

..

150 Commits

Author SHA1 Message Date
Miriam Baglioni e0a0dddfac remove from the set of relation to be included in the dump also those with relType resultService. It is not possible to extend the removeSet parameter because the relClass is IsRelatedTo 2024-04-29 18:30:51 +02:00
Miriam Baglioni fcec4b4225 added the possibility to start the execution of the workflow from the make tar step 2024-04-18 14:19:55 +02:00
Miriam Baglioni ec55a22091 [Funder] changed the chosen funder identifier 2024-03-01 14:50:21 +01:00
Miriam Baglioni d5000502ea - 2024-02-14 11:39:55 +01:00
Miriam Baglioni 47757d2836 Merge branch 'master' of https://code-repo.d4science.org/D-Net/dhp-graph-dump 2024-02-13 12:10:30 +01:00
Miriam Baglioni e4b56a4f88 [Organization] dump for organizations only 2024-02-06 14:13:12 +02:00
Claudio Atzori 1a083b2079 Merge pull request 'code of conduct and contributing' (#9) from contributing into master
Reviewed-on: #9
2024-01-24 15:39:54 +01:00
Claudio Atzori 3c6efc142d added code of conduct and contributing files 2024-01-24 10:54:32 +01:00
Miriam Baglioni 3ad0d6edfc - 2024-01-18 11:18:53 +01:00
Miriam Baglioni 5dcb7019a5 fixed left over 2024-01-08 12:26:16 +01:00
Miriam Baglioni 253ffb42f6 merge origin master after PR merge branch 'master' of https://code-repo.d4science.org/D-Net/dhp-graph-dump 2024-01-08 12:10:45 +01:00
Miriam Baglioni 874a5ea63e Merge pull request 'communityAPI' (#8) from communityAPI into master
Reviewed-on: #8
2024-01-08 11:55:36 +01:00
Miriam Baglioni 53257bc041 fixed error for mapping of open access color 2024-01-08 11:54:33 +01:00
Miriam Baglioni 4f01c5b046 added new fields related to the irish funder to the mapping 2024-01-04 16:48:13 +01:00
Miriam Baglioni aed2308299 removed not needed file 2024-01-04 16:41:35 +01:00
Miriam Baglioni ac8ef53d02 removed last islookupur. Added new data for the irish tender 2023-12-19 11:08:04 +01:00
Miriam Baglioni 14e2027bb0 - 2023-12-13 16:09:03 +01:00
Miriam Baglioni 4c7eb7d2c3 updating gitignore 2023-12-13 16:07:59 +01:00
Miriam Baglioni 9743216062 changed the .gitignore 2023-11-30 14:55:57 +01:00
Miriam Baglioni aefec499a9 avoid dump of trust for fos and sdg 2023-11-30 14:49:05 +01:00
Miriam Baglioni 08f0b1c84c removed not needed parameters 2023-11-27 11:00:07 +01:00
Miriam Baglioni b2ca6b3bb9 modified the .gitignore file 2023-11-22 12:24:48 +01:00
Miriam Baglioni 566c1a9e4e - 2023-11-21 16:57:15 +01:00
Miriam Baglioni d170789adc removed not needed parameter 2023-11-13 18:20:34 +01:00
Miriam Baglioni 0a2c00ce29 - 2023-11-13 18:13:08 +01:00
Miriam Baglioni cc86f24372 removed not needed parameter 2023-11-13 11:38:56 +01:00
Miriam Baglioni d9ca135c1f removed not needed parameter 2023-11-13 10:33:31 +01:00
Miriam Baglioni 332c02c2c1 removed not needed parameter 2023-11-10 14:59:16 +01:00
Miriam Baglioni 998048b494 - 2023-11-10 14:42:07 +01:00
Miriam Baglioni c7eb1f7dbe removed leftover file 2023-11-10 13:54:40 +01:00
Miriam Baglioni 6c0ffd9824 added public community status 2023-11-07 11:51:24 +01:00
Miriam Baglioni e3c1ae809d refactoring 2023-10-31 10:42:43 +01:00
Miriam Baglioni 10ea974d56 removed not needed class 2023-10-31 10:39:29 +01:00
Miriam Baglioni 012d4cece6 adding test classes 2023-10-31 10:38:09 +01:00
Miriam Baglioni 818bb4b11c removing interaction with the IS. Using communityAPIs instead 2023-10-30 14:28:55 +01:00
Miriam Baglioni e91636817c - 2023-10-25 17:32:09 +02:00
Miriam Baglioni eb407ba0d3 first try to remove the IS FROM THE CODE 2023-10-25 14:58:19 +02:00
Miriam Baglioni 8cf6a40bdf - 2023-10-24 16:02:21 +02:00
Miriam Baglioni 176d6d7f2b change MEDIA_TYPE to avoid error on Zenodo side 2023-10-24 12:45:55 +02:00
Miriam Baglioni f2b890f8a8 - 2023-10-24 08:55:14 +02:00
Miriam Baglioni 5529bbe3cc add retry with exponential backof and delay between the calls 2023-09-29 15:53:08 +02:00
Miriam Baglioni 32d64dd7a1 added possibility to copy the graph from hive 2023-09-22 15:25:53 +02:00
Miriam Baglioni 9dea4f30ca removed constraint on length of the name 2023-09-19 11:29:27 +02:00
Miriam Baglioni 60e2713d56 - 2023-09-18 14:30:31 +02:00
Miriam Baglioni 9aec98cea0 changed make tar to avoid repetition of name in archive 2023-09-18 12:07:54 +02:00
Miriam Baglioni 4885d36b3b refactoring 2023-08-17 15:11:36 +02:00
Miriam Baglioni f6677429c7 fixed conflicts 2023-08-17 15:03:41 +02:00
Miriam Baglioni c0ce9023a5 Merge pull request 'masterRelationNoNode' (#4) from masterRelationNoNode into master
Reviewed-on: #4
2023-08-17 14:57:41 +02:00
Miriam Baglioni d1f41b8e28 removed organization without legalname and legalshortname from the dump 2023-08-17 10:14:20 +02:00
Miriam Baglioni 24be522e7c fixed NPE, moved class to generate tar 2023-08-07 13:56:58 +02:00
Miriam Baglioni e9aca6b702 refactoring 2023-08-04 19:32:16 +02:00
Miriam Baglioni 5fb58362c5 moved parameter file. Added 40| as prefix on projects for computing the delta 2023-08-04 17:18:15 +02:00
Miriam Baglioni 097905171a adding master duplicate to avoid join of relation. Changed the model for the indicators 2023-08-04 16:22:23 +02:00
Miriam Baglioni 6b113961c1 - 2023-07-28 10:26:22 +02:00
Miriam Baglioni a175ac2c7f [dump] refactoring 2023-07-19 09:40:48 +02:00
Miriam Baglioni 2566b97138 [dumpCSV] remove not needed code 2023-07-17 16:28:46 +02:00
Miriam Baglioni 0482648131 merg 2023-07-17 16:24:57 +02:00
Miriam Baglioni 5ff50d115a [dumpCSV] ading double quotes enclosing all the fileds 2023-07-17 16:21:20 +02:00
Miriam Baglioni 81b55dc20b merging with master 2023-07-15 11:15:24 +02:00
Miriam Baglioni 7ccd4e7866 Merge pull request 'Dump per Country' (#3) from dumpSubset into master
Reviewed-on: #3
2023-07-15 11:13:28 +02:00
Miriam Baglioni 25be584028 [dumpSubset] aligned with master 2023-07-15 11:12:27 +02:00
Miriam Baglioni 21a521b97c changed the API to consider the upload only of an already open version 2023-07-15 10:36:54 +02:00
Miriam Baglioni b74d6f1c23 resolved conflict 2023-07-13 18:20:31 +02:00
Miriam Baglioni 787d4d0b4a changed the pom reference to the dho schema 2023-07-13 18:19:12 +02:00
Miriam Baglioni b01573e201 [dumpCSV] removed output directory before starting the jobs 2023-07-12 07:38:53 +02:00
Miriam Baglioni baef25560a [dumpCSV] align pom version with master for graph 2023-07-11 13:47:28 +02:00
Miriam Baglioni 95125d704a [dump] removed usage stats info from the datasource and project 2023-07-11 12:40:21 +02:00
Miriam Baglioni abc30756e4 - 2023-07-07 18:42:41 +02:00
Miriam Baglioni ab791fe424 [master] update reference to ZenodoAPI 2023-07-07 18:12:17 +02:00
Miriam Baglioni 3bfac8bc6e [dumpCSV] addressing the issues fointed out by the Dare Lab people. Repeated relations from author to result due to the author repeated in the data. Repeated relations from result to result due to the same pid present in more that one result. Author table not properly formatted due to the bad formatting of the input data 2023-07-07 18:01:26 +02:00
Miriam Baglioni 9d1b708a89 [dumpCSV] addressing the issues fointed out by the Dare Lab people. Repeated relations from author to result due to the author repeated in the data. Repeated relations from result to result due to the same pid present in more that one result. Author table not properly formatted due to the bad formatting of the input data 2023-07-07 17:44:19 +02:00
Miriam Baglioni 8a44653dbe [DumpCSV] fixing issues 2023-07-05 09:58:55 +02:00
Miriam Baglioni b26fb92838 changed the pom dependency of a different schema 2023-07-01 12:38:18 +02:00
Miriam Baglioni 29b81bef26 refactoring 2023-07-01 11:54:48 +02:00
Miriam Baglioni d53c6850aa Merge pull request 'dump_zenodo_2' (#2) from dump_zenodo_2 into master
Reviewed-on: #2
2023-07-01 11:09:19 +02:00
Miriam Baglioni 3fba247c38 refactoring 2023-07-01 11:07:41 +02:00
Miriam Baglioni 2ac5c4a9ab moved also the model and other linked classes to the dump project 2023-07-01 11:06:41 +02:00
Miriam Baglioni 766288d1c9 Merge branch 'dump_zenodo_2' of https://code-repo.d4science.org/D-Net/dhp-graph-dump into dump_zenodo_2 2023-07-01 10:39:37 +02:00
Miriam Baglioni b9d4d67c72 - 2023-06-30 19:06:15 +02:00
Sandro La Bruzzo d746390b9f new implementatiton with okhttp 2023-06-23 15:15:09 +02:00
Miriam Baglioni 72ead1bd85 added okhttp3 again 2023-06-23 14:16:15 +02:00
Sandro La Bruzzo 6ace388cff fixed method 2023-06-23 14:16:10 +02:00
Sandro La Bruzzo d472050ad4 Added new implementation upload huge file 2023-06-22 17:43:53 +02:00
Sandro La Bruzzo 5d0d14528f Added new implementation upload huge file 2023-06-22 16:54:17 +02:00
Miriam Baglioni e87b790a60 - 2023-06-22 16:54:13 +02:00
Miriam Baglioni 8661bc0c90 aligned with the last version of pom for production 2023-06-02 16:13:18 +02:00
Miriam Baglioni 2e8639f22d added test to verify the dump for indicators at the level of project and datasource. Fixed issue on identifier with the prefix 2023-06-01 15:10:00 +02:00
Miriam Baglioni 32983e90d1 change to the model of the Relation -> flatten: remove the node and add source, sourceType, target, targetType. Adding indicators at the level of Projects and Datasources. Removing the prefix from the identifier of the entities 2023-06-01 12:58:56 +02:00
Miriam Baglioni 2e0999a1df First implementation of the csv dump 2023-05-29 10:16:47 +02:00
Miriam Baglioni f79b9d5c0d [DUMP CSV] sligth modification 2023-05-17 16:58:04 +02:00
Miriam Baglioni 21599598ae [DUMP CSV] test and resources for the result dumps 2023-05-17 16:57:25 +02:00
Miriam Baglioni 66873c1744 [DUMP CSV] Dumping of the results, of the authors and the relationships between results and authors and results and pids 2023-05-17 16:56:28 +02:00
Miriam Baglioni 7563499740 [DUMP CSV] - 2023-05-16 14:29:31 +02:00
Miriam Baglioni f79c06209e [DUMP CSV] test and resources for the SelectResultAndDumpRelation job 2023-05-16 14:21:39 +02:00
Miriam Baglioni 2ed76d4662 [DUMP CSV] tested file to execute the dump of the relations with semantcis Cites from nodes belonging to a selected communities. It also dumps the relationships result_communities and prepare the ground for the dump of the results. 2023-05-16 14:20:45 +02:00
Miriam Baglioni 44a256fc90 [DUMP CSV] refactoring 2023-05-16 14:10:14 +02:00
Miriam Baglioni 636945a5c5 [DUMP CSV] refactoring 2023-05-16 14:09:21 +02:00
Miriam Baglioni b9076f9aa8 [DUMP CSV] model classes to mirror the attributes of the tables to be dumped 2023-05-16 14:06:25 +02:00
Miriam Baglioni acb3c691bc [DUMP CSV] added query and method to get the information to dump in the CSV regarding the selected communities 2023-05-16 14:04:44 +02:00
Miriam Baglioni d0f144d422 first implementation for the dump in csv of the subset of the graph related to specific communities. The only relations considered are the cites. the source must be within the set of communties, the target con be outside => we also have to map nodes not related to the communities of interest. These communities are given as parameter 2023-05-11 16:44:54 +02:00
Miriam Baglioni 1fb840ff28 added test classes and resources. removed one step from the workflow since it was not needed 2023-05-04 12:05:10 +02:00
Miriam Baglioni 011b7737ad - 2023-05-02 15:47:06 +02:00
Miriam Baglioni 6ba43a1b67 selects actual result per result type associated with the given country and saves them 2023-04-27 18:16:01 +02:00
Miriam Baglioni 7f57f3cd1e selection of the results id having the given country among the countries, or being in relation with other entities associated with the given country 2023-04-27 18:14:48 +02:00
Miriam Baglioni 1671e78e59 - 2023-04-21 11:32:07 +02:00
Miriam Baglioni 563c5d8527 - 2023-04-19 15:19:03 +02:00
Miriam Baglioni b6e0c7d660 changed the interaction with Zenodo since the API chenaged 2023-04-19 09:40:45 +02:00
Miriam Baglioni 43e9286db2 Changed the code for the production of the dump for FCT 2023-04-05 19:00:10 +02:00
Miriam Baglioni 80d51cea56 change dipendency from the workflow (leftover with old library name 2023-03-30 10:29:00 +02:00
Miriam Baglioni f738db860a refactoring 2023-01-25 11:52:51 +01:00
Miriam Baglioni 4dcd03b78e minor and fixed wronf number is test because of change in the input resource 2022-12-31 13:00:00 +01:00
Miriam Baglioni 2cae97d049 Merge pull request 'Dump of indicators changed' (#1) from changeMeasure into master
Reviewed-on: #1
2022-12-29 15:20:10 +01:00
Miriam Baglioni b743dc2960 removed class 2022-12-29 15:19:36 +01:00
Miriam Baglioni 5e36b80dc1 merge with changeMeasure 2022-12-29 15:14:20 +01:00
Miriam Baglioni ad1ba563cd update of the schema of the dump 2022-12-29 14:46:51 +01:00
Miriam Baglioni 8ec02787f2 minor changes 2022-12-28 23:00:37 +01:00
Miriam Baglioni 2d2b62386f removed indicators from Instance 2022-12-28 21:50:48 +01:00
Miriam Baglioni 71862838b0 [dump] removed relations extracted from products where the datasource was not in the graph 2022-12-27 10:00:47 +01:00
Miriam Baglioni b26ecd74ea merging with dumpSubset 2022-12-23 09:43:30 +01:00
Miriam Baglioni dc5e79dc64 [dumpSubset] added test to verify why sdsn-gr disappears from the community set 2022-12-23 09:42:49 +01:00
Miriam Baglioni 4bedecaa60 [dumpSubset] added the correct path to the context relations 2022-12-22 13:48:15 +01:00
Miriam Baglioni 62d8180891 [ChangeMeasure] semplified workflow 2022-12-22 09:54:21 +01:00
Miriam Baglioni db36a9be2e [Dump Subset] issue on the relations 2022-12-22 09:38:09 +01:00
Miriam Baglioni 45cc165e92 [Dump Subset] moved one step ahead the change of master in hosted by, collectedfrom 2022-11-30 09:54:45 +01:00
Miriam Baglioni 0a0e2cfc9c refactoring 2022-11-29 16:09:10 +01:00
Miriam Baglioni 054103ae70 [Dump Subset] fixed issue in workflow parameter 2022-11-29 15:32:08 +01:00
Miriam Baglioni 99fb3dc1d0 [Dump Subset] fixed issue in parameter file 2022-11-28 14:58:57 +01:00
Miriam Baglioni f26378f426 [Dump Subset] change code to read from db 2022-11-25 17:52:46 +01:00
Miriam Baglioni 67d48763fa [Dump subset] change the class to read from the db and added needed parameters in the workflow 2022-11-24 10:24:22 +01:00
Miriam Baglioni 0bb97fead7 [Dump Subset] fixing issue with missing datasource - serach in collectedfrom at the level of the result and select master id if duplicate id is inserted in hostedby or collectedfrom in the result-. Added new test 2022-11-22 15:58:50 +01:00
Miriam Baglioni d3da9ab2c6 [Dump Subset] fixing issue and finalizing workflow 2022-11-21 14:10:46 +01:00
Miriam Baglioni 8878b96204 [Dump Subset] first try fro dump subset and refactoring 2022-11-17 16:13:10 +01:00
Miriam Baglioni 447af1a851 [DUMP INDICATORS ] added annotation to ignore null values in serialization 2022-11-10 09:46:13 +01:00
Miriam Baglioni 956962453f [DUMP INDICATORS ] changed to accommodate the new indicators 2022-11-10 09:45:06 +01:00
Miriam Baglioni 5544b049a9 [DUMP INDICATORS ] new classes needed for the indicators 2022-11-10 09:44:23 +01:00
Miriam Baglioni 31ce13ffb4 [DUMP INDICATORS ] refactoring 2022-11-10 09:40:42 +01:00
Miriam Baglioni 0a53c29a8f [DUMP INDICATORS ] added code and resource to test the serialization of indicators 2022-11-10 09:39:09 +01:00
Miriam Baglioni bdd1cfc1e0 [DUMP INDICATORS ] added code to serialize the indicators 2022-11-10 09:37:54 +01:00
Miriam Baglioni e222c2c4d7 [DUMP INDICATORS ] added new constants for the indicators 2022-11-10 09:37:28 +01:00
Miriam Baglioni 5e8cd02acd [DUMP INDICATORS ] adding a step of mapping to string with object mapper to support decorator in getter and setter to have 'class' as value for a serialized variable 2022-11-10 09:37:05 +01:00
Miriam Baglioni 4b339df43f [DUMP INDICATORS ] refactoring 2022-11-10 09:32:05 +01:00
Miriam Baglioni 3cc2802a75 [Dump] removing all EOSC related addition from master and fixed issue on dump of datasource pids 2022-10-13 11:49:37 +02:00
Miriam Baglioni 8a574fee2a [Dump] removing all EOSC related addition from master 2022-10-11 11:41:19 +02:00
Miriam Baglioni 80e525e0c1 Changed the jar from dhp-graph-dump to dump 2022-10-04 12:37:24 +02:00
Miriam Baglioni 3fe35345c3 minor changes 2022-10-04 12:13:18 +02:00
Claudio Atzori 6a4589aa2f Merge branch 'master' into changeMeasure 2022-09-27 15:10:57 +02:00
Miriam Baglioni eb06474106 [Extend Measure] added test to verify the new serialization model and the serialization at the level of the result 2022-09-22 18:02:37 +02:00
Miriam Baglioni b5ee457969 added measure at the level of the result. Changed the way the measures are dumped since the previous serialization was not able to describe in the correct way the current association measure and value for the indicators (for BipFinder) 2022-09-22 15:50:39 +02:00
Miriam Baglioni 3905afa0c2 fixed format in measure after modification of the model 2022-08-03 15:40:28 +02:00
Miriam Baglioni e7eb17f73e first attempt at changing the measure element 2022-08-03 12:25:32 +02:00
348 changed files with 20413 additions and 6847 deletions

5
.gitignore vendored
View File

@ -26,5 +26,8 @@ spark-warehouse
/**/*.log
/**/.factorypath
/**/.scalafmt.conf
/*/*/job.properties
/**/job.properties
/job.properties
/*/job.properties
/*/*/job.properties
/*/*/*/job.properties

43
CODE_OF_CONDUCT.md Normal file
View File

@ -0,0 +1,43 @@
# Contributor Code of Conduct
Openness, transparency and our community-driven participatory approach guide us in our day-to-day interactions and decision-making. Our open source projects are no exception. Trust, respect, collaboration and transparency are core values we believe should live and breathe within our projects. Our community welcomes participants from around the world with different experiences, unique perspectives, and great ideas to share.
## Our Pledge
In the interest of fostering an open and welcoming environment, we as contributors and maintainers pledge to making participation in our project and our community a harassment-free experience for everyone, regardless of age, body size, disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, religion, or sexual identity and orientation.
## Our Standards
Examples of behavior that contributes to creating a positive environment include:
- Using welcoming and inclusive language
- Being respectful of differing viewpoints and experiences
- Gracefully accepting constructive criticism
- Attempting collaboration before conflict
- Focusing on what is best for the community
- Showing empathy towards other community members
Examples of unacceptable behavior by participants include:
- Violence, threats of violence, or inciting others to commit self-harm
- The use of sexualized language or imagery and unwelcome sexual attention or advances
- Trolling, intentionally spreading misinformation, insulting/derogatory comments, and personal or political attacks
- Public or private harassment
- Publishing others' private information, such as a physical or electronic address, without explicit permission
- Abuse of the reporting process to intentionally harass or exclude others
- Advocating for, or encouraging, any of the above behavior
- Other conduct which could reasonably be considered inappropriate in a professional setting
## Our Responsibilities
Project maintainers are responsible for clarifying the standards of acceptable behavior and are expected to take appropriate and fair corrective action in response to any instances of unacceptable behavior.
Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, or to ban temporarily or permanently any contributor for other behaviors that they deem inappropriate, threatening, offensive, or harmful.
## Scope
This Code of Conduct applies both within project spaces and in public spaces when an individual is representing the project or its community. Examples of representing a project or community include using an official project e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event. Representation of a project may be further defined and clarified by project maintainers.
## Attribution
This Code of Conduct is adapted from the [Contributor Covenant](https://www.contributor-covenant.org/), [version 1.4](https://www.contributor-covenant.org/version/1/4/code-of-conduct.html).

10
CONTRIBUTING.md Normal file
View File

@ -0,0 +1,10 @@
# Contributing to D-Net Hadoop
:+1::tada: First off, thanks for taking the time to contribute! :tada::+1:
This project and everyone participating in it is governed by our [Code of Conduct](CODE_OF_CONDUCT.md). By participating, you are expected to uphold this code. Please report unacceptable behavior to [dnet-team@isti.cnr.it](mailto:dnet-team@isti.cnr.it).
The following is a set of guidelines for contributing to this project and its packages. These are mostly guidelines, not rules, which applies to this project as a while, including all its sub-modules.
Use your best judgment, and feel free to propose changes to this document in a pull request.
All contributions are welcome, all contributions will be considered to be contributed under the [project license](#LICENSE.md).

View File

View File

@ -1,2 +1,8 @@
# dhp-graph-dump
This module defines the oozie workflows for creating & publishing the OpenAIRE Graph dumps
This project defines the oozie workflows for creating & publishing the OpenAIRE Graph dumps.
This project adheres to the Contributor Covenant [code of conduct](CODE_OF_CONDUCT.md).
By participating, you are expected to uphold this code. Please report unacceptable behavior to [dnet-team@isti.cnr.it](mailto:dnet-team@isti.cnr.it).
This project is licensed under the [AGPL v3 or later version](#LICENSE.md).

View File

@ -45,5 +45,5 @@
</dependencies>
</project>

View File

@ -6,7 +6,6 @@
<artifactId>dhp-graph-dump</artifactId>
<groupId>eu.dnetlib.dhp</groupId>
<version>1.2.5-SNAPSHOT</version>
<relativePath>../pom.xml</relativePath>
</parent>
<modelVersion>4.0.0</modelVersion>

View File

@ -12,7 +12,8 @@ import com.fasterxml.jackson.databind.SerializationFeature;
import com.github.imifou.jsonschema.module.addon.AddonModule;
import com.github.victools.jsonschema.generator.*;
import eu.dnetlib.dhp.eosc.model.Result;
import eu.dnetlib.dhp.oa.model.community.CommunityResult;
import eu.dnetlib.dhp.oa.model.graph.*;
public class ExecCreateSchemas {
final static String DIRECTORY = "/eu/dnetlib/dhp/schema/dump/jsonschemas/";
@ -39,6 +40,7 @@ public class ExecCreateSchemas {
.get(Paths.get(getClass().getResource("/").getPath()).toAbsolutePath() + directory)
.toString();
System.out.println(dir);
if (!Files.exists(Paths.get(dir))) {
Files.createDirectories(Paths.get(dir));
}
@ -59,8 +61,14 @@ public class ExecCreateSchemas {
ExecCreateSchemas ecs = new ExecCreateSchemas();
ecs.init();
ecs.generate(GraphResult.class, DIRECTORY, "result_schema.json");
ecs.generate(ResearchCommunity.class, DIRECTORY, "community_infrastructure_schema.json");
ecs.generate(Datasource.class, DIRECTORY, "datasource_schema.json");
ecs.generate(Project.class, DIRECTORY, "project_schema.json");
ecs.generate(Relation.class, DIRECTORY, "relation_schema.json");
ecs.generate(Organization.class, DIRECTORY, "organization_schema.json");
ecs.generate(Result.class, DIRECTORY, "eosc_result_schema.json");
ecs.generate(CommunityResult.class, DIRECTORY, "community_result_schema.json");
}
}

View File

@ -1,46 +0,0 @@
package eu.dnetlib.dhp.eosc.model;
import java.io.Serializable;
import java.util.List;
import com.github.imifou.jsonschema.module.addon.annotation.JsonSchema;
/**
* @author miriam.baglioni
* @Date 13/09/22
*/
public class Affiliation implements Serializable {
@JsonSchema(description = "the OpenAIRE id of the organizaiton")
private String id;
@JsonSchema(description = "the name of the organization")
private String name;
@JsonSchema(description = "the list of pids we have in OpenAIRE for the organization")
private List<OrganizationPid> pid;
public String getId() {
return id;
}
public void setId(String id) {
this.id = id;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public List<OrganizationPid> getPid() {
return pid;
}
public void setPid(List<OrganizationPid> pid) {
this.pid = pid;
}
}

View File

@ -1,67 +0,0 @@
package eu.dnetlib.dhp.eosc.model;
import java.io.Serializable;
import com.github.imifou.jsonschema.module.addon.annotation.JsonSchema;
/**
* @author miriam.baglioni
* @Date 29/07/22
*/
public class EoscInteroperabilityFramework implements Serializable {
@JsonSchema(description = "EOSC-IF label")
private String label;
@JsonSchema(
description = "EOSC-IF local code. Later on it could be populated with a PID (e.g. DOI), but for the time being we stick to a more loose definition.")
private String code;
@JsonSchema(description = "EOSC-IF url to the guidelines")
private String url;
@JsonSchema(description = "EOSC-IF semantic relation (e.g. compliesWith)")
private String semanticRelation;
public String getLabel() {
return label;
}
public void setLabel(String label) {
this.label = label;
}
public String getCode() {
return code;
}
public void setCode(String code) {
this.code = code;
}
public String getUrl() {
return url;
}
public void setUrl(String url) {
this.url = url;
}
public String getSemanticRelation() {
return semanticRelation;
}
public void setSemanticRelation(String semanticRelation) {
this.semanticRelation = semanticRelation;
}
public static EoscInteroperabilityFramework newInstance(String code, String label, String url,
String semanticRelation) {
EoscInteroperabilityFramework eif = new EoscInteroperabilityFramework();
eif.label = label;
eif.code = code;
eif.url = url;
eif.semanticRelation = semanticRelation;
return eif;
}
}

View File

@ -1,33 +0,0 @@
package eu.dnetlib.dhp.eosc.model;
import java.io.Serializable;
/**
* @author miriam.baglioni
* @Date 04/11/22
*/
public class Indicator implements Serializable {
private UsageCounts usageCounts;
public UsageCounts getUsageCounts() {
return usageCounts;
}
public void setUsageCounts(UsageCounts usageCounts) {
this.usageCounts = usageCounts;
}
public static Indicator newInstance(UsageCounts uc) {
Indicator i = new Indicator();
i.setUsageCounts(uc);
return i;
}
public static Indicator newInstance(String downloads, String views) {
Indicator i = new Indicator();
i.setUsageCounts(UsageCounts.newInstance(views, downloads));
return i;
}
}

View File

@ -1,43 +0,0 @@
package eu.dnetlib.dhp.eosc.model;
import java.io.Serializable;
import com.github.imifou.jsonschema.module.addon.annotation.JsonSchema;
/**
* @author miriam.baglioni
* @Date 13/09/22
*/
public class OrganizationPid implements Serializable {
@JsonSchema(description = "the type of the organization pid")
private String type;
@JsonSchema(description = "the value of the organization pid")
private String value;
public String getType() {
return type;
}
public void setType(String type) {
this.type = type;
}
public String getValue() {
return value;
}
public void setValue(String value) {
this.value = value;
}
public static OrganizationPid newInstance(String type, String value) {
OrganizationPid op = new OrganizationPid();
op.type = type;
op.value = value;
return op;
}
}

View File

@ -1,97 +0,0 @@
package eu.dnetlib.dhp.eosc.model;
import java.io.Serializable;
import com.github.imifou.jsonschema.module.addon.annotation.JsonSchema;
/**
* @author miriam.baglioni
* @Date 26/01/23
*/
public class ProjectSummary implements Serializable {
@JsonSchema(description = "The OpenAIRE id for the project")
protected String id;// OpenAIRE id
@JsonSchema(description = "The grant agreement number")
protected String code;
@JsonSchema(description = "The acronym of the project")
protected String acronym;
protected String title;
@JsonSchema(description = "Information about the funder funding the project")
private FunderShort funder;
private Provenance provenance;
private Validated validated;
public void setValidated(Validated validated) {
this.validated = validated;
}
public Validated getValidated() {
return validated;
}
public Provenance getProvenance() {
return provenance;
}
public void setProvenance(Provenance provenance) {
this.provenance = provenance;
}
public FunderShort getFunder() {
return funder;
}
public void setFunder(FunderShort funders) {
this.funder = funders;
}
public String getId() {
return id;
}
public void setId(String id) {
this.id = id;
}
public String getCode() {
return code;
}
public void setCode(String code) {
this.code = code;
}
public String getAcronym() {
return acronym;
}
public void setAcronym(String acronym) {
this.acronym = acronym;
}
public String getTitle() {
return title;
}
public void setTitle(String title) {
this.title = title;
}
public static ProjectSummary newInstance(String id, String code, String acronym, String title, FunderShort funder) {
ProjectSummary project = new ProjectSummary();
project.setAcronym(acronym);
project.setCode(code);
project.setFunder(funder);
project.setId(id);
project.setTitle(title);
return project;
}
}

View File

@ -1,34 +0,0 @@
package eu.dnetlib.dhp.eosc.model;
import java.io.Serializable;
import com.github.imifou.jsonschema.module.addon.annotation.JsonSchema;
/**
* @author miriam.baglioni
* @Date 10/08/22
*/
public class Subject implements Serializable {
@JsonSchema(description = "Why this subject is associated to the result")
private Provenance provenance;
@JsonSchema(description = "The subject value")
private String value;
public Provenance getProvenance() {
return provenance;
}
public void setProvenance(Provenance provenance) {
this.provenance = provenance;
}
public String getValue() {
return value;
}
public void setValue(String value) {
this.value = value;
}
}

View File

@ -1,5 +1,5 @@
package eu.dnetlib.dhp.eosc.model;
package eu.dnetlib.dhp.oa.model;
import java.io.Serializable;

View File

@ -1,5 +1,5 @@
package eu.dnetlib.dhp.eosc.model;
package eu.dnetlib.dhp.oa.model;
/**
* AccessRight. Used to represent the result access rights. It extends the eu.dnet.lib.dhp.schema.dump.oaf.BestAccessRight

View File

@ -1,5 +1,5 @@
package eu.dnetlib.dhp.eosc.model;
package eu.dnetlib.dhp.oa.model;
import java.io.Serializable;

View File

@ -1,10 +1,8 @@
package eu.dnetlib.dhp.eosc.model;
package eu.dnetlib.dhp.oa.model;
import java.io.Serializable;
import org.apache.commons.lang3.StringUtils;
import com.github.imifou.jsonschema.module.addon.annotation.JsonSchema;
/**

View File

@ -1,10 +1,8 @@
package eu.dnetlib.dhp.eosc.model;
package eu.dnetlib.dhp.oa.model;
import java.io.Serializable;
import org.apache.commons.lang3.StringUtils;
import com.github.imifou.jsonschema.module.addon.annotation.JsonSchema;
/**
@ -38,15 +36,15 @@ public class AuthorPid implements Serializable {
public static AuthorPid newInstance(AuthorPidSchemeValue pid, Provenance provenance) {
AuthorPid p = new AuthorPid();
p.setId(pid);
p.setProvenance(provenance);
p.id = pid;
p.provenance = provenance;
return p;
}
public static AuthorPid newInstance(AuthorPidSchemeValue pid) {
AuthorPid p = new AuthorPid();
p.setId(pid);
p.id = pid;
return p;
}

View File

@ -1,10 +1,8 @@
package eu.dnetlib.dhp.eosc.model;
package eu.dnetlib.dhp.oa.model;
import java.io.Serializable;
import org.apache.commons.lang3.StringUtils;
import com.github.imifou.jsonschema.module.addon.annotation.JsonSchema;
public class AuthorPidSchemeValue implements Serializable {
@ -39,5 +37,4 @@ public class AuthorPidSchemeValue implements Serializable {
return cf;
}
}

View File

@ -1,5 +1,5 @@
package eu.dnetlib.dhp.eosc.model;
package eu.dnetlib.dhp.oa.model;
import java.io.Serializable;

View File

@ -1,5 +1,5 @@
package eu.dnetlib.dhp.eosc.model;
package eu.dnetlib.dhp.oa.model;
import java.io.Serializable;
@ -27,13 +27,10 @@ public class Container implements Serializable {
@JsonSchema(description = "Name of the journal or conference")
private String name;
@JsonSchema(description = "The issn")
private String issnPrinted;
@JsonSchema(description = "The eissn")
private String issnOnline;
@JsonSchema(description = "The lissn")
private String issnLinking;
@JsonSchema(description = "End page")

View File

@ -1,10 +1,8 @@
package eu.dnetlib.dhp.eosc.model;
package eu.dnetlib.dhp.oa.model;
import java.io.Serializable;
import org.apache.commons.lang3.StringUtils;
import com.github.imifou.jsonschema.module.addon.annotation.JsonSchema;
/**
@ -45,4 +43,5 @@ public class Country implements Serializable {
c.setLabel(label);
return c;
}
}

View File

@ -1,15 +1,11 @@
package eu.dnetlib.dhp.eosc.model;
package eu.dnetlib.dhp.oa.model;
import java.io.Serializable;
import com.github.imifou.jsonschema.module.addon.annotation.JsonSchema;
/**
* @author miriam.baglioni
* @Date 26/01/23
*/
public class FunderShort implements Serializable {
public class Funder implements Serializable {
@JsonSchema(description = "The short name of the funder (EC)")
private String shortName;
@ -44,15 +40,4 @@ public class FunderShort implements Serializable {
public void setName(String name) {
this.name = name;
}
@JsonSchema(description = "Stream of funding (e.g. for European Commission can be H2020 or FP7)")
private String fundingStream;
public String getFundingStream() {
return fundingStream;
}
public void setFundingStream(String fundingStream) {
this.fundingStream = fundingStream;
}
}

View File

@ -1,5 +1,5 @@
package eu.dnetlib.dhp.eosc.model;
package eu.dnetlib.dhp.oa.model;
import java.io.Serializable;

View File

@ -0,0 +1,56 @@
package eu.dnetlib.dhp.oa.model;
import java.io.Serializable;
/**
* @author miriam.baglioni
* @Date 07/11/22
*/
public class ImpactIndicators implements Serializable {
Score influence;
Score influence_alt;
Score popularity;
Score popularity_alt;
Score impulse;
public Score getInfluence() {
return influence;
}
public void setInfluence(Score influence) {
this.influence = influence;
}
public Score getInfluence_alt() {
return influence_alt;
}
public void setInfluence_alt(Score influence_alt) {
this.influence_alt = influence_alt;
}
public Score getPopularity() {
return popularity;
}
public void setPopularity(Score popularity) {
this.popularity = popularity;
}
public Score getPopularity_alt() {
return popularity_alt;
}
public void setPopularity_alt(Score popularity_alt) {
this.popularity_alt = popularity_alt;
}
public Score getImpulse() {
return impulse;
}
public void setImpulse(Score impulse) {
this.impulse = impulse;
}
}

View File

@ -0,0 +1,34 @@
package eu.dnetlib.dhp.oa.model;
import java.io.Serializable;
import java.util.List;
import com.fasterxml.jackson.annotation.JsonInclude;
import com.github.imifou.jsonschema.module.addon.annotation.JsonSchema;
public class Indicator implements Serializable {
@JsonSchema(description = "The impact measures (i.e. popularity)")
List<Score> bipIndicators;
@JsonSchema(description = "The usage counts (i.e. downloads)")
UsageCounts usageCounts;
@JsonInclude(JsonInclude.Include.NON_NULL)
public List<Score> getBipIndicators() {
return bipIndicators;
}
public void setBipIndicators(List<Score> bipIndicators) {
this.bipIndicators = bipIndicators;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public UsageCounts getUsageCounts() {
return usageCounts;
}
public void setUsageCounts(UsageCounts usageCounts) {
this.usageCounts = usageCounts;
}
}

View File

@ -1,5 +1,5 @@
package eu.dnetlib.dhp.eosc.model;
package eu.dnetlib.dhp.oa.model;
import java.io.Serializable;
import java.util.List;
@ -8,30 +8,26 @@ import com.fasterxml.jackson.annotation.JsonInclude;
import com.github.imifou.jsonschema.module.addon.annotation.JsonSchema;
/**
* @author miriam.baglioni
* @Date 02/02/23
*/
/**
* It extends eu.dnetlib.dhp.dump.oaf.Instance with values related to the community dump. In the Result dump this
* information is not present because it is dumped as a set of relations between the result and the datasource. -
* hostedby of type eu.dnetlib.dhp.schema.dump.oaf.KeyValue to store the information about the source from which the
* instance can be viewed or downloaded. It is mapped against the hostedby parameter of the instance to be dumped and -
* key corresponds to hostedby.key - value corresponds to hostedby.value - collectedfrom of type
* eu.dnetlib.dhp.schema.dump.oaf.KeyValue to store the information about the source from which the instance has been
* collected. It is mapped against the collectedfrom parameter of the instance to be dumped and - key corresponds to
* collectedfrom.key - value corresponds to collectedfrom.value
* Represents the manifestations (i.e. different versions) of the result. For example: the pre-print and the published
* versions are two manifestations of the same research result. It has the following parameters: - license of type
* String to store the license applied to the instance. It corresponds to the value of the licence in the instance to be
* dumped - accessright of type eu.dnetlib.dhp.schema.dump.oaf.AccessRight to store the accessright of the instance. -
* type of type String to store the type of the instance as defined in the corresponding dnet vocabulary
* (dnet:pubication_resource). It corresponds to the instancetype.classname of the instance to be mapped - url of type
* List<String> list of locations where the instance is accessible. It corresponds to url of the instance to be dumped -
* publicationdate of type String to store the publication date of the instance ;// dateofacceptance; - refereed of type
* String to store information abour the review status of the instance. Possible values are 'Unknown',
* 'nonPeerReviewed', 'peerReviewed'. It corresponds to refereed.classname of the instance to be dumped
* - articleprocessingcharge of type APC to store the article processing charges possibly associated to the instance
* -pid of type List<ControlledField> that is the list of pids associated to the result coming from authoritative sources for that pid
* -alternateIdentifier of type List<ControlledField> that is the list of pids associated to the result coming from NON authoritative
* sources for that pid
* -measure list<KeyValue> to represent the measure computed for this instance (for example the Bip!Finder ones). It corresponds to measures in the model
*/
public class Instance implements Serializable {
@JsonSchema(description = "Information about the source from which the instance can be viewed or downloaded.")
private CfHbKeyValue hostedby;
@JsonSchema(description = "Information about the source from which the record has been collected")
@JsonInclude(JsonInclude.Include.NON_NULL)
private CfHbKeyValue collectedfrom;
@JsonSchema(description = "Measures computed for this instance, for example Bip!Finder ones")
private List<Measure> measures;
// @JsonSchema(description = "Indicators computed for this instance, for example Bip!Finder ones")
// private Indicator indicators;
private List<ResultPid> pid;
@ -64,26 +60,7 @@ public class Instance implements Serializable {
"nonPeerReviewed, UNKNOWN (as defined in https://api.openaire.eu/vocabularies/dnet:review_levels)")
private String refereed; // peer-review status
private String fulltext;
private List<String> eoscDsId;
public List<String> getEoscDsId() {
return eoscDsId;
}
public void setEoscDsId(List<String> eoscId) {
this.eoscDsId = eoscId;
}
public String getFulltext() {
return fulltext;
}
public void setFulltext(String fulltext) {
this.fulltext = fulltext;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public String getLicense() {
return license;
}
@ -92,6 +69,7 @@ public class Instance implements Serializable {
this.license = license;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public AccessRight getAccessright() {
return accessright;
}
@ -100,6 +78,7 @@ public class Instance implements Serializable {
this.accessright = accessright;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public String getType() {
return type;
}
@ -108,6 +87,7 @@ public class Instance implements Serializable {
this.type = type;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public List<String> getUrl() {
return url;
}
@ -116,6 +96,7 @@ public class Instance implements Serializable {
this.url = url;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public String getPublicationdate() {
return publicationdate;
}
@ -124,6 +105,7 @@ public class Instance implements Serializable {
this.publicationdate = publicationdate;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public String getRefereed() {
return refereed;
}
@ -132,6 +114,7 @@ public class Instance implements Serializable {
this.refereed = refereed;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public APC getArticleprocessingcharge() {
return articleprocessingcharge;
}
@ -140,6 +123,7 @@ public class Instance implements Serializable {
this.articleprocessingcharge = articleprocessingcharge;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public List<ResultPid> getPid() {
return pid;
}
@ -148,6 +132,7 @@ public class Instance implements Serializable {
this.pid = pid;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public List<AlternateIdentifier> getAlternateIdentifier() {
return alternateIdentifier;
}
@ -156,27 +141,12 @@ public class Instance implements Serializable {
this.alternateIdentifier = alternateIdentifier;
}
public List<Measure> getMeasures() {
return measures;
}
public void setMeasures(List<Measure> measures) {
this.measures = measures;
}
public CfHbKeyValue getHostedby() {
return hostedby;
}
public void setHostedby(CfHbKeyValue hostedby) {
this.hostedby = hostedby;
}
public CfHbKeyValue getCollectedfrom() {
return collectedfrom;
}
public void setCollectedfrom(CfHbKeyValue collectedfrom) {
this.collectedfrom = collectedfrom;
}
// @JsonInclude(JsonInclude.Include.NON_NULL)
// public Indicator getIndicators() {
// return indicators;
// }
//
// public void setIndicators(Indicator indicators) {
// this.indicators = indicators;
// }
}

View File

@ -1,5 +1,5 @@
package eu.dnetlib.dhp.eosc.model;
package eu.dnetlib.dhp.oa.model;
import java.io.Serializable;

View File

@ -1,5 +1,5 @@
package eu.dnetlib.dhp.eosc.model;
package eu.dnetlib.dhp.oa.model;
import java.io.Serializable;
@ -8,8 +8,12 @@ import org.apache.commons.lang3.StringUtils;
import com.fasterxml.jackson.annotation.JsonIgnore;
import com.github.imifou.jsonschema.module.addon.annotation.JsonSchema;
/**
* @author miriam.baglioni
* @Date 03/08/22
*/
public class Measure implements Serializable {
@JsonSchema(description = "The measure (i.e. popularity)")
@JsonSchema(description = "The measure (i.e. class)")
private String key;
@JsonSchema(description = "The value for that measure")
@ -32,15 +36,14 @@ public class Measure implements Serializable {
}
public static Measure newInstance(String key, String value) {
Measure inst = new Measure();
inst.key = key;
inst.value = value;
return inst;
Measure mes = new Measure();
mes.key = key;
mes.value = value;
return mes;
}
@JsonIgnore
public boolean isBlank() {
return StringUtils.isBlank(key) && StringUtils.isBlank(value);
}
}

View File

@ -0,0 +1,15 @@
package eu.dnetlib.dhp.oa.model;
/**
* @author miriam.baglioni
* @Date 19/12/23
*/
/**
* The OpenAccess color meant to be used on the result level
*/
public enum OpenAccessColor {
gold, hybrid, bronze
}

View File

@ -1,5 +1,5 @@
package eu.dnetlib.dhp.eosc.model;
package eu.dnetlib.dhp.oa.model;
/**
* This Enum models the OpenAccess status, currently including only the values from Unpaywall

View File

@ -0,0 +1,57 @@
package eu.dnetlib.dhp.oa.model;
import java.io.Serializable;
import com.github.imifou.jsonschema.module.addon.annotation.JsonSchema;
/**
* This class to store the common information about the project that will be dumped for community and for the whole
* graph - private String id to store the id of the project (OpenAIRE id) - private String code to store the grant
* agreement of the project - private String acronym to store the acronym of the project - private String title to store
* the tile of the project
*/
public class Project implements Serializable {
@JsonSchema(description = "The OpenAIRE id for the project")
protected String id;// OpenAIRE id
@JsonSchema(description = "The grant agreement number")
protected String code;
@JsonSchema(description = "The acronym of the project")
protected String acronym;
protected String title;
public String getId() {
return id;
}
public void setId(String id) {
this.id = id;
}
public String getCode() {
return code;
}
public void setCode(String code) {
this.code = code;
}
public String getAcronym() {
return acronym;
}
public void setAcronym(String acronym) {
this.acronym = acronym;
}
public String getTitle() {
return title;
}
public void setTitle(String title) {
this.title = title;
}
}

View File

@ -1,13 +1,12 @@
package eu.dnetlib.dhp.eosc.model;
package eu.dnetlib.dhp.oa.model;
import java.io.Serializable;
import org.apache.commons.lang3.StringUtils;
/**
* @author miriam.baglioni
* @Date 26/01/23
* Indicates the process that produced (or provided) the information, and the trust associated to the information. It
* has two parameters: - provenance of type String to store the provenance of the information, - trust of type String to
* store the trust associated to the information
*/
public class Provenance implements Serializable {
private String provenance;
@ -31,13 +30,12 @@ public class Provenance implements Serializable {
public static Provenance newInstance(String provenance, String trust) {
Provenance p = new Provenance();
p.setProvenance(provenance);
p.setTrust(trust);
p.provenance = provenance;
p.trust = trust;
return p;
}
// public String toStringProvenance(Provenance p) {
// return p.getProvenance() + p.getTrust();
// }
public String toString() {
return provenance + trust;
}
}

View File

@ -1,52 +1,127 @@
package eu.dnetlib.dhp.eosc.model;
package eu.dnetlib.dhp.oa.model;
import java.io.Serializable;
import java.util.List;
import java.util.Map;
import com.fasterxml.jackson.annotation.JsonInclude;
import com.fasterxml.jackson.annotation.JsonProperty;
import com.github.imifou.jsonschema.module.addon.annotation.JsonSchema;
/**
* @author miriam.baglioni
* @Date 29/07/22
* To represent the dumped result. It will be extended in the dump for Research Communities - Research
* Initiative/Infrastructures. It has the following parameters:
* - author of type
* List<eu.dnetlib.dhpschema.dump.oaf.Author> to describe the authors of a result. For each author in the result
* represented in the internal model one author in the esternal model is produced.
* - type of type String to represent
* the category of the result. Possible values are publication, dataset, software, other. It corresponds to
* resulttype.classname of the dumped result
* - language of type eu.dnetlib.dhp.schema.dump.oaf.Language to store
* information about the language of the result. It is dumped as - code corresponds to language.classid - value
* corresponds to language.classname
* - country of type List<eu.dnetlib.dhp.schema.dump.oaf.Country> to store the country
* list to which the result is associated. For each country in the result respresented in the internal model one country
* in the external model is produces - subjects of type List<eu.dnetlib.dhp.dump.oaf.Subject> to store the subjects for
* the result. For each subject in the result represented in the internal model one subject in the external model is
* produced - maintitle of type String to store the main title of the result. It corresponds to the value of the first
* title in the resul to be dumped having classid equals to "main title" - subtitle of type String to store the subtitle
* of the result. It corresponds to the value of the first title in the resul to be dumped having classid equals to
* "subtitle" - description of type List<String> to store the description of the result. It corresponds to the list of
* description.value in the result represented in the internal model - publicationdate of type String to store the
* pubblication date. It corresponds to dateofacceptance.value in the result represented in the internal model -
* publisher of type String to store information about the publisher. It corresponds to publisher.value of the result
* represented in the intrenal model - embargoenddate of type String to store the embargo end date. It corresponds to
* embargoenddate.value of the result represented in the internal model - source of type List<String> See definition of
* Dublin Core field dc:source. It corresponds to the list of source.value in the result represented in the internal
* model - format of type List<String> It corresponds to the list of format.value in the result represented in the
* internal model - contributor of type List<String> to represent contributors for this result. It corresponds to the
* list of contributor.value in the result represented in the internal model - coverage of type String. It corresponds
* to the list of coverage.value in the result represented in the internal model - bestaccessright of type
* eu.dnetlib.dhp.schema.dump.oaf.AccessRight to store informatin about the openest access right associated to the
* manifestations of this research results. It corresponds to the same parameter in the result represented in the
* internal model - container of type eu.dnetlib.dhp.schema/dump.oaf.Container (only for result of type publication). It
* corresponds to the parameter journal of the result represented in the internal model - documentationUrl of type
* List<String> (only for results of type software) to store the URLs to the software documentation. It corresponds to
* the list of documentationUrl.value of the result represented in the internal model - codeRepositoryUrl of type String
* (only for results of type software) to store the URL to the repository with the source code. It corresponds to
* codeRepositoryUrl.value of the result represented in the internal model - programmingLanguage of type String (only
* for results of type software) to store the programming language. It corresponds to programmingLanguaga.classid of the
* result represented in the internal model - contactperson of type List<String> (only for results of type other) to
* store the contact person for this result. It corresponds to the list of contactperson.value of the result represented
* in the internal model - contactgroup of type List<String> (only for results of type other) to store the information
* for the contact group. It corresponds to the list of contactgroup.value of the result represented in the internal
* model - tool of type List<String> (only fro results of type other) to store information about tool useful for the
* interpretation and/or re-used of the research product. It corresponds to the list of tool.value in the result
* represented in the internal modelt - size of type String (only for results of type dataset) to store the size of the
* dataset. It corresponds to size.value in the result represented in the internal model - version of type String (only
* for results of type dataset) to store the version. It corresponds to version.value of the result represented in the
* internal model - geolocation fo type List<eu.dnetlib.dhp.schema.dump.oaf.GeoLocation> (only for results of type
* dataset) to store geolocation information. For each geolocation element in the result represented in the internal
* model a GeoLocation in the external model il produced - id of type String to store the OpenAIRE id of the result. It
* corresponds to the id of the result represented in the internal model - originalId of type List<String> to store the
* original ids of the result. It corresponds to the originalId of the result represented in the internal model - pid of
* type List<eu.dnetlib.dhp.schema.dump.oaf.ControlledField> to store the persistent identifiers for the result. For
* each pid in the results represented in the internal model one pid in the external model is produced. The value
* correspondence is: - scheme corresponds to pid.qualifier.classid of the result represented in the internal model -
* value corresponds to the pid.value of the result represented in the internal model - dateofcollection of type String
* to store information about the time OpenAIRE collected the record. It corresponds to dateofcollection of the result
* represented in the internal model - lasteupdatetimestamp of type String to store the timestamp of the last update of
* the record. It corresponds to lastupdatetimestamp of the resord represented in the internal model
*
*/
public class Result implements Serializable {
@JsonSchema(description = "Describes a reference to the EOSC Interoperability Framework (IF) Guidelines")
private List<EoscInteroperabilityFramework> eoscIF;
@JsonSchema(description = "The subject dumped by type associated to the result")
private Map<String, List<Subject>> subject;
@JsonSchema(description = "The list of keywords associated to the result")
private List<String> keywords;
@JsonSchema(description = "The list of organizations the result is affiliated to")
private List<Affiliation> affiliation;
@JsonSchema(description = "The indicators for this result")
private Indicator indicator;
@JsonSchema(description = "List of projects (i.e. grants) that (co-)funded the production ofn the research results")
private List<ProjectSummary> projects;
@JsonSchema(
description = "Reference to a relevant research infrastructure, initiative or community (RI/RC) among those collaborating with OpenAIRE. Please see https://connect.openaire.eu")
private List<Context> context;
@JsonSchema(description = "Information about the sources from which the result has been collected")
@JsonInclude(JsonInclude.Include.NON_NULL)
protected List<CfHbKeyValue> collectedfrom;
@JsonSchema(
description = "Each instance is one specific materialisation or version of the result. For example, you can have one result with three instance: one is the pre-print, one is the post-print, one is te published version")
private List<Instance> instance;
private List<Author> author;
// resulttype allows subclassing results into publications | datasets | software
@JsonProperty("isGreen")
@JsonSchema(description = "True if the result is green Open Access")
private Boolean isGreen;
@JsonSchema(description = "The Open Access Color of the publication")
private OpenAccessColor openAccessColor;
@JsonProperty("isInDiamondJournal")
@JsonSchema(description = "True if the result is published in a Diamond Journal")
private Boolean isInDiamondJournal;
@JsonSchema(description = "True if the result is outcome of a project")
private Boolean publiclyFunded;
public Boolean getGreen() {
return isGreen;
}
public void setGreen(Boolean green) {
isGreen = green;
}
public OpenAccessColor getOpenAccessColor() {
return openAccessColor;
}
public void setOpenAccessColor(OpenAccessColor openAccessColor) {
this.openAccessColor = openAccessColor;
}
public Boolean getInDiamondJournal() {
return isInDiamondJournal;
}
public void setInDiamondJournal(Boolean inDiamondJournal) {
isInDiamondJournal = inDiamondJournal;
}
public Boolean getPubliclyFunded() {
return publiclyFunded;
}
public void setPubliclyFunded(Boolean publiclyFunded) {
this.publiclyFunded = publiclyFunded;
}
@JsonSchema(
description = "Type of the result: one of 'publication', 'dataset', 'software', 'other' (see also https://api.openaire.eu/vocabularies/dnet:result_typologies)")
private String type; // resulttype
@ -57,6 +132,9 @@ public class Result implements Serializable {
@JsonSchema(description = "The list of countries associated to this result")
private List<ResultCountry> country;
@JsonSchema(description = "Keywords associated to the result")
private List<Subject> subjects;
@JsonSchema(
description = "A name or title by which a scientific result is known. May be the title of a publication, of a dataset or the name of a piece of software.")
private String maintitle;
@ -139,20 +217,19 @@ public class Result implements Serializable {
@JsonSchema(description = "Timestamp of last update of the record in OpenAIRE")
private Long lastupdatetimestamp;
@JsonSchema(description = "The set of relations associated to this result")
private List<Relation> relations;
@JsonSchema(description = "Indicators computed for this result, for example UsageCount ones")
private Indicator indicators;
@JsonSchema(description = "The direct link to the full-text as collected from the data source")
private List<String> fulltext;
public List<String> getFulltext() {
return fulltext;
@JsonInclude(JsonInclude.Include.NON_NULL)
public Indicator getIndicators() {
return indicators;
}
public void setFulltext(List<String> fulltext) {
this.fulltext = fulltext;
public void setIndicators(Indicator indicators) {
this.indicators = indicators;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public Long getLastupdatetimestamp() {
return lastupdatetimestamp;
}
@ -161,6 +238,7 @@ public class Result implements Serializable {
this.lastupdatetimestamp = lastupdatetimestamp;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public String getId() {
return id;
}
@ -169,6 +247,7 @@ public class Result implements Serializable {
this.id = id;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public List<String> getOriginalId() {
return originalId;
}
@ -177,6 +256,7 @@ public class Result implements Serializable {
this.originalId = originalId;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public List<ResultPid> getPid() {
return pid;
}
@ -185,6 +265,7 @@ public class Result implements Serializable {
this.pid = pid;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public String getDateofcollection() {
return dateofcollection;
}
@ -193,10 +274,12 @@ public class Result implements Serializable {
this.dateofcollection = dateofcollection;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public List<Author> getAuthor() {
return author;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public String getType() {
return type;
}
@ -205,6 +288,7 @@ public class Result implements Serializable {
this.type = type;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public Container getContainer() {
return container;
}
@ -217,6 +301,7 @@ public class Result implements Serializable {
this.author = author;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public Language getLanguage() {
return language;
}
@ -225,6 +310,7 @@ public class Result implements Serializable {
this.language = language;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public List<ResultCountry> getCountry() {
return country;
}
@ -233,6 +319,16 @@ public class Result implements Serializable {
this.country = country;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public List<Subject> getSubjects() {
return subjects;
}
public void setSubjects(List<Subject> subjects) {
this.subjects = subjects;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public String getMaintitle() {
return maintitle;
}
@ -241,6 +337,7 @@ public class Result implements Serializable {
this.maintitle = maintitle;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public String getSubtitle() {
return subtitle;
}
@ -249,6 +346,7 @@ public class Result implements Serializable {
this.subtitle = subtitle;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public List<String> getDescription() {
return description;
}
@ -257,6 +355,7 @@ public class Result implements Serializable {
this.description = description;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public String getPublicationdate() {
return publicationdate;
}
@ -265,6 +364,7 @@ public class Result implements Serializable {
this.publicationdate = publicationdate;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public String getPublisher() {
return publisher;
}
@ -273,6 +373,7 @@ public class Result implements Serializable {
this.publisher = publisher;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public String getEmbargoenddate() {
return embargoenddate;
}
@ -281,6 +382,7 @@ public class Result implements Serializable {
this.embargoenddate = embargoenddate;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public List<String> getSource() {
return source;
}
@ -289,6 +391,7 @@ public class Result implements Serializable {
this.source = source;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public List<String> getFormat() {
return format;
}
@ -297,6 +400,7 @@ public class Result implements Serializable {
this.format = format;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public List<String> getContributor() {
return contributor;
}
@ -305,6 +409,7 @@ public class Result implements Serializable {
this.contributor = contributor;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public List<String> getCoverage() {
return coverage;
}
@ -313,6 +418,7 @@ public class Result implements Serializable {
this.coverage = coverage;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public BestAccessRight getBestaccessright() {
return bestaccessright;
}
@ -321,6 +427,7 @@ public class Result implements Serializable {
this.bestaccessright = bestaccessright;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public List<String> getDocumentationUrl() {
return documentationUrl;
}
@ -329,6 +436,7 @@ public class Result implements Serializable {
this.documentationUrl = documentationUrl;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public String getCodeRepositoryUrl() {
return codeRepositoryUrl;
}
@ -337,6 +445,7 @@ public class Result implements Serializable {
this.codeRepositoryUrl = codeRepositoryUrl;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public String getProgrammingLanguage() {
return programmingLanguage;
}
@ -345,6 +454,7 @@ public class Result implements Serializable {
this.programmingLanguage = programmingLanguage;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public List<String> getContactperson() {
return contactperson;
}
@ -353,6 +463,7 @@ public class Result implements Serializable {
this.contactperson = contactperson;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public List<String> getContactgroup() {
return contactgroup;
}
@ -361,6 +472,7 @@ public class Result implements Serializable {
this.contactgroup = contactgroup;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public List<String> getTool() {
return tool;
}
@ -369,6 +481,7 @@ public class Result implements Serializable {
this.tool = tool;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public String getSize() {
return size;
}
@ -377,6 +490,7 @@ public class Result implements Serializable {
this.size = size;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public String getVersion() {
return version;
}
@ -385,6 +499,7 @@ public class Result implements Serializable {
this.version = version;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public List<GeoLocation> getGeolocation() {
return geolocation;
}
@ -393,83 +508,4 @@ public class Result implements Serializable {
this.geolocation = geolocation;
}
public List<Instance> getInstance() {
return instance;
}
public void setInstance(List<Instance> instance) {
this.instance = instance;
}
public List<CfHbKeyValue> getCollectedfrom() {
return collectedfrom;
}
public void setCollectedfrom(List<CfHbKeyValue> collectedfrom) {
this.collectedfrom = collectedfrom;
}
public List<ProjectSummary> getProjects() {
return projects;
}
public void setProjects(List<ProjectSummary> projects) {
this.projects = projects;
}
public List<Context> getContext() {
return context;
}
public void setContext(List<Context> context) {
this.context = context;
}
public List<Relation> getRelations() {
return relations;
}
public void setRelations(List<Relation> relations) {
this.relations = relations;
}
public Indicator getIndicator() {
return indicator;
}
public void setIndicator(Indicator indicator) {
this.indicator = indicator;
}
public List<String> getKeywords() {
return keywords;
}
public void setKeywords(List<String> keywords) {
this.keywords = keywords;
}
public List<EoscInteroperabilityFramework> getEoscIF() {
return eoscIF;
}
public void setEoscIF(List<EoscInteroperabilityFramework> eoscIF) {
this.eoscIF = eoscIF;
}
public Map<String, List<Subject>> getSubject() {
return subject;
}
public void setSubject(Map<String, List<Subject>> subject) {
this.subject = subject;
}
public List<Affiliation> getAffiliation() {
return affiliation;
}
public void setAffiliation(List<Affiliation> affiliation) {
this.affiliation = affiliation;
}
}

View File

@ -1,5 +1,5 @@
package eu.dnetlib.dhp.eosc.model;
package eu.dnetlib.dhp.oa.model;
import com.github.imifou.jsonschema.module.addon.annotation.JsonSchema;
@ -38,5 +38,4 @@ public class ResultCountry extends Country {
public static ResultCountry newInstance(String code, String label, String provenance, String trust) {
return newInstance(code, label, Provenance.newInstance(provenance, trust));
}
}

View File

@ -1,5 +1,5 @@
package eu.dnetlib.dhp.eosc.model;
package eu.dnetlib.dhp.oa.model;
import java.io.Serializable;

View File

@ -0,0 +1,46 @@
package eu.dnetlib.dhp.oa.model;
import java.io.Serializable;
import com.fasterxml.jackson.annotation.JsonGetter;
import com.fasterxml.jackson.annotation.JsonProperty;
import com.fasterxml.jackson.annotation.JsonSetter;
/**
* @author miriam.baglioni
* @Date 07/11/22
*/
public class Score implements Serializable {
private String indicator;
private String score;
@JsonProperty("class")
private String clazz;
public String getScore() {
return score;
}
public void setScore(String score) {
this.score = score;
}
@JsonGetter("class")
public String getClazz() {
return clazz;
}
@JsonSetter("class")
public void setClazz(String clazz) {
this.clazz = clazz;
}
public String getIndicator() {
return indicator;
}
public void setIndicator(String indicator) {
this.indicator = indicator;
}
}

View File

@ -0,0 +1,40 @@
package eu.dnetlib.dhp.oa.model;
import java.io.Serializable;
import com.github.imifou.jsonschema.module.addon.annotation.JsonSchema;
/**
* To represent keywords associated to the result. It has two parameters:
* - subject of type eu.dnetlib.dhp.schema.dump.oaf.SubjectSchemeValue to describe the subject. It mapped as:
* - schema it corresponds to qualifier.classid of the dumped subject
* - value it corresponds to the subject value
* - provenance of type eu.dnetlib.dhp.schema.dump.oaf.Provenance to represent the provenance of the subject. It is dumped only if dataInfo
* is not null. In this case:
* - provenance corresponds to dataInfo.provenanceaction.classname
* - trust corresponds to dataInfo.trust
*/
public class Subject implements Serializable {
private SubjectSchemeValue subject;
@JsonSchema(description = "Why this subject is associated to the result")
private Provenance provenance;
public SubjectSchemeValue getSubject() {
return subject;
}
public void setSubject(SubjectSchemeValue subject) {
this.subject = subject;
}
public Provenance getProvenance() {
return provenance;
}
public void setProvenance(Provenance provenance) {
this.provenance = provenance;
}
}

View File

@ -0,0 +1,42 @@
package eu.dnetlib.dhp.oa.model;
import java.io.Serializable;
import com.github.imifou.jsonschema.module.addon.annotation.JsonSchema;
public class SubjectSchemeValue implements Serializable {
@JsonSchema(
description = "OpenAIRE subject classification scheme (https://api.openaire.eu/vocabularies/dnet:subject_classification_typologies).")
private String scheme;
@JsonSchema(
description = "The value for the subject in the selected scheme. When the scheme is 'keyword', it means that the subject is free-text (i.e. not a term from a controlled vocabulary).")
private String value;
public String getScheme() {
return scheme;
}
public void setScheme(String scheme) {
this.scheme = scheme;
}
public String getValue() {
return value;
}
public void setValue(String value) {
this.value = value;
}
public static SubjectSchemeValue newInstance(String scheme, String value) {
SubjectSchemeValue cf = new SubjectSchemeValue();
cf.setScheme(scheme);
cf.setValue(value);
return cf;
}
}

View File

@ -1,25 +1,15 @@
package eu.dnetlib.dhp.eosc.model;
package eu.dnetlib.dhp.oa.model;
import java.io.Serializable;
import org.apache.commons.lang3.StringUtils;
/**
* @author miriam.baglioni
* @Date 04/11/22
* @Date 07/11/22
*/
public class UsageCounts implements Serializable {
private String views;
private String downloads;
public String getViews() {
return views;
}
public void setViews(String views) {
this.views = views;
}
private String views;
public String getDownloads() {
return downloads;
@ -29,15 +19,11 @@ public class UsageCounts implements Serializable {
this.downloads = downloads;
}
public static UsageCounts newInstance(String views, String downloads) {
UsageCounts uc = new UsageCounts();
uc.setViews(views);
uc.setDownloads(downloads);
return uc;
public String getViews() {
return views;
}
public boolean isEmpty() {
return StringUtils.isEmpty(this.downloads) || StringUtils.isEmpty(this.views);
public void setViews(String views) {
this.views = views;
}
}

View File

@ -1,5 +1,5 @@
package eu.dnetlib.dhp.eosc.model;
package eu.dnetlib.dhp.oa.model.community;
import java.io.Serializable;
@ -32,15 +32,16 @@ public class CfHbKeyValue implements Serializable {
this.value = value;
}
public static CfHbKeyValue newInstance(String key, String value) {
CfHbKeyValue inst = new CfHbKeyValue();
inst.key = key;
inst.value = value;
return inst;
}
@JsonIgnore
public boolean isBlank() {
return StringUtils.isBlank(key) && StringUtils.isBlank(value);
}
public static CfHbKeyValue newInstance(String key, String value) {
CfHbKeyValue inst = new CfHbKeyValue();
inst.setKey(key);
inst.setValue(value);
return inst;
}
}

View File

@ -0,0 +1,43 @@
package eu.dnetlib.dhp.oa.model.community;
import com.fasterxml.jackson.annotation.JsonInclude;
import com.github.imifou.jsonschema.module.addon.annotation.JsonSchema;
import eu.dnetlib.dhp.oa.model.Instance;
/**
* It extends eu.dnetlib.dhp.dump.oaf.Instance with values related to the community dump. In the Result dump this
* information is not present because it is dumped as a set of relations between the result and the datasource. -
* hostedby of type eu.dnetlib.dhp.schema.dump.oaf.KeyValue to store the information about the source from which the
* instance can be viewed or downloaded. It is mapped against the hostedby parameter of the instance to be dumped and -
* key corresponds to hostedby.key - value corresponds to hostedby.value - collectedfrom of type
* eu.dnetlib.dhp.schema.dump.oaf.KeyValue to store the information about the source from which the instance has been
* collected. It is mapped against the collectedfrom parameter of the instance to be dumped and - key corresponds to
* collectedfrom.key - value corresponds to collectedfrom.value
*/
public class CommunityInstance extends Instance {
@JsonSchema(description = "Information about the source from which the instance can be viewed or downloaded.")
private CfHbKeyValue hostedby;
@JsonSchema(description = "Information about the source from which the record has been collected")
private CfHbKeyValue collectedfrom;
@JsonInclude(JsonInclude.Include.NON_NULL)
public CfHbKeyValue getHostedby() {
return hostedby;
}
public void setHostedby(CfHbKeyValue hostedby) {
this.hostedby = hostedby;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public CfHbKeyValue getCollectedfrom() {
return collectedfrom;
}
public void setCollectedfrom(CfHbKeyValue collectedfrom) {
this.collectedfrom = collectedfrom;
}
}

View File

@ -0,0 +1,75 @@
package eu.dnetlib.dhp.oa.model.community;
import java.util.List;
import com.fasterxml.jackson.annotation.JsonInclude;
import com.github.imifou.jsonschema.module.addon.annotation.JsonSchema;
import eu.dnetlib.dhp.oa.model.Result;
/**
* extends eu.dnetlib.dhp.schema.dump.oaf.Result with the following parameters: - projects of type
* List<eu.dnetlib.dhp.schema.dump.oaf.community.Project> to store the list of projects related to the result. The
* information is added after the result is mapped to the external model - context of type
* List<eu.dnetlib.dhp.schema.dump.oaf.community.Context> to store information about the RC RI related to the result.
* For each context in the result represented in the internal model one context in the external model is produced -
* collectedfrom of type List<eu.dnetliv.dhp.schema.dump.oaf.KeyValue> to store information about the sources from which
* the record has been collected. For each collectedfrom in the result represented in the internal model one
* collectedfrom in the external model is produced - instance of type
* List<eu.dnetlib.dhp.schema.dump.oaf.community.CommunityInstance> to store all the instances associated to the result.
* It corresponds to the same parameter in the result represented in the internal model
*/
public class CommunityResult extends Result {
@JsonSchema(description = "List of projects (i.e. grants) that (co-)funded the production ofn the research results")
private List<Project> projects;
@JsonSchema(
description = "Reference to a relevant research infrastructure, initiative or community (RI/RC) among those collaborating with OpenAIRE. Please see https://connect.openaire.eu")
private List<Context> context;
@JsonSchema(description = "Information about the sources from which the record has been collected")
protected List<CfHbKeyValue> collectedfrom;
@JsonSchema(
description = "Each instance is one specific materialisation or version of the result. For example, you can have one result with three instance: one is the pre-print, one is the post-print, one is te published version")
private List<CommunityInstance> instance;
@JsonInclude(JsonInclude.Include.NON_NULL)
public List<CommunityInstance> getInstance() {
return instance;
}
public void setInstance(List<CommunityInstance> instance) {
this.instance = instance;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public List<CfHbKeyValue> getCollectedfrom() {
return collectedfrom;
}
public void setCollectedfrom(List<CfHbKeyValue> collectedfrom) {
this.collectedfrom = collectedfrom;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public List<Project> getProjects() {
return projects;
}
public void setProjects(List<Project> projects) {
this.projects = projects;
}
@JsonInclude(JsonInclude.Include.NON_NULL)
public List<Context> getContext() {
return context;
}
public void setContext(List<Context> context) {
this.context = context;
}
}

View File

@ -1,5 +1,5 @@
package eu.dnetlib.dhp.eosc.model;
package eu.dnetlib.dhp.oa.model.community;
import java.util.List;
import java.util.Objects;
@ -8,6 +8,8 @@ import java.util.stream.Collectors;
import com.github.imifou.jsonschema.module.addon.annotation.JsonSchema;
import eu.dnetlib.dhp.oa.model.Provenance;
/**
* Reference to a relevant research infrastructure, initiative or community (RI/RC) among those collaborating with
* OpenAIRE. It extend eu.dnetlib.dhp.shema.dump.oaf.Qualifier with a parameter provenance of type

View File

@ -0,0 +1,24 @@
package eu.dnetlib.dhp.oa.model.community;
import com.github.imifou.jsonschema.module.addon.annotation.JsonSchema;
/**
* To store information about the funder funding the project related to the result. It has the following parameters: -
* shortName of type String to store the funder short name (e.c. AKA). - name of type String to store the funder name
* (e.c. Akademy of Finland) - fundingStream of type String to store the funding stream - jurisdiction of type String to
* store the jurisdiction of the funder
*/
public class Funder extends eu.dnetlib.dhp.oa.model.Funder {
@JsonSchema(description = "Stream of funding (e.g. for European Commission can be H2020 or FP7)")
private String fundingStream;
public String getFundingStream() {
return fundingStream;
}
public void setFundingStream(String fundingStream) {
this.fundingStream = fundingStream;
}
}

View File

@ -0,0 +1,58 @@
package eu.dnetlib.dhp.oa.model.community;
import com.github.imifou.jsonschema.module.addon.annotation.JsonSchema;
import eu.dnetlib.dhp.oa.model.Provenance;
/**
* To store information about the project related to the result. This information is not directly mapped from the result
* represented in the internal model because it is not there. The mapped result will be enriched with project
* information derived by relation between results and projects. Project extends eu.dnetlib.dhp.schema.dump.oaf.Project
* with the following parameters: - funder of type eu.dnetlib.dhp.schema.dump.oaf.community.Funder to store information
* about the funder funding the project - provenance of type eu.dnetlib.dhp.schema.dump.oaf.Provenance to store
* information about the. provenance of the association between the result and the project
*/
public class Project extends eu.dnetlib.dhp.oa.model.Project {
@JsonSchema(description = "Information about the funder funding the project")
private Funder funder;
private Provenance provenance;
private Validated validated;
public void setValidated(Validated validated) {
this.validated = validated;
}
public Validated getValidated() {
return validated;
}
public Provenance getProvenance() {
return provenance;
}
public void setProvenance(Provenance provenance) {
this.provenance = provenance;
}
public Funder getFunder() {
return funder;
}
public void setFunder(Funder funders) {
this.funder = funders;
}
public static Project newInstance(String id, String code, String acronym, String title, Funder funder) {
Project project = new Project();
project.setAcronym(acronym);
project.setCode(code);
project.setFunder(funder);
project.setId(id);
project.setTitle(title);
return project;
}
}

View File

@ -1,13 +1,13 @@
package eu.dnetlib.dhp.eosc.model;
package eu.dnetlib.dhp.oa.model.community;
import java.io.Serializable;
import org.apache.commons.lang3.StringUtils;
/**
* @author miriam.baglioni
* @Date 26/01/23
* To store information about the funder funding the project related to the result. It has the following parameters: -
* shortName of type String to store the funder short name (e.c. AKA). - name of type String to store the funder name
* (e.c. Akademy of Finland) - fundingStream of type String to store the funding stream - jurisdiction of type String to
* store the jurisdiction of the funder
*/
public class Validated implements Serializable {
@ -32,9 +32,8 @@ public class Validated implements Serializable {
public static Validated newInstance(Boolean validated, String validationDate) {
Validated v = new Validated();
v.setValidatedByFunder(validated);
v.setValidationDate(validationDate);
v.validatedByFunder = validated;
v.validationDate = validationDate;
return v;
}
}

View File

@ -0,0 +1,21 @@
package eu.dnetlib.dhp.oa.model.graph;
import java.io.Serializable;
public class Constants implements Serializable {
// collectedFrom va con isProvidedBy -> becco da ModelSupport
public static final String HOSTED_BY = "isHostedBy";
public static final String HOSTS = "hosts";
// community result uso isrelatedto
public static final String RESULT_ENTITY = "result";
public static final String DATASOURCE_ENTITY = "datasource";
public static final String CONTEXT_ENTITY = "context";
public static final String CONTEXT_ID = "60";
public static final String CONTEXT_NS_PREFIX = "context____";
}

View File

@ -0,0 +1,358 @@
package eu.dnetlib.dhp.oa.model.graph;
import java.io.Serializable;
import java.util.List;
import com.github.imifou.jsonschema.module.addon.annotation.JsonSchema;
import eu.dnetlib.dhp.oa.model.Container;
import eu.dnetlib.dhp.oa.model.Indicator;
/**
* To store information about the datasource OpenAIRE collects information from. It contains the following parameters: -
* id of type String to store the OpenAIRE id for the datasource. It corresponds to the parameter id of the datasource
* represented in the internal model - originalId of type List<String> to store the list of original ids associated to
* the datasource. It corresponds to the parameter originalId of the datasource represented in the internal model. The
* null values are filtered out - pid of type List<eu.dnetlib.shp.schema.dump.oaf.ControlledField> to store the
* persistent identifiers for the datasource. For each pid in the datasource represented in the internal model one pid
* in the external model is produced as : - schema corresponds to pid.qualifier.classid of the datasource represented in
* the internal model - value corresponds to pid.value of the datasource represented in the internal model -
* datasourceType of type eu.dnetlib.dhp.schema.dump.oaf.ControlledField to store the datasource type (e.g.
* pubsrepository::institutional, Institutional Repository) as in the dnet vocabulary dnet:datasource_typologies. It
* corresponds to datasourcetype of the datasource represented in the internal model and : - code corresponds to
* datasourcetype.classid - value corresponds to datasourcetype.classname - openairecompatibility of type String to
* store information about the OpenAIRE compatibility of the ingested results (which guidelines they are compliant to).
* It corresponds to openairecompatibility.classname of the datasource represented in the internal model - officialname
* of type Sgtring to store the official name of the datasource. It correspond to officialname.value of the datasource
* represented in the internal model - englishname of type String to store the English name of the datasource. It
* corresponds to englishname.value of the datasource represented in the internal model - websiteurl of type String to
* store the URL of the website of the datasource. It corresponds to websiteurl.value of the datasource represented in
* the internal model - logourl of type String to store the URL of the logo for the datasource. It corresponds to
* logourl.value of the datasource represented in the internal model - dateofvalidation of type String to store the data
* of validation against the guidelines for the datasource records. It corresponds to dateofvalidation.value of the
* datasource represented in the internal model - description of type String to store the description for the
* datasource. It corresponds to description.value of the datasource represented in the internal model
*/
public class Datasource implements Serializable {
@JsonSchema(description = "The OpenAIRE id of the data source")
private String id; // string
@JsonSchema(description = "Original identifiers for the datasource")
private List<String> originalId; // list string
@JsonSchema(description = "Persistent identifiers of the datasource")
private List<DatasourcePid> pid; // list<String>
@JsonSchema(
description = "The type of the datasource. See https://api.openaire.eu/vocabularies/dnet:datasource_typologies")
private DatasourceSchemeValue datasourcetype; // value
@JsonSchema(
description = "OpenAIRE guidelines the data source comply with. See also https://guidelines.openaire.eu.")
private String openairecompatibility; // value
@JsonSchema(description = "The official name of the datasource")
private String officialname; // string
@JsonSchema(description = "The English name of the datasource")
private String englishname; // string
private String websiteurl; // string
private String logourl; // string
@JsonSchema(description = "The date of last validation against the OpenAIRE guidelines for the datasource records")
private String dateofvalidation; // string
private String description; // description
@JsonSchema(description = "List of subjects associated to the datasource")
private List<String> subjects; // List<String>
// opendoar specific fields (od*)
@JsonSchema(description = "The languages present in the data source's content, as defined by OpenDOAR.")
private List<String> languages; // odlanguages List<String>
@JsonSchema(description = "Types of content in the data source, as defined by OpenDOAR")
private List<String> contenttypes; // odcontent types List<String>
// re3data fields
@JsonSchema(description = "Releasing date of the data source, as defined by re3data.org")
private String releasestartdate; // string
@JsonSchema(
description = "Date when the data source went offline or stopped ingesting new research data. As defined by re3data.org")
private String releaseenddate; // string
@JsonSchema(
description = "The URL of a mission statement describing the designated community of the data source. As defined by re3data.org")
private String missionstatementurl; // string
@JsonSchema(
description = "Type of access to the data source, as defined by re3data.org. Possible values: " +
"{open, restricted, closed}")
private String accessrights; // databaseaccesstype string
// {open, restricted or closed}
@JsonSchema(description = "Type of data upload. As defined by re3data.org: one of {open, restricted,closed}")
private String uploadrights; // datauploadtype string
@JsonSchema(
description = "Access restrinctions to the data source, as defined by re3data.org. One of {feeRequired, registration, other}")
private String databaseaccessrestriction; // string
@JsonSchema(
description = "Upload restrictions applied by the datasource, as defined by re3data.org. One of {feeRequired, registration, other}")
private String datauploadrestriction; // string
@JsonSchema(description = "As defined by redata.org: 'yes' if the data source supports versioning, 'no' otherwise.")
private Boolean versioning; // boolean
@JsonSchema(
description = "The URL of the data source providing information on how to cite its items. As defined by re3data.org.")
private String citationguidelineurl; // string
// {yes, no, uknown}
@JsonSchema(
description = "The persistent identifier system that is used by the data source. As defined by re3data.org")
private String pidsystems; // string
@JsonSchema(
description = "The certificate, seal or standard the data source complies with. As defined by re3data.org.")
private String certificates; // string
@JsonSchema(description = "Policies of the data source, as defined in OpenDOAR.")
private List<String> policies; //
@JsonSchema(description = "Information about the journal, if this data source is of type Journal.")
private Container journal; // issn etc del Journal
// @JsonSchema(description = "Indicators computed for this Datasource, for example UsageCount ones")
// private Indicator indicators;
//
// public Indicator getIndicators() {
// return indicators;
// }
//
// public void setIndicators(Indicator indicators) {
// this.indicators = indicators;
// }
public String getId() {
return id;
}
public void setId(String id) {
this.id = id;
}
public List<String> getOriginalId() {
return originalId;
}
public void setOriginalId(List<String> originalId) {
this.originalId = originalId;
}
public List<DatasourcePid> getPid() {
return pid;
}
public void setPid(List<DatasourcePid> pid) {
this.pid = pid;
}
public DatasourceSchemeValue getDatasourcetype() {
return datasourcetype;
}
public void setDatasourcetype(DatasourceSchemeValue datasourcetype) {
this.datasourcetype = datasourcetype;
}
public String getOpenairecompatibility() {
return openairecompatibility;
}
public void setOpenairecompatibility(String openairecompatibility) {
this.openairecompatibility = openairecompatibility;
}
public String getOfficialname() {
return officialname;
}
public void setOfficialname(String officialname) {
this.officialname = officialname;
}
public String getEnglishname() {
return englishname;
}
public void setEnglishname(String englishname) {
this.englishname = englishname;
}
public String getWebsiteurl() {
return websiteurl;
}
public void setWebsiteurl(String websiteurl) {
this.websiteurl = websiteurl;
}
public String getLogourl() {
return logourl;
}
public void setLogourl(String logourl) {
this.logourl = logourl;
}
public String getDateofvalidation() {
return dateofvalidation;
}
public void setDateofvalidation(String dateofvalidation) {
this.dateofvalidation = dateofvalidation;
}
public String getDescription() {
return description;
}
public void setDescription(String description) {
this.description = description;
}
public List<String> getSubjects() {
return subjects;
}
public void setSubjects(List<String> subjects) {
this.subjects = subjects;
}
public List<String> getLanguages() {
return languages;
}
public void setLanguages(List<String> languages) {
this.languages = languages;
}
public List<String> getContenttypes() {
return contenttypes;
}
public void setContenttypes(List<String> contenttypes) {
this.contenttypes = contenttypes;
}
public String getReleasestartdate() {
return releasestartdate;
}
public void setReleasestartdate(String releasestartdate) {
this.releasestartdate = releasestartdate;
}
public String getReleaseenddate() {
return releaseenddate;
}
public void setReleaseenddate(String releaseenddate) {
this.releaseenddate = releaseenddate;
}
public String getMissionstatementurl() {
return missionstatementurl;
}
public void setMissionstatementurl(String missionstatementurl) {
this.missionstatementurl = missionstatementurl;
}
public String getAccessrights() {
return accessrights;
}
public void setAccessrights(String accessrights) {
this.accessrights = accessrights;
}
public String getUploadrights() {
return uploadrights;
}
public void setUploadrights(String uploadrights) {
this.uploadrights = uploadrights;
}
public String getDatabaseaccessrestriction() {
return databaseaccessrestriction;
}
public void setDatabaseaccessrestriction(String databaseaccessrestriction) {
this.databaseaccessrestriction = databaseaccessrestriction;
}
public String getDatauploadrestriction() {
return datauploadrestriction;
}
public void setDatauploadrestriction(String datauploadrestriction) {
this.datauploadrestriction = datauploadrestriction;
}
public Boolean getVersioning() {
return versioning;
}
public void setVersioning(Boolean versioning) {
this.versioning = versioning;
}
public String getCitationguidelineurl() {
return citationguidelineurl;
}
public void setCitationguidelineurl(String citationguidelineurl) {
this.citationguidelineurl = citationguidelineurl;
}
public String getPidsystems() {
return pidsystems;
}
public void setPidsystems(String pidsystems) {
this.pidsystems = pidsystems;
}
public String getCertificates() {
return certificates;
}
public void setCertificates(String certificates) {
this.certificates = certificates;
}
public List<String> getPolicies() {
return policies;
}
public void setPolicies(List<String> policiesr3) {
this.policies = policiesr3;
}
public Container getJournal() {
return journal;
}
public void setJournal(Container journal) {
this.journal = journal;
}
}

View File

@ -0,0 +1,41 @@
package eu.dnetlib.dhp.oa.model.graph;
import java.io.Serializable;
import com.github.imifou.jsonschema.module.addon.annotation.JsonSchema;
public class DatasourcePid implements Serializable {
@JsonSchema(description = "The scheme used to express the value ")
private String scheme;
@JsonSchema(description = "The value expressed in the scheme ")
private String value;
public String getScheme() {
return scheme;
}
public void setScheme(String scheme) {
this.scheme = scheme;
}
public String getValue() {
return value;
}
public void setValue(String value) {
this.value = value;
}
public static DatasourcePid newInstance(String scheme, String value) {
DatasourcePid cf = new DatasourcePid();
cf.setScheme(scheme);
cf.setValue(value);
return cf;
}
}

View File

@ -0,0 +1,41 @@
package eu.dnetlib.dhp.oa.model.graph;
import java.io.Serializable;
import com.github.imifou.jsonschema.module.addon.annotation.JsonSchema;
public // TODO change the DatasourceSchemaValue to DatasourceKeyValue. The scheme is always the dnet one. What we show
// here is the entry in the scheme (the key) and its understandable value
class DatasourceSchemeValue implements Serializable {
@JsonSchema(description = "The scheme used to express the value (i.e. pubsrepository::journal)")
private String scheme;
@JsonSchema(description = "The value expressed in the scheme (Journal)")
private String value;
public String getScheme() {
return scheme;
}
public void setScheme(String scheme) {
this.scheme = scheme;
}
public String getValue() {
return value;
}
public void setValue(String value) {
this.value = value;
}
public static DatasourceSchemeValue newInstance(String scheme, String value) {
DatasourceSchemeValue cf = new DatasourceSchemeValue();
cf.setScheme(scheme);
cf.setValue(value);
return cf;
}
}

View File

@ -1,14 +1,6 @@
package eu.dnetlib.dhp.eosc.model;
package eu.dnetlib.dhp.oa.model.graph;
/**
* @author miriam.baglioni
* @Date 25/10/23
*/
/**
* @author miriam.baglioni
* @Date 25/10/23
*/
import com.github.imifou.jsonschema.module.addon.annotation.JsonSchema;
/**
@ -16,7 +8,7 @@ import com.github.imifou.jsonschema.module.addon.annotation.JsonSchema;
* eu.dnetlib.dhp.schema.dump.oaf.Funder with the following parameter: - - private
* eu.dnetdlib.dhp.schema.dump.oaf.graph.Fundings funding_stream to store the fundingstream
*/
public class Funder extends FunderShort {
public class Funder extends eu.dnetlib.dhp.oa.model.Funder {
@JsonSchema(description = "Description of the funding stream")
private Fundings funding_stream;

View File

@ -1,14 +1,6 @@
package eu.dnetlib.dhp.eosc.model;
package eu.dnetlib.dhp.oa.model.graph;
/**
* @author miriam.baglioni
* @Date 25/10/23
*/
/**
* @author miriam.baglioni
* @Date 25/10/23
*/
import java.io.Serializable;
import com.github.imifou.jsonschema.module.addon.annotation.JsonSchema;

View File

@ -1,14 +1,6 @@
package eu.dnetlib.dhp.eosc.model;
package eu.dnetlib.dhp.oa.model.graph;
/**
* @author miriam.baglioni
* @Date 25/10/23
*/
/**
* @author miriam.baglioni
* @Date 25/10/23
*/
import java.io.Serializable;
import com.github.imifou.jsonschema.module.addon.annotation.JsonSchema;

View File

@ -0,0 +1,28 @@
package eu.dnetlib.dhp.oa.model.graph;
import java.util.List;
import com.github.imifou.jsonschema.module.addon.annotation.JsonSchema;
import eu.dnetlib.dhp.oa.model.Instance;
import eu.dnetlib.dhp.oa.model.Result;
/**
* It extends the eu.dnetlib.dhp.schema.dump.oaf.Result with - instance of type
* List<eu.dnetlib.dhp.schema.dump.oaf.Instance> to store all the instances associated to the result. It corresponds to
* the same parameter in the result represented in the internal model
*/
public class GraphResult extends Result {
@JsonSchema(
description = "Each instance is one specific materialisation or version of the result. For example, you can have one result with three instance: one is the pre-print, one is the post-print, one is te published version")
private List<Instance> instance;
public List<Instance> getInstance() {
return instance;
}
public void setInstance(List<Instance> instance) {
this.instance = instance;
}
}

View File

@ -0,0 +1,82 @@
package eu.dnetlib.dhp.oa.model.graph;
import java.io.Serializable;
/**
* To store information about the classification for the project. The classification depends on the programme. For example
* H2020-EU.3.4.5.3 can be classified as
* H2020-EU.3. => Societal Challenges (level1)
* H2020-EU.3.4. => Transport (level2)
* H2020-EU.3.4.5. => CLEANSKY2 (level3)
* H2020-EU.3.4.5.3. => IADP Fast Rotorcraft (level4)
*
* We decided to explicitly represent up to three levels in the classification.
*
* H2020Classification has the following parameters:
* - private Programme programme to store the information about the programme related to this classification
* - private String level1 to store the information about the level 1 of the classification (Priority or Pillar of the EC)
* - private String level2 to store the information about the level2 af the classification (Objectives (?))
* - private String level3 to store the information about the level3 of the classification
* - private String classification to store the entire classification related to the programme
*/
public class H2020Classification implements Serializable {
private Programme programme;
private String level1;
private String level2;
private String level3;
private String classification;
public Programme getProgramme() {
return programme;
}
public void setProgramme(Programme programme) {
this.programme = programme;
}
public String getLevel1() {
return level1;
}
public void setLevel1(String level1) {
this.level1 = level1;
}
public String getLevel2() {
return level2;
}
public void setLevel2(String level2) {
this.level2 = level2;
}
public String getLevel3() {
return level3;
}
public void setLevel3(String level3) {
this.level3 = level3;
}
public String getClassification() {
return classification;
}
public void setClassification(String classification) {
this.classification = classification;
}
public static H2020Classification newInstance(String programme_code, String programme_description, String level1,
String level2, String level3, String classification) {
H2020Classification h2020classification = new H2020Classification();
h2020classification.programme = Programme.newInstance(programme_code, programme_description);
h2020classification.level1 = level1;
h2020classification.level2 = level2;
h2020classification.level3 = level3;
h2020classification.classification = classification;
return h2020classification;
}
}

View File

@ -1,11 +1,13 @@
package eu.dnetlib.dhp.eosc.model;
package eu.dnetlib.dhp.oa.model.graph;
import java.io.Serializable;
import java.util.List;
import com.github.imifou.jsonschema.module.addon.annotation.JsonSchema;
import eu.dnetlib.dhp.oa.model.Country;
/**
* To represent the generic organizaiton. It has the following parameters:
* - private String legalshortname to store the legalshortname of the organizaiton

View File

@ -0,0 +1,42 @@
package eu.dnetlib.dhp.oa.model.graph;
import java.io.Serializable;
import com.github.imifou.jsonschema.module.addon.annotation.JsonSchema;
public
class OrganizationPid implements Serializable {
@JsonSchema(description = "The scheme of the identifier (i.e. isni)")
private String scheme;
@JsonSchema(description = "The value in the schema (i.e. 0000000090326370)")
private String value;
public String getScheme() {
return scheme;
}
public void setScheme(String scheme) {
this.scheme = scheme;
}
public String getValue() {
return value;
}
public void setValue(String value) {
this.value = value;
}
public static OrganizationPid newInstance(String scheme, String value) {
OrganizationPid cf = new OrganizationPid();
cf.setScheme(scheme);
cf.setValue(value);
return cf;
}
}

View File

@ -1,14 +1,6 @@
package eu.dnetlib.dhp.eosc.model;
package eu.dnetlib.dhp.oa.model.graph;
/**
* @author miriam.baglioni
* @Date 25/10/23
*/
/**
* @author miriam.baglioni
* @Date 25/10/23
*/
import java.io.Serializable;
import com.github.imifou.jsonschema.module.addon.annotation.JsonSchema;

View File

@ -1,19 +1,13 @@
package eu.dnetlib.dhp.eosc.model;
package eu.dnetlib.dhp.oa.model.graph;
/**
* @author miriam.baglioni
* @Date 25/10/23
*/
/**
* @author miriam.baglioni
* @Date 25/10/23
*/
import java.io.Serializable;
import java.util.List;
import com.github.imifou.jsonschema.module.addon.annotation.JsonSchema;
import eu.dnetlib.dhp.oa.model.Indicator;
/**
* This is the class representing the Project in the model used for the dumps of the whole graph. At the moment the dump
* of the Projects differs from the other dumps because we do not create relations between Funders (Organization) and
@ -76,6 +70,17 @@ public class Project implements Serializable {
@JsonSchema(description = "The h2020 programme funding the project")
private List<Programme> h2020programme;
// @JsonSchema(description = "Indicators computed for this project, for example UsageCount ones")
// private Indicator indicators;
//
// public Indicator getIndicators() {
// return indicators;
// }
//
// public void setIndicators(Indicator indicators) {
// this.indicators = indicators;
// }
public String getId() {
return id;
}

View File

@ -1,5 +1,5 @@
package eu.dnetlib.dhp.eosc.model;
package eu.dnetlib.dhp.oa.model.graph;
import java.io.Serializable;

View File

@ -1,12 +1,13 @@
package eu.dnetlib.dhp.eosc.model;
package eu.dnetlib.dhp.oa.model.graph;
import java.io.Serializable;
import java.util.Objects;
import com.fasterxml.jackson.annotation.JsonIgnoreProperties;
import com.github.imifou.jsonschema.module.addon.annotation.JsonSchema;
import eu.dnetlib.dhp.oa.model.Provenance;
/**
* To represent the gereric relation between two entities. It has the following parameters: - private Node source to
* represent the entity source of the relation - private Node target to represent the entity target of the relation -
@ -14,31 +15,30 @@ import com.github.imifou.jsonschema.module.addon.annotation.JsonSchema;
* provenance of the relation
*/
public class Relation implements Serializable {
@JsonSchema(description = "The node source in the relation")
@JsonSchema(description = "The identifier of the source in the relation")
private String source;
@JsonSchema(description = "The node target in the relation")
@JsonSchema(description = "The entity type of the source in the relation")
private String sourceType;
@JsonSchema(description = "The identifier of the target in the relation")
private String target;
@JsonSchema(description = "The entity type of the target in the relation")
private String targetType;
@JsonSchema(description = "To represent the semantics of a relation between two entities")
@JsonIgnoreProperties(ignoreUnknown = true)
private RelType reltype;
@JsonSchema(description = "The reason why OpenAIRE holds the relation ")
@JsonIgnoreProperties(ignoreUnknown = true)
private Provenance provenance;
@JsonSchema(description = "The result type of the target for this relation")
@JsonIgnoreProperties(ignoreUnknown = true)
private String targetType;
@JsonSchema(
description = "True if the relation is related to a project and it has been collected from an authoritative source (i.e. the funder)")
private boolean validated;
public String getTargetType() {
return targetType;
}
public void setTargetType(String targetType) {
this.targetType = targetType;
}
@JsonSchema(description = "The date when the relation was collected from OpenAIRE")
private String validationDate;
public String getSource() {
return source;
@ -48,6 +48,14 @@ public class Relation implements Serializable {
this.source = source;
}
public String getSourceType() {
return sourceType;
}
public void setSourceType(String sourceType) {
this.sourceType = sourceType;
}
public String getTarget() {
return target;
}
@ -56,6 +64,14 @@ public class Relation implements Serializable {
this.target = target;
}
public String getTargetType() {
return targetType;
}
public void setTargetType(String targetType) {
this.targetType = targetType;
}
public RelType getReltype() {
return reltype;
}
@ -72,26 +88,37 @@ public class Relation implements Serializable {
this.provenance = provenance;
}
public void setValidated(boolean validate) {
this.validated = validate;
}
public boolean getValidated() {
return validated;
}
public void setValidationDate(String validationDate) {
this.validationDate = validationDate;
}
public String getValidationDate() {
return validationDate;
}
@Override
public int hashCode() {
return Objects.hash(source, target, reltype.getType() + ":" + reltype.getName());
}
public static Relation newInstance(String source, String target, RelType reltype, Provenance provenance) {
public static Relation newInstance(String source, String sourceType, String target, String targetType,
RelType reltype, Provenance provenance) {
Relation relation = new Relation();
relation.source = source;
relation.sourceType = sourceType;
relation.target = target;
relation.targetType = targetType;
relation.reltype = reltype;
relation.provenance = provenance;
return relation;
}
public static Relation newInstance(String source, String target) {
Relation relation = new Relation();
relation.source = source;
relation.target = target;
return relation;
}
}

View File

@ -0,0 +1,27 @@
package eu.dnetlib.dhp.oa.model.graph;
import java.util.List;
import com.fasterxml.jackson.annotation.JsonInclude;
import com.github.imifou.jsonschema.module.addon.annotation.JsonSchema;
/**
* To represent RC entities. It extends eu.dnetlib.dhp.dump.oaf.grap.ResearchInitiative by adding the parameter subject
* to store the list of subjects related to the community
*/
public class ResearchCommunity extends ResearchInitiative {
@JsonSchema(
description = "Only for research communities: the list of the subjects associated to the research community")
@JsonInclude(JsonInclude.Include.NON_NULL)
private List<String> subject;
public List<String> getSubject() {
return subject;
}
public void setSubject(List<String> subject) {
this.subject = subject;
}
}

View File

@ -0,0 +1,89 @@
package eu.dnetlib.dhp.oa.model.graph;
import java.io.Serializable;
import com.github.imifou.jsonschema.module.addon.annotation.JsonSchema;
/**
* To represent entity of type RC/RI. It has the following parameters, which are mostly derived by the profile
* - private
* String id to store the openaire id for the entity. Is has as code 00 and will be created as
* 00|context_____::md5(originalId) private
* String originalId to store the id of the context as provided in the profile
* (i.e. mes)
* - private String name to store the name of the context (got from the label attribute in the context
* definition)
* - private String type to store the type of the context (i.e.: research initiative or research community)
* - private String description to store the description of the context as given in the profile
* -private String
* zenodo_community to store the zenodo community associated to the context (main zenodo community)
*/
public class ResearchInitiative implements Serializable {
@JsonSchema(description = "The OpenAIRE id for the community/research infrastructure")
private String id; // openaireId
@JsonSchema(description = "The acronym of the community")
private String acronym; // context id
@JsonSchema(description = "The long name of the community")
private String name; // context name
@JsonSchema(description = "One of {Research Community, Research infrastructure}")
private String type; // context type: research initiative or research community
@JsonSchema(description = "Description of the research community/research infrastructure")
private String description;
@JsonSchema(
description = "The URL of the Zenodo community associated to the Research community/Research infrastructure")
private String zenodo_community;
public String getZenodo_community() {
return zenodo_community;
}
public void setZenodo_community(String zenodo_community) {
this.zenodo_community = zenodo_community;
}
public String getType() {
return type;
}
public void setType(String type) {
this.type = type;
}
public String getId() {
return id;
}
public void setId(String id) {
this.id = id;
}
public String getName() {
return name;
}
public void setName(String label) {
this.name = label;
}
public String getAcronym() {
return acronym;
}
public void setAcronym(String acronym) {
this.acronym = acronym;
}
public String getDescription() {
return description;
}
public void setDescription(String description) {
this.description = description;
}
}

View File

@ -1,35 +1,38 @@
{
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"properties": {
"acronym": {
"type": "string",
"description": "The acronym of the community"
},
"description": {
"type": "string",
"description": "Description of the research community/research infrastructure"
},
"id": {
"type": "string",
"description": "OpenAIRE id of the research community/research infrastructure"
},
"name": {
"type": "string",
"description": "The long name of the community"
},
"subject": {
"description": "Only for research communities: the list of the subjects associated to the research community",
"type": "array",
"items": {"type": "string"}
},
"type": {
"type": "string",
"description": "One of {Research Community, Research infrastructure}"
},
"zenodo_community": {
"type": "string",
"description": "The URL of the Zenodo community associated to the Research community/Research infrastructure"
}
"$schema" : "http://json-schema.org/draft-07/schema#",
"type" : "object",
"properties" : {
"acronym" : {
"type" : "string",
"description" : "The acronym of the community"
},
"description" : {
"type" : "string",
"description" : "Description of the research community/research infrastructure"
},
"id" : {
"type" : "string",
"description" : "The OpenAIRE id for the community/research infrastructure"
},
"name" : {
"type" : "string",
"description" : "The long name of the community"
},
"subject" : {
"description" : "Only for research communities: the list of the subjects associated to the research community",
"type" : "array",
"items" : {
"type" : "string",
"description" : "Only for research communities: the list of the subjects associated to the research community"
}
},
"type" : {
"type" : "string",
"description" : "One of {Research Community, Research infrastructure}"
},
"zenodo_community" : {
"type" : "string",
"description" : "The URL of the Zenodo community associated to the Research community/Research infrastructure"
}
}
}
}

View File

@ -1,192 +1,196 @@
{
"$schema":"http://json-schema.org/draft-07/schema#",
"definitions": {
"ControlledField": {
"type": "object",
"properties": {
"scheme": {
"type": "string"
"$schema" : "http://json-schema.org/draft-07/schema#",
"type" : "object",
"properties" : {
"accessrights" : {
"type" : "string",
"description" : "Type of access to the data source, as defined by re3data.org. Possible values: {open, restricted, closed}"
},
"certificates" : {
"type" : "string",
"description" : "The certificate, seal or standard the data source complies with. As defined by re3data.org."
},
"citationguidelineurl" : {
"type" : "string",
"description" : "The URL of the data source providing information on how to cite its items. As defined by re3data.org."
},
"contenttypes" : {
"description" : "Types of content in the data source, as defined by OpenDOAR",
"type" : "array",
"items" : {
"type" : "string",
"description" : "Types of content in the data source, as defined by OpenDOAR"
}
},
"databaseaccessrestriction" : {
"type" : "string",
"description" : "Access restrinctions to the data source, as defined by re3data.org. One of {feeRequired, registration, other}"
},
"datasourcetype" : {
"type" : "object",
"properties" : {
"scheme" : {
"type" : "string",
"description" : "The scheme used to express the value (i.e. pubsrepository::journal)"
},
"value": {
"type": "string"
"value" : {
"type" : "string",
"description" : "The value expressed in the scheme (Journal)"
}
},
"description": "To represent the information described by a scheme and a value in that scheme (i.e. pid)"
}
},
"type":"object",
"properties": {
"accessrights": {
"type": "string",
"description": "Type of access to the data source, as defined by re3data.org. Possible values: {open, restricted, closed}"
"description" : "The type of the datasource. See https://api.openaire.eu/vocabularies/dnet:datasource_typologies"
},
"certificates": {
"type": "string",
"description": "The certificate, seal or standard the data source complies with. As defined by re3data.org."
"datauploadrestriction" : {
"type" : "string",
"description" : "Upload restrictions applied by the datasource, as defined by re3data.org. One of {feeRequired, registration, other}"
},
"citationguidelineurl": {
"type": "string",
"description":"The URL of the data source providing information on how to cite its items. As defined by re3data.org."
"dateofvalidation" : {
"type" : "string",
"description" : "The date of last validation against the OpenAIRE guidelines for the datasource records"
},
"contenttypes": {
"description": "Types of content in the data source, as defined by OpenDOAR",
"type": "array",
"items": {
"type": "string"
}
"description" : {
"type" : "string"
},
"databaseaccessrestriction": {
"type": "string",
"description": "Access restrinctions to the data source, as defined by re3data.org. One of {feeRequired, registration, other}"
"englishname" : {
"type" : "string",
"description" : "The English name of the datasource"
},
"datasourcetype": {
"allOf": [
{
"$ref": "#/definitions/ControlledField"
},
{
"description": "The type of the datasource. See https://api.openaire.eu/vocabularies/dnet:datasource_typologies"
}
]
"id" : {
"type" : "string",
"description" : "The OpenAIRE id of the data source"
},
"datauploadrestriction": {
"type": "string",
"description": "Upload restrictions applied by the datasource, as defined by re3data.org. One of {feeRequired, registration, other}"
},
"dateofvalidation": {
"type": "string",
"description": "The date of last validation against the OpenAIRE guidelines for the datasource records"
},
"description": {
"type": "string"
},
"englishname": {
"type": "string",
"description": "The English name of the datasource"
},
"id": {
"type": "string",
"description": "The OpenAIRE id of the data source"
},
"journal": {
"type": "object",
"properties": {
"conferencedate": {
"type": "string"
"journal" : {
"type" : "object",
"properties" : {
"conferencedate" : {
"type" : "string"
},
"conferenceplace": {
"type": "string"
"conferenceplace" : {
"type" : "string"
},
"edition": {
"type": "string"
"edition" : {
"type" : "string",
"description" : "Edition of the journal or conference proceeding"
},
"ep": {
"type": "string",
"description": "End page"
"ep" : {
"type" : "string",
"description" : "End page"
},
"iss": {
"type": "string",
"description": "Issue number"
"iss" : {
"type" : "string",
"description" : "Journal issue number"
},
"issnLinking": {
"type": "string"
"issnLinking" : {
"type" : "string"
},
"issnOnline": {
"type": "string"
"issnOnline" : {
"type" : "string"
},
"issnPrinted": {
"type": "string"
"issnPrinted" : {
"type" : "string"
},
"name": {
"type": "string"
"name" : {
"type" : "string",
"description" : "Name of the journal or conference"
},
"sp": {
"type": "string",
"description": "Start page"
"sp" : {
"type" : "string",
"description" : "Start page"
},
"vol": {
"type": "string",
"description": "Volume"
"vol" : {
"type" : "string",
"description" : "Volume"
}
},
"description": "Information about the journal, if this data source is of type Journal."
"description" : "Information about the journal, if this data source is of type Journal."
},
"languages": {
"description": "The languages present in the data source's content, as defined by OpenDOAR.",
"type": "array",
"items": {
"type": "string"
"languages" : {
"description" : "The languages present in the data source's content, as defined by OpenDOAR.",
"type" : "array",
"items" : {
"type" : "string",
"description" : "The languages present in the data source's content, as defined by OpenDOAR."
}
},
"logourl": {
"type": "string"
"logourl" : {
"type" : "string"
},
"missionstatementurl": {
"type": "string",
"description":"The URL of a mission statement describing the designated community of the data source. As defined by re3data.org"
"missionstatementurl" : {
"type" : "string",
"description" : "The URL of a mission statement describing the designated community of the data source. As defined by re3data.org"
},
"officialname": {
"type": "string",
"description": "The official name of the datasource"
"officialname" : {
"type" : "string",
"description" : "The official name of the datasource"
},
"openairecompatibility": {
"type": "string",
"description": "OpenAIRE guidelines the data source comply with. See also https://guidelines.openaire.eu."
"openairecompatibility" : {
"type" : "string",
"description" : "OpenAIRE guidelines the data source comply with. See also https://guidelines.openaire.eu."
},
"originalId": {
"description": "Original identifiers for the datasource"
"type": "array",
"items": {
"type": "string"
"originalId" : {
"description" : "Original identifiers for the datasource",
"type" : "array",
"items" : {
"type" : "string",
"description" : "Original identifiers for the datasource"
}
},
"pid": {
"description": "Persistent identifiers of the datasource",
"type": "array",
"items": {
"allOf": [
{
"$ref": "#/definitions/ControlledField"
"pid" : {
"description" : "Persistent identifiers of the datasource",
"type" : "array",
"items" : {
"type" : "object",
"properties" : {
"scheme" : {
"type" : "string",
"description" : "The scheme used to express the value "
},
"value" : {
"type" : "string",
"description" : "The value expressed in the scheme "
}
]
},
"description" : "Persistent identifiers of the datasource"
}
},
"pidsystems": {
"type": "string",
"description": "The persistent identifier system that is used by the data source. As defined by re3data.org"
"pidsystems" : {
"type" : "string",
"description" : "The persistent identifier system that is used by the data source. As defined by re3data.org"
},
"policies": {
"description": "Policies of the data source, as defined in OpenDOAR.",
"type": "array",
"items": {
"type": "string"
"policies" : {
"description" : "Policies of the data source, as defined in OpenDOAR.",
"type" : "array",
"items" : {
"type" : "string",
"description" : "Policies of the data source, as defined in OpenDOAR."
}
},
"releaseenddate": {
"type": "string",
"description": "Date when the data source went offline or stopped ingesting new research data. As defined by re3data.org"
"releaseenddate" : {
"type" : "string",
"description" : "Date when the data source went offline or stopped ingesting new research data. As defined by re3data.org"
},
"releasestartdate": {
"type": "string",
"description": "Releasing date of the data source, as defined by re3data.org"
"releasestartdate" : {
"type" : "string",
"description" : "Releasing date of the data source, as defined by re3data.org"
},
"subjects": {
"description": "List of subjects associated to the datasource",
"type": "array",
"items": {
"type": "string"
"subjects" : {
"description" : "List of subjects associated to the datasource",
"type" : "array",
"items" : {
"type" : "string",
"description" : "List of subjects associated to the datasource"
}
},
"uploadrights": {
"type": "string",
"description": "Type of data upload. As defined by re3data.org: one of {open, restricted,closed}"
"uploadrights" : {
"type" : "string",
"description" : "Type of data upload. As defined by re3data.org: one of {open, restricted,closed}"
},
"versioning": {
"type": "boolean",
"description": "As defined by redata.org: 'yes' if the data source supports versioning, 'no' otherwise."
"versioning" : {
"type" : "boolean",
"description" : "As defined by redata.org: 'yes' if the data source supports versioning, 'no' otherwise."
},
"websiteurl": {
"type": "string"
"websiteurl" : {
"type" : "string"
}
}
}
}

View File

@ -0,0 +1,621 @@
{
"$schema" : "http://json-schema.org/draft-07/schema#",
"definitions" : {
"CfHbKeyValue" : {
"type" : "object",
"properties" : {
"key" : {
"type" : "string",
"description" : "the OpenAIRE identifier of the data source"
},
"value" : {
"type" : "string",
"description" : "the name of the data source"
}
}
},
"Provenance" : {
"type" : "object",
"properties" : {
"provenance" : {
"type" : "string"
},
"trust" : {
"type" : "string"
}
}
},
"ResultPid" : {
"type" : "object",
"properties" : {
"scheme" : {
"type" : "string",
"description" : "The scheme of the persistent identifier for the result (i.e. doi). If the pid is here it means the information for the pid has been collected from an authority for that pid type (i.e. Crossref/Datacite for doi). The set of authoritative pid is: doi when collected from Crossref or Datacite pmid when collected from EuroPubmed, arxiv when collected from arXiv, handle from the repositories"
},
"value" : {
"type" : "string",
"description" : "The value expressed in the scheme (i.e. 10.1000/182)"
}
}
}
},
"type" : "object",
"properties" : {
"author" : {
"type" : "array",
"items" : {
"type" : "object",
"properties" : {
"fullname" : {
"type" : "string"
},
"name" : {
"type" : "string"
},
"pid" : {
"type" : "object",
"properties" : {
"id" : {
"type" : "object",
"properties" : {
"scheme" : {
"type" : "string",
"description" : "The author's pid scheme. OpenAIRE currently supports 'ORCID'"
},
"value" : {
"type" : "string",
"description" : "The author's pid value in that scheme (i.e. 0000-1111-2222-3333)"
}
}
},
"provenance" : {
"allOf" : [ {
"$ref" : "#/definitions/Provenance"
}, {
"description" : "The reason why the pid was associated to the author"
} ]
}
},
"description" : "The author's persistent identifiers"
},
"rank" : {
"type" : "integer"
},
"surname" : {
"type" : "string"
}
}
}
},
"bestaccessright" : {
"type" : "object",
"properties" : {
"code" : {
"type" : "string",
"description" : "COAR access mode code: http://vocabularies.coar-repositories.org/documentation/access_rights/"
},
"label" : {
"type" : "string",
"description" : "Label for the access mode"
},
"scheme" : {
"type" : "string",
"description" : "Scheme of reference for access right code. Always set to COAR access rights vocabulary: http://vocabularies.coar-repositories.org/documentation/access_rights/"
}
},
"description" : "The openest of the access rights of this result."
},
"codeRepositoryUrl" : {
"type" : "string",
"description" : "Only for results with type 'software': the URL to the repository with the source code"
},
"collectedfrom" : {
"description" : "Information about the sources from which the record has been collected",
"type" : "array",
"items" : {
"allOf" : [ {
"$ref" : "#/definitions/CfHbKeyValue"
}, {
"description" : "Information about the sources from which the record has been collected"
} ]
}
},
"contactgroup" : {
"description" : "Only for results with type 'software': Information on the group responsible for providing further information regarding the resource",
"type" : "array",
"items" : {
"type" : "string",
"description" : "Only for results with type 'software': Information on the group responsible for providing further information regarding the resource"
}
},
"contactperson" : {
"description" : "Only for results with type 'software': Information on the person responsible for providing further information regarding the resource",
"type" : "array",
"items" : {
"type" : "string",
"description" : "Only for results with type 'software': Information on the person responsible for providing further information regarding the resource"
}
},
"container" : {
"type" : "object",
"properties" : {
"conferencedate" : {
"type" : "string"
},
"conferenceplace" : {
"type" : "string"
},
"edition" : {
"type" : "string",
"description" : "Edition of the journal or conference proceeding"
},
"ep" : {
"type" : "string",
"description" : "End page"
},
"iss" : {
"type" : "string",
"description" : "Journal issue number"
},
"issnLinking" : {
"type" : "string"
},
"issnOnline" : {
"type" : "string"
},
"issnPrinted" : {
"type" : "string"
},
"name" : {
"type" : "string",
"description" : "Name of the journal or conference"
},
"sp" : {
"type" : "string",
"description" : "Start page"
},
"vol" : {
"type" : "string",
"description" : "Volume"
}
},
"description" : "Container has information about the conference or journal where the result has been presented or published"
},
"context" : {
"description" : "Reference to a relevant research infrastructure, initiative or community (RI/RC) among those collaborating with OpenAIRE. Please see https://connect.openaire.eu",
"type" : "array",
"items" : {
"type" : "object",
"properties" : {
"code" : {
"type" : "string",
"description" : "Code identifying the RI/RC"
},
"label" : {
"type" : "string",
"description" : "Label of the RI/RC"
},
"provenance" : {
"description" : "Why this result is associated to the RI/RC.",
"type" : "array",
"items" : {
"allOf" : [ {
"$ref" : "#/definitions/Provenance"
}, {
"description" : "Why this result is associated to the RI/RC."
} ]
}
}
},
"description" : "Reference to a relevant research infrastructure, initiative or community (RI/RC) among those collaborating with OpenAIRE. Please see https://connect.openaire.eu"
}
},
"contributor" : {
"description" : "Contributors for the result",
"type" : "array",
"items" : {
"type" : "string",
"description" : "Contributors for the result"
}
},
"country" : {
"description" : "The list of countries associated to this result",
"type" : "array",
"items" : {
"type" : "object",
"properties" : {
"code" : {
"type" : "string",
"description" : "ISO 3166-1 alpha-2 country code (i.e. IT)"
},
"label" : {
"type" : "string",
"description" : "The label for that code (i.e. Italy)"
},
"provenance" : {
"allOf" : [ {
"$ref" : "#/definitions/Provenance"
}, {
"description" : "Why this result is associated to the country."
} ]
}
},
"description" : "The list of countries associated to this result"
}
},
"coverage" : {
"type" : "array",
"items" : {
"type" : "string"
}
},
"dateofcollection" : {
"type" : "string",
"description" : "When OpenAIRE collected the record the last time"
},
"description" : {
"type" : "array",
"items" : {
"type" : "string"
}
},
"documentationUrl" : {
"description" : "Only for results with type 'software': URL to the software documentation",
"type" : "array",
"items" : {
"type" : "string",
"description" : "Only for results with type 'software': URL to the software documentation"
}
},
"embargoenddate" : {
"type" : "string",
"description" : "Date when the embargo ends and this result turns Open Access"
},
"format" : {
"type" : "array",
"items" : {
"type" : "string"
}
},
"geolocation" : {
"description" : "Geolocation information",
"type" : "array",
"items" : {
"type" : "object",
"properties" : {
"box" : {
"type" : "string"
},
"place" : {
"type" : "string"
},
"point" : {
"type" : "string"
}
},
"description" : "Geolocation information"
}
},
"id" : {
"type" : "string",
"description" : "The OpenAIRE identifiers for this result"
},
"indicators" : {
"type" : "object",
"properties" : {
"bipIndicators" : {
"description" : "The impact measures (i.e. popularity)",
"type" : "array",
"items" : {
"type" : "object",
"properties" : {
"clazz" : {
"type" : "string"
},
"indicator" : {
"type" : "string"
},
"score" : {
"type" : "string"
}
},
"description" : "The impact measures (i.e. popularity)"
}
},
"usageCounts" : {
"type" : "object",
"properties" : {
"downloads" : {
"type" : "string"
},
"views" : {
"type" : "string"
}
},
"description" : "The usage counts (i.e. downloads)"
}
},
"description" : "Indicators computed for this result, for example UsageCount ones"
},
"instance" : {
"description" : "Each instance is one specific materialisation or version of the result. For example, you can have one result with three instance: one is the pre-print, one is the post-print, one is te published version",
"type" : "array",
"items" : {
"type" : "object",
"properties" : {
"accessright" : {
"type" : "object",
"properties" : {
"code" : {
"type" : "string",
"description" : "COAR access mode code: http://vocabularies.coar-repositories.org/documentation/access_rights/"
},
"label" : {
"type" : "string",
"description" : "Label for the access mode"
},
"openAccessRoute" : {
"type" : "string",
"enum" : [ "gold", "green", "hybrid", "bronze" ]
},
"scheme" : {
"type" : "string",
"description" : "Scheme of reference for access right code. Always set to COAR access rights vocabulary: http://vocabularies.coar-repositories.org/documentation/access_rights/"
}
},
"description" : "The accessRights for this materialization of the result"
},
"alternateIdentifier" : {
"description" : "All the identifiers other than pids forged by an authorithy for the pid type (i.e. Crossref for DOIs",
"type" : "array",
"items" : {
"type" : "object",
"properties" : {
"scheme" : {
"type" : "string",
"description" : "The scheme of the identifier. It can be a persistent identifier (i.e. doi). If it is present in the alternate identifiers it means it has not been forged by an authority for that pid. For example we collect metadata from an institutional repository that provides as identifier for the result also the doi"
},
"value" : {
"type" : "string",
"description" : "The value expressed in the scheme"
}
},
"description" : "All the identifiers other than pids forged by an authorithy for the pid type (i.e. Crossref for DOIs"
}
},
"articleprocessingcharge" : {
"type" : "object",
"properties" : {
"amount" : {
"type" : "string"
},
"currency" : {
"type" : "string"
}
},
"description" : "The money spent to make this book or article available in Open Access. Source for this information is the OpenAPC initiative."
},
"collectedfrom" : {
"allOf" : [ {
"$ref" : "#/definitions/CfHbKeyValue"
}, {
"description" : "Information about the source from which the record has been collected"
} ]
},
"hostedby" : {
"allOf" : [ {
"$ref" : "#/definitions/CfHbKeyValue"
}, {
"description" : "Information about the source from which the instance can be viewed or downloaded."
} ]
},
"license" : {
"type" : "string"
},
"pid" : {
"type" : "array",
"items" : {
"$ref" : "#/definitions/ResultPid"
}
},
"publicationdate" : {
"type" : "string",
"description" : "Date of the research product"
},
"refereed" : {
"type" : "string",
"description" : "If this instance has been peer-reviewed or not. Allowed values are peerReviewed, nonPeerReviewed, UNKNOWN (as defined in https://api.openaire.eu/vocabularies/dnet:review_levels)"
},
"type" : {
"type" : "string",
"description" : "The specific sub-type of this instance (see https://api.openaire.eu/vocabularies/dnet:result_typologies following the links)"
},
"url" : {
"description" : "URLs to the instance. They may link to the actual full-text or to the landing page at the hosting source. ",
"type" : "array",
"items" : {
"type" : "string",
"description" : "URLs to the instance. They may link to the actual full-text or to the landing page at the hosting source. "
}
}
},
"description" : "Each instance is one specific materialisation or version of the result. For example, you can have one result with three instance: one is the pre-print, one is the post-print, one is te published version"
}
},
"language" : {
"type" : "object",
"properties" : {
"code" : {
"type" : "string",
"description" : "alpha-3/ISO 639-2 code of the language"
},
"label" : {
"type" : "string",
"description" : "Language label in English"
}
}
},
"lastupdatetimestamp" : {
"type" : "integer",
"description" : "Timestamp of last update of the record in OpenAIRE"
},
"maintitle" : {
"type" : "string",
"description" : "A name or title by which a scientific result is known. May be the title of a publication, of a dataset or the name of a piece of software."
},
"originalId" : {
"description" : "Identifiers of the record at the original sources",
"type" : "array",
"items" : {
"type" : "string",
"description" : "Identifiers of the record at the original sources"
}
},
"pid" : {
"description" : "Persistent identifiers of the result",
"type" : "array",
"items" : {
"allOf" : [ {
"$ref" : "#/definitions/ResultPid"
}, {
"description" : "Persistent identifiers of the result"
} ]
}
},
"programmingLanguage" : {
"type" : "string",
"description" : "Only for results with type 'software': the programming language"
},
"projects" : {
"description" : "List of projects (i.e. grants) that (co-)funded the production ofn the research results",
"type" : "array",
"items" : {
"type" : "object",
"properties" : {
"acronym" : {
"type" : "string",
"description" : "The acronym of the project"
},
"code" : {
"type" : "string",
"description" : "The grant agreement number"
},
"funder" : {
"type" : "object",
"properties" : {
"fundingStream" : {
"type" : "string",
"description" : "Stream of funding (e.g. for European Commission can be H2020 or FP7)"
},
"jurisdiction" : {
"type" : "string",
"description" : "Geographical jurisdiction (e.g. for European Commission is EU, for Croatian Science Foundation is HR)"
},
"name" : {
"type" : "string",
"description" : "The name of the funder (European Commission)"
},
"shortName" : {
"type" : "string",
"description" : "The short name of the funder (EC)"
}
},
"description" : "Information about the funder funding the project"
},
"id" : {
"type" : "string",
"description" : "The OpenAIRE id for the project"
},
"provenance" : {
"$ref" : "#/definitions/Provenance"
},
"title" : {
"type" : "string"
},
"validated" : {
"type" : "object",
"properties" : {
"validatedByFunder" : {
"type" : "boolean"
},
"validationDate" : {
"type" : "string"
}
}
}
},
"description" : "List of projects (i.e. grants) that (co-)funded the production ofn the research results"
}
},
"publicationdate" : {
"type" : "string",
"description" : "Main date of the research product: typically the publication or issued date. In case of a research result with different versions with different dates, the date of the result is selected as the most frequent well-formatted date. If not available, then the most recent and complete date among those that are well-formatted. For statistics, the year is extracted and the result is counted only among the result of that year. Example: Pre-print date: 2019-02-03, Article date provided by repository: 2020-02, Article date provided by Crossref: 2020, OpenAIRE will set as date 2019-02-03, because its the most recent among the complete and well-formed dates. If then the repository updates the metadata and set a complete date (e.g. 2020-02-12), then this will be the new date for the result because it becomes the most recent most complete date. However, if OpenAIRE then collects the pre-print from another repository with date 2019-02-03, then this will be the “winning date” because it becomes the most frequent well-formatted date."
},
"publisher" : {
"type" : "string",
"description" : "The name of the entity that holds, archives, publishes prints, distributes, releases, issues, or produces the resource."
},
"size" : {
"type" : "string",
"description" : "Only for results with type 'dataset': the declared size of the dataset"
},
"source" : {
"description" : "See definition of Dublin Core field dc:source",
"type" : "array",
"items" : {
"type" : "string",
"description" : "See definition of Dublin Core field dc:source"
}
},
"subjects" : {
"description" : "Keywords associated to the result",
"type" : "array",
"items" : {
"type" : "object",
"properties" : {
"provenance" : {
"allOf" : [ {
"$ref" : "#/definitions/Provenance"
}, {
"description" : "Why this subject is associated to the result"
} ]
},
"subject" : {
"type" : "object",
"properties" : {
"scheme" : {
"type" : "string",
"description" : "OpenAIRE subject classification scheme (https://api.openaire.eu/vocabularies/dnet:subject_classification_typologies)."
},
"value" : {
"type" : "string",
"description" : "The value for the subject in the selected scheme. When the scheme is 'keyword', it means that the subject is free-text (i.e. not a term from a controlled vocabulary)."
}
}
}
},
"description" : "Keywords associated to the result"
}
},
"subtitle" : {
"type" : "string",
"description" : "Explanatory or alternative name by which a scientific result is known."
},
"tool" : {
"description" : "Only for results with type 'other': tool useful for the interpretation and/or re-used of the research product",
"type" : "array",
"items" : {
"type" : "string",
"description" : "Only for results with type 'other': tool useful for the interpretation and/or re-used of the research product"
}
},
"type" : {
"type" : "string",
"description" : "Type of the result: one of 'publication', 'dataset', 'software', 'other' (see also https://api.openaire.eu/vocabularies/dnet:result_typologies)"
},
"version" : {
"type" : "string",
"description" : "Version of the result"
}
}
}

View File

@ -1,57 +1,59 @@
{
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"properties": {
"alternativenames": {
"description": "Alternative names that identify the organisation",
"type": "array",
"items": {
"type": "string"
"$schema" : "http://json-schema.org/draft-07/schema#",
"type" : "object",
"properties" : {
"alternativenames" : {
"description" : "Alternative names that identify the organisation",
"type" : "array",
"items" : {
"type" : "string",
"description" : "Alternative names that identify the organisation"
}
},
"country": {
"type": "object",
"properties": {
"code": {
"type": "string",
"description": "The organisation country code"
"country" : {
"type" : "object",
"properties" : {
"code" : {
"type" : "string",
"description" : "ISO 3166-1 alpha-2 country code (i.e. IT)"
},
"label": {
"type": "string",
"description": "The organisation country label"
"label" : {
"type" : "string",
"description" : "The label for that code (i.e. Italy)"
}
},
"description": "The country of the organisation"
"description" : "The organisation country"
},
"id": {
"type": "string",
"description": "The OpenAIRE id for the organisation"
"id" : {
"type" : "string",
"description" : "The OpenAIRE id for the organisation"
},
"legalname": {
"type": "string"
"legalname" : {
"type" : "string"
},
"legalshortname": {
"type": "string"
"legalshortname" : {
"type" : "string"
},
"pid": {
"description": "Persistent identifiers for the organisation i.e. isni 0000000090326370",
"type": "array",
"items": {
"type": "object",
"properties": {
"scheme": {
"type": "string",
"description": "The scheme of the identifier (i.e. isni)"
"pid" : {
"description" : "Persistent identifiers for the organisation i.e. isni 0000000090326370",
"type" : "array",
"items" : {
"type" : "object",
"properties" : {
"scheme" : {
"type" : "string",
"description" : "The scheme of the identifier (i.e. isni)"
},
"value": {
"type": "string",
"description": "the value in the schema (i.e. 0000000090326370)"
"value" : {
"type" : "string",
"description" : "The value in the schema (i.e. 0000000090326370)"
}
}
},
"description" : "Persistent identifiers for the organisation i.e. isni 0000000090326370"
}
},
"websiteurl": {
"type": "string"
"websiteurl" : {
"type" : "string"
}
}
}
}

View File

@ -1,119 +1,119 @@
{
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"properties": {
"acronym": {
"type": "string"
"$schema" : "http://json-schema.org/draft-07/schema#",
"type" : "object",
"properties" : {
"acronym" : {
"type" : "string"
},
"callidentifier": {
"type": "string"
"callidentifier" : {
"type" : "string"
},
"code": {
"type": "string",
"description": "The grant agreement number"
"code" : {
"type" : "string"
},
"enddate": {
"type": "string"
"enddate" : {
"type" : "string"
},
"funding": {
"description": "Funding information for the project",
"type": "array",
"items": {
"type": "object",
"properties": {
"funding_stream": {
"type": "object",
"properties": {
"description": {
"type": "string",
"description": "Description of the funding stream"
"funding" : {
"description" : "Funding information for the project",
"type" : "array",
"items" : {
"type" : "object",
"properties" : {
"funding_stream" : {
"type" : "object",
"properties" : {
"description" : {
"type" : "string"
},
"id": {
"type": "string",
"description": "Id of the funding stream"
"id" : {
"type" : "string",
"description" : "Id of the funding stream"
}
}
},
"description" : "Description of the funding stream"
},
"jurisdiction": {
"type": "string",
"description": "The jurisdiction of the funder (i.e. EU)"
"jurisdiction" : {
"type" : "string",
"description" : "Geographical jurisdiction (e.g. for European Commission is EU, for Croatian Science Foundation is HR)"
},
"name": {
"type": "string",
"description": "The name of the funder (European Commission)"
"name" : {
"type" : "string",
"description" : "The name of the funder (European Commission)"
},
"shortName": {
"type": "string",
"description": "The short name of the funder (EC)"
"shortName" : {
"type" : "string",
"description" : "The short name of the funder (EC)"
}
}
},
"description" : "Funding information for the project"
}
},
"granted": {
"type": "object",
"properties": {
"currency": {
"type": "string",
"description": "The currency of the granted amount (e.g. EUR)"
"granted" : {
"type" : "object",
"properties" : {
"currency" : {
"type" : "string",
"description" : "The currency of the granted amount (e.g. EUR)"
},
"fundedamount": {
"type": "number",
"description": "The funded amount"
"fundedamount" : {
"type" : "number",
"description" : "The funded amount"
},
"totalcost": {
"type": "number",
"description": "The total cost of the project"
"totalcost" : {
"type" : "number",
"description" : "The total cost of the project"
}
},
"description": "The money granted to the project"
"description" : "The money granted to the project"
},
"h2020programme": {
"description": "The h2020 programme funding the project",
"type": "array",
"items": {
"type": "object",
"properties": {
"code": {
"type": "string",
"description": "The code of the programme"
"h2020programme" : {
"description" : "The h2020 programme funding the project",
"type" : "array",
"items" : {
"type" : "object",
"properties" : {
"code" : {
"type" : "string",
"description" : "The code of the programme"
},
"description": {
"type": "string",
"description": "The description of the programme"
"description" : {
"type" : "string",
"description" : "The description of the programme"
}
}
},
"description" : "The h2020 programme funding the project"
}
},
"id": {
"type": "string",
"description": "OpenAIRE id for the project"
"id" : {
"type" : "string"
},
"keywords": {
"type": "string"
"keywords" : {
"type" : "string"
},
"openaccessmandatefordataset": {
"type": "boolean"
"openaccessmandatefordataset" : {
"type" : "boolean"
},
"openaccessmandateforpublications": {
"type": "boolean"
"openaccessmandateforpublications" : {
"type" : "boolean"
},
"startdate": {
"type": "string"
"startdate" : {
"type" : "string"
},
"subject": {
"type": "array",
"items": {
"type": "string"
"subject" : {
"type" : "array",
"items" : {
"type" : "string"
}
},
"summary": {
"type": "string"
"summary" : {
"type" : "string"
},
"title": {
"type": "string"
"title" : {
"type" : "string"
},
"websiteurl": {
"type": "string"
"websiteurl" : {
"type" : "string"
}
}
}
}

View File

@ -1,68 +1,54 @@
{
"$schema":"http://json-schema.org/draft-07/schema#",
"definitions": {
"Node": {
"type": "object",
"properties": {
"id": {
"type": "string",
"description": "The OpenAIRE id of the entity"
"$schema" : "http://json-schema.org/draft-07/schema#",
"type" : "object",
"properties" : {
"provenance" : {
"type" : "object",
"properties" : {
"provenance" : {
"type" : "string"
},
"type": {
"type": "string",
"description": "The type of the entity (i.e. organisation)"
}
}
}
},
"type":"object",
"properties": {
"provenance": {
"type": "object",
"properties": {
"provenance": {
"type": "string",
"description": "The reason why OpenAIRE holds the relation "
},
"trust": {
"type": "string",
"description": "The trust of the relation in the range of [0,1]. Where greater the number, more the trust. Harvested relationships have typically a high trust (0.9). The trust of inferred relationship is calculated by the inference algorithm that generated them, as described in https://graph.openaire.eu/about#architecture (Enrichment --> Mining)"
}
}
},
"reltype": {
"type": "object",
"properties": {
"name": {
"type": "string",
"description": "The semantics of the relation (i.e. isAuthorInstitutionOf). "
},
"type": {
"type": "string",
"description": "the type of the relation (i.e. affiliation)"
"trust" : {
"type" : "string"
}
},
"description": "To represent the semantics of a relation between two entities"
"description" : "The reason why OpenAIRE holds the relation "
},
"source": {
"allOf": [
{"$ref": "#/definitions/Node"},
{"description": "The node source in the relation"}
]
"reltype" : {
"type" : "object",
"properties" : {
"name" : {
"type" : "string"
},
"type" : {
"type" : "string"
}
},
"description" : "To represent the semantics of a relation between two entities"
},
"target": {
"allOf": [
{"$ref": "#/definitions/Node"},
{"description": "The node target in the relation"}
]
"source" : {
"type" : "string",
"description" : "The identifier of the source in the relation"
},
"validated":{
"type":"boolean",
"description":"True if the relation is related to a project and it has been collected from an authoritative source (i.e. the funder)"
"sourceType" : {
"type" : "string",
"description" : "The entity type of the source in the relation"
},
"validationDate":{
"type":"string",
"description":"The date when the relation was collected from OpenAIRE"
"target" : {
"type" : "string",
"description" : "The identifier of the target in the relation"
},
"targetType" : {
"type" : "string",
"description" : "The entity type of the target in the relation"
},
"validated" : {
"type" : "boolean",
"description" : "True if the relation is related to a project and it has been collected from an authoritative source (i.e. the funder)"
},
"validationDate" : {
"type" : "string",
"description" : "The date when the relation was collected from OpenAIRE"
}
}
}
}

View File

@ -1,415 +1,506 @@
{
"$schema": "http://json-schema.org/draft-07/schema#",
"definitions": {
"ControlledField": {
"type": "object",
"properties": {
"scheme": {
"type": "string"
},
"value": {
"type": "string"
}
},
"description": "To represent the information described by a scheme and a value in that scheme (i.e. pid)"
},
"Provenance": {
"type": "object",
"properties": {
"provenance": {
"type": "string",
"description": "The process that produced/provided the information"
"description": "Description of provenance"
},
"trust": {
"type": "string"
"type": "string",
"description": "Description of trust"
}
},
"description": "Indicates the process that produced (or provided) the information, and the trust associated to the information"
}
},
"ResultPid": {
"type": "object",
"properties": {
"scheme": {
"type": "string",
"description": "Description of scheme"
},
"value": {
"type": "string",
"description": "Description of value"
}
}
}
},
"type": "object",
"properties": {
"author": {
"description": "Description of author",
"type": "array",
"items": {
"type": "object",
"properties": {
"fullname": {
"type": "string"
"type": "string",
"description": "Description of fullname"
},
"name": {
"type": "string"
"type": "string",
"description": "Description of name"
},
"pid": {
"type": "object",
"properties": {
"id": {
"allOf": [
{"$ref": "#/definitions/ControlledField"},
{"description": "The author's id and scheme. OpenAIRE currently supports 'ORCID'"}
]
"type": "object",
"properties": {
"scheme": {
"type": "string",
"description": "Description of scheme"
},
"value": {
"type": "string",
"description": "Description of value"
}
},
"description": "Description of id"
},
"provenance": {
"allOf": [
{"$ref": "#/definitions/Provenance"},
{"description": "Provenance of author's pid"}
{"description": "Description of provenance"}
]
}
}
},
"description": "Description of pid"
},
"rank": {
"type": "integer"
"type": "integer",
"description": "Description of rank"
},
"surname": {
"type": "string"
"type": "string",
"description": "Description of surname"
}
}
},
"description": "Description of author"
}
},
"bestaccessright":{
"type":"object",
"properties":{
"code": {
"type": "string",
"description": "COAR access mode code: http://vocabularies.coar-repositories.org/documentation/access_rights/"
},
"label": {
"type": "string",
"description": "Label for the access mode"
},
"bestaccessright": {
"type": "object",
"properties": {
"code": {
"type": "string",
"description": "Description of code"
},
"label": {
"type": "string",
"description": "Description of label"
},
"scheme": {
"type": "string",
"description": "Scheme of reference for access right code. Always set to COAR access rights vocabulary: http://vocabularies.coar-repositories.org/documentation/access_rights/"
}
}
"type": "string",
"description": "Description of scheme"
}
},
"description": "Description of bestaccessright"
},
"codeRepositoryUrl": {
"type": "string",
"description": "Only for results with type 'software': the URL to the repository with the source code"
"description": "Description of codeRepositoryUrl"
},
"contactgroup": {
"description": "Only for results with type 'software': Information on the group responsible for providing further information regarding the resource",
"description": "Description of contactgroup",
"type": "array",
"items": {
"type": "string"
"type": "string",
"description": "Description of contactgroup"
}
},
"contactperson": {
"description": "Only for results with type 'software': Information on the person responsible for providing further information regarding the resource",
"description": "Description of contactperson",
"type": "array",
"items": {
"type": "string"
"type": "string",
"description": "Description of contactperson"
}
},
"container": {
"type": "object",
"properties": {
"conferencedate": {
"type": "string"
"type": "string",
"description": "Description of conferencedate"
},
"conferenceplace": {
"type": "string"
"type": "string",
"description": "Description of conferenceplace"
},
"edition": {
"type": "string",
"description": "Edition of the journal or conference proceeding"
"description": "Description of edition"
},
"ep": {
"type": "string",
"description": "End page"
"description": "Description of ep"
},
"iss": {
"type": "string",
"description": "Journal issue"
"description": "Description of iss"
},
"issnLinking": {
"type": "string"
"type": "string",
"description": "Description of issnLinking"
},
"issnOnline": {
"type": "string"
"type": "string",
"description": "Description of issnOnline"
},
"issnPrinted": {
"type": "string"
"type": "string",
"description": "Description of issnPrinted"
},
"name": {
"type": "string",
"description": "Name of the journal or conference"
"description": "Description of name"
},
"sp": {
"type": "string",
"description": "start page"
"description": "Description of sp"
},
"vol": {
"type": "string"
"type": "string",
"description": "Description of vol"
}
},
"description": "Container has information about the conference or journal where the result has been presented or published"
"description": "Description of container"
},
"contributor": {
"description": "Description of contributor",
"type": "array",
"items": {
"type": "string",
"description": "Contributors for the result"
"description": "Description of contributor"
}
},
"country": {
"description": "Description of country",
"type": "array",
"items": {
"type": "object",
"properties": {
"code": {
"type": "string",
"description": "ISO 3166-1 alpha-2 country code"
"description": "Description of code"
},
"label": {
"type": "string"
"type": "string",
"description": "Description of label"
},
"provenance": {
"allOf": [
{"$ref": "#/definitions/Provenance"},
{"description": "Why this result is associated to the country."}
{"description": "Description of provenance"}
]
}
}
},
"description": "Description of country"
}
},
"coverage": {
"description": "Description of coverage",
"type": "array",
"items": {
"type": "string"
"type": "string",
"description": "Description of coverage"
}
},
"dateofcollection": {
"type": "string",
"description": "When OpenAIRE collected the record the last time"
"description": "Description of dateofcollection"
},
"description": {
"description": "Description of description",
"type": "array",
"items": {
"type": "string"
"type": "string",
"description": "Description of description"
}
},
"documentationUrl": {
"description": "Only for results with type 'software': URL to the software documentation",
"description": "Description of documentationUrl",
"type": "array",
"items": {
"type": "string"
"type": "string",
"description": "Description of documentationUrl"
}
},
"embargoenddate": {
"type": "string",
"description": "Date when the embargo ends and this result turns Open Access"
"description": "Description of embargoenddate"
},
"format": {
"description": "Description of format",
"type": "array",
"items": {
"type": "string"
"type": "string",
"description": "Description of format"
}
},
"geolocation": {
"description": "Geolocation information",
"description": "Description of geolocation",
"type": "array",
"items": {
"type": "object",
"properties": {
"box": {
"type": "string"
"type": "string",
"description": "Description of box"
},
"place": {
"type": "string"
"type": "string",
"description": "Description of place"
},
"point": {
"type": "string"
"type": "string",
"description": "Description of point"
}
}
},
"description": "Description of geolocation"
}
},
"id": {
"type": "string",
"description": "OpenAIRE Identifier"
"description": "Description of id"
},
"instance":{
"description":"Each instance is one specific materialisation or version of the result. For example, you can have one result with three instance: one is the pre-print, one is the post-print, one is te published version",
"type":"array",
"items":{
"type":"object",
"properties":{
"accessright":{
"type":"object",
"properties":{
"indicators": {
"type": "object",
"properties": {
"bipIndicators": {
"description": "Description of bipIndicators",
"type": "array",
"items": {
"type": "object",
"properties": {
"clazz": {
"type": "string",
"description": "Description of clazz"
},
"indicator": {
"type": "string",
"description": "Description of indicator"
},
"score": {
"type": "string",
"description": "Description of score"
}
},
"description": "Description of bipIndicators"
}
},
"usageCounts": {
"type": "object",
"properties": {
"downloads": {
"type": "string",
"description": "Description of downloads"
},
"views": {
"type": "string",
"description": "Description of views"
}
},
"description": "Description of usageCounts"
}
},
"description": "Description of indicators"
},
"instance": {
"description": "Description of instance",
"type": "array",
"items": {
"type": "object",
"properties": {
"accessright": {
"type": "object",
"properties": {
"code": {
"type": "string",
"description": "COAR access mode code: http://vocabularies.coar-repositories.org/documentation/access_rights/"
"description": "Description of code"
},
"label": {
"type": "string",
"description": "Label for the access mode"
"description": "Description of label"
},
"openAccessRoute":{
"type":"string",
"enum":[
"openAccessRoute": {
"type": "string",
"enum": [
"gold",
"green",
"hybrid",
"bronze"
],
"description":"The type of OpenAccess applied to the result"
"description": "Description of openAccessRoute"
},
"scheme": {
"type": "string",
"description": "Scheme of reference for access right code. Always set to COAR access rights vocabulary: http://vocabularies.coar-repositories.org/documentation/access_rights/"
"description": "Description of scheme"
}
},
"description": "Description of accessright"
},
"alternateIdentifier": {
"description": "Description of alternateIdentifier",
"type": "array",
"items": {
"type": "object",
"properties": {
"scheme": {
"type": "string",
"description": "Description of scheme"
},
"value": {
"type": "string",
"description": "Description of value"
}
},
"description": "Description of alternateIdentifier"
}
},
"alternateIdentifier":{
"type":"array",
"items":{
"allOf":[
{
"$ref":"#/definitions/ControlledField"
},
{
"description":"All the identifiers other than pids forged by an authorithy for the pid type (i.e. Crossref for DOIs"
}
"articleprocessingcharge": {
"type": "object",
"properties": {
"amount": {
"type": "string",
"description": "Description of amount"
},
"currency": {
"type": "string",
"description": "Description of currency"
}
},
"description": "Description of articleprocessingcharge"
},
"license": {
"type": "string",
"description": "Description of license"
},
"pid": {
"description": "Description of pid",
"type": "array",
"items": {
"allOf": [
{"$ref": "#/definitions/ResultPid"},
{"description": "Description of pid"}
]
}
},
"articleprocessingcharge":{
"description": "The money spent to make this book or article available in Open Access. Source for this information is the OpenAPC initiative.",
"type":"object",
"properties":{
"amount":{
"type":"string"
},
"currency":{
"type":"string"
}
}
"publicationdate": {
"type": "string",
"description": "Description of publicationdate"
},
"license":{
"type":"string"
"refereed": {
"type": "string",
"description": "Description of refereed"
},
"measures":{
"type":"array",
"items":{
"type":"object",
"properties":{
"key":{
"type":"string",
"description":"The measure"
},
"value":{
"type":"string",
"description":"The value for the measure"
}
},
"description":"Measures computed for this instance, for example Bip!Finder ones"
}
"type": {
"type": "string",
"description": "Description of type"
},
"pid":{
"description":"The set of persistent identifiers associated to this instance that have been collected from an authority for the pid type (i.e. Crossref/Datacite for doi)",
"type":"array",
"items":{
"allOf":[
{
"$ref":"#/definitions/ControlledField"
},
{
"description":"The persistent identifier associated to the result"
}
]
}
},
"publicationdate":{
"type":"string",
"description": "Date of the research product"
},
"refereed":{
"description": "If this instance has been peer-reviewed or not. Allowed values are peerReviewed, nonPeerReviewed, UNKNOWN (as defined in https://api.openaire.eu/vocabularies/dnet:review_levels)",
"type":"string"
},
"type":{
"type":"string",
"description":"The specific sub-type of this instance (see https://api.openaire.eu/vocabularies/dnet:result_typologies following the links)"
},
"url":{
"description":"URLs to the instance. They may link to the actual full-text or to the landing page at the hosting source. ",
"type":"array",
"items":{
"type":"string"
"url": {
"description": "Description of url",
"type": "array",
"items": {
"type": "string",
"description": "Description of url"
}
}
}
},
"description": "Description of instance"
}
},
"isGreen": {
"type": "boolean",
"description": "Description of isGreen"
},
"isInDiamondJournal": {
"type": "boolean",
"description": "Description of isInDiamondJournal"
},
"language": {
"type": "object",
"properties": {
"code": {
"type": "string",
"description": "alpha-3/ISO 639-2 code of the language"
"description": "Description of code"
},
"label": {
"type": "string",
"description": "Language label in English"
"description": "Description of label"
}
}
},
"description": "Description of language"
},
"lastupdatetimestamp": {
"type": "integer",
"description": "Timestamp of last update of the record in OpenAIRE"
"description": "Description of lastupdatetimestamp"
},
"maintitle": {
"type": "string",
"descriptio": "A name or title by which a scientific result is known. May be the title of a publication, of a dataset or the name of a piece of software."
"description": "Description of maintitle"
},
"subtitle": {
"openAccessColor": {
"type": "string",
"descriptio": "Explanatory or alternative name by which a scientific result is known."
"enum": [
"gold",
"hybrid",
"bronze"
],
"description": "Description of openAccessColor"
},
"originalId": {
"description": "Identifiers of the record at the original sources",
"description": "Description of originalId",
"type": "array",
"items": {
"type": "string"
"type": "string",
"description": "Description of originalId"
}
},
"pid": {
"description": "Persistent identifiers of the result",
"description": "Description of pid",
"type": "array",
"items": {
"allOf": [
{"$ref": "#/definitions/ControlledField"},
{"description": "scheme: list of available schemes are at https://api.openaire.eu/vocabularies/dnet:pid_types, value: the PID of the result. Note: the result will have a pid associated only if it was collected from an authority for that pid type. For example a doi will be among the pids for one result if the result metadata were collected from Crossref or Datacite. In all the other cases, the doi will be present among the alteranteIdentifiers for the result "}
{"$ref": "#/definitions/ResultPid"},
{"description": "Description of pid"}
]
}
},
"programmingLanguage": {
"type": "string",
"description": "Only for results with type 'software': the programming language"
"description": "Description of programmingLanguage"
},
"publicationdate": {
"type": "string",
"description": "Main date of the research product: typically the publication or issued date. In case of a research result with different versions with different dates, the date of the result is selected as the most frequent well-formatted date. If not available, then the most recent and complete date among those that are well-formatted. For statistics, the year is extracted and the result is counted only among the result of that year. Example: Pre-print date: 2019-02-03, Article date provided by repository: 2020-02, Article date provided by Crossref: 2020, OpenAIRE will set as date 2019-02-03, because its the most recent among the complete and well-formed dates. If then the repository updates the metadata and set a complete date (e.g. 2020-02-12), then this will be the new date for the result because it becomes the most recent most complete date. However, if OpenAIRE then collects the pre-print from another repository with date 2019-02-03, then this will be the “winning date” because it becomes the most frequent well-formatted date."
"description": "Description of publicationdate"
},
"publiclyFunded": {
"type": "boolean",
"description": "Description of publiclyFunded"
},
"publisher": {
"type": "string",
"description": "The name of the entity that holds, archives, publishes prints, distributes, releases, issues, or produces the resource."
"description": "Description of publisher"
},
"size": {
"type": "string",
"description": "Only for results with type 'dataset': the declared size of the dataset"
"description": "Description of size"
},
"source": {
"description": "See definition of Dublin Core field dc:source",
"description": "Description of source",
"type": "array",
"items": {
"type": "string"
"type": "string",
"description": "Description of source"
}
},
"subjects": {
"description": "Keywords associated to the result",
"description": "Description of subjects",
"type": "array",
"items": {
"type": "object",
@ -417,32 +508,46 @@
"provenance": {
"allOf": [
{"$ref": "#/definitions/Provenance"},
{"description": "Why this subject is associated to the result"}
{"description": "Description of provenance"}
]
},
"subject": {
"allOf": [
{"$ref": "#/definitions/ControlledField"},
{"description": "OpenAIRE subject classification scheme (https://api.openaire.eu/vocabularies/dnet:subject_classification_typologies) and value. When the scheme is 'keyword', it means that the subject is free-text (i.e. not a term from a controlled vocabulary)."}
]
"type": "object",
"properties": {
"scheme": {
"type": "string",
"description": "Description of scheme"
},
"value": {
"type": "string",
"description": "Description of value"
}
},
"description": "Description of subject"
}
}
},
"description": "Description of subjects"
}
},
"subtitle": {
"type": "string",
"description": "Description of subtitle"
},
"tool": {
"description": "Only for results with type 'other': tool useful for the interpretation and/or re-used of the research product",
"description": "Description of tool",
"type": "array",
"items": {
"type": "string"
"type": "string",
"description": "Description of tool"
}
},
"type": {
"type": "string",
"description": "Type of the result: one of 'publication', 'dataset', 'software', 'other' (see also https://api.openaire.eu/vocabularies/dnet:result_typologies)"
"description": "Description of type"
},
"version": {
"type": "string",
"description": "Version of the result"
"description": "Description of version"
}
}
}

View File

@ -9,31 +9,29 @@ import com.github.imifou.jsonschema.module.addon.AddonModule;
import com.github.victools.jsonschema.generator.*;
import eu.dnetlib.dhp.ExecCreateSchemas;
import eu.dnetlib.dhp.eosc.model.Relation;
import eu.dnetlib.dhp.eosc.model.Result;
import eu.dnetlib.dhp.oa.model.Result;
import eu.dnetlib.dhp.oa.model.community.CommunityResult;
import eu.dnetlib.dhp.oa.model.graph.*;
//@Disabled
class GenerateJsonSchema {
@Test
void generateSchema3() throws JsonProcessingException {
ObjectMapper objectMapper = new ObjectMapper();
AddonModule module = new AddonModule();
SchemaGeneratorConfigBuilder configBuilder = new SchemaGeneratorConfigBuilder(objectMapper,
SchemaVersion.DRAFT_7, OptionPreset.PLAIN_JSON)
.with(module)
void generateSchema() {
SchemaGeneratorConfigBuilder configBuilder = new SchemaGeneratorConfigBuilder(SchemaVersion.DRAFT_7,
OptionPreset.PLAIN_JSON)
.with(Option.SCHEMA_VERSION_INDICATOR)
.without(Option.NONPUBLIC_NONSTATIC_FIELDS_WITHOUT_GETTERS);
configBuilder.forFields().withDescriptionResolver(field -> "Description of " + field.getDeclaredName());
SchemaGeneratorConfig config = configBuilder.build();
SchemaGenerator generator = new SchemaGenerator(config);
JsonNode jsonSchema = generator.generateSchema(Result.class);
JsonNode jsonSchema = generator.generateSchema(CommunityResult.class);
System.out.println(new ObjectMapper().writeValueAsString(jsonSchema));
System.out.println(jsonSchema.toString());
}
@Test
void generateSchemaEoscRelation() throws JsonProcessingException {
void generateSchema2() {
ObjectMapper objectMapper = new ObjectMapper();
AddonModule module = new AddonModule();
@ -46,7 +44,7 @@ class GenerateJsonSchema {
SchemaGenerator generator = new SchemaGenerator(config);
JsonNode jsonSchema = generator.generateSchema(Result.class);
System.out.println(new ObjectMapper().writeValueAsString(jsonSchema));
System.out.println(jsonSchema.toString());
}
@Test

View File

@ -6,7 +6,6 @@
<artifactId>dhp-graph-dump</artifactId>
<groupId>eu.dnetlib.dhp</groupId>
<version>1.2.5-SNAPSHOT</version>
<relativePath>../pom.xml</relativePath>
</parent>
<modelVersion>4.0.0</modelVersion>
@ -59,6 +58,15 @@
<artifactId>api</artifactId>
<version>1.2.5-SNAPSHOT</version>
</dependency>
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpclient</artifactId>
</dependency>
<dependency>
<groupId>io.github.classgraph</groupId>
<artifactId>classgraph</artifactId>
<version>4.8.71</version>
</dependency>
<dependency>
<groupId>eu.dnetlib.dhp</groupId>
<artifactId>api</artifactId>

View File

@ -10,6 +10,7 @@ import java.util.Optional;
import org.apache.commons.compress.archivers.tar.TarArchiveEntry;
import org.apache.commons.compress.archivers.tar.TarArchiveOutputStream;
import org.apache.commons.io.IOUtils;
import org.apache.commons.lang.StringUtils;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.*;
import org.slf4j.Logger;
@ -20,13 +21,15 @@ import eu.dnetlib.dhp.application.ArgumentApplicationParser;
public class MakeTarArchive implements Serializable {
private static final Logger log = LoggerFactory.getLogger(MakeTarArchive.class);
private static int index = 1;
private static String prevname = new String();
public static void main(String[] args) throws Exception {
String jsonConfiguration = IOUtils
.toString(
MakeTarArchive.class
.getResourceAsStream(
"/eu/dnetlib/dhp/eosc_input_maketar_parameters.json"));
"/eu/dnetlib/dhp/common/input_maketar_parameters.json"));
final ArgumentApplicationParser parser = new ArgumentApplicationParser(jsonConfiguration);
parser.parseArgument(args);
@ -154,13 +157,21 @@ public class MakeTarArchive implements Serializable {
String pString = p.toString();
if (!pString.endsWith("_SUCCESS")) {
String name = pString.substring(pString.lastIndexOf("/") + 1);
if (name.startsWith("part-") & name.length() > 10) {
String tmp = name.substring(0, 10);
if (name.contains(".")) {
tmp += name.substring(name.indexOf("."));
}
name = tmp;
}
// if (name.startsWith("part-") & name.length() > 10) {
// String tmp = name.substring(0, 10);
// if (prevname.equalsIgnoreCase(tmp)) {
// tmp = tmp + "_" + index;
// index += 1;
// } else {
// prevname = tmp;
// index = 1;
// }
// if (name.contains(".")) {
// tmp += name.substring(name.indexOf("."));
// }
// name = tmp;
//
// }
if (rename) {
if (name.endsWith(".txt.gz"))
name = name.replace(".txt.gz", ".json.gz");

View File

@ -27,6 +27,14 @@ public class Constants {
public static final String RESEARCH_INFRASTRUCTURE = "Research Infrastructure/Initiative";
public static final String USAGE_COUNT_DOWNLOADS = "downloads";
public static final String USAGE_COUNT_VIEWS = "views";
public static final String IMPACT_POPULARITY = "popularity";
public static final String IMPACT_POPULARITY_ALT = "popularity_alt";
public static final String IMPACT_INFLUENCE = "influence";
public static final String IMPACT_INFLUENCE_ALT = "influence_alt";
public static final String IMPACT_IMPULSE = "impulse";
static {
ACCESS_RIGHTS_COAR_MAP.put(ModelConstants.ACCESS_RIGHT_OPEN, CABF2);
ACCESS_RIGHTS_COAR_MAP.put("RESTRICTED", "c_16ec");
@ -43,7 +51,7 @@ public class Constants {
}
public enum DUMPTYPE {
COMPLETE("complete"), COMMUNITY("community"), FUNDER("funder"), EOSC("eosc");
COMPLETE("complete"), COMMUNITY("community"), FUNDER("funder");
private final String type;

View File

@ -1,80 +0,0 @@
package eu.dnetlib.dhp.oa.graph.dump;
import static eu.dnetlib.dhp.common.SparkSessionSupport.runWithSparkSession;
import java.io.Serializable;
import java.util.Optional;
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.function.FilterFunction;
import org.apache.spark.api.java.function.MapFunction;
import org.apache.spark.sql.Encoders;
import org.apache.spark.sql.SaveMode;
import org.apache.spark.sql.SparkSession;
import eu.dnetlib.dhp.eosc.model.Result;
import eu.dnetlib.dhp.oa.graph.dump.eosc.CommunityMap;
import eu.dnetlib.dhp.oa.graph.dump.eosc.Utils;
import eu.dnetlib.dhp.oa.graph.dump.exceptions.CardinalityTooHighException;
import eu.dnetlib.dhp.oa.graph.dump.exceptions.NoAvailableEntityTypeException;
import eu.dnetlib.dhp.schema.oaf.DataInfo;
import eu.dnetlib.dhp.schema.oaf.OafEntity;
/**
* It fires the execution of the actual dump for result entities. If the dump is for RC/RI products its checks for each
* result its belongingess to at least one RC/RI before "asking" for its mapping.
*/
public class DumpProducts implements Serializable {
public void run(Boolean isSparkSessionManaged, String inputPath, String outputPath, String communityMapPath,
Class<? extends OafEntity> inputClazz) {
SparkConf conf = new SparkConf();
runWithSparkSession(
conf,
isSparkSessionManaged,
spark -> {
Utils.removeOutputDir(spark, outputPath);
execDump(
spark, inputPath, outputPath, communityMapPath, inputClazz);
});
}
public static <I extends OafEntity> void execDump(
SparkSession spark,
String inputPath,
String outputPath,
String communityMapPath,
Class<I> inputClazz) {
CommunityMap communityMap = Utils.getCommunityMap(spark, communityMapPath);
Utils
.readPath(spark, inputPath, inputClazz)
.map((MapFunction<I, Result>) value -> execMap(value, communityMap), Encoders.bean(Result.class))
.filter((FilterFunction<Result>) value -> value != null)
.write()
.mode(SaveMode.Overwrite)
.option("compression", "gzip")
.json(outputPath);
}
private static <I extends OafEntity> Result execMap(I value,
CommunityMap communityMap) throws NoAvailableEntityTypeException, CardinalityTooHighException {
Optional<DataInfo> odInfo = Optional.ofNullable(value.getDataInfo());
if (odInfo.isPresent()) {
if (odInfo.get().getDeletedbyinference() || odInfo.get().getInvisible()) {
return null;
}
} else {
return null;
}
return ResultMapper.map(value, communityMap, null);
}
}

View File

@ -26,7 +26,7 @@ public class MakeTar implements Serializable {
.toString(
MakeTar.class
.getResourceAsStream(
"/eu/dnetlib/dhp/oa/graph/dump/eosc_input_maketar_parameters.json"));
"/eu/dnetlib/dhp/oa/graph/dump/input_maketar_parameters.json"));
final ArgumentApplicationParser parser = new ArgumentApplicationParser(jsonConfiguration);
parser.parseArgument(args);

View File

@ -1,49 +1,54 @@
package eu.dnetlib.dhp.oa.graph.dump;
import static eu.dnetlib.dhp.oa.graph.dump.Constants.*;
import static eu.dnetlib.dhp.oa.graph.dump.Utils.ENTITY_ID_SEPARATOR;
import static eu.dnetlib.dhp.oa.graph.dump.Utils.getEntityId;
import java.io.Serializable;
import java.util.*;
import java.util.stream.Collectors;
import org.apache.commons.lang3.StringUtils;
import org.apache.spark.api.java.function.FilterFunction;
import org.apache.spark.api.java.function.MapFunction;
import org.apache.spark.sql.Column;
import org.apache.spark.sql.Encoders;
import org.apache.spark.sql.Row;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import eu.dnetlib.dhp.eosc.model.*;
import eu.dnetlib.dhp.eosc.model.AccessRight;
import eu.dnetlib.dhp.eosc.model.Author;
import eu.dnetlib.dhp.eosc.model.Context;
import eu.dnetlib.dhp.eosc.model.GeoLocation;
import eu.dnetlib.dhp.eosc.model.Measure;
import eu.dnetlib.dhp.eosc.model.OpenAccessRoute;
import eu.dnetlib.dhp.eosc.model.Provenance;
import eu.dnetlib.dhp.eosc.model.Result;
import eu.dnetlib.dhp.oa.graph.dump.eosc.MasterDuplicate;
import eu.dnetlib.dhp.oa.graph.dump.exceptions.CardinalityTooHighException;
import eu.dnetlib.dhp.oa.graph.dump.exceptions.NoAvailableEntityTypeException;
import eu.dnetlib.dhp.oa.model.*;
import eu.dnetlib.dhp.oa.model.AccessRight;
import eu.dnetlib.dhp.oa.model.Author;
import eu.dnetlib.dhp.oa.model.GeoLocation;
import eu.dnetlib.dhp.oa.model.Instance;
import eu.dnetlib.dhp.oa.model.OpenAccessColor;
import eu.dnetlib.dhp.oa.model.OpenAccessRoute;
import eu.dnetlib.dhp.oa.model.Result;
import eu.dnetlib.dhp.oa.model.Subject;
import eu.dnetlib.dhp.oa.model.community.CfHbKeyValue;
import eu.dnetlib.dhp.oa.model.community.CommunityInstance;
import eu.dnetlib.dhp.oa.model.community.CommunityResult;
import eu.dnetlib.dhp.oa.model.community.Context;
import eu.dnetlib.dhp.oa.model.graph.GraphResult;
import eu.dnetlib.dhp.schema.common.ModelConstants;
import eu.dnetlib.dhp.schema.oaf.*;
public class ResultMapper implements Serializable {
private static final Logger log = LoggerFactory.getLogger(ResultMapper.class);
private static final String NULL = "null";
public static <E extends eu.dnetlib.dhp.schema.oaf.OafEntity> Result map(
E in, Map<String, String> communityMap,
List<MasterDuplicate> eoscIds)
E in, Map<String, String> communityMap, String dumpType)
throws NoAvailableEntityTypeException, CardinalityTooHighException {
log.info("*****************" + eoscIds.size());
Result out = new Result();
Result out;
if (Constants.DUMPTYPE.COMPLETE.getType().equals(dumpType)) {
out = new GraphResult();
} else {
out = new CommunityResult();
}
eu.dnetlib.dhp.schema.oaf.Result input = (eu.dnetlib.dhp.schema.oaf.Result) in;
Optional<eu.dnetlib.dhp.schema.oaf.Qualifier> ort = Optional.ofNullable(input.getResulttype());
if (ort.isPresent()) {
try {
addTypeSpecificInformation(out, input, ort.get());
mapAuthor(out, input);
mapAccessRight(out, input);
@ -51,25 +56,30 @@ public class ResultMapper implements Serializable {
mapCountry(out, input);
mapCoverage(out, input);
out.setDateofcollection(input.getDateofcollection());
out.setGreen(input.getIsGreen());
out.setInDiamondJournal(input.getIsInDiamondJournal());
out.setPubliclyFunded(input.getPubliclyFunded());
mapOpenAccessColor(out, input);
mapDescription(out, input);
mapEmbrargo(out, input);
mapMeasure(out, input);
mapEmbargo(out, input);
mapFormat(out, input);
out.setId(input.getId());
out.setId(getEntityId(input.getId(), ENTITY_ID_SEPARATOR));
mapOriginalId(out, input);
mapInstance(out, input, eoscIds);
mapLamguage(out, input);
mapInstance(dumpType, out, input);
mapLanguage(out, input);
mapLastUpdateTimestamp(out, input);
mapTitle(out, input);
mapPid(out, input);
mapAcceptanceDate(out, input);
mapDateOfAcceptance(out, input);
mapPublisher(out, input);
mapSource(out, input);
mapSubject(out, input);
out.setType(input.getResulttype().getClassid());
mapContext(communityMap, out, input);
mapCollectedfrom(out, input);
mapMeasure(out, input);
if (!Constants.DUMPTYPE.COMPLETE.getType().equals(dumpType)) {
mapCollectedfrom((CommunityResult) out, input);
mapContext(communityMap, (CommunityResult) out, input);
}
} catch (ClassCastException cce) {
return null;
}
@ -79,24 +89,24 @@ public class ResultMapper implements Serializable {
}
private static void mapCollectedfrom(Result out, eu.dnetlib.dhp.schema.oaf.Result input) {
private static void mapOpenAccessColor(Result out, eu.dnetlib.dhp.schema.oaf.Result input) {
if (Optional.ofNullable(input.getOpenAccessColor()).isPresent())
switch (input.getOpenAccessColor()) {
case bronze:
out.setOpenAccessColor(OpenAccessColor.bronze);
break;
case gold:
out.setOpenAccessColor(OpenAccessColor.gold);
break;
case hybrid:
out.setOpenAccessColor(OpenAccessColor.hybrid);
break;
out
.setCollectedfrom(
input
.getCollectedfrom()
.stream()
.map(cf -> CfHbKeyValue.newInstance(cf.getKey(), cf.getValue()))
.collect(Collectors.toList()));
}
}
private static void mapFulltext(Result out, eu.dnetlib.dhp.schema.oaf.Result input) {
if (Optional.ofNullable(input.getFulltext()).isPresent() && !input.getFulltext().isEmpty())
out.setFulltext(input.getFulltext().stream().map(ft -> ft.getValue()).collect(Collectors.toList()));
}
private static void mapContext(Map<String, String> communityMap, Result out,
private static void mapContext(Map<String, String> communityMap, CommunityResult out,
eu.dnetlib.dhp.schema.oaf.Result input) {
Set<String> communities = communityMap.keySet();
List<Context> contextList = Optional
@ -162,37 +172,50 @@ public class ResultMapper implements Serializable {
}
}
private static void mapSubject(Result out, eu.dnetlib.dhp.schema.oaf.Result input) {
if (Optional.ofNullable(input.getSubject()).isPresent()) {
out.setSubject(createSubjectMap(input));
out
.setKeywords(
input
.getSubject()
.stream()
.filter(
s -> s.getQualifier().getClassid().equalsIgnoreCase("keyword") &&
!s.getValue().equalsIgnoreCase("EOSC::RO-crate"))
.map(s -> s.getValue())
.collect(Collectors.toList()));
private static void mapCollectedfrom(CommunityResult out, eu.dnetlib.dhp.schema.oaf.Result input) {
out
.setCollectedfrom(
input
.getCollectedfrom()
.stream()
.map(cf -> CfHbKeyValue.newInstance(getEntityId(cf.getKey(), ENTITY_ID_SEPARATOR), cf.getValue()))
.collect(Collectors.toList()));
}
if (Optional.ofNullable(input.getEoscifguidelines()).isPresent()) {
out
.setEoscIF(
input
.getEoscifguidelines()
.stream()
.map(
eig -> EoscInteroperabilityFramework
.newInstance(
eig.getCode(), eig.getLabel(), eig.getUrl(),
eig.getSemanticRelation()))
.collect(Collectors.toList()));
}
private static void mapMeasure(Result out, eu.dnetlib.dhp.schema.oaf.Result input) {
if (Optional.ofNullable(input.getMeasures()).isPresent() && input.getMeasures().size() > 0) {
out.setIndicators(Utils.getIndicator(input.getMeasures()));
}
}
private static void mapSubject(Result out, eu.dnetlib.dhp.schema.oaf.Result input) {
List<Subject> subjectList = new ArrayList<>();
Optional
.ofNullable(input.getSubject())
.ifPresent(
value -> value
.stream()
// .filter(
// s -> !((s.getQualifier().getClassid().equalsIgnoreCase("fos") &&
// Optional.ofNullable(s.getDataInfo()).isPresent()
// && Optional.ofNullable(s.getDataInfo().getProvenanceaction()).isPresent() &&
// s.getDataInfo().getProvenanceaction().getClassid().equalsIgnoreCase("subject:fos"))
// ||
// (s.getQualifier().getClassid().equalsIgnoreCase("sdg") &&
// Optional.ofNullable(s.getDataInfo()).isPresent()
// && Optional.ofNullable(s.getDataInfo().getProvenanceaction()).isPresent() &&
// s
// .getDataInfo()
// .getProvenanceaction()
// .getClassid()
// .equalsIgnoreCase("subject:sdg"))))
.filter(s -> !s.getValue().equalsIgnoreCase(NULL))
.forEach(s -> subjectList.add(getSubject(s))));
out.setSubjects(subjectList);
}
private static void mapSource(Result out, eu.dnetlib.dhp.schema.oaf.Result input) {
Optional
.ofNullable(input.getSource())
@ -201,14 +224,18 @@ public class ResultMapper implements Serializable {
}
private static void mapPublisher(Result out, eu.dnetlib.dhp.schema.oaf.Result input) {
if (Optional.ofNullable(input.getPublisher()).isPresent()) {
out.setPublisher(input.getPublisher().getValue());
Optional<Field<String>> oStr;
oStr = Optional.ofNullable(input.getPublisher());
if (oStr.isPresent()) {
out.setPublisher(oStr.get().getValue());
}
}
private static void mapAcceptanceDate(Result out, eu.dnetlib.dhp.schema.oaf.Result input) {
if (Optional.ofNullable(input.getDateofacceptance()).isPresent()) {
out.setPublicationdate(input.getDateofacceptance().getValue());
private static void mapDateOfAcceptance(Result out, eu.dnetlib.dhp.schema.oaf.Result input) {
Optional<Field<String>> oStr;
oStr = Optional.ofNullable(input.getDateofacceptance());
if (oStr.isPresent()) {
out.setPublicationdate(oStr.get().getValue());
}
}
@ -227,9 +254,10 @@ public class ResultMapper implements Serializable {
}
private static void mapTitle(Result out, eu.dnetlib.dhp.schema.oaf.Result input) {
if (Optional.ofNullable(input.getTitle()).isPresent()) {
List<StructuredProperty> iTitle = input
.getTitle()
Optional<List<StructuredProperty>> otitle = Optional.ofNullable(input.getTitle());
if (otitle.isPresent()) {
List<StructuredProperty> iTitle = otitle
.get()
.stream()
.filter(t -> t.getQualifier().getClassid().equalsIgnoreCase("main title"))
.collect(Collectors.toList());
@ -237,8 +265,8 @@ public class ResultMapper implements Serializable {
out.setMaintitle(iTitle.get(0).getValue());
}
iTitle = input
.getTitle()
iTitle = otitle
.get()
.stream()
.filter(t -> t.getQualifier().getClassid().equalsIgnoreCase("subtitle"))
.collect(Collectors.toList());
@ -250,32 +278,38 @@ public class ResultMapper implements Serializable {
}
private static void mapLastUpdateTimestamp(Result out, eu.dnetlib.dhp.schema.oaf.Result input) {
if (Optional.ofNullable(input.getLastupdatetimestamp()).isPresent()) {
out.setLastupdatetimestamp(input.getLastupdatetimestamp());
Optional<Long> oLong = Optional.ofNullable(input.getLastupdatetimestamp());
if (oLong.isPresent()) {
out.setLastupdatetimestamp(oLong.get());
}
}
private static void mapLamguage(Result out, eu.dnetlib.dhp.schema.oaf.Result input) {
if (Optional.ofNullable(input.getLanguage()).isPresent()) {
out
.setLanguage(
Language.newInstance(input.getLanguage().getClassid(), input.getLanguage().getClassname()));
private static void mapLanguage(Result out, eu.dnetlib.dhp.schema.oaf.Result input) {
Optional<Qualifier> oL = Optional.ofNullable(input.getLanguage());
if (oL.isPresent()) {
Qualifier language = oL.get();
out.setLanguage(Language.newInstance(language.getClassid(), language.getClassname()));
}
}
private static void mapInstance(Result out, eu.dnetlib.dhp.schema.oaf.Result input,
List<MasterDuplicate> eoscIds) {
if (Optional
.ofNullable(input.getInstance())
.isPresent()) {
out
.setInstance(
input
.getInstance()
.stream()
.map(i -> getCommunityInstance(i, input.getResulttype().getClassid(), eoscIds))
.collect(Collectors.toList()));
private static void mapInstance(String dumpType, Result out, eu.dnetlib.dhp.schema.oaf.Result input) {
Optional<List<eu.dnetlib.dhp.schema.oaf.Instance>> oInst = Optional
.ofNullable(input.getInstance());
if (oInst.isPresent()) {
if (DUMPTYPE.COMPLETE.getType().equals(dumpType)) {
((GraphResult) out)
.setInstance(
oInst.get().stream().map(ResultMapper::getGraphInstance).collect(Collectors.toList()));
} else {
((CommunityResult) out)
.setInstance(
oInst
.get()
.stream()
.map(ResultMapper::getCommunityInstance)
.collect(Collectors.toList()));
}
}
}
@ -297,32 +331,14 @@ public class ResultMapper implements Serializable {
final List<String> formatList = new ArrayList<>();
Optional
.ofNullable(input.getFormat())
.ifPresent(value -> value.forEach(f -> formatList.add(f.getValue())));
.ifPresent(value -> value.stream().forEach(f -> formatList.add(f.getValue())));
out.setFormat(formatList);
}
private static void mapMeasure(Result out, eu.dnetlib.dhp.schema.oaf.Result input) {
if (Optional.ofNullable(input.getMeasures()).isPresent()) {
Indicator i = new Indicator();
UsageCounts uc = new UsageCounts();
input.getMeasures().forEach(m -> {
if (m.getId().equals("downloads")) {
uc.setDownloads(m.getUnit().get(0).getValue());
}
if (m.getId().equals("views")) {
uc.setViews(m.getUnit().get(0).getValue());
}
});
if (!uc.isEmpty()) {
i.setUsageCounts(uc);
out.setIndicator(i);
}
}
}
private static void mapEmbrargo(Result out, eu.dnetlib.dhp.schema.oaf.Result input) {
if (Optional.ofNullable(input.getEmbargoenddate()).isPresent()) {
out.setEmbargoenddate(input.getEmbargoenddate().getValue());
private static void mapEmbargo(Result out, eu.dnetlib.dhp.schema.oaf.Result input) {
Optional<Field<String>> oStr = Optional.ofNullable(input.getEmbargoenddate());
if (oStr.isPresent()) {
out.setEmbargoenddate(oStr.get().getValue());
}
}
@ -407,31 +423,6 @@ public class ResultMapper implements Serializable {
ats -> out.setAuthor(ats.stream().map(ResultMapper::getAuthor).collect(Collectors.toList())));
}
private static Map<String, List<eu.dnetlib.dhp.eosc.model.Subject>> createSubjectMap(
eu.dnetlib.dhp.schema.oaf.Result input) {
Map<String, List<eu.dnetlib.dhp.eosc.model.Subject>> map = new HashMap<>();
input.getSubject().stream().forEach(s -> {
String key = s.getQualifier().getClassid().toLowerCase();
if (!key.equalsIgnoreCase("http://www.abs.gov.au/ausstats/abs@.nsf/0/6BB427AB9696C225CA2574180004463E") &&
!key.equalsIgnoreCase("keyword") &&
!key.equalsIgnoreCase("eosc")) {
if (!map.containsKey(key)) {
map.put(key, new ArrayList<>());
}
eu.dnetlib.dhp.eosc.model.Subject subject = new eu.dnetlib.dhp.eosc.model.Subject();
subject.setValue(s.getValue());
Provenance p = getProvenance(s);
if (p != null) {
subject.setProvenance(p);
}
map.get(key).add(subject);
}
});
return map;
}
private static void addTypeSpecificInformation(Result out, eu.dnetlib.dhp.schema.oaf.Result input,
eu.dnetlib.dhp.schema.oaf.Qualifier ort) throws NoAvailableEntityTypeException {
switch (ort.getClassid()) {
@ -455,8 +446,6 @@ public class ResultMapper implements Serializable {
out.setContainer(c);
out.setType(ModelConstants.PUBLICATION_DEFAULT_RESULTTYPE.getClassname());
}
if (Optional.ofNullable(((Publication) input).getFulltext()).isPresent())
mapFulltext(out, input);
break;
case "dataset":
Dataset id = (Dataset) input;
@ -529,16 +518,15 @@ public class ResultMapper implements Serializable {
.orElse(null));
out.setType(ModelConstants.ORP_DEFAULT_RESULTTYPE.getClassname());
if (Optional.ofNullable(((OtherResearchProduct) input).getFulltext()).isPresent())
mapFulltext(out, input);
break;
default:
throw new NoAvailableEntityTypeException();
}
}
private static eu.dnetlib.dhp.eosc.model.Instance getGraphInstance(eu.dnetlib.dhp.schema.oaf.Instance i) {
eu.dnetlib.dhp.eosc.model.Instance instance = new eu.dnetlib.dhp.eosc.model.Instance();
private static Instance getGraphInstance(eu.dnetlib.dhp.schema.oaf.Instance i) {
Instance instance = new Instance();
setCommonValue(i, instance);
@ -546,48 +534,35 @@ public class ResultMapper implements Serializable {
}
private static eu.dnetlib.dhp.eosc.model.Instance getCommunityInstance(eu.dnetlib.dhp.schema.oaf.Instance i,
String resultType, List<MasterDuplicate> eoscIds) {
eu.dnetlib.dhp.eosc.model.Instance instance = new eu.dnetlib.dhp.eosc.model.Instance();
private static CommunityInstance getCommunityInstance(eu.dnetlib.dhp.schema.oaf.Instance i) {
CommunityInstance instance = new CommunityInstance();
setCommonValue(i, instance);
instance
.setHostedby(
CfHbKeyValue.newInstance(i.getHostedby().getKey(), i.getHostedby().getValue()));
List<MasterDuplicate> eoscDsIds = eoscIds
.stream()
.filter(
dm -> dm
.getGraphId()
.equals(i.getHostedby().getKey()) ||
dm
.getGraphId()
.equals(i.getCollectedfrom().getKey()))
.collect(Collectors.toList());
if (eoscDsIds.size() > 0) {
if (Optional.ofNullable(i.getCollectedfrom()).isPresent() &&
Optional.ofNullable(i.getCollectedfrom().getKey()).isPresent() &&
StringUtils.isNotBlank(i.getCollectedfrom().getKey()))
instance
.setEoscDsId(
eoscDsIds
.stream()
.map(dm -> dm.getEoscId())
.collect(Collectors.toList()));
.setCollectedfrom(
CfHbKeyValue
.newInstance(
getEntityId(i.getCollectedfrom().getKey(), ENTITY_ID_SEPARATOR),
i.getCollectedfrom().getValue()));
}
if (Optional.ofNullable(i.getHostedby()).isPresent() &&
Optional.ofNullable(i.getHostedby().getKey()).isPresent() &&
StringUtils.isNotBlank(i.getHostedby().getKey()))
instance
.setHostedby(
CfHbKeyValue
.newInstance(
getEntityId(i.getHostedby().getKey(), ENTITY_ID_SEPARATOR), i.getHostedby().getValue()));
if (resultType.equals("publication") ||
resultType.equals("other")) {
if (Optional.ofNullable(i.getFulltext()).isPresent())
instance.setFulltext(i.getFulltext());
}
return instance;
}
private static void setCommonValue(eu.dnetlib.dhp.schema.oaf.Instance i,
eu.dnetlib.dhp.eosc.model.Instance instance) {
private static <I extends Instance> void setCommonValue(eu.dnetlib.dhp.schema.oaf.Instance i, I instance) {
Optional<eu.dnetlib.dhp.schema.oaf.AccessRight> opAr = Optional.ofNullable(i.getAccessright());
if (opAr.isPresent() && Constants.ACCESS_RIGHTS_COAR_MAP.containsKey(opAr.get().getClassid())) {
@ -601,16 +576,6 @@ public class ResultMapper implements Serializable {
Constants.COAR_CODE_LABEL_MAP.get(code),
Constants.COAR_ACCESS_RIGHT_SCHEMA));
Optional<List<eu.dnetlib.dhp.schema.oaf.Measure>> mes = Optional.ofNullable(i.getMeasures());
if (mes.isPresent()) {
List<Measure> measure = new ArrayList<>();
mes
.get()
.forEach(
m -> m.getUnit().forEach(u -> measure.add(Measure.newInstance(m.getId(), u.getValue()))));
instance.setMeasures(measure);
}
if (opAr.get().getOpenAccessRoute() != null) {
switch (opAr.get().getOpenAccessRoute()) {
case hybrid:
@ -726,15 +691,20 @@ public class ResultMapper implements Serializable {
}
private static Provenance getProvenance(StructuredProperty s) {
private static Subject getSubject(StructuredProperty s) {
Subject subject = new Subject();
subject.setSubject(SubjectSchemeValue.newInstance(s.getQualifier().getClassid(), s.getValue()));
Optional<DataInfo> di = Optional.ofNullable(s.getDataInfo());
if (di.isPresent()) {
Provenance p = new Provenance();
p.setProvenance(di.get().getProvenanceaction().getClassname());
p.setTrust(di.get().getTrust());
return p;
if (!s.getQualifier().getClassid().equalsIgnoreCase("fos") &&
!s.getQualifier().getClassid().equalsIgnoreCase("sdg"))
p.setTrust(di.get().getTrust());
subject.setProvenance(p);
}
return null;
return subject;
}
private static Author getAuthor(eu.dnetlib.dhp.schema.oaf.Author oa) {
@ -775,8 +745,7 @@ public class ResultMapper implements Serializable {
AuthorPidSchemeValue
.newInstance(
pid.getQualifier().getClassid(),
pid.getValue()),
null
pid.getValue())
);
}

View File

@ -1,5 +1,5 @@
package eu.dnetlib.dhp.oa.graph.dump.eosc;
package eu.dnetlib.dhp.oa.graph.dump;
import java.io.BufferedWriter;
import java.io.IOException;
@ -17,7 +17,6 @@ import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import eu.dnetlib.dhp.application.ArgumentApplicationParser;
import eu.dnetlib.dhp.oa.graph.dump.UtilCommunityAPI;
/**
* This class connects with the community APIs for production. It saves the information about the
@ -54,7 +53,7 @@ public class SaveCommunityMap implements Serializable {
.toString(
SaveCommunityMap.class
.getResourceAsStream(
"/eu/dnetlib/dhp/oa/graph/dump/eosc_cm_parameters.json"));
"/eu/dnetlib/dhp/oa/graph/dump/input_cm_parameters.json"));
final ArgumentApplicationParser parser = new ArgumentApplicationParser(jsonConfiguration);
parser.parseArgument(args);

View File

@ -3,10 +3,15 @@ package eu.dnetlib.dhp.oa.graph.dump;
import java.io.Serializable;
import java.util.Optional;
import java.util.concurrent.TimeUnit;
import org.apache.commons.io.IOUtils;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.*;
import org.apache.http.HttpStatus;
import org.joda.time.DateTime;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import eu.dnetlib.dhp.application.ArgumentApplicationParser;
import eu.dnetlib.dhp.oa.graph.dump.exceptions.NoAvailableEntityTypeException;
@ -18,6 +23,11 @@ public class SendToZenodoHDFS implements Serializable {
private static final String NEW = "new"; // to be used for a brand new deposition in zenodo
private static final String VERSION = "version"; // to be used to upload a new version of a published deposition
private static final String UPDATE = "update"; // to upload content to an open deposition not published
private static final Integer NUMBER_OF_RETRIES = 5;
private static final Integer DELAY = 10;
private static final Integer MULTIPLIER = 5;
private static final Logger log = LoggerFactory.getLogger(SendToZenodoHDFS.class);
public static void main(final String[] args) throws Exception, MissingConceptDoiException {
final ArgumentApplicationParser parser = new ArgumentApplicationParser(
@ -25,7 +35,7 @@ public class SendToZenodoHDFS implements Serializable {
.toString(
SendToZenodoHDFS.class
.getResourceAsStream(
"/eu/dnetlib/dhp/oa/graph/dump/eosc_upload_zenodo.json")));
"/eu/dnetlib/dhp/oa/graph/dump/upload_zenodo.json")));
parser.parseArgument(args);
@ -79,21 +89,44 @@ public class SendToZenodoHDFS implements Serializable {
Path p = fileStatus.getPath();
String pString = p.toString();
boolean retry = true;
int numberOfRetries = 0;
if (!pString.endsWith("_SUCCESS")) {
String name = pString.substring(pString.lastIndexOf("/") + 1);
log.info("Upoloading: {}", name);
FSDataInputStream inputStream = fileSystem.open(p);
zenodoApiClient.uploadIS3(inputStream, name, fileSystem.getFileStatus(p).getLen());
}
while (retry && numberOfRetries < NUMBER_OF_RETRIES) {
int response_code = -1;
try {
response_code = zenodoApiClient
.uploadIS3(inputStream, name, fileSystem.getFileStatus(p).getLen());
} catch (Exception e) {
log.info(e.getMessage());
throw new RuntimeException("Error while uploading on Zenodo");
}
log.info("response code: {}", response_code);
if (HttpStatus.SC_OK == response_code || HttpStatus.SC_CREATED == response_code) {
retry = false;
} else {
numberOfRetries += 1;
TimeUnit.SECONDS.sleep(DELAY * MULTIPLIER ^ numberOfRetries);
}
}
if (numberOfRetries == NUMBER_OF_RETRIES) {
throw new RuntimeException("reached the maximun number or retries to upload on Zenodo");
}
}
// log.info(DateTime.now().toDateTimeISO().toString());
TimeUnit.SECONDS.sleep(DELAY);
// log.info("Delayed: {}", DateTime.now().toDateTimeISO().toString());
}
if (!metadata.equals("")) {
zenodoApiClient.sendMretadata(metadata);
}
// if (Boolean.TRUE.equals(publish)) {
// zenodoApiClient.publish();
// }
}
}

View File

@ -0,0 +1,92 @@
package eu.dnetlib.dhp.oa.graph.dump;
import static eu.dnetlib.dhp.common.SparkSessionSupport.runWithSparkSession;
import java.io.Serializable;
import java.util.ArrayList;
import java.util.List;
import java.util.Optional;
import org.apache.commons.io.IOUtils;
import org.apache.commons.lang3.StringUtils;
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.function.FilterFunction;
import org.apache.spark.api.java.function.FlatMapFunction;
import org.apache.spark.api.java.function.MapFunction;
import org.apache.spark.sql.*;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import eu.dnetlib.dhp.application.ArgumentApplicationParser;
import eu.dnetlib.dhp.schema.common.ModelSupport;
import eu.dnetlib.dhp.schema.oaf.Relation;
import eu.dnetlib.dhp.schema.oaf.Result;
import scala.Tuple2;
/**
* @author miriam.baglioni
* @Date 22/09/23
*/
public class SparkCopyGraph implements Serializable {
private static final Logger log = LoggerFactory.getLogger(SparkCopyGraph.class);
public static void main(String[] args) throws Exception {
String jsonConfiguration = IOUtils
.toString(
SparkCopyGraph.class
.getResourceAsStream(
"/eu/dnetlib/dhp/oa/graph/dump/copygraph_parameters.json"));
final ArgumentApplicationParser parser = new ArgumentApplicationParser(jsonConfiguration);
parser.parseArgument(args);
Boolean isSparkSessionManaged = Optional
.ofNullable(parser.get("isSparkSessionManaged"))
.map(Boolean::valueOf)
.orElse(Boolean.TRUE);
log.info("isSparkSessionManaged: {}", isSparkSessionManaged);
final String hivePath = parser.get("hivePath");
log.info("hivePath: {}", hivePath);
final String outputPath = parser.get("outputPath");
log.info("outputPath: {}", outputPath);
SparkConf conf = new SparkConf();
runWithSparkSession(
conf,
isSparkSessionManaged,
spark ->
execCopy(
spark,
hivePath,
outputPath));
}
private static void execCopy(SparkSession spark, String hivePath, String outputPath) {
ModelSupport.oafTypes.entrySet().parallelStream().forEach(entry -> {
String entityType = entry.getKey();
Class<?> clazz = entry.getValue();
// if (!entityType.equalsIgnoreCase("relation")) {
spark
.read()
.schema(Encoders.bean(clazz).schema())
.parquet(hivePath + "/" + entityType)
.write()
.mode(SaveMode.Overwrite)
.option("compression", "gzip")
.json(outputPath + "/" + entityType);
});
}
}

View File

@ -1,9 +1,13 @@
package eu.dnetlib.dhp.oa.graph.dump;
import static eu.dnetlib.dhp.utils.DHPUtils.MAPPER;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;
import java.util.Optional;
import java.util.stream.Collectors;
import org.slf4j.Logger;
@ -12,7 +16,10 @@ import org.slf4j.LoggerFactory;
import com.fasterxml.jackson.databind.ObjectMapper;
import eu.dnetlib.dhp.communityapi.model.*;
import eu.dnetlib.dhp.oa.graph.dump.eosc.CommunityMap;
import eu.dnetlib.dhp.oa.graph.dump.community.CommunityMap;
import eu.dnetlib.dhp.oa.graph.dump.complete.ContextInfo;
import eu.dnetlib.dhp.oa.graph.dump.csv.Constants;
import eu.dnetlib.dhp.utils.DHPUtils;
public class UtilCommunityAPI {
@ -32,6 +39,28 @@ public class UtilCommunityAPI {
return map;
}
public List<String> getCommunityCsv(List<String> comms) {
return comms.stream().map(c -> {
try {
CommunityModel community = getCommunity(c);
StringBuilder builder = new StringBuilder();
builder.append(DHPUtils.md5(community.getId()));
builder.append(Constants.SEP);
builder.append(community.getName());
builder.append(Constants.SEP);
builder.append(community.getId());
builder.append(Constants.SEP);
builder
.append(
community.getDescription());
return builder.toString();
} catch (IOException e) {
throw new RuntimeException(e);
}
}).collect(Collectors.toList());
}
private List<CommunityModel> getValidCommunities() throws IOException {
ObjectMapper mapper = new ObjectMapper();
return mapper
@ -52,4 +81,123 @@ public class UtilCommunityAPI {
}
public List<ContextInfo> getContextInformation() throws IOException {
return getValidCommunities()
.stream()
.map(c -> getContext(c))
.collect(Collectors.toList());
}
public ContextInfo getContext(CommunityModel c) {
ContextInfo cinfo = new ContextInfo();
cinfo.setId(c.getId());
cinfo.setDescription(c.getDescription());
CommunityModel cm = null;
try {
cm = getCommunity(c.getId());
} catch (IOException e) {
throw new RuntimeException(e);
}
cinfo.setSubject(new ArrayList<>());
cinfo.getSubject().addAll(cm.getSubjects());
cinfo.setZenodocommunity(c.getZenodoCommunity());
cinfo.setType(c.getType());
return cinfo;
}
public List<ContextInfo> getContextRelation() throws IOException {
return getValidCommunities().stream().map(c -> {
ContextInfo cinfo = new ContextInfo();
cinfo.setId(c.getId());
cinfo.setDatasourceList(getDatasourceList(c.getId()));
cinfo.setProjectList(getProjectList(c.getId()));
return cinfo;
}).collect(Collectors.toList());
}
private List<String> getDatasourceList(String id) {
List<String> datasourceList = new ArrayList<>();
try {
new ObjectMapper()
.readValue(
eu.dnetlib.dhp.communityapi.QueryCommunityAPI.communityDatasource(id),
DatasourceList.class)
.stream()
.forEach(ds -> {
if (Optional.ofNullable(ds.getOpenaireId()).isPresent()) {
datasourceList.add(ds.getOpenaireId());
}
});
} catch (IOException e) {
throw new RuntimeException(e);
}
return datasourceList;
}
private List<String> getProjectList(String id) {
int page = -1;
int size = 100;
ContentModel cm = null;
;
ArrayList<String> projectList = new ArrayList<>();
do {
page++;
try {
cm = new ObjectMapper()
.readValue(
eu.dnetlib.dhp.communityapi.QueryCommunityAPI
.communityProjects(
id, String.valueOf(page), String.valueOf(size)),
ContentModel.class);
if (cm.getContent().size() > 0) {
cm.getContent().forEach(p -> {
if (Optional.ofNullable(p.getOpenaireId()).isPresent())
projectList.add(p.getOpenaireId());
});
}
} catch (IOException e) {
throw new RuntimeException(e);
}
} while (!cm.getLast());
return projectList;
}
/**
* it returns for each organization the list of associated communities
*/
public CommunityEntityMap getCommunityOrganization() throws IOException {
CommunityEntityMap organizationMap = new CommunityEntityMap();
getValidCommunities()
.forEach(community -> {
String id = community.getId();
try {
List<String> associatedOrgs = MAPPER
.readValue(
eu.dnetlib.dhp.communityapi.QueryCommunityAPI.communityPropagationOrganization(id),
OrganizationList.class);
associatedOrgs.forEach(o -> {
if (!organizationMap
.keySet()
.contains(o))
organizationMap.put(o, new ArrayList<>());
organizationMap.get(o).add(community.getId());
});
} catch (IOException e) {
throw new RuntimeException(e);
}
});
return organizationMap;
}
}

View File

@ -0,0 +1,200 @@
package eu.dnetlib.dhp.oa.graph.dump;
import static eu.dnetlib.dhp.oa.graph.dump.Constants.*;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.ArrayList;
import java.util.List;
import java.util.stream.Collectors;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.spark.api.java.function.MapFunction;
import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Encoders;
import org.apache.spark.sql.SparkSession;
import org.jetbrains.annotations.NotNull;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.google.gson.Gson;
import eu.dnetlib.dhp.common.HdfsSupport;
import eu.dnetlib.dhp.oa.graph.dump.community.CommunityMap;
import eu.dnetlib.dhp.oa.graph.dump.complete.Constants;
import eu.dnetlib.dhp.oa.model.Indicator;
import eu.dnetlib.dhp.oa.model.Score;
import eu.dnetlib.dhp.oa.model.UsageCounts;
import eu.dnetlib.dhp.oa.model.graph.GraphResult;
import eu.dnetlib.dhp.oa.model.graph.Relation;
import eu.dnetlib.dhp.oa.model.graph.ResearchCommunity;
import eu.dnetlib.dhp.schema.oaf.KeyValue;
import eu.dnetlib.dhp.schema.oaf.Measure;
import eu.dnetlib.dhp.utils.DHPUtils;
import scala.Tuple2;
public class Utils {
public static final ObjectMapper OBJECT_MAPPER = new ObjectMapper();
public static final String ENTITY_ID_SEPARATOR = "|";
private Utils() {
}
public static void removeOutputDir(SparkSession spark, String path) {
HdfsSupport.remove(path, spark.sparkContext().hadoopConfiguration());
}
public static <R> Dataset<R> readPath(
SparkSession spark, String inputPath, Class<R> clazz) {
return spark
.read()
.textFile(inputPath)
.map((MapFunction<String, R>) value -> OBJECT_MAPPER.readValue(value, clazz), Encoders.bean(clazz));
}
public static String getContextId(String id) {
return String
.format(
"%s::%s", Constants.CONTEXT_NS_PREFIX,
DHPUtils.md5(id));
}
public static CommunityMap getCommunityMap(SparkSession spark, String communityMapPath) {
return new Gson().fromJson(spark.read().textFile(communityMapPath).collectAsList().get(0), CommunityMap.class);
}
public static CommunityMap readCommunityMap(FileSystem fileSystem, String communityMapPath) throws IOException {
BufferedReader br = new BufferedReader(new InputStreamReader(fileSystem.open(new Path(communityMapPath))));
StringBuilder sb = new StringBuilder();
try {
String line;
while ((line = br.readLine()) != null) {
sb.append(line);
}
} finally {
br.close();
}
return new Gson().fromJson(sb.toString(), CommunityMap.class);
}
public static String getEntityId(String id, String separator) {
return id.substring(id.indexOf(separator) + 1);
}
public static Dataset<String> getEntitiesId(SparkSession spark, String inputPath) {
Dataset<String> dumpedIds = Utils
.readPath(spark, inputPath + "/publication", GraphResult.class)
.map((MapFunction<GraphResult, String>) r -> r.getId(), Encoders.STRING())
.union(
Utils
.readPath(spark, inputPath + "/dataset", GraphResult.class)
.map((MapFunction<GraphResult, String>) r -> r.getId(), Encoders.STRING()))
.union(
Utils
.readPath(spark, inputPath + "/software", GraphResult.class)
.map((MapFunction<GraphResult, String>) r -> r.getId(), Encoders.STRING()))
.union(
Utils
.readPath(spark, inputPath + "/otherresearchproduct", GraphResult.class)
.map((MapFunction<GraphResult, String>) r -> r.getId(), Encoders.STRING()))
.union(
Utils
.readPath(spark, inputPath + "/organization", eu.dnetlib.dhp.oa.model.graph.Organization.class)
.map(
(MapFunction<eu.dnetlib.dhp.oa.model.graph.Organization, String>) o -> o.getId(),
Encoders.STRING()))
.union(
Utils
.readPath(spark, inputPath + "/project", eu.dnetlib.dhp.oa.model.graph.Project.class)
.map(
(MapFunction<eu.dnetlib.dhp.oa.model.graph.Project, String>) o -> o.getId(), Encoders.STRING()))
.union(
Utils
.readPath(spark, inputPath + "/datasource", eu.dnetlib.dhp.oa.model.graph.Datasource.class)
.map(
(MapFunction<eu.dnetlib.dhp.oa.model.graph.Datasource, String>) o -> o.getId(),
Encoders.STRING()))
.union(
Utils
.readPath(spark, inputPath + "/communities_infrastructures", ResearchCommunity.class)
.map((MapFunction<ResearchCommunity, String>) c -> c.getId(), Encoders.STRING()));
return dumpedIds;
}
public static Dataset<Relation> getValidRelations(Dataset<Relation> relations,
Dataset<String> entitiesIds) {
Dataset<Tuple2<String, Relation>> relationSource = relations
.map(
(MapFunction<Relation, Tuple2<String, Relation>>) r -> new Tuple2<>(r.getSource(), r),
Encoders.tuple(Encoders.STRING(), Encoders.bean(Relation.class)));
Dataset<Tuple2<String, Relation>> relJoinSource = relationSource
.joinWith(entitiesIds, relationSource.col("_1").equalTo(entitiesIds.col("value")))
.map(
(MapFunction<Tuple2<Tuple2<String, Relation>, String>, Tuple2<String, Relation>>) t2 -> new Tuple2<>(
t2._1()._2().getTarget(), t2._1()._2()),
Encoders.tuple(Encoders.STRING(), Encoders.bean(Relation.class)));
return relJoinSource
.joinWith(entitiesIds, relJoinSource.col("_1").equalTo(entitiesIds.col("value")))
.map(
(MapFunction<Tuple2<Tuple2<String, Relation>, String>, Relation>) t2 -> t2._1()._2(),
Encoders.bean(Relation.class));
}
public static Indicator getIndicator(List<Measure> measures) {
Indicator i = new Indicator();
for (eu.dnetlib.dhp.schema.oaf.Measure m : measures) {
switch (m.getId()) {
case USAGE_COUNT_DOWNLOADS:
getUsageCounts(i).setDownloads(m.getUnit().get(0).getValue());
break;
case USAGE_COUNT_VIEWS:
getUsageCounts(i).setViews(m.getUnit().get(0).getValue());
break;
default:
getImpactMeasure(i).add(getScore(m.getId(), m.getUnit()));
break;
}
}
return i;
}
@NotNull
private static UsageCounts getUsageCounts(Indicator i) {
if (i.getUsageCounts() == null) {
i.setUsageCounts(new UsageCounts());
}
return i.getUsageCounts();
}
@NotNull
private static List<Score> getImpactMeasure(Indicator i) {
if (i.getBipIndicators() == null) {
i.setBipIndicators(new ArrayList<>());
}
return i.getBipIndicators();
}
private static Score getScore(String indicator, List<KeyValue> unit) {
Score s = new Score();
s.setIndicator(indicator);
for (KeyValue u : unit) {
if (u.getKey().equals("score")) {
s.setScore(u.getValue());
} else {
s.setClazz(u.getValue());
}
}
return s;
}
}

View File

@ -1,5 +1,5 @@
package eu.dnetlib.dhp.oa.graph.dump.eosc;
package eu.dnetlib.dhp.oa.graph.dump.community;
import java.io.Serializable;
import java.util.HashMap;

View File

@ -0,0 +1,75 @@
package eu.dnetlib.dhp.oa.graph.dump.community;
import static eu.dnetlib.dhp.common.SparkSessionSupport.runWithSparkSession;
import java.io.Serializable;
import java.util.Optional;
import java.util.stream.Collectors;
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.function.FilterFunction;
import org.apache.spark.api.java.function.MapFunction;
import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Encoders;
import org.apache.spark.sql.SaveMode;
import org.apache.spark.sql.SparkSession;
import com.fasterxml.jackson.databind.ObjectMapper;
import eu.dnetlib.dhp.oa.graph.dump.Utils;
import eu.dnetlib.dhp.oa.model.community.CommunityResult;
import eu.dnetlib.dhp.oa.model.community.Context;
/**
* This class splits the dumped results according to the research community - research initiative/infrastructure they
* are related to. The information about the community is found in the element "context.id" in the result. Since the
* context that can be found in the result can be associated not only to communities, a community Map is provided. It
* will guide the splitting process. Note: the repartition(1) just before writing the results related to a community.
* This is a choice due to uploading constraints (just one file for each community) As soon as a better solution will be
* in place remove the repartition
*/
public class CommunitySplit implements Serializable {
public void run(Boolean isSparkSessionManaged, String inputPath, String outputPath, String communityMapPath) {
SparkConf conf = new SparkConf();
runWithSparkSession(
conf,
isSparkSessionManaged,
spark -> {
Utils.removeOutputDir(spark, outputPath);
CommunityMap communityMap = Utils.getCommunityMap(spark, communityMapPath);
execSplit(spark, inputPath, outputPath, communityMap);
});
}
private static void execSplit(SparkSession spark, String inputPath, String outputPath,
CommunityMap communities) {
Dataset<CommunityResult> result = Utils
.readPath(spark, inputPath + "/publication", CommunityResult.class)
.union(Utils.readPath(spark, inputPath + "/dataset", CommunityResult.class))
.union(Utils.readPath(spark, inputPath + "/orp", CommunityResult.class))
.union(Utils.readPath(spark, inputPath + "/software", CommunityResult.class));
communities
.keySet()
.stream()
.parallel()
.forEach(c -> {
result
.filter(
(FilterFunction<CommunityResult>) r -> Optional.ofNullable(r.getContext()).isPresent() &&
r.getContext().stream().anyMatch(con -> con.getCode().equals(c)))
.map(
(MapFunction<CommunityResult, String>) cr -> new ObjectMapper().writeValueAsString(cr),
Encoders.STRING())
.write()
.option("compression", "gzip")
.mode(SaveMode.Overwrite)
.text(outputPath + "/" + c.replace(" ", "_"));
});
}
}

View File

@ -1,14 +1,14 @@
package eu.dnetlib.dhp.oa.graph.dump.eosc;
package eu.dnetlib.dhp.oa.graph.dump.community;
import java.io.Serializable;
import java.util.List;
import eu.dnetlib.dhp.eosc.model.ProjectSummary;
import eu.dnetlib.dhp.oa.model.community.Project;
public class ResultProject implements Serializable {
private String resultId;
private List<ProjectSummary> projectsList;
private List<Project> projectsList;
public String getResultId() {
return resultId;
@ -18,11 +18,11 @@ public class ResultProject implements Serializable {
this.resultId = resultId;
}
public List<ProjectSummary> getProjectsList() {
public List<Project> getProjectsList() {
return projectsList;
}
public void setProjectsList(List<ProjectSummary> projectsList) {
public void setProjectsList(List<Project> projectsList) {
this.projectsList = projectsList;
}
}

View File

@ -0,0 +1,155 @@
package eu.dnetlib.dhp.oa.graph.dump.community;
import static eu.dnetlib.dhp.common.SparkSessionSupport.runWithSparkSession;
import java.io.Serializable;
import java.util.*;
import java.util.stream.Collectors;
import org.apache.commons.io.IOUtils;
import org.apache.commons.lang.StringUtils;
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.function.FilterFunction;
import org.apache.spark.api.java.function.MapFunction;
import org.apache.spark.sql.Encoders;
import org.apache.spark.sql.SaveMode;
import org.apache.spark.sql.SparkSession;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import com.fasterxml.jackson.databind.ObjectMapper;
import eu.dnetlib.dhp.application.ArgumentApplicationParser;
import eu.dnetlib.dhp.oa.graph.dump.Constants;
import eu.dnetlib.dhp.oa.graph.dump.ResultMapper;
import eu.dnetlib.dhp.oa.graph.dump.Utils;
import eu.dnetlib.dhp.oa.graph.dump.exceptions.CardinalityTooHighException;
import eu.dnetlib.dhp.oa.graph.dump.exceptions.NoAvailableEntityTypeException;
import eu.dnetlib.dhp.oa.model.community.CommunityResult;
import eu.dnetlib.dhp.schema.common.ModelSupport;
import eu.dnetlib.dhp.schema.oaf.Context;
import eu.dnetlib.dhp.schema.oaf.DataInfo;
import eu.dnetlib.dhp.schema.oaf.OafEntity;
import eu.dnetlib.dhp.schema.oaf.Result;
/**
* Spark action to trigger the dump of results associated to research community - reseach initiative/infrasctructure The
* actual dump if performed via the class DumpProducts that is used also for the entire graph dump
*/
public class SparkDumpCommunityProducts implements Serializable {
private static final Logger log = LoggerFactory.getLogger(SparkDumpCommunityProducts.class);
public static void main(String[] args) throws Exception {
String jsonConfiguration = IOUtils
.toString(
SparkDumpCommunityProducts.class
.getResourceAsStream(
"/eu/dnetlib/dhp/oa/graph/dump/input_parameters.json"));
final ArgumentApplicationParser parser = new ArgumentApplicationParser(jsonConfiguration);
parser.parseArgument(args);
Boolean isSparkSessionManaged = Optional
.ofNullable(parser.get("isSparkSessionManaged"))
.map(Boolean::valueOf)
.orElse(Boolean.TRUE);
log.info("isSparkSessionManaged: {}", isSparkSessionManaged);
final String inputPath = parser.get("sourcePath");
log.info("inputPath: {}", inputPath);
final String outputPath = parser.get("outputPath");
log.info("outputPath: {}", outputPath);
final String resultClassName = parser.get("resultTableName");
log.info("resultTableName: {}", resultClassName);
String communityMapPath = Optional
.ofNullable(parser.get("communityMapPath"))
.orElse(null);
String dumpType = Optional
.ofNullable(parser.get("dumpType"))
.orElse(null);
Class<? extends Result> inputClazz = (Class<? extends Result>) Class.forName(resultClassName);
SparkConf conf = new SparkConf();
runWithSparkSession(
conf,
isSparkSessionManaged,
spark -> {
Utils.removeOutputDir(spark, outputPath);
resultDump(
spark, inputPath, outputPath, communityMapPath, inputClazz, dumpType);
});
}
public static <I extends OafEntity> void resultDump(
SparkSession spark,
String inputPath,
String outputPath,
String communityMapPath,
Class<I> inputClazz,
String dumpType) {
CommunityMap communityMap = null;
if (!StringUtils.isEmpty(communityMapPath))
communityMap = Utils.getCommunityMap(spark, communityMapPath);
CommunityMap finalCommunityMap = communityMap;
Utils
.readPath(spark, inputPath, inputClazz)
.map(
(MapFunction<I, CommunityResult>) value -> execMap(value, finalCommunityMap, dumpType),
Encoders.bean(CommunityResult.class))
.filter((FilterFunction<CommunityResult>) value -> value != null)
.map(
(MapFunction<CommunityResult, String>) r -> new ObjectMapper().writeValueAsString(r), Encoders.STRING())
.write()
.mode(SaveMode.Overwrite)
.option("compression", "gzip")
.text(outputPath);
}
private static <I extends OafEntity, O extends eu.dnetlib.dhp.oa.model.Result> O execMap(I value,
CommunityMap communityMap, String dumpType) throws NoAvailableEntityTypeException, CardinalityTooHighException {
Optional<DataInfo> odInfo = Optional.ofNullable(value.getDataInfo());
if (Boolean.FALSE.equals(odInfo.isPresent())) {
return null;
}
if (Boolean.TRUE.equals(odInfo.get().getDeletedbyinference())
|| Boolean.TRUE.equals(odInfo.get().getInvisible())) {
return null;
}
if (StringUtils.isEmpty(dumpType)) {
Set<String> communities = communityMap.keySet();
Optional<List<Context>> inputContext = Optional
.ofNullable(((eu.dnetlib.dhp.schema.oaf.Result) value).getContext());
if (!inputContext.isPresent()) {
return null;
}
List<String> toDumpFor = inputContext.get().stream().map(c -> {
if (communities.contains(c.getId())) {
return c.getId();
}
if (c.getId().contains("::") && communities.contains(c.getId().substring(0, c.getId().indexOf("::")))) {
return c.getId().substring(0, c.getId().indexOf("::"));
}
return null;
}).filter(Objects::nonNull).collect(Collectors.toList());
if (toDumpFor.isEmpty()) {
return null;
}
}
return (O) ResultMapper.map(value, communityMap, Constants.DUMPTYPE.COMMUNITY.getType());
}
}

View File

@ -1,7 +1,9 @@
package eu.dnetlib.dhp.oa.graph.dump.eosc;
package eu.dnetlib.dhp.oa.graph.dump.community;
import static eu.dnetlib.dhp.common.SparkSessionSupport.runWithSparkSession;
import static eu.dnetlib.dhp.oa.graph.dump.Utils.ENTITY_ID_SEPARATOR;
import static eu.dnetlib.dhp.oa.graph.dump.Utils.getEntityId;
import java.io.Serializable;
import java.io.StringReader;
@ -26,10 +28,11 @@ import org.slf4j.LoggerFactory;
import org.xml.sax.SAXException;
import eu.dnetlib.dhp.application.ArgumentApplicationParser;
import eu.dnetlib.dhp.eosc.model.FunderShort;
import eu.dnetlib.dhp.eosc.model.ProjectSummary;
import eu.dnetlib.dhp.eosc.model.Provenance;
import eu.dnetlib.dhp.eosc.model.Validated;
import eu.dnetlib.dhp.oa.graph.dump.Utils;
import eu.dnetlib.dhp.oa.model.Provenance;
import eu.dnetlib.dhp.oa.model.community.Funder;
import eu.dnetlib.dhp.oa.model.community.Project;
import eu.dnetlib.dhp.oa.model.community.Validated;
import eu.dnetlib.dhp.schema.common.ModelConstants;
import eu.dnetlib.dhp.schema.oaf.DataInfo;
import eu.dnetlib.dhp.schema.oaf.Field;
@ -49,7 +52,7 @@ public class SparkPrepareResultProject implements Serializable {
.toString(
SparkPrepareResultProject.class
.getResourceAsStream(
"/eu/dnetlib/dhp/oa/graph/dump/eosc_project_prep_parameters.json"));
"/eu/dnetlib/dhp/oa/graph/dump/project_prep_parameters.json"));
final ArgumentApplicationParser parser = new ArgumentApplicationParser(jsonConfiguration);
parser.parseArgument(args);
@ -60,6 +63,13 @@ public class SparkPrepareResultProject implements Serializable {
.orElse(Boolean.TRUE);
log.info("isSparkSessionManaged: {}", isSparkSessionManaged);
Boolean substring = Optional
.ofNullable(parser.get("substring"))
.map(Boolean::valueOf)
.orElse(Boolean.TRUE);
log.info("isSparkSessionManaged: {}", isSparkSessionManaged);
final String inputPath = parser.get("sourcePath");
log.info("inputPath: {}", inputPath);
@ -73,11 +83,12 @@ public class SparkPrepareResultProject implements Serializable {
isSparkSessionManaged,
spark -> {
Utils.removeOutputDir(spark, outputPath);
prepareResultProjectList(spark, inputPath, outputPath);
prepareResultProjectList(spark, inputPath, outputPath, substring);
});
}
private static void prepareResultProjectList(SparkSession spark, String inputPath, String outputPath) {
private static void prepareResultProjectList(SparkSession spark, String inputPath, String outputPath,
Boolean substring) {
Dataset<Relation> relation = Utils
.readPath(spark, inputPath + "/relation", Relation.class)
.filter(
@ -100,22 +111,20 @@ public class SparkPrepareResultProject implements Serializable {
Set<String> projectSet = new HashSet<>();
Tuple2<eu.dnetlib.dhp.schema.oaf.Project, Relation> first = it.next();
ResultProject rp = new ResultProject();
rp.setResultId(s);
if (substring)
rp.setResultId(getEntityId(s, ENTITY_ID_SEPARATOR));
else
rp.setResultId(s);
eu.dnetlib.dhp.schema.oaf.Project p = first._1();
projectSet.add(p.getId());
ProjectSummary ps = getProject(p, first._2);
Project ps = getProject(p, first._2);
List<ProjectSummary> projList = new ArrayList<>();
List<Project> projList = new ArrayList<>();
projList.add(ps);
rp.setProjectsList(projList);
it.forEachRemaining(c -> {
eu.dnetlib.dhp.schema.oaf.Project op = c._1();
if (!projectSet.contains(op.getId())) {
if (!Optional.ofNullable(op.getCode()).isPresent()
|| !Optional.ofNullable(op.getCode().getValue()).isPresent()) {
throw new RuntimeException("No project code for " + p.getId());
}
projList
.add(getProject(op, c._2));
@ -132,10 +141,10 @@ public class SparkPrepareResultProject implements Serializable {
.json(outputPath);
}
private static ProjectSummary getProject(eu.dnetlib.dhp.schema.oaf.Project op, Relation relation) {
ProjectSummary p = ProjectSummary
private static Project getProject(eu.dnetlib.dhp.schema.oaf.Project op, Relation relation) {
Project p = Project
.newInstance(
op.getId(),
getEntityId(op.getId(), ENTITY_ID_SEPARATOR),
op.getCode().getValue(),
Optional
.ofNullable(op.getAcronym())
@ -148,7 +157,7 @@ public class SparkPrepareResultProject implements Serializable {
Optional
.ofNullable(op.getFundingtree())
.map(value -> {
List<FunderShort> tmp = value
List<Funder> tmp = value
.stream()
.map(ft -> getFunder(ft.getValue()))
.collect(Collectors.toList());
@ -174,8 +183,8 @@ public class SparkPrepareResultProject implements Serializable {
}
private static FunderShort getFunder(String fundingtree) {
final FunderShort f = new FunderShort();
private static Funder getFunder(String fundingtree) {
final Funder f = new Funder();
final Document doc;
try {
final SAXReader reader = new SAXReader();

View File

@ -0,0 +1,50 @@
package eu.dnetlib.dhp.oa.graph.dump.community;
import java.io.Serializable;
import java.util.Optional;
import org.apache.commons.io.IOUtils;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import eu.dnetlib.dhp.application.ArgumentApplicationParser;
/**
* Spark job to trigger the split of results associated to research community - reseach initiative/infrasctructure. The
* actual split is performed by the class CommunitySplit
*/
public class SparkSplitForCommunity implements Serializable {
private static final Logger log = LoggerFactory.getLogger(SparkSplitForCommunity.class);
public static void main(String[] args) throws Exception {
String jsonConfiguration = IOUtils
.toString(
SparkSplitForCommunity.class
.getResourceAsStream(
"/eu/dnetlib/dhp/oa/graph/dump/split_parameters.json"));
final ArgumentApplicationParser parser = new ArgumentApplicationParser(jsonConfiguration);
parser.parseArgument(args);
Boolean isSparkSessionManaged = Optional
.ofNullable(parser.get("isSparkSessionManaged"))
.map(Boolean::valueOf)
.orElse(Boolean.TRUE);
log.info("isSparkSessionManaged: {}", isSparkSessionManaged);
final String inputPath = parser.get("sourcePath");
log.info("inputPath: {}", inputPath);
final String outputPath = parser.get("outputPath");
log.info("outputPath: {}", outputPath);
final String communityMapPath = parser.get("communityMapPath");
CommunitySplit split = new CommunitySplit();
split.run(isSparkSessionManaged, inputPath, outputPath, communityMapPath);
}
}

View File

@ -1,15 +1,14 @@
package eu.dnetlib.dhp.oa.graph.dump.eosc;
package eu.dnetlib.dhp.oa.graph.dump.community;
import static eu.dnetlib.dhp.common.SparkSessionSupport.runWithSparkSession;
import java.io.Serializable;
import java.util.Arrays;
import java.util.List;
import java.util.Optional;
import org.apache.commons.io.IOUtils;
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.function.ForeachFunction;
import org.apache.spark.api.java.function.MapFunction;
import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Encoders;
@ -21,21 +20,21 @@ import org.slf4j.LoggerFactory;
import com.fasterxml.jackson.databind.ObjectMapper;
import eu.dnetlib.dhp.application.ArgumentApplicationParser;
import eu.dnetlib.dhp.eosc.model.Result;
import eu.dnetlib.dhp.oa.graph.dump.Constants;
import eu.dnetlib.dhp.oa.graph.dump.Utils;
import eu.dnetlib.dhp.oa.model.community.CommunityResult;
import scala.Tuple2;
public class ExtendEoscResultWithProject implements Serializable {
public class SparkUpdateProjectInfo implements Serializable {
private static final Logger log = LoggerFactory.getLogger(ExtendEoscResultWithProject.class);
private static final Logger log = LoggerFactory.getLogger(SparkUpdateProjectInfo.class);
public static final ObjectMapper OBJECT_MAPPER = new ObjectMapper();
public static void main(String[] args) throws Exception {
String jsonConfiguration = IOUtils
.toString(
ExtendEoscResultWithProject.class
SparkUpdateProjectInfo.class
.getResourceAsStream(
"/eu/dnetlib/dhp/oa/graph/dump/eosc_project_input_parameters.json"));
"/eu/dnetlib/dhp/oa/graph/dump/project_input_parameters.json"));
final ArgumentApplicationParser parser = new ArgumentApplicationParser(jsonConfiguration);
parser.parseArgument(args);
@ -55,11 +54,6 @@ public class ExtendEoscResultWithProject implements Serializable {
final String preparedInfoPath = parser.get("preparedInfoPath");
log.info("preparedInfoPath: {}", preparedInfoPath);
final String dumpType = Optional
.ofNullable(parser.get("dumpType"))
.orElse(Constants.DUMPTYPE.COMMUNITY.getType());
log.info("dumpType: {}", dumpType);
SparkConf conf = new SparkConf();
runWithSparkSession(
@ -76,28 +70,25 @@ public class ExtendEoscResultWithProject implements Serializable {
String inputPath,
String outputPath,
String preparedInfoPath) {
Dataset<CommunityResult> result = Utils.readPath(spark, inputPath, CommunityResult.class);
Dataset<ResultProject> resultProject = Utils.readPath(spark, preparedInfoPath, ResultProject.class);
List<String> entities = Arrays.asList("publication", "dataset", "software", "otherresearchproduct");
entities.parallelStream().forEach(entity -> {
Dataset<Result> result = Utils.readPath(spark, inputPath + "/" + entity, Result.class);
Dataset<ResultProject> resultProject = Utils.readPath(spark, preparedInfoPath, ResultProject.class);
result
.joinWith(
resultProject, result.col("id").equalTo(resultProject.col("resultId")),
"left")
.map((MapFunction<Tuple2<Result, ResultProject>, Result>) value -> {
Result r = value._1();
Optional.ofNullable(value._2()).ifPresent(rp -> r.setProjects(rp.getProjectsList()));
return r;
}, Encoders.bean(Result.class))
.write()
.option("compression", "gzip")
.mode(SaveMode.Append)
.json(outputPath + "/" + entity);
});
result
.joinWith(
resultProject, result.col("id").equalTo(resultProject.col("resultId")),
"left")
.map((MapFunction<Tuple2<CommunityResult, ResultProject>, CommunityResult>) value -> {
CommunityResult r = value._1();
Optional.ofNullable(value._2()).ifPresent(rp -> r.setProjects(rp.getProjectsList()));
return r;
}, Encoders.bean(CommunityResult.class))
.map(
(MapFunction<CommunityResult, String>) cr -> new ObjectMapper().writeValueAsString(cr),
Encoders.STRING())
.write()
.option("compression", "gzip")
.mode(SaveMode.Append)
.text(outputPath);
}

View File

@ -1,5 +1,5 @@
package eu.dnetlib.dhp.oa.graph.dump.eosc;
package eu.dnetlib.dhp.oa.graph.dump.complete;
import java.io.Serializable;

View File

@ -0,0 +1,84 @@
package eu.dnetlib.dhp.oa.graph.dump.complete;
import java.io.Serializable;
import java.util.List;
/**
* Deserialization of the information in the context needed to create Context Entities, and relations between context
* entities and datasources and projects
*/
public class ContextInfo implements Serializable {
private String id;
private String description;
private String type;
private String zenodocommunity;
private String name;
private List<String> projectList;
private List<String> datasourceList;
private List<String> subject;
public List<String> getSubject() {
return subject;
}
public void setSubject(List<String> subject) {
this.subject = subject;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public String getId() {
return id;
}
public void setId(String id) {
this.id = id;
}
public String getDescription() {
return description;
}
public void setDescription(String description) {
this.description = description;
}
public String getType() {
return type;
}
public void setType(String type) {
this.type = type;
}
public String getZenodocommunity() {
return zenodocommunity;
}
public void setZenodocommunity(String zenodocommunity) {
this.zenodocommunity = zenodocommunity;
}
public List<String> getProjectList() {
return projectList;
}
public void setProjectList(List<String> projectList) {
this.projectList = projectList;
}
public List<String> getDatasourceList() {
return datasourceList;
}
public void setDatasourceList(List<String> datasourceList) {
this.datasourceList = datasourceList;
}
}

View File

@ -0,0 +1,106 @@
package eu.dnetlib.dhp.oa.graph.dump.complete;
import java.io.BufferedWriter;
import java.io.IOException;
import java.io.OutputStreamWriter;
import java.io.Serializable;
import java.nio.charset.StandardCharsets;
import java.util.function.Consumer;
import java.util.function.Function;
import org.apache.commons.io.IOUtils;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataOutputStream;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.compress.CompressionCodec;
import org.apache.hadoop.io.compress.CompressionCodecFactory;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import eu.dnetlib.dhp.application.ArgumentApplicationParser;
import eu.dnetlib.dhp.oa.graph.dump.UtilCommunityAPI;
import eu.dnetlib.dhp.oa.graph.dump.Utils;
import eu.dnetlib.dhp.oa.model.graph.ResearchInitiative;
/**
* Writes on HDFS Context entities. It queries the Information System at the lookup url provided as parameter and
* collects the general information for contexes of type community or ri. The general information is the id of the
* context, its label, the subjects associated to the context, its zenodo community, description and type. This
* information is used to create a new Context Entity
*/
public class CreateContextEntities implements Serializable {
private static final Logger log = LoggerFactory.getLogger(CreateContextEntities.class);
private final transient Configuration conf;
private final transient BufferedWriter writer;
public static void main(String[] args) throws Exception {
String jsonConfiguration = IOUtils
.toString(
CreateContextEntities.class
.getResourceAsStream(
"/eu/dnetlib/dhp/oa/graph/dump/input_entity_parameter.json"));
final ArgumentApplicationParser parser = new ArgumentApplicationParser(jsonConfiguration);
parser.parseArgument(args);
final String hdfsPath = parser.get("hdfsPath");
log.info("hdfsPath: {}", hdfsPath);
final String hdfsNameNode = parser.get("nameNode");
log.info("nameNode: {}", hdfsNameNode);
final CreateContextEntities cce = new CreateContextEntities(hdfsPath, hdfsNameNode);
log.info("Processing contexts...");
cce.execute(Process::getEntity);
cce.close();
}
private void close() throws IOException {
writer.close();
}
public CreateContextEntities(String hdfsPath, String hdfsNameNode) throws IOException {
this.conf = new Configuration();
this.conf.set("fs.defaultFS", hdfsNameNode);
FileSystem fileSystem = FileSystem.get(this.conf);
Path hdfsWritePath = new Path(hdfsPath);
FSDataOutputStream fsDataOutputStream = null;
if (fileSystem.exists(hdfsWritePath)) {
fsDataOutputStream = fileSystem.append(hdfsWritePath);
} else {
fsDataOutputStream = fileSystem.create(hdfsWritePath);
}
CompressionCodecFactory factory = new CompressionCodecFactory(conf);
CompressionCodec codec = factory.getCodecByClassName("org.apache.hadoop.io.compress.GzipCodec");
this.writer = new BufferedWriter(new OutputStreamWriter(codec.createOutputStream(fsDataOutputStream),
StandardCharsets.UTF_8));
}
public <R extends ResearchInitiative> void execute(final Function<ContextInfo, R> producer)
throws IOException {
UtilCommunityAPI queryInformationSystem = new UtilCommunityAPI();
final Consumer<ContextInfo> consumer = ci -> writeEntity(producer.apply(ci));
queryInformationSystem.getContextInformation().forEach(ci -> consumer.accept(ci));
}
protected <R extends ResearchInitiative> void writeEntity(final R r) {
try {
writer.write(Utils.OBJECT_MAPPER.writeValueAsString(r));
writer.newLine();
} catch (final IOException e) {
throw new IllegalArgumentException(e);
}
}
}

View File

@ -0,0 +1,113 @@
package eu.dnetlib.dhp.oa.graph.dump.complete;
import java.io.*;
import java.nio.charset.StandardCharsets;
import java.util.ArrayList;
import java.util.List;
import java.util.Objects;
import java.util.Optional;
import java.util.function.Consumer;
import java.util.function.Function;
import org.apache.commons.io.IOUtils;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FSDataOutputStream;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import com.fasterxml.jackson.databind.ObjectMapper;
import eu.dnetlib.dhp.application.ArgumentApplicationParser;
import eu.dnetlib.dhp.oa.graph.dump.UtilCommunityAPI;
import eu.dnetlib.dhp.oa.graph.dump.Utils;
import eu.dnetlib.dhp.oa.graph.dump.exceptions.MyRuntimeException;
import eu.dnetlib.dhp.oa.graph.dump.subset.MasterDuplicate;
import eu.dnetlib.dhp.oa.model.graph.*;
/**
* Writes the set of new Relation between the context and datasources. At the moment the relation between the context
* and the project is not created because of a low coverage in the profiles of openaire ids related to projects
*/
public class CreateContextRelation implements Serializable {
private static final Logger log = LoggerFactory.getLogger(CreateContextRelation.class);
private final transient Configuration conf;
private final transient BufferedWriter writer;
public static void main(String[] args) throws Exception {
String jsonConfiguration = IOUtils
.toString(
Objects
.requireNonNull(
CreateContextRelation.class
.getResourceAsStream(
"/eu/dnetlib/dhp/oa/graph/dump/input_entity_parameter.json")));
final ArgumentApplicationParser parser = new ArgumentApplicationParser(jsonConfiguration);
parser.parseArgument(args);
Boolean isSparkSessionManaged = Optional
.ofNullable(parser.get("isSparkSessionManaged"))
.map(Boolean::valueOf)
.orElse(Boolean.TRUE);
log.info("isSparkSessionManaged: {}", isSparkSessionManaged);
final String hdfsPath = parser.get("hdfsPath");
log.info("hdfsPath: {}", hdfsPath);
final String hdfsNameNode = parser.get("nameNode");
log.info("hdfsNameNode: {}", hdfsNameNode);
final CreateContextRelation cce = new CreateContextRelation(hdfsPath, hdfsNameNode);
log.info("Creating relation for datasources and projects...");
cce
.execute(
Process::getRelation);
cce.close();
}
private void close() throws IOException {
writer.close();
}
public CreateContextRelation(String hdfsPath, String hdfsNameNode)
throws IOException {
this.conf = new Configuration();
this.conf.set("fs.defaultFS", hdfsNameNode);
FileSystem fileSystem = FileSystem.get(this.conf);
Path hdfsWritePath = new Path(hdfsPath);
FSDataOutputStream fsDataOutputStream = null;
if (fileSystem.exists(hdfsWritePath)) {
fsDataOutputStream = fileSystem.append(hdfsWritePath);
} else {
fsDataOutputStream = fileSystem.create(hdfsWritePath);
}
this.writer = new BufferedWriter(new OutputStreamWriter(fsDataOutputStream, StandardCharsets.UTF_8));
}
public void execute(final Function<ContextInfo, List<Relation>> producer) throws IOException {
final Consumer<ContextInfo> consumer = ci -> producer.apply(ci).forEach(this::writeEntity);
UtilCommunityAPI queryCommunityAPI = new UtilCommunityAPI();
queryCommunityAPI.getContextRelation().forEach(ci -> consumer.accept(ci));
}
protected void writeEntity(final Relation r) {
try {
writer.write(Utils.OBJECT_MAPPER.writeValueAsString(r));
writer.newLine();
} catch (final Exception e) {
throw new MyRuntimeException(e);
}
}
}

View File

@ -0,0 +1,203 @@
package eu.dnetlib.dhp.oa.graph.dump.complete;
import static eu.dnetlib.dhp.common.SparkSessionSupport.runWithSparkSession;
import static eu.dnetlib.dhp.oa.graph.dump.Utils.ENTITY_ID_SEPARATOR;
import static eu.dnetlib.dhp.oa.graph.dump.Utils.getEntityId;
import java.io.Serializable;
import java.util.*;
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.function.FlatMapFunction;
import org.apache.spark.sql.Encoders;
import org.apache.spark.sql.SaveMode;
import org.apache.spark.sql.SparkSession;
import eu.dnetlib.dhp.oa.graph.dump.Utils;
import eu.dnetlib.dhp.oa.graph.dump.community.CommunityMap;
import eu.dnetlib.dhp.oa.model.Provenance;
import eu.dnetlib.dhp.oa.model.graph.RelType;
import eu.dnetlib.dhp.oa.model.graph.Relation;
import eu.dnetlib.dhp.schema.common.ModelConstants;
import eu.dnetlib.dhp.schema.oaf.KeyValue;
import eu.dnetlib.dhp.schema.oaf.Result;
/**
* Creates new Relations (as in eu.dnetlib.dhp.schema.dump.oaf.graph.Relation) from the information in the Entity. The
* new Relations are created for the datasource in the collectedfrom and hostedby elements and for the context related
* to communities and research initiative/infrastructures. For collectedfrom elements it creates: datasource -> provides
* -> result and result -> isProvidedBy -> datasource For hostedby elements it creates: datasource -> hosts -> result
* and result -> isHostedBy -> datasource For context elements it creates: context <-> isRelatedTo <-> result. Note for
* context: it gets the first provenance in the dataInfo. If more than one is present the others are not dumped
*/
public class Extractor implements Serializable {
public void run(Boolean isSparkSessionManaged,
String inputPath,
String outputPath,
Class<? extends Result> inputClazz,
String communityMapPath) {
SparkConf conf = new SparkConf();
runWithSparkSession(
conf,
isSparkSessionManaged,
spark -> {
extractRelationResult(
spark, inputPath, outputPath, inputClazz, Utils.getCommunityMap(spark, communityMapPath));
});
}
private <R extends Result> void extractRelationResult(SparkSession spark,
String inputPath,
String outputPath,
Class<R> inputClazz,
CommunityMap communityMap) {
Set<Integer> hashCodes = new HashSet<>();
Utils
.readPath(spark, inputPath, inputClazz)
.flatMap((FlatMapFunction<R, Relation>) value -> {
List<Relation> relationList = new ArrayList<>();
extractRelationsFromInstance(hashCodes, value, relationList);
Set<String> communities = communityMap.keySet();
Optional
.ofNullable(value.getContext())
.ifPresent(contexts -> contexts.forEach(context -> {
String id = context.getId();
if (id.contains(":")) {
id = id.substring(0, id.indexOf(":"));
}
if (communities.contains(id)) {
String contextId = Utils.getContextId(id);
Provenance provenance = Optional
.ofNullable(context.getDataInfo())
.map(
dinfo -> Optional
.ofNullable(dinfo.get(0).getProvenanceaction())
.map(
paction -> Provenance
.newInstance(
paction.getClassid(),
dinfo.get(0).getTrust()))
.orElse(null))
.orElse(null);
Relation r = getRelation(
getEntityId(value.getId(), ENTITY_ID_SEPARATOR), contextId,
Constants.RESULT_ENTITY,
Constants.CONTEXT_ENTITY,
ModelConstants.IS_RELATED_TO, ModelConstants.RELATIONSHIP, provenance);
if (!hashCodes.contains(r.hashCode())) {
relationList
.add(r);
hashCodes.add(r.hashCode());
}
r = getRelation(
contextId, getEntityId(value.getId(), ENTITY_ID_SEPARATOR),
Constants.CONTEXT_ENTITY,
Constants.RESULT_ENTITY,
ModelConstants.IS_RELATED_TO,
ModelConstants.RELATIONSHIP, provenance);
if (!hashCodes.contains(r.hashCode())) {
relationList
.add(
r);
hashCodes.add(r.hashCode());
}
}
}));
return relationList.iterator();
}, Encoders.bean(Relation.class))
.write()
.option("compression", "gzip")
.mode(SaveMode.Append)
.json(outputPath);
}
private <R extends Result> void extractRelationsFromInstance(Set<Integer> hashCodes, R value,
List<Relation> relationList) {
Optional
.ofNullable(value.getInstance())
.ifPresent(inst -> inst.forEach(instance -> {
Optional
.ofNullable(instance.getCollectedfrom())
.ifPresent(
cf -> getRelatioPair(
value, relationList, cf,
ModelConstants.IS_PROVIDED_BY, ModelConstants.PROVIDES, hashCodes));
Optional
.ofNullable(instance.getHostedby())
.ifPresent(
hb -> getRelatioPair(
value, relationList, hb,
Constants.IS_HOSTED_BY, Constants.HOSTS, hashCodes));
}));
}
private static <R extends Result> void getRelatioPair(R value, List<Relation> relationList, KeyValue cf,
String resultDatasource, String datasourceResult,
Set<Integer> hashCodes) {
Provenance provenance = Optional
.ofNullable(cf.getDataInfo())
.map(
dinfo -> Optional
.ofNullable(dinfo.getProvenanceaction())
.map(
paction -> Provenance
.newInstance(
paction.getClassname(),
dinfo.getTrust()))
.orElse(
Provenance
.newInstance(
eu.dnetlib.dhp.oa.graph.dump.Constants.HARVESTED,
eu.dnetlib.dhp.oa.graph.dump.Constants.DEFAULT_TRUST)))
.orElse(
Provenance
.newInstance(
eu.dnetlib.dhp.oa.graph.dump.Constants.HARVESTED,
eu.dnetlib.dhp.oa.graph.dump.Constants.DEFAULT_TRUST));
Relation r = getRelation(
getEntityId(value.getId(), ENTITY_ID_SEPARATOR),
getEntityId(cf.getKey(), ENTITY_ID_SEPARATOR), Constants.RESULT_ENTITY, Constants.DATASOURCE_ENTITY,
resultDatasource, ModelConstants.PROVISION,
provenance);
if (!hashCodes.contains(r.hashCode())) {
relationList
.add(r);
hashCodes.add(r.hashCode());
}
r = getRelation(
getEntityId(cf.getKey(), ENTITY_ID_SEPARATOR), getEntityId(value.getId(), ENTITY_ID_SEPARATOR),
Constants.DATASOURCE_ENTITY, Constants.RESULT_ENTITY,
datasourceResult, ModelConstants.PROVISION,
provenance);
if (!hashCodes.contains(r.hashCode())) {
relationList
.add(r);
hashCodes.add(r.hashCode());
}
}
private static Relation getRelation(String source, String target, String sourceType, String targetType,
String relName, String relType, Provenance provenance) {
Relation r = new Relation();
r.setSource(source);
r.setSourceType(sourceType);
r.setTarget(target);
r.setTargetType(targetType);
r.setReltype(RelType.newInstance(relName, relType));
r.setProvenance(provenance);
return r;
}
}

View File

@ -0,0 +1,25 @@
package eu.dnetlib.dhp.oa.graph.dump.complete;
import java.io.Serializable;
public class MergedRels implements Serializable {
private String organizationId;
private String representativeId;
public String getOrganizationId() {
return organizationId;
}
public void setOrganizationId(String organizationId) {
this.organizationId = organizationId;
}
public String getRepresentativeId() {
return representativeId;
}
public void setRepresentativeId(String representativeId) {
this.representativeId = representativeId;
}
}

View File

@ -0,0 +1,21 @@
package eu.dnetlib.dhp.oa.graph.dump.complete;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
public class OrganizationMap extends HashMap<String, List<String>> {
public OrganizationMap() {
super();
}
public List<String> get(String key) {
if (super.get(key) == null) {
return new ArrayList<>();
}
return super.get(key);
}
}

Some files were not shown because too many files have changed in this diff Show More