Commit Graph

788 Commits

Author SHA1 Message Date
Claudio Atzori 93dd9cc639 code formatting 2024-04-23 11:28:00 +02:00
Miriam Baglioni 6189879643 [NOAMI] removed entry for Irish Research eLibray (IReL) Care Board from the list of funders. 2024-04-23 11:09:18 +02:00
Miriam Baglioni 7de114bda0 [WebCrawl] addressing comments from PR 2024-04-22 13:52:50 +02:00
Miriam Baglioni 776c898c4b [WebCrawl] adding affiliation relations from web information 2024-04-22 11:04:17 +02:00
Claudio Atzori 0656ab2838 code formatting 2024-04-20 08:10:58 +02:00
Claudio Atzori e5879b68c7 [transformative agreement] including reuslt-funder relations to the information imported from the TRs 2024-04-19 17:14:18 +02:00
Sandro La Bruzzo b84ad0c06e merged beta 2024-04-19 14:39:59 +02:00
Miriam Baglioni 0625b9061f removed the funder id : 100011062 Asian Spinal Cord Network, wrongly associated to Ireland 2024-04-16 15:26:53 +02:00
Miriam Baglioni 9eeb9f5d32 mergin with branch beta 2024-04-16 15:24:40 +02:00
Sandro La Bruzzo a5ddd8dfbb Added Action set generation for the MAG organization 2024-04-16 13:39:15 +02:00
Michele Artini 78b9d84e4a test 2024-04-16 09:41:16 +02:00
Sandro La Bruzzo 41a42dde64 code formatted 2024-04-11 17:43:48 +02:00
Sandro La Bruzzo 843dc95340 resolved conflict 2024-04-11 17:38:16 +02:00
Sandro La Bruzzo 1e30454ee0 added vocabulary tu instanceTypeMApping of Mag 2024-04-11 17:32:30 +02:00
Sandro La Bruzzo 2581672c11 updated wf of MAG and crossref to use transaction 2024-04-11 17:27:49 +02:00
Sandro La Bruzzo a0642bd190 added instanceTypeMapping field on MAG 2024-04-11 13:10:12 +02:00
Sandro La Bruzzo 98dc042db5 mapping generated for MAG,
missing generation of Organization Action set
2024-04-05 18:12:53 +02:00
Sandro La Bruzzo ef582948a7 Updated mapping 2024-04-05 11:10:44 +02:00
Sandro La Bruzzo 5142f462b5 completed mapping from paper to OAF, not tested 2024-04-04 21:06:04 +02:00
Miriam Baglioni 0794e0667b Merge branch 'doidoost_dismiss' of https://code-repo.d4science.org/D-Net/dnet-hadoop into doidoost_dismiss 2024-04-04 09:16:18 +02:00
Miriam Baglioni 4b1de076ac [DataciteHostedByMap] added entry for EBRAINS 2024-04-04 09:16:14 +02:00
Miriam Baglioni c8a88b2187 [DataciteHostedByMap] added entry for EBRAINS 2024-04-04 09:14:58 +02:00
Sandro La Bruzzo 31e152d2bb Merge remote-tracking branch 'origin/doidoost_dismiss' into doidoost_dismiss 2024-04-03 17:08:35 +02:00
Sandro La Bruzzo 6f3e925cae Implemented first part of the new MAG mapping 2024-04-03 17:07:14 +02:00
Miriam Baglioni f0f6abf892 [MapToFunderLink]added references for HFRI and Erasmus+ for the creation of links for funders 2024-04-03 14:59:09 +02:00
Miriam Baglioni 50fbebf186 [NOAMI] removed entry for Health and Social Care Board from the list of funders. Modified IRC putting 1596 and 1597 as synonyms, as required in ticket 9635 2024-04-03 11:45:40 +02:00
Michele Artini 71d6e02886 Merge branch 'beta' of code-repo.d4science.org:D-Net/dnet-hadoop into beta 2024-04-03 09:50:41 +02:00
Michele Artini 02c9a311c8 base datainfo with trust=0.89 2024-04-03 09:50:21 +02:00
Miriam Baglioni 42846d3b91 [OpenCitation] add compression option when writing the sequence file 2024-04-03 09:25:00 +02:00
Miriam Baglioni 4f0a044245 Merge pull request 'Add action set creation for Datacite affiliations' (#413) from 9647_datacite_affiliations into beta
Reviewed-on: #413
2024-04-02 17:33:38 +02:00
Serafeim Chatzopoulos cbe13a5c61 Fix datacite input path in properties file 2024-04-02 18:00:35 +03:00
Miriam Baglioni 9c9a9562ae [UsageCount] fixed error 2024-04-02 16:56:37 +02:00
Miriam Baglioni b42bdd5fb3 [UsageCount] add check in case the datasource is not matched against those present in the graph 2024-04-02 16:28:27 +02:00
Miriam Baglioni 64cbd8abe9 Merge pull request '[UsageCount] Usage count per result split by datasource' (#318) from UsageStatsRecordDS into beta
Reviewed-on: #318
2024-04-02 10:21:39 +02:00
Serafeim Chatzopoulos 0eb0701b26 Add action set creation for Datacite affiliations 2024-04-01 17:23:26 +03:00
Sandro La Bruzzo 73a67c0e4a Improved Crossref mapping to include also unpaywall tested 2024-03-26 17:26:47 +01:00
Miriam Baglioni 94b931f7bd [BulkTagging - tag datasource and projects]merging with branch beta 2024-03-26 14:25:19 +01:00
Claudio Atzori ef52128c55 included new stats* workflows in parent pom list of modules, code formatting 2024-03-26 10:42:10 +01:00
Sandro La Bruzzo ece56f0178 update crossref mapping to be transformed together with UnpayWall 2024-03-25 18:18:10 +01:00
Claudio Atzori 74e5d05577 Merge branch 'beta' into ocnew 2024-03-25 16:10:31 +01:00
Claudio Atzori 6c3b692f60 integrated minor change from beta branch 2024-03-25 16:10:23 +01:00
Claudio Atzori 9a5b134ddf Merge branch 'beta' into FOSNew 2024-03-25 16:07:37 +01:00
Claudio Atzori 71c1f81b54 Merge branch 'beta' into exception_on_invalid_transofmation_rule 2024-03-25 16:05:11 +01:00
Claudio Atzori 91b61687fa Merge branch 'beta' into bulkTaggingPathMapExtention 2024-03-25 15:50:18 +01:00
Claudio Atzori 54936b7f42 Merge branch 'beta' into transformativeagreement 2024-03-25 15:42:22 +01:00
Michele Artini e1149eb5c4 xslt rules and tests 2024-03-25 15:01:42 +01:00
Michele Artini 6ffb1faf09 fixed a problem with multiple nodes 2024-03-25 12:15:51 +01:00
Michele Artini 7faa115ba0 Merge branch 'beta' of code-repo.d4science.org:D-Net/dnet-hadoop into beta 2024-03-22 11:08:59 +01:00
Michele Artini f9c74c98fa fixed an identifier xpath 2024-03-22 11:08:45 +01:00
Sandro La Bruzzo 58dbe71d39 update crossref mapping to be runnable separately as a single datasource outside doiboost 2024-03-20 17:04:52 +01:00
Giambattista Bloisi 664a381d31 Unify merge logic of entities in MergeUtils.class 2024-03-18 16:04:49 +01:00
Michele Artini cb29b9773c xslt rules 2024-03-18 15:31:34 +01:00
Michele Artini 85b844d57e updated BASE filter param 2024-03-15 15:03:27 +01:00
Michele Artini 455f2e1e07 apply commits from master 2024-03-15 14:56:39 +01:00
Michele Artini 88fef367b9 new plugin to collect from a dump of BASE 2024-03-15 10:47:52 +01:00
Sandro La Bruzzo 5281f010a5 applied cherry pick 2024-03-13 09:59:20 +01:00
Sandro La Bruzzo ee1fcb672b code refactor 2024-03-13 09:46:31 +01:00
Miriam Baglioni 5a32bb9578 [OC New] last fix 2024-03-13 09:36:18 +01:00
Sandro La Bruzzo c532831718 Moved Crossref Mapping on dhp-aggregations,
refactored code, avoid to use utility for create part of the oaf defined in DOIBoostMappingUtils, used instead utility in OafMappingUtils
2024-03-13 06:56:10 +01:00
Miriam Baglioni 48c052215c [OC New] last fix 2024-03-12 23:12:32 +01:00
Sandro La Bruzzo cbd4e5e4bb update mag mapping 2024-03-08 16:31:40 +01:00
Miriam Baglioni 5180b6ec8a [FOSNEW] removed test class 2024-03-07 10:47:13 +01:00
Miriam Baglioni 7827a2d66b [OCNEW] added creation of the actionset for the results classified with FoS based ont he OpenAIRE identifier 2024-03-07 10:36:30 +01:00
Miriam Baglioni fd34372c40 [OCNEW] first implementation 2024-03-06 13:42:00 +01:00
Sandro La Bruzzo d34cef3f8d Merge remote-tracking branch 'origin/beta' into doidoost_dismiss 2024-03-05 11:45:31 +01:00
Sandro La Bruzzo 3b837d38ce added oozie workflow 2024-03-05 11:44:59 +01:00
Sandro La Bruzzo f417515e43 Implemented class that generates a normalized table of MAG, which is the starting point for the creation of the mag source 2024-03-04 17:15:13 +01:00
Sandro La Bruzzo ad0e9aa80c added first part of refactoring of the code generating MAG,
make it more readable using spark sql queries
2024-02-29 18:16:15 +01:00
Giambattista Bloisi 3cd5590f3b When converting json to XML, remove characters that are not allowed in the XML 1.0 specs, as they will cause xpath failures even if escaped 2024-02-28 15:14:18 +01:00
Giambattista Bloisi 56dd05f85c Merge pull request 'Revised procedure when converting json data into xml' (#395) from restiterator_xmlcleanup into beta
Reviewed-on: #395
2024-02-28 10:38:54 +01:00
Sandro La Bruzzo 7d806a434c formatted code 2024-02-28 09:31:58 +01:00
Sandro La Bruzzo 915a76a796 following the comment on the pull requests:
- Added #NUM_OF_THREADS complete job in the queue at the end of  the main loop to avoid deadlock
2024-02-28 09:10:55 +01:00
Giambattista Bloisi 773e856550 Revised procedure when converting json data into xml:
- json object keys are renamed to be conformant to xml tag elements, special characters are substituted or removed
- json string values are no longer post-processed as they are already escaped by the org.json.XML.toString method
2024-02-24 16:54:30 +01:00
Sandro La Bruzzo a712df1e1d Merge remote-tracking branch 'origin/beta' into orcid_update 2024-02-23 10:12:25 +01:00
Sandro La Bruzzo b32a9d1994 Implemented workflow for updating table , added step to check if the new generated table is valid 2024-02-23 10:04:28 +01:00
Miriam Baglioni 72bae7af76 [Transformative Agreement] removed the relations from the ActionSet waiting to have the gree light from Ioanna 2024-02-19 16:20:12 +01:00
Serafeim Chatzopoulos f0dc12634b Add Action Set creation for affiliations inferred from the OpenAPC data 2024-02-18 18:02:09 +02:00
Miriam Baglioni eca021f4d6 [Transformative Agreement] add results with information abount the agreement and the country of the organization paid for it 2024-02-13 12:21:07 +01:00
Miriam Baglioni bdb6bbb365 mergin with branch beta 2024-02-12 15:50:43 +01:00
Miriam Baglioni 07a373a0bd [bulkTagging] removing checks while performing the substring action so that it will fire an Exception if the paramneters are wrongly set 2024-01-30 13:51:11 +01:00
Miriam Baglioni a418dacb47 [UsageCount] code extention to include also the name of the datasource 2024-01-29 18:12:33 +01:00
Miriam Baglioni e9131f4e4a mergin with branch beta 2024-01-29 16:27:18 +01:00
Sandro La Bruzzo 9aebca77a0 Added exception throwing in Hadoop transformation when TR is not syntactically valid 2024-01-29 14:41:02 +01:00
Sandro La Bruzzo 0386f36385 Added workflow to update ORCID and replaced some parsing, because the update works and employments xml differs from the dump one. 2024-01-25 19:40:59 +01:00
Sandro La Bruzzo 43e0bba7ed logg added during download 2024-01-23 15:04:49 +01:00
Miriam Baglioni f7d06dc661 compilation after merging 2024-01-23 11:43:08 +01:00
Miriam Baglioni e0ec800d7e [BulkTagging] extend the definition of the pathMap to include also actions that should be performed of the value extracted from the result befor applying the constraint 2024-01-23 11:34:53 +01:00
Sandro La Bruzzo e0753f19da Fixed error of connection timeout 2024-01-13 09:27:08 +01:00
sandro.labruzzo e328bc0ade fixed missing parameter on download update 2024-01-12 16:18:20 +01:00
Miriam Baglioni f612125939 fix issue on FoS integration. Removing the null values from FoS 2024-01-12 10:20:28 +01:00
Sandro La Bruzzo 859babf722 added some useful comment 2024-01-10 19:51:13 +01:00
Sandro La Bruzzo 8f61063201 Added workflow 2024-01-10 19:42:22 +01:00
Sandro La Bruzzo 1a42a5c10d Implemented Download update of ORCID 2024-01-10 18:03:20 +01:00
Miriam Baglioni 624f5f3f21 [Transformative Agreement] added check to verify the APC were paid byu the IReL funder 2023-12-18 15:28:19 +01:00
Miriam Baglioni 354e02e6a9 [Transformative Agreement] removed not needed class. Read directly the json and no need to pass from the csv 2023-12-18 15:20:27 +01:00
Miriam Baglioni b00771c7cc [Transformative Agreement] added code to extract relations from the transformative agreement file for the IE products got from OpenAPC 2023-12-18 15:12:44 +01:00
Sandro La Bruzzo 15fd93a2b6 uploaded input parameters on CreateBaseline WF 2023-12-18 12:21:55 +01:00
Sandro La Bruzzo 9d342a47da updated the transformation Baseline workflow to include mdstore rollback/commit action 2023-12-18 11:48:57 +01:00
Giambattista Bloisi 613ec5ffce Add profiles for different spark versions: spark-24, spark-34, spark-35 2023-12-05 19:11:06 +01:00
Sandro La Bruzzo 52495f2cd2 used javax.xml.stream.XMLEventReader instead of deprecated scala.xml.pull.XMLEventReader 2023-12-05 19:11:06 +01:00