Commit Graph

842 Commits

Author SHA1 Message Date
Miriam Baglioni c7f6669f1a [webcrawl] the blacklist is now in json and no more in csv after the normalization process 2024-07-25 15:20:18 +02:00
Miriam Baglioni 7cff281d3e [webcrawl] the blacklist is now in json and no more in csv after the normalization process 2024-07-25 15:16:42 +02:00
Miriam Baglioni fc60661ac5 [webcrawl] added code and test (code/resource) to verify the deletion of the relations related to results put in blacklist 2024-07-25 12:25:14 +02:00
Miriam Baglioni 6f1801d7d1 [webcrawl]- 2024-07-23 17:34:48 +02:00
Miriam Baglioni 19806c2ae3 [SDG]fixed switch of methods 2024-07-23 17:12:55 +02:00
Miriam Baglioni 9573bf576d [SDG]added code to ingest also the SDG without DOI 2024-07-23 12:47:57 +02:00
Miriam Baglioni 79985ad197 [Crossref]added mapping for DFG versus the unidentified project [https://support.openaire.eu/issues/9926?next_issue_id=9924&prev_issue_id=9927#note-4] 2024-07-17 18:30:24 +02:00
Claudio Atzori 06e3985b77 merged from beta 2024-07-17 12:01:40 +02:00
Claudio Atzori 83327239de fixed pom definitions, bumped dependency version for the dhp-schema module, removed unnecessary dependencies 2024-07-17 11:58:48 +02:00
Claudio Atzori e39e8bbd47 Merge pull request '[WebCrawlAffiliation]remove from the creation of the action set the relations for pmc and pmid. Only doi are allowed' (#462) from affiliationFromWebCrawlOnlyDOI into beta
Reviewed-on: #462
2024-07-17 11:12:32 +02:00
Claudio Atzori a65241fcaf Merge pull request 'implementation of the new collector plugin: research_fi' (#456) from research_fi_collector_plugin into beta
Reviewed-on: #456
2024-07-17 10:25:38 +02:00
Claudio Atzori c99f92efaa Merge pull request '[beta] OpenAIRE Affiliation Inference' (#452) from affRoFromRawString into beta
Reviewed-on: #452
2024-07-17 10:24:39 +02:00
Miriam Baglioni d96215cb9b [UnpayWall]added othe : in the identifier construction 2024-07-16 18:17:32 +02:00
Miriam Baglioni 9246bdec1c [WebCrawlAffiliation]remove from the creation of the action set the relations for pmc and pmid. Only doi are allowed 2024-07-16 14:07:37 +02:00
Claudio Atzori 61d1fa9b9f [metadata collection] added -Dcom.sun.security.enableAIAcaIssuers=true as a default for metadata collection 2024-07-12 10:26:45 +02:00
Claudio Atzori f9ed2ae33c [metadata collection] added the possibility to specify the JAVA_HOME and the JAVA_OPTS parameters 2024-07-11 15:32:36 +02:00
Michele Artini bbe52584f7 log message 2024-07-11 15:14:34 +02:00
Michele Artini 5cdba9172b implementeation of the new collector plugin: research_fi 2024-07-10 14:53:13 +02:00
Miriam Baglioni c465835061 [Person]new implementation for the extraction of the coAuthorship relations 2024-07-09 12:29:55 +02:00
Miriam Baglioni 814e650e12 [Irish Tender]changed the irish.json file according to comments #26, #29, and #34 for 9635 2024-07-04 12:24:28 +02:00
Miriam Baglioni ddd20e7f8e [Person]first implementation of the action set to include Person entity in the graph starting from the orcid data 2024-07-04 12:08:46 +02:00
Miriam Baglioni 9cbe966b4a [AffiliationIngestion]refactoring 2024-06-29 18:35:49 +02:00
Miriam Baglioni 236b64d830 [AffiliationIngestion]Extended the ingestion of affiliation from open aire to include also links derived from Web Crawl. Extended the test. Inserted in Constatns the id and name of the webcrawl datasource to be used here and also in the ingestion of links from web crawl 2024-06-29 18:29:20 +02:00
Miriam Baglioni 67ff783e65 [Person]First implementation to include Person entity in the graph 2024-06-29 17:13:01 +02:00
Miriam Baglioni d35edac212 [IrishFunderList]make changed according to 9635 comment 20, 21, 22 and 23 2024-06-20 12:28:28 +02:00
Miriam Baglioni 6421f8fece Merge remote-tracking branch 'origin/beta' into beta 2024-06-19 11:12:15 +02:00
Miriam Baglioni ac270f795b [IrishFunderList]make changed according to 9635 comment 14, 15 and 16 2024-06-19 11:11:52 +02:00
Giambattista Bloisi 9bf2bda1c6 Fix: next returned a null value at end of stream 2024-06-12 13:28:51 +02:00
Giambattista Bloisi d90cb099b8 Fix for paginationStart parameter management 2024-06-11 20:23:44 +02:00
Miriam Baglioni 8fe934810f Merge remote-tracking branch 'origin/beta' into beta 2024-06-11 10:28:51 +02:00
Miriam Baglioni 9da006e98c [SDGFoSActionSet]remove datainfo for the result. It is not needed (qualifier.classid = UPDATE) useless since subject do not go at the level of the instance 2024-06-11 10:28:32 +02:00
Giambattista Bloisi 85c1eae7e0 Fixes for pagination strategy looping at end of download 2024-06-10 19:03:58 +02:00
Michele Artini c726572418 changed some parameters in OSF test 2024-06-07 12:03:26 +02:00
Claudio Atzori a02f3f0d2b code formatting 2024-05-30 10:21:18 +02:00
Alessia Bardi 05ee783c07 Merge branch 'beta' into dblp_collection_plugin 2024-05-29 16:04:39 +02:00
Claudio Atzori c272c4ad68 code formatting 2024-05-29 15:50:07 +02:00
Alessia Bardi c5f4da16a4 Merge branch 'beta' into rest-collector-request-header-map 2024-05-29 15:46:23 +02:00
Alessia 1b165a14a0 Rest collector plugin on hadoop supports a new param to pass request headers 2024-05-29 15:41:36 +02:00
Michele Artini e996787be2 OSF test 2024-05-29 15:05:17 +02:00
Miriam Baglioni 5d85b70e1f [NOAMI] removed Ireland funder id 501100011103. ticket 9635 2024-05-29 11:55:00 +02:00
Miriam Baglioni 75d5ddb999 Update to include a blackList that filters out the results we know are wrongly associated to IE - update workflow definition - the blacklist parameter 2024-05-27 12:01:28 +02:00
Miriam Baglioni 87c9c61b41 Update to include a blackList that filters out the results we know are wrongly associated to IE - refactoring 2024-05-27 12:01:16 +02:00
Miriam Baglioni b55fed09f8 Update to include a blackList that filters out the results we know are wrongly associated to IE 2024-05-27 12:01:01 +02:00
Sandro La Bruzzo 66c1ffc866 merged again from beta (I hope for the last time) 2024-05-22 11:02:46 +02:00
Sandro La Bruzzo e8a61d5dd5 removed plugin, use only FileGZip plugin 2024-05-21 13:45:29 +02:00
Sandro La Bruzzo ca9414b737 Implement multiple node name splitter on GZipCollectorPlugin and all nodes that use XMLIterator. If the splitter name contains is a comma separated values it splits for all the values 2024-05-21 09:11:13 +02:00
Sandro La Bruzzo 032bcc8279 since last beta workflow we decide to introduce in the graph only MAG item with DOI and set them invisible ( this should be the same behaviour of the previous DOIBoost mapping).
This commit apply this type of mapping
2024-05-20 09:24:15 +02:00
Claudio Atzori f7d56e2ef2 Merge branch 'beta' into rest-collector-plugin-with-retry 2024-05-10 09:02:21 +02:00
Claudio Atzori 26363060ed fixed id prefix creation for the fosnodoi records, again 2024-05-03 15:53:52 +02:00
Claudio Atzori e1a0fb8933 fixed id prefix creation for the fosnodoi records 2024-05-03 14:14:18 +02:00
Michele Artini f4068de298 code reindent + tests 2024-05-02 09:51:33 +02:00
Michele Artini 2615136efc added a retry mechanism 2024-04-30 11:58:42 +02:00
Sandro La Bruzzo 052c6aac9d formatted code 2024-04-26 16:03:04 +02:00
Sandro La Bruzzo 0d628cd62b merged again from beta 2024-04-23 17:34:55 +02:00
Claudio Atzori 93dd9cc639 code formatting 2024-04-23 11:28:00 +02:00
Miriam Baglioni 6189879643 [NOAMI] removed entry for Irish Research eLibray (IReL) Care Board from the list of funders. 2024-04-23 11:09:18 +02:00
Miriam Baglioni 7de114bda0 [WebCrawl] addressing comments from PR 2024-04-22 13:52:50 +02:00
Miriam Baglioni 776c898c4b [WebCrawl] adding affiliation relations from web information 2024-04-22 11:04:17 +02:00
Claudio Atzori 0656ab2838 code formatting 2024-04-20 08:10:58 +02:00
Claudio Atzori e5879b68c7 [transformative agreement] including reuslt-funder relations to the information imported from the TRs 2024-04-19 17:14:18 +02:00
Sandro La Bruzzo b84ad0c06e merged beta 2024-04-19 14:39:59 +02:00
Miriam Baglioni 0625b9061f removed the funder id : 100011062 Asian Spinal Cord Network, wrongly associated to Ireland 2024-04-16 15:26:53 +02:00
Miriam Baglioni 9eeb9f5d32 mergin with branch beta 2024-04-16 15:24:40 +02:00
Sandro La Bruzzo a5ddd8dfbb Added Action set generation for the MAG organization 2024-04-16 13:39:15 +02:00
Michele Artini 78b9d84e4a test 2024-04-16 09:41:16 +02:00
Sandro La Bruzzo 41a42dde64 code formatted 2024-04-11 17:43:48 +02:00
Sandro La Bruzzo 843dc95340 resolved conflict 2024-04-11 17:38:16 +02:00
Sandro La Bruzzo 1e30454ee0 added vocabulary tu instanceTypeMApping of Mag 2024-04-11 17:32:30 +02:00
Sandro La Bruzzo 2581672c11 updated wf of MAG and crossref to use transaction 2024-04-11 17:27:49 +02:00
Sandro La Bruzzo a0642bd190 added instanceTypeMapping field on MAG 2024-04-11 13:10:12 +02:00
Sandro La Bruzzo 98dc042db5 mapping generated for MAG,
missing generation of Organization Action set
2024-04-05 18:12:53 +02:00
Sandro La Bruzzo ef582948a7 Updated mapping 2024-04-05 11:10:44 +02:00
Sandro La Bruzzo 5142f462b5 completed mapping from paper to OAF, not tested 2024-04-04 21:06:04 +02:00
Miriam Baglioni 0794e0667b Merge branch 'doidoost_dismiss' of https://code-repo.d4science.org/D-Net/dnet-hadoop into doidoost_dismiss 2024-04-04 09:16:18 +02:00
Miriam Baglioni 4b1de076ac [DataciteHostedByMap] added entry for EBRAINS 2024-04-04 09:16:14 +02:00
Miriam Baglioni c8a88b2187 [DataciteHostedByMap] added entry for EBRAINS 2024-04-04 09:14:58 +02:00
Sandro La Bruzzo 31e152d2bb Merge remote-tracking branch 'origin/doidoost_dismiss' into doidoost_dismiss 2024-04-03 17:08:35 +02:00
Sandro La Bruzzo 6f3e925cae Implemented first part of the new MAG mapping 2024-04-03 17:07:14 +02:00
Miriam Baglioni f0f6abf892 [MapToFunderLink]added references for HFRI and Erasmus+ for the creation of links for funders 2024-04-03 14:59:09 +02:00
Miriam Baglioni 50fbebf186 [NOAMI] removed entry for Health and Social Care Board from the list of funders. Modified IRC putting 1596 and 1597 as synonyms, as required in ticket 9635 2024-04-03 11:45:40 +02:00
Michele Artini 71d6e02886 Merge branch 'beta' of code-repo.d4science.org:D-Net/dnet-hadoop into beta 2024-04-03 09:50:41 +02:00
Michele Artini 02c9a311c8 base datainfo with trust=0.89 2024-04-03 09:50:21 +02:00
Miriam Baglioni 42846d3b91 [OpenCitation] add compression option when writing the sequence file 2024-04-03 09:25:00 +02:00
Miriam Baglioni 4f0a044245 Merge pull request 'Add action set creation for Datacite affiliations' (#413) from 9647_datacite_affiliations into beta
Reviewed-on: #413
2024-04-02 17:33:38 +02:00
Serafeim Chatzopoulos cbe13a5c61 Fix datacite input path in properties file 2024-04-02 18:00:35 +03:00
Miriam Baglioni 9c9a9562ae [UsageCount] fixed error 2024-04-02 16:56:37 +02:00
Miriam Baglioni b42bdd5fb3 [UsageCount] add check in case the datasource is not matched against those present in the graph 2024-04-02 16:28:27 +02:00
Miriam Baglioni 64cbd8abe9 Merge pull request '[UsageCount] Usage count per result split by datasource' (#318) from UsageStatsRecordDS into beta
Reviewed-on: #318
2024-04-02 10:21:39 +02:00
Serafeim Chatzopoulos 0eb0701b26 Add action set creation for Datacite affiliations 2024-04-01 17:23:26 +03:00
Sandro La Bruzzo 73a67c0e4a Improved Crossref mapping to include also unpaywall tested 2024-03-26 17:26:47 +01:00
Miriam Baglioni 94b931f7bd [BulkTagging - tag datasource and projects]merging with branch beta 2024-03-26 14:25:19 +01:00
Claudio Atzori ef52128c55 included new stats* workflows in parent pom list of modules, code formatting 2024-03-26 10:42:10 +01:00
Sandro La Bruzzo ece56f0178 update crossref mapping to be transformed together with UnpayWall 2024-03-25 18:18:10 +01:00
Claudio Atzori 74e5d05577 Merge branch 'beta' into ocnew 2024-03-25 16:10:31 +01:00
Claudio Atzori 6c3b692f60 integrated minor change from beta branch 2024-03-25 16:10:23 +01:00
Claudio Atzori 9a5b134ddf Merge branch 'beta' into FOSNew 2024-03-25 16:07:37 +01:00
Claudio Atzori 71c1f81b54 Merge branch 'beta' into exception_on_invalid_transofmation_rule 2024-03-25 16:05:11 +01:00
Claudio Atzori 91b61687fa Merge branch 'beta' into bulkTaggingPathMapExtention 2024-03-25 15:50:18 +01:00
Claudio Atzori 54936b7f42 Merge branch 'beta' into transformativeagreement 2024-03-25 15:42:22 +01:00
Michele Artini e1149eb5c4 xslt rules and tests 2024-03-25 15:01:42 +01:00