Commit Graph

5482 Commits

Author SHA1 Message Date
Michele Artini fdbe629f49 removed the deletedByInference=true filter 2024-09-23 15:27:28 +02:00
Antonis Lempesis 619aa34a15 Merge branch 'beta' of https://code-repo.d4science.org/antonis.lempesis/dnet-hadoop into beta 2024-09-23 15:25:59 +03:00
Antonis Lempesis dbea7a4072 removed duplicate line 2024-09-23 14:57:11 +03:00
Antonis Lempesis c9241dba0d Merge pull request 'convert_hive_to_spark_actions' (#1) from convert_hive_to_spark_actions into beta
Reviewed-on: antonis.lempesis/dnet-hadoop#1
2024-09-23 13:53:28 +02:00
Claudio Atzori e0ff84baf0 [graph provision] person serialisation, limit the number of authorships and coauthorships before expanding the payloads 2024-09-23 10:29:46 +02:00
Michele Artini 755a5aefcf Merge pull request 'osfPreprints_plugin' (#482) from osfPreprints_plugin into beta
Reviewed-on: #482
2024-09-23 10:21:34 +02:00
Michele Artini 2d7a7a962d unit test @Disabled 2024-09-23 10:19:36 +02:00
Michele Artini 6b0f7cc8b0 skip urls with authentication 2024-09-23 10:16:53 +02:00
Claudio Atzori 5f86c93be6 [graph provision] person serialisation 2024-09-20 12:20:00 +02:00
Michele Artini db6f137cf9 Merge pull request 'osfPreprints_plugin' (#480) from osfPreprints_plugin into beta
Reviewed-on: #480
2024-09-20 09:56:50 +02:00
Michele Artini 339d8124f2 osf plugin: links to contributors and primaty_file 2024-09-20 08:44:05 +02:00
Michele Artini 52bb7af03b use of dom4j 2024-09-19 14:59:05 +02:00
Michele Artini 9073b1159d partial implementation of osfPreprints plugin + tests 2024-09-19 13:58:53 +02:00
Michele Artini dcf09811a2 partial implementation of osfPreprints plugin 2024-09-19 12:42:45 +02:00
Claudio Atzori 23e0ab3a7c run mergeResultsOfDifferentTypes only when checkDelegatedAuthority is true 2024-09-17 15:36:10 +02:00
Claudio Atzori bfd05cdab2 run mergeResultsOfDifferentTypes only when checkDelegatedAuthority is true 2024-09-17 10:49:32 +02:00
Michele Artini 714a16854e Merge pull request 'gtr2Publications_plugin' (#477) from gtr2Publications_plugin into beta
Reviewed-on: #477
2024-09-17 10:23:39 +02:00
Michele Artini a2fac78dcc fixed a problem in incremental harvesting 2024-09-17 10:16:28 +02:00
Michele Artini 99b7adda0c gtr2 unit test 2024-09-16 15:13:44 +02:00
Michele Artini bb9cee4f40 implementation of gtr2Publications plugin 2024-09-16 14:16:56 +02:00
Michele De Bonis 6df6b4583e blacklist filtering moved before the cleanup phase in order to have case sensitive regex 2024-09-16 14:04:59 +02:00
Alessia 07e6e7b4d6 #9839: include claimed affiliation relationships 2024-09-16 13:41:56 +02:00
Antonis Lempesis 37ad259296 cleanup 2024-09-05 16:02:44 +03:00
Antonis Lempesis b64c144abf added new institutions 2024-09-05 16:00:09 +03:00
Serafeim Chatzopoulos b043f8a963 Remove redundant error messages from impact indicators workflow 2024-09-04 14:28:43 +03:00
Serafeim Chatzopoulos db03f85366 Remove steps for updating BIP! from the impact indicators workflow 2024-09-04 14:25:44 +03:00
Miriam Baglioni 468f2aa5a5 [AffiliationAffRo]align beta with new affiliation from publisher webpage introduced in production. AffRo collectedfrom OpenAIRE to discriminate against WebCrawl 2024-08-12 18:10:46 +02:00
Miriam Baglioni 89fcf4086c [Person]fix issue in affiliation relation id construction for person (missing ::) 2024-08-12 18:04:43 +02:00
Miriam Baglioni 45605f93ae merging with branch beta 2024-08-12 18:03:10 +02:00
Miriam Baglioni 5a7ba77271 [Person]fix issue in affiliation relation id construction for person (missing ::) 2024-08-12 18:01:15 +02:00
Miriam Baglioni 8c185a7b1a resolving conflicts 2024-08-05 17:14:11 +02:00
Claudio Atzori e16616b964 added dataInfo to person records 2024-08-05 15:57:37 +02:00
Claudio Atzori 8e7ef79ce0 [bip affiliations] considers only DOI based records 2024-08-05 12:13:48 +02:00
Miriam Baglioni 985ca15264 [openaire-affiliation]removes matchings without DOI 2024-08-05 12:10:40 +02:00
Claudio Atzori 0bf76f2a34 [graph provision] added person to the graph2hive workflow 2024-08-05 09:35:07 +02:00
Claudio Atzori 975d44cac7 [graph provision] added person to the provision workflow 2024-08-02 16:14:10 +02:00
Claudio Atzori fecbf93e0e Merge pull request 'FoS L1 & L2' (#465) from fos_l1l2 into beta
Reviewed-on: #465
2024-08-01 13:58:04 +02:00
Claudio Atzori 6bdb8643e6 ActionManager promote: allow to ingest person records in a graph that did not contain them, bumped dhp-schemas version 2024-07-31 11:02:22 +02:00
Claudio Atzori 9486e21a44 copy or process the person records throughout the graph pipeline 2024-07-30 14:25:31 +02:00
Claudio Atzori 64740475d0 depending on dhp-schemas:7.0.1 2024-07-29 11:51:42 +02:00
Claudio Atzori 75a11d0ba5 [dedup] avoid NPEs in the countryInference dedup utility 2024-07-25 16:34:32 +02:00
Claudio Atzori 8f551afa52 Merge pull request 'Remove Relation From AS' (#466) from webCrawlLessBlackList into beta
Reviewed-on: #466
2024-07-25 15:50:19 +02:00
Miriam Baglioni 1af6571474 merging with branch beta 2024-07-25 15:48:05 +02:00
Claudio Atzori a81c555fe6 [graph provision] include only FoS L1..L2 in the record serialization 2024-07-25 15:26:47 +02:00
Claudio Atzori 359b8ebda8 [graph provision] include only FoS L1..L2 in the record serialization 2024-07-25 15:22:29 +02:00
Miriam Baglioni c7f6669f1a [webcrawl] the blacklist is now in json and no more in csv after the normalization process 2024-07-25 15:20:18 +02:00
Miriam Baglioni 7cff281d3e [webcrawl] the blacklist is now in json and no more in csv after the normalization process 2024-07-25 15:16:42 +02:00
Claudio Atzori d4bf449e8c minor 2024-07-25 14:53:06 +02:00
Miriam Baglioni fc60661ac5 [webcrawl] added code and test (code/resource) to verify the deletion of the relations related to results put in blacklist 2024-07-25 12:25:14 +02:00
Claudio Atzori d771a883f9 [dedup] updated sql query used to read organizations from the OpenOrgs DB to include their typology 2024-07-25 09:53:48 +02:00