Commit Graph

4500 Commits

Author SHA1 Message Date
Claudio Atzori 5734b80861 Merge pull request 'datasource table creation split in steps' (#489) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: #489
2024-09-30 16:34:38 +02:00
Antonis Lempesis f3c179658a datasource table creation split in steps 2024-09-30 17:12:21 +03:00
Miriam Baglioni b18ad035c1 Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta 2024-09-30 15:10:44 +02:00
Miriam Baglioni e430826e00 [ImportOC] fix to move original folder instead of extracted ones 2024-09-30 15:10:10 +02:00
Claudio Atzori 3fcafc7ed6 Merge pull request 'Latest institutions in monitor dbs' (#472) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: #472
2024-09-26 09:49:01 +02:00
Miriam Baglioni 599e56dbc6 Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta 2024-09-25 17:28:23 +02:00
Claudio Atzori 6397141e56 code formatting 2024-09-25 15:27:32 +02:00
Claudio Atzori e354f9853a [OpenCitations] move the extracted contents under a backup path to avoid needing to re-download it in case of errors 2024-09-25 15:27:02 +02:00
Sandro La Bruzzo 6a097abc89 as described on ticket #9525
1. Changed the mapping applied to Crossref records: anything that has a relationship "is-review-of" must be mapped as publication of type "Review".
2. Force the hostedby of Crossref records with DOI prefix 10.3410 and 10.12703 to the H1 Connect data source.
2024-09-25 11:32:54 +02:00
Michele Artini 9754521847 Merge pull request 'fixed a bug with id' (#486) from osfPreprints_plugin into beta
Reviewed-on: #486
2024-09-25 10:02:24 +02:00
Michele Artini fa2532db30 fixed a bug with id 2024-09-25 09:38:50 +02:00
Michele Artini 54f8b4da39 Merge pull request 'fixed a bug with 'null' string' (#484) from osfPreprints_plugin into beta
Reviewed-on: #484
2024-09-24 15:19:54 +02:00
Michele Artini b35d046fd2 fixed a bug with 'null' string 2024-09-24 15:18:54 +02:00
Miriam Baglioni 4d3e079590 Merge remote-tracking branch 'origin/beta' into beta 2024-09-24 14:26:29 +02:00
Michele Artini e941adbe2b fixed a bug with topic ENRICH/MORE/SUBJECT/ARXIV 2024-09-24 08:57:37 +02:00
Michele Artini fdbe629f49 removed the deletedByInference=true filter 2024-09-23 15:27:28 +02:00
Antonis Lempesis 619aa34a15 Merge branch 'beta' of https://code-repo.d4science.org/antonis.lempesis/dnet-hadoop into beta 2024-09-23 15:25:59 +03:00
Antonis Lempesis dbea7a4072 removed duplicate line 2024-09-23 14:57:11 +03:00
Antonis Lempesis c9241dba0d Merge pull request 'convert_hive_to_spark_actions' (#1) from convert_hive_to_spark_actions into beta
Reviewed-on: antonis.lempesis/dnet-hadoop#1
2024-09-23 13:53:28 +02:00
Michele Artini 2d7a7a962d unit test @Disabled 2024-09-23 10:19:36 +02:00
Michele Artini 6b0f7cc8b0 skip urls with authentication 2024-09-23 10:16:53 +02:00
Michele Artini 339d8124f2 osf plugin: links to contributors and primaty_file 2024-09-20 08:44:05 +02:00
Michele Artini 52bb7af03b use of dom4j 2024-09-19 14:59:05 +02:00
Michele Artini 9073b1159d partial implementation of osfPreprints plugin + tests 2024-09-19 13:58:53 +02:00
Michele Artini dcf09811a2 partial implementation of osfPreprints plugin 2024-09-19 12:42:45 +02:00
Claudio Atzori bfd05cdab2 run mergeResultsOfDifferentTypes only when checkDelegatedAuthority is true 2024-09-17 10:49:32 +02:00
Michele Artini a2fac78dcc fixed a problem in incremental harvesting 2024-09-17 10:16:28 +02:00
Michele Artini 99b7adda0c gtr2 unit test 2024-09-16 15:13:44 +02:00
Michele Artini bb9cee4f40 implementation of gtr2Publications plugin 2024-09-16 14:16:56 +02:00
Antonis Lempesis 37ad259296 cleanup 2024-09-05 16:02:44 +03:00
Antonis Lempesis b64c144abf added new institutions 2024-09-05 16:00:09 +03:00
Miriam Baglioni 468f2aa5a5 [AffiliationAffRo]align beta with new affiliation from publisher webpage introduced in production. AffRo collectedfrom OpenAIRE to discriminate against WebCrawl 2024-08-12 18:10:46 +02:00
Miriam Baglioni 89fcf4086c [Person]fix issue in affiliation relation id construction for person (missing ::) 2024-08-12 18:04:43 +02:00
Miriam Baglioni 8c185a7b1a resolving conflicts 2024-08-05 17:14:11 +02:00
Claudio Atzori 8e7ef79ce0 [bip affiliations] considers only DOI based records 2024-08-05 12:13:48 +02:00
Miriam Baglioni 985ca15264 [openaire-affiliation]removes matchings without DOI 2024-08-05 12:10:40 +02:00
Claudio Atzori fecbf93e0e Merge pull request 'FoS L1 & L2' (#465) from fos_l1l2 into beta
Reviewed-on: #465
2024-08-01 13:58:04 +02:00
Claudio Atzori 64740475d0 depending on dhp-schemas:7.0.1 2024-07-29 11:51:42 +02:00
Miriam Baglioni 1af6571474 merging with branch beta 2024-07-25 15:48:05 +02:00
Claudio Atzori a81c555fe6 [graph provision] include only FoS L1..L2 in the record serialization 2024-07-25 15:26:47 +02:00
Claudio Atzori 359b8ebda8 [graph provision] include only FoS L1..L2 in the record serialization 2024-07-25 15:22:29 +02:00
Miriam Baglioni c7f6669f1a [webcrawl] the blacklist is now in json and no more in csv after the normalization process 2024-07-25 15:20:18 +02:00
Miriam Baglioni 7cff281d3e [webcrawl] the blacklist is now in json and no more in csv after the normalization process 2024-07-25 15:16:42 +02:00
Claudio Atzori d4bf449e8c minor 2024-07-25 14:53:06 +02:00
Miriam Baglioni fc60661ac5 [webcrawl] added code and test (code/resource) to verify the deletion of the relations related to results put in blacklist 2024-07-25 12:25:14 +02:00
Claudio Atzori d771a883f9 [dedup] updated sql query used to read organizations from the OpenOrgs DB to include their typology 2024-07-25 09:53:48 +02:00
Claudio Atzori 01958a3e07 [graph provision] addded filter to exclude records marked with datainfo.deletedbyinference = true 2024-07-24 10:00:10 +02:00
Miriam Baglioni 6f1801d7d1 [webcrawl]- 2024-07-23 17:34:48 +02:00
Miriam Baglioni 19806c2ae3 [SDG]fixed switch of methods 2024-07-23 17:12:55 +02:00
Antonis Lempesis d0590e0e49 added latest institutions 2024-07-23 15:17:15 +03:00