Commit Graph

3982 Commits

Author SHA1 Message Date
Claudio Atzori c86cc53520 suppressing hyper verbose spark logs during unit test execution 2022-09-23 15:20:40 +02:00
Claudio Atzori c42850328e fixed semantic (subreltype) for ServiceOrganization relations 2022-09-22 16:23:25 +02:00
Claudio Atzori 26e1badded added instance.url syntactical validation, avoid creating multiple duplicated URLs 2022-09-19 11:19:10 +02:00
Claudio Atzori 192215a18e merged from branch discard-non-wellformed 2022-09-19 10:17:10 +02:00
Claudio Atzori e370e940d8 [aggregator graph] save invalid records aside for further inspection 2022-09-16 14:06:28 +02:00
Claudio Atzori 465e941214 Merge pull request '[stats wf] Changes to indicators tables' (#244) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: #244
2022-09-16 10:13:58 +02:00
Claudio Atzori 1e42d984e1 [aggregator graph] save invalid records aside for further inspection 2022-09-15 10:49:42 +02:00
Alessia Bardi 9e7ec4198f fixed test 2022-09-14 18:08:56 +02:00
Claudio Atzori c48f6e9c57 [aggregator graph] save invalid records aside for further inspection 2022-09-14 17:11:26 +02:00
dimitrispie 3bf3127251 Changes to monitor and indicator scripts 2022-09-14 16:36:19 +03:00
Claudio Atzori a0919ed495 [aggregator graph] save invalid records aside for further inspection 2022-09-14 13:27:39 +02:00
Alessia Bardi b99a011345 return empty Oaf list if record cannot be parsed 2022-09-13 11:51:55 +02:00
Alessia Bardi 27af5122d2 logs for non well formed XML files 2022-09-12 14:25:23 +02:00
Claudio Atzori 5066db3386 Merge pull request 'subjects cleaning' (#239) from clean_subjects into beta
Reviewed-on: #239
2022-09-09 15:17:02 +02:00
Claudio Atzori ff6f789b6d code formatting 2022-09-09 15:16:31 +02:00
Claudio Atzori b5f7bd30be Merge branch 'beta' into clean_subjects 2022-09-09 12:20:04 +02:00
Claudio Atzori 690be4482f Merge pull request '#7861#note-8 instance url from handle' (#243) from handle_as_instance_urls into beta
Reviewed-on: #243
2022-09-09 12:19:17 +02:00
Alessia Bardi f14107ad77 Merge branch 'handle_as_instance_urls' of https://code-repo.d4science.org/D-Net/dnet-hadoop into handle_as_instance_urls 2022-09-09 12:17:19 +02:00
Alessia Bardi a539c6ccaf https for handle URLs 2022-09-09 12:16:28 +02:00
dimitrispie 71b069ca90 Changes to indicator and monitor scripts 2022-09-09 13:15:58 +03:00
Claudio Atzori 1203378441 Merge branch 'beta' into clean_subjects 2022-09-09 10:38:47 +02:00
Claudio Atzori 853c996fa2 Merge branch 'beta' into handle_as_instance_urls 2022-09-09 09:47:16 +02:00
Claudio Atzori a431e01383 Merge pull request 'orcid_multipleworks_download' (#242) from enrico.ottonello/dnet-hadoop:orcid_multipleworks_download into beta
Reviewed-on: #242
2022-09-09 08:45:02 +02:00
Alessia Bardi 9ef063d502 #7861#note-8 instance url from handle 2022-09-07 17:29:54 +03:00
Alessia Bardi 5c45d52af3 testing for RiuNet 2022-09-07 15:40:57 +03:00
dimitrispie 2b5f8c9c9a comment out duplicate table creation 2022-09-06 12:27:53 +03:00
Alessia Bardi a11eb38065 testing for RO-Hub 2022-09-02 16:07:36 +02:00
Enrico Ottonello bfdf2dc390 Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into orcid_multipleworks_download 2022-08-25 12:07:54 +02:00
Enrico Ottonello da1cf561e6 alignment with beta 2022-08-25 11:57:20 +02:00
Enrico Ottonello 27445ccdaa cleaned log 2022-08-25 11:56:14 +02:00
Claudio Atzori b7c387c21f cleaning of subjects: avoid duplicated subjects, prioritise collected vs inferred or other sources 2022-08-12 15:09:16 +02:00
Claudio Atzori adb526b0e1 Merge branch 'beta' into clean_subjects 2022-08-12 10:51:17 +02:00
Claudio Atzori cb7c07c54e [scholix] added step to create tar archive 2022-08-11 11:25:24 +02:00
Claudio Atzori 2aa16d0432 [scholix] fixed OpenCitation dump procedure 2022-08-10 17:39:29 +02:00
Claudio Atzori 51ad93e545 [scholix] fixed OpenCitation dump procedure 2022-08-10 11:57:56 +02:00
Claudio Atzori 3418ce50ac cleaning of subjects: perform the cleaning when the given value is equivalent to one of the terms in the vocabulary 2022-08-08 12:48:47 +02:00
Claudio Atzori a78028dabc Merge branch 'beta' into clean_subjects 2022-08-08 12:34:33 +02:00
Claudio Atzori d85ba3c1a9 Merge pull request 'serialising field eoscifguidelines field in the Solr XML records' (#234) from tagEosc into beta
Reviewed-on: #234
2022-08-08 10:28:41 +02:00
Claudio Atzori 3937ff04de Merge branch 'beta' into tagEosc 2022-08-08 09:57:23 +02:00
Claudio Atzori a4815f6bec Merge branch 'beta' into clean_subjects 2022-08-05 16:57:03 +02:00
Claudio Atzori 29c4cde42e Merge branch 'clean_subjects' of https://code-repo.d4science.org/D-Net/dnet-hadoop into clean_subjects 2022-08-05 16:56:37 +02:00
Claudio Atzori 4eaa063b1f cleaning of subjects 2022-08-05 16:56:09 +02:00
Claudio Atzori 84598c7535 Merge pull request 'restored some collab indicators' (#240) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: #240
2022-08-05 15:50:39 +02:00
Antonis Lempesis fcef5294e2 restored some collab indicators 2022-08-05 13:45:01 +03:00
Claudio Atzori 844f6eb465 Merge branch 'beta' into clean_subjects 2022-08-05 12:39:05 +02:00
Claudio Atzori 32cee1f619 WIP: cleaning of subjects 2022-08-05 12:32:08 +02:00
Claudio Atzori c1f2ffc53d Merge pull request 'commenting out the collab indicators because they still fail' (#237) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: #237
2022-08-05 11:57:36 +02:00
Antonis Lempesis 227e10f4b3 commenting out the collab indicators because they still fail 2022-08-05 12:54:36 +03:00
Claudio Atzori 6c0fd9284b merge from beta 2022-08-05 10:42:53 +02:00
Claudio Atzori b78889a0ce WIP: cleaning of subjects 2022-08-05 09:11:37 +02:00