Commit Graph

3416 Commits

Author SHA1 Message Date
Sandro La Bruzzo 63952018c0 [scholexplorer]
-moved SparkRetrieveDataciteDelta in scala folder
2021-12-15 11:25:32 +01:00
Sandro La Bruzzo e5bff64f2e [scholexplorer]
- Minor fix on SparkConvertRDDtoDataset
-first implementation of retrieve datacite dump
2021-12-15 11:25:32 +01:00
Claudio Atzori e30e5ac8a8 Merge pull request '[Affiliation Propagation]' (#162) from affiliationPropagation into beta
Reviewed-on: D-Net/dnet-hadoop#162
2021-12-14 15:28:23 +01:00
Claudio Atzori 1790fa2d44 Merge branch 'beta' into affiliationPropagation 2021-12-14 15:26:56 +01:00
Claudio Atzori aff3ddc8d2 added cleaning for the format field, removing carrige return and tab characters 2021-12-14 11:41:46 +01:00
Claudio Atzori 98eb292c59 avoid NPEs merging XMLInstance(s) 2021-12-13 13:27:20 +01:00
Claudio Atzori 5e17247bb6 avoid NPEs merging XMLInstance(s) 2021-12-13 11:48:40 +01:00
Claudio Atzori b70ecccea0 avoid NPEs merging XMLInstance(s) 2021-12-12 12:37:38 +01:00
Claudio Atzori 25dc7929a9 Merge pull request '[graph cleaning] improved instance type defaults' (#172) from graph_cleaning into beta
Reviewed-on: D-Net/dnet-hadoop#172
2021-12-09 16:47:06 +01:00
Claudio Atzori eb43eda42a Merge branch 'beta' into graph_cleaning 2021-12-09 16:46:48 +01:00
Claudio Atzori 41c70c607d cleaning workflow assigns the proper default instance type when a value could not be cleaned using the vocabularies 2021-12-09 16:44:28 +01:00
Alessia Bardi 8f1e018ceb Merge pull request 'Serialization of fields in XML records for Sygma (and not only)' (#171) from sygma_indexing into beta
Reviewed-on: D-Net/dnet-hadoop#171
2021-12-09 15:53:27 +01:00
Alessia Bardi cba63e9f82 Merge branch 'beta' into sygma_indexing 2021-12-09 15:52:16 +01:00
Alessia Bardi e53228401b style 2021-12-09 15:46:22 +01:00
Claudio Atzori adf17452b0 Merge pull request '[graph cleaning] consider terms as synonyms in the vocabulary lookup' (#170) from graph_cleaning into beta
Reviewed-on: D-Net/dnet-hadoop#170
2021-12-09 14:45:14 +01:00
Claudio Atzori e6e177dda0 vocabulary based cleaning considers also the term label when looking up for a synonym 2021-12-09 13:57:53 +01:00
Alessia Bardi 6b5d7688a4 #7275 serialize license information in XML records 2021-12-09 13:46:48 +01:00
Sandro La Bruzzo 5d51b3dd4a Merge pull request 'scala_refactor' (#169) from scala_refactor into beta
Reviewed-on: D-Net/dnet-hadoop#169
2021-12-06 15:33:44 +01:00
Miriam Baglioni d9836f0cf3 [OpenCitations] fixed test when executed one after the other 2021-12-06 15:27:09 +01:00
Miriam Baglioni d1df01ff1e [Graph Dump] fixed resource for test 2021-12-06 15:15:48 +01:00
Sandro La Bruzzo ed0c352799 [test-fixing] fixed wrong test 2021-12-06 15:07:41 +01:00
Sandro La Bruzzo e9f285ec4d [scala-refactor] Module dhp-doiboost:
Moved all scala source into src/main/scala and src/test/scala
2021-12-06 14:24:03 +01:00
Sandro La Bruzzo bf880e2508 [scala-refactor] Module dhp-graph-mapper:
Moved all scala source into src/main/scala and src/test/scala
2021-12-06 13:57:41 +01:00
Sandro La Bruzzo 81bf604059 [scala-refactor] Module dhp-common:
Moved all scala source into src/main/scala and src/test/scala
2021-12-06 11:29:24 +01:00
Sandro La Bruzzo 7af0bbd0b1 [scala-refactor] Module dhp-aggregation:
Moved all scala source into src/main/scala and src/test/scala
2021-12-06 11:26:36 +01:00
Claudio Atzori 9132727793 fixed date cleaning test 2021-12-06 10:54:05 +01:00
Claudio Atzori 08795cbd30 using helper method from ModelSupport to find the inverse relation descriptor 2021-12-06 10:39:56 +01:00
Sandro La Bruzzo 0fa0ce33d6 removed duplicated on gitignore 2021-12-03 11:47:35 +01:00
Sandro La Bruzzo f7011b90d8 format code 2021-12-03 11:15:09 +01:00
Claudio Atzori 372633880f Merge pull request 'XML serialisation of instances with the same URLs' (#167) from instance_group_by_url into beta
Reviewed-on: D-Net/dnet-hadoop#167
2021-12-03 09:28:06 +01:00
Claudio Atzori dd0b2e5244 Merge branch 'beta' into instance_group_by_url 2021-12-03 09:27:58 +01:00
Claudio Atzori c4c705aa46 Merge pull request 'Cleaning of invisible records' (#168) from clean_invisible_records into beta
Reviewed-on: D-Net/dnet-hadoop#168
2021-12-03 09:27:41 +01:00
Claudio Atzori 863a2f9db3 avoid to filter OAF records defined as invisible = true 2021-12-03 09:08:12 +01:00
Claudio Atzori 9cac283bec implemented Instance serialization features requested in https://support.openaire.eu/issues/7156 2021-12-02 17:20:33 +01:00
Claudio Atzori 3b19821f3c added stats computation on the graph hive DB tables 2021-12-02 10:44:10 +01:00
Claudio Atzori cfa4560769 minor: fixed hive action name 2021-12-02 10:43:36 +01:00
Claudio Atzori d85af6fc25 [cleaning wf] fixed OAF record navigation, a mapping defined on a container object would have prevented the natvigation to continue on its properties 2021-12-01 15:49:15 +01:00
Claudio Atzori 4fe7888817 code formatting 2021-12-01 15:48:15 +01:00
Claudio Atzori 01e5e0142a added test to verify the relation inverse lookup operation 2021-12-01 09:46:26 +01:00
Claudio Atzori 0df9574a6f Merge pull request '[stats wf] Added sprint 3&4 of indicators' (#166) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#166
2021-11-29 10:40:26 +01:00
Claudio Atzori 014e872ae1 [resolution wf] added optional parameter to skip the entity resolution 2021-11-26 15:38:56 +01:00
Claudio Atzori 5c6d328537 code formatting 2021-11-26 15:38:16 +01:00
dimitrispie 09fc2afdca Added indi_funder_country_collab
Kept only indi_pub_has_cc_licence
2021-11-26 16:13:10 +02:00
dimitrispie 8750a71502 Merge remote-tracking branch 'origin/beta' into beta 2021-11-26 16:11:26 +02:00
dimitrispie 25fc8abf77 Sprint 4 2021-11-26 16:10:58 +02:00
Antonis Lempesis 0b4163ee0b added sprint3,4, removed 2, chaos 2021-11-26 15:58:01 +02:00
dimitrispie 29f69f2f89 Sprint 4 2021-11-26 15:22:04 +02:00
Sandro La Bruzzo bb7f556eff Merge remote-tracking branch 'origin/beta' into beta 2021-11-25 13:03:25 +01:00
Sandro La Bruzzo 1e1f5e4fe0 minor fix 2021-11-25 13:03:17 +01:00
Miriam Baglioni ac07ed8251 Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta 2021-11-25 12:32:58 +01:00