Commit Graph

2319 Commits

Author SHA1 Message Date
Miriam Baglioni cd285e98bc usoing the constants defined in the ModelConstants class 2020-12-02 11:13:23 +01:00
Miriam Baglioni 51c582c08c added orcid class name among the constants set 2020-12-02 11:12:54 +01:00
Miriam Baglioni 4b0d1530a2 merge upstream 2020-12-02 11:05:00 +01:00
Claudio Atzori faa977df7e Merge pull request 'orcid-no-doi' (#43) from enrico.ottonello/dnet-hadoop:orcid-no-doi into master
The dataset was generated and is now part of the actionsets available in BETA
2020-12-02 10:55:12 +01:00
Claudio Atzori 57f448b7a4 graph cleaning workflow separate orcid_pending from orcid, depending on the author pid provenance 2020-12-02 10:44:05 +01:00
Alessia Bardi 2d15667b4a testing XML generation from json object (case AMS ACTA) 2020-12-02 10:16:26 +01:00
Alessia Bardi a417624670 tests for raw graph mapping 2020-12-02 10:15:26 +01:00
Claudio Atzori 943b961cf6 introduced PidBlacklist 2020-12-02 09:30:34 +01:00
Claudio Atzori 893ac4a77b GenerateEntitiesApplication can be configured to hash the id value or not 2020-12-02 09:30:06 +01:00
Miriam Baglioni f8468c9c22 added extention for new author pid (orcid_pending) 2020-12-01 20:09:35 +01:00
Miriam Baglioni 888175baf7 added java doc 2020-12-01 18:36:29 +01:00
Miriam Baglioni 3d62d99d5d fixed issue in workflow variable 2020-12-01 15:02:49 +01:00
Miriam Baglioni 17680296b9 removed unnecessary variable and unused method 2020-12-01 15:02:31 +01:00
Miriam Baglioni 5b3ed70808 refactoring 2020-12-01 14:31:34 +01:00
Miriam Baglioni 62ff4999e3 added workflow and last step of collection and save 2020-12-01 14:30:56 +01:00
Miriam Baglioni 45d06c45c7 collecting all the atoic actions for result type and save them all in the AS path 2020-12-01 14:29:18 +01:00
Miriam Baglioni 0051ebede5 extending test 2020-12-01 12:43:03 +01:00
Miriam Baglioni 719da15f04 added test resources 2020-12-01 12:42:30 +01:00
Miriam Baglioni e819155eb2 added implements Seriaiazable 2020-12-01 09:51:58 +01:00
Miriam Baglioni db36e11912 classes test classes and resources for production of the actionset to include bipFinder score in results 2020-11-30 20:14:23 +01:00
Claudio Atzori 349e7246aa do not consider NCID, GBIF as PIDs candidate for the ID creation 2020-11-30 16:52:40 +01:00
Enrico Ottonello f2df3ead74 Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop into orcid-no-doi 2020-11-30 14:22:46 +01:00
Enrico Ottonello 40c4559e92 added datainfo on authors pid with "sysimport:crosswalk:entityregistry", 2020-11-30 14:19:22 +01:00
Claudio Atzori 2c407e775e GenerateEntitiesApplication can be configured to hash the id value or not 2020-11-30 12:00:38 +01:00
Antonis Lempesis 815d6b25d9 added last step to update cache 2020-11-30 00:48:10 +02:00
Claudio Atzori 758d27745d cleaning tab characters from text fields 2020-11-27 16:07:24 +01:00
Claudio Atzori 596a2a459d added testing class for OafMapperUtils 2020-11-27 12:01:11 +01:00
Claudio Atzori e731a7658d cleaning texts to remove tab characters too 2020-11-27 09:00:04 +01:00
Claudio Atzori fa66e5b6b8 ResultTypeComparator gives priority to Records collectedfrom Crossref 2020-11-26 13:09:19 +01:00
Claudio Atzori 5151850a19 CROSSREF and DATACITE constants moved in common ModelConstants 2020-11-26 13:08:36 +01:00
Claudio Atzori a104d2b6ad cleanup 2020-11-26 11:12:00 +01:00
Claudio Atzori d0d5525d40 minor changes 2020-11-26 11:04:17 +01:00
Claudio Atzori 13eae4b31e GroupEntitiesSparkJob must read all graph paths but relations 2020-11-26 11:04:01 +01:00
Claudio Atzori 76363a8512 SimpleDateFormat is not thread safe; improved error reporting in case of invalid dates 2020-11-26 11:03:12 +01:00
Claudio Atzori c1b9a4045a grouping of records will be performed by the dedup workflow 2020-11-26 10:59:10 +01:00
Miriam Baglioni 124591a7f3 refactoring 2020-11-25 18:23:28 +01:00
Miriam Baglioni 1a89f8211c D-Net/dnet-hadoop#61 (comment) 2020-11-25 18:12:40 +01:00
Miriam Baglioni 5fbe54ef54 D-Net/dnet-hadoop#61 (comment) 2020-11-25 18:10:28 +01:00
Miriam Baglioni ed01e5a5e1 D-Net/dnet-hadoop#61 (comment) 2020-11-25 18:09:34 +01:00
Miriam Baglioni d4ddde2ef2 changed because of D-Net/dnet-hadoop#61 (comment) 2020-11-25 18:01:01 +01:00
Miriam Baglioni f5e5e92a10 changed because of D-Net/dnet-hadoop#61 (comment) 2020-11-25 17:58:53 +01:00
Miriam Baglioni 1df94b85b4 changed because of D-Net/dnet-hadoop#61 (comment) 2020-11-25 17:57:43 +01:00
Miriam Baglioni 66c0e3e574 changed because of D-Net/dnet-hadoop#61 (comment) 2020-11-25 17:52:17 +01:00
Claudio Atzori db0181b8af Merge pull request 'added bidirectionality to relations from project and result coming from crossref' (#60) from miriam.baglioni/dnet-hadoop:sxBidirectionality into master 2020-11-25 17:17:40 +01:00
Sandro La Bruzzo ec3e238de6 Fixed problem on duplicated identifier 2020-11-25 17:15:54 +01:00
Claudio Atzori 1372a4d1bf fixed merging method 2020-11-25 16:05:51 +01:00
Claudio Atzori e208b03755 renamed workflow 2020-11-25 14:55:50 +01:00
Claudio Atzori dfd6205b95 Consistency graph workflow merges all the entities by ID 2020-11-25 14:55:32 +01:00
Miriam Baglioni 90d4369fd2 added test to verify the compression in writing community info on hdfs 2020-11-25 14:34:58 +01:00
Miriam Baglioni 6750e33d69 merge branch with master 2020-11-25 14:09:01 +01:00