Commit Graph

136 Commits

Author SHA1 Message Date
Claudio Atzori 6dddad86ee [cleaning] title cleaning based on the me.xuender:unidecode library 2021-07-28 16:21:29 +02:00
Claudio Atzori bc4b86c27c updated URL in the issueManagement tag 2021-07-13 11:54:32 +02:00
Claudio Atzori 9038fdc771 depending on dhp-schemas:2.7.14 (release) 2021-07-12 17:46:12 +02:00
Sandro La Bruzzo c952c8d236 generate first side of scholix mapping 2021-07-06 09:53:14 +02:00
Sandro La Bruzzo c6fa8598e1 massive code refactor:
removed modules dhp-*-scholexplorer
2021-07-01 22:13:45 +02:00
Claudio Atzori fd54ecf7bd bumped dhp-schemas dependency version 2021-06-18 10:08:07 +02:00
Claudio Atzori 10bd6ca194 depending on dhp-schemas:2.5.12 (release) 2021-06-11 16:59:56 +02:00
Claudio Atzori a900bfb874 delegating the date parsing to https://github.com/sisyphsu/dateparser 2021-06-11 16:53:01 +02:00
Claudio Atzori eb6acfbabc [cleaning] removing non parsable relation.validationDate(s) 2021-05-28 10:50:44 +02:00
Claudio Atzori ac3d090e9e bumped dhp-schemas dependency version 2021-05-27 17:31:12 +02:00
Claudio Atzori c3d92247d3 bumped dhp-schemas dependency version 2021-05-27 15:10:51 +02:00
Claudio Atzori 4f58418184 depending on dhp-schemas:2.4.7 (release) 2021-05-24 10:32:48 +02:00
Claudio Atzori 0358ae16ce depending on the latest dhp-schema version 2021-05-14 11:28:33 +02:00
Claudio Atzori d1cbee8413 imported methods from CleaningFunctions, defined in GraphCleaningFunctions 2021-05-10 16:43:39 +02:00
Claudio Atzori 3797543600 MDStoreManager model classes moved in dhp-schemas 2021-05-10 14:32:05 +02:00
Claudio Atzori 923d19ea8e mdstore read lock/unlock when bulk copying records from mongodb to hdfs 2021-05-04 18:06:21 +02:00
Claudio Atzori 5cc3e6d61c bumped pace-core dependency version 2021-05-03 16:40:50 +02:00
Claudio Atzori 91e7220f20 cleaned up workflow for actionset migration, adjusted dnet|cnr* dependency versions 2021-04-29 10:09:52 +02:00
Claudio Atzori 233d849f90 added dnet45-bootstrap-snapshot and dnet45-bootstrap-release repositories 2021-04-27 12:03:40 +02:00
Claudio Atzori 4028176559 enabled snapshots from dnet45-snapshots repository 2021-04-27 11:37:32 +02:00
Claudio Atzori 27ab8a704d adjusted poms to align with the external dhp-schema module 2021-04-27 10:12:27 +02:00
Claudio Atzori c2bb03c8b5 depending on external dhp-schemas module 2021-04-23 17:57:35 +02:00
miconis 2709d08fc2 Merge branch 'stable_ids' into openorgswf 2021-03-29 16:39:07 +02:00
miconis 28c1cdd132 merged stable_ids into openorgswf 2021-03-25 10:44:49 +01:00
Sandro La Bruzzo c73072079d fix conflicts 2021-03-22 16:36:31 +01:00
Claudio Atzori acbe3119a4 RestCollectorPlugin imported from dne45 2021-03-08 09:44:09 +01:00
Claudio Atzori fa7930d2e2 merging contributions from PR#97 2021-03-05 15:45:28 +01:00
miconis 1a85020572 bug fix in graph-mapper, changes in the implementation of the openorgs wf to create relations and populate openorgs db 2021-02-26 10:19:28 +01:00
Claudio Atzori 72c57b28fa switched project version to 1.2.4-branch_hadoop_aggregator-SNAPSHOT 2021-02-04 14:08:18 +01:00
Claudio Atzori d62ea1490d cleaned up RabbitMQ stuff 2021-02-02 10:53:19 +01:00
Michele Artini b9d90e95b8 Added eventId to ShortEventMessage 2021-01-14 14:32:31 +01:00
Claudio Atzori 197f286fa4 removed duplicated dependency (org.apache.httpcomponents:httpclent 2020-12-07 21:52:17 +01:00
Enrico Ottonello 2b0c9bbb7e Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop into orcid-no-doi 2020-11-17 18:24:34 +01:00
Claudio Atzori 628ca54dd3 disable old maven repository URLs 2020-11-17 12:26:16 +01:00
Enrico Ottonello c796adae24 Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop into orcid-no-doi 2020-11-16 11:57:19 +01:00
Claudio Atzori 2facfefc19 updated maven repository URL 2020-11-13 15:38:40 +01:00
Enrico Ottonello 6bc7dbeca7 first version of dataset successful generated from orcid dump 2020 2020-11-06 13:47:50 +01:00
Enrico Ottonello 9818e74a70 added dependency version in main pom.xml for orcid no doi 2020-10-22 16:38:00 +02:00
Enrico Ottonello b0290dbcb7 moved all dependencies version to main pom.xml 2020-10-22 16:20:46 +02:00
Miriam Baglioni ae08b3c0dd merge branch with master 2020-10-05 11:35:55 +02:00
Claudio Atzori 4fddd18403 updating to dnet-pace-core:4.0.5
- fixed error in the treeprocessor. it used th=-1 as default value, now it use th=1  5021e5048f
- fixed error in the block processor: entities with orderField=null were not considered 9e8ea8f6ee
2020-10-02 12:37:25 +02:00
Miriam Baglioni 5ef03e5971 added the dependencies from dhp-aggregation for h2020classification 2020-10-01 15:44:40 +02:00
Enrico Ottonello fefbcfb106 dependency version moved to main pom (PR review) 2020-09-22 10:20:25 +02:00
Michele Artini 51321c2701 partition of events by opedoarId 2020-09-17 11:38:07 +02:00
Miriam Baglioni 02a4986e7b Applying changed from code reviews D-Net/dnet-hadoop#40 (comment) and D-Net/dnet-hadoop#40 (comment) and D-Net/dnet-hadoop#40 (comment) 2020-08-13 11:53:01 +02:00
Claudio Atzori 3a11a387a9 data provision workflow enhancement: added nodes to perform DELETE BY QUERY before the indexing begins and COMMIT after the indexing is completed 2020-08-03 14:28:08 +02:00
Claudio Atzori 105176105c updated dnet-pace-core dependency to version 4.0.4 to include the latest clustering function 2020-07-20 09:59:47 +02:00
Claudio Atzori 4b9fb2ffb8 Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop 2020-07-15 11:26:04 +02:00
Claudio Atzori 5033c25587 code formatting 2020-07-15 11:26:00 +02:00
Michele Artini 262c29463e relations with multiple datasources 2020-07-15 09:18:40 +02:00