Commit Graph

2597 Commits

Author SHA1 Message Date
Claudio Atzori 3f8f78cbfb Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop 2021-02-11 09:36:10 +01:00
Claudio Atzori b34b5a39ca index field authoridtypevalue mixes up different author id-type value pairs, dropped in favour of orcidtypevalue 2021-02-11 09:36:04 +01:00
Michele Artini 7249cceb53 switch of 2 nodes 2021-02-11 09:27:08 +01:00
Claudio Atzori 73393d3c4d Merge pull request 'validatedLinksToProjects' (#93) from validatedLinksToProjects into master
LGTM
2021-02-10 12:32:35 +01:00
Alessia Bardi 986dd969d3 use the proper import for Lists 2021-02-10 12:03:54 +01:00
miconis 4b2124a18e implementation of the openorgs wfs, implementation of the raw_all wf to migrate openorgs db entities 2021-02-10 11:51:50 +01:00
Alessia Bardi c4d1feca74 mapper test with validated link to project 2021-02-10 11:22:54 +01:00
Alessia Bardi 09fc7e2f78 serialization of validated flag on relationships 2021-02-10 11:22:09 +01:00
Enrico Ottonello ee4ba7298b fix last update read/write from file on hdfs 2021-02-09 23:24:57 +01:00
Claudio Atzori bc458d1b54 Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop 2021-02-09 16:27:30 +01:00
Claudio Atzori 82e6c50f3f updated solr fields (authoridtypevalue, resultsubject, resultresourcetypename) 2021-02-09 16:27:04 +01:00
Claudio Atzori 62bd3c53ee Merge branch 'master' into provision_indexing 2021-02-09 15:46:26 +01:00
Claudio Atzori bae029f828 collection_java_xmx allows to declare the heap size allocated for the java actions involved in the metadata collectionw workflow 2021-02-08 18:07:23 +01:00
Claudio Atzori bebc54d5bf seq file storing native records is now compressed 2021-02-08 18:06:25 +01:00
Claudio Atzori 50add4c61b added requestDelay to HttpConnector2 configuration; Aggregation workflow constants moved in dhp-common 2021-02-08 12:19:38 +01:00
Claudio Atzori 40df0f987d better logging, WIP: collectorWorker error reporting; common functions moved in DHPUtils 2021-02-06 20:12:00 +01:00
Claudio Atzori a8a758925e better logging, WIP: collectorWorker error reporting 2021-02-05 19:18:05 +01:00
Michele Artini 2ee0c3e47e http entity as json string 2021-02-05 09:45:39 +01:00
Claudio Atzori 730973679a Merge branch 'hadoop_aggregator' of https://code-repo.d4science.org/D-Net/dnet-hadoop into hadoop_aggregator 2021-02-04 17:25:00 +01:00
Claudio Atzori deb85706db imported HttpConnector from https://svn.driver.research-infrastructures.eu/driver/dnet45/modules/dnet-modular-collector-service/trunk/src/main/java/eu/dnetlib/data/collector/plugins/HttpConnector.java as HttpConnector2 2021-02-04 17:24:52 +01:00
Sandro La Bruzzo 4dae5e605d implemented messaging btween collection worker and Dnet 2021-02-04 15:51:15 +01:00
Claudio Atzori 72c57b28fa switched project version to 1.2.4-branch_hadoop_aggregator-SNAPSHOT 2021-02-04 14:08:18 +01:00
Claudio Atzori 40764cf626 better logging, WIP: collectorWorker error reporting 2021-02-04 14:06:02 +01:00
Enrico Ottonello c238561001 Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop into orcid-no-doi 2021-02-04 10:44:21 +01:00
Enrico Ottonello 465ce39f75 job execution now based on file last_update.txt on hdfs 2021-02-04 10:44:04 +01:00
Sandro La Bruzzo 69c253710b fixed test 2021-02-04 10:30:49 +01:00
Michele Artini 3ea8c328ac Merge branch 'hadoop_aggregator' of code-repo.d4science.org:D-Net/dnet-hadoop into hadoop_aggregator 2021-02-04 09:46:13 +01:00
Michele Artini 26d2eb946f messages sender 2021-02-04 09:45:46 +01:00
Claudio Atzori 4758b58aa2 Merge branch 'hadoop_aggregator' of https://code-repo.d4science.org/D-Net/dnet-hadoop into hadoop_aggregator 2021-02-03 17:58:29 +01:00
Claudio Atzori e04045089f better logging, WIP: collectorWorker error reporting 2021-02-03 17:58:22 +01:00
Alessia Bardi c67329d3ad updated test for EU Open Data portal datasets 2021-02-03 17:06:48 +01:00
Michele Artini 1b9731632b Message Sender 2021-02-03 16:42:36 +01:00
Michele Artini 820d729e99 recover of Message and MessageType class 2021-02-03 16:20:34 +01:00
Michele Artini 33f4696d6e Merge branch 'hadoop_aggregator' of code-repo.d4science.org:D-Net/dnet-hadoop into hadoop_aggregator 2021-02-03 16:08:21 +01:00
Michele Artini c286d28ad2 logs 2021-02-03 16:07:49 +01:00
Claudio Atzori 0e8a4f9f1a better logging, WIP: collectorWorker error reporting 2021-02-03 12:33:41 +01:00
Alessia Bardi fd705404a1 tests for EU Open Data portal dataset mapping 2021-02-03 10:28:17 +01:00
Claudio Atzori 53884d12c2 code formatting 2021-02-02 14:38:03 +01:00
Claudio Atzori ac46c247d2 code formatting 2021-02-02 14:24:00 +01:00
Claudio Atzori bde14b149a fixed transformation target paths 2021-02-02 12:49:29 +01:00
Claudio Atzori ca4391aa1c minor changes 2021-02-02 12:44:04 +01:00
Claudio Atzori bb89b99b24 code formatting 2021-02-02 12:34:14 +01:00
Claudio Atzori 75807ea5ae factored out constants 2021-02-02 12:28:21 +01:00
Sandro La Bruzzo 4ed1e306b6 Merge branch 'hadoop_aggregator' of code-repo.d4science.org:D-Net/dnet-hadoop into hadoop_aggregator 2021-02-02 12:12:51 +01:00
Sandro La Bruzzo 0634674add implemented transformation test 2021-02-02 12:12:14 +01:00
Claudio Atzori d62ea1490d cleaned up RabbitMQ stuff 2021-02-02 10:53:19 +01:00
Claudio Atzori 73d772a4b4 added method to list the known vocabulary names 2021-02-02 10:39:47 +01:00
Claudio Atzori 8eaa1fd4b4 WIP: metadata collection in INCREMENTAL mode and relative test 2021-02-01 19:29:10 +01:00
Sandro La Bruzzo bead34d11a code refactor 2021-02-01 14:58:06 +01:00
Sandro La Bruzzo 6ff234d81b Implemented a first prototype of incremental harvesting and trasformation using readlock 2021-02-01 13:56:05 +01:00