Claudio Atzori
|
55f6ff5f55
|
README.md for aggregation workflows
|
2021-03-03 16:18:34 +01:00 |
Claudio Atzori
|
36f750cd1d
|
removed unused classes
|
2021-03-03 10:22:29 +01:00 |
Claudio Atzori
|
b73dce3e3a
|
more logging on the MDStore mongodb client. Forcing UTF_8 encoding on the content
|
2021-03-03 10:17:16 +01:00 |
Claudio Atzori
|
e76c4f62c1
|
MetadataRecord moved in dhp-schemas
|
2021-02-26 10:58:48 +01:00 |
Claudio Atzori
|
7df2461ccc
|
indent XML records collected from oai-pmh endpoints
|
2021-02-25 16:19:12 +01:00 |
Claudio Atzori
|
b830e33392
|
mdstore collector plugin
|
2021-02-25 12:30:30 +01:00 |
Claudio Atzori
|
271e88537b
|
code formatting
|
2021-02-25 12:28:56 +01:00 |
Claudio Atzori
|
9c899f4433
|
cleanup on transformation functions and the relative tests
|
2021-02-24 15:07:59 +01:00 |
Claudio Atzori
|
fc3fa5e343
|
implemented mdstore collector plugin
|
2021-02-24 15:07:24 +01:00 |
Claudio Atzori
|
e7eba9f7e7
|
WIP: transformation workflow error reporting; cleanup
|
2021-02-17 16:54:08 +01:00 |
Claudio Atzori
|
58467aaf1e
|
WIP: transformation workflow error reporting
|
2021-02-17 16:14:41 +01:00 |
Claudio Atzori
|
cc88701f29
|
retry for any Socket exception
|
2021-02-17 16:13:54 +01:00 |
Claudio Atzori
|
545f8f3e48
|
using jackson objectmapper instead of GSon to serialise the aggregation report
|
2021-02-17 12:15:00 +01:00 |
Claudio Atzori
|
b592d78bb4
|
WIP: collectorWorker error reporting, generalised reported implementation
|
2021-02-17 10:28:01 +01:00 |
Claudio Atzori
|
cf27905a71
|
WIP: collectorWorker error reporting, added report messages
|
2021-02-16 16:53:14 +01:00 |
Claudio Atzori
|
1abe6d1ad7
|
WIP: collectorWorker error reporting, added report messages
|
2021-02-15 15:08:59 +01:00 |
Claudio Atzori
|
523a6bfa97
|
Merge pull request 'first commit to the correct branch' (#94) from andreas.czerniak/BrAggr_dnet-hadoop:hadoop_aggregator into hadoop_aggregator
Looks good to me, thanks Andreas!
|
2021-02-15 12:15:31 +01:00 |
Sandro La Bruzzo
|
7edcc87ed4
|
changed xslt behaviour on failure
|
2021-02-12 17:27:08 +01:00 |
Sandro La Bruzzo
|
6a37c7f175
|
merge fixed
|
2021-02-12 16:38:47 +01:00 |
Sandro La Bruzzo
|
b3f5c2351d
|
Merge branch 'hadoop_aggregator' of code-repo.d4science.org:D-Net/dnet-hadoop into hadoop_aggregator
Conflicts:
dhp-workflows/dhp-aggregation/src/test/java/eu/dnetlib/dhp/transformation/TransformationJobTest.java
|
2021-02-12 16:37:14 +01:00 |
Sandro La Bruzzo
|
f216277219
|
Implemented cleaning date
|
2021-02-12 16:34:52 +01:00 |
Andreas Czerniak
|
5a9017cf18
|
clone, min. changes, test, run
|
2021-02-12 14:32:36 +01:00 |
Claudio Atzori
|
aa55dedb8a
|
Merge branch 'hadoop_aggregator' of https://code-repo.d4science.org/D-Net/dnet-hadoop into hadoop_aggregator
|
2021-02-12 12:31:05 +01:00 |
Claudio Atzori
|
29c6f7e255
|
classes related to the collection workflow moved into common package; implemented MongoDB collection plugins
|
2021-02-12 12:31:02 +01:00 |
Sandro La Bruzzo
|
17e6f1934e
|
fixed NPE on cleaner
|
2021-02-12 11:48:11 +01:00 |
Sandro La Bruzzo
|
ebcc3ec14f
|
updated wrong datacite identifier in trasformation
|
2021-02-11 16:25:51 +01:00 |
Claudio Atzori
|
bae029f828
|
collection_java_xmx allows to declare the heap size allocated for the java actions involved in the metadata collectionw workflow
|
2021-02-08 18:07:23 +01:00 |
Claudio Atzori
|
bebc54d5bf
|
seq file storing native records is now compressed
|
2021-02-08 18:06:25 +01:00 |
Claudio Atzori
|
50add4c61b
|
added requestDelay to HttpConnector2 configuration; Aggregation workflow constants moved in dhp-common
|
2021-02-08 12:19:38 +01:00 |
Claudio Atzori
|
40df0f987d
|
better logging, WIP: collectorWorker error reporting; common functions moved in DHPUtils
|
2021-02-06 20:12:00 +01:00 |
Claudio Atzori
|
a8a758925e
|
better logging, WIP: collectorWorker error reporting
|
2021-02-05 19:18:05 +01:00 |
Claudio Atzori
|
730973679a
|
Merge branch 'hadoop_aggregator' of https://code-repo.d4science.org/D-Net/dnet-hadoop into hadoop_aggregator
|
2021-02-04 17:25:00 +01:00 |
Claudio Atzori
|
deb85706db
|
imported HttpConnector from https://svn.driver.research-infrastructures.eu/driver/dnet45/modules/dnet-modular-collector-service/trunk/src/main/java/eu/dnetlib/data/collector/plugins/HttpConnector.java as HttpConnector2
|
2021-02-04 17:24:52 +01:00 |
Sandro La Bruzzo
|
4dae5e605d
|
implemented messaging btween collection worker and Dnet
|
2021-02-04 15:51:15 +01:00 |
Claudio Atzori
|
72c57b28fa
|
switched project version to 1.2.4-branch_hadoop_aggregator-SNAPSHOT
|
2021-02-04 14:08:18 +01:00 |
Claudio Atzori
|
40764cf626
|
better logging, WIP: collectorWorker error reporting
|
2021-02-04 14:06:02 +01:00 |
Sandro La Bruzzo
|
69c253710b
|
fixed test
|
2021-02-04 10:30:49 +01:00 |
Claudio Atzori
|
e04045089f
|
better logging, WIP: collectorWorker error reporting
|
2021-02-03 17:58:22 +01:00 |
Claudio Atzori
|
0e8a4f9f1a
|
better logging, WIP: collectorWorker error reporting
|
2021-02-03 12:33:41 +01:00 |
Claudio Atzori
|
53884d12c2
|
code formatting
|
2021-02-02 14:38:03 +01:00 |
Claudio Atzori
|
ac46c247d2
|
code formatting
|
2021-02-02 14:24:00 +01:00 |
Claudio Atzori
|
bde14b149a
|
fixed transformation target paths
|
2021-02-02 12:49:29 +01:00 |
Claudio Atzori
|
ca4391aa1c
|
minor changes
|
2021-02-02 12:44:04 +01:00 |
Claudio Atzori
|
bb89b99b24
|
code formatting
|
2021-02-02 12:34:14 +01:00 |
Claudio Atzori
|
75807ea5ae
|
factored out constants
|
2021-02-02 12:28:21 +01:00 |
Sandro La Bruzzo
|
0634674add
|
implemented transformation test
|
2021-02-02 12:12:14 +01:00 |
Claudio Atzori
|
8eaa1fd4b4
|
WIP: metadata collection in INCREMENTAL mode and relative test
|
2021-02-01 19:29:10 +01:00 |
Sandro La Bruzzo
|
bead34d11a
|
code refactor
|
2021-02-01 14:58:06 +01:00 |
Sandro La Bruzzo
|
6ff234d81b
|
Implemented a first prototype of incremental harvesting and trasformation using readlock
|
2021-02-01 13:56:05 +01:00 |
Sandro La Bruzzo
|
b6b835ef49
|
update transformation Factory to get Transformation Rule by Id and not by Title
|
2021-02-01 08:49:42 +01:00 |