Claudio Atzori
55f6ff5f55
README.md for aggregation workflows
2021-03-03 16:18:34 +01:00
Claudio Atzori
e8789b0cdb
Merge pull request 'stats DB for monitor' ( #99 ) from antonis.lempesis/dnet-hadoop:master into master
...
Looks good to me, just a note on the parsing of the citations: since the last version, IIS produces citations as proper relationships among results. This is what we got already in the BETA graph
```
count r.reltype r.subreltype r.relclass
62.129.254 resultResult citation cites
62.043.309 resultResult citation isCitedBy
```
Thus, I suggest to move away from the current property based implementation for the extraction of the citation links and start relying on the relationships instead.
2021-03-03 10:29:09 +01:00
Claudio Atzori
ec80b7ade3
code formatting
2021-03-03 10:22:53 +01:00
Claudio Atzori
36f750cd1d
removed unused classes
2021-03-03 10:22:29 +01:00
Claudio Atzori
b73dce3e3a
more logging on the MDStore mongodb client. Forcing UTF_8 encoding on the content
2021-03-03 10:17:16 +01:00
Antonis Lempesis
27796343ca
crude sleep. hardcoded value
2021-03-03 01:37:47 +02:00
Claudio Atzori
e76c4f62c1
MetadataRecord moved in dhp-schemas
2021-02-26 10:58:48 +01:00
Claudio Atzori
7df2461ccc
indent XML records collected from oai-pmh endpoints
2021-02-25 16:19:12 +01:00
Claudio Atzori
b830e33392
mdstore collector plugin
2021-02-25 12:30:30 +01:00
Claudio Atzori
dc98c39500
more logging
2021-02-25 12:29:18 +01:00
Claudio Atzori
271e88537b
code formatting
2021-02-25 12:28:56 +01:00
Claudio Atzori
9c899f4433
cleanup on transformation functions and the relative tests
2021-02-24 15:07:59 +01:00
Claudio Atzori
fc3fa5e343
implemented mdstore collector plugin
2021-02-24 15:07:24 +01:00
Antonis Lempesis
d90767c733
correctly invalidating metadata
2021-02-19 03:18:47 +02:00
Antonis Lempesis
3681afbe04
typo
2021-02-19 03:04:27 +02:00
Antonis Lempesis
c5502eba8f
actually moved stats computation in impala instead of hive...
2021-02-19 02:54:39 +02:00
Antonis Lempesis
33c85d4e66
moved stats computation in impala instead of hive
2021-02-18 17:23:34 +02:00
Antonis Lempesis
b8e96c8ae7
moved cache update to the end
2021-02-18 16:42:22 +02:00
Antonis Lempesis
bcbfc052b1
fixed last errors in step 21
2021-02-18 16:32:54 +02:00
Antonis Lempesis
10a29a4b9a
fixes in monitor step
2021-02-18 15:05:59 +02:00
Antonis Lempesis
8ef66452d5
fixed typo
2021-02-17 22:24:44 +02:00
Antonis Lempesis
a8836e2f5f
fixed typo
2021-02-17 19:27:07 +02:00
Claudio Atzori
e7eba9f7e7
WIP: transformation workflow error reporting; cleanup
2021-02-17 16:54:08 +01:00
Claudio Atzori
58467aaf1e
WIP: transformation workflow error reporting
2021-02-17 16:14:41 +01:00
Claudio Atzori
cc88701f29
retry for any Socket exception
2021-02-17 16:13:54 +01:00
Antonis Lempesis
a445c1ac3d
fixed variable names in monitor script
2021-02-17 16:45:09 +02:00
Antonis Lempesis
00d516360f
added missing ;
2021-02-17 16:41:10 +02:00
Claudio Atzori
545f8f3e48
using jackson objectmapper instead of GSon to serialise the aggregation report
2021-02-17 12:15:00 +01:00
Claudio Atzori
b592d78bb4
WIP: collectorWorker error reporting, generalised reported implementation
2021-02-17 10:28:01 +01:00
Antonis Lempesis
cd1b794409
added the monitor db wf
2021-02-17 02:11:55 +02:00
Claudio Atzori
cf27905a71
WIP: collectorWorker error reporting, added report messages
2021-02-16 16:53:14 +01:00
Alessia Bardi
bf2830b981
Merge pull request 'manage merging of Relatation validation attributes' ( #95 ) from merge_rel_validation into master
2021-02-16 12:14:27 +01:00
Claudio Atzori
6f9864c564
manage merging of Relatation validation attributes
2021-02-16 11:53:44 +01:00
Alessia Bardi
32e81c2d89
non validated rel has null value in validated field
2021-02-16 11:01:42 +01:00
Claudio Atzori
58288a95b8
WIP: collectorWorker error reporting, added report messages
2021-02-15 15:28:53 +01:00
Claudio Atzori
1abe6d1ad7
WIP: collectorWorker error reporting, added report messages
2021-02-15 15:08:59 +01:00
Claudio Atzori
523a6bfa97
Merge pull request 'first commit to the correct branch' ( #94 ) from andreas.czerniak/BrAggr_dnet-hadoop:hadoop_aggregator into hadoop_aggregator
...
Looks good to me, thanks Andreas!
2021-02-15 12:15:31 +01:00
Antonis Lempesis
1c029b9fc0
fixed formatting
2021-02-14 03:14:24 +02:00
Antonis Lempesis
2c4dcc90ba
analyzing tables to produce stats
2021-02-14 02:54:55 +02:00
Sandro La Bruzzo
7edcc87ed4
changed xslt behaviour on failure
2021-02-12 17:27:08 +01:00
Sandro La Bruzzo
6a37c7f175
merge fixed
2021-02-12 16:38:47 +01:00
Sandro La Bruzzo
b3f5c2351d
Merge branch 'hadoop_aggregator' of code-repo.d4science.org:D-Net/dnet-hadoop into hadoop_aggregator
...
Conflicts:
dhp-workflows/dhp-aggregation/src/test/java/eu/dnetlib/dhp/transformation/TransformationJobTest.java
2021-02-12 16:37:14 +01:00
Sandro La Bruzzo
f216277219
Implemented cleaning date
2021-02-12 16:34:52 +01:00
Andreas Czerniak
5a9017cf18
clone, min. changes, test, run
2021-02-12 14:32:36 +01:00
Claudio Atzori
aa55dedb8a
Merge branch 'hadoop_aggregator' of https://code-repo.d4science.org/D-Net/dnet-hadoop into hadoop_aggregator
2021-02-12 12:31:05 +01:00
Claudio Atzori
29c6f7e255
classes related to the collection workflow moved into common package; implemented MongoDB collection plugins
2021-02-12 12:31:02 +01:00
Sandro La Bruzzo
17e6f1934e
fixed NPE on cleaner
2021-02-12 11:48:11 +01:00
Sandro La Bruzzo
ebcc3ec14f
updated wrong datacite identifier in trasformation
2021-02-11 16:25:51 +01:00
Michele Artini
83d815d0bc
only stats
2021-02-11 10:57:23 +01:00
Michele Artini
8c836bf930
Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop
2021-02-11 10:54:41 +01:00