Sandro La Bruzzo
|
486e850bcc
|
next step of MAG conversion implemented
|
2020-05-19 09:24:45 +02:00 |
Enrico Ottonello
|
d4e9075f22
|
Merge branch 'doiboost' of https://code-repo.d4science.org/D-Net/dnet-hadoop into doiboost
|
2020-05-18 19:51:36 +02:00 |
Enrico Ottonello
|
fc80e8c7de
|
added accumulator; last modified date of the record is added to saved data; lambda file is partitioned into 20 parts before starting downloading
|
2020-05-18 19:51:29 +02:00 |
Claudio Atzori
|
f3bc8aed31
|
lifted memory requirements for country propagation wf
|
2020-05-18 15:29:10 +02:00 |
Miriam Baglioni
|
b71fbb68b1
|
removed the removeOutputDir command from code. Reltions are written in Append. The erase of the output dir ment to remove all the relations computed in the prevoius steps
|
2020-05-18 13:57:20 +02:00 |
Miriam Baglioni
|
629af7cb79
|
Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop
|
2020-05-18 13:07:36 +02:00 |
Claudio Atzori
|
ef9a9a9f1a
|
remove the outout path when starting
|
2020-05-15 22:34:19 +02:00 |
Enrico Ottonello
|
0b29bb7e3b
|
spark job to download orcid record modified after a fixed date
|
2020-05-15 19:49:26 +02:00 |
Claudio Atzori
|
7838f2c63f
|
init the empty list for author pids mapped from OAF
|
2020-05-15 17:06:01 +02:00 |
Claudio Atzori
|
82b615ab33
|
NPE check
|
2020-05-15 16:04:46 +02:00 |
Miriam Baglioni
|
e26a67c3eb
|
merge with upstream
|
2020-05-15 15:53:05 +02:00 |
Claudio Atzori
|
7a89507ab1
|
code formatting
|
2020-05-15 15:16:54 +02:00 |
Miriam Baglioni
|
5ec8c49ad5
|
removed serialization points
|
2020-05-15 12:49:58 +02:00 |
Claudio Atzori
|
1d35836a58
|
Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop
|
2020-05-15 12:26:31 +02:00 |
Claudio Atzori
|
cfc8948717
|
fixed mapping OdfToGraph: pick the correct element to map author pids and author affiliations; extended mapping Oaf2Graph: added support for author pids
|
2020-05-15 12:26:16 +02:00 |
Michele Artini
|
2a4e68a292
|
events recognition
|
2020-05-15 12:25:37 +02:00 |
Claudio Atzori
|
a832658296
|
code formatting
|
2020-05-15 10:21:09 +02:00 |
Claudio Atzori
|
b7e198475a
|
added common methods to create HiveDB table identifiers
|
2020-05-15 10:20:07 +02:00 |
Claudio Atzori
|
50d6a2ad3c
|
added output directory removal in the blacklist spark actions; included common global properties in blacklist's workflow.xml
|
2020-05-15 09:53:37 +02:00 |
Claudio Atzori
|
18f46e47b9
|
added relations to the graph2hive import workflow
|
2020-05-15 09:34:48 +02:00 |
Claudio Atzori
|
9d028ffe1c
|
cleanup
|
2020-05-15 09:28:55 +02:00 |
Claudio Atzori
|
fd62359538
|
cleanup
|
2020-05-15 09:28:15 +02:00 |
Claudio Atzori
|
eb64335a54
|
parallel implementation for graph Hive importer
|
2020-05-15 09:05:26 +02:00 |
Miriam Baglioni
|
94571c9a51
|
Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop
|
2020-05-14 18:29:55 +02:00 |
Miriam Baglioni
|
f25db01664
|
changed in the constant from propagationconstants to modelconstants
|
2020-05-14 18:29:24 +02:00 |
Miriam Baglioni
|
d05630d979
|
removed the constants added in ModelConstants
|
2020-05-14 18:22:50 +02:00 |
Miriam Baglioni
|
42085e8d99
|
added some constants
|
2020-05-14 18:22:28 +02:00 |
Claudio Atzori
|
f044d09315
|
revised mapping: more accurate mapping for name/surname from datacite format; improved mapping of null values
|
2020-05-14 15:07:24 +02:00 |
Miriam Baglioni
|
e7eb4f377e
|
Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop
|
2020-05-14 10:34:17 +02:00 |
Miriam Baglioni
|
8828458acf
|
minor changes
|
2020-05-14 10:34:12 +02:00 |
Claudio Atzori
|
ab37953332
|
added global properties in wf definitions to avoid repeating name-node and job-tracker in the (many) distcp actions; reintroduced output directory removal at the beginning of each spark action
|
2020-05-14 10:25:41 +02:00 |
Claudio Atzori
|
12bfa6702e
|
Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop
|
2020-05-13 17:01:17 +02:00 |
Claudio Atzori
|
5ecacad70a
|
fixed default resource typing in Oaf/Odf mapping
|
2020-05-13 17:01:11 +02:00 |
Enrico Ottonello
|
12756f9d41
|
multithread (4 threads) test to feed elastic search
|
2020-05-13 16:11:40 +02:00 |
Michele Artini
|
c0265213a0
|
partial implementation
|
2020-05-13 12:00:27 +02:00 |
Sandro La Bruzzo
|
a92ee0f41e
|
Merge remote-tracking branch 'origin/master' into doiboost
|
2020-05-13 10:38:13 +02:00 |
Sandro La Bruzzo
|
d876f47d06
|
next step of MAG conversion implemented
|
2020-05-13 10:38:04 +02:00 |
Claudio Atzori
|
1ddd33de41
|
Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop
|
2020-05-13 09:04:41 +02:00 |
Claudio Atzori
|
85f3c55992
|
fixed node names in blacklist workflow
|
2020-05-13 09:04:33 +02:00 |
Miriam Baglioni
|
43f127448d
|
changed the package name from dhp-propagation to dhp-enrichment for the preparation phase of funding propagation
|
2020-05-12 18:24:26 +02:00 |
Enrico Ottonello
|
08040cef80
|
spark action to analyze orcid lambda file
|
2020-05-12 16:57:43 +02:00 |
Claudio Atzori
|
ec0782e582
|
renamed jar containing the bulktagging and propagation workflows from dhp-[bulktagging|propagation] to dhp-enrichment; adjusted xml formatting
|
2020-05-12 15:49:28 +02:00 |
Miriam Baglioni
|
1547ca7e15
|
added blacklist step to the end of the provision wf
|
2020-05-12 12:17:27 +02:00 |
Miriam Baglioni
|
14979f299e
|
changed the configuration factory
|
2020-05-12 11:28:38 +02:00 |
Miriam Baglioni
|
f8aef6161a
|
minor modification
|
2020-05-12 11:28:07 +02:00 |
Miriam Baglioni
|
7387f3449a
|
changed the route to find the verb resolver classes
|
2020-05-12 11:27:38 +02:00 |
Miriam Baglioni
|
7687519f00
|
merged conflicts with upstream branch
|
2020-05-12 10:03:44 +02:00 |
Miriam Baglioni
|
8ffc050b8a
|
fixed problem in communityconfigurationfactory test
|
2020-05-12 10:01:09 +02:00 |
Claudio Atzori
|
527e8169a8
|
adjusted paths pointing to test configurations, cleanup
|
2020-05-11 18:17:05 +02:00 |
Claudio Atzori
|
f9a62ba63b
|
added wf nodes to copy entities to the output path
|
2020-05-11 18:16:39 +02:00 |