Antonis Lempesis
|
33c85d4e66
|
moved stats computation in impala instead of hive
|
2021-02-18 17:23:34 +02:00 |
Antonis Lempesis
|
b8e96c8ae7
|
moved cache update to the end
|
2021-02-18 16:42:22 +02:00 |
Antonis Lempesis
|
bcbfc052b1
|
fixed last errors in step 21
|
2021-02-18 16:32:54 +02:00 |
Antonis Lempesis
|
10a29a4b9a
|
fixes in monitor step
|
2021-02-18 15:05:59 +02:00 |
Antonis Lempesis
|
8ef66452d5
|
fixed typo
|
2021-02-17 22:24:44 +02:00 |
Antonis Lempesis
|
a8836e2f5f
|
fixed typo
|
2021-02-17 19:27:07 +02:00 |
Antonis Lempesis
|
a445c1ac3d
|
fixed variable names in monitor script
|
2021-02-17 16:45:09 +02:00 |
Antonis Lempesis
|
00d516360f
|
added missing ;
|
2021-02-17 16:41:10 +02:00 |
Antonis Lempesis
|
cd1b794409
|
added the monitor db wf
|
2021-02-17 02:11:55 +02:00 |
Alessia Bardi
|
32e81c2d89
|
non validated rel has null value in validated field
|
2021-02-16 11:01:42 +01:00 |
Antonis Lempesis
|
1c029b9fc0
|
fixed formatting
|
2021-02-14 03:14:24 +02:00 |
Antonis Lempesis
|
2c4dcc90ba
|
analyzing tables to produce stats
|
2021-02-14 02:54:55 +02:00 |
Michele Artini
|
83d815d0bc
|
only stats
|
2021-02-11 10:57:23 +01:00 |
Michele Artini
|
8c836bf930
|
Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop
|
2021-02-11 10:54:41 +01:00 |
Michele Artini
|
8c1600398a
|
added resumeFrom parameter
|
2021-02-11 10:54:16 +01:00 |
Claudio Atzori
|
3f8f78cbfb
|
Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop
|
2021-02-11 09:36:10 +01:00 |
Claudio Atzori
|
b34b5a39ca
|
index field authoridtypevalue mixes up different author id-type value pairs, dropped in favour of orcidtypevalue
|
2021-02-11 09:36:04 +01:00 |
Michele Artini
|
7249cceb53
|
switch of 2 nodes
|
2021-02-11 09:27:08 +01:00 |
Alessia Bardi
|
986dd969d3
|
use the proper import for Lists
|
2021-02-10 12:03:54 +01:00 |
Alessia Bardi
|
c4d1feca74
|
mapper test with validated link to project
|
2021-02-10 11:22:54 +01:00 |
Alessia Bardi
|
09fc7e2f78
|
serialization of validated flag on relationships
|
2021-02-10 11:22:09 +01:00 |
Claudio Atzori
|
bc458d1b54
|
Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop
|
2021-02-09 16:27:30 +01:00 |
Claudio Atzori
|
82e6c50f3f
|
updated solr fields (authoridtypevalue, resultsubject, resultresourcetypename)
|
2021-02-09 16:27:04 +01:00 |
Claudio Atzori
|
62bd3c53ee
|
Merge branch 'master' into provision_indexing
|
2021-02-09 15:46:26 +01:00 |
Miriam Baglioni
|
2f5e6647c6
|
merge upstream
|
2021-02-08 10:33:11 +01:00 |
Alessia Bardi
|
c67329d3ad
|
updated test for EU Open Data portal datasets
|
2021-02-03 17:06:48 +01:00 |
Alessia Bardi
|
fd705404a1
|
tests for EU Open Data portal dataset mapping
|
2021-02-03 10:28:17 +01:00 |
Miriam Baglioni
|
6190465851
|
merge upstream
|
2021-02-03 10:27:27 +01:00 |
Claudio Atzori
|
f1a852f278
|
align usage-stats workflow poms with latest snapshot version
|
2021-01-26 15:42:42 +01:00 |
Claudio Atzori
|
9c32119dc2
|
Merge pull request 'usage-stats-export-wf-v2' (#89) from dimitris.pierrakos/dnet-hadoop:usage-stats-export-wf-v2 into master
Thank you Dimitris!
|
2021-01-26 15:01:41 +01:00 |
Claudio Atzori
|
885e0dd926
|
[Cleaning] filter authors not providing word characters in the fullname
|
2021-01-26 09:48:53 +01:00 |
Claudio Atzori
|
2890511613
|
[Cleaning] normalise missing Result.country
|
2021-01-26 09:41:44 +01:00 |
Claudio Atzori
|
4eb9ed35b1
|
Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop
|
2021-01-25 18:12:24 +01:00 |
Claudio Atzori
|
cd379eb5e3
|
[Cleaning] trying to avoid NPEs, this time by ruling out authors without a defined fullname
|
2021-01-25 18:11:49 +01:00 |
Alessia Bardi
|
505477f36f
|
format code
|
2021-01-25 18:02:49 +01:00 |
Alessia Bardi
|
ded6ed8d7d
|
no ',' author, if there are no author in ODF records
|
2021-01-25 17:57:51 +01:00 |
Claudio Atzori
|
3465c8ccee
|
[Cleaning] trying to avoid NPEs
|
2021-01-25 16:54:53 +01:00 |
Claudio Atzori
|
07a0ccfc96
|
[Cleaning] trying to avoid NPEs
|
2021-01-25 13:36:01 +01:00 |
Claudio Atzori
|
34d653de41
|
[Cleaning] updated cleaning rule for DOIs
|
2021-01-22 14:16:33 +01:00 |
Miriam Baglioni
|
fe36895c53
|
added datasource blacklist for the organization to result propagation through institutional repositories
|
2021-01-22 11:55:10 +01:00 |
Dimitris
|
3e8d2a6b2d
|
Clean workflows
|
2021-01-15 16:19:12 +02:00 |
Michele Artini
|
cfbcdc95bc
|
fixed a wf param
|
2021-01-14 14:45:23 +01:00 |
Michele Artini
|
69ba3203c0
|
fixed a conflict
|
2021-01-14 14:43:25 +01:00 |
Michele Artini
|
b230d44411
|
fixed conflict
|
2021-01-14 14:32:31 +01:00 |
Michele Artini
|
b9d90e95b8
|
Added eventId to ShortEventMessage
|
2021-01-14 14:32:31 +01:00 |
Michele Artini
|
64b0b0bfb3
|
fixed a bug with invalid subject topic
|
2021-01-14 14:32:31 +01:00 |
Michele Artini
|
e3e0ab1de1
|
fixed a problem with join
|
2021-01-14 14:32:31 +01:00 |
Michele Artini
|
26a941315a
|
openaireId
|
2021-01-14 14:32:31 +01:00 |
Michele Artini
|
6f4d1a37f0
|
ES wf properties
|
2021-01-14 14:32:31 +01:00 |
Michele Artini
|
1391341d06
|
mkdir of output dir
|
2021-01-14 14:32:31 +01:00 |
Michele Artini
|
3c9cbd19f3
|
whitelist of topics
|
2021-01-14 14:32:31 +01:00 |
Michele Artini
|
467aa77279
|
workingDir and outputDir
|
2021-01-14 14:32:31 +01:00 |
Michele Artini
|
10f3f7eca7
|
workingDir and outputDir
|
2021-01-14 14:32:31 +01:00 |
Michele Artini
|
ff41a7b3a4
|
gzipped output
|
2021-01-14 14:32:31 +01:00 |
Claudio Atzori
|
80cf55ef2e
|
[Broker] fixed partitionEventsByOpendoarIds workflow parameter names
|
2021-01-13 16:24:30 +01:00 |
Claudio Atzori
|
41500669e2
|
[BIP! Scores integration] merged missing classes from bipFinder branch
|
2021-01-11 14:39:47 +01:00 |
Claudio Atzori
|
2a7a10809e
|
[BIP! Scores integration] merged missing classes from bipFinder branch
|
2021-01-11 10:05:02 +01:00 |
Claudio Atzori
|
d6686dd7cf
|
merged from master
|
2021-01-08 18:16:12 +01:00 |
Claudio Atzori
|
34229970e6
|
[BIP! Scores integration] Create updates as Result rather than subclasses; Result considers also metrics in the mergeFrom operation
|
2021-01-08 16:29:17 +01:00 |
Claudio Atzori
|
1361c9eb0c
|
[BIP! Scores integration] Create updates as Result rather than subclasses; Result considers also metrics in the mergeFrom operation
|
2021-01-07 10:07:30 +01:00 |
Claudio Atzori
|
ab2fe9266a
|
[DOIBoost] minor fixes in workflow definition
|
2021-01-05 10:26:39 +01:00 |
Claudio Atzori
|
7c722f3fdc
|
[DOIBoost] fixed typo
|
2021-01-05 10:25:54 +01:00 |
Claudio Atzori
|
8879704ba0
|
[DOIBoost] configurable ES server url and index name in crossref importer
|
2021-01-05 10:00:13 +01:00 |
Claudio Atzori
|
26e9d55c13
|
code formatting
|
2021-01-05 09:59:26 +01:00 |
Sandro La Bruzzo
|
7834a35768
|
avoid to save intermediate dataset before generation of Sequence file
|
2021-01-04 17:54:57 +01:00 |
Sandro La Bruzzo
|
e79445a8b4
|
minor fix for claudio polemica
|
2021-01-04 17:39:25 +01:00 |
Sandro La Bruzzo
|
8765020b85
|
minor fix
|
2021-01-04 17:37:08 +01:00 |
Sandro La Bruzzo
|
b0dc92786f
|
defined a single oozie workflow for the generation of doiboost
|
2021-01-04 17:01:35 +01:00 |
Claudio Atzori
|
7185158942
|
ignore missing properties
|
2020-12-29 11:06:28 +01:00 |
Claudio Atzori
|
28460c2cd1
|
using com.fasterxml.jackson.databind.ObjectMapper instead of org.codehaus.jackson.map.ObjectMapper
|
2020-12-23 16:59:52 +01:00 |
Claudio Atzori
|
60649ac7d2
|
swapped expected and actual in tests, updated expected number of authors
|
2020-12-23 12:26:04 +01:00 |
Claudio Atzori
|
723b01f9e9
|
trivial: the less magic numbers and values around, the better
|
2020-12-23 12:22:48 +01:00 |
Claudio Atzori
|
7bfc35df5e
|
Merge pull request 'Changed typo in script names' (#82) from antonis.lempesis/dnet-hadoop:master into master
no need to! :)
|
2020-12-22 12:36:21 +01:00 |
Antonis Lempesis
|
be5969a8c2
|
Changed typo in script names
|
2020-12-22 13:33:32 +02:00 |
Claudio Atzori
|
6cb0dc3f43
|
extended OCRID cleaning procedure
|
2020-12-21 11:40:17 +01:00 |
Claudio Atzori
|
573a8a3272
|
Merge pull request 'Changed typo in script names' (#81) from antonis.lempesis/dnet-hadoop:master into master
ok! LGTM
|
2020-12-18 17:44:26 +01:00 |
Antonis Lempesis
|
2a074c3b2b
|
Changed typo in script names
|
2020-12-18 18:40:48 +02:00 |
Claudio Atzori
|
47270d9af5
|
lenient mock can be lenient
|
2020-12-18 15:38:59 +01:00 |
Claudio Atzori
|
2e503ee101
|
code formatting
|
2020-12-17 13:47:38 +01:00 |
Claudio Atzori
|
5a3e2199b2
|
Merge pull request 'Creation of the action set to include the bipFinder! score' (#80) from miriam.baglioni/dnet-hadoop:bipFinder into bipFinder_master_test
|
2020-12-17 12:26:38 +01:00 |
Claudio Atzori
|
03319d3bd9
|
Revert "Merge pull request 'Creation of the action set to include the bipFinder! score' (#62) from miriam.baglioni/dnet-hadoop:bipFinder into master"
This reverts commit add7e1693b , reversing
changes made to f9a8fd8bbd .
|
2020-12-17 12:23:58 +01:00 |
Claudio Atzori
|
add7e1693b
|
Merge pull request 'Creation of the action set to include the bipFinder! score' (#62) from miriam.baglioni/dnet-hadoop:bipFinder into master
|
2020-12-17 12:09:03 +01:00 |
Alessia Bardi
|
f9a8fd8bbd
|
updated test record for textgrid
|
2020-12-17 11:59:45 +01:00 |
Claudio Atzori
|
4766495f5b
|
[orcid_to_result_from_semrel_propagation] fixed typo in SQL
|
2020-12-17 09:15:50 +01:00 |
Claudio Atzori
|
de00094ebc
|
Merge pull request 'FIX on the creation of subject based broker enrichments' (#79) from broker into master
|
2020-12-15 14:58:31 +01:00 |
Michele Artini
|
f9dc1e45fd
|
fixed a bug with invalid subject topic
|
2020-12-15 14:54:11 +01:00 |
Sandro La Bruzzo
|
f92bd56f56
|
Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop
|
2020-12-15 11:47:29 +01:00 |
Sandro La Bruzzo
|
1f6c8a9e83
|
added orcid_pending type to records coming from Crossref
|
2020-12-15 11:47:15 +01:00 |
Claudio Atzori
|
9f1181290e
|
Merge pull request 'broker' (#78) from broker into master
The changes look good to me.
|
2020-12-15 10:03:45 +01:00 |
Michele Artini
|
0a0f62bd01
|
Merge branch 'master' into broker
|
2020-12-15 08:30:52 +01:00 |
Michele Artini
|
12fa5d122a
|
fixed a problem with join
|
2020-12-15 08:30:26 +01:00 |
Michele Artini
|
991e675dc6
|
validation in claim rels
|
2020-12-14 15:41:25 +01:00 |
Michele Artini
|
3e19cf7b4a
|
openaireId
|
2020-12-14 15:24:33 +01:00 |
Claudio Atzori
|
b6f08ce226
|
re-adding the old junit:junit dep as solr-test-framework needs it
|
2020-12-14 15:07:31 +01:00 |
Claudio Atzori
|
7d325e2c57
|
using actual result subclasses instead of their parent class
|
2020-12-14 14:40:54 +01:00 |
Claudio Atzori
|
152916890f
|
renamed test name
|
2020-12-14 14:40:05 +01:00 |
Michele Artini
|
a203aee32a
|
ES wf properties
|
2020-12-14 12:02:33 +01:00 |
Claudio Atzori
|
1506f49052
|
Xml record serialization for author PIDs: 1) only one value per PID type is allowed; 2) orcid prevails over orcid_pending
|
2020-12-14 11:14:03 +01:00 |
Michele Artini
|
d03756c962
|
mkdir of output dir
|
2020-12-14 11:11:41 +01:00 |
Michele Artini
|
399548f221
|
whitelist of topics
|
2020-12-14 11:03:55 +01:00 |