Michele Artini
|
5de8a7276f
|
wf to partition opendoar events
|
2020-12-07 14:56:06 +01:00 |
Claudio Atzori
|
a104a632df
|
cleanup
|
2020-12-04 16:32:47 +01:00 |
Claudio Atzori
|
5b4e1142a8
|
Merge pull request 'added last step to update cache' (#64) from antonis.lempesis/dnet-hadoop:master into master
Looks good to me, thanks!
|
2020-12-04 14:42:31 +01:00 |
Antonis Lempesis
|
b1ed1afdcc
|
added the new parameter (stats_tool_api_url) in the workflow parameters
|
2020-12-04 13:07:18 +02:00 |
Antonis Lempesis
|
7cb113e088
|
added the new parameter (stats_tool_api_url) in the workflow parameters
|
2020-12-04 13:04:25 +02:00 |
Antonis Lempesis
|
d23ccae0d5
|
ignoring deletedbyinference relations
|
2020-12-04 12:42:17 +02:00 |
Antonis Lempesis
|
413afcfed5
|
finished first implementation of wf
|
2020-12-02 15:57:17 +02:00 |
Antonis Lempesis
|
0948536614
|
initial implementation of the promote wf
|
2020-12-02 15:41:56 +02:00 |
Sandro La Bruzzo
|
6ba8037cc7
|
fixed failure to test due to changing of input
|
2020-12-02 11:34:46 +01:00 |
Claudio Atzori
|
cfb55effd9
|
code formatting
|
2020-12-02 11:23:49 +01:00 |
Claudio Atzori
|
74242e450e
|
using constants from ModelConstants
|
2020-12-02 11:23:35 +01:00 |
Claudio Atzori
|
873c358d1d
|
Merge pull request 'added extension for new author pid (orcid_pending)' (#63) from miriam.baglioni/dnet-hadoop:master into master
LGTM
|
2020-12-02 11:15:00 +01:00 |
Miriam Baglioni
|
cd285e98bc
|
usoing the constants defined in the ModelConstants class
|
2020-12-02 11:13:23 +01:00 |
Miriam Baglioni
|
51c582c08c
|
added orcid class name among the constants set
|
2020-12-02 11:12:54 +01:00 |
Miriam Baglioni
|
4b0d1530a2
|
merge upstream
|
2020-12-02 11:05:00 +01:00 |
Claudio Atzori
|
faa977df7e
|
Merge pull request 'orcid-no-doi' (#43) from enrico.ottonello/dnet-hadoop:orcid-no-doi into master
The dataset was generated and is now part of the actionsets available in BETA
|
2020-12-02 10:55:12 +01:00 |
Claudio Atzori
|
57f448b7a4
|
graph cleaning workflow separate orcid_pending from orcid, depending on the author pid provenance
|
2020-12-02 10:44:05 +01:00 |
Alessia Bardi
|
2d15667b4a
|
testing XML generation from json object (case AMS ACTA)
|
2020-12-02 10:16:26 +01:00 |
Alessia Bardi
|
a417624670
|
tests for raw graph mapping
|
2020-12-02 10:15:26 +01:00 |
Miriam Baglioni
|
f8468c9c22
|
added extention for new author pid (orcid_pending)
|
2020-12-01 20:09:35 +01:00 |
Enrico Ottonello
|
f2df3ead74
|
Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop into orcid-no-doi
|
2020-11-30 14:22:46 +01:00 |
Enrico Ottonello
|
40c4559e92
|
added datainfo on authors pid with "sysimport:crosswalk:entityregistry",
|
2020-11-30 14:19:22 +01:00 |
Antonis Lempesis
|
815d6b25d9
|
added last step to update cache
|
2020-11-30 00:48:10 +02:00 |
Claudio Atzori
|
e731a7658d
|
cleaning texts to remove tab characters too
|
2020-11-27 09:00:04 +01:00 |
Claudio Atzori
|
a104d2b6ad
|
cleanup
|
2020-11-26 11:12:00 +01:00 |
Claudio Atzori
|
db0181b8af
|
Merge pull request 'added bidirectionality to relations from project and result coming from crossref' (#60) from miriam.baglioni/dnet-hadoop:sxBidirectionality into master
|
2020-11-25 17:17:40 +01:00 |
Sandro La Bruzzo
|
ec3e238de6
|
Fixed problem on duplicated identifier
|
2020-11-25 17:15:54 +01:00 |
Sandro La Bruzzo
|
264723ffd8
|
updated stuff for zenodo upload
|
2020-11-25 11:56:07 +01:00 |
Claudio Atzori
|
eeebd5a920
|
Cleanig workflow: remove newlines from titles, descriptions, subjects
|
2020-11-24 18:40:25 +01:00 |
Enrico Ottonello
|
99a086f0c6
|
max concurrent executors set to 10, according to ORCID Director of Technology mail request
|
2020-11-24 17:49:32 +01:00 |
Miriam Baglioni
|
00874a8ce6
|
added bidirectionality to relations from project and result
|
2020-11-24 15:17:23 +01:00 |
Enrico Ottonello
|
5c17e768b2
|
set wf configuration with spark.dynamicAllocation.maxExecutors 20 over 20 input partitions
|
2020-11-23 16:01:23 +01:00 |
Enrico Ottonello
|
5c9a727895
|
Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop into orcid-no-doi
|
2020-11-23 09:49:53 +01:00 |
Enrico Ottonello
|
97c8111847
|
action to convert lambda file in seq file; spark action to download updated authors
|
2020-11-23 09:49:22 +01:00 |
Claudio Atzori
|
d48f388fb2
|
Merge branch 'provision_indexing'
|
2020-11-19 15:59:55 +01:00 |
Claudio Atzori
|
46bde9c13f
|
Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop
|
2020-11-19 15:26:27 +01:00 |
Claudio Atzori
|
7c9feaf9e7
|
project attributes removed from the XML record serialization: contactfullname, contactfax, contactphone, contactemail
|
2020-11-19 15:26:20 +01:00 |
Michele Artini
|
293da47ad9
|
Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop
|
2020-11-19 10:42:31 +01:00 |
Michele Artini
|
ab08d12c46
|
considering abstract > MIN_LENGTH in ENRICH_MISSING_ABSTRACT
|
2020-11-19 10:42:10 +01:00 |
Claudio Atzori
|
e503271abe
|
fixed notification workflow name
|
2020-11-19 10:41:38 +01:00 |
Claudio Atzori
|
0374d34c3e
|
introduced configuration param outputFormat: HDFS | SOLR
|
2020-11-19 10:34:28 +01:00 |
Claudio Atzori
|
ede7fae6c8
|
Merge pull request 'XML record indexing test' (#58) from provision_indexing into master
|
2020-11-18 17:04:34 +01:00 |
Claudio Atzori
|
5218718e8b
|
updated set of fields from the MDFormatDSResourceType on PROD
|
2020-11-18 15:00:41 +01:00 |
Claudio Atzori
|
d9e07a242b
|
extended XmlIndexingJob to accept an optional parameter: outputPath. When present, forces the job to write its output on the specified HDFS location
|
2020-11-18 14:34:55 +01:00 |
Claudio Atzori
|
29dcff0f34
|
spark complains about missing classes, so here they are again
|
2020-11-18 14:32:32 +01:00 |
Claudio Atzori
|
12acf25519
|
Merge pull request 'starting from first step...' (#57) from antonis.lempesis/dnet-hadoop:master into master
No judging. Just re-deploying...
|
2020-11-18 11:01:49 +01:00 |
Claudio Atzori
|
8177ce7939
|
test for XmlIndexingJob based on a local miniSolrCluster
|
2020-11-18 10:58:05 +01:00 |
Alessia Bardi
|
10e673660f
|
Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop
|
2020-11-18 10:01:23 +01:00 |
Alessia Bardi
|
be7b310cef
|
rel semantcis ignore case
|
2020-11-18 10:01:20 +01:00 |
Michele Artini
|
33da2e3d6c
|
xpaths for dateOfCollection and dateOfTransformation
|
2020-11-18 09:26:20 +01:00 |