Michele Artini
|
9cfc124ac5
|
Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop
|
2020-09-08 16:39:54 +02:00 |
Michele Artini
|
a597a218ab
|
* forall topics
|
2020-09-08 16:39:40 +02:00 |
Claudio Atzori
|
8a523474b7
|
code formatting
|
2020-09-07 11:40:16 +02:00 |
Michele Artini
|
bb459caf69
|
support for all topic subscriptions
|
2020-08-27 11:01:21 +02:00 |
Michele Artini
|
82ed8edafd
|
notification indexing
|
2020-08-26 15:10:48 +02:00 |
Miriam Baglioni
|
b72a7dad46
|
resuorce for pid graph dump
|
2020-08-24 17:09:01 +02:00 |
Miriam Baglioni
|
8694bb9b31
|
refactoring due to compilation
|
2020-08-24 17:07:34 +02:00 |
Miriam Baglioni
|
8a069a4fea
|
-
|
2020-08-24 17:01:30 +02:00 |
Miriam Baglioni
|
34fa96f3b1
|
-
|
2020-08-24 17:00:20 +02:00 |
Miriam Baglioni
|
5fb2949cb8
|
added utils methods
|
2020-08-24 17:00:09 +02:00 |
Miriam Baglioni
|
2a540b6c01
|
added constants for the pid graph dump
|
2020-08-24 16:55:35 +02:00 |
Miriam Baglioni
|
da103c399a
|
resources for the pid graph dump test
|
2020-08-24 16:52:07 +02:00 |
Miriam Baglioni
|
630a6a1fe7
|
first tests for the pid graph dump
|
2020-08-24 16:51:26 +02:00 |
Miriam Baglioni
|
40c8d2de7b
|
test resources for the dump of the pids graph
|
2020-08-24 16:50:39 +02:00 |
Miriam Baglioni
|
bef79d3bdf
|
first attempt to the dump of pids graph
|
2020-08-24 16:49:38 +02:00 |
Michele Artini
|
da470422d3
|
deleting events
|
2020-08-21 14:52:48 +02:00 |
Michele Artini
|
6e60bf026a
|
indexing only a subset of eventsa
|
2020-08-19 12:39:22 +02:00 |
Miriam Baglioni
|
85203c16e3
|
merge branch with master
|
2020-08-19 11:49:03 +02:00 |
Miriam Baglioni
|
2c783793ba
|
removed the affiliation from the author to mirror the changes in the model
|
2020-08-19 11:48:12 +02:00 |
Miriam Baglioni
|
f6bf888016
|
removed affiliation from author to mirror the changes in the model
|
2020-08-19 11:41:41 +02:00 |
Miriam Baglioni
|
66d0e0d3f2
|
-
|
2020-08-19 11:31:50 +02:00 |
Miriam Baglioni
|
1c593a9cfe
|
-
|
2020-08-19 11:29:51 +02:00 |
Miriam Baglioni
|
e42b2f5ae2
|
-
|
2020-08-19 11:29:09 +02:00 |
Miriam Baglioni
|
f81ee22418
|
changed to mirror the changes in the model (Instance, CommunityInstance, GraphResult)
|
2020-08-19 11:28:26 +02:00 |
Miriam Baglioni
|
387be43fd4
|
changed to discriminate if dumping all the results type together or each one in its own archive
|
2020-08-19 11:25:27 +02:00 |
Miriam Baglioni
|
c5858afb88
|
added parameter to guide the dump for the result (resultAggregation). true if all the result types should be dump together, false otherwise.
|
2020-08-19 11:24:14 +02:00 |
Miriam Baglioni
|
d407852ac2
|
changed to reflect the changed in the model
|
2020-08-19 11:15:05 +02:00 |
Miriam Baglioni
|
47c21a8961
|
refactoring due to compilation
|
2020-08-19 11:11:57 +02:00 |
Miriam Baglioni
|
5570678c65
|
changed parameter name from hfdsNameNode to nameNode
|
2020-08-19 10:59:26 +02:00 |
Miriam Baglioni
|
dc5096a327
|
refactoring due to compilation
|
2020-08-19 10:57:36 +02:00 |
Miriam Baglioni
|
55e24c2547
|
relclass for relation and corresponding values have been put to lower case (isSupplementedBy wrote as IsSupplementedBy - orcid propagation)
|
2020-08-18 16:42:08 +02:00 |
Miriam Baglioni
|
f44dd5d886
|
changed in mapping the result semantic name as it will be visible il the relclass Relation: from IsSupplementedBy to isSupplementedBy
|
2020-08-17 17:15:09 +02:00 |
Miriam Baglioni
|
bc6b5d5b34
|
removed leftover parameter
|
2020-08-15 11:22:35 +02:00 |
Miriam Baglioni
|
200cd5c730
|
removed leftover parameter
|
2020-08-15 11:22:19 +02:00 |
Miriam Baglioni
|
96600ed04a
|
modified test resource for mirroring the deletion of affiliation from author parameters
|
2020-08-14 20:41:49 +02:00 |
Miriam Baglioni
|
09f5b92763
|
added specific reference to class
|
2020-08-14 20:00:09 +02:00 |
Miriam Baglioni
|
37e7c43652
|
changed parameter name from hdfsNaemNode to nameNode
|
2020-08-14 18:18:25 +02:00 |
Claudio Atzori
|
5b994d7ccf
|
Merge branch 'dump' of https://code-repo.d4science.org/miriam.baglioni/dnet-hadoop into resolve_conflicts_pr40_dump
|
2020-08-14 15:32:29 +02:00 |
Miriam Baglioni
|
de995970ea
|
try again to solve clash with master
|
2020-08-14 15:24:36 +02:00 |
Miriam Baglioni
|
5040d72d5e
|
changed to make it equal to master branch
|
2020-08-14 15:20:17 +02:00 |
Miriam Baglioni
|
be8106c339
|
added space toavoid conflicts with master branch
|
2020-08-14 15:16:27 +02:00 |
Claudio Atzori
|
1871d1c6f6
|
solve error java.lang.NoSuchFieldError: INSTANCE when instantiating Solr client
|
2020-08-14 11:18:30 +02:00 |
Miriam Baglioni
|
d2a8a4961a
|
refactoring
|
2020-08-13 18:50:33 +02:00 |
Miriam Baglioni
|
a5043de5da
|
added method to get the mapped instance
|
2020-08-13 18:45:50 +02:00 |
Miriam Baglioni
|
b7e49aee8d
|
removed commented code
|
2020-08-13 18:44:07 +02:00 |
Miriam Baglioni
|
f439a6231e
|
added missing constraint in XQuery (verify the status of the RC/RI different from hidden)
|
2020-08-13 15:30:55 +02:00 |
Miriam Baglioni
|
0fe800b1c9
|
modified because of D-Net/dnet-hadoop#40\#issuecomment-1902
|
2020-08-13 15:17:12 +02:00 |
Miriam Baglioni
|
270c89489c
|
fixed issue created while renaming subject to subjects in community configuration xml
|
2020-08-13 15:16:04 +02:00 |
Miriam Baglioni
|
fcd10f452c
|
changed because of D-Net/dnet-hadoop#40 (comment)
|
2020-08-13 12:55:32 +02:00 |
Miriam Baglioni
|
fd48ae3b85
|
changed because of D-Net/dnet-hadoop#40 (comment)
|
2020-08-13 12:19:15 +02:00 |
Miriam Baglioni
|
04a3e1ab38
|
disabled tests
|
2020-08-13 12:18:13 +02:00 |
Miriam Baglioni
|
2ede397933
|
Apply change because of D-Net/dnet-hadoop#40 (comment)
|
2020-08-13 12:16:39 +02:00 |
Miriam Baglioni
|
bfd1fcde6d
|
removed not useful method and changed because of D-Net/dnet-hadoop#40 (comment) and D-Net/dnet-hadoop#40 (comment)
|
2020-08-13 12:14:37 +02:00 |
Miriam Baglioni
|
7fd8397123
|
apply changes in D-Net/dnet-hadoop#40 (comment)
|
2020-08-13 12:13:15 +02:00 |
Miriam Baglioni
|
753d448cc9
|
apply changes in D-Net/dnet-hadoop#40 (comment)
|
2020-08-13 12:12:58 +02:00 |
Miriam Baglioni
|
c0e071fa26
|
apply changes in D-Net/dnet-hadoop#40 (comment)
|
2020-08-13 12:12:40 +02:00 |
Miriam Baglioni
|
526db915bc
|
apply changes in D-Net/dnet-hadoop#40 (comment)
|
2020-08-13 12:12:16 +02:00 |
Miriam Baglioni
|
b0fab0d138
|
apply changes in D-Net/dnet-hadoop#40 (comment)
|
2020-08-13 12:11:57 +02:00 |
Miriam Baglioni
|
1b6320b251
|
apply changes in D-Net/dnet-hadoop#40 (comment)
|
2020-08-13 12:11:41 +02:00 |
Miriam Baglioni
|
743d31be22
|
apply changes in D-Net/dnet-hadoop#40 (comment)
|
2020-08-13 12:11:22 +02:00 |
Miriam Baglioni
|
65b48df652
|
apply changes in D-Net/dnet-hadoop#40 (comment)
|
2020-08-13 12:11:06 +02:00 |
Miriam Baglioni
|
90b54d3efb
|
apply changes in D-Net/dnet-hadoop#40 (comment)
|
2020-08-13 12:08:24 +02:00 |
Miriam Baglioni
|
69bbb9592a
|
apply changes in D-Net/dnet-hadoop#40 (comment)
|
2020-08-13 12:07:39 +02:00 |
Miriam Baglioni
|
945323299a
|
apply changes in D-Net/dnet-hadoop#40 (comment)
|
2020-08-13 12:07:24 +02:00 |
Miriam Baglioni
|
e04c993247
|
apply changes in D-Net/dnet-hadoop#40 (comment)
|
2020-08-13 12:07:07 +02:00 |
Miriam Baglioni
|
ed0812d0ce
|
apply changes in D-Net/dnet-hadoop#40 (comment)
|
2020-08-13 12:06:49 +02:00 |
Miriam Baglioni
|
d55cfe0ea5
|
apply changes in D-Net/dnet-hadoop#40 (comment)
|
2020-08-13 12:06:20 +02:00 |
Miriam Baglioni
|
80866bec7d
|
apply changes in D-Net/dnet-hadoop#40 (comment)
|
2020-08-13 12:06:05 +02:00 |
Miriam Baglioni
|
1400978c0a
|
apply changes in D-Net/dnet-hadoop#40 (comment)
|
2020-08-13 12:05:44 +02:00 |
Miriam Baglioni
|
7b941a2e0a
|
apply changes in D-Net/dnet-hadoop#40 (comment)
|
2020-08-13 12:05:17 +02:00 |
Miriam Baglioni
|
f7474f50fe
|
apply changes in D-Net/dnet-hadoop#40 (comment)
|
2020-08-13 12:04:52 +02:00 |
Miriam Baglioni
|
367203f412
|
apply changes in D-Net/dnet-hadoop#40 (comment)
|
2020-08-13 12:04:33 +02:00 |
Miriam Baglioni
|
3ab4809d31
|
apply changes in D-Net/dnet-hadoop#40 (comment)
|
2020-08-13 12:04:10 +02:00 |
Miriam Baglioni
|
02a4986e7b
|
Applying changed from code reviews D-Net/dnet-hadoop#40 (comment) and D-Net/dnet-hadoop#40 (comment) and D-Net/dnet-hadoop#40 (comment)
|
2020-08-13 11:53:01 +02:00 |
Miriam Baglioni
|
235d4e4d6e
|
moved Context as relevant for Communities dump
|
2020-08-12 18:16:45 +02:00 |
Miriam Baglioni
|
adf9f96a67
|
test for extraction of relation between organizations and context
|
2020-08-12 10:04:47 +02:00 |
Miriam Baglioni
|
7400cd019d
|
removed not needed variable
|
2020-08-12 10:03:33 +02:00 |
Miriam Baglioni
|
98d28bab5c
|
fixed missing _ in context nsprefix
|
2020-08-12 10:00:18 +02:00 |
Miriam Baglioni
|
8f48cb29f4
|
changed resource because of a change in the XQuery that returned the XML to be parsed. The main Zenodo community is no more a separate element, but part of the <zenodocommunities> element
|
2020-08-11 18:04:38 +02:00 |
Miriam Baglioni
|
c3672b162b
|
merge branch with master
|
2020-08-11 17:53:04 +02:00 |
Miriam Baglioni
|
a16bbf3202
|
changed test resource to mirror change in the Xquery that produced data to be parsed. The main Zenodo community it is no more provided in a different element, but it is part of the <zenodocommunities>
|
2020-08-11 17:48:44 +02:00 |
Miriam Baglioni
|
25f4fbceea
|
draft of test and resources
|
2020-08-11 17:37:22 +02:00 |
Miriam Baglioni
|
30a2b19b65
|
changed metadata for deposition od covid-19 dump in Zenodo
|
2020-08-11 17:36:56 +02:00 |
Claudio Atzori
|
f7cc52ab02
|
Merge pull request 'enrichment_wfs' (#39) from enrichment_wfs into master
LGTM
|
2020-08-11 17:26:13 +02:00 |
Miriam Baglioni
|
49788b532a
|
changed to mirror changes in the schema
|
2020-08-11 16:05:03 +02:00 |
Miriam Baglioni
|
b08511287b
|
-
|
2020-08-11 16:01:36 +02:00 |
Miriam Baglioni
|
7e81a17068
|
changed the XQUERY to mirror the change in the code
|
2020-08-11 16:00:33 +02:00 |
Miriam Baglioni
|
37ad2f28e9
|
removed added | in prefix for datasource
|
2020-08-11 15:55:06 +02:00 |
Miriam Baglioni
|
f31c2e9461
|
enabled test
|
2020-08-11 15:49:25 +02:00 |
Miriam Baglioni
|
2d67476417
|
merge branch with master
|
2020-08-11 15:46:04 +02:00 |
Miriam Baglioni
|
77a390878c
|
merge upstream
|
2020-08-11 15:45:48 +02:00 |
Miriam Baglioni
|
6d3804e24c
|
-
|
2020-08-11 15:45:12 +02:00 |
Miriam Baglioni
|
0603ec4757
|
changed test to upload the dump for covid-19 community
|
2020-08-11 15:43:25 +02:00 |
Miriam Baglioni
|
7dfd56df9d
|
-
|
2020-08-11 15:42:35 +02:00 |
Miriam Baglioni
|
a169d7e7c1
|
added test file for the MakeTar class
|
2020-08-11 15:40:41 +02:00 |
Miriam Baglioni
|
acb0926b2e
|
json schemas for the dumped entities and relation
|
2020-08-11 15:39:48 +02:00 |
Miriam Baglioni
|
ff52c51f92
|
added the communityMapPath parameter and removed the isLookUpUrl parameter
|
2020-08-11 15:39:22 +02:00 |
Miriam Baglioni
|
6f43acda5e
|
added the maketar and send to zenodo step. Adjusted wf parameters
|
2020-08-11 15:38:20 +02:00 |
Miriam Baglioni
|
ddc19de2e9
|
removed the isLookUpUrl among the parameters
|
2020-08-11 15:37:47 +02:00 |
Miriam Baglioni
|
592a8ea573
|
added parameter file for maketar class
|
2020-08-11 15:37:14 +02:00 |
Miriam Baglioni
|
77a0951b32
|
added the make archive step in the workflow
|
2020-08-11 15:32:32 +02:00 |
Miriam Baglioni
|
cf4d918787
|
added description, changed parameter name and added method
|
2020-08-11 15:27:31 +02:00 |
Miriam Baglioni
|
dc5fc5366d
|
Creation of an archive for each related dump part
|
2020-08-11 15:26:06 +02:00 |
Miriam Baglioni
|
0ce49049d6
|
added description
|
2020-08-11 15:25:11 +02:00 |
Miriam Baglioni
|
9bae991167
|
added description of the class
|
2020-08-11 11:20:43 +02:00 |
Miriam Baglioni
|
341dc59ead
|
removed the repartition(1). Added code for the creation of an archive containing all the parts dumped for each community
|
2020-08-11 11:18:58 +02:00 |
Sandro La Bruzzo
|
fe8d640aee
|
fixed error on oozie workflow
|
2020-08-11 09:43:03 +02:00 |
Sandro La Bruzzo
|
304590e854
|
updated workflow of indexing to start from begin
|
2020-08-11 09:17:47 +02:00 |
Sandro La Bruzzo
|
eaf0dc68a2
|
fixed indexing
|
2020-08-11 09:17:03 +02:00 |
Miriam Baglioni
|
1991a49f70
|
removed reference to isLookUp to get the communityMap
|
2020-08-10 18:02:56 +02:00 |
Miriam Baglioni
|
c378c38546
|
disabled test. The testing functionalities for hte upload in Zenode are moved to common
|
2020-08-10 12:41:11 +02:00 |
Miriam Baglioni
|
63ad0ed209
|
changed to use communityMapPath instead of IsLookUp
|
2020-08-10 12:40:19 +02:00 |
Miriam Baglioni
|
cec795f2ea
|
changed resources to mirror changes in the model
|
2020-08-10 12:39:35 +02:00 |
Miriam Baglioni
|
f50e3e7333
|
changed the class for which to generate the schema
|
2020-08-10 12:03:49 +02:00 |
Miriam Baglioni
|
b8c26f656c
|
test using communityMapPath instead of isLookUp
|
2020-08-10 12:02:55 +02:00 |
Miriam Baglioni
|
fe88904df0
|
changed the wf definition
|
2020-08-10 12:01:14 +02:00 |
Miriam Baglioni
|
87856467e2
|
removed isLookUpUrl and added code to read from HDSF the communitymap
|
2020-08-10 11:38:41 +02:00 |
Miriam Baglioni
|
1cf7043e26
|
removed isLookUoUrl from the parameters
|
2020-08-10 11:38:03 +02:00 |
Claudio Atzori
|
cf6b68ce5a
|
Merge pull request 'data provision workflow: add nodes to perform DELETE BY QUERY before the indexing begins and COMMIT after the indexing is completed' (#36) from provision_indexing into master
|
2020-08-10 11:16:29 +02:00 |
Sandro La Bruzzo
|
0ade33ad15
|
updated mergeFrom function for DLI Unknown
|
2020-08-10 10:18:35 +02:00 |
Miriam Baglioni
|
46986aae2d
|
added the new parameter for newdeposion/newversion and concept_record_id
|
2020-08-07 18:00:06 +02:00 |
Miriam Baglioni
|
3aedfdf0d6
|
added option to do a new deposition or new version of an old deposition
|
2020-08-07 17:49:14 +02:00 |
Miriam Baglioni
|
1b3ad1bce6
|
filter out authors pid (only orcid). Added check to get unique provenance for context id. filtr out countries with code UNKNOWN
|
2020-08-07 17:48:18 +02:00 |
Miriam Baglioni
|
5ceb8c5f0a
|
moved constants from graph.Constants
|
2020-08-07 17:46:47 +02:00 |
Miriam Baglioni
|
6c65c93c0e
|
refactoring
|
2020-08-07 17:45:35 +02:00 |
Miriam Baglioni
|
68adf86fe4
|
refactoring
|
2020-08-07 17:43:20 +02:00 |
Miriam Baglioni
|
26d2ad6ebb
|
refactoring
|
2020-08-07 17:41:56 +02:00 |
Miriam Baglioni
|
9675af7965
|
refactoring
|
2020-08-07 17:41:07 +02:00 |
Miriam Baglioni
|
346a91f4d9
|
Added constants
|
2020-08-07 17:35:39 +02:00 |
Miriam Baglioni
|
d52b0e1797
|
no use of IsLookUp. The query is done once and its result stored on HDFS. The path to the result is given instead of the isLookUpUrl
|
2020-08-07 17:34:40 +02:00 |
Miriam Baglioni
|
ae1b7fbfdb
|
changed method signature from set of mapkey entries to String representing path on file system where to find the map
|
2020-08-07 17:32:27 +02:00 |
Miriam Baglioni
|
931fa2ff00
|
removed dependencies
|
2020-08-07 16:46:37 +02:00 |
Miriam Baglioni
|
545ea9f77e
|
moved in common. Zenodo response model and APIClient to deposit in Zenodo
|
2020-08-07 16:44:51 +02:00 |
Sandro La Bruzzo
|
ddb1446ceb
|
fixed test
|
2020-08-07 11:34:33 +02:00 |
Sandro La Bruzzo
|
718bc7bbc8
|
implemented provision workflows using the new implementation with Dataset
|
2020-08-07 11:05:18 +02:00 |
Miriam Baglioni
|
da9b012c15
|
fixed dewcription
|
2020-08-06 11:55:44 +02:00 |
Miriam Baglioni
|
6dbadcf181
|
the new schema for the dumped result
|
2020-08-06 11:05:56 +02:00 |
Sandro La Bruzzo
|
a44e5abaa7
|
reformat code
|
2020-08-06 10:30:22 +02:00 |
Sandro La Bruzzo
|
4fb1821fab
|
Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop
|
2020-08-06 10:28:31 +02:00 |
Sandro La Bruzzo
|
9d9e9edbd2
|
improved extractEntity Relation workflows using dataset
|
2020-08-06 10:28:24 +02:00 |
Miriam Baglioni
|
adf0ca5aa7
|
test to send is from hdfs
|
2020-08-05 14:24:43 +02:00 |
Miriam Baglioni
|
14eda4f46e
|
added method to try to put inputstream to zenodo
|
2020-08-05 14:18:25 +02:00 |
Miriam Baglioni
|
e737a47270
|
added classes to try to send input stream to zenodo for the upload
|
2020-08-05 14:17:40 +02:00 |
Miriam Baglioni
|
873e9cd50c
|
changed hadoop setting to connect to s3
|
2020-08-04 15:37:25 +02:00 |
Alessia Bardi
|
a29565ff57
|
code formatting
|
2020-08-04 12:55:27 +02:00 |
Alessia Bardi
|
01db29e208
|
fixes redmine issue #5846: datacite and its different namespace declarations
|
2020-08-04 12:53:48 +02:00 |
Alessia Bardi
|
b4e4e5f858
|
do not duplicate result PIDs
|
2020-08-04 12:52:14 +02:00 |
Alessia Bardi
|
09a323d18d
|
testing a dataset from Nakala
|
2020-08-04 12:50:52 +02:00 |
Alessia Bardi
|
c35bf486cc
|
added handle among the possible PIDs
|
2020-08-04 12:50:12 +02:00 |
Miriam Baglioni
|
5b651abf82
|
merge branch with master
|
2020-08-04 10:14:07 +02:00 |
Miriam Baglioni
|
88e4c3b751
|
added default trust to context bulktagged
|
2020-08-04 10:13:25 +02:00 |
Miriam Baglioni
|
f9342cb484
|
added constant
|
2020-08-03 18:32:35 +02:00 |
Miriam Baglioni
|
96c3c891f4
|
added trust
|
2020-08-03 18:32:17 +02:00 |
Miriam Baglioni
|
53656600ad
|
changed XQuery to select only community and ri with status not hidden
|
2020-08-03 18:29:30 +02:00 |
Miriam Baglioni
|
b34177d8ef
|
merge upstream
|
2020-08-03 18:13:42 +02:00 |
Miriam Baglioni
|
901ae37f7b
|
added step to workflow
|
2020-08-03 18:12:54 +02:00 |
Miriam Baglioni
|
fa38cdb10b
|
added resource
|
2020-08-03 18:11:12 +02:00 |
Miriam Baglioni
|
e9fcc0b2f1
|
commented test unit - to decide change for mirroring the changed logics
|
2020-08-03 18:10:53 +02:00 |
Miriam Baglioni
|
e43aeb139a
|
added new property file and changed some parameter to old files
|
2020-08-03 18:07:28 +02:00 |
Miriam Baglioni
|
aa9f3d9698
|
changed logic for save in s3 directly
|
2020-08-03 18:06:18 +02:00 |
Miriam Baglioni
|
d465f0eec9
|
added fulltext to result
|
2020-08-03 18:03:27 +02:00 |
Miriam Baglioni
|
ec4b392d12
|
added new dependencies for writing on s3
|
2020-08-03 17:57:04 +02:00 |
Miriam Baglioni
|
c892c7dfa7
|
changed to query for community map just once and save the result for remaining executions
|
2020-08-03 17:56:31 +02:00 |
Claudio Atzori
|
3a11a387a9
|
data provision workflow enhancement: added nodes to perform DELETE BY QUERY before the indexing begins and COMMIT after the indexing is completed
|
2020-08-03 14:28:08 +02:00 |
Alessia Bardi
|
8cc067fe76
|
specific test for claims
|
2020-08-03 11:17:50 +02:00 |
Claudio Atzori
|
a89b6cc3ba
|
Merge pull request 'nsprefix_blacklist' (#34) from nsprefix_blacklist into master
|
2020-07-31 11:52:23 +02:00 |
Sandro La Bruzzo
|
0c3bc9ea4b
|
Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop
|
2020-07-31 09:07:18 +02:00 |
Sandro La Bruzzo
|
168bfb496a
|
adopted dedup to the new schema
|
2020-07-31 09:06:57 +02:00 |
Michele Artini
|
652b13abb6
|
Merge branch 'master' into nsprefix_blacklist
|
2020-07-31 07:58:37 +02:00 |
Claudio Atzori
|
cd631bb5bc
|
defaults fixed in the cleaning workflow forces result.publisher to NULL when result.publisher.value in empty
|
2020-07-30 17:03:53 +02:00 |
Miriam Baglioni
|
872d7783fc
|
-
|
2020-07-30 16:50:36 +02:00 |
Miriam Baglioni
|
57c87b7653
|
re-implemented to fix issue on not serializable Set<String> variable
|
2020-07-30 16:43:43 +02:00 |
Miriam Baglioni
|
ef8e5957b5
|
added specific directory where to save results
|
2020-07-30 16:42:46 +02:00 |
Miriam Baglioni
|
75f3361c85
|
-
|
2020-07-30 16:41:31 +02:00 |
Miriam Baglioni
|
3f695b25fa
|
refactoring
|
2020-07-30 16:40:15 +02:00 |
Miriam Baglioni
|
e623f12bef
|
refactoring
|
2020-07-30 16:32:59 +02:00 |
Miriam Baglioni
|
ff7d05abb4
|
added support class to store the couple organizationId representativeId gaot from sql query on hive
|
2020-07-30 16:32:04 +02:00 |
Miriam Baglioni
|
cf6d80b2ab
|
added command to close the writer
|
2020-07-30 16:31:22 +02:00 |
Miriam Baglioni
|
f985bca37b
|
added USER_CLAIM constant value
|
2020-07-30 16:25:26 +02:00 |
Claudio Atzori
|
4bbfcf1ac6
|
Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop
|
2020-07-30 16:25:06 +02:00 |
Claudio Atzori
|
4ff8007518
|
added function to set the missing vocabulary names, used in the cleaning workflow as a pre-cleaning step
|
2020-07-30 16:24:39 +02:00 |
Miriam Baglioni
|
6f1c40a933
|
-
|
2020-07-30 16:24:28 +02:00 |
Miriam Baglioni
|
2b66a93f9e
|
added property file that was missing
|
2020-07-30 16:24:17 +02:00 |
Michele Artini
|
bdece15ca0
|
blacklist of nsprefix
|
2020-07-30 16:13:38 +02:00 |
Sandro La Bruzzo
|
c97c8f0c44
|
implemented new oozie job to extract entities in a separate dataset
|
2020-07-30 12:13:58 +02:00 |
Sandro La Bruzzo
|
3010a362bc
|
updated changing in the workflow of provision in the phase of aggregation. Removed serialization in JSON RDD and used spark Dataset
|
2020-07-30 09:25:56 +02:00 |
Sandro La Bruzzo
|
487226f669
|
Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop
|
2020-07-30 09:25:39 +02:00 |
Sandro La Bruzzo
|
16ae3c9ccf
|
updated changing in the workflow of provision in the phase of aggregation. Removed serialization in JSON RDD and used spark Dataset
|
2020-07-30 09:25:32 +02:00 |
Miriam Baglioni
|
ee8420c6b3
|
added resource for datasource test
|
2020-07-29 18:28:43 +02:00 |
Miriam Baglioni
|
76bcab98ce
|
added code to filter out null originalId from the dump
|
2020-07-29 18:28:21 +02:00 |
Miriam Baglioni
|
ef1d8aef17
|
added one test to verify the dump for the datasources
|
2020-07-29 18:27:46 +02:00 |
Miriam Baglioni
|
86bab79512
|
-
|
2020-07-29 18:20:22 +02:00 |
Miriam Baglioni
|
31791dcf3d
|
fixed wrong property file path name
|
2020-07-29 18:20:08 +02:00 |
Miriam Baglioni
|
9e722aa1ef
|
-
|
2020-07-29 18:00:08 +02:00 |
Miriam Baglioni
|
d22f106f27
|
added constant to identify datasource associated to funders
|
2020-07-29 17:56:55 +02:00 |
Miriam Baglioni
|
40e194fe2f
|
added check to not dump datasources related to funders
|
2020-07-29 17:56:18 +02:00 |
Miriam Baglioni
|
b48934f6df
|
changed the workflow name
|
2020-07-29 17:43:43 +02:00 |
Miriam Baglioni
|
1433db825d
|
refactorign
|
2020-07-29 17:43:24 +02:00 |
Miriam Baglioni
|
074e9ab75e
|
refactoring
|
2020-07-29 17:42:50 +02:00 |
Miriam Baglioni
|
8ad8dac7d4
|
merge branch with fork master
|
2020-07-29 17:38:28 +02:00 |