Claudio Atzori
57f448b7a4
graph cleaning workflow separate orcid_pending from orcid, depending on the author pid provenance
2020-12-02 10:44:05 +01:00
Alessia Bardi
a417624670
tests for raw graph mapping
2020-12-02 10:15:26 +01:00
Claudio Atzori
2c407e775e
GenerateEntitiesApplication can be configured to hash the id value or not
2020-11-30 12:00:38 +01:00
Miriam Baglioni
124591a7f3
refactoring
2020-11-25 18:23:28 +01:00
Miriam Baglioni
1a89f8211c
D-Net/dnet-hadoop#61 (comment)
2020-11-25 18:12:40 +01:00
Miriam Baglioni
d4ddde2ef2
changed because of D-Net/dnet-hadoop#61 (comment)
2020-11-25 18:01:01 +01:00
Miriam Baglioni
90d4369fd2
added test to verify the compression in writing community info on hdfs
2020-11-25 14:34:58 +01:00
Miriam Baglioni
1f130cdf92
changed the relation (produces -> isProducedBy) due to the change in the code
2020-11-25 14:04:26 +01:00
Miriam Baglioni
305e3d0c9c
added resource file for relation with relClass = isProducedBy
2020-11-25 13:43:41 +01:00
Miriam Baglioni
bde6d337dd
test classes for dump of results related to funders
2020-11-25 13:42:01 +01:00
Miriam Baglioni
b37b9352d7
added constant value for semantic relationship between projects and results
2020-11-25 13:41:08 +01:00
Claudio Atzori
e1a1bb3ee4
moved class CleaningFunctions in the correct package. Remove newlines from titles, descriptions, subjects
2020-11-24 18:34:03 +01:00
Miriam Baglioni
54a309bb6b
refactoring
2020-11-24 14:45:30 +01:00
Miriam Baglioni
35ecea8842
changed to consider the modification for the specification of the type of dump
2020-11-24 14:45:15 +01:00
Claudio Atzori
3f34757c63
merged from master
2020-11-19 14:34:54 +01:00
Claudio Atzori
ede7fae6c8
Merge pull request 'XML record indexing test' ( #58 ) from provision_indexing into master
2020-11-18 17:04:34 +01:00
Claudio Atzori
8177ce7939
test for XmlIndexingJob based on a local miniSolrCluster
2020-11-18 10:58:05 +01:00
Michele Artini
33da2e3d6c
xpaths for dateOfCollection and dateOfTransformation
2020-11-18 09:26:20 +01:00
Alessia Bardi
7e0a76a8ac
test fr TextGrid
2020-11-17 18:39:25 +01:00
Claudio Atzori
331d621800
added test resource
2020-11-14 12:16:15 +01:00
Claudio Atzori
768bc5304c
Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop
2020-11-13 15:40:34 +01:00
Claudio Atzori
93f7b7974f
Merge pull request 'trust truncated to 3 decimals' ( #24 ) from trunc_trust into master
...
LGTM
2020-11-13 15:40:02 +01:00
Claudio Atzori
2bed29eb09
WIP: added oozie workflow for grouping graph entities by id
2020-11-13 10:05:12 +01:00
Claudio Atzori
13e36a4da0
WIP: added oozie workflow for grouping graph entities by id
2020-11-13 10:05:02 +01:00
Claudio Atzori
9b0fb9e958
merged from master
2020-11-12 09:27:12 +01:00
Sandro La Bruzzo
027ef2326c
Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop
2020-11-06 17:12:42 +01:00
Sandro La Bruzzo
cd27df91a1
fixed bug on missing relation in ANDS
2020-11-06 17:12:31 +01:00
Claudio Atzori
d10447e747
re-packaged graph dump workflow sources
2020-11-05 17:38:18 +01:00
Miriam Baglioni
56150d7e5e
removed all code related to the dump of pids graph
2020-11-04 17:13:12 +01:00
Miriam Baglioni
c010a8442f
fixed issue on test code
2020-11-03 17:26:51 +01:00
Miriam Baglioni
8ec7a61188
merge branch with master
2020-11-03 16:59:08 +01:00
Claudio Atzori
86d6fbe95b
refactoring: CleaningFunctions and OafMapperUtils moved in dhp-commong
2020-11-03 12:19:46 +01:00
Claudio Atzori
3fcd669e99
result merge operation leverage on custom ResultTypeComparator in the aggregator graph construction
2020-11-03 10:53:23 +01:00
Claudio Atzori
09e44dabff
Merge branch 'master' into stable_ids
2020-11-02 12:16:01 +01:00
Sandro La Bruzzo
754c86f33e
fixed test to work on jenkins
2020-11-02 09:35:01 +01:00
Claudio Atzori
4ca75d6951
Merge pull request 'Dedup ID creation policy' ( #48 ) from deduptesting into stable_ids
2020-10-30 15:15:32 +01:00
Miriam Baglioni
3241ec1777
added connection timeout and socket timeout 600 sec
2020-10-27 16:12:11 +01:00
Alessia Bardi
1425d810a8
testing mapping
2020-10-19 17:46:14 +02:00
Claudio Atzori
266bf1a221
common IdentifierFactory in use on the mapping from the aggregator data; merge the entities sharing the same id; code formatting
2020-10-16 17:02:10 +02:00
Claudio Atzori
34f1d0904b
common IdentifierFactory in use on the mapping from the aggregator data
2020-10-16 16:00:19 +02:00
Sandro La Bruzzo
fed711da80
Merge remote-tracking branch 'origin/master' into merge_record_to_common
2020-10-13 15:32:45 +02:00
Alessia Bardi
8775a64bc1
Merge pull request 'Merging different compatibility levels (pinocchio operator)' ( #47 ) from merge_graph into master
2020-10-09 14:44:52 +02:00
Sandro La Bruzzo
fe0a7870e6
Added test to check if merge authors works
2020-10-08 10:33:12 +02:00
Sandro La Bruzzo
cd9c377d18
adpted scholexplorer Dump generation to the new Dataset definition
2020-10-08 10:10:13 +02:00
Claudio Atzori
8d85a2fced
[BETA wf only] datasources involved in the merge operation doesn't obey to the infra precedence policy, but relies on a custom behaviour that, given two datasources from beta and prod returns the one from prod with the highest compatibility among the two
2020-10-07 16:28:52 +02:00
Claudio Atzori
c2a6e2a9bf
fixed mapping for datasource journal info (ISSNs)
2020-10-02 09:37:08 +02:00
Miriam Baglioni
fcaedac980
merge branch with master
2020-10-01 16:46:59 +02:00
Claudio Atzori
2e9e13444d
author pids made unique by value
2020-10-01 12:50:40 +02:00
Claudio Atzori
e265c3e125
cleaning functions factored out in a dedicated class
2020-10-01 10:50:15 +02:00
Miriam Baglioni
de6c4d46d8
fixed conflicts
2020-09-24 15:35:01 +02:00
Claudio Atzori
27df1cea6d
code formatting
2020-09-24 12:16:00 +02:00
Claudio Atzori
fb22f4d70b
included values for projects fundedamount and totalcost fields in the mapping tests. Swapped expected and actual values in junit test assertions
2020-09-24 12:10:59 +02:00
Claudio Atzori
9e3e93c6b6
setting the correct issn type in the datasource.journal element
2020-09-24 10:39:16 +02:00
Miriam Baglioni
c2b5c780ff
-
2020-09-14 14:34:03 +02:00
Miriam Baglioni
1f893e63dc
-
2020-09-14 14:33:10 +02:00
Claudio Atzori
8a523474b7
code formatting
2020-09-07 11:40:16 +02:00
Miriam Baglioni
b72a7dad46
resuorce for pid graph dump
2020-08-24 17:09:01 +02:00
Miriam Baglioni
da103c399a
resources for the pid graph dump test
2020-08-24 16:52:07 +02:00
Miriam Baglioni
630a6a1fe7
first tests for the pid graph dump
2020-08-24 16:51:26 +02:00
Miriam Baglioni
2c783793ba
removed the affiliation from the author to mirror the changes in the model
2020-08-19 11:48:12 +02:00
Miriam Baglioni
f6bf888016
removed affiliation from author to mirror the changes in the model
2020-08-19 11:41:41 +02:00
Miriam Baglioni
66d0e0d3f2
-
2020-08-19 11:31:50 +02:00
Miriam Baglioni
d407852ac2
changed to reflect the changed in the model
2020-08-19 11:15:05 +02:00
Miriam Baglioni
47c21a8961
refactoring due to compilation
2020-08-19 11:11:57 +02:00
Miriam Baglioni
96600ed04a
modified test resource for mirroring the deletion of affiliation from author parameters
2020-08-14 20:41:49 +02:00
Miriam Baglioni
d2a8a4961a
refactoring
2020-08-13 18:50:33 +02:00
Miriam Baglioni
fd48ae3b85
changed because of D-Net/dnet-hadoop#40 (comment)
2020-08-13 12:19:15 +02:00
Miriam Baglioni
04a3e1ab38
disabled tests
2020-08-13 12:18:13 +02:00
Miriam Baglioni
2ede397933
Apply change because of D-Net/dnet-hadoop#40 (comment)
2020-08-13 12:16:39 +02:00
Miriam Baglioni
adf9f96a67
test for extraction of relation between organizations and context
2020-08-12 10:04:47 +02:00
Miriam Baglioni
25f4fbceea
draft of test and resources
2020-08-11 17:37:22 +02:00
Miriam Baglioni
30a2b19b65
changed metadata for deposition od covid-19 dump in Zenodo
2020-08-11 17:36:56 +02:00
Miriam Baglioni
49788b532a
changed to mirror changes in the schema
2020-08-11 16:05:03 +02:00
Miriam Baglioni
b08511287b
-
2020-08-11 16:01:36 +02:00
Miriam Baglioni
7e81a17068
changed the XQUERY to mirror the change in the code
2020-08-11 16:00:33 +02:00
Miriam Baglioni
37ad2f28e9
removed added | in prefix for datasource
2020-08-11 15:55:06 +02:00
Miriam Baglioni
f31c2e9461
enabled test
2020-08-11 15:49:25 +02:00
Miriam Baglioni
2d67476417
merge branch with master
2020-08-11 15:46:04 +02:00
Miriam Baglioni
6d3804e24c
-
2020-08-11 15:45:12 +02:00
Miriam Baglioni
0603ec4757
changed test to upload the dump for covid-19 community
2020-08-11 15:43:25 +02:00
Miriam Baglioni
7dfd56df9d
-
2020-08-11 15:42:35 +02:00
Miriam Baglioni
a169d7e7c1
added test file for the MakeTar class
2020-08-11 15:40:41 +02:00
Miriam Baglioni
c378c38546
disabled test. The testing functionalities for hte upload in Zenode are moved to common
2020-08-10 12:41:11 +02:00
Miriam Baglioni
63ad0ed209
changed to use communityMapPath instead of IsLookUp
2020-08-10 12:40:19 +02:00
Miriam Baglioni
cec795f2ea
changed resources to mirror changes in the model
2020-08-10 12:39:35 +02:00
Miriam Baglioni
f50e3e7333
changed the class for which to generate the schema
2020-08-10 12:03:49 +02:00
Miriam Baglioni
b8c26f656c
test using communityMapPath instead of isLookUp
2020-08-10 12:02:55 +02:00
Sandro La Bruzzo
0ade33ad15
updated mergeFrom function for DLI Unknown
2020-08-10 10:18:35 +02:00
Miriam Baglioni
545ea9f77e
moved in common. Zenodo response model and APIClient to deposit in Zenodo
2020-08-07 16:44:51 +02:00
Miriam Baglioni
adf0ca5aa7
test to send is from hdfs
2020-08-05 14:24:43 +02:00
Alessia Bardi
a29565ff57
code formatting
2020-08-04 12:55:27 +02:00
Alessia Bardi
09a323d18d
testing a dataset from Nakala
2020-08-04 12:50:52 +02:00
Alessia Bardi
c35bf486cc
added handle among the possible PIDs
2020-08-04 12:50:12 +02:00
Miriam Baglioni
5b651abf82
merge branch with master
2020-08-04 10:14:07 +02:00
Miriam Baglioni
fa38cdb10b
added resource
2020-08-03 18:11:12 +02:00
Miriam Baglioni
e9fcc0b2f1
commented test unit - to decide change for mirroring the changed logics
2020-08-03 18:10:53 +02:00
Miriam Baglioni
c892c7dfa7
changed to query for community map just once and save the result for remaining executions
2020-08-03 17:56:31 +02:00
Alessia Bardi
8cc067fe76
specific test for claims
2020-08-03 11:17:50 +02:00
Michele Artini
652b13abb6
Merge branch 'master' into nsprefix_blacklist
2020-07-31 07:58:37 +02:00
Claudio Atzori
cd631bb5bc
defaults fixed in the cleaning workflow forces result.publisher to NULL when result.publisher.value in empty
2020-07-30 17:03:53 +02:00
Miriam Baglioni
872d7783fc
-
2020-07-30 16:50:36 +02:00
Claudio Atzori
4ff8007518
added function to set the missing vocabulary names, used in the cleaning workflow as a pre-cleaning step
2020-07-30 16:24:39 +02:00
Michele Artini
bdece15ca0
blacklist of nsprefix
2020-07-30 16:13:38 +02:00
Miriam Baglioni
ee8420c6b3
added resource for datasource test
2020-07-29 18:28:43 +02:00
Miriam Baglioni
ef1d8aef17
added one test to verify the dump for the datasources
2020-07-29 18:27:46 +02:00
Miriam Baglioni
1433db825d
refactorign
2020-07-29 17:43:24 +02:00
Miriam Baglioni
8ad8dac7d4
merge branch with fork master
2020-07-29 17:38:28 +02:00
Miriam Baglioni
536e7f6352
added and changed resources for testing of the whole graph dump and of community related products dumps
2020-07-29 17:33:34 +02:00
Miriam Baglioni
4d7f590493
testings for the whole graph dump
2020-07-29 17:32:37 +02:00
Miriam Baglioni
a2f73ec2c7
changed due to changes in the model
2020-07-29 17:32:02 +02:00
Miriam Baglioni
481585e9d3
-
2020-07-29 17:31:41 +02:00
Miriam Baglioni
de2ebb467e
changed due to changes in the model
2020-07-29 17:08:02 +02:00
Miriam Baglioni
d0ff2a56fb
-
2020-07-29 17:06:53 +02:00
Miriam Baglioni
b96dedb56b
changed due to changes in the model
2020-07-29 17:05:31 +02:00
Michele Artini
35e6e9c064
tests
2020-07-28 12:02:15 +02:00
Miriam Baglioni
332258d199
split the classes related to the communities dump and to the whole graph dump
2020-07-24 17:21:48 +02:00
Sandro La Bruzzo
9ab594ccf6
fixed test
2020-07-21 10:36:21 +02:00
Miriam Baglioni
40bbe94f7c
merge with master fork
2020-07-20 18:10:03 +02:00
Miriam Baglioni
3aab7680f6
changed the test results
2020-07-20 18:00:43 +02:00
Miriam Baglioni
5076e4f320
changed test to comply with the modifications
2020-07-20 17:55:18 +02:00
Claudio Atzori
54ac583923
code formatting
2020-07-20 17:37:08 +02:00
Claudio Atzori
124e7ce19c
in case of missing attribute //dr:CobjCategory/@type the resulttype is derived by looking up the vocabulary dnet:result_typologies with the 1st instance type available
2020-07-20 17:33:37 +02:00
Claudio Atzori
050dda223d
Merge pull request 'removed duplicated fields' ( #25 ) from unique_field_in_lists into master
...
Looks good as a temporary workaround. I agree the model could seamlessly make the distinct operation by using HashSets instead of Linked (or Array) Lists.
The task to update the model in such a way is added on #9#issuecomment-1583
Thanks!
2020-07-20 12:12:50 +02:00
Michele Artini
331a3cbdd0
fixed originalId
2020-07-20 09:50:29 +02:00
Michele Artini
442f30930c
removed duplicated fields
2020-07-17 12:25:36 +02:00
Michele Artini
3adedd0a68
trust truncated to 3 decimals
2020-07-17 11:58:11 +02:00
Sandro La Bruzzo
c01efed79b
Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop
2020-07-10 14:44:57 +02:00
Sandro La Bruzzo
a7d3977481
added generation of EBI Dataset
2020-07-10 14:44:50 +02:00
Claudio Atzori
67e1d222b6
bulk cleaning when found null or empty, sets bestaccessrights evaluating the result instances
2020-07-08 17:53:35 +02:00
Alessia Bardi
9a898c0e4c
Json schema generator
2020-07-08 12:52:00 +02:00
Miriam Baglioni
7fe00cb4fb
-
2020-07-08 10:29:37 +02:00
Miriam Baglioni
375ef07d7b
changed the description for the upload
2020-07-07 18:41:27 +02:00
Miriam Baglioni
817cddfc52
-
2020-07-07 18:25:12 +02:00
Miriam Baglioni
a66aa9bd83
removed unuseful tests
2020-07-07 18:25:00 +02:00
Miriam Baglioni
9b20a21b24
removed unuseful tests
2020-07-07 18:23:37 +02:00
Miriam Baglioni
0208bc18f3
added new resource for testing
2020-07-07 17:47:24 +02:00
Miriam Baglioni
f5bb65c9ef
the json schema for the dump of the results
2020-07-07 17:34:40 +02:00
Miriam Baglioni
f8bf4acd76
-
2020-07-02 16:03:11 +02:00
Miriam Baglioni
e6c79d44e6
-
2020-07-02 16:02:02 +02:00
Miriam Baglioni
94500a581b
merge branch with fork master
2020-07-02 14:25:39 +02:00
Sandro La Bruzzo
1d420eedb4
added generation of EBI Dataset
2020-07-02 12:37:43 +02:00
Miriam Baglioni
3e5570de7a
-
2020-06-23 15:44:54 +02:00
Michele Artini
38bb45d0b6
test osf:refereed
2020-06-23 10:14:39 +02:00
Miriam Baglioni
e4b21be004
-
2020-06-22 17:31:50 +02:00
Miriam Baglioni
df80ae5c1b
merge branch with fork master
2020-06-22 10:51:23 +02:00
Miriam Baglioni
e8f914f8b3
-
2020-06-22 10:50:41 +02:00
Claudio Atzori
d0ac7514b2
cleaning workflow to include cleaning of default values
2020-06-18 19:37:25 +02:00
Miriam Baglioni
fb80353018
-
2020-06-18 14:21:36 +02:00
Miriam Baglioni
65bf312360
merge branch with fork master
2020-06-18 11:35:27 +02:00
Miriam Baglioni
a118b66858
-
2020-06-18 11:34:30 +02:00