Claudio Atzori
|
5c6d328537
|
code formatting
|
2021-11-26 15:38:16 +01:00 |
Sandro La Bruzzo
|
93fe8ce8b2
|
entity resolution: fix test
|
2021-11-22 15:50:43 +01:00 |
Sandro La Bruzzo
|
35e20b0647
|
updated resolution wf:
- generate a new version of the graph
- changed merge from union to join
|
2021-11-22 11:48:55 +01:00 |
Miriam Baglioni
|
fdb75b180e
|
[Cleaning] added couple of tests for DOIBOOST publications
|
2021-11-21 16:35:22 +01:00 |
Miriam Baglioni
|
0136a8c266
|
[Graph Dump] Change test to mirror that measure is at the level of the isntance
|
2021-11-18 14:38:33 +01:00 |
Miriam Baglioni
|
1b79c0ee79
|
mergin with branch beta
|
2021-11-18 11:01:00 +01:00 |
Claudio Atzori
|
e0395719d7
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2021-11-17 14:17:27 +01:00 |
Claudio Atzori
|
82a4e4efae
|
[cleaning wf] fixed methodology to rule out invalid result titles, based on https://support.openaire.eu/issues/7206
|
2021-11-17 14:17:22 +01:00 |
Miriam Baglioni
|
6d4a1c57ee
|
[Resolve Entities] Change test dataset to mirror the modification in the creation of the map between the pids and the unresolved
|
2021-11-17 12:41:52 +01:00 |
Miriam Baglioni
|
99d86134f5
|
[Graph Dump] changed the dump since the measures have been moded at the level of the instance
|
2021-11-16 12:04:21 +01:00 |
Miriam Baglioni
|
6595135a1a
|
[Dump Schemas] changed the schema of the dumped result according to the modifications in the bestAccessRight type
|
2021-11-12 11:45:38 +01:00 |
Miriam Baglioni
|
b3f9370125
|
merge with beta - resolved conflict in pom
|
2021-11-12 11:25:26 +01:00 |
Miriam Baglioni
|
ffb0ce1d59
|
merge with beta - resolved conflict in pom
|
2021-11-12 10:19:59 +01:00 |
Miriam Baglioni
|
b8bdabfae9
|
[Graph DUmp] removed OpenAccessRoute from test in best access right
|
2021-11-11 16:16:48 +01:00 |
Miriam Baglioni
|
e5498052e8
|
[Graph DUmp] removed OpenAccessRoute from test in best access right
|
2021-11-11 16:14:10 +01:00 |
Miriam Baglioni
|
935062edec
|
[Bypass Action Set] creation of unresolved entities
|
2021-11-11 16:11:25 +01:00 |
Sandro La Bruzzo
|
9cb195314f
|
implemented and tested resolution of entities
|
2021-11-11 10:17:40 +01:00 |
Sandro La Bruzzo
|
ae4e99a471
|
Adapted workflow of resolution of PID to work into OpenAIRE data workflow
- Added relations in both verse on all Scholexplorer datasources
|
2021-10-20 17:12:16 +02:00 |
Miriam Baglioni
|
fec40bdd95
|
merging with branch beta - resolved conflicts
|
2021-10-12 09:16:36 +02:00 |
Miriam Baglioni
|
83f51f1812
|
refactoring
|
2021-10-12 09:14:43 +02:00 |
Sandro La Bruzzo
|
5606014b17
|
code refactor see ticket #7065
|
2021-10-12 08:11:53 +02:00 |
Sandro La Bruzzo
|
2557bb41f5
|
Implemented new method for update baseline inside scala node
|
2021-10-06 16:41:08 +02:00 |
Sandro La Bruzzo
|
b84e0cabeb
|
Implemented new method for update baseline
|
2021-10-05 16:34:47 +02:00 |
Miriam Baglioni
|
e653756e3d
|
applied some suggestiond from Sonar Lint
|
2021-10-04 18:40:07 +02:00 |
Miriam Baglioni
|
c4ccd7b32c
|
-
|
2021-10-01 12:59:47 +02:00 |
Miriam Baglioni
|
c8321ad31a
|
merge with branch beta
|
2021-10-01 12:59:08 +02:00 |
Claudio Atzori
|
ebf53a1616
|
added cleaning for relation fields: subRelType & relClass according to dedicated vocabs
|
2021-09-15 16:10:37 +02:00 |
Sandro La Bruzzo
|
45898c71ac
|
fixed wrong doi in pubmed
|
2021-08-24 15:20:04 +02:00 |
Alessia Bardi
|
00a28c0080
|
originalId was renamed to acronym
|
2021-08-23 15:02:21 +02:00 |
Alessia Bardi
|
f19b04d41b
|
code formatting after mvn compile
|
2021-08-23 14:33:39 +02:00 |
Alessia Bardi
|
931f430129
|
Merge branch 'beta' into datasource_model_eosc_beta
|
2021-08-23 11:57:21 +02:00 |
Alessia Bardi
|
4c1474e693
|
Dealing with #6859#note-2: we have to decode URLs to avoid & and other chars encoded becasue of the original XML representation of data
|
2021-08-20 17:03:30 +02:00 |
Claudio Atzori
|
f74adc4752
|
added DownloadCSV2 as alternative implementation of the same download procedure
|
2021-08-13 15:52:15 +02:00 |
Claudio Atzori
|
5f0903d50d
|
fixed CSV downloader & tests
|
2021-08-13 14:17:54 +02:00 |
Claudio Atzori
|
baed5e3337
|
test classes moved in specific components
|
2021-08-13 12:14:47 +02:00 |
Claudio Atzori
|
3359f73fcf
|
cleanup & best practices
|
2021-08-13 12:00:42 +02:00 |
Miriam Baglioni
|
f4ec81c92c
|
mergin with branch beta
|
2021-08-13 10:31:35 +02:00 |
Miriam Baglioni
|
32fd75691f
|
refactoring
|
2021-08-13 10:15:42 +02:00 |
Miriam Baglioni
|
01db1f8bc4
|
GetCSV refactoring - removed not needed import
|
2021-08-13 10:14:17 +02:00 |
Claudio Atzori
|
9587d4aee8
|
Merge branch 'beta' into hostedbymap
|
2021-08-12 17:04:30 +02:00 |
Claudio Atzori
|
86d940044c
|
added test to verify bad records from FWF-E-Book-Library
|
2021-08-12 11:32:56 +02:00 |
Claudio Atzori
|
8cdce59e0e
|
[graph raw] let the mapping exceptions propagate
|
2021-08-12 11:32:26 +02:00 |
Miriam Baglioni
|
785db1d5b2
|
refactoring
|
2021-08-11 17:44:07 +02:00 |
Miriam Baglioni
|
8229632839
|
adding assertions to the mapping of the unibi part of gold list
|
2021-08-11 16:36:01 +02:00 |
Miriam Baglioni
|
8da3a25cf6
|
merging with branch beta
|
2021-08-11 15:55:34 +02:00 |
Claudio Atzori
|
9f4db73f30
|
updated/fixed unit tests
|
2021-08-11 15:02:51 +02:00 |
Claudio Atzori
|
61d811ba53
|
suggestions from intellij
|
2021-08-11 12:18:20 +02:00 |
Claudio Atzori
|
2ee21da43b
|
suggestions from SonarLint
|
2021-08-11 12:13:22 +02:00 |
Miriam Baglioni
|
b954fe9ba8
|
mergin with branch beta
|
2021-08-11 10:12:46 +02:00 |
Miriam Baglioni
|
b688567db5
|
hostedbymap - modified part of test to check the bestaccessright changed
|
2021-08-11 10:12:10 +02:00 |
Miriam Baglioni
|
a90bac3bc9
|
Graph Dump - added method to test class to verify addition of validation date in projects for community result
|
2021-08-09 16:36:54 +02:00 |
Miriam Baglioni
|
bc9e3a06ba
|
Graph Dump - extended the test class
|
2021-08-09 15:46:06 +02:00 |
Miriam Baglioni
|
eff499af9f
|
added new tests and changed the test example
|
2021-08-09 11:12:30 +02:00 |
Miriam Baglioni
|
c3931557e3
|
extended the logic of the dump to consider the validation date in the relation (also in the dumped result for communities and funders at the level of the project), the extention on the instance for the APC, the pid, the alternate identifiers, and the extention of the AccessRight to store the OpenAccessRoute. Added new resourec for testing and extended the old class to verify the new dump. Fixed also issue on relation dump: only relation whose source and target are entities in the graph are dumped. The same hold for references to projects
|
2021-08-06 18:56:18 +02:00 |
Miriam Baglioni
|
6bd1eca7e0
|
merge branch with beta
|
2021-08-05 15:23:32 +02:00 |
Miriam Baglioni
|
ee13da9258
|
merge branch with master
|
2021-08-05 11:34:20 +02:00 |
Claudio Atzori
|
83c04e5d28
|
mapping test for dataset records adapted to reflect the delegated pid authority (zenodo)
|
2021-08-04 10:37:57 +02:00 |
Miriam Baglioni
|
eb8c3f8594
|
Hosted By Map - test modified because of the application of the new aggregator on datasources
|
2021-08-04 10:19:17 +02:00 |
Miriam Baglioni
|
ee7ccb98dc
|
Hosted By Map - test class to verify the application of the hbm to results and datasource
|
2021-08-02 19:36:18 +02:00 |
Miriam Baglioni
|
90e91486e2
|
Hosted By Map - test class to verify each step in the preparation process
|
2021-08-02 19:35:52 +02:00 |
Miriam Baglioni
|
1695d45bd4
|
Hosted By Map - Test class to verify the preparation of the intermediate information
|
2021-07-30 17:57:01 +02:00 |
Miriam Baglioni
|
d1807781c0
|
mergin with branch beta
|
2021-07-30 14:34:07 +02:00 |
Miriam Baglioni
|
1d6ac3715b
|
merge branch with beta
|
2021-07-30 11:58:29 +02:00 |
Claudio Atzori
|
19620eed46
|
applying PR#131, Patch the identifiers (source/target) in the relations, refinements
|
2021-07-30 11:09:32 +02:00 |
Claudio Atzori
|
a6a38cca9e
|
fixed implementation of PatchRelationsApplication, refined the relative unit test
|
2021-07-30 11:06:11 +02:00 |
Claudio Atzori
|
081fe92a21
|
Merge branch 'fct_project_id_replacement' of https://code-repo.d4science.org/D-Net/dnet-hadoop into fct_project_id_replacement
|
2021-07-30 10:13:56 +02:00 |
Claudio Atzori
|
576693d782
|
added unit test for PatchRelationsApplication
|
2021-07-30 10:13:33 +02:00 |
Miriam Baglioni
|
baad01cadc
|
hostedbymap
|
2021-07-29 13:04:39 +02:00 |
Claudio Atzori
|
a9961a1835
|
[cleaning] title cleaning based on the me.xuender:unidecode library
|
2021-07-28 16:36:33 +02:00 |
Claudio Atzori
|
e1797c0a42
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2021-07-28 16:21:36 +02:00 |
Claudio Atzori
|
6dddad86ee
|
[cleaning] title cleaning based on the me.xuender:unidecode library
|
2021-07-28 16:21:29 +02:00 |
Alessia Bardi
|
c806387d4b
|
tests for enermaps
|
2021-07-28 11:54:36 +02:00 |
Claudio Atzori
|
2fff24df55
|
code formatting
|
2021-07-28 11:34:19 +02:00 |
Michele Artini
|
9f1c7b8e17
|
tests
|
2021-07-28 11:32:34 +02:00 |
Miriam Baglioni
|
708d0ade34
|
Merge branch 'beta' into hostedbymap
|
2021-07-28 10:37:22 +02:00 |
Miriam Baglioni
|
0424f47494
|
HostedByMap fixing issues
|
2021-07-28 10:24:13 +02:00 |
Claudio Atzori
|
5aa7d16d1b
|
updated assertions in eu.dnetlib.dhp.oa.graph.raw.MappersTest
|
2021-07-27 15:11:58 +02:00 |
Claudio Atzori
|
998b66855a
|
updated assertions in eu.dnetlib.dhp.oa.graph.raw.MappersTest
|
2021-07-27 15:11:37 +02:00 |
Miriam Baglioni
|
35e395eae8
|
merge with master
|
2021-07-27 12:34:59 +02:00 |
Miriam Baglioni
|
eb07f7f40f
|
Hosted By Map
|
2021-07-27 12:27:26 +02:00 |
Alessia Bardi
|
9069958479
|
tests for enermaps
|
2021-07-20 19:31:43 +02:00 |
Miriam Baglioni
|
774cdb190e
|
changes to mirror the last dump of the graph with the ols data model.
|
2021-07-13 18:57:24 +02:00 |
Miriam Baglioni
|
618d2de2da
|
minor changes and refactoring
|
2021-07-13 17:10:02 +02:00 |
Miriam Baglioni
|
59615da65e
|
Add test to verify the creation of relation between context and projects
|
2021-07-13 17:09:15 +02:00 |
Miriam Baglioni
|
5295d10691
|
added check not to dump deletedByInference entities
|
2021-07-13 16:11:46 +02:00 |
Miriam Baglioni
|
39b1a6edf6
|
added test class for the selection of valid relations and description
|
2021-07-13 15:23:09 +02:00 |
Miriam Baglioni
|
6410ab71d8
|
added APC in the dump and test method
|
2021-07-13 15:13:58 +02:00 |
Miriam Baglioni
|
87a6e2b967
|
extended test class
|
2021-07-13 14:38:28 +02:00 |
Sandro La Bruzzo
|
4c54bd8742
|
add test to verify merge scholix on source
|
2021-07-06 11:32:14 +02:00 |
Sandro La Bruzzo
|
c952c8d236
|
generate first side of scholix mapping
|
2021-07-06 09:53:14 +02:00 |
Sandro La Bruzzo
|
c6fa8598e1
|
massive code refactor:
removed modules dhp-*-scholexplorer
|
2021-07-01 22:13:45 +02:00 |
Sandro La Bruzzo
|
623a0c4edb
|
code Refactor, renaming packages
|
2021-06-30 11:09:30 +02:00 |
Sandro La Bruzzo
|
075055eaca
|
added relation dates in bio mapping
|
2021-06-29 10:33:09 +02:00 |
Sandro La Bruzzo
|
f36f92287d
|
implemented mapping from Crossref Event Data to Oaf
|
2021-06-29 10:21:23 +02:00 |
Sandro La Bruzzo
|
511ec14c63
|
implemented mapping from EBI and Scholix Resolved to OAF
|
2021-06-28 22:04:22 +02:00 |
Sandro La Bruzzo
|
ad50415167
|
Merge remote-tracking branch 'origin/stable_ids' into stable_id_scholexplorer
|
2021-06-24 17:20:50 +02:00 |
Sandro La Bruzzo
|
80e15cc455
|
implemented mapping from uniprot, pdb and ebi links
|
2021-06-24 17:20:00 +02:00 |
Claudio Atzori
|
2e8fd2c531
|
cleanup
|
2021-06-23 14:38:24 +02:00 |
Sandro La Bruzzo
|
080a280bea
|
added pdb to Oaf Transformation
|
2021-06-21 16:23:59 +02:00 |
Sandro La Bruzzo
|
4fe7b75644
|
renamed packages
|
2021-06-18 16:41:24 +02:00 |
Sandro La Bruzzo
|
cc0f2b11fb
|
Implemented mapping from pubmed baseline to OAF
|
2021-06-16 14:56:24 +02:00 |
Claudio Atzori
|
2039bb9f5f
|
orcid / orcid_pending cleaning backported from master branch
|
2021-06-14 09:40:50 +02:00 |
Claudio Atzori
|
dd19c4ac5a
|
Merge pull request 'import_new_mdstores' (#112) from import_new_mdstores into stable_ids
Reviewed-on: #112
|
2021-06-14 09:23:55 +02:00 |
Claudio Atzori
|
a900bfb874
|
delegating the date parsing to https://github.com/sisyphsu/dateparser
|
2021-06-11 16:53:01 +02:00 |
Sandro La Bruzzo
|
e57294ac99
|
implemented changes on PUBMed dataflow
|
2021-06-03 10:52:09 +02:00 |
Michele Artini
|
f0fbfdcfae
|
Merge branch 'stable_ids' into import_new_mdstores
|
2021-06-01 12:03:00 +02:00 |
Michele Artini
|
03a510859a
|
removed coalesce(1)
|
2021-05-31 14:10:51 +02:00 |
Michele Artini
|
e9f2b6037c
|
patch of mdstore records
|
2021-05-31 11:36:26 +02:00 |
Claudio Atzori
|
6e3a4e9237
|
updated test expectations
|
2021-05-28 09:37:50 +02:00 |
Claudio Atzori
|
9d725efdc1
|
reverted implementation of the mdstore client
|
2021-05-20 18:26:09 +02:00 |
Claudio Atzori
|
ae5c28e54f
|
code formatting
|
2021-05-20 16:13:06 +02:00 |
Claudio Atzori
|
232dce83db
|
fixes #6701: xpath for titles to support both datacite and Guidelines v4 mapping
|
2021-05-20 14:41:15 +02:00 |
Claudio Atzori
|
23b8883ab1
|
applied intellij code cleanup
|
2021-05-14 10:58:12 +02:00 |
Claudio Atzori
|
d1cbee8413
|
imported methods from CleaningFunctions, defined in GraphCleaningFunctions
|
2021-05-10 16:43:39 +02:00 |
Claudio Atzori
|
d4a30fabe3
|
clean up tests
|
2021-05-05 17:28:15 +02:00 |
Claudio Atzori
|
dccaf173cf
|
fixed mapping applied to ODF records. Added unit test to verify the mapping for OpenTrials
|
2021-05-05 16:36:15 +02:00 |
Claudio Atzori
|
2e1eb96f9a
|
code formatting
|
2021-05-05 11:23:57 +02:00 |
Claudio Atzori
|
fb930b84d3
|
Merge branch 'stable_ids' of https://code-repo.d4science.org/D-Net/dnet-hadoop into stable_ids
|
2021-05-04 18:06:30 +02:00 |
Claudio Atzori
|
923d19ea8e
|
mdstore read lock/unlock when bulk copying records from mongodb to hdfs
|
2021-05-04 18:06:21 +02:00 |
Sandro La Bruzzo
|
714b71bd21
|
updated pubmed
|
2021-05-04 14:54:12 +02:00 |
Sandro La Bruzzo
|
2129e9caa7
|
updated pangaea transformation to parse directly the xml
|
2021-04-28 10:21:03 +02:00 |
Claudio Atzori
|
5afa7d3e0c
|
core utilities in dhp-common moved in external module dhp-schemas
|
2021-04-27 15:44:01 +02:00 |
Sandro La Bruzzo
|
7f8848ecdd
|
added first implementation of Pangaea Mapping
|
2021-04-27 11:30:37 +02:00 |
Claudio Atzori
|
d0d477cca3
|
code formatting
|
2021-04-20 12:50:34 +02:00 |
miconis
|
0393cdce42
|
addition of alternative names in export queries
|
2021-04-20 12:45:21 +02:00 |
Claudio Atzori
|
d1ca025b0b
|
[cleaning] remiving authors without fullname or providing 'deactivated' keyword. Removing test test titles
|
2021-04-13 14:32:41 +02:00 |
Claudio Atzori
|
827e7e37db
|
[Cleaning] drop instance.alternateIdentifier elements when they are available among instance.pid
|
2021-03-25 11:07:59 +01:00 |
Claudio Atzori
|
751125fdf9
|
[Actionmanager] zero function considers empty entity.id as well as rel.source/rel.target
|
2021-03-23 17:34:32 +01:00 |
Claudio Atzori
|
b4febed138
|
updated mapping tests as consequence of the special treatment reserved to Handle PIDs
|
2021-03-23 09:37:48 +01:00 |
Claudio Atzori
|
431cbe9955
|
handle missing instance.pid during bulk cleaning
|
2021-03-23 09:28:58 +01:00 |
Sandro La Bruzzo
|
c73072079d
|
fix conflicts
|
2021-03-22 16:36:31 +01:00 |
Claudio Atzori
|
8257f9a2bc
|
result.pid: adjusted the mapping applied to the contents from the aggregator
|
2021-03-17 12:45:38 +01:00 |
Claudio Atzori
|
640b885706
|
added instance.alternativeIdentifiers to the graph model, adjusted the mapping applied to the contents from the aggregator
|
2021-03-16 14:19:32 +01:00 |
Claudio Atzori
|
01630f638d
|
IdentifierFactory implementation based on the list of datasources authoritative for a given pid type
|
2021-03-09 17:11:50 +01:00 |
Claudio Atzori
|
59532b0919
|
[#6281 Provenance of product PIDs] Added PIDs to the Instance type; extended mapping for OAF/ODF records
|
2021-03-09 11:14:45 +01:00 |
Claudio Atzori
|
f468c7f0d7
|
merged from master
|
2021-03-09 09:12:41 +01:00 |
Claudio Atzori
|
8d2bb24512
|
merged from master
|
2021-03-08 15:44:34 +01:00 |
Alessia Bardi
|
c4d1feca74
|
mapper test with validated link to project
|
2021-02-10 11:22:54 +01:00 |
Alessia Bardi
|
c67329d3ad
|
updated test for EU Open Data portal datasets
|
2021-02-03 17:06:48 +01:00 |
Alessia Bardi
|
fd705404a1
|
tests for EU Open Data portal dataset mapping
|
2021-02-03 10:28:17 +01:00 |
Sandro La Bruzzo
|
686e7b507c
|
Merge branch 'hadoop_aggregator' of code-repo.d4science.org:D-Net/dnet-hadoop into aggregation_on_hadoop
|
2021-01-28 10:02:13 +01:00 |
Sandro La Bruzzo
|
98b9498b57
|
Removed old messaging system not quite used from collection and Transformation workflow
code refactor
|
2021-01-28 09:51:17 +01:00 |
Sandro La Bruzzo
|
150a617bd1
|
Merge pull request 'aggregation_on_hadoop' (#90) from sandro.labruzzo/dnet-hadoop:aggregation_on_hadoop into hadoop_aggregator
Wonderfull code... You're the Best Sandro
|
2021-01-26 16:00:47 +01:00 |
Alessia Bardi
|
505477f36f
|
format code
|
2021-01-25 18:02:49 +01:00 |
Alessia Bardi
|
ded6ed8d7d
|
no ',' author, if there are no author in ODF records
|
2021-01-25 17:57:51 +01:00 |
Sandro La Bruzzo
|
a54848a59c
|
Moved Vocabulary stuff to common module
|
2021-01-25 15:43:04 +01:00 |
Claudio Atzori
|
47270d9af5
|
lenient mock can be lenient
|
2020-12-18 15:38:59 +01:00 |
Claudio Atzori
|
12e2f930c8
|
resolved conflicts
|
2020-12-10 10:57:39 +01:00 |
Alessia Bardi
|
112da6d76a
|
in theory, just auto-formatting after mvn compile
|
2020-12-09 20:00:27 +01:00 |
Alessia Bardi
|
bece04b330
|
Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop
|
2020-12-09 19:54:43 +01:00 |