Alessia Bardi
|
ccf4103a25
|
keep the original url if the decoder fails for any reason
|
2021-08-25 10:07:58 +02:00 |
Sandro La Bruzzo
|
45898c71ac
|
fixed wrong doi in pubmed
|
2021-08-24 15:20:04 +02:00 |
Alessia Bardi
|
00a28c0080
|
originalId was renamed to acronym
|
2021-08-23 15:02:21 +02:00 |
Alessia Bardi
|
f19b04d41b
|
code formatting after mvn compile
|
2021-08-23 14:33:39 +02:00 |
Alessia Bardi
|
931f430129
|
Merge branch 'beta' into datasource_model_eosc_beta
|
2021-08-23 11:57:21 +02:00 |
Alessia Bardi
|
4c1474e693
|
Dealing with #6859#note-2: we have to decode URLs to avoid & and other chars encoded becasue of the original XML representation of data
|
2021-08-20 17:03:30 +02:00 |
Miriam Baglioni
|
5f8ccbc365
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2021-08-20 11:13:47 +02:00 |
Miriam Baglioni
|
882abb40e4
|
CrossrefDump -
|
2021-08-20 11:12:53 +02:00 |
Miriam Baglioni
|
45c62609af
|
CrossrefDump - modified because parameter file was moved
|
2021-08-20 11:12:31 +02:00 |
Miriam Baglioni
|
35880c0e7b
|
CrossrefDump - changed the wf to be able to resume from one of the steps
|
2021-08-20 11:11:35 +02:00 |
Miriam Baglioni
|
f3b6c392c1
|
CrossrefDump - moving parameter file under folder crossref_dump_reader
|
2021-08-20 11:10:58 +02:00 |
Miriam Baglioni
|
65822400ce
|
CrossrefDump - added new parameter file that was missing
|
2021-08-20 11:10:35 +02:00 |
Alessia Bardi
|
a053e1513c
|
different funders in blacklist from BETA and PROD aggregator
|
2021-08-19 11:32:27 +02:00 |
Alessia Bardi
|
812bd54c57
|
different funders in blacklist from BETA and PROD aggregator
|
2021-08-19 11:30:14 +02:00 |
Miriam Baglioni
|
a65d3caaea
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2021-08-19 10:29:10 +02:00 |
Miriam Baglioni
|
e5cf11d088
|
change open access route to result matching hbm to gold
|
2021-08-19 10:29:04 +02:00 |
Claudio Atzori
|
7c0c67bdd6
|
added mock pom
|
2021-08-13 17:45:53 +02:00 |
Claudio Atzori
|
82086f3422
|
fixed directory name
|
2021-08-13 17:42:14 +02:00 |
Claudio Atzori
|
bc7068106c
|
added crossref download oozie workflow
|
2021-08-13 17:19:44 +02:00 |
Claudio Atzori
|
2c0a05f11a
|
manually merged PR#139
|
2021-08-13 17:15:53 +02:00 |
Claudio Atzori
|
d43667d857
|
Merge pull request 'Automatic download of Crossref' (#138) from crossref_dw_wf into beta
Reviewed-on: D-Net/dnet-hadoop#138
|
2021-08-13 17:10:10 +02:00 |
Miriam Baglioni
|
5856ca8a7b
|
merging with branch beta - resolved conflicts
|
2021-08-13 16:45:45 +02:00 |
Miriam Baglioni
|
6fec71e8d2
|
removed the specific of the infra we are running the wf from the wf name
|
2021-08-13 16:39:02 +02:00 |
Miriam Baglioni
|
ed7e28490a
|
change in sh
|
2021-08-13 16:19:01 +02:00 |
Claudio Atzori
|
7743d0f919
|
consolidated dnet wf profiles into the same submodule
|
2021-08-13 16:14:54 +02:00 |
Miriam Baglioni
|
6eb7508995
|
mergin with branch beta
|
2021-08-13 16:07:04 +02:00 |
Claudio Atzori
|
f74adc4752
|
added DownloadCSV2 as alternative implementation of the same download procedure
|
2021-08-13 15:52:15 +02:00 |
Claudio Atzori
|
5f0903d50d
|
fixed CSV downloader & tests
|
2021-08-13 14:17:54 +02:00 |
Claudio Atzori
|
17cefe6a97
|
[HBM] removed stale replace option
|
2021-08-13 12:43:59 +02:00 |
Claudio Atzori
|
7ee2757fcd
|
fixed DownloadCSV parameters spec; workflow patching the hostedby replaces the graph content (publication, datasource) rather than creating a copy
|
2021-08-13 12:41:01 +02:00 |
Claudio Atzori
|
c3ad4ab701
|
minor fixes
|
2021-08-13 12:23:15 +02:00 |
Claudio Atzori
|
baed5e3337
|
test classes moved in specific components
|
2021-08-13 12:14:47 +02:00 |
Claudio Atzori
|
3359f73fcf
|
cleanup & best practices
|
2021-08-13 12:00:42 +02:00 |
Miriam Baglioni
|
f4ec81c92c
|
mergin with branch beta
|
2021-08-13 10:31:35 +02:00 |
Miriam Baglioni
|
dc8b05b39e
|
Hosted By Map - changed the association with the datasource id for the hostedby element: there is no more the need to compute it. With the new HBM it is already the id in the graph
|
2021-08-13 10:18:25 +02:00 |
Miriam Baglioni
|
32fd75691f
|
refactoring
|
2021-08-13 10:15:42 +02:00 |
Miriam Baglioni
|
01db1f8bc4
|
GetCSV refactoring - removed not needed import
|
2021-08-13 10:14:17 +02:00 |
Miriam Baglioni
|
964a46ca21
|
GetCSV refactoring - modified due to movement of classes
|
2021-08-13 10:11:18 +02:00 |
Miriam Baglioni
|
eaf077fc34
|
GetCSV refactoring - removed not needed dependency
|
2021-08-13 10:08:58 +02:00 |
Miriam Baglioni
|
5f674efb0c
|
moved dependency version in external pom
|
2021-08-13 10:07:53 +02:00 |
Miriam Baglioni
|
5cd5714530
|
GetCSV refactoring - added ignore annotation for fields not in input csv
|
2021-08-13 10:06:49 +02:00 |
Miriam Baglioni
|
ed183d878e
|
GetCSV refactoring - modified test classes due to change in the model of projects and programme
|
2021-08-13 09:28:51 +02:00 |
Miriam Baglioni
|
8769dd8eef
|
GetCSV refactoring - refactoring due to movement of classes
|
2021-08-12 18:20:56 +02:00 |
Miriam Baglioni
|
6b9e1bf2e3
|
GetCSV refactoring - removing not needed dependency
|
2021-08-12 18:17:50 +02:00 |
Miriam Baglioni
|
d57b2bb927
|
GetCSV refactoring - removing not needed dependency
|
2021-08-12 18:12:51 +02:00 |
Miriam Baglioni
|
9da74b544a
|
GetCSV refactoring - refactoring due to movement of classes
|
2021-08-12 18:12:15 +02:00 |
Miriam Baglioni
|
ab8abd61bb
|
GetCSV refactoring - refactoring due to movement of classes
|
2021-08-12 18:11:07 +02:00 |
Miriam Baglioni
|
335a824e34
|
GetCSV refactoring - fixed issue
|
2021-08-12 18:10:10 +02:00 |
Miriam Baglioni
|
f0845e9865
|
GetCSV refactoring - refactoring due to movement of classes
|
2021-08-12 18:04:58 +02:00 |
Miriam Baglioni
|
7a789423aa
|
GetCSV refactoring - refactoring due to movement of classes
|
2021-08-12 18:04:27 +02:00 |
Miriam Baglioni
|
e9fc3ef3bc
|
GetCSV refactoring - changed to use the new class to get and write the csv file
|
2021-08-12 18:03:41 +02:00 |
Miriam Baglioni
|
4317211a2b
|
GetCSV refactoring - refactoring due to movement
|
2021-08-12 18:03:14 +02:00 |
Miriam Baglioni
|
b62cd656a7
|
GetCSV refactoring - changed the model to store only the information needed
|
2021-08-12 18:01:10 +02:00 |
Miriam Baglioni
|
d36e925277
|
GetCSV refactoring - moved under model package
|
2021-08-12 18:00:21 +02:00 |
Miriam Baglioni
|
6e84b3951f
|
GetCSV refactoring - moving classes to dhp-common that have dependency with GetCSV class (that was located in graph-mapper)
|
2021-08-12 17:57:41 +02:00 |
Claudio Atzori
|
9587d4aee8
|
Merge branch 'beta' into hostedbymap
|
2021-08-12 17:04:30 +02:00 |
Claudio Atzori
|
86d940044c
|
added test to verify bad records from FWF-E-Book-Library
|
2021-08-12 11:32:56 +02:00 |
Claudio Atzori
|
8cdce59e0e
|
[graph raw] let the mapping exceptions propagate
|
2021-08-12 11:32:26 +02:00 |
Miriam Baglioni
|
08dd2b2102
|
moving the dependency version to the external pom file
|
2021-08-11 18:09:41 +02:00 |
Miriam Baglioni
|
ac417ca798
|
removed not needed test resource
|
2021-08-11 17:50:33 +02:00 |
Miriam Baglioni
|
e33daaeee8
|
reverting
|
2021-08-11 17:46:19 +02:00 |
Miriam Baglioni
|
785db1d5b2
|
refactoring
|
2021-08-11 17:44:07 +02:00 |
Miriam Baglioni
|
95e5482bbb
|
removing not needed dependency
|
2021-08-11 17:42:26 +02:00 |
Miriam Baglioni
|
b966329833
|
reverting
|
2021-08-11 17:37:00 +02:00 |
Miriam Baglioni
|
8ad7c71417
|
reverting
|
2021-08-11 17:36:12 +02:00 |
Miriam Baglioni
|
0e1a6bec20
|
reverting
|
2021-08-11 17:32:29 +02:00 |
Miriam Baglioni
|
c6a2a780a9
|
reverting
|
2021-08-11 17:30:17 +02:00 |
Miriam Baglioni
|
b6b58bba28
|
reverting
|
2021-08-11 17:25:37 +02:00 |
Miriam Baglioni
|
804589eb30
|
reverting
|
2021-08-11 17:23:35 +02:00 |
Miriam Baglioni
|
d688749ad9
|
reverting
|
2021-08-11 17:22:28 +02:00 |
Miriam Baglioni
|
524c06e028
|
reverting
|
2021-08-11 17:20:30 +02:00 |
Miriam Baglioni
|
7aa3260729
|
reverting
|
2021-08-11 17:18:45 +02:00 |
Miriam Baglioni
|
55fc500d8d
|
reverting
|
2021-08-11 17:17:48 +02:00 |
Miriam Baglioni
|
8229632839
|
adding assertions to the mapping of the unibi part of gold list
|
2021-08-11 16:36:01 +02:00 |
Miriam Baglioni
|
b1c6140ebf
|
removed all comments in Italian
|
2021-08-11 16:23:33 +02:00 |
Miriam Baglioni
|
52c18c2697
|
removed not needed test class. Teh functionality has been moved
|
2021-08-11 16:16:55 +02:00 |
Miriam Baglioni
|
8da3a25cf6
|
merging with branch beta
|
2021-08-11 15:55:34 +02:00 |
Claudio Atzori
|
9f4db73f30
|
updated/fixed unit tests
|
2021-08-11 15:02:51 +02:00 |
Claudio Atzori
|
61d811ba53
|
suggestions from intellij
|
2021-08-11 12:18:20 +02:00 |
Claudio Atzori
|
2ee21da43b
|
suggestions from SonarLint
|
2021-08-11 12:13:22 +02:00 |
Miriam Baglioni
|
b954fe9ba8
|
mergin with branch beta
|
2021-08-11 10:12:46 +02:00 |
Miriam Baglioni
|
b688567db5
|
hostedbymap - modified part of test to check the bestaccessright changed
|
2021-08-11 10:12:10 +02:00 |
Miriam Baglioni
|
9731a6144a
|
hostedbymap - in case the journal is open access the access may be changed also for the best access right in the result
|
2021-08-10 17:49:45 +02:00 |
Miriam Baglioni
|
a90bac3bc9
|
Graph Dump - added method to test class to verify addition of validation date in projects for community result
|
2021-08-09 16:36:54 +02:00 |
Miriam Baglioni
|
bd0d7bfba7
|
Graph Dump - added resources for testing addition of validation date in project for communityresult
|
2021-08-09 16:36:17 +02:00 |
Miriam Baglioni
|
8daaa32e90
|
Graph Dump - added resources for testing
|
2021-08-09 15:46:29 +02:00 |
Miriam Baglioni
|
bc9e3a06ba
|
Graph Dump - extended the test class
|
2021-08-09 15:46:06 +02:00 |
Claudio Atzori
|
d64a942a76
|
fixed MappersTest
|
2021-08-09 12:32:26 +02:00 |
Miriam Baglioni
|
2efa5abda5
|
refactoring
|
2021-08-09 12:28:36 +02:00 |
Claudio Atzori
|
577f3b1ac8
|
added dnet workflows responsible for the graph construction, enrichment, provision
|
2021-08-09 11:53:58 +02:00 |
Miriam Baglioni
|
da20fceaf7
|
removed all the part related to the crossref dump download since it is done in a separate workflow
|
2021-08-09 11:53:45 +02:00 |
Claudio Atzori
|
964f97ed4d
|
cleanup
|
2021-08-09 11:53:06 +02:00 |
Miriam Baglioni
|
54a6cbb244
|
CrossrefDump - put token among the parameters
|
2021-08-09 11:41:10 +02:00 |
Miriam Baglioni
|
b7079804cb
|
CrossrefDump - put token among the parameters
|
2021-08-09 11:34:35 +02:00 |
Miriam Baglioni
|
a5f82f442b
|
Merge branch 'beta' into doiboost_wf
|
2021-08-09 11:17:51 +02:00 |
Miriam Baglioni
|
b6dcf89d22
|
mergin with branch beta
|
2021-08-09 11:14:43 +02:00 |
Miriam Baglioni
|
eff499af9f
|
added new tests and changed the test example
|
2021-08-09 11:12:30 +02:00 |
Claudio Atzori
|
a45b95ccc1
|
resolving conflicts for PR#134
|
2021-08-09 10:50:03 +02:00 |
Miriam Baglioni
|
5d70f842eb
|
mergin with branch beta
|
2021-08-06 18:57:09 +02:00 |
Miriam Baglioni
|
c3931557e3
|
extended the logic of the dump to consider the validation date in the relation (also in the dumped result for communities and funders at the level of the project), the extention on the instance for the APC, the pid, the alternate identifiers, and the extention of the AccessRight to store the OpenAccessRoute. Added new resourec for testing and extended the old class to verify the new dump. Fixed also issue on relation dump: only relation whose source and target are entities in the graph are dumped. The same hold for references to projects
|
2021-08-06 18:56:18 +02:00 |
Claudio Atzori
|
66f398fe6f
|
Merge pull request '[stats] fixed a typo' (#133) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#133
|
2021-08-06 14:29:57 +02:00 |
Miriam Baglioni
|
6bd1eca7e0
|
merge branch with beta
|
2021-08-05 15:23:32 +02:00 |
Miriam Baglioni
|
73dc082927
|
added new dumped field (openaccessroute, pid and alternate identifier at the level of the instance) and the bipFinder measure at the level of the result
|
2021-08-05 15:20:50 +02:00 |
Miriam Baglioni
|
ee13da9258
|
merge branch with master
|
2021-08-05 11:34:20 +02:00 |
Miriam Baglioni
|
bd096f5170
|
removed not needed param file
|
2021-08-05 10:55:43 +02:00 |
Miriam Baglioni
|
5faeefbda8
|
added script to download the dump,changed the workflow input paramenters
|
2021-08-05 10:54:03 +02:00 |
Miriam Baglioni
|
1965e4eece
|
new workflow for downloading the dump of crossref and unpack it
|
2021-08-04 18:29:03 +02:00 |
Claudio Atzori
|
83c04e5d28
|
mapping test for dataset records adapted to reflect the delegated pid authority (zenodo)
|
2021-08-04 10:37:57 +02:00 |
Miriam Baglioni
|
b4eb026c8b
|
mergin with branch beta
|
2021-08-04 10:21:37 +02:00 |
Miriam Baglioni
|
c7b71647c6
|
Hosted By Map - modification of the resource for testing the presence of only one entry per datasource id
|
2021-08-04 10:20:02 +02:00 |
Miriam Baglioni
|
eb8c3f8594
|
Hosted By Map - test modified because of the application of the new aggregator on datasources
|
2021-08-04 10:19:17 +02:00 |
Miriam Baglioni
|
e94ae0b1de
|
Hosted By Map - extention of the workflow to consider also the application of the map to publications and datasources
|
2021-08-04 10:18:11 +02:00 |
Miriam Baglioni
|
67ba4c40e0
|
Hosted By Map - added parameter resources
|
2021-08-04 10:17:28 +02:00 |
Miriam Baglioni
|
eccf3851b0
|
Hosted By Map - refactoring
|
2021-08-04 10:16:30 +02:00 |
Sandro La Bruzzo
|
74afe43c3a
|
fixed wrong test file
|
2021-08-04 10:16:17 +02:00 |
Miriam Baglioni
|
1e952cccf6
|
Hosted By Map - refactoring and deletion of not needed methods
|
2021-08-04 10:15:43 +02:00 |
Miriam Baglioni
|
8ba8c77f92
|
Hosted By Map - refactoring
|
2021-08-04 10:14:57 +02:00 |
Miriam Baglioni
|
8f7623e77a
|
Hosted By Map - refactoring and application of the new aggregator
|
2021-08-04 10:14:20 +02:00 |
Sandro La Bruzzo
|
3fc820203b
|
fixed wrong test file
|
2021-08-04 10:13:59 +02:00 |
Miriam Baglioni
|
a7bf314fd2
|
Hosted By Map - added new aggregator to get just one result per datasource id
|
2021-08-04 10:13:30 +02:00 |
Miriam Baglioni
|
9831725073
|
Hosted By Map - remove from workflow a step not needed. The hbm will be take care also of the integration of the unibi list of gold openaccess journals
|
2021-08-03 11:02:17 +02:00 |
Miriam Baglioni
|
100e54e6c8
|
mergin with branch beta
|
2021-08-03 10:47:11 +02:00 |
Miriam Baglioni
|
461b8a29a0
|
removed not needed class
|
2021-08-03 10:46:51 +02:00 |
Miriam Baglioni
|
327cddde33
|
Hosted By Map - refactoring
|
2021-08-03 10:44:13 +02:00 |
Miriam Baglioni
|
17292c6641
|
Hosted By Map - resources for testing purposes
|
2021-08-02 19:37:08 +02:00 |
Miriam Baglioni
|
ee7ccb98dc
|
Hosted By Map - test class to verify the application of the hbm to results and datasource
|
2021-08-02 19:36:18 +02:00 |
Miriam Baglioni
|
90e91486e2
|
Hosted By Map - test class to verify each step in the preparation process
|
2021-08-02 19:35:52 +02:00 |
Miriam Baglioni
|
1e859706a3
|
Hosted By Map - Classes to apply the HBM to results and datasources
|
2021-08-02 19:35:23 +02:00 |
Miriam Baglioni
|
72df8f9232
|
Hosted By Map - removed the aggregator for the datasource (it is no more needed) and added a new aggregator for the results. Changed also the hostedBYMap aggregator
|
2021-08-02 19:34:44 +02:00 |
Miriam Baglioni
|
ff1ce75e33
|
Hosted By Map - modification in the code to prepare the info needed to apply the HostedByMap. There is no need to join datasources with the hbm: all the information needed is in the hosted by map already
|
2021-08-02 19:32:59 +02:00 |
Claudio Atzori
|
e826aae848
|
using constants from ModelConstants
|
2021-08-02 14:28:59 +02:00 |
Antonis Lempesis
|
117c3d5c67
|
fixed a typo
|
2021-08-02 12:15:58 +03:00 |
Miriam Baglioni
|
1695d45bd4
|
Hosted By Map - Test class to verify the preparation of the intermediate information
|
2021-07-30 17:57:01 +02:00 |
Miriam Baglioni
|
7c6ea2f4c7
|
Hosted By Map - first attempt for the creation of intermedia information to be used to applu the hosted by map on the graph entities
|
2021-07-30 17:56:27 +02:00 |
Miriam Baglioni
|
d8b9b0553b
|
Hosted By Map - model classes to store the intermediate information to be used to apply the hosted by map
|
2021-07-30 17:55:39 +02:00 |
Miriam Baglioni
|
613bd3bde0
|
Hosted By Map - refactor of the first attemp to prepare a new hosted by map dependent on the datasource in the graph and on two external sources: the gold list from unibi ad the doaj list of open access journal. Both the lists are downloaded from provided url parameter
|
2021-07-30 17:54:45 +02:00 |
Miriam Baglioni
|
d1807781c0
|
mergin with branch beta
|
2021-07-30 14:34:07 +02:00 |
Miriam Baglioni
|
1d6ac3715b
|
merge branch with beta
|
2021-07-30 11:58:29 +02:00 |
Claudio Atzori
|
19620eed46
|
applying PR#131, Patch the identifiers (source/target) in the relations, refinements
|
2021-07-30 11:09:32 +02:00 |
Claudio Atzori
|
4f78565c04
|
fixed implementation of PatchRelationsApplication, refined the relative unit test
|
2021-07-30 11:07:09 +02:00 |
Claudio Atzori
|
a6a38cca9e
|
fixed implementation of PatchRelationsApplication, refined the relative unit test
|
2021-07-30 11:06:11 +02:00 |
Miriam Baglioni
|
9bc4fd3b69
|
Patch FCT relations - fixed issue with join
|
2021-07-30 10:34:05 +02:00 |
Miriam Baglioni
|
2fc89fc9b5
|
Merge branch 'fct_project_id_replacement' of https://code-repo.d4science.org/D-Net/dnet-hadoop into fct_project_id_replacement
|
2021-07-30 10:20:43 +02:00 |
Claudio Atzori
|
081fe92a21
|
Merge branch 'fct_project_id_replacement' of https://code-repo.d4science.org/D-Net/dnet-hadoop into fct_project_id_replacement
|
2021-07-30 10:13:56 +02:00 |
Claudio Atzori
|
576693d782
|
added unit test for PatchRelationsApplication
|
2021-07-30 10:13:33 +02:00 |
Claudio Atzori
|
55e6470f44
|
Merge pull request 'added the sprint 2 indicators in monitor db' (#129) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#129
|
2021-07-30 10:11:46 +02:00 |
Sandro La Bruzzo
|
6358f92c3a
|
added sleep to solve problem of lost request of creating index
|
2021-07-30 08:54:37 +02:00 |
Antonis Lempesis
|
26af0320d0
|
added the sprint 2 indicators in monitor db
|
2021-07-30 00:31:33 +03:00 |
Claudio Atzori
|
7b172e7cd9
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2021-07-29 13:57:06 +02:00 |
Claudio Atzori
|
c53d106e80
|
[provision] lowercase relation filter
|
2021-07-29 13:57:00 +02:00 |
Claudio Atzori
|
6e3554a45e
|
[provision] lowercase relation filter
|
2021-07-29 13:56:37 +02:00 |
Sandro La Bruzzo
|
b1b0cc3f15
|
fixed wrong package name
|
2021-07-29 13:55:08 +02:00 |
Miriam Baglioni
|
baad01cadc
|
hostedbymap
|
2021-07-29 13:04:39 +02:00 |
Claudio Atzori
|
e725c88ebb
|
[raw_all] patching relation identifier phase to be run at the end, i.e. includes also claimed relations
|
2021-07-29 13:03:43 +02:00 |
Claudio Atzori
|
5d08ad86ae
|
[raw_all] patching relation identifier phase to be run at the end, i.e. includes also claimed relations
|
2021-07-29 13:03:16 +02:00 |
Claudio Atzori
|
e87e1805c4
|
[raw_all] added extra workflow step for patching the identifiers in the relations, given an id mapping dataset
|
2021-07-29 12:13:06 +02:00 |
Claudio Atzori
|
5f7330d407
|
Merge branch 'master' into fct_project_id_replacement
|
2021-07-29 11:38:22 +02:00 |
Claudio Atzori
|
1923c1ce21
|
replaced full join + filtering with a left join
|
2021-07-29 11:36:20 +02:00 |
Claudio Atzori
|
dc55ed4acd
|
Merge pull request '[beta] stats update workflow' (#128) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#128
|
2021-07-29 11:13:21 +02:00 |
Claudio Atzori
|
908f57a475
|
code formatting
|
2021-07-29 10:49:39 +02:00 |
Sandro La Bruzzo
|
3721df7aa6
|
refactoring create actionset of scholexplorer, moved on package dhp-aggregation
|
2021-07-29 10:45:35 +02:00 |
Antonis Lempesis
|
4afa5215a9
|
fixed a NPE?
|
2021-07-28 21:59:12 +03:00 |
Antonis Lempesis
|
3d1580fa9b
|
fixed a typo
|
2021-07-28 18:50:31 +03:00 |
Claudio Atzori
|
4c5a71ba2f
|
[broker] updated relation descriptors, making use of constant values
|
2021-07-28 17:11:18 +02:00 |
Claudio Atzori
|
a9961a1835
|
[cleaning] title cleaning based on the me.xuender:unidecode library
|
2021-07-28 16:36:33 +02:00 |
Claudio Atzori
|
e1797c0a42
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2021-07-28 16:21:36 +02:00 |
Claudio Atzori
|
6dddad86ee
|
[cleaning] title cleaning based on the me.xuender:unidecode library
|
2021-07-28 16:21:29 +02:00 |
Sandro La Bruzzo
|
3d8f0f629b
|
implemented workflow of creation action set for scholexplorer
|
2021-07-28 16:15:34 +02:00 |
Antonis Lempesis
|
9b181ffa73
|
added the h2020 classification scheme for projects
|
2021-07-28 16:31:29 +03:00 |
Alessia Bardi
|
df8715a1ec
|
format code after mvn compile
|
2021-07-28 11:58:26 +02:00 |
Michele Artini
|
3e2a2d6e71
|
added new fields in xml
|
2021-07-28 11:56:55 +02:00 |
Alessia Bardi
|
c806387d4b
|
tests for enermaps
|
2021-07-28 11:54:36 +02:00 |
Alessia Bardi
|
9594343725
|
code formatting after mvn compile
|
2021-07-28 11:41:34 +02:00 |
Claudio Atzori
|
2fff24df55
|
code formatting
|
2021-07-28 11:34:19 +02:00 |
Michele Artini
|
9f1c7b8e17
|
tests
|
2021-07-28 11:32:34 +02:00 |
Antonis Lempesis
|
4a9741825d
|
added result_orcid, result_project provenance, issn in datasources
|
2021-07-28 12:28:04 +03:00 |
Miriam Baglioni
|
3d2bba3d5d
|
removing not needed classes
|
2021-07-28 11:25:43 +02:00 |
Miriam Baglioni
|
cc0d3d8a7b
|
mergin with branch beta
|
2021-07-28 11:24:46 +02:00 |
Michele Artini
|
e6f1773d63
|
mapping of new eosc fields
|
2021-07-28 11:17:11 +02:00 |
Miriam Baglioni
|
80d5b3b4de
|
DoiBoost AccessRigh #4362 - removing commented code
|
2021-07-28 11:16:49 +02:00 |
Miriam Baglioni
|
5fe016dcbc
|
DoiBoost AccessRigh #4362 - related to https://code-repo.d4science.org/D-Net/dnet-hadoop/pulls/126/files#issuecomment-4194
|
2021-07-28 11:14:28 +02:00 |
Miriam Baglioni
|
73ed7374a9
|
mergin with branch beta
|
2021-07-28 11:05:16 +02:00 |
Miriam Baglioni
|
43e62fcae9
|
DoiBoost AccessRigh #4362 - related to https://code-repo.d4science.org/D-Net/dnet-hadoop/pulls/126/files#issuecomment-4193
|
2021-07-28 11:04:55 +02:00 |
Michele Artini
|
c72c960ffb
|
added eosc fields
|
2021-07-28 11:03:15 +02:00 |
Michele Artini
|
1fb572a33a
|
added eosc fields
|
2021-07-28 10:52:24 +02:00 |
Miriam Baglioni
|
708d0ade34
|
Merge branch 'beta' into hostedbymap
|
2021-07-28 10:37:22 +02:00 |
Sandro La Bruzzo
|
16c91203bd
|
implemented workflow of creation action set for scholexplorer
|
2021-07-28 10:30:49 +02:00 |
Miriam Baglioni
|
6c936943aa
|
mergin with branch beta
|
2021-07-28 10:24:48 +02:00 |
Miriam Baglioni
|
0424f47494
|
HostedByMap fixing issues
|
2021-07-28 10:24:13 +02:00 |
Michele Artini
|
52e2315ba2
|
removed trick for datasourcetypeui
|
2021-07-28 10:23:00 +02:00 |
Claudio Atzori
|
d267dce520
|
[raw_all] added extra workflow step for patching the identifiers in the relations, given an id mapping dataset
|
2021-07-27 17:18:29 +02:00 |
Sandro La Bruzzo
|
825d9f0289
|
fixed datacite workflow starting from Importing delta
|
2021-07-27 16:09:46 +02:00 |
Claudio Atzori
|
5aa7d16d1b
|
updated assertions in eu.dnetlib.dhp.oa.graph.raw.MappersTest
|
2021-07-27 15:11:58 +02:00 |
Claudio Atzori
|
998b66855a
|
updated assertions in eu.dnetlib.dhp.oa.graph.raw.MappersTest
|
2021-07-27 15:11:37 +02:00 |
Antonis Lempesis
|
1a28a69cac
|
changed the citeee in *_citations to cites
|
2021-07-27 15:14:09 +03:00 |
Miriam Baglioni
|
74f801b689
|
mergin with branch beta
|
2021-07-27 13:18:31 +02:00 |
Miriam Baglioni
|
35e395eae8
|
merge with master
|
2021-07-27 12:34:59 +02:00 |
Miriam Baglioni
|
eb07f7f40f
|
Hosted By Map
|
2021-07-27 12:27:26 +02:00 |
Antonis Lempesis
|
ed185fd7ed
|
added missing colons
|
2021-07-27 11:42:47 +03:00 |
Antonis Lempesis
|
f3b9570354
|
properly invalidating metadata
|
2021-07-26 13:00:16 +03:00 |
Sandro La Bruzzo
|
848aabbb6c
|
minor fix
|
2021-07-25 12:06:41 +02:00 |
Sandro La Bruzzo
|
8fac10c91e
|
fixed defintion wf of creation final infospace of scholexplorer
|
2021-07-25 11:15:37 +02:00 |
Sandro La Bruzzo
|
3920c69bc8
|
change implementation of resolve Relation to generate jsonRdd in output
|
2021-07-25 09:51:36 +02:00 |
Antonis Lempesis
|
f9fbb0f261
|
added indicators second sprint
|
2021-07-24 16:40:28 +03:00 |
Claudio Atzori
|
a0393607a7
|
mapping funding relations from Datacite should be done according to the actual result identifier
|
2021-07-23 18:15:08 +02:00 |
Claudio Atzori
|
5b6844b969
|
mapping funding relations from Datacite should be done according to the actual result identifier
|
2021-07-23 18:14:37 +02:00 |
Sandro La Bruzzo
|
d9e3b89937
|
implemented last part of workflows to generate scholixGraph
|
2021-07-23 16:38:32 +02:00 |
Sandro La Bruzzo
|
cfde63a7c3
|
fixed resolve relation join
|
2021-07-23 14:17:29 +02:00 |
Sandro La Bruzzo
|
4a439c3863
|
NPE fixed
|
2021-07-23 14:17:29 +02:00 |
Sandro La Bruzzo
|
ca74e8dd02
|
create a separate wf for resolving relation
|
2021-07-23 11:40:06 +02:00 |
Sandro La Bruzzo
|
43e9380cd3
|
update resolve relation to use the same format of openaire graph
|
2021-07-23 11:25:18 +02:00 |
Sandro La Bruzzo
|
058b636d4d
|
added control to check if the entity exists
|
2021-07-22 16:08:54 +02:00 |
Sandro La Bruzzo
|
62ae36a3d2
|
fixed NPE
|
2021-07-22 15:41:38 +02:00 |
Miriam Baglioni
|
63553a76b3
|
added code to download gold issn list from unibi
|
2021-07-22 12:01:48 +02:00 |
Miriam Baglioni
|
1a5b114906
|
DoiBoost AccessRigh #4362 - refactoring
|
2021-07-22 12:00:23 +02:00 |
Sandro La Bruzzo
|
31d2d6d41e
|
Scholexplorer: introduction of dedup openaire
|
2021-07-21 18:09:32 +02:00 |
Miriam Baglioni
|
b226ba4439
|
mergin with branch beta
|
2021-07-21 09:46:40 +02:00 |
Alessia Bardi
|
9069958479
|
tests for enermaps
|
2021-07-20 19:31:43 +02:00 |
Claudio Atzori
|
10d7b4f0b4
|
filtering 'old' OpenAIRE ids from the entity.originalId[] array in the OAF -> XML searialization procedure
|
2021-07-20 11:52:05 +02:00 |
Claudio Atzori
|
77e8c6c7f7
|
filtering 'old' OpenAIRE ids from the entity.originalId[] array in the OAF -> XML searialization procedure
|
2021-07-20 11:51:33 +02:00 |
Miriam Baglioni
|
83fe31c92e
|
changed the name of the workflows
|
2021-07-19 18:19:14 +02:00 |
Miriam Baglioni
|
dd81c36b60
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2021-07-19 18:18:14 +02:00 |
Miriam Baglioni
|
54acc5373b
|
changed the name of the workflows
|
2021-07-19 18:18:09 +02:00 |
Miriam Baglioni
|
b420b11ed3
|
duplicate the number of partitions in ProcessMag
|
2021-07-19 18:16:23 +02:00 |
Claudio Atzori
|
65934888a1
|
adding record identifier among the originalIds regardless of what IdentifierFactory produces
|
2021-07-19 17:52:52 +02:00 |
Claudio Atzori
|
5947cddafc
|
adding record identifier among the originalIds regardless of what IdentifierFactory produces
|
2021-07-19 17:52:24 +02:00 |
Claudio Atzori
|
0977baf41d
|
contents mapped from the stores with 'claim' interpretation will not change their identifier along their way towards the graph
|
2021-07-19 17:43:52 +02:00 |
Claudio Atzori
|
5e5f65a3c3
|
contents mapped from the stores with 'claim' interpretation will not change their identifier along their way towards the graph
|
2021-07-19 15:56:55 +02:00 |
Miriam Baglioni
|
662c396354
|
duplicate the number of partitions in ConvertCrossrefToOaf
|
2021-07-19 12:41:14 +02:00 |
Miriam Baglioni
|
59530a14fb
|
DoiBoost AccessRigh #4362 - set BestAccessRight with the ususal comparator
|
2021-07-19 12:34:35 +02:00 |
Miriam Baglioni
|
199123b74b
|
DoiBoost AccessRigh #4362 - Fixed issue on date formatting. Added test method and associated resource
|
2021-07-16 17:30:27 +02:00 |
Miriam Baglioni
|
c4b18e6ccb
|
changed the download.sh, added skip step to allow to not execute one phase and changed the workflow sequence of steps
|
2021-07-16 15:01:25 +02:00 |
Miriam Baglioni
|
acd6056330
|
added shell action to automatically download the new dump and put it in a specified hdfs location
|
2021-07-16 12:47:10 +02:00 |
Miriam Baglioni
|
3bc9a05bc9
|
mergin with branch beta
|
2021-07-16 10:32:27 +02:00 |
Miriam Baglioni
|
34506df1b6
|
DoiBoost AccessRigh #4362 - if the journal is open, the OPEN access right is set to all instances and color is GOLD (overwrite if the color was already set in one of the previous steps)
|
2021-07-16 10:29:51 +02:00 |
Claudio Atzori
|
bf9e0d2d4f
|
Merge pull request 'orcid-no-doi' (#123) from enrico.ottonello/dnet-hadoop:orcid-no-doi into beta
Reviewed-on: D-Net/dnet-hadoop#123
|
2021-07-15 17:59:41 +02:00 |
Sandro La Bruzzo
|
7e2caafe84
|
Scholexplorer: fixed mapping typologies
|
2021-07-15 09:53:12 +02:00 |
Enrico Ottonello
|
2dc50c0999
|
added default value to process path
|
2021-07-14 17:02:22 +02:00 |
Enrico Ottonello
|
66604bb2b4
|
added absolute path to process folder
|
2021-07-14 16:44:51 +02:00 |
Enrico Ottonello
|
7840cc6526
|
merged with master
|
2021-07-14 15:33:59 +02:00 |
Miriam Baglioni
|
4da46bb62f
|
mergin with branch beta
|
2021-07-14 15:08:52 +02:00 |
Enrico Ottonello
|
a65667d217
|
added publication to dataset even if no contributors
|
2021-07-14 15:07:07 +02:00 |
Sandro La Bruzzo
|
10068c00ea
|
Code refactor:
- removed old workflows in doiboost
- splitted workflow of doiboost in preprocess and process
|
2021-07-14 14:45:50 +02:00 |
Miriam Baglioni
|
09ad7b2a9e
|
DoiBoost AccessRigh #4362 - Unpaywall mapped to OAF with OPEN instance (non oa are filtered out) (unknown hostedby) + map the color as it is
|
2021-07-14 14:45:21 +02:00 |
Miriam Baglioni
|
f4f7c6f9d3
|
DoiBoost AccessRigh #4362 - Unpaywall mapped to OAF with OPEN instance (non oa are filtered out) (unknown hostedby) + map the color as it is
|
2021-07-14 14:44:54 +02:00 |
Miriam Baglioni
|
6222adf176
|
DoiBoost AccessRigh #4362 - added resources and test for crossref mapping (licence part included)
|
2021-07-14 14:42:34 +02:00 |
Miriam Baglioni
|
981b1018f6
|
DoiBoost AccessRigh #4362 - decide access right according to licence. Default access right is Unknown
|
2021-07-14 14:42:06 +02:00 |
Sandro La Bruzzo
|
3d8e2aa146
|
Code refactor:
- removed old workflows in doiboost
- splitted workflow of doiboost in preprocess and process
|
2021-07-14 14:37:06 +02:00 |
Miriam Baglioni
|
441701c85c
|
DoiBoost AccessRigh #4362 - If multiple licenses are available, take the one applied to 'vor'
|
2021-07-14 14:14:50 +02:00 |
Sandro La Bruzzo
|
c35c117601
|
fixed process doiboost workflow:
- splitted OrcidToOAF into two phase preprocess and process
- updated workflow used in production
|
2021-07-14 12:48:01 +02:00 |
Miriam Baglioni
|
1cdd09cd8e
|
Tentative fix for testing of Jenkins
|
2021-07-14 11:14:59 +02:00 |
Sandro La Bruzzo
|
4cb65bc64a
|
fixed process doiboost workflow:
- splitted OrcidToOAF into two phase preprocess and process
- updated workflow used in production
|
2021-07-14 09:44:32 +02:00 |
Miriam Baglioni
|
774cdb190e
|
changes to mirror the last dump of the graph with the ols data model.
|
2021-07-13 18:57:24 +02:00 |
Miriam Baglioni
|
886617afd0
|
One result linked to more than on project is saved just once
|
2021-07-13 18:15:35 +02:00 |
Miriam Baglioni
|
320cf02d96
|
Changed the way to find results linked to projects. We verify to actually have the project on the graph before selecting the result
|
2021-07-13 18:13:32 +02:00 |
Miriam Baglioni
|
52ce35d57b
|
-
|
2021-07-13 18:08:46 +02:00 |
Miriam Baglioni
|
970b387b8d
|
modification to allow dump of a single community
|
2021-07-13 18:08:10 +02:00 |
Miriam Baglioni
|
eae10c5894
|
modification to allow the dump for a single community
|
2021-07-13 18:07:25 +02:00 |
Miriam Baglioni
|
c028feef4f
|
workflow for the dump as sub workflows
|
2021-07-13 18:06:44 +02:00 |
Miriam Baglioni
|
d70f8c96fd
|
funding contains and not starts with h2020
|
2021-07-13 17:34:53 +02:00 |
Miriam Baglioni
|
5e38c7f42d
|
dumping only communities with status all
|
2021-07-13 17:32:38 +02:00 |
Claudio Atzori
|
734de62474
|
[doiboost] added workflow for the ActionSet update dedicated to production
|
2021-07-13 17:26:04 +02:00 |
Miriam Baglioni
|
618d2de2da
|
minor changes and refactoring
|
2021-07-13 17:10:02 +02:00 |
Miriam Baglioni
|
59615da65e
|
Add test to verify the creation of relation between context and projects
|
2021-07-13 17:09:15 +02:00 |
Miriam Baglioni
|
084b4ef999
|
added the creation of the openaireId from funder and grant number if the element is not present in the context profile
|
2021-07-13 17:07:46 +02:00 |
Claudio Atzori
|
fa720c1da4
|
[doiboost] added workflow for the ActionSet update dedicated to production
|
2021-07-13 16:59:30 +02:00 |
Miriam Baglioni
|
8f322a73cb
|
change because of the renaming of originalId in acronym
|
2021-07-13 16:22:58 +02:00 |
Miriam Baglioni
|
72397ea1ba
|
Added fix for community of arbitrary name length
|
2021-07-13 16:18:35 +02:00 |
Miriam Baglioni
|
5295d10691
|
added check not to dump deletedByInference entities
|
2021-07-13 16:11:46 +02:00 |
Claudio Atzori
|
9629569e22
|
Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop
|
2021-07-13 16:04:08 +02:00 |
Claudio Atzori
|
f13e11e3f7
|
[aggregation] datacite wf: defined parameter declaring the path used to store the OAF objects produced by the transformation phase
|
2021-07-13 16:04:02 +02:00 |
Miriam Baglioni
|
e9a17ec899
|
added check to verify not to add void APC
|
2021-07-13 15:53:35 +02:00 |
Miriam Baglioni
|
8429aed6c6
|
Added resource for testing selection of valid relations
|
2021-07-13 15:49:38 +02:00 |
Miriam Baglioni
|
39b1a6edf6
|
added test class for the selection of valid relations and description
|
2021-07-13 15:23:09 +02:00 |
Miriam Baglioni
|
9a58f1b93d
|
added logic to select only the valid relations: those not deletedbyinference and having both part of the relation as entities in the graph
|
2021-07-13 15:20:39 +02:00 |
Miriam Baglioni
|
13c66e16be
|
changed logic to split for communities
|
2021-07-13 15:15:27 +02:00 |
Miriam Baglioni
|
6410ab71d8
|
added APC in the dump and test method
|
2021-07-13 15:13:58 +02:00 |
Miriam Baglioni
|
65a242646d
|
added resource for APC dump
|
2021-07-13 14:45:25 +02:00 |
Miriam Baglioni
|
4b432fbee8
|
extended test class
|
2021-07-13 14:40:39 +02:00 |
Miriam Baglioni
|
87a6e2b967
|
extended test class
|
2021-07-13 14:38:28 +02:00 |
Miriam Baglioni
|
69fd40fd30
|
modified code to split the Croatian funder
|
2021-07-13 14:35:26 +02:00 |
Miriam Baglioni
|
86e50f7311
|
modified code to split the Croatian funder
|
2021-07-13 14:31:45 +02:00 |
Miriam Baglioni
|
da88c850c6
|
changed the logic to verify if a community is contained in the list of context of a result
|
2021-07-13 14:22:44 +02:00 |
Miriam Baglioni
|
2f66fedfec
|
changed the logic to verify if a community is contained in the list of context of a result
|
2021-07-13 14:22:23 +02:00 |
Miriam Baglioni
|
f5486ffb14
|
Fixed issues to tests
|
2021-07-13 14:07:45 +02:00 |
Claudio Atzori
|
e0061232e9
|
[aggregation] datacite wf: conditional creation of links, optional resume from intermediate phases
|
2021-07-13 13:41:21 +02:00 |
Sandro La Bruzzo
|
bbe8193930
|
merged stable ids
|
2021-07-12 17:00:43 +02:00 |
Claudio Atzori
|
ae2b47b29d
|
[broker] added coalesce(1) on the stats dataset before storing it on postgres
|
2021-07-09 15:47:51 +02:00 |
Sandro La Bruzzo
|
57c74c73c6
|
fixed mistakes in oozie workflow
|
2021-07-09 12:28:09 +02:00 |
Sandro La Bruzzo
|
61ccb54fde
|
removed wrong loop on oozie wf
|
2021-07-09 12:17:57 +02:00 |
Sandro La Bruzzo
|
9f5a0f3ab6
|
moved wf indexing of Scholexplorer in dhp-graph-provision
|
2021-07-09 12:06:43 +02:00 |
Sandro La Bruzzo
|
09fccf8000
|
added workflow to serialize scholix and summary in json
|
2021-07-09 11:01:42 +02:00 |
Sandro La Bruzzo
|
0ea576745f
|
updated CreateInputGraph because ggenerics don't work on Spark Dataset
|
2021-07-09 10:29:24 +02:00 |
Sandro La Bruzzo
|
cd17e19044
|
implemented branch workflow to import datacite and crossref in scholexplorer
|
2021-07-08 21:20:19 +02:00 |
Miriam Baglioni
|
c30f3ce647
|
merge doi normalization
|
2021-07-08 19:20:02 +02:00 |
Sandro La Bruzzo
|
8a034e46e1
|
updated baseline workflow
|
2021-07-08 11:11:41 +02:00 |
Claudio Atzori
|
b7b8e0986e
|
[raw_all] The claim merge procedure includes the claimed contexts in the merged result
|
2021-07-08 10:42:31 +02:00 |
Sandro La Bruzzo
|
0799ac9fb6
|
fixed wrong path
|
2021-07-08 10:36:37 +02:00 |
Sandro La Bruzzo
|
4d53402712
|
extended ebiLinks to create a dataset before generation of OAF
|
2021-07-08 10:26:21 +02:00 |
Sandro La Bruzzo
|
a4a54a3786
|
code refactor
|
2021-07-08 09:08:25 +02:00 |