Alessia Bardi
982bcc1e35
test wrid pid and record identifier
2022-09-23 12:06:06 +02:00
Claudio Atzori
c42850328e
fixed semantic (subreltype) for ServiceOrganization relations
2022-09-22 16:23:25 +02:00
Claudio Atzori
e45ec15221
Merge branch 'beta' into clean_country
2022-09-19 11:34:02 +02:00
Claudio Atzori
26e1badded
added instance.url syntactical validation, avoid creating multiple duplicated URLs
2022-09-19 11:19:10 +02:00
Claudio Atzori
192215a18e
merged from branch discard-non-wellformed
2022-09-19 10:17:10 +02:00
Claudio Atzori
e370e940d8
[aggregator graph] save invalid records aside for further inspection
2022-09-16 14:06:28 +02:00
Claudio Atzori
1e42d984e1
[aggregator graph] save invalid records aside for further inspection
2022-09-15 10:49:42 +02:00
Claudio Atzori
c48f6e9c57
[aggregator graph] save invalid records aside for further inspection
2022-09-14 17:11:26 +02:00
Claudio Atzori
a0919ed495
[aggregator graph] save invalid records aside for further inspection
2022-09-14 13:27:39 +02:00
Alessia Bardi
b99a011345
return empty Oaf list if record cannot be parsed
2022-09-13 11:51:55 +02:00
Alessia Bardi
27af5122d2
logs for non well formed XML files
2022-09-12 14:25:23 +02:00
Claudio Atzori
ff6f789b6d
code formatting
2022-09-09 15:16:31 +02:00
Claudio Atzori
b5d6966c01
Merge branch 'beta' into clean_country
2022-09-09 12:20:19 +02:00
Claudio Atzori
b5f7bd30be
Merge branch 'beta' into clean_subjects
2022-09-09 12:20:04 +02:00
Alessia Bardi
a539c6ccaf
https for handle URLs
2022-09-09 12:16:28 +02:00
Alessia Bardi
9ef063d502
#7861#note-8 instance url from handle
2022-09-07 17:29:54 +03:00
Miriam Baglioni
7dbdd4a0fe
[Clean Country]changes related to #241 (comment)
2022-08-10 15:13:10 +02:00
Miriam Baglioni
62d2138806
[Clean Context] changed a bit the logic. Added the check not to have result hosted by a datasource of type institutional repository from NL. Added also the check that the country should have been included in the result via propagation for it to be removed
2022-08-08 14:10:47 +02:00
Claudio Atzori
3418ce50ac
cleaning of subjects: perform the cleaning when the given value is equivalent to one of the terms in the vocabulary
2022-08-08 12:48:47 +02:00
Miriam Baglioni
390013a4b2
mergin with branch beta
2022-08-08 12:30:31 +02:00
Claudio Atzori
4eaa063b1f
cleaning of subjects
2022-08-05 16:56:09 +02:00
Claudio Atzori
32cee1f619
WIP: cleaning of subjects
2022-08-05 12:32:08 +02:00
Claudio Atzori
6c0fd9284b
merge from beta
2022-08-05 10:42:53 +02:00
Claudio Atzori
b78889a0ce
WIP: cleaning of subjects
2022-08-05 09:11:37 +02:00
Miriam Baglioni
a7a18d7630
[Graph Dump] removed code for the dump from the project. Fixed issues in tests when possible
2022-08-04 17:40:40 +02:00
Claudio Atzori
27a91841e7
WIP: cleaning of subjects
2022-08-04 11:39:39 +02:00
Claudio Atzori
1dd1e4fe3a
extended test for mapping project_organization relations
2022-07-28 11:27:08 +02:00
Claudio Atzori
09ccc7b472
Merge branch 'beta' into project_organization_contribution
2022-07-28 09:49:59 +02:00
Miriam Baglioni
5968ec018d
[Clean Country] modified workflow and added param file
2022-07-22 16:48:38 +02:00
Miriam Baglioni
a12d28c644
[Clean Country] added logic not to remove country from result if it exist a hosting datasource with that country. Moreover the country will be removed only if added with propagation
2022-07-22 16:23:12 +02:00
Miriam Baglioni
65cc736e2f
[Clean Country] first implementation to remove country NL from results collected from NARCIS when doi starts with mendely prefix
2022-07-20 17:05:56 +02:00
Claudio Atzori
1138b2ac8e
code formatting
2022-07-19 14:15:49 +02:00
Claudio Atzori
0c1cfee396
mapping oaf:fulltext elements in the result.fulltext field
2022-07-11 17:34:59 +02:00
Claudio Atzori
0cb1c70788
code formatting
2022-07-01 10:44:08 +02:00
Claudio Atzori
4ec13e2b66
Merge branch 'master' into dump_new_funded_products
2022-07-01 10:30:28 +02:00
Claudio Atzori
7da24c1dec
added more logging
2022-06-28 13:47:49 +02:00
Miriam Baglioni
71744a1f52
[DUMP DELTA PROJECTS] refactoring
2022-06-27 18:07:58 +02:00
Claudio Atzori
5130eac247
mapping by participant project contribution
2022-06-24 17:16:42 +02:00
Miriam Baglioni
b98f904d48
[Funder Products Dump] new way to avoid using hive
2022-06-21 17:52:27 +02:00
Miriam Baglioni
7423577a08
[Graph DUMP] add code to produce the delta of new projects with respect to the previous delta/dump
2022-06-21 14:51:38 +02:00
Claudio Atzori
4c8e820ff0
mapping relationship from trasformed records based on oaf:relation
2022-06-14 08:49:02 +02:00
Claudio Atzori
116902c028
mapping relationship from trasformed records based on oaf:relation
2022-06-13 14:31:48 +02:00
Claudio Atzori
5d3b4a9c25
[graph merge beta] merge datasource originalid, collectedfrom, and pid lists
2022-05-11 14:13:06 +02:00
Claudio Atzori
2a8e0fb72f
[openorgs] mapping parent/child relations without massaging the semantic labels
2022-05-10 08:45:53 +02:00
Claudio Atzori
77bc9863e9
[openorgs] mapping parent/child relations without massaging the semantic labels
2022-05-09 16:06:04 +02:00
Claudio Atzori
da611cfbbd
[eosc_services] resolved merge conflicts
2022-05-03 13:37:15 +02:00
Claudio Atzori
2ade69dea6
EOSC Services - minor
2022-05-02 17:03:31 +02:00
Claudio Atzori
b6a7ff3a99
EOSC Services - removed fields from mapping, testing preparation
2022-05-02 15:52:33 +02:00
Claudio Atzori
f5f532d134
EOSC Services - ongoing update
2022-04-29 12:25:24 +02:00
Claudio Atzori
5ffc24d1ba
EOSC Services - ongoing update
2022-04-26 16:18:41 +02:00
Claudio Atzori
29150a5d0c
code formatting
2022-04-21 13:31:56 +02:00
Miriam Baglioni
a38f0f5ea7
mergin with branch beta
2022-04-20 15:44:18 +02:00
Claudio Atzori
05fafa1408
[graph raw] avoid NPEs importing datasource consent fields
2022-04-06 15:23:50 +02:00
Claudio Atzori
8c457f1b2c
conflicts resolved, merged from beta
2022-04-06 10:27:52 +02:00
Miriam Baglioni
79336d46c5
[Clean Context] first naive implementation of a functionality to clean not wanted contextes from one result. This implementation simply verifies the main title of the results start with a given string
2022-04-04 15:52:31 +02:00
Claudio Atzori
0a0ae84c22
[graph raw] DOI based instance URLs on https
2022-03-29 10:52:58 +02:00
Miriam Baglioni
0f7d8ca2e0
[HostedByMap] change on master to align to PR 201 on beta merged as 9f3036c847
2022-03-11 15:16:02 +01:00
Claudio Atzori
f25407bbe2
added mapping for datasource consent fields to integrate them in the graph
2022-03-11 09:32:42 +01:00
Miriam Baglioni
2c5087d55a
[HostedByMap] download of doaj from json, modification of test resources, deletion of class no more needed for the CSV download
2022-03-04 15:18:21 +01:00
Miriam Baglioni
5d608d6291
[HostedByMap] changed the model to include also oaStart date and review process that could be possibly used in the future
2022-03-04 11:06:09 +01:00
Miriam Baglioni
8a41f63348
[HostedByMap] update to download the json instead of the csv
2022-03-04 10:38:43 +01:00
Miriam Baglioni
44b0c03080
[HostedByMap] update to download the json instead of the csv
2022-03-04 10:37:59 +01:00
Alessia Bardi
600ede1798
serialisation of APCs int he XML records
2022-02-11 11:00:20 +01:00
Miriam Baglioni
aae667e6b6
[APC at the result level] added the APC at the level of the result and modified test class
2022-02-04 12:34:25 +01:00
Claudio Atzori
f0ea2410e5
improved mapping titles from datacite records to consider title types
2022-01-21 10:50:34 +01:00
Miriam Baglioni
31b26d48ac
[Graph Dump] fixed issue on extraction of relation between entities and contexts: the relationship name and type were swapped
2021-12-23 10:09:47 +01:00
Miriam Baglioni
56409d1281
[Dump] resolved conflicts with beta and merging
2021-12-14 15:03:45 +01:00
Miriam Baglioni
8d755cca80
-
2021-12-13 15:01:40 +01:00
Claudio Atzori
41c70c607d
cleaning workflow assigns the proper default instance type when a value could not be cleaned using the vocabularies
2021-12-09 16:44:28 +01:00
Sandro La Bruzzo
bf880e2508
[scala-refactor] Module dhp-graph-mapper:
...
Moved all scala source into src/main/scala and src/test/scala
2021-12-06 13:57:41 +01:00
Miriam Baglioni
58bc3f223a
[GRAPH DUMP] Add filtering for relation we do not want to dump. It is based on the relclass
2021-12-02 14:09:46 +01:00
Miriam Baglioni
8905a39bf3
mergin with branch beta
2021-12-02 13:17:29 +01:00
Claudio Atzori
d85af6fc25
[cleaning wf] fixed OAF record navigation, a mapping defined on a container object would have prevented the natvigation to continue on its properties
2021-12-01 15:49:15 +01:00
Sandro La Bruzzo
93fe8ce8b2
entity resolution: fix test
2021-11-22 15:50:43 +01:00
Sandro La Bruzzo
35e20b0647
updated resolution wf:
...
- generate a new version of the graph
- changed merge from union to join
2021-11-22 11:48:55 +01:00
Miriam Baglioni
0506fa2654
[Graph Dump] changed to mirror the changes in the model
2021-11-19 15:56:25 +01:00
Miriam Baglioni
9fae872181
[Graph Dump] changed to mirror the changes in the model
2021-11-19 11:25:50 +01:00
Claudio Atzori
bb5dca7979
cleanup
2021-11-18 17:10:46 +01:00
Miriam Baglioni
793b5a8e5f
Aggiornare 'dhp-workflows/dhp-graph-mapper/src/main/java/eu/dnetlib/dhp/oa/graph/dump/ResultMapper.java'
...
Removing the dump of Measure at the level of the result. We decided not to map it
2021-11-18 14:49:38 +01:00
Miriam Baglioni
c6a9f0a1a8
mergin with branch beta
2021-11-16 12:04:40 +01:00
Miriam Baglioni
99d86134f5
[Graph Dump] changed the dump since the measures have been moded at the level of the instance
2021-11-16 12:04:21 +01:00
Claudio Atzori
668ac25224
[graph resolution] using existing argument parser file name
2021-11-15 17:02:45 +01:00
Miriam Baglioni
b3f9370125
merge with beta - resolved conflict in pom
2021-11-12 11:25:26 +01:00
Sandro La Bruzzo
a7763d2492
removed alternate identifier in resolutionMap
2021-11-12 09:56:45 +01:00
Sandro La Bruzzo
2ca0a436ad
added SparkResolveEntities node to the oozie wf
2021-11-11 10:25:42 +01:00
Sandro La Bruzzo
9cb195314f
implemented and tested resolution of entities
2021-11-11 10:17:40 +01:00
Miriam Baglioni
8cc50ecee0
[Graph Dump] changed AccessRight with BestAccessRight in the dump and modified the dependency to the schema to the SNAPSHOT
2021-11-11 08:59:20 +01:00
Miriam Baglioni
88b73f4f49
mergin with branch beta
2021-11-10 17:00:52 +01:00
Sandro La Bruzzo
6477a40670
implement filter of openCitation
2021-11-09 11:27:12 +01:00
Miriam Baglioni
94918a673c
[Graph DUMP] Fix issue for empty origilaId list
2021-11-08 10:25:28 +01:00
Miriam Baglioni
8442efd8d1
[Graph DUMP] Filtering out from the originalIds the id of the result in OpenAIRE
2021-11-05 12:29:22 +01:00
Miriam Baglioni
a22c29fba1
[Graph DUMP] Filtering out from the originalIds the id of the result in OpenAIRE
2021-11-05 12:08:33 +01:00
Miriam Baglioni
0857849a86
[Graph DUMP] Remove dump of measure until it will be clear where to put it (at the level of result or at the level of the instance)
2021-11-05 11:02:37 +01:00
Sandro La Bruzzo
7bd224f051
implement first version of scholexplorer integration for the generation of final graph
2021-11-02 15:58:15 +01:00
Sandro La Bruzzo
d9cbca83f7
moved filter on next phase
2021-10-28 16:13:24 +02:00
Sandro La Bruzzo
1be9aa0a5f
Removed filter of datacite items from the raw graph merging phase, Datacite is not an actionset anymore in beta
2021-10-26 17:52:20 +02:00
Sandro La Bruzzo
4acfa8fa2e
Scholexplorer Datasource Aggregation:
...
- Added collectedfrom in the inverse relation generated
Relation resolution:
- increased number of partitions in workflow.xml
- using classid instead of classname to build the pid-dnetId mapping
2021-10-26 17:51:20 +02:00
Sandro La Bruzzo
034304b33a
conflict resolved on merge
2021-10-26 09:40:47 +02:00
Claudio Atzori
d147295c2f
avoiding java.io.NotSerializableException: java.util.HashMap
2021-10-21 14:15:57 +02:00
Claudio Atzori
3702fe478d
cleanup
2021-10-21 12:05:02 +02:00