Claudio Atzori
|
8de9788308
|
applied fix for avoiding ruling out the invisible (APC) records during the graph cleaning
|
2022-01-24 11:29:22 +01:00 |
Claudio Atzori
|
2f385b3ac6
|
updated dnet workflow profile definitions
|
2022-01-21 13:59:46 +01:00 |
Claudio Atzori
|
dd52bf1bb8
|
copy relations to the graphOutputPath
|
2022-01-21 13:59:29 +01:00 |
Claudio Atzori
|
4983d6536d
|
Merge branch 'beta' into delegated_authorities
|
2022-01-21 13:02:48 +01:00 |
Claudio Atzori
|
f0ea2410e5
|
improved mapping titles from datacite records to consider title types
|
2022-01-21 10:50:34 +01:00 |
Claudio Atzori
|
b37bc277c4
|
reintroduced the hostedby patching to the datacite records
|
2022-01-21 09:15:13 +01:00 |
Claudio Atzori
|
f2fde5566b
|
using helper method from ModelSupport to find the inverse relation descriptor
|
2022-01-20 09:19:07 +01:00 |
Claudio Atzori
|
3b9020c1b7
|
added unit test for the DispatchEntitiesJob
|
2022-01-19 18:15:55 +01:00 |
Claudio Atzori
|
abfa9c6045
|
code formatting
|
2022-01-19 17:17:11 +01:00 |
Claudio Atzori
|
391aa1373b
|
added unit test
|
2022-01-19 17:13:21 +01:00 |
Claudio Atzori
|
44a937f4ed
|
factored out entity grouping implementation, extended to consider results from delegated authorities rather than identical records from other sources
|
2022-01-19 12:24:52 +01:00 |
Miriam Baglioni
|
a7c4d0d16d
|
[DoiBoost Organizations] added parameter to specify the action in the wf raw_organizations to be able to load the openorgs organization as in the loading step for the construction of the graph
|
2022-01-13 13:52:00 +01:00 |
Miriam Baglioni
|
a75fb8c47a
|
[BipFinderInstanceLevel] change pom to align to the dhp-schema release 2.10.24 and refactoring
|
2022-01-12 18:06:26 +01:00 |
Miriam Baglioni
|
e7d5a39c03
|
[BipFinderInstanceLevel] added tests in test class
|
2022-01-12 17:25:04 +01:00 |
Miriam Baglioni
|
4993666d73
|
[BipFinderInstanceLevel] changed creation of the instance to allow to enrich existing instances with same pid
|
2022-01-12 16:53:47 +01:00 |
Claudio Atzori
|
9acc32faa6
|
[stats wf] final touches for the integration of PRs #166, #179 in the master branch
|
2022-01-12 12:04:31 +01:00 |
dimitrispie
|
b053b0178e
|
Sprint 5 and other changes
|
2022-01-12 11:23:37 +01:00 |
Antonis Lempesis
|
b6b4bc0df9
|
added first indicator of sprint 5
|
2022-01-12 11:20:28 +01:00 |
Antonis Lempesis
|
e91f06f39b
|
fixed typos in indicators. Added extra views in monitor
|
2022-01-12 11:18:28 +01:00 |
Antonis Lempesis
|
3ce1976627
|
fixed column names
|
2022-01-12 11:14:41 +01:00 |
Antonis Lempesis
|
4878d7485c
|
added usage stats
|
2022-01-12 11:13:25 +01:00 |
Antonis Lempesis
|
a4316bafed
|
fixed a typo
|
2022-01-12 11:12:53 +01:00 |
Antonis Lempesis
|
bb17e070d8
|
added result_result relations
|
2022-01-12 11:09:38 +01:00 |
Claudio Atzori
|
a30a98a716
|
Applying PR#166 in the master branch (Added sprint 3&4 of indicators). Merge commit '0df9574a6f5d9d75bc840decb023561ae941f9d6'
|
2022-01-12 10:57:19 +01:00 |
Sandro La Bruzzo
|
57e2c4b749
|
formatted code
|
2022-01-12 09:40:28 +01:00 |
Claudio Atzori
|
0f2144b5e0
|
scalafmt: code formatting
|
2022-01-11 17:03:44 +01:00 |
Claudio Atzori
|
dcd282977c
|
pulled from beta
|
2022-01-11 16:59:41 +01:00 |
Claudio Atzori
|
4f212652ca
|
scalafmt: code formatting
|
2022-01-11 16:57:48 +01:00 |
Sandro La Bruzzo
|
0163dadb7f
|
[doiboost]
- update MAG schema, new filed added on version dec-2021
|
2022-01-11 11:05:44 +01:00 |
Miriam Baglioni
|
904e1c2667
|
Merge pull request 'Affiliation Propagation through semantic relation' (#183) from enrichment into beta
Reviewed-on: D-Net/dnet-hadoop#183
|
2022-01-07 19:18:16 +01:00 |
Miriam Baglioni
|
064f9bbd87
|
[AFFPropSR] added new paprameter for the number of iterations and new code for just one iteration
|
2022-01-07 18:58:51 +01:00 |
Miriam Baglioni
|
b7e450070b
|
[SDG-FOS] to import SDG file not considering the header
|
2022-01-07 12:13:26 +01:00 |
Miriam Baglioni
|
639190370a
|
mergin with branch beta
|
2022-01-07 11:29:25 +01:00 |
Miriam Baglioni
|
adccc2346a
|
[SDG-FOS] to lower case for the doi
|
2022-01-07 11:28:50 +01:00 |
Claudio Atzori
|
8ae46ca789
|
OAF-store-graph mdstores: firther fix for PR#180
|
2022-01-05 15:52:15 +01:00 |
Claudio Atzori
|
908294d86e
|
OAF-store-graph mdstores: firther fix for PR#180
|
2022-01-05 15:49:05 +01:00 |
Claudio Atzori
|
3bd3653be9
|
OAF-store-graph mdstores: save them in text format
|
2022-01-04 16:39:39 +01:00 |
Claudio Atzori
|
3dc48c7ab5
|
OAF-store-graph mdstores: save them in text format
|
2022-01-04 16:39:27 +01:00 |
Claudio Atzori
|
f82db765db
|
OAF-store-graph mdstores: save them in text format
|
2022-01-04 16:39:15 +01:00 |
Claudio Atzori
|
8d13effa31
|
test for the tolerant deserialisation utility method
|
2022-01-04 16:38:26 +01:00 |
Claudio Atzori
|
9458ee7938
|
serialise records in the OAF-store-graph mdstores in json format. Read them again in the graph construction phase using a tolerant parser to support backward compatible changes in the evolution of the schema
|
2022-01-04 16:38:09 +01:00 |
Claudio Atzori
|
58f8998e3d
|
OAF-store-graph mdstores: save them in text format
|
2022-01-04 15:02:09 +01:00 |
Claudio Atzori
|
174c3037e1
|
OAF-store-graph mdstores: save them in text format
|
2022-01-04 14:40:16 +01:00 |
Claudio Atzori
|
045d767013
|
OAF-store-graph mdstores: save them in text format
|
2022-01-04 14:23:01 +01:00 |
Claudio Atzori
|
bd59b58efb
|
test for the tolerant deserialisation utility method
|
2022-01-04 11:26:56 +01:00 |
Claudio Atzori
|
a6977197b3
|
serialise records in the OAF-store-graph mdstores in json format. Read them again in the graph construction phase using a tolerant parser to support backward compatible changes in the evolution of the schema
|
2022-01-03 17:25:26 +01:00 |
Miriam Baglioni
|
4c60ee1718
|
mergin with branch beta
|
2022-01-03 15:24:02 +01:00 |
Miriam Baglioni
|
92fd69e25d
|
[SDG-FOS] alternative way to get input data to avoid OOM error while getting csv
|
2022-01-03 15:23:06 +01:00 |
Claudio Atzori
|
fe7e5f4748
|
Merge pull request '[stats wf] result_result relations, usage stats, monitor views, indicator for sprint 5' (#179) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#179
|
2022-01-03 14:52:11 +01:00 |
Claudio Atzori
|
bcea4e3a9b
|
added dnet workflow profile for the orchestration of the simplified and complete graph construction and processing pipeline, where the IIS works on the non-deduplicated graph
|
2022-01-03 14:33:00 +01:00 |
Miriam Baglioni
|
a706ba0c08
|
Merge pull request 'SDG Integration' (#178) from SDG into beta
Reviewed-on: D-Net/dnet-hadoop#178
|
2021-12-23 14:50:00 +01:00 |
Antonis Lempesis
|
81ee654271
|
added result_result relations
|
2021-12-23 15:46:17 +02:00 |
Antonis Lempesis
|
7551e52e95
|
fixed a typo
|
2021-12-23 15:33:53 +02:00 |
Miriam Baglioni
|
7a1b440413
|
[SDG] logic to create unresolved entities out of SDG input. This changes also some classes related to FOS to reuse the same code. The code under createunresolvedentities create results with the merged update of the the inputs provided (bip at the level of the isntance, fos and sdg for subjects)
|
2021-12-23 13:24:28 +01:00 |
Claudio Atzori
|
cccb16900c
|
https://support.openaire.eu/issues/7330 normalising DOI urls
|
2021-12-23 12:33:53 +01:00 |
Miriam Baglioni
|
2a67ee13ec
|
[SDG] added model class
|
2021-12-23 10:37:52 +01:00 |
Miriam Baglioni
|
69e9ea9eeb
|
[Graph Dump] Test for extraction of rels from entities extended
|
2021-12-23 10:15:30 +01:00 |
Miriam Baglioni
|
31b26d48ac
|
[Graph Dump] fixed issue on extraction of relation between entities and contexts: the relationship name and type were swapped
|
2021-12-23 10:09:47 +01:00 |
Miriam Baglioni
|
10579c0dd0
|
[FOS]fixed doi value in test
|
2021-12-22 23:10:16 +01:00 |
Miriam Baglioni
|
6116fc5d40
|
[FOS]added logic to include only different subjects. Test refactoring and extention
|
2021-12-22 23:04:22 +01:00 |
Miriam Baglioni
|
b81efb6a9d
|
[FOS]changed the mapping between the csv and the model. Changed Test classes and resources
|
2021-12-22 21:40:35 +01:00 |
Miriam Baglioni
|
de6c4c8968
|
[FOS]creation of the unresolved entities: remove the split for the doi: no more needed since each row is related to one doi
|
2021-12-22 16:44:44 +01:00 |
Miriam Baglioni
|
34ac56565d
|
refactoring
|
2021-12-22 16:28:11 +01:00 |
Miriam Baglioni
|
20ef1d657f
|
refactoring
|
2021-12-22 16:26:36 +01:00 |
Miriam Baglioni
|
813f856d3f
|
[BipFinder] removing left over parameter in wf
|
2021-12-22 16:11:12 +01:00 |
Miriam Baglioni
|
2c126ed014
|
[BipFinder] create unresolved entities with measures at the level of the instance
|
2021-12-22 16:03:41 +01:00 |
Miriam Baglioni
|
0807fdb65a
|
[BipFinder] remove not needed resources
|
2021-12-22 15:37:00 +01:00 |
Miriam Baglioni
|
b5e11a3a0a
|
[BipFinder] put in common package BipFinder model
|
2021-12-22 15:33:05 +01:00 |
Miriam Baglioni
|
c5739c4266
|
[BipFinder] create action set for the measures at the level of the result
|
2021-12-22 15:08:33 +01:00 |
Miriam Baglioni
|
da5f6260aa
|
mergin with branch beta
|
2021-12-22 13:12:02 +01:00 |
Miriam Baglioni
|
be0acccf42
|
Merge branch 'beta' into dump
|
2021-12-22 12:39:57 +01:00 |
Antonis Lempesis
|
16539d7360
|
added usage stats
|
2021-12-22 02:54:42 +02:00 |
Antonis Lempesis
|
3edd661608
|
fixed column names
|
2021-12-21 22:55:04 +02:00 |
Antonis Lempesis
|
a4c0cbb98c
|
fixed typos in indicators. Added extra views in monitor
|
2021-12-21 15:54:38 +02:00 |
Miriam Baglioni
|
e24a7f3496
|
mergin with branch beta
|
2021-12-21 13:57:19 +01:00 |
Miriam Baglioni
|
d1ae219cb4
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2021-12-21 13:55:53 +01:00 |
Miriam Baglioni
|
460e6b95d6
|
[Graph Dump] -
|
2021-12-21 13:48:03 +01:00 |
Sandro La Bruzzo
|
3920d68992
|
Fixed workflow generation of delta in datacite
|
2021-12-21 11:41:49 +01:00 |
Antonis Lempesis
|
58996972d9
|
added first indicator of sprint 5
|
2021-12-21 03:35:04 +02:00 |
dimitrispie
|
c1cdec09a9
|
Sprint 5 and other changes
|
2021-12-20 19:23:57 +02:00 |
Miriam Baglioni
|
3cc1b7b153
|
mergin with branch beta
|
2021-12-15 17:25:02 +01:00 |
Miriam Baglioni
|
63b648b0dd
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2021-12-15 12:41:15 +01:00 |
Antonis Lempesis
|
f0b523cfa7
|
removed the too restrctive clause. will discuss again
|
2021-12-15 12:32:15 +01:00 |
Sandro La Bruzzo
|
b881ee5ef8
|
[scholexplorer]
- implemented generation of scholix of delta update of datacite
|
2021-12-15 11:25:32 +01:00 |
Sandro La Bruzzo
|
63952018c0
|
[scholexplorer]
-moved SparkRetrieveDataciteDelta in scala folder
|
2021-12-15 11:25:32 +01:00 |
Sandro La Bruzzo
|
e5bff64f2e
|
[scholexplorer]
- Minor fix on SparkConvertRDDtoDataset
-first implementation of retrieve datacite dump
|
2021-12-15 11:25:32 +01:00 |
Claudio Atzori
|
1790fa2d44
|
Merge branch 'beta' into affiliationPropagation
|
2021-12-14 15:26:56 +01:00 |
Miriam Baglioni
|
56409d1281
|
[Dump] resolved conflicts with beta and merging
|
2021-12-14 15:03:45 +01:00 |
Miriam Baglioni
|
22d4b5619b
|
[BipFinder Result] last changes to test and resources files
|
2021-12-14 14:54:13 +01:00 |
Miriam Baglioni
|
6fb6236cd4
|
changed the way to produce the AS for bipFinder.
|
2021-12-14 14:51:14 +01:00 |
Miriam Baglioni
|
573bd17cbb
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2021-12-14 11:12:25 +01:00 |
Miriam Baglioni
|
4eb8276493
|
-
|
2021-12-14 11:12:17 +01:00 |
Miriam Baglioni
|
936578aaf1
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2021-12-13 15:01:47 +01:00 |
Miriam Baglioni
|
8d755cca80
|
-
|
2021-12-13 15:01:40 +01:00 |
Claudio Atzori
|
98eb292c59
|
avoid NPEs merging XMLInstance(s)
|
2021-12-13 13:27:20 +01:00 |
Claudio Atzori
|
5e17247bb6
|
avoid NPEs merging XMLInstance(s)
|
2021-12-13 11:48:40 +01:00 |
Claudio Atzori
|
b70ecccea0
|
avoid NPEs merging XMLInstance(s)
|
2021-12-12 12:37:38 +01:00 |
Claudio Atzori
|
c1b6ae47cd
|
cleaning workflow assigns the proper default instance type when a value could not be cleaned using the vocabularies
|
2021-12-09 16:47:41 +01:00 |
Claudio Atzori
|
eb43eda42a
|
Merge branch 'beta' into graph_cleaning
|
2021-12-09 16:46:48 +01:00 |
Claudio Atzori
|
41c70c607d
|
cleaning workflow assigns the proper default instance type when a value could not be cleaned using the vocabularies
|
2021-12-09 16:44:28 +01:00 |
Alessia Bardi
|
cba63e9f82
|
Merge branch 'beta' into sygma_indexing
|
2021-12-09 15:52:16 +01:00 |
Alessia Bardi
|
e53228401b
|
style
|
2021-12-09 15:46:22 +01:00 |
Claudio Atzori
|
cd9c51fd7a
|
vocabulary based cleaning considers also the term label when looking up for a synonym
|
2021-12-09 14:49:24 +01:00 |
Claudio Atzori
|
e6e177dda0
|
vocabulary based cleaning considers also the term label when looking up for a synonym
|
2021-12-09 13:57:53 +01:00 |
Alessia Bardi
|
6b5d7688a4
|
#7275 serialize license information in XML records
|
2021-12-09 13:46:48 +01:00 |
Miriam Baglioni
|
b113586207
|
resolved conflicts
|
2021-12-07 10:16:14 +01:00 |
Sandro La Bruzzo
|
5d51b3dd4a
|
Merge pull request 'scala_refactor' (#169) from scala_refactor into beta
Reviewed-on: D-Net/dnet-hadoop#169
|
2021-12-06 15:33:44 +01:00 |
Miriam Baglioni
|
d9836f0cf3
|
[OpenCitations] fixed test when executed one after the other
|
2021-12-06 15:27:09 +01:00 |
Miriam Baglioni
|
d1df01ff1e
|
[Graph Dump] fixed resource for test
|
2021-12-06 15:15:48 +01:00 |
Sandro La Bruzzo
|
ed0c352799
|
[test-fixing] fixed wrong test
|
2021-12-06 15:07:41 +01:00 |
Miriam Baglioni
|
96a7d46278
|
[Graph Dump] fixed tests
|
2021-12-06 15:06:32 +01:00 |
Sandro La Bruzzo
|
e9f285ec4d
|
[scala-refactor] Module dhp-doiboost:
Moved all scala source into src/main/scala and src/test/scala
|
2021-12-06 14:24:03 +01:00 |
Sandro La Bruzzo
|
bf880e2508
|
[scala-refactor] Module dhp-graph-mapper:
Moved all scala source into src/main/scala and src/test/scala
|
2021-12-06 13:57:41 +01:00 |
Sandro La Bruzzo
|
7af0bbd0b1
|
[scala-refactor] Module dhp-aggregation:
Moved all scala source into src/main/scala and src/test/scala
|
2021-12-06 11:26:36 +01:00 |
Claudio Atzori
|
08795cbd30
|
using helper method from ModelSupport to find the inverse relation descriptor
|
2021-12-06 10:39:56 +01:00 |
Miriam Baglioni
|
f430688ff7
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2021-12-03 12:36:08 +01:00 |
Miriam Baglioni
|
4bb1d43afc
|
-
|
2021-12-03 12:35:51 +01:00 |
Sandro La Bruzzo
|
f7011b90d8
|
format code
|
2021-12-03 11:15:09 +01:00 |
Claudio Atzori
|
dd0b2e5244
|
Merge branch 'beta' into instance_group_by_url
|
2021-12-03 09:27:58 +01:00 |
Claudio Atzori
|
863a2f9db3
|
avoid to filter OAF records defined as invisible = true
|
2021-12-03 09:08:12 +01:00 |
Claudio Atzori
|
9cac283bec
|
implemented Instance serialization features requested in https://support.openaire.eu/issues/7156
|
2021-12-02 17:20:33 +01:00 |
Miriam Baglioni
|
d9f80488cc
|
[GRAPH DUMP] Add one more test to check the filtering of the relations
|
2021-12-02 14:15:19 +01:00 |
Miriam Baglioni
|
58bc3f223a
|
[GRAPH DUMP] Add filtering for relation we do not want to dump. It is based on the relclass
|
2021-12-02 14:09:46 +01:00 |
Miriam Baglioni
|
8905a39bf3
|
mergin with branch beta
|
2021-12-02 13:17:29 +01:00 |
Miriam Baglioni
|
87eedad898
|
-
|
2021-12-02 13:17:19 +01:00 |
Claudio Atzori
|
3b19821f3c
|
added stats computation on the graph hive DB tables
|
2021-12-02 10:44:10 +01:00 |
Claudio Atzori
|
cfa4560769
|
minor: fixed hive action name
|
2021-12-02 10:43:36 +01:00 |
Claudio Atzori
|
d85af6fc25
|
[cleaning wf] fixed OAF record navigation, a mapping defined on a container object would have prevented the natvigation to continue on its properties
|
2021-12-01 15:49:15 +01:00 |
Claudio Atzori
|
4fe7888817
|
code formatting
|
2021-12-01 15:48:15 +01:00 |
Claudio Atzori
|
01e5e0142a
|
added test to verify the relation inverse lookup operation
|
2021-12-01 09:46:26 +01:00 |
Claudio Atzori
|
0df9574a6f
|
Merge pull request '[stats wf] Added sprint 3&4 of indicators' (#166) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#166
|
2021-11-29 10:40:26 +01:00 |
Claudio Atzori
|
1de881b796
|
resolved conflicts for #165
|
2021-11-26 16:15:11 +01:00 |
Claudio Atzori
|
014e872ae1
|
[resolution wf] added optional parameter to skip the entity resolution
|
2021-11-26 15:38:56 +01:00 |
Claudio Atzori
|
5c6d328537
|
code formatting
|
2021-11-26 15:38:16 +01:00 |
dimitrispie
|
09fc2afdca
|
Added indi_funder_country_collab
Kept only indi_pub_has_cc_licence
|
2021-11-26 16:13:10 +02:00 |
Antonis Lempesis
|
0b4163ee0b
|
added sprint3,4, removed 2, chaos
|
2021-11-26 15:58:01 +02:00 |
dimitrispie
|
29f69f2f89
|
Sprint 4
|
2021-11-26 15:22:04 +02:00 |
Miriam Baglioni
|
ac07ed8251
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2021-11-25 12:32:58 +01:00 |
Miriam Baglioni
|
5fd0e610bf
|
[DOIBOOST Process] fix filtering to filter results with non null id
|
2021-11-25 12:10:45 +01:00 |
Sandro La Bruzzo
|
feea154e89
|
remove working dir after test
|
2021-11-25 11:02:38 +01:00 |
Sandro La Bruzzo
|
028a8acad8
|
add test resources
|
2021-11-25 10:54:47 +01:00 |
Sandro La Bruzzo
|
2164a2a889
|
Datacite: Code Refactor generated a general SparkApplication Scala where all the spark scala have to inherit
Commented a little the Datacite transformation code
|
2021-11-25 10:54:13 +01:00 |
Miriam Baglioni
|
3f9b2ba8ce
|
[Hosted By Map] fix issue in test
|
2021-11-22 16:59:43 +01:00 |
Sandro La Bruzzo
|
a7cf277d98
|
Datacite: Removed HostedBy Patch as described on ticket #7219, Now all the records will have hosted by Unknown Repository
|
2021-11-22 16:03:17 +01:00 |
Sandro La Bruzzo
|
483d3039d1
|
entity resolution: added distcpt of missing entities in graph materialization
|
2021-11-22 15:55:24 +01:00 |
Sandro La Bruzzo
|
93fe8ce8b2
|
entity resolution: fix test
|
2021-11-22 15:50:43 +01:00 |
Sandro La Bruzzo
|
35e20b0647
|
updated resolution wf:
- generate a new version of the graph
- changed merge from union to join
|
2021-11-22 11:48:55 +01:00 |
Miriam Baglioni
|
fdb75b180e
|
[Cleaning] added couple of tests for DOIBOOST publications
|
2021-11-21 16:35:22 +01:00 |
Miriam Baglioni
|
0506fa2654
|
[Graph Dump] changed to mirror the changes in the model
|
2021-11-19 15:56:25 +01:00 |
Sandro La Bruzzo
|
3426451d3f
|
Merge remote-tracking branch 'origin/beta' into beta
|
2021-11-19 14:49:04 +01:00 |