Claudio Atzori
|
174c3037e1
|
OAF-store-graph mdstores: save them in text format
|
2022-01-04 14:40:16 +01:00 |
Claudio Atzori
|
045d767013
|
OAF-store-graph mdstores: save them in text format
|
2022-01-04 14:23:01 +01:00 |
Claudio Atzori
|
bd59b58efb
|
test for the tolerant deserialisation utility method
|
2022-01-04 11:26:56 +01:00 |
Claudio Atzori
|
a6977197b3
|
serialise records in the OAF-store-graph mdstores in json format. Read them again in the graph construction phase using a tolerant parser to support backward compatible changes in the evolution of the schema
|
2022-01-03 17:25:26 +01:00 |
Miriam Baglioni
|
4c60ee1718
|
mergin with branch beta
|
2022-01-03 15:24:02 +01:00 |
Miriam Baglioni
|
92fd69e25d
|
[SDG-FOS] alternative way to get input data to avoid OOM error while getting csv
|
2022-01-03 15:23:06 +01:00 |
Claudio Atzori
|
fe7e5f4748
|
Merge pull request '[stats wf] result_result relations, usage stats, monitor views, indicator for sprint 5' (#179) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#179
|
2022-01-03 14:52:11 +01:00 |
Claudio Atzori
|
bcea4e3a9b
|
added dnet workflow profile for the orchestration of the simplified and complete graph construction and processing pipeline, where the IIS works on the non-deduplicated graph
|
2022-01-03 14:33:00 +01:00 |
Miriam Baglioni
|
a706ba0c08
|
Merge pull request 'SDG Integration' (#178) from SDG into beta
Reviewed-on: D-Net/dnet-hadoop#178
|
2021-12-23 14:50:00 +01:00 |
Antonis Lempesis
|
81ee654271
|
added result_result relations
|
2021-12-23 15:46:17 +02:00 |
Antonis Lempesis
|
7551e52e95
|
fixed a typo
|
2021-12-23 15:33:53 +02:00 |
Miriam Baglioni
|
7a1b440413
|
[SDG] logic to create unresolved entities out of SDG input. This changes also some classes related to FOS to reuse the same code. The code under createunresolvedentities create results with the merged update of the the inputs provided (bip at the level of the isntance, fos and sdg for subjects)
|
2021-12-23 13:24:28 +01:00 |
Claudio Atzori
|
cccb16900c
|
https://support.openaire.eu/issues/7330 normalising DOI urls
|
2021-12-23 12:33:53 +01:00 |
Miriam Baglioni
|
2a67ee13ec
|
[SDG] added model class
|
2021-12-23 10:37:52 +01:00 |
Miriam Baglioni
|
69e9ea9eeb
|
[Graph Dump] Test for extraction of rels from entities extended
|
2021-12-23 10:15:30 +01:00 |
Miriam Baglioni
|
31b26d48ac
|
[Graph Dump] fixed issue on extraction of relation between entities and contexts: the relationship name and type were swapped
|
2021-12-23 10:09:47 +01:00 |
Miriam Baglioni
|
10579c0dd0
|
[FOS]fixed doi value in test
|
2021-12-22 23:10:16 +01:00 |
Miriam Baglioni
|
6116fc5d40
|
[FOS]added logic to include only different subjects. Test refactoring and extention
|
2021-12-22 23:04:22 +01:00 |
Miriam Baglioni
|
b81efb6a9d
|
[FOS]changed the mapping between the csv and the model. Changed Test classes and resources
|
2021-12-22 21:40:35 +01:00 |
Miriam Baglioni
|
de6c4c8968
|
[FOS]creation of the unresolved entities: remove the split for the doi: no more needed since each row is related to one doi
|
2021-12-22 16:44:44 +01:00 |
Miriam Baglioni
|
34ac56565d
|
refactoring
|
2021-12-22 16:28:11 +01:00 |
Miriam Baglioni
|
20ef1d657f
|
refactoring
|
2021-12-22 16:26:36 +01:00 |
Miriam Baglioni
|
813f856d3f
|
[BipFinder] removing left over parameter in wf
|
2021-12-22 16:11:12 +01:00 |
Miriam Baglioni
|
2c126ed014
|
[BipFinder] create unresolved entities with measures at the level of the instance
|
2021-12-22 16:03:41 +01:00 |
Miriam Baglioni
|
0807fdb65a
|
[BipFinder] remove not needed resources
|
2021-12-22 15:37:00 +01:00 |
Miriam Baglioni
|
b5e11a3a0a
|
[BipFinder] put in common package BipFinder model
|
2021-12-22 15:33:05 +01:00 |
Miriam Baglioni
|
c5739c4266
|
[BipFinder] create action set for the measures at the level of the result
|
2021-12-22 15:08:33 +01:00 |
Miriam Baglioni
|
da5f6260aa
|
mergin with branch beta
|
2021-12-22 13:12:02 +01:00 |
Miriam Baglioni
|
be0acccf42
|
Merge branch 'beta' into dump
|
2021-12-22 12:39:57 +01:00 |
Antonis Lempesis
|
16539d7360
|
added usage stats
|
2021-12-22 02:54:42 +02:00 |
Antonis Lempesis
|
3edd661608
|
fixed column names
|
2021-12-21 22:55:04 +02:00 |
Antonis Lempesis
|
a4c0cbb98c
|
fixed typos in indicators. Added extra views in monitor
|
2021-12-21 15:54:38 +02:00 |
Miriam Baglioni
|
e24a7f3496
|
mergin with branch beta
|
2021-12-21 13:57:19 +01:00 |
Miriam Baglioni
|
d1ae219cb4
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2021-12-21 13:55:53 +01:00 |
Miriam Baglioni
|
460e6b95d6
|
[Graph Dump] -
|
2021-12-21 13:48:03 +01:00 |
Sandro La Bruzzo
|
3920d68992
|
Fixed workflow generation of delta in datacite
|
2021-12-21 11:41:49 +01:00 |
Antonis Lempesis
|
58996972d9
|
added first indicator of sprint 5
|
2021-12-21 03:35:04 +02:00 |
dimitrispie
|
c1cdec09a9
|
Sprint 5 and other changes
|
2021-12-20 19:23:57 +02:00 |
Miriam Baglioni
|
3cc1b7b153
|
mergin with branch beta
|
2021-12-15 17:25:02 +01:00 |
Miriam Baglioni
|
63b648b0dd
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2021-12-15 12:41:15 +01:00 |
Antonis Lempesis
|
f0b523cfa7
|
removed the too restrctive clause. will discuss again
|
2021-12-15 12:32:15 +01:00 |
Sandro La Bruzzo
|
b881ee5ef8
|
[scholexplorer]
- implemented generation of scholix of delta update of datacite
|
2021-12-15 11:25:32 +01:00 |
Sandro La Bruzzo
|
63952018c0
|
[scholexplorer]
-moved SparkRetrieveDataciteDelta in scala folder
|
2021-12-15 11:25:32 +01:00 |
Sandro La Bruzzo
|
e5bff64f2e
|
[scholexplorer]
- Minor fix on SparkConvertRDDtoDataset
-first implementation of retrieve datacite dump
|
2021-12-15 11:25:32 +01:00 |
Claudio Atzori
|
1790fa2d44
|
Merge branch 'beta' into affiliationPropagation
|
2021-12-14 15:26:56 +01:00 |
Miriam Baglioni
|
56409d1281
|
[Dump] resolved conflicts with beta and merging
|
2021-12-14 15:03:45 +01:00 |
Miriam Baglioni
|
22d4b5619b
|
[BipFinder Result] last changes to test and resources files
|
2021-12-14 14:54:13 +01:00 |
Miriam Baglioni
|
6fb6236cd4
|
changed the way to produce the AS for bipFinder.
|
2021-12-14 14:51:14 +01:00 |
Miriam Baglioni
|
573bd17cbb
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2021-12-14 11:12:25 +01:00 |
Miriam Baglioni
|
4eb8276493
|
-
|
2021-12-14 11:12:17 +01:00 |
Miriam Baglioni
|
936578aaf1
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2021-12-13 15:01:47 +01:00 |
Miriam Baglioni
|
8d755cca80
|
-
|
2021-12-13 15:01:40 +01:00 |
Claudio Atzori
|
98eb292c59
|
avoid NPEs merging XMLInstance(s)
|
2021-12-13 13:27:20 +01:00 |
Claudio Atzori
|
5e17247bb6
|
avoid NPEs merging XMLInstance(s)
|
2021-12-13 11:48:40 +01:00 |
Claudio Atzori
|
b70ecccea0
|
avoid NPEs merging XMLInstance(s)
|
2021-12-12 12:37:38 +01:00 |
Claudio Atzori
|
c1b6ae47cd
|
cleaning workflow assigns the proper default instance type when a value could not be cleaned using the vocabularies
|
2021-12-09 16:47:41 +01:00 |
Claudio Atzori
|
eb43eda42a
|
Merge branch 'beta' into graph_cleaning
|
2021-12-09 16:46:48 +01:00 |
Claudio Atzori
|
41c70c607d
|
cleaning workflow assigns the proper default instance type when a value could not be cleaned using the vocabularies
|
2021-12-09 16:44:28 +01:00 |
Alessia Bardi
|
cba63e9f82
|
Merge branch 'beta' into sygma_indexing
|
2021-12-09 15:52:16 +01:00 |
Alessia Bardi
|
e53228401b
|
style
|
2021-12-09 15:46:22 +01:00 |
Claudio Atzori
|
cd9c51fd7a
|
vocabulary based cleaning considers also the term label when looking up for a synonym
|
2021-12-09 14:49:24 +01:00 |
Claudio Atzori
|
e6e177dda0
|
vocabulary based cleaning considers also the term label when looking up for a synonym
|
2021-12-09 13:57:53 +01:00 |
Alessia Bardi
|
6b5d7688a4
|
#7275 serialize license information in XML records
|
2021-12-09 13:46:48 +01:00 |
Miriam Baglioni
|
b113586207
|
resolved conflicts
|
2021-12-07 10:16:14 +01:00 |
Sandro La Bruzzo
|
5d51b3dd4a
|
Merge pull request 'scala_refactor' (#169) from scala_refactor into beta
Reviewed-on: D-Net/dnet-hadoop#169
|
2021-12-06 15:33:44 +01:00 |
Miriam Baglioni
|
d9836f0cf3
|
[OpenCitations] fixed test when executed one after the other
|
2021-12-06 15:27:09 +01:00 |
Miriam Baglioni
|
d1df01ff1e
|
[Graph Dump] fixed resource for test
|
2021-12-06 15:15:48 +01:00 |
Sandro La Bruzzo
|
ed0c352799
|
[test-fixing] fixed wrong test
|
2021-12-06 15:07:41 +01:00 |
Miriam Baglioni
|
96a7d46278
|
[Graph Dump] fixed tests
|
2021-12-06 15:06:32 +01:00 |
Sandro La Bruzzo
|
e9f285ec4d
|
[scala-refactor] Module dhp-doiboost:
Moved all scala source into src/main/scala and src/test/scala
|
2021-12-06 14:24:03 +01:00 |
Sandro La Bruzzo
|
bf880e2508
|
[scala-refactor] Module dhp-graph-mapper:
Moved all scala source into src/main/scala and src/test/scala
|
2021-12-06 13:57:41 +01:00 |
Sandro La Bruzzo
|
7af0bbd0b1
|
[scala-refactor] Module dhp-aggregation:
Moved all scala source into src/main/scala and src/test/scala
|
2021-12-06 11:26:36 +01:00 |
Claudio Atzori
|
08795cbd30
|
using helper method from ModelSupport to find the inverse relation descriptor
|
2021-12-06 10:39:56 +01:00 |
Miriam Baglioni
|
f430688ff7
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2021-12-03 12:36:08 +01:00 |
Miriam Baglioni
|
4bb1d43afc
|
-
|
2021-12-03 12:35:51 +01:00 |
Sandro La Bruzzo
|
f7011b90d8
|
format code
|
2021-12-03 11:15:09 +01:00 |
Claudio Atzori
|
dd0b2e5244
|
Merge branch 'beta' into instance_group_by_url
|
2021-12-03 09:27:58 +01:00 |
Claudio Atzori
|
863a2f9db3
|
avoid to filter OAF records defined as invisible = true
|
2021-12-03 09:08:12 +01:00 |
Claudio Atzori
|
9cac283bec
|
implemented Instance serialization features requested in https://support.openaire.eu/issues/7156
|
2021-12-02 17:20:33 +01:00 |
Miriam Baglioni
|
d9f80488cc
|
[GRAPH DUMP] Add one more test to check the filtering of the relations
|
2021-12-02 14:15:19 +01:00 |
Miriam Baglioni
|
58bc3f223a
|
[GRAPH DUMP] Add filtering for relation we do not want to dump. It is based on the relclass
|
2021-12-02 14:09:46 +01:00 |
Miriam Baglioni
|
8905a39bf3
|
mergin with branch beta
|
2021-12-02 13:17:29 +01:00 |
Miriam Baglioni
|
87eedad898
|
-
|
2021-12-02 13:17:19 +01:00 |
Claudio Atzori
|
3b19821f3c
|
added stats computation on the graph hive DB tables
|
2021-12-02 10:44:10 +01:00 |
Claudio Atzori
|
cfa4560769
|
minor: fixed hive action name
|
2021-12-02 10:43:36 +01:00 |
Claudio Atzori
|
d85af6fc25
|
[cleaning wf] fixed OAF record navigation, a mapping defined on a container object would have prevented the natvigation to continue on its properties
|
2021-12-01 15:49:15 +01:00 |
Claudio Atzori
|
4fe7888817
|
code formatting
|
2021-12-01 15:48:15 +01:00 |
Claudio Atzori
|
01e5e0142a
|
added test to verify the relation inverse lookup operation
|
2021-12-01 09:46:26 +01:00 |
Claudio Atzori
|
0df9574a6f
|
Merge pull request '[stats wf] Added sprint 3&4 of indicators' (#166) from antonis.lempesis/dnet-hadoop:beta into beta
Reviewed-on: D-Net/dnet-hadoop#166
|
2021-11-29 10:40:26 +01:00 |
Claudio Atzori
|
1de881b796
|
resolved conflicts for #165
|
2021-11-26 16:15:11 +01:00 |
Claudio Atzori
|
014e872ae1
|
[resolution wf] added optional parameter to skip the entity resolution
|
2021-11-26 15:38:56 +01:00 |
Claudio Atzori
|
5c6d328537
|
code formatting
|
2021-11-26 15:38:16 +01:00 |
dimitrispie
|
09fc2afdca
|
Added indi_funder_country_collab
Kept only indi_pub_has_cc_licence
|
2021-11-26 16:13:10 +02:00 |
Antonis Lempesis
|
0b4163ee0b
|
added sprint3,4, removed 2, chaos
|
2021-11-26 15:58:01 +02:00 |
dimitrispie
|
29f69f2f89
|
Sprint 4
|
2021-11-26 15:22:04 +02:00 |
Miriam Baglioni
|
ac07ed8251
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2021-11-25 12:32:58 +01:00 |
Miriam Baglioni
|
5fd0e610bf
|
[DOIBOOST Process] fix filtering to filter results with non null id
|
2021-11-25 12:10:45 +01:00 |
Sandro La Bruzzo
|
feea154e89
|
remove working dir after test
|
2021-11-25 11:02:38 +01:00 |
Sandro La Bruzzo
|
028a8acad8
|
add test resources
|
2021-11-25 10:54:47 +01:00 |
Sandro La Bruzzo
|
2164a2a889
|
Datacite: Code Refactor generated a general SparkApplication Scala where all the spark scala have to inherit
Commented a little the Datacite transformation code
|
2021-11-25 10:54:13 +01:00 |
Miriam Baglioni
|
3f9b2ba8ce
|
[Hosted By Map] fix issue in test
|
2021-11-22 16:59:43 +01:00 |
Sandro La Bruzzo
|
a7cf277d98
|
Datacite: Removed HostedBy Patch as described on ticket #7219, Now all the records will have hosted by Unknown Repository
|
2021-11-22 16:03:17 +01:00 |
Sandro La Bruzzo
|
483d3039d1
|
entity resolution: added distcpt of missing entities in graph materialization
|
2021-11-22 15:55:24 +01:00 |
Sandro La Bruzzo
|
93fe8ce8b2
|
entity resolution: fix test
|
2021-11-22 15:50:43 +01:00 |
Sandro La Bruzzo
|
35e20b0647
|
updated resolution wf:
- generate a new version of the graph
- changed merge from union to join
|
2021-11-22 11:48:55 +01:00 |
Miriam Baglioni
|
fdb75b180e
|
[Cleaning] added couple of tests for DOIBOOST publications
|
2021-11-21 16:35:22 +01:00 |
Miriam Baglioni
|
0506fa2654
|
[Graph Dump] changed to mirror the changes in the model
|
2021-11-19 15:56:25 +01:00 |
Sandro La Bruzzo
|
3426451d3f
|
Merge remote-tracking branch 'origin/beta' into beta
|
2021-11-19 14:49:04 +01:00 |
Sandro La Bruzzo
|
4542a2338b
|
updated site configuration to deploy on website
|
2021-11-19 13:44:08 +01:00 |
Claudio Atzori
|
e5a2c596b2
|
Merge branch 'beta' into preserve_openorg_parent_child_relations
|
2021-11-19 11:35:46 +01:00 |
Claudio Atzori
|
f4538f3c4c
|
cleanup
|
2021-11-19 11:33:10 +01:00 |
Claudio Atzori
|
2b46b87f56
|
fixed filtering criteria applied in SparkCopyRelationsNoOpenorgs to keep the parent/child relations from OpenOrgs
|
2021-11-19 11:30:29 +01:00 |
Miriam Baglioni
|
9fae872181
|
[Graph Dump] changed to mirror the changes in the model
|
2021-11-19 11:25:50 +01:00 |
Sandro La Bruzzo
|
fc03c99805
|
fixed javadocs url after deploying site
|
2021-11-19 10:46:33 +01:00 |
Sandro La Bruzzo
|
0c0d561bc4
|
added public class into tests to create correct javadoc
|
2021-11-19 09:54:22 +01:00 |
Claudio Atzori
|
62fa61f3cf
|
merge from beta
|
2021-11-19 09:23:42 +01:00 |
Claudio Atzori
|
bd9a43cefd
|
Revert to 4094f2bb9a
|
2021-11-19 09:20:43 +01:00 |
Claudio Atzori
|
3a4d925386
|
Merge branch 'beta' into hierarchical_orgs_relations
|
2021-11-18 18:07:08 +01:00 |
Claudio Atzori
|
3974fa7dc1
|
Merge branch 'beta' into affiliationPropagation
|
2021-11-18 18:06:26 +01:00 |
Claudio Atzori
|
a24b9f8268
|
[dedup] trivial refactoring
|
2021-11-18 17:12:02 +01:00 |
Claudio Atzori
|
c0750fb17c
|
avoid non necessary count operations over large spark datasets
|
2021-11-18 17:11:31 +01:00 |
Claudio Atzori
|
bb5dca7979
|
cleanup
|
2021-11-18 17:10:46 +01:00 |
Miriam Baglioni
|
793b5a8e5f
|
Aggiornare 'dhp-workflows/dhp-graph-mapper/src/main/java/eu/dnetlib/dhp/oa/graph/dump/ResultMapper.java'
Removing the dump of Measure at the level of the result. We decided not to map it
|
2021-11-18 14:49:38 +01:00 |
Miriam Baglioni
|
5dc5792722
|
[Graph Dump] Change test resource to mirror the movement of the measure element
|
2021-11-18 14:39:12 +01:00 |
Miriam Baglioni
|
0136a8c266
|
[Graph Dump] Change test to mirror that measure is at the level of the isntance
|
2021-11-18 14:38:33 +01:00 |
Miriam Baglioni
|
1b79c0ee79
|
mergin with branch beta
|
2021-11-18 11:01:00 +01:00 |
Antonis Lempesis
|
cb3adb90f4
|
Merge branch 'beta' into beta
|
2021-11-17 14:33:45 +01:00 |
Antonis Lempesis
|
c283406829
|
added Universidad Polytecnica de Madrid
|
2021-11-17 15:33:00 +02:00 |
Claudio Atzori
|
e0395719d7
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2021-11-17 14:17:27 +01:00 |
Claudio Atzori
|
82a4e4efae
|
[cleaning wf] fixed methodology to rule out invalid result titles, based on https://support.openaire.eu/issues/7206
|
2021-11-17 14:17:22 +01:00 |
Miriam Baglioni
|
6d4a1c57ee
|
[Resolve Entities] Change test dataset to mirror the modification in the creation of the map between the pids and the unresolved
|
2021-11-17 12:41:52 +01:00 |
Sandro La Bruzzo
|
9c82d670b8
|
make class public in order to create javadoc
|
2021-11-17 12:31:02 +01:00 |
Sandro La Bruzzo
|
1f5ee116ed
|
code refactor, created and moved scala code on the correct maven folder under src/main/scala and src/test/scala
fixed test
|
2021-11-17 12:23:52 +01:00 |
Sandro La Bruzzo
|
2fd9ceac13
|
code refactor, created and moved scala code on the correct maven folder under src/main/scala and src/test/scala
|
2021-11-17 11:35:22 +01:00 |
Sandro La Bruzzo
|
2506d7a679
|
Merge branch 'mvn_site_documentation' of code-repo.d4science.org:D-Net/dnet-hadoop into mvn_site_documentation
|
2021-11-17 11:07:24 +01:00 |
Sandro La Bruzzo
|
cded363b55
|
code refactor, created and moved scala code on the correct maven folder under src/main/scala and src/test/scala
|
2021-11-17 11:06:35 +01:00 |
Miriam Baglioni
|
4094f2bb9a
|
added integration md file
|
2021-11-17 10:04:52 +01:00 |
Miriam Baglioni
|
ec8b0219ff
|
[Documentation] Added first page for Integration via unresolved entities generation
|
2021-11-16 17:41:34 +01:00 |
Miriam Baglioni
|
2bbece2ca5
|
mergin with branch beta
|
2021-11-16 16:35:40 +01:00 |
Sandro La Bruzzo
|
2d67020c59
|
added dhp-enrichment maven site template
|
2021-11-16 16:01:08 +01:00 |
Miriam Baglioni
|
28ea532ece
|
[Affilaition Propagation] moved the selection of graph relation as a preparation step
|
2021-11-16 15:24:19 +01:00 |
Sandro La Bruzzo
|
18c1d70ef4
|
Merge branch 'beta' of code-repo.d4science.org:D-Net/dnet-hadoop into mvn_site_documentation
|
2021-11-16 15:16:49 +01:00 |
Sandro La Bruzzo
|
a1cafaf2e3
|
added mvn site for dnet-hadoop project
|
2021-11-16 15:16:28 +01:00 |
Miriam Baglioni
|
7c96e3fd46
|
removed not useful dir
|
2021-11-16 13:57:26 +01:00 |
Miriam Baglioni
|
c7c0c3187b
|
[AFFILIATION PROPAGATION] Applied some SonarLint suggestions
|
2021-11-16 13:56:32 +01:00 |
Miriam Baglioni
|
c6a9f0a1a8
|
mergin with branch beta
|
2021-11-16 12:04:40 +01:00 |
Miriam Baglioni
|
99d86134f5
|
[Graph Dump] changed the dump since the measures have been moded at the level of the instance
|
2021-11-16 12:04:21 +01:00 |
Claudio Atzori
|
0a727d325d
|
[dedup] increased number of partitions in the consistency phase
|
2021-11-16 08:43:41 +01:00 |
Claudio Atzori
|
bafa2990f3
|
code formatting
|
2021-11-15 17:07:16 +01:00 |
Claudio Atzori
|
668ac25224
|
[graph resolution] using existing argument parser file name
|
2021-11-15 17:02:45 +01:00 |