Miriam Baglioni
|
72df8f9232
|
Hosted By Map - removed the aggregator for the datasource (it is no more needed) and added a new aggregator for the results. Changed also the hostedBYMap aggregator
|
2021-08-02 19:34:44 +02:00 |
Miriam Baglioni
|
ff1ce75e33
|
Hosted By Map - modification in the code to prepare the info needed to apply the HostedByMap. There is no need to join datasources with the hbm: all the information needed is in the hosted by map already
|
2021-08-02 19:32:59 +02:00 |
Miriam Baglioni
|
1695d45bd4
|
Hosted By Map - Test class to verify the preparation of the intermediate information
|
2021-07-30 17:57:01 +02:00 |
Miriam Baglioni
|
7c6ea2f4c7
|
Hosted By Map - first attempt for the creation of intermedia information to be used to applu the hosted by map on the graph entities
|
2021-07-30 17:56:27 +02:00 |
Miriam Baglioni
|
d8b9b0553b
|
Hosted By Map - model classes to store the intermediate information to be used to apply the hosted by map
|
2021-07-30 17:55:39 +02:00 |
Miriam Baglioni
|
613bd3bde0
|
Hosted By Map - refactor of the first attemp to prepare a new hosted by map dependent on the datasource in the graph and on two external sources: the gold list from unibi ad the doaj list of open access journal. Both the lists are downloaded from provided url parameter
|
2021-07-30 17:54:45 +02:00 |
Miriam Baglioni
|
d1807781c0
|
mergin with branch beta
|
2021-07-30 14:34:07 +02:00 |
Miriam Baglioni
|
1d6ac3715b
|
merge branch with beta
|
2021-07-30 11:58:29 +02:00 |
Claudio Atzori
|
19620eed46
|
applying PR#131, Patch the identifiers (source/target) in the relations, refinements
|
2021-07-30 11:09:32 +02:00 |
Miriam Baglioni
|
baad01cadc
|
hostedbymap
|
2021-07-29 13:04:39 +02:00 |
Claudio Atzori
|
5d08ad86ae
|
[raw_all] patching relation identifier phase to be run at the end, i.e. includes also claimed relations
|
2021-07-29 13:03:16 +02:00 |
Claudio Atzori
|
e87e1805c4
|
[raw_all] added extra workflow step for patching the identifiers in the relations, given an id mapping dataset
|
2021-07-29 12:13:06 +02:00 |
Claudio Atzori
|
e1797c0a42
|
Merge branch 'beta' of https://code-repo.d4science.org/D-Net/dnet-hadoop into beta
|
2021-07-28 16:21:36 +02:00 |
Claudio Atzori
|
6dddad86ee
|
[cleaning] title cleaning based on the me.xuender:unidecode library
|
2021-07-28 16:21:29 +02:00 |
Alessia Bardi
|
c806387d4b
|
tests for enermaps
|
2021-07-28 11:54:36 +02:00 |
Claudio Atzori
|
2fff24df55
|
code formatting
|
2021-07-28 11:34:19 +02:00 |
Miriam Baglioni
|
708d0ade34
|
Merge branch 'beta' into hostedbymap
|
2021-07-28 10:37:22 +02:00 |
Miriam Baglioni
|
0424f47494
|
HostedByMap fixing issues
|
2021-07-28 10:24:13 +02:00 |
Claudio Atzori
|
5aa7d16d1b
|
updated assertions in eu.dnetlib.dhp.oa.graph.raw.MappersTest
|
2021-07-27 15:11:58 +02:00 |
Miriam Baglioni
|
74f801b689
|
mergin with branch beta
|
2021-07-27 13:18:31 +02:00 |
Miriam Baglioni
|
eb07f7f40f
|
Hosted By Map
|
2021-07-27 12:27:26 +02:00 |
Sandro La Bruzzo
|
848aabbb6c
|
minor fix
|
2021-07-25 12:06:41 +02:00 |
Sandro La Bruzzo
|
8fac10c91e
|
fixed defintion wf of creation final infospace of scholexplorer
|
2021-07-25 11:15:37 +02:00 |
Sandro La Bruzzo
|
3920c69bc8
|
change implementation of resolve Relation to generate jsonRdd in output
|
2021-07-25 09:51:36 +02:00 |
Sandro La Bruzzo
|
d9e3b89937
|
implemented last part of workflows to generate scholixGraph
|
2021-07-23 16:38:32 +02:00 |
Sandro La Bruzzo
|
cfde63a7c3
|
fixed resolve relation join
|
2021-07-23 14:17:29 +02:00 |
Sandro La Bruzzo
|
4a439c3863
|
NPE fixed
|
2021-07-23 14:17:29 +02:00 |
Sandro La Bruzzo
|
ca74e8dd02
|
create a separate wf for resolving relation
|
2021-07-23 11:40:06 +02:00 |
Sandro La Bruzzo
|
43e9380cd3
|
update resolve relation to use the same format of openaire graph
|
2021-07-23 11:25:18 +02:00 |
Sandro La Bruzzo
|
62ae36a3d2
|
fixed NPE
|
2021-07-22 15:41:38 +02:00 |
Sandro La Bruzzo
|
31d2d6d41e
|
Scholexplorer: introduction of dedup openaire
|
2021-07-21 18:09:32 +02:00 |
Claudio Atzori
|
65934888a1
|
adding record identifier among the originalIds regardless of what IdentifierFactory produces
|
2021-07-19 17:52:52 +02:00 |
Claudio Atzori
|
0977baf41d
|
contents mapped from the stores with 'claim' interpretation will not change their identifier along their way towards the graph
|
2021-07-19 17:43:52 +02:00 |
Sandro La Bruzzo
|
7e2caafe84
|
Scholexplorer: fixed mapping typologies
|
2021-07-15 09:53:12 +02:00 |
Sandro La Bruzzo
|
bbe8193930
|
merged stable ids
|
2021-07-12 17:00:43 +02:00 |
Sandro La Bruzzo
|
09fccf8000
|
added workflow to serialize scholix and summary in json
|
2021-07-09 11:01:42 +02:00 |
Sandro La Bruzzo
|
0ea576745f
|
updated CreateInputGraph because ggenerics don't work on Spark Dataset
|
2021-07-09 10:29:24 +02:00 |
Sandro La Bruzzo
|
cd17e19044
|
implemented branch workflow to import datacite and crossref in scholexplorer
|
2021-07-08 21:20:19 +02:00 |
Sandro La Bruzzo
|
8a034e46e1
|
updated baseline workflow
|
2021-07-08 11:11:41 +02:00 |
Claudio Atzori
|
b7b8e0986e
|
[raw_all] The claim merge procedure includes the claimed contexts in the merged result
|
2021-07-08 10:42:31 +02:00 |
Sandro La Bruzzo
|
0799ac9fb6
|
fixed wrong path
|
2021-07-08 10:36:37 +02:00 |
Sandro La Bruzzo
|
4d53402712
|
extended ebiLinks to create a dataset before generation of OAF
|
2021-07-08 10:26:21 +02:00 |
Sandro La Bruzzo
|
a4a54a3786
|
code refactor
|
2021-07-08 09:08:25 +02:00 |
Sandro La Bruzzo
|
a01dbe0ab0
|
completed workflow of generation of scholix and summaries
|
2021-07-07 23:10:34 +02:00 |
Claudio Atzori
|
fdcff42e46
|
[raw_all] Aggregator graph creation merges claims (updates) with the corresponding entity
|
2021-07-07 19:01:59 +02:00 |
Claudio Atzori
|
bc014023c8
|
Merge pull request 'to solve the scala SI-3623' (#122) from andreas.czerniak/BrStableId_dnet-hadoop:stable_ids into stable_ids
Reviewed-on: #122
|
2021-07-07 11:13:51 +02:00 |
Claudio Atzori
|
32bdfdccbc
|
[raw_all] Aggregator graph creation merges claims (updates) with the corresponding entity
|
2021-07-07 11:08:27 +02:00 |
Claudio Atzori
|
f580cb77e1
|
added mapping for claim relation 'resultResult_publicationDataset_isRelatedTo' (present on BETA)
|
2021-07-06 21:11:11 +02:00 |
Sandro La Bruzzo
|
8535506c22
|
added scholix generation
|
2021-07-06 17:18:06 +02:00 |
Sandro La Bruzzo
|
4c54bd8742
|
add test to verify merge scholix on source
|
2021-07-06 11:32:14 +02:00 |