Sandro La Bruzzo
|
0dff14b28e
|
added property to gitignore
|
2020-01-27 10:53:54 +01:00 |
miconis
|
5c8f6febee
|
minor changes in comparators
|
2020-01-24 10:01:11 +01:00 |
Sandro La Bruzzo
|
19a80e4638
|
implemented workfow for aggregation and generation of infospace graph
|
2020-01-24 09:58:55 +01:00 |
Claudio Atzori
|
fcbc4ccd70
|
a bit of docs doesn't hurt
|
2020-01-24 08:43:23 +01:00 |
Claudio Atzori
|
a55f5fecc6
|
joining entities using T x R x S method with groupByKey, WIP: making target objects (T) have lower memory footprint
|
2020-01-24 08:17:53 +01:00 |
Michele Artini
|
6bfe2dc96e
|
partial implementation
|
2020-01-22 16:00:23 +01:00 |
Claudio Atzori
|
799929c1e3
|
joining entities using T x R x S method with groupByKey
|
2020-01-21 16:35:44 +01:00 |
Michele Artini
|
f6eccdde33
|
partial implementation
|
2020-01-21 14:17:05 +01:00 |
Michele Artini
|
cd114f1c3b
|
partial update
|
2020-01-21 12:32:10 +01:00 |
Michele Artini
|
b35c59eb42
|
partial implementation of entities from db
|
2020-01-20 16:04:19 +01:00 |
Sandro La Bruzzo
|
fa7504bf29
|
removed DLI stuff should be in a branch
|
2020-01-20 10:28:00 +01:00 |
Michele Artini
|
81f82b5d34
|
partial implementation of applications to migrate entities
|
2020-01-17 15:26:21 +01:00 |
Claudio Atzori
|
1cd6899480
|
merged from master
|
2020-01-17 14:25:57 +01:00 |
Claudio Atzori
|
749b0660ab
|
instance URLs must be repeatable
|
2020-01-17 14:22:15 +01:00 |
Claudio Atzori
|
63c0db4ff8
|
instance URLs must be repeatable
|
2020-01-16 15:54:53 +02:00 |
Claudio Atzori
|
97c239ee0d
|
WIP: trying to find a way to build the records for the index
|
2020-01-16 12:02:28 +02:00 |
miconis
|
4955be0197
|
Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop
|
2020-01-14 15:03:44 +02:00 |
miconis
|
f61adfc2bb
|
minor changes
|
2020-01-14 15:03:27 +02:00 |
miconis
|
9bdcb02179
|
minor changes and update of the configuration for publications
|
2020-01-14 15:01:03 +02:00 |
miconis
|
4dce785375
|
update in the implementation of the tree: addition of new logic aggregations and statistics
|
2020-01-14 11:42:43 +02:00 |
Michele Artini
|
f7b9a7a9af
|
entity migration (partial implementation)
|
2020-01-10 15:55:23 +01:00 |
Claudio Atzori
|
731f9b64e6
|
Merge branch 'master' of michele.artini/dnet-hadoop into master
|
2019-12-20 14:22:37 +01:00 |
Michele Artini
|
7229fecbcf
|
fix warnings in poms
|
2019-12-20 13:41:08 +01:00 |
Sandro La Bruzzo
|
dd21db7036
|
fixed stuff
|
2019-12-18 16:28:22 +01:00 |
miconis
|
b3748b8d77
|
minor changes
|
2019-12-18 16:20:35 +01:00 |
miconis
|
b21b1b8f61
|
implementation of new aggregation in the tree node processing
|
2019-12-18 16:19:36 +01:00 |
miconis
|
20fcfe6328
|
implementation of new aggregation in the tree node processing
|
2019-12-18 16:19:26 +01:00 |
Sandro La Bruzzo
|
d924f28b93
|
fixed wrong use of jspath
|
2019-12-18 09:29:44 +01:00 |
Claudio Atzori
|
7ba586d2e5
|
oozie workflow aimed to build the adjacency lists representation of the graph, needed to build the records to be indexed
|
2019-12-17 16:24:49 +01:00 |
miconis
|
84aaa65501
|
implementation of new json comparator and update of the publication configuration
|
2019-12-17 09:16:26 +01:00 |
Sandro La Bruzzo
|
76efcde4fd
|
using new branch decisionTreeDedup
|
2019-12-13 12:20:35 +01:00 |
Sandro La Bruzzo
|
5c01ae4c92
|
merged JqMapping branch into tree2
|
2019-12-13 11:30:02 +01:00 |
Sandro La Bruzzo
|
b4392f9f43
|
implemented DedupRecord factory for missing entities
|
2019-12-13 09:40:02 +01:00 |
miconis
|
545e940007
|
implementation of the mergeFrom for the Datasources
|
2019-12-12 15:36:41 +01:00 |
Sandro La Bruzzo
|
39367676d7
|
implemented DedupRecord factory with the merge of project
|
2019-12-12 15:18:48 +01:00 |
Sandro La Bruzzo
|
6b45e37e22
|
implemented DedupRecord factory with the merge of organizations
|
2019-12-11 16:57:37 +01:00 |
Sandro La Bruzzo
|
abd9034da0
|
implemented DedupRecord factory with the merge of publications
|
2019-12-11 15:43:24 +01:00 |
miconis
|
4b66b471a4
|
implementation of the sorting by trust mechanism and the merge of oaf entities
|
2019-12-10 14:57:16 +01:00 |
Sandro La Bruzzo
|
35008fdbf9
|
fix stuff
|
2019-12-06 15:28:30 +01:00 |
Sandro La Bruzzo
|
cc63706347
|
Implemented deduplication on spark
|
2019-12-06 13:38:00 +01:00 |
Sandro La Bruzzo
|
16c670a5d5
|
Improved deduplication
|
2019-12-05 14:14:25 +01:00 |
miconis
|
49f9beb4a8
|
implementation of romansmatch and re-implementation of the getNumber function. New terms in the translation map and update of the configuration
|
2019-11-28 16:54:44 +01:00 |
miconis
|
f791730330
|
addition of one term to the translation maps in the configurations
|
2019-11-27 15:48:37 +01:00 |
miconis
|
d2278fe358
|
minor change in the citymatch
|
2019-11-21 10:54:02 +01:00 |
miconis
|
8c0d346005
|
the param map has been updated: now it accepts string parameters
|
2019-11-21 09:37:56 +01:00 |
miconis
|
ddd40540aa
|
jarowinklernormalizedname splitted in 3 different comparators: citymatch, keywordmatch and jarowinkler. Implementation of the TreeStatistic support functions
|
2019-11-20 10:45:00 +01:00 |
Claudio Atzori
|
6a7bee5e43
|
Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop
|
2019-11-14 15:43:07 +01:00 |
Claudio Atzori
|
0c4b316f82
|
align Result model with the latest OpenAIRE schema changes introduced in the protobuf model
|
2019-11-14 15:42:52 +01:00 |
Sandro La Bruzzo
|
aad0cb40b7
|
Added schema Scholexplorer
|
2019-11-14 10:34:09 +01:00 |
miconis
|
c687956371
|
code cleaning and implementation of the TreeDedup + minor changes
|
2019-11-14 10:01:21 +01:00 |