Miriam Baglioni
|
46094a3eec
|
bug fixing for implementation with dataset
|
2020-03-24 16:19:36 +01:00 |
Miriam Baglioni
|
ad712f2d79
|
added the needed variables in the config and read the variables in the workflow
|
2020-03-23 17:11:36 +01:00 |
Miriam Baglioni
|
f1e9fe9752
|
changed implementation using dataset and query on hive
|
2020-03-23 17:11:00 +01:00 |
Miriam Baglioni
|
f09cd1e911
|
removed unuseful variable in the configuration
|
2020-03-23 17:10:14 +01:00 |
Miriam Baglioni
|
9418e3d4fa
|
read dataset from files instead of using hive tables
|
2020-03-23 17:09:27 +01:00 |
Miriam Baglioni
|
a7bf037306
|
remove unused class
|
2020-03-23 14:36:43 +01:00 |
Miriam Baglioni
|
8ab8b6b0bf
|
minor
|
2020-03-23 14:35:23 +01:00 |
Miriam Baglioni
|
30d58fd98c
|
change the configuration of the workflow
|
2020-03-23 14:32:49 +01:00 |
Miriam Baglioni
|
a440152b46
|
refactoring
|
2020-03-23 14:30:56 +01:00 |
Miriam Baglioni
|
47561f3597
|
changed the implementation from rdd to dataset got from sql queries (on hive)
|
2020-03-23 11:58:32 +01:00 |
Miriam Baglioni
|
67ea3cf3ed
|
changed the way to read the file with info on resource or relation. From sequenceFile to textFile
|
2020-03-17 16:32:05 +01:00 |
Miriam Baglioni
|
b4652d018c
|
moved the creation of new dir to common class.
|
2020-03-17 16:31:24 +01:00 |
Miriam Baglioni
|
92f4e0001d
|
Merge branch 'bulktag'
|
2020-03-16 13:33:27 +01:00 |
Miriam Baglioni
|
ab08a37024
|
Merge remote-tracking branch 'upstream/master'
|
2020-03-16 12:45:23 +01:00 |
Claudio Atzori
|
af835f2f98
|
when migrating actionsets from DM cluster, populate the AtomicAction.targetValue when empty (dedup similarities)
|
2020-03-15 18:07:59 +01:00 |
Claudio Atzori
|
9c84e21b87
|
added workflow to migrate latest version of each actionset content from DM to OCEAN cluster, mapping the targetValues from the old protobuf data model to the dhp.OAF datamodel
|
2020-03-13 15:56:52 +01:00 |
Claudio Atzori
|
8fe7ae1482
|
xml formatting
|
2020-03-13 15:53:56 +01:00 |
Claudio Atzori
|
23a929177d
|
updates to the graph require this to be an actual class
|
2020-03-13 14:56:35 +01:00 |
Claudio Atzori
|
7b6f0c8756
|
reading graph dump as text files, encoded as newline-delimited JSON records, as indicated in the wiki
|
2020-03-10 17:19:17 +01:00 |
Claudio Atzori
|
60aedb1110
|
Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop
|
2020-03-10 17:09:44 +01:00 |
Claudio Atzori
|
a3f184fd3f
|
added field websiteurl in related organizations
|
2020-03-10 17:08:58 +01:00 |
Claudio Atzori
|
0e95544495
|
fixed serialization for datasource subjects
|
2020-03-10 17:07:44 +01:00 |
Michele Artini
|
b6efa9d6ab
|
Configuration of the SequenceFile Writer
|
2020-03-05 15:49:14 +01:00 |
Claudio Atzori
|
ccb153de78
|
updated image
|
2020-03-05 15:11:42 +01:00 |
Claudio Atzori
|
5e342a555c
|
no need to compute the inverse relClass, fixed text() in xpath expressions
|
2020-03-05 12:51:48 +01:00 |
Claudio Atzori
|
6ec04d4e02
|
specified column used to perform the join operation in the javadoc
|
2020-03-05 12:50:38 +01:00 |
Claudio Atzori
|
960619de98
|
updated image
|
2020-03-04 16:51:55 +01:00 |
Claudio Atzori
|
e89aa52e58
|
updated image
|
2020-03-04 16:18:49 +01:00 |
Claudio Atzori
|
5474e8ac9f
|
Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop
|
2020-03-04 14:54:46 +01:00 |
Claudio Atzori
|
d7137e566e
|
added dhp-doc-resources, aimed to include all the documentation resources used in the wiki pages
|
2020-03-04 14:54:41 +01:00 |
Michele Artini
|
7a2a466161
|
Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop
|
2020-03-04 14:50:59 +01:00 |
Michele Artini
|
755eade2fb
|
fix creation ids
|
2020-03-04 14:49:45 +01:00 |
Claudio Atzori
|
6379f32466
|
Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop
|
2020-03-04 10:57:06 +01:00 |
Claudio Atzori
|
0233987603
|
introduced post processing step following the hive DB creation/population
|
2020-03-04 10:56:50 +01:00 |
Claudio Atzori
|
1e563bc15e
|
introduced distinct properties driving the resouce usage for the XML record creation and the indexing phase
|
2020-03-04 10:55:11 +01:00 |
Claudio Atzori
|
9af3e904be
|
close the SparkSession at the end
|
2020-03-04 10:53:31 +01:00 |
Michele Artini
|
086af63158
|
Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop
|
2020-03-04 10:46:40 +01:00 |
Michele Artini
|
e7167b996a
|
logs and closeable
|
2020-03-04 10:46:36 +01:00 |
Claudio Atzori
|
25ceec29ab
|
code formatting
|
2020-03-04 10:44:24 +01:00 |
Claudio Atzori
|
63c00c5e88
|
fixed typo
|
2020-03-04 10:43:44 +01:00 |
Miriam Baglioni
|
c37f2bd1b5
|
moved some classes to package to make code clearer
|
2020-03-03 16:42:23 +01:00 |
Miriam Baglioni
|
d9d2060561
|
implementation for bulk tagging
|
2020-03-03 16:38:50 +01:00 |
Miriam Baglioni
|
e80f80ca93
|
properties and workflow for new propagation
|
2020-03-02 17:03:31 +01:00 |
Claudio Atzori
|
9cf5ce2e66
|
Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop
|
2020-03-02 17:03:10 +01:00 |
Claudio Atzori
|
bc7cfd5975
|
indexing workflow WIP: fixed projects fundingtree xml conversion, prioritized links between results and projects when limiting them to 100 in the join procedure
|
2020-03-02 17:03:07 +01:00 |
Miriam Baglioni
|
50080c1b3c
|
changed the implementation of addAll method. Before adding all the items in a collection, we check if the accumulator set is not empty
|
2020-03-02 16:41:37 +01:00 |
Miriam Baglioni
|
02815dd2cf
|
update result for community moved in propagationconstants
|
2020-03-02 16:40:56 +01:00 |
Miriam Baglioni
|
95f8c3092f
|
update for new propagation implementation and moving of updateResult for community business logic since the same can be used for result to community from organization and result to community from semrel
|
2020-03-02 16:40:17 +01:00 |
Miriam Baglioni
|
3d63f35dcb
|
implementation of new propagation. Result to community for results linked to given organization. We exploit the hasAuthorInstitution semantic link to discover which results are related to institutions
|
2020-03-02 16:39:03 +01:00 |
Michele Artini
|
4b29a121b0
|
migration using spark in step2
|
2020-03-02 16:12:14 +01:00 |