Miriam Baglioni
|
771cde3d05
|
moved the library version to global pom
|
2020-10-01 15:43:47 +02:00 |
Miriam Baglioni
|
632351c0da
|
modified test resources to mirror the changed in the code
|
2020-10-01 15:43:02 +02:00 |
Miriam Baglioni
|
ebc1c5513f
|
modified test resources to mirror the changed in the code
|
2020-10-01 15:42:29 +02:00 |
Miriam Baglioni
|
3a374c34b6
|
fixed null pointer exception
|
2020-10-01 15:41:01 +02:00 |
Miriam Baglioni
|
83ea746163
|
added check to the test
|
2020-10-01 15:40:28 +02:00 |
Miriam Baglioni
|
6e5db85b32
|
-
|
2020-10-01 11:51:11 +02:00 |
Miriam Baglioni
|
a46179f61c
|
refactoring
|
2020-10-01 11:22:01 +02:00 |
Miriam Baglioni
|
b90bee124b
|
removing raws that are empy from thos imported
|
2020-10-01 11:16:49 +02:00 |
Miriam Baglioni
|
c107f193c9
|
refactoring
|
2020-10-01 11:16:22 +02:00 |
Miriam Baglioni
|
706a80a29a
|
added test to check that separator '-' (not hyphen) will be recognized
|
2020-10-01 10:38:31 +02:00 |
Miriam Baglioni
|
3dca586b3b
|
refactoring
|
2020-10-01 10:34:48 +02:00 |
Miriam Baglioni
|
416bda6066
|
changed the programme.desxcription by using the same value used in the classification instead of the short title or the title
|
2020-10-01 10:31:33 +02:00 |
Miriam Baglioni
|
f6587c91f3
|
added comparison to a char that seems - but it is not
|
2020-10-01 10:30:26 +02:00 |
Miriam Baglioni
|
7e73bb88b3
|
changed the logic to add the topic description to the project
|
2020-09-28 17:21:43 +02:00 |
Miriam Baglioni
|
0a035e3630
|
-
|
2020-09-28 17:20:49 +02:00 |
Miriam Baglioni
|
16bee2084d
|
added the topic code to the project subset
|
2020-09-28 17:20:11 +02:00 |
Miriam Baglioni
|
0bf2d0db52
|
added to the workflow the download of the topic excel file and one property needed to get the input path of the topic file in the hdfs filesystem
|
2020-09-28 12:17:22 +02:00 |
Miriam Baglioni
|
c2abde4d9f
|
changed the implementation of Atomic Actions creation by exploiting the topic information get from the cordis excel file
|
2020-09-28 12:16:34 +02:00 |
Miriam Baglioni
|
d930b8d3fc
|
changed the query to get only the code of the project and not the optional1 (topic code) and optional2 (topic description)
|
2020-09-28 12:15:48 +02:00 |
Miriam Baglioni
|
f8f5cfd5cc
|
removed the part added to set the topic code and description in the step of project preparation
|
2020-09-28 12:13:33 +02:00 |
Miriam Baglioni
|
9e19c9a221
|
remove the topic description from the values in the CSVProject class
|
2020-09-28 12:11:03 +02:00 |
Miriam Baglioni
|
6d8b932e40
|
refactoring
|
2020-09-28 12:06:56 +02:00 |
Miriam Baglioni
|
b77f166549
|
changed the package name from csvutils to utils
|
2020-09-28 12:05:47 +02:00 |
Miriam Baglioni
|
e33e3277de
|
added needed dependency to read the excel file
|
2020-09-28 12:03:14 +02:00 |
Miriam Baglioni
|
f4739a371a
|
code to get the information related to the topic association between code and description.
|
2020-09-28 12:02:48 +02:00 |
Miriam Baglioni
|
12c2dfc268
|
modified the resource to consider the information added to the model
|
2020-09-25 14:17:23 +02:00 |
Miriam Baglioni
|
969fa8d96e
|
fixed issue and changed the transformation of the programme file to consider the new model
|
2020-09-25 13:32:34 +02:00 |
Miriam Baglioni
|
e917281822
|
-
|
2020-09-24 15:24:05 +02:00 |
Miriam Baglioni
|
9f54f69e6d
|
added topic information
|
2020-09-24 15:23:35 +02:00 |
Miriam Baglioni
|
d6206d6e63
|
add the topic description to the action set associated to the project
|
2020-09-24 15:22:40 +02:00 |
Miriam Baglioni
|
6b50226f3b
|
added topic code and topic description
|
2020-09-24 15:21:49 +02:00 |
Miriam Baglioni
|
15af1f527e
|
modified to consider the topic information
|
2020-09-24 15:20:56 +02:00 |
Miriam Baglioni
|
609ff17cfc
|
now the commission give us the framework programme (FP7 - H2020) so use this information to filter out programmes not associated to H2020
|
2020-09-24 15:19:31 +02:00 |
Miriam Baglioni
|
b66f930466
|
Added optionl1 and optional2 information to the files red from the db. Optional1 contains the topic code and optional2 contains the topic description
|
2020-09-24 15:16:56 +02:00 |
Miriam Baglioni
|
860e6d38a6
|
added topic description to the CSV project variables
|
2020-09-24 15:15:26 +02:00 |
Miriam Baglioni
|
1d84cf19a6
|
added new line to resource file
|
2020-09-23 17:32:22 +02:00 |
Miriam Baglioni
|
f0c476b6c9
|
modification to the test classes to consider h2020classification
|
2020-09-23 17:31:49 +02:00 |
Miriam Baglioni
|
2cba3cb484
|
modification to the classes building the actionset to consider the h2020classification
|
2020-09-23 17:31:15 +02:00 |
Miriam Baglioni
|
1069cf243a
|
modification to the schema to consider the H2020classification of the programme. The filed Programme has been moved inside the H2020classification that is now associated to the Project. Programme is no more associated directly to the Project but via H2020CLassification
|
2020-09-22 14:38:00 +02:00 |
Claudio Atzori
|
9cd27183b6
|
[maven-release-plugin] prepare for next development iteration
|
2020-06-22 11:27:44 +02:00 |
Claudio Atzori
|
1e3dab0631
|
[maven-release-plugin] prepare release dhp-1.2.3
|
2020-06-22 11:27:39 +02:00 |
Claudio Atzori
|
306669209f
|
code formatting
|
2020-06-16 16:54:44 +02:00 |
Claudio Atzori
|
603b1bd0bb
|
Merge branch 'master' into dhp_oaf_model
|
2020-06-16 15:43:59 +02:00 |
Claudio Atzori
|
c4d9f1837f
|
[maven-release-plugin] prepare for next development iteration
|
2020-06-12 12:21:08 +02:00 |
Claudio Atzori
|
f0746a7605
|
[maven-release-plugin] prepare release dhp-1.2.2
|
2020-06-12 12:21:03 +02:00 |
Claudio Atzori
|
a2fdf85ba1
|
WIP: graph cleaner implementation
|
2020-06-09 19:52:53 +02:00 |
Miriam Baglioni
|
dfa4997a4f
|
removed commented code
|
2020-05-29 10:45:18 +02:00 |
Miriam Baglioni
|
6f1eea28b6
|
changed message in log
|
2020-05-29 10:41:39 +02:00 |
Miriam Baglioni
|
8b6e886fb6
|
added new resource for testing
|
2020-05-28 23:54:31 +02:00 |
Miriam Baglioni
|
6989fb9c8a
|
changed the project test according to the newly introduced join with the db project codes
|
2020-05-28 23:53:24 +02:00 |
Miriam Baglioni
|
782984d8e5
|
added needed parameter
|
2020-05-28 23:52:41 +02:00 |
Miriam Baglioni
|
01f7876595
|
fix issue with flatMap - the return type must not be null
|
2020-05-28 23:50:32 +02:00 |
Miriam Baglioni
|
773735f870
|
added the path to the file containing the projects code from the db
|
2020-05-28 17:30:45 +02:00 |
Miriam Baglioni
|
6a15067a64
|
added one step in the workflow
|
2020-05-28 17:30:09 +02:00 |
Miriam Baglioni
|
5309a99a70
|
modified the PrepareProjects to consider those in the db
|
2020-05-28 17:29:53 +02:00 |
Miriam Baglioni
|
b737ed8236
|
added part to read projects from the openaire db to filter out those in the csv file that are not in the db
|
2020-05-28 17:29:21 +02:00 |
Miriam Baglioni
|
35b7279147
|
changed test because data are saved as SequenceFile now, and because of the group by the umber of produced update decrease
|
2020-05-28 10:26:12 +02:00 |
Miriam Baglioni
|
df44db686a
|
refactoring
|
2020-05-28 10:07:00 +02:00 |
Miriam Baglioni
|
87b07f4af8
|
removed unused variables
|
2020-05-28 10:05:43 +02:00 |
Miriam Baglioni
|
1060977272
|
added fs actions to remove and the create the workingDir
|
2020-05-28 10:04:36 +02:00 |
Miriam Baglioni
|
96d1a3c431
|
deleted the file were to store the csv files
|
2020-05-28 10:04:10 +02:00 |
Miriam Baglioni
|
669c05c771
|
added groupBy before creating Actions
|
2020-05-28 10:00:45 +02:00 |
Miriam Baglioni
|
1855453434
|
changed the outputdir of the last step
|
2020-05-27 17:59:36 +02:00 |
Miriam Baglioni
|
92e3a52e91
|
merge branch with fork master
|
2020-05-26 15:57:51 +02:00 |
Claudio Atzori
|
7582532e73
|
[maven-release-plugin] prepare for next development iteration
|
2020-05-25 19:48:18 +02:00 |
Claudio Atzori
|
01c2e93395
|
[maven-release-plugin] prepare release dhp-1.2.1
|
2020-05-25 19:48:14 +02:00 |
Miriam Baglioni
|
ac8025f469
|
-
|
2020-05-22 15:29:41 +02:00 |
Miriam Baglioni
|
50ad83b97f
|
-
|
2020-05-22 15:27:19 +02:00 |
Miriam Baglioni
|
473c6d3a23
|
produces AtomicActions instead of Projects
|
2020-05-22 15:26:57 +02:00 |
Miriam Baglioni
|
4589c428b1
|
generate action sets and saves them in the hdfs path for the actions sets
|
2020-05-21 16:30:39 +02:00 |
Miriam Baglioni
|
055eec5a77
|
added resource for prepare project test
|
2020-05-20 13:54:10 +02:00 |
Miriam Baglioni
|
9079bc1f61
|
-
|
2020-05-20 13:53:32 +02:00 |
Miriam Baglioni
|
67ba4fde57
|
added test for prepare projects step
|
2020-05-20 13:53:08 +02:00 |
Miriam Baglioni
|
3c0eb12d3e
|
removed the not zipped files
|
2020-05-20 10:31:05 +02:00 |
Miriam Baglioni
|
c0d9e02340
|
zipped test resources that are too big
|
2020-05-20 10:30:25 +02:00 |
Miriam Baglioni
|
5e9c9fa87c
|
tests
|
2020-05-20 10:29:57 +02:00 |
Miriam Baglioni
|
faed7521bf
|
added resources for testing
|
2020-05-20 10:29:29 +02:00 |
Miriam Baglioni
|
75491482de
|
added a new preparation step to replicate each project for the programme it is associated to
|
2020-05-20 10:28:56 +02:00 |
Miriam Baglioni
|
eb0e47ba53
|
parameters for h2020 programme
|
2020-05-20 10:26:44 +02:00 |
Miriam Baglioni
|
08218d2f3f
|
new workflow with added steps
|
2020-05-19 18:44:25 +02:00 |
Miriam Baglioni
|
457293ccc0
|
test for the variuos steps of project update with programme
|
2020-05-19 18:43:42 +02:00 |
Miriam Baglioni
|
9447d78ef3
|
added preparation classes
|
2020-05-19 18:42:50 +02:00 |
Miriam Baglioni
|
f0f14caf99
|
removed script files for shell actions not performed
|
2020-05-18 13:06:16 +02:00 |
Miriam Baglioni
|
23bbac7d7c
|
-
|
2020-05-18 13:05:03 +02:00 |
Miriam Baglioni
|
4f1ff7ba73
|
added dependency to org.apache.commons common-csv
|
2020-05-18 13:04:39 +02:00 |
Miriam Baglioni
|
abc45f2708
|
added dnet-45 HttpConnector and related Classes, produced the POJO for projects and programme
|
2020-05-18 13:04:06 +02:00 |
Miriam Baglioni
|
5a648016ef
|
parameters from the GetFile class
|
2020-05-15 18:18:50 +02:00 |
Miriam Baglioni
|
83c262a483
|
workflow to download the files
|
2020-05-15 18:18:31 +02:00 |
Miriam Baglioni
|
22cb9e0da7
|
simple code to get file from URL
|
2020-05-15 18:18:01 +02:00 |
Claudio Atzori
|
60c40618d3
|
[maven-release-plugin] prepare for next development iteration
|
2020-05-11 10:17:14 +02:00 |
Claudio Atzori
|
c267d958d5
|
[maven-release-plugin] prepare release dhp-1.2.0
|
2020-05-11 10:17:10 +02:00 |
Claudio Atzori
|
42f1a2bf94
|
bumped project version to 1.2.0-SNAPSHOT
|
2020-05-11 10:05:57 +02:00 |
Claudio Atzori
|
0ccc864ad9
|
[maven-release-plugin] prepare for next development iteration
|
2020-05-08 17:01:31 +02:00 |
Claudio Atzori
|
6e47c724c6
|
[maven-release-plugin] prepare release dhp-1.1.7
|
2020-05-08 17:01:27 +02:00 |
Claudio Atzori
|
0825321d0b
|
improved unit tests in dhp-aggregation
|
2020-05-05 12:39:04 +02:00 |
Claudio Atzori
|
439c6255a2
|
cleanup
|
2020-04-29 19:09:07 +02:00 |
Claudio Atzori
|
6f5b899038
|
reformatted code according to the updated style descriptor
|
2020-04-28 11:23:29 +02:00 |
Claudio Atzori
|
a0bdbacdae
|
switched automatic code formatting plugin to net.revelc.code.formatter:formatter-maven-plugin
|
2020-04-27 14:52:31 +02:00 |
Claudio Atzori
|
7a3f8085f7
|
switched automatic code formatting plugin to net.revelc.code.formatter:formatter-maven-plugin
|
2020-04-27 14:45:40 +02:00 |
Claudio Atzori
|
9147af7fed
|
actionsets migration workflow moved in dhp-workflows/dhp-actionmanager
|
2020-04-20 15:24:33 +02:00 |
Claudio Atzori
|
d714bfb4d4
|
collectedfrom field moved in common parent class Oaf.java
|
2020-04-20 12:25:19 +02:00 |
Claudio Atzori
|
ad7a131b18
|
introduced common project code formatting plugin, works on the commit hook, based on https://github.com/Cosium/git-code-format-maven-plugin, applied to each java class in the project
|
2020-04-18 12:42:58 +02:00 |
Claudio Atzori
|
6b5f9ca9cb
|
raw graph creation workflow moved under dhp-graph-mapper, claims integration is included
|
2020-04-10 17:53:07 +02:00 |
Claudio Atzori
|
7061d07727
|
ActionSets migration serialize the output as plain text files instead of SequenceFiles
|
2020-04-01 14:58:22 +02:00 |
Claudio Atzori
|
377e1ba840
|
[maven-release-plugin] prepare for next development iteration
|
2020-03-30 20:06:00 +02:00 |
Claudio Atzori
|
76d9315129
|
[maven-release-plugin] prepare release dhp-1.1.6
|
2020-03-30 20:05:56 +02:00 |
Michele Artini
|
ae03948eed
|
Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop
|
2020-03-27 11:47:07 +01:00 |
Michele Artini
|
f6e86b44a6
|
tests
|
2020-03-27 11:46:37 +01:00 |
Michele Artini
|
408be3c632
|
test and fixed a problem with datacite namespaces
|
2020-03-27 11:44:50 +01:00 |
Sandro La Bruzzo
|
0cd022ad6a
|
merge with master
|
2020-03-26 14:08:29 +01:00 |
Claudio Atzori
|
c0e825e713
|
dhp-aggregation workflow tests upgraded to junit5
|
2020-03-25 17:59:45 +01:00 |
Michele Artini
|
ebe45003d9
|
fixed some junit packages
|
2020-03-25 16:45:03 +01:00 |
Michele Artini
|
d9bfdcd607
|
updated poms
|
2020-03-25 16:31:12 +01:00 |
Michele Artini
|
fd57722c69
|
Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop
|
2020-03-25 15:56:49 +01:00 |
Michele Artini
|
2559299da4
|
tests
|
2020-03-25 12:25:00 +01:00 |
Michele Artini
|
0fda2c3a30
|
some tests on db records
|
2020-03-25 09:43:58 +01:00 |
Michele Artini
|
e3760c7f39
|
fix a bug with organization countries
|
2020-03-24 08:43:56 +01:00 |
Claudio Atzori
|
ecb64e4998
|
Merge branch 'migration_wfs_regular_all_steps'
|
2020-03-23 08:57:01 +01:00 |
Michele Artini
|
15160032bd
|
fixed a bug setting some organization fields
|
2020-03-23 08:39:14 +01:00 |
Claudio Atzori
|
36236dd1c1
|
action migration workflow produces eu.dnetlib.dhp.schema.action.AtomicAction(s)
|
2020-03-19 14:00:38 +01:00 |
Claudio Atzori
|
abe8fb69a2
|
added global properties, moved postprocessing script inside the oozie_app directory
|
2020-03-18 15:43:54 +01:00 |
Claudio Atzori
|
c7e0730720
|
compress the output produced by migration steps 1 and 2
|
2020-03-18 09:34:57 +01:00 |
Claudio Atzori
|
2f11e37602
|
fixed expansion of path variables
|
2020-03-17 19:41:07 +01:00 |
Claudio Atzori
|
2795b0b096
|
no need to mkdir a the all_entities file
|
2020-03-17 17:22:14 +01:00 |
Claudio Atzori
|
19746ad308
|
when reuseContent, reset ${workingPath}/all_entities
|
2020-03-17 17:17:06 +01:00 |
Claudio Atzori
|
2f0c85eeb3
|
updated parameters for regular_all_steps worfklow, introduced flag 'reuseContent'
|
2020-03-17 17:04:58 +01:00 |
Claudio Atzori
|
b8290b5851
|
updated parameters for regular_all_steps worfklow
|
2020-03-17 15:45:30 +01:00 |
Claudio Atzori
|
4706f24ec5
|
updated parameters for regular_all_steps worfklow
|
2020-03-17 15:23:54 +01:00 |
Claudio Atzori
|
af835f2f98
|
when migrating actionsets from DM cluster, populate the AtomicAction.targetValue when empty (dedup similarities)
|
2020-03-15 18:07:59 +01:00 |
Claudio Atzori
|
9c84e21b87
|
added workflow to migrate latest version of each actionset content from DM to OCEAN cluster, mapping the targetValues from the old protobuf data model to the dhp.OAF datamodel
|
2020-03-13 15:56:52 +01:00 |
Michele Artini
|
b6efa9d6ab
|
Configuration of the SequenceFile Writer
|
2020-03-05 15:49:14 +01:00 |
Michele Artini
|
755eade2fb
|
fix creation ids
|
2020-03-04 14:49:45 +01:00 |
Michele Artini
|
e7167b996a
|
logs and closeable
|
2020-03-04 10:46:36 +01:00 |
Michele Artini
|
4b29a121b0
|
migration using spark in step2
|
2020-03-02 16:12:14 +01:00 |
Michele Artini
|
5445a57102
|
migration using spark in step2
|
2020-03-02 16:11:59 +01:00 |
Michele Artini
|
93665773ea
|
Fixed a problem with JavaRDD Union
|
2020-02-25 15:59:21 +01:00 |
Michele Artini
|
5d3739b5cf
|
migration of claims
|
2020-02-19 15:11:17 +01:00 |
Michele Artini
|
173f1df1e5
|
saved a query for openaire production database
|
2020-02-19 10:15:08 +01:00 |
Sandro La Bruzzo
|
9a2d74ac82
|
Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop
|
2020-02-19 10:13:45 +01:00 |
Sandro La Bruzzo
|
e5d7cdf422
|
fixed sql query
|
2020-02-19 10:13:36 +01:00 |
Sandro La Bruzzo
|
2b8675462f
|
refactoring code
|
2020-02-19 10:07:08 +01:00 |
Claudio Atzori
|
6a288625e5
|
fixed workflow outgoing node
|
2020-02-17 15:04:33 +01:00 |
Sandro La Bruzzo
|
76ee85141a
|
added oozie job for DNET migration and implemented Spark job for extracting entities
|
2020-02-17 12:31:44 +01:00 |
Michele Artini
|
176c5606bd
|
aligned with origin/master, aligned model and mapping
|
2020-02-17 10:40:53 +01:00 |
Claudio Atzori
|
a3d0b57b25
|
[maven-release-plugin] prepare for next development iteration
|
2020-02-13 18:11:33 +01:00 |
Claudio Atzori
|
6ed9a15bc8
|
[maven-release-plugin] prepare release dhp-1.1.5
|
2020-02-13 18:11:31 +01:00 |
Claudio Atzori
|
49e648f7c3
|
bumped version
|
2020-02-13 18:09:31 +01:00 |
Michele Artini
|
80cb52593f
|
bug fixing
|
2020-02-13 15:34:13 +01:00 |
Michele Artini
|
cdea0dae75
|
bug fixing
|
2020-02-12 16:34:00 +01:00 |
Michele Artini
|
69336195d3
|
simplifications
|
2020-02-12 11:12:38 +01:00 |