Miriam Baglioni
|
ea88dc3401
|
fixed issue in property name
|
2020-12-03 11:24:23 +01:00 |
Claudio Atzori
|
893ac4a77b
|
GenerateEntitiesApplication can be configured to hash the id value or not
|
2020-12-02 09:30:06 +01:00 |
Miriam Baglioni
|
5fbe54ef54
|
#61 (comment)
|
2020-11-25 18:10:28 +01:00 |
Miriam Baglioni
|
ed01e5a5e1
|
#61 (comment)
|
2020-11-25 18:09:34 +01:00 |
Miriam Baglioni
|
f5e5e92a10
|
changed because of #61 (comment)
|
2020-11-25 17:58:53 +01:00 |
Claudio Atzori
|
dfd6205b95
|
Consistency graph workflow merges all the entities by ID
|
2020-11-25 14:55:32 +01:00 |
Miriam Baglioni
|
e7e418e444
|
added decision node to verify if to upload in Zenodo
|
2020-11-25 13:44:10 +01:00 |
Miriam Baglioni
|
39f4a20873
|
chenged the path and the name for saving the communities_infrastructures dump file
|
2020-11-24 14:47:32 +01:00 |
Miriam Baglioni
|
7e14452a87
|
final versione of the wf to get the dump of results associated to at least one funder per funder
|
2020-11-24 14:46:34 +01:00 |
Miriam Baglioni
|
c167a18057
|
added new parameter for the dumpType
|
2020-11-24 14:45:50 +01:00 |
Claudio Atzori
|
33bae02451
|
reverted behaviour of the cleaning workflow: grouping entities by ID will be managed differently
|
2020-11-24 14:42:33 +01:00 |
Miriam Baglioni
|
0a9db67eec
|
-
|
2020-11-20 12:21:33 +01:00 |
Miriam Baglioni
|
cf3f47563f
|
new parameter files
|
2020-11-19 19:16:05 +01:00 |
Miriam Baglioni
|
24c56fa7a3
|
new logic and workflow for dump of results with link to projects. In this implementation the result match the model of the communityresult.
|
2020-11-19 19:15:39 +01:00 |
Miriam Baglioni
|
fafb688887
|
-
|
2020-11-18 18:56:48 +01:00 |
Miriam Baglioni
|
46ba3793f6
|
code, workflow and parameters for the dump of the results associated to funders
|
2020-11-18 16:47:31 +01:00 |
Miriam Baglioni
|
57cac36898
|
changed the workflow name
|
2020-11-18 13:38:03 +01:00 |
Claudio Atzori
|
6ab1ce53c9
|
fixed condition in result pid cleaning; cleanup
|
2020-11-16 10:09:17 +01:00 |
Claudio Atzori
|
4de8c8b237
|
fixed workflow variable name
|
2020-11-16 10:03:11 +01:00 |
Claudio Atzori
|
5d4e34e26a
|
fixed typo in variable name
|
2020-11-14 10:32:26 +01:00 |
Claudio Atzori
|
528231a287
|
grouping graph entities by id turned out to be an easy extension for the already existing cleaning workflow
|
2020-11-13 15:37:48 +01:00 |
Claudio Atzori
|
2bed29eb09
|
WIP: added oozie workflow for grouping graph entities by id
|
2020-11-13 10:05:12 +01:00 |
Claudio Atzori
|
13e36a4da0
|
WIP: added oozie workflow for grouping graph entities by id
|
2020-11-13 10:05:02 +01:00 |
Michele Artini
|
40160d171f
|
organizations pids
|
2020-11-09 12:58:36 +01:00 |
Claudio Atzori
|
d10447e747
|
re-packaged graph dump workflow sources
|
2020-11-05 17:38:18 +01:00 |
Miriam Baglioni
|
f8e9bda24c
|
merge branch with master
|
2020-11-05 16:31:18 +01:00 |
Miriam Baglioni
|
be5ed8f554
|
added check to avoid sending empty metadata.
|
2020-11-05 16:10:17 +01:00 |
Claudio Atzori
|
2148a51fae
|
minor changes
|
2020-11-05 11:24:12 +01:00 |
Miriam Baglioni
|
b90a945c49
|
removed property files for pid graph dump
|
2020-11-04 17:28:33 +01:00 |
Alessia Bardi
|
51808b5afd
|
Updated descriptions
|
2020-11-04 12:29:48 +01:00 |
Alessia Bardi
|
e6becf8659
|
Updated descriptions
|
2020-11-04 12:17:57 +01:00 |
Alessia Bardi
|
0abe0eee33
|
Updated descriptions
|
2020-11-04 12:15:30 +01:00 |
Alessia Bardi
|
f6ab238f5d
|
Updated descriptions
|
2020-11-04 11:50:47 +01:00 |
Miriam Baglioni
|
c209284ca7
|
new schemas for the entities in the dump with added descriptions
|
2020-11-03 16:58:08 +01:00 |
Miriam Baglioni
|
08806deddf
|
added the splitSize non mandatory parameter. Default size 10G
|
2020-11-03 16:57:34 +01:00 |
Miriam Baglioni
|
7d2eda43ca
|
added new non mandatory property publish to determine if to publish the upload or leave it pending. Default value flase
|
2020-11-03 16:57:01 +01:00 |
Miriam Baglioni
|
d4382b54df
|
moved the tar archive with maz size on common module
|
2020-11-03 16:54:50 +01:00 |
Miriam Baglioni
|
78fdb11c3f
|
merge branch with master
|
2020-10-29 12:55:22 +01:00 |
Sandro La Bruzzo
|
1d9fdb7367
|
fixed spark memory issue in SparkSplitOafTODLIEntities
|
2020-10-28 12:30:32 +01:00 |
Miriam Baglioni
|
3241ec1777
|
added connection timeout and socket timeout 600 sec
|
2020-10-27 16:12:11 +01:00 |
Claudio Atzori
|
b961dc7d1e
|
added originalid to the fields in the result graph view
|
2020-10-09 13:53:15 +02:00 |
Miriam Baglioni
|
11b7eaae09
|
changed the name of the folder where to store the context entity from context to communities_infrastructures
|
2020-10-05 11:24:54 +02:00 |
Claudio Atzori
|
c2a6e2a9bf
|
fixed mapping for datasource journal info (ISSNs)
|
2020-10-02 09:37:08 +02:00 |
Miriam Baglioni
|
01117a46e1
|
whole workflow activated
|
2020-10-01 17:19:21 +02:00 |
Miriam Baglioni
|
fcaedac980
|
merge branch with master
|
2020-10-01 16:46:59 +02:00 |
Claudio Atzori
|
4287164aba
|
include relevantdate field in the result view
|
2020-10-01 10:28:55 +02:00 |
Miriam Baglioni
|
983a12ed15
|
temporary modification to allow the upload of files in the sandbox without the neew to recreate the mapping from scratch
|
2020-09-25 16:41:51 +02:00 |
Miriam Baglioni
|
8b36d19182
|
added property depositionId and chenage property newVersion that became string from boolean to handle the three possible distinct values
|
2020-09-25 16:41:15 +02:00 |
Miriam Baglioni
|
54800fb9b0
|
enabled only the step to upload in zenodo
|
2020-09-25 14:40:22 +02:00 |
Miriam Baglioni
|
de6c4d46d8
|
fixed conflicts
|
2020-09-24 15:35:01 +02:00 |
Claudio Atzori
|
044d3a0214
|
fixed query used to load datasources in the Graph
|
2020-09-24 13:48:58 +02:00 |
Claudio Atzori
|
42f55395c8
|
fixed order of the ISSNs returned by the SQL query
|
2020-09-24 12:09:58 +02:00 |
Claudio Atzori
|
9a7e72d528
|
using concat_ws to join textual columns from PSQL. When using || to perform the concatenation, Null columns makes the operation result to be Null
|
2020-09-24 10:42:47 +02:00 |
Miriam Baglioni
|
e2ceefe9be
|
-
|
2020-09-14 14:33:28 +02:00 |
Miriam Baglioni
|
40c8d2de7b
|
test resources for the dump of the pids graph
|
2020-08-24 16:50:39 +02:00 |
Miriam Baglioni
|
c5858afb88
|
added parameter to guide the dump for the result (resultAggregation). true if all the result types should be dump together, false otherwise.
|
2020-08-19 11:24:14 +02:00 |
Miriam Baglioni
|
5570678c65
|
changed parameter name from hfdsNameNode to nameNode
|
2020-08-19 10:59:26 +02:00 |
Miriam Baglioni
|
37e7c43652
|
changed parameter name from hdfsNaemNode to nameNode
|
2020-08-14 18:18:25 +02:00 |
Miriam Baglioni
|
acb0926b2e
|
json schemas for the dumped entities and relation
|
2020-08-11 15:39:48 +02:00 |
Miriam Baglioni
|
ff52c51f92
|
added the communityMapPath parameter and removed the isLookUpUrl parameter
|
2020-08-11 15:39:22 +02:00 |
Miriam Baglioni
|
6f43acda5e
|
added the maketar and send to zenodo step. Adjusted wf parameters
|
2020-08-11 15:38:20 +02:00 |
Miriam Baglioni
|
ddc19de2e9
|
removed the isLookUpUrl among the parameters
|
2020-08-11 15:37:47 +02:00 |
Miriam Baglioni
|
592a8ea573
|
added parameter file for maketar class
|
2020-08-11 15:37:14 +02:00 |
Miriam Baglioni
|
77a0951b32
|
added the make archive step in the workflow
|
2020-08-11 15:32:32 +02:00 |
Miriam Baglioni
|
fe88904df0
|
changed the wf definition
|
2020-08-10 12:01:14 +02:00 |
Miriam Baglioni
|
1cf7043e26
|
removed isLookUoUrl from the parameters
|
2020-08-10 11:38:03 +02:00 |
Miriam Baglioni
|
46986aae2d
|
added the new parameter for newdeposion/newversion and concept_record_id
|
2020-08-07 18:00:06 +02:00 |
Miriam Baglioni
|
da9b012c15
|
fixed dewcription
|
2020-08-06 11:55:44 +02:00 |
Miriam Baglioni
|
6dbadcf181
|
the new schema for the dumped result
|
2020-08-06 11:05:56 +02:00 |
Miriam Baglioni
|
5b651abf82
|
merge branch with master
|
2020-08-04 10:14:07 +02:00 |
Miriam Baglioni
|
901ae37f7b
|
added step to workflow
|
2020-08-03 18:12:54 +02:00 |
Miriam Baglioni
|
e43aeb139a
|
added new property file and changed some parameter to old files
|
2020-08-03 18:07:28 +02:00 |
Miriam Baglioni
|
c892c7dfa7
|
changed to query for community map just once and save the result for remaining executions
|
2020-08-03 17:56:31 +02:00 |
Miriam Baglioni
|
6f1c40a933
|
-
|
2020-07-30 16:24:28 +02:00 |
Miriam Baglioni
|
2b66a93f9e
|
added property file that was missing
|
2020-07-30 16:24:17 +02:00 |
Michele Artini
|
bdece15ca0
|
blacklist of nsprefix
|
2020-07-30 16:13:38 +02:00 |
Sandro La Bruzzo
|
3010a362bc
|
updated changing in the workflow of provision in the phase of aggregation. Removed serialization in JSON RDD and used spark Dataset
|
2020-07-30 09:25:56 +02:00 |
Miriam Baglioni
|
b48934f6df
|
changed the workflow name
|
2020-07-29 17:43:43 +02:00 |
Miriam Baglioni
|
8ad8dac7d4
|
merge branch with fork master
|
2020-07-29 17:38:28 +02:00 |
Miriam Baglioni
|
40a8dafbdc
|
-
|
2020-07-29 17:30:44 +02:00 |
Miriam Baglioni
|
8d4327b292
|
input parameters and workflow definition for the dump of the whole graph
|
2020-07-29 17:00:34 +02:00 |
Miriam Baglioni
|
178c2729a7
|
changed the path to reach the java class to be executed
|
2020-07-29 12:29:51 +02:00 |
Miriam Baglioni
|
437ac12139
|
removed unused parameter
|
2020-07-29 12:28:16 +02:00 |
Miriam Baglioni
|
332258d199
|
split the classes related to the communities dump and to the whole graph dump
|
2020-07-24 17:21:48 +02:00 |
Claudio Atzori
|
56bbfdc65d
|
introduced parameter 'numParitions', driving the hive DB table data partitioning. Currently specified only for table 'project'
|
2020-07-23 08:54:10 +02:00 |
Miriam Baglioni
|
40bbe94f7c
|
merge with master fork
|
2020-07-20 18:10:03 +02:00 |
Miriam Baglioni
|
23160b4d29
|
realignment of the workflow classes with the changes in the structure of the module
|
2020-07-20 18:04:30 +02:00 |
Claudio Atzori
|
e0c4cf6f7b
|
added parameter to drive the graph merge strategy: priority (BETA|PROD)
|
2020-07-20 10:48:01 +02:00 |
Claudio Atzori
|
94ccdb4852
|
Merge branch 'master' into merge_graph
|
2020-07-20 10:14:55 +02:00 |
Sandro La Bruzzo
|
9116d75b3e
|
Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop
|
2020-07-17 18:01:30 +02:00 |
Claudio Atzori
|
878f2b931c
|
Merge branch 'master' into merge_graph
|
2020-07-16 16:34:24 +02:00 |
Claudio Atzori
|
cc77446dc4
|
added dbSchema parameter to the raw_db workflow
|
2020-07-10 19:01:50 +02:00 |
Sandro La Bruzzo
|
c01efed79b
|
Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop
|
2020-07-10 14:44:57 +02:00 |
Sandro La Bruzzo
|
a7d3977481
|
added generation of EBI Dataset
|
2020-07-10 14:44:50 +02:00 |
Claudio Atzori
|
610d377d57
|
first implementation of the BETA & PROD graphs merge procedure
|
2020-07-08 16:54:26 +02:00 |
Alessia Bardi
|
8f83b726fa
|
Dump json schema compliant to json schema Draft 7
|
2020-07-08 12:48:46 +02:00 |
Miriam Baglioni
|
7fe00cb4fb
|
-
|
2020-07-08 10:29:37 +02:00 |
Miriam Baglioni
|
b2782025f6
|
enabled the whole workflow to run. Added property to give priority to depenedency in the classpath - to solve conflicts
|
2020-07-07 18:10:47 +02:00 |
Miriam Baglioni
|
f5bb65c9ef
|
the json schema for the dump of the results
|
2020-07-07 17:34:40 +02:00 |
Miriam Baglioni
|
c19818a3f8
|
merge branch with fork master
|
2020-07-06 13:58:23 +02:00 |