Miriam Baglioni
|
08806deddf
|
added the splitSize non mandatory parameter. Default size 10G
|
2020-11-03 16:57:34 +01:00 |
Miriam Baglioni
|
7d2eda43ca
|
added new non mandatory property publish to determine if to publish the upload or leave it pending. Default value flase
|
2020-11-03 16:57:01 +01:00 |
Miriam Baglioni
|
cbbb1bdc54
|
moved business logic to new class in common for handling the zip of hte archives
|
2020-11-03 16:55:50 +01:00 |
Miriam Baglioni
|
d4382b54df
|
moved the tar archive with maz size on common module
|
2020-11-03 16:54:50 +01:00 |
Claudio Atzori
|
5310e56dba
|
remove empy PIDs
|
2020-11-03 11:52:10 +01:00 |
Sandro La Bruzzo
|
754c86f33e
|
fixed test to work on jenkins
|
2020-11-02 09:35:01 +01:00 |
Sandro La Bruzzo
|
39337d8a8a
|
fixed test
|
2020-11-02 09:26:25 +01:00 |
Miriam Baglioni
|
dabb33e018
|
changed the discriminant for which split the file
|
2020-10-30 17:52:22 +01:00 |
Claudio Atzori
|
c5dda3a00c
|
Merge pull request 'h2020classification' (#49) from miriam.baglioni/dnet-hadoop:h2020classification into master
LGTM
|
2020-10-30 17:10:05 +01:00 |
Miriam Baglioni
|
4905739be6
|
changed resource file to mirror change in business logic
|
2020-10-30 17:02:57 +01:00 |
Miriam Baglioni
|
b40360ebfb
|
changed the code to mirror the changed decision in the classification level and prodramme description labels
|
2020-10-30 17:02:30 +01:00 |
Miriam Baglioni
|
696409fb9f
|
disabled tests because needing remote resource
|
2020-10-30 17:01:48 +01:00 |
Miriam Baglioni
|
0fba08eae4
|
max allowed size per file 10 Gb
|
2020-10-30 16:05:55 +01:00 |
Miriam Baglioni
|
b828587252
|
prevent the code to cicle indefinetly
|
2020-10-30 15:01:25 +01:00 |
Miriam Baglioni
|
f747e303ac
|
classes for dumping of the graph as ttl file
|
2020-10-30 14:13:45 +01:00 |
Miriam Baglioni
|
16baf5b69e
|
formatting
|
2020-10-30 14:13:14 +01:00 |
Miriam Baglioni
|
a9eef9c852
|
added check for possible Optional value in relation dataInfo
|
2020-10-30 14:12:28 +01:00 |
Miriam Baglioni
|
5f4de9a962
|
formatting
|
2020-10-30 14:11:40 +01:00 |
Miriam Baglioni
|
14bf2e7238
|
added option to split dumps bigger that 40Gb on different files
|
2020-10-30 14:09:04 +01:00 |
Miriam Baglioni
|
78fdb11c3f
|
merge branch with master
|
2020-10-29 12:55:22 +01:00 |
Sandro La Bruzzo
|
1d9fdb7367
|
fixed spark memory issue in SparkSplitOafTODLIEntities
|
2020-10-28 12:30:32 +01:00 |
Miriam Baglioni
|
d2374e3b9e
|
added code to handle cases where the funding tree is not existing
|
2020-10-27 16:15:21 +01:00 |
Miriam Baglioni
|
5d3012eeb4
|
changed code to dump only the programme list and not the classification list
|
2020-10-27 16:14:18 +01:00 |
Miriam Baglioni
|
3241ec1777
|
added connection timeout and socket timeout 600 sec
|
2020-10-27 16:12:11 +01:00 |
Enrico Ottonello
|
9818e74a70
|
added dependency version in main pom.xml for orcid no doi
|
2020-10-22 16:38:00 +02:00 |
Enrico Ottonello
|
210a50e4f4
|
replaced null value
|
2020-10-22 16:24:42 +02:00 |
Enrico Ottonello
|
b0290dbcb7
|
moved all dependencies version to main pom.xml
|
2020-10-22 16:20:46 +02:00 |
Enrico Ottonello
|
a38ab57062
|
let run test methods
|
2020-10-22 15:43:50 +02:00 |
Enrico Ottonello
|
1139d6568d
|
replaced null value with a more safe empty string as return value
|
2020-10-22 15:32:26 +02:00 |
Enrico Ottonello
|
c58db1c8ea
|
added filter on null value after map function
|
2020-10-22 15:11:02 +02:00 |
Enrico Ottonello
|
846ba30873
|
if typologies mapping fails, an exception will be propagated
|
2020-10-22 14:36:18 +02:00 |
Enrico Ottonello
|
c3114ba0ae
|
replaced null as return value with a more safe empty string
|
2020-10-22 14:21:31 +02:00 |
Enrico Ottonello
|
c295c71ca0
|
added comment
|
2020-10-22 14:07:26 +02:00 |
Enrico Ottonello
|
ab083f9946
|
propagate exception on parsing work (PR request)
|
2020-10-22 14:02:32 +02:00 |
sandro
|
3a81a940b7
|
solved bug on merge publication
|
2020-10-21 22:41:55 +02:00 |
Miriam Baglioni
|
a2ce527fae
|
changed to match the requirements for short titles in level and long titles in classification
|
2020-10-20 17:03:25 +02:00 |
Sandro La Bruzzo
|
346ed65e2c
|
added upload to zenodo node
|
2020-10-20 16:59:55 +02:00 |
sandro
|
271b4db450
|
Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop
|
2020-10-20 16:09:49 +02:00 |
sandro
|
d58d02d448
|
added workflow upload on zenodo
|
2020-10-20 16:09:07 +02:00 |
Alessia Bardi
|
1425d810a8
|
testing mapping
|
2020-10-19 17:46:14 +02:00 |
Sandro La Bruzzo
|
fed711da80
|
Merge remote-tracking branch 'origin/master' into merge_record_to_common
|
2020-10-13 15:32:45 +02:00 |
Sandro La Bruzzo
|
34bf64c94f
|
fixed export Scholexplorer to OpenAire
|
2020-10-13 08:47:58 +02:00 |
Alessia Bardi
|
8775a64bc1
|
Merge pull request 'Merging different compatibility levels (pinocchio operator)' (#47) from merge_graph into master
|
2020-10-09 14:44:52 +02:00 |
Claudio Atzori
|
e751c1402f
|
Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop
|
2020-10-09 13:53:21 +02:00 |
Claudio Atzori
|
b961dc7d1e
|
added originalid to the fields in the result graph view
|
2020-10-09 13:53:15 +02:00 |
Sandro La Bruzzo
|
734934e2eb
|
fixed error on empty intersection with publication and relation on export to OAF
|
2020-10-08 17:29:29 +02:00 |
Sandro La Bruzzo
|
eec418cd26
|
moved AuthoreMerger into dhp-common
|
2020-10-08 10:33:55 +02:00 |
Sandro La Bruzzo
|
fe0a7870e6
|
Added test to check if merge authors works
|
2020-10-08 10:33:12 +02:00 |
Sandro La Bruzzo
|
cd9c377d18
|
adpted scholexplorer Dump generation to the new Dataset definition
|
2020-10-08 10:10:13 +02:00 |
Claudio Atzori
|
a3f37a9414
|
javadoc
|
2020-10-07 16:44:22 +02:00 |
Claudio Atzori
|
8d85a2fced
|
[BETA wf only] datasources involved in the merge operation doesn't obey to the infra precedence policy, but relies on a custom behaviour that, given two datasources from beta and prod returns the one from prod with the highest compatibility among the two
|
2020-10-07 16:28:52 +02:00 |
Claudio Atzori
|
5f7b75f5c5
|
code formatting
|
2020-10-07 13:22:54 +02:00 |
miconis
|
5a8bc329c5
|
bug fix in the result merge: it takes the correct bestaccessright basing on the license instead of the trust
|
2020-10-06 15:26:44 +02:00 |
Miriam Baglioni
|
061527f06e
|
adding short description
|
2020-10-05 13:54:39 +02:00 |
Miriam Baglioni
|
0c12d7bdd8
|
adding short description
|
2020-10-05 11:39:55 +02:00 |
Miriam Baglioni
|
ae08b3c0dd
|
merge branch with master
|
2020-10-05 11:35:55 +02:00 |
Miriam Baglioni
|
11b7eaae09
|
changed the name of the folder where to store the context entity from context to communities_infrastructures
|
2020-10-05 11:24:54 +02:00 |
Miriam Baglioni
|
32bffb0134
|
changed the name from communities_infrastructures to communities_infrastuctures.json
|
2020-10-05 11:24:17 +02:00 |
Claudio Atzori
|
23f64d9eb4
|
updated dedup tests following the dnet-pace-core library update
|
2020-10-02 14:30:53 +02:00 |
Miriam Baglioni
|
fc2f7636be
|
removed not used code
|
2020-10-02 12:33:52 +02:00 |
Miriam Baglioni
|
25cbcf6114
|
changed to solve issues about names. context renamed communities_infrastructure.json and removed the double json.gz extention to the name of the part in the tar
|
2020-10-02 12:17:46 +02:00 |
Claudio Atzori
|
9db0f88fb8
|
Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop
|
2020-10-02 09:43:35 +02:00 |
Claudio Atzori
|
49ae3450a9
|
code formatting
|
2020-10-02 09:43:24 +02:00 |
Claudio Atzori
|
c2a6e2a9bf
|
fixed mapping for datasource journal info (ISSNs)
|
2020-10-02 09:37:08 +02:00 |
Miriam Baglioni
|
01117a46e1
|
whole workflow activated
|
2020-10-01 17:19:21 +02:00 |
Miriam Baglioni
|
cfb5766c6b
|
removed double json.gz from names of files in the tar
|
2020-10-01 17:18:34 +02:00 |
Miriam Baglioni
|
fcaedac980
|
merge branch with master
|
2020-10-01 16:46:59 +02:00 |
Miriam Baglioni
|
c6e6ed1bd8
|
merge branch with master
|
2020-10-01 16:24:41 +02:00 |
Miriam Baglioni
|
4aec347351
|
refactoring
|
2020-10-01 16:23:52 +02:00 |
Miriam Baglioni
|
61946b4092
|
refactoring
|
2020-10-01 16:22:48 +02:00 |
Miriam Baglioni
|
7e6d35e56c
|
added the link to the excel file related to topic
|
2020-10-01 15:53:31 +02:00 |
Sandro La Bruzzo
|
1a0a44e85a
|
Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop
|
2020-10-01 15:46:53 +02:00 |
Sandro La Bruzzo
|
c4a3c52e45
|
fixed Doiboost bug in the identifier
|
2020-10-01 15:46:44 +02:00 |
Miriam Baglioni
|
43cbd62c2b
|
added classpath.first in the configuration
|
2020-10-01 15:46:34 +02:00 |
Miriam Baglioni
|
cd69c6b023
|
added dependency for the topic file path
|
2020-10-01 15:45:59 +02:00 |
Miriam Baglioni
|
771cde3d05
|
moved the library version to global pom
|
2020-10-01 15:43:47 +02:00 |
Miriam Baglioni
|
632351c0da
|
modified test resources to mirror the changed in the code
|
2020-10-01 15:43:02 +02:00 |
Miriam Baglioni
|
ebc1c5513f
|
modified test resources to mirror the changed in the code
|
2020-10-01 15:42:29 +02:00 |
Miriam Baglioni
|
3a374c34b6
|
fixed null pointer exception
|
2020-10-01 15:41:01 +02:00 |
Miriam Baglioni
|
83ea746163
|
added check to the test
|
2020-10-01 15:40:28 +02:00 |
Claudio Atzori
|
2e9e13444d
|
author pids made unique by value
|
2020-10-01 12:50:40 +02:00 |
Miriam Baglioni
|
6e5db85b32
|
-
|
2020-10-01 11:51:11 +02:00 |
Miriam Baglioni
|
a46179f61c
|
refactoring
|
2020-10-01 11:22:01 +02:00 |
Miriam Baglioni
|
b90bee124b
|
removing raws that are empy from thos imported
|
2020-10-01 11:16:49 +02:00 |
Miriam Baglioni
|
c107f193c9
|
refactoring
|
2020-10-01 11:16:22 +02:00 |
Claudio Atzori
|
e265c3e125
|
cleaning functions factored out in a dedicated class
|
2020-10-01 10:50:15 +02:00 |
Miriam Baglioni
|
706a80a29a
|
added test to check that separator '-' (not hyphen) will be recognized
|
2020-10-01 10:38:31 +02:00 |
Miriam Baglioni
|
3dca586b3b
|
refactoring
|
2020-10-01 10:34:48 +02:00 |
Miriam Baglioni
|
416bda6066
|
changed the programme.desxcription by using the same value used in the classification instead of the short title or the title
|
2020-10-01 10:31:33 +02:00 |
Miriam Baglioni
|
f6587c91f3
|
added comparison to a char that seems - but it is not
|
2020-10-01 10:30:26 +02:00 |
Claudio Atzori
|
4287164aba
|
include relevantdate field in the result view
|
2020-10-01 10:28:55 +02:00 |
Miriam Baglioni
|
7e73bb88b3
|
changed the logic to add the topic description to the project
|
2020-09-28 17:21:43 +02:00 |
Miriam Baglioni
|
0a035e3630
|
-
|
2020-09-28 17:20:49 +02:00 |
Miriam Baglioni
|
16bee2084d
|
added the topic code to the project subset
|
2020-09-28 17:20:11 +02:00 |
Miriam Baglioni
|
0bf2d0db52
|
added to the workflow the download of the topic excel file and one property needed to get the input path of the topic file in the hdfs filesystem
|
2020-09-28 12:17:22 +02:00 |
Miriam Baglioni
|
c2abde4d9f
|
changed the implementation of Atomic Actions creation by exploiting the topic information get from the cordis excel file
|
2020-09-28 12:16:34 +02:00 |
Miriam Baglioni
|
d930b8d3fc
|
changed the query to get only the code of the project and not the optional1 (topic code) and optional2 (topic description)
|
2020-09-28 12:15:48 +02:00 |
Miriam Baglioni
|
f8f5cfd5cc
|
removed the part added to set the topic code and description in the step of project preparation
|
2020-09-28 12:13:33 +02:00 |
Miriam Baglioni
|
9e19c9a221
|
remove the topic description from the values in the CSVProject class
|
2020-09-28 12:11:03 +02:00 |
Miriam Baglioni
|
6d8b932e40
|
refactoring
|
2020-09-28 12:06:56 +02:00 |
Miriam Baglioni
|
b77f166549
|
changed the package name from csvutils to utils
|
2020-09-28 12:05:47 +02:00 |
Miriam Baglioni
|
e33e3277de
|
added needed dependency to read the excel file
|
2020-09-28 12:03:14 +02:00 |
Miriam Baglioni
|
f4739a371a
|
code to get the information related to the topic association between code and description.
|
2020-09-28 12:02:48 +02:00 |
Miriam Baglioni
|
7b6a7333e6
|
merge branch with master
|
2020-09-25 16:42:07 +02:00 |
Miriam Baglioni
|
983a12ed15
|
temporary modification to allow the upload of files in the sandbox without the neew to recreate the mapping from scratch
|
2020-09-25 16:41:51 +02:00 |
Miriam Baglioni
|
8b36d19182
|
added property depositionId and chenage property newVersion that became string from boolean to handle the three possible distinct values
|
2020-09-25 16:41:15 +02:00 |
Miriam Baglioni
|
ed5239f9ec
|
added new code to handle the new possibility to upload files to an already open deposition
|
2020-09-25 16:34:32 +02:00 |
Miriam Baglioni
|
3a8c524fce
|
refactor
|
2020-09-25 16:34:02 +02:00 |
Miriam Baglioni
|
2ac2b537b6
|
merge branch with master
|
2020-09-25 14:40:47 +02:00 |
Miriam Baglioni
|
54800fb9b0
|
enabled only the step to upload in zenodo
|
2020-09-25 14:40:22 +02:00 |
Miriam Baglioni
|
12c2dfc268
|
modified the resource to consider the information added to the model
|
2020-09-25 14:17:23 +02:00 |
Miriam Baglioni
|
969fa8d96e
|
fixed issue and changed the transformation of the programme file to consider the new model
|
2020-09-25 13:32:34 +02:00 |
Michele Artini
|
c171fdebe1
|
Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop
|
2020-09-25 09:03:09 +02:00 |
Michele Artini
|
c96598aaa4
|
opendoar partition
|
2020-09-25 09:02:58 +02:00 |
Miriam Baglioni
|
de6c4d46d8
|
fixed conflicts
|
2020-09-24 15:35:01 +02:00 |
Miriam Baglioni
|
e917281822
|
-
|
2020-09-24 15:24:05 +02:00 |
Miriam Baglioni
|
9f54f69e6d
|
added topic information
|
2020-09-24 15:23:35 +02:00 |
Miriam Baglioni
|
d6206d6e63
|
add the topic description to the action set associated to the project
|
2020-09-24 15:22:40 +02:00 |
Miriam Baglioni
|
6b50226f3b
|
added topic code and topic description
|
2020-09-24 15:21:49 +02:00 |
Miriam Baglioni
|
15af1f527e
|
modified to consider the topic information
|
2020-09-24 15:20:56 +02:00 |
Miriam Baglioni
|
609ff17cfc
|
now the commission give us the framework programme (FP7 - H2020) so use this information to filter out programmes not associated to H2020
|
2020-09-24 15:19:31 +02:00 |
Miriam Baglioni
|
b66f930466
|
Added optionl1 and optional2 information to the files red from the db. Optional1 contains the topic code and optional2 contains the topic description
|
2020-09-24 15:16:56 +02:00 |
Miriam Baglioni
|
860e6d38a6
|
added topic description to the CSV project variables
|
2020-09-24 15:15:26 +02:00 |
Claudio Atzori
|
044d3a0214
|
fixed query used to load datasources in the Graph
|
2020-09-24 13:48:58 +02:00 |
Claudio Atzori
|
27df1cea6d
|
code formatting
|
2020-09-24 12:16:00 +02:00 |
Claudio Atzori
|
fb22f4d70b
|
included values for projects fundedamount and totalcost fields in the mapping tests. Swapped expected and actual values in junit test assertions
|
2020-09-24 12:10:59 +02:00 |
Claudio Atzori
|
42f55395c8
|
fixed order of the ISSNs returned by the SQL query
|
2020-09-24 12:09:58 +02:00 |
Claudio Atzori
|
fadf5c7c69
|
Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop
|
2020-09-24 10:42:52 +02:00 |
Claudio Atzori
|
9a7e72d528
|
using concat_ws to join textual columns from PSQL. When using || to perform the concatenation, Null columns makes the operation result to be Null
|
2020-09-24 10:42:47 +02:00 |
Claudio Atzori
|
9e3e93c6b6
|
setting the correct issn type in the datasource.journal element
|
2020-09-24 10:39:16 +02:00 |
Miriam Baglioni
|
0d83f47166
|
merge branch with master
|
2020-09-23 17:33:49 +02:00 |
Miriam Baglioni
|
39eb8ab25b
|
changed the dump to move from h2020programme to h2020classification
|
2020-09-23 17:33:00 +02:00 |
Miriam Baglioni
|
1d84cf19a6
|
added new line to resource file
|
2020-09-23 17:32:22 +02:00 |
Miriam Baglioni
|
f0c476b6c9
|
modification to the test classes to consider h2020classification
|
2020-09-23 17:31:49 +02:00 |
Miriam Baglioni
|
2cba3cb484
|
modification to the classes building the actionset to consider the h2020classification
|
2020-09-23 17:31:15 +02:00 |
Miriam Baglioni
|
1069cf243a
|
modification to the schema to consider the H2020classification of the programme. The filed Programme has been moved inside the H2020classification that is now associated to the Project. Programme is no more associated directly to the Project but via H2020CLassification
|
2020-09-22 14:38:00 +02:00 |
Enrico Ottonello
|
a97ad20c7b
|
exception is now propagated (PR review)
|
2020-09-22 10:46:34 +02:00 |
Enrico Ottonello
|
fefbcfb106
|
dependency version moved to main pom (PR review)
|
2020-09-22 10:20:25 +02:00 |
Michele Artini
|
9e681609fd
|
stats to sql file
|
2020-09-17 15:51:22 +02:00 |
Michele Artini
|
51321c2701
|
partition of events by opedoarId
|
2020-09-17 11:38:07 +02:00 |
Claudio Atzori
|
cf2ce1a09b
|
code formatting
|
2020-09-15 15:58:03 +02:00 |
Enrico Ottonello
|
9e8e7fe6ef
|
add comments
|
2020-09-15 11:32:49 +02:00 |
Miriam Baglioni
|
c2b5c780ff
|
-
|
2020-09-14 14:34:03 +02:00 |
Miriam Baglioni
|
e2ceefe9be
|
-
|
2020-09-14 14:33:28 +02:00 |
Miriam Baglioni
|
1f893e63dc
|
-
|
2020-09-14 14:33:10 +02:00 |
Enrico Ottonello
|
538f299767
|
merged
|
2020-09-14 12:35:16 +02:00 |
Enrico Ottonello
|
eb8c9b2348
|
Merge remote-tracking branch 'upstream/master' into orcid-no-doi
|
2020-09-14 12:00:56 +02:00 |
Michele Artini
|
9b0c12f5d3
|
send notifications
|
2020-09-11 12:06:16 +02:00 |
Michele Artini
|
028613b751
|
remove old notifications
|
2020-09-09 15:32:06 +02:00 |
Michele Artini
|
9cfc124ac5
|
Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop
|
2020-09-08 16:39:54 +02:00 |