Miriam Baglioni
|
14bf2e7238
|
added option to split dumps bigger that 40Gb on different files
|
2020-10-30 14:09:04 +01:00 |
Claudio Atzori
|
58f28296ea
|
ProvisionConstants moved as ModelHardLimits in dhp-common and applied to truncate long abstracts (len > 150000). Further filtering for empty PID values
|
2020-10-30 10:56:42 +01:00 |
Miriam Baglioni
|
78fdb11c3f
|
merge branch with master
|
2020-10-29 12:55:22 +01:00 |
Sandro La Bruzzo
|
1d9fdb7367
|
fixed spark memory issue in SparkSplitOafTODLIEntities
|
2020-10-28 12:30:32 +01:00 |
Miriam Baglioni
|
d2374e3b9e
|
added code to handle cases where the funding tree is not existing
|
2020-10-27 16:15:21 +01:00 |
Miriam Baglioni
|
5d3012eeb4
|
changed code to dump only the programme list and not the classification list
|
2020-10-27 16:14:18 +01:00 |
Miriam Baglioni
|
3241ec1777
|
added connection timeout and socket timeout 600 sec
|
2020-10-27 16:12:11 +01:00 |
sandro
|
3a81a940b7
|
solved bug on merge publication
|
2020-10-21 22:41:55 +02:00 |
Miriam Baglioni
|
a2ce527fae
|
changed to match the requirements for short titles in level and long titles in classification
|
2020-10-20 17:03:25 +02:00 |
Sandro La Bruzzo
|
346ed65e2c
|
added upload to zenodo node
|
2020-10-20 16:59:55 +02:00 |
sandro
|
271b4db450
|
Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop
|
2020-10-20 16:09:49 +02:00 |
sandro
|
d58d02d448
|
added workflow upload on zenodo
|
2020-10-20 16:09:07 +02:00 |
miconis
|
c4a59d1b9a
|
merge with the master to port the new packages
|
2020-10-20 16:07:30 +02:00 |
miconis
|
708d887e64
|
minor changes
|
2020-10-20 15:12:19 +02:00 |
miconis
|
0e54803177
|
bug fix in the id generator and implementation of jobs for organization dedup
|
2020-10-20 12:19:46 +02:00 |
Alessia Bardi
|
1425d810a8
|
testing mapping
|
2020-10-19 17:46:14 +02:00 |
Claudio Atzori
|
266bf1a221
|
common IdentifierFactory in use on the mapping from the aggregator data; merge the entities sharing the same id; code formatting
|
2020-10-16 17:02:10 +02:00 |
Claudio Atzori
|
34f1d0904b
|
common IdentifierFactory in use on the mapping from the aggregator data
|
2020-10-16 16:00:19 +02:00 |
Sandro La Bruzzo
|
fed711da80
|
Merge remote-tracking branch 'origin/master' into merge_record_to_common
|
2020-10-13 15:32:45 +02:00 |
Sandro La Bruzzo
|
34bf64c94f
|
fixed export Scholexplorer to OpenAire
|
2020-10-13 08:47:58 +02:00 |
Alessia Bardi
|
8775a64bc1
|
Merge pull request 'Merging different compatibility levels (pinocchio operator)' (#47) from merge_graph into master
|
2020-10-09 14:44:52 +02:00 |
Claudio Atzori
|
e751c1402f
|
Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop
|
2020-10-09 13:53:21 +02:00 |
Claudio Atzori
|
b961dc7d1e
|
added originalid to the fields in the result graph view
|
2020-10-09 13:53:15 +02:00 |
miconis
|
6f8720982c
|
bug fix in the idgenerator and test implementation
|
2020-10-09 09:30:23 +02:00 |
Sandro La Bruzzo
|
734934e2eb
|
fixed error on empty intersection with publication and relation on export to OAF
|
2020-10-08 17:29:29 +02:00 |
Sandro La Bruzzo
|
eec418cd26
|
moved AuthoreMerger into dhp-common
|
2020-10-08 10:33:55 +02:00 |
Sandro La Bruzzo
|
fe0a7870e6
|
Added test to check if merge authors works
|
2020-10-08 10:33:12 +02:00 |
Sandro La Bruzzo
|
cd9c377d18
|
adpted scholexplorer Dump generation to the new Dataset definition
|
2020-10-08 10:10:13 +02:00 |
Claudio Atzori
|
a3f37a9414
|
javadoc
|
2020-10-07 16:44:22 +02:00 |
Claudio Atzori
|
8d85a2fced
|
[BETA wf only] datasources involved in the merge operation doesn't obey to the infra precedence policy, but relies on a custom behaviour that, given two datasources from beta and prod returns the one from prod with the highest compatibility among the two
|
2020-10-07 16:28:52 +02:00 |
Claudio Atzori
|
5f7b75f5c5
|
code formatting
|
2020-10-07 13:22:54 +02:00 |
miconis
|
1804c5d809
|
refactoring: classes moved in the right package
|
2020-10-06 16:44:51 +02:00 |
miconis
|
7093355487
|
bug fix and minor changes
|
2020-10-06 16:21:34 +02:00 |
miconis
|
5a8bc329c5
|
bug fix in the result merge: it takes the correct bestaccessright basing on the license instead of the trust
|
2020-10-06 15:26:44 +02:00 |
miconis
|
a2ac7e52fb
|
implementation of the workflow for new organizations in openorgs
|
2020-10-06 13:58:09 +02:00 |
Miriam Baglioni
|
061527f06e
|
adding short description
|
2020-10-05 13:54:39 +02:00 |
Miriam Baglioni
|
0c12d7bdd8
|
adding short description
|
2020-10-05 11:39:55 +02:00 |
Miriam Baglioni
|
ae08b3c0dd
|
merge branch with master
|
2020-10-05 11:35:55 +02:00 |
Miriam Baglioni
|
11b7eaae09
|
changed the name of the folder where to store the context entity from context to communities_infrastructures
|
2020-10-05 11:24:54 +02:00 |
Miriam Baglioni
|
32bffb0134
|
changed the name from communities_infrastructures to communities_infrastuctures.json
|
2020-10-05 11:24:17 +02:00 |
Claudio Atzori
|
23f64d9eb4
|
updated dedup tests following the dnet-pace-core library update
|
2020-10-02 14:30:53 +02:00 |
Miriam Baglioni
|
fc2f7636be
|
removed not used code
|
2020-10-02 12:33:52 +02:00 |
Miriam Baglioni
|
25cbcf6114
|
changed to solve issues about names. context renamed communities_infrastructure.json and removed the double json.gz extention to the name of the part in the tar
|
2020-10-02 12:17:46 +02:00 |
Claudio Atzori
|
9db0f88fb8
|
Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop
|
2020-10-02 09:43:35 +02:00 |
Claudio Atzori
|
49ae3450a9
|
code formatting
|
2020-10-02 09:43:24 +02:00 |
Claudio Atzori
|
c2a6e2a9bf
|
fixed mapping for datasource journal info (ISSNs)
|
2020-10-02 09:37:08 +02:00 |
Miriam Baglioni
|
01117a46e1
|
whole workflow activated
|
2020-10-01 17:19:21 +02:00 |
Miriam Baglioni
|
cfb5766c6b
|
removed double json.gz from names of files in the tar
|
2020-10-01 17:18:34 +02:00 |
Miriam Baglioni
|
fcaedac980
|
merge branch with master
|
2020-10-01 16:46:59 +02:00 |
Miriam Baglioni
|
c6e6ed1bd8
|
merge branch with master
|
2020-10-01 16:24:41 +02:00 |
Miriam Baglioni
|
4aec347351
|
refactoring
|
2020-10-01 16:23:52 +02:00 |
Miriam Baglioni
|
61946b4092
|
refactoring
|
2020-10-01 16:22:48 +02:00 |
Miriam Baglioni
|
7e6d35e56c
|
added the link to the excel file related to topic
|
2020-10-01 15:53:31 +02:00 |
Sandro La Bruzzo
|
1a0a44e85a
|
Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop
|
2020-10-01 15:46:53 +02:00 |
Sandro La Bruzzo
|
c4a3c52e45
|
fixed Doiboost bug in the identifier
|
2020-10-01 15:46:44 +02:00 |
Miriam Baglioni
|
43cbd62c2b
|
added classpath.first in the configuration
|
2020-10-01 15:46:34 +02:00 |
Miriam Baglioni
|
cd69c6b023
|
added dependency for the topic file path
|
2020-10-01 15:45:59 +02:00 |
Miriam Baglioni
|
771cde3d05
|
moved the library version to global pom
|
2020-10-01 15:43:47 +02:00 |
Miriam Baglioni
|
632351c0da
|
modified test resources to mirror the changed in the code
|
2020-10-01 15:43:02 +02:00 |
Miriam Baglioni
|
ebc1c5513f
|
modified test resources to mirror the changed in the code
|
2020-10-01 15:42:29 +02:00 |
Miriam Baglioni
|
3a374c34b6
|
fixed null pointer exception
|
2020-10-01 15:41:01 +02:00 |
Miriam Baglioni
|
83ea746163
|
added check to the test
|
2020-10-01 15:40:28 +02:00 |
Claudio Atzori
|
2e9e13444d
|
author pids made unique by value
|
2020-10-01 12:50:40 +02:00 |
Miriam Baglioni
|
6e5db85b32
|
-
|
2020-10-01 11:51:11 +02:00 |
Miriam Baglioni
|
a46179f61c
|
refactoring
|
2020-10-01 11:22:01 +02:00 |
Miriam Baglioni
|
b90bee124b
|
removing raws that are empy from thos imported
|
2020-10-01 11:16:49 +02:00 |
Miriam Baglioni
|
c107f193c9
|
refactoring
|
2020-10-01 11:16:22 +02:00 |
Claudio Atzori
|
e265c3e125
|
cleaning functions factored out in a dedicated class
|
2020-10-01 10:50:15 +02:00 |
Miriam Baglioni
|
706a80a29a
|
added test to check that separator '-' (not hyphen) will be recognized
|
2020-10-01 10:38:31 +02:00 |
Miriam Baglioni
|
3dca586b3b
|
refactoring
|
2020-10-01 10:34:48 +02:00 |
Miriam Baglioni
|
416bda6066
|
changed the programme.desxcription by using the same value used in the classification instead of the short title or the title
|
2020-10-01 10:31:33 +02:00 |
Miriam Baglioni
|
f6587c91f3
|
added comparison to a char that seems - but it is not
|
2020-10-01 10:30:26 +02:00 |
Claudio Atzori
|
4287164aba
|
include relevantdate field in the result view
|
2020-10-01 10:28:55 +02:00 |
miconis
|
e3f7798d1b
|
minor changes in dedup tests, bug fix in the idgenerator and pace-core version update
|
2020-09-29 15:31:46 +02:00 |
Miriam Baglioni
|
7e73bb88b3
|
changed the logic to add the topic description to the project
|
2020-09-28 17:21:43 +02:00 |
Miriam Baglioni
|
0a035e3630
|
-
|
2020-09-28 17:20:49 +02:00 |
Miriam Baglioni
|
16bee2084d
|
added the topic code to the project subset
|
2020-09-28 17:20:11 +02:00 |
Miriam Baglioni
|
0bf2d0db52
|
added to the workflow the download of the topic excel file and one property needed to get the input path of the topic file in the hdfs filesystem
|
2020-09-28 12:17:22 +02:00 |
Miriam Baglioni
|
c2abde4d9f
|
changed the implementation of Atomic Actions creation by exploiting the topic information get from the cordis excel file
|
2020-09-28 12:16:34 +02:00 |
Miriam Baglioni
|
d930b8d3fc
|
changed the query to get only the code of the project and not the optional1 (topic code) and optional2 (topic description)
|
2020-09-28 12:15:48 +02:00 |
Miriam Baglioni
|
f8f5cfd5cc
|
removed the part added to set the topic code and description in the step of project preparation
|
2020-09-28 12:13:33 +02:00 |
Miriam Baglioni
|
9e19c9a221
|
remove the topic description from the values in the CSVProject class
|
2020-09-28 12:11:03 +02:00 |
Miriam Baglioni
|
6d8b932e40
|
refactoring
|
2020-09-28 12:06:56 +02:00 |
Miriam Baglioni
|
b77f166549
|
changed the package name from csvutils to utils
|
2020-09-28 12:05:47 +02:00 |
Miriam Baglioni
|
e33e3277de
|
added needed dependency to read the excel file
|
2020-09-28 12:03:14 +02:00 |
Miriam Baglioni
|
f4739a371a
|
code to get the information related to the topic association between code and description.
|
2020-09-28 12:02:48 +02:00 |
Miriam Baglioni
|
7b6a7333e6
|
merge branch with master
|
2020-09-25 16:42:07 +02:00 |
Miriam Baglioni
|
983a12ed15
|
temporary modification to allow the upload of files in the sandbox without the neew to recreate the mapping from scratch
|
2020-09-25 16:41:51 +02:00 |
Miriam Baglioni
|
8b36d19182
|
added property depositionId and chenage property newVersion that became string from boolean to handle the three possible distinct values
|
2020-09-25 16:41:15 +02:00 |
Miriam Baglioni
|
ed5239f9ec
|
added new code to handle the new possibility to upload files to an already open deposition
|
2020-09-25 16:34:32 +02:00 |
Miriam Baglioni
|
3a8c524fce
|
refactor
|
2020-09-25 16:34:02 +02:00 |
Miriam Baglioni
|
2ac2b537b6
|
merge branch with master
|
2020-09-25 14:40:47 +02:00 |
Miriam Baglioni
|
54800fb9b0
|
enabled only the step to upload in zenodo
|
2020-09-25 14:40:22 +02:00 |
Miriam Baglioni
|
12c2dfc268
|
modified the resource to consider the information added to the model
|
2020-09-25 14:17:23 +02:00 |
Miriam Baglioni
|
969fa8d96e
|
fixed issue and changed the transformation of the programme file to consider the new model
|
2020-09-25 13:32:34 +02:00 |
miconis
|
4cf79f32eb
|
implementation of the oozie wf to prepare the openorgs input: relations between organizations
|
2020-09-25 11:29:51 +02:00 |
Michele Artini
|
c171fdebe1
|
Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop
|
2020-09-25 09:03:09 +02:00 |
Michele Artini
|
c96598aaa4
|
opendoar partition
|
2020-09-25 09:02:58 +02:00 |
Miriam Baglioni
|
de6c4d46d8
|
fixed conflicts
|
2020-09-24 15:35:01 +02:00 |
Miriam Baglioni
|
e917281822
|
-
|
2020-09-24 15:24:05 +02:00 |
Miriam Baglioni
|
9f54f69e6d
|
added topic information
|
2020-09-24 15:23:35 +02:00 |
Miriam Baglioni
|
d6206d6e63
|
add the topic description to the action set associated to the project
|
2020-09-24 15:22:40 +02:00 |
Miriam Baglioni
|
6b50226f3b
|
added topic code and topic description
|
2020-09-24 15:21:49 +02:00 |
Miriam Baglioni
|
15af1f527e
|
modified to consider the topic information
|
2020-09-24 15:20:56 +02:00 |
Miriam Baglioni
|
609ff17cfc
|
now the commission give us the framework programme (FP7 - H2020) so use this information to filter out programmes not associated to H2020
|
2020-09-24 15:19:31 +02:00 |
Miriam Baglioni
|
b66f930466
|
Added optionl1 and optional2 information to the files red from the db. Optional1 contains the topic code and optional2 contains the topic description
|
2020-09-24 15:16:56 +02:00 |
Miriam Baglioni
|
860e6d38a6
|
added topic description to the CSV project variables
|
2020-09-24 15:15:26 +02:00 |
Claudio Atzori
|
044d3a0214
|
fixed query used to load datasources in the Graph
|
2020-09-24 13:48:58 +02:00 |
Claudio Atzori
|
27df1cea6d
|
code formatting
|
2020-09-24 12:16:00 +02:00 |
Claudio Atzori
|
fb22f4d70b
|
included values for projects fundedamount and totalcost fields in the mapping tests. Swapped expected and actual values in junit test assertions
|
2020-09-24 12:10:59 +02:00 |
Claudio Atzori
|
42f55395c8
|
fixed order of the ISSNs returned by the SQL query
|
2020-09-24 12:09:58 +02:00 |
Claudio Atzori
|
fadf5c7c69
|
Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop
|
2020-09-24 10:42:52 +02:00 |
Claudio Atzori
|
9a7e72d528
|
using concat_ws to join textual columns from PSQL. When using || to perform the concatenation, Null columns makes the operation result to be Null
|
2020-09-24 10:42:47 +02:00 |
Claudio Atzori
|
9e3e93c6b6
|
setting the correct issn type in the datasource.journal element
|
2020-09-24 10:39:16 +02:00 |
Miriam Baglioni
|
0d83f47166
|
merge branch with master
|
2020-09-23 17:33:49 +02:00 |
Miriam Baglioni
|
39eb8ab25b
|
changed the dump to move from h2020programme to h2020classification
|
2020-09-23 17:33:00 +02:00 |
Miriam Baglioni
|
1d84cf19a6
|
added new line to resource file
|
2020-09-23 17:32:22 +02:00 |
Miriam Baglioni
|
f0c476b6c9
|
modification to the test classes to consider h2020classification
|
2020-09-23 17:31:49 +02:00 |
Miriam Baglioni
|
2cba3cb484
|
modification to the classes building the actionset to consider the h2020classification
|
2020-09-23 17:31:15 +02:00 |
Miriam Baglioni
|
1069cf243a
|
modification to the schema to consider the H2020classification of the programme. The filed Programme has been moved inside the H2020classification that is now associated to the Project. Programme is no more associated directly to the Project but via H2020CLassification
|
2020-09-22 14:38:00 +02:00 |
miconis
|
259362ef47
|
implementation of the job to collect simrels from postgres db
|
2020-09-22 09:43:27 +02:00 |
Michele Artini
|
9e681609fd
|
stats to sql file
|
2020-09-17 15:51:22 +02:00 |
Michele Artini
|
51321c2701
|
partition of events by opedoarId
|
2020-09-17 11:38:07 +02:00 |
Claudio Atzori
|
cf2ce1a09b
|
code formatting
|
2020-09-15 15:58:03 +02:00 |
Miriam Baglioni
|
c2b5c780ff
|
-
|
2020-09-14 14:34:03 +02:00 |
Miriam Baglioni
|
e2ceefe9be
|
-
|
2020-09-14 14:33:28 +02:00 |
Miriam Baglioni
|
1f893e63dc
|
-
|
2020-09-14 14:33:10 +02:00 |
Michele Artini
|
9b0c12f5d3
|
send notifications
|
2020-09-11 12:06:16 +02:00 |
Michele Artini
|
028613b751
|
remove old notifications
|
2020-09-09 15:32:06 +02:00 |
Michele Artini
|
9cfc124ac5
|
Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-hadoop
|
2020-09-08 16:39:54 +02:00 |
Michele Artini
|
a597a218ab
|
* forall topics
|
2020-09-08 16:39:40 +02:00 |
Claudio Atzori
|
8a523474b7
|
code formatting
|
2020-09-07 11:40:16 +02:00 |
Michele Artini
|
bb459caf69
|
support for all topic subscriptions
|
2020-08-27 11:01:21 +02:00 |
Michele Artini
|
82ed8edafd
|
notification indexing
|
2020-08-26 15:10:48 +02:00 |
Miriam Baglioni
|
b72a7dad46
|
resuorce for pid graph dump
|
2020-08-24 17:09:01 +02:00 |
Miriam Baglioni
|
8694bb9b31
|
refactoring due to compilation
|
2020-08-24 17:07:34 +02:00 |
Miriam Baglioni
|
8a069a4fea
|
-
|
2020-08-24 17:01:30 +02:00 |
Miriam Baglioni
|
34fa96f3b1
|
-
|
2020-08-24 17:00:20 +02:00 |
Miriam Baglioni
|
5fb2949cb8
|
added utils methods
|
2020-08-24 17:00:09 +02:00 |
Miriam Baglioni
|
2a540b6c01
|
added constants for the pid graph dump
|
2020-08-24 16:55:35 +02:00 |
Miriam Baglioni
|
da103c399a
|
resources for the pid graph dump test
|
2020-08-24 16:52:07 +02:00 |
Miriam Baglioni
|
630a6a1fe7
|
first tests for the pid graph dump
|
2020-08-24 16:51:26 +02:00 |
Miriam Baglioni
|
40c8d2de7b
|
test resources for the dump of the pids graph
|
2020-08-24 16:50:39 +02:00 |
Miriam Baglioni
|
bef79d3bdf
|
first attempt to the dump of pids graph
|
2020-08-24 16:49:38 +02:00 |
Michele Artini
|
da470422d3
|
deleting events
|
2020-08-21 14:52:48 +02:00 |
Michele Artini
|
6e60bf026a
|
indexing only a subset of eventsa
|
2020-08-19 12:39:22 +02:00 |
Miriam Baglioni
|
85203c16e3
|
merge branch with master
|
2020-08-19 11:49:03 +02:00 |
Miriam Baglioni
|
2c783793ba
|
removed the affiliation from the author to mirror the changes in the model
|
2020-08-19 11:48:12 +02:00 |
Miriam Baglioni
|
f6bf888016
|
removed affiliation from author to mirror the changes in the model
|
2020-08-19 11:41:41 +02:00 |
Miriam Baglioni
|
66d0e0d3f2
|
-
|
2020-08-19 11:31:50 +02:00 |