Sandro La Bruzzo
|
ab3a99d3e9
|
removed old datacite oozie workflow
|
2021-10-20 17:19:47 +02:00 |
Sandro La Bruzzo
|
ae4e99a471
|
Adapted workflow of resolution of PID to work into OpenAIRE data workflow
- Added relations in both verse on all Scholexplorer datasources
|
2021-10-20 17:12:16 +02:00 |
Miriam Baglioni
|
1cc09adfaa
|
Opencitations: chenaged the test class to mirror the creation or not of duplicate dois for .refs oc original plus added optional parameter to duplicate the relation
|
2021-10-18 14:11:27 +02:00 |
Sandro La Bruzzo
|
7b15b88d4c
|
renamed wrong package, implemented last aggregation workflow for scholexplorer
|
2021-10-15 15:00:15 +02:00 |
Sandro La Bruzzo
|
51a03c0a50
|
refactor code for EBI from dhp-graph-mapper into dhp-aggregation
|
2021-10-14 14:23:13 +02:00 |
Sandro La Bruzzo
|
7387416e90
|
added params skip update to direct transform in OAF, this should be set to true in production
|
2021-10-12 12:36:30 +02:00 |
Sandro La Bruzzo
|
511da98d0c
|
- fixed bug on download pmc Article
- removed unused line of code in SparkCreateActionset
|
2021-10-12 11:47:49 +02:00 |
Sandro La Bruzzo
|
5606014b17
|
code refactor see ticket #7065
|
2021-10-12 08:11:53 +02:00 |
Sandro La Bruzzo
|
66702b1973
|
Added node to update datacite
|
2021-09-28 08:59:06 +02:00 |
Miriam Baglioni
|
5ec69889db
|
OpenCitations: creation of AS from OC
|
2021-09-27 16:02:06 +02:00 |
Miriam Baglioni
|
f2118d771a
|
first steps in the implementation of the integration of opencitations
|
2021-09-22 15:18:05 +02:00 |
Claudio Atzori
|
663b1556d7
|
manually integrating PR#140 D-Net/dnet-hadoop#140
|
2021-09-15 16:40:25 +02:00 |
Sandro La Bruzzo
|
aed29156c7
|
changed behavior in transformation job, that doesn't fail at first error
|
2021-09-07 19:05:46 +02:00 |
Sandro La Bruzzo
|
3c6fc2096c
|
fix bug on oai iterator that skip record cleaned
|
2021-09-07 10:46:26 +02:00 |
Sandro La Bruzzo
|
9f8a80deb7
|
fixed wrong import of unresolved relation in openaire
|
2021-09-01 14:16:27 +02:00 |
Sandro La Bruzzo
|
e8b3cb9147
|
Implemented method to download delta updates in EBI Links
|
2021-08-30 09:32:45 +02:00 |
Alessia Bardi
|
931f430129
|
Merge branch 'beta' into datasource_model_eosc_beta
|
2021-08-23 11:57:21 +02:00 |
Claudio Atzori
|
baed5e3337
|
test classes moved in specific components
|
2021-08-13 12:14:47 +02:00 |
Claudio Atzori
|
3359f73fcf
|
cleanup & best practices
|
2021-08-13 12:00:42 +02:00 |
Miriam Baglioni
|
32fd75691f
|
refactoring
|
2021-08-13 10:15:42 +02:00 |
Miriam Baglioni
|
5cd5714530
|
GetCSV refactoring - added ignore annotation for fields not in input csv
|
2021-08-13 10:06:49 +02:00 |
Miriam Baglioni
|
ed183d878e
|
GetCSV refactoring - modified test classes due to change in the model of projects and programme
|
2021-08-13 09:28:51 +02:00 |
Miriam Baglioni
|
8769dd8eef
|
GetCSV refactoring - refactoring due to movement of classes
|
2021-08-12 18:20:56 +02:00 |
Miriam Baglioni
|
6b9e1bf2e3
|
GetCSV refactoring - removing not needed dependency
|
2021-08-12 18:17:50 +02:00 |
Miriam Baglioni
|
9da74b544a
|
GetCSV refactoring - refactoring due to movement of classes
|
2021-08-12 18:12:15 +02:00 |
Miriam Baglioni
|
ab8abd61bb
|
GetCSV refactoring - refactoring due to movement of classes
|
2021-08-12 18:11:07 +02:00 |
Miriam Baglioni
|
335a824e34
|
GetCSV refactoring - fixed issue
|
2021-08-12 18:10:10 +02:00 |
Miriam Baglioni
|
f0845e9865
|
GetCSV refactoring - refactoring due to movement of classes
|
2021-08-12 18:04:58 +02:00 |
Miriam Baglioni
|
7a789423aa
|
GetCSV refactoring - refactoring due to movement of classes
|
2021-08-12 18:04:27 +02:00 |
Miriam Baglioni
|
e9fc3ef3bc
|
GetCSV refactoring - changed to use the new class to get and write the csv file
|
2021-08-12 18:03:41 +02:00 |
Miriam Baglioni
|
4317211a2b
|
GetCSV refactoring - refactoring due to movement
|
2021-08-12 18:03:14 +02:00 |
Miriam Baglioni
|
b62cd656a7
|
GetCSV refactoring - changed the model to store only the information needed
|
2021-08-12 18:01:10 +02:00 |
Miriam Baglioni
|
d36e925277
|
GetCSV refactoring - moved under model package
|
2021-08-12 18:00:21 +02:00 |
Miriam Baglioni
|
6e84b3951f
|
GetCSV refactoring - moving classes to dhp-common that have dependency with GetCSV class (that was located in graph-mapper)
|
2021-08-12 17:57:41 +02:00 |
Miriam Baglioni
|
804589eb30
|
reverting
|
2021-08-11 17:23:35 +02:00 |
Miriam Baglioni
|
d688749ad9
|
reverting
|
2021-08-11 17:22:28 +02:00 |
Miriam Baglioni
|
524c06e028
|
reverting
|
2021-08-11 17:20:30 +02:00 |
Miriam Baglioni
|
7aa3260729
|
reverting
|
2021-08-11 17:18:45 +02:00 |
Miriam Baglioni
|
55fc500d8d
|
reverting
|
2021-08-11 17:17:48 +02:00 |
Miriam Baglioni
|
8da3a25cf6
|
merging with branch beta
|
2021-08-11 15:55:34 +02:00 |
Claudio Atzori
|
9f4db73f30
|
updated/fixed unit tests
|
2021-08-11 15:02:51 +02:00 |
Claudio Atzori
|
61d811ba53
|
suggestions from intellij
|
2021-08-11 12:18:20 +02:00 |
Claudio Atzori
|
2ee21da43b
|
suggestions from SonarLint
|
2021-08-11 12:13:22 +02:00 |
Miriam Baglioni
|
1d6ac3715b
|
merge branch with beta
|
2021-07-30 11:58:29 +02:00 |
Sandro La Bruzzo
|
b1b0cc3f15
|
fixed wrong package name
|
2021-07-29 13:55:08 +02:00 |
Sandro La Bruzzo
|
3721df7aa6
|
refactoring create actionset of scholexplorer, moved on package dhp-aggregation
|
2021-07-29 10:45:35 +02:00 |
Sandro La Bruzzo
|
3d8f0f629b
|
implemented workflow of creation action set for scholexplorer
|
2021-07-28 16:15:34 +02:00 |
Miriam Baglioni
|
cc0d3d8a7b
|
mergin with branch beta
|
2021-07-28 11:24:46 +02:00 |
Miriam Baglioni
|
708d0ade34
|
Merge branch 'beta' into hostedbymap
|
2021-07-28 10:37:22 +02:00 |
Sandro La Bruzzo
|
16c91203bd
|
implemented workflow of creation action set for scholexplorer
|
2021-07-28 10:30:49 +02:00 |
Sandro La Bruzzo
|
825d9f0289
|
fixed datacite workflow starting from Importing delta
|
2021-07-27 16:09:46 +02:00 |
Miriam Baglioni
|
74f801b689
|
mergin with branch beta
|
2021-07-27 13:18:31 +02:00 |
Miriam Baglioni
|
eb07f7f40f
|
Hosted By Map
|
2021-07-27 12:27:26 +02:00 |
Claudio Atzori
|
a0393607a7
|
mapping funding relations from Datacite should be done according to the actual result identifier
|
2021-07-23 18:15:08 +02:00 |
Sandro La Bruzzo
|
62ae36a3d2
|
fixed NPE
|
2021-07-22 15:41:38 +02:00 |
Miriam Baglioni
|
63553a76b3
|
added code to download gold issn list from unibi
|
2021-07-22 12:01:48 +02:00 |
Sandro La Bruzzo
|
bbe8193930
|
merged stable ids
|
2021-07-12 17:00:43 +02:00 |
Sandro La Bruzzo
|
cd17e19044
|
implemented branch workflow to import datacite and crossref in scholexplorer
|
2021-07-08 21:20:19 +02:00 |
Claudio Atzori
|
777536ce91
|
[aggregation] string values used as regular expressions in the OAI collection classes are defined in a single point as constants, to be reused across the code (PR#122)
|
2021-07-07 11:23:48 +02:00 |
Claudio Atzori
|
bc014023c8
|
Merge pull request 'to solve the scala SI-3623' (#122) from andreas.czerniak/BrStableId_dnet-hadoop:stable_ids into stable_ids
Reviewed-on: D-Net/dnet-hadoop#122
|
2021-07-07 11:13:51 +02:00 |
Andreas Czerniak
|
ebf3f47a02
|
from&until more OAI2.0 compl., adding tfs
|
2021-07-07 09:29:49 +02:00 |
Claudio Atzori
|
70ded407bb
|
HttpClient used in metadata collection retries also on 404
|
2021-07-05 18:04:30 +02:00 |
Sandro La Bruzzo
|
db933ebd21
|
Merge remote-tracking branch 'origin/stable_ids' into stable_id_scholexplorer
|
2021-06-29 14:16:12 +02:00 |
Sandro La Bruzzo
|
7e08655e5f
|
added relation dates in all scholexplorer Datasources
|
2021-06-29 12:02:03 +02:00 |
Claudio Atzori
|
af42377d0e
|
HttpClient used in metadata collection retries on 502, 503, 504
|
2021-06-28 09:34:30 +02:00 |
Sandro La Bruzzo
|
ad50415167
|
Merge remote-tracking branch 'origin/stable_ids' into stable_id_scholexplorer
|
2021-06-24 17:20:50 +02:00 |
Sandro La Bruzzo
|
80e15cc455
|
implemented mapping from uniprot, pdb and ebi links
|
2021-06-24 17:20:00 +02:00 |
Claudio Atzori
|
5edcc6832a
|
applying sonarLint suggestions
|
2021-06-23 09:53:29 +02:00 |
Sandro La Bruzzo
|
1dc0c59e20
|
merged fix thai dates from stable_ids
|
2021-06-21 10:39:46 +02:00 |
Sandro La Bruzzo
|
507e42102a
|
added pdb to oaf class
|
2021-06-21 09:36:40 +02:00 |
Sandro La Bruzzo
|
3990165d05
|
changed typologies of unresolved relation
|
2021-06-18 11:43:59 +02:00 |
Sandro La Bruzzo
|
cc0f2b11fb
|
Implemented mapping from pubmed baseline to OAF
|
2021-06-16 14:56:24 +02:00 |
Sandro La Bruzzo
|
aeb8132627
|
Merged branch stable_ids
|
2021-06-14 10:07:29 +02:00 |
Claudio Atzori
|
e9e86a237d
|
Merge branch 'stable_ids' of https://code-repo.d4science.org/D-Net/dnet-hadoop into stable_ids
|
2021-06-11 17:00:02 +02:00 |
Claudio Atzori
|
a900bfb874
|
delegating the date parsing to https://github.com/sisyphsu/dateparser
|
2021-06-11 16:53:01 +02:00 |
Sandro La Bruzzo
|
dd997c49e0
|
fix wrong relation id
fix date thai ticket #6791
|
2021-06-10 14:47:18 +02:00 |
Sandro La Bruzzo
|
0cdb7ccdaa
|
added inverse relations to datacite mapping
|
2021-06-04 15:10:20 +02:00 |
Sandro La Bruzzo
|
5b724d9972
|
added relations to datacite mapping
|
2021-06-04 10:14:22 +02:00 |
Sandro La Bruzzo
|
02ef46535f
|
Merge branch 'stable_ids' of code-repo.d4science.org:D-Net/dnet-hadoop into stable_ids
|
2021-05-31 09:50:15 +02:00 |
Sandro La Bruzzo
|
aeadc5a366
|
updated wf Datacite Import to retrieve the block size as parameter
|
2021-05-31 09:49:53 +02:00 |
Claudio Atzori
|
d512062b58
|
integrating pull #109, H2020Classification
|
2021-05-27 12:22:47 +02:00 |
Sandro La Bruzzo
|
bced804151
|
updated wf Datacite Import to retrieve the block size as parameter
|
2021-05-26 17:06:50 +02:00 |
Miriam Baglioni
|
abd88f663d
|
changed test resource to mirror change in the input file
|
2021-05-21 15:20:47 +02:00 |
Miriam Baglioni
|
c844877de2
|
changed workflow flow to possibly parallelize also the programme and project preparation steps
|
2021-05-21 14:41:57 +02:00 |
Miriam Baglioni
|
073d76864d
|
refactoring
|
2021-05-21 14:41:03 +02:00 |
Miriam Baglioni
|
4c8b4a774c
|
removed not needed code
|
2021-05-21 14:40:07 +02:00 |
Miriam Baglioni
|
53b9d87fec
|
new prepareProgramme according to the new file
|
2021-05-21 11:49:31 +02:00 |
Miriam Baglioni
|
1ee8f13580
|
refactoring and added "left" as join type to be 100% sure to get the whole set of projects
|
2021-05-21 11:49:05 +02:00 |
Miriam Baglioni
|
e07c3ba089
|
due to change in the input file the filtering step is no more needed
|
2021-05-21 11:47:43 +02:00 |
Miriam Baglioni
|
54f6e2f693
|
changed to get the needed information to build the action set as parallel jobs
|
2021-05-21 11:47:00 +02:00 |
Miriam Baglioni
|
7180505519
|
removed non needed variable
|
2021-05-21 11:46:13 +02:00 |
Miriam Baglioni
|
2eb1a8b344
|
changed because the input file changed
|
2021-05-21 11:40:20 +02:00 |
Claudio Atzori
|
9d725efdc1
|
reverted implementation of the mdstore client
|
2021-05-20 18:26:09 +02:00 |
Miriam Baglioni
|
9610224671
|
added param to workflow property
|
2021-05-20 18:21:12 +02:00 |
Miriam Baglioni
|
052c837843
|
-
|
2021-05-20 15:54:44 +02:00 |
Claudio Atzori
|
b695932ae4
|
integrated pull#108
|
2021-05-20 15:34:04 +02:00 |
Miriam Baglioni
|
dc0ad8d2e0
|
fixed issue related to change in the file name downloaded. Added sheet name as parameter and also a check if the name should change
|
2021-05-20 14:53:53 +02:00 |
Claudio Atzori
|
239d0f0a9a
|
ROR actionset import workflow backported from branch stable_ids
|
2021-05-18 16:12:11 +02:00 |
Michele Artini
|
c1e20de7cf
|
fixed the deserialization of a json property
|
2021-05-18 14:00:14 +02:00 |
Claudio Atzori
|
23b8883ab1
|
applied intellij code cleanup
|
2021-05-14 10:58:12 +02:00 |
Sandro La Bruzzo
|
6424cd9062
|
Added passing of the following parameters:
-varDataSourceId
-varOfficialName
in Each transformation Rule
|
2021-05-11 15:17:38 +02:00 |
Sandro La Bruzzo
|
073dcea2aa
|
Added passing of the following parameters:
-varDataSourceId
-varOfficialName
in Each transformation Rule
|
2021-05-11 15:05:58 +02:00 |
Claudio Atzori
|
3797543600
|
MDStoreManager model classes moved in dhp-schemas
|
2021-05-10 14:32:05 +02:00 |
Michele Artini
|
d82071ba6c
|
originalId with prefix
|
2021-05-06 15:34:48 +02:00 |
Claudio Atzori
|
923d19ea8e
|
mdstore read lock/unlock when bulk copying records from mongodb to hdfs
|
2021-05-04 18:06:21 +02:00 |
Claudio Atzori
|
ba86835951
|
using common constants from ModelConstants
|
2021-05-04 11:51:52 +02:00 |
Michele Artini
|
a278d67175
|
parse input file
|
2021-04-29 11:34:47 +02:00 |
Michele Artini
|
f77ba34126
|
pid types
|
2021-04-29 09:50:05 +02:00 |
Michele Artini
|
7c5cd86927
|
annotations and tests
|
2021-04-29 09:29:19 +02:00 |
Michele Artini
|
b5cf505cc6
|
partial implementation of the ROR->actionset workflow
|
2021-04-28 16:00:24 +02:00 |
Claudio Atzori
|
5afa7d3e0c
|
core utilities in dhp-common moved in external module dhp-schemas
|
2021-04-27 15:44:01 +02:00 |
Sandro La Bruzzo
|
63c0303137
|
removed unused import, add log
|
2021-04-27 12:17:23 +02:00 |
Claudio Atzori
|
fa42026590
|
fixed PersonCleaner extension functions
|
2021-04-27 10:10:06 +02:00 |
Sandro La Bruzzo
|
fd29307b84
|
updated workflow name
|
2021-04-21 09:21:41 +02:00 |
Claudio Atzori
|
d0d477cca3
|
code formatting
|
2021-04-20 12:50:34 +02:00 |
Sandro La Bruzzo
|
e06c7f32f6
|
updated id figshare as described in #6377
|
2021-04-20 10:18:07 +02:00 |
Sandro La Bruzzo
|
dbe0d0378e
|
resolved ticket #6377
|
2021-04-20 09:44:44 +02:00 |
Sandro La Bruzzo
|
524e5f3092
|
Improved parallelization on transformation wf on hadoop
|
2021-04-19 15:17:25 +02:00 |
Sandro La Bruzzo
|
cdfe01bbae
|
improved parallelization on transformation job
|
2021-04-19 15:14:52 +02:00 |
Claudio Atzori
|
3125cef545
|
code formatting
|
2021-04-14 09:11:54 +02:00 |
Andreas Czerniak
|
3b694074ff
|
add xslt, personname cleaner
|
2021-04-13 07:04:27 +02:00 |
Claudio Atzori
|
7941d7be29
|
WIP: using common definitions from ModelConstants
|
2021-03-31 18:33:57 +02:00 |
Claudio Atzori
|
879e8cc7ef
|
WIP: using common definitions from ModelConstants
|
2021-03-31 17:12:01 +02:00 |
Claudio Atzori
|
72ce741ea6
|
WIP: using common definitions from ModelConstants
|
2021-03-31 17:07:13 +02:00 |
Sandro La Bruzzo
|
616d2ecce2
|
splitted workflow collecting datacite into two workflows.
Released on beta
|
2021-03-31 15:45:58 +02:00 |
Sandro La Bruzzo
|
1dfda3624e
|
improved workflow importing datacite
|
2021-03-26 13:56:29 +01:00 |
Claudio Atzori
|
8db248aa13
|
avoiding error on jenkins compilations: java.net.BindException: Cannot assign requested address: Service 'sparkDriver' failed after 16 retries (on a random free port)!
|
2021-03-23 09:56:34 +01:00 |
Sandro La Bruzzo
|
c73072079d
|
fix conflicts
|
2021-03-22 16:36:31 +01:00 |
Claudio Atzori
|
61a2551e74
|
migrated last changes from svn (dnet45)
|
2021-03-15 17:17:55 +01:00 |
Claudio Atzori
|
acbe3119a4
|
RestCollectorPlugin imported from dne45
|
2021-03-08 09:44:09 +01:00 |
Claudio Atzori
|
fa7930d2e2
|
merging contributions from PR#97
|
2021-03-05 15:45:28 +01:00 |
Claudio Atzori
|
36f750cd1d
|
removed unused classes
|
2021-03-03 10:22:29 +01:00 |
Claudio Atzori
|
b73dce3e3a
|
more logging on the MDStore mongodb client. Forcing UTF_8 encoding on the content
|
2021-03-03 10:17:16 +01:00 |
Claudio Atzori
|
e76c4f62c1
|
MetadataRecord moved in dhp-schemas
|
2021-02-26 10:58:48 +01:00 |
Claudio Atzori
|
7df2461ccc
|
indent XML records collected from oai-pmh endpoints
|
2021-02-25 16:19:12 +01:00 |
Claudio Atzori
|
b830e33392
|
mdstore collector plugin
|
2021-02-25 12:30:30 +01:00 |
Claudio Atzori
|
271e88537b
|
code formatting
|
2021-02-25 12:28:56 +01:00 |
Claudio Atzori
|
9c899f4433
|
cleanup on transformation functions and the relative tests
|
2021-02-24 15:07:59 +01:00 |
Claudio Atzori
|
fc3fa5e343
|
implemented mdstore collector plugin
|
2021-02-24 15:07:24 +01:00 |
Claudio Atzori
|
e7eba9f7e7
|
WIP: transformation workflow error reporting; cleanup
|
2021-02-17 16:54:08 +01:00 |
Claudio Atzori
|
58467aaf1e
|
WIP: transformation workflow error reporting
|
2021-02-17 16:14:41 +01:00 |
Claudio Atzori
|
cc88701f29
|
retry for any Socket exception
|
2021-02-17 16:13:54 +01:00 |
Claudio Atzori
|
545f8f3e48
|
using jackson objectmapper instead of GSon to serialise the aggregation report
|
2021-02-17 12:15:00 +01:00 |
Claudio Atzori
|
b592d78bb4
|
WIP: collectorWorker error reporting, generalised reported implementation
|
2021-02-17 10:28:01 +01:00 |
Claudio Atzori
|
cf27905a71
|
WIP: collectorWorker error reporting, added report messages
|
2021-02-16 16:53:14 +01:00 |
Claudio Atzori
|
1abe6d1ad7
|
WIP: collectorWorker error reporting, added report messages
|
2021-02-15 15:08:59 +01:00 |
Claudio Atzori
|
523a6bfa97
|
Merge pull request 'first commit to the correct branch' (#94) from andreas.czerniak/BrAggr_dnet-hadoop:hadoop_aggregator into hadoop_aggregator
Looks good to me, thanks Andreas!
|
2021-02-15 12:15:31 +01:00 |
Sandro La Bruzzo
|
7edcc87ed4
|
changed xslt behaviour on failure
|
2021-02-12 17:27:08 +01:00 |
Sandro La Bruzzo
|
6a37c7f175
|
merge fixed
|
2021-02-12 16:38:47 +01:00 |
Sandro La Bruzzo
|
b3f5c2351d
|
Merge branch 'hadoop_aggregator' of code-repo.d4science.org:D-Net/dnet-hadoop into hadoop_aggregator
Conflicts:
dhp-workflows/dhp-aggregation/src/test/java/eu/dnetlib/dhp/transformation/TransformationJobTest.java
|
2021-02-12 16:37:14 +01:00 |
Sandro La Bruzzo
|
f216277219
|
Implemented cleaning date
|
2021-02-12 16:34:52 +01:00 |
Andreas Czerniak
|
5a9017cf18
|
clone, min. changes, test, run
|
2021-02-12 14:32:36 +01:00 |
Claudio Atzori
|
aa55dedb8a
|
Merge branch 'hadoop_aggregator' of https://code-repo.d4science.org/D-Net/dnet-hadoop into hadoop_aggregator
|
2021-02-12 12:31:05 +01:00 |
Claudio Atzori
|
29c6f7e255
|
classes related to the collection workflow moved into common package; implemented MongoDB collection plugins
|
2021-02-12 12:31:02 +01:00 |
Sandro La Bruzzo
|
17e6f1934e
|
fixed NPE on cleaner
|
2021-02-12 11:48:11 +01:00 |
Sandro La Bruzzo
|
ebcc3ec14f
|
updated wrong datacite identifier in trasformation
|
2021-02-11 16:25:51 +01:00 |
Claudio Atzori
|
bae029f828
|
collection_java_xmx allows to declare the heap size allocated for the java actions involved in the metadata collectionw workflow
|
2021-02-08 18:07:23 +01:00 |
Claudio Atzori
|
bebc54d5bf
|
seq file storing native records is now compressed
|
2021-02-08 18:06:25 +01:00 |
Claudio Atzori
|
50add4c61b
|
added requestDelay to HttpConnector2 configuration; Aggregation workflow constants moved in dhp-common
|
2021-02-08 12:19:38 +01:00 |
Claudio Atzori
|
40df0f987d
|
better logging, WIP: collectorWorker error reporting; common functions moved in DHPUtils
|
2021-02-06 20:12:00 +01:00 |
Claudio Atzori
|
a8a758925e
|
better logging, WIP: collectorWorker error reporting
|
2021-02-05 19:18:05 +01:00 |
Claudio Atzori
|
730973679a
|
Merge branch 'hadoop_aggregator' of https://code-repo.d4science.org/D-Net/dnet-hadoop into hadoop_aggregator
|
2021-02-04 17:25:00 +01:00 |
Claudio Atzori
|
deb85706db
|
imported HttpConnector from https://svn.driver.research-infrastructures.eu/driver/dnet45/modules/dnet-modular-collector-service/trunk/src/main/java/eu/dnetlib/data/collector/plugins/HttpConnector.java as HttpConnector2
|
2021-02-04 17:24:52 +01:00 |
Sandro La Bruzzo
|
4dae5e605d
|
implemented messaging btween collection worker and Dnet
|
2021-02-04 15:51:15 +01:00 |
Claudio Atzori
|
40764cf626
|
better logging, WIP: collectorWorker error reporting
|
2021-02-04 14:06:02 +01:00 |
Sandro La Bruzzo
|
69c253710b
|
fixed test
|
2021-02-04 10:30:49 +01:00 |
Claudio Atzori
|
e04045089f
|
better logging, WIP: collectorWorker error reporting
|
2021-02-03 17:58:22 +01:00 |
Claudio Atzori
|
0e8a4f9f1a
|
better logging, WIP: collectorWorker error reporting
|
2021-02-03 12:33:41 +01:00 |
Claudio Atzori
|
53884d12c2
|
code formatting
|
2021-02-02 14:38:03 +01:00 |
Claudio Atzori
|
ac46c247d2
|
code formatting
|
2021-02-02 14:24:00 +01:00 |
Claudio Atzori
|
bde14b149a
|
fixed transformation target paths
|
2021-02-02 12:49:29 +01:00 |
Claudio Atzori
|
ca4391aa1c
|
minor changes
|
2021-02-02 12:44:04 +01:00 |
Claudio Atzori
|
bb89b99b24
|
code formatting
|
2021-02-02 12:34:14 +01:00 |
Claudio Atzori
|
75807ea5ae
|
factored out constants
|
2021-02-02 12:28:21 +01:00 |
Sandro La Bruzzo
|
0634674add
|
implemented transformation test
|
2021-02-02 12:12:14 +01:00 |
Claudio Atzori
|
8eaa1fd4b4
|
WIP: metadata collection in INCREMENTAL mode and relative test
|
2021-02-01 19:29:10 +01:00 |
Sandro La Bruzzo
|
bead34d11a
|
code refactor
|
2021-02-01 14:58:06 +01:00 |
Sandro La Bruzzo
|
6ff234d81b
|
Implemented a first prototype of incremental harvesting and trasformation using readlock
|
2021-02-01 13:56:05 +01:00 |
Sandro La Bruzzo
|
b6b835ef49
|
update transformation Factory to get Transformation Rule by Id and not by Title
|
2021-02-01 08:49:42 +01:00 |
Sandro La Bruzzo
|
e423634cb6
|
RollBack in case of error WORKS!!!
|
2021-01-29 17:21:42 +01:00 |
Sandro La Bruzzo
|
8ee82576c6
|
Collection on Refresh WORKS!!!
|
2021-01-29 17:02:46 +01:00 |
Sandro La Bruzzo
|
0276180039
|
WIP mdstore
transaction implemented on hadoop side
|
2021-01-29 16:42:41 +01:00 |
Sandro La Bruzzo
|
0f8e2ecce6
|
Merged Datacite transfrom into this branch
|
2021-01-29 10:45:07 +01:00 |
Sandro La Bruzzo
|
99cf3a8ea4
|
Merged Datacite transfrom into this branch
|
2021-01-28 16:34:46 +01:00 |
Sandro La Bruzzo
|
98b9498b57
|
Removed old messaging system not quite used from collection and Transformation workflow
code refactor
|
2021-01-28 09:51:17 +01:00 |
Sandro La Bruzzo
|
184e7b3856
|
Implemented new Transformation using spark
|
2021-01-27 15:43:08 +01:00 |
Sandro La Bruzzo
|
ffb092b8d3
|
removed duplicate code HttpConnector.java
|
2021-01-25 15:05:37 +01:00 |
Claudio Atzori
|
41500669e2
|
[BIP! Scores integration] merged missing classes from bipFinder branch
|
2021-01-11 14:39:47 +01:00 |
Claudio Atzori
|
2a7a10809e
|
[BIP! Scores integration] merged missing classes from bipFinder branch
|
2021-01-11 10:05:02 +01:00 |
Claudio Atzori
|
d6686dd7cf
|
merged from master
|
2021-01-08 18:16:12 +01:00 |
Claudio Atzori
|
34229970e6
|
[BIP! Scores integration] Create updates as Result rather than subclasses; Result considers also metrics in the mergeFrom operation
|
2021-01-08 16:29:17 +01:00 |
Claudio Atzori
|
1361c9eb0c
|
[BIP! Scores integration] Create updates as Result rather than subclasses; Result considers also metrics in the mergeFrom operation
|
2021-01-07 10:07:30 +01:00 |
Claudio Atzori
|
2e503ee101
|
code formatting
|
2020-12-17 13:47:38 +01:00 |
Claudio Atzori
|
03319d3bd9
|
Revert "Merge pull request 'Creation of the action set to include the bipFinder! score' (#62) from miriam.baglioni/dnet-hadoop:bipFinder into master"
This reverts commit add7e1693b , reversing
changes made to f9a8fd8bbd .
|
2020-12-17 12:23:58 +01:00 |
Miriam Baglioni
|
888175baf7
|
added java doc
|
2020-12-01 18:36:29 +01:00 |
Miriam Baglioni
|
3d62d99d5d
|
fixed issue in workflow variable
|
2020-12-01 15:02:49 +01:00 |
Miriam Baglioni
|
17680296b9
|
removed unnecessary variable and unused method
|
2020-12-01 15:02:31 +01:00 |
Miriam Baglioni
|
5b3ed70808
|
refactoring
|
2020-12-01 14:31:34 +01:00 |
Miriam Baglioni
|
62ff4999e3
|
added workflow and last step of collection and save
|
2020-12-01 14:30:56 +01:00 |
Miriam Baglioni
|
45d06c45c7
|
collecting all the atoic actions for result type and save them all in the AS path
|
2020-12-01 14:29:18 +01:00 |
Miriam Baglioni
|
0051ebede5
|
extending test
|
2020-12-01 12:43:03 +01:00 |
Miriam Baglioni
|
719da15f04
|
added test resources
|
2020-12-01 12:42:30 +01:00 |
Miriam Baglioni
|
db36e11912
|
classes test classes and resources for production of the actionset to include bipFinder score in results
|
2020-11-30 20:14:23 +01:00 |
Sandro La Bruzzo
|
66efb39634
|
implemented merge scholix
|
2020-11-04 09:04:01 +01:00 |
Miriam Baglioni
|
4905739be6
|
changed resource file to mirror change in business logic
|
2020-10-30 17:02:57 +01:00 |
Miriam Baglioni
|
b40360ebfb
|
changed the code to mirror the changed decision in the classification level and prodramme description labels
|
2020-10-30 17:02:30 +01:00 |
Miriam Baglioni
|
696409fb9f
|
disabled tests because needing remote resource
|
2020-10-30 17:01:48 +01:00 |
Miriam Baglioni
|
a2ce527fae
|
changed to match the requirements for short titles in level and long titles in classification
|
2020-10-20 17:03:25 +02:00 |
Claudio Atzori
|
5f7b75f5c5
|
code formatting
|
2020-10-07 13:22:54 +02:00 |
Miriam Baglioni
|
061527f06e
|
adding short description
|
2020-10-05 13:54:39 +02:00 |
Miriam Baglioni
|
0c12d7bdd8
|
adding short description
|
2020-10-05 11:39:55 +02:00 |
Miriam Baglioni
|
fc2f7636be
|
removed not used code
|
2020-10-02 12:33:52 +02:00 |
Miriam Baglioni
|
4aec347351
|
refactoring
|
2020-10-01 16:23:52 +02:00 |
Miriam Baglioni
|
61946b4092
|
refactoring
|
2020-10-01 16:22:48 +02:00 |
Miriam Baglioni
|
7e6d35e56c
|
added the link to the excel file related to topic
|
2020-10-01 15:53:31 +02:00 |
Miriam Baglioni
|
43cbd62c2b
|
added classpath.first in the configuration
|
2020-10-01 15:46:34 +02:00 |
Miriam Baglioni
|
cd69c6b023
|
added dependency for the topic file path
|
2020-10-01 15:45:59 +02:00 |
Miriam Baglioni
|
632351c0da
|
modified test resources to mirror the changed in the code
|
2020-10-01 15:43:02 +02:00 |
Miriam Baglioni
|
ebc1c5513f
|
modified test resources to mirror the changed in the code
|
2020-10-01 15:42:29 +02:00 |
Miriam Baglioni
|
3a374c34b6
|
fixed null pointer exception
|
2020-10-01 15:41:01 +02:00 |
Miriam Baglioni
|
83ea746163
|
added check to the test
|
2020-10-01 15:40:28 +02:00 |
Miriam Baglioni
|
6e5db85b32
|
-
|
2020-10-01 11:51:11 +02:00 |
Miriam Baglioni
|
a46179f61c
|
refactoring
|
2020-10-01 11:22:01 +02:00 |
Miriam Baglioni
|
b90bee124b
|
removing raws that are empy from thos imported
|
2020-10-01 11:16:49 +02:00 |
Miriam Baglioni
|
c107f193c9
|
refactoring
|
2020-10-01 11:16:22 +02:00 |
Miriam Baglioni
|
706a80a29a
|
added test to check that separator '-' (not hyphen) will be recognized
|
2020-10-01 10:38:31 +02:00 |
Miriam Baglioni
|
3dca586b3b
|
refactoring
|
2020-10-01 10:34:48 +02:00 |
Miriam Baglioni
|
416bda6066
|
changed the programme.desxcription by using the same value used in the classification instead of the short title or the title
|
2020-10-01 10:31:33 +02:00 |
Miriam Baglioni
|
f6587c91f3
|
added comparison to a char that seems - but it is not
|
2020-10-01 10:30:26 +02:00 |
Miriam Baglioni
|
7e73bb88b3
|
changed the logic to add the topic description to the project
|
2020-09-28 17:21:43 +02:00 |
Miriam Baglioni
|
0a035e3630
|
-
|
2020-09-28 17:20:49 +02:00 |
Miriam Baglioni
|
16bee2084d
|
added the topic code to the project subset
|
2020-09-28 17:20:11 +02:00 |
Miriam Baglioni
|
0bf2d0db52
|
added to the workflow the download of the topic excel file and one property needed to get the input path of the topic file in the hdfs filesystem
|
2020-09-28 12:17:22 +02:00 |
Miriam Baglioni
|
c2abde4d9f
|
changed the implementation of Atomic Actions creation by exploiting the topic information get from the cordis excel file
|
2020-09-28 12:16:34 +02:00 |
Miriam Baglioni
|
d930b8d3fc
|
changed the query to get only the code of the project and not the optional1 (topic code) and optional2 (topic description)
|
2020-09-28 12:15:48 +02:00 |
Miriam Baglioni
|
f8f5cfd5cc
|
removed the part added to set the topic code and description in the step of project preparation
|
2020-09-28 12:13:33 +02:00 |
Miriam Baglioni
|
9e19c9a221
|
remove the topic description from the values in the CSVProject class
|
2020-09-28 12:11:03 +02:00 |
Miriam Baglioni
|
6d8b932e40
|
refactoring
|
2020-09-28 12:06:56 +02:00 |
Miriam Baglioni
|
b77f166549
|
changed the package name from csvutils to utils
|
2020-09-28 12:05:47 +02:00 |
Miriam Baglioni
|
f4739a371a
|
code to get the information related to the topic association between code and description.
|
2020-09-28 12:02:48 +02:00 |
Miriam Baglioni
|
12c2dfc268
|
modified the resource to consider the information added to the model
|
2020-09-25 14:17:23 +02:00 |
Miriam Baglioni
|
969fa8d96e
|
fixed issue and changed the transformation of the programme file to consider the new model
|
2020-09-25 13:32:34 +02:00 |
Miriam Baglioni
|
e917281822
|
-
|
2020-09-24 15:24:05 +02:00 |
Miriam Baglioni
|
9f54f69e6d
|
added topic information
|
2020-09-24 15:23:35 +02:00 |
Miriam Baglioni
|
d6206d6e63
|
add the topic description to the action set associated to the project
|
2020-09-24 15:22:40 +02:00 |
Miriam Baglioni
|
6b50226f3b
|
added topic code and topic description
|
2020-09-24 15:21:49 +02:00 |
Miriam Baglioni
|
15af1f527e
|
modified to consider the topic information
|
2020-09-24 15:20:56 +02:00 |
Miriam Baglioni
|
609ff17cfc
|
now the commission give us the framework programme (FP7 - H2020) so use this information to filter out programmes not associated to H2020
|
2020-09-24 15:19:31 +02:00 |
Miriam Baglioni
|
b66f930466
|
Added optionl1 and optional2 information to the files red from the db. Optional1 contains the topic code and optional2 contains the topic description
|
2020-09-24 15:16:56 +02:00 |
Miriam Baglioni
|
860e6d38a6
|
added topic description to the CSV project variables
|
2020-09-24 15:15:26 +02:00 |