Miriam Baglioni
|
4b1920f008
|
changed the working path parameter value as dependant from the dnet-workflow working dir parameter
|
2021-10-14 09:18:09 +02:00 |
Miriam Baglioni
|
8db39c86e2
|
added new parameter in the doiboost process workflow to specify a folder for the process of MAG dataset
|
2021-10-14 09:17:39 +02:00 |
Sandro La Bruzzo
|
be79d74e3d
|
Fixed DoiBoost generation to point to correct organization in affiliation relation
|
2021-09-27 16:57:04 +02:00 |
Miriam Baglioni
|
882abb40e4
|
CrossrefDump -
|
2021-08-20 11:12:53 +02:00 |
Miriam Baglioni
|
35880c0e7b
|
CrossrefDump - changed the wf to be able to resume from one of the steps
|
2021-08-20 11:11:35 +02:00 |
Miriam Baglioni
|
f3b6c392c1
|
CrossrefDump - moving parameter file under folder crossref_dump_reader
|
2021-08-20 11:10:58 +02:00 |
Miriam Baglioni
|
65822400ce
|
CrossrefDump - added new parameter file that was missing
|
2021-08-20 11:10:35 +02:00 |
Claudio Atzori
|
bc7068106c
|
added crossref download oozie workflow
|
2021-08-13 17:19:44 +02:00 |
Claudio Atzori
|
2c0a05f11a
|
manually merged PR#139
|
2021-08-13 17:15:53 +02:00 |
Miriam Baglioni
|
5856ca8a7b
|
merging with branch beta - resolved conflicts
|
2021-08-13 16:45:45 +02:00 |
Miriam Baglioni
|
6fec71e8d2
|
removed the specific of the infra we are running the wf from the wf name
|
2021-08-13 16:39:02 +02:00 |
Miriam Baglioni
|
ed7e28490a
|
change in sh
|
2021-08-13 16:19:01 +02:00 |
Miriam Baglioni
|
6eb7508995
|
mergin with branch beta
|
2021-08-13 16:07:04 +02:00 |
Miriam Baglioni
|
b6b58bba28
|
reverting
|
2021-08-11 17:25:37 +02:00 |
Miriam Baglioni
|
da20fceaf7
|
removed all the part related to the crossref dump download since it is done in a separate workflow
|
2021-08-09 11:53:45 +02:00 |
Miriam Baglioni
|
54a6cbb244
|
CrossrefDump - put token among the parameters
|
2021-08-09 11:41:10 +02:00 |
Miriam Baglioni
|
b7079804cb
|
CrossrefDump - put token among the parameters
|
2021-08-09 11:34:35 +02:00 |
Miriam Baglioni
|
bd096f5170
|
removed not needed param file
|
2021-08-05 10:55:43 +02:00 |
Miriam Baglioni
|
5faeefbda8
|
added script to download the dump,changed the workflow input paramenters
|
2021-08-05 10:54:03 +02:00 |
Miriam Baglioni
|
1965e4eece
|
new workflow for downloading the dump of crossref and unpack it
|
2021-08-04 18:29:03 +02:00 |
Miriam Baglioni
|
b4eb026c8b
|
mergin with branch beta
|
2021-08-04 10:21:37 +02:00 |
Miriam Baglioni
|
9831725073
|
Hosted By Map - remove from workflow a step not needed. The hbm will be take care also of the integration of the unibi list of gold openaccess journals
|
2021-08-03 11:02:17 +02:00 |
Miriam Baglioni
|
eb07f7f40f
|
Hosted By Map
|
2021-07-27 12:27:26 +02:00 |
Miriam Baglioni
|
63553a76b3
|
added code to download gold issn list from unibi
|
2021-07-22 12:01:48 +02:00 |
Miriam Baglioni
|
b226ba4439
|
mergin with branch beta
|
2021-07-21 09:46:40 +02:00 |
Miriam Baglioni
|
83fe31c92e
|
changed the name of the workflows
|
2021-07-19 18:19:14 +02:00 |
Miriam Baglioni
|
54acc5373b
|
changed the name of the workflows
|
2021-07-19 18:18:09 +02:00 |
Miriam Baglioni
|
b420b11ed3
|
duplicate the number of partitions in ProcessMag
|
2021-07-19 18:16:23 +02:00 |
Miriam Baglioni
|
662c396354
|
duplicate the number of partitions in ConvertCrossrefToOaf
|
2021-07-19 12:41:14 +02:00 |
Miriam Baglioni
|
c4b18e6ccb
|
changed the download.sh, added skip step to allow to not execute one phase and changed the workflow sequence of steps
|
2021-07-16 15:01:25 +02:00 |
Miriam Baglioni
|
acd6056330
|
added shell action to automatically download the new dump and put it in a specified hdfs location
|
2021-07-16 12:47:10 +02:00 |
Claudio Atzori
|
bf9e0d2d4f
|
Merge pull request 'orcid-no-doi' (#123) from enrico.ottonello/dnet-hadoop:orcid-no-doi into beta
Reviewed-on: #123
|
2021-07-15 17:59:41 +02:00 |
Sandro La Bruzzo
|
3d8e2aa146
|
Code refactor:
- removed old workflows in doiboost
- splitted workflow of doiboost in preprocess and process
|
2021-07-14 14:37:06 +02:00 |
Sandro La Bruzzo
|
c35c117601
|
fixed process doiboost workflow:
- splitted OrcidToOAF into two phase preprocess and process
- updated workflow used in production
|
2021-07-14 12:48:01 +02:00 |
Miriam Baglioni
|
13c96622c9
|
-
|
2021-06-18 09:45:16 +02:00 |
Miriam Baglioni
|
3585e53da3
|
changed to split in two steps the generation of the crossref dataset
|
2021-06-18 09:41:23 +02:00 |
Miriam Baglioni
|
95885bcf12
|
forces executor Executor memory and driver executor memory to be 7G (trying to avoid OOM)
|
2021-06-16 10:17:52 +02:00 |
Miriam Baglioni
|
2550a73981
|
-
|
2021-06-16 10:04:41 +02:00 |
Miriam Baglioni
|
1c47c0d786
|
modified the number of executors trying to avoid OOM exception
|
2021-06-15 21:05:39 +02:00 |
Miriam Baglioni
|
7deac55138
|
added one option for resume from in the wf
|
2021-06-15 18:38:20 +02:00 |
Miriam Baglioni
|
66e7ef892f
|
changed the parameter name
|
2021-06-15 11:08:54 +02:00 |
Miriam Baglioni
|
4f47ad0891
|
no need to rename the folders, just write in overwrite mode, so I changed the name of the output folder
|
2021-06-15 09:28:31 +02:00 |
Miriam Baglioni
|
6ebc236657
|
added needed property: outputPath
|
2021-06-15 09:23:24 +02:00 |
Miriam Baglioni
|
f7379255b6
|
changed the workflow to extract info from the dump
|
2021-06-15 09:22:54 +02:00 |
Miriam Baglioni
|
8873e6b6d1
|
workflow and parameter
|
2021-06-14 10:15:57 +02:00 |
Miriam Baglioni
|
0f1acdf6b6
|
workflow and parameter
|
2021-06-14 10:08:55 +02:00 |
Enrico Ottonello
|
c537986b7c
|
deleted folders with merged data immediately before merge phases
|
2021-04-28 11:25:25 +02:00 |
Claudio Atzori
|
e5abbec2ba
|
[orcid] download of the lambda file defined in a script
|
2021-04-22 11:22:10 +02:00 |
Claudio Atzori
|
55964cbd81
|
[orcid] large oozie workflow cleanup; updated workflow for the orcidnodoi actionset creation
|
2021-04-22 10:18:09 +02:00 |
Claudio Atzori
|
52244f813a
|
merging from enrico.ottonello/dnet-hadoop:orcid-no-doi
|
2021-04-21 12:24:09 +02:00 |