Sandro La Bruzzo
|
80e15cc455
|
implemented mapping from uniprot, pdb and ebi links
|
2021-06-24 17:20:00 +02:00 |
Sandro La Bruzzo
|
080a280bea
|
added pdb to Oaf Transformation
|
2021-06-21 16:23:59 +02:00 |
Sandro La Bruzzo
|
1dc0c59e20
|
merged fix thai dates from stable_ids
|
2021-06-21 10:39:46 +02:00 |
Sandro La Bruzzo
|
dc66cf615b
|
Merge branch 'stable_id_scholexplorer' of code-repo.d4science.org:D-Net/dnet-hadoop into stable_id_scholexplorer
|
2021-06-21 09:38:33 +02:00 |
Sandro La Bruzzo
|
507e42102a
|
added pdb to oaf class
|
2021-06-21 09:36:40 +02:00 |
Sandro La Bruzzo
|
a167543637
|
Merge branch 'stable_ids' of code-repo.d4science.org:D-Net/dnet-hadoop into stable_id_scholexplorer
|
2021-06-21 09:14:11 +02:00 |
Sandro La Bruzzo
|
4fe7b75644
|
renamed packages
|
2021-06-18 16:41:24 +02:00 |
Sandro La Bruzzo
|
3990165d05
|
changed typologies of unresolved relation
|
2021-06-18 11:43:59 +02:00 |
Claudio Atzori
|
2dd5449c13
|
Merge branch 'stable_ids' of https://code-repo.d4science.org/D-Net/dnet-hadoop into stable_ids
|
2021-06-18 10:08:15 +02:00 |
Claudio Atzori
|
fd54ecf7bd
|
bumped dhp-schemas dependency version
|
2021-06-18 10:08:07 +02:00 |
Miriam Baglioni
|
180d671127
|
Merge branch 'stable_ids' of https://code-repo.d4science.org/D-Net/dnet-hadoop into stable_ids
|
2021-06-18 09:46:18 +02:00 |
Miriam Baglioni
|
13c96622c9
|
-
|
2021-06-18 09:45:16 +02:00 |
Miriam Baglioni
|
b486ae498f
|
added test and test resource to verify the generation of the date of acceptance from the input extracted from the dump
|
2021-06-18 09:43:32 +02:00 |
Miriam Baglioni
|
464c2ddde3
|
changed to split in two steps the generation of the crossref dataset
|
2021-06-18 09:42:31 +02:00 |
Miriam Baglioni
|
6aca0d8ebb
|
added kryo encoding for input files
|
2021-06-18 09:42:07 +02:00 |
Miriam Baglioni
|
3585e53da3
|
changed to split in two steps the generation of the crossref dataset
|
2021-06-18 09:41:23 +02:00 |
Claudio Atzori
|
41b551562e
|
applying PR#115 (DatePicker) on stable_ids
|
2021-06-17 09:33:50 +02:00 |
Sandro La Bruzzo
|
3100166d29
|
Merge remote-tracking branch 'origin/stable_ids' into stable_id_scholexplorer
|
2021-06-16 16:22:16 +02:00 |
Claudio Atzori
|
74833d04f1
|
Merge branch 'pids_beta' of https://code-repo.d4science.org/antonis.lempesis/dnet-hadoop into stable_ids
|
2021-06-16 15:54:18 +02:00 |
Claudio Atzori
|
7243a40c88
|
code formatting
|
2021-06-16 15:03:03 +02:00 |
Sandro La Bruzzo
|
dfcf78cf24
|
removed wrong code
|
2021-06-16 14:57:42 +02:00 |
Sandro La Bruzzo
|
cc0f2b11fb
|
Implemented mapping from pubmed baseline to OAF
|
2021-06-16 14:56:24 +02:00 |
Miriam Baglioni
|
95885bcf12
|
forces executor Executor memory and driver executor memory to be 7G (trying to avoid OOM)
|
2021-06-16 10:17:52 +02:00 |
Miriam Baglioni
|
2550a73981
|
-
|
2021-06-16 10:04:41 +02:00 |
Miriam Baglioni
|
1c47c0d786
|
modified the number of executors trying to avoid OOM exception
|
2021-06-15 21:05:39 +02:00 |
Miriam Baglioni
|
7deac55138
|
added one option for resume from in the wf
|
2021-06-15 18:38:20 +02:00 |
Antonis Lempesis
|
f7c0b80e35
|
storing result_instance as parquet
|
2021-06-15 14:45:48 +03:00 |
Miriam Baglioni
|
66e7ef892f
|
changed the parameter name
|
2021-06-15 11:08:54 +02:00 |
Miriam Baglioni
|
4f47ad0891
|
no need to rename the folders, just write in overwrite mode, so I changed the name of the output folder
|
2021-06-15 09:28:31 +02:00 |
Miriam Baglioni
|
9f9dd00b94
|
refactoring
|
2021-06-15 09:24:46 +02:00 |
Miriam Baglioni
|
63d74ee379
|
refactoring
|
2021-06-15 09:24:11 +02:00 |
Miriam Baglioni
|
6ebc236657
|
added needed property: outputPath
|
2021-06-15 09:23:24 +02:00 |
Miriam Baglioni
|
f7379255b6
|
changed the workflow to extract info from the dump
|
2021-06-15 09:22:54 +02:00 |
Miriam Baglioni
|
d6e21bb6ea
|
creates the crossref dataset used for doiboost together with unpacking part from tar
|
2021-06-14 17:27:19 +02:00 |
Miriam Baglioni
|
4da141bd7c
|
Merge branch 'stable_ids' of https://code-repo.d4science.org/D-Net/dnet-hadoop into stable_ids
|
2021-06-14 13:41:02 +02:00 |
Miriam Baglioni
|
ce0cfd79e0
|
creates the crossref dataset used for doiboost
|
2021-06-14 13:40:19 +02:00 |
Miriam Baglioni
|
93efe4de82
|
split the construction of crossref dataset in two parts. This one just unpacks the tar entries
|
2021-06-14 13:39:40 +02:00 |
Michele Artini
|
ada063ce70
|
fixed a problem with empty mdstore list (2)
|
2021-06-14 12:04:47 +02:00 |
Michele Artini
|
83132ee99a
|
fixed a problem with empty mdstore list
|
2021-06-14 11:57:00 +02:00 |
Miriam Baglioni
|
cf360d7c97
|
Merge branch 'stable_ids' of https://code-repo.d4science.org/D-Net/dnet-hadoop into stable_ids
|
2021-06-14 10:19:49 +02:00 |
Miriam Baglioni
|
8873e6b6d1
|
workflow and parameter
|
2021-06-14 10:15:57 +02:00 |
Miriam Baglioni
|
0f1acdf6b6
|
workflow and parameter
|
2021-06-14 10:08:55 +02:00 |
Sandro La Bruzzo
|
aeb8132627
|
Merged branch stable_ids
|
2021-06-14 10:07:29 +02:00 |
Sandro La Bruzzo
|
efbea1e01a
|
minor fix
|
2021-06-14 09:45:14 +02:00 |
Miriam Baglioni
|
75780fc636
|
extraction of the tar for the dump of crossref, and creation of the dataset
|
2021-06-14 09:45:07 +02:00 |
Claudio Atzori
|
2039bb9f5f
|
orcid / orcid_pending cleaning backported from master branch
|
2021-06-14 09:40:50 +02:00 |
Claudio Atzori
|
dd19c4ac5a
|
Merge pull request 'import_new_mdstores' (#112) from import_new_mdstores into stable_ids
Reviewed-on: D-Net/dnet-hadoop#112
|
2021-06-14 09:23:55 +02:00 |
Claudio Atzori
|
e9e86a237d
|
Merge branch 'stable_ids' of https://code-repo.d4science.org/D-Net/dnet-hadoop into stable_ids
|
2021-06-11 17:00:02 +02:00 |
Claudio Atzori
|
10bd6ca194
|
depending on dhp-schemas:2.5.12 (release)
|
2021-06-11 16:59:56 +02:00 |
Claudio Atzori
|
a900bfb874
|
delegating the date parsing to https://github.com/sisyphsu/dateparser
|
2021-06-11 16:53:01 +02:00 |