Enrico Ottonello
|
9e8e7fe6ef
|
add comments
|
2020-09-15 11:32:49 +02:00 |
Enrico Ottonello
|
0377b40fba
|
output to one parquet file
|
2020-07-30 18:38:07 +02:00 |
Enrico Ottonello
|
196f36c6ed
|
fix publication dataset creation
|
2020-07-30 13:38:33 +02:00 |
Enrico Ottonello
|
c82b15b5f4
|
migrate configuration to ocean, fix publication dataset creation
|
2020-07-28 15:23:52 +02:00 |
Enrico Ottonello
|
ca37d3427b
|
separate workflow to parse orcid summaries, activities and generate dataset with no doi publications; test
|
2020-07-03 23:30:31 +02:00 |
Enrico Ottonello
|
1729cc5cf3
|
publication conversion from json to oaf test
|
2020-07-02 18:46:20 +02:00 |
Enrico Ottonello
|
5525f57ec8
|
converter from orcid work json to oaf
|
2020-07-01 18:36:14 +02:00 |
Enrico Ottonello
|
b7b6be12a5
|
fixed enriched works generation
|
2020-06-29 18:03:16 +02:00 |
Enrico Ottonello
|
b2213b6435
|
merged with dnet version
|
2020-06-26 17:27:34 +02:00 |
Enrico Ottonello
|
c5e149c46e
|
Merge branch 'master' of https://code-repo.d4science.org/D-Net/dnet-hadoop into orcid-no-doi
|
2020-06-26 16:15:38 +02:00 |
Enrico Ottonello
|
d6498278ed
|
added workflow to generate seq(orcidId,work) and seq(orcidId,enrichedWork)
|
2020-06-25 18:43:29 +02:00 |
Sandro La Bruzzo
|
a6c0faac70
|
added test to verify secondary sorting
|
2020-06-25 10:48:15 +02:00 |
Enrico Ottonello
|
fcbb4c1489
|
parser of orcid publication data from xml original dump
|
2020-06-24 16:29:32 +02:00 |
Claudio Atzori
|
67c7b31ba6
|
Merge branch 'master' into graph_cleaning
|
2020-06-10 15:00:35 +02:00 |
Claudio Atzori
|
a2fdf85ba1
|
WIP: graph cleaner implementation
|
2020-06-09 19:52:53 +02:00 |
Alessia Bardi
|
2d3f7d1eb4
|
fixed log classes to make the ORCID test run
|
2020-06-09 18:07:14 +02:00 |
Alessia Bardi
|
fc4d220964
|
updated function name for SNSF
|
2020-06-09 17:05:31 +02:00 |
Alessia Bardi
|
d6de406e11
|
fixed classid for subjects
|
2020-06-09 14:43:34 +02:00 |
Alessia Bardi
|
f072125152
|
map volume and issue in journal information from MAG
|
2020-06-09 14:32:10 +02:00 |
Alessia Bardi
|
b7cb1163ea
|
identifiers always start with 50
|
2020-06-09 10:39:11 +02:00 |
Alessia Bardi
|
9fd25887f7
|
Result identifiers all start with 50|
|
2020-06-08 19:32:24 +02:00 |
Alessia Bardi
|
16cb073b15
|
set the instance datepfacceptance with the Crossref createdDate in case the issuedDate is blank
|
2020-06-08 19:06:03 +02:00 |
Sandro La Bruzzo
|
7ac1ba2e35
|
improvement DOIBoost
|
2020-06-04 14:39:20 +02:00 |
Sandro La Bruzzo
|
13815d5d13
|
improvement DOIBoost
|
2020-06-01 17:52:12 +02:00 |
Sandro La Bruzzo
|
b87b3ddb6b
|
changed mapping ORCIDToOAF
|
2020-05-29 09:32:04 +02:00 |
Sandro La Bruzzo
|
7d29b61c62
|
code refactor
|
2020-05-28 09:57:46 +02:00 |
Sandro La Bruzzo
|
25f52e19a4
|
implemented generation of ActionSet
|
2020-05-26 09:15:33 +02:00 |
Sandro La Bruzzo
|
2408083566
|
implemented filtering step
|
2020-05-23 08:46:49 +02:00 |
Sandro La Bruzzo
|
147dd389bf
|
minor fix
|
2020-05-22 20:51:42 +02:00 |
Sandro La Bruzzo
|
22936d0877
|
Merge branch 'doiboost' of code-repo.d4science.org:D-Net/dnet-hadoop into doiboost
|
2020-05-22 15:15:17 +02:00 |
Sandro La Bruzzo
|
9fbb221457
|
completed mapping of UnpayWall and ORCID
|
2020-05-22 15:15:09 +02:00 |
Enrico Ottonello
|
1109d3b3fc
|
Merge branch 'doiboost' of https://code-repo.d4science.org/D-Net/dnet-hadoop into doiboost
|
2020-05-21 00:41:27 +02:00 |
Enrico Ottonello
|
869a53040e
|
save to text file format
|
2020-05-21 00:41:21 +02:00 |
Sandro La Bruzzo
|
5818abaab4
|
fixed Crossref Mapping
|
2020-05-20 17:05:46 +02:00 |
Sandro La Bruzzo
|
b771d67e9d
|
next step of MAG conversion implemented
|
2020-05-20 08:14:03 +02:00 |
Enrico Ottonello
|
934ad570e0
|
joined summaries and activities dataset
|
2020-05-19 12:57:21 +02:00 |
Enrico Ottonello
|
ca722d4d18
|
merged
|
2020-05-19 09:43:12 +02:00 |
Enrico Ottonello
|
7362bc3e9d
|
workflow to generate seq(doi,AuthorList)
|
2020-05-19 09:34:44 +02:00 |
Sandro La Bruzzo
|
486e850bcc
|
next step of MAG conversion implemented
|
2020-05-19 09:24:45 +02:00 |
Enrico Ottonello
|
d4e9075f22
|
Merge branch 'doiboost' of https://code-repo.d4science.org/D-Net/dnet-hadoop into doiboost
|
2020-05-18 19:51:36 +02:00 |
Enrico Ottonello
|
fc80e8c7de
|
added accumulator; last modified date of the record is added to saved data; lambda file is partitioned into 20 parts before starting downloading
|
2020-05-18 19:51:29 +02:00 |
Enrico Ottonello
|
0b29bb7e3b
|
spark job to download orcid record modified after a fixed date
|
2020-05-15 19:49:26 +02:00 |
Sandro La Bruzzo
|
d876f47d06
|
next step of MAG conversion implemented
|
2020-05-13 10:38:04 +02:00 |
Enrico Ottonello
|
08040cef80
|
spark action to analyze orcid lambda file
|
2020-05-12 16:57:43 +02:00 |
Enrico Ottonello
|
f53e42bda7
|
merged
|
2020-05-11 14:49:28 +02:00 |
Enrico Ottonello
|
7990894454
|
different date format in lambda file parsing
|
2020-05-11 14:41:11 +02:00 |
Sandro La Bruzzo
|
0c6774e4da
|
updated pom version
|
2020-05-11 14:35:14 +02:00 |
Sandro La Bruzzo
|
2b48a2c32c
|
Merge branch 'doiboost' of code-repo.d4science.org:D-Net/dnet-hadoop into doiboost
|
2020-05-11 09:38:36 +02:00 |
Sandro La Bruzzo
|
4cebca09d2
|
start implementing MAG mapping
|
2020-05-11 09:38:27 +02:00 |
Enrico Ottonello
|
b9d126dd1f
|
formatting modified after commit
|
2020-05-08 14:54:37 +02:00 |