Open Citation integration #401
No reviewers
Labels
No Label
bug
duplicate
enhancement
help wanted
invalid
question
RDGraph
RSAC
wontfix
No Milestone
No project
No Assignees
2 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: D-Net/dnet-hadoop#401
Loading…
Reference in New Issue
No description provided.
Delete Branch "ocnew"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
New implementation for the integration of the new dump of Open Citation. We need to change the implementation because OC has changed the information in the dataset: they use now internal OC identifiers to link citing and cited results instead of pids (DOI, PMID etc)
We also need to download a correspondence file between OC internal identifiers and pids.
OC also uses isbn and issn as identifiers for the citing/cited resource. For now this types of identifiers are not considered. We can reconsider them once we have inserted venues.
Parameters to be passed to the workflow:
Possible values for the
resumeFrom
parameter:DownloadDump
to download the new OC dump files and the correspondence fileExtractContent
to extract the zip from the dumpReadContent
to read the content in the new dump and create the corresponding json fileMapContent
to map the internal OC identifiers in the corresponding pidsCreateAS
to create the action set