WIP: 8549_affiliation_extraction #50
No reviewers
Labels
No Label
No Milestone
No project
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: D-Net/openaire-graph-docs#50
Loading…
Reference in New Issue
No description provided.
Delete Branch "8549_affiliation_extraction"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
This PR introduces the following changes:
@claudio.atzori @miriam.baglioni as you can see, I have highlighted some parts of the pages above that need your feedback.
Please, provide your input and let me know if you need anything from me.
@ -82,0 +67,4 @@
| uniprot | [Protein Data Bank](http://www.pdb.org/) <span className="todo">[ or EMBL-EBI ?]</span> | `uniprot_____`
| ena | [Protein Data Bank](http://www.pdb.org/) <span className="todo">[ or EMBL-EBI ?]</span> | `ena_________`
| pdb | [Protein Data Bank](http://www.pdb.org/) <span className="todo">[ or EMBL-EBI ?]</span> | `pdb_________`
| handle | Any repository | <span className="todo">`handle______`</span>
This isn't correct. Handles bypasses the rule as they are integrated as legit PIDs (i.e. set in the field
result.pid
andresult.instance.pid
), but do not take part in the internal identifier creation, which depends on thedatasource specific prefix + md5(localid)
, where the local id is typically the oai identifier@ -82,0 +71,4 @@
#### Delegated authorities
<span className="todo">[TODO: the problem that this solves is that we can get a specific PID from more than one auhtoritative sources right ? For example, if we get DOIs from Crossref, Datacite, and Zenodo (btw Zenodo was not mentioned in the first table).
Does your comment refer to the concept of PID authorities, or instead to the concept of delegated authority? Zenodo is not listed in the table above because it is not a PID authority, while it is a delegated authority (for DOIs).
Anyway, the problem that delegated authorities aims to solve (with the solution still in place) were explained here D-Net/dnet-hadoop#187, I hope it helps.
@ -82,0 +72,4 @@
#### Delegated authorities
<span className="todo">[TODO: the problem that this solves is that we can get a specific PID from more than one auhtoritative sources right ? For example, if we get DOIs from Crossref, Datacite, and Zenodo (btw Zenodo was not mentioned in the first table).
Can't we mention those sources by priority in the first table and simply mention in the text that we prefer to collect those PIDs starting from the first till the last one? Is this the problem or I am missing something else here?]</span>
We could list all the prefixes in a single, overcomprehensive table, that lists both the prefixes (and sources) considered as PID authorities as well as those considered as delegated authorities. However, if we do so, we should explain the two concepts before the table.
I am including a few, but we should decide which datasource registries (prefixes) should be included in the table describing their PIDs.
The freq / prefix can be extracted with the following query, which returns 139 entries, below the top 10
Similarly for organizations, we should describe the most important prefixes (indicaiting the registry/source form which the organization comes from):
And last, but not least, for projects:
Step 1:
From your project repository, check out a new branch and test the changes.Step 2:
Merge the changes and update on Gitea.