Describe the usage of the pivot table to improve stability of “representative records” and how “non authoritative” PIDs are used to generate “representative records”
This commit is contained in:
parent
6bb810a606
commit
6b3533d29a
|
@ -163,18 +163,19 @@ are included in the group, while the remaining elements are kept ungrouped.
|
|||
#### Selection of the pivot record
|
||||
|
||||
Each group of duplicate records needs to be identified in the final graph with
|
||||
an OpenAIRE identifier, derived from a record of the group known as the pivot
|
||||
record. The pivot record is determined after sorting by the following criteria:
|
||||
an OpenAIRE identifier, derived from a record of the group known as the _pivot
|
||||
record_. It is determined after sorting the group of duplicate records by the
|
||||
following criteria:
|
||||
|
||||
1. Records previously chosen as pivot records in the graph's previous
|
||||
generations.
|
||||
2. Records with identifiers from a "PID authority".
|
||||
2. Records with identifiers from a [PID authority](/data-model/pids-and-identifiers#pid-authorities).
|
||||
3. Publications from CrossRef or datasets from DataCite.
|
||||
4. Records with an earlier date of acceptance.
|
||||
5. Records with smaller IDs in lexicographical order.
|
||||
|
||||
The first sorting criterion is possible because a state table, called "pivot
|
||||
history," is maintained across graph generations. It keeps track of which
|
||||
history", is maintained across graph generations. It keeps track of which
|
||||
records were used as pivot records in what graph, guaranteed to retain data for
|
||||
the last 12 months.
|
||||
|
||||
|
|
Loading…
Reference in New Issue