2021-01-25 14:16:37 +01:00
|
|
|
# dhp-mdstore-manager
|
|
|
|
|
2021-03-03 16:32:08 +01:00
|
|
|
A key component in the OpenAIRE aggregation workflows is the Metadata Store Manager (MdSM).
|
|
|
|
It manages the set of metadata record collections resulting from the aggregation processes (MDStore), keeping track of the
|
|
|
|
logical-physical mapping of each MDStore indicating the HDFS location responsible to store each set of records.
|
|
|
|
|
|
|
|
Moreover, as the aggregation workflows are intrinsically asynchronous processes, it must ensure the consistency of the
|
|
|
|
records stored within each MDStore. Being HDFS an append-only filesystem (i.e. it does not support updates on existing
|
|
|
|
files as batch operations), the MdSM introduces the concept of MDStore version: each MDStore represents a set of metadata
|
|
|
|
records, where each set is associated to the timestamp relative to its creation (so that they that can be ordered over time).
|
|
|
|
One of its versions is defined as the "current" available for clients to read.
|
|
|
|
Writing a new batch of records in a given MDStore doesn't alter the content available for clients reading from the same
|
|
|
|
MDStore until the write operations are concluded and committed, therefore implementing a transaction (similarly to
|
|
|
|
transactions in RDBMS).
|
|
|
|
Only a predefined amount of versions are kept. From time to time a garbage collection mechanism disposes older MDStore
|
|
|
|
versions, preserving the last N, where N is a configuration parameter defined in the information system (or in the MDdSM
|
|
|
|
itself, yet to be decided).
|
|
|
|
|
|
|
|
The MdSM implements a lock mechanism that acts as a semaphore: the lock includes a counter that is incremented by one
|
|
|
|
every time a given MDStore is read, and decrementing it when the read operation is concluded. Locked MDStores
|
|
|
|
(associated to non-zero semaphores) cannot be deleted or garbage collected.
|
|
|
|
|
|
|
|
The MdSM implements the following operations
|
|
|
|
|
|
|
|
* GET /mdstores/mdstore/{mdId}/newVersion : Create a new preliminary version of a MDStore, used to begin writing new records.
|
|
|
|
* GET /mdstores/version/{versionId}/commit/{size} : Promote a preliminary version to current, used when a process wiring new records has finished.
|
|
|
|
* GET /mdstores/version/{versionId}/abort : Abort a preliminary version, used to discard records temporarily store in a new MDStore version.
|
|
|
|
* GET /mdstores/mdstore/{mdId}/startReading : Increase the read count of the current MDStore
|
|
|
|
* GET /mdstores/version/{versionId}/endReading : Decrease the read count of a MDStore version
|
|
|
|
* GET /mdstores/mdstore/{mdId}/versions : Return all the versions of a MDStore
|
|
|
|
* DELETE /mdstoremanager/mdstores/mdstore/{mdId} : Delete a MDStore by id
|
|
|
|
* DELETE /mdstores/versions/expired : Delete expired versions
|