Merge branch 'master' of code-repo.d4science.org:D-Net/dnet-applications
This commit is contained in:
commit
e4fea5585b
|
@ -1,2 +1,32 @@
|
|||
# dhp-mdstore-manager
|
||||
|
||||
A key component in the OpenAIRE aggregation workflows is the Metadata Store Manager (MdSM).
|
||||
It manages the set of metadata record collections resulting from the aggregation processes (MDStore), keeping track of the
|
||||
logical-physical mapping of each MDStore indicating the HDFS location responsible to store each set of records.
|
||||
|
||||
Moreover, as the aggregation workflows are intrinsically asynchronous processes, it must ensure the consistency of the
|
||||
records stored within each MDStore. Being HDFS an append-only filesystem (i.e. it does not support updates on existing
|
||||
files as batch operations), the MdSM introduces the concept of MDStore version: each MDStore represents a set of metadata
|
||||
records, where each set is associated to the timestamp relative to its creation (so that they that can be ordered over time).
|
||||
One of its versions is defined as the "current" available for clients to read.
|
||||
Writing a new batch of records in a given MDStore doesn't alter the content available for clients reading from the same
|
||||
MDStore until the write operations are concluded and committed, therefore implementing a transaction (similarly to
|
||||
transactions in RDBMS).
|
||||
Only a predefined amount of versions are kept. From time to time a garbage collection mechanism disposes older MDStore
|
||||
versions, preserving the last N, where N is a configuration parameter defined in the information system (or in the MDdSM
|
||||
itself, yet to be decided).
|
||||
|
||||
The MdSM implements a lock mechanism that acts as a semaphore: the lock includes a counter that is incremented by one
|
||||
every time a given MDStore is read, and decrementing it when the read operation is concluded. Locked MDStores
|
||||
(associated to non-zero semaphores) cannot be deleted or garbage collected.
|
||||
|
||||
The MdSM implements the following operations
|
||||
|
||||
* GET /mdstores/mdstore/{mdId}/newVersion : Create a new preliminary version of a MDStore, used to begin writing new records.
|
||||
* GET /mdstores/version/{versionId}/commit/{size} : Promote a preliminary version to current, used when a process wiring new records has finished.
|
||||
* GET /mdstores/version/{versionId}/abort : Abort a preliminary version, used to discard records temporarily store in a new MDStore version.
|
||||
* GET /mdstores/mdstore/{mdId}/startReading : Increase the read count of the current MDStore
|
||||
* GET /mdstores/version/{versionId}/endReading : Decrease the read count of a MDStore version
|
||||
* GET /mdstores/mdstore/{mdId}/versions : Return all the versions of a MDStore
|
||||
* DELETE /mdstoremanager/mdstores/mdstore/{mdId} : Delete a MDStore by id
|
||||
* DELETE /mdstores/versions/expired : Delete expired versions
|
||||
|
|
Loading…
Reference in New Issue