From dcaf1e16cf1ae64d6fc3ec67e8fd0889f529624d Mon Sep 17 00:00:00 2001 From: Claudio Atzori Date: Wed, 3 Mar 2021 16:32:08 +0100 Subject: [PATCH] MDsM README.md --- apps/dhp-mdstore-manager/README.md | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+) diff --git a/apps/dhp-mdstore-manager/README.md b/apps/dhp-mdstore-manager/README.md index 2167cb75..3cee0fd0 100644 --- a/apps/dhp-mdstore-manager/README.md +++ b/apps/dhp-mdstore-manager/README.md @@ -1,2 +1,32 @@ # dhp-mdstore-manager +A key component in the OpenAIRE aggregation workflows is the Metadata Store Manager (MdSM). +It manages the set of metadata record collections resulting from the aggregation processes (MDStore), keeping track of the +logical-physical mapping of each MDStore indicating the HDFS location responsible to store each set of records. + +Moreover, as the aggregation workflows are intrinsically asynchronous processes, it must ensure the consistency of the +records stored within each MDStore. Being HDFS an append-only filesystem (i.e. it does not support updates on existing +files as batch operations), the MdSM introduces the concept of MDStore version: each MDStore represents a set of metadata +records, where each set is associated to the timestamp relative to its creation (so that they that can be ordered over time). +One of its versions is defined as the "current" available for clients to read. +Writing a new batch of records in a given MDStore doesn't alter the content available for clients reading from the same +MDStore until the write operations are concluded and committed, therefore implementing a transaction (similarly to +transactions in RDBMS). +Only a predefined amount of versions are kept. From time to time a garbage collection mechanism disposes older MDStore +versions, preserving the last N, where N is a configuration parameter defined in the information system (or in the MDdSM +itself, yet to be decided). + +The MdSM implements a lock mechanism that acts as a semaphore: the lock includes a counter that is incremented by one +every time a given MDStore is read, and decrementing it when the read operation is concluded. Locked MDStores +(associated to non-zero semaphores) cannot be deleted or garbage collected. + +The MdSM implements the following operations + +* GET /mdstores/mdstore/{mdId}/newVersion : Create a new preliminary version of a MDStore, used to begin writing new records. +* GET /mdstores/version/{versionId}/commit/{size} : Promote a preliminary version to current, used when a process wiring new records has finished. +* GET /mdstores/version/{versionId}/abort : Abort a preliminary version, used to discard records temporarily store in a new MDStore version. +* GET /mdstores/mdstore/{mdId}/startReading : Increase the read count of the current MDStore +* GET /mdstores/version/{versionId}/endReading : Decrease the read count of a MDStore version +* GET /mdstores/mdstore/{mdId}/versions : Return all the versions of a MDStore +* DELETE /mdstoremanager/mdstores/mdstore/{mdId} : Delete a MDStore by id +* DELETE /mdstores/versions/expired : Delete expired versions