# Action Management Framework This module implements the oozie workflow for the integration of pre-built contents into the OpenAIRE Graph. Such contents can be * brand new, non-existing records to be introduced as nodes of the graph * updates (or enrichment) for records that does exist in the graph (e.g. a new subject term for a publication) * relations among existing nodes The actionset contents are organised into logical containers, each of them can contain multiple versions contents and is characterised by * a name * an identifier * the paths on HDFS where each version of the contents is stored Each version is then characterised by * the creation date * the last update date * the indication where it is the latest one or it is an expired version, candidate for garbage collection ## ActionSet serialization Each actionset version contains records compliant to the graph internal data model, i.e. subclasses of `eu.dnetlib.dhp.schema.oaf.Oaf`, defined in the external schemas module ``` eu.dnetlib.dhp ${dhp-schemas.artifact} ${dhp-schemas.version} ``` When the actionset contains a relationship, the model class to use is `eu.dnetlib.dhp.schema.oaf.Relation`, otherwise when the actionset contains an entity, it is a `eu.dnetlib.dhp.schema.oaf.OafEntity` or one of its subclasses `Datasource`, `Organization`, `Project`, `Result` (or one of its subclasses `Publication`, `Dataset`, etc...). Then, each OpenAIRE Graph model class instance must be wrapped using the class `eu.dnetlib.dhp.schema.action.AtomicAction`, a generic container that defines two attributes * `T payload` the OpenAIRE Graph class instance containing the data; * `Class clazz` must contain the class whose instance is contained in the payload. Each AtomicAction can be then serialised in JSON format using `com.fasterxml.jackson.databind.ObjectMapper` from ``` com.fasterxml.jackson.core jackson-databind ${dhp.jackson.version} ``` Then, the JSON serialization must be stored as a GZip compressed sequence file (`org.apache.hadoop.mapred.SequenceFileOutputFormat`). As such, it contains a set of tuples, a key and a value defined as `org.apache.hadoop.io.Text` where * the `key` must be set to the class canonical name contained in the `AtomicAction`; * the `value` must be set to the AtomicAction JSON serialization. The following snippet provides an example of how create an actionset version of Relation records: ``` rels // JavaRDD .map(relation -> new AtomicAction(Relation.class, relation)) .mapToPair( aa -> new Tuple2<>(new Text(aa.getClazz().getCanonicalName()), new Text(OBJECT_MAPPER.writeValueAsString(aa)))) .saveAsHadoopFile(outputPath, Text.class, Text.class, SequenceFileOutputFormat.class, GzipCodec.class); ```