SmartExecutor

From Gcube Wiki
Jump to: navigation, search


A new RESTful version of the Smart Executor (version 2.0) is available. The wiki will be updated ASAP.

The SmartExecutor service allows to execute "gCube Tasks" and monitor their execution status. Each instance of the SmartExecutor service can run the "gCube Tasks" related to the plugins available on such an instance.

Each instance of the SmartExecutor service publishes descriptive information about the co-deployed plugins.

Clients may interact with the SmartExecutor service through a library (SmartExecutor Client) of high-level facilities to simplify the discovery of available plugins in those instances. Each client can request to execute a "gCube Tasks" or getting informations about the state of their execution.

The Service

SmartExecutor service allows tasks execution through the use of co-deployed plugins. The service allows to pass inputs parameter to the plugin requested to run.

The execution is invoked every time it matches the scheduling parameters. The way to schedule the plugin execution is indicated by scheduling parameter (see Launch Inputs). Passing a null scheduling means a one shot execution, that is start immediately and just one time. Moreover, a new SmartExecutor instance could take care of a plugin when the node on which it was previously allocated crashes or is overloaded.

In the following we will use these notations

  • run and die plugin: the plugin is launched just for one time and after this execution it won't be repeated;
  • scheduled plugin: the plugin repeat its execution over time according a delay interval or a more complex cron expression.

Details will be analyzed in the following.

Plugins

At startup time the SmartExecutor discover the available plugins (using java.util.ServiceLoader). For each discovered plugin the SmartExecutor gather some mandatory information (e.g. version, name) and asks to the plugin to discover the capabilities of the machine. The discovered capabilities can be used by clients to select one instance instead of another (e.g. available python version, available R version, a certain hardware capability). The machine capabilities are plugin dependend so that is responsibility of the plugin to discover them.

The SmartExecutor publish the mandatory information and the discovered capabilities on IS using a ServiceEndpoint with Category: VREManagement and Name: SmartExecutor.

The SmartExecutor Client search for this service endpoints to retrieve available SmartExecutor instances running a certain plugin.

Client

The SmartExecutor service provides a client library to simplify the following procedures:

  • discover service instances that can execute the target task. This requires interaction with the Information System.
  • launch the execution of the task with one the discovered instances.
  • monitor the execution of the running task.

In order to use the SmartExecutor client library, you have to declare this dependency in your pom.xml file

<dependency>
	<groupId>org.gcube.vremanagement</groupId>
	<artifactId>smart-executor-client</artifactId>
</dependency>

You can now instanciate the SmartExecutorProxy object that will allow you to perform the operations cited above:

// instanciate the proxy (you need the plugin name)
SmartExecutorProxy proxy = ExecutorPlugin.getExecutorProxy(YourSmartExecutorPlugin.NAME).build();
 
// launch the plugin passing the inputs it needs
Map<String, Object> inputs = new HashMap<String, Object>();
 
...
 
// set the parameters (give the inputs, the scheduling options and so on)
LaunchParameter parameter = new LaunchParameter(YourSmartExecutorPlugin.NAME, inputs);
parameter.setScheduling(...);
 
// launch (for example one shot)
String uuidPluginLaunched = proxy.launch(parameter);
 
// after a while you can check the status of the launch, for example
PluginState state = proxy.getState(uuidPluginLaunched);
 
switch(state){
 case CREATED : ....;
 case DONE : ....;
 case FAILED : ....;
 case ... : ....;
 default : ....;
}

Instance Discovery

SmartExecutor Client provide some facilities to get the SmartExecutor service instances.

It allows to retrieve one of the the running SmartExecutor services filtering them in the following way:

  • By plugin name only (that's the case in the example shown above);
  • By plugin name AND one or more of the following filters:
    • KEY-VALUE list. Each KEY-VALUE must be found on the Service Endpoints.
    • XQuery conditions to filter the Service Endpoints. The conditions are added to the standard query performed.
    • List of endpoints to be selected within the available one.

In the latter cases you can use the following available functions:

public static ProxyBuilder<SmartExecutorProxy> getExecutorProxy(String pluginName, Tuple<String, String> ... tuples);
 
public static ProxyBuilder<SmartExecutorProxy> getExecutorProxy(String pluginName, 
			                                        Tuple<String, String>[] tuples, 
			                                        ServiceEndpointQueryFilter serviceEndpointQueryFilter,
			                                        EndpointDiscoveryFilter endpointDiscoveryFilter);

Using SmartExecutor Client a proxy for remote invocation can be obtained. Using this proxy is very easy to call a remote method as the method was local.

Managing Tasks

One of the API exposed by the SmartExecutor service is used to launch a plugin execution instance. Each time a task launch is executed the service (after some checks) create a separated thread, instantiate the related plugin and invoke the execution passing inputs parameters. The service return immediately after the task is launched (task is executed asynchronously) The string representation of the ID (an UUID) is retuned to the clients. That ID can be used to monitor the task execution.

Launch Inputs

The invocation parameters are:

  • name of the plugin to be instantiated and run (MANDATORY)
  • inputs to pass to plugin instance execution (MANDATORY)
  • capabilities the plugin has to satisfy to correctly run an execution (e.g. the machine has a certain version of python interpreter) (OPTIONAL)
  • scheduling strategy to launch the execution (e.g. every day at 22:00) (OPTIONAL)
  • persist if the scheduling has to be persisted. If the SmartExecutor instance terminate another SmartExecutor must take in charge that scheduling. Due to a scheduling can survive over the time is strongly recommend to declare the needed capabilities.

Please note that the capabilities matches are checked before launching the plugin execution. This is a service check and is conceptually different from filtering an instance at discovery time using the client.

As earlier stated, plugins can be launched and executed just one time (run and die), or scheduled. The schedule parameter can be set in different ways, some examples are reported below

// Specifying a simple delay parameter (minimum time between two executions) and the number of times the plugin should execute
Scheduling scheduling = new Scheduling(1000 * 60, 50); // perform 50 iterations  with 1 minute delay between two of them
                                                       // If you do not specify the number of iterations, the plugin 
                                                       // will be executed once
 
// Using a CronExpression
CronExpression cronExpression = new CronExpression("0 */10 * * * ?"); // every 10 minutes starting from now
Scheduling scheduling = new Scheduling(cronExpression, true); // the true parameter specifies that before starting a new execution, 
                                                              // the previous one must be terminated 
                                                              // (otherways the new one will be deferred)

Finally, set the schedule parameter you defined above by using a LaunchParameter object passed to the launch() method.

LaunchParameter parameter = new LaunchParameter(SimpleExamplePlugin.NAME, inputs);
parameter.setScheduling(scheduling);
 
// launch with the above parameters
String uuidLaunchedPlugin = proxy.launch(parameter);

Monitor Task Execution

The ID returned invoking the launch can be used to monitor an execution.

The possible state for a task are:

  • CREATED : The Task is created but not still running.
  • RUNNING : The Task is running.
  • STOPPED : The Task has been stopped.
  • DONE : The Task terminated successfully.
  • FAILED : The Task failed the execution.
  • DISCARDED : The Task has been discarded by the scheduler. This happen only for repetitive or recurrent tasks and only when the launch parameter require that the previous task must be completed.

The are two API to monitor a task execution.

  • The first one to retrieve the current state of the last execution.
  • The second one to retrieve the current state of the nth execution (execution number).

The latter is used to monitor scheduled tasks. The execution number is used to identify the nth run launched respecting the scheduling strategy.

To monitor the run and die tasks the first API should be used. You could use the second API providing the value 1 for (execution number).

Requesting the state for en execution that has not been performed raise an exception. The methods invocation are the following for the two cases:

// retrieve the state about the last execution
PluginState state = proxy.getState(uuidPluginLaunched);
 
// retrieve the state about the n-th execution
PluginState stateN = proxy.getState(uuidPluginLaunched, executionNumber);

Unschedule a Task Execution

There is an API to stop a running execution. When stopped the final state will be set to STOPPED'. If the task was already finished this method has no effect. The task to be stopped is identified by the ID returned from launch method.

This method stop the current execution, if any. If the plugin is scheduled to be repeated the method stops only the current execution. If this method is used on a Run and Die task the effect is stopping the current execution if any.

// stop (or at least try to stop) the execution of the plugin given its uuid
boolean stopped = proxy.stop(uuidPluginLaunched);

There is also an advanced mechanism that takes into account the fact that the plugin schedule can be of type repeat and/or that another SmartExecutor node must take care of a plugin that was running on a node that went down or is overloaded.

// Unschedule the current execution and if the task was of type repeat, no other iteration will occur.
// If the task was set to be Global, no other iteration will be launched on the current node, but another
// SmartExecutor node will take care of it.
boolean unscheduled = proxy.unSchedule(uuidPluginLaunched);
 
// unschedule globally (stop current iteration and the following ones on any node)
boolean unscheduled = proxy.unSchedule(uuidPluginLaunched, true);

Plugin Development

Each plugin must declare:

  • Name
  • Version
  • Description

Moreover the plugin is able to discover the machine capabilities it wants to publish, to allows clients to filters instances, by providing them as a key-value map

The mandatory information and the discovered capabilities are published in the ServiceEndpoint by the SmartExecutor service.

SmartExecutor plugins may have arbitrary size and dependencies.

Each plugin must implements at least two classes:

  • The first one has to extend the org.gcube.vremanagement.executor.plugin.PluginDeclaration interface. This class is responsible of providing the mandatory information and discover the capabilities. this class is used by the SmartExecutor service just one time at application startup, to populate the ServiceEndpoint.
  • The second class had to implement org.gcube.vremanagement.executor.plugin.Plugin<? extends PluginDeclaration> class. This class must implements two method. The first one responsible to launch the plugin execution with the provided input parameters. The second method is used to perform termination actions when an execution is stopped due to a client request.

Maven Dependencies

The Maven dependencies you need to develop a SmartExecutorPlugin are the following:

<dependency>
	<groupId>org.gcube.vremanagement</groupId>
	<artifactId>smart-executor-api</artifactId>
	<version>[1.0.0-SNAPSHOT, 2.0.0-SNAPSHOT)</version>
</dependency>
<dependency>
	<groupId>org.slf4j</groupId>
	<artifactId>slf4j-api</artifactId>
	<scope>provided</scope>
</dependency>

You are suggested to use a logging mechanism rather than write your debug messages on the console, so please add the slf4j library to your pom too. Of course, you can put inside the pom.xml whatever dependency you need but you need to set its scope at compile. In fact, the final .jar package will contain all the needed classes.

Simple Plugin Example

As earlier stated, your project needs at least two classes. The first class must extend the PluginDeclaration interface. Its structure can be as simple as this

package org.gcube.seplugin;
 
/**
 * Your first amazing SE plugin.
 * @author ...
 */
public class SimpleExamplePluginDeclaration implements PluginDeclaration {
 
	/**
	 * Logger
	 */
	private static Logger logger = LoggerFactory.getLogger(SimpleExamplePluginDeclaration.class);
 
	/**
	 * Plugin name used by the Executor to retrieve this class
	 */
	public static final String NAME = "simple-plugin-example-name";
	public static final String DESCRIPTION = "Simple plugin example description";
	public static final String VERSION = "1.0.0";
 
	/**{@inheritDoc}*/
	@Override
	public void init() {
		logger.debug(String.format("%s initialized", SimpleExamplePluginDeclaration.class.getSimpleName()));
	}
 
	/**{@inheritDoc}*/
	@Override
	public String getName() {
		return NAME;
	}
 
	/**{@inheritDoc}*/
	@Override
	public String getDescription() {
		return DESCRIPTION;
	}
 
	/**{@inheritDoc}*/
	@Override
	public String getVersion() {
		return VERSION;
	}
 
	/**{@inheritDoc}*/
	@Override
	public Map<String, String> getSupportedCapabilities() {
		Map<String, String> discoveredCapabilities = new HashMap<String, String>();
		// No capabilities to discover
		return discoveredCapabilities;
	}
 
	/**{@inheritDoc}*/
	@Override
	public Class<? extends Plugin<? extends PluginDeclaration>> getPluginImplementation() {
		return SimpleExamplePluginDeclaration.class;
	}
 
	@Override
	public String toString(){
		return String.format("%s : %s - %s - %s - %s - %s", 
				this.getClass().getSimpleName(), 
				getName(), getVersion(), getDescription(), 
				getSupportedCapabilities(), 
				getPluginImplementation().getClass().getSimpleName());
	}
}


Now, the real body of the plugin must be written inside the SimpleExamplePlugin.class that extends the Plugin<SimpleExamplePluginDeclaration> class.

package org.gcube.seplugin;
 
public class SimpleExamplePlugin extends Plugin<SimpleExamplePluginDeclaration>{
 
	/**
	 * Logger
	 */
	private static Logger logger = LoggerFactory.getLogger(SimpleExamplePlugin.class);
 
	public SimpleExamplePlugin(SimpleExamplePluginDeclaration pluginDeclaration){
 
		super(pluginDeclaration);
		logger.debug("Constructor");
 
	}
 
	/**{@inheritDoc}*/
	@Override
	public void launch(Map<String, Object> inputs){
 
		// Here you put the code you want to execute
	}
 
 
	/**{@inheritDoc}*/
	@Override
	protected void onStop() throws Exception {
 
               // code call on stop() or unschedule()
	       logger.debug("onStop()");
	       Thread.currentThread().interrupt();
 
	}
}


In order to be discoverable by the ServiceLoader, your plugin project must have under the src/main/resources directory

  • the subdirectory META_INF/services;
  • inside the subdirectory a file named org.gcube.vremanagement.executor.plugin.PluginDeclaration ;
  • inside this file the fully qualified name of the plugin declaration class (in the previous example it would be org.gcube.seplugin.SimpleExamplePluginDeclaration )

Now you can create the .jar package of your SimpleExamplePlugin and deploy it on a SmartExecutorNode, under the path tomcat/webapps/smart-executor/WEB-INF/lib/ and launch it as shown in SmartExecutor#Client.

A simple plugin project is available at the following url https://code-repo.d4science.org/gCubeSystem/hello-world-se-plugin

Check it out to better understand a plugin skeleton.