Access Keys:
Skip to content (Access Key - 0)

Workflow


Design


Introduction

caBIG aims to bring together disparate data and analytic resources into a "World Wide Web of cancer research." This will be achieved through common standards and software frameworks for the federation of these resources into "grid" services. Many of the tasks in the collection and analysis of cancer-related data on the grid involve the use of workflow. Here, we define workflow as the connecting of services to solve a problem that each individual service could not solve. caGrid implements workflow by providing a grid service for submitting and running workflows that are composed of other grid services. This chapter describes the architecture and APIs for interacting with caGrid workflow.

Workflow Architecture


Figure 1 Overview of the architecture of the caGrid Workflow component.

The Workflow component leverages the same infrastructure stack as the caGrid toolkit (GT4, Tomcat, Java, Ant, and Introduce) with the addition of the ActiveBPEL workflow engine. The WorkflowFactoryService is a standard Introduce-built grid service that allows a workflow to be created from a BPEL workflow document. An EPR is returned to a WorkflowManagementService resource that can be used to start, stop, pause, resume, cancel, and destroy the created workflow. The WorkflowManagementService is layered on top of the ActiveBPEL workflow engine, which provides the primary functionality for running the BPEL-defined workflow. See Figure 1 for an overview of this architecture.

The following actions are performed when a user invokes start on the workflow management service:

  • The input BPEL document is parsed and an exception is thrown if it is not well-formed (with respect to schema compliance).
  • The input arguments to the workflow are declared as an array of xsd:any. They are parsed and cast to the types that they are meant to be.
  • The service implementation invokes PDDGenerator to generate the deployment descriptor for the workflow.
  • The service implementation invokes BPRGenerator to generate the BPelArchive (called bpr hereafter) that is ready to be deployed.
  • The workflow management service is bootstrapped with the location of ActiveBPEL admin service location.
  • The service implementation invokes deployBpr operation on the admin service, which is a vanilla Axis-based Web Service, to deploy the workflow created.
  • The admin service responds with deployment summary reporting success if the workflow is deployed successfully.
  • Once the workflow is deployed successfully, it is deployed as a web service inside ActiveBPEL.
  • To start the workflow, a message is sent to the receiving partnerLink in the workflow.
  • After the workflow successfully executes the results are returned to the client app/user by a call to getWorkflowOutput

WorkflowFactoryService API


Workflows are created using the WorkflowFactoryService, which is a grid service that follows the resource pattern. The returned object holds an EPR to a WorkflowManagementService, which can be used to manipulate the create workflow.

Public WorkflowFactoryOutputType createWorkflow(WorkflowDescriptionType wmsInputType) throws WorkflowException ()

Description:
This method creates a workflow resource from the BPEL document found in wmsInputType and returns an EPR of the created resource to the client. The BPEL resource, along with the most recent state, is persisted in a MySQL database and is recovered in the event of a container crash.

WorkflowDescriptionType:
This is the input to createWorkflow, and it consists of workflowName , a String bpelDoc, an Array of wsdlReferences, and an initial termination time for the workflow. If the termination time is not specified the service defaults to 24hrs. Termination of the workflow invalidates the WorkflowManagementService EPR and any running workflow is stopped.

<xsd:complexType name="WorkflowDescriptionType">
        <xsd:sequence>
      <xsd:element name="workflowName" type="xsd:string" minOccurs="1" maxOccurs="1" />
      <xsd:element name="bpelDoc" type="xsd:string" maxOccurs="1" />
      <xsd:element name="wsdlReferences" type="tns:WSDLReferences" maxOccurs="unbounded" />
	<xsd:element name="InitialTerminationTime" type="xsd:dateTime"/>
    </xsd:sequence>
  </xsd:complexType>

<xsd:complexType name="WSDLReferences">
    <xsd:sequence>
        <xsd:element name="wsdlNamespace" type="xsd:anyURI"/>
        <xsd:element name="wsdlLocation" type="xsd:string"/>
        <xsd:element name="serviceUrl" type="xsd:anyURI"/>
    </xsd:sequence>
</xsd:complexType>

WorkflowFactoryOutputType:
This is the output of the createWorkflow method. An EPR is constructed by the factory and returned to the client. At this point the workflow document is deployed in the workflow engine and is also stored in a database, but it has not started. The EPR points to an instance of the WorkflowManagementService, which should be used to start the workflow.

<xsd:complexType name="WorkflowFactoryOutputType">
    <xsd:annotation>
      <xsd:documentation>This type represents the output from a workflow</xsd:documentation>
    </xsd:annotation>
    <xsd:sequence>
      <xsd:element name="WorkflowEPR" type="wsa:EndpointReferenceType" />
    </xsd:sequence>
  </xsd:complexType>

Faults:

  • UnableToDeployWorkflowFault: This fault is thrown if the workflow is unable to be deployed (e.g. the BPEL document submitted fails pre-deployment validation).
  • InvalidBPELFault extends UnableToDeployWorkflowFault: This fault is throw if the BPEL document submitted fails pre-deployment validation (e.g. not valid XML).

Factory ResourceProperties:
We intend to provide aggregate resource properties on the factory service in our next iteration. Following are some of the examples of those:
Total number of workflows
ListOfWorkflowsSubmitted

WorkflowManagementService API


This service is used to manage the workflow resources created by the WorkflowFactoryService. The service provides asynchronous execution of deployed workflows. The following are the operations the service provides in addition to the standard WS-RF operations such as destroy(), setTerminationTime(), etc.

Workflow Service State Diagram

Public WorkflowStatusType start(StartInputType input) throws WorkflowException, StartCalledOnStartedWorkflowFault

Description:
This operation is used to start the workflow deployed using the factory with a set of input parameters. The input parameters are modeled as an array of xsd:any elements. The output is a void type.

<xsd:complexType name="StartInputType">
    <xsd:sequence>
      <xsd:element name="inputArgs" type="tns:WorkflowInputType" maxOccurs="1" />
    </xsd:sequence>
</xsd:complexType>

<xsd:complexType name="WorkflowInputType">
    <xsd:sequence>
      <xsd:any maxOccurs="1" />
    </xsd:sequence>
  </xsd:complexType>

Faults:

  • StartCalledOnStartedWorkflowFault: This is thrown if start() is called on a workflow that is not in any one of the terminal states (i.e. Done, Failed, Cancelled).
  • WorkflowException: Every other fault results in the service throwing this with a message describing more details as to what went wrong.

Public WorkflowStatusType getStatus( ) throws WorkflowException

Description:
This operation is used to query for the status of the deployed workflow. WorkflowStatusType includes a fault.

<xsd:simpleType name="WorkflowStatusType">
    <xsd:restriction base="xsd:string">
      <xsd:enumeration value="Pending" />
      <xsd:enumeration value="Active" />
      <xsd:enumeration value="Done" />
      <xsd:enumeration value="Failed" />
      <xsd:enumeration value="Cancelled" />
    </xsd:restriction>
  </xsd:simpleType>

Public WorkflowStatusType pause() throws CannotPauseFault

Description:
This operation pauses the workflow until resume() or cancel() is invoked. This operation translates to invoking an equivalent operation provided in the ActiveBPEL Admin interface. When the pause operation is invoked, ActiveBPEL stops the execution of the workflow document which means that there would no further service invocations or other activities. However, this will not affect the invocations in progress when the pause() is invoked. This operation returns the new state of the workflow resource (which always should be Active).

Public WorkflowStatusType resume() throws CannotResumeFault

Description:
This operation resumes a paused workflow. It translates to invoking an equivalent operation provided in the ActiveBPEL Admin interface.

Public WorkflowOutputType getWorkflowOutput() throws WorkflowException

Description:
This operation is used to get the final output of a completed workflow. It will return a fault if the workflow is not yet completed. If ActiveBPEL allows for intermediate access of results, then this operation can potentially return the last result that the workflow engine has for this workflow. The output is modeled as a array of xsd:any elements.

<xsd:complexType name="WorkflowOutputType">
    <xsd:sequence>
      <xsd:any maxOccurs="1" />
    </xsd:sequence>
  </xsd:complexType>

Public void cancel() throws WorkflowException

Decription:
This operation terminates a workflow. It translates to invoking equivalent operation provided in the ActiveBPEL Admin interface.

Resource Properties:

WorkflowStatusRP:

The status of a workflow is exposed as a Resource Property so clients can subscribe to it and get notified when a state change happens. WorkflowStatusType is modeled as an Enum of Strings with the following valid values:

  • Pending (Created but Start has not been called)
  • Active
  • Done
  • Paused
  • Failed

The status also includes the latest fault a workflow execution throws.

<xsd:simpleType name="WorkflowStatusType">
    <xsd:restriction base="xsd:string">
      <xsd:enumeration value="Pending" />
      <xsd:enumeration value="Active" />
      <xsd:enumeration value="Done" />
      <xsd:enumeration value="Failed" />
      <xsd:enumeration value="Cancelled" />
      <xsd:enumeration value="Paused" />
    </xsd:restriction>
  </xsd:simpleType>

WorkflowStartTimeRP:

This property denotes the time when start() operation is called on the resource.

WorkflowEndTimeRP:

This property denotes the time when workflow status is set to Done/Failed/Cancelled.

Public void destroy()

Description:
This is a standard WS-RF operation but mentioned here to clarify the semantics and what it means to a Workflow Resource. If called, this method will delete a Workflow resource from the database along with the intermediate results and subscriptions for notifications. This operation is called by the GT4 framework when the lifetime of a resource is expired. The lifetime is set in the initial create() call in the factory. Internally, destroy() removes all the database entries for a particular workflow resource, all the subscriptions for notifications, and other temporary resources both in memory and on the disk.

Security in WorkflowFactory and Context Services


Two types of deployment patterns for Workflow are needed in regards to security. One deployment scenario would have the factory and the context service running with grid security (using Transport level security and caGrid authorization) and would require a client present grid credentials to submit and run workflows. Once a workflow resource is created by a user, programmatic GridMap authorization is used to limit access to the resource to the creator of the resource. Delegation of credentials is performed using the delegation service of the Globus Toolkit. This deployment is used to orchestrate workflows that require secure access to any services involved in the workflow. The other deployment does not have any security and is used to orchestrate workflow between unsecured grid services.

Service Selection


A Custom invoke handler is written for ActiveBPEL that queries a pre-configured GT4 index service to get the list of services. The query would be based on input and output types of the service invocation. Once a list of service handles is obtained from the index service, the dynamic endpoint for the service invocation is replaced by the first endpoint in the list.

Provenance Tracking


This is out of scope for this component during this release. No provenance tracking is exposed via the workflow component.

WS-RF Resources in Workflows


A BPEL service can involve affecting state of a WS-RF resource. Additional support needs to be added to ActiveBPEL to pass WS-A headers that contain EPRs and other relevant info to caGrid services.

Workflow Management Interface design and Implementation


Background: ActiveBPEL provides a web application to find more detailed status of the submitted workflows. However exposing this management interface to end users is not advisable as it provides access to more powerful operations like stopping the Workflow Engine, stopping execution of other workflows etc. In this implementation we are trying to provide the same level of status information in the Workflow Submission GUI without the User going to the ActiveBPEL Admin webapp.

Goal: We want the Workflow Service to expose detailed status so it is possible to know which portion of the workflow is being executed without the need for the ActiveBPEL web admin interface.

Our Approach: ActiveBPEL provides an option for executing bpel processes to log their current state to a file. A file is created for each execution of the BPEL process under $USER_HOME/AeBPELEngine/process-logs/*.log. We can write a Java class to tail the output of a log file and provide more detailed status information to the user. This approach does not require us modifying any of the ActiveBPEL code and hence well with-in GPL boundary. This would also mean adding a new method getDetailedStatus() method to the WorkflowServiceImpl. The return type for this method is an array of WorkflowStatusEventType with the following structure:

Timestamp: xsd:dateTime
State: Enum of {Executing, Completed, Failed}
Current Operation: This would be a Xpath expression into BPEL process

The format of the log file for a process with id 2 is as follows:

[2][2007-05-10 16:38:01.022] : Executing [/process]
[2][2007-05-10 16:38:01.022] : Executing [/process/sequence]
[2][2007-05-10 16:38:01.032] : Executing [/process/sequence/receive]
[2][2007-05-10 16:38:01.052] : Completed normally [/process/sequence/receive]
[2][2007-05-10 16:38:01.052] : Executing [/process/sequence/invoke]
[2][2007-05-10 16:38:01.173] : Completed normally [/process/sequence/invoke]
[2][2007-05-10 16:38:01.173] : Executing [/process/sequence/reply]
[2][2007-05-10 16:38:01.233] : Completed normally [/process/sequence/reply]
[2][2007-05-10 16:38:01.233] : Completed normally [/process/sequence]
[2][2007-05-10 16:38:01.233] : Completed normally [/process]

Alternate Approach: ActiveBPEL admin interface provides an operation called addProcessListener that would let us pass a callback object that gets called every time any event happens in a BPEL process. The events contain the following information:

  • @param aPID The process ID of the event.
  • @param aPath The path of the object trigerring the event.
  • @param aEventID The event id of the event.
  • @param aFault The associated Fault, or empty.
  • @param aInfo Extra info to register with the event.
  • @param aTimestamp The event timestamp

We need to investigate this approach to make sure we don't violate any GPL concerns in implementing the ProcessListener interface provided by ActiveBPEL. A Process or Engine Event listener is a web service that exposes an operation, the engine can invoke to report events. The endpointURL you pass is the endpoint reference for the service that the engine will call. The context ID is a key you pass when you register the listener, which the engine will return when it calls your service. You can use it for correlation, etc. This seems to be a rather heavy weight approach to get detailed status.

Conclusion: After doing some investigation and prototyping the first approach is more practical than the second approach. This will be implemented and distributed with caGrid 1.1 release.

Workflow Submission GUI


Design Document Artifact for Workflow Submission GUI

Introduction: The Workflow Submission GUI allows the users to submit, monitor BPEL workflows. It also allows users discover and add services to be used in the workflow. User can also get output of the workflow once the workflow is done execution.

In the following paragraphs we try to outline some of the capabilities of the GUI.

Submitting a Workflow: The user can enter the path to the BPEL document inside the BPEL File text field or the user can browse to the location of the BPEL document.

The workflow name should be the name of the BPEL doc. Once the user is done with having valid values for both the text fields, press Submit button. This would submit the workflow to a pre-configured Workflow Factory Service. After this step the BPEL document is validated and submitted to the Workflow Engine. If there are no errors the Status is changed from Pending - > Submitted.

Execution of the Workflow: The next step is to execute the submitted workflow with some input. In the current implementation the user has to paste the input XML in to the text area and press the Start button. If the workflow is started without problems then the Status is changed to Active.

Querying for Status: Press the Get Status button and if the status is different from the existing status that is on the left hand corner of the GUI, the status changes to the latest status.

Getting Workflow Output: Press the Get Status button and if the status of workflow is Done the workflow output is displayed in the output text area.

Future Work: Future work includes tying this GUI to a BPEL authoring GUI, adding partnerLinks (services to be used in the workflow) to the workflow and adding support for getting more detailed status.

Last edited by
Sarah Honacki (1038 days ago)
Adaptavist Theme Builder Powered by Atlassian Confluence