Table of Contents
caBIG aims to bring together disparate data and analytic resources into a "World Wide Web of cancer research." This will be achieved through common standards and software frameworks for the federation of these resources into "grid" services. Many of the tasks in the collection and analysis of cancer-related data on the grid involve the use of workflow. Here, we define workflow as the connecting of services to solve a problem that each individual service could not solve. caGrid implements workflow by providing a grid service for submitting and running workflows that are composed of other grid services.
The Business Process Execution Language (BPEL) is an XML language for describing business process behavior based on web/grid services. BPEL is layered on top of other Web technologies such as WSDL 1.1, XML Schema 1.0, XPath 1.0, and WS Addressing, which makes it a perfect candidate for use in caGrid. The BPEL notation includes flow control, variables, concurrent execution, input and output, transaction scoping/compensation, and error handling. A BPEL process describes a business process, which often invoke Web/Grid services to perform functional tasks. A process can be either abstract or executable. Abstract processes are similar to library APIs: they describe what the process can do with inputs and outputs, but they do not describe how the work actually gets done. Abstract processes are useful for describing a business process to another party that wants to implement the process. Executable processes do the "heavy lifting" - they contain all of the execution steps that represent a cohesive unit of work. The focus of this document will be on executable processes, as they are concrete workflows that are runnable through the workflow service.
Some vocabulary must be established to understand a BPEL document. While a typical domain user such as an oncologist is not expected to write a BPEL document, it is expected that developers be able to produce BPEL from higher-level tools. In BPEL, a process consists of activities connected by links. A process sometimes only contains one activity, but that is usually a container for more activities. The path taken through the activities and their links is determined by many things, including the values of variables and the evaluation of expressions. The starting points are called start activities, and their "create instance" attributes are set to "yes". When a start activity is triggered, a new business process instance is created. Each service that is invoked by the workflow is called a PartnerLink, and BPEL extends this concept to include the client that is invoking the workflow.
- Get the endpoints of the services you want to the workflow to consist of. These endpoints can be obtained from a query to the Index Service based on the ServiceMetaData, though that must be done prior to creating the workflow.
- Define PartnerLinks for the services you want to interact
- Create a BPEL document using a GUI if available
- Submit the BPEL document to the WorkflowFactoryService using the Workflow GUI client
- The command-line client submits the workflow and starts it using the input document specified by the user.
- Download the services from http://gforge.nci.nih.gov/frs/download.php/2223/workflow-services.zip
- Unzip the services into a directorycd workflow-services/TestService1
- Make sure you have $CATALINA_HOME set and points to a working Tomcat installed with caGrid 1.1.
- run ant deployTomcat
- This will deploy the Test service1 in tomcat. Do not install the second service as yet
- Restart the tomcat and check if the service is up by doing :http://<hostname>:<port>/wsrf/services/cagrid/WorkflowTestService1?wsdl
Prerequisites: A tomcat container with caGrid installed. A tomcat container with ActiveBPEL installed. Note that these are already done if you installed the Workflow component using the caGrid installer.
The Workflow Submission GUI allows the users to submit, monitor BPEL workflows. It also allows users discover and add services to be used in the workflow. User can also get output of the workflow once the workflow is done execution.
In the following paragraphs we try to outline some of the capabilities of the GUI.
Browse to the WorkflowFactoryService dir under caGrid source distribution (caGrid/projects/workflow/WorkflowFactoryService/). From that directory run ant ui. This will launch the Workflow GUI. Once the GUI is launched click on Window -> Preferences Menu option. Browse to the WorkflowFactoryService(s) endpoint option and make sure it points to the Workflow service that needs to be validated. Use Add/Remove buttons to add new Workflow services endpoints. Use the Move up/Decrease keys to move the endpoint up and down.
Click on Submit Workflow Button and the following window is open. The user can enter the path to the BPEL document inside the BPEL File text field or the user can browse to the location of the BPEL document. Select Test1.bpel workflow document that is under WorkflowFactoryService folder.
The workflow name should be the name of the BPEL doc (Test1 in this case) . Once the user is done with having valid values for both the text fields, press Add Partner Links button.
When this button is presses the following GUI opens. This GUI is used to enter Endpoints of the services taking part in this workflow. In this example we have one service invoked twice in the workflow. Enter appropriate values to the fields in the PartnerLinkFrame GUI. For this test the values are as follows:
Select Type: Static
Service Endpoint: http://localhost:8080/wsrf/services/cagrid/WorkflowTestService1
WSDL Location : http://localhost:8080/wsrf/share/schema/WorkflowTestService1/WorkflowTestService1.wsdl
The localhost:8080 should be replaced with appropriate host:port on which the test service is deployed ( From the first part of the document ). Click Add.
Now you are ready to submit the Workflow. Click on Submit.This would submit the workflow to a pre-configured Workflow Factory Service. After this step the BPEL document is validated and submitted to the Workflow Engine. If there are no errors the Status is changed from Pending - > Submitted.
The next step is to execute the submitted workflow with some input.
In the current implementation the user has to paste the input XML in to
the text area and press the Start button. If the workflow is started
without problems then the Status is changed to Active.
Copy the following XML blob into the input XML Text area:
Click Start. This will Start the workflow and change the status from Submitted to Active.
Press the Get Status button and if the status is different from the existing status that is on the left hand corner of the GUI, the status changes to the latest status.
Press the Get Status button and if the status of workflow is Done, the workflow output is displayed in the output text area.
|Output when Status is Done|
If you submit a long workflow you would like to see which portion of Workflow is currently being executed. Press the Get Details button to see where the current workflow execution is taking place. The GUI displays XPath expression of the node in Workflow Document and its status
Since the workflows are modeled as WS-RF resources, they have a lifetime associated with them. The Workflow service provides a standard "destroy" operation to stop the workflow and free up all the resources that are used by the workflow
The command-line client provides an operation by which users can pause an active workflow. This command will result in the service invoking the "pause" operation of the workflow management service using the workflow id.
The command-line client provides an operation by which users can resume a paused workflow. This command will result in the service invoking the "resume" operation of the workflow management service using the workflow id.