caBIGaims to bring together disparate data and analytic resources into a "World Wide Web of cancer research." This is achieved through common standards and software frameworks for the federation of these resources into "grid" services. Key to the realization of the benefits of Grid computing is the ability to integrate basic services to create higher-level applications. Workflow languages permit such aggregation of services. With such languages, higher-level application can be modeled as graphs where the nodes represent tasks while the edges represent inter-task dependencies, data flow or flow control. Tasks may be performed by basic services. Many of the tasks in the collection and analysis of cancer-related data on the grid involve the use of workflow. Here, we define workflow as connecting of services to solve a problem that each individual service could not solve. caGrid implements workflow by providing a grid service for submitting and running workflows that are composed of other grid services.
The Workflow component leverages the same infrastructure stack as the caGrid toolkit (GT4, Tomcat, Java, Ant, and Introduce) with the addition of the Taverna workflow engine. This workflow engine is wrapped and exposed as WS-Resource that allows users to integrate other WSRF-based Grid services hosted within and outside caGrid for their analyses.
Taverna, a part of myGrid project, is an application that helps in building and executing workflows to the users who are not necessarily experts in web services and programming. It provides access to a range of services with programmatic interfaces, primarily the molecular biology tools and databases available on the web, especially as web services. It allows bioinformaticians to construct workflows or pipelines of services to perform a range of different analyses, such as sequence analysis and genome annotation. These high-level workflows can integrate many different resources into a single analysis.
CaGrid Workflow 1.4 provides support for the features made available in the Taverna 2.1.2 release.
Taverna 2.1.2 contains a plug-in for that enables semantic search for caGrid services described by the caGrid Index Service. Once found, users can easily add the services as components to their workflows.
Taverna 2.1.2 also contains two additional caGrid plug-ins: the caGrid plug-in, for invoking caGrid services; and the caGrid remote execution plug-in, for remote execution of caGrid workflows on a server.