Developing Python Clients
The best way to access web services from Python is by using the ZSI
library. The ZSI distribution contains a code generation feature which reads WSDL and generates a Python API. After this initial step, ZSI is then used as a SOAP library. Unfortunately, the ZSI library does not currently work "out-of-the-box" with caGrid services. Some code modifications to ZSI are necessary and are described here in detail.
Assumptions
This tutorial will walk you through accessing the caCORE 3.2 Grid Service from a Python client. This is a typical data service on the caGrid, which should be a reasonably general example of Python access to the caGrid.
We assume the following software versions:
Installation
Download the ZSI source distribution from SourceForge
and install it. Do not use the .egg file, because we will need to make some modifications to the ZSI code base. Also note that this tutorial contains workarounds specific to ZSI version 2.1_a1. Future versions of ZSI may require updates to these instructions.
Once you have the tarball, install as follows:
$ tar xvfz ZSI-2.1-a1.tar.gz $ cd ZSI-2.1-a1 $ python setup.py install
This will usually install ZSI in the /usr/lib/python2.5/site-packages/ZSI directory.
As mentioned previously, ZSI requires some minor code changes to work with caGrid. You can choose to apply all the patches at once, or see the ZSI Code Changes section below for a step-by-step walkthrough. These patches are temporary stop-gap measures, and should not be relied on for production code.
API Generation
We can use wsdl2py to generate the Python code for the domain and service object. In general, caGrid services will need two special options enabled: lazy typecode evaluation (-l) and complex types (-b).
wsdl2py -lb [http://cabiogrid32.nci.nih.gov:80/wsrf/services/cagrid/CaBIO32GridSvc?wsdl]
This should generate three modules:
- CaBIO32GridSvc_client.py
- CaBIO32GridSvc_server.py
- CaBIO32GridSvc_types.py
API Usage
As a simple example, we will search for caBIO Genes that have a symbol starting with "Brca".
from CaBIO32GridSvc_client import *
c = CaBIO32GridSvcServiceLocator().getCaBIO32GridSvcPortTypePort()
# Create Attribute restriction
attr = ns18.Attribute_Def(None).pyclass()
attr._attrs = { 'name': 'symbol',
'value': 'Brca%',
'predicate': 'LIKE', }
# Create Target (Gene)
target = ns18.Object_Def(None).pyclass()
target._attrs = {'name':'gov.nih.nci.cabio.domain.Gene'}
target.Attribute = attr
# Create CQLQuery
cq = ns18.CQLQuery_Dec().pyclass()
cq.Target = target
# Create QueryRequest
qacq = ns3.QueryRequest_Dec.cqlQuery_Dec().pyclass()
qacq.CQLQuery = cq
qr = QueryRequest()
qr.CqlQuery = qacq
# Execute query
ret = c.query(qr)
# Print results
for o in ret.CQLQueryResultCollection.ObjectResult:
print o.Any.get_attribute_fullName()
This program should output:
Breast cancer 1 Breast cancer 2
To print out SOAP messages for debugging purposes, one can do the following:
import sys
c = CaBIO32GridSvcServiceLocator().getCaBIO32GridSvcPortTypePort(tracefile=sys.stdout)
ZSI Code Changes
ComplexType with Annotation
Running wsdl2py may initially produce the following exception:
IndexError: list index out of range
This seems to be a bug in ZSI in that it assumes a <complexType> always has another child after an <annotation>. We can fix this as follows:
ZSI/wstools/XMLSchema.py
2471,2472c2474,2479
< component = SplitQName(contents[indx].getTagName())[1]
<
---
> if indx < len(contents):
> component = SplitQName(contents[indx].getTagName())[1]
> else:
> component = None
Unsupported Base Type
When we make the first change and run wsdl2py again, we encounter another error:
Unsupported base('http://docs.oasis-open.org/wsrf/2004/06/wsrf-WS-BaseFaults-1.2-draft-01.xsd', 'BaseFaultType')
This seems to be a limitation of ZSI, which only supports a limited set of base types. Commenting out the error code seems to work.
ZSI-mine/generate/containers.py
2392,2399c2392,2399
< if base is None:
< raise ContainerError, 'Unsupported derivation: %s'\
< %derivation.getItemTrace()
<
< if base \!= (SOAP.ENC,'Array') and base \!= (SCHEMA.XSD3,'anyType'):
< raise ContainerError, 'Unsupported base(%s): %s' %(
< base, derivation.getItemTrace()
< )
---
> # if base is None:
> # raise ContainerError, 'Unsupported derivation: %s'\
> # %derivation.getItemTrace()
> #
> # if base \!= (SOAP.ENC,'Array') and base \!= (SCHEMA.XSD3,'anyType'):
> # raise ContainerError, 'Unsupported base(%s): %s' %(
> # base, derivation.getItemTrace()
> # )
The wsdl2py program should now run without issue.
CQLQuery Serialization
Attempting to use query() with a CQLQuery may result in the following exception:
TypeError: bad usage, failed to serialize element reference ([http://CQL.caBIG/1/gov.nih.nci.cagrid.CQLQuery], CQLQuery), in: /SOAP-ENV:Body/ns1:QueryRequest
The error message here is too vague to diagnose the problem. However, it will work if we just comment out this error checking:
TCcompound.py
58,61c58,61
< if (typecode.nspname,typecode.pname) == (sub.nspname,sub.pname):
< raise TypeError(\
< 'bad usage, failed to serialize element reference (%s, %s), in: %s' %
< (typecode.nspname, typecode.pname, sw.Backtrace(elt),))
---
> # if (typecode.nspname,typecode.pname) == (sub.nspname,sub.pname):
> # raise TypeError(\
> # 'bad usage, failed to serialize element reference (%s, %s), in: %s' %
> # (typecode.nspname, typecode.pname, sw.Backtrace(elt),))
67,70c67,70
< raise TypeError(\
< 'failed to serialize (%s, %s) illegal sub GED (%s,%s): %s' %
< (typecode.nspname, typecode.pname, sub.nspname, sub.pname,
< sw.Backtrace(elt),))
---
> # raise TypeError(\
> # 'failed to serialize (%s, %s) illegal sub GED (%s,%s): %s' %
> # (typecode.nspname, typecode.pname, sub.nspname, sub.pname,
> # sw.Backtrace(elt),))
Target Namespace
Finally, there is an error from the Axis side:
ZSI.FaultException: org.xml.sax.SAXException: Invalid element in gov.nih.nci.cagrid.cqlquery.CQLQuery - Target
This is clearly caused by the fact that the SOAP generated by ZSI is using the wrong namespace for Target and Attribute. For example:
<ns2:CQLQuery>
<Target name="gov.nih.nci.cabio.domain.Gene" xsi:type="ns2:Object">
<Attribute name="symbol" predicate="LIKE" value="Brca1" xsi:type="ns2:Attribute"></Attribute>
</Target>
</ns2:CQLQuery>
Target and Attribute should be in the same namespace as CQLQuery. For reasons that are not entirely clear, ZSI discards the namespace on these elements and uses xsi:type instead. To ensure the namespace appears correctly, we can do the following:
schema.py
352c352,355
< self.__cache = self.klass(pname=self.pname,
---
> reconstruct_pname = self.pname
> if self.nspname:
> reconstruct_pname = (self.nspname,self.pname)
> self.__cache = self.klass(pname=reconstruct_pname,
All of these changes are available as a patch