Utility Data Service APIs
The caGrid data services infrastructure includes several utility classes which can be used to ease development and use of data services. These classes are found in the gov.nih.nci.cagrid.data.utilities package distributed with the data service infrastructure.
Creating CQL Query Results
The CQLResultsCreationUtil class provides convenience methods for creating CQLQueryResults instances for object results, attribute results, and a counting result. A convenience method for identifier results may be added in the future. The class provides three public static methods, one for each type of results currently supported.
- public static CQLQueryResults createObjectResults(List objects, String targetName, Mappings classToQname)
- objects - a list of Java objects to be placed in a new CQLQueryResults object.
- targetName - the name of the class targeted by the query which produced these object results. All items in the objects list should be of this type.
- classToQname - a mapping from class name to QName. This is a generated Java bean from the XML schema for the data service infrastructure and contains an array of name/value pairs that map class names to QNames.
- public static CQLQueryResults createAttributeResults(List attribArrays, String targetClassname, String[] attribNames)
- attribArrays - a List of Object arrays. Each array should have one value for one attribute of an object. These values may be null. The values must be in an order corresponding the ordering of attribute names
- targetClassname - the name of the class targeted by the query which produced these attribute results. All attribute arrays should have some from this type.
- attribNames - the names of the attributes returned by the query. These should be in the same ordering used by the attribute arrays.
- public static CQLQueryResults createCountResults(long count, String targetClassname)
- count - the number of resulting items (objects, attribute sets) from a query
- targetClassname - the name of the class which was the target of the query
Querying and Iterating
The DataServiceIterator is an interface which provides for a query to be submitted to a data service and an Iterator over the result set to be returned. There are two implementations of this interface; one for the standard data service and one for data services with enumeration enabled.
- DataServiceHandle
- The data service handle is the implementation of the data service iterator class for a base caGrid Data Service. It has three constructors, all of which take a DataServiceClient instance. The default constructor needs only this parameter. The other two constructors should be used when custom serialization and deserialization of types has been specified for the service. The extra parameter can be either the filename of a wsdd file containing this mapping information, or an InputStream to the same information.
- EnumDataServiceHandle
- The enum data service handle is the implementation of the data service iterator interface for a WS-Enumeration enabled caGrid Data Service. It has two constructors, both of which take an enumeration data service client instance. The default constructor needs only this parameter. The second constructor takes an IterationConstraints instance, which contains information about how data should be requested from the enumeration data service.
- BdtDataServiceHandle
- The BDT data service handle is an implementation of the data service iterator interface to be used with a BDT-enabled caGrid Data Service. Its behavior is the same as that of the enum data service handle, except that it handles the additional invocation of the BDT context to support enumeration internally.
Domain Model Manipulation
The DomainModelUtils class provides a means to extract useful information from a domain model.
- public static UMLClass getReferencedUMLClass(DomainModel model, UMLClassReference reference)
- To save on document size, domain models do not duplicate class information when an association is defined, but rather use class references based on ID values. These reference values can be traced back to their original UML Class instance with this function.
- public static UMLClass[] getAllSuperclasses(DomainModel model, String className)
- Superclasses of a UML Class can be determined by traversing UML class references and generalization information. There are two methods which perform this task in the Domain Model Utils class. One uses a class name and the other extracts the name from an UMLClass instance and passes it to the other.
WSDD Manipulation
The WsddUtil utility class contains functions to set parameters on a wsdd file. This class is used internally to the Introduce data service extension to edit the wsdd files and change the castor mapping file name.
- public static void setGlobalClientParameter(String clientWsddFile, String key, String value)
- clientWsddFile - the name of the client side wsdd file to edit. When edits are complete, the changed file is saved to the same location.
- key - the key of the parameter. This is the name by which the parameter can be accessed.
- value - the value stored in the parameter
- public static void setServiceParameter(String serverWsddFile, String serviceName, String key, String value)
- serverWsddFile - the name of the server side wsdd file to edit. When edits are complete, the changed file will be saved to the same location
- key - the key of the parameter. This is the name by which the parameter can be accessed
- value - the value stored in the parameter
Query Validation Tools
The caGrid Data Services infrastructure provides for validation of queries with respect to the domain model exposed by a service and the CQL schema, as well as query results for validity with respect to the exposed data types.
CQL Query Syntax
The caGrid Data Service infrastructure provides mechanisms to validate CQL queries for syntactic correctness. While the Axis engine prevents malformed XML from ever being turned into CQL objects, it does not handle XML that does not conform to certain schema restrictions. For example, Axis does not prevent populating multiple child elements of an XML schema 'choice'. For this reason, CQL syntax validation can be enabled on a caGrid data service. This mechanism will reject invalid queries before they ever reach a CQL Query Processor implementation, saving the processor's developer from having to handle them. This same validation can be performed either on the client side or offline completely by using the query validation utilities. For syntax validation, the interface gov.nih.nci.cagrid.data.cql.validation.CqlStructureValidator is provided, as are two implementations of this interface. The interface provides the validateCqlStructure() method, which takes a single CQLQuery instance parameter, and throws a MalformedQueryException if an error is encountered. The default implementation of this interface is the gov.nih.nci.cagrid.data.cql.validation.ObjectWalkingCQLValidator class. As its name suggests, this class walks through the CQL object model, seeking out inconsistencies with the published CQL schema. This class also has a main()' method, which allows it to be run from the command line with a list of CQL query XML files specified as arguments. The data service infrastructure uses this class by default when query validation is enabled. This can be changed to any other class which implements the CqlStructureValidator interface by editing the value of the dataService_cqlValidatorClass service property in a generated data service.
Domain Model Conformance
The Data Service infrastructure also provides mechanisms to validate a structurally sound CQL query against a Domain Model to ensure its restrictions are supported by the domain model's exposed structure. Domain Model validation may be enabled for a caGrid data service, and will be performed on every query submitted to the service before it is passed to the CQL query processor. The interface gov.nih.nci.cagrid.data.cql.validation.CqlDomainValidator is provided, along with a single implementation. The interface provides the validateDomainModel() method, which takes a single CQLQuery instance parameter, and throws a MalformedQueryException if an error is encountered. The lone implementation provided with the caGrid Data Service infrastructure is the gov.nih.nci.cagrid.data.cql.validation.DomainModelValidator class. Like the CQL validation instance, this class has a main() method, which allows it to be run from a command line. The arguments should be first a domain model XML file, then a list of CQL query files to be validated. The data service infrastructure uses this class when domain model validation is enabled. This implementation may be substituted for another by editing the value of the dataService_domainModelValidatorClass service property in a generated data service.
Query Results Validation
The data service infrastructure also provides a means to both validate the results of a CQL query against a known set of targets, and to determine what target data types are allowed to be returned by a caGrid Data Service. Every data service exposes a schema through its WSDL that enumerates the data types which may be returned by the data service. This schema appears in generated services under the schemas/<ServiceName> directory as <ServiceName>_CQLResultTypes.xsd.
The utility class gov.nih.nci.cagrid.data.utilities.validation.CQLQueryResultsValidator has been provided to both retrieve this file and verify that a CQLQueryResults instance conforms to this schema. An instance of this class can be constructed with either the full path to a data service's WSDL file, or an endpoint reference to a running data service.
The validator exposes two public methods:
- public void saveRestrictedCQLResultSetXSD(File fileLocation) throws SchemaValidationException
- This method locates the restriction XSD file and saves its contents into the file specified.
- fileLocation - a file into which the restriction XSD will be saved.
- public void validateCQLResultSet(CQLQueryResults resultSet) throws SchemaValidationException
- resultSet - a set of results generated by a query into a caGrid Data Service. The object contents of this result set will be processed against the restriction XSD.
The CQLQueryResultsValidator class also has a main() method, which takes two arguments. The first argument is a URL to a caGrid Data Service, which will be used to retrieve the result restriction schema. The second argument should be the filename of a CQLQueryResults instance serialized to an XML document.





