Federated Query Processor 1.2 Developers Guide
| Navigation | ||
|---|---|---|
| caGrid | caGrid 1.2 Documentation | |
| FQP | FQP 1.2 Documentation | FQP 1.2 Developers Guide |
This document is intended to provide information to developers who wish to make use of the Federated Query Processor grid service and local Federated Query Engine APIs.
Prerequisites
To get started developing against the FQP APIs, your project will require the Java libraries found in the FQP project's ext/dependencies/jars directory, and those in its build/lib directory.
Developers using Ivy to integrate with the caGrid build artifacts may use the following line in their dependencies:
<dependency rev="latest.integration" org="caGrid" name="cql" conf="myconfiguration->cql"/>
Federated Query Engine API
The Federated Query Engine is the core component of the Federated Query Processor, and can be used as either a standalone API, or within the context of the Federated Query Processor grid service.
Constructing an Instance
There are two constructors for the Federated Query Engine:
- public FederatedQueryEngine()
- public FederatedQueryEngine(GlobusCredential credential)
The first constructor is simply a convenience method which passes null to the second one, and the credential parameter is optional in the case of the second constructor.
The credential parameter can be passed along to the Federared Query Engine, and will be used to query secure data services involved in any DCQL queries issued to the engine.
API Methods
The Federated Query Engine exposes two methods for executing a DCQL query:
Simple Query Execution
The execute method takes a single DCQL query parameter and returns a single DCQLQueryResultsCollection instance. This method may throw a FederatedQueryProcessingException
public DCQLQueryResultsCollection execute(DCQLQuery dcqlQuery)
throws FederatedQueryProcessingException
This method processes the DCQL query by breaking it down into parts according to foreign join conditions and generating further CQL queries until it has produced a single CQL query which is then distributed to all target data services specified by the query. The results of this final query are placed in the DCQLQueryResultsCollection according to which target data service returned them, and then returned to the caller.
Execute and Aggregate Results
The executeAndAggregateResults method takes a single DCQL query parameter and returns a single CQLQueryResults instance. This method may also throw a FederatedQueryProcessingException.
public CQLQueryResults executeAndAggregateResults(DCQLQuery dcqlQuery)
throws FederatedQueryProcessingException
This method processes the DCQL query by breaking it down into parts according to foreign join conditions and generating further CQL queries until it has produced a single CQL query which is then distributed to all target data services specified by the query. The results of this final query are aggregated into a single CQL query results instance, which allows it to be manipulated by existing data service infrastructure tooling (iterators, enumerators, etc), while loosing the context of which target data service produced a given result.
Federated Query Processing Exceptions
These exceptions may be thrown from either public API method when something goes wrong in the course of processing a DCQL query. Several common causes of this exception are:
- Failure of a data service involved in the DCQL query
- Failure handling behavior for target data services is controllable by the Query Execution Parameters used to construct the Federated Query Engine
- Invalid CQL is passed along to a data service (typically due to invalid DCQL originally)
- Bad / unrecognized user certificate
Federated Query Processor Client
The Federated Query Processor Client is the client-side API for communicating with the caGrid Federated Query Processor Service.
Constructing an Instance
The Federated Query Processor Client has four constructors, most of which are simply convenience accessors to a final constructor. The various constructors are as follows:
- public FederatedQueryProcessorClient(String url) throws MalformedURIException, RemoteException
- public FederatedQueryProcessorClient(String url, GlobusCredential proxy) throws MalformedURIException, RemoteException
- public FederatedQueryProcessorClient(EndpointReferenceType epr) throws MalformedURIException, RemoteException
- public FederatedQueryProcessorClient(EndpointReferenceType epr, GlobusCredential proxy) throws MalformedURIException, RemoteException
The url parameter passed in the first two constructors is the URL of the Federated Query Processor Service you wish to connect to. The epr parameter in the last two constructors is an Axis Endpoint Reference which resolves to the FQP service you wish to connect to. The proxy parameter is a Globus Credential Proxy which you may use to authenticate to and securly communicate with the FQP service. These constructors should look familiar to users of other Introduce-generated caGrid services, since they are the standard client constructors.
Connecting to a Secure FQP Service
Federated Query Processor 1.2 Connecting to Secure FQP Services
The 1.2 and earlier versions of the Federated Query Processor Client contains special modifications to always connect with a caller ID when the client supplied a credential. That is, if you create a FQP client and use the constructor that takes in a GlobusCredential object, that credential will always be passed to the FQP service when invoking service operations. Note: this does not provide any form of credential delegation.
API Methods
The Federated Query Processor Client offers three public methods for executing DCQL queries:
Simple Query Execution
The execute method takes a single DCQL query parameter and returns a single DCQLQueryResultsCollection instance. This method may throw a FederatedQueryProcessingException
public DCQLQueryResultsCollection execute(DCQLQuery dcqlQuery)
throws RemoteException, FederatedQueryProcessingFault
This method sends a DCQL query to the service, which then uses the Federated Query Engine to processes the DCQL query by breaking it down into parts according to foreign join conditions and generating further CQL queries until it has produced a single CQL query which is then distributed to all target data services specified by the query. The results of this final query are placed in the DCQLQueryResultsCollection according to which target data service returned them, and then returned to the caller.
Execute and Aggregate Results
The executeAndAggregateResults method takes a single DCQL query parameter and returns a single CQLQueryResults instance. This method may also throw a FederatedQueryProcessingException.
public CQLQueryResults executeAndAggregateResults(DCQLQuery dcqlQuery)
throws RemoteException, FederatedQueryProcessingFault
This method sends a DCQL query to the service, which then uses the Federated Query Engine to processes the DCQL query by breaking it down into parts according to foreign join conditions and generating further CQL queries until it has produced a single CQL query which is then distributed to all target data services specified by the query. The results of this final query are aggregated into a single CQL query results instance, which allows it to be manipulated by existing data service infrastructure tooling (iterators, enumerators, etc), while loosing the context of which target data service produced a given result.
Asynchronous Query Execution
The Federated Query Processor Client offers an API to perform a DCQL query asynchronously. With this functionality, a client can issue a DCQL query, immediatly recieve a Federated Query Results Client, and use that new client to retrieve results at a later time, potentially using WS-Notification functionality to determine when the query has completed processing on the service and results are available.
The executeAsynchronously method takes a single DCQL query parameter and returns a single Federated Query Results Client instance. This method may also throw a Malformed URI Exception, and a Remote Exception.
public FederatedQueryResultsClient executeAsynchronously(DCQLQuery query)
throws RemoteException, org.apache.axis.types.URI.MalformedURIException
Federated Query Results Client
The Federated Query Results Client is a caGrid service client which can retrieve results and information about the current state of query processing from a Federated Query Processor Service which has been previously issued a query.
The Federated Query Results Client has the same constructors that any standard Introduce generated service client would have, but only those constructors which take an Endpoint Reference Type should be used, since EPRs contain the necessary resource key to access the server-side query results resource.
API Methods
The Federated Query Results Client supplies standard WS-ResourceLifetime methods, which may be used to set the resource's termination time and immediatly dispose of the resource. The results client also includes methods supporting WS-Notification, which can be used to determine when various query processing events have happened, and when query results are available.
Other methods specific to the Federated Query Results Client are as follows:
public boolean isProcessingComplete() throws RemoteException
This method simply returns true if the Federated Query Processor Service has completed execution of the original DCQL query and results are available, or false otherwise.
public DCQLQueryResultsCollection getResults() throws RemoteException, ProcessingNotCompleteFault, FederatedQueryProcessingFault, InternalErrorFault
This method gets the DCQL query results from the resource to which the Federated Query Results Client is connected. If processing has not yet completed (as indicated by the isProcessingComplete method), this will throw a Processing Not Complete Fault. Problems encountered while processing the query will cause a Federated Query Processing Fault to be thrown.
public CQLQueryResults getAggregateResults() throws RemoteException, FederatedQueryProcessingFault, ProcessingNotCompleteFault, InternalErrorFault
This method behaves very similarly to the Federated Query Engine's executeAndAggregate method. It gets the DCQL query results as a single, aggregate CQL Query Results instance which can be processed further with the standard data service tools.





