Access Keys:
Skip to content (Access Key - 0)

Knowledgebase


Part Seven: Analysis using Bootcamp Services


Table of Contents

Introduction


This section will provide an example of the client software required to interact with a caGrid Analytical Service. Although it sounds like services are purely for analyzing data they are not. They can be used to store and retrieve data on the Grid as well. An excellent example of a complex caGrid Analytical Service is Dorian. Dorian exposes many operations which allowing users to register, login and obtain host credentials.

Although Analytical Services are created and deployed to the Grid like Data Services there is no generic client that allows you to invoke their operations like the Discovery and Data Service clients that we have used in the previous sections. For this reason, we will have to obtain some Jar files from the Analytical Services that we intend to use. These Jars allow us to utilize the data types from the service and call the service to invoke exposed operations. Typically, you will have to contact the Analytical Service developer or administrator to obtain the required Jars (remember that this contact information is available in the service metadata in the Portal). Because we are using a service that was created by the caGrid Knowledge Center, the Jars will be provided as part of this guide.

Due to the tightly coupled nature of analytical services and client applications we will also alter our interaction with the data service. In the Data Service section we were using receiving data in XML to allow is to interact with Data Services without needing to work directly with the data types used for each service. Now, we want to be able to work specifically with these data types. To accomplish this, we will implement another method of the DataServiceClient to obtain query results as BootCampDataSvc objects, rather than XML, before invoking the BootCampAnalyticalService. The BootCamp Analytical Service has been created to use the same caCORE data types as the Data Service.

Desired Application Functionality

Discover the BootCamp Data Services

Performed in Step Five: Discovery of Services on the Grid

Query of Data Services

Performed in Step Five: Data Service Querying

Processing CQL Result Sets as Java Objects

When processing the CQL result set, use the BootCampAnalyticalSvc client-config.wsdd to specify which deserializer is needed to retrieve results as Java Objects, then cast the results to ProteinSequences.

Invoke an Analytical Service

Using the Jar files from the Analytical Service, invoke a service operation to perform some filtering of the result set that was obtained from a Data Service.

Example Application Flow

  1. Discover the BootCamp Data Services.
  2. Query the Boot Camp Data Service.
  3. Processing the CQL result set to cast the results to ProteinSequences.
  4. Format the Protein Sequences into an array as required by the Analytical Service.
  5. Call the Analytical Service to perform some filtering of the result set.

Creating Client Functionality


Now, we will implement the new data service query in our client application.

Step 1: Configure your Project with the BootCamp Analytical Service Jars.

Import Analytical Service Jars

  1. Download the provided BootCamp Analytical Service Jar files.
  2. Copy the files into the caGridClient/ext/lib directory.

Import Jars into the Project

  1. Right-click on the caGridClient project in the Eclipse Package Explorer.
  2. Select the Properties entry at the bottom of the list.
  3. Select Java Build Path, then Libraries.
  4. Click the Add Library button.
  5. In the Add Library dialog, click User Library, then click Next.
  6. Click the User Libraries button.
  7. In the next dialog, click the New button.
  8. Specify the name ext-lib as the library name and click "OK".
  9. Click the newly added library then click Add Jars.
  10. In the file selection dialog navigate to caGridClient/ext/lib.
  11. Select the 5 Jar files that you just added and click OK.
  12. In the Preferences dialog, click OK.
  13. In the Add Library dialog, click Finish.

Step 2: Invoke an Analytical Service

We will now create a class that will handle our calls to the BootCamp Analytical Service. This class will contain two methods that will be used to obtain BootCamp specific service endpoints and instantiate the BootCamp Analytical Service client and call the removeProteinSequencesBelowMolecularWeight operation.

Create the BootCampSvcActions Class

  1. In the Eclipse Package Explorer right-click the src/org.cagrid.client package and select New->Class
  2. Set the class name as BootCampSvcActions and click Finish.
  3. Add imports for the BootCamp ProteinSequence data type and analytical service client.
    import gov.nih.nci.training.BootCamp.domain.ProteinSequence;
    import gov.nih.nci.training.bootcamp.client.BootCampAnalyticalSvcClient;
    
    import java.rmi.RemoteException;
    
    import org.apache.axis.message.addressing.EndpointReferenceType;
    import org.apache.axis.types.URI.MalformedURIException;
    
  4. Add the following method to discover the BootCamp specific service endpoints.
    public static EndpointReferenceType getBootCampSvcEndpoint (String indexUrl, String serviceName)
    {
    	// query for the "BootCampDataSvc" service
    	EndpointReferenceType[] serviceEndpoints =null;
    	DiscoveryActions discActions;
    	try
    	{
    		discActions = new DiscoveryActions(indexUrl);
    		serviceEndpoints = discActions.searchServices(serviceName);
    
    		if (null != serviceEndpoints && serviceEndpoints.length > 0)
    		{
    			System.out.println ("Available " + serviceName + " Services:");
    			for (int i=0; i < serviceEndpoints.length; i++)
    			{
    				System.out.println("    " + i + ": " + serviceEndpoints[i].getAddress().toString());
    			}
    		}
    		else
    		{
    			System.out.println("No matching Services found");
    			return null;
    		}
    	}
    	catch (Exception e)
    	{
    		System.out.println("ERROR: " + e.getClass().getName() + ": " + e.toString());
    		e.printStackTrace();
    	}
    
    	// select the available BootCamp Data Service
    	EndpointReferenceType dataServiceEndpoint = null;
    	if (serviceEndpoints.length > 1)
    	{
    		int last = serviceEndpoints.length -1;
    		String message = "Select Service [0.." + last + "]: ";
    		int serviceNumber = UserInteraction.getIntegerFromUser(message, 0, last);
    		dataServiceEndpoint = serviceEndpoints[serviceNumber];
    	}
    	else {dataServiceEndpoint = serviceEndpoints[0];}
    
    	return dataServiceEndpoint;
    }
    
  5. Add the following method to invoke the analytical service operation.
    public static ProteinSequence[] filterLowMolecularWeightProteinSequences (String url, ProteinSequence[] proteinSequences, int molecularWeightInDaltonsMin)
    {
    	BootCampAnalyticalSvcClient client;
    	try
    	{
    		client = new BootCampAnalyticalSvcClient(url);
    		ProteinSequence[] filteredSequences = client.removeProteinSequencesBelowMolecularWeight(proteinSequences, molecularWeightInDaltonsMin);
    		return filteredSequences;
    	}
    	catch (MalformedURIException e){e.printStackTrace();}
    	catch (RemoteException e){e.printStackTrace();}
    
    	return null;
    }
    
  6. Save the File.

Update the GridClient Class
Add the necessary code to the Grid Client to allow the user to discover the specific BootCamp data and analytical services that we must use for this client, specify the ProteinSequence domain type for querying and call the BootCampSvcAction class.

  1. Open GridClient.java.
  2. Add imports.
    import gov.nih.nci.training.bootcamp.client.BootCampAnalyticalSvcClient;
    import gov.nih.nci.training.BootCamp.domain.ProteinSequence;
    import gov.nih.nci.training.bootcamp.client.BootCampAnalyticalSvcClient;
    
    import gov.nih.nci.cagrid.data.utilities.CQLQueryResultsIterator;
    import java.util.ArrayList;
    
  3. Create a new method for handling our BootCamp service logic.
    public void invokeBootCampServices ()
    {
    }
    
  4. Add code to the method to search for the BootCampDataSvc.
    String indexServiceUrl = props.getProperty(DEFAULT_INDEX_SERVICE_URL_PROP);
    EndpointReferenceType dataServiceEndpoint = BootCampSvcActions.getBootCampSvcEndpoint (indexServiceUrl, "BootCampDataSvc");
    
    if (null == dataServiceEndpoint)
    {
    	System.out.println("ERROR: BootCampDataSvc not found");
    	System.exit(-1);
    }
    
  5. Add code to the method to search for the BootCampAnalyticalSvc.
    EndpointReferenceType analyticalServiceEndpoint = BootCampSvcActions.getBootCampSvcEndpoint (indexServiceUrl, "BootCampAnalyticalSvc");
    
    if (null == analyticalServiceEndpoint)
    {
    	System.out.println("ERROR: BootCampAnalyticalSvc not found");
    	System.exit(-1);
    }
    
  6. Add code to the method to specify the domain objects that we want to retrieve.
            // we are interested in the ProteinSequence object
            String proteinSequenceObject = "gov.nih.nci.training.BootCamp.domain.ProteinSequence";
    
  7. Add the code to the method to query the data service, prepare the data for submission to the analytical service and call out BootCampSvcActions class.
    	GlobusCredential cred;
    	try
    	{
    		// get the stored credential for secure service invocations
    		cred = ProxyUtil.loadProxy(props.getProperty(DEFAULT_PROXY_FILENAME));
    
    		CQLQueryResults results = DataServiceActions.invokeSecureService (dataServiceEndpoint, cred, proteinSequenceObject);
    
    		// IMPORTANT: we are specifying the wsdd of our
    		// analytical service in order to configure
    		// the correct deserializer in order to
    		// work with the objects, not XML
    		CQLQueryResultsIterator iter = new CQLQueryResultsIterator (results, BootCampAnalyticalSvcClient.class.getResourceAsStream("client-config.wsdd"));
    
    		// create and ArrayList of results
    		ArrayList<ProteinSequence> sequenceList = new ArrayList<ProteinSequence>();
    		int maxSequenceSize = 0;
    		while (iter.hasNext())
    		{
    			ProteinSequence singleResult = (ProteinSequence) iter.next();
    			sequenceList.add(singleResult);
    			if (singleResult.getMolecularWeightInDaltons().intValue() > maxSequenceSize)
    			{
    				maxSequenceSize = singleResult.getMolecularWeightInDaltons().intValue();
    			}
    		}
    
    		// convert ArrayList to ProteinSequence[] as
    		// required by analytical service
    		ProteinSequence[] unfilteredSequences = new ProteinSequence[sequenceList.size()];
    		unfilteredSequences = sequenceList.toArray(unfilteredSequences);
    		int unfilteredCount = sequenceList.size();
    
    		// Get filter value
    		String message = "Provide minimum molecular weight (in Daltons)";
    		int molecularWeightInDaltonsMin = UserInteraction.getIntegerFromUser(message, 0, maxSequenceSize );
    
    		// call the BootCampAnalyticalSvc
                    String anlyticalServiceUrl = analyticalServiceEndpoint.getAddress().toString();
                    ProteinSequence[] filteredSequences = BootCampSvcActions.filterLowMolecularWeightProteinSequences(anlyticalServiceUrl,
    		unfilteredSequences, molecularWeightInDaltonsMin);
    
    		int keptCount = filteredSequences.length;
    
    		System.out.println("Filtered Protein Sequences: Kept " + keptCount + " of " + unfilteredCount);
    		for (int i=0; i < filteredSequences.length; i++)
    		{
    			ProteinSequence ps = filteredSequences[i];
    
    			System.out.println("ID=" + ps.getId());
    			System.out.println("  LENGTH=" + ps.getLength());
    			System.out.println("  CHECKSUM=" + ps.getChecksum());
    			System.out.println("  MOLECULAR_WEIGHT_IN_DALTONS=" + ps.getMolecularWeightInDaltons());
    			System.out.println("  SEQUENCE_IN_FASTA_FORMAT=" + ps.getSequenceInFastaFormat());
    
    		}
    
    	}
    	catch (Exception e)
    	{
    		//TODO: Display appropriate client error
    		e.printStackTrace();
    	}
    
  8. In the main method, add the call to the invokeBootCampServices method where we are evaluating the use input of the value 7.
    client.invokeBootCampServices();
    
  9. Save the file.

Build the Application

  1. Build the project
    > cd GRID_CLIENT_HOME
    > ant all
    
  2. Fix all compilation errors.

Step 4: Test the Client

The Ant build file includes a target for running the client. It will create a classpath from the jars included by Ivy and the ext/lib directory.

  1. Open a command prompt.
  2. Change directory to your caGridClient location
  3. Type ant run. This will build and execute the project
  4. When prompted, type 7 to invoke the BootCamp services.
  5. You should see the following:
    [java] Searching for 'BootCampDataSvc' services
    [java] Available BootCamp Data Services:
    [java]     0: https://tutorials.training.cagrid.org:8443/wsrf/services/cagrid/BootCampDataSvc
    [java] Searching for 'BootCampAnalyticalSvc' services
    [java]     0: https://tutorials.training.cagrid.org:9443/wsrf/services/cagrid/BootCampAnalyticalSvc
    [java] Querying: gov.nih.nci.training.BootCamp.domain.ProteinSequence
    [java] <ns2:ProteinSequence value="" length="1863" molecularWeightInDaltons="207899" checksum="49673829CCFA756E" sequenceInFastaFormat="MDLSALRVEEVQNVINAMQKILECPICLELIKEPVSTKCDHIFCKF" id="1" xmlns:ns2="gme://caCORE.caCORE/3.2/gov.nih.nci.training.BootCamp.domain"/>
    [java] <ns3:ProteinSequence value="" length="1863" molecularWeightInDaltons="207954" checksum="D2C2569CEB3EA83B" sequenceInFastaFormat="MDLSAVRVEEVQNVINAMQKILECPICLELIKEPVSTKCDHIFCKF" id="2" xmlns:ns3="gme://caCORE.caCORE/3.2/gov.nih.nci.training.BootCamp.domain"/>
    ...
    [java] <ns14:ProteinSequence value="" length="1863" molecularWeightInDaltons="207761" checksum="6B5D0B281A9DBE8A" sequenceInFastaFormat="MDLSALRVEEVQNVINAMQKILECPICLELIKEPVSTKCDHIFCKF" id="13" xmlns:ns14="gme://caCORE.caCORE/3.2/gov.nih.nci.training.BootCamp.domain"/>
    [java] Provide minimum molecular weight (in Daltons) [0..384225] : 
    300000
    [java] Filtered Protein Sequences: \*Kept 4 of 13\*       
    [java] ID=6       
    [java]   LENGTH=3343       
    [java]   CHECKSUM=653DB110D2302A8D       
    [java]   MOLECULAR_WEIGHT_IN_DALTONS=372216       
    [java]   SEQUENCE_IN_FASTA_FORMAT=MTVEYKRRPTFWEIFKARCSTADLGPISLNWFEELFSEAPPYNTEH       
    [java] ID=10       
    [java]   LENGTH=3372       
    [java]   CHECKSUM=37F23DA23CA94665       
    [java]   MOLECULAR_WEIGHT_IN_DALTONS=377345       
    [java]   SEQUENCE_IN_FASTA_FORMAT=MPIGCKERPTFFEIFRTRCNKADLGPISLNWFEELCLEAPPYNSEP
    

Conclusion


In this section we implemented client functionality that is tightly coupled to the BootCamp Data and Analytical services. To do this, we included several Jar files from the BootCamp Analytical Service and used these Jars to allow us to use the data types from the BootCamp domain model and the BootCamp Analytical Service client to invoke the analytical service.

Last edited by
Saba Bokhari (365 days ago) , ...
Adaptavist Theme Builder Powered by Atlassian Confluence