Troubleshoot Index Service Registration
Steps Necessary for Success
For successful registration, there are two processes that need to work:
- (Registration) The service must successfully "register" to the Index Service; the service must be able to connect to the Index Service. This simply means the service will show up in the list of services registered, but does not mean it will be "discoverable."
- (Aggregation) The service's metadata must be able to be retrieved and aggregated by the Index Service; the Index Service must be able to connect to the service.
Registration Problems
- The service in question is not actually initialized and running.
- The hosting machine's system clock is significantly off.
Aggregation Problems
- The container is not configured to run with a public IP or externally resolvable domain name.
- The service is running behind a firewall.
- The service or container is running with untrusted credentials.
- The service must be providing ServiceMetadata and specify that it should be registered to the Index Service.
Diagnosing the Problem
Step 1: Turn on debug logging and monitor the log file.
In this example, the FederatedQueryProcessor service is registering to the Index Service running at cagrid01.bmi.ohio-state.edu. We can also see which metadata is supposed to be aggregated at the Index Service from this line:
<ns9:ResourcePropertyNames>ns7:ServiceMetadata</ns9:ResourcePropertyNames>
That line shows us that the caGrid standard common metadata (ServiceMetadata) is going to be aggregated. By turning on this debugging information, we can determine:
- What URL the service is trying to register
- Which Index Service it is registering to
- What metadata it is registering
- Whether or not the "registration" was successful
Verify the URL is correct by entering the URL in your web browser. You should see a result similar to the following:
"Hi there, this is an AXIS service!"
If you do not see this, then then the service may be experiencing a registration issue. Check that your container is started and that you system clock is correct.
 | It is vital that your server have the correct system time to register with the Index service. Please double check your system time is accurate. We strongly recommend that you use NTP to keep your clock accurate. |
In the example output above, we can see the service registration won't "work" (the Index Service won't have it's metadata, and therefore the DiscoveryClient won't be able to discover it) because the service is running in a container that is configured to run as "localhost". The following section will correct this.
Step 2: Verify your Grid service is registering with the Index service
Query the index service and check the output:
Then inspect the file named "all_grid_services.txt". Check that your Grid service URL is listed.
This step verified that your Grid service is advertising to the Index service. However, you need to check the next step to make sure the Index service can retrieve your service's metadata.
Step 3: Using the caBIG Portal to diagnose your service
Next, please go to the caBIG portal to check the status of your service: Production portal service diagnostics page
Enter your Grid service URL in the "Service Diagnostics" pane and click "Diagnose."
If you are using the Training Grid, go to Training portal service diagnostics page.
The results will be similar to the following:
Index
If the result is not "Service found in the INDEX":
Make sure you are advertising to the correct Index service
Check your container for the <SERVICE NAME>_registration.xml file. Check the cagrid:Appendix for the location of that file.
Here is a snippet from that file that indicates the service is registering to the Training Grid index service:
<ServiceGroupEPR>
<wsa:Address>http://index.training.cagrid.org:8080/wsrf/services/DefaultIndexService</wsa:Address></ServiceGroupEPR>
Make sure you are registering to the correct Index service. That is, make sure the URL here matches the URL for the Index service in your target Grid (caBIG Production, Training Grid, etc.).
If you need to update it, simply edit the file and restart the container. Verify that you are registering with the appropriate Index service by checking your container's log file. You should see something similar to the following:
Also update that URL in your service's deploy.properties file so if you re-deploy it, it will point to the correct index service.
Then restart the container and check again with the portal diagnostics page. Please wait about 30 minutes to see your service register properly with the Index service.
Make sure your container is publishing the right host name
Your service must register with a publicly accessible address or DNS-resolvable host name, so the Index Service (and other clients) can connect to it. The caGrid Installerallows you to specify your container's host name and port, but if you used the incorrect settings, or didn't use the installer, this section details how to change those settings.
The caGrid advertisement code uses Globus's ServiceHost API to determine the container's host name and port information. The host name of the container, unless otherwise specified, will be the hosting machine's IP address or primary name. This behavior is controllable by parameters in the global configuration settings for Globus
.
Users of the Globus container should edit the file $GLOBUS_LOCATION/etc/globus_wsrf_core/server-config.wsdd to add a logicalHost parameter with a value of the IP or host name they wish to use (which should be publicly accessible). If you have already deployed your service then you need to modify this server-config.wsdd inside your container (file location is specified in the cagrid:Appendix).
For example:
...
<globalConfiguration>
...
<parameter name="sendXsiTypes"value="true"/>
<parameter name="logicalHost"value="somehost.cagrid.org"/> <parameter name="publishHostName"value="true"/>
...
 | NOTE: add the two lines for parameters named "logicalHost" and "publishHostName" just after the "sendXsiTyes" parameter line which is already in your configuration file. |
The settings above will cause the container's services to advertise with the host somehost.cagrid.org (where somehost.cagrid.org is your externally visible DNS entry or IP value) regardless of what the primary host name of the machine is.
Tomcat users should also edit the deployed version of this file (which is what is used by Tomcat), in the same fashion, in the location $CATALINA_HOME/webapps/wsrf/WEB-INF/etc/globus_wsrf_core/server-config.wsdd.
JBoss users should edit the file at $JBOSS_HOME/server/default/deploy/wsrf.war/WEB-INF/etc/globus_wsrf_core/server-config.wsdd as specified above.
Advertise the correct protocol and port
Tomcat users may encounter this Globus bug
which causes the wrong port and/or protocol to be advertised, if they did not use the caGrid Installer. You can work around this issue by setting parameters "defaultPort" and "defaultProtocol" in web.xml in tomcat to match the server.xml settings.
For example, if you are using Tomcat on port 8080 and using http, edit web.xml appropriately:
<servlet> <servlet-name>WSRFServlet</servlet-name> <display-name>WSRF Container Servlet</display-name> <servlet-class> org.globus.wsrf.container.AxisServlet? </servlet-class> <init-param> <param-name>defaultProtocol</param-name> <param-value>http</param-value> </init-param> <init-param> <param-name>defaultPort</param-name> <param-value>8080</param-value> </init-param> <load-on-startup>true</load-on-startup> </servlet>
 | NOTE: Depending upon the URL which was used to access the service for the first time, the container at times (mostly Tomcat & JBoss) caches the host name used in the URL and this may cause issues when the container is trying to broadcast itself with a different name (via proxy settings in server.xml file). We recommend clearing the cache on the server. |
For Tomcat, these folder would be the work and temp folders in the $CATALINA_HOME.
For JBoss, the temporary folders are the tmp and work folders in $JBOSS_HOME/server/default: http://docs.jboss.org/jbossas/getting_started/v4/html/tour.html
Status in Portal
Service is in ACTIVE status
This will show up ACTIVE if the previous checks are successful.
Ping Service
If the result does not say "Successfully pinged service", please check:
- you have the correct Grid Service URL
- there is no firewall or proxy preventing the Grid from contacting your service
- your DNS server is properly configured (you may need to contact your network administrators)
Changing your port when using a web proxy (Apache) in front of your service
You may have a deployment environment where you cannot open up a container port directly through your firewall. If you absolutely cannot open the container port up directly, please contact the caGrid Knowledge Center for additional information. However, the following are some guidelines for what needs to be done.
If you are using Apache or another web service to re-direct requests to your Grid service, you need to tell Tomcat about the external port to use. Edit Tomcat's connector settings in server.xml and set the proxyPort to your external port (e.g., 80).
The technical details are as follows (reference
)
Your connector would have something like <Connector ... port="<internal tomcat port>" proxyPort="<external proxy port>"...>
The "port" is used at startup time (outside of the context of any invocation) for the initial registration. However, when someone actually connects to your service, the registration code will attempt to use the port that was used to connect to it to renew the registration. If you are telling tomcat to connect to one port, Tomcat tells the client to use the "proxyPort" regardless of how the client connected.
JBoss configuration: Edit the file _$JBOSS_HOME/server/default/deploy/jbossweb-tomcat55.sar/server.xml.
More details on JBoss Tomcat connector options can be found in the jboss docs.
If you use a firewall, check that your firewall settings are configured properly.If you use a firewall, check that your firewall settings are configured properly.
To repeat this check yourself, enter the service URL in a web browser that is "on the Internet" (meaning outside your firewall).
To repeat this check yourself, enter the service URL in a web browser that is "on the Internet" (meaning outside your firewall).
Service Metadata
If the result is not "Service Metadata retrieved successfully", please follow these instructions to check that your service is properly advertising metadata:
Make sure the service's metadata is accessible
If you've validated the "registration" process is working, there may be an aggregation problem. Preferably from an "external" (outside of your firewall) machine, request the common service metadata from your service. Preferably from an "external" (outside of your firewall) machine, request the common service metadata from your service.
On Unix-based systems, run this command (replacing your service's URL):
$GLOBUS_LOCATION/bin/wsrf-get-property -a -z none -s <YOUR_SERVICE'S URL> {gme://caGrid.caBIG/1.0/gov.nih.nci.cagrid.metadata}ServiceMetadata
On Windows-based systems, run this command (replacing your service's URL):
%GLOBUS_LOCATION%\bin\wsrf-get-property.bat -a -z none -s <YOUR_SERVICE'S URL> {gme://caGrid.caBIG/1.0/gov.nih.nci.cagrid.metadata}ServiceMetadata
Running this command should display your service's common service metadata. If you get an error, this is likely the error the Index Service is getting.
Domain Model
If your service is a Data Service and you want to check if the domain model for the service is being advertised correctly, run the following command:
On Unix-based systems, run this command (replacing your service's URL):
$GLOBUS_LOCATION/bin/wsrf-get-property -a -z none -s <YOUR_SERVICE'S URL> {gme://caGrid.caBIG/1.0/gov.nih.nci.cagrid.metadata.dataservice}DomainModel
On Windows-based systems, run this command (replacing your service's URL):
%GLOBUS_LOCATION%\bin\wsrf-get-property.bat -a -z none -s <YOUR_SERVICE'S URL> {gme://caGrid.caBIG/1.0/gov.nih.nci.cagrid.metadata.dataservice}DomainModel
Running this command will display your data service's domain model. If you get an error, this is likely the error the Index Service is getting.
Step 4: Verify that your host is reachable from outside your institution
As stated before, registration is a 2 phase process. Your service registers itself with the Index Service then the Index Service connects to your service to aggregate the service metadata. This aggregation step requires that your service is available to connection from the Internet.
Please Test your service endpoint from a tool on the web that will verify that it is routable from outside your network.
External Web Site Test Tools:
http://validator.w3.org/checklink
http://www.websiteoptimization.com/services/analyze/
Step 5: Check Container Logs
Check the logs of your container to identify the specific problem. Here are the locations of the log file:
- Tomcat: $CATALINA_HOME/logs/catalina.out
- JBoss: $JBOSS_HOME/server/default/log
Additional Steps
Peruse the grid service deployment checklist: Grid Service Deployment Checklist
Everything seems ok, but it's still not working
If you can't figure out the problem, contact the caBIG caGrid Knowledge Center.
Appendix
Container File Locations
server.xml
Notes: container startup information, including host and port to listen on
- Tomcat: $CATALINA_HOME/conf/server.xml
- JBoss: $JBOSS_HOME/server/default/deploy/jbossweb-tomcat55.sar/server.xml
web.xml
Notes: port that is advertised to the Index service. Must match your port in server.xml
- Tomcat: $CATALINA_HOME/webapps/wsrf/WEB-INF/web.xml
- JBoss: $JBOSS_HOME/server/default/deploy/wsrf.war/WEB-INF/web.xml
server-config.wsdd
Notes: hostname that you publish to the index service
- Tomcat: $CATALINA_HOME/webapps/wsrf/WEB-INF/etc/globus_wsrf_core/server-config.wsdd.
- JBoss: $JBOSS_HOME/server/default/deploy/wsrf.war/WEB-INF/etc/globus_wsrf_core/server-config.wsdd as specified above.
jndi-config.xml
Notes: performRegistration parameter to determine whether to advertise to Index service
- Tomcat: $CATALINA_HOME/webapps/wsrf/WEB-INF/etc/cagrid_<SERVICE NAME>/jndi-config.xml
- JBoss: $JBOSS_HOME/server/default/deploy/wsrf.war/WEB-INF/etc/cagrid_TestDataServiceQuery/jndi-config.xml
<SERVICE NAME>_registration.xml
Notes: the Index service URL to register with
- Tomcat: $CATALINA_HOME/webapps/wsrf/WEB-INF/etc/cagrid_<SERVICE NAME>/<SERVICE NAME>_registration.xml
- JBoss: $JBOSS_HOME/server/default/deploy/wsrf.war/WEB-INF/etc/cagrid_<SERVICE NAME>/<SERVICE NAME>_registration.xml
Other XML Files Related to Service Configuration
- Tomcat: $CATALINA_HOME/webapps/wsrf/WEB-INF/etc/cagrid_<SERVICE NAME>
- JBoss: $JBOSS_HOME/server/default/deploy/wsrf.war/WEB-INF/etc/cagrid_<SERVICE NAME>
Diagnostic Tools
If you are installing your service to a Unix/Linux OS you can use the following Perl script to pull all of your service settings from the specified files for you.
 | This script is built to use the CATALINA_HOME and JBOSS_HOME environment variables. Please make sure they are defined on your system. |
- Download the zip file
and extract it a directory that is on your PATH, like $HOME/bin.
- Set execute permissions on the script:
chmod u+x chkUnixContainer.pl
- Run it using one of the following (Note: if your Perl is not in /usr/bin/perl use the 2nd option):
./chkUnixContainer.pl (tomcat|jboss)
perl chkUnixContainer.pl (tomcat|jboss)
- You can easily capture the output and send it to the caBIG caGrid Knowledge Center.