Prefix Authority
| |
|
|
| |
Contents |
|
| |
|
|
Deployment Planning
Prior to deployment, the identifier prefix must be established. This involves determining the URL endpoints for the naming authority and the prefix authority as well as the PURL top-level domain mapping to the naming authority.
It is particularly important to choose an appropriate domain name for the prefix authority (PURL server) and the PURL top level domain
because these two components make up the identifier prefix, and this is not expected to ever change after deployment. Identifiers are permanent URIs by definition.
The naming authority URL can change at any time since it is actually hidden or protected by the prefix authority. When such a change occurs, the mapping from PURL domain to naming authority is updated to specify the new endpoint.
The rest of this guide assumes the following domains:
| Prefix Authority (PURL Server) End Point | http://identifiers-pa.nci.nih.gov |
|---|---|
| Naming Authority ID (PURL Domain) | production |
| Naming Authority Web App End Point | https://identifiers-na.nci.nih.gov/namingauthority/NamingAuthorityService |
| Naming Authority Grid Service End Point | https://identifiers-na.nci.nih.gov:8443/wsrf/services/cagrid/IdentifiersNAService |
With the settings above, the identifiers prefix becomes:
Example identifier:
When an identifier such as
http://identifiers-pa.nci.nih.gov/production/7e82e853-c972-4d63-a891-cbe0260316c2![]()
is "followed" (resolved), the prefix authority (PURL) redirects the client to
for resolution services.
Prefix Authority Deployment
PURLZ
is the official PURL server by OCLC. It provides a level of indirection that allows the underlying web addresses of resources to change over time without negatively affecting systems that depend on them. This capability provides continuity of references to network resources that may migrate from machine to machine.
caGrid's identifiers framework leverages PURLZ as its prefix authority.
Installation
Download PURLZ: http://persistenturls.googlecode.com/files/PURLZ-Server-1.6.2.jar![]()
Double click the jar file or use "java --jar PURLZ-Server-1.6.2.jar" from a terminal window to start the installer.
Click Next.
Accept the terms of the license agreement and click Next.
Specify an installation path and click Next.
Enter the host name and port number. Then click Next.
Choose "Use MySQL" and click Next.
Enter MySQL connectivity parameters and click Next.
In controlled environments such as production, it is recommended that a PURL administrator be designated to approve user- and top-level domain registrations. Click Next.
Accept defaults and click Next.
The installer proceeds to complete the installation. Click Next twice and then Done.
Configuration
Server Name
The host name identifiers-pa.nci.nih.gov has to be added to the server configuration before it can be used. Otherwise, the web interface and redirection services would only work when localhost is used in the URL.
Open /Applications/PURL-Server-1.6.1/modules/mod-purl-virtualhost/module.xml for edit and add our host name after localhost as follows (note the ".*" after the host name):
| <export> <!-- *********** Export all of host address space - note could export multiple hosts here. (Note have added localhost so you can test it) *********** --> <uri> <match>jetty://localhost.*</match> <match>jetty://identifiers-pa.osu-citih.org.*</match>* <!-- Add any other jetty://<servername> matches that you want to match. --> <match>ffcpl:/etc/HTTPBridgeConfig.xml</match> </uri> </export> |
Running on Port 80
The installation wizard above showed that port 8080 was entered along with the desired host name. In most setups, this wouldn't be desired since the port number would then have to be part of the identifiers.
A problem to solve is that PURLZ seems to lack support for running on ports that are considered privileged (i.e., 80) by operating systems such as Linux. Even if the server is started by root, which is undesirable, it exhibits other undocumented run-time issues.
PURLZ uses a jetty server internally, and there is jetty documentation
pointing to a solution that allows the setting of a runtime user ID after the port has been bound. This would potentially allow to start the server as root (enabling binding to port 80), and then jetty would switch to the specified runtime user id. The problem with this solution is that it requires rebuilding jetty's source, which again, is undesirable.
Therefore, the recommended approach in this guide is to let PURLZ run on port 8080 (or another non-privileged port) and configure the operating system to redirect port 80 to the PURLZ port. The following configuration has been tested to work on CentOS Linux.
| 1.- Create file /etc/xinetd.d/http with the following contents service http { disable = no flags = REUSE socket_type = stream wait = no user = root redirect = 127.0.0.1 8080 log_on_failure += USERID } 2.- Re-start xinetd $ /etc/init.d/xinetd restart |
Startup
The server can be started in the foreground as follows:

$ cd /Applications/PURLZ-Server-1.6.1/bin $ ./start.sh (or startup.bat if using MS Windows)
| #!/bin/sh # # Startup script for PURLZ # # chkconfig: - 85 15 # description: PURLZ server # processname: purlz # pidfile: /var/run/purlz.pid # config: ############################################################################## . /etc/init.d/functions JAVA_HOME=/usr/local/java PURLZ_HOME=/home/purlz/ext/purlz PURLZ_LOG=$PURLZ_HOME/log/console.log PURLZ_USER=purlz PID_FILE=/var/run/purlz.pid case "$1" in start) daemon --pidfile=$PID_FILE --user=$PURLZ_USER $PURLZ_HOME/bin/start.sh &> $PURLZ_HOME/log/console.log & chown $PURLZ_USER $PURLZ_HOME/log/console.log chgrp $PURLZ_USER $PURLZ_HOME/log/console.log chmod 755 $PURLZ_HOME/log/console.log exit $? ;; stop) PID=`cat $PID_FILE` kill $PID ;; *) echo "Usage purlz start/stop" exit 1;; esac |
Once the server is started, verify it's running by pointing your browser to http://identifiers-pa.nci.nih.gov
. A page like the one shown immediately below should be displayed. 
Log on to the server as 'admin' with password 'password' and proceed to change the password. 
Top-Level Domain Creation
A PURL domain is needed to identify the target-naming authority. The domain binds the identifier prefix to the naming authority. Therefore, a prefix authority (PURL server) can be used as an authority for multiple naming authorities by defining corresponding domains.
Following the aforementioned deployment plan will create the following mapping:
production => http://identifiers-na.nci.nih.gov/namingauthority/NamingAuthorityService![]()
Where production is the PURL domain
and the mapping itself is a partial-redirect PURL
, domain has to be created before any PURL can be placed in it.
Login as 'admin' and click on the Domains tab. Choose Create a new domain from the drop-down menu on the left, and enter the information as seen below. Click Submit to create the domain.
Now create a PURL that will redirect resolution of our identifiers to their corresponding naming authority host.
Click the PURLs tab. Choose Create an advanced PURL from the drop-down menu on the left, and enter the information as seen below. Note that the full Target URL is "http://identifiers-na.nci.nih.gov/namingauthority/NamingAuthorityService". Click Submit to create the PURL.





