Part II
Tutorial Example: Creating a Simple Server

 3 Introduction
 4 Basics of Querying Remote Catalogues
 5 Obtaining Example Files
 6 Creating a Server
  6.1 Query string
  6.2 Results returned
  6.3 Installing and testing the server
  6.4 Modifying the server
 7 Creating a Configuration File
  7.1 URL query
  7.2 Installing and testing the configuration file
  7.3 Modifying the configuration file
 8 Carrying On
  8.1 Multiple catalogue servers
  8.2 Providing range queries
  8.3 Linked configuration files
  8.4 Handling queries which return no results

3 Introduction

This part of the document is a tutorial example which describes how to create a simple ACL server. Example files are provided which illustrate the procedure. Firstly, however, the basics of using an ACL server to query a catalogue are reviewed.

4 Basics of Querying Remote Catalogues

The basics of querying a catalogue are that some criterion is specified, all the rows in the catalogue are examined and those which satisfy the criterion are returned as the list of selected rows. Some criteria might be:

However, a very common sort of search on astronomical catalogues is the so-called ‘circular area search’ or ‘cone search’. Virtually all astronomical catalogues contain celestial coordinates, in practice Right Ascension and Declination for some equinox and epoch. In a circular area search the central coordinates and angular radius are specified. All the objects in the catalogue which are less than this angular radius from the central coordinates are selected. That is, the circular area search finds all the objects in the catalogue within a given circular patch of sky.

It is usual for ACL servers to a support a circular area search and often this may be the only type of search provided. The full form of an ACL circular area search is slightly more general, with both an inner and outer radius specified, so that objects inside an annulus rather than a circle are selected. (The ‘traditional’ circular area search corresponds to setting the inner radius to zero.)

The catalogue being accessed will doubtless have other columns as well as the Right Ascension and Declination, and the ACL server may permit ‘range’ searches on some of these columns. In a range search minimum and maximum values are specified for a column and a row is selected if its value for the column falls within the given range. Any range searches specified are combined with each other and with the circular area search using a ‘logical and’. That is, for example, the objects selected would correspond to those which are both in a given area of sky and in a given magnitude range. Though this mechanism allows powerful queries to be made it still provides only a subset of all the conceivable types of queries.

Before starting work on an ACL server for a catalogue, you need to decide two things.

(1)
Are circular area searches to be supported (the answer is almost certainly yes)?
(2)
On which, if any, additional columns are range searches to be supported?

In order to implement the server you need to provide two things:

(1)
the server itself,
(2)
an entry for the server in an ACL ‘configuration file’.

An ACL configuration file defines the list of catalogues which a client, such as GAIA or SkyCat, currently knows about. It has an entry for each catalogue. The entry specifies details of the catalogue which the client needs to know: the URL of its server, the types of queries that it supports, the name by which it is to be described to the user etc.

5 Obtaining Example Files

Some simple examples of server scripts and a configuration file are provided with this document. Subsequent sections of the tutorial will describe these files and explain how to install them. You may also find them useful as templates for developing your own servers and configuration files. The files can be obtained in two different ways.

The important files used in the tutorial are:

simpleserver.cgi
a simple server script,
secondserver.cgi
a second, slightly more complicated, server script,
genfield.c
a program to generate the list of stars returned by the servers,
simpleconfig.cfg
a configuration file including the two example servers.
checkcfg
a script for checking configuration files.

File 0README.LIS gives a complete list. You will probably find it useful to print out copies of the files and have them to hand as you work through the examples. Copies of the servers are installed at Edinburgh, so the URLs given in the examples should work, though you will be accessing the Edinburgh versions rather than your own copies.

6 Creating a Server

An ACL server is a Perl script which accepts a query in a standard format, searches the corresponding catalogue to find the objects which satisfy the query and returns them to the remote client. The server simpleserver.cgi supplied with this document is more-or-less the simplest practical server, but it illustrates the structure that they usually have. You will find it helpful to have a copy to hand as you read through this section. simpleserver.cgi supports only circular area searches in which a central Right Ascension and Declination and an angular radius are specified. Because it is provided as an example it generates and returns an artificial list of stars, centred on, and scaled to fit, the specified area, rather than searching a real catalogue. The overall structure of simpleserver.cgi is:

obtain the query string
parse the query string to obtain the central coordinates and radius
generate the list of stars to fit in this field
(a real server would search a catalogue instead)
return the results to the client

The comments in the source code should make the detailed working obvious. However, the following notes about the query string and results returned might be useful.

6.1 Query string

(1)
If a CGI gateway is implemented as a Perl script the query string appears in the variable $ENV{’QUERY_STRING’}. In simpleserver.cgi the query is copied into a local variable by the line:

$query = $ENV{’QUERY_STRING’};

(2)
In simpleserver.cgi the query string has the format:

ra=xxx&dec=yyy&radius=zzz

For example:

ra=10:30:00&dec=-30:30.0&radius=3

where ra=, dec= and radius= have their obvious meanings. When you set up the entry for the server in the configuration file (see Section 7, below) you have considerable latitude over this format. However, the one used by simpleserver.cgi is a common one, and is as good as any. simpleserver.cgi supports only circular area searches; if additional types of search were supported then the query string would contain additional parameters.

(3)
The formats and units required for the parameters are as follows:
Right Ascension
sexagesimal or decimal hours,
Declination
sexagesimal or decimal degrees,
Radius
decimal minutes of arc.

If a sexagesimal value is entered then a colon (‘:’) should be used as a separator. The Right Ascension and Declination should be for equinox and epoch J2000. The server must accept values in these units and formats because typically the client will present the same recipe and information to the user when requesting input for any of the various catalogues available to it.

6.2 Results returned

(1)
The first line of information returned by the server should always be the MIME type. In simpleserver.cgi it is written by the line:

{  print "Content-type: text/plain\n\n\n";

The MIME type is not part of the table of results, but rather is used by the client or Web browser to interpret the format of the data which follows.

(2)
The list of selected objects are returned to the client as a stream of ASCII characters. The list is written in the Tab-Separated Table (TST) format (see Section 11).
(3)
The Perl script should simply write the results to standard output (whence it will be automatically forwarded to the remote client). In simpleserver.cgi the lines:

$tst = ‘$queryExe $ra $dec $radius‘;
print "$tst";

invoke program genfield (variable $queryExe has previously been set to contain the name and directory specification of the executable for genfield) to generate the star list, copy the list the to variable $tst and then write the contents of $tst to standard output. (Hint: Perl has several mechanisms for invoking processes and directing their output to standard output; I found the one described to be the most suitable for use in a CGI script.)

(4)
The last line written to standard output should be:

print "[EOD]\n";

The string ‘[EOD]’ informs the client that the server has finished sending data.

6.3 Installing and testing the server

The procedures for installing CGI scripts vary at different sites; your system manager should be able to advise on arrangements at your site. However, to install simpleserver.cgi you need to follow at least the following steps.

(1)
Copy files simpleserver.cgi and genfield.c to a suitable directory (there are likely to be restrictions on which directories can contain CGI scripts; see your system manager).
(2)
Compile program genfield.c and name the executable genfield. genfield.c is a standard (and simple) C program; any C compiler should be able to handle it.
(3)
Edit file simpleserver.cgi. Locate the line:

$queryExe = "/star/examples/ssn75/genfield";

which is towards the top of the script and change it to correspond to whichever directory you have put the files in.

(4)
Check with your system manager whether there are any other local requirements for running CGI scripts.
(5)
You are now ready to test the server. The tests are best conducted from a normal Web browser, such as netscape. Start the browser and enter a URL similar to:
  http://www.roe.ac.uk/~acd/cgi-bin/simpleserver.cgi?ra=10.0&dec=30.0&radius=3

This string comprises the normal URL for the CGI script, followed by a ‘?’, followed by the query. This format is the normal syntax for invoking CGI scripts and passing queries to them. To invoke your version of the server you would substitute the appropriate URL in the example above. Typing in the example exactly as given should invoke a version of the server running in Edinburgh. If all is working correctly the browser should display a table similar to the one in Figure 1. If the server fails then your Web server probably maintains error logs which might contain useful diagnostics; your system manager should know where these logs are kept.


  
  Example Star Field.
  
  #column-units:          DEGREES         DEGREES         Magnitudes
  #column-types:   CHAR*8         DOUBLE  DOUBLE  REAL
  #column-formats: A8     F12.6   F12.6   F6.2
  
  Id      RA      DEC     mag
  --      --      ---     ---
  Star 1  150.026667      30.050000       0.580000
  Star 2  149.968333      29.961667       0.120000
  Star 3  149.983333      30.041667       1.640000
  Star 4  149.993333      30.005000       2.210000
  Star 5  150.000000      30.000000       1.700000
  Star 6  150.006667      29.996667       1.790000
  Star 7  149.983333      29.993333       3.380000
  Star 8  150.015000      29.951667       4.130000
  [EOD]

Figure 1: The table of values returned by the ACL server simpleserver.cgi.


6.4 Modifying the server

Other servers are likely to be similar to simpleserver.cgi. To modify it to access your own catalogue you would replace the invocation of program getfield with an invocation of a program which searched your catalogue or DBMS.

simpleserver.cgi is more-or-less the simplest practical server, and is provided as an example. Additional useful features in a server include: copying the queries to a log file (so that you can monitor usage) and copying error messages to a second log file (as diagnostics in case of misadventure). Another server, secondserver.cgi, which incorporates these features, is supplied with this document. To install it, follow the same procedure as for simpleserver.cgi, except that the lines towards the top of the script which need to be modified are:

$queryExe = "/star/examples/ssn75/genfield";
$logDir = "/star/examples/ssn75/examplelogs";

When developing a server it is often useful to comment out the line:

       $query = $ENV{’QUERY_STRING’};

and un-comment the line:

  #    $query = "ra=10:30:00&dec=-30:30.0&radius=3";

so that the script has a query ‘hard-wired’. It can now be run directly from the command line rather than be invoked via a Web browser. This trick often makes debugging scripts a lot easier.

Both the query and error log files written by secondserver.cgi are simple text files. Note, however, that file query.TXT, which is supplied with the examples, allows the query log to be accessed by CURSA[3] as an STL (Small Text List) format catalogue. Once the query log becomes large you might find it more convenient to examine it with the CURSA catalogue browser xcatview rather than Unix commands such as cat and more.

7 Creating a Configuration File

The configuration file used by GAIA etc. mediates interaction between the client and server. It is an ASCII text file containing details for each of a list of catalogues. GAIA (or whatever) reads the file and the catalogues listed in it become the ones that GAIA knows about. The details supplied for each catalogue are things like: its URL, the type of queries supported, the name that will be used to describe it to users etc. The entry for a typical simple catalogue looks something like:

  serv_type:      catalog
  long_name:      Simple example server.
  short_name:     simple@roe
  url:            http://www.roe.ac.uk/~acd/cgi-bin/simpleserver.cgi?ra=...
  symbol:         mag circle 3

This entry is taken from simpleconfig.cfg, the example configuration file supplied with this document, though the url entry has been truncated. By convention configuration files have file type ‘.cfg’. The purposes of the various items are as follows.

serv_type:
is the type of the server. For a straightforward catalogue the value required is ‘catalog’. Other values are possible, though you will probably rarely use them.
long_name:
a one-line name or short description of the catalogue. It will be presented to the user to allow him to identify the catalogue.
short_name:
a short name for the catalogue. Conventionally it has the form:

catalogue@institution

where catalogue is an abbreviation for the catalogue and institution a standardised abbreviation for the institution where the on-line version is located. By convention institution has three or four characters.

url:
the URL used to access the server. Following the usual conventions for a CGI gateway it consists of the URL corresponding to the script which constitutes the server, followed by a ‘?’ and then a string defining the query passed to the server (see Section 7.1).
symbol:
specifies how objects are to be plotted (see Section 10.6).

There are various other optional items which can be included. They are described in Section 10.

7.1 URL query

The string appended to the server URL in the configuration file and which defines the type of queries supported by the catalogue has a format something like:

ra=%ra&dec=%dec&radius=%r2

It consists of simple characters and ‘tokens’. The tokens start with a ‘%’ character. When GAIA makes a query the tokens are replaced with values which correspond to the individual query and the resulting string is sent to the server. For example, tokens in the above string might be susbstituted to yield:

  ra=10:30:00&dec=-30:30.0&radius=3

Obviously the format of the query string appended to the URL in the configuration file must correspond to that expected by the server. Various standard tokens can be included in the query string. Some common ones are:

%ra
Right Ascension,
%dec
Declination,
%r1
inner radius,
%r2
outer radius,
%n
maximum number of objects to return.

For a complete list see Section 10.

7.2 Installing and testing the configuration file

Installing and testing the configuration file should be quite straightforward. The servers
simpleserver.cgi and secondserver.cgi should be installed (see Section 6, above). Then proceed as follows.

(1)
Edit the configuration file and change the URLs for the servers simple@roe and second@roe to correspond to wherever you have installed the servers.
(2)
A Perl script to check a configuration file for errors is included with the example files for this document. To check that you have not introduced any errors whilst editing the example configuration file type:

/star/examples/ssn75/checkcfg   simpleconfig.cfg

(If the example files are not in their standard location on a Starlink system then obviously you need to alter the directory specification accordingly. Also, on non-Starlink systems you might need to edit the first line of checkcfg to correspond to wherever Perl is installed on your system.) If the configuration file is valid then checkcfg will report:

Configuration file parsed successfully.

Conversely, if it contains errors then messages describing the problems will be reported. checkcfg is described in Section 10.7.

(3)
Start GAIA and test the server. To import the configuration file into GAIA click on the Data-Servers button on the right hand side of the main menu bar and choose the Browse Catalog Directories option. A window showing the catalogues available should appear. Click on the File button at the left of its top menu bar and choose the Load Config file… option. A window allowing you to select the required file should then appear.

Once the configuration file has been loaded you can choose from amongst its catalogues and make selections in the normal fashion. If you make a selection from the simple example server a list of objects similar to Figure 1 should be returned.

7.3 Modifying the configuration file

When you create a new server you need to create an entry for it in your configuration file. If the server is just a simple variation of simpleserver.cgi or secondserver.cgi and only supports circular area queries then just duplicate the entry for simple@roe and change long_name, short_name and url to correspond to your server.

Additional modifications can be made as required. The following section gives some examples and the options are documented in Section 10. Remember that script checkcfg (see Section 10.7) is available for checking configuration files.

8 Carrying On

This section introduces a few additional features that are often required in servers and configuration files.

8.1 Multiple catalogue servers

simpleserver.cgi can access only a single catalogue. Often you may want to write a server which can access each of several catalogues. In this case your configuration file must contain an entry for each catalogue, not each server. An additional parameter, whose value identifies the catalogue, is added to the query. The server parses this parameter to identify the catalogue required. You can decide the syntax and values of this parameter, though the configuration file and server must agree.

For example, suppose that you are writing a server which will provide access to the SAO, PPM and Hipparcos astrometric catalogues. You might invent a parameter called ‘catalogue’ whose value identifies the catalogue. Your configuration file would then have three entries like:

  serv_type:  catalog
  long_name:  SAO (Smithsonian Astrophysical Observatory) catalog
  short_name: sao@roe
  url:        http://www.roe.ac.uk/~acd/cgi-bin/ast.cgi?catalogue=sao&ra=%ra&dec=...
  symbol:     mag square 3
  
  serv_type:  catalog
  long_name:  PPM (Positions and Proper Motions) catalogue
  short_name: ppm@roe
  url:        http://www.roe.ac.uk/~acd/cgi-bin/ast.cgi?catalogue=ppm&ra=%ra&dec=...
  symbol:     mag square 3
  
  serv_type:  catalog
  long_name:  Hipparcos catalogue
  short_name: sao@roe
  url:        http://www.roe.ac.uk/~acd/cgi-bin/ast.cgi?catalogue=hipparcos&ra=%ra&dec=...
  symbol:     mag square 3

The server would be written to parse the value of catalogue and then search the catalogue indicated.

8.2 Providing range queries

As well as selecting objects in a circular area of sky it is possible to further restrict the objects selected to only those for which the values of some column lie within a given range. Suppose selections from an optical catalogue were to be optionally restricted to also lie within a specified range of magnitudes and the column of magnitudes was named mag. To provide this facility for a catalogue its entry in the configuration file should be include the keyword search_cols and the query part of its url keyword should include the token %cond.

(1)
The syntax of the search_cols keyword is:

search_cols:column-name minimum-prompt maximum-prompt

For example:

search_cols: mag "Bright limit" "Faint limit"

column-name is the name of the column. The client uses minimum-prompt and maximum-prompt as prompts when soliciting the extrema of the required range from the user. (Note that because the example column is a magnitude the “Bright limit" corresponds to the smallest numerical value and the “Faint limit" to the largest.)

(2)
The %cond token is added to the query part of the url keyword to supply details of the range query. Such a query string might look like:

ra=%ra&dec=%dec&radius=%r2&%cond

The syntax of the values substituted into the %cond string is:

column-name =minimum-value,maximum-value

For example, if a range of first to second magnitude had been specified for column mag then %cond would translate to:

mag=1.0,2.0

The server that parses the query must interpret this string and ensure that the range selection specified is applied.

See Section 10.5 for further details.

8.3 Linked configuration files

In addition to entries for individual catalogues, configuration files can also contain entries for other configuration files. Typically when such an entry is chosen all the entries in the target configuration file are loaded into the client. In this way a tree (or rather a network, because recursion is allowed) of entries can be built up. Entries of this type are referred to as ‘directories’ (by analogy with an hierachical file system).

For a directory entry the serv_type should be ‘directory’ and the url should be the URL of the destination configuration file. long_name and short_name have their usual meaning. Other options are unlikely to be required. An example might be:

  serv_type:      directory
  long_name:      ESO Catalogues
  short_name:     catalogs@eso
  url:            http://archive.eso.org/skycat/skycat2.0.cfg

8.4 Handling queries which return no results

Sometimes users will submit a query which no objects in the catalogue satisfy. For example, the query might correspond to an empty patch of sky. In practice such queries are quite common. Unfortunately, the ACL format does not prescribe the action the server should take in this case. However, the recommended action is for the server to return an empty TST table. That is, it should return all the header information for a TST generated from the catalogue being queried (see Section 11), down to and including the list of column names and the line of dashes anf tab characters which terminate the header, but no table of values.