CAT can access catalogues held in several different formats: FITS tables, STL lists and Tab-Separated Tables. The restrictions and peculiarities associated with each of these formats are described below.
CAT determines the type of a catalogue from the ‘file type’ component of the name of the file holding
the catalogue. The file types for the various formats are included in the descriptions below. If a
file name is specified without a file type then it is assumed to be a FITS table of file type
.FIT
.
File types: .FIT .fit .FITS .fits .GSC .gsc
Mixed capitalisations, such as .Fit
, are also supported. The .GSC
file types are supported in order to
allow regions of the HST Guide Star Catalog to be accessed.
CAT can read both binary and formatted FITS tables. It can write only binary FITS tables. It should be able to handle most components of FITS tables, with the exception of variable length array columns. If a variable length array column is encountered a warning message will be reported and the column will be ignored.
If a column containing no data is encountered a warning message will be generated and the column will be ignored.
In common with other Starlink software, CAT does not support the COMPLEX REAL and COMPLEX DOUBLE PRECISION data types. If it encounters COMPLEX columns in a FITS table it represents them as follows:
Usually the table component of a FITS file occurs in the first FITS extension to the file.
When reading an existing FITS file CAT will look for a table in the first extension. In cases
where the table is located in an extension other than the first you may specify the required
extension by giving its number inside curly brackets after the name of the file. For example,
if the table occurred in the third extension of a FITS file called perseus.FIT
you would
specify:
The closing curly bracket is optional. When CAT writes FITS tables the table is always written to the first extension.
The textual information for a FITS table comprises the entire contents of the primary header and the appropriate table extension header of the FITS file containing the table. The entire contents of both headers are returned because this is the best way to present the maximum amount of information about the catalogue to the user in its full context. For example, a FITS table COMMENT keyword may be used to annotate other keywords and if only the COMMENT keywords were returned ‘out of context’ they would be difficult to understand, and perhaps even misleading.
In addition CAT invents two additional lines of textual information. The first precedes the primary header and serves to introduce it. The second is inserted between the primary header and the table extension header, and serves to introduce the table extension header. These two lines have class ‘CAT’ (because they are invented by CAT). Table 18 lists the text classes supported for FITS tables.
Access | Class | Description |
Read | COMMENT | Comment line |
HISTORY | Line of history information | |
KEYWORD | Named FITS keyword | |
BLANK | FITS blank comment line | |
CAT | Line invented by CAT | |
Write | COMMENT | Comment line |
HISTORY | Line of history information | |
File types: .TXT .txt
Mixed capitalisations, such as .Txt
, are also supported.
CAT can read and write catalogues in the STL (Small Text List) format. Unlike the other formats which CAT can access the STL format is specific to the CAT library. Nonetheless the STL format exists in order to allow easy access to both private tables and versions of standard catalogues held as text files. It is usually straightforward to create an STL catalogue from a text file containing a private list or standard catalogue.
In the STL format both the table of values for the catalogue and the definitions of its columns, parameters etc. are held in simple ASCII text files. These files may be created and modified with a text editor. The information defining the catalogue is called the description of the catalogue and the file in which it is held is called the description file.
When you specify a small text list you give the name of the description file. The table of values comprising the catalogue may either be in the same file as the description or in a separate file. If the table of values occurs in a separate file then the name of this file is specified in the description file and CAT places no restrictions on this name other than those imposed by the host operating system.
The STL format is fully documented in appendices to SUN/190[3], the manual for the catalogue and table manipulation package CURSA18. The documentation consists of a simple tutorial which is a good starting point and a complete description. In addition to the basic STL format there is a variant which allows STL format files to inter-operate with applications in the KAPPA image processing package (see SUN/95[3]). The KAPPA variant is also documented in SUN/190.
The CURSA documentation refers to various example files kept in directory:
/star/share/cursa
/star/share/cat
CAT can read STL format catalogues with either a free format or a fixed-format table of values. However, CAT can only write STL format catalogues with a free format table. The KAPPA variant of the STL may be both read and written.
As its name implies, the Small Text List format is intended for use with relatively small catalogues and it is unsuitable for very large catalogues. Currently there is no upper limit to the size of catalogue for which it can be used. However, if you attempt to read a catalogue containing more than 15,000 rows a warning message is issued. A large STL format catalogue may take a while to open for reading and CAT may be unable to access a very large STL catalogue19.
The textual information for an STL catalogue comprises the entire contents of the description. This approach makes the maximum amount of information about the catalogue available to the user in its full context. Table 19 lists the text classes supported for STL catalogues.
Access | Class | Description |
Read | COLUMN | Column definition |
PARAMETER | Parameter definition | |
DIRECTIVE | Catalogue directive | |
CONTINUATION | Continuation line | |
NOTE | Annotation of the catalogue description | |
COMMENT | Comment line | |
BEGINTABLE | Start of table flag | |
Write | COMMENT | Comment line |
HISTORY | Line of history information | |
The STL format provides limited support for null values; currently it does not provide all the options described in Section 8.2.
A null value for a field in an STL table is indicated by inserting the string ‘<null>’ at the appropriate place in the input file. When CAT reads this string it will interpret it as a null value. Actually, if CAT encounters any value for a field which it cannot interpret given the data type of the column (such as a string containing alphabetic characters in a field for an INTEGER column) then the field is interpreted as null. However, when preparing STL files I recommend that you indicate nulls using the string ‘<null>’. This string is recognised as indicating a null value even for CHARACTER columns.
When CAT writes an STL catalogue null fields in the table are represented by the string ‘<null>’.
Null values are not permitted in the KAPPA variant of the STL format.
File types: .TAB .tab
Mixed capitalisations, such as .Tab
, are also supported.
CAT can read and write catalogues in the TST (Tab-Separated Table) format. The TST format is a standard for exchanging catalogue data and is commonly used to transfer subsets extracted from remote catalogues or archives across the Internet. It is used by GAIA (see SUN/214[5]) and SkyCat20. The TST format is described in SSN/75[4].
Compared to the other formats supported by CAT, the TST format is somewhat deficient in the amount of metadata that it includes. In particular, the details stored for each column do not include its data type or units. Consequently, when reading a TST catalogue produced by an external program CAT deduces a data type for each column by reading the values that it contains. This procedure usually works reasonably well, though occasionally it produces bizarre results. When CAT writes a TST catalogue it includes some of the column details. These details are written in a format which CAT can interpret if it subsequently reads the catalogue. Though this enhancement is specific to CAT it is entirely consistent with the TST format and does not affect the ability of external programs to read the catalogues. The format in which the additional information is stored is documented in SSN/75.
The TST format does not support vector columns. If a catalogue containing vector columns is written as a tab-separated table each vector element is written as a scalar column.
Unsurprisingly, given its provenance as a medium for transporting subsets extracted from remote catalogues across the Internet, the tab-separated table format is intended for use with relatively small catalogues and is unsuitable for very large ones. Currently there is no upper limit to the size of catalogue for which it can be used. However, if you attempt to read a catalogue containing more than 15,000 rows a warning message is issued. A large TST format catalogue may take a while to open for reading and CAT may be unable to access a very large TST catalogue21.
The textual information for a tab-separated table comprises the entire description of the table. This approach makes the maximum amount of information about the catalogue available to the user in its full context. Table 20 lists the text classes supported for TST catalogues.
Access | Class | Description |
Read | COLUMNS | List of column names |
PARAMETER | Parameter definition | |
NOTE | Annotation of the catalogue description | |
COMMENT | Comment line | |
BEGINTABLE | Start of table flag | |
Write | COMMENT | Comment line |
HISTORY | Line of history information | |
In a tab-separated table the values for adjacent fields in a given row are separated by a tab character. In tab-separated tables written by CAT null values are represented by two adjacent tab characters. That is, no value is included for the null field.
18CURSA uses the CAT library to access catalogues.
19For information, the underlying reason for this behaviour is that CAT attempts to memory-map work arrays to hold the columns of an STL catalogue and then reads the table into these arrays when an input catalogue is opened. For a very large catalogue CAT may be unable to map the required arrays.
20http://archive.eso.org/skycat/
21For information, the underlying reason for this behaviour is that CAT attempts to memory-map work arrays to hold the columns of an TST catalogue and then reads the table into these arrays when an input catalogue is opened. For a very large catalogue CAT may be unable to map the required arrays.