DATA EXTENSIBILITY

A notable feature of the NDF data format is its extensibility, which is achieved by means of independent extensions⁷ to the format, which can be defined and added to suit the needs of individual software authors. A key distinction between these extensions and the other contents of an NDF dataset is that the meaning and processing rules for data held in extensions are generally unknown to writers of format conversion utilities, whereas the standard components of an NDF have well-defined and universal meanings (see SUN/33).

This has important implications. It means, for instance, that it is relatively straightforward to write a general purpose utility to change (say) IRAF format into NDF format, so long as only standard NDF components need to be considered. However, if the receiving NDF application is equipped to handle data in its own NDF extension, then converting that additional data (i.e. extracting it from the IRAF file and putting it into the NDF extension) will require specialist knowledge, and so cannot be expected of a general purpose utility.

What is required is for conversion utilities to be extensible in the same way as the NDF datasets themselves. A standard utility could then be used to convert the bulk of the data, and a more specialised utility could simply add the extension information to the converted dataset.

As will be explained below, the NDF library supports this concept, but there still remains one problem. If, for example, you had written a software package and an associated utility that extracted specialist extension information from IRAF datasets, you would probably not want to repeat this work for every other possible data format that might come along in future – you would surely prefer to use the same specialist utility to access a whole range of foreign formats. This is where the NDF’s FITS extension comes in.

4.2 The FITS Extension

An important feature of the well-known FITS ⁸ data format (which was originally designed as a convenient container for the interchange of astronomical images between sites) is its “FITS header”. This, in essence, is a sequence of character strings each of which contains the name of a keyword, an associated value and (optionally) a comment.

Although rather few of the keywords that appear in a FITS header have standardised meanings, the freedom that this gives makes it a convenient place to store information about which the reader or writer may have little knowledge. A special NDF extension mirroring the properties of a FITS header can therefore provide a useful “airlock” or “staging post” for interchanging specialist information between general purpose conversion utilities (for which the information is meaningless) and specialist utilities (for which it has meaning).

To satisfy this requirement a FITS extension, equipped to hold FITS header information, may be added to an NDF. By convention, it consists of a 1-dimensional (HDS) array of _CHAR

*

80 character strings which holds a sequence of header records according to FITS formatting rules (including the final ‘END’ record).

4.3 Extension Import and Export Operations

To illustrate the function that the FITS extension performs, consider the following sequence of events in which an IRAF format file is read by an application that expects to find a CCDPACK extension present:

When writing to a foreign dataset, the sequence of events is broadly similar, except that the specialist utilities are invoked first (before the general purpose one) and transfer information from their relevant extensions into the FITS extension. The general purpose conversion utility then transfers the contents of the FITS extension to the foreign dataset as part of its conversion task.

The processes of (a) creating a specialist extension from information stored in the FITS extension and (b) writing specialist extension information back into the FITS extension are referred to as importing and exporting the extension information.

Using this scheme, utilities that import and export extension information will, in many circumstances, be able to rely entirely on the contents of the FITS extension and need not access the foreign data file at all. This relieves their authors of the need to understand the foreign format, beyond knowing what FITS keywords will be used to store the information of interest. Import and export utilities are therefore easily re-used when new formats are encountered. Indeed, since FITS keywords are so widely used, there will often be conventions in place that make even a change of keywords unnecessary when adding a new format.

The following sections now describe the stages involved in setting up import and export utilities to make use of this scheme.

4.4 Defining the Extension List

It is first necessary to define the set of specialist NDF extensions that should be recognised. This is normally done via the environment variable NDF_XTN, as follows:

This is simply a comma-separated list of extension names conforming to the naming conventions described in SUN/33. It applies to all foreign format datasets, unless overridden. The order in which extensions occur in this list determines the order in which they will be imported. They will be exported in the reverse order.

On occasion, it may be necessary to use a different list of NDF extensions for a particular foreign format. Most commonly, this involves simply using an empty list for formats that do not require any extension handling (data compression of ordinary NDF data files would be an example – see §3.5). To specify a separate extension list for a particular foreign format, an environment variable is used whose name is constructed by prefixing ‘NDF_XTN_’ to the format name (in upper case). For example:

would over-ride the normal extension list with an empty one so that no extension handling would occur when COMPRESSED format data is accessed.

4.5 Extension Import and Export Commands

The commands that perform import and export of extension data are defined in the usual way via environment variables whose names are formed by prefixing ‘NDF_IMP_’ or ‘NDF_EXP_’ to the extension name (in upper case), for example:

Here, the extension name is CCDPACK (and should appear in the NDF_XTN list) and the “impccd” and “expccd” utilities are assumed to have been written to import and export information for this extension.

The commands are invoked after message token substitution has taken place, as described in §2.3. In this case, the set of tokens defined for use is as follows:


Token	Value


dir	Directory in which the foreign file resides
name	Foreign file name (without directory or extension)
type	Foreign file extension (with leading ‘.’)
vers	Foreign file version number (blank if not supported)
fxs	Foreign extension specifier (see §2.4 )
fxscl	Clean version of fxs (all non-alphanumeric characters replaced by underscores)
fmt	Foreign format name (upper case)
ndf	Full name of the native NDF format copy of the dataset
xtn	Name of the NDF extension (upper case)

As explained earlier, the foreign format file should not normally be accessed by import and export utilities unless that is unavoidable, so one set of import and export commands will normally suffice for accessing a whole range of foreign formats.

In special cases, however, where techniques specific to a particular format are needed, an alternative set of commands may be defined to apply to that format alone. This is done via environment variables whose names are constructed by appending an underscore and the foreign format name to the usual names shown above. For instance, when importing and exporting CCDPACK extension information to FIGARO files, one might want to use:

setenv NDF_IMP_CCDPACK_FIGARO ’impccd ndf=^ndf file=^dir^name^type’
setenv NDF_EXP_CCDPACK_FIGARO ’expccd ndf=^ndf file=^dir^name^type’

If these variables were defined, they would over-ride any defined without the ‘_FIGARO’ suffix when accessing that particular format. Any special techniques can therefore be restricted to those formats that require them.

4.6 Writing Import and Export Utilities

Before writing your own import and export utilities, you should consider using standard ones that already exist. For example, the KAPPA package (SUN/95) contains a general purpose “fitsimp” command that can be used to build a specialist NDF extension by importing information from a FITS extension. It is driven by a keyword translation table stored in a text file, so can easily be adapted for different needs. For example, it might be used in an NDF import command as follows:

Here, mine.imp is the table that drives the importation process. This could be different for each format if necessary. An equivalent extension export utility “fitsexp” is also available.

If you find that you must write your own software for this purpose, then the IMG library (SUN/160) provides a convenient programming interface for accessing items of NDF extension information (including individual items within the FITS extension) and should make most import and export utilities straightforward to write. With a little more effort, you can, of course, also use the NDF and HDS libraries, which allow you to construct any form of extension you want.

4.7 Example: Setting Up an Extension

The following example shows typical C shell commands that might be used to allow NDF applications to handle specialist extension information derived from foreign format datasets. Normally such commands would form part of the startup sequence for the package that utilised the extension.

  #  Append the CCDPACK extension to the list of extensions to be
  #  handled.
        if ($?NDF_XTN) then
           setenv NDF_XTN $NDF_XTN,CCDPACK
        else
           setenv NDF_XTN CCDPACK
        endif

  #  Define commands for importing and exporting CCDPACK extension
  #  information.
        setenv NDF_IMP_CCDPACK ’ccdimp ndf=^ndf table=$CCDPACK_DIR/^fmt.imp’
        setenv NDF_EXP_CCDPACK ’ccdexp ndf=^ndf table=$CCDPACK_DIR/^fmt.exp’

Note that we have specified keyword translation tables here (for use in the import and export commands) which depend on the foreign data format being accessed. This would be necessary if, for instance, data in different formats were to follow different conventions about how its header information is stored, so that different FITS keywords were used in the FITS extension as a result. By concentrating this information in a table, it becomes easy to change and users can even have their own versions if necessary.

Applications which process the converted data need only deal with the validated information stored within their own private extension. They are therefore insulated from details of the conversion process and any need to change in order to access new data formats in future. The use of a private extension also protects them from the possibility of other software inadvertently corrupting their private data.

⁷Be careful to distinguish an “extension” to an NDF data structure (which is an addition of extra data to the file) from the “file extension” (which is the end part of the file name, such as ‘.sdf’, used to identify the file’s format).

⁸FITS stands for Flexible Image Transport System.