Introduction

SMURF – the Sub-Millimetre User Reduction Facility
Next→
TOC ↑

1 Introduction

1.1 Document conventions
1.2 Using Smurf
1.3 Data file structure
1.4 Supported coordinate systems
1.5 File sizes and disk space

This document is aimed at users who wish to perform their own customised reductions of ACSIS or SCUBA-2 data. It is is expected that most users will normally prefer to use the higher level facilities provided by the appropriate Orac-dr pipeline.

The main purpose of this document is to provided complete reference information for all facilities provided by Smurf. Thus, for instance, it contains details of all available command parameters and configuration parameters. It is not really intended to be read from start to finish as a complete document, but rather to be dipped into, as and when needed, for information about specific parameters or facilities. Most users will normally refer to this document during the course of reading the following higher-level documents:

SC/21: SCUBA-2 data reduction cookbook (covers the use of both Smurf and Orac-dr).
SUN/264: More details on using the Orac-dr SCUBA-2 pipeline.
SC/19: SCUBA-2 SRO data reduction cookbook (focuses specifically on reduction of very early SCUBA-2 data but also contains some methods and information not yet available in SC/21).
SC/20: Reducing ACSIS data using the Orac-dr ACSIS pipeline.

After an introductory section covering the mechanics common to using all Smurf commands, the rest of this document is divided into two main parts; one dedicated to processing ACSIS data ([2], see Section 2) and the other for processing SCUBA-2 data ([4], see Section 3).

1.1 Document conventions

In an attempt to make this document clearer to read, different fonts are used for specific structures:

Observing modes are denoted by all upper case body text (e.g. FLATFIELD).
Starlink package names are shown in small capitals (e.g. Smurf); individual task names are shown in sans-serif (e.g. makemap).
Content listings are shown in fixed-width type (sometimes called ‘typewriter’). Extensions and components within NDF (NDF) data files are shown in upper case fixed-width type (e.g. HISTORY).
Text relating to file names, key presses or entries typed at the command line are also denoted by fixed-width type (e.g. % smurf), as are command-line parameters for tasks (which are displayed in upper case - e.g. METHOD).
References to Starlink documents, i.e., Starlink User Notes (SUN), Starlink General documents (SG) and Starlink Cookbooks (SC), are given in the text using the document type and the corresponding number (e.g. SUN/95). Non-Starlink documents are cited in the text and listed in the bibliography.

1.2 Using Smurf

Smurf is a suite of Starlink ADAM tasks (SUN/101 and SG/4) and therefore requires the Starlink environment to be defined. For C shells (csh, tcsh), do:

  % setenv STARLINK_DIR <path to the starlink installation>
  % source $STARLINK_DIR/etc/login
  % source $STARLINK_DIR/etc/cshrc

before using any Starlink commands. For Bourne shells (sh, bash, zsh), do:

% export STARLINK_DIR=<path to the starlink installation>
% source $STARLINK_DIR/etc/profile

1.2.1 Starting Smurf

Having set up Starlink as described in the previous paragraph, the Smurf commands are made available by typing smurf at the shell prompt. The welcome message will appear as shown below:

  % smurf

          SMURF commands are now available -- (Version 1.6.1)

          Type smurfhelp for help on SMURF commands.
          Type ’showme sun258’ to browse the hypertext documentation.
          Type ’showme sc21’ to view the SCUBA-2 map-making cookbook

This defines aliases for each Smurf command, gives a reminder of the help command and shows the version number. You can now use Smurf routines or ask for help.

1.2.2 Getting help

Access the Smurf online help system as follows:

(1): At the prompt, type smurfhelp. The welcome message is displayed along with a list of available topics.
(2): To get information, type the name of an available topic at the help prompt. The next level of help lists information and further subtopics.
(3): To go to the next level, type the name of a subtopic.
(4): Type a question mark, ?, to re-display the available topics at the current level.
(5): To go back one level, press <Enter>.
(6): To exit the help system, press <Enter> until you return to the shell prompt.

Further help on the help system maybe obtained by accessing the topic smurfhelp from within smurfhelp. If you already know the topic for which you want help, you can access it directly by specifying it on the smurfhelp command line, as in the following example:

% smurfhelp makemap parameters

If an application prompts you for input and you do not know what the parameter means, you can use ? at the prompt for more information.

  % calcflat
  IN - Input flatfield files > ?

  CALCFLAT

    Parameters

      IN

        IN = NDF (Read)
           Input files to be processed. Must all be from the same
           observation and the same sub-array.

  IN - Input flatfield files >

1.2.3 Smurf parameters

Smurf uses named parameters to specify input and output files and other variables necessary for data processing. There are two types of named parameter which should not be confused as they are accessed and specified in very different ways:

“ADAM”, or “command line” parameters:: These can be specified on the command line when running a Smurf command, in just the same way as when running other Starlink commands. If no value is supplied on the command line for a parameter, a default value will be used. If no default value is available, or if use of a default is not appropriate, then the user is prompted for a value. The reference documentation for each Smurf command includes details of each ADAM parameter, and indicates if a default will be used or not. Kappa (SUN/95) has a convenient overview of the Starlink parameter system. ADAM parameters are usually used to specify the main inputs and output for each command, and to select the main options to be used.
Maybe the most difficult aspect of giving ADAM parameter values on the command line is handling shell meta-characters. If the parameter value includes any characters that would normally be interpreted and replaced by the Unix shell before invoking the requested command, such as wild-cards, dollars, commas, etc, then they must be protected in some way so that the Starlink software receives them unchanged. This can be done either by escaping each meta-character (i.e. preceding each one with a back-slash character - “\”) or by quoting the whole string. If all else fails, it may be necessary to enclose the parameter value in two layers of quotes, an inner layer of single quotes and an outer layer of double quotes. Note, the above comments only apply for ADAM parameter values that are supplied on the command line - when supplying a value in response to a prompt, ths Unix shell is not involved and so shell meta-characters should not be escaped or enclosed in quotes.
Configuration parameters:: These are used to fine tune the details of specific algorithms, and are usually much more numerous than ADAM parameters. If required, a Smurf command will access an entire group of configuration parameter settings using a single ADAM parameter usually called “CONFIG”. The configuration parameter settings can be specified directly as a comma separated list in response to a prompt for CONFIG, or may be stored in a text file, the name of which is then supplied (preceded by a caret - ‘^’) when prompted for CONFIG. Each Smurf command that requires a group of configuration parameters will document what is needed, and how it can be supplied, in the reference documentation for CONFIG. Appendix E describes individual configuration parameters in detail. Kappa (SUN/95) has a complete description of the various ways in which groups can be specified.

1.2.4 Message filter

All Smurf commands support the ‘message filter’ ADAM parameter (MSG_FILTER), which controls the number of messages Smurf writes to the screen when executing routines. The default setting for the message filter is normal. Table 1.2.4 lists the available values for MSG_FILTER. Be aware that specifying verbose or debug will slow down execution due to the (potentially vast) number of messages written to the terminal. It is also possible to control message output by setting the MSG\_FILTER environment variable to one of the values listed in this table. To hide all messages, a quick option is to add QUIET to the command line.


Option	Description

none	No messages
quiet	Limited messages
normal	Very few messages
verbose	Full messages
debug	Some debugging messages (useful for programmers)
all	All messages regardless of debug level

1.2.5 Working with data files

Smurf does not itself enforce a naming scheme on files. However, raw data from ACSIS and SCUBA-2 obey a well-defined naming scheme. The convention is as follows: the name is composed of an instrument prefix, the UT date in the form YYYYMMDD, a zero-padded five-digit observation number, followed by a two-digit sub-system number (ACSIS only) and a zero-padded four-digit sub-scan number, all separated by underscore characters. The file has an extension of .sdf. The instrument prefix for ACSIS is simply “a”. For SCUBA-2 it is a three-character string dependent on the particular sub-array from which the data were recorded. The SCUBA-2 sub-arrays are labelled a–d at each wavelength, which are coded by a single digit (either 4 or 8 for 450 and 850 $μ$ m data respectively); thus the SCUBA-2 prefix is s[4|8][a-d].

Example ACSIS file name: a20090620_00023_01_0002.sdf
Example SCUBA-2 file name: s8a20090620_00075_0001.sdf

Files can be processed either singly or in batches. It is more efficient to process multiple files at the same time. There are three ways to specify multiple files:

(1): store the file names in a text file and then supply the file name, preceded by a caret ‘^’, as the value for ADAM parameter IN.
(2): include one or more wild-cards in the ADAM parameter value. Such wild-cards are expanded by Starlink itself, rather than the Unix shell, and so need to be protected from shell expansion using quotes or back-slashes as described earlier.
(3): list the file names explicitly, separated by commas (which need to be quoted).

For more information on specifying groups of objects for input and output, see the section Specifying Groups of Objects in the Kappa documentation (SUN/95). Examples of valid inputs (including the back-slashes and quotes required to protect the shell meta-characters) are:

  IN=s8a20090620_00075_0001.sdf
  IN=s8a20090620_00075_\*
  IN=s8a20090620_00075_00\?\?
  IN="’file1,file2’"
  IN=^myfile.lis
  OUT=\*_out

Note that if you are providing a text file containing output file names, those should be listed in the same order as the input file names, otherwise the processed data will be written under the wrong file names.

1.3 Data file structure

Data files for both ACSIS and SCUBA-2 [6] use the Starlink N-dimensional Data Format (NDF, see NDF), a hierarchical format which allows additional data and metadata to be stored within a single file. The KAPPA Kappa (SUN/95) contains many commands for examining and manipulating NDF structures. A single NDF structure describes a single data array with associated meta-data. NDFs are usually stored within files of type “.sdf”. In most cases (but not all), a single .sdf file will contain just one top-level NDF structure, and the NDF can be referred to simply by giving the name of the file (with or without the “.sdf” prefix). In many cases, a top-level NDF containing JCMT data will contain other “extension” NDFs buried inside them at a lower level. For instance, raw files contain a number of NDF components which store observation-specific data necessary for subsequent processing. The contents of these (and other NDF) files may be listed with Hdstrace. Each file holding raw JCMT data on disk is also known as a ‘sub-scan’.

The main components of any NDF structure are:

An array of numerical data (may have up to 7 dimensions - usually 3 for JCMT data);
An array of variance values corresponding to the numerical data values;
World Coordinate System information;
History;
Raw data units (“K” for ACSIS, “adu” (possibly as compressed integers) for raw SCUBA-2 data, “pW” for flat-fielded SCUBA-2 data, etc).

For ACSIS, the raw data are stored as $N_{chan} \times N_{receptors} \times N_{samp}$ , while SCUBA-2 data are stored as $N_{columns} \times N_{rows} \times N_{samp}$ , where $N_{samp}$ is the number of time samples in a file.

The files also contain additional NDF components common to both instruments:

JCMT State structure (the telescope pointing record) and other metadata that potentially varies for every sample;
JCMT Observatory Control System (OCS) configuration, with the contents of the XML file used to set up the observation;
A “FITS extension” containing information in the form of a set of FITS (Flexible Image Transport System) header cards, that does not change during a sub-scan.

The jcmtstate2cat command can be used to extract the time varying metadata and store it in a tab-separated table (TST) format catalogue.¹ so that it can be visualised using Topcat. An example Topcat plot of the telescope motion for a particular observation can be seen in 1. The JCMTSTATE extension contains information from the telescope, secondary mirror and real-time sequencer (RTS). ACSIS observations include environmental weather information and SCUBA-2 observations include SCUBA-2 data (such as the mixing chamber temperature) and the water vapour monitor (WVM) raw data. jcmtstate2cat converts the telescope and SMU information to additional columns showing the tracking and AZEL offsets and also converts raw WVM data to a tau (CSO units). Finally, SCUBA-2 data has additional low-level MCE information that can be included in the output catalogue using the ‘--with-mce‘ option.

Figure 1: The telescope positions during observation number 7 on 20090107. The plot is created by Topcat from the output from jcmtstate2cat plotting the DRA columns against the DDEC column. The pong scan pattern is clearly visible.

The original XML used to specify the details of the observation can be obtained from any data file using the dumpocscfg command.

The FITS extension is used to store information that either does not change or changes by a small amount during the course of an observation. Note that in the particular case of SCUBA-2 data some values in the FITS extension will change for each sub-scan (i.e. file) of a single observation. The values in the FITS extension may be viewed with the Kappa fitslist command.

Each instrument has further specific components. SCUBA-2 files contain dark SQUID data, the current flatfield solution, an image of the bolometers used for heater tracking and possibly information indicating how to uncompress the raw data. ACSIS files contain information about the receptors used including their coordinates in the focal plane.

Output files created by Smurf may contain some or all of these, plus new components with information about the output data. These are noted in the description of specific applications. All output files contain a PROVENANCE extension which provides a detailed record of the data processing history. Use the Kappa command provshow to list the contents.

1.4 Supported coordinate systems

Smurf uses AST for its astrometry and thus any coordinate system supported by AST may be used when creating images/cubes. The default behaviour is to use the system in which the observations were made (known as the TRACKING system within Smurf).

1.4.1 Moving sources

The mapping tasks makemap and makecube automatically deal with moving sources. There is no need to deal with moving sources explicitly for any processing with Smurf. Maps and cubes made from moving sources will use a coordinate system that represents offsets from the source centre, rather than absolute celestial coordinates.

1.5 File sizes and disk space

Be aware that the raw data files from both instruments may be large (tens to hundreds of megabytes). Subsequent processing of raw SCUBA-2 time-series data produces output files which are even larger for two reasons:

(1): The archived raw integer data values are compressed using “delta” compression (see Kappa task ndfcompress) and must therefore first be uncompressed before being processed². The compression ratio varies from file to file but is usually in the range 2 to 3.
(2): The uncompressed data values are then converted from four byte integer to eight byte floating point.

Thus file size increases in the range 4 to 6 are to be expected. Smurf mapping tasks have the ability to restrict the size of output data files for manipulation on 32-bit operating systems. For further details, see the description of the TILEDIMS parameter in the sections on makemap and makecube. Processing SCUBA-2 data is faster on 64-bit systems due to its use of double precision for all calculations.

¹This is a standard format historically supported by Cursa and ESO SkyCat

²This uncompression is performed automatically when the data is read by any Starlink command - there is no need to uncompress the data as a separate step.

SMURF – the Sub-Millimetre User Reduction Facility
Next→
TOC ↑