The applications use the parameter system to get the necessary information from the outside world. The source of information is not always the user’s keyboard. The specification of a parameter on the command line is slightly different from entering the value at the prompt.
A good model to imagine the workings of the parameter system is as follows. The system is a pool of parameter values. On the command line you can pass values to the parameter system (not the application). When the application runs and needs a parameter value, it will ask the parameter system (not the user terminal). For each parameter the system has two sets of rules, one to arrive at a prompt default value and one to work out the value itself. If the value was specified on the command line, the system will just pass that as the value to the application. Otherwise the value is so far unknown to the parameter system and it will construct a prompt default and a value according to the rules. There are several possible sources for these two:
So asking the user is only one way of getting information from the parameter system. You also see that the defaults offered—or accepted by ‘accept’ on the command line—may be arrived at in a number of ways.
There are three useful keywords the user can give on the command line to control the defaulting and prompting:
The user is not prompted for a parameter value in all circumstances, but a value can be specified on the command line even if it would not be prompted for. Conversely, if a parameter is given on the command line, then it will not be prompted for.
On the command line, parameters can be specified by position or by name. To specify by position, they must be in the right order and all previous positions must be filled, too. E.g. for a spectrum in ‘a_file.sdf’ the following will work:
In the third version, the first pair ‘min max’ is for parameters 2 and 3, ‘ystart’ and ‘yend’. Although these are not needed for a spectrum, positions 2 and 3 must be filled in order to use positions 4 and 5 for ‘xstart’ and ‘xend’.
Logical parameters usually do not have positions assigned to them. On the other hand, these can be specified by name and value, or by negated name. Consider the ‘median’ parameter of the command ‘istat’. There are two ways to set it true, and two ways to set it false:
Furthermore, instead of ‘true’, ‘yes’, ‘t’, or ‘y’ can be used, similarly with ‘false’ and ‘no’.
There are a few vector parameters in Figaro, where the parameter value is not a single number but a vector of numbers. To specify vector values, use square brackets such as
If you set the environment variable ADAM_ABBRV, you can abbreviate the parameter names on your command line to the shortest unambiguous string. Say ‘istat’ has only one parameter beginning with ‘i’. Therefore ‘i=a_file’ is just as well as ‘image=a_file’.
When a prompt occurs, it consists of the parameter name, a prompt string, and possibly a prompt default. Normally the user responds with the value for the parameter. But other responses are possible:
By entering nothing and just hitting return, the prompt default is accepted as the proper value. Entering the backslash () accepts the default for this parameter and the respective defaults for all parameters that would subsequently be prompted for; no further prompting will occur. Sometimes it is necessary to make clear that a value is not a number. When a file name begins with a digit then in may have to be given with single quotes or with an ‘at’ (@).
Numeric parameters have a permitted range. The user can ask for either of the extreme values to be used by entering ‘min’ or ‘max’. In most circumstances the permitted range of ‘xstart’ etc. is the extent of the data in question. The ‘splot’ command is an exception: The permitted range is close to the floating point range of the machine and grossly exceeds the extent of any reasonable spectrum. This is necessary so that you can have a plot that is wider than the data reach. ‘splot’ has the logical parameter ‘whole’ to adjust the plotted range to the data range itself.
Entering an exclamation mark should assign the null value to the parameter, i.e. make the value undefined. In Figaro this has no meaning and the application should abort. A double exclamation mark is the proper signal for the application to abort. The question mark can be used to get ‘run-time’ help, i.e. help on the parameter currently prompted for.
Parameters are not only used to pass information to the application, the application may also return information in other parameters. ‘istat’ has a number of output parameters in order that its results can be used in scripts. From ICL, output parameters can be handled quite easily:
To achieve anything like that from the Unix shell one needs detailed knowledge of the storage of these output parameters in the Unix file system. The Unix shell does not have floating point arithmetics, but one can at least pass the value from one application to the next:
Here it is assumed that the environment variable ADAM_USER does not exist or points to $HOME/adam.
Some confusion arises from syntax conflicts between the Unix shell and the ADAM parameter system.
An instructive example is where the user wants to find out the value of a certain pixel in an image. In the language of the ADAM parameter system the user is interested in the object ‘a_file.DATA_ARRAY.DATA(5,7)’:
Now try the same from the Unix shell:
As a rule, we have to mask each meta-character with a backslash (). The backslashes make sure the Unix shell passes the meta-characters on and does not interpret them. They then make it through to the ADAM parameter system, and from there the previous arguments apply.
Some people prefer other schemes such as enclosing the whole object specification in a pair of single quotes or double quotes. The advantage is that you need only two additional characters no matter how many pairs of parentheses are in the object name. The disadvantage is that you need a matching pair of characters, and that sometimes you need to pass quotes to the ADAM parameter system.
The troublesome meta-characters are parentheses ()
, square brackets []
, quotes ’
, double quotes "
and the backslash itself. See SC/4 for further details.
Your run-of-the-mill Figaro application uses the NDF library to access data. But the HDS object manipulators ‘copobj’, ‘creobj’, ‘delobj’, ‘renobj’, ‘setobj’, and ‘trimfile’ use the HDS library to access data files, just like ‘hdstrace’ does. Notice the small difference between accessing data and accessing data files! The point is that two different data formats (NDF and DST) actually use the same file format (HDS). Therefore accessing such files on a low level is different to accessing the data inside the files on a higher level.
HDS files play a vital role in Figaro. Two data formats, NDF and DST, are realised in HDS files. In addition, the ADAM parameter system uses HDS files to store its information. HDS stands for Hierarchical Data System, and the main feature of HDS files it that there is no fixed order in which items are arranged within the files. Instead HDS files contain a hierarchy of named objects, all that matters are the hierarchy tree and the names of objects, not their sequence.
The actual information is kept in ‘primitive’ objects. These are numbers or strings, scalar or arrays. The other kind of objects are structures, which can contain further objects. These are needed to build the hierarchy tree. Structures can be arrays, too.
Users can inspect and modify the contents and organisation of HDS files. For this a syntax to specify objects has been developed. To practise the syntax, you can use the ‘hdstrace’ utility (see SUN/102) to list the contents of an HDS file:
This shows that the file contains a top-level structure of type NDF, within it a structure of type ARRAY called ‘DATA_ARRAY’, and within that a primitive array of type _REAL called ‘DATA’. We also see that this latter array contains 256 by 256 floating point numbers.
You do not have to inspect the whole file, but can concentrate on specific objects within the object tree. For this you use the component names and dots between them. To address array elements, use parentheses:
Here the @ and the double quotes are not necessary, but they show how you can address a DST file instead. The double quotes bind the ‘.sdf’ into the file name so that it does not look like the object ‘SDF’ within the file. The @ makes clear that you talk about an HDS object and not a string or a number. An @ may be useful if you have file names that begin with a digit.
The top-level object is equivalent to the file itself and is not specified. Its name is in fact irrelevant, so long as it is a scalar structure. The other object names are case-insensitive, you can use upper case as in the example, or lower case.
HDS files have names ending with the file name extension ‘.sdf’, which stands for Starlink Data File. But this does not have to be so. Figaro’s second data format, DST, also uses HDS files, but their names end in ‘.dst’. You could use any file name extension, ‘.sdf’ is just the default that is assumed if you don’t specify it, such as:
A remark is necessary about data types here. The primitive data types (those that do not make HDS structures, but scalars, vectors and arrays of numbers, strings, etc.) have names beginning with an underscore. The most common type is ‘_REAL’, others are ‘_DOUBLE’, ‘_INTEGER’, ‘_BYTE’, ‘_UBYTE’, and ‘_CHAR*n’. When an HDS structure manipulator needs to be given a type, then these are to be used. But if a data processing application such as ‘retype’ needs a type specification, then you have to use the traditional Figaro type specifications ‘FLOAT’, ‘DOUBLE’, ‘INT’, ‘BYTE’, and ‘CHAR’.
NDF, the Extensible N-Dimensional Data Format, is a data format for spectra, images, cubes etc. In the first place this is a specification of an HDS object hierarchy. Although Figaro’s old data access routines (DSA/DTA) did a reasonably good job at implementing the NDF format, Figaro now uses the official implementation in the form of the NDF subroutine library. So the term ‘NDF’ is not only a recipe for an HDS object tree, it is also the name of a subroutine library to create and access such a tree. And it is customary to call a data set in NDF format ‘an NDF’.
The minimum content of an NDF is an object with name ‘DATA_ARRAY’. It must either be a ‘primitive’ array (one that contains numbers etc.), or it must be a structure of type ‘ARRAY’ and contain a primitive array called ‘DATA’. The two variants are called a primitive NDF and a simple NDF. The following two are equivalent, the first is simple, the second is primitive:
The difference can normally be ignored, since the software will sense which variant is present and act accordingly. When you have to specify HDS objects to ‘hdstrace’, ‘creobj’ etc, you must of course check which variant is present. Simple NDFs are more flexible, and are also the standard variant.
There are a number of further components in addition to the data array that an NDF can have. There may be a variance array of the same shape as the data array, or title, label and unit strings. A relatively complete NDF might look like this:
The most remarkable addition, and quite commonly present, is the AXIS. AXIS is an array of structures, even if the data are one-dimensional. Each element AXIS(1), AXIS(2) ... is thus a structure. In fact, it is very similar to a minimal NDF. In the AXIS each pixel is given its own coordinate value. In general AXIS(i).DATA_ARRAY.DATA need neither be linear nor monotonic. However, most software may assume one or the other.
There are a few cases where Figaro treats NDFs differently from other Starlink packages. The details are as follows.
A great advantage in accessing data via the NDF library is that you can specify a section of the data rather than the full array. The section can specify a part of the array, or it might only partially overlap with the array, or they might not overlap at all. The section can also have a different dimensionality than the array itself. Here is an example specification of a seven-dimensional section, in each dimension a different feature of the section specification syntax is used:
The commas separate the dimensions (axes) from one another. Colons mean that there’s a range rather than a single value. If the start or end of a range is left out, the start or end of the NDF is used. Integer numbers are pixel indices, floating point numbers are pixel centre values (e.g. wavelengths). Instead of using a colon to separate start and end, you can also use a tilde to separate the centre of the section from the size of the section.
You can use NDF sections whenever existing data are accessed. You cannot use NDF sections to specify output data. And you cannot use NDF sections when access is to the HDS file rather than the data inside it.
Another great advantage of using the NDF library for data access is that data need not be in NDF format. Any foreign data format can be accessed, so long as a conversion to NDF is defined. An easy way to define a large number of format conversions is to start up the CONVERT package in addition to Figaro:
At the time of writing this is the list of file name extensions and formats. The list is in order of preference of the formats.
Normally you need not concern yourself with the file name extension. The data access routines will go through the list and give the application the first data set found no matter which original format they are in. But if you want to be specific, you can say
if you want to use the disk-FITS file when there is also a ‘file.sdf’ lying around on your disk. As you can see, you can even use NDF sections on foreign data formats.
You can use foreign formats whenever data are accessed, be it input or output. You cannot use foreign formats when access is to an HDS file rather than to data. Things like ‘delobj’ can be used only on HDS files, which restricts it to NDF and DST format.
Whether you can actually store your data in a foreign format, is not guaranteed. Figaro occasionally uses a FITS extension, which you would lose if you store data in GIF. More relevant is that NDF and DST formats allow you to have non-linear pixel centre coordinates, and you are likely to lose such coordinates altogether if you store in FITS or IRAF format.
There is a penalty for using foreign formats rather than NDF. Accessing input and releasing output data takes that bit longer to do the format conversion. And it may take a lot longer if the disk is not local to the computer that does the processing. So if you use only Figaro and other Starlink packages, it is best to stick with NDF format. If every third application you use is from IRAF or MIDAS, you might be better off using disk-FITS.
NDF to ASCII and ASCII to NDF conversions can also be performed using the ex-SPECDRE applications ASCIN and ASCOUT. ASCIN will read axis values, pixel widths, data values, and data errors from an ASCII table into an NDF structure, while ASCOUT will create such an ASCII table from an NDF.
For more information on foreign formats see the Developer’s Guide on adding format conversion to NDF (SSN/20) and the user documentation on CONVERT (SUN/55).
One of the foreign formats is called the ‘Figaro format’—in fact it is Figaro’s old DST format. These days Figaro uses NDF format, but that was the case only since version 3.0.
Like NDF, DST is a specification of an HDS object hierarchy. Only the hierarchy is different, the same information is in different places.
The NDF shown in Section 3.2.2 would look in DST format like this:
DST stores errors instead of variances. There is no ‘a_file.Y’ structure, since the second axis has default pixel coordinates. In DST format only non-default axes must actually exist.
There is also no quality array, instead the data array has bad values interspersed with good values. This is not a property of the DST format itself, but the way the format conversion treats a quality array.
Due to the flexibility of HDS object trees, it is possible to add extra information. Usually such information is recognised only by specific software. In the NDF format, Figaro uses a Figaro extension and a FITS extension. Any extension to an NDF is an object within a ‘MORE’ structure. Since axes are also kind-of NDFs, they can have extensions, too. Figaro is probably the only package to use an extension to axes.
In DST format the information that would be in NDF extensions is not so obvious to locate. The following table translates between objects in an NDF extension and the equivalent in a DST file.
DST | NDF |
.FITS | .MORE.FITS |
.OBS.TIME | .MORE.FIGARO.TIME |
.OBS.SECZ | .MORE.FIGARO.SECZ |
.Z.MAGFLAG | .MORE.FIGARO.MAGFLAG |
.OBS.OBJECT | .TITLE |
.Z.RANGE | .MORE.FIGARO.RANGE |
.extra | .MORE.FIGARO .extra |
The FITS extensions in the two formats differ internally. In DST each FITS item is an HDS object. But in NDF format the whole FITS header is an array of strings, and each FITS item is a string in this array.
Needless to say, all but the FITS extension go overboard when you use other foreign data formats. You can inspect the FITS extension with ‘fitskeys’ and add or change items with ‘fitset’.
The NDF data access routines – together with the lower-level HDS routines – allow the construction of extensions to the data format. Extensions unknown to an application package are propagated, and extensions known to an application enable it to provide more functionality.
The Specdre applications use a set of extensions to the basic NDF format to communicate information between themselves. This extension is known (not unreasonably) as the Specdre Extension. It is recognised and understood by all the Specdre applications. Other Figaro applications, and applications in other Starlink packages such as KAPPA, simply propagate the Specdre extension unchanged without modifying (or understanding) it. This behaviour is not always correct, but in general it is the best that can be done. If the dataset is not processed with another Specdre application it does not matter if the Specdre extension and the basic NDF become inconsistent (because other applications will ignore the extension). If a Specdre application is presented with a dataset in which the Specdre extension is obviously inconsistent with the basic NDF it will notice and report an error.
Specdre contains a number of subroutines for easy and consistent access to the Extension.
Programmers are advised to contact the Figaro support username (e-mail figaro@star.rl.ac.uk
) if
they want to make use of the Specdre Extension in their programs.
All Specdre applications support version 0.7 of the Specdre Extension, which was introduced in
version 0.7 of Specdre. Some applications support an advanced version 1.1 of the Extension, as
introduced in version 1.1 of the package.
Design The spectroscopic axis information is not quite complete with only label and units in the
axis structure. For a frequency scale you have to know which reference frame it refers to,
for wavelengths you might also be worried about the refractive index of the medium in
which the wavelength is measured. And for radial velocity scales you need to know the rest
frequency of the spectral line. Another thing most useful in spectroscopy would be to store
continuum and line fits along with the data. The concept of extensions to an NDF provides a
good means to store such information. The design of Specdre’s own extension is outlined
here.
The Specdre Extension is the structure <myndf>.MORE.SPECDRE
and is of type SPECDRE_EXT
. <myndf>
is
the creation name of the main NDF, the one with the data, axes etc. <myndf>.MORE
contains all the
extensions to this main NDF. The Specdre Extension may contain other NDFs, thus we have to
distinguish between the “main NDF” and the “Extension NDFs”.
Using the Specdre Extension The demonstration script (demo_specdre
) shows the use of the
Specdre Extension. There is a tool editext
to look at or manipulate the Extension. resamp
will usually
store some limited amount of information about the interdependence of post-resample pixels in the
Extension. If you try to fitgauss
such re-sampled data, that application will pick up and use the
information in the Extension.
fitgauss
or fitpoly
will store their fit results in the Extension. In conjunction with the NDF sections
(see Sections 3.2.3 and B.4) you can work your way through the rows of a CCD long-slit spectrum,
store each row’s fit in the Extension and fill the result storage space. The collection of all fits is again an
NDF (located in the Extension) and can be fed into other applications like specplot
, ascout
, or indeed
KAPPA’s display
.
The Extension also provides a place to store a full-size array for the wavelength or frequency etc. of
each pixel. This array of spectroscopic values may be created by grow
, and is used by arcdisp
to apply
individual dispersion curves for the rows of an image. The image may be a long-slit spectrum, or the
rows may be extracted fibre spectra or échelle orders.
There is some potential for confusion here. You may tend to count pixels in your data starting at 1, you
may use NDF pixel indices, NDF pixel coordinates, a vector of pixel centre values, or an
N-dimensional array of spectroscopic values.
ndf(-25:32,4:)
.
linplot
to use it for axis
labelling, Specdre’s specplot
does use it of course. You also cannot use this array to
specify an NDF section in a parameter value.
.SPECAXIS
is an _INTEGER scalar which defaults to 1. Its value is the number of
the spectroscopic axis. This is the axis along which the one-dimensional ‘spectroscopic
subsets’ extend. It must be greater than or equal to 1 and less than or equal to the number
of axes in <myndf>
. A change of specaxis may render other components invalid as regards
their values or shapes.
.RESTFRAME
is a _CHAR*32 scalar which defaults to ‘unknown’. Its value describes the
rest frame used to express the observed frequency, wavelength, radial velocity, or redshift.
.INDEXREFR
is a _REAL or _DOUBLE scalar which defaults to 1. Its value is the index of
refraction needed to convert frequency into wavelength and vice versa.
.FREQREF
is a _REAL or _DOUBLE scalar which defaults to the bad value. Its value is the
reference frequency needed to convert between radial velocity and redshift on the one
hand and frequency on the other hand. The value is evaluated together with .FREQUNIT
:
.FREQUNIT
is an _INTEGER scalar which defaults to 0. Its value is the common logarithm of the
unit used for .FREQREF
divided by Hertz. Note that this item is not of type _CHAR, _REAL or
_DOUBLE.
.SPECVALS
is an NDF structure with data, label and units components. .SPECVALS.DATA_ARRAY
is a _REAL or _DOUBLE array which has the same shape as <myndf>
. It defaults to a multiple
copy of the centre array of the spectroscopic axis <myndf>.AXIS(specaxis).DATA_ARRAY
.
This structure contains spectroscopic axis values (either wavelength, or frequency, or
velocity, or redshift) for each pixel of the data cube. Labels and units must be stored
with this data structure. Their default values are copies from the spectroscopic axis
<myndf>.AXIS(specaxis).LABEL, .UNITS
in the main NDF or ‘unknown’ if the axis has no
label or unit. A modification of spectroscopic values may render parts of .RESULTS
invalid, but
no rules are formulated in this respect. This structure must not contain bad values.
.SPECWIDS
is an NDF structure with only a data component. .SPECWIDS.DATA_ARRAY
is a _REAL
or _DOUBLE array which has the same shape as <myndf>
. It defaults to (i) a derivative from
.SPECVALS
in the way prescribed by SUN/33, or (ii) to a multiple copy of the width array of
the spectroscopic axis. Just as .SPECVALS
contains the multidimensional array of spectroscopic
values for the pixel centres, this array contains the spectroscopic widths for the pixels. Labels
and units are not to be stored with this data structure. This structure is always considered
together with .SPECVALS
. It must not contain bad values.
.COVRS
is an NDF-type structure with only a data component. .COVRS.DATA_ARRAY
is a _REAL
or _DOUBLE array which has the same shape as <myndf>
. .COVRS
defaults to non-existence. The
meaning is as follows: for some reason pixels belonging to the same spectroscopic subset may
be interrelated. While no complete covariance matrix is stored, this structure holds the sum
over the rows of the covariance matrix (cf . Meyerdierks, 19921).
For multi-dimensional data note that this holds information only about the interrelation within
any one spectroscopic subset, not between different such subsets. That means we know only
about interrelation along the spectroscopic axis (within a spectrum) but not perpendicular to
that axis (between spectra).
.RESULTS
is an NDF-type structure with data and variance components and a number of
extensions in the .MORE
component. All these extensions are HDS vectors. They have either one
element per component or one element per parameter. The shape of the .RESULTS
structure is
defined by (i) the shape of <myndf>
, (ii) the total number of parameters tnpar >
0, (iii) the number
of components ncomp > 0. Each component has allocated a certain number of parameters
npara(comp) >=
0. The total number of parameters must be
while the parameter index for any component runs from
Components are additive, i.e. the combined result is the sum of all those components that can be evaluated.
Note that the concept of a component is different from that of a transition or a kinematic component: a component is a line feature you can discern in the spectrum. Any component can in general be assigned a transition and a radial velocity. So you may have several components belonging to the same transition and several components of similar velocity belonging to different transitions. You may at any time decide that a discernible component has been misidentified and just change its identification. The concept of a component is even more general, in that it can be the continuum, in which case there is in general no laboratory frequency for identification.
There is, however, a restriction for data sets with more than one spectrum. Any component may differ from spectrum to spectrum only by the fitted values. The mathematical type and identification of components is common to all spectra.
.RESULTS.DATA_ARRAY
and .RESULTS.VARIANCE
are _REAL or _DOUBLE array structures
which default to bad values. They have one axis more than <myndf>
: the spectroscopic
axis is skipped; instead new first and second axes are inserted. The first axis counts the fit
parameters up to the maximum (tnpar). The second axis is of length 1 and may be used
in future. All further axes are of the same length as the corresponding non-spectroscopic
axes in <myndf>
.
.RESULTS.MORE.LINENAME
is a _CHAR*32 vector which defaults to ‘unidentified
component’. There is one element for each component. Its value is a spectroscopist’s
description of the component, such as ‘[OIII] 5007 v-comp #1’, ‘12CO J=1-0’, ‘nebulium’,
‘5500 K black body’. It is essential that the strings are of length 32.
.RESULTS.MORE.LABFREQ
is a _REAL or _DOUBLE vector which defaults to bad values.
There is one element for each component. The value is the laboratory frequency of the
transition. The units used are the ones stored in .FREQUNIT
. The laboratory frequency
is the frequency as observed in the emitter’s rest frame. The meaning of this frequency
is similar to that of .FREQREF
in that the laboratory frequency of a transition is a useful
value for the reference frequency of the velocity or redshift axis. The difference is that
each component fitted may or may not have its own laboratory frequency. .FREQREF
will
usually be a copy of one of the elements of .RESULTS.MORE.LABFREQ
.
.RESULTS.MORE.COMPTYPE
is a _CHAR*32 vector which defaults to ‘unknown function’.
There is one element for each component. Its value is a mathematician’s description of the
component, such as ‘Gauss’, ‘triangle’, ‘Lorentz-Gauss’, ‘Voigt’, ‘polynomial’, ‘Chebyshev
series’, ‘sine’. It is essential that the strings are of length 32.
.RESULTS.MORE.NPARA
is an _INTEGER vector defaulting to INT(tnpar/ncomp)
. There is
one element for each component. Its value is the number of parameters stored for that
component. When more components are added to an existing .RESULTS
structure, then
the new components are allocated by default INT
(( tnpar - tnpar_old) / (ncomp - ncomp_old))
parameters. The numbers of parameters must be greater than or equal to zero.
.RESULTS.MORE.MASKL
and .RESULTS.MORE.MASKR
are _REAL or _DOUBLE vectors which
default to bad values; both are of the same type. There is one element in either vector for
each component. A component comp is evaluated according to type and parameters in
the range of spectroscopic values between maskl(comp) and maskr(comp). The component
is assumed to be zero outside this interval. Bad values indicate that the range is not
restricted.
.RESULTS.MORE.PARATYPE
is a _CHAR*32 vector which defaults to ‘unknown parameter’.
There is one element for each parameter. Its value is a mathematician’s description of
the parameter. A Gauss profile might be specified by parameters ‘centre’, ‘peak’, ‘sigma
width’, ‘integral’. It is essential that the strings are of length 32.It should be noted that there exist no rules about how to store certain components, such as ‘A Gauss profile must be called Gauss and have parameters such-and-such’. What is stored should be described by the strings so that a human reader knows what it is all about. This does not prevent pairs of applications from storing and retrieving components if they use the strings in a consistent way. The documentation of any application that writes or reads results should specify what strings are written or recognised. Specdre Extension v. 1.1 The items of the Extension added with Specdre version 1.1 are described below. All top-level items are optional, but each is either complete or absent.
.COORD1
and .COORD2
are NDF structures each with data, label and units components.
.COORDi.DATA_ARRAY
are _REAL or _DOUBLE. Both have the same shape, which is
similar to that of <myndf>
. The only difference is that in .COORDi
the spectroscopic
axis is degenerate (has only one pixel). Either both or neither of .COORD1
and .COORD2
must exist. The data values default to a multiple copy of the pixel centres of the main
array along the first and second non-spectroscopic axes. For example, if .SPECAXIS
is
2, then .COORD1.DATA_ARRAY
is a multiple copy of <myndf>.AXIS(1).DATA_ARRAY
and
.COORD2.DATA_ARRAY
is a multiple copy of <myndf>.AXIS(3).DATA_ARRAY
. In the same
example, if <myndf>
has shape nx by ny by nz, then both .COORDi
have shape nx by 1 by
nz. .COORDi
store for each spectrum a two-dimensional position. This position could be
used by a plot routine to position spectra according to COORDi
on the plot. The values
may or may not be sky positions. They could be in any coordinate system and using any
units. Labels and units must be stored with both NDF structures. Their default values are
copies from the relevant axes of <myndf>
, or ‘unknown’ if the relevant axis has no .LABEL
or .UNITS
. The data components must not contain bad values.
One difference between .SPECVALS/.SPECWIDS
and .COORDi
is important to note. .SPECVALS
and
.SPECWIDS
are to be used (within Specdre) as replacement of the centres and widths in
<myndf>.AXIS(specaxis)
. But .COORDi
are not intended as replacements for the corresponding axis
information, these structures are used only in special circumstances. In practice the consequences are
as follows.
.SPECVALS/SPECWIDS
must
override <myndf>.AXIS(specaxis)
, but .COORDi
are ignored completely.
.COORDi
, then these structures are looked for. Only in their absence are the
relevant axes in <myndf>
used to generate the same information.
hdstrace
sees the file before the Extension is
added:
Now almost all components that can exist in a Specdre Extension are set, mostly to their default
values. The only component missing is the sum of rows of a covariance matrix. This is because that
structure usually must not exist: other structures can be assigned ‘harmless’ values, but the simple
existence of .COVRS
makes a difference. The Extension was actually made with the editext
command,
which can also list a summary of the Extension:
Using hdstrace
to list the NDF now yields:
.AXIS(3)
structure will in general remain untouched.
COORD1/COORD2
space. A
two-dimensional main NDF might actually be an arbitrary sequence of spectra, and still
these two structures could help to sort each spectrum into its place on the sky.
Usually the result structure will be manipulated by applications as they find it necessary
to store data in it. But most of the structure can be manipulated explicitly with the
command editext
.
.PARATYPE
vector has four elements, one for each
parameter. The sole component provided for in the result NDF is the ‘LV component’ of
the 21 cm line. Its laboratory frequency is repeated here, the value is independent of the
reference frequency above, but the same unit (MHz) is used. We obviously expect that
the spectral line has the shape of a Gauss curve and we want to store four parameters
describing that curve. .PARATYPE
indicates the meaning of all parameters. No mask is
enabled to limit the velocity range where the Gauss curve applies, i.e. .MASKL
and .MASKR
have the bad value.
We might want to also store the results for a parabolic baseline fit. Then we would add a
second spectral component with three parameters. The vectors that are now of length 1
would become of length 2, .PARATYPE
would become of length 7. The additional second
vector elements would be ‘baseline’, bad value, ‘polynomial of 2nd order’, 3, bad value,
bad value. The fifth to seventh element of .PARATYPE
could be ‘coeff. 0’, ‘coeff. 1’, ‘coeff.
2’.
The Twodspec applications use a unique results structure to store identifications and parameters of line fits.
COMB, ARCSDI, ARC2D and LONGSLIT create a .RES structure in which to store their results. This structure enables repeat fits etc. to be performed easily, as well as making it unnecessary to do all the fits at one time. Most of this structure is mapped and its arrays thus accessed directly from the programs. Data are mapped by the address of the first element of an array in the file’s being obtained by the program. Fortran passes all arguments to subroutines and functions by giving their address, although for character data this is the address of the descriptor, which includes the address of the data. Therefore, if this address is passed to the subroutine or function—its value not address—then the called routine will treat the data as a normal array. For character data a descriptor must be constructed and passed by address. In the interests of portability it is better to use an array in common to pass the address, the element of the array at that address is passed (even though that is outside the bounds of the array). For character strings the string must be passed as string(start:end). The arrays are in common so that they can be referenced with the same offset from different subroutines.
Since COMB performs fitting of continua rather than lines, the structure for COMB is different from that for the other programs in that the arrays are dimensioned in channels where other programs would dimension them in cross-sections.
The elements of the structure are listed below :-
The .TRAML and .TRAMR arrays store the line positions, that is the limits considered for optimisation (assumed at the centre by ARC2D), the .IDS array stores the line identifications and the .REST_WAVE their wavelengths. The .ARC array is used by ARC2D to decide which lines are to be used in the evaluation of the relationship between channel number and wavelength. If the element of .ARC is 0 then the line is included under all circumstances, if it is 1 then it is included for non-‘continuity corrected’ data only, otherwise it is not included for any. A value of 4 indicates that no fits are present, while 10 and 11 are the values if the user manually deletes a line which previously had the value 0 or 1 respectively. If arc is 10 or 11 the fits can be ‘undeleted’.
The TEMPLATE structure keeps a record of the one-dimensional spectrum used for line identification. Since the axis array is also kept, it is possible to CLONE from such an array, even if the main data array has been scrunched.
The DATA_ARRAY array is used to store the results of the Gaussian fitting, and is also used by ARC2D to store the results for the continuity correction. The errors on the results are stored as variances in the VARIANCE array. The .PARAMS structure acts as an index to this (used by the program to determine where to store results). The .CONTROL array gives the fit type to be performed (in the same form as the fit status element of the .RESULTS structure). In the above example, 171 is the number of cross-sections in the image and 10 is the maximum number of lines allowed for in the structure (this can be up to 50). Originally the block number was stored, but this gave an ambiguous way of determining where the block starts and ends. Therefore the starting cross-section of the fit is now stored (it is possible to change the blocking and only alter a small number of fits).
The fit is performed on data extracted from ‘nwindow’ cross-sections starting at ‘first cross-section’.
|
|||||
Element | Digit | Number | Name | Refer to as | Meaning |
1 | 1 | 1 | Absorption flag | FIT_ABS | 0 - Emission |
1 - Absorption | |||||
2 | 2 | Profile Model | FIT_MODEL | 0 - none | |
3 | 1 - Gaussian | ||||
2 - Skew Gaussian | |||||
3 - Cauchy/Gaussian | |||||
4 - Centroid | |||||
5 - Lorentzian | |||||
4 | 3 | Fit type | FIT_TYPE | 0 - none | |
5 | 1 - single | ||||
2 - double (fixed separation) | |||||
3 - double (fixed width ratio) | |||||
4 - double (fixed height ratio) | |||||
5 - double (independent) | |||||
6 - Multiple | |||||
6 | 4 | Number of Components | FIT_NCMP | or may act as maximum | |
7 | 5 | Weights Method for Profiles | FIT_WEIGH | 0 - Uniform | |
1 - Variance | |||||
2 - BIweights | |||||
3 - Hubber - This is a Robust estimator | |||||
4 - Cauchy | |||||
5 - Not used | |||||
6 - Not used | |||||
7 - Not used | |||||
8 - Not used | |||||
9 - Not used | |||||
8 | 6 | Profile Fit Status | FIT_STAT | 0 - No fit | |
1 - Success | |||||
2 - Nag error (not serious) | |||||
3 - Nag error (serious) | |||||
4 - Crash | |||||
5 - Failed tols | |||||
6 - Failed tols (non-serious Nag) | |||||
2 | 1 | 7 | Manual guessing flag | FIT_MAN | 0 - No manual guessing |
1 - Manual guessing (between below and fit) | |||||
2 | 8 | First Guess method | FIT_GUES | 1 - Centroid | |
3 | 2 - Peak | ||||
3 - Bimodef | |||||
4 - Inherit FORWARD | |||||
5 - Previous answer at this place | |||||
6 - Inherit BACKWARD | |||||
7 - REGION (2d for TAURUS -to be defined) | |||||
8 - ROBUST estimator | |||||
9 - MODEL (synthetic model eg rotation curve) | |||||
10 - P Cygni | |||||
4 | 9 | Optimization method | FIT_OPT | Choice of routines | |
5 | 10 | Component number control | FIT_STST | 0 - Fit up to number of components requested | |
1 - AIC after fitting | |||||
2 - AIC, before fitting (using guesses) | |||||
6 | 11 | Constraints Method | FIT_CONSTR | 0 - No Constraints | |
1 - Bounds only | |||||
2 - General Constraints (read from constr. struc.) | |||||
3 - EQUATIONS -read from equations Struct | |||||
7 | 12 | Dynamic weights flag | FIT_DYNWEI | 0 - no dynamic weights | |
1 - Use dynamic weights | |||||
8 | 13 | Fit group | FIT_GROUP | Number of group | |
9 | |||||
3 | 1 | 1 | Method of Removal | BACK_REMOV | 0 - subtract |
1 - divide | |||||
4 | 3 | Background Order | BACK_ORDER | ||
5 | |||||
6 | 4 | Weight Function to be used | BACK_WEIGH | as for entry 5 above | |
2 | 2 | GLOBAL Background model | BACK_MODEL | 0 - No base | |
3 | 1 - Flat BAse | ||||
2 - Cubic Spline | |||||
3 - Chebyshev fit NOW | |||||
4 - FITCONT cheby | |||||
5 - Power law | |||||
6 - Black Body | |||||
7 - EMPIRICAL | |||||
8 - Black Body with Optical Depth Cutoff | |||||
7 | 5 | Optimization Method | BACK_OPT | ||
8 | 6 | local Fit flag | BACK_LOCAL | 0-9 Codes as above | |
9 | 7 | Success of FIT | BACK_STAT | ||
For background model 3 the polynomial is fitted immediately before the line profile fitting is carried out. For background model 4 the polynomial is evaluated using coefficients previously stored in the data file and subtracted before fitting.
The FIT_STATUS array contains information as to the type of fit performed and as to the success or otherwise of the fit (see table 1).
Only LONGSLIT and FIBDISP are able to fit multiple Gaussians, so ARC2D, COMB, and ARCSDI use a smaller results structure.
The last element in this direction is the density scale factor, used for scaling previous fits for use as first guesses to another fit.
The elements of the CONTROL array are the same as those of the FIT_STATUS array, except that they do not include information on the success of the fit. This array is used to set which type of fit is to be performed.
The .ITMASK array and .ITERATION are used in conjunction to prevent accidental refitting of data. Note that a check is made to prevent a fit being accidently overwritten by a simpler fit. The order of increasing complexity is: single Gaussian, skew Gaussian, Cauchy function, double Gaussian, multiple Gaussian. Both this type of checking and the checking using .MASK and .ITERATION can be overridden in MANUAL mode.
The .TOLS array provides a way of retaining the values for tolerances between successive work on the same file, and also (using the FIGARO function LET, or the CLONE COPY option in LONGSLIT), provides a way of transferring these from one file to another. This retention enables tolerances to be applied in batch, since it is much simpler to specify which tolerances to apply, rather than to list all the values to use. It is also used during multiple fitting in batch to determine the number of components to fit.
The results structure for three–dimensional data arrays used by FIBDISP is similar to the above, but has some extra elements. The VARIANT element gives the form of the relationship between the array elements of the main data array, and their actual positions. For type=‘HEX’ there is a .XDISP array: XDISP[IY] defines the displacement in the X direction of the .Z.DATA[IX,IY], relative to the value given by X.DATA[IX] (that is the element of the data of axis number one). ORDER (at present always ‘SORT’) indicates whether the data are arranged with the first or the last array index varying over wavelength. SORT corresponds to the first axis being wavelength. TOTAL_INTENSITY stores the total intensity integrated along the wavelength axis, and is useful for locating interesting areas of the data.
From version 5.2-0 onwards Figaro’s error-propagation capabilities have been expanded and enhanced. Consequently many data sets may now be reduced with the error information propagated through to the final result. This section describes the error-propagation features of Figaro.
For NDF format files, the default is to store the error information as an array of variance values (i.e., the uncertainties squared). If error information in your data is not stored as variances, some routines will not propagate the errors properly (this will be true of the old Figaro DST format which uses uncertainties).
Note that in Figaro the variance array must not contain bad values. ‘goodvar’ can be used to clean up an offending data set.
To check that a variance structure exists use the command ‘hdstrace’. If your data already contains a variance array, the output from ‘hdstrace’ will look something like this:
This trace indicates that there is a
If the output for one of your files does not contain the VARIANCE line, your file contains no usable error information.
Only some Figaro applications propagate error values. The full list of Figaro routines which propagate the variances is:
Note that if you use an application which does not propagate the variances, you will see the following message:
What this means is that since the variance array is now no longer correct, it has been deleted to prevent its use. Running ‘hdstrace’ on the new file shows that the variance structure is gone. The obvious lesson from this is to stick to those Figaro routines which keep the variance structure intact wherever possible. Alternatively, Starlink packages such as KAPPA offer other error-propagating tasks.
Raw data from the telescope probably don’t contain a variance array. The recommended way of getting error information into a file is when the data are in its earliest stages of reduction, i.e., when de-biasing. The CCDPACK (see SUN/139) applications ‘makebias’ and ‘debias’ will create a variance array in the de-biased data files which Figaro can use in the remaining stages of reduction.
Variances are propagated using a standard equation (see e.g. equation 3.13 from Bevington and Robinson2). Note that covariances are not calculated to save computational time. Thus, calculated variances will not be formal for problems with correlated errors.
In order to avoid losing your error information during the reduction process, the recipe outlined below describes a recommended reduction path. Note that routines from both CCDPACK and KAPPA are employed. If you are working at a Starlink site, these will already be installed ready for your use. If you are not working from a Starlink site, you might wish to get these packages (e.g. from the Starlink Software Store on the World Wide Web) if you don’t already have them.
CCDPACK’s ‘makeflat’ is the recommended way of producing a master flat-field frame. However, either CCDPACK or Figaro may be used to perform the flat-fielding operation. The recommended Figaro flat-fielding routine is ‘ff’.
‘Polysky’ should be used to subtract the sky background from your data. Note that there is no direct way of obtaining the sky variance from the region containing the object spectrum. For this reason it is assumed that the variance in the object region is the average residual to the polynomial fit in the sky regions. Those outlying points not used to calculate the sky fit are not included in the calculation of the average residual.
Both the Figaro ‘profile’ and ‘optextract’ routines support error propagation. A normal extraction can be performed with ‘extract’.
The routines ‘bclean’ and ‘sclean’ now partially support propagation of error information. More specifically, bad pixels are interpolated over as before and the associated variance values are set to zero to indicate the data are ‘fudged’.
Some Figaro routines use the variance arrays to weight the data values during fitting. When a zero variance is found (e.g. because ‘bclean’, ‘sclean’ and/or ‘cset’ were used), the data will not be given infinite weight! Instead, some routines (such as ‘polysky’) will set the weight of such a point to be zero. Some routines (e.g. ‘ff’) may revert to using uniform weighting in this situation.
To be absolutely sure that the variances are used to weight a fit, one might wish to set zero variances to some high value. (The reason that a large value is not set by default is to prevent numerical overflows in later stages of the reduction process). The way this is achieved is using the KAPPA routines ‘setmagic’ and ‘nomagic’:
This replaces any zeros in the file called ‘file1.sdf’ with a ‘bad’ value and writes the output to ‘file2.sdf’.
replaces the ‘bad’ values with a high value (1.0e+20).
Now the fitting routine may be run, giving a very low weight to the previously bad pixels. Finally,
replaces the high values with a zero value. Any remaining stages of data reduction can now be carried out without the worry of encountering numerical overflow from the variances. Note that any variances which were bad for any reason other than being set as bad by the user will also be returned to zero.
Figaro has three directories which are used to keep various data files, such as flux calibration tables and arc line lists. The locations of these directories are not fixed, but rather they are referred to by environment variables. These environment variables are:
environment variable | meaning |
FIGARO_PROG_S | default or standard |
FIGARO_PROG_L | local |
FIGARO_PROG_U | user |
FIGARO_PROG_S is a standard directory which is always available and contains the same files in all Figaro installations. On standard Starlink systems it corresponds to directory:
FIGARO_PROG_L contains files local to your site and FIGARO_PROG_U your own personal files. They may or not defined at your site. You could define your own Figaro directory by adding a line similar to the following to your ‘.login’ script:
By default Figaro applications will search for data files in these directories in the following order: first your default directory, then FIGARO_PROG_U, then FIGARO_PROG_L and finally FIGARO_PROG_S.
1H. Meyerdierks, 1992, ‘Fitting Resampled Spectra’, in P.J. Grosbøl and R.C.E. de Ruijsscher (eds), 4th ESO/ST-ECF Data Analysis Workshop, Garching, 13 – 14 May 1992, ESO Conference and Workshop Proceedings No. 41, Garching bei München.
2P.R. Bevington and D.K. Robinson, 1992, Data Reduction and Error Analysis for the Physical Sciences, second edition (McGraw-Hill: New York). See especially p43.