Occasionally you’ll want to work with parts of a filename, such as the path or the file type. The C-shell provides filename modifiers that select the various portions. A couple are shown in the example below.
Suppose the first argument of a script, $1
, is a filename called galaxy.bdf. The value of variable type
is
bdf
and name equates to galaxy
because of the presence of the filename modifiers :e
and :r
. The rest
of the script uses the file type to control the processing, in this case to provide a listing of the contents
of a data file using the Hdstrace utility.
The complete list of modifiers, their meanings, and examples is presented in the table below.
Modifier | Value returned | Value for filename |
| /star/bin/kappa/comwest.sdf |
|
:e | Portion of the filename following a full stop; if the filename does not contain a full stop, it returns a null string | sdf |
:r | Portion of the filename preceding a full stop; if there is no full stop present, it returns the complete filename | comwest |
:h | The path of the filename (mnemonic: h for head) | /star/bin/kappa |
:t | The tail of the file specification, excluding the path | comwest.sdf |
One of the most common things you’ll want to do, having devised a data-processing path, is to apply those operations to a series of data files. For this you need a foreach...end construct.
This takes all the FITS files in the current directory and computes the statistics for them using the stats command from Kappa.file
is a shell variable. Each time in the loop file
is assigned to the name
of the next file of the list between the parentheses. The *
is the familiar wildcard which matches any
string. Remember when you want to use the shell variable’s value you prefix it with a $
. Thus $file is
the filename.
Some data formats like the NDF demand that only the file name (i.e. what appears before the last dot)
be given in commands. To achieve this you must first strip off the remainder (the file extension or
type) with the :r
file modifier.
This processes all the HDS files in the current directory and calculates an histogram for each of them
using the histogram command from Kappa. It assumes that the files are NDFs. The :r
instructs the
shell to remove the file extension (the part of the name following the the rightmost full stop). If we
didn’t do this, the histogram task would try to find NDFs called SDF within each of the HDS
files.
You can give a list of files separated by spaces, each of which can include the various UNIX wildcards. Thus the code below would report the name of each NDF and its standard deviation. The NDFs are called ‘Z’ followed by a single character, ccd1, ccd2, ccd3, and spot.
echo writes to standard output, so you can write text including values of shell variables to the screen or
redirect it to a file. Thus the output produced by stats is piped (the |
is the pipe) into the UNIX grep
utility to search for the string "Standard deviation"
. The ‘ ‘
invokes the command, and the
resulting standard deviation is substituted.
You might just want to provide an arbitrary list of NDFs as arguments to a generic script. Suppose you
had a script called splotem
, and you have made it executable with chmod +x splotem
.
Notice the -e
file-comparison operator. It tests whether the file exists or not. (Section 12.4 has a full list
of the file operators.) To plot a series of spectra stored in NDFs, you just invoke it something like
this.
See the glossary for a list of the available wildcards such as the [a-z]
in the above example.
In the splotem
example from the previous section the list of NDFs on the command line required the
inclusion of the .sdf
file extension. Having to supply the .sdf
for an NDF is abnormal. For reasons of
familiarity and ease of use, you probably want your relevant scripts to accept a list of NDF names and
to append the file extension automatically before the list is passed to foreach. So let’s modify the
previous example to do this.
ndfs
. The set defines a value for a shell variable; don’t forget the spaces around the =
.
ndfs[*]
means all the elements of variable ndfs
. The loop adds elements to ndfs
which is
initialised without a value. Note the necessary parentheses around the expression ($ndfs[*]
$argv[i]".sdf")
.
On the command line the wildcards have to be passed verbatim, because the shell will try to match
with files than don’t have the .sdf
file extension. Thus you must protect the wildcards with quotes.
It’s a nuisance, but the advantages of wildcards more than compensate.
’ ’
, you’ll receive a No match
error.
A common need is to browse through several datasets, perhaps to locate a specific one, or to determine which are acceptable for further processing. The following presents images of a series of NDFs using the display task of Kappa. The title of each plot tells you which NDF is currently displayed.
sleep pauses the process for a given number of seconds, allowing you to view each image. If this is too inflexible you could add a prompt so the script displays the image once you press the return key.nfiles
.
You can substitute another visualisation command for display as appropriate. You can also use the graphics database to plot more than one image on the screen or to hardcopy. The script $KAPPA_DIR/multiplot.csh does the latter.
Thus far the examples have not created a new file. When you want to create an output file, you need a name for it. This could be an explicit name, one derived from the process identification number, one generated by some counter, or from the input filename. Here we deal with all but the trivial first case.
To help identify datasets and to indicate the processing steps used to generate them, their names are often created by appending suffices to the original file name. This is illustrated below.
This uses block from Kappa to perform block smoothing on a series of NDFs, creating new NDFs, each of which takes the name of the corresponding input NDF with a_sm
suffix. The accept keyword
accepts the suggested defaults for parameters that would otherwise be prompted. We use the set to
assign the NDF name to variable ndf
for clarity.
If a counter is preferred, this example
smooth1
, smooth2
and so on.
Whilst appending a suffix after each data-processing stage is feasible, it can generate some long
names, which are tedious to handle. Instead you might want to replace part of the input name with a
new string. The following creates another shell variable, ndfout
by replacing the string _flat from the
input NDF name with _sm
. The script pipes the input name into the sed editor which performs the
substitution.
#
is a delimiter for the strings being substituted; it should be a character that is not present in the
strings being altered. Notice the ‘ ‘
quotes in the assignment of ndfout. These instruct the shell
to process the expression immediately, rather than treating it as a literal string. This is
how you can put values output from UNIX commands and other applications into shell
variables.
There
is a special class of C-shell operator
that lets you test the properties of
a file. A file operator is used in
comparison expressions of the form
if (file_operator file) then
. A
list of file operators is tabulated to
the right.
The most common usage is to test for a file’s existence. The following only runs cleanup if the first argument is an existing file.
File operators
| |
Operator | True if: |
-d | file is a directory |
-e | file exists |
-f | file is ordinary |
-o | you are the owner of the file |
-r | file is readable by you |
-w | file is writable by you |
-x | file is executable by you |
-z | file is empty |
Here are some other examples.
A frequent feature of scripts is redirecting the output from tasks to a text file. For instance,
directs the output of the hdstrace and fitshead to text files. The name of the first is generated from the name of the file whose contents are being listed, so for HDS filecosmic.sdf
the trace is stored in
cosmic.lis
. In the second case, the process identification number is the name of the text file. You can
include this special variable to generate the names of temporary files. (The :r
is described in
Section 12.1.)
If you intend to write more than once to a file you should first create the file with the touch command, and then append output to the file.
logfile.lis
. There is a heading
including the dataset name and blank line between each set of headers. Notice this time
we use »
to append. If you try to redirect with >
to an existing file you’ll receive an error
message whenever you have the noclobber
variable set. >!
redirects regardless of noclobber
.
There is an alternative—write the text file as part of the script. This is often a better way for longer files. It utilises the cat command to create the file.
EOF
s to file catpair_par.lis. Note the second EOF
must begin in column 1. You can choose the delimiting words; common ones are EOF
, FOO.
Remember the >!
demands that the file be written regardless of whether the file already
exists.
A handy feature is that you can embed shell variables, such as refndf
and distance
in the
example. You can also include commands between left quotes (‘ ‘
); the commands are
evaluated before the file is written. However, should you want the special characters $
, \, and
‘ ‘
to be treated literally insert a \ before the delimiting word or a \ before each special
character.
The last technique might be needed if your script writes another script, say for overnight batch processing, in which you want a command to be evaluated when the second script is run, not the first. You can also write files within this second script, provided you choose different words to delimit the file contents. Here’s an example which combines both of these techniques.
This excerpt writes a script in the temporary directory. The temporary script’s filename includes our
username ($user
) and some run number stored in variable runno
. The temporary script begins with the
standard comment indicating that it’s a C-shell script. The script’s first action is to write a three-line
data file. Note the different delimiter, EOD
. This data file is created only when the temporary script is
run. As we want the time and date information at run time to be written to the data file, the command
substitution backquotes are both escaped with a \. In contrast, the final line of the data file is
evaluated before the temporary script is written. Finally, the temporary script removes
itself and the data file. After making a temporary script, don’t forget to give it execute
permission.
There is no simple file reading facility in the C-shell. So we call upon awk again.
Variablefile
is a space-delineated array of the lines in file sources.lis
. More useful is to extract a
line at a time.
where shell variable j
is a positive integer and no more than the number of lines in the file, returns the
th line
in variable text
.
When reading data into your script from a text file you will often require to extract columns of data, determine the number of lines extracted, and sometimes the number columns and selecting columns by heading name. The shell does not offer file reading commands, so we fall back heavily on our friend awk.
The simple recipe is
whereFNR
is the number of the records read. NF
is the number of space-separated fields in that record.
If another character delimits the columns, this can be set by assigning the FS
variable without
reading any of the records in the file (because of the BEGIN
pattern or through the -F
option).
FNR
, NF
, and FS
are called built-in variables.
There may be a header line or some schema before the actual data, you can obtain the field count from the last line.
First we obtain the number of lines in the file using wc stored in$lines[1]
. This shell variable is
passed into awk, as variable nl
, through the -v
option.
If you know the comment line begins with a hash (or can be recognised by some other regular expression) you can do something like this.
Here we initialise awk variablei
. Then we test the record $0 does not match a line starting with #
and
increment i
, and only print the number of fields for the first such occurrence.
For a simple case without any comments.
Variablecol1
is an array of the values of the first column. If you want an arbitrary column
where shell variable j
is a positive integer and no more than the number of columns, returns the
th column in the
shell array col
.
If there are comment lines to ignore, say beginning with #
or *
, the following excludes those from the
array of values.
E
and e
are omitted to allow for exponents.
awk lets you select the lines to extract through boolean expressions, that includes involving the
column data themselves, or line numbers through NR
.
-999
.
You can find out how many values were extracted through $#
var, such as in the final line
above.
You have the standard relational and boolean operators available, as well as
and
!
for
match and does not match respectively. These last two con involve regular expressions giving
powerful selection tools.
Suppose your text file has a heading line listing the names of the columns.
name
into the awk variable col
through to -v
command-line option. For the first record NR==1
, we loop
through all the fields (NF
starting at the first, and if the current column name ($i
) equals the requested
name, the column number is printed and we break from the loop. If the field is not present, the
result is null. The extra braces associate commands in the same for or if block. Note that
unlike C-shell, in awk the line break can only appear immediately after a semicolon or
brace.
The above can be improved upon using the toupper
function to avoid case sensitivity.
You can also read from a text file created dynamically from within your script.
./doubleword
reads its standard input from the file mynovel.txt
. The «word obtains the
input data from the script file itself until there is line beginning word. You may also include variables
and commands to execute as the $
, \, and ‘ ‘
retain their special meaning. If you want
these characters to be treated literally, say to prevent substitution, insert a \ before the
delimiting word. The command myprog
reads from the script, substituting the value of variable
nstars
in the second line, and the number of lines in file brightnesses.txt in the third
line.
The technical term for such files are here documents.
The output from some routines is often unwanted in scripts. In these cases redirect the standard output to a null file.
Here the text output from the task correlate is disposed of to the/dev/null
file. Messages from
Starlink tasks and usually Fortran channel 6 write to standard output.
When writing a data-processing pipeline connecting several applications you will often need to know some attribute of the data file, such as its number of dimensions, its shape, whether or not it may contain bad pixels, a variance array or a specified extension. The way to access these data is with ndftrace from Kappa and parget commands. ndftrace inquires the data, and parget communicates the information to a shell variable.
Suppose that you want to process all the two-dimensional NDFs in a directory. You would write something like this in your script.
Note although called ndftrace, this function can determine the properties of foreign data formats through the automatic conversion system (SUN/55, SSN/20). Of course, other formats do not have all the facilities of an NDF.
If you want the dimensions of a FITS file supplied as the first argument you need this ingredient.
Thendims[]
will
contain the size of the
dimension. Similarly
will assign the pixel bounds to arrays lbnd
and ubnd
.
Below is a complete list of the results parameters from ndftrace. If the parameter is an array, it will have one element per dimension of the data array (given by parameter NDIM); except for EXTNAM and EXTTYPE where there is one element per extension (given by parameter NEXTN). Several of the axis parameters are only set if the ndftrace input keyword fullaxis is set (not the default). To obtain, say, the data type of the axis centres of the current dataset, the code would look like this.
Name | Array? | Meaning |
AEND | Yes | The axis upper extents of the NDF. For non-monotonic axes, zero
is used. See parameter AMONO. This is not assigned if AXIS is
|
AFORM | Yes | The storage forms of the axis centres of the NDF. This is only
written when parameter FULLAXIS is |
ALABEL | Yes | The axis labels of the NDF. This is not assigned if AXIS is |
AMONO | Yes | These are |
ANORM | Yes | The axis normalisation flags of the NDF. This is only written when
FULLAXIS is |
ASTART | Yes | The axis lower extents of the NDF. For non-monotonic axes, zero
is used. See parameter AMONO. This is not assigned if AXIS is
|
ATYPE | Yes | The data types of the axis centres of the NDF. This is only written
when FULLAXIS is |
AUNITS | Yes | The axis units of the NDF. This is not assigned if AXIS is |
AVARIANCE | Yes | Whether or not there are axis variance arrays present in the NDF.
This is only written when FULLAXIS is |
AXIS | Whether or not the NDF has an axis system. |
|
BAD | If |
|
BADBITS | The BADBITS mask. This is only valid when QUALITY is |
|
CURRENT | The integer Frame index of the current co-ordinate Frame in the WCS component. |
|
DIMS | Yes | The dimensions of the NDF. |
EXTNAME | Yes | The names of the extensions in the NDF. It is only written when NEXTN is positive. |
EXTTYPE | Yes | The types of the extensions in the NDF. Their order corresponds to the names in EXTNAME. It is only written when NEXTN is positive. |
FDIM | Yes | The numbers of axes in each co-ordinate Frame stored in the WCS component of the NDF. The elements in this parameter correspond to those in FDOMAIN and FTITLE. The number of elements in each of these parameters is given by NFRAME. |
FDOMAIN | Yes | The domain of each co-ordinate Frame stored in the WCS component of the NDF. The elements in this parameter correspond to those in FDIM and FTITLE. The number of elements in each of these parameters is given by NFRAME. |
FLABEL | Yes | The axis labels from the current WCS Frame of the NDF. |
FLBND | Yes | The lower bounds of the bounding box enclosing the NDF in the current WCS Frame. The number of elements in this parameter is equal to the number of axes in the current WCS Frame (see FDIM). |
FORM | The storage form of the NDF’s data array. |
|
FTITLE | Yes | The title of each co-ordinate Frame stored in the WCS component of the NDF. The elements in this parameter correspond to those in FDOMAIN and FDIM. The number of elements in each of these parameters is given by NFRAME. |
Name | Array? | Meaning |
FUBND | Yes | The upper bounds of the bounding box enclosing the NDF in the current WCS Frame. The number of elements in this parameter is equal to the number of axes in the current WCS Frame (see FDIM). |
FUNIT | Yes | The axis units from the current WCS Frame of the NDF. |
HISTORY | Whether or not the NDF contains HISTORY records. | |
LABEL | The label of the NDF. | |
LBOUND | Yes | The lower bounds of the NDF. |
NDIM | The number of dimensions of the NDF. | |
NEXTN | The number of extensions in the NDF. | |
NFRAME | The number of WCS domains described by FDIM, FDOMAIN and
FTITLE. Set to zero if WCS is |
|
QUALITY | Whether or not the NDF contains a QUALITY array. |
|
TITLE | The title of the NDF. |
|
TYPE | The data type of the NDF’s data array. |
|
UBOUND | Yes | The upper bounds of the NDF. |
UNITS | The units of the NDF. |
|
VARIANCE | Whether or not the NDF contains a VARIANCE array. |
|
WCS | Whether or not the NDF has any WCS co-ordinate Frames, over and above the default GRID, PIXEL and AXIS Frames. |
|
WIDTH | Yes | Whether or not there are axis width arrays present in the NDF. This
is only written when FULLAXIS is |
Suppose you have an application which demands that variance information be present, say for
optimal extraction of spectra, you could test for the existence of a variance array in your FITS file
called dataset.fit
like this.
TRUE
or FALSE
. You merely substitute another
component such as quality or axis in the parget command to test for the presence of these
components.
Imagine you have an application which could not process bad pixels. You could test whether a dataset might contain bad pixels, and run some pre-processing task to remove them first. This attribute could be inquired via ndftrace. If you need to know whether or not any were actually present, you should run setbad from Kappa first.
goto exit
), or, as here, moving to a named label. This lets us skip over some code, and move directly
to the closedown tidying operations. Notice the colon terminating the label itself, and that it is absent
from the goto command.
One recipe for testing for a spectrum is to look at the axis labels. (whereas a modern approach might
use WCS information). Here is a longer example showing how this might be implemented. Suppose
the name of the dataset being probed is stored in variable ndf
.
Associated with FITS files and many NDFs is header information stored in 80-character ‘cards’. It is possible to use these ancillary data in your script. Each non-comment header has a keyword, by which you can reference it; a value; and usually a comment. Kappa from V0.10 has a few commands for processing FITS header information described in the following sections.
Suppose that you wanted to determine whether an NDF called image123 contains an AIRMASS keyword in its FITS headers (stored in the FITS extension).
airpres
would be assigned "TRUE"
when the AIRMASS card was present, and
"FALSE"
otherwise. Remember that the ‘ ‘
quotes cause the enclosed command to be
executed.
Once we know the named header exists, we can then assign its value to a shell variable.
We can also write new headers at specified locations (the default being just before the END card), or
revise the value and/or comment of existing headers. As we know the header AIRMASS exists in
image123, the following revises the value and comment of the AIRMASS header. It also writes a new
header called FILTER immediately preceding the AIRMASS card assigning it value B
and comment
Waveband
.
As we want the metacharacters "
to be treated literally, each is preceded by a backslash.
You can manipulate data objects in HDS files, such as components of an NDF’s extension. There are several Starlink applications for this purpose including the FIGARO commands copobj, creobj, delobj, renobj, setobj; and the Kappa commands setext, and erase.
For example, if you wanted to obtain the value of the EPOCH object from an extension called IRAS_ASTROMETRY in an NDF called lmc, you could do it like this.
Thenoloop
prevents prompting for another extension-editing operation. The single backslash is the
line continuation.
If you want to define a subset or superset of a dataset, most Starlink applications recognise NDF sections (see SUN/95’s chapter called “NDF Sections”) appended after the name.
A naïve approach might expect the following to work
Bad : modifier in $ ($).
error. That’s because it is stupidly looking
for a filename modifier :$
(see Section 12.1).
Instead here are some recipes that work.