Processing math: 100%

1 Introduction

 1.1 Facilities Provided by GRP
 1.2 An Example of a Simple GRP Application

Applications often have to manage groups of strings in which each string represents some object. Typical examples of the contents of such groups may be:

The GRP package provides facilities for managing the storage and retrieval of strings within such groups. Applications may explicitly specify strings to be stored within a group, or instead may specify the name of a text file from which to read such strings. The contents of a group may be written out to a text file, providing an easy means of passing groups of objects between applications.

1.1 Facilities Provided by GRP

The concept of a group within GRP may be compared with an “array” structure within a high level programming language such as Fortran. Arrays are used for storing many values in a single object, with each value associated with an integer “index”. A value can be stored in, or retrieved from, a particular element of an array by specifying the index for that element. The facilities provided by the GRP package are similar. GRP allows character strings to be stored in, or retrieved from, one of several “groups” (in this sense, a “group” is the GRP equivalent of a character array). Each element within the group has an associated index, and different groups are distinguished by different “identifiers” (similar to the way that different arrays are distinguished by having different names).

GRP also provides the following features not available through the use of Fortran arrays:

(1)
In order to store values in a normal character array, you must assign an explicit character string to each element of the array. The GRP package provides similar facilities for storing explicit strings, but also provides facilities for reading the values to be stored from a text file. This is known as indirection; instead of providing a set of strings to be stored, you provide the name of a file which contains the strings to to be stored. For instance, if you were prompted for a list of data files to be processed, you could then respond either with the explicit name of each data file, or with the name of a text file containing a list of the names of the data files.
(2)
An alternative method for specifying values to be stored in a group is by modification of the values already stored in another group. For instance, if values are to be assigned to elements 1 to 3, the actual strings stored would be obtained by taking the values stored in elements 1 to 3 of another group and applying some editing to them. The same editing is used for each element; it can include the addition of a suffix and/or a prefix, and the substitution of one sub-string with another. A typical use of this facility could be to specify a set of output files by describing the editing and the names of the corresponding input files. Thus, if a group describing the input files contained the three strings:

  OBJ1.DAT
  OBJ2.DAT
  OBJ3.DAT

you might specify the output files using a string such as:

  A_*|DAT|TXT|

This would cause the addition of the prefix “A_” and substitution of “TXT” for “DAT”, resulting in the output file names:

  A_OBJ1.TXT
  A_OBJ2.TXT
  A_OBJ3.TXT

(3)
In the above example, a group of values was obtained by copying them from another group, and then applying some specified editing. This technique of editing values can be combined with other ways of specifying the original values. For instance, to apply the same editing as above to each of the values stored in the text file OBJECTS.LIS the following response could be given:

   A_{^OBJECTS.LIS}|DAT|TXT|

If the same editing is to be applied to a list of literal values typed in directly at the keyboard, then a response such as the following could be given:

  A_{OBJ1.DAT,OBJ2.DAT,OBJ3.DAT}|DAT|TXT|

(4)
Groups have dynamic size (i.e. they expand in size as required to make room for new values), which is often more convenient than the fixed size of an array specified in its declaration. Thus if an application wants to process many data files, and chooses to store the file names in a GRP group, then no limit is imposed on the number of files which the user may specify.
(5)
The contents of a group may be written out to a text file using a single subroutine call. This, together with the ability to read a group’s contents from a text file, provides a convenient means of passing groups of objects between applications. For instance, an application may produce a text file holding the names of all the output data files it has created. A later application can then read this text file to obtain the names of the data files which it is to process.
(6)
Elements within a group can be copied in a single call to another group. Duplicate names may also be purged from a group in a single call.
(7)
GRP stores information about how each value within a group group was obtained (i.e. whether it was given explicitly, or by indirection, or by modification). The names of indirection files are stored, as are the identifiers for groups used as the basis for “modified” elements, and all this information is available to the calling application.

GRP is a general purpose library which makes no attempt to attach any particular meaning or properties to the strings stored in a group. It is expected that other, more specialized libraries will be written which use GRP to handle specific types of strings (eg coordinate values, names of data files, etc). Such packages will provide additional features to handle the objects stored by GRP (eg the creation, opening and closing of data files).

1.2 An Example of a Simple GRP Application

The facilities of GRP are particularly useful for processing lists of text strings provided in response to a prompt. The user of an application can specify the strings literally, or can specify the name of a text file containing the strings, or can specify the editing to be used to derive them by modification of the strings stored somewhere else.

1.2.1 Example Code

Here is a simple example of the use of GRP which illustrates the ideas of indirection and modification. In this example, each stored string corresponds to the name of a file but obviously an application could apply other interpretations. The user is prompted for a set of input file names and then prompted again for a set of output file names. Each input file is processed in some way (by routine PROC) to produce the corresponding output file. The source code that follows is not intended to provide all the information necessary to write GRP applications, but simply to give a feeling for the way GRP works:

        SUBROUTINE GRP_TEST( STATUS )                             [1]
        IMPLICIT NONE
  
  *  Include definitions of global constants.
        INCLUDE ’SAE_PAR’                                         [2]
        INCLUDE ’GRP_PAR’                                         [3]
  
  *  Declare local variables.
        INTEGER STATUS, IGRP1, IGRP2, SIZE1, SIZE2, ADDED, I
        CHARACTER*(GRP__SZNAM) INFIL, OUTFIL
        LOGICAL FLAG
  
  *  Check inherited global status.
        IF ( STATUS .NE. SAI__OK ) RETURN                         [4]
  
  *  Create a new (empty) group to contain the names of the
  *  input files.
        CALL GRP_NEW( ’Input files’, IGRP1, STATUS )              [5]
  
  *  Prompt the user for a group of input file names and place
  *  them in the group just created.
        CALL GRP_GROUP( ’IN_FILES’, GRP__NOID, IGRP1, SIZE1,      [6]
       :                ADDED, FLAG, STATUS )
  
  *  Create a second group to hold output file names.
        CALL GRP_NEW( ’Output files’, IGRP2, STATUS )             [7]
  
  *  Prompt the user for a group of output file names, giving
  *  the chance to specify them by modification of the input
  *  file names. Place the output file names in the new group
  *  just created.
        CALL GRP_GROUP( ’OUT_FILES’, IGRP1, IGRP2, SIZE2,         [8]
       :                ADDED, FLAG, STATUS )
  
  *  Report an error and abort if the number of output files
  *  does not equal the number of input files.
        IF( SIZE2 .NE. SIZE1 .AND. STATUS .EQ. SAI__OK ) THEN     [9]
           STATUS = SAI__ERROR
           CALL ERR_REP( ’GRP_TEST_ERR1’,
       :           ’Incorrect number of output files specified’,
       :           STATUS )
           GO TO 999
        END IF
  
  *  Loop round each input file.
        DO I = 1, SIZE1                                           [10]
  
  *  Retrieve the input file name with index given by I.
           CALL GRP_GET( IGRP1, I, 1, INFIL, STATUS )             [11]
  
  *  Retrieve the output file name with index given by I.
           CALL GRP_GET( IGRP2, I, 1, OUTFIL, STATUS )
  
  *  Process the data.
           CALL PROC( INFIL, OUTFIL, STATUS )                     [12]
  
  *  Do the next input file.
        END DO
  
  *  Delete the groups created by this application.
   999  CONTINUE
        CALL GRP_DELET( IGRP1, STATUS )                           [13]
        CALL GRP_DELET( IGRP2, STATUS )
  
        END

Programming notes:

(1)
The example is actually an ADAM A-task, and so consists of a subroutine with a single argument giving the inherited status value. See SUN/101 for further details about writing ADAM A-tasks. A “stand-alone” version of the GRP package is available which can be used with non-ADAM applications.
(2)
The first INCLUDE statement is used to define standard “symbolic constants”, such as the values SAI__OK and SAI__ERROR which are used in this routine. Starlink software makes widespread use of such constants, which should always be defined in this way rather than by using actual numerical values. The file SAE_PAR is almost always needed, and should be included as standard in every application.
(3)
The second INCLUDE statement performs a similar function to the first, but defines symbolic constants which are specific to the GRP package. Such constants are recognizable by the fact that they start with the five characters “GRP__” (such as GRP__NOID and GRP__SZNAM used in the above example). Note the double underscore “__” which distinguishes them from subroutine names.
(4)
The value of the STATUS argument is checked. This is because the application uses the Starlink error handling strategy (see SUN/104), which requires that a subroutine should do nothing unless its STATUS argument is set to the value SAI__OK on entry. Here, we simply return without action if STATUS has the wrong value.
(5)
An identifier for a new group is now obtained. The variable IGRP1 is returned holding an integer value which the GRP package uses to recognise the group just created. Initially, there are no names stored in the group. A string is stored which should be used to give a description of the type of objects stored within the group (in this case the string “Input files” is used).
(6)
The user is now prompted for a value for the parameter IN_FILES, and replies with a string, which GRP splits up into separate elements, each element being either a literal file name or the name of a text file containing other file names. As there are no other groups defined at this point, it is not possible to specify the files names using “modification” (as described in item (2) in section 1.1). For this reason, the second argument (which would normally specify the group to use as the basis for modification) is given the value GRP__NOID. This is a special identifier value used to indicate a “null” group. The names supplied by the user are stored in the group created by the previous call to GRP_NEW, and the number of names is returned in argument SIZE1. Note, no permanent association exists between the group identified by IGRP1 and the parameter IN_FILES (which is one reason why GRP_GROUP is not called GRP_ASSOC). The parameter value may be cancelled (for instance using PAR_CANCL) without effecting the contents of the group.
(7)
A second group is now created to hold the output file names. The two groups are distinguished by the fact that they have different identifiers (stored in IGRP1 and IGRP2).
(8)
The user is prompted again, this time for a value for the parameter OUT_FILES and the names obtained are stored in the second group just created. Again the user can give literal files names and/or the names of text files holding other file names. Now that there are two groups, it is possible to use modification to specify the output files. The identifier for the group containing the input files names is given as the second argument of GRP_GROUP, telling the GRP system that if the user specifies output file names using modification (which may not be the case of course), then the output file names are to be derived by modifying the input file names stored in the first group.
(9)
In this particular application it is deemed necessary to have equal numbers of input and output files, but GRP imposes no restrictions on the number of strings which can be supplied when responding to a prompt from GRP_GROUP. It is therefore necessary to check the that the two groups contain the same number of elements. A more sophisticated application might seek user intervention to determine how to proceed at this point (either by requesting extra output file names or by ignoring some). Note, if some other error has already been detected (as shown by STATUS having a value other than SAI__OK), then the check on the number of input and output files is irrelevant.
(10)
Having stored the input file names in one group and the output file names in another, the application now loops through each pair of input and output file names in turn. An “index” I is used to distinguish between different elements within a group. The input file name with index I is retrieved from the first group and the output file name with the same index is retrieved from the second group.
(11)
Note, there is a limit to the size of the character string which can be stored in a GRP group. This size is given by the symbolic constant GRP__SZNAM.
(12)
A routine is now called which uses the two file names; a typical routine may take data out of the input file, process it and store the results in the output file. The file handling itself would be done within the routine PROC. This example actually makes no assumptions about what the strings stored in the two groups represent (the descriptive strings stored when the two groups were created are of no significance in this application). Although, for clarity, it has been assumed that the strings correspond to file names, they could just as easily have been wavelengths, the names of astronomical objects, calender dates, or just about anything else.
(13)
Finally, the groups created by this application are deleted. This is particularly important in applications which run as subroutines within a wider context (such as ADAM applications). There is a limited number of groups available, and if applications forget to delete the groups they have created, then the possibility of exceeding the limit then exists. Note, the groups should be deleted even if the application aborts because of an error, so the statement labelled 999 (to which a jump is made if an error is detected) comes before the calls to GRP_DELET.
1.2.2 Examples of Possible User Responses

If the example above was run, the user could respond in several ways to the prompts for parameters IN_FILES and OUT_FILES. The following paragraphs illustrate the use of indirection and modification in this context. Indirection.  When prompted for IN_FILES the user could reply with the following text:

  IC_1575_RAW, IC_4320_RAW, ^NGC_OBJECTS.LIS

This would cause the GRP package to place the two strings IC_1575_RAW and IC_4320_RAW in two elements of the first group and then search for a file called NGC_OBJECTS.LIS. If this file contains the following two lines of text:

  NGC_5128_RAW, NGC_2534_RAW
  ^OTHERS.LIS

then the strings NGC_5128_RAW and NGC_2534_RAW would be added to the same group, and a search made for the file OTHERS.LIS. If this file, in turn, contained the two lines:

  NGC_1947_FLAT
  NGC_3302_FLAT

then the final group would consist of the six strings:

  IC_1575_RAW
  IC_4320_RAW
  NGC_5128_RAW
  NGC_2534_RAW
  NGC_1947_FLAT
  NGC_3302_FLAT

This illustrates the use of indirection as a means of specifying the strings to be stored in a group, and shows it being combined with the specification of literal strings, and indirections being nested.

Strings stored in a text file can be edited “on the fly” before being stored in a group. For instance, the user could give the following response to a prompt for IN_FILES:

  ^NGC_OBJECTS.LIS|_RAW|_CAL|

This would cause the GRP package to read the values stored in text file NGC_OBJECTS.LIS, replacing all occurrences of the string “_RAW” with the string “_CAL”. If the file NGC_OBJECTS.LIS contained the same values as before, then the editing would also be applied to the values stored in the file OTHERS.LIS. Modification.  As an example of the use of “modification”, let’s suppose that the user responds to the prompt for OUT_FILES with the string:

  A_*2|RAW|FLAT|

This would cause the application to generate six strings, based on the six strings held in the first group (see programming note (8)). The names are generated as follows

(1)
First, a copy of the six names in the first group is made and stored in the second group.
(2)
Next, any occurrence of the string “RAW” within any of these names is replaced with the string “FLAT”. This leaves the second group holding the names:

  IC_1575_FLAT
  IC_4320_FLAT
  NGC_5128_FLAT
  NGC_2534_FLAT
  NGC_1947_FLAT
  NGC_3302_FLAT

(3)
Next, each name is substituted in turn for the “” character in the string to the left of the first “” character. Thus each name is prefixed by “A_” and suffixed by “2”. The second group finally holds the names:

  A_IC_1575_FLAT2
  A_IC_4320_FLAT2
  A_NGC_5128_FLAT2
  A_NGC_2534_FLAT2
  A_NGC_1947_FLAT2
  A_NGC_3302_FLAT2

Modification can be combined with indirection and/or the specification of literal strings. For instance, the user could have replied to the prompt for OUT_FILES with the string:

  NEW_FILE,A_*2|RAW|FLAT|,^LIST.DAT

This would have caused the second group to contain not only the six names described above, but also the additional names NEW_FILE and any names read from the text file LIST.DAT. In this case, the number of output files would have exceeded the number of input files and the check described in programming note (9) would fail.