17 MERGING AND MATCHING NDF ATTRIBUTES

 17.1 The Problem
 17.2 Merging and Matching Bad-Pixel Flags
 17.3 Matching NDF Bounds
 17.4 Merging and Matching Numeric Types

17.1 The Problem

A problem which immediately surfaces when you start to write applications to process NDFs is how to cope with the wide variety of data structures which may be encountered. Taking account of all possible shapes and sizes, all feasible types and storage forms for NDF components, the presence or absence of bad pixels and the state (defined or undefined) of each component, the number of combinations is clearly enormous. So how can simple general-purpose applications be written to cope with all this?

In simple cases (e.g. only a single input NDF) it is possible for an application to enquire about certain NDF attributes and adapt to some extent to take account of them. In the example in §9.4, for instance, separate algorithms were used depending on whether bad pixels might be present or not. However, even this modest degree of adaption can rapidly become complicated if there are two or more NDFs (and hence four or more combinations of bad-pixel flags) to consider.

For general-purpose software which will be heavily used, some attempt to adapt will probably be worthwhile if it leads to significantly better performance. However, it is often necessary to write very simple or “one off” applications with little knowledge of NDF data structures, where any need to adapt to an incoming NDF would impose an unwelcome programming burden. Such applications should nevertheless make a consistently good job of processing NDF data structures.

The problem, therefore, is how to reconcile the very diverse requirements imposed by the complicated nature of NDF data structures with the variable, but generally far more modest capabilities of real applications. The solution lies in various conversion processes which allow the attributes of NDFs to be merged and matched to the capabilities of applications, to arrive at a compromise method of processing the information in any particular set of NDF data structures. As this description suggests, some loss of efficiency or information will inevitably be involved, but this will usually be acceptably small. In cases where the penalty is unacceptable, an application naturally has the option of aborting with an appropriate error message.

17.2 Merging and Matching Bad-Pixel Flags

First consider processing the data arrays of two input NDFs. Under what circumstances should checks for bad pixels be made?

The capabilities of real applications in this area usually fall into one of the following categories:

(1)
No ability to handle bad pixels at all.
(2)
The ability to check all arrays for bad pixels, regardless of the bad-pixel flag(s).
(3)
The ability to adapt the processing algorithm according to whether bad-pixels may be present or not.

It is expected that the majority of applications will fall into categories 1 and 2, since only a few general-purpose software items are likely to obtain worthwhile benefit from attempting to optimise their behaviour when bad pixels are known to be absent (category 3). Nevertheless, it is instructive to consider this last and most complicated category first.

As already noted, there are four possible combinations of bad-pixel flags if we consider the data components of a pair of NDFs (there would be considerably more if we wanted to include the variance components as well). For all practical purposes, however, the number of combinations can be reduced to two by merging the bad-pixel flags of the separate array components using a logical “OR”, for instance:

        LOGICAL BAD1, BAD2, BAD
  
        ...
  
        CALL NDF_BAD( INDF1, ’Data’, .FALSE., BAD1, STATUS )
        CALL NDF_BAD( INDF2, ’Data’, .FALSE., BAD2, STATUS )
        BAD = BAD1 .OR. BAD2
  
        ...

The resulting value of BAD could then be used to determine whether checks for bad-pixel values are necessary. For instance, if the two data arrays were to be added, the main processing algorithm might look like this:

        SUBROUTINE ADDIT( BAD, EL, A, B, C, STATUS )
        INCLUDE ’SAE_PAR’
        INCLUDE ’PRM_PAR’
  
        LOGICAL BAD
        INTEGER EL, STATUS, I
        REAL A( EL ), B( EL ), C( EL )
  
        IF ( STATUS .NE. SAI__OK ) RETURN
  
  *  No bad-pixel checks needed, so add the arrays.
        IF ( .NOT. BAD ) THEN
           DO 1 I = 1, EL
              C( I ) = A( I ) + B( I )
   1       CONTINUE
  
  *  Bad pixel checks needed, so check and assign a bad result if necessary.
        ELSE
           DO 2 I = 1, EL
              IF ( A( I ) .EQ. VAL__BADR .OR. B( I ) .EQ. VAL__BADR ) THEN
                 C( I ) = VAL__BADR
  
  *  Otherwise add the array elements normally.
              ELSE
                 C( I ) = A( I ) + B( I )
              END IF
   2       CONTINUE
        END IF
        END

Although this is the most sophisticated response to the presence of bad pixels which we will consider, two compromises have nevertheless still been made here. First, the main processing algorithm will sometimes run slightly more slowly than is absolutely necessary, because taking the logical “OR” means that checks will occasionally be made for bad pixels in arrays which do not contain any. Secondly, as a result of this, some “good” pixels may accidentally be identified as bad and a small fraction of valid pixels might therefore be lost. In practice, both of these penalties are usually acceptably small, given the alternative (of writing many versions of the main processing algorithm to cater for every combination of bad-pixel flags).

So what should happen if the main processing algorithm is very simple and cannot handle any bad pixels at all (category 1 above)? In this case, it is not possible to compromise, because attempting to ignore the situation and process an array containing bad pixels without checking for them will give a completely wrong result (as well as causing severe numerical problems which will probably cause the application to crash). In this situation, the limitations of the application are paramount, and the correct response is to report an error and abort (see §9.5). A user would then have the option of running a separate application to “repair” the data by replacing bad pixels with an acceptable alternative.

The need to handle bad-pixel flags in any of the ways described above is very common, so the prototype merging and matching routine NDF_MBAD is provided to simplify the process. This routine merges the bad-pixel flags of a pair of NDFs and matches them to the capabilities of an application. Its first (BADOK) argument is supplied with a logical value indicating whether the calling application can handle bad pixels or not, for instance:

        LOGICAL BADOK, CHECK, BAD
  
        ...
  
        CALL NDF_MBAD( BADOK, INDF1, INDF2, ’Data’, CHECK, BAD, STATUS )

If the BADOK argument is set to .TRUE., then NDF_MBAD simply returns the logical “OR” of the bad-pixel flags for each of the two NDFs via its BAD argument. In this case it behaves in most respects exactly like its simpler relative NDF_BAD (see §9.4). However, if BADOK is set to .FALSE., but the returned BAD value would nevertheless be .TRUE. (indicating that bad pixels may be present in one of the NDFs), then an error results. NDF_MBAD then makes an appropriate error report, explaining that the application cannot handle the bad pixels, and returns with its STATUS argument set to NDF__BADNS (bad values not supported), as defined in the include file NDF_ERR. In the normal course of events, the application would then abort and this message would be delivered to the user.

A call to NDF_MBAD therefore allows an application to declare whether it can handle bad pixels, and then determines whether checks for bad pixels should actually be made. At the same time, it will handle any problems which may arise.

If more than two NDF’s are involved, then the routine NDF_MBADN may be used instead of NDF_MBAD. This routine exists for merging the bad-pixel flag values of an arbitrary number of NDFs, whose identifiers are supplied in an integer array. For instance:

        INTEGER N, NDFS( N )
  
        ...
  
        CALL NDF_MBADN( .FALSE., N, NDFS, ’Data’, .TRUE., BAD, STATUS )

could be used by an application which was incapable of handling bad pixels in order to check for their existence in a whole sequence of NDFs at once.

Finally, if only a single NDF is being used, then there is a choice. The routine NDF_MBADN may be called with N set to 1, but NDF_MBAD may also be used by passing the same identifier value twice, as follows:

        CALL NDF_MBAD( BADOK, INDF1, INDF1, COMP, CHECK, BAD, STATUS )

NDF_MBAD is designed to expect this kind of usage, and will detect it and behave appropriately without any loss of efficiency. All the other equivalent matching and merging routines described in subsequent sections also have this capability.

17.3 Matching NDF Bounds

The common situation where an application takes two or more input NDFs and combines their array component values pixel-by-pixel (e.g. to multiply their data arrays together) raises a similar problem of how to cope with the diversity of NDF shapes which could be encountered. In this case the capabilities of the application are generally not an issue, because most applications will be written to process NDFs of arbitrary size. However, what steps should be taken if the NDFs supplied as input turn out to have different sizes?

A simple approach might be to check whether the shapes of the input NDFs match and to report an error and abort if they do not. This straightforward (but unfriendly) approach requires a significant number of Fortran statements and quickly becomes a chore if it has to be performed explicitly in every application. However, by making use of NDF sections (see §15.2) to change the apparent pixel-index bounds of an NDF, it is possible to do considerably better than this, and to allow processing of NDFs whose bounds may not initially match.

To understand the principle, consider two NDFs with shapes which do not match, such as:

(15:115,-5,10)

and

(1:200,1:30)

It would clearly be meaningless to attempt to multiply the data arrays of these two NDFs together pixel-by-pixel as they stand, since there are some pixels in the first one which do not exist in the second one, and vice versa. However, the pixels within the region:

(15:115,1:10)

do exist within both NDFs and can be multiplied together if we decide to discard the rest. This can be done by creating an appropriate section from each NDF which selects only these common pixels for further processing. This task, of calculating which pixels to use and of selecting the appropriate NDF sections, may be performed using the routine NDF_MBND, which matches the bounds of a pair of NDFs, as follows:

        CALL NDF_MBND( ’TRIM’, INDF1, INDF2, STATUS )

If necessary, NDF_MBND will annul the NDF identifiers supplied and replace them with identifiers to appropriate NDF sections with matching shapes. The array components of these sections may then be accessed and compared pixel-by-pixel, since they will be of equal size. If the two NDFs have no pixels in common, then NDF_MBND will report an appropriate error message and set a STATUS value.

In cases where more than two NDFs are involved, the similar routine NDF_MBNDN may be used. This will match the bounds of an arbitrary number of NDFs, whose identifiers should be held in an integer array, for instance:

        INTEGER N, NDFS( N )
  
        ...
  
        CALL NDF_MBNDN( ’TRIM’, N, NDFS, STATUS )

This routine will create sections with matching bounds from each NDF supplied, selecting only those pixels which exist in all of the NDFs. Once again, an appropriate error report will be made, and a STATUS value set, if this cannot be achieved.

If access to any of the original NDFs is still required, then the routine NDF_CLONE may be used to retain an identifier for it before calling NDF_MBND or NDF_MBNDN, since either of these latter routines may annul the identifiers passed to them. For example:

        CALL NDF_CLONE( INDF1, INDFK, STATUS )
        CALL NDF_MBND( ’TRIM’, INDF1, INDF2, STATUS )

would ensure that an identifier INDFK is retained for the NDF whose original identifier was held in INDF1, even although NDF_MBND may alter the INDF1 variable so that it refers only to a sub-section of the NDF.

An alternative method of matching NDF bounds is by padding, in which each NDF is padded out with bad pixels rather than by discarding pixels at the edges. It may be employed by specifying ‘PAD’ for the first (OPTION) argument to NDF_MBND or NDF_MBNDN. For instance:

        CALL NDF_MBND( ’PAD’, INDF1, INDF2, STATUS )

In this case, the NDF sections produced may represent super-sets of the original NDFs (see §15.4). This option is needed less often, but it has the advantage that no pixels will be lost. It is generally used in cases where an output pixel should be assigned a value even if one of the input pixels is missing (or has a bad value). For instance, an application to merge the data arrays of two non-congruent NDFs by averaging their pixel values might use the ‘PAD’ option to match the bounds of the input NDFs. It could then generate an output array using an algorithm along the following lines, where an output pixel value is generated if either or both input pixels are valid:

        INTEGER EL, I
        REAL A( EL ), B( EL ), C( EL )
        INCLUDE ’PRM_PAR’
  
        ...
  
  *  Loop to process each array element.
        DO 1 I = 1, EL
  
  *  If both input pixels are good, then take the average.
           IF ( A( I ) .NE. VAL__BADR .AND. B( I ) .NE. VAL__BADR ) THEN
              C( I ) = 0.5 * ( A( I ) + B( I ) )
  
  *  Otherwise, if the first one is good, then use it.
           ELSE IF ( A( I ) .NE. VAL__BADR ) THEN
              C( I ) = A( I )
  
  *  Otherwise use the second one.
           ELSE
              C( I ) = B( I )
           END IF
   1    CONTINUE

When using the ‘TRIM’ option, the output NDF may be smaller than either of the two input NDFs, and when using the ‘PAD’ option, it may be larger. However, the application usually need not be aware of this.

17.4 Merging and Matching Numeric Types

The variety of possible numeric types which may be used to store the values of NDF array components (see §7.1) poses another problem for the applications programmer. What type of arithmetic should be chosen to process the values?

The simplest option, and the one which is recommended for normal use, is:

Use single-precision floating-point arithmetic for calculations wherever possible...

and access all array components (other than quality) as ‘_REAL’ values, taking advantage of the implicit type conversion provided by the NDF_ routines if necessary (see §8.7).

However, when writing general-purpose software which may be heavily used, the possibility of duplicating the main processing algorithm so that calculations can be performed using several alternative types of arithmetic might also be considered. This allows applications to support the processing of unusual numeric types (e.g. double-precision), while normally using less computationally expensive arithmetic (e.g. single-precision) when that is adequate. The ability to explicitly handle values with a variety of numeric types also makes it less likely that an expensive type conversion will have to be performed implicitly by the NDF_ routines (see §8.7). Of course, the disadvantages of this approach are that extra work is involved in duplicating the main processing algorithm. The magnitude of this extra work should not be underestimated, although the use of the GENERIC compiler (SUN/7) can simplify the task to some extent.

To give a concrete example, suppose you decide to duplicate the main processing algorithm in an application so that it can perform calculations using either single- or double-precision arithmetic. Input NDFs with numeric array types of ‘_REAL’ or ‘_DOUBLE’ can then be processed directly, although with other numeric types (or with mixed types) implicit type conversion will still need to occur. This can get rather complicated to sort out, so a simple routine NDF_MTYPE is provided to make the decision about which version of an algorithm to invoke in a particular case, depending on the declared capabilities of an application, for instance:

        INCLUDE ’NDF_PAR’
        CHARACTER ITYPE * ( NDF__SZTYP ), DTYPE * ( NDF__SZFTP )
  
        ...
  
        CALL NDF_MTYPE( ’_REAL,_DOUBLE’, INDF1, INDF2, ’Data’, ITYPE, DTYPE, STATUS )

The first (TYPLST) argument is a list of the numeric types which the application can process explicitly (‘_REAL’ and ‘_DOUBLE’) supplied in order of preference, i.e. in order of increasing computational cost in this instance. NDF_MTYPE will examine the two NDFs supplied and select a numeric type from this list to which the data component values should be converted for processing so that no unnecessary loss of information occurs. The conversion from ‘_REAL’ to ‘_DOUBLE’ would be acceptable, for instance, whereas the opposite conversion process would lose information.

The resulting implementation type is returned as an upper-case character string via the ITYPE argument, and may be used both as an input argument to NDF_MAP for accessing the NDFs’ values, as well as for deciding which version of the main algorithm to invoke, for instance:

        CALL NDF_MTYPE( ’_REAL,_DOUBLE’, INDF1, INDF2, ’Data’, ITYPE, DTYPE, STATUS )
  
  *  Access the input array values using the selected implementation type.
        CALL NDF_MAP( INDF1, ’Data’, ITYPE, PNTR1, EL, STATUS )
        CALL NDF_MAP( INDF2, ’Data’, ITYPE, PNTR2, EL, STATUS )
  
  *  Invoke the appropriate version of the processing algorithm.
        IF ( ITYPE .EQ. ’_REAL’ ) THEN
           <invoke the single-precision algorithm>
        ELSE IF ( ITYPE .EQ. ’_DOUBLE’ ) THEN
           <invoke the double-precision algorithm>
        END IF

The NDF_MTYPE routine also addresses the question of how to store the results of the calculation since, ideally, the numeric type used for the output array(s) should preserve the precision of the calculation without unnecessarily increasing the storage space required. A suitable numeric type is therefore returned as an upper-case character string via the DTYPE argument and may be used as an input argument to a routine such as NDF_CREAT which creates an output NDF. More commonly, however, it will be used to change the numeric type of an NDF which has already been created by a call to NDF_PROP (see §14.4), for instance:

        CALL NDF_PROP( INDF1, ’ ’, ’OUT’, INDF3, STATUS )
        CALL NDF_MTYPE( ’_REAL,_DOUBLE’, INDF1, INDF2, ’Dat,Var’, ITYPE, DTYPE, STATUS )
        CALL NDF_STYPE( DTYPE, INDF3, ’Dat,Var’, STATUS )

Here, a new output NDF is created based on the first input NDF by using NDF_PROP, and an identifier INDF3 is obtained for it. NDF_MTYPE is then called to determine an appropriate numeric type for storing the output data and variance components, and NDF_STYPE is invoked to set the output components’ type appropriately. In a typical application, the input and output arrays would probably then be accessed and passed to the appropriate version of the main processing algorithm. A complete example of this sequence of events can be found in §A.7, where its integration with the handling of the bad-pixel flag and matching of the NDFs’ bounds is also demonstrated.

Note that an equivalent routine NDF_MTYPN also exists for merging and matching the numeric types of an arbitrary number of NDFs, whose identifiers are supplied in an integer array, as follows:

        INTEGER N, NDFS( N )
  
        ...
  
        CALL NDF_MTYPN( ’_INTEGER,_REAL’, N, NDFS, ’Data’, ITYPE, DTYPE, STATUS )

Both this routine and NDF_MTYPE will report an error and set STATUS to NDF__TYPNI (processing of data type not implemented), as defined in the include file NDF_ERR, if it is not possible to select an implementation type from the list supplied without leading to loss of information. This would be the case for instance, if one of the NDFs had a data component with a numeric type of ‘_DOUBLE’ but the application could only perform single-precision arithmetic. In this case, the values returned via the ITYPE and DTYPE arguments would be the “best compromise” values and could still be used, if required, by ignoring the error condition. For instance:

        INCLUDE ’NDF_ERR’
  
        ...
  
        CALL ERR_MARK
        CALL NDF_MTYPE( ’_REAL’, INDF1, INDF1, ’Data’, ITYPE, DTYPE, STATUS )
        IF ( STATUS .EQ. NDF__TYPNI ) CALL ERR_FLUSH( STATUS )
        CALL ERR_RLSE

Here, the error system routine ERR_FLUSH has been used to deliver the error message to the user and reset STATUS (see SUN/104). The user would then receive a warning message explaining that ‘_DOUBLE’ data could not be handled without loss of information, and that conversion to ‘_REAL’ would have to be used. The application could then perform the conversion and processing, having warned the user about the unavoidable loss if information which will occur.