12 ARRAY COMPONENT STORAGE FORM AND COMPRESSION

 12.1 General
 12.2 Obtaining the Storage Form
 12.3 Simple Storage Form
 12.4 Scaled Storage Form
 12.5 Delta compressed Storage Form
 12.6 Primitive Storage Form

12.1 General

An NDF data structure allows for the possibility of storing the values of its array components in a variety of different ways within the underlying data system HDS. The reasons for this are various, but have to do with maintaining compatibility with previous data formats and optimising disk space or access time for certain kinds of information. The options are described in SGP/38, where they correspond with the various variants of the ARRAY structure, which is one of the building-blocks from which an NDF is constructed.

In the present document, the terminology has been changed slightly. In particular, the term storage form is used in preference to variant to avoid possible confusion with variance, although the meaning is unchanged. Also, note that the term storage form incorporates what is often referred to as compression - different compression algorithms correspond to different storage forms.

12.2 Obtaining the Storage Form

The storage form of an NDF array component may be determined using the routine NDF_FORM. For instance:

        INCLUDE ’NDF_PAR’
        CHARACTER * ( NDF__SZFRM ) FORM
  
        ...
  
        CALL NDF_FORM( INDF, ’Data’, FORM, STATUS )

will return the storage form of an NDF’s data component as an upper-case character string via the FORM argument. Note how the symbolic constant NDF__SZFRM (defined in the include file NDF_PAR) should be used to declare the size of the character variable which is to receive the returned storage form information.

The storage form is established when an NDF is first created (see §13.2). At present there is no way of explicitly changing it, but in some circumstances it may be changed implicitly (see §12.6). At present, only four storage forms are supported, so only the values ‘SIMPLE’, ’SCALED’, ’DELTA’ and ‘PRIMITIVE’ can be returned by NDF_FORM. These are described below.

12.3 Simple Storage Form

In this form of storage, the values of an NDF’s array component are stored in their simplest possible form, i.e. as a sequence of pixels in an N-dimensional array with (optionally) a similar imaginary component. This, together with other ancillary information is held in a single ARRAY structure within HDS.

There are no special restrictions on the use of simple arrays and most applications will not need to be aware of the use of this storage form.

12.4 Scaled Storage Form

In this form of storage, the values stored internally within an NDF’s array component are a scaled form of the external values of interest to application code. Specifically, the internal scaled values are related to the external unscaled values via:

           (unscaled value) = ZERO + (scaled value) * SCALE

where ZERO and SCALE are two constant values stored with the array. In all other respects, a scaled array is exactly like a simple array.

This storage form is commonly used as a means of compressing the data into a smaller disk file by selecting a data type for the scaled values that uses fewer bytes per value than the unscaled data values (for instance, scaling four byte _REALs into two byte integer _WORDS). Note, information is lost in this process as the original unscaled values cannot be recreated exactly from the scaled value.

Support for scaled arrays is currently limited, since it is anticipated that they will only be of interest as an archive format. The following details should be noted:

(1)
Scaled arrays are “read-only”. An error will be reported if an attempt is made to map a scaled array for WRITE or UPDATE access. When mapped for READ access, the pointer returned by NDF_MAP provides access to the unscaled data values - that is, the mapped values are the result of applying the scale and zero terms to the stored (scaled) values.

Currently, the scaled (i.e. stored) data values cannot be accessed directly. If you want to change the array values in a scaled array, first take a copy of the NDF and then modify the array values in the copy11.

(2)
The result of copying a scaled array (for instance, using NDF_PROP, etc.) will be an equivalent simple array.
(3)
Scaled arrays cannot be created directly. To create an NDF with scaled arrays, first create an NDF with simple arrays, and then copy it using NDF_ZSCAL. The output NDF created by NDF_ZSCAL is a copy of the input NDF, but stored with scaled storage form12.
(4)
The NDF_GTSZx routine can be used to determine the scale and zero values of an existing scaled array.
(5)
Scaled arrays cannot have complex data types. An error will be reported if an attempt is made to to import an HDS structure describing a complex scaled array, or to assign scale and zero values to an array with complex data values.
(6)
When applied to a scaled array, the NDF_TYPE and NDF_FTYPE routines return information about the data type of the unscaled data values. In practice, this means that they return the data type of the SCALE and ZERO constants, rather than the data type of the array holding the stored (scaled) data values. To get the data type of the stored (scaled) values, use NDF_SCTYP.

12.5 Delta compressed Storage Form

In this form of storage, the values stored internally within an NDF’s array component are a compressed form of the external values of interest to application code. Delta form provides a lossless compression scheme designed for arrays of integers in which there is at least one pixel axis along which the array value changes only slowly. It uses two methods to achieve compression:

Support for delta arrays is currently limited, since it is anticipated that they will only be of interest as an archive format. The following details should be noted:

(1)
Delta arrays are “read-only”. An error will be reported if an attempt is made to map a delta array for WRITE or UPDATE access. When mapped for READ access, the pointer returned by NDF_MAP provides access to the original uncompressed data values.
(2)
The result of copying a delta array (for instance, using NDF_PROP, etc.) will be an equivalent simple array.
(3)
Delta arrays cannot be created directly. To create an NDF with delta compressed arrays, first create an NDF with simple arrays, and then copy it using NDF_ZDELT. The output NDF created by NDF_ZDELT is a copy of the input NDF, but stored with delta storage form.
(4)
Delta form can only be used to store integer data values, but NDFs with floating point data values may be compressed indirectly, by first storing the floating point values in a scaled NDF, and then using NDF_ZDELT to create a delta compressed copy of the scaled NDF. Note, the scaled NDF must use an integer data type to store the internal (i.e. scaled) values. The use of the scaled NDF means that the compression is not lossless, since some information will have been lost in scaling the floating point values into integers.
(5)
The NDF_GTDLT routine will return details of the compression applied to a delta compressed NDF array component.
(6)
Delta arrays cannot have complex data types. An error will be reported if an attempt is made to to import an HDS structure describing a complex delta array.
(7)
When applied to a delta array, the NDF_TYPE and NDF_FTYPE routines return information about the data type of the original uncompressed data values.

12.6 Primitive Storage Form

This storage form is provided primarily to maintain compatibility with previous data formats. In this case, the array’s values are held as a sequence of pixels in an N-dimensional array, but in a primitive HDS object. This means that no ancillary information can be associated with such a component and this imposes a number of restrictions on the properties of such arrays:

In most situations, these restrictions are unimportant and primitive storage form may be used to maintain compatibility with existing datasets and software. In the longer term, it is expected that a gradual transition will take place, replacing primitive arrays by equivalent simple arrays and this latter approach should be taken by all new software. However, there is usually little harm in creating NDFs with primitive array components, because any change made to the NDF which would violate one of the restrictions above will cause any affected primitive array component to implicitly change its storage form to become simple. This is a straightforward change which costs little, and a subsequent call to NDF_FORM will show if this has occurred. Only one possible complication may arise: if the array is mapped for access when its storage form is implicitly changed, then an error will result. This is unlikely to be a problem in practice. Warning – possible pitfall:  A case which occasionally causes problems can arise if a primitive NDF is created (e.g. by calling NDF_CREP – see §13.4) and an array component is then mapped for access using an access mode such as ‘WRITE/ZERO’. This access mode will cause the component’s bad-pixel flag to be set to .FALSE. (see §9.7). When the component is unmapped, this, in turn, will cause its storage form to be implicitly converted to simple. This behaviour is correct, but it is not always what is expected, or wanted. It can be avoided by setting the bad-pixel flag value back to .TRUE. (see §9.6) before unmapping the component concerned, or by performing the initialisation to zero explicitly rather than via an initialisation option on the mapping mode.

11Copying a scaled array produces an equivalent simple array.

12Alternatively, an existing simple NDF can be converted to a scaled array by assigning scale and zero values to it using NDF_PTSZ<T> - a typical program could create a simple array, map it for write access, store the scaled data values in the mapped simple array, unmap the array, and then associate scale and zero values with the array, thus converting it to a scaled array.