- ←Prev
- ARY
A Subroutine Library for Accessing
ARRAY Data Structures - Next→
- TOC ↑
3 Array Storage Forms
Note that at present, the ARY_ system provides full support only for the “primitive” and “simple”
forms of the ARRAY data structure.
Some support is also provided for two additional forms:
-
SCALED
- - the “scaled” form described in SGP/38. This form is the same as the “simple” form
except that two extra scalar values are included that describe a linear scaling from the
stored array values to the data values of interest to an external user. These two scalars are
referred to as SCALE and ZERO. The external (unscaled) data values are derived from
the stored (scaled) data values as follows:
unscaled = SCALE*scaled + ZERO
-
DELTA
- - this form is not currently described in SGP/38. Delta form provides a lossless
compression scheme designed for arrays of integers in which there is at least one pixel
axis along which the array value changes only slowly. For further details, see §3.1.
The following points should be noted:
-
(1)
- Scaled and delta arrays are “read-only”. An error will be reported if an attempt is made
to map a scaled or delta array for WRITE or UPDATE access. When mapped for READ
access, the pointer returned by ARY_MAP provides access to the original data values -
that is, the mapped values are the result of (for scaled arrays) applying the scale and zero
terms to the stored values, or (for delta arrays) uncompressing the compressed values.
Currently, the internal stored (i.e. scaled or compressed) data values cannot be accessed
directly.
-
(2)
- The result of copying a scaled or delta array (using ARY_COPY) will be an equivalent
simple array.
-
(3)
- Scaled and delta arrays cannot be created directly. Instead, a simple array must first be
created (using ARY_NEW), and this can then be converted to a scaled or delta array as
follows:
-
SCALED
- - storing scale and zero values in the simple array using ARY_PTSZ<T>. A
typical program would create a simple array, map it for write access, store the scaled
data values in the mapped simple array, unmap the array, and then associate scale
and zero values with the array, thus converting it to a scaled array.
-
DELTA
- - copying the simple array using ARY_DELTA. The copy will be a compressed
array stored in delta form. A typical program would create a simple array, map it
for write access, store the uncompressed data values in the mapped simple array,
unmap the array, and then copy it using ARY_DELTA. The original simple array
could then be deleted if it is no longer needed.
-
(4)
- Scaled and delta arrays cannot have complex data types. An error will be reported if
an attempt is made to to import an HDS structure describing a complex scaled or
delta array, or to use ARY_PTSZ<T> or ARY_DELTA on an array with complex data
values.
-
(5)
- When applied to a scaled or delta array, the ARY_TYPE and ARY_FTYPE routines return the
data type of the external (i.e. unscaled or uncompressed) values. In practice, this means that for a
scaled array they return the data type of the SCALE and ZERO constants, rather than the data
type of the array holding the stored (scaled) data values. For a delta array they return the data
type of the original uncompressed values.
3.1 Delta Compressed Array Form
The DELTA storage form provides lossless compression for integer arrays. It uses two methods to
achieve compression:
- Differences between adjacent data values are stored, rather than the full data values
themselves. For many forms of astronomical data, the differences between adjacent data
values have a much smaller range than the data values themselves. This means that they
can be represented in fewer bits. For instance, if the data values are _INTEGER, then
the differences between adjacent values may fit into the range of a _WORD (-32767 to
+32767) or even a _BYTE (-127 to +127). This use of a shorter data type usually provides
the majority of the compression. However, it is not necessary for all differences to be small
- if the difference between two adjacent data values is too large for the compressed data
type, the second of the two data values will be stored explicitly using the full data type
of the original uncompressed data. Obviously, the more values that need to be stored in
full in this way, the lower will be the compression.
In the above description, the term “adjacent” means “adjacent along a specified pixel
axis”. The pixel axis along which differences are taken is referred to as the “compression
axis”. It may be specified explicitly by the calling application when ARY_DELTA is called,
or it may be left unspecified in which case ARY_DELTA will choose the axis that gives
the best compression.
- If the uncompressed array contains runs of more than three identical values along the
compression axis, then the run of identical values is replaced by a single value (stored in
full, not as a difference) and a repetition count.
3.1.1 Creating a Delta Array
To create a DELTA array, first store the uncompressed integer values in a simple array, and then copy
the array using ARY_DELTA. The copy produced by ARY_DELTA will be stored in DELTA
form.
Arrays of floating point values may be compressed by first storing the floating point values in a
SCALED array, and then using ARY_DELTA to create a delta compressed copy of the scaled array.
Note, the scaled array must use an integer data type to store the internal (i.e. scaled) values. The use of
the scaled array means that the compression is not lossless, since some information will have been lost
in scaling the floating point values into integers.
3.1.2 The HDS Structure of a Delta Array
The HDS structure of a DELTA array is similar to the SIMPLE array, in that it will contain
VARIANT, DATA and ORIGIN components. In addition they can contain SCALE and ZERO
terms, which, if present, are used to scale the uncompressed integers as in a SCALED array.
Uncompression happens first, producing an array of uncompressed integers, which are then
unscaled if required using SCALE and ZERO to produce the final uncompressed, unscaled,
array.
DELTA arrays cannot be used to hold complex values and so no IMAGINARY_DATA component will
be present. Also, DELTA arrays have an implicit value of .TRUE. for their bad pixel flags, and so no
BAD_PIXEL component will be present in the HDS structure.
Information is stored within a DELTA array that allows sub-sections of the compressed array to be
uncompressed without needing to uncompress the whole array.
A DELTA array is stored in an HDS structure with type DELTA_ARRAY, and contains the following
components:
-
DATA
- - This is a one-dimensional integer array holding the differences between adjacent
uncompressed integer data values along the compression axis. Its data type will be eother
_INTEGER, _WORD or _BYTE and is specified when ARY_DELTA is called to create
the array. A few integer values (all near the maximum value allowed by the data type)
are reserved for use as flags to indicate one of the following conditions (where “MAX”
represents the largest positive integer value that can be represented using the data type
of the DATA array):
- The value MAX is reserved to indicate that the next element of the uncompressed
array is good, but could not be expressed as a difference from the previous element
because the difference would not fit into the available data range of the DATA array.
Instead, the full uncompressed value is stored in the next element of the VALUE
array.
- The value (MAX-1) is reserved to indicate that the next element of the uncompressed
array is good and is exactly equal to the following (N-1) elements. The full
uncompressed value is stored in the next element of the VALUE array. The value of
N is stored in the next element of the REPEAT array.
- The value (MAX-2) is reserved to indicate that the next element of the uncompressed
array is bad, as are the following (N-1) elements. The full uncompressed value of the
next good value following the bad values is stored in the next element of the VALUE
array. The value of N is stored in the next element of the REPEAT array.
- The value (MAX-3) is reserved to indicate that the next element of the uncompressed
array is bad, but the following element is good and its full uncompressed value is
stored in the next element of the VALUE array.
- The value (MAX-4) is reserved to indicate that the next N elements of the
uncompressed array are good but cannot be expressed as differences from the
previous element because the differences would not fit into the available data range
of the DATA array. Instead, the full uncompressed values are stored in the next N
elements of the VALUE array. The value of N is stored in the next element of the
REPEAT array.
- Any other value is taken to be (NEXT - PREVIOUS) - the difference between the next
uncompressed value and the previous uncompressed value.
Notes:
-
(1)
- The “available data range” in DATA is reduced to leave room for the above flags.
-
(2)
- The first element in each row of pixels parallel to the compression axis is always
represented using one of these flag values. This allows each row of pixel values to
be uncompressed without reference to any earlier values.
-
(3)
- Repeated runs of good or bad value are always contained within a single row
of pixels parallel to the compression axis. Runs of repeated values that cross the
boundary between adjacent rows are split into two repeated runs - one for each row.
-
FIRST_DATA
- - This is an _INTEGER array with (NDIM-1) axes which have the same order and size
as the axes of the uncompressed array, but omitting the compression axis (NDIM is
the number of axes in the uncompressed array). It holds the zero-based index into
the DATA array at which the first element of the corresponding row of values is
stored.
For instance, if the uncompressed array is a cube with bounds (1:10,1:5,1:7), and the compression
axis is axis number 2, then the FIRST_DATA array will be two-dimensional with bounds
(1:10,1:7). Element (2,3) of this array (for instance) will hold the integer index of the DATA array
element that gives the full value for pixel (2,1,3) in the uncompressed array. Elements (2,2,3),
(2,3,3), (2,4,3) and (2,5,3) of the uncompressed array are then derived from the following values
in the DATA array.
-
FIRST_REPEAT
- - This is an array with the same shape as the FIRST_DATA array. It holds the
zero-based index of the first value of the REPEAT array to be used whilst uncompressing the
corresponding row of pixels. This component will only be present in the DELTA_ARRAY
structure if the REPEAT component is present. The data type of this array will be
one of _INTEGER, _UWORD or _UBYTE, depending on the largest value stored in
it.
-
FIRST_VALUE
- - This is an array with the same shape as the FIRST_DATA array. It holds the
zero-based index of the first value of the VALUE array to be used whilst uncompressing the
corresponding row of pixels. The data type of this array will be one of _INTEGER, _UWORD or
_UBYTE, depending on the largest value stored in it.
-
ORIGIN
- - A one-dimensional _INTEGER array holding the pixel indices of the first element of the
uncompressed array. This component is optional - an origin of (1,1,1...) is assumed if the
component is not present in the DELTA_ARRAY structure.
-
REPEAT
- - A one-dimensional _INTEGER array holding the number of repetitions for each value
associated with an occurrence of (MAX-1), (MAX-2) or (MAX-4) in the DATA array. The data
type of this array will be one of _INTEGER, _UWORD or _UBYTE, depending on the largest
value stored in it. This array will not be present if there are no runs in the uncompressed data
array.
-
SCALE
- - An optional component giving a scale factor to apply to the uncompressed integer values. It
can be of any data type. If present the uncompressed array is treated like a SCALED array. In
particular, the data type of the uncompressed array will be the same as the data type of the
SCALE component, if present. If not present, the data type of the uncompressed array is given
by the data type of the VALUE array.
-
VALUE
- - A one-dimensional array with the same data type as the uncompressed array (_INTEGER,
_WORD, _UWORD, _BYTE or _UBYTE) prior to scaling by SCALE and ZERO. It holds full
uncompressed integer values for the elements that are flagged with any of the special values
listed under “DATA” above. Note, if SCALE and ZERO components are present in the DELTA
array, the VALUE array holds internal scaled values, rather than external unscaled
values.
-
VARIANT
- - The storage form of the array. This will always be set to “DELTA”.
-
ZAXIS
- - A scalar _INTEGER value giving the index of the ecompression axis - that is, the pixel
axis index within the uncompressed array along which differences were taken. Care
should be taken in the choice of ZAXIS since it can affect the degree of compression
achieved. If ZAXIS is not specified when compressing an array, it defaults to the
axis that gives the greatest compression. Note, the ZAXIS value is one-based, not
zero-based.
-
ZDIM
- - A scalar _INTEGER holding the length of the compression axis within the uncompressed
array. The other dimensions of the uncompressed array are given by the shape of the
FIRST_DATA array.
-
ZERO
- - An optional component giving a zero offset to add to the uncompressed integer values. It can
be of any data type. If present the uncompressed array is treated like a SCALED
array.
-
ZRATIO
- - A scalar _REAL holding the compression factor - that is, the ratio of the uncompressed
array size to the compressed array size. This is approximate as it does not include
the effects of the metadata needed to describe the extra components of a DELTA
array (i.e. the space needed to hold the HDS component names, types, dimensions,
etc).
- ←Prev
- ARY
A Subroutine Library for Accessing
ARRAY Data Structures - Next→
- TOC ↑