Calculate statistics over a group of data arrays or points


This application calculates cumulative statistics over a group of NDFs . It can either generate the statistics of each corresponding pixel in the input array components and output a new NDF with array components containing the result, or calculate statistics at a single point specified in the current co-ordinate Frame  of the input NDFs.

In array mode (SINGLE=FALSE), statistics are calculated for each pixel in one of the array components (DATA, VARIANCE or QUALITY) accumulated over all the input NDFs and written to an output NDF; each pixel of the output NDF is a result of combination of pixels with the same Pixel co-ordinates in all the input NDFs. There is a selection of statistics available to form the output values.

The input NDFs must all have the same number of dimensions, but need not all be the same shape. The shape of the output NDF can be set to either the intersection or the union of the shapes of the input NDFs using the TRIM parameter.

In single pixel mode (SINGLE=TRUE) a position in the current co-ordinate Frame of all the NDFs is given, and the value at the pixel covering this point in each of the input NDFs is accumulated to form the results that comprise the mean, variance, and median. These statistics, and if environment variable MSG_FILTER is set to VERBOSE, the value of each contributing pixel, is reported directly to you.


mstats in out [estimator]


CLIP = _REAL (Read)
The number of standard deviations about the mean at which to clip outliers for the "Mode", "Cmean" and "Csigma" statistics (see Parameter ESTIMATOR). The application first computes statistics using all the available pixels. It then rejects all those pixels whose values lie beyond CLIP standard deviations from the mean and will then re-evaluate the statistics. For "Cmean" and "Csigma" there is currently only one iteration, but up to seven for "Mode".

The value must be positive. [3.0]


The NDF array component to be analysed. It may be "Data", "Quality", "Variance", or "Error" (where "Error" is an alternative to "Variance" and causes the square root of the variance values to be used). If "Quality" is specified, then the quality values are treated as numerical values (in the range 0 to 255). In cases other than "Data", which is always present, a missing component will be treated as having all pixels set to the ‘bad’ value. ["Data"]
The method to use for estimating the output pixel values. It can be one of the following options. The first four are more for general collapsing, and the remainder are for cube analysis.
  • "Mean" –- Mean value

  • "WMean" –- Weighted mean in which each data value is weighted by the reciprocal of the associated variance. (2)

  • "Mode" –- Modal value. (4)

  • "Median" –- Median value. Note that this is extremely memory and CPU intensive for large datasets; use with care! If strange things happen, use "Mean". (3)

  • "Absdev" –- Mean absolute deviation from the unweighted mean. (2)

  • "Cmean" –- Sigma-clipped mean. (4)

  • "Csigma" –- Sigma-clipped standard deviation. (4)

  • "Comax" –- Co-ordinate of the maximum value.

  • "Comin" –- Co-ordinate of the minimum value.

  • "FBad" –- Fraction of bad pixel values.

  • "FGood" –- Fraction of good pixel values.

  • "Integ" –- Integrated value, being the sum of the products of the value and pixel width in world co-ordinates.

  • "Iwc" –- Intensity-weighted co-ordinate, being the sum of each value times its co-ordinate, all divided by the integrated value (see the "Integ" option).

  • "Iwd" –- Intensity-weighted dispersion of the co-ordinate, normalised like "Iwc" by the integrated value. (4)

  • "Max" –- Maximum value.

  • "Min" –- Minimum value.

  • "FBad" –- Fraction of bad pixel values.

  • "FGood" –- Fraction of good pixel values.

  • "NBad" –- Count of bad pixel values.

  • "NGood" –- Count of good pixel values.

  • "Rms" –- Root-mean-square value. (4)

  • "Sigma" –- Standard deviation about the unweighted mean. (4)

  • "Sum" –- The total value.

Where needed, the co-ordinates are the indices of the input NDFs in the supplied order. Thus the calculations behave like the NDFs were stacked one upon another to form an extra axis, and that axis had GRID co-ordinates. Care using wildcards is necessary, to achieve a specific order, say for a time series, and hence assign the desired co-ordinate for a each NDF. Indirection through a text file is recommended.

The selection is restricted if there are only a few input NDFs. For instance, measures of dispersion like "Sigma" and "Iwd" are meaningless for combining only two NDFs. The minimum number of input NDFs for each estimator is given in parentheses in the list above. Where there is no number, there is no restriction. If you supply an unavailable option, you will be informed, and presented with the available options. ["Mean"]

IN = GROUP (Read)
A group of input NDFs. They may have different shapes, but must all have the same number of dimensions. This should be given as a comma-separated list, in which each list element can be one of the following.
  • An NDF name, optionally containing wild-cards and/or regular expressions ("", "?", "[a-z]" etc.);

  • the name of a text file, preceded by an up-arrow character "^". Each line in the text file should contain a comma-separated list of elements, each of which can in turn be an NDF name (with optional wild-cards, etc), or another file specification (preceded by an up-arrow). Comments can be included in the file by commencing lines with a hash character "#".

If the value supplied for this parameter ends with a minus sign "-", then the user is re-prompted for further input until a value is given which does not end with a minus sign. All the images given in this way are concatenated into a single group.

OUT = NDF (Read)
The name of an NDF to receive the results. Each pixel of the DATA (and perhaps VARIANCE) component represents the statistics of the corresponding pixels of the input NDFs. Only used if SINGLE=FALSE.
In Single pixel mode (SINGLE=TRUE), this parameter gives the position in the current co-ordinate Frame at which the statistics should be calculated (supplying a colon ":" will display details of the required co-ordinate Frame). The position should be supplied as a list of formatted axis values separated by spaces or commas. The pixel covering this point in each input array, if any, will be used.
Whether the statistics should be calculated in Single pixel mode or Array mode. If SINGLE=TRUE, then the POS parameter will be used to get the point to which the statistics refer, but if SINGLE=FALSE an output NDF will be generated containing the results for all the pixels. [FALSE]
Title for the output NDF. ["KAPPA - Mstats"]
This parameter controls the shape of the output NDF. If TRIM=TRUE, then the output NDF is the shape of the intersection of all the input NDFs, i.e. only pixels which appear in all the input arrays will be represented in the output. If TRIM=FALSE, the output is the shape of the union of the inputs, i.e. every pixel which appears in the input arrays will be represented in the output. [TRUE]
A flag indicating whether a variance array present in the NDF is used to weight the array values while forming the estimator’s statistic, and to derive output variance. If VARIANCE is TRUE and all the input NDFs contain a variance array, this array will be used to define the weights, otherwise all the weights will be set equal. [TRUE]
WLIM = _REAL (Read)
If the input NDFs contain bad pixels, then this parameter may be used to determine at a given pixel location the number of good pixels which must be present within the input NDFs before a valid output pixel is generated. It can be used, for example, to prevent output pixels from being generated in regions where there are relatively few good pixels to contribute to the result of combining the input NDFs.

Results Parameters

MEAN = _DOUBLE (Write)
The mean pixel value, if SINGLE=TRUE.
The median pixel value, if SINGLE=TRUE.
VAR = _DOUBLE (Write)
The variance of the pixel values, if SINGLE=TRUE.


mstats idat ostats
This calculates the mean of each pixel in the Data arrays of all the NDFs in the current directory with names which start "idat", and writes the result in a new NDF called ostats. The shape of ostats will be the intersection of the volumes of all the indat NDFs.
mstats idat ostats trim=false
This does the same as the previous example, except that the output NDF will be the ‘union’ of the volumes of the input NDFs, that is a cuboid with lower bounds as low as the lowest pixel bound of the input NDFs in each dimension and with upper bounds as high as the highest pixel bound in each dimension.
mstats idat ostats variance
This is like the first example except variance information present is used to weight the data values.
mstats idat ostats comp=variance variance
This does the same as the first example except that statistics are calculated on the VARIANCE components of all the input NDFs. Thus the pixels of the VARIANCE component of ostats will be the variances of the variances of the input data.
mstats m31 single=true pos="0:42:38,40:52:20"
This example is analysing the pixel brightness at the indicated sky position in a number of NDFs whose name start with "m31", which all have SKY as their current co-ordinate Frame. The mean and variance of the pixels at that position in all the NDFs are printed to the screen. If the reporting level is verbose, the command also prints the value of the sampled pixel in each of the NDFs. For those in which the pixel at the selected position is bad or falls outside the NDF, this is also indicated.
mstats in="arr1,arr2,arr3" out=middle estimator=median wlim=1.0
This example calculates the medians of the DATA components of the three named NDFs and writes them into a new NDF called middle. All input values must be good to form a non-bad output value.


Related Applications


Implementation Status: