Menu Options

Processing math: 100%

←Prev
PERIOD
A Time-Series Analysis Package
Next→
TOC ↑

5 Menu Options

PERIOD is a menu-driven package. On entering PERIOD, you will be confronted with the following menu options, which are described in greater detail below.

  |**************************************************|
  |**| PERIOD  :>  A time-series analysis package |**|
  |**| Version :>  5.0 for UNIX                   |**|
  |**| Date    :>  12 December 2001               |**|
  |**************************************************|

  Options.
  --------

  INPUT    --  Input ASCII file data.
  OGIP     --  Input OGIP FITS table data
  FAKE     --  Create fake data.
  NOISE    --  Add noise to data.
  DETREND  --  Detrend the data.
  WINDOW   --  Set data points to unity.
  OPEN     --  Open a log file.
  CLOSE    --  Close the log file.
  PERIOD   --  Find periodicities.
  FIT      --  Fit sine curve to folded data.
  FOLD     --  Fold data on given period.
  SINE     --  +, -, / or * sine curves.
  PLT      --  Call PLT.
  STATUS   --  Information on stored data.
  OUTPUT   --  Output data.
  HELP     --  On-line help.
  QUIT     --  Quit PERIOD.

  PERIOD>

Any one of these commands can be entered by typing anything from the shortest unambiguous string up to the full command name. Therefore, P would be ambiguous, but PE would not.

`INPUT`

As described in section 3.1, this option allows you to input ASCII data into PERIOD. The routine determines the number of columns in the input files and then prompts the user for which columns refer to the $x$ -axis, $y$ -axis and $y$ -axis errors (if desired, see section 3). For example, if the user is inputting radial velocity data, the $x$ -axis would most probably be HJD’s, the $y$ -axis the heliocentric radial velocities and there would most likely be errors associated with each radial velocity value. Note that the $x$ -axis values must be in ascending order, otherwise INPUT will report a warning and either sort the data (if requested to do so) or abort. Note also that the $y$ -axis errors are used by all options in the main PERIOD menu, but by only the CHISQ periodicity-finding option in the period_period sub-menu.

`OGIP`

As described in section 3.2, this option allows you to input data from an OGIP FITS table into PERIOD. The routine displays some information about the file requested and allows you to choose which of its available tables is to be examined. You then select which of the columns in the file refers to the $x$ -axis, $y$ -axis and $y$ -axis errors (if desired, see section 3).

`FAKE`

Allows you to create fake data with which to test or experiment with PERIOD. Two options are catered for: periodic data or chaotic data. The periodic data are created by summing a user-specified number of sine curves of the form:

Y = GAMMA + (AMPLITUDE * SIN( ((2.0*PI)/PERIOD) * (X - ZEROPT)))

The chaotic data are created using a simple logistic equation of the form:

Xn+1 = LAMBDA * Xn * (1-Xn)

(see, for example, Scargle 1990ab).

`NOISE`

Using this option, it is possible to add noise to data or randomize data. The latter operation is carried out by specifying the [N]ew dataset option, which will construct an artificial dataset of the same mean value and the same standard deviation as the original. Selecting the [O]ld dataset allows you to apply noise to data, create errorbars on the data points, and/or add noise to the data sampling (so that, for instance, an evenly sampled dataset becomes unevenly sampled). This routine is useful, not only in creating realistic artificial datasets (in conjunction with FAKE), but also in investigating the effects of noise on a period detection.

`DETREND`

This option removes the D.C. bias from data, which if not removed gives rise to significant power at 0 Hz. There are two options: If the data show no long term trends, it is best to simply subtract the mean and divide by the standard deviation (the [M] option). This gives a dataset with a mean of zero and a standard deviation of one. Otherwise, it is best to subtract a low-order polynomial fit to the data (the [P] option), since if these are not removed, a Fourier transform will inject a significant amount of power at the frequency of the long term variations.

`WINDOW`

One of the main problems with the classical periodogram¹ (see Scargle 1982 for a definition), is spectral leakage, of which there are several forms. Leakage to nearby frequencies (sidelobes) is due to the finite total interval over which the data is sampled. Leakage to distant frequencies is due to the finite size of the interval between samples. The WINDOW option sets all the $y$ -axis data points to unity. A discrete Fourier transform of the resulting data (using, for example, the FT option, see below) yields the window function (or spectrum), which shows the effects of spectral leakage.

`OPEN`

It is possible to store the fits calculated by SINE and PEAKS in a log file. This option opens a new log file (if it does not already exist), or else re-opens an old log file and skips over the existing entries.

`CLOSE`

This option closes the currently open log file.

`PERIOD`

This is where all the work is done. You will be confronted by the following sub-menu:

  Options.
  --------

  SELECT   --  Select data slots.
  FREQ     --  Set/show frequency search limits.
  CHISQ    --  Chi-squared of sine fit vs frequency.
  CLEAN    --  CLEANed power spectrum.
  FT       --  Discrete Fourier power spectrum.
  PDM      --  Phase dispersion minimization.
  SCARGLE  --  Lomb-Scargle normalized periodogram.
  STRING   --  String-length vs frequency.
  PEAKS    --  Calculate period from periodogram.
  SIG      --  Enable/disable significance calc.
  HELP     --  On-line help.
  QUIT     --  Quit PERIOD_PERIOD.

  PERIOD_PERIOD>

SELECT – Selects input and output slots for processing, as described in section 3. The input slots should contain the time-series, the output slots will contain, for example, the power spectra. SELECT must be run every time a periodicity-finding option is about to be executed; although tedious, this prevents one from accidentally overwriting slots.
FREQ – Sets the frequency search parameters. The minimum frequency, maximum frequency and frequency interval can be selected by you. Generally, there is no restriction on the number of frequencies to be stepped-though in the processing. Alternatively, by entering 0’s, default values can be accepted. Note that the default values are set on entering the PERIOD package and thus the FREQ option need not be run if default frequencies are required. The default values are calculated as follows: minimum frequency = 0 (ie. infinite period), maximum frequency = 1 / (2 $\times$ Smallest Data Interval) (ie. Nyquist), frequency interval = 1 / (4 $\times$ Total Time Interval).
CHISQ – This is a straight-forward technique where the input data is folded on a series of trial periods. At each trial period, the data is fitted with a sine curve. The resulting reduced- $χ^{2}$ values are plotted as a function of trial frequency and the minima in the plot suggest the most likely periods. See Horne, Wade and Szkody (1986) for an example of the use of this method, which is ideally suited to the study of radial velocity data or any other sinusoidal variations. Note that windowed data cannot be processed by this option since no sine fit is possible.
CLEAN – The CLEAN algorithm was originally developed for use in aperture synthesis and was later applied to one-dimensional data by Roberts, Lehár and Dreher (1987). An adapted version of Lehár’s code is used here, and is particularly useful for unequally spaced data. The algorithm basically deconvolves the spectral window from the discrete Fourier power spectrum (or dirty spectrum). This produces a CLEAN spectrum, which is largely free of the many effects of spectral leakage. In order to prevent small errors from destabilizing the CLEAN procedure, the user is prompted for two parameters – the loop gain and the number of iterations. Briefly, with each iteration, some fraction (governed by the loop gain) of the window function is removed from the dirty spectrum. For convergence, the loop gain must lie between 0 and 2, typical values being between 0.1 and 1. Values at the bottom of this range require more iterations, but should provide more stability. Hence, the number of iterations should be large if the loop gain is small, typical values lying between 1 and 100. Note that an increase in the number of cleans produces a less noisy spectrum but, in general, the amplitude of the peaks is decreased, sometimes by a substantial amount. See Roberts, Lehár and Dreher (1987) for further details on choosing these parameters.
FT – This option performs a classical discrete Fourier transform on the data and sums the mean-square-amplitudes of the result to form a power spectrum (see, for example, Deeming 1975). This discrete Fourier transform is defined for arbitrary data spacing and is equal to the convolution of the true Fourier transform with a spectral window. Hence, the effects of data spacing, such as aliasing, are all contained in the spectral window, which can be generated using the WINDOW option (see above). This spectral window should be analysed in conjunction with the discrete Fourier transform generated here in order to estimate the effects of aliasing.
PDM – The phase dispersion minimization (PDM) technique is simply an automated version of the classical method of distinguishing between possible periods, in which the period producing the least observational scatter about the mean light curve (or, for example, radial velocity curve) is chosen. This technique (which is described in detail by Stellingwerf 1978) is well suited to cases in which only a few observations are available over a limited period of time, especially if the light curve is highly non-sinusoidal. The data is first folded on a series of trial frequencies. For each trial frequency, the full phase interval (0,1) is divided into a user-specified number of bins. The width of each bin is specified by the user, such that a point need not be picked (if a bin width narrower than the bin spacing is selected) or a point can belong to more than one bin (if a bin width wider than the bin spacing is selected). The variance of each of these bins (or samples) is then calculated. This gives a measure of the scatter around the mean light curve defined by the means of the data in each sample. The PDM statistic can then be calculated by dividing the overall variance of all the samples by the variance of the original (unbinned) dataset. This process is then repeated for the next trial frequency. Note that windowed data cannot be passed to this option since its variance is zero. If the trial period is not a true period, then the overall sample variance will be approximately equal to the variance of the original dataset (ie. the PDM statistic will be approximately equal to 1). If the trial period is a correct period, the PDM statistic will reach a local minimum compared with neighbouring periods, hopefully near zero.
SCARGLE – By redefining the classical periodogram (ie. the discrete Fourier periodogram) in such a manner as to make it invariant to a shift of the origin of time, Lomb (1976) and Scargle (1982) developed a novel type of periodogram analysis, quite powerful for finding, and testing the significance of, weak periodic signals in otherwise random, unevenly sampled data. Horne and Baliunas (1986) have elaborated on the method and Press and Rybicki (1989) present a fast implementation of the algorithm, a modified version of which is used here. This implementation uses FFTs to increase the speed of computation (although it is in no way equivalent to conventional FFT periodogram analysis). Note that windowed data cannot be passed to this option since it needs to calculate the variance (which is zero) to normalize the power of the periodogram.
STRING – The string-length method is an intuitively simple method, described in detail by Dworetsky (1983) and Friend et al. (1990). The data is folded on a series of trial periods and at each period the sum of the lengths of line segments joining successive points (the string-length) is calculated. Minima in a plot of string-length versus trial frequency indicate possible periods. The string-length method is especially useful in the limit of a very small number (about 20 or more) of randomly spaced observations of periodic phenomena. Note that windowed data cannot be passed to this option due to the $y$ -data scaling process (see Dworetsky 1983).
PEAKS – This option should be run once a periodogram has been obtained. It finds the highest peak in the periodogram (or lowest trough if it is a string-length, PDM or reduced- $χ^{2}$ plot) between user-specified frequencies. The resulting period is calculated, along with an error. Errors on period detections are notoriously difficult to estimate. The estimate used in the previous version of PERIOD (v3.0) employed a formula derived by Kovacs (1981). The derivation assumed a single signal, Gaussian noise and even data spacing. This is clearly not the case with most astronomical datasets and the formula is hence of little use (see Horne and Baliunas 1986). Schwarzenberg-Czerny (1991) presents a detailed account of the accuracy of period determinations and advises a post-mortem analysis by measuring the width and heights of peaks in a periodogram. Although virtually impossible to automate, it is possible to do this manually from within PERIOD using the fitting routines of QDP/PLT (see above). Therefore, for the sake of generality and to avoid uncertainties, version 4.0 of PERIOD now only outputs an error derived by calculating the half-size of a single frequency bin, centred on the peak (or trough) in a periodogram, and then converting to period units. This error gives an indication of the accuracy to which a peak can be located in a periodogram (due to the frequency sampling). Clearly, with a larger frequency search interval it is more difficult to locate a peak precisely and this is reflected in the error estimate. However, this error estimate does not take into account the fact that the peak (or trough) may not represent the true period (which can be shifted due to a number of effects) and it should therefore be regarded as a minimum error and not a formal error.
If the significance calculation is enabled (with the SIG command, see below), two false alarm probabilities are quoted alongside the period. The first (FAP1) is the probability that, given the frequency search parameters, there is no periodic component present in the data with this period. The second (FAP2) is the probability that the period is not actually equal to the quoted value but is equal to some other value. Note that FAP1 is only output if the whole frequency range is specified to be analysed in PEAKS (see below). One sigma errors on both significance values are also given. If the significance values are zero, these errors are displayed as –1, implying that the false alarm probabilities lie between 0.00 and 0.01 with 95% confidence. Clearly, the lower a significance value and its error, the more likely the quoted period is a correct one. If both the significances and errors are displayed as –1, this means that the input periodogram has not been subjected to a significance calculation (ie. the significance calculation has been disabled). Note that the results can be written to a log file if one is open. For more information on the SIG option, see below. For useful discussions on errors and significances of period determinations, see Schwarzenberg-Czerny (1991) and Nemec and Nemec (1985).
SIG – This option works as a switch, either turning on or turning off the significance calculation. The default on entering PERIOD is for the significance calculation to be disabled. This means that no significance values are calculated or attached to period determinations. By typing SIG, the significance calculation is enabled. You are first prompted for the number of permutations in the sample. To ensure reliable significance values, the minimum number of permutations is set to 100. You are then prompted for a seed for the random number generator – this number determines the starting point in a number series of infinite period. Therefore, entering the same seed on two calls to SIG will result in the same sequence of random numbers. If SIG is already enabled, one can disable the significance calculation by typing SIG again.
With the significance calculation enabled, every time a period-finding option is run (CHISQ, FT, SCARGLE, CLEAN, STRING, PDM) a Fisher randomization test is performed (see, for example, Nemec and Nemec 1985). This consists of calculating the periodogram as usual and loading the specified output slot. The $y$ -axis data is then shuffled to form a new, randomized time-series. The periodogram of this dataset is then calculated (but not stored in the output slot, which will always contain the periodogram of the real time-series). This randomization and periodogram calculation loop is then performed for the number of permutations specified by the user. This can take a considerable amount of time, depending on the number of data points in the time-series, the frequency search parameters and the number of permutations.
Once the loop is complete, you should enter the PEAKS option to view the resulting significances. Two significance estimates are given in PEAKS. The first, denoted FAP1, represents the proportion of permutations (ie. shuffled time-series) that contained a trough lower than (in the case of the CHISQ, STRING and PDM options) or a peak higher than (in the case of the FT, SCARGLE and CLEAN options) that of the periodogram of the unrandomized dataset at any frequency. This therefore represents the probability that, given the frequency search parameters, no periodic component is present in the data with this period and it is only output in PEAKS if the whole frequency range is specified to be analysed. The second significance, denoted FAP2, represents the proportion of permutations that, at the frequency given by the period output by PEAKS, contained troughs lower than (or peaks higher than) the peak or trough in the periodogram of the real dataset. This therefore represents the probability that the period is not actually equal to the quoted value but is equal to some other value, and is quoted for any frequency range specified in PEAKS. Standard errors on both of these false alarm probabilities are also given (see Nemec and Nemec 1985).
It is perhaps worth mentioning here that significance estimates of period detections are notoriously unreliable. The methods used in the previous version of PERIOD (v3.0) suffered from a number of problems. For example, the F-test used with the PDM method (Stellingwerf 1978) has been proved to be incorrect (see, for example, Heck, Manfroid and Mersch 1985). Similarly, the theoretical minimum string-lengths quoted by Dworetsky (1983) are misleading, since they are based on evenly-spaced functions and it is possible to obtain values below this even for pure noise data with certain data spacings. The well-known SCARGLE false alarm probabilities are also incorrect, since the Horne and Baliunas (1986) equation for the number of independent frequencies has been shown to be incorrect (Christian Knigge (Oxford), private communication). Even if correct, the Horne and Baliunas formula would be incorrect to apply in a general way since it is a poor approximation to small datasets. The only reliable method of estimating significances from such non-parametric tests is by some sort of Monte Carlo or randomization method. As described above, one such method (Fisher randomization) has been implemented in this version of PERIOD (v4.0) following the prescription described by Nemec and Nemec (1985).
HELP – This command provides on-line help for PERIOD. Detailed information about individual commands can be obtained by typing HELP ’COMMAND’ (eg. HELP PEAKS).
QUIT (or EXIT) – This quits the PERIOD_PERIOD sub-menu and returns the user to the main PERIOD menu.

Returning to the main PERIOD menu:

`FIT`

Folds the data on a given period and zero point and then fits the data with a sine curve. The sine curve has the form: Y = GAMMA + (AMPLITUDE * SIN( ((2.0*PI)/PERIOD) * (X - ZEROPT) )). Outputs the fit parameters (which can be written to a log file) and the resulting sine curve.

`FOLD`

Folds the data on a given period and zero point. Hence, this option transforms the data onto a phase scale, where one phase unit is equal to one period and phase zero is defined by the zero point. If the zero point is not known, the data can be folded by taking the first data point as the zero point. This option is useful for checking whether derived periods actually give sensible results when applied to the data. In addition to normal folding, it is also possible to phase bin the data, which folds the data and then averages all the data points falling into each bin.

`SINE`

Adds, subtracts, multiplies or divides a sine curve from data. The sine curve has the form: Y = GAMMA + (AMPLITUDE * SIN( ((2.0*PI)/PERIOD) * (X - ZEROPT) )). This option is useful for removing or adding known periods from/to data, thus enabling or testing the detection of other periods.

`PLT`

This routine calls PGPLOT routines to display the graphs of the slots requested. The layout of the displays is fixed but output file types such as landscape postscript files can be created. This represents slightly less functionality than the original XANADU based QDP PLT routine, but no QDP PLT routine is currently available for LINUX.

In order to receive on-line help, simply type HELP at the PERIOD-PLT prompt. To exit PERIOD-PLT and return to the PERIOD menu, type EXIT.

`STATUS`

Returns information on the data slots or on the stored fits in the log file. This command is useful in order to check which slots contain which datasets and also as a means of obtaining some elementary statistics on the stored data. You can also use this option to check the fits from the SINE and PEAKS options stored in the log file without having to exit the package and read the log file.

`OUTPUT`

Writes any selected slot to an ASCII file on disk. This is the only way of saving data created by PERIOD (it does not write to FITS files), and should therefore be run before QUITing in order to store, say, a power spectrum.

`HELP`

This command provides on-line help for PERIOD. Detailed information about individual commands can be obtained by typing HELP ’COMMAND’ (eg. HELP PERIOD).

`QUIT` (or `EXIT`)

This option quits a PERIOD session. However, it does provide a last chance to stay in the package. This is essential to prevent accidental exit, since any data files created using PERIOD will be lost on exit from the package unless one OUTPUTs the data first.

¹Throughout the PERIOD package and this document, the terms power spectrum and periodogram are used interchangeably, although strictly speaking the power spectrum is a theoretical quantity defined as an integral over continuous time, of which the periodogram is merely an estimate based on a finite amount of discrete data (Scargle 1982).

←Prev
PERIOD
A Time-Series Analysis Package
Next→
TOC ↑

5 Menu Options

INPUT

OGIP

FAKE

NOISE

DETREND

WINDOW

OPEN

CLOSE

PERIOD

FIT

FOLD

SINE

PLT

STATUS

OUTPUT

HELP

QUIT (or EXIT)