See: Description
| Class | Description |
|---|---|
| FBar | |
| FDist |
WARNING: Most methods in this class are deprecated.
|
| GofFormat |
This class contains methods used to format results of GOF
test statistics, or to apply a series of tests
simultaneously and format the results.
|
| GofStat |
This class provides methods to compute several types of EDF goodness-of-fit
test statistics and to apply certain transformations to a set of
observations.
|
| GofStat.OutcomeCategoriesChi2 |
This class helps managing the partitions of possible outcomes
into categories for applying chi-square tests.
|
| KernelDensity |
This class provides methods to compute a kernel density estimator from a set
of n individual observations
x0,…, xn-1, and returns its value
at m selected points.
|
We are concerned here with GOF test statistics for testing the hypothesis H0 that a sample of N observations X1,..., XN comes from a given univariate probability distribution F. We consider tests such as those of Kolmogorov-Smirnov, Anderson-Darling, Crámer-von Mises, etc. These test statistics generally measure, in different ways, the distance between a continuous distribution function F and the empirical distribution function (EDF) 2#2 of X1,..., XN. They are also called EDF test statistics. The observations Xi are usually transformed into Ui = F(Xi), which satisfy 0≤Ui≤1 and which follow the U(0, 1) distribution under H0. (This is called the probability integral transformation.) Methods for applying this transformation, as well as other types of transformations, to the observations Xi or Ui are provided in umontreal.iro.lecuyer.gofGofStat.
Then the GOF tests are applied to the Ui sorted by increasing order. The corresponding p-values are easily computed by calling the appropriate static methods in umontreal.iro.lecuyer.gofFDist. If a GOF test statistic Y has a continuous distribution under H0 and takes the value y, its (right) p-value is defined as p = P[Y≥y | H0]. The test usually rejects H0 if p is deemed too close to 0 (for a one-sided test) or too close to 0 or 1 (for a two-sided test).
In the case where Y has a discrete distribution under H0, we distinguish the right p-value pR = P[Y≥y | H0] and the left p-value pL = P[Y≤y | H0]. We then define the p-value for a two-sided test as
| p = | pR | if pR < pL, |
| p = | 1 - pL | if pR≥pL and pL < 0.5, |
| p = | 0.5 | otherwise. |
A very common type of test in the discrete case is the chi-square test, which applies when the possible outcomes are partitioned into a finite number of categories. Suppose there are k categories and that each observation belongs to category i with probability pi, for 0≤i < k. If there are n independent observations, the expected number of observations in category i is ei = npi, and the chi-square test statistic is defined as
The class umontreal.iro.lecuyer.gofGofFormat contains methods used to format results of GOF test statistics, or to apply several such tests simultaneously to a given data set and format the results to produce a report that also contains the p-values of all these tests. A C version of this class is actually used extensively in the package TestU01, which applies statistical tests to random number generators[#!iLEC01t!#]. The class also provides tools to plot an empirical or theoretical distribution function, by creating a data file that contains a graphic plot in a format compatible with a given software.
To submit a bug or ask questions, send an e-mail to Pierre L'Ecuyer.