Studentization and deriving accurate p-values
Rousseau, Judith; Fraser, Donald (2008), Studentization and deriving accurate p-values, Biometrika, 95, 1, p. 1-16. http://dx.doi.org/10.1093/biomet/asm093
TypeArticle accepté pour publication ou publié
Oxford University Press
MetadataShow full item record
Abstract (EN)The original Studentization was the conversion of a sample mean departure into the familiar t-statistic, plus the derivation of the corresponding Student distribution function; the observed value of the distribution function is the observed p-value, as presented in an elemental form. We examine this process in a broadly general context: a null statistical model is available together with observed data; a statistic t(y) has been proposed as a plausible measure of the location of the data relative to what is expected under the null; a modified statistic, say ~t(y), is developed that is ancillary; the corresponding distribution function is determined, exactly or approximately; and the observed value of the distribution function is the p-value or percentage position of the data with respect to the model. Such p-values have had extensive coverage in the recent Bayesian literature, with many variations and some preference for two versions labelled pppost and pcpred. The bootstrap method also directly addresses this Studentization process. We use recent likelihood theory that gives a third order factorization of a regular statistical model into a marginal density for a full dimensional ancillary and a conditional density for the maximum likelihood variable. The full dimensional ancillary is shown to lead to an explicit determination of the Studentized version ~t(y) together with a highly accurate approximation to its distribution function; the observed value of the distribution function is the p-value and is available numerically by direct calculation or by Markov chain Monte Carlo or by other simulations. In this paper, for any given initial or trial test statistic proposed as a location indicator for a data point, we develop: an ancillary based p-value designated panc; a special version of the Bayesian pcpred; and a bootstrap based p-value designated pbs. We then show under moderate regularity that these are equivalent to the third order and have uniqueness as a determination of the statistical location of the data point, as of course derived from the initial location measure. We also show that these p-values have a uniform distribution to third order, as based on calculations in the moderate-deviations region. For implementation the Bayesian and likelihood procedures would perhaps require the same numerical computations, while the bootstrap would require a magnitude more in computation and would perhaps not be accessible. Examples are given to indicate the ease and exibility of the approach
Subjects / KeywordsAncillary; Bayesian; Bootstrap; Conditioning; Departure measure; Likelihood; p-value; Studentization.
Showing items related by title and author.