Buradasınız

Stochastic Complexity, Histograms and Hypothesis Testing of Homogeneity

Journal Name:

Publication Year:

Author Name

AMS Codes:

Abstract (2. Language): 
Information contained in a sample of quantitative data may be summarized or described by a nonparametric histogram density function. An interesting question is how to construct such a histogram density to express the data information with minimum stochastic complexity. The stochastic complexity is a pseudonym of Rissanen’s minimum description length (MDL) which gives the length of a sequence of decipherable binary code resulted from optimally encoding the data information using a probability distribution based code-book. Here we have derived an optimal generalized histogram density estimator to provide both predictive and non-predictive coding description of a data sample. We have also obtained uniform and almost sure asymptotic approximations for the lengths of both descriptions. As an application of this result to statistical inference a new procedure for hypothesis testing of distribution homogeneity is proposed and is proved to have an asymptotic power of 1.
51-80

REFERENCES

References: 

[1] Bozdogan, H. (2000). Akaike’s information criterion and recent developments in information
complexity. Journal of Mathematical Psychology, 44, 62-91.
[2] Dawid, A.P. (1992). Prequential analysis, stochastic complexity and Bayesian inference.
Bayesian Statistics 4 (J.M. Bernardo, J.O. Berger, A.P. Dawid and A.F.M. Smith eds),
Oxford University Press, 109-125 (with discussions).
[3] Dawid, A.P. (1991). Fisherian inference in likelihood and prequential frames of reference.
J. Roy. Statist. Soc. B 53, 79-109 (with discussions).
[4] Dawid, A.P. (1984). Present position and potential developments: some personal views,
statistical theory, the prequential approach. J. Roy. Statist. Soc. A, 47, 278-292 (with
discussions).
[5] Elias, P. (1975). Universal codeword sets and representations of the integers. IEEE Trans.
Information Theory 21, 194-203.
[6] Freedman, D.A. and Diaconis, P. (1981). On the histogram as a density estimator: L2
theory. Z. Wahrscheinlichkeitstheor. Verw. Geb. 57, 453-475.
[7] Hall, P. and Hannan, E.J. (1988). On stochastic complexity and nonparametric density
estimation. Biometrika, 75, 705-714.
[8] Lehmann, E.L. and Casella, G. (1998). Theory of Point Estimation. Springer-Verlag, New
York.
[9] Qian, G. and Künsch, H.R. (1998). Some notes on Rissanen’s stochastic complexity. IEEE
Trans Information Theory 44, 782-786.
[10] Qian, G. Gabor, G. and Gupta, R.P. (1996). Test for homogeniety of several populations
by stochastic complexity. Journal of Statistical Planning and Inference. 53. 133-151
[11] Rissanen, J. (2007). Information and Complexity in Statistical Modeling. Springer, New
York.
[12] Rissanen, J. (1996). Fisher information and stochastic complexity. IEEE Trans. Informa-
tion Theory, 42, 40-47.
[13] Rissanen, J. (1989). Stochastic Complexity in Statistical Inquiry. World Scientific Publishing
Company, Teaneck, NJ.
[14] Rissanen, J., Speed, T.P. and Yu, B. (1992). Density estimation by stochastic complexity.
IEEE Trans. Information Theory 38, 315-323.
[15] Rissanen, J. (1986). Stochastic complexity and modeling. Ann. Statist., 14, 1080-1100.
[16] Rosenblatt, M. (1975). A quadratic measure of deviation of two-dimensional density
estimates and a test of independence. Ann. Statist. 3, 1-14.
[17] Shiryayev, A.N. (1995).Probability (2nd Edition). Springer-Verlag, New York.
[18] Solomonoff, R.J. (1978). Complexity-based induction system: comparison and convergence
theorems. IEEE Trans. Information Theory 24, 422-432.
[19] Stone, C.J. (1985). An asymptotic optimal histogram selection rule. Proceedings of the
Berkeley Conference in Honor of Jerzy Neyman and Jack Kiefer (ed. by Le Cam, L.M. and
Ohshen, R.A.), Volume II, 513-520. Wadsworth, Belmont, CA.
[20] Yu, B. and Speed, T.P. (1992). Data compression and histograms. Probability Theory and
Related Fields 92, 195-229.

Thank you for copying data from http://www.arastirmax.com