Statistical Analysis and Data Mining: The ASA Data Science Journal

Two‐sample test based on classification probability

Early View

Abstract Robust classification algorithms have been developed in recent years with great success. We take advantage of this development and recast the classical two‐sample test problem in the framework of classification. Based on the estimates of classification probabilities from a classifier trained from the samples, a test statistic is proposed. We explain why such a test can be a powerful test and compare its performance in terms of the power and efficiency with those of some other recently proposed tests with simulation and real‐life data. The test proposed is nonparametric and can be applied to complex and high‐dimensional data wherever there is a classifier that provides consistent estimate of the classification probability for such data.

Related Topics

Related Publications

Related Content

Site Footer


This website is provided by John Wiley & Sons Limited, The Atrium, Southern Gate, Chichester, West Sussex PO19 8SQ (Company No: 00641132, VAT No: 376766987)

Published features on are checked for statistical accuracy by a panel from the European Network for Business and Industrial Statistics (ENBIS)   to whom Wiley and express their gratitude. This panel are: Ron Kenett, David Steinberg, Shirley Coleman, Irena Ograjenšek, Fabrizio Ruggeri, Rainer Göb, Philippe Castagliola, Xavier Tort-Martorell, Bart De Ketelaere, Antonio Pievatolo, Martina Vandebroek, Lance Mitchell, Gilbert Saporta, Helmut Waldl and Stelios Psarakis.