Multivariate Homogeneity Testing Using an Extended Concept of Nearest Neighbors

2038's picture
Journal Title, Volume, Page: 
Biometrical Journal Volume 38, Issue 5, pages 605–612, 1996
Year of Publication: 
1996
Authors: 
Ali S. Barakat
Department of Mathematics, Faculty of Science, An-Najah National University, Nablus, Palestine
Current Affiliation: 
Department of Mathematics, Faculty of Science, An-Najah National University, Nablus, Palestine
Dana Quade
University of North Carolina at Chapel Hill, U.S.A.
Ibrahim A. Salama
University of North Carolina at Chapel Hill, U.S.A.
Preferred Abstract (Original): 

Given independent multivariate random samples {Xij: j = 1, …, ni} from Fi, for i = 1,2, a test is desired for H0: F1 = F2 against general alternatives. Consider the k · (n1 + n2) possible ways of choosing one observation from the combined samples and then one of its k nearest neighbors, and let Sk be the proportion of these choices in which the point and neighbor are in the same sample. Schilling (1986) proposed Sk as a test statistic, but did not indicate how to determine k. We suggest as test statistic W = N Σ kSk, which we show is equivalent to a sum of N Wilcoxon rank sums, and also to a sum of two two-sample U-statistics of degrees (1, 2) and (2, 1). Simulation with multivariate normal data suggests that our test is generally more powerful than Schilling's test using k = 1, 2, or 3. We illustrate its use with Fisher's iris data.