The CTI_TEST function constructs a “contingency table” from an array of observed frequencies and tests the hypothesis that the rows and columns are independent using an extension of the chi-square goodness-of-fit test.
This routine is written in the IDL language. Its source code can be found in the file cti_test.pro in the lib subdirectory of the IDL distribution.
Examples
Define a 5-column and 4-row array of observed frequencies.
obfreq = [[748, 821, 786, 720, 672], $
[ 74, 60, 51, 66, 50], $
[ 31, 25, 22, 16, 15], $
[ 9, 10, 6, 5, 7]]
Test the hypothesis that the rows and columns of “obfreq” contain independent data at the 0.05 significance level.
result = CTI_TEST(obfreq, COEFF = coeff)
PRINT, result
The result should be the two-element vector [14.3953, 0.276181].
The computed value of 0.276181 indicates that there is no reason to reject the proposed hypothesis at the 0.05 significance level. The coefficient of contingency returned in the parameter “coeff” (coeff = 0.0584860) also indicates the lack of dependence between the rows and columns of the observed frequencies. Setting the CORRECTED keyword returns the two-element vector [12.0032, 0.445420] and (coeff = 0.0534213) resulting in the same conclusion of independence.
Syntax
Result = CTI_TEST( Obfreq [, COEFF=variable] [, /CORRECTED] [, CRAMV=variable] [, DF=variable] [, EXFREQ=variable] [, RESIDUAL=variable] )
Return Value
Returns a two-element vector containing the chi-square test statistic χ2 and the one-tailed probability of obtaining a value of χ2 or greater.
Arguments
Obfreq
An m x n array containing observed frequencies. Obfreq can contain either integer, single-precision , double-precision floating-point values.
Keywords
COEFF
Set this keyword to a named variable that will contain the coefficient of contingency. The coefficient of contingency is a non-negative scalar, in the interval [0.0, 1.0], which measures the degree of dependence within a contingency table. The larger the value of COEFF, the greater the degree of dependence.
CORRECTED
Set this keyword to use the Yate’s correction for continuity when computing the chi-squared test statistic, χ2. The Yate’s correction always decreases the magnitude of χ2. In general, this keyword should be set for small sample sizes.
CRAMV
Set this keyword to a named variable that will contain Cramer’s V. Cramer’s V is a non-negative scalar, in the interval [0.0, 1.0], which measures the degree of dependence within a contingency table.
DF
Set this keyword to a named variable that will contain the number of degrees of freedom used to compute the probability of obtaining the value of the chi-squared test statistic or greater. DF = (n - 1) * (m - 1) where m and n are the number of columns and rows of the contingency table, respectively.
EXFREQ
Set this keyword to a named variable that will contain an array of m-columns and n-rows containing expected frequencies. The elements of this array are often referred to as the “cells” of the expected frequencies. The expected frequency of each cell is computed as the product of row and column marginal frequencies divided by the overall total of observed frequencies.
RESIDUAL
Set this keyword to a named variable that will contain an array of m-columns and n-rows containing signed differences between corresponding cells of observed frequencies and expected frequencies.
Version History
See Also
CORRELATE, M_CORRELATE, XSQ_TEST