The CORRELATE function computes the linear Pearson correlation coefficient of two vectors or the correlation matrix of an m x n array. Alternatively, this function computes the unbiased sample covariance of two vectors or the covariance matrix of an m x n array.

This routine is written in the IDL language. Its source code can be found in the file correlate.pro in the lib subdirectory of the IDL distribution.

Tip: If you are computing covariance, you may want to use the RUNNING_COVARIANCE function instead, which avoids overflow for large values, is significantly faster, uses less memory, and also allows you to combine calculations for data sets that do not fit into memory.

Examples


Define the data vectors.

X = [65,63,67,64,68,62,70,66,68,67,69,71]
Y = [68,66,68,65,69,66,68,65,71,67,68,70]

Compute the linear Pearson correlation coefficient of x and y. The result should be 0.702652:

PRINT, CORRELATE(X, Y)

IDL prints:

0.702652

Compute the covariance of x and y. The result should be 3.66667.

PRINT, CORRELATE(X, Y, /COVARIANCE)

IDL prints:

3.66667

Define an array with x and y as its columns.

A = TRANSPOSE([[X],[Y]])

Compute the correlation matrix.

PRINT, CORRELATE(A)

IDL prints:

  1.00000    0.702652
  0.702652   1.00000

Syntax


Result = CORRELATE( X [, Y] [, /COVARIANCE] [, /DOUBLE] )

Return Value


If vectors of unequal lengths are specified, the longer vector is truncated to the length of the shorter vector and a single correlation coefficient is returned. If an m x n array is specified, the result will be an m x m array of linear Pearson correlation coefficients, with the element i,j corresponding to correlation of the ith and jth columns of the input array.

Arguments


X

A vector or an m x n array. X can be integer, single-, or double-precision floating-point.

Y

An integer, single-, or double-precision floating-point vector. If X is an m x n array, Y should not be supplied.

Keywords


COVARIANCE

Set this keyword to compute the sample covariance rather than the correlation coefficient.

Tip: If you are computing covariance, you may want to use the RUNNING_COVARIANCE function instead, which avoids overflow for large values, is significantly faster, uses less memory, and also allows you to combine calculations for data sets that do not fit into memory.

DOUBLE

Set this keyword to force the computation to be done in double-precision arithmetic.

Version History


Pre 4.0

Introduced

Resources and References


J. Neter, W. Wasserman, G.A. Whitmore, Applied Statistics (Third Edition), Allyn and Bacon (ISBN 0-205-10328-6).

See Also


A_CORRELATE, C_CORRELATE, M_CORRELATE, P_CORRELATE, R_CORRELATE, RUNNING_COVARIANCE