REGRESS

The REGRESS function performs a multiple linear regression fit and returns an Nterm-element column vector of coefficients.

REGRESS fits the function:

y_i = const + a₀x_0, _i + a₁x_1, _i + ... + a_Nterms_-1x_Nterms_-1, _i

This routine is written in the IDL language. Its source code can be found in the file regress.pro in the lib subdirectory of the IDL distribution.

Examples

; Create two vectors of independent variable data:
X1 = [1.0, 2.0, 4.0, 8.0, 16.0, 32.0]
X2 = [0.0, 1.0, 2.0, 3.0, 4.0, 5.0]
; Combine into a 2x6 array
X = [TRANSPOSE(X1), TRANSPOSE(X2)]
; Create a vector of dependent variable data:
Y = 5 + 3*X1 - 4*X2
; Assume Gaussian measurement errors for each point:
measure_errors = REPLICATE(0.5, N_ELEMENTS(Y))
; Compute the fit, and print the results:
result = REGRESS(X, Y, SIGMA=sigma, CONST=const, $
   MEASURE_ERRORS=measure_errors)
PRINT, 'Constant: ', const
PRINT, 'Coefficients: ', result[*]
PRINT, 'Standard errors: ', sigma

IDL prints:

Constant:    4.99999

Coefficients:    3.00000    -3.99999

Standard errors:    0.0444831    0.282038

Syntax

Result = REGRESS( X, Y, [, CHISQ=variable] [, CONST=variable] [, CORRELATION=variable] [, /DOUBLE] [, FTEST=variable] [, MCORRELATION=variable] [, MEASURE_ERRORS=vector] [, SIGMA=variable] [, STATUS=variable] [, YFIT=variable] )

Return Value

REGRESS returns a 1 x Nterm array of coefficients. If the DOUBLE keyword is set, or if X or Y are double-precision, then the result will be double precision, otherwise the result will be single precision.

Arguments

X

An Nterms by Npoints array of independent variable data, where Nterms is the number of coefficients (independent variables) and Npoints is the number of samples.

Y

An Npoints-element vector of dependent variable points.

Keywords

CHISQ

Set this keyword equal to a named variable that will contain the value of the unreduced chi-square goodness-of-fit statistic.

CONST

Set this keyword to a named variable that will contain the constant term of the fit.

CORRELATION

Set this keyword to a named variable that will contain the vector of linear correlation coefficients.

DOUBLE

Set this keyword to force computations to be done in double-precision arithmetic.

FTEST

Set this keyword to a named variable that will contain the F-value for the goodness-of-fit test.

MCORRELATION

Set this keyword to a named variable that will contain the multiple linear correlation coefficient.

MEASURE_ERRORS

Set this keyword to a vector containing standard measurement errors for each point Y[i]. This vector must be the same length as X and Y.

Note: For Gaussian errors (e.g., instrumental uncertainties), MEASURE_ERRORS should be set to the standard deviations of each point in Y. For Poisson or statistical weighting, MEASURE_ERRORS should be set to SQRT(Y).

SIGMA

Set this keyword to a named variable that will contain the 1-sigma uncertainty estimates for the returned parameters.

Note: If MEASURE_ERRORS is omitted, then you are assuming that the regression model is the correct model for your data, and therefore, no independent goodness-of-fit test is possible. In this case, the values returned in SIGMA are multiplied by SQRT(CHISQ/(N–M)), where N is the number of points in X, and M is the number of coefficients. See section 15.2 of Numerical Recipes in C (Second Edition) for details.

STATUS

Set this keyword to a named variable that will contain the status of the operation. Possible status values are:

0 = successful completion
1 = singular array (which indicates that the inversion is invalid)
2 = warning that a small pivot element was used and that significant accuracy was probably lost.

Note: If STATUS is not specified, any error messages will be output to the screen.

YFIT

Set this keyword to a named variable that will contain the vector of calculated Y values.

Version History

Original	Introduced
5.4	Deprecated the Weights, Yfit, Const, Sigma, Ftest, R, Rmul, Chisq, and Status arguments, RELATIVE_WEIGHT keyword.