GOODPOLY

Name

goodpoly

Purpose (one Line)

Robust fitting of a polynomial to data.

Description

This is a multi-pass fitting routine that fits a fixed order polynomial
to the input data. After each pass, the scatter of the fit relative
to the fitted line is computed. Each point is examined to see if it
falls beyond THRESH sigma from the line. If is does, it is removed
from the data and the fit is tried again. This will make up to two
attempts to remove bad data.

Calling Sequence

coeff = goodpoly(x,y,order,thresh,yfit,newx,newy,newerr)

Inputs

x - Input dataset, independant values.
y - Input dataset, dependant values.
order - Order of the polynomial fit (linear = 1).
thresh - Sigma threshold for removing outliers.

Optional Input Parameters

Keyword Input Parameters

BAD - Vector of flags, same length as x and y, that indicate bad values
in the y vector. The default is that all points are considered
good at the stars. If supplied, any additional bad values found
will be marked as bad in this vector upon output. Also, any
NaN values found in either the x or y vector will be trimmed
from the data (and marked bad) prior to any processing.
EXCLUDE - Number of points to exclude from initial pass fit. This number
is rounded up to the next even number. Then EXCLUDE/2 of the
highest and lowest points are removed before the first fit.
If these numbers are reasonable, they will not be excluded
in the second pass. This helps prevent biasing the first fit
with the worst points in the array. Default is to not exclude
any points on the first pass.
MAX_VALUE - The maximum value to be fitted. If this keyword is provided,
data values greater than MAX_VALUE are treated as missing
and are not used in the fit at any pass.
MIN_VALUE - The minimum value to be fitted. If this keyword is provided,
data values greater than MIN_VALUE are treated as missing
and are not used in the fit at any pass.
YERR - Uncertainties on the y values. Default=no weighting.
The point-to-point scatter from the fit is still used for the
threshold. After all editing is done on this basis, a last
poly_fit is run using the errors on the surviving points to
do a final weighted fit. This is not a perfect solution to
this desired feature but it's reasonably close. The down side
is that large error points are more likely to be removed than
would be truly justified.

Outputs

yfit - Fitted values for y that match the input vector.
newx - X values from input that were considered good.
newy - Y values from input that were considered good.
newerr - uncertainties from input that were considered good.
Return value is the set of polynomial coefficients.

Keyword Output Parameters

COEFFSIG - Uncertainty on the final returned coefficients
SIGMA - Standard deviation of difference between good points and the
fitted curve.

Common Blocks

Side Effects

Restrictions

Procedure

Modification History

Written 1991 Feb., Marc W. Buie, Lowell Observatory
93/11/12, MWB, Program fixed to return a computed y for all input x.
95/09/20, MWB, Added EXCLUDE keyword.
98/02/09, MWB, Added SIGMA keyword.
98/06/08, MWB, Added MIN/MAX_VALUE keywords.
98/08/12, MWB, Revamped some logic plus added direct support for badflags.
The new version is probably a tad faster and more robust
than the old version.
2009/10/02, MWB, removed obsolete Poly_fit arguments
2010/04/29, MWB, added a trimrank call on the return value
2011/04/29, MWB, added COEFFSIG keyword
2013/07/19, MWB, added YERR keyword and newerr argument
2013/11/21, MWB, the previous modification to add YERR had some really bad
logic. This has been fixed now but please read the
documentation carefully to understand what it really does.