TNMIN Name
TNMIN
Author
Craig B. Markwardt, NASA/GSFC Code 662, Greenbelt, MD 20770
craigm@lheamail.gsfc.nasa.gov
UPDATED VERSIONs can be found on my WEB PAGE:
http://cow.physics.wisc.edu/~craigm/idl/idl.html
Purpose
Performs function minimization (Truncated-Newton Method)
Major Topics
Optimization and Minimization Calling Sequence
parms = TNMIN(MYFUNCT, X, FUNCTARGS=fcnargs, NFEV=nfev,
MAXITER=maxiter, ERRMSG=errmsg, NPRINT=nprint,
QUIET=quiet, XTOL=xtol, STATUS=status,
FGUESS=fguess, PARINFO=parinfo, BESTMIN=bestmin,
ITERPROC=iterproc, ITERARGS=iterargs, niter=niter)
Description
TNMIN uses the Truncated-Newton method to minimize an arbitrary IDL
function with respect to a given set of free parameters. The
user-supplied function must compute the gradient with respect to
each parameter. TNMIN is based on TN.F (TNBC) by Stephen Nash.
If you want to solve a least-squares problem, to perform *curve*
*fitting*, then you will probably want to use the routines MPFIT,
MPFITFUN and MPFITEXPR. Those routines are specifically optimized
for the least-squares problem. TNMIN is suitable for constrained
and unconstrained optimization problems with a medium number of
variables. Function *maximization* can be performed using the
MAXIMIZE keyword.
TNMIN is similar to MPFIT in that it allows parameters to be fixed,
simple bounding limits to be placed on parameter values, and
parameters to be tied to other parameters. One major difference
between MPFIT and TNMIN is that TNMIN does not compute derivatives
automatically by default. See PARINFO and AUTODERIVATIVE below for
more discussion and examples.
User Function
The user must define an IDL function which returns the desired
value as a single scalar. The IDL function must accept a list of
numerical parameters, P. Additionally, keyword parameters may be
used to pass more data or information to the user function, via the
FUNCTARGS keyword.
The user function should be declared in the following way:
FUNCTION MYFUNCT, p, dp [, keywords permitted ]
; Parameter values are passed in "p"
f = .... ; Compute function value
dp = .... ; Compute partial derivatives (optional)
return, f
END
The function *must* accept at least one argument, the parameter
list P. The vector P is implicitly assumed to be a one-dimensional
array. Users may pass additional information to the function by
composing and _EXTRA structure and passing it in the FUNCTARGS
keyword.
User functions may also indicate a fatal error condition using the
ERROR_CODE common block variable, as described below under the
TNMIN_ERROR common block definition (by setting ERROR_CODE to a
number between -15 and -1).
EXPLICIT vs. NUMERICAL DERIVATIVES
By default, the user must compute the function gradient components
explicitly using AUTODERIVATIVE=0. As explained below, numerical
derivatives can also be calculated using AUTODERIVATIVE=1.
For explicit derivatives, the user should call TNMIN using the
default keyword value AUTODERIVATIVE=0. [ This is different
behavior from MPFIT, where AUTODERIVATIVE=1 is the default. ] The
IDL user routine should compute the gradient of the function as a
one-dimensional array of values, one for each of the parameters.
They are passed back to TNMIN via "dp" as shown above.
The derivatives with respect to fixed parameters are ignored; zero
is an appropriate value to insert for those derivatives. Upon
input to the user function, DP is set to a vector with the same
length as P, with a value of 1 for a parameter which is free, and a
value of zero for a parameter which is fixed (and hence no
derivative needs to be calculated). This input vector may be
overwritten as needed.
For numerical derivatives, a finite differencing approximation is
used to estimate the gradient values. Users can activate this
feature by passing the keyword AUTODERIVATIVE=1. Fine control over
this behavior can be achieved using the STEP, RELSTEP and TNSIDE
fields of the PARINFO structure.
When finite differencing is used for computing derivatives (ie,
when AUTODERIVATIVE=1), the parameter DP is not passed. Therefore
functions can use N_PARAMS() to indicate whether they must compute
the derivatives or not. However there is no penalty (other than
computation time) for computing the gradient values and users may
switch between AUTODERIVATIVE=0 or =1 in order to test both
scenarios.
Constraining Parameter Values With The Parinfo Keyword
The behavior of TNMIN can be modified with respect to each
parameter to be optimized. A parameter value can be fixed; simple
boundary constraints can be imposed; limitations on the parameter
changes can be imposed; properties of the automatic derivative can
be modified; and parameters can be tied to one another.
These properties are governed by the PARINFO structure, which is
passed as a keyword parameter to TNMIN.
PARINFO should be an array of structures, one for each parameter.
Each parameter is associated with one element of the array, in
numerical order. The structure can have the following entries
(none are required):
.VALUE - the starting parameter value (but see the START_PARAMS
parameter for more information).
.FIXED - a boolean value, whether the parameter is to be held
fixed or not. Fixed parameters are not varied by
TNMIN, but are passed on to MYFUNCT for evaluation.
.LIMITED - a two-element boolean array. If the first/second
element is set, then the parameter is bounded on the
lower/upper side. A parameter can be bounded on both
sides. Both LIMITED and LIMITS must be given
together.
.LIMITS - a two-element float or double array. Gives the
parameter limits on the lower and upper sides,
respectively. Zero, one or two of these values can be
set, depending on the values of LIMITED. Both LIMITED
and LIMITS must be given together.
.PARNAME - a string, giving the name of the parameter. The
fitting code of TNMIN does not use this tag in any
way.
.STEP - the step size to be used in calculating the numerical
derivatives. If set to zero, then the step size is
computed automatically. Ignored when AUTODERIVATIVE=0.
.TNSIDE - the sidedness of the finite difference when computing
numerical derivatives. This field can take four
values:
0 - one-sided derivative computed automatically
1 - one-sided derivative (f(x+h) - f(x) )/h
-1 - one-sided derivative (f(x) - f(x-h))/h
2 - two-sided derivative (f(x+h) - f(x-h))/(2*h)
Where H is the STEP parameter described above. The
"automatic" one-sided derivative method will chose a
direction for the finite difference which does not
violate any constraints. The other methods do not
perform this check. The two-sided method is in
principle more precise, but requires twice as many
function evaluations. Default: 0.
.TIED - a string expression which "ties" the parameter to other
free or fixed parameters. Any expression involving
constants and the parameter array P are permitted.
Example: if parameter 2 is always to be twice parameter
1 then use the following: parinfo(2).tied = '2 * P(1)'.
Since they are totally constrained, tied parameters are
considered to be fixed; no errors are computed for them.
[ NOTE: the PARNAME can't be used in expressions. ]
Future modifications to the PARINFO structure, if any, will involve
adding structure tags beginning with the two letters "MP" or "TN".
Therefore programmers are urged to avoid using tags starting with
these two combinations of letters; otherwise they are free to
include their own fields within the PARINFO structure, and they
will be ignored.
PARINFO Example:
parinfo = replicate({value:0.D, fixed:0, limited:[0,0], $
limits:[0.D,0]}, 5)
parinfo(0).fixed = 1
parinfo(4).limited(0) = 1
parinfo(4).limits(0) = 50.D
parinfo(*).value = [5.7D, 2.2, 500., 1.5, 2000.]
A total of 5 parameters, with starting values of 5.7,
2.2, 500, 1.5, and 2000 are given. The first parameter
is fixed at a value of 5.7, and the last parameter is
constrained to be above 50. Inputs
MYFUNCT - a string variable containing the name of the function to
be minimized (see USER FUNCTION above). The IDL routine
should return the value of the function and optionally
its gradients.
X - An array of starting values for each of the parameters of the
model.
This parameter is optional if the PARINFO keyword is used.
See above. The PARINFO keyword provides a mechanism to fix or
constrain individual parameters. If both X and PARINFO are
passed, then the starting *value* is taken from X, but the
*constraints* are taken from PARINFO.
Returns
Returns the array of best-fit parameters.
Keyword Parameters
AUTODERIVATIVE - If this is set, derivatives of the function will
be computed automatically via a finite
differencing procedure. If not set, then MYFUNCT
must provide the (explicit) derivatives.
Default: 0 (explicit derivatives required)
BESTMIN - upon return, the best minimum function value that TNMIN
could find.
EPSABS - a nonnegative real variable. Termination occurs when the
absolute error between consecutive iterates is at most
EPSABS. Note that using EPSREL is normally preferable
over EPSABS, except in cases where ABS(F) is much larger
than 1 at the optimal point. A value of zero indicates
the absolute error test is not applied. If EPSABS is
specified, then both EPSREL and EPSABS tests are applied;
if either succeeds then termination occurs.
Default: 0 (i.e., only EPSREL is applied).
EPSREL - a nonnegative input variable. Termination occurs when the
relative error between two consecutive iterates is at
most EPSREL. Therefore, EPSREL measures the relative
error desired in the function. An additional, more
lenient, stopping condition can be applied by specifying
the EPSABS keyword.
Default: 100 * Machine Precision
ERRMSG - a string error or warning message is returned.
FGUESS - provides the routine a guess to the minimum value.
Default: 0
FUNCTARGS - A structure which contains the parameters to be passed
to the user-supplied function specified by MYFUNCT via
the _EXTRA mechanism. This is the way you can pass
additional data to your user-supplied function without
using common blocks.
Consider the following example:
if FUNCTARGS = { XVAL:[1.D,2,3], YVAL:[1.D,4,9]}
then the user supplied function should be declared
like this:
FUNCTION MYFUNCT, P, XVAL=x, YVAL=y
By default, no extra parameters are passed to the
user-supplied function.
ITERARGS - The keyword arguments to be passed to ITERPROC via the
_EXTRA mechanism. This should be a structure, and is
similar in operation to FUNCTARGS.
Default: no arguments are passed.
ITERDERIV - Intended to print function gradient information. If
set, then the ITERDERIV keyword is set in each call to
ITERPROC. In the default ITERPROC, parameter values
and gradient information are both printed when this
keyword is set.
ITERPROC - The name of a procedure to be called upon each NPRINT
iteration of the TNMIN routine. It should be declared
in the following way:
PRO ITERPROC, MYFUNCT, p, iter, fnorm, FUNCTARGS=fcnargs, $
PARINFO=parinfo, QUIET=quiet, _EXTRA=extra
; perform custom iteration update
END
ITERPROC must accept the _EXTRA keyword, in case of
future changes to the calling procedure.
MYFUNCT is the user-supplied function to be minimized,
P is the current set of model parameters, ITER is the
iteration number, and FUNCTARGS are the arguments to be
passed to MYFUNCT. FNORM is should be the function
value. QUIET is set when no textual output should be
printed. See below for documentation of PARINFO.
In implementation, ITERPROC can perform updates to the
terminal or graphical user interface, to provide
feedback while the fit proceeds. If the fit is to be
stopped for any reason, then ITERPROC should set the
common block variable ERROR_CODE to negative value
between -15 and -1 (see TNMIN_ERROR common block
below). In principle, ITERPROC should probably not
modify the parameter values, because it may interfere
with the algorithm's stability. In practice it is
allowed.
Default: an internal routine is used to print the
parameter values.
MAXITER - The maximum number of iterations to perform. If the
number is exceeded, then the STATUS value is set to 5
and TNMIN returns.
Default: 200 iterations
MAXIMIZE - If set, the function is maximized instead of
minimized.
MAXNFEV - The maximum number of function evaluations to perform.
If the number is exceeded, then the STATUS value is set
to -17 and TNMIN returns. A value of zero indicates no
maximum.
Default: 0 (no maximum)
NFEV - upon return, the number of MYFUNCT function evaluations
performed.
NITER - upon return, number of iterations completed.
NPRINT - The frequency with which ITERPROC is called. A value of
1 indicates that ITERPROC is called with every iteration,
while 2 indicates every other iteration, etc.
Default value: 1
PARINFO - Provides a mechanism for more sophisticated constraints
to be placed on parameter values. When PARINFO is not
passed, then it is assumed that all parameters are free
and unconstrained. Values in PARINFO are never modified
during a call to TNMIN.
See description above for the structure of PARINFO.
Default value: all parameters are free and unconstrained.
QUIET - set this keyword when no textual output should be printed
by TNMIN
STATUS - an integer status code is returned. All values greater
than zero can represent success (however STATUS EQ 5 may
indicate failure to converge). Gaps in the numbering
system below are to maintain compatibility with MPFIT.
Upon return, STATUS can have one of the following values:
-18 a fatal internal error occurred during optimization.
-17 the maximum number of function evaluations has been
reached without convergence.
-16 a parameter or function value has become infinite or an
undefined number. This is usually a consequence of
numerical overflow in the user's function, which must be
avoided.
-15 to -1
these are error codes that either MYFUNCT or ITERPROC
may return to terminate the fitting process (see
description of MPFIT_ERROR common below). If either
MYFUNCT or ITERPROC set ERROR_CODE to a negative number,
then that number is returned in STATUS. Values from -15
to -1 are reserved for the user functions and will not
clash with MPFIT.
0 improper input parameters.
1 convergence was reached.
2-4 (RESERVED)
5 the maximum number of iterations has been reached
6-8 (RESERVED)
Example
FUNCTION F, X, DF, _EXTRA=extra ;; *** MUST ACCEPT KEYWORDS
F = (X(0)-1)^2 + (X(1)+7)^2
DF = [ 2D * (X(0)-1), 2D * (X(1)+7) ] ; Gradient
RETURN, F
END
P = TNMIN('F', [0D, 0D], BESTMIN=F0)
Minimizes the function F(x0,x1) = (x0-1)^2 + (x1+7)^2.
Common Blocks
COMMON TNMIN_ERROR, ERROR_CODE
User routines may stop the fitting process at any time by
setting an error condition. This condition may be set in either
the user's model computation routine (MYFUNCT), or in the
iteration procedure (ITERPROC).
To stop the fitting, the above common block must be declared,
and ERROR_CODE must be set to a negative number. After the user
procedure or function returns, TNMIN checks the value of this
common block variable and exits immediately if the error
condition has been set. By default the value of ERROR_CODE is
zero, indicating a successful function/procedure call.
References
TRUNCATED-NEWTON METHOD, TN.F
Stephen G. Nash, Operations Research and Applied Statistics
Department
http://www.netlib.org/opt/tn
Nash, S. G. 1984, "Newton-Type Minimization via the Lanczos
Method," SIAM J. Numerical Analysis, 21, p. 770-778
Modification History
Derived from TN.F by Stephen Nash with many changes and additions,
to conform to MPFIT paradigm, 19 Jan 1999, CM
Changed web address to COW, 25 Sep 1999, CM
Alphabetized documented keyword parameters, 02 Oct 1999, CM
Changed to ERROR_CODE for error condition, 28 Jan 2000, CM
Continued and fairly major improvements (CM, 08 Jan 2001):
- calling of user procedure is now concentrated in TNMIN_CALL,
and arguments are reduced by storing a large number of them
in common blocks;
- finite differencing is done within TNMIN_CALL; added
AUTODERIVATIVE=1 so that finite differencing can be enabled,
both one and two sided;
- a new procedure to parse PARINFO fields, borrowed from MPFIT;
brought PARINFO keywords up to date with MPFIT;
- go through and check for float vs. double discrepancies;
- add explicit MAXIMIZE keyword, and support in TNMIN_CALL and
TNMIN_DEFITER to print the correct values in that case;
TNMIN_DEFITER now prints function gradient values if
requested;
- convert to common-based system of MPFIT for storing machine
constants; revert TNMIN_ENORM to simple sum of squares, at
least for now;
- remove limit on number of function evaluations, at least for
now, and until I can understand what happens when we do
numerical derivatives.
Further changes: more float vs double; disable TNMINSTEP for now;
experimented with convergence test in case of function
maximization, 11 Jan 2001, CM
TNMINSTEP is parsed but not enabled, 13 Mar 2001
Major code cleanups; internal docs; reduced commons, CM, 08 Apr
2001
Continued code cleanups; documentation; the STATUS keyword
actually means something, CM, 10 Apr 2001
Added reference to Nash paper, CM, 08 Feb 2002
Fixed 16-bit loop indices, D. Schelgel, 22 Apr 2003
Changed parens to square brackets because of conflicts with
limits function. K. Tolbert, 23 Feb 2005
Some documentation clarifications, CM, 09 Nov 2007
Ensure that MY_FUNCT returns a scalar; make it more likely that
error messages get back out to the user; fatal CATCH'd error
now returns STATUS = -18, CM, 17 Sep 2008
Fix TNMIN_CALL when parameters are TIEd (thanks to Alfred de
Wijn), CM, 22 Nov 2009
Remember to TIE the parameters before final return (thanks to
Michael Smith), CM, 20 Jan 2010
Todo
- scale derivatives semi-automatically;
- ability to scale and offset parameters;