diff options
Diffstat (limited to 'gsl-1.9/doc/multifit.texi')
-rw-r--r-- | gsl-1.9/doc/multifit.texi | 645 |
1 files changed, 645 insertions, 0 deletions
diff --git a/gsl-1.9/doc/multifit.texi b/gsl-1.9/doc/multifit.texi new file mode 100644 index 0000000..6467b67 --- /dev/null +++ b/gsl-1.9/doc/multifit.texi @@ -0,0 +1,645 @@ +@cindex nonlinear least squares fitting +@cindex least squares fitting, nonlinear + +This chapter describes functions for multidimensional nonlinear +least-squares fitting. The library provides low level components for a +variety of iterative solvers and convergence tests. These can be +combined by the user to achieve the desired solution, with full access +to the intermediate steps of the iteration. Each class of methods uses +the same framework, so that you can switch between solvers at runtime +without needing to recompile your program. Each instance of a solver +keeps track of its own state, allowing the solvers to be used in +multi-threaded programs. + +The header file @file{gsl_multifit_nlin.h} contains prototypes for the +multidimensional nonlinear fitting functions and related declarations. + +@menu +* Overview of Nonlinear Least-Squares Fitting:: +* Initializing the Nonlinear Least-Squares Solver:: +* Providing the Function to be Minimized:: +* Iteration of the Minimization Algorithm:: +* Search Stopping Parameters for Minimization Algorithms:: +* Minimization Algorithms using Derivatives:: +* Minimization Algorithms without Derivatives:: +* Computing the covariance matrix of best fit parameters:: +* Example programs for Nonlinear Least-Squares Fitting:: +* References and Further Reading for Nonlinear Least-Squares Fitting:: +@end menu + +@node Overview of Nonlinear Least-Squares Fitting +@section Overview +@cindex nonlinear least squares fitting, overview + +The problem of multidimensional nonlinear least-squares fitting requires +the minimization of the squared residuals of @math{n} functions, +@math{f_i}, in @math{p} parameters, @math{x_i}, +@tex +\beforedisplay +$$ +\Phi(x) = {1 \over 2} || F(x) ||^2 + = {1 \over 2} \sum_{i=1}^{n} f_i (x_1, \dots, x_p)^2 +$$ +\afterdisplay +@end tex +@ifinfo + +@example +\Phi(x) = (1/2) || F(x) ||^2 + = (1/2) \sum_@{i=1@}^@{n@} f_i(x_1, ..., x_p)^2 +@end example + +@end ifinfo +@noindent +All algorithms proceed from an initial guess using the linearization, +@tex +\beforedisplay +$$ +\psi(p) = || F(x+p) || \approx || F(x) + J p\, || +$$ +\afterdisplay +@end tex +@ifinfo + +@example +\psi(p) = || F(x+p) || ~=~ || F(x) + J p || +@end example + +@end ifinfo +@noindent +where @math{x} is the initial point, @math{p} is the proposed step +and @math{J} is the +Jacobian matrix @c{$J_{ij} = \partial f_i / \partial x_j$} +@math{J_@{ij@} = d f_i / d x_j}. +Additional strategies are used to enlarge the region of convergence. +These include requiring a decrease in the norm @math{||F||} on each +step or using a trust region to avoid steps which fall outside the linear +regime. + +To perform a weighted least-squares fit of a nonlinear model +@math{Y(x,t)} to data (@math{t_i}, @math{y_i}) with independent gaussian +errors @math{\sigma_i}, use function components of the following form, +@tex +\beforedisplay +$$ +f_i = {(Y(x, t_i) - y_i) \over \sigma_i} +$$ +\afterdisplay +@end tex +@ifinfo + +@example +f_i = (Y(x, t_i) - y_i) / \sigma_i +@end example + +@end ifinfo +@noindent +Note that the model parameters are denoted by @math{x} in this chapter +since the non-linear least-squares algorithms are described +geometrically (i.e. finding the minimum of a surface). The +independent variable of any data to be fitted is denoted by @math{t}. + +With the definition above the Jacobian is +@c{$J_{ij} = (1 / \sigma_i) \partial Y_i / \partial x_j$} +@math{J_@{ij@} =(1 / \sigma_i) d Y_i / d x_j}, where @math{Y_i = Y(x,t_i)}. + + +@node Initializing the Nonlinear Least-Squares Solver +@section Initializing the Solver + +@deftypefun {gsl_multifit_fsolver *} gsl_multifit_fsolver_alloc (const gsl_multifit_fsolver_type * @var{T}, size_t @var{n}, size_t @var{p}) +This function returns a pointer to a newly allocated instance of a +solver of type @var{T} for @var{n} observations and @var{p} parameters. +The number of observations @var{n} must be greater than or equal to +parameters @var{p}. + +If there is insufficient memory to create the solver then the function +returns a null pointer and the error handler is invoked with an error +code of @code{GSL_ENOMEM}. +@end deftypefun + +@deftypefun {gsl_multifit_fdfsolver *} gsl_multifit_fdfsolver_alloc (const gsl_multifit_fdfsolver_type * @var{T}, size_t @var{n}, size_t @var{p}) +This function returns a pointer to a newly allocated instance of a +derivative solver of type @var{T} for @var{n} observations and @var{p} +parameters. For example, the following code creates an instance of a +Levenberg-Marquardt solver for 100 data points and 3 parameters, + +@example +const gsl_multifit_fdfsolver_type * T + = gsl_multifit_fdfsolver_lmder; +gsl_multifit_fdfsolver * s + = gsl_multifit_fdfsolver_alloc (T, 100, 3); +@end example + +@noindent +The number of observations @var{n} must be greater than or equal to +parameters @var{p}. + +If there is insufficient memory to create the solver then the function +returns a null pointer and the error handler is invoked with an error +code of @code{GSL_ENOMEM}. +@end deftypefun + +@deftypefun int gsl_multifit_fsolver_set (gsl_multifit_fsolver * @var{s}, gsl_multifit_function * @var{f}, gsl_vector * @var{x}) +This function initializes, or reinitializes, an existing solver @var{s} +to use the function @var{f} and the initial guess @var{x}. +@end deftypefun + +@deftypefun int gsl_multifit_fdfsolver_set (gsl_multifit_fdfsolver * @var{s}, gsl_multifit_function_fdf * @var{fdf}, gsl_vector * @var{x}) +This function initializes, or reinitializes, an existing solver @var{s} +to use the function and derivative @var{fdf} and the initial guess +@var{x}. +@end deftypefun + +@deftypefun void gsl_multifit_fsolver_free (gsl_multifit_fsolver * @var{s}) +@deftypefunx void gsl_multifit_fdfsolver_free (gsl_multifit_fdfsolver * @var{s}) +These functions free all the memory associated with the solver @var{s}. +@end deftypefun + +@deftypefun {const char *} gsl_multifit_fsolver_name (const gsl_multifit_fsolver * @var{s}) +@deftypefunx {const char *} gsl_multifit_fdfsolver_name (const gsl_multifit_fdfsolver * @var{s}) +These functions return a pointer to the name of the solver. For example, + +@example +printf ("s is a '%s' solver\n", + gsl_multifit_fdfsolver_name (s)); +@end example + +@noindent +would print something like @code{s is a 'lmder' solver}. +@end deftypefun + +@node Providing the Function to be Minimized +@section Providing the Function to be Minimized + +You must provide @math{n} functions of @math{p} variables for the +minimization algorithms to operate on. In order to allow for +arbitrary parameters the functions are defined by the following data +types: + +@deftp {Data Type} gsl_multifit_function +This data type defines a general system of functions with arbitrary parameters. + +@table @code +@item int (* f) (const gsl_vector * @var{x}, void * @var{params}, gsl_vector * @var{f}) +this function should store the vector result +@c{$f(x,\hbox{\it params})$} +@math{f(x,params)} in @var{f} for argument @var{x} and arbitrary parameters @var{params}, +returning an appropriate error code if the function cannot be computed. + +@item size_t n +the number of functions, i.e. the number of components of the +vector @var{f}. + +@item size_t p +the number of independent variables, i.e. the number of components of +the vector @var{x}. + +@item void * params +a pointer to the arbitrary parameters of the function. +@end table +@end deftp + +@deftp {Data Type} gsl_multifit_function_fdf +This data type defines a general system of functions with arbitrary parameters and +the corresponding Jacobian matrix of derivatives, + +@table @code +@item int (* f) (const gsl_vector * @var{x}, void * @var{params}, gsl_vector * @var{f}) +this function should store the vector result +@c{$f(x,\hbox{\it params})$} +@math{f(x,params)} in @var{f} for argument @var{x} and arbitrary parameters @var{params}, +returning an appropriate error code if the function cannot be computed. + +@item int (* df) (const gsl_vector * @var{x}, void * @var{params}, gsl_matrix * @var{J}) +this function should store the @var{n}-by-@var{p} matrix result +@c{$J_{ij} = \partial f_i(x,\hbox{\it params}) / \partial x_j$} +@math{J_ij = d f_i(x,params) / d x_j} in @var{J} for argument @var{x} +and arbitrary parameters @var{params}, returning an appropriate error code if the +function cannot be computed. + +@item int (* fdf) (const gsl_vector * @var{x}, void * @var{params}, gsl_vector * @var{f}, gsl_matrix * @var{J}) +This function should set the values of the @var{f} and @var{J} as above, +for arguments @var{x} and arbitrary parameters @var{params}. This function +provides an optimization of the separate functions for @math{f(x)} and +@math{J(x)}---it is always faster to compute the function and its +derivative at the same time. + +@item size_t n +the number of functions, i.e. the number of components of the +vector @var{f}. + +@item size_t p +the number of independent variables, i.e. the number of components of +the vector @var{x}. + +@item void * params +a pointer to the arbitrary parameters of the function. +@end table +@end deftp + +Note that when fitting a non-linear model against experimental data, +the data is passed to the functions above using the +@var{params} argument and the trial best-fit parameters through the +@var{x} argument. + +@node Iteration of the Minimization Algorithm +@section Iteration + +The following functions drive the iteration of each algorithm. Each +function performs one iteration to update the state of any solver of the +corresponding type. The same functions work for all solvers so that +different methods can be substituted at runtime without modifications to +the code. + +@deftypefun int gsl_multifit_fsolver_iterate (gsl_multifit_fsolver * @var{s}) +@deftypefunx int gsl_multifit_fdfsolver_iterate (gsl_multifit_fdfsolver * @var{s}) +These functions perform a single iteration of the solver @var{s}. If +the iteration encounters an unexpected problem then an error code will +be returned. The solver maintains a current estimate of the best-fit +parameters at all times. +@end deftypefun + +The solver struct @var{s} contains the following entries, which can +be used to track the progress of the solution: + +@table @code +@item gsl_vector * x +The current position. + +@item gsl_vector * f +The function value at the current position. + +@item gsl_vector * dx +The difference between the current position and the previous position, +i.e. the last step, taken as a vector. + +@item gsl_matrix * J +The Jacobian matrix at the current position (for the +@code{gsl_multifit_fdfsolver} struct only) +@end table + +The best-fit information also can be accessed with the following +auxiliary functions, + +@deftypefun {gsl_vector *} gsl_multifit_fsolver_position (const gsl_multifit_fsolver * @var{s}) +@deftypefunx {gsl_vector *} gsl_multifit_fdfsolver_position (const gsl_multifit_fdfsolver * @var{s}) +These functions return the current position (i.e. best-fit parameters) +@code{s->x} of the solver @var{s}. +@end deftypefun + +@node Search Stopping Parameters for Minimization Algorithms +@section Search Stopping Parameters +@cindex nonlinear fitting, stopping parameters + +A minimization procedure should stop when one of the following conditions is +true: + +@itemize @bullet +@item +A minimum has been found to within the user-specified precision. + +@item +A user-specified maximum number of iterations has been reached. + +@item +An error has occurred. +@end itemize + +@noindent +The handling of these conditions is under user control. The functions +below allow the user to test the current estimate of the best-fit +parameters in several standard ways. + +@deftypefun int gsl_multifit_test_delta (const gsl_vector * @var{dx}, const gsl_vector * @var{x}, double @var{epsabs}, double @var{epsrel}) + +This function tests for the convergence of the sequence by comparing the +last step @var{dx} with the absolute error @var{epsabs} and relative +error @var{epsrel} to the current position @var{x}. The test returns +@code{GSL_SUCCESS} if the following condition is achieved, +@tex +\beforedisplay +$$ +|dx_i| < \hbox{\it epsabs} + \hbox{\it epsrel\/}\, |x_i| +$$ +\afterdisplay +@end tex +@ifinfo + +@example +|dx_i| < epsabs + epsrel |x_i| +@end example + +@end ifinfo +@noindent +for each component of @var{x} and returns @code{GSL_CONTINUE} otherwise. +@end deftypefun + +@cindex residual, in nonlinear systems of equations +@deftypefun int gsl_multifit_test_gradient (const gsl_vector * @var{g}, double @var{epsabs}) +This function tests the residual gradient @var{g} against the absolute +error bound @var{epsabs}. Mathematically, the gradient should be +exactly zero at the minimum. The test returns @code{GSL_SUCCESS} if the +following condition is achieved, +@tex +\beforedisplay +$$ +\sum_i |g_i| < \hbox{\it epsabs} +$$ +\afterdisplay +@end tex +@ifinfo + +@example +\sum_i |g_i| < epsabs +@end example + +@end ifinfo +@noindent +and returns @code{GSL_CONTINUE} otherwise. This criterion is suitable +for situations where the precise location of the minimum, @math{x}, +is unimportant provided a value can be found where the gradient is small +enough. +@end deftypefun + + +@deftypefun int gsl_multifit_gradient (const gsl_matrix * @var{J}, const gsl_vector * @var{f}, gsl_vector * @var{g}) +This function computes the gradient @var{g} of @math{\Phi(x) = (1/2) +||F(x)||^2} from the Jacobian matrix @math{J} and the function values +@var{f}, using the formula @math{g = J^T f}. +@end deftypefun + +@node Minimization Algorithms using Derivatives +@section Minimization Algorithms using Derivatives + +The minimization algorithms described in this section make use of both +the function and its derivative. They require an initial guess for the +location of the minimum. There is no absolute guarantee of +convergence---the function must be suitable for this technique and the +initial guess must be sufficiently close to the minimum for it to work. + +@comment ============================================================ +@cindex Levenberg-Marquardt algorithms +@deffn {Derivative Solver} gsl_multifit_fdfsolver_lmsder +@cindex LMDER algorithm +@cindex MINPACK, minimization algorithms +This is a robust and efficient version of the Levenberg-Marquardt +algorithm as implemented in the scaled @sc{lmder} routine in +@sc{minpack}. Minpack was written by Jorge J. Mor@'e, Burton S. Garbow +and Kenneth E. Hillstrom. + +The algorithm uses a generalized trust region to keep each step under +control. In order to be accepted a proposed new position @math{x'} must +satisfy the condition @math{|D (x' - x)| < \delta}, where @math{D} is a +diagonal scaling matrix and @math{\delta} is the size of the trust +region. The components of @math{D} are computed internally, using the +column norms of the Jacobian to estimate the sensitivity of the residual +to each component of @math{x}. This improves the behavior of the +algorithm for badly scaled functions. + +On each iteration the algorithm attempts to minimize the linear system +@math{|F + J p|} subject to the constraint @math{|D p| < \Delta}. The +solution to this constrained linear system is found using the +Levenberg-Marquardt method. + +The proposed step is now tested by evaluating the function at the +resulting point, @math{x'}. If the step reduces the norm of the +function sufficiently, and follows the predicted behavior of the +function within the trust region, then it is accepted and the size of the +trust region is increased. If the proposed step fails to improve the +solution, or differs significantly from the expected behavior within +the trust region, then the size of the trust region is decreased and +another trial step is computed. + +The algorithm also monitors the progress of the solution and returns an +error if the changes in the solution are smaller than the machine +precision. The possible error codes are, + +@table @code +@item GSL_ETOLF +the decrease in the function falls below machine precision + +@item GSL_ETOLX +the change in the position vector falls below machine precision + +@item GSL_ETOLG +the norm of the gradient, relative to the norm of the function, falls +below machine precision +@end table + +@noindent +These error codes indicate that further iterations will be unlikely to +change the solution from its current value. + +@end deffn + +@deffn {Derivative Solver} gsl_multifit_fdfsolver_lmder +This is an unscaled version of the @sc{lmder} algorithm. The elements of the +diagonal scaling matrix @math{D} are set to 1. This algorithm may be +useful in circumstances where the scaled version of @sc{lmder} converges too +slowly, or the function is already scaled appropriately. +@end deffn + +@node Minimization Algorithms without Derivatives +@section Minimization Algorithms without Derivatives + +There are no algorithms implemented in this section at the moment. + +@node Computing the covariance matrix of best fit parameters +@section Computing the covariance matrix of best fit parameters +@cindex best-fit parameters, covariance +@cindex least squares, covariance of best-fit parameters +@cindex covariance matrix, nonlinear fits + +@deftypefun int gsl_multifit_covar (const gsl_matrix * @var{J}, double @var{epsrel}, gsl_matrix * @var{covar}) +This function uses the Jacobian matrix @var{J} to compute the covariance +matrix of the best-fit parameters, @var{covar}. The parameter +@var{epsrel} is used to remove linear-dependent columns when @var{J} is +rank deficient. + +The covariance matrix is given by, +@tex +\beforedisplay +$$ +C = (J^T J)^{-1} +$$ +\afterdisplay +@end tex +@ifinfo + +@example +covar = (J^T J)^@{-1@} +@end example + +@end ifinfo +@noindent +and is computed by QR decomposition of J with column-pivoting. Any +columns of @math{R} which satisfy +@tex +\beforedisplay +$$ +|R_{kk}| \leq epsrel |R_{11}| +$$ +\afterdisplay +@end tex +@ifinfo + +@example +|R_@{kk@}| <= epsrel |R_@{11@}| +@end example + +@end ifinfo +@noindent +are considered linearly-dependent and are excluded from the covariance +matrix (the corresponding rows and columns of the covariance matrix are +set to zero). + +If the minimisation uses the weighted least-squares function +@math{f_i = (Y(x, t_i) - y_i) / \sigma_i} then the covariance +matrix above gives the statistical error on the best-fit parameters +resulting from the gaussian errors @math{\sigma_i} on +the underlying data @math{y_i}. This can be verified from the relation +@math{\delta f = J \delta c} and the fact that the fluctuations in @math{f} +from the data @math{y_i} are normalised by @math{\sigma_i} and +so satisfy @c{$\langle \delta f \delta f^T \rangle = I$} +@math{<\delta f \delta f^T> = I}. + +For an unweighted least-squares function @math{f_i = (Y(x, t_i) - +y_i)} the covariance matrix above should be multiplied by the variance +of the residuals about the best-fit @math{\sigma^2 = \sum (y_i - Y(x,t_i))^2 / (n-p)} +to give the variance-covariance +matrix @math{\sigma^2 C}. This estimates the statistical error on the +best-fit parameters from the scatter of the underlying data. + +For more information about covariance matrices see @ref{Fitting Overview}. +@end deftypefun + +@comment ============================================================ + +@node Example programs for Nonlinear Least-Squares Fitting +@section Examples + +The following example program fits a weighted exponential model with +background to experimental data, @math{Y = A \exp(-\lambda t) + b}. The +first part of the program sets up the functions @code{expb_f} and +@code{expb_df} to calculate the model and its Jacobian. The appropriate +fitting function is given by, +@tex +\beforedisplay +$$ +f_i = ((A \exp(-\lambda t_i) + b) - y_i)/\sigma_i +$$ +\afterdisplay +@end tex +@ifinfo + +@example +f_i = ((A \exp(-\lambda t_i) + b) - y_i)/\sigma_i +@end example + +@end ifinfo +@noindent +where we have chosen @math{t_i = i}. The Jacobian matrix @math{J} is +the derivative of these functions with respect to the three parameters +(@math{A}, @math{\lambda}, @math{b}). It is given by, +@tex +\beforedisplay +$$ +J_{ij} = {\partial f_i \over \partial x_j} +$$ +\afterdisplay +@end tex +@ifinfo + +@example +J_@{ij@} = d f_i / d x_j +@end example + +@end ifinfo +@noindent +where @math{x_0 = A}, @math{x_1 = \lambda} and @math{x_2 = b}. + +@example +@verbatiminclude examples/expfit.c +@end example + +@noindent +The main part of the program sets up a Levenberg-Marquardt solver and +some simulated random data. The data uses the known parameters +(1.0,5.0,0.1) combined with gaussian noise (standard deviation = 0.1) +over a range of 40 timesteps. The initial guess for the parameters is +chosen as (0.0, 1.0, 0.0). + +@example +@verbatiminclude examples/nlfit.c +@end example + +@noindent +The iteration terminates when the change in x is smaller than 0.0001, as +both an absolute and relative change. Here are the results of running +the program: + +@smallexample +iter: 0 x=1.00000000 0.00000000 0.00000000 |f(x)|=117.349 +status=success +iter: 1 x=1.64659312 0.01814772 0.64659312 |f(x)|=76.4578 +status=success +iter: 2 x=2.85876037 0.08092095 1.44796363 |f(x)|=37.6838 +status=success +iter: 3 x=4.94899512 0.11942928 1.09457665 |f(x)|=9.58079 +status=success +iter: 4 x=5.02175572 0.10287787 1.03388354 |f(x)|=5.63049 +status=success +iter: 5 x=5.04520433 0.10405523 1.01941607 |f(x)|=5.44398 +status=success +iter: 6 x=5.04535782 0.10404906 1.01924871 |f(x)|=5.44397 +chisq/dof = 0.800996 +A = 5.04536 +/- 0.06028 +lambda = 0.10405 +/- 0.00316 +b = 1.01925 +/- 0.03782 +status = success +@end smallexample + +@noindent +The approximate values of the parameters are found correctly, and the +chi-squared value indicates a good fit (the chi-squared per degree of +freedom is approximately 1). In this case the errors on the parameters +can be estimated from the square roots of the diagonal elements of the +covariance matrix. + +If the chi-squared value shows a poor fit (i.e. @c{$\chi^2/(n-p) \gg 1$} +@math{chi^2/dof >> 1}) then the error estimates obtained from the +covariance matrix will be too small. In the example program the error estimates +are multiplied by @c{$\sqrt{\chi^2/(n-p)}$} +@math{\sqrt@{\chi^2/dof@}} in this case, a common way of increasing the +errors for a poor fit. Note that a poor fit will result from the use +an inappropriate model, and the scaled error estimates may then +be outside the range of validity for gaussian errors. + +@iftex +@sp 1 +@center @image{fit-exp,3.4in} +@end iftex + +@node References and Further Reading for Nonlinear Least-Squares Fitting +@section References and Further Reading + +The @sc{minpack} algorithm is described in the following article, + +@itemize @asis +@item +J.J. Mor@'e, @cite{The Levenberg-Marquardt Algorithm: Implementation and +Theory}, Lecture Notes in Mathematics, v630 (1978), ed G. Watson. +@end itemize + +@noindent +The following paper is also relevant to the algorithms described in this +section, + +@itemize @asis +@item +J.J. Mor@'e, B.S. Garbow, K.E. Hillstrom, ``Testing Unconstrained +Optimization Software'', ACM Transactions on Mathematical Software, Vol +7, No 1 (1981), p 17--41. +@end itemize + |