diff options
Diffstat (limited to 'gsl-1.9/doc/sort.texi')
-rw-r--r-- | gsl-1.9/doc/sort.texi | 315 |
1 files changed, 315 insertions, 0 deletions
diff --git a/gsl-1.9/doc/sort.texi b/gsl-1.9/doc/sort.texi new file mode 100644 index 0000000..3e5f216 --- /dev/null +++ b/gsl-1.9/doc/sort.texi @@ -0,0 +1,315 @@ +@cindex sorting +@cindex heapsort +This chapter describes functions for sorting data, both directly and +indirectly (using an index). All the functions use the @dfn{heapsort} +algorithm. Heapsort is an @math{O(N \log N)} algorithm which operates +in-place and does not require any additional storage. It also provides +consistent performance, the running time for its worst-case (ordered +data) being not significantly longer than the average and best cases. +Note that the heapsort algorithm does not preserve the relative ordering +of equal elements---it is an @dfn{unstable} sort. However the resulting +order of equal elements will be consistent across different platforms +when using these functions. + +@menu +* Sorting objects:: +* Sorting vectors:: +* Selecting the k smallest or largest elements:: +* Computing the rank:: +* Sorting Examples:: +* Sorting References and Further Reading:: +@end menu + +@node Sorting objects +@section Sorting objects + +The following function provides a simple alternative to the standard +library function @code{qsort}. It is intended for systems lacking +@code{qsort}, not as a replacement for it. The function @code{qsort} +should be used whenever possible, as it will be faster and can provide +stable ordering of equal elements. Documentation for @code{qsort} is +available in the @cite{GNU C Library Reference Manual}. + +The functions described in this section are defined in the header file +@file{gsl_heapsort.h}. + +@cindex comparison functions, definition +@deftypefun void gsl_heapsort (void * @var{array}, size_t @var{count}, size_t @var{size}, gsl_comparison_fn_t @var{compare}) + +This function sorts the @var{count} elements of the array @var{array}, +each of size @var{size}, into ascending order using the comparison +function @var{compare}. The type of the comparison function is defined by, + +@example +int (*gsl_comparison_fn_t) (const void * a, + const void * b) +@end example + +@noindent +A comparison function should return a negative integer if the first +argument is less than the second argument, @code{0} if the two arguments +are equal and a positive integer if the first argument is greater than +the second argument. + +For example, the following function can be used to sort doubles into +ascending numerical order. + +@example +int +compare_doubles (const double * a, + const double * b) +@{ + if (*a > *b) + return 1; + else if (*a < *b) + return -1; + else + return 0; +@} +@end example + +@noindent +The appropriate function call to perform the sort is, + +@example +gsl_heapsort (array, count, sizeof(double), + compare_doubles); +@end example + +Note that unlike @code{qsort} the heapsort algorithm cannot be made into +a stable sort by pointer arithmetic. The trick of comparing pointers for +equal elements in the comparison function does not work for the heapsort +algorithm. The heapsort algorithm performs an internal rearrangement of +the data which destroys its initial ordering. +@end deftypefun + +@cindex indirect sorting +@deftypefun int gsl_heapsort_index (size_t * @var{p}, const void * @var{array}, size_t @var{count}, size_t @var{size}, gsl_comparison_fn_t @var{compare}) + +This function indirectly sorts the @var{count} elements of the array +@var{array}, each of size @var{size}, into ascending order using the +comparison function @var{compare}. The resulting permutation is stored +in @var{p}, an array of length @var{n}. The elements of @var{p} give the +index of the array element which would have been stored in that position +if the array had been sorted in place. The first element of @var{p} +gives the index of the least element in @var{array}, and the last +element of @var{p} gives the index of the greatest element in +@var{array}. The array itself is not changed. +@end deftypefun + +@node Sorting vectors +@section Sorting vectors + +The following functions will sort the elements of an array or vector, +either directly or indirectly. They are defined for all real and integer +types using the normal suffix rules. For example, the @code{float} +versions of the array functions are @code{gsl_sort_float} and +@code{gsl_sort_float_index}. The corresponding vector functions are +@code{gsl_sort_vector_float} and @code{gsl_sort_vector_float_index}. The +prototypes are available in the header files @file{gsl_sort_float.h} +@file{gsl_sort_vector_float.h}. The complete set of prototypes can be +included using the header files @file{gsl_sort.h} and +@file{gsl_sort_vector.h}. + +There are no functions for sorting complex arrays or vectors, since the +ordering of complex numbers is not uniquely defined. To sort a complex +vector by magnitude compute a real vector containing the magnitudes +of the complex elements, and sort this vector indirectly. The resulting +index gives the appropriate ordering of the original complex vector. + +@cindex sorting vector elements +@cindex vector, sorting elements of +@deftypefun void gsl_sort (double * @var{data}, size_t @var{stride}, size_t @var{n}) +This function sorts the @var{n} elements of the array @var{data} with +stride @var{stride} into ascending numerical order. +@end deftypefun + +@deftypefun void gsl_sort_vector (gsl_vector * @var{v}) +This function sorts the elements of the vector @var{v} into ascending +numerical order. +@end deftypefun + +@cindex indirect sorting, of vector elements +@deftypefun void gsl_sort_index (size_t * @var{p}, const double * @var{data}, size_t @var{stride}, size_t @var{n}) +This function indirectly sorts the @var{n} elements of the array +@var{data} with stride @var{stride} into ascending order, storing the +resulting permutation in @var{p}. The array @var{p} must be allocated with +a sufficient length to store the @var{n} elements of the permutation. +The elements of @var{p} give the index of the array element which would +have been stored in that position if the array had been sorted in place. +The array @var{data} is not changed. +@end deftypefun + +@deftypefun int gsl_sort_vector_index (gsl_permutation * @var{p}, const gsl_vector * @var{v}) +This function indirectly sorts the elements of the vector @var{v} into +ascending order, storing the resulting permutation in @var{p}. The +elements of @var{p} give the index of the vector element which would +have been stored in that position if the vector had been sorted in +place. The first element of @var{p} gives the index of the least element +in @var{v}, and the last element of @var{p} gives the index of the +greatest element in @var{v}. The vector @var{v} is not changed. +@end deftypefun + +@node Selecting the k smallest or largest elements +@section Selecting the k smallest or largest elements + +The functions described in this section select the @math{k} smallest +or largest elements of a data set of size @math{N}. The routines use an +@math{O(kN)} direct insertion algorithm which is suited to subsets that +are small compared with the total size of the dataset. For example, the +routines are useful for selecting the 10 largest values from one million +data points, but not for selecting the largest 100,000 values. If the +subset is a significant part of the total dataset it may be faster +to sort all the elements of the dataset directly with an @math{O(N \log +N)} algorithm and obtain the smallest or largest values that way. + +@deftypefun int gsl_sort_smallest (double * @var{dest}, size_t @var{k}, const double * @var{src}, size_t @var{stride}, size_t @var{n}) +This function copies the @var{k} smallest elements of the array +@var{src}, of size @var{n} and stride @var{stride}, in ascending +numerical order into the array @var{dest}. The size @var{k} of the subset must be +less than or equal to @var{n}. The data @var{src} is not modified by +this operation. +@end deftypefun + +@deftypefun int gsl_sort_largest (double * @var{dest}, size_t @var{k}, const double * @var{src}, size_t @var{stride}, size_t @var{n}) +This function copies the @var{k} largest elements of the array +@var{src}, of size @var{n} and stride @var{stride}, in descending +numerical order into the array @var{dest}. @var{k} must be +less than or equal to @var{n}. The data @var{src} is not modified by +this operation. +@end deftypefun + +@deftypefun int gsl_sort_vector_smallest (double * @var{dest}, size_t @var{k}, const gsl_vector * @var{v}) +@deftypefunx int gsl_sort_vector_largest (double * @var{dest}, size_t @var{k}, const gsl_vector * @var{v}) +These functions copy the @var{k} smallest or largest elements of the +vector @var{v} into the array @var{dest}. @var{k} +must be less than or equal to the length of the vector @var{v}. +@end deftypefun + +The following functions find the indices of the @math{k} smallest or +largest elements of a dataset, + +@deftypefun int gsl_sort_smallest_index (size_t * @var{p}, size_t @var{k}, const double * @var{src}, size_t @var{stride}, size_t @var{n}) +This function stores the indices of the @var{k} smallest elements of +the array @var{src}, of size @var{n} and stride @var{stride}, in the +array @var{p}. The indices are chosen so that the corresponding data is +in ascending numerical order. @var{k} must be +less than or equal to @var{n}. The data @var{src} is not modified by +this operation. +@end deftypefun + +@deftypefun int gsl_sort_largest_index (size_t * @var{p}, size_t @var{k}, const double * @var{src}, size_t @var{stride}, size_t @var{n}) +This function stores the indices of the @var{k} largest elements of +the array @var{src}, of size @var{n} and stride @var{stride}, in the +array @var{p}. The indices are chosen so that the corresponding data is +in descending numerical order. @var{k} must be +less than or equal to @var{n}. The data @var{src} is not modified by +this operation. +@end deftypefun + +@deftypefun int gsl_sort_vector_smallest_index (size_t * @var{p}, size_t @var{k}, const gsl_vector * @var{v}) +@deftypefunx int gsl_sort_vector_largest_index (size_t * @var{p}, size_t @var{k}, const gsl_vector * @var{v}) +These functions store the indices of the @var{k} smallest or largest +elements of the vector @var{v} in the array @var{p}. @var{k} must be less than or equal to the length of the vector +@var{v}. +@end deftypefun + + +@node Computing the rank +@section Computing the rank + +The @dfn{rank} of an element is its order in the sorted data. The rank +is the inverse of the index permutation, @var{p}. It can be computed +using the following algorithm, + +@example +for (i = 0; i < p->size; i++) +@{ + size_t pi = p->data[i]; + rank->data[pi] = i; +@} +@end example + +@noindent +This can be computed directly from the function +@code{gsl_permutation_inverse(rank,p)}. + +The following function will print the rank of each element of the vector +@var{v}, + +@example +void +print_rank (gsl_vector * v) +@{ + size_t i; + size_t n = v->size; + gsl_permutation * perm = gsl_permutation_alloc(n); + gsl_permutation * rank = gsl_permutation_alloc(n); + + gsl_sort_vector_index (perm, v); + gsl_permutation_inverse (rank, perm); + + for (i = 0; i < n; i++) + @{ + double vi = gsl_vector_get(v, i); + printf ("element = %d, value = %g, rank = %d\n", + i, vi, rank->data[i]); + @} + + gsl_permutation_free (perm); + gsl_permutation_free (rank); +@} +@end example + +@node Sorting Examples +@section Examples + +The following example shows how to use the permutation @var{p} to print +the elements of the vector @var{v} in ascending order, + +@example +gsl_sort_vector_index (p, v); + +for (i = 0; i < v->size; i++) +@{ + double vpi = gsl_vector_get (v, p->data[i]); + printf ("order = %d, value = %g\n", i, vpi); +@} +@end example + +@noindent +The next example uses the function @code{gsl_sort_smallest} to select +the 5 smallest numbers from 100000 uniform random variates stored in an +array, + +@example +@verbatiminclude examples/sortsmall.c +@end example +The output lists the 5 smallest values, in ascending order, + +@example +$ ./a.out +@verbatiminclude examples/sortsmall.out +@end example + +@node Sorting References and Further Reading +@section References and Further Reading + +The subject of sorting is covered extensively in Knuth's +@cite{Sorting and Searching}, + +@itemize @asis +@item +Donald E. Knuth, @cite{The Art of Computer Programming: Sorting and +Searching} (Vol 3, 3rd Ed, 1997), Addison-Wesley, ISBN 0201896850. +@end itemize + +@noindent +The Heapsort algorithm is described in the following book, + +@itemize @asis +@item Robert Sedgewick, @cite{Algorithms in C}, Addison-Wesley, +ISBN 0201514257. +@end itemize + + |