gsl-1.9/doc/fft.texi


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001

@cindex FFT
@cindex Fast Fourier Transforms, see FFT
@cindex Fourier Transforms, see FFT
@cindex Discrete Fourier Transforms, see FFT
@cindex DFTs, see FFT

This chapter describes functions for performing Fast Fourier Transforms
(FFTs).  The library includes radix-2 routines (for lengths which are a
power of two) and mixed-radix routines (which work for any length).  For
efficiency there are separate versions of the routines for real data and
for complex data.  The mixed-radix routines are a reimplementation of the
@sc{fftpack} library of Paul Swarztrauber.  Fortran code for @sc{fftpack} is
available on Netlib (@sc{fftpack} also includes some routines for sine and
cosine transforms but these are currently not available in GSL).  For
details and derivations of the underlying algorithms consult the
document @cite{GSL FFT Algorithms} (@pxref{FFT References and Further Reading})

@menu
* Mathematical Definitions::    
* Overview of complex data FFTs::  
* Radix-2 FFT routines for complex data::  
* Mixed-radix FFT routines for complex data::  
* Overview of real data FFTs::  
* Radix-2 FFT routines for real data::  
* Mixed-radix FFT routines for real data::  
* FFT References and Further Reading::  
@end menu

@node Mathematical Definitions
@section Mathematical Definitions 
@cindex FFT mathematical definition

Fast Fourier Transforms are efficient algorithms for
calculating the discrete fourier transform (DFT),
@tex
\beforedisplay
$$
x_j = \sum_{k=0}^{N-1} z_k \exp(-2\pi i j k / N) 
$$
\afterdisplay
@end tex
@ifinfo

@example
x_j = \sum_@{k=0@}^@{N-1@} z_k \exp(-2\pi i j k / N) 
@end example
@end ifinfo

The DFT usually arises as an approximation to the continuous fourier
transform when functions are sampled at discrete intervals in space or
time.  The naive evaluation of the discrete fourier transform is a
matrix-vector multiplication 
@c{$W\vec{z}$}
@math{W\vec@{z@}}. A general matrix-vector multiplication takes
@math{O(N^2)} operations for @math{N} data-points.  Fast fourier
transform algorithms use a divide-and-conquer strategy to factorize the
matrix @math{W} into smaller sub-matrices, corresponding to the integer
factors of the length @math{N}.  If @math{N} can be factorized into a
product of integers
@c{$f_1 f_2 \ldots f_n$} 
@math{f_1 f_2 ... f_n} then the DFT can be computed in @math{O(N \sum
f_i)} operations.  For a radix-2 FFT this gives an operation count of
@math{O(N \log_2 N)}.

All the FFT functions offer three types of transform: forwards, inverse
and backwards, based on the same mathematical definitions.  The
definition of the @dfn{forward fourier transform},
@c{$x = \hbox{FFT}(z)$}
@math{x = FFT(z)}, is,
@tex
\beforedisplay
$$
x_j = \sum_{k=0}^{N-1} z_k \exp(-2\pi i j k / N) 
$$
\afterdisplay
@end tex
@ifinfo

@example
x_j = \sum_@{k=0@}^@{N-1@} z_k \exp(-2\pi i j k / N) 
@end example

@end ifinfo
@noindent
and the definition of the @dfn{inverse fourier transform},
@c{$x = \hbox{IFFT}(z)$}
@math{x = IFFT(z)}, is,
@tex
\beforedisplay
$$
z_j = {1 \over N} \sum_{k=0}^{N-1} x_k \exp(2\pi i j k / N).
$$
\afterdisplay
@end tex
@ifinfo

@example
z_j = @{1 \over N@} \sum_@{k=0@}^@{N-1@} x_k \exp(2\pi i j k / N).
@end example

@end ifinfo
@noindent
The factor of @math{1/N} makes this a true inverse.  For example, a call
to @code{gsl_fft_complex_forward} followed by a call to
@code{gsl_fft_complex_inverse} should return the original data (within
numerical errors).

In general there are two possible choices for the sign of the
exponential in the transform/ inverse-transform pair. GSL follows the
same convention as @sc{fftpack}, using a negative exponential for the forward
transform.  The advantage of this convention is that the inverse
transform recreates the original function with simple fourier
synthesis.  Numerical Recipes uses the opposite convention, a positive
exponential in the forward transform.

The @dfn{backwards FFT} is simply our terminology for an unscaled
version of the inverse FFT,
@tex
\beforedisplay
$$
z^{backwards}_j = \sum_{k=0}^{N-1} x_k \exp(2\pi i j k / N).
$$
\afterdisplay
@end tex
@ifinfo

@example
z^@{backwards@}_j = \sum_@{k=0@}^@{N-1@} x_k \exp(2\pi i j k / N).
@end example

@end ifinfo
@noindent
When the overall scale of the result is unimportant it is often
convenient to use the backwards FFT instead of the inverse to save
unnecessary divisions.

@node Overview of complex data FFTs
@section Overview of complex data FFTs
@cindex FFT, complex data

The inputs and outputs for the complex FFT routines are @dfn{packed
arrays} of floating point numbers.  In a packed array the real and
imaginary parts of each complex number are placed in alternate
neighboring elements.  For example, the following definition of a packed
array of length 6,

@example
double x[3*2];
gsl_complex_packed_array data = x;
@end example

@noindent
can be used to hold an array of three complex numbers, @code{z[3]}, in
the following way,

@example
data[0] = Re(z[0])
data[1] = Im(z[0])
data[2] = Re(z[1])
data[3] = Im(z[1])
data[4] = Re(z[2])
data[5] = Im(z[2])
@end example

@noindent
The array indices for the data have the same ordering as those
in the definition of the DFT---i.e. there are no index transformations
or permutations of the data.

A @dfn{stride} parameter allows the user to perform transforms on the
elements @code{z[stride*i]} instead of @code{z[i]}.  A stride greater
than 1 can be used to take an in-place FFT of the column of a matrix. A
stride of 1 accesses the array without any additional spacing between
elements.  

To perform an FFT on a vector argument, such as @code{gsl_vector_complex
* v}, use the following definitions (or their equivalents) when calling
the functions described in this chapter:

@example
gsl_complex_packed_array data = v->data;
size_t stride = v->stride;
size_t n = v->size;
@end example

For physical applications it is important to remember that the index
appearing in the DFT does not correspond directly to a physical
frequency.  If the time-step of the DFT is @math{\Delta} then the
frequency-domain includes both positive and negative frequencies,
ranging from @math{-1/(2\Delta)} through 0 to @math{+1/(2\Delta)}.  The
positive frequencies are stored from the beginning of the array up to
the middle, and the negative frequencies are stored backwards from the
end of the array.

Here is a table which shows the layout of the array @var{data}, and the
correspondence between the time-domain data @math{z}, and the
frequency-domain data @math{x}.

@example
index    z               x = FFT(z)

0        z(t = 0)        x(f = 0)
1        z(t = 1)        x(f = 1/(N Delta))
2        z(t = 2)        x(f = 2/(N Delta))
.        ........        ..................
N/2      z(t = N/2)      x(f = +1/(2 Delta),
                               -1/(2 Delta))
.        ........        ..................
N-3      z(t = N-3)      x(f = -3/(N Delta))
N-2      z(t = N-2)      x(f = -2/(N Delta))
N-1      z(t = N-1)      x(f = -1/(N Delta))
@end example

@noindent
When @math{N} is even the location @math{N/2} contains the most positive
and negative frequencies (@math{+1/(2 \Delta)}, @math{-1/(2 \Delta)})
which are equivalent.  If @math{N} is odd then general structure of the
table above still applies, but @math{N/2} does not appear.


@node Radix-2 FFT routines for complex data
@section Radix-2 FFT routines for complex data
@cindex FFT of complex data, radix-2 algorithm
@cindex Radix-2 FFT, complex data

The radix-2 algorithms described in this section are simple and compact,
although not necessarily the most efficient.  They use the Cooley-Tukey
algorithm to compute in-place complex FFTs for lengths which are a power
of 2---no additional storage is required.  The corresponding
self-sorting mixed-radix routines offer better performance at the
expense of requiring additional working space.

All the functions described in this section are declared in the header file @file{gsl_fft_complex.h}.

@deftypefun int gsl_fft_complex_radix2_forward (gsl_complex_packed_array @var{data}, size_t @var{stride}, size_t @var{n})

@deftypefunx int gsl_fft_complex_radix2_transform (gsl_complex_packed_array @var{data}, size_t @var{stride}, size_t @var{n}, gsl_fft_direction @var{sign})

@deftypefunx int gsl_fft_complex_radix2_backward (gsl_complex_packed_array @var{data}, size_t @var{stride}, size_t @var{n})

@deftypefunx int gsl_fft_complex_radix2_inverse (gsl_complex_packed_array @var{data}, size_t @var{stride}, size_t @var{n})

These functions compute forward, backward and inverse FFTs of length
@var{n} with stride @var{stride}, on the packed complex array @var{data}
using an in-place radix-2 decimation-in-time algorithm.  The length of
the transform @var{n} is restricted to powers of two.  For the
@code{transform} version of the function the @var{sign} argument can be
either @code{forward} (@math{-1}) or @code{backward} (@math{+1}).

The functions return a value of @code{GSL_SUCCESS} if no errors were
detected, or @code{GSL_EDOM} if the length of the data @var{n} is not a
power of two.
@end deftypefun

@deftypefun int gsl_fft_complex_radix2_dif_forward (gsl_complex_packed_array @var{data}, size_t @var{stride}, size_t @var{n})

@deftypefunx int gsl_fft_complex_radix2_dif_transform (gsl_complex_packed_array @var{data}, size_t @var{stride}, size_t @var{n}, gsl_fft_direction @var{sign})

@deftypefunx int gsl_fft_complex_radix2_dif_backward (gsl_complex_packed_array @var{data}, size_t @var{stride}, size_t @var{n})

@deftypefunx int gsl_fft_complex_radix2_dif_inverse (gsl_complex_packed_array @var{data}, size_t @var{stride}, size_t @var{n})

These are decimation-in-frequency versions of the radix-2 FFT functions.

@end deftypefun


@comment @node Example of using radix-2 FFT routines for complex data
@comment @subsection Example of using radix-2 FFT routines for complex data

Here is an example program which computes the FFT of a short pulse in a
sample of length 128.  To make the resulting fourier transform real the
pulse is defined for equal positive and negative times (@math{-10}
@dots{} @math{10}), where the negative times wrap around the end of the
array.

@example
@verbatiminclude examples/fft.c
@end example

@noindent
Note that we have assumed that the program is using the default error
handler (which calls @code{abort} for any errors).  If you are not using
a safe error handler you would need to check the return status of
@code{gsl_fft_complex_radix2_forward}.

The transformed data is rescaled by @math{1/\sqrt N} so that it fits on
the same plot as the input.  Only the real part is shown, by the choice
of the input data the imaginary part is zero.  Allowing for the
wrap-around of negative times at @math{t=128}, and working in units of
@math{k/N}, the DFT approximates the continuum fourier transform, giving
a modulated sine function.
@iftex
@tex
\beforedisplay
$$
\int_{-a}^{+a} e^{-2 \pi i k x} dx = {\sin(2\pi k a) \over\pi k}
$$
\afterdisplay
@end tex

@sp 1
@center @image{fft-complex-radix2-t,2.8in} 
@center @image{fft-complex-radix2-f,2.8in}
@quotation
A pulse and its discrete fourier transform, output from
the example program.
@end quotation
@end iftex

@node Mixed-radix FFT routines for complex data
@section Mixed-radix FFT routines for complex data
@cindex FFT of complex data, mixed-radix algorithm
@cindex Mixed-radix FFT, complex data

This section describes mixed-radix FFT algorithms for complex data.  The
mixed-radix functions work for FFTs of any length.  They are a
reimplementation of Paul Swarztrauber's Fortran @sc{fftpack} library.
The theory is explained in the review article @cite{Self-sorting
Mixed-radix FFTs} by Clive Temperton.  The routines here use the same
indexing scheme and basic algorithms as @sc{fftpack}.

The mixed-radix algorithm is based on sub-transform modules---highly
optimized small length FFTs which are combined to create larger FFTs.
There are efficient modules for factors of 2, 3, 4, 5, 6 and 7.  The
modules for the composite factors of 4 and 6 are faster than combining
the modules for @math{2*2} and @math{2*3}.

For factors which are not implemented as modules there is a fall-back to
a general length-@math{n} module which uses Singleton's method for
efficiently computing a DFT. This module is @math{O(n^2)}, and slower
than a dedicated module would be but works for any length @math{n}.  Of
course, lengths which use the general length-@math{n} module will still
be factorized as much as possible.  For example, a length of 143 will be
factorized into @math{11*13}.  Large prime factors are the worst case
scenario, e.g. as found in @math{n=2*3*99991}, and should be avoided
because their @math{O(n^2)} scaling will dominate the run-time (consult
the document @cite{GSL FFT Algorithms} included in the GSL distribution
if you encounter this problem).

The mixed-radix initialization function @code{gsl_fft_complex_wavetable_alloc}
returns the list of factors chosen by the library for a given length
@math{N}.  It can be used to check how well the length has been
factorized, and estimate the run-time.  To a first approximation the
run-time scales as @math{N \sum f_i}, where the @math{f_i} are the
factors of @math{N}.  For programs under user control you may wish to
issue a warning that the transform will be slow when the length is
poorly factorized.  If you frequently encounter data lengths which
cannot be factorized using the existing small-prime modules consult
@cite{GSL FFT Algorithms} for details on adding support for other
factors.

@comment First, the space for the trigonometric lookup tables and scratch area is
@comment allocated by a call to one of the @code{alloc} functions.  We
@comment call the combination of factorization, scratch space and trigonometric
@comment lookup arrays a @dfn{wavetable}.  It contains the sine and cosine
@comment waveforms for the all the frequencies that will be used in the FFT.

@comment The wavetable is initialized by a call to the corresponding @code{init}
@comment function.  It factorizes the data length, using the implemented
@comment subtransforms as preferred factors wherever possible.  The trigonometric
@comment lookup table for the chosen factorization is also computed.

@comment An FFT is computed by a call to one of the @code{forward},
@comment @code{backward} or @code{inverse} functions, with the data, length and
@comment wavetable as arguments.

All the functions described in this section are declared in the header
file @file{gsl_fft_complex.h}.

@deftypefun {gsl_fft_complex_wavetable *} gsl_fft_complex_wavetable_alloc (size_t @var{n})
This function prepares a trigonometric lookup table for a complex FFT of
length @var{n}. The function returns a pointer to the newly allocated
@code{gsl_fft_complex_wavetable} if no errors were detected, and a null
pointer in the case of error.  The length @var{n} is factorized into a
product of subtransforms, and the factors and their trigonometric
coefficients are stored in the wavetable. The trigonometric coefficients
are computed using direct calls to @code{sin} and @code{cos}, for
accuracy.  Recursion relations could be used to compute the lookup table
faster, but if an application performs many FFTs of the same length then
this computation is a one-off overhead which does not affect the final
throughput.

The wavetable structure can be used repeatedly for any transform of the
same length.  The table is not modified by calls to any of the other FFT
functions.  The same wavetable can be used for both forward and backward
(or inverse) transforms of a given length.
@end deftypefun

@deftypefun void gsl_fft_complex_wavetable_free (gsl_fft_complex_wavetable * @var{wavetable})
This function frees the memory associated with the wavetable
@var{wavetable}.  The wavetable can be freed if no further FFTs of the
same length will be needed.
@end deftypefun

@noindent
These functions operate on a @code{gsl_fft_complex_wavetable} structure
which contains internal parameters for the FFT.  It is not necessary to
set any of the components directly but it can sometimes be useful to
examine them.  For example, the chosen factorization of the FFT length
is given and can be used to provide an estimate of the run-time or
numerical error. The wavetable structure is declared in the header file
@file{gsl_fft_complex.h}.

@deftp {Data Type} gsl_fft_complex_wavetable
This is a structure that holds the factorization and trigonometric
lookup tables for the mixed radix fft algorithm.  It has the following
components:

@table @code
@item size_t n
This is the number of complex data points

@item size_t nf
This is the number of factors that the length @code{n} was decomposed into.

@item size_t factor[64]
This is the array of factors.  Only the first @code{nf} elements are
used. 

@comment (FIXME: This is a fixed length array and therefore probably in
@comment violation of the GNU Coding Standards).

@item gsl_complex * trig
This is a pointer to a preallocated trigonometric lookup table of
@code{n} complex elements.

@item gsl_complex * twiddle[64]
This is an array of pointers into @code{trig}, giving the twiddle
factors for each pass.
@end table
@end deftp

@noindent
The mixed radix algorithms require additional working space to hold
the intermediate steps of the transform.

@deftypefun {gsl_fft_complex_workspace *} gsl_fft_complex_workspace_alloc (size_t @var{n})
This function allocates a workspace for a complex transform of length
@var{n}.
@end deftypefun

@deftypefun void gsl_fft_complex_workspace_free (gsl_fft_complex_workspace * @var{workspace})
This function frees the memory associated with the workspace
@var{workspace}. The workspace can be freed if no further FFTs of the
same length will be needed.
@end deftypefun

@comment @deftp {Data Type} gsl_fft_complex_workspace
@comment This is a structure that holds the workspace for the mixed radix fft
@comment algorithm.  It has the following components:
@comment
@comment @table @code
@comment @item gsl_complex * scratch
@comment This is a pointer to a workspace of @code{n} complex elements,
@comment capable of holding intermediate copies of the original data set.
@comment @end table
@comment @end deftp

@noindent
The following functions compute the transform,

@deftypefun int gsl_fft_complex_forward (gsl_complex_packed_array @var{data}, size_t @var{stride}, size_t @var{n}, const gsl_fft_complex_wavetable * @var{wavetable}, gsl_fft_complex_workspace * @var{work})
@deftypefunx int gsl_fft_complex_transform (gsl_complex_packed_array @var{data}, size_t @var{stride}, size_t @var{n}, const gsl_fft_complex_wavetable * @var{wavetable}, gsl_fft_complex_workspace * @var{work}, gsl_fft_direction @var{sign})
@deftypefunx int gsl_fft_complex_backward (gsl_complex_packed_array @var{data}, size_t @var{stride}, size_t @var{n}, const gsl_fft_complex_wavetable * @var{wavetable}, gsl_fft_complex_workspace * @var{work})
@deftypefunx int gsl_fft_complex_inverse (gsl_complex_packed_array @var{data}, size_t @var{stride}, size_t @var{n}, const gsl_fft_complex_wavetable * @var{wavetable}, gsl_fft_complex_workspace * @var{work})

These functions compute forward, backward and inverse FFTs of length
@var{n} with stride @var{stride}, on the packed complex array
@var{data}, using a mixed radix decimation-in-frequency algorithm.
There is no restriction on the length @var{n}.  Efficient modules are
provided for subtransforms of length 2, 3, 4, 5, 6 and 7.  Any remaining
factors are computed with a slow, @math{O(n^2)}, general-@math{n}
module. The caller must supply a @var{wavetable} containing the
trigonometric lookup tables and a workspace @var{work}.  For the
@code{transform} version of the function the @var{sign} argument can be
either @code{forward} (@math{-1}) or @code{backward} (@math{+1}).

The functions return a value of @code{0} if no errors were detected. The
following @code{gsl_errno} conditions are defined for these functions:

@table @code
@item GSL_EDOM
The length of the data @var{n} is not a positive integer (i.e. @var{n}
is zero).

@item GSL_EINVAL
The length of the data @var{n} and the length used to compute the given
@var{wavetable} do not match.
@end table
@end deftypefun

@comment @node Example of using mixed-radix FFT routines for complex data
@comment @subsection Example of using mixed-radix FFT routines for complex data

Here is an example program which computes the FFT of a short pulse in a
sample of length 630 (@math{=2*3*3*5*7}) using the mixed-radix
algorithm.

@example
@verbatiminclude examples/fftmr.c
@end example

@noindent
Note that we have assumed that the program is using the default
@code{gsl} error handler (which calls @code{abort} for any errors).  If
you are not using a safe error handler you would need to check the
return status of all the @code{gsl} routines.

@node Overview of real data FFTs
@section Overview of real data FFTs
@cindex FFT of real data
The functions for real data are similar to those for complex data.
However, there is an important difference between forward and inverse
transforms.  The fourier transform of a real sequence is not real.  It is
a complex sequence with a special symmetry:
@tex
\beforedisplay
$$
z_k = z_{N-k}^*
$$
\afterdisplay
@end tex
@ifinfo

@example
z_k = z_@{N-k@}^*
@end example

@end ifinfo
@noindent
A sequence with this symmetry is called @dfn{conjugate-complex} or
@dfn{half-complex}.  This different structure requires different
storage layouts for the forward transform (from real to half-complex)
and inverse transform (from half-complex back to real).  As a
consequence the routines are divided into two sets: functions in
@code{gsl_fft_real} which operate on real sequences and functions in
@code{gsl_fft_halfcomplex} which operate on half-complex sequences.

Functions in @code{gsl_fft_real} compute the frequency coefficients of a
real sequence.  The half-complex coefficients @math{c} of a real sequence
@math{x} are given by fourier analysis,
@tex
\beforedisplay
$$
c_k = \sum_{j=0}^{N-1} x_j \exp(-2 \pi i j k /N)
$$
\afterdisplay
@end tex
@ifinfo

@example
c_k = \sum_@{j=0@}^@{N-1@} x_j \exp(-2 \pi i j k /N)
@end example

@end ifinfo
@noindent
Functions in @code{gsl_fft_halfcomplex} compute inverse or backwards
transforms.  They reconstruct real sequences by fourier synthesis from
their half-complex frequency coefficients, @math{c},
@tex
\beforedisplay
$$
x_j = {1 \over N} \sum_{k=0}^{N-1} c_k \exp(2 \pi i j k /N)
$$
\afterdisplay
@end tex
@ifinfo

@example
x_j = @{1 \over N@} \sum_@{k=0@}^@{N-1@} c_k \exp(2 \pi i j k /N)
@end example

@end ifinfo
@noindent
The symmetry of the half-complex sequence implies that only half of the
complex numbers in the output need to be stored.  The remaining half can
be reconstructed using the half-complex symmetry condition. This works
for all lengths, even and odd---when the length is even the middle value
where @math{k=N/2} is also real.  Thus only @var{N} real numbers are
required to store the half-complex sequence, and the transform of a real
sequence can be stored in the same size array as the original data.

The precise storage arrangements depend on the algorithm, and are
different for radix-2 and mixed-radix routines.  The radix-2 function
operates in-place, which constrains the locations where each element can
be stored.  The restriction forces real and imaginary parts to be stored
far apart.  The mixed-radix algorithm does not have this restriction, and
it stores the real and imaginary parts of a given term in neighboring
locations (which is desirable for better locality of memory accesses).

@node Radix-2 FFT routines for real data
@section Radix-2 FFT routines for real data
@cindex FFT of real data, radix-2 algorithm
@cindex Radix-2 FFT for real data

This section describes radix-2 FFT algorithms for real data.  They use
the Cooley-Tukey algorithm to compute in-place FFTs for lengths which
are a power of 2. 

The radix-2 FFT functions for real data are declared in the header files
@file{gsl_fft_real.h} 

@deftypefun int gsl_fft_real_radix2_transform (double @var{data}[], size_t @var{stride}, size_t @var{n})

This function computes an in-place radix-2 FFT of length @var{n} and
stride @var{stride} on the real array @var{data}.  The output is a
half-complex sequence, which is stored in-place.  The arrangement of the
half-complex terms uses the following scheme: for @math{k < N/2} the
real part of the @math{k}-th term is stored in location @math{k}, and
the corresponding imaginary part is stored in location @math{N-k}.  Terms
with @math{k > N/2} can be reconstructed using the symmetry 
@c{$z_k = z^*_{N-k}$}
@math{z_k = z^*_@{N-k@}}. 
The terms for @math{k=0} and @math{k=N/2} are both purely
real, and count as a special case.  Their real parts are stored in
locations @math{0} and @math{N/2} respectively, while their imaginary
parts which are zero are not stored.

The following table shows the correspondence between the output
@var{data} and the equivalent results obtained by considering the input
data as a complex sequence with zero imaginary part,

@example
complex[0].real    =    data[0] 
complex[0].imag    =    0 
complex[1].real    =    data[1] 
complex[1].imag    =    data[N-1]
...............         ................
complex[k].real    =    data[k]
complex[k].imag    =    data[N-k] 
...............         ................
complex[N/2].real  =    data[N/2]
complex[N/2].imag  =    0
...............         ................
complex[k'].real   =    data[k]        k' = N - k
complex[k'].imag   =   -data[N-k] 
...............         ................
complex[N-1].real  =    data[1]
complex[N-1].imag  =   -data[N-1]
@end example
@noindent
Note that the output data can be converted into the full complex
sequence using the function @code{gsl_fft_halfcomplex_unpack}
described in the next section.

@end deftypefun
The radix-2 FFT functions for halfcomplex data are declared in the
header file @file{gsl_fft_halfcomplex.h}.

@deftypefun int gsl_fft_halfcomplex_radix2_inverse (double @var{data}[], size_t @var{stride}, size_t @var{n})
@deftypefunx int gsl_fft_halfcomplex_radix2_backward (double @var{data}[], size_t @var{stride}, size_t @var{n})

These functions compute the inverse or backwards in-place radix-2 FFT of
length @var{n} and stride @var{stride} on the half-complex sequence
@var{data} stored according the output scheme used by
@code{gsl_fft_real_radix2}.  The result is a real array stored in natural
order.

@end deftypefun


@node Mixed-radix FFT routines for real data
@section Mixed-radix FFT routines for real data
@cindex FFT of real data, mixed-radix algorithm
@cindex Mixed-radix FFT, real data

This section describes mixed-radix FFT algorithms for real data.  The
mixed-radix functions work for FFTs of any length.  They are a
reimplementation of the real-FFT routines in the Fortran @sc{fftpack} library
by Paul Swarztrauber.  The theory behind the algorithm is explained in
the article @cite{Fast Mixed-Radix Real Fourier Transforms} by Clive
Temperton.  The routines here use the same indexing scheme and basic
algorithms as @sc{fftpack}.

The functions use the @sc{fftpack} storage convention for half-complex
sequences.  In this convention the half-complex transform of a real
sequence is stored with frequencies in increasing order, starting at
zero, with the real and imaginary parts of each frequency in neighboring
locations.  When a value is known to be real the imaginary part is not
stored.  The imaginary part of the zero-frequency component is never
stored.  It is known to be zero (since the zero frequency component is
simply the sum of the input data (all real)).  For a sequence of even
length the imaginary part of the frequency @math{n/2} is not stored
either, since the symmetry 
@c{$z_k = z_{N-k}^*$}
@math{z_k = z_@{N-k@}^*} implies that this is
purely real too.

The storage scheme is best shown by some examples.  The table below
shows the output for an odd-length sequence, @math{n=5}.  The two columns
give the correspondence between the 5 values in the half-complex
sequence returned by @code{gsl_fft_real_transform}, @var{halfcomplex}[] and the
values @var{complex}[] that would be returned if the same real input
sequence were passed to @code{gsl_fft_complex_backward} as a complex
sequence (with imaginary parts set to @code{0}),

@example
complex[0].real  =  halfcomplex[0] 
complex[0].imag  =  0
complex[1].real  =  halfcomplex[1] 
complex[1].imag  =  halfcomplex[2]
complex[2].real  =  halfcomplex[3]
complex[2].imag  =  halfcomplex[4]
complex[3].real  =  halfcomplex[3]
complex[3].imag  = -halfcomplex[4]
complex[4].real  =  halfcomplex[1]
complex[4].imag  = -halfcomplex[2]
@end example

@noindent
The upper elements of the @var{complex} array, @code{complex[3]} and
@code{complex[4]} are filled in using the symmetry condition.  The
imaginary part of the zero-frequency term @code{complex[0].imag} is
known to be zero by the symmetry.

The next table shows the output for an even-length sequence, @math{n=6}
In the even case there are two values which are purely real,

@example
complex[0].real  =  halfcomplex[0]
complex[0].imag  =  0
complex[1].real  =  halfcomplex[1] 
complex[1].imag  =  halfcomplex[2] 
complex[2].real  =  halfcomplex[3] 
complex[2].imag  =  halfcomplex[4] 
complex[3].real  =  halfcomplex[5] 
complex[3].imag  =  0 
complex[4].real  =  halfcomplex[3] 
complex[4].imag  = -halfcomplex[4]
complex[5].real  =  halfcomplex[1] 
complex[5].imag  = -halfcomplex[2] 
@end example

@noindent
The upper elements of the @var{complex} array, @code{complex[4]} and
@code{complex[5]} are filled in using the symmetry condition.  Both
@code{complex[0].imag} and @code{complex[3].imag} are known to be zero.

All these functions are declared in the header files
@file{gsl_fft_real.h} and @file{gsl_fft_halfcomplex.h}.

@deftypefun {gsl_fft_real_wavetable *} gsl_fft_real_wavetable_alloc (size_t @var{n})
@deftypefunx {gsl_fft_halfcomplex_wavetable *} gsl_fft_halfcomplex_wavetable_alloc (size_t @var{n})
These functions prepare trigonometric lookup tables for an FFT of size
@math{n} real elements.  The functions return a pointer to the newly
allocated struct if no errors were detected, and a null pointer in the
case of error.  The length @var{n} is factorized into a product of
subtransforms, and the factors and their trigonometric coefficients are
stored in the wavetable. The trigonometric coefficients are computed
using direct calls to @code{sin} and @code{cos}, for accuracy.
Recursion relations could be used to compute the lookup table faster,
but if an application performs many FFTs of the same length then
computing the wavetable is a one-off overhead which does not affect the
final throughput.

The wavetable structure can be used repeatedly for any transform of the
same length.  The table is not modified by calls to any of the other FFT
functions.  The appropriate type of wavetable must be used for forward
real or inverse half-complex transforms.
@end deftypefun

@deftypefun void gsl_fft_real_wavetable_free (gsl_fft_real_wavetable * @var{wavetable})
@deftypefunx void gsl_fft_halfcomplex_wavetable_free (gsl_fft_halfcomplex_wavetable * @var{wavetable})
These functions free the memory associated with the wavetable
@var{wavetable}. The wavetable can be freed if no further FFTs of the
same length will be needed.
@end deftypefun

@noindent
The mixed radix algorithms require additional working space to hold
the intermediate steps of the transform,

@deftypefun {gsl_fft_real_workspace *} gsl_fft_real_workspace_alloc (size_t @var{n})
This function allocates a workspace for a real transform of length
@var{n}.  The same workspace can be used for both forward real and inverse
halfcomplex transforms.
@end deftypefun

@deftypefun void gsl_fft_real_workspace_free (gsl_fft_real_workspace * @var{workspace})
This function frees the memory associated with the workspace
@var{workspace}. The workspace can be freed if no further FFTs of the
same length will be needed.
@end deftypefun

@noindent
The following functions compute the transforms of real and half-complex
data,

@deftypefun int gsl_fft_real_transform (double @var{data}[], size_t @var{stride}, size_t @var{n}, const gsl_fft_real_wavetable * @var{wavetable}, gsl_fft_real_workspace * @var{work})
@deftypefunx int gsl_fft_halfcomplex_transform (double @var{data}[], size_t @var{stride}, size_t @var{n}, const gsl_fft_halfcomplex_wavetable * @var{wavetable}, gsl_fft_real_workspace * @var{work})
These functions compute the FFT of @var{data}, a real or half-complex
array of length @var{n}, using a mixed radix decimation-in-frequency
algorithm.  For @code{gsl_fft_real_transform} @var{data} is an array of
time-ordered real data.  For @code{gsl_fft_halfcomplex_transform}
@var{data} contains fourier coefficients in the half-complex ordering
described above.  There is no restriction on the length @var{n}.
Efficient modules are provided for subtransforms of length 2, 3, 4 and
5.  Any remaining factors are computed with a slow, @math{O(n^2)},
general-n module.  The caller must supply a @var{wavetable} containing
trigonometric lookup tables and a workspace @var{work}. 
@end deftypefun

@deftypefun int gsl_fft_real_unpack (const double @var{real_coefficient}[], gsl_complex_packed_array @var{complex_coefficient}, size_t @var{stride}, size_t @var{n})

This function converts a single real array, @var{real_coefficient} into
an equivalent complex array, @var{complex_coefficient}, (with imaginary
part set to zero), suitable for @code{gsl_fft_complex} routines.  The
algorithm for the conversion is simply,

@example
for (i = 0; i < n; i++)
  @{
    complex_coefficient[i].real 
      = real_coefficient[i];
    complex_coefficient[i].imag 
      = 0.0;
  @}
@end example
@end deftypefun

@deftypefun int gsl_fft_halfcomplex_unpack (const double @var{halfcomplex_coefficient}[], gsl_complex_packed_array @var{complex_coefficient}, size_t @var{stride}, size_t @var{n})

This function converts @var{halfcomplex_coefficient}, an array of
half-complex coefficients as returned by @code{gsl_fft_real_transform}, into an
ordinary complex array, @var{complex_coefficient}.  It fills in the
complex array using the symmetry 
@c{$z_k = z_{N-k}^*$}
@math{z_k = z_@{N-k@}^*}
to reconstruct the redundant elements.  The algorithm for the conversion
is,

@example  
complex_coefficient[0].real 
  = halfcomplex_coefficient[0];
complex_coefficient[0].imag 
  = 0.0;

for (i = 1; i < n - i; i++)
  @{
    double hc_real 
      = halfcomplex_coefficient[2 * i - 1];
    double hc_imag 
      = halfcomplex_coefficient[2 * i];
    complex_coefficient[i].real = hc_real;
    complex_coefficient[i].imag = hc_imag;
    complex_coefficient[n - i].real = hc_real;
    complex_coefficient[n - i].imag = -hc_imag;
  @}

if (i == n - i)
  @{
    complex_coefficient[i].real 
      = halfcomplex_coefficient[n - 1];
    complex_coefficient[i].imag 
      = 0.0;
  @}
@end example
@end deftypefun

@comment @node Example of using mixed-radix FFT routines for real data
@comment @subsection Example of using mixed-radix FFT routines for real data

Here is an example program using @code{gsl_fft_real_transform} and
@code{gsl_fft_halfcomplex_inverse}.  It generates a real signal in the
shape of a square pulse.  The pulse is fourier transformed to frequency
space, and all but the lowest ten frequency components are removed from
the array of fourier coefficients returned by
@code{gsl_fft_real_transform}.

The remaining fourier coefficients are transformed back to the
time-domain, to give a filtered version of the square pulse.  Since
fourier coefficients are stored using the half-complex symmetry both
positive and negative frequencies are removed and the final filtered
signal is also real.

@example
@verbatiminclude examples/fftreal.c
@end example

@iftex
@sp 1
@center @image{fft-real-mixedradix,3.4in}

@center Low-pass filtered version of a real pulse, 
@center output from the example program.
@end iftex

@node FFT References and Further Reading
@section References and Further Reading

A good starting point for learning more about the FFT is the review
article @cite{Fast Fourier Transforms: A Tutorial Review and A State of
the Art} by Duhamel and Vetterli,

@itemize @asis
@item
P. Duhamel and M. Vetterli.
Fast fourier transforms: A tutorial review and a state of the art.
@cite{Signal Processing}, 19:259--299, 1990.
@end itemize

@noindent
To find out about the algorithms used in the GSL routines you may want
to consult the document @cite{GSL FFT Algorithms} (it is included
in GSL, as @file{doc/fftalgorithms.tex}).  This has general information
on FFTs and explicit derivations of the implementation for each
routine.  There are also references to the relevant literature.  For
convenience some of the more important references are reproduced below.

@noindent
There are several introductory books on the FFT with example programs,
such as @cite{The Fast Fourier Transform} by Brigham and @cite{DFT/FFT
and Convolution Algorithms} by Burrus and Parks,

@itemize @asis 
@item
E. Oran Brigham.
@cite{The Fast Fourier Transform}.
Prentice Hall, 1974.

@item
C. S. Burrus and T. W. Parks.
@cite{DFT/FFT and Convolution Algorithms}.
Wiley, 1984.
@end itemize

@noindent
Both these introductory books cover the radix-2 FFT in some detail.
The mixed-radix algorithm at the heart of the @sc{fftpack} routines is
reviewed in Clive Temperton's paper,

@itemize @asis 
@item
Clive Temperton.
Self-sorting mixed-radix fast fourier transforms.
@cite{Journal of Computational Physics}, 52(1):1--23, 1983.
@end itemize

@noindent
The derivation of FFTs for real-valued data is explained in the
following two articles,

@itemize @asis
@item
Henrik V. Sorenson, Douglas L. Jones, Michael T. Heideman, and C. Sidney
Burrus.
Real-valued fast fourier transform algorithms.
@cite{IEEE Transactions on Acoustics, Speech, and Signal Processing},
ASSP-35(6):849--863, 1987.

@item
Clive Temperton.
Fast mixed-radix real fourier transforms.
@cite{Journal of Computational Physics}, 52:340--350, 1983.
@end itemize

@noindent
In 1979 the IEEE published a compendium of carefully-reviewed Fortran
FFT programs in @cite{Programs for Digital Signal Processing}.  It is a
useful reference for implementations of many different FFT
algorithms,

@itemize @asis 
@item
Digital Signal Processing Committee and IEEE Acoustics, Speech, and Signal
Processing Committee, editors.
@cite{Programs for Digital Signal Processing}.
IEEE Press, 1979.
@end itemize
@comment  @noindent
@comment  There is also an annotated bibliography of papers on the FFT and related
@comment  topics by Burrus,

@comment  @itemize @asis 
@comment  @item C. S. Burrus.  Notes on the FFT.
@comment  @end itemize
@comment  @noindent
@comment The notes are available from @url{http://www-dsp.rice.edu/res/fft/fftnote.asc}.

@noindent
For large-scale FFT work we recommend the use of the dedicated FFTW library
by Frigo and Johnson.  The FFTW library is self-optimizing---it
automatically tunes itself for each hardware platform in order to
achieve maximum performance.  It is available under the GNU GPL.

@itemize @asis
@item
FFTW Website, @uref{http://www.fftw.org/}
@end itemize

@noindent
The source code for @sc{fftpack} is available from Netlib,

@itemize @asis
@item
FFTPACK, @uref{http://www.netlib.org/fftpack/}
@end itemize