906 lines
		
	
	
		
			38 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
		
		
			
		
	
	
			906 lines
		
	
	
		
			38 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
| 
								 | 
							
								@node  Tutorial, Other Important Topics, Introduction, Top
							 | 
						||
| 
								 | 
							
								@chapter Tutorial
							 | 
						||
| 
								 | 
							
								@menu
							 | 
						||
| 
								 | 
							
								* Complex One-Dimensional DFTs::
							 | 
						||
| 
								 | 
							
								* Complex Multi-Dimensional DFTs::
							 | 
						||
| 
								 | 
							
								* One-Dimensional DFTs of Real Data::
							 | 
						||
| 
								 | 
							
								* Multi-Dimensional DFTs of Real Data::
							 | 
						||
| 
								 | 
							
								* More DFTs of Real Data::
							 | 
						||
| 
								 | 
							
								@end menu
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								This chapter describes the basic usage of FFTW, i.e., how to compute
							 | 
						||
| 
								 | 
							
								@cindex basic interface
							 | 
						||
| 
								 | 
							
								the Fourier transform of a single array.  This chapter tells the
							 | 
						||
| 
								 | 
							
								truth, but not the @emph{whole} truth. Specifically, FFTW implements
							 | 
						||
| 
								 | 
							
								additional routines and flags that are not documented here, although
							 | 
						||
| 
								 | 
							
								in many cases we try to indicate where added capabilities exist.  For
							 | 
						||
| 
								 | 
							
								more complete information, see @ref{FFTW Reference}.  (Note that you
							 | 
						||
| 
								 | 
							
								need to compile and install FFTW before you can use it in a program.
							 | 
						||
| 
								 | 
							
								For the details of the installation, see @ref{Installation and
							 | 
						||
| 
								 | 
							
								Customization}.)
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								We recommend that you read this tutorial in order.@footnote{You can
							 | 
						||
| 
								 | 
							
								read the tutorial in bit-reversed order after computing your first
							 | 
						||
| 
								 | 
							
								transform.}  At the least, read the first section (@pxref{Complex
							 | 
						||
| 
								 | 
							
								One-Dimensional DFTs}) before reading any of the others, even if your
							 | 
						||
| 
								 | 
							
								main interest lies in one of the other transform types.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Users of FFTW version 2 and earlier may also want to read @ref{Upgrading
							 | 
						||
| 
								 | 
							
								from FFTW version 2}.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								@c ------------------------------------------------------------
							 | 
						||
| 
								 | 
							
								@node Complex One-Dimensional DFTs, Complex Multi-Dimensional DFTs, Tutorial, Tutorial
							 | 
						||
| 
								 | 
							
								@section Complex One-Dimensional DFTs
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								@quotation
							 | 
						||
| 
								 | 
							
								Plan: To bother about the best method of accomplishing an accidental result.
							 | 
						||
| 
								 | 
							
								[Ambrose Bierce, @cite{The Enlarged Devil's Dictionary}.]
							 | 
						||
| 
								 | 
							
								@cindex Devil
							 | 
						||
| 
								 | 
							
								@end quotation
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								@iftex
							 | 
						||
| 
								 | 
							
								@medskip
							 | 
						||
| 
								 | 
							
								@end iftex
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The basic usage of FFTW to compute a one-dimensional DFT of size
							 | 
						||
| 
								 | 
							
								@code{N} is simple, and it typically looks something like this code:
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								@example
							 | 
						||
| 
								 | 
							
								#include <fftw3.h>
							 | 
						||
| 
								 | 
							
								...
							 | 
						||
| 
								 | 
							
								@{
							 | 
						||
| 
								 | 
							
								    fftw_complex *in, *out;
							 | 
						||
| 
								 | 
							
								    fftw_plan p;
							 | 
						||
| 
								 | 
							
								    ...
							 | 
						||
| 
								 | 
							
								    in = (fftw_complex*) fftw_malloc(sizeof(fftw_complex) * N);
							 | 
						||
| 
								 | 
							
								    out = (fftw_complex*) fftw_malloc(sizeof(fftw_complex) * N);
							 | 
						||
| 
								 | 
							
								    p = fftw_plan_dft_1d(N, in, out, FFTW_FORWARD, FFTW_ESTIMATE);
							 | 
						||
| 
								 | 
							
								    ...
							 | 
						||
| 
								 | 
							
								    fftw_execute(p); /* @r{repeat as needed} */
							 | 
						||
| 
								 | 
							
								    ...
							 | 
						||
| 
								 | 
							
								    fftw_destroy_plan(p);
							 | 
						||
| 
								 | 
							
								    fftw_free(in); fftw_free(out);
							 | 
						||
| 
								 | 
							
								@}
							 | 
						||
| 
								 | 
							
								@end example
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								You must link this code with the @code{fftw3} library.  On Unix systems,
							 | 
						||
| 
								 | 
							
								link with @code{-lfftw3 -lm}.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The example code first allocates the input and output arrays.  You can
							 | 
						||
| 
								 | 
							
								allocate them in any way that you like, but we recommend using
							 | 
						||
| 
								 | 
							
								@code{fftw_malloc}, which behaves like
							 | 
						||
| 
								 | 
							
								@findex fftw_malloc
							 | 
						||
| 
								 | 
							
								@code{malloc} except that it properly aligns the array when SIMD
							 | 
						||
| 
								 | 
							
								instructions (such as SSE and Altivec) are available (@pxref{SIMD
							 | 
						||
| 
								 | 
							
								alignment and fftw_malloc}). [Alternatively, we provide a convenient wrapper function @code{fftw_alloc_complex(N)} which has the same effect.]
							 | 
						||
| 
								 | 
							
								@findex fftw_alloc_complex
							 | 
						||
| 
								 | 
							
								@cindex SIMD
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The data is an array of type @code{fftw_complex}, which is by default a
							 | 
						||
| 
								 | 
							
								@code{double[2]} composed of the real (@code{in[i][0]}) and imaginary
							 | 
						||
| 
								 | 
							
								(@code{in[i][1]}) parts of a complex number.
							 | 
						||
| 
								 | 
							
								@tindex fftw_complex
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The next step is to create a @dfn{plan}, which is an object
							 | 
						||
| 
								 | 
							
								@cindex plan
							 | 
						||
| 
								 | 
							
								that contains all the data that FFTW needs to compute the FFT. 
							 | 
						||
| 
								 | 
							
								This function creates the plan:
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								@example
							 | 
						||
| 
								 | 
							
								fftw_plan fftw_plan_dft_1d(int n, fftw_complex *in, fftw_complex *out,
							 | 
						||
| 
								 | 
							
								                           int sign, unsigned flags);
							 | 
						||
| 
								 | 
							
								@end example
							 | 
						||
| 
								 | 
							
								@findex fftw_plan_dft_1d
							 | 
						||
| 
								 | 
							
								@tindex fftw_plan
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The first argument, @code{n}, is the size of the transform you are
							 | 
						||
| 
								 | 
							
								trying to compute.  The size @code{n} can be any positive integer, but
							 | 
						||
| 
								 | 
							
								sizes that are products of small factors are transformed most
							 | 
						||
| 
								 | 
							
								efficiently (although prime sizes still use an @Onlogn{} algorithm).
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The next two arguments are pointers to the input and output arrays of
							 | 
						||
| 
								 | 
							
								the transform.  These pointers can be equal, indicating an
							 | 
						||
| 
								 | 
							
								@dfn{in-place} transform.
							 | 
						||
| 
								 | 
							
								@cindex in-place
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The fourth argument, @code{sign}, can be either @code{FFTW_FORWARD}
							 | 
						||
| 
								 | 
							
								(@code{-1}) or @code{FFTW_BACKWARD} (@code{+1}),
							 | 
						||
| 
								 | 
							
								@ctindex FFTW_FORWARD
							 | 
						||
| 
								 | 
							
								@ctindex FFTW_BACKWARD
							 | 
						||
| 
								 | 
							
								and indicates the direction of the transform you are interested in;
							 | 
						||
| 
								 | 
							
								technically, it is the sign of the exponent in the transform.  
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The @code{flags} argument is usually either @code{FFTW_MEASURE} or
							 | 
						||
| 
								 | 
							
								@cindex flags
							 | 
						||
| 
								 | 
							
								@code{FFTW_ESTIMATE}.  @code{FFTW_MEASURE} instructs FFTW to run
							 | 
						||
| 
								 | 
							
								@ctindex FFTW_MEASURE
							 | 
						||
| 
								 | 
							
								and measure the execution time of several FFTs in order to find the
							 | 
						||
| 
								 | 
							
								best way to compute the transform of size @code{n}.  This process takes
							 | 
						||
| 
								 | 
							
								some time (usually a few seconds), depending on your machine and on
							 | 
						||
| 
								 | 
							
								the size of the transform.  @code{FFTW_ESTIMATE}, on the contrary,
							 | 
						||
| 
								 | 
							
								does not run any computation and just builds a
							 | 
						||
| 
								 | 
							
								@ctindex FFTW_ESTIMATE
							 | 
						||
| 
								 | 
							
								reasonable plan that is probably sub-optimal.  In short, if your
							 | 
						||
| 
								 | 
							
								program performs many transforms of the same size and initialization
							 | 
						||
| 
								 | 
							
								time is not important, use @code{FFTW_MEASURE}; otherwise use the
							 | 
						||
| 
								 | 
							
								estimate.  
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								@emph{You must create the plan before initializing the input}, because
							 | 
						||
| 
								 | 
							
								@code{FFTW_MEASURE} overwrites the @code{in}/@code{out} arrays.
							 | 
						||
| 
								 | 
							
								(Technically, @code{FFTW_ESTIMATE} does not touch your arrays, but you
							 | 
						||
| 
								 | 
							
								should always create plans first just to be sure.)
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Once the plan has been created, you can use it as many times as you
							 | 
						||
| 
								 | 
							
								like for transforms on the specified @code{in}/@code{out} arrays,
							 | 
						||
| 
								 | 
							
								computing the actual transforms via @code{fftw_execute(plan)}:
							 | 
						||
| 
								 | 
							
								@example
							 | 
						||
| 
								 | 
							
								void fftw_execute(const fftw_plan plan);
							 | 
						||
| 
								 | 
							
								@end example
							 | 
						||
| 
								 | 
							
								@findex fftw_execute
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The DFT results are stored in-order in the array @code{out}, with the
							 | 
						||
| 
								 | 
							
								zero-frequency (DC) component in @code{out[0]}.
							 | 
						||
| 
								 | 
							
								@cindex frequency
							 | 
						||
| 
								 | 
							
								If @code{in != out}, the transform is @dfn{out-of-place} and the input
							 | 
						||
| 
								 | 
							
								array @code{in} is not modified.  Otherwise, the input array is
							 | 
						||
| 
								 | 
							
								overwritten with the transform.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								@cindex execute
							 | 
						||
| 
								 | 
							
								If you want to transform a @emph{different} array of the same size, you
							 | 
						||
| 
								 | 
							
								can create a new plan with @code{fftw_plan_dft_1d} and FFTW
							 | 
						||
| 
								 | 
							
								automatically reuses the information from the previous plan, if
							 | 
						||
| 
								 | 
							
								possible.  Alternatively, with the ``guru'' interface you can apply a
							 | 
						||
| 
								 | 
							
								given plan to a different array, if you are careful.
							 | 
						||
| 
								 | 
							
								@xref{FFTW Reference}.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								When you are done with the plan, you deallocate it by calling
							 | 
						||
| 
								 | 
							
								@code{fftw_destroy_plan(plan)}:
							 | 
						||
| 
								 | 
							
								@example
							 | 
						||
| 
								 | 
							
								void fftw_destroy_plan(fftw_plan plan);
							 | 
						||
| 
								 | 
							
								@end example
							 | 
						||
| 
								 | 
							
								@findex fftw_destroy_plan
							 | 
						||
| 
								 | 
							
								If you allocate an array with @code{fftw_malloc()} you must deallocate
							 | 
						||
| 
								 | 
							
								it with @code{fftw_free()}.  Do not use @code{free()} or, heaven
							 | 
						||
| 
								 | 
							
								forbid, @code{delete}.
							 | 
						||
| 
								 | 
							
								@findex fftw_free
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								FFTW computes an @emph{unnormalized} DFT.  Thus, computing a forward
							 | 
						||
| 
								 | 
							
								followed by a backward transform (or vice versa) results in the original
							 | 
						||
| 
								 | 
							
								array scaled by @code{n}.  For the definition of the DFT, see @ref{What
							 | 
						||
| 
								 | 
							
								FFTW Really Computes}.
							 | 
						||
| 
								 | 
							
								@cindex DFT
							 | 
						||
| 
								 | 
							
								@cindex normalization
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								If you have a C compiler, such as @code{gcc}, that supports the
							 | 
						||
| 
								 | 
							
								C99 standard, and you @code{#include <complex.h>} @emph{before}
							 | 
						||
| 
								 | 
							
								@code{<fftw3.h>}, then @code{fftw_complex} is the native
							 | 
						||
| 
								 | 
							
								double-precision complex type and you can manipulate it with ordinary
							 | 
						||
| 
								 | 
							
								arithmetic.  Otherwise, FFTW defines its own complex type, which is
							 | 
						||
| 
								 | 
							
								bit-compatible with the C99 complex type. @xref{Complex numbers}.
							 | 
						||
| 
								 | 
							
								(The C++ @code{<complex>} template class may also be usable via a
							 | 
						||
| 
								 | 
							
								typecast.)
							 | 
						||
| 
								 | 
							
								@cindex C++
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								To use single or long-double precision versions of FFTW, replace the
							 | 
						||
| 
								 | 
							
								@code{fftw_} prefix by @code{fftwf_} or @code{fftwl_} and link with
							 | 
						||
| 
								 | 
							
								@code{-lfftw3f} or @code{-lfftw3l}, but use the @emph{same}
							 | 
						||
| 
								 | 
							
								@code{<fftw3.h>} header file.
							 | 
						||
| 
								 | 
							
								@cindex precision
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Many more flags exist besides @code{FFTW_MEASURE} and
							 | 
						||
| 
								 | 
							
								@code{FFTW_ESTIMATE}.  For example, use @code{FFTW_PATIENT} if you're
							 | 
						||
| 
								 | 
							
								willing to wait even longer for a possibly even faster plan (@pxref{FFTW
							 | 
						||
| 
								 | 
							
								Reference}).
							 | 
						||
| 
								 | 
							
								@ctindex FFTW_PATIENT
							 | 
						||
| 
								 | 
							
								You can also save plans for future use, as described by @ref{Words of
							 | 
						||
| 
								 | 
							
								Wisdom-Saving Plans}.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								@c ------------------------------------------------------------
							 | 
						||
| 
								 | 
							
								@node Complex Multi-Dimensional DFTs, One-Dimensional DFTs of Real Data, Complex One-Dimensional DFTs, Tutorial
							 | 
						||
| 
								 | 
							
								@section Complex Multi-Dimensional DFTs
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Multi-dimensional transforms work much the same way as one-dimensional
							 | 
						||
| 
								 | 
							
								transforms: you allocate arrays of @code{fftw_complex} (preferably
							 | 
						||
| 
								 | 
							
								using @code{fftw_malloc}), create an @code{fftw_plan}, execute it as
							 | 
						||
| 
								 | 
							
								many times as you want with @code{fftw_execute(plan)}, and clean up
							 | 
						||
| 
								 | 
							
								with @code{fftw_destroy_plan(plan)} (and @code{fftw_free}).  
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								FFTW provides two routines for creating plans for 2d and 3d transforms,
							 | 
						||
| 
								 | 
							
								and one routine for creating plans of arbitrary dimensionality.
							 | 
						||
| 
								 | 
							
								The 2d and 3d routines have the following signature:
							 | 
						||
| 
								 | 
							
								@example
							 | 
						||
| 
								 | 
							
								fftw_plan fftw_plan_dft_2d(int n0, int n1,
							 | 
						||
| 
								 | 
							
								                           fftw_complex *in, fftw_complex *out,
							 | 
						||
| 
								 | 
							
								                           int sign, unsigned flags);
							 | 
						||
| 
								 | 
							
								fftw_plan fftw_plan_dft_3d(int n0, int n1, int n2,
							 | 
						||
| 
								 | 
							
								                           fftw_complex *in, fftw_complex *out,
							 | 
						||
| 
								 | 
							
								                           int sign, unsigned flags);
							 | 
						||
| 
								 | 
							
								@end example
							 | 
						||
| 
								 | 
							
								@findex fftw_plan_dft_2d
							 | 
						||
| 
								 | 
							
								@findex fftw_plan_dft_3d
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								These routines create plans for @code{n0} by @code{n1} two-dimensional
							 | 
						||
| 
								 | 
							
								(2d) transforms and @code{n0} by @code{n1} by @code{n2} 3d transforms,
							 | 
						||
| 
								 | 
							
								respectively.  All of these transforms operate on contiguous arrays in
							 | 
						||
| 
								 | 
							
								the C-standard @dfn{row-major} order, so that the last dimension has the
							 | 
						||
| 
								 | 
							
								fastest-varying index in the array.  This layout is described further in
							 | 
						||
| 
								 | 
							
								@ref{Multi-dimensional Array Format}.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								FFTW can also compute transforms of higher dimensionality.  In order to
							 | 
						||
| 
								 | 
							
								avoid confusion between the various meanings of the the word
							 | 
						||
| 
								 | 
							
								``dimension'', we use the term @emph{rank}
							 | 
						||
| 
								 | 
							
								@cindex rank
							 | 
						||
| 
								 | 
							
								to denote the number of independent indices in an array.@footnote{The
							 | 
						||
| 
								 | 
							
								term ``rank'' is commonly used in the APL, FORTRAN, and Common Lisp
							 | 
						||
| 
								 | 
							
								traditions, although it is not so common in the C@tie{}world.}  For
							 | 
						||
| 
								 | 
							
								example, we say that a 2d transform has rank@tie{}2, a 3d transform has
							 | 
						||
| 
								 | 
							
								rank@tie{}3, and so on.  You can plan transforms of arbitrary rank by
							 | 
						||
| 
								 | 
							
								means of the following function:
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								@example
							 | 
						||
| 
								 | 
							
								fftw_plan fftw_plan_dft(int rank, const int *n,
							 | 
						||
| 
								 | 
							
								                        fftw_complex *in, fftw_complex *out,
							 | 
						||
| 
								 | 
							
								                        int sign, unsigned flags);
							 | 
						||
| 
								 | 
							
								@end example
							 | 
						||
| 
								 | 
							
								@findex fftw_plan_dft
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Here, @code{n} is a pointer to an array @code{n[rank]} denoting an
							 | 
						||
| 
								 | 
							
								@code{n[0]} by @code{n[1]} by @dots{} by @code{n[rank-1]} transform.
							 | 
						||
| 
								 | 
							
								Thus, for example, the call
							 | 
						||
| 
								 | 
							
								@example
							 | 
						||
| 
								 | 
							
								fftw_plan_dft_2d(n0, n1, in, out, sign, flags);
							 | 
						||
| 
								 | 
							
								@end example
							 | 
						||
| 
								 | 
							
								is equivalent to the following code fragment:
							 | 
						||
| 
								 | 
							
								@example
							 | 
						||
| 
								 | 
							
								int n[2];
							 | 
						||
| 
								 | 
							
								n[0] = n0;
							 | 
						||
| 
								 | 
							
								n[1] = n1;
							 | 
						||
| 
								 | 
							
								fftw_plan_dft(2, n, in, out, sign, flags);
							 | 
						||
| 
								 | 
							
								@end example
							 | 
						||
| 
								 | 
							
								@code{fftw_plan_dft} is not restricted to 2d and 3d transforms,
							 | 
						||
| 
								 | 
							
								however, but it can plan transforms of arbitrary rank.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								You may have noticed that all the planner routines described so far
							 | 
						||
| 
								 | 
							
								have overlapping functionality.  For example, you can plan a 1d or 2d
							 | 
						||
| 
								 | 
							
								transform by using @code{fftw_plan_dft} with a @code{rank} of @code{1}
							 | 
						||
| 
								 | 
							
								or @code{2}, or even by calling @code{fftw_plan_dft_3d} with @code{n0}
							 | 
						||
| 
								 | 
							
								and/or @code{n1} equal to @code{1} (with no loss in efficiency).  This
							 | 
						||
| 
								 | 
							
								pattern continues, and FFTW's planning routines in general form a
							 | 
						||
| 
								 | 
							
								``partial order,'' sequences of
							 | 
						||
| 
								 | 
							
								@cindex partial order
							 | 
						||
| 
								 | 
							
								interfaces with strictly increasing generality but correspondingly
							 | 
						||
| 
								 | 
							
								greater complexity.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								@code{fftw_plan_dft} is the most general complex-DFT routine that we
							 | 
						||
| 
								 | 
							
								describe in this tutorial, but there are also the advanced and guru interfaces,
							 | 
						||
| 
								 | 
							
								@cindex advanced interface
							 | 
						||
| 
								 | 
							
								@cindex guru interface 
							 | 
						||
| 
								 | 
							
								which allow one to efficiently combine multiple/strided transforms
							 | 
						||
| 
								 | 
							
								into a single FFTW plan, transform a subset of a larger
							 | 
						||
| 
								 | 
							
								multi-dimensional array, and/or to handle more general complex-number
							 | 
						||
| 
								 | 
							
								formats.  For more information, see @ref{FFTW Reference}.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								@c ------------------------------------------------------------
							 | 
						||
| 
								 | 
							
								@node One-Dimensional DFTs of Real Data, Multi-Dimensional DFTs of Real Data, Complex Multi-Dimensional DFTs, Tutorial
							 | 
						||
| 
								 | 
							
								@section One-Dimensional DFTs of Real Data
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								In many practical applications, the input data @code{in[i]} are purely
							 | 
						||
| 
								 | 
							
								real numbers, in which case the DFT output satisfies the ``Hermitian''
							 | 
						||
| 
								 | 
							
								@cindex Hermitian
							 | 
						||
| 
								 | 
							
								redundancy: @code{out[i]} is the conjugate of @code{out[n-i]}.  It is
							 | 
						||
| 
								 | 
							
								possible to take advantage of these circumstances in order to achieve
							 | 
						||
| 
								 | 
							
								roughly a factor of two improvement in both speed and memory usage.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								In exchange for these speed and space advantages, the user sacrifices
							 | 
						||
| 
								 | 
							
								some of the simplicity of FFTW's complex transforms. First of all, the
							 | 
						||
| 
								 | 
							
								input and output arrays are of @emph{different sizes and types}: the
							 | 
						||
| 
								 | 
							
								input is @code{n} real numbers, while the output is @code{n/2+1}
							 | 
						||
| 
								 | 
							
								complex numbers (the non-redundant outputs); this also requires slight
							 | 
						||
| 
								 | 
							
								``padding'' of the input array for
							 | 
						||
| 
								 | 
							
								@cindex padding
							 | 
						||
| 
								 | 
							
								in-place transforms.  Second, the inverse transform (complex to real)
							 | 
						||
| 
								 | 
							
								has the side-effect of @emph{overwriting its input array}, by default.
							 | 
						||
| 
								 | 
							
								Neither of these inconveniences should pose a serious problem for
							 | 
						||
| 
								 | 
							
								users, but it is important to be aware of them.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The routines to perform real-data transforms are almost the same as
							 | 
						||
| 
								 | 
							
								those for complex transforms: you allocate arrays of @code{double}
							 | 
						||
| 
								 | 
							
								and/or @code{fftw_complex} (preferably using @code{fftw_malloc} or
							 | 
						||
| 
								 | 
							
								@code{fftw_alloc_complex}), create an @code{fftw_plan}, execute it as
							 | 
						||
| 
								 | 
							
								many times as you want with @code{fftw_execute(plan)}, and clean up
							 | 
						||
| 
								 | 
							
								with @code{fftw_destroy_plan(plan)} (and @code{fftw_free}).  The only
							 | 
						||
| 
								 | 
							
								differences are that the input (or output) is of type @code{double}
							 | 
						||
| 
								 | 
							
								and there are new routines to create the plan.  In one dimension:
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								@example
							 | 
						||
| 
								 | 
							
								fftw_plan fftw_plan_dft_r2c_1d(int n, double *in, fftw_complex *out,
							 | 
						||
| 
								 | 
							
								                               unsigned flags);
							 | 
						||
| 
								 | 
							
								fftw_plan fftw_plan_dft_c2r_1d(int n, fftw_complex *in, double *out,
							 | 
						||
| 
								 | 
							
								                               unsigned flags);
							 | 
						||
| 
								 | 
							
								@end example
							 | 
						||
| 
								 | 
							
								@findex fftw_plan_dft_r2c_1d
							 | 
						||
| 
								 | 
							
								@findex fftw_plan_dft_c2r_1d
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								for the real input to complex-Hermitian output (@dfn{r2c}) and
							 | 
						||
| 
								 | 
							
								complex-Hermitian input to real output (@dfn{c2r}) transforms.
							 | 
						||
| 
								 | 
							
								@cindex r2c
							 | 
						||
| 
								 | 
							
								@cindex c2r
							 | 
						||
| 
								 | 
							
								Unlike the complex DFT planner, there is no @code{sign} argument.
							 | 
						||
| 
								 | 
							
								Instead, r2c DFTs are always @code{FFTW_FORWARD} and c2r DFTs are
							 | 
						||
| 
								 | 
							
								always @code{FFTW_BACKWARD}.
							 | 
						||
| 
								 | 
							
								@ctindex FFTW_FORWARD
							 | 
						||
| 
								 | 
							
								@ctindex FFTW_BACKWARD
							 | 
						||
| 
								 | 
							
								(For single/long-double precision
							 | 
						||
| 
								 | 
							
								@code{fftwf} and @code{fftwl}, @code{double} should be replaced by
							 | 
						||
| 
								 | 
							
								@code{float} and @code{long double}, respectively.)
							 | 
						||
| 
								 | 
							
								@cindex precision
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Here, @code{n} is the ``logical'' size of the DFT, not necessarily the
							 | 
						||
| 
								 | 
							
								physical size of the array.  In particular, the real (@code{double})
							 | 
						||
| 
								 | 
							
								array has @code{n} elements, while the complex (@code{fftw_complex})
							 | 
						||
| 
								 | 
							
								array has @code{n/2+1} elements (where the division is rounded down).
							 | 
						||
| 
								 | 
							
								For an in-place transform,
							 | 
						||
| 
								 | 
							
								@cindex in-place
							 | 
						||
| 
								 | 
							
								@code{in} and @code{out} are aliased to the same array, which must be
							 | 
						||
| 
								 | 
							
								big enough to hold both; so, the real array would actually have
							 | 
						||
| 
								 | 
							
								@code{2*(n/2+1)} elements, where the elements beyond the first
							 | 
						||
| 
								 | 
							
								@code{n} are unused padding.  (Note that this is very different from
							 | 
						||
| 
								 | 
							
								the concept of ``zero-padding'' a transform to a larger length, which
							 | 
						||
| 
								 | 
							
								changes the logical size of the DFT by actually adding new input
							 | 
						||
| 
								 | 
							
								data.)  The @math{k}th element of the complex array is exactly the
							 | 
						||
| 
								 | 
							
								same as the @math{k}th element of the corresponding complex DFT.  All
							 | 
						||
| 
								 | 
							
								positive @code{n} are supported; products of small factors are most
							 | 
						||
| 
								 | 
							
								efficient, but an @Onlogn algorithm is used even for prime sizes.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								As noted above, the c2r transform destroys its input array even for
							 | 
						||
| 
								 | 
							
								out-of-place transforms.  This can be prevented, if necessary, by
							 | 
						||
| 
								 | 
							
								including @code{FFTW_PRESERVE_INPUT} in the @code{flags}, with
							 | 
						||
| 
								 | 
							
								unfortunately some sacrifice in performance.
							 | 
						||
| 
								 | 
							
								@cindex flags
							 | 
						||
| 
								 | 
							
								@ctindex FFTW_PRESERVE_INPUT
							 | 
						||
| 
								 | 
							
								This flag is also not currently supported for multi-dimensional real
							 | 
						||
| 
								 | 
							
								DFTs (next section).
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Readers familiar with DFTs of real data will recall that the 0th (the
							 | 
						||
| 
								 | 
							
								``DC'') and @code{n/2}-th (the ``Nyquist'' frequency, when @code{n} is
							 | 
						||
| 
								 | 
							
								even) elements of the complex output are purely real.  Some
							 | 
						||
| 
								 | 
							
								implementations therefore store the Nyquist element where the DC
							 | 
						||
| 
								 | 
							
								imaginary part would go, in order to make the input and output arrays
							 | 
						||
| 
								 | 
							
								the same size.  Such packing, however, does not generalize well to
							 | 
						||
| 
								 | 
							
								multi-dimensional transforms, and the space savings are miniscule in
							 | 
						||
| 
								 | 
							
								any case; FFTW does not support it.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								An alternative interface for one-dimensional r2c and c2r DFTs can be
							 | 
						||
| 
								 | 
							
								found in the @samp{r2r} interface (@pxref{The Halfcomplex-format
							 | 
						||
| 
								 | 
							
								DFT}), with ``halfcomplex''-format output that @emph{is} the same size
							 | 
						||
| 
								 | 
							
								(and type) as the input array.
							 | 
						||
| 
								 | 
							
								@cindex halfcomplex format
							 | 
						||
| 
								 | 
							
								That interface, although it is not very useful for multi-dimensional
							 | 
						||
| 
								 | 
							
								transforms, may sometimes yield better performance.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								@c ------------------------------------------------------------
							 | 
						||
| 
								 | 
							
								@node Multi-Dimensional DFTs of Real Data, More DFTs of Real Data, One-Dimensional DFTs of Real Data, Tutorial
							 | 
						||
| 
								 | 
							
								@section Multi-Dimensional DFTs of Real Data
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Multi-dimensional DFTs of real data use the following planner routines:
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								@example
							 | 
						||
| 
								 | 
							
								fftw_plan fftw_plan_dft_r2c_2d(int n0, int n1,
							 | 
						||
| 
								 | 
							
								                               double *in, fftw_complex *out,
							 | 
						||
| 
								 | 
							
								                               unsigned flags);
							 | 
						||
| 
								 | 
							
								fftw_plan fftw_plan_dft_r2c_3d(int n0, int n1, int n2,
							 | 
						||
| 
								 | 
							
								                               double *in, fftw_complex *out,
							 | 
						||
| 
								 | 
							
								                               unsigned flags);
							 | 
						||
| 
								 | 
							
								fftw_plan fftw_plan_dft_r2c(int rank, const int *n,
							 | 
						||
| 
								 | 
							
								                            double *in, fftw_complex *out,
							 | 
						||
| 
								 | 
							
								                            unsigned flags);
							 | 
						||
| 
								 | 
							
								@end example
							 | 
						||
| 
								 | 
							
								@findex fftw_plan_dft_r2c_2d
							 | 
						||
| 
								 | 
							
								@findex fftw_plan_dft_r2c_3d
							 | 
						||
| 
								 | 
							
								@findex fftw_plan_dft_r2c
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								as well as the corresponding @code{c2r} routines with the input/output
							 | 
						||
| 
								 | 
							
								types swapped.  These routines work similarly to their complex
							 | 
						||
| 
								 | 
							
								analogues, except for the fact that here the complex output array is cut
							 | 
						||
| 
								 | 
							
								roughly in half and the real array requires padding for in-place
							 | 
						||
| 
								 | 
							
								transforms (as in 1d, above).
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								As before, @code{n} is the logical size of the array, and the
							 | 
						||
| 
								 | 
							
								consequences of this on the the format of the complex arrays deserve
							 | 
						||
| 
								 | 
							
								careful attention.
							 | 
						||
| 
								 | 
							
								@cindex r2c/c2r multi-dimensional array format
							 | 
						||
| 
								 | 
							
								Suppose that the real data has dimensions @ndims (in row-major order).
							 | 
						||
| 
								 | 
							
								Then, after an r2c transform, the output is an @ndimshalf array of
							 | 
						||
| 
								 | 
							
								@code{fftw_complex} values in row-major order, corresponding to slightly
							 | 
						||
| 
								 | 
							
								over half of the output of the corresponding complex DFT.  (The division
							 | 
						||
| 
								 | 
							
								is rounded down.)  The ordering of the data is otherwise exactly the
							 | 
						||
| 
								 | 
							
								same as in the complex-DFT case.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								For out-of-place transforms, this is the end of the story: the real
							 | 
						||
| 
								 | 
							
								data is stored as a row-major array of size @ndims and the complex
							 | 
						||
| 
								 | 
							
								data is stored as a row-major array of size @ndimshalf{}.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								For in-place transforms, however, extra padding of the real-data array
							 | 
						||
| 
								 | 
							
								is necessary because the complex array is larger than the real array,
							 | 
						||
| 
								 | 
							
								and the two arrays share the same memory locations.  Thus, for
							 | 
						||
| 
								 | 
							
								in-place transforms, the final dimension of the real-data array must
							 | 
						||
| 
								 | 
							
								be padded with extra values to accommodate the size of the complex
							 | 
						||
| 
								 | 
							
								data---two values if the last dimension is even and one if it is odd.
							 | 
						||
| 
								 | 
							
								@cindex padding
							 | 
						||
| 
								 | 
							
								That is, the last dimension of the real data must physically contain
							 | 
						||
| 
								 | 
							
								@tex
							 | 
						||
| 
								 | 
							
								$2 (n_{d-1}/2+1)$
							 | 
						||
| 
								 | 
							
								@end tex
							 | 
						||
| 
								 | 
							
								@ifinfo
							 | 
						||
| 
								 | 
							
								2 * (n[d-1]/2+1)
							 | 
						||
| 
								 | 
							
								@end ifinfo
							 | 
						||
| 
								 | 
							
								@html
							 | 
						||
| 
								 | 
							
								2 * (n<sub>d-1</sub>/2+1)
							 | 
						||
| 
								 | 
							
								@end html
							 | 
						||
| 
								 | 
							
								@code{double} values (exactly enough to hold the complex data).
							 | 
						||
| 
								 | 
							
								This physical array size does not, however, change the @emph{logical}
							 | 
						||
| 
								 | 
							
								array size---only
							 | 
						||
| 
								 | 
							
								@tex
							 | 
						||
| 
								 | 
							
								$n_{d-1}$
							 | 
						||
| 
								 | 
							
								@end tex
							 | 
						||
| 
								 | 
							
								@ifinfo
							 | 
						||
| 
								 | 
							
								n[d-1]
							 | 
						||
| 
								 | 
							
								@end ifinfo
							 | 
						||
| 
								 | 
							
								@html
							 | 
						||
| 
								 | 
							
								n<sub>d-1</sub>
							 | 
						||
| 
								 | 
							
								@end html
							 | 
						||
| 
								 | 
							
								values are actually stored in the last dimension, and
							 | 
						||
| 
								 | 
							
								@tex
							 | 
						||
| 
								 | 
							
								$n_{d-1}$
							 | 
						||
| 
								 | 
							
								@end tex
							 | 
						||
| 
								 | 
							
								@ifinfo
							 | 
						||
| 
								 | 
							
								n[d-1]
							 | 
						||
| 
								 | 
							
								@end ifinfo
							 | 
						||
| 
								 | 
							
								@html
							 | 
						||
| 
								 | 
							
								n<sub>d-1</sub>
							 | 
						||
| 
								 | 
							
								@end html
							 | 
						||
| 
								 | 
							
								is the last dimension passed to the plan-creation routine.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								For example, consider the transform of a two-dimensional real array of
							 | 
						||
| 
								 | 
							
								size @code{n0} by @code{n1}.  The output of the r2c transform is a
							 | 
						||
| 
								 | 
							
								two-dimensional complex array of size @code{n0} by @code{n1/2+1}, where
							 | 
						||
| 
								 | 
							
								the @code{y} dimension has been cut nearly in half because of
							 | 
						||
| 
								 | 
							
								redundancies in the output.  Because @code{fftw_complex} is twice the
							 | 
						||
| 
								 | 
							
								size of @code{double}, the output array is slightly bigger than the
							 | 
						||
| 
								 | 
							
								input array.  Thus, if we want to compute the transform in place, we
							 | 
						||
| 
								 | 
							
								must @emph{pad} the input array so that it is of size @code{n0} by
							 | 
						||
| 
								 | 
							
								@code{2*(n1/2+1)}.  If @code{n1} is even, then there are two padding
							 | 
						||
| 
								 | 
							
								elements at the end of each row (which need not be initialized, as they
							 | 
						||
| 
								 | 
							
								are only used for output).
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								@ifhtml
							 | 
						||
| 
								 | 
							
								The following illustration depicts the input and output arrays just
							 | 
						||
| 
								 | 
							
								described, for both the out-of-place and in-place transforms (with the
							 | 
						||
| 
								 | 
							
								arrows indicating consecutive memory locations):
							 | 
						||
| 
								 | 
							
								@image{rfftwnd-for-html}
							 | 
						||
| 
								 | 
							
								@end ifhtml
							 | 
						||
| 
								 | 
							
								@ifnotinfo
							 | 
						||
| 
								 | 
							
								@ifnothtml
							 | 
						||
| 
								 | 
							
								@float Figure,fig:rfftwnd
							 | 
						||
| 
								 | 
							
								@center @image{rfftwnd}
							 | 
						||
| 
								 | 
							
								@caption{Illustration of the data layout for a 2d @code{nx} by @code{ny}
							 | 
						||
| 
								 | 
							
								real-to-complex transform.}
							 | 
						||
| 
								 | 
							
								@end float
							 | 
						||
| 
								 | 
							
								@ref{fig:rfftwnd} depicts the input and output arrays just
							 | 
						||
| 
								 | 
							
								described, for both the out-of-place and in-place transforms (with the
							 | 
						||
| 
								 | 
							
								arrows indicating consecutive memory locations):
							 | 
						||
| 
								 | 
							
								@end ifnothtml
							 | 
						||
| 
								 | 
							
								@end ifnotinfo
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								These transforms are unnormalized, so an r2c followed by a c2r
							 | 
						||
| 
								 | 
							
								transform (or vice versa) will result in the original data scaled by
							 | 
						||
| 
								 | 
							
								the number of real data elements---that is, the product of the
							 | 
						||
| 
								 | 
							
								(logical) dimensions of the real data.
							 | 
						||
| 
								 | 
							
								@cindex normalization
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								(Because the last dimension is treated specially, if it is equal to
							 | 
						||
| 
								 | 
							
								@code{1} the transform is @emph{not} equivalent to a lower-dimensional
							 | 
						||
| 
								 | 
							
								r2c/c2r transform.  In that case, the last complex dimension also has
							 | 
						||
| 
								 | 
							
								size @code{1} (@code{=1/2+1}), and no advantage is gained over the
							 | 
						||
| 
								 | 
							
								complex transforms.)
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								@c ------------------------------------------------------------
							 | 
						||
| 
								 | 
							
								@node More DFTs of Real Data,  , Multi-Dimensional DFTs of Real Data, Tutorial
							 | 
						||
| 
								 | 
							
								@section More DFTs of Real Data
							 | 
						||
| 
								 | 
							
								@menu
							 | 
						||
| 
								 | 
							
								* The Halfcomplex-format DFT::
							 | 
						||
| 
								 | 
							
								* Real even/odd DFTs (cosine/sine transforms)::
							 | 
						||
| 
								 | 
							
								* The Discrete Hartley Transform::
							 | 
						||
| 
								 | 
							
								@end menu
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								FFTW supports several other transform types via a unified @dfn{r2r}
							 | 
						||
| 
								 | 
							
								(real-to-real) interface,
							 | 
						||
| 
								 | 
							
								@cindex r2r
							 | 
						||
| 
								 | 
							
								so called because it takes a real (@code{double}) array and outputs a
							 | 
						||
| 
								 | 
							
								real array of the same size.  These r2r transforms currently fall into
							 | 
						||
| 
								 | 
							
								three categories: DFTs of real input and complex-Hermitian output in
							 | 
						||
| 
								 | 
							
								halfcomplex format, DFTs of real input with even/odd symmetry
							 | 
						||
| 
								 | 
							
								(a.k.a. discrete cosine/sine transforms, DCTs/DSTs), and discrete
							 | 
						||
| 
								 | 
							
								Hartley transforms (DHTs), all described in more detail by the
							 | 
						||
| 
								 | 
							
								following sections.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The r2r transforms follow the by now familiar interface of creating an
							 | 
						||
| 
								 | 
							
								@code{fftw_plan}, executing it with @code{fftw_execute(plan)}, and
							 | 
						||
| 
								 | 
							
								destroying it with @code{fftw_destroy_plan(plan)}.  Furthermore, all
							 | 
						||
| 
								 | 
							
								r2r transforms share the same planner interface:
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								@example
							 | 
						||
| 
								 | 
							
								fftw_plan fftw_plan_r2r_1d(int n, double *in, double *out,
							 | 
						||
| 
								 | 
							
								                           fftw_r2r_kind kind, unsigned flags);
							 | 
						||
| 
								 | 
							
								fftw_plan fftw_plan_r2r_2d(int n0, int n1, double *in, double *out,
							 | 
						||
| 
								 | 
							
								                           fftw_r2r_kind kind0, fftw_r2r_kind kind1,
							 | 
						||
| 
								 | 
							
								                           unsigned flags);
							 | 
						||
| 
								 | 
							
								fftw_plan fftw_plan_r2r_3d(int n0, int n1, int n2,
							 | 
						||
| 
								 | 
							
								                           double *in, double *out,
							 | 
						||
| 
								 | 
							
								                           fftw_r2r_kind kind0,
							 | 
						||
| 
								 | 
							
								                           fftw_r2r_kind kind1,
							 | 
						||
| 
								 | 
							
								                           fftw_r2r_kind kind2,
							 | 
						||
| 
								 | 
							
								                           unsigned flags);
							 | 
						||
| 
								 | 
							
								fftw_plan fftw_plan_r2r(int rank, const int *n, double *in, double *out,
							 | 
						||
| 
								 | 
							
								                        const fftw_r2r_kind *kind, unsigned flags);
							 | 
						||
| 
								 | 
							
								@end example
							 | 
						||
| 
								 | 
							
								@findex fftw_plan_r2r_1d
							 | 
						||
| 
								 | 
							
								@findex fftw_plan_r2r_2d
							 | 
						||
| 
								 | 
							
								@findex fftw_plan_r2r_3d
							 | 
						||
| 
								 | 
							
								@findex fftw_plan_r2r
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Just as for the complex DFT, these plan 1d/2d/3d/multi-dimensional
							 | 
						||
| 
								 | 
							
								transforms for contiguous arrays in row-major order, transforming (real)
							 | 
						||
| 
								 | 
							
								input to output of the same size, where @code{n} specifies the
							 | 
						||
| 
								 | 
							
								@emph{physical} dimensions of the arrays.  All positive @code{n} are
							 | 
						||
| 
								 | 
							
								supported (with the exception of @code{n=1} for the @code{FFTW_REDFT00}
							 | 
						||
| 
								 | 
							
								kind, noted in the real-even subsection below); products of small
							 | 
						||
| 
								 | 
							
								factors are most efficient (factorizing @code{n-1} and @code{n+1} for
							 | 
						||
| 
								 | 
							
								@code{FFTW_REDFT00} and @code{FFTW_RODFT00} kinds, described below), but
							 | 
						||
| 
								 | 
							
								an @Onlogn algorithm is used even for prime sizes.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Each dimension has a @dfn{kind} parameter, of type
							 | 
						||
| 
								 | 
							
								@code{fftw_r2r_kind}, specifying the kind of r2r transform to be used
							 | 
						||
| 
								 | 
							
								for that dimension.
							 | 
						||
| 
								 | 
							
								@cindex kind (r2r)
							 | 
						||
| 
								 | 
							
								@tindex fftw_r2r_kind
							 | 
						||
| 
								 | 
							
								(In the case of @code{fftw_plan_r2r}, this is an array @code{kind[rank]}
							 | 
						||
| 
								 | 
							
								where @code{kind[i]} is the transform kind for the dimension
							 | 
						||
| 
								 | 
							
								@code{n[i]}.)  The kind can be one of a set of predefined constants,
							 | 
						||
| 
								 | 
							
								defined in the following subsections.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								In other words, FFTW computes the separable product of the specified
							 | 
						||
| 
								 | 
							
								r2r transforms over each dimension, which can be used e.g. for partial
							 | 
						||
| 
								 | 
							
								differential equations with mixed boundary conditions.  (For some r2r
							 | 
						||
| 
								 | 
							
								kinds, notably the halfcomplex DFT and the DHT, such a separable
							 | 
						||
| 
								 | 
							
								product is somewhat problematic in more than one dimension, however,
							 | 
						||
| 
								 | 
							
								as is described below.)
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								In the current version of FFTW, all r2r transforms except for the
							 | 
						||
| 
								 | 
							
								halfcomplex type are computed via pre- or post-processing of
							 | 
						||
| 
								 | 
							
								halfcomplex transforms, and they are therefore not as fast as they
							 | 
						||
| 
								 | 
							
								could be.  Since most other general DCT/DST codes employ a similar
							 | 
						||
| 
								 | 
							
								algorithm, however, FFTW's implementation should provide at least
							 | 
						||
| 
								 | 
							
								competitive performance.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								@c =========>
							 | 
						||
| 
								 | 
							
								@node The Halfcomplex-format DFT, Real even/odd DFTs (cosine/sine transforms), More DFTs of Real Data, More DFTs of Real Data
							 | 
						||
| 
								 | 
							
								@subsection The Halfcomplex-format DFT
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								An r2r kind of @code{FFTW_R2HC} (@dfn{r2hc}) corresponds to an r2c DFT
							 | 
						||
| 
								 | 
							
								@ctindex FFTW_R2HC
							 | 
						||
| 
								 | 
							
								@cindex r2c
							 | 
						||
| 
								 | 
							
								@cindex r2hc
							 | 
						||
| 
								 | 
							
								(@pxref{One-Dimensional DFTs of Real Data}) but with ``halfcomplex''
							 | 
						||
| 
								 | 
							
								format output, and may sometimes be faster and/or more convenient than
							 | 
						||
| 
								 | 
							
								the latter.
							 | 
						||
| 
								 | 
							
								@cindex halfcomplex format
							 | 
						||
| 
								 | 
							
								The inverse @dfn{hc2r} transform is of kind @code{FFTW_HC2R}.
							 | 
						||
| 
								 | 
							
								@ctindex FFTW_HC2R
							 | 
						||
| 
								 | 
							
								@cindex hc2r
							 | 
						||
| 
								 | 
							
								This consists of the non-redundant half of the complex output for a 1d
							 | 
						||
| 
								 | 
							
								real-input DFT of size @code{n}, stored as a sequence of @code{n} real
							 | 
						||
| 
								 | 
							
								numbers (@code{double}) in the format:
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								@tex
							 | 
						||
| 
								 | 
							
								$$
							 | 
						||
| 
								 | 
							
								r_0, r_1, r_2, \ldots, r_{n/2}, i_{(n+1)/2-1}, \ldots, i_2, i_1
							 | 
						||
| 
								 | 
							
								$$
							 | 
						||
| 
								 | 
							
								@end tex
							 | 
						||
| 
								 | 
							
								@ifinfo
							 | 
						||
| 
								 | 
							
								r0, r1, r2, r(n/2), i((n+1)/2-1), ..., i2, i1
							 | 
						||
| 
								 | 
							
								@end ifinfo
							 | 
						||
| 
								 | 
							
								@html
							 | 
						||
| 
								 | 
							
								<p align=center>
							 | 
						||
| 
								 | 
							
								r<sub>0</sub>, r<sub>1</sub>, r<sub>2</sub>, ..., r<sub>n/2</sub>, i<sub>(n+1)/2-1</sub>, ..., i<sub>2</sub>, i<sub>1</sub>
							 | 
						||
| 
								 | 
							
								</p>
							 | 
						||
| 
								 | 
							
								@end html
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Here,
							 | 
						||
| 
								 | 
							
								@ifinfo
							 | 
						||
| 
								 | 
							
								rk
							 | 
						||
| 
								 | 
							
								@end ifinfo
							 | 
						||
| 
								 | 
							
								@tex
							 | 
						||
| 
								 | 
							
								$r_k$
							 | 
						||
| 
								 | 
							
								@end tex
							 | 
						||
| 
								 | 
							
								@html
							 | 
						||
| 
								 | 
							
								r<sub>k</sub>
							 | 
						||
| 
								 | 
							
								@end html
							 | 
						||
| 
								 | 
							
								is the real part of the @math{k}th output, and
							 | 
						||
| 
								 | 
							
								@ifinfo
							 | 
						||
| 
								 | 
							
								ik
							 | 
						||
| 
								 | 
							
								@end ifinfo
							 | 
						||
| 
								 | 
							
								@tex
							 | 
						||
| 
								 | 
							
								$i_k$
							 | 
						||
| 
								 | 
							
								@end tex
							 | 
						||
| 
								 | 
							
								@html
							 | 
						||
| 
								 | 
							
								i<sub>k</sub>
							 | 
						||
| 
								 | 
							
								@end html
							 | 
						||
| 
								 | 
							
								is the imaginary part.  (Division by 2 is rounded down.) For a
							 | 
						||
| 
								 | 
							
								halfcomplex array @code{hc[n]}, the @math{k}th component thus has its
							 | 
						||
| 
								 | 
							
								real part in @code{hc[k]} and its imaginary part in @code{hc[n-k]}, with
							 | 
						||
| 
								 | 
							
								the exception of @code{k} @code{==} @code{0} or @code{n/2} (the latter
							 | 
						||
| 
								 | 
							
								only if @code{n} is even)---in these two cases, the imaginary part is
							 | 
						||
| 
								 | 
							
								zero due to symmetries of the real-input DFT, and is not stored.
							 | 
						||
| 
								 | 
							
								Thus, the r2hc transform of @code{n} real values is a halfcomplex array of
							 | 
						||
| 
								 | 
							
								length @code{n}, and vice versa for hc2r.
							 | 
						||
| 
								 | 
							
								@cindex normalization
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Aside from the differing format, the output of
							 | 
						||
| 
								 | 
							
								@code{FFTW_R2HC}/@code{FFTW_HC2R} is otherwise exactly the same as for
							 | 
						||
| 
								 | 
							
								the corresponding 1d r2c/c2r transform
							 | 
						||
| 
								 | 
							
								(i.e. @code{FFTW_FORWARD}/@code{FFTW_BACKWARD} transforms, respectively).
							 | 
						||
| 
								 | 
							
								Recall that these transforms are unnormalized, so r2hc followed by hc2r
							 | 
						||
| 
								 | 
							
								will result in the original data multiplied by @code{n}.  Furthermore,
							 | 
						||
| 
								 | 
							
								like the c2r transform, an out-of-place hc2r transform will
							 | 
						||
| 
								 | 
							
								@emph{destroy its input} array.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Although these halfcomplex transforms can be used with the
							 | 
						||
| 
								 | 
							
								multi-dimensional r2r interface, the interpretation of such a separable
							 | 
						||
| 
								 | 
							
								product of transforms along each dimension is problematic.  For example,
							 | 
						||
| 
								 | 
							
								consider a two-dimensional @code{n0} by @code{n1}, r2hc by r2hc
							 | 
						||
| 
								 | 
							
								transform planned by @code{fftw_plan_r2r_2d(n0, n1, in, out, FFTW_R2HC,
							 | 
						||
| 
								 | 
							
								FFTW_R2HC, FFTW_MEASURE)}.  Conceptually, FFTW first transforms the rows
							 | 
						||
| 
								 | 
							
								(of size @code{n1}) to produce halfcomplex rows, and then transforms the
							 | 
						||
| 
								 | 
							
								columns (of size @code{n0}).  Half of these column transforms, however,
							 | 
						||
| 
								 | 
							
								are of imaginary parts, and should therefore be multiplied by @math{i}
							 | 
						||
| 
								 | 
							
								and combined with the r2hc transforms of the real columns to produce the
							 | 
						||
| 
								 | 
							
								2d DFT amplitudes; FFTW's r2r transform does @emph{not} perform this
							 | 
						||
| 
								 | 
							
								combination for you.  Thus, if a multi-dimensional real-input/output DFT
							 | 
						||
| 
								 | 
							
								is required, we recommend using the ordinary r2c/c2r
							 | 
						||
| 
								 | 
							
								interface (@pxref{Multi-Dimensional DFTs of Real Data}).
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								@c =========>
							 | 
						||
| 
								 | 
							
								@node Real even/odd DFTs (cosine/sine transforms), The Discrete Hartley Transform, The Halfcomplex-format DFT, More DFTs of Real Data
							 | 
						||
| 
								 | 
							
								@subsection Real even/odd DFTs (cosine/sine transforms)
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The Fourier transform of a real-even function @math{f(-x) = f(x)} is
							 | 
						||
| 
								 | 
							
								real-even, and @math{i} times the Fourier transform of a real-odd
							 | 
						||
| 
								 | 
							
								function @math{f(-x) = -f(x)} is real-odd.  Similar results hold for a
							 | 
						||
| 
								 | 
							
								discrete Fourier transform, and thus for these symmetries the need for
							 | 
						||
| 
								 | 
							
								complex inputs/outputs is entirely eliminated.  Moreover, one gains a
							 | 
						||
| 
								 | 
							
								factor of two in speed/space from the fact that the data are real, and
							 | 
						||
| 
								 | 
							
								an additional factor of two from the even/odd symmetry: only the
							 | 
						||
| 
								 | 
							
								non-redundant (first) half of the array need be stored.  The result is
							 | 
						||
| 
								 | 
							
								the real-even DFT (@dfn{REDFT}) and the real-odd DFT (@dfn{RODFT}), also
							 | 
						||
| 
								 | 
							
								known as the discrete cosine and sine transforms (@dfn{DCT} and
							 | 
						||
| 
								 | 
							
								@dfn{DST}), respectively.
							 | 
						||
| 
								 | 
							
								@cindex real-even DFT
							 | 
						||
| 
								 | 
							
								@cindex REDFT
							 | 
						||
| 
								 | 
							
								@cindex real-odd DFT
							 | 
						||
| 
								 | 
							
								@cindex RODFT
							 | 
						||
| 
								 | 
							
								@cindex discrete cosine transform
							 | 
						||
| 
								 | 
							
								@cindex DCT
							 | 
						||
| 
								 | 
							
								@cindex discrete sine transform
							 | 
						||
| 
								 | 
							
								@cindex DST
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								(In this section, we describe the 1d transforms; multi-dimensional
							 | 
						||
| 
								 | 
							
								transforms are just a separable product of these transforms operating
							 | 
						||
| 
								 | 
							
								along each dimension.)
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Because of the discrete sampling, one has an additional choice: is the
							 | 
						||
| 
								 | 
							
								data even/odd around a sampling point, or around the point halfway
							 | 
						||
| 
								 | 
							
								between two samples?  The latter corresponds to @emph{shifting} the
							 | 
						||
| 
								 | 
							
								samples by @emph{half} an interval, and gives rise to several transform
							 | 
						||
| 
								 | 
							
								variants denoted by REDFT@math{ab} and RODFT@math{ab}: @math{a} and
							 | 
						||
| 
								 | 
							
								@math{b} are @math{0} or @math{1}, and indicate whether the input
							 | 
						||
| 
								 | 
							
								(@math{a}) and/or output (@math{b}) are shifted by half a sample
							 | 
						||
| 
								 | 
							
								(@math{1} means it is shifted).  These are also known as types I-IV of
							 | 
						||
| 
								 | 
							
								the DCT and DST, and all four types are supported by FFTW's r2r
							 | 
						||
| 
								 | 
							
								interface.@footnote{There are also type V-VIII transforms, which
							 | 
						||
| 
								 | 
							
								correspond to a logical DFT of @emph{odd} size @math{N}, independent of
							 | 
						||
| 
								 | 
							
								whether the physical size @code{n} is odd, but we do not support these
							 | 
						||
| 
								 | 
							
								variants.}
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The r2r kinds for the various REDFT and RODFT types supported by FFTW,
							 | 
						||
| 
								 | 
							
								along with the boundary conditions at both ends of the @emph{input}
							 | 
						||
| 
								 | 
							
								array (@code{n} real numbers @code{in[j=0..n-1]}), are:
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								@itemize @bullet
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								@item
							 | 
						||
| 
								 | 
							
								@code{FFTW_REDFT00} (DCT-I): even around @math{j=0} and even around @math{j=n-1}.
							 | 
						||
| 
								 | 
							
								@ctindex FFTW_REDFT00
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								@item
							 | 
						||
| 
								 | 
							
								@code{FFTW_REDFT10} (DCT-II, ``the'' DCT): even around @math{j=-0.5} and even around @math{j=n-0.5}.
							 | 
						||
| 
								 | 
							
								@ctindex FFTW_REDFT10
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								@item
							 | 
						||
| 
								 | 
							
								@code{FFTW_REDFT01} (DCT-III, ``the'' IDCT): even around @math{j=0} and odd around @math{j=n}.
							 | 
						||
| 
								 | 
							
								@ctindex FFTW_REDFT01
							 | 
						||
| 
								 | 
							
								@cindex IDCT
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								@item
							 | 
						||
| 
								 | 
							
								@code{FFTW_REDFT11} (DCT-IV): even around @math{j=-0.5} and odd around @math{j=n-0.5}.
							 | 
						||
| 
								 | 
							
								@ctindex FFTW_REDFT11
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								@item
							 | 
						||
| 
								 | 
							
								@code{FFTW_RODFT00} (DST-I): odd around @math{j=-1} and odd around @math{j=n}.
							 | 
						||
| 
								 | 
							
								@ctindex FFTW_RODFT00
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								@item
							 | 
						||
| 
								 | 
							
								@code{FFTW_RODFT10} (DST-II): odd around @math{j=-0.5} and odd around @math{j=n-0.5}.
							 | 
						||
| 
								 | 
							
								@ctindex FFTW_RODFT10
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								@item
							 | 
						||
| 
								 | 
							
								@code{FFTW_RODFT01} (DST-III): odd around @math{j=-1} and even around @math{j=n-1}.
							 | 
						||
| 
								 | 
							
								@ctindex FFTW_RODFT01
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								@item
							 | 
						||
| 
								 | 
							
								@code{FFTW_RODFT11} (DST-IV): odd around @math{j=-0.5} and even around @math{j=n-0.5}.
							 | 
						||
| 
								 | 
							
								@ctindex FFTW_RODFT11
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								@end itemize
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Note that these symmetries apply to the ``logical'' array being
							 | 
						||
| 
								 | 
							
								transformed; @strong{there are no constraints on your physical input
							 | 
						||
| 
								 | 
							
								data}.  So, for example, if you specify a size-5 REDFT00 (DCT-I) of the
							 | 
						||
| 
								 | 
							
								data @math{abcde}, it corresponds to the DFT of the logical even array
							 | 
						||
| 
								 | 
							
								@math{abcdedcb} of size 8.  A size-4 REDFT10 (DCT-II) of the data
							 | 
						||
| 
								 | 
							
								@math{abcd} corresponds to the size-8 logical DFT of the even array
							 | 
						||
| 
								 | 
							
								@math{abcddcba}, shifted by half a sample.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								All of these transforms are invertible.  The inverse of R*DFT00 is
							 | 
						||
| 
								 | 
							
								R*DFT00; of R*DFT10 is R*DFT01 and vice versa (these are often called
							 | 
						||
| 
								 | 
							
								simply ``the'' DCT and IDCT, respectively); and of R*DFT11 is R*DFT11.
							 | 
						||
| 
								 | 
							
								However, the transforms computed by FFTW are unnormalized, exactly
							 | 
						||
| 
								 | 
							
								like the corresponding real and complex DFTs, so computing a transform
							 | 
						||
| 
								 | 
							
								followed by its inverse yields the original array scaled by @math{N},
							 | 
						||
| 
								 | 
							
								where @math{N} is the @emph{logical} DFT size.  For REDFT00,
							 | 
						||
| 
								 | 
							
								@math{N=2(n-1)}; for RODFT00, @math{N=2(n+1)}; otherwise, @math{N=2n}.
							 | 
						||
| 
								 | 
							
								@cindex normalization
							 | 
						||
| 
								 | 
							
								@cindex IDCT
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Note that the boundary conditions of the transform output array are
							 | 
						||
| 
								 | 
							
								given by the input boundary conditions of the inverse transform.
							 | 
						||
| 
								 | 
							
								Thus, the above transforms are all inequivalent in terms of
							 | 
						||
| 
								 | 
							
								input/output boundary conditions, even neglecting the 0.5 shift
							 | 
						||
| 
								 | 
							
								difference.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								FFTW is most efficient when @math{N} is a product of small factors; note
							 | 
						||
| 
								 | 
							
								that this @emph{differs} from the factorization of the physical size
							 | 
						||
| 
								 | 
							
								@code{n} for REDFT00 and RODFT00!  There is another oddity: @code{n=1}
							 | 
						||
| 
								 | 
							
								REDFT00 transforms correspond to @math{N=0}, and so are @emph{not
							 | 
						||
| 
								 | 
							
								defined} (the planner will return @code{NULL}).  Otherwise, any positive
							 | 
						||
| 
								 | 
							
								@code{n} is supported.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								For the precise mathematical definitions of these transforms as used by
							 | 
						||
| 
								 | 
							
								FFTW, see @ref{What FFTW Really Computes}.  (For people accustomed to
							 | 
						||
| 
								 | 
							
								the DCT/DST, FFTW's definitions have a coefficient of @math{2} in front
							 | 
						||
| 
								 | 
							
								of the cos/sin functions so that they correspond precisely to an
							 | 
						||
| 
								 | 
							
								even/odd DFT of size @math{N}.  Some authors also include additional
							 | 
						||
| 
								 | 
							
								multiplicative factors of 
							 | 
						||
| 
								 | 
							
								@ifinfo
							 | 
						||
| 
								 | 
							
								sqrt(2)
							 | 
						||
| 
								 | 
							
								@end ifinfo
							 | 
						||
| 
								 | 
							
								@html
							 | 
						||
| 
								 | 
							
								√2
							 | 
						||
| 
								 | 
							
								@end html
							 | 
						||
| 
								 | 
							
								@tex
							 | 
						||
| 
								 | 
							
								$\sqrt{2}$
							 | 
						||
| 
								 | 
							
								@end tex
							 | 
						||
| 
								 | 
							
								for selected inputs and outputs; this makes
							 | 
						||
| 
								 | 
							
								the transform orthogonal, but sacrifices the direct equivalence to a
							 | 
						||
| 
								 | 
							
								symmetric DFT.)
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								@subsubheading Which type do you need?
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Since the required flavor of even/odd DFT depends upon your problem,
							 | 
						||
| 
								 | 
							
								you are the best judge of this choice, but we can make a few comments
							 | 
						||
| 
								 | 
							
								on relative efficiency to help you in your selection.  In particular,
							 | 
						||
| 
								 | 
							
								R*DFT01 and R*DFT10 tend to be slightly faster than R*DFT11
							 | 
						||
| 
								 | 
							
								(especially for odd sizes), while the R*DFT00 transforms are sometimes
							 | 
						||
| 
								 | 
							
								significantly slower (especially for even sizes).@footnote{R*DFT00 is
							 | 
						||
| 
								 | 
							
								sometimes slower in FFTW because we discovered that the standard
							 | 
						||
| 
								 | 
							
								algorithm for computing this by a pre/post-processed real DFT---the
							 | 
						||
| 
								 | 
							
								algorithm used in FFTPACK, Numerical Recipes, and other sources for
							 | 
						||
| 
								 | 
							
								decades now---has serious numerical problems: it already loses several
							 | 
						||
| 
								 | 
							
								decimal places of accuracy for 16k sizes.  There seem to be only two
							 | 
						||
| 
								 | 
							
								alternatives in the literature that do not suffer similarly: a
							 | 
						||
| 
								 | 
							
								recursive decomposition into smaller DCTs, which would require a large
							 | 
						||
| 
								 | 
							
								set of codelets for efficiency and generality, or sacrificing a factor of 
							 | 
						||
| 
								 | 
							
								@tex
							 | 
						||
| 
								 | 
							
								$\sim 2$
							 | 
						||
| 
								 | 
							
								@end tex
							 | 
						||
| 
								 | 
							
								@ifnottex
							 | 
						||
| 
								 | 
							
								2
							 | 
						||
| 
								 | 
							
								@end ifnottex
							 | 
						||
| 
								 | 
							
								in speed to use a real DFT of twice the size.  We currently
							 | 
						||
| 
								 | 
							
								employ the latter technique for general @math{n}, as well as a limited
							 | 
						||
| 
								 | 
							
								form of the former method: a split-radix decomposition when @math{n}
							 | 
						||
| 
								 | 
							
								is odd (@math{N} a multiple of 4).  For @math{N} containing many
							 | 
						||
| 
								 | 
							
								factors of 2, the split-radix method seems to recover most of the
							 | 
						||
| 
								 | 
							
								speed of the standard algorithm without the accuracy tradeoff.}
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Thus, if only the boundary conditions on the transform inputs are
							 | 
						||
| 
								 | 
							
								specified, we generally recommend R*DFT10 over R*DFT00 and R*DFT01 over
							 | 
						||
| 
								 | 
							
								R*DFT11 (unless the half-sample shift or the self-inverse property is
							 | 
						||
| 
								 | 
							
								significant for your problem).
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								If performance is important to you and you are using only small sizes
							 | 
						||
| 
								 | 
							
								(say @math{n<200}), e.g. for multi-dimensional transforms, then you
							 | 
						||
| 
								 | 
							
								might consider generating hard-coded transforms of those sizes and types
							 | 
						||
| 
								 | 
							
								that you are interested in (@pxref{Generating your own code}).
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								We are interested in hearing what types of symmetric transforms you find
							 | 
						||
| 
								 | 
							
								most useful.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								@c =========>
							 | 
						||
| 
								 | 
							
								@node The Discrete Hartley Transform,  , Real even/odd DFTs (cosine/sine transforms), More DFTs of Real Data
							 | 
						||
| 
								 | 
							
								@subsection The Discrete Hartley Transform
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								If you are planning to use the DHT because you've heard that it is
							 | 
						||
| 
								 | 
							
								``faster'' than the DFT (FFT), @strong{stop here}.  The DHT is not
							 | 
						||
| 
								 | 
							
								faster than the DFT.  That story is an old but enduring misconception
							 | 
						||
| 
								 | 
							
								that was debunked in 1987.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The discrete Hartley transform (DHT) is an invertible linear transform
							 | 
						||
| 
								 | 
							
								closely related to the DFT.  In the DFT, one multiplies each input by
							 | 
						||
| 
								 | 
							
								@math{cos - i * sin} (a complex exponential), whereas in the DHT each
							 | 
						||
| 
								 | 
							
								input is multiplied by simply @math{cos + sin}.  Thus, the DHT
							 | 
						||
| 
								 | 
							
								transforms @code{n} real numbers to @code{n} real numbers, and has the
							 | 
						||
| 
								 | 
							
								convenient property of being its own inverse.  In FFTW, a DHT (of any
							 | 
						||
| 
								 | 
							
								positive @code{n}) can be specified by an r2r kind of @code{FFTW_DHT}.
							 | 
						||
| 
								 | 
							
								@ctindex FFTW_DHT
							 | 
						||
| 
								 | 
							
								@cindex discrete Hartley transform
							 | 
						||
| 
								 | 
							
								@cindex DHT
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								Like the DFT, in FFTW the DHT is unnormalized, so computing a DHT of
							 | 
						||
| 
								 | 
							
								size @code{n} followed by another DHT of the same size will result in
							 | 
						||
| 
								 | 
							
								the original array multiplied by @code{n}.
							 | 
						||
| 
								 | 
							
								@cindex normalization
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								The DHT was originally proposed as a more efficient alternative to the
							 | 
						||
| 
								 | 
							
								DFT for real data, but it was subsequently shown that a specialized DFT
							 | 
						||
| 
								 | 
							
								(such as FFTW's r2hc or r2c transforms) could be just as fast.  In FFTW,
							 | 
						||
| 
								 | 
							
								the DHT is actually computed by post-processing an r2hc transform, so
							 | 
						||
| 
								 | 
							
								there is ordinarily no reason to prefer it from a performance
							 | 
						||
| 
								 | 
							
								perspective.@footnote{We provide the DHT mainly as a byproduct of some
							 | 
						||
| 
								 | 
							
								internal algorithms. FFTW computes a real input/output DFT of
							 | 
						||
| 
								 | 
							
								@emph{prime} size by re-expressing it as a DHT plus post/pre-processing
							 | 
						||
| 
								 | 
							
								and then using Rader's prime-DFT algorithm adapted to the DHT.}
							 | 
						||
| 
								 | 
							
								However, we have heard rumors that the DHT might be the most appropriate
							 | 
						||
| 
								 | 
							
								transform in its own right for certain applications, and we would be
							 | 
						||
| 
								 | 
							
								very interested to hear from anyone who finds it useful.
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								If @code{FFTW_DHT} is specified for multiple dimensions of a
							 | 
						||
| 
								 | 
							
								multi-dimensional transform, FFTW computes the separable product of 1d
							 | 
						||
| 
								 | 
							
								DHTs along each dimension.  Unfortunately, this is not quite the same
							 | 
						||
| 
								 | 
							
								thing as a true multi-dimensional DHT; you can compute the latter, if
							 | 
						||
| 
								 | 
							
								necessary, with at most @code{rank-1} post-processing passes
							 | 
						||
| 
								 | 
							
								[see e.g. H. Hao and R. N. Bracewell, @i{Proc. IEEE} @b{75}, 264--266 (1987)].
							 | 
						||
| 
								 | 
							
								
							 | 
						||
| 
								 | 
							
								For the precise mathematical definition of the DHT as used by FFTW, see
							 | 
						||
| 
								 | 
							
								@ref{What FFTW Really Computes}.
							 | 
						||
| 
								 | 
							
								
							 |