2457 lines
		
	
	
		
			90 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
		
		
			
		
	
	
			2457 lines
		
	
	
		
			90 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
|   | @node FFTW Reference, Multi-threaded FFTW, Other Important Topics, Top | ||
|  | @chapter FFTW Reference | ||
|  | 
 | ||
|  | This chapter provides a complete reference for all sequential (i.e., | ||
|  | one-processor) FFTW functions.  Parallel transforms are described in | ||
|  | later chapters. | ||
|  | 
 | ||
|  | @menu | ||
|  | * Data Types and Files:: | ||
|  | * Using Plans:: | ||
|  | * Basic Interface:: | ||
|  | * Advanced Interface:: | ||
|  | * Guru Interface:: | ||
|  | * New-array Execute Functions:: | ||
|  | * Wisdom:: | ||
|  | * What FFTW Really Computes:: | ||
|  | @end menu | ||
|  | 
 | ||
|  | @c ------------------------------------------------------------ | ||
|  | @node Data Types and Files, Using Plans, FFTW Reference, FFTW Reference | ||
|  | @section Data Types and Files | ||
|  | 
 | ||
|  | All programs using FFTW should include its header file: | ||
|  | 
 | ||
|  | @example | ||
|  | #include <fftw3.h> | ||
|  | @end example | ||
|  | 
 | ||
|  | You must also link to the FFTW library.  On Unix, this | ||
|  | means adding @code{-lfftw3 -lm} at the @emph{end} of the link command. | ||
|  | 
 | ||
|  | @menu | ||
|  | * Complex numbers:: | ||
|  | * Precision:: | ||
|  | * Memory Allocation:: | ||
|  | @end menu | ||
|  | 
 | ||
|  | @c =========> | ||
|  | @node Complex numbers, Precision, Data Types and Files, Data Types and Files | ||
|  | @subsection Complex numbers | ||
|  | 
 | ||
|  | The default FFTW interface uses @code{double} precision for all | ||
|  | floating-point numbers, and defines a @code{fftw_complex} type to hold | ||
|  | complex numbers as: | ||
|  | 
 | ||
|  | @example | ||
|  | typedef double fftw_complex[2]; | ||
|  | @end example | ||
|  | @tindex fftw_complex | ||
|  | 
 | ||
|  | Here, the @code{[0]} element holds the real part and the @code{[1]} | ||
|  | element holds the imaginary part. | ||
|  | 
 | ||
|  | Alternatively, if you have a C compiler (such as @code{gcc}) that | ||
|  | supports the C99 revision of the ANSI C standard, you can use C's new | ||
|  | native complex type (which is binary-compatible with the typedef above). | ||
|  | In particular, if you @code{#include <complex.h>} @emph{before} | ||
|  | @code{<fftw3.h>}, then @code{fftw_complex} is defined to be the native | ||
|  | complex type and you can manipulate it with ordinary arithmetic | ||
|  | (e.g. @code{x = y * (3+4*I)}, where @code{x} and @code{y} are | ||
|  | @code{fftw_complex} and @code{I} is the standard symbol for the | ||
|  | imaginary unit); | ||
|  | @cindex C99 | ||
|  | 
 | ||
|  | 
 | ||
|  | C++ has its own @code{complex<T>} template class, defined in the | ||
|  | standard @code{<complex>} header file.  Reportedly, the C++ standards | ||
|  | committee has recently agreed to mandate that the storage format used | ||
|  | for this type be binary-compatible with the C99 type, i.e. an array | ||
|  | @code{T[2]} with consecutive real @code{[0]} and imaginary @code{[1]} | ||
|  | parts.  (See report | ||
|  | @uref{http://www.open-std.org/jtc1/sc22/WG21/docs/papers/2002/n1388.pdf | ||
|  | WG21/N1388}.)  Although not part of the official standard as of this | ||
|  | writing, the proposal stated that: ``This solution has been tested with | ||
|  | all current major implementations of the standard library and shown to | ||
|  | be working.''  To the extent that this is true, if you have a variable | ||
|  | @code{complex<double> *x}, you can pass it directly to FFTW via | ||
|  | @code{reinterpret_cast<fftw_complex*>(x)}. | ||
|  | @cindex C++ | ||
|  | @cindex portability | ||
|  | 
 | ||
|  | @c =========> | ||
|  | @node Precision, Memory Allocation, Complex numbers, Data Types and Files | ||
|  | @subsection Precision | ||
|  | @cindex precision | ||
|  | 
 | ||
|  | You can install single and long-double precision versions of FFTW, | ||
|  | which replace @code{double} with @code{float} and @code{long double}, | ||
|  | respectively (@pxref{Installation and Customization}).  To use these | ||
|  | interfaces, you: | ||
|  | 
 | ||
|  | @itemize @bullet | ||
|  | 
 | ||
|  | @item | ||
|  | Link to the single/long-double libraries; on Unix, @code{-lfftw3f} or | ||
|  | @code{-lfftw3l} instead of (or in addition to) @code{-lfftw3}.  (You | ||
|  | can link to the different-precision libraries simultaneously.) | ||
|  | 
 | ||
|  | @item | ||
|  | Include the @emph{same} @code{<fftw3.h>} header file. | ||
|  | 
 | ||
|  | @item | ||
|  | Replace all lowercase instances of @samp{fftw_} with @samp{fftwf_} or | ||
|  | @samp{fftwl_} for single or long-double precision, respectively. | ||
|  | (@code{fftw_complex} becomes @code{fftwf_complex}, @code{fftw_execute} | ||
|  | becomes @code{fftwf_execute}, etcetera.) | ||
|  | 
 | ||
|  | @item | ||
|  | Uppercase names, i.e. names beginning with @samp{FFTW_}, remain the | ||
|  | same. | ||
|  | 
 | ||
|  | @item | ||
|  | Replace @code{double} with @code{float} or @code{long double} for | ||
|  | subroutine parameters. | ||
|  | 
 | ||
|  | @end itemize | ||
|  | 
 | ||
|  | Depending upon your compiler and/or hardware, @code{long double} may not | ||
|  | be any more precise than @code{double} (or may not be supported at all, | ||
|  | although it is standard in C99). | ||
|  | @cindex C99 | ||
|  | 
 | ||
|  | 
 | ||
|  | We also support using the nonstandard @code{__float128} | ||
|  | quadruple-precision type provided by recent versions of @code{gcc} on | ||
|  | 32- and 64-bit x86 hardware (@pxref{Installation and Customization}). | ||
|  | To use this type, link with @code{-lfftw3q -lquadmath -lm} (the | ||
|  | @code{libquadmath} library provided by @code{gcc} is needed for | ||
|  | quadruple-precision trigonometric functions) and use @samp{fftwq_} | ||
|  | identifiers. | ||
|  | 
 | ||
|  | @c =========> | ||
|  | @node Memory Allocation,  , Precision, Data Types and Files | ||
|  | @subsection Memory Allocation | ||
|  | 
 | ||
|  | @example | ||
|  | void *fftw_malloc(size_t n); | ||
|  | void fftw_free(void *p); | ||
|  | @end example | ||
|  | @findex fftw_malloc | ||
|  | @findex fftw_free | ||
|  | 
 | ||
|  | These are functions that behave identically to @code{malloc} and | ||
|  | @code{free}, except that they guarantee that the returned pointer obeys | ||
|  | any special alignment restrictions imposed by any algorithm in FFTW | ||
|  | (e.g. for SIMD acceleration).  @xref{SIMD alignment and fftw_malloc}. | ||
|  | @cindex alignment | ||
|  | 
 | ||
|  | 
 | ||
|  | Data allocated by @code{fftw_malloc} @emph{must} be deallocated by | ||
|  | @code{fftw_free} and not by the ordinary @code{free}. | ||
|  | 
 | ||
|  | These routines simply call through to your operating system's | ||
|  | @code{malloc} or, if necessary, its aligned equivalent | ||
|  | (e.g. @code{memalign}), so you normally need not worry about any | ||
|  | significant time or space overhead.  You are @emph{not required} to use | ||
|  | them to allocate your data, but we strongly recommend it. | ||
|  | 
 | ||
|  | Note: in C++, just as with ordinary @code{malloc}, you must typecast | ||
|  | the output of @code{fftw_malloc} to whatever pointer type you are | ||
|  | allocating. | ||
|  | @cindex C++ | ||
|  | 
 | ||
|  | 
 | ||
|  | We also provide the following two convenience functions to allocate | ||
|  | real and complex arrays with @code{n} elements, which are equivalent | ||
|  | to @code{(double *) fftw_malloc(sizeof(double) * n)} and | ||
|  | @code{(fftw_complex *) fftw_malloc(sizeof(fftw_complex) * n)}, | ||
|  | respectively: | ||
|  | 
 | ||
|  | @example | ||
|  | double *fftw_alloc_real(size_t n); | ||
|  | fftw_complex *fftw_alloc_complex(size_t n); | ||
|  | @end example | ||
|  | @findex fftw_alloc_real | ||
|  | @findex fftw_alloc_complex | ||
|  | 
 | ||
|  | The equivalent functions in other precisions allocate arrays of @code{n} | ||
|  | elements in that precision.  e.g. @code{fftwf_alloc_real(n)} is | ||
|  | equivalent to @code{(float *) fftwf_malloc(sizeof(float) * n)}. | ||
|  | @cindex precision | ||
|  | 
 | ||
|  | @c ------------------------------------------------------------ | ||
|  | @node Using Plans, Basic Interface, Data Types and Files, FFTW Reference | ||
|  | @section Using Plans | ||
|  | 
 | ||
|  | Plans for all transform types in FFTW are stored as type | ||
|  | @code{fftw_plan} (an opaque pointer type), and are created by one of the | ||
|  | various planning routines described in the following sections. | ||
|  | @tindex fftw_plan | ||
|  | An @code{fftw_plan} contains all information necessary to compute the | ||
|  | transform, including the pointers to the input and output arrays. | ||
|  | 
 | ||
|  | @example | ||
|  | void fftw_execute(const fftw_plan plan); | ||
|  | @end example | ||
|  | @findex fftw_execute | ||
|  | 
 | ||
|  | This executes the @code{plan}, to compute the corresponding transform on | ||
|  | the arrays for which it was planned (which must still exist).  The plan | ||
|  | is not modified, and @code{fftw_execute} can be called as many times as | ||
|  | desired. | ||
|  | 
 | ||
|  | To apply a given plan to a different array, you can use the new-array execute | ||
|  | interface.  @xref{New-array Execute Functions}. | ||
|  | 
 | ||
|  | @code{fftw_execute} (and equivalents) is the only function in FFTW | ||
|  | guaranteed to be thread-safe; see @ref{Thread safety}. | ||
|  | 
 | ||
|  | This function: | ||
|  | @example | ||
|  | void fftw_destroy_plan(fftw_plan plan); | ||
|  | @end example | ||
|  | @findex fftw_destroy_plan | ||
|  | deallocates the @code{plan} and all its associated data. | ||
|  | 
 | ||
|  | FFTW's planner saves some other persistent data, such as the | ||
|  | accumulated wisdom and a list of algorithms available in the current | ||
|  | configuration.  If you want to deallocate all of that and reset FFTW | ||
|  | to the pristine state it was in when you started your program, you can | ||
|  | call: | ||
|  | 
 | ||
|  | @example | ||
|  | void fftw_cleanup(void); | ||
|  | @end example | ||
|  | @findex fftw_cleanup | ||
|  | 
 | ||
|  | After calling @code{fftw_cleanup}, all existing plans become undefined, | ||
|  | and you should not attempt to execute them nor to destroy them.  You can | ||
|  | however create and execute/destroy new plans, in which case FFTW starts | ||
|  | accumulating wisdom information again. | ||
|  | 
 | ||
|  | @code{fftw_cleanup} does not deallocate your plans, however.  To prevent | ||
|  | memory leaks, you must still call @code{fftw_destroy_plan} before | ||
|  | executing @code{fftw_cleanup}. | ||
|  | 
 | ||
|  | Occasionally, it may useful to know FFTW's internal ``cost'' metric | ||
|  | that it uses to compare plans to one another; this cost is | ||
|  | proportional to an execution time of the plan, in undocumented units, | ||
|  | if the plan was created with the @code{FFTW_MEASURE} or other | ||
|  | timing-based options, or alternatively is a heuristic cost function | ||
|  | for @code{FFTW_ESTIMATE} plans.  (The cost values of measured and | ||
|  | estimated plans are not comparable, being in different units.  Also, | ||
|  | costs from different FFTW versions or the same version compiled | ||
|  | differently may not be in the same units.  Plans created from wisdom | ||
|  | have a cost of 0 since no timing measurement is performed for them. | ||
|  | Finally, certain problems for which only one top-level algorithm was | ||
|  | possible may have required no measurements of the cost of the whole | ||
|  | plan, in which case @code{fftw_cost} will also return 0.)  The cost | ||
|  | metric for a given plan is returned by: | ||
|  | 
 | ||
|  | @example | ||
|  | double fftw_cost(const fftw_plan plan); | ||
|  | @end example | ||
|  | @findex fftw_cost | ||
|  | 
 | ||
|  | The following two routines are provided purely for academic purposes | ||
|  | (that is, for entertainment). | ||
|  | 
 | ||
|  | @example | ||
|  | void fftw_flops(const fftw_plan plan,  | ||
|  |                 double *add, double *mul, double *fma); | ||
|  | @end example | ||
|  | @findex fftw_flops | ||
|  | 
 | ||
|  | Given a @code{plan}, set @code{add}, @code{mul}, and @code{fma} to an | ||
|  | exact count of the number of floating-point additions, multiplications, | ||
|  | and fused multiply-add operations involved in the plan's execution.  The | ||
|  | total number of floating-point operations (flops) is @code{add + mul + | ||
|  | 2*fma}, or @code{add + mul + fma} if the hardware supports fused | ||
|  | multiply-add instructions (although the number of FMA operations is only | ||
|  | approximate because of compiler voodoo).  (The number of operations | ||
|  | should be an integer, but we use @code{double} to avoid overflowing | ||
|  | @code{int} for large transforms; the arguments are of type @code{double} | ||
|  | even for single and long-double precision versions of FFTW.) | ||
|  | 
 | ||
|  | @example | ||
|  | void fftw_fprint_plan(const fftw_plan plan, FILE *output_file); | ||
|  | void fftw_print_plan(const fftw_plan plan); | ||
|  | char *fftw_sprint_plan(const fftw_plan plan); | ||
|  | @end example | ||
|  | @findex fftw_fprint_plan | ||
|  | @findex fftw_print_plan | ||
|  | 
 | ||
|  | This outputs a ``nerd-readable'' representation of the @code{plan} to | ||
|  | the given file, to @code{stdout}, or two a newly allocated | ||
|  | NUL-terminated string (which the caller is responsible for deallocating | ||
|  | with @code{free}), respectively. | ||
|  | 
 | ||
|  | @c ------------------------------------------------------------ | ||
|  | @node Basic Interface, Advanced Interface, Using Plans, FFTW Reference | ||
|  | @section Basic Interface | ||
|  | @cindex basic interface | ||
|  | 
 | ||
|  | Recall that the FFTW API is divided into three parts@footnote{@i{Gallia est | ||
|  | omnis divisa in partes tres} (Julius Caesar).}: the @dfn{basic interface} | ||
|  | computes a single transform of contiguous data, the @dfn{advanced | ||
|  | interface} computes transforms of multiple or strided arrays, and the | ||
|  | @dfn{guru interface} supports the most general data layouts, | ||
|  | multiplicities, and strides.  This section describes the basic | ||
|  | interface, which we expect to satisfy the needs of most users. | ||
|  | 
 | ||
|  | @menu | ||
|  | * Complex DFTs:: | ||
|  | * Planner Flags:: | ||
|  | * Real-data DFTs:: | ||
|  | * Real-data DFT Array Format:: | ||
|  | * Real-to-Real Transforms:: | ||
|  | * Real-to-Real Transform Kinds:: | ||
|  | @end menu | ||
|  | 
 | ||
|  | @c =========> | ||
|  | @node Complex DFTs, Planner Flags, Basic Interface, Basic Interface | ||
|  | @subsection Complex DFTs | ||
|  | 
 | ||
|  | @example | ||
|  | fftw_plan fftw_plan_dft_1d(int n0, | ||
|  |                            fftw_complex *in, fftw_complex *out, | ||
|  |                            int sign, unsigned flags); | ||
|  | fftw_plan fftw_plan_dft_2d(int n0, int n1, | ||
|  |                            fftw_complex *in, fftw_complex *out, | ||
|  |                            int sign, unsigned flags); | ||
|  | fftw_plan fftw_plan_dft_3d(int n0, int n1, int n2, | ||
|  |                            fftw_complex *in, fftw_complex *out, | ||
|  |                            int sign, unsigned flags); | ||
|  | fftw_plan fftw_plan_dft(int rank, const int *n, | ||
|  |                         fftw_complex *in, fftw_complex *out, | ||
|  |                         int sign, unsigned flags); | ||
|  | @end example | ||
|  | @findex fftw_plan_dft_1d | ||
|  | @findex fftw_plan_dft_2d | ||
|  | @findex fftw_plan_dft_3d | ||
|  | @findex fftw_plan_dft | ||
|  | 
 | ||
|  | Plan a complex input/output discrete Fourier transform (DFT) in zero or | ||
|  | more dimensions, returning an @code{fftw_plan} (@pxref{Using Plans}). | ||
|  | 
 | ||
|  | Once you have created a plan for a certain transform type and | ||
|  | parameters, then creating another plan of the same type and parameters, | ||
|  | but for different arrays, is fast and shares constant data with the | ||
|  | first plan (if it still exists). | ||
|  | 
 | ||
|  | The planner returns @code{NULL} if the plan cannot be created.  In the | ||
|  | standard FFTW distribution, the basic interface is guaranteed to return | ||
|  | a non-@code{NULL} plan.  A plan may be @code{NULL}, however, if you are | ||
|  | using a customized FFTW configuration supporting a restricted set of | ||
|  | transforms. | ||
|  | 
 | ||
|  | @subsubheading Arguments | ||
|  | @itemize @bullet | ||
|  | 
 | ||
|  | @item | ||
|  | @code{rank} is the rank of the transform (it should be the size of the | ||
|  | array @code{*n}), and can be any non-negative integer.  (@xref{Complex | ||
|  | Multi-Dimensional DFTs}, for the definition of ``rank''.)  The | ||
|  | @samp{_1d}, @samp{_2d}, and @samp{_3d} planners correspond to a | ||
|  | @code{rank} of @code{1}, @code{2}, and @code{3}, respectively.  The rank | ||
|  | may be zero, which is equivalent to a rank-1 transform of size 1, i.e. a | ||
|  | copy of one number from input to output. | ||
|  | 
 | ||
|  | @item | ||
|  | @code{n0}, @code{n1}, @code{n2}, or @code{n[0..rank-1]} (as appropriate | ||
|  | for each routine) specify the size of the transform dimensions.  They | ||
|  | can be any positive integer. | ||
|  |   | ||
|  | @itemize @minus | ||
|  | @item | ||
|  | @cindex row-major | ||
|  | Multi-dimensional arrays are stored in row-major order with dimensions: | ||
|  | @code{n0} x @code{n1}; or @code{n0} x @code{n1} x @code{n2}; or | ||
|  | @code{n[0]} x @code{n[1]} x ... x @code{n[rank-1]}. | ||
|  | @xref{Multi-dimensional Array Format}. | ||
|  | @item | ||
|  | FFTW is best at handling sizes of the form | ||
|  | @ifinfo | ||
|  | @math{2^a 3^b 5^c 7^d 11^e 13^f}, | ||
|  | @end ifinfo | ||
|  | @tex | ||
|  | $2^a 3^b 5^c 7^d 11^e 13^f$, | ||
|  | @end tex | ||
|  | @html | ||
|  | 2<sup>a</sup> 3<sup>b</sup> 5<sup>c</sup> 7<sup>d</sup> | ||
|  |         11<sup>e</sup> 13<sup>f</sup>, | ||
|  | @end html | ||
|  | where @math{e+f} is either @math{0} or @math{1}, and the other exponents | ||
|  | are arbitrary.  Other sizes are computed by means of a slow, | ||
|  | general-purpose algorithm (which nevertheless retains @Onlogn{} performance even for prime sizes).  It is possible to customize FFTW | ||
|  | for different array sizes; see @ref{Installation and Customization}. | ||
|  | Transforms whose sizes are powers of @math{2} are especially fast. | ||
|  | @end itemize | ||
|  | 
 | ||
|  | @item | ||
|  | @code{in} and @code{out} point to the input and output arrays of the | ||
|  | transform, which may be the same (yielding an in-place transform). | ||
|  | @cindex in-place | ||
|  | These arrays are overwritten during planning, unless | ||
|  | @code{FFTW_ESTIMATE} is used in the flags.  (The arrays need not be | ||
|  | initialized, but they must be allocated.) | ||
|  | 
 | ||
|  | If @code{in == out}, the transform is @dfn{in-place} and the input | ||
|  | array is overwritten. If @code{in != out}, the two arrays must | ||
|  | not overlap (but FFTW does not check for this condition). | ||
|  | 
 | ||
|  | @item | ||
|  | @ctindex FFTW_FORWARD | ||
|  | @ctindex FFTW_BACKWARD | ||
|  | @code{sign} is the sign of the exponent in the formula that defines the | ||
|  | Fourier transform.  It can be @math{-1} (= @code{FFTW_FORWARD}) or | ||
|  | @math{+1} (= @code{FFTW_BACKWARD}). | ||
|  | 
 | ||
|  | @item | ||
|  | @cindex flags | ||
|  | @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags, | ||
|  | as defined in @ref{Planner Flags}. | ||
|  | 
 | ||
|  | @end itemize | ||
|  | 
 | ||
|  | FFTW computes an unnormalized transform: computing a forward followed by | ||
|  | a backward transform (or vice versa) will result in the original data | ||
|  | multiplied by the size of the transform (the product of the dimensions). | ||
|  | @cindex normalization | ||
|  | For more information, see @ref{What FFTW Really Computes}. | ||
|  | 
 | ||
|  | @c =========> | ||
|  | @node Planner Flags, Real-data DFTs, Complex DFTs, Basic Interface | ||
|  | @subsection Planner Flags | ||
|  | 
 | ||
|  | All of the planner routines in FFTW accept an integer @code{flags} | ||
|  | argument, which is a bitwise OR (@samp{|}) of zero or more of the flag | ||
|  | constants defined below.  These flags control the rigor (and time) of | ||
|  | the planning process, and can also impose (or lift) restrictions on the | ||
|  | type of transform algorithm that is employed. | ||
|  | 
 | ||
|  | @emph{Important:} the planner overwrites the input array during | ||
|  | planning unless a saved plan (@pxref{Wisdom}) is available for that | ||
|  | problem, so you should initialize your input data after creating the | ||
|  | plan.  The only exceptions to this are the @code{FFTW_ESTIMATE} and | ||
|  | @code{FFTW_WISDOM_ONLY} flags, as mentioned below. | ||
|  | 
 | ||
|  | In all  cases, if  wisdom is  available for the  given problem  that was | ||
|  | created  with equal-or-greater  planning rigor,  then the  more rigorous | ||
|  | wisdom is used.  For example, in @code{FFTW_ESTIMATE} mode any available | ||
|  | wisdom is used, whereas  in @code{FFTW_PATIENT} mode only wisdom created | ||
|  | in patient or exhaustive mode can be used.  @xref{Words of Wisdom-Saving | ||
|  | Plans}. | ||
|  | 
 | ||
|  | @subsubheading Planning-rigor flags | ||
|  | @itemize @bullet | ||
|  | 
 | ||
|  | @item | ||
|  | @ctindex FFTW_ESTIMATE | ||
|  | @code{FFTW_ESTIMATE} specifies that, instead of actual measurements of | ||
|  | different algorithms, a simple heuristic is used to pick a (probably | ||
|  | sub-optimal) plan quickly.  With this flag, the input/output arrays are | ||
|  | not overwritten during planning. | ||
|  | 
 | ||
|  | @item | ||
|  | @ctindex FFTW_MEASURE | ||
|  | @code{FFTW_MEASURE} tells FFTW to find an optimized plan by actually | ||
|  | @emph{computing} several FFTs and measuring their execution time. | ||
|  | Depending on your machine, this can take some time (often a few | ||
|  | seconds).  @code{FFTW_MEASURE} is the default planning option. | ||
|  | 
 | ||
|  | @item | ||
|  | @ctindex FFTW_PATIENT | ||
|  | @code{FFTW_PATIENT} is like @code{FFTW_MEASURE}, but considers a wider | ||
|  | range of algorithms and often produces a ``more optimal'' plan | ||
|  | (especially for large transforms), but at the expense of several times | ||
|  | longer planning time (especially for large transforms). | ||
|  | 
 | ||
|  | @item | ||
|  | @ctindex FFTW_EXHAUSTIVE | ||
|  | @code{FFTW_EXHAUSTIVE} is like @code{FFTW_PATIENT}, but considers an | ||
|  | even wider range of algorithms, including many that we think are | ||
|  | unlikely to be fast, to produce the most optimal plan but with a | ||
|  | substantially increased planning time. | ||
|  | 
 | ||
|  | @item | ||
|  | @ctindex FFTW_WISDOM_ONLY | ||
|  | @code{FFTW_WISDOM_ONLY} is a special planning mode in which the plan | ||
|  | is only created if wisdom is available for the given problem, and | ||
|  | otherwise a @code{NULL} plan is returned.  This can be combined with | ||
|  | other flags, e.g. @samp{FFTW_WISDOM_ONLY | FFTW_PATIENT} creates a | ||
|  | plan only if wisdom is available that was created in | ||
|  | @code{FFTW_PATIENT} or @code{FFTW_EXHAUSTIVE} mode.  The | ||
|  | @code{FFTW_WISDOM_ONLY} flag is intended for users who need to detect | ||
|  | whether wisdom is available; for example, if wisdom is not available | ||
|  | one may wish to allocate new arrays for planning so that user data is | ||
|  | not overwritten. | ||
|  | 
 | ||
|  | @end itemize | ||
|  | 
 | ||
|  | @subsubheading Algorithm-restriction flags | ||
|  | @itemize @bullet | ||
|  | 
 | ||
|  | @item | ||
|  | @ctindex FFTW_DESTROY_INPUT | ||
|  | @code{FFTW_DESTROY_INPUT} specifies that an out-of-place transform is | ||
|  | allowed to @emph{overwrite its input} array with arbitrary data; this | ||
|  | can sometimes allow more efficient algorithms to be employed. | ||
|  | @cindex out-of-place | ||
|  | 
 | ||
|  | @item | ||
|  | @ctindex FFTW_PRESERVE_INPUT | ||
|  | @code{FFTW_PRESERVE_INPUT} specifies that an out-of-place transform must | ||
|  | @emph{not change its input} array.  This is ordinarily the | ||
|  | @emph{default}, except for c2r and hc2r (i.e. complex-to-real) | ||
|  | transforms for which @code{FFTW_DESTROY_INPUT} is the default.  In the | ||
|  | latter cases, passing @code{FFTW_PRESERVE_INPUT} will attempt to use | ||
|  | algorithms that do not destroy the input, at the expense of worse | ||
|  | performance; for multi-dimensional c2r transforms, however, no | ||
|  | input-preserving algorithms are implemented and the planner will return | ||
|  | @code{NULL} if one is requested. | ||
|  | @cindex c2r | ||
|  | @cindex hc2r | ||
|  | 
 | ||
|  | @item | ||
|  | @ctindex FFTW_UNALIGNED | ||
|  | @cindex alignment | ||
|  | @findex fftw_malloc | ||
|  | @findex fftw_alignment_of | ||
|  | @code{FFTW_UNALIGNED} specifies that the algorithm may not impose any | ||
|  | unusual alignment requirements on the input/output arrays (i.e. no | ||
|  | SIMD may be used).  This flag is normally @emph{not necessary}, since | ||
|  | the planner automatically detects misaligned arrays.  The only use for | ||
|  | this flag is if you want to use the new-array execute interface to | ||
|  | execute a given plan on a different array that may not be aligned like | ||
|  | the original.  (Using @code{fftw_malloc} makes this flag unnecessary | ||
|  | even then.  You can also use @code{fftw_alignment_of} to detect | ||
|  | whether two arrays are equivalently aligned.) | ||
|  | 
 | ||
|  | @end itemize | ||
|  | 
 | ||
|  | @subsubheading Limiting planning time | ||
|  | 
 | ||
|  | @example | ||
|  | extern void fftw_set_timelimit(double seconds); | ||
|  | @end example | ||
|  | @findex fftw_set_timelimit | ||
|  | 
 | ||
|  | This function instructs FFTW to spend at most @code{seconds} seconds | ||
|  | (approximately) in the planner.  If @code{seconds == | ||
|  | FFTW_NO_TIMELIMIT} (the default value, which is negative), then | ||
|  | planning time is unbounded.  Otherwise, FFTW plans with a | ||
|  | progressively wider range of algorithms until the given time limit | ||
|  | is reached or the given range of algorithms is explored, returning the | ||
|  | best available plan. | ||
|  | @ctindex FFTW_NO_TIMELIMIT | ||
|  | 
 | ||
|  | 
 | ||
|  | For example, specifying @code{FFTW_PATIENT} first plans in | ||
|  | @code{FFTW_ESTIMATE} mode, then in @code{FFTW_MEASURE} mode, then | ||
|  | finally (time permitting) in @code{FFTW_PATIENT}.  If | ||
|  | @code{FFTW_EXHAUSTIVE} is specified instead, the planner will further | ||
|  | progress to @code{FFTW_EXHAUSTIVE} mode. | ||
|  | 
 | ||
|  | Note that the @code{seconds} argument specifies only a rough limit; in | ||
|  | practice, the planner may use somewhat more time if the time limit is | ||
|  | reached when the planner is in the middle of an operation that cannot | ||
|  | be interrupted.  At the very least, the planner will complete planning | ||
|  | in @code{FFTW_ESTIMATE} mode (which is thus equivalent to a time limit | ||
|  | of 0). | ||
|  | 
 | ||
|  | 
 | ||
|  | @c =========> | ||
|  | @node Real-data DFTs, Real-data DFT Array Format, Planner Flags, Basic Interface | ||
|  | @subsection Real-data DFTs | ||
|  | 
 | ||
|  | @example | ||
|  | fftw_plan fftw_plan_dft_r2c_1d(int n0, | ||
|  |                                double *in, fftw_complex *out, | ||
|  |                                unsigned flags); | ||
|  | fftw_plan fftw_plan_dft_r2c_2d(int n0, int n1, | ||
|  |                                double *in, fftw_complex *out, | ||
|  |                                unsigned flags); | ||
|  | fftw_plan fftw_plan_dft_r2c_3d(int n0, int n1, int n2, | ||
|  |                                double *in, fftw_complex *out, | ||
|  |                                unsigned flags); | ||
|  | fftw_plan fftw_plan_dft_r2c(int rank, const int *n, | ||
|  |                             double *in, fftw_complex *out, | ||
|  |                             unsigned flags); | ||
|  | @end example | ||
|  | @findex fftw_plan_dft_r2c_1d | ||
|  | @findex fftw_plan_dft_r2c_2d | ||
|  | @findex fftw_plan_dft_r2c_3d | ||
|  | @findex fftw_plan_dft_r2c | ||
|  | @cindex r2c | ||
|  | 
 | ||
|  | Plan a real-input/complex-output discrete Fourier transform (DFT) in | ||
|  | zero or more dimensions, returning an @code{fftw_plan} (@pxref{Using | ||
|  | Plans}). | ||
|  | 
 | ||
|  | Once you have created a plan for a certain transform type and | ||
|  | parameters, then creating another plan of the same type and parameters, | ||
|  | but for different arrays, is fast and shares constant data with the | ||
|  | first plan (if it still exists). | ||
|  | 
 | ||
|  | The planner returns @code{NULL} if the plan cannot be created.  A | ||
|  | non-@code{NULL} plan is always returned by the basic interface unless | ||
|  | you are using a customized FFTW configuration supporting a restricted | ||
|  | set of transforms, or if you use the @code{FFTW_PRESERVE_INPUT} flag | ||
|  | with a multi-dimensional out-of-place c2r transform (see below). | ||
|  | 
 | ||
|  | @subsubheading Arguments | ||
|  | @itemize @bullet | ||
|  | 
 | ||
|  | @item | ||
|  | @code{rank} is the rank of the transform (it should be the size of the | ||
|  | array @code{*n}), and can be any non-negative integer.  (@xref{Complex | ||
|  | Multi-Dimensional DFTs}, for the definition of ``rank''.)  The | ||
|  | @samp{_1d}, @samp{_2d}, and @samp{_3d} planners correspond to a | ||
|  | @code{rank} of @code{1}, @code{2}, and @code{3}, respectively.  The rank | ||
|  | may be zero, which is equivalent to a rank-1 transform of size 1, i.e. a | ||
|  | copy of one real number (with zero imaginary part) from input to output. | ||
|  | 
 | ||
|  | @item | ||
|  | @code{n0}, @code{n1}, @code{n2}, or @code{n[0..rank-1]}, (as appropriate | ||
|  | for each routine) specify the size of the transform dimensions.  They | ||
|  | can be any positive integer.  This is different in general from the | ||
|  | @emph{physical} array dimensions, which are described in @ref{Real-data | ||
|  | DFT Array Format}. | ||
|  |   | ||
|  | @itemize @minus | ||
|  | @item | ||
|  | FFTW is best at handling sizes of the form | ||
|  | @ifinfo | ||
|  | @math{2^a 3^b 5^c 7^d 11^e 13^f}, | ||
|  | @end ifinfo | ||
|  | @tex | ||
|  | $2^a 3^b 5^c 7^d 11^e 13^f$, | ||
|  | @end tex | ||
|  | @html | ||
|  | 2<sup>a</sup> 3<sup>b</sup> 5<sup>c</sup> 7<sup>d</sup> | ||
|  |         11<sup>e</sup> 13<sup>f</sup>, | ||
|  | @end html | ||
|  | where @math{e+f} is either @math{0} or @math{1}, and the other exponents | ||
|  | are arbitrary.  Other sizes are computed by means of a slow, | ||
|  | general-purpose algorithm (which nevertheless retains @Onlogn{} performance even for prime sizes).  (It is possible to customize FFTW | ||
|  | for different array sizes; see @ref{Installation and Customization}.) | ||
|  | Transforms whose sizes are powers of @math{2} are especially fast, and | ||
|  | it is generally beneficial for the @emph{last} dimension of an r2c/c2r | ||
|  | transform to be @emph{even}. | ||
|  | @end itemize | ||
|  | 
 | ||
|  | @item | ||
|  | @code{in} and @code{out} point to the input and output arrays of the | ||
|  | transform, which may be the same (yielding an in-place transform). | ||
|  | @cindex in-place | ||
|  | These arrays are overwritten during planning, unless | ||
|  | @code{FFTW_ESTIMATE} is used in the flags.  (The arrays need not be | ||
|  | initialized, but they must be allocated.)  For an in-place transform, it | ||
|  | is important to remember that the real array will require padding, | ||
|  | described in @ref{Real-data DFT Array Format}. | ||
|  | @cindex padding | ||
|  | 
 | ||
|  | @item | ||
|  | @cindex flags | ||
|  | @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags, | ||
|  | as defined in @ref{Planner Flags}. | ||
|  | 
 | ||
|  | @end itemize | ||
|  | 
 | ||
|  | The inverse transforms, taking complex input (storing the non-redundant | ||
|  | half of a logically Hermitian array) to real output, are given by: | ||
|  | 
 | ||
|  | @example | ||
|  | fftw_plan fftw_plan_dft_c2r_1d(int n0, | ||
|  |                                fftw_complex *in, double *out, | ||
|  |                                unsigned flags); | ||
|  | fftw_plan fftw_plan_dft_c2r_2d(int n0, int n1, | ||
|  |                                fftw_complex *in, double *out, | ||
|  |                                unsigned flags); | ||
|  | fftw_plan fftw_plan_dft_c2r_3d(int n0, int n1, int n2, | ||
|  |                                fftw_complex *in, double *out, | ||
|  |                                unsigned flags); | ||
|  | fftw_plan fftw_plan_dft_c2r(int rank, const int *n, | ||
|  |                             fftw_complex *in, double *out, | ||
|  |                             unsigned flags); | ||
|  | @end example | ||
|  | @findex fftw_plan_dft_c2r_1d | ||
|  | @findex fftw_plan_dft_c2r_2d | ||
|  | @findex fftw_plan_dft_c2r_3d | ||
|  | @findex fftw_plan_dft_c2r | ||
|  | @cindex c2r | ||
|  | 
 | ||
|  | The arguments are the same as for the r2c transforms, except that the | ||
|  | input and output data formats are reversed. | ||
|  | 
 | ||
|  | FFTW computes an unnormalized transform: computing an r2c followed by a | ||
|  | c2r transform (or vice versa) will result in the original data | ||
|  | multiplied by the size of the transform (the product of the logical | ||
|  | dimensions). | ||
|  | @cindex normalization | ||
|  | An r2c transform produces the same output as a @code{FFTW_FORWARD} | ||
|  | complex DFT of the same input, and a c2r transform is correspondingly | ||
|  | equivalent to @code{FFTW_BACKWARD}.  For more information, see @ref{What | ||
|  | FFTW Really Computes}. | ||
|  | 
 | ||
|  | @c =========> | ||
|  | @node Real-data DFT Array Format, Real-to-Real Transforms, Real-data DFTs, Basic Interface | ||
|  | @subsection Real-data DFT Array Format | ||
|  | @cindex r2c/c2r multi-dimensional array format | ||
|  | 
 | ||
|  | The output of a DFT of real data (r2c) contains symmetries that, in | ||
|  | principle, make half of the outputs redundant (@pxref{What FFTW Really | ||
|  | Computes}).  (Similarly for the input of an inverse c2r transform.)  In | ||
|  | practice, it is not possible to entirely realize these savings in an | ||
|  | efficient and understandable format that generalizes to | ||
|  | multi-dimensional transforms.  Instead, the output of the r2c | ||
|  | transforms is @emph{slightly} over half of the output of the | ||
|  | corresponding complex transform.  We do not ``pack'' the data in any | ||
|  | way, but store it as an ordinary array of @code{fftw_complex} values. | ||
|  | In fact, this data is simply a subsection of what would be the array in | ||
|  | the corresponding complex transform. | ||
|  | 
 | ||
|  | Specifically, for a real transform of @math{d} (= @code{rank}) | ||
|  | dimensions @ndims{}, the complex data is an @ndimshalf array of | ||
|  | @code{fftw_complex} values in row-major order (with the division rounded | ||
|  | down).  That is, we only store the @emph{lower} half (non-negative | ||
|  | frequencies), plus one element, of the last dimension of the data from | ||
|  | the ordinary complex transform.  (We could have instead taken half of | ||
|  | any other dimension, but implementation turns out to be simpler if the | ||
|  | last, contiguous, dimension is used.) | ||
|  | 
 | ||
|  | @cindex out-of-place | ||
|  | For an out-of-place transform, the real data is simply an array with | ||
|  | physical dimensions @ndims in row-major order. | ||
|  | 
 | ||
|  | @cindex in-place | ||
|  | @cindex padding | ||
|  | For an in-place transform, some complications arise since the complex data | ||
|  | is slightly larger than the real data.  In this case, the final | ||
|  | dimension of the real data must be @emph{padded} with extra values to | ||
|  | accommodate the size of the complex data---two extra if the last | ||
|  | dimension is even and one if it is odd.  That is, the last dimension of | ||
|  | the real data must physically contain | ||
|  | @tex | ||
|  | $2 (n_{d-1}/2+1)$ | ||
|  | @end tex | ||
|  | @ifinfo | ||
|  | 2 * (n[d-1]/2+1) | ||
|  | @end ifinfo | ||
|  | @html | ||
|  | 2 * (n<sub>d-1</sub>/2+1) | ||
|  | @end html | ||
|  | @code{double} values (exactly enough to hold the complex data).  This | ||
|  | physical array size does not, however, change the @emph{logical} array | ||
|  | size---only | ||
|  | @tex | ||
|  | $n_{d-1}$ | ||
|  | @end tex | ||
|  | @ifinfo | ||
|  | n[d-1] | ||
|  | @end ifinfo | ||
|  | @html | ||
|  | n<sub>d-1</sub> | ||
|  | @end html | ||
|  | values are actually stored in the last dimension, and | ||
|  | @tex | ||
|  | $n_{d-1}$ | ||
|  | @end tex | ||
|  | @ifinfo | ||
|  | n[d-1] | ||
|  | @end ifinfo | ||
|  | @html | ||
|  | n<sub>d-1</sub> | ||
|  | @end html | ||
|  | is the last dimension passed to the planner. | ||
|  | 
 | ||
|  | @c =========> | ||
|  | @node Real-to-Real Transforms, Real-to-Real Transform Kinds, Real-data DFT Array Format, Basic Interface | ||
|  | @subsection Real-to-Real Transforms | ||
|  | @cindex r2r | ||
|  | 
 | ||
|  | @example | ||
|  | fftw_plan fftw_plan_r2r_1d(int n, double *in, double *out, | ||
|  |                            fftw_r2r_kind kind, unsigned flags); | ||
|  | fftw_plan fftw_plan_r2r_2d(int n0, int n1, double *in, double *out, | ||
|  |                            fftw_r2r_kind kind0, fftw_r2r_kind kind1, | ||
|  |                            unsigned flags); | ||
|  | fftw_plan fftw_plan_r2r_3d(int n0, int n1, int n2, | ||
|  |                            double *in, double *out, | ||
|  |                            fftw_r2r_kind kind0, | ||
|  |                            fftw_r2r_kind kind1, | ||
|  |                            fftw_r2r_kind kind2, | ||
|  |                            unsigned flags); | ||
|  | fftw_plan fftw_plan_r2r(int rank, const int *n, double *in, double *out, | ||
|  |                         const fftw_r2r_kind *kind, unsigned flags); | ||
|  | @end example | ||
|  | @findex fftw_plan_r2r_1d | ||
|  | @findex fftw_plan_r2r_2d | ||
|  | @findex fftw_plan_r2r_3d | ||
|  | @findex fftw_plan_r2r | ||
|  | 
 | ||
|  | Plan a real input/output (r2r) transform of various kinds in zero or | ||
|  | more dimensions, returning an @code{fftw_plan} (@pxref{Using Plans}). | ||
|  | 
 | ||
|  | Once you have created a plan for a certain transform type and | ||
|  | parameters, then creating another plan of the same type and parameters, | ||
|  | but for different arrays, is fast and shares constant data with the | ||
|  | first plan (if it still exists). | ||
|  | 
 | ||
|  | The planner returns @code{NULL} if the plan cannot be created.  A | ||
|  | non-@code{NULL} plan is always returned by the basic interface unless | ||
|  | you are using a customized FFTW configuration supporting a restricted | ||
|  | set of transforms, or for size-1 @code{FFTW_REDFT00} kinds (which are | ||
|  | not defined). | ||
|  | @ctindex FFTW_REDFT00 | ||
|  | 
 | ||
|  | @subsubheading Arguments | ||
|  | @itemize @bullet | ||
|  | 
 | ||
|  | @item | ||
|  | @code{rank} is the dimensionality of the transform (it should be the | ||
|  | size of the arrays @code{*n} and @code{*kind}), and can be any | ||
|  | non-negative integer.  The @samp{_1d}, @samp{_2d}, and @samp{_3d} | ||
|  | planners correspond to a @code{rank} of @code{1}, @code{2}, and | ||
|  | @code{3}, respectively.  A @code{rank} of zero is equivalent to a copy | ||
|  | of one number from input to output. | ||
|  | 
 | ||
|  | @item | ||
|  | @code{n}, or @code{n0}/@code{n1}/@code{n2}, or @code{n[rank]}, | ||
|  | respectively, gives the (physical) size of the transform dimensions. | ||
|  | They can be any positive integer. | ||
|  |   | ||
|  | @itemize @minus | ||
|  | @item | ||
|  | @cindex row-major | ||
|  | Multi-dimensional arrays are stored in row-major order with dimensions: | ||
|  | @code{n0} x @code{n1}; or @code{n0} x @code{n1} x @code{n2}; or | ||
|  | @code{n[0]} x @code{n[1]} x ... x @code{n[rank-1]}. | ||
|  | @xref{Multi-dimensional Array Format}. | ||
|  | @item | ||
|  | FFTW is generally best at handling sizes of the form | ||
|  | @ifinfo | ||
|  | @math{2^a 3^b 5^c 7^d 11^e 13^f}, | ||
|  | @end ifinfo | ||
|  | @tex | ||
|  | $2^a 3^b 5^c 7^d 11^e 13^f$, | ||
|  | @end tex | ||
|  | @html | ||
|  | 2<sup>a</sup> 3<sup>b</sup> 5<sup>c</sup> 7<sup>d</sup> | ||
|  |         11<sup>e</sup> 13<sup>f</sup>, | ||
|  | @end html | ||
|  | where @math{e+f} is either @math{0} or @math{1}, and the other exponents | ||
|  | are arbitrary.  Other sizes are computed by means of a slow, | ||
|  | general-purpose algorithm (which nevertheless retains @Onlogn{} performance even for prime sizes).  (It is possible to customize FFTW | ||
|  | for different array sizes; see @ref{Installation and Customization}.) | ||
|  | Transforms whose sizes are powers of @math{2} are especially fast. | ||
|  | @item | ||
|  | For a @code{REDFT00} or @code{RODFT00} transform kind in a dimension of | ||
|  | size @math{n}, it is @math{n-1} or @math{n+1}, respectively, that | ||
|  | should be factorizable in the above form. | ||
|  | @end itemize | ||
|  | 
 | ||
|  | @item | ||
|  | @code{in} and @code{out} point to the input and output arrays of the | ||
|  | transform, which may be the same (yielding an in-place transform). | ||
|  | @cindex in-place | ||
|  | These arrays are overwritten during planning, unless | ||
|  | @code{FFTW_ESTIMATE} is used in the flags.  (The arrays need not be | ||
|  | initialized, but they must be allocated.) | ||
|  | 
 | ||
|  | @item | ||
|  | @code{kind}, or @code{kind0}/@code{kind1}/@code{kind2}, or | ||
|  | @code{kind[rank]}, is the kind of r2r transform used for the | ||
|  | corresponding dimension.  The valid kind constants are described in | ||
|  | @ref{Real-to-Real Transform Kinds}.  In a multi-dimensional transform, | ||
|  | what is computed is the separable product formed by taking each | ||
|  | transform kind along the corresponding dimension, one dimension after | ||
|  | another. | ||
|  | 
 | ||
|  | @item | ||
|  | @cindex flags | ||
|  | @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags, | ||
|  | as defined in @ref{Planner Flags}. | ||
|  | 
 | ||
|  | @end itemize | ||
|  | 
 | ||
|  | @c =========> | ||
|  | @node Real-to-Real Transform Kinds,  , Real-to-Real Transforms, Basic Interface | ||
|  | @subsection Real-to-Real Transform Kinds | ||
|  | @cindex kind (r2r) | ||
|  | 
 | ||
|  | FFTW currently supports 11 different r2r transform kinds, specified by | ||
|  | one of the constants below.  For the precise definitions of these | ||
|  | transforms, see @ref{What FFTW Really Computes}.  For a more colloquial | ||
|  | introduction to these transform kinds, see @ref{More DFTs of Real Data}. | ||
|  | 
 | ||
|  | For dimension of size @code{n}, there is a corresponding ``logical'' | ||
|  | dimension @code{N} that determines the normalization (and the optimal | ||
|  | factorization); the formula for @code{N} is given for each kind below. | ||
|  | Also, with each transform kind is listed its corrsponding inverse | ||
|  | transform.  FFTW computes unnormalized transforms: a transform followed | ||
|  | by its inverse will result in the original data multiplied by @code{N} | ||
|  | (or the product of the @code{N}'s for each dimension, in | ||
|  | multi-dimensions). | ||
|  | @cindex normalization | ||
|  | 
 | ||
|  | @itemize @bullet | ||
|  | 
 | ||
|  | @item | ||
|  | @ctindex FFTW_R2HC | ||
|  | @code{FFTW_R2HC} computes a real-input DFT with output in | ||
|  | ``halfcomplex'' format, i.e. real and imaginary parts for a transform of | ||
|  | size @code{n} stored as: | ||
|  | @tex | ||
|  | $$ | ||
|  | r_0, r_1, r_2, \ldots, r_{n/2}, i_{(n+1)/2-1}, \ldots, i_2, i_1 | ||
|  | $$ | ||
|  | @end tex | ||
|  | @ifinfo | ||
|  | r0, r1, r2, r(n/2), i((n+1)/2-1), ..., i2, i1 | ||
|  | @end ifinfo | ||
|  | @html | ||
|  | <p align=center> | ||
|  | r<sub>0</sub>, r<sub>1</sub>, r<sub>2</sub>, ..., r<sub>n/2</sub>, i<sub>(n+1)/2-1</sub>, ..., i<sub>2</sub>, i<sub>1</sub> | ||
|  | </p> | ||
|  | @end html | ||
|  | (Logical @code{N=n}, inverse is @code{FFTW_HC2R}.) | ||
|  | 
 | ||
|  | @item | ||
|  | @ctindex FFTW_HC2R | ||
|  | @code{FFTW_HC2R} computes the reverse of @code{FFTW_R2HC}, above. | ||
|  | (Logical @code{N=n}, inverse is @code{FFTW_R2HC}.) | ||
|  | 
 | ||
|  | @item | ||
|  | @ctindex FFTW_DHT | ||
|  | @code{FFTW_DHT} computes a discrete Hartley transform. | ||
|  | (Logical @code{N=n}, inverse is @code{FFTW_DHT}.) | ||
|  | @cindex discrete Hartley transform | ||
|  | 
 | ||
|  | @item | ||
|  | @ctindex FFTW_REDFT00 | ||
|  | @code{FFTW_REDFT00} computes an REDFT00 transform, i.e. a DCT-I. | ||
|  | (Logical @code{N=2*(n-1)}, inverse is @code{FFTW_REDFT00}.) | ||
|  | @cindex discrete cosine transform | ||
|  | @cindex DCT | ||
|  | 
 | ||
|  | @item | ||
|  | @ctindex FFTW_REDFT10 | ||
|  | @code{FFTW_REDFT10} computes an REDFT10 transform, i.e. a DCT-II (sometimes called ``the'' DCT). | ||
|  | (Logical @code{N=2*n}, inverse is @code{FFTW_REDFT01}.) | ||
|  | 
 | ||
|  | @item | ||
|  | @ctindex FFTW_REDFT01 | ||
|  | @code{FFTW_REDFT01} computes an REDFT01 transform, i.e. a DCT-III (sometimes called ``the'' IDCT, being the inverse of DCT-II). | ||
|  | (Logical @code{N=2*n}, inverse is @code{FFTW_REDFT=10}.) | ||
|  | @cindex IDCT | ||
|  | 
 | ||
|  | @item | ||
|  | @ctindex FFTW_REDFT11 | ||
|  | @code{FFTW_REDFT11} computes an REDFT11 transform, i.e. a DCT-IV. | ||
|  | (Logical @code{N=2*n}, inverse is @code{FFTW_REDFT11}.) | ||
|  | 
 | ||
|  | @item | ||
|  | @ctindex FFTW_RODFT00 | ||
|  | @code{FFTW_RODFT00} computes an RODFT00 transform, i.e. a DST-I. | ||
|  | (Logical @code{N=2*(n+1)}, inverse is @code{FFTW_RODFT00}.) | ||
|  | @cindex discrete sine transform | ||
|  | @cindex DST | ||
|  | 
 | ||
|  | @item | ||
|  | @ctindex FFTW_RODFT10 | ||
|  | @code{FFTW_RODFT10} computes an RODFT10 transform, i.e. a DST-II. | ||
|  | (Logical @code{N=2*n}, inverse is @code{FFTW_RODFT01}.) | ||
|  | 
 | ||
|  | @item | ||
|  | @ctindex FFTW_RODFT01 | ||
|  | @code{FFTW_RODFT01} computes an RODFT01 transform, i.e. a DST-III. | ||
|  | (Logical @code{N=2*n}, inverse is @code{FFTW_RODFT=10}.) | ||
|  | 
 | ||
|  | @item | ||
|  | @ctindex FFTW_RODFT11 | ||
|  | @code{FFTW_RODFT11} computes an RODFT11 transform, i.e. a DST-IV. | ||
|  | (Logical @code{N=2*n}, inverse is @code{FFTW_RODFT11}.) | ||
|  | 
 | ||
|  | @end itemize | ||
|  | 
 | ||
|  | @c ------------------------------------------------------------ | ||
|  | @node Advanced Interface, Guru Interface, Basic Interface, FFTW Reference | ||
|  | @section Advanced Interface | ||
|  | @cindex advanced interface | ||
|  | 
 | ||
|  | FFTW's ``advanced'' interface supplements the basic interface with four | ||
|  | new planner routines, providing a new level of flexibility: you can plan | ||
|  | a transform of multiple arrays simultaneously, operate on non-contiguous | ||
|  | (strided) data, and transform a subset of a larger multi-dimensional | ||
|  | array.  Other than these additional features, the planner operates in | ||
|  | the same fashion as in the basic interface, and the resulting | ||
|  | @code{fftw_plan} is used in the same way (@pxref{Using Plans}). | ||
|  | 
 | ||
|  | @menu | ||
|  | * Advanced Complex DFTs:: | ||
|  | * Advanced Real-data DFTs:: | ||
|  | * Advanced Real-to-real Transforms:: | ||
|  | @end menu | ||
|  | 
 | ||
|  | @c =========> | ||
|  | @node Advanced Complex DFTs, Advanced Real-data DFTs, Advanced Interface, Advanced Interface | ||
|  | @subsection Advanced Complex DFTs | ||
|  | 
 | ||
|  | @example | ||
|  | fftw_plan fftw_plan_many_dft(int rank, const int *n, int howmany, | ||
|  |                              fftw_complex *in, const int *inembed, | ||
|  |                              int istride, int idist, | ||
|  |                              fftw_complex *out, const int *onembed, | ||
|  |                              int ostride, int odist, | ||
|  |                              int sign, unsigned flags); | ||
|  | @end example | ||
|  | @findex fftw_plan_many_dft | ||
|  | 
 | ||
|  | This routine plans multiple multidimensional complex DFTs, and it | ||
|  | extends the @code{fftw_plan_dft} routine (@pxref{Complex DFTs}) to | ||
|  | compute @code{howmany} transforms, each having rank @code{rank} and size | ||
|  | @code{n}.  In addition, the transform data need not be contiguous, but | ||
|  | it may be laid out in memory with an arbitrary stride.  To account for | ||
|  | these possibilities, @code{fftw_plan_many_dft} adds the new parameters | ||
|  | @code{howmany}, @{@code{i},@code{o}@}@code{nembed}, | ||
|  | @{@code{i},@code{o}@}@code{stride}, and | ||
|  | @{@code{i},@code{o}@}@code{dist}.  The FFTW basic interface | ||
|  | (@pxref{Complex DFTs}) provides routines specialized for ranks 1, 2, | ||
|  | and@tie{}3, but the advanced interface handles only the general-rank | ||
|  | case. | ||
|  | 
 | ||
|  | @code{howmany} is the (nonnegative) number of transforms to compute.  The resulting | ||
|  | plan computes @code{howmany} transforms, where the input of the | ||
|  | @code{k}-th transform is at location @code{in+k*idist} (in C pointer | ||
|  | arithmetic), and its output is at location @code{out+k*odist}.  Plans | ||
|  | obtained in this way can often be faster than calling FFTW multiple | ||
|  | times for the individual transforms.  The basic @code{fftw_plan_dft} | ||
|  | interface corresponds to @code{howmany=1} (in which case the @code{dist} | ||
|  | parameters are ignored). | ||
|  | @cindex howmany parameter | ||
|  | @cindex dist | ||
|  | 
 | ||
|  | 
 | ||
|  | Each of the @code{howmany} transforms has rank @code{rank} and size | ||
|  | @code{n}, as in the basic interface.  In addition, the advanced | ||
|  | interface allows the input and output arrays of each transform to be | ||
|  | row-major subarrays of larger rank-@code{rank} arrays, described by | ||
|  | @code{inembed} and @code{onembed} parameters, respectively. | ||
|  | @{@code{i},@code{o}@}@code{nembed} must be arrays of length @code{rank}, | ||
|  | and @code{n} should be elementwise less than or equal to | ||
|  | @{@code{i},@code{o}@}@code{nembed}.  Passing @code{NULL} for an | ||
|  | @code{nembed} parameter is equivalent to passing @code{n} (i.e. same | ||
|  | physical and logical dimensions, as in the basic interface.) | ||
|  | 
 | ||
|  | The @code{stride} parameters indicate that the @code{j}-th element of | ||
|  | the input or output arrays is located at @code{j*istride} or | ||
|  | @code{j*ostride}, respectively.  (For a multi-dimensional array, | ||
|  | @code{j} is the ordinary row-major index.)  When combined with the | ||
|  | @code{k}-th transform in a @code{howmany} loop, from above, this means | ||
|  | that the (@code{j},@code{k})-th element is at @code{j*stride+k*dist}. | ||
|  | (The basic @code{fftw_plan_dft} interface corresponds to a stride of 1.) | ||
|  | @cindex stride | ||
|  | 
 | ||
|  | 
 | ||
|  | For in-place transforms, the input and output @code{stride} and | ||
|  | @code{dist} parameters should be the same; otherwise, the planner may | ||
|  | return @code{NULL}. | ||
|  | 
 | ||
|  | Arrays @code{n}, @code{inembed}, and @code{onembed} are not used after | ||
|  | this function returns.  You can safely free or reuse them. | ||
|  | 
 | ||
|  | @strong{Examples}: | ||
|  | One transform of one 5 by 6 array contiguous in memory: | ||
|  | @example | ||
|  |    int rank = 2; | ||
|  |    int n[] = @{5, 6@}; | ||
|  |    int howmany = 1; | ||
|  |    int idist = odist = 0; /* unused because howmany = 1 */ | ||
|  |    int istride = ostride = 1; /* array is contiguous in memory */ | ||
|  |    int *inembed = n, *onembed = n; | ||
|  | @end example | ||
|  | 
 | ||
|  | Transform of three 5 by 6 arrays, each contiguous in memory, | ||
|  | stored in memory one after another: | ||
|  | @example | ||
|  |    int rank = 2; | ||
|  |    int n[] = @{5, 6@}; | ||
|  |    int howmany = 3; | ||
|  |    int idist = odist = n[0]*n[1]; /* = 30, the distance in memory | ||
|  |                                      between the first element | ||
|  |                                      of the first array and the | ||
|  |                                      first element of the second array */ | ||
|  |    int istride = ostride = 1; /* array is contiguous in memory */ | ||
|  |    int *inembed = n, *onembed = n; | ||
|  | @end example | ||
|  | 
 | ||
|  | Transform each column of a 2d array with 10 rows and 3 columns: | ||
|  | @example | ||
|  |    int rank = 1; /* not 2: we are computing 1d transforms */ | ||
|  |    int n[] = @{10@}; /* 1d transforms of length 10 */ | ||
|  |    int howmany = 3; | ||
|  |    int idist = odist = 1; | ||
|  |    int istride = ostride = 3; /* distance between two elements in  | ||
|  |                                  the same column */ | ||
|  |    int *inembed = n, *onembed = n; | ||
|  | @end example | ||
|  | 
 | ||
|  | @c =========> | ||
|  | @node Advanced Real-data DFTs, Advanced Real-to-real Transforms, Advanced Complex DFTs, Advanced Interface | ||
|  | @subsection Advanced Real-data DFTs | ||
|  | 
 | ||
|  | @example | ||
|  | fftw_plan fftw_plan_many_dft_r2c(int rank, const int *n, int howmany, | ||
|  |                                  double *in, const int *inembed, | ||
|  |                                  int istride, int idist, | ||
|  |                                  fftw_complex *out, const int *onembed, | ||
|  |                                  int ostride, int odist, | ||
|  |                                  unsigned flags); | ||
|  | fftw_plan fftw_plan_many_dft_c2r(int rank, const int *n, int howmany, | ||
|  |                                  fftw_complex *in, const int *inembed, | ||
|  |                                  int istride, int idist, | ||
|  |                                  double *out, const int *onembed, | ||
|  |                                  int ostride, int odist, | ||
|  |                                  unsigned flags); | ||
|  | @end example | ||
|  | @findex fftw_plan_many_dft_r2c | ||
|  | @findex fftw_plan_many_dft_c2r | ||
|  | 
 | ||
|  | Like @code{fftw_plan_many_dft}, these two functions add @code{howmany}, | ||
|  | @code{nembed}, @code{stride}, and @code{dist} parameters to the | ||
|  | @code{fftw_plan_dft_r2c} and @code{fftw_plan_dft_c2r} functions, but | ||
|  | otherwise behave the same as the basic interface. | ||
|  | 
 | ||
|  | The interpretation of @code{howmany}, @code{stride}, and @code{dist} are | ||
|  | the same as for @code{fftw_plan_many_dft}, above.  Note that the | ||
|  | @code{stride} and @code{dist} for the real array are in units of | ||
|  | @code{double}, and for the complex array are in units of | ||
|  | @code{fftw_complex}. | ||
|  | 
 | ||
|  | If an @code{nembed} parameter is @code{NULL}, it is interpreted as what | ||
|  | it would be in the basic interface, as described in @ref{Real-data DFT | ||
|  | Array Format}.  That is, for the complex array the size is assumed to be | ||
|  | the same as @code{n}, but with the last dimension cut roughly in half. | ||
|  | For the real array, the size is assumed to be @code{n} if the transform | ||
|  | is out-of-place, or @code{n} with the last dimension ``padded'' if the | ||
|  | transform is in-place. | ||
|  | 
 | ||
|  | If an @code{nembed} parameter is non-@code{NULL}, it is interpreted as | ||
|  | the physical size of the corresponding array, in row-major order, just | ||
|  | as for @code{fftw_plan_many_dft}.  In this case, each dimension of | ||
|  | @code{nembed} should be @code{>=} what it would be in the basic | ||
|  | interface (e.g. the halved or padded @code{n}). | ||
|  | 
 | ||
|  | Arrays @code{n}, @code{inembed}, and @code{onembed} are not used after | ||
|  | this function returns.  You can safely free or reuse them. | ||
|  | 
 | ||
|  | @c =========> | ||
|  | @node Advanced Real-to-real Transforms,  , Advanced Real-data DFTs, Advanced Interface | ||
|  | @subsection Advanced Real-to-real Transforms | ||
|  | 
 | ||
|  | @example | ||
|  | fftw_plan fftw_plan_many_r2r(int rank, const int *n, int howmany, | ||
|  |                              double *in, const int *inembed, | ||
|  |                              int istride, int idist, | ||
|  |                              double *out, const int *onembed, | ||
|  |                              int ostride, int odist, | ||
|  |                              const fftw_r2r_kind *kind, unsigned flags); | ||
|  | @end example | ||
|  | @findex fftw_plan_many_r2r | ||
|  | 
 | ||
|  | Like @code{fftw_plan_many_dft}, this functions adds @code{howmany}, | ||
|  | @code{nembed}, @code{stride}, and @code{dist} parameters to the | ||
|  | @code{fftw_plan_r2r} function, but otherwise behave the same as the | ||
|  | basic interface.  The interpretation of those additional parameters are | ||
|  | the same as for @code{fftw_plan_many_dft}.  (Of course, the | ||
|  | @code{stride} and @code{dist} parameters are now in units of | ||
|  | @code{double}, not @code{fftw_complex}.) | ||
|  | 
 | ||
|  | Arrays @code{n}, @code{inembed}, @code{onembed}, and @code{kind} are not | ||
|  | used after this function returns.  You can safely free or reuse them. | ||
|  | 
 | ||
|  | @c ------------------------------------------------------------ | ||
|  | @node Guru Interface, New-array Execute Functions, Advanced Interface, FFTW Reference | ||
|  | @section Guru Interface | ||
|  | @cindex guru interface | ||
|  | 
 | ||
|  | The ``guru'' interface to FFTW is intended to expose as much as possible | ||
|  | of the flexibility in the underlying FFTW architecture.  It allows one | ||
|  | to compute multi-dimensional ``vectors'' (loops) of multi-dimensional | ||
|  | transforms, where each vector/transform dimension has an independent | ||
|  | size and stride. | ||
|  | @cindex vector | ||
|  | One can also use more general complex-number formats, e.g. separate real | ||
|  | and imaginary arrays. | ||
|  | 
 | ||
|  | For those users who require the flexibility of the guru interface, it is | ||
|  | important that they pay special attention to the documentation lest they | ||
|  | shoot themselves in the foot. | ||
|  | 
 | ||
|  | @menu | ||
|  | * Interleaved and split arrays:: | ||
|  | * Guru vector and transform sizes:: | ||
|  | * Guru Complex DFTs:: | ||
|  | * Guru Real-data DFTs:: | ||
|  | * Guru Real-to-real Transforms:: | ||
|  | * 64-bit Guru Interface:: | ||
|  | @end menu | ||
|  | 
 | ||
|  | @c =========> | ||
|  | @node  Interleaved and split arrays, Guru vector and transform sizes, Guru Interface, Guru Interface | ||
|  | @subsection Interleaved and split arrays | ||
|  | 
 | ||
|  | The guru interface supports two representations of complex numbers, | ||
|  | which we call the interleaved and the split format. | ||
|  | 
 | ||
|  | The @dfn{interleaved} format is the same one used by the basic and | ||
|  | advanced interfaces, and it is documented in @ref{Complex numbers}. | ||
|  | In the interleaved format, you provide pointers to the real part of a | ||
|  | complex number, and the imaginary part understood to be stored in the | ||
|  | next memory location. | ||
|  | @cindex interleaved format | ||
|  | 
 | ||
|  | 
 | ||
|  | The @dfn{split} format allows separate pointers to the real and | ||
|  | imaginary parts of a complex array. | ||
|  | @cindex split format | ||
|  | 
 | ||
|  | 
 | ||
|  | Technically, the interleaved format is redundant, because you can | ||
|  | always express an interleaved array in terms of a split array with | ||
|  | appropriate pointers and strides.  On the other hand, the interleaved | ||
|  | format is simpler to use, and it is common in practice.  Hence, FFTW | ||
|  | supports it as a special case. | ||
|  | 
 | ||
|  | @c =========> | ||
|  | @node Guru vector and transform sizes, Guru Complex DFTs, Interleaved and split arrays, Guru Interface | ||
|  | @subsection Guru vector and transform sizes | ||
|  | 
 | ||
|  | The guru interface introduces one basic new data structure, | ||
|  | @code{fftw_iodim}, that is used to specify sizes and strides for | ||
|  | multi-dimensional transforms and vectors: | ||
|  | 
 | ||
|  | @example | ||
|  | typedef struct @{ | ||
|  |      int n; | ||
|  |      int is; | ||
|  |      int os; | ||
|  | @} fftw_iodim; | ||
|  | @end example | ||
|  | @tindex fftw_iodim | ||
|  | 
 | ||
|  | Here, @code{n} is the size of the dimension, and @code{is} and @code{os} | ||
|  | are the strides of that dimension for the input and output arrays.  (The | ||
|  | stride is the separation of consecutive elements along this dimension.) | ||
|  | 
 | ||
|  | The meaning of the stride parameter depends on the type of the array | ||
|  | that the stride refers to.  @emph{If the array is interleaved complex, | ||
|  | strides are expressed in units of complex numbers | ||
|  | (@code{fftw_complex}).  If the array is split complex or real, strides | ||
|  | are expressed in units of real numbers (@code{double}).}  This | ||
|  | convention is consistent with the usual pointer arithmetic in the C | ||
|  | language.  An interleaved array is denoted by a pointer @code{p} to | ||
|  | @code{fftw_complex}, so that @code{p+1} points to the next complex | ||
|  | number.  Split arrays are denoted by pointers to @code{double}, in | ||
|  | which case pointer arithmetic operates in units of | ||
|  | @code{sizeof(double)}. | ||
|  | @cindex stride | ||
|  | 
 | ||
|  | 
 | ||
|  | The guru planner interfaces all take a (@code{rank}, @code{dims[rank]}) | ||
|  | pair describing the transform size, and a (@code{howmany_rank}, | ||
|  | @code{howmany_dims[howmany_rank]}) pair describing the ``vector'' size (a | ||
|  | multi-dimensional loop of transforms to perform), where @code{dims} and | ||
|  | @code{howmany_dims} are arrays of @code{fftw_iodim}.  Each @code{n} field must | ||
|  | be positive for @code{dims} and nonnegative for @code{howmany_dims}, while both | ||
|  | @code{rank} and @code{howmany_rank} must be nonnegative. | ||
|  | 
 | ||
|  | For example, the @code{howmany} parameter in the advanced complex-DFT | ||
|  | interface corresponds to @code{howmany_rank} = 1, | ||
|  | @code{howmany_dims[0].n} = @code{howmany}, @code{howmany_dims[0].is} = | ||
|  | @code{idist}, and @code{howmany_dims[0].os} = @code{odist}. | ||
|  | @cindex howmany loop | ||
|  | @cindex dist | ||
|  | (To compute a single transform, you can just use @code{howmany_rank} = 0.) | ||
|  | 
 | ||
|  | 
 | ||
|  | A row-major multidimensional array with dimensions @code{n[rank]} | ||
|  | (@pxref{Row-major Format}) corresponds to @code{dims[i].n} = | ||
|  | @code{n[i]} and the recurrence @code{dims[i].is} = @code{n[i+1] * | ||
|  | dims[i+1].is} (similarly for @code{os}).  The stride of the last | ||
|  | (@code{i=rank-1}) dimension is the overall stride of the array. | ||
|  | e.g. to be equivalent to the advanced complex-DFT interface, you would | ||
|  | have @code{dims[rank-1].is} = @code{istride} and | ||
|  | @code{dims[rank-1].os} = @code{ostride}. | ||
|  | @cindex row-major | ||
|  | 
 | ||
|  | 
 | ||
|  | In general, we only guarantee FFTW to return a non-@code{NULL} plan if | ||
|  | the vector and transform dimensions correspond to a set of distinct | ||
|  | indices, and for in-place transforms the input/output strides should | ||
|  | be the same. | ||
|  | 
 | ||
|  | @c =========> | ||
|  | @node Guru Complex DFTs, Guru Real-data DFTs, Guru vector and transform sizes, Guru Interface | ||
|  | @subsection Guru Complex DFTs | ||
|  | 
 | ||
|  | @example | ||
|  | fftw_plan fftw_plan_guru_dft( | ||
|  |      int rank, const fftw_iodim *dims, | ||
|  |      int howmany_rank, const fftw_iodim *howmany_dims, | ||
|  |      fftw_complex *in, fftw_complex *out, | ||
|  |      int sign, unsigned flags); | ||
|  | 
 | ||
|  | fftw_plan fftw_plan_guru_split_dft( | ||
|  |      int rank, const fftw_iodim *dims, | ||
|  |      int howmany_rank, const fftw_iodim *howmany_dims, | ||
|  |      double *ri, double *ii, double *ro, double *io, | ||
|  |      unsigned flags); | ||
|  | @end example | ||
|  | @findex fftw_plan_guru_dft | ||
|  | @findex fftw_plan_guru_split_dft | ||
|  | 
 | ||
|  | These two functions plan a complex-data, multi-dimensional DFT | ||
|  | for the interleaved and split format, respectively. | ||
|  | Transform dimensions are given by (@code{rank}, @code{dims}) over a | ||
|  | multi-dimensional vector (loop) of dimensions (@code{howmany_rank}, | ||
|  | @code{howmany_dims}).  @code{dims} and @code{howmany_dims} should point | ||
|  | to @code{fftw_iodim} arrays of length @code{rank} and | ||
|  | @code{howmany_rank}, respectively. | ||
|  | 
 | ||
|  | @cindex flags | ||
|  | @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags, | ||
|  | as defined in @ref{Planner Flags}. | ||
|  | 
 | ||
|  | In the @code{fftw_plan_guru_dft} function, the pointers @code{in} and | ||
|  | @code{out} point to the interleaved input and output arrays, | ||
|  | respectively.  The sign can be either @math{-1} (= | ||
|  | @code{FFTW_FORWARD}) or @math{+1} (= @code{FFTW_BACKWARD}).  If the | ||
|  | pointers are equal, the transform is in-place. | ||
|  | 
 | ||
|  | In the @code{fftw_plan_guru_split_dft} function, | ||
|  | @code{ri} and @code{ii} point to the real and imaginary input arrays, | ||
|  | and @code{ro} and @code{io} point to the real and imaginary output | ||
|  | arrays.  The input and output pointers may be the same, indicating an | ||
|  | in-place transform.  For example, for @code{fftw_complex} pointers | ||
|  | @code{in} and @code{out}, the corresponding parameters are: | ||
|  | 
 | ||
|  | @example | ||
|  | ri = (double *) in; | ||
|  | ii = (double *) in + 1; | ||
|  | ro = (double *) out; | ||
|  | io = (double *) out + 1; | ||
|  | @end example | ||
|  | 
 | ||
|  | Because @code{fftw_plan_guru_split_dft} accepts split arrays, strides | ||
|  | are expressed in units of @code{double}.  For a contiguous | ||
|  | @code{fftw_complex} array, the overall stride of the transform should | ||
|  | be 2, the distance between consecutive real parts or between | ||
|  | consecutive imaginary parts; see @ref{Guru vector and transform | ||
|  | sizes}.  Note that the dimension strides are applied equally to the | ||
|  | real and imaginary parts; real and imaginary arrays with different | ||
|  | strides are not supported. | ||
|  | 
 | ||
|  | There is no @code{sign} parameter in @code{fftw_plan_guru_split_dft}. | ||
|  | This function always plans for an @code{FFTW_FORWARD} transform.  To | ||
|  | plan for an @code{FFTW_BACKWARD} transform, you can exploit the | ||
|  | identity that the backwards DFT is equal to the forwards DFT with the | ||
|  | real and imaginary parts swapped.  For example, in the case of the | ||
|  | @code{fftw_complex} arrays above, the @code{FFTW_BACKWARD} transform | ||
|  | is computed by the parameters: | ||
|  | 
 | ||
|  | @example | ||
|  | ri = (double *) in + 1; | ||
|  | ii = (double *) in; | ||
|  | ro = (double *) out + 1; | ||
|  | io = (double *) out; | ||
|  | @end example | ||
|  | 
 | ||
|  | @c =========> | ||
|  | @node Guru Real-data DFTs, Guru Real-to-real Transforms, Guru Complex DFTs, Guru Interface | ||
|  | @subsection Guru Real-data DFTs | ||
|  | 
 | ||
|  | @example | ||
|  | fftw_plan fftw_plan_guru_dft_r2c( | ||
|  |      int rank, const fftw_iodim *dims, | ||
|  |      int howmany_rank, const fftw_iodim *howmany_dims, | ||
|  |      double *in, fftw_complex *out, | ||
|  |      unsigned flags); | ||
|  | 
 | ||
|  | fftw_plan fftw_plan_guru_split_dft_r2c( | ||
|  |      int rank, const fftw_iodim *dims, | ||
|  |      int howmany_rank, const fftw_iodim *howmany_dims, | ||
|  |      double *in, double *ro, double *io, | ||
|  |      unsigned flags); | ||
|  | 
 | ||
|  | fftw_plan fftw_plan_guru_dft_c2r( | ||
|  |      int rank, const fftw_iodim *dims, | ||
|  |      int howmany_rank, const fftw_iodim *howmany_dims, | ||
|  |      fftw_complex *in, double *out, | ||
|  |      unsigned flags); | ||
|  | 
 | ||
|  | fftw_plan fftw_plan_guru_split_dft_c2r( | ||
|  |      int rank, const fftw_iodim *dims, | ||
|  |      int howmany_rank, const fftw_iodim *howmany_dims, | ||
|  |      double *ri, double *ii, double *out, | ||
|  |      unsigned flags); | ||
|  | @end example | ||
|  | @findex fftw_plan_guru_dft_r2c | ||
|  | @findex fftw_plan_guru_split_dft_r2c | ||
|  | @findex fftw_plan_guru_dft_c2r | ||
|  | @findex fftw_plan_guru_split_dft_c2r | ||
|  | 
 | ||
|  | Plan a real-input (r2c) or real-output (c2r), multi-dimensional DFT with | ||
|  | transform dimensions given by (@code{rank}, @code{dims}) over a | ||
|  | multi-dimensional vector (loop) of dimensions (@code{howmany_rank}, | ||
|  | @code{howmany_dims}).  @code{dims} and @code{howmany_dims} should point | ||
|  | to @code{fftw_iodim} arrays of length @code{rank} and | ||
|  | @code{howmany_rank}, respectively.  As for the basic and advanced | ||
|  | interfaces, an r2c transform is @code{FFTW_FORWARD} and a c2r transform | ||
|  | is @code{FFTW_BACKWARD}. | ||
|  | 
 | ||
|  | The @emph{last} dimension of @code{dims} is interpreted specially: | ||
|  | that dimension of the real array has size @code{dims[rank-1].n}, but | ||
|  | that dimension of the complex array has size @code{dims[rank-1].n/2+1} | ||
|  | (division rounded down).  The strides, on the other hand, are taken to | ||
|  | be exactly as specified.  It is up to the user to specify the strides | ||
|  | appropriately for the peculiar dimensions of the data, and we do not | ||
|  | guarantee that the planner will succeed (return non-@code{NULL}) for | ||
|  | any dimensions other than those described in @ref{Real-data DFT Array | ||
|  | Format} and generalized in @ref{Advanced Real-data DFTs}.  (That is, | ||
|  | for an in-place transform, each individual dimension should be able to | ||
|  | operate in place.) | ||
|  | @cindex in-place | ||
|  | 
 | ||
|  | 
 | ||
|  | @code{in} and @code{out} point to the input and output arrays for r2c | ||
|  | and c2r transforms, respectively.  For split arrays, @code{ri} and | ||
|  | @code{ii} point to the real and imaginary input arrays for a c2r | ||
|  | transform, and @code{ro} and @code{io} point to the real and imaginary | ||
|  | output arrays for an r2c transform.  @code{in} and @code{ro} or | ||
|  | @code{ri} and @code{out} may be the same, indicating an in-place | ||
|  | transform.   (In-place transforms where @code{in} and @code{io} or | ||
|  | @code{ii} and @code{out} are the same are not currently supported.) | ||
|  | 
 | ||
|  | @cindex flags | ||
|  | @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags, | ||
|  | as defined in @ref{Planner Flags}. | ||
|  | 
 | ||
|  | In-place transforms of rank greater than 1 are currently only | ||
|  | supported for interleaved arrays.  For split arrays, the planner will | ||
|  | return @code{NULL}. | ||
|  | @cindex in-place | ||
|  | 
 | ||
|  | @c =========> | ||
|  | @node Guru Real-to-real Transforms, 64-bit Guru Interface, Guru Real-data DFTs, Guru Interface | ||
|  | @subsection Guru Real-to-real Transforms | ||
|  | 
 | ||
|  | @example | ||
|  | fftw_plan fftw_plan_guru_r2r(int rank, const fftw_iodim *dims, | ||
|  |                              int howmany_rank, | ||
|  |                              const fftw_iodim *howmany_dims, | ||
|  |                              double *in, double *out, | ||
|  |                              const fftw_r2r_kind *kind, | ||
|  |                              unsigned flags); | ||
|  | @end example | ||
|  | @findex fftw_plan_guru_r2r | ||
|  | 
 | ||
|  | Plan a real-to-real (r2r) multi-dimensional @code{FFTW_FORWARD} | ||
|  | transform with transform dimensions given by (@code{rank}, @code{dims}) | ||
|  | over a multi-dimensional vector (loop) of dimensions | ||
|  | (@code{howmany_rank}, @code{howmany_dims}).  @code{dims} and | ||
|  | @code{howmany_dims} should point to @code{fftw_iodim} arrays of length | ||
|  | @code{rank} and @code{howmany_rank}, respectively. | ||
|  | 
 | ||
|  | The transform kind of each dimension is given by the @code{kind} | ||
|  | parameter, which should point to an array of length @code{rank}.  Valid | ||
|  | @code{fftw_r2r_kind} constants are given in @ref{Real-to-Real Transform | ||
|  | Kinds}. | ||
|  | 
 | ||
|  | @code{in} and @code{out} point to the real input and output arrays; they | ||
|  | may be the same, indicating an in-place transform. | ||
|  | 
 | ||
|  | @cindex flags | ||
|  | @code{flags} is a bitwise OR (@samp{|}) of zero or more planner flags, | ||
|  | as defined in @ref{Planner Flags}. | ||
|  | 
 | ||
|  | @c =========> | ||
|  | @node 64-bit Guru Interface,  , Guru Real-to-real Transforms, Guru Interface | ||
|  | @subsection 64-bit Guru Interface | ||
|  | @cindex 64-bit architecture | ||
|  | 
 | ||
|  | When compiled in 64-bit mode on a 64-bit architecture (where addresses | ||
|  | are 64 bits wide), FFTW uses 64-bit quantities internally for all | ||
|  | transform sizes, strides, and so on---you don't have to do anything | ||
|  | special to exploit this.  However, in the ordinary FFTW interfaces, | ||
|  | you specify the transform size by an @code{int} quantity, which is | ||
|  | normally only 32 bits wide.  This means that, even though FFTW is | ||
|  | using 64-bit sizes internally, you cannot specify a single transform | ||
|  | dimension larger than | ||
|  | @ifinfo | ||
|  | 2^31-1 | ||
|  | @end ifinfo | ||
|  | @html | ||
|  | 2<sup><small>31</small></sup>−1 | ||
|  | @end html | ||
|  | @tex | ||
|  | $2^{31}-1$ | ||
|  | @end tex | ||
|  | numbers. | ||
|  | 
 | ||
|  | We expect that few users will require transforms larger than this, but, | ||
|  | for those who do, we provide a 64-bit version of the guru interface in | ||
|  | which all sizes are specified as integers of type @code{ptrdiff_t} | ||
|  | instead of @code{int}.  (@code{ptrdiff_t} is a signed integer type | ||
|  | defined by the C standard to be wide enough to represent address | ||
|  | differences, and thus must be at least 64 bits wide on a 64-bit | ||
|  | machine.)  We stress that there is @emph{no performance advantage} to | ||
|  | using this interface---the same internal FFTW code is employed | ||
|  | regardless---and it is only necessary if you want to specify very | ||
|  | large transform sizes. | ||
|  | @tindex ptrdiff_t | ||
|  | 
 | ||
|  | 
 | ||
|  | In particular, the 64-bit guru interface is a set of planner routines | ||
|  | that are exactly the same as the guru planner routines, except that | ||
|  | they are named with @samp{guru64} instead of @samp{guru} and they take | ||
|  | arguments of type @code{fftw_iodim64} instead of @code{fftw_iodim}. | ||
|  | For example, instead of @code{fftw_plan_guru_dft}, we have | ||
|  | @code{fftw_plan_guru64_dft}. | ||
|  | 
 | ||
|  | @example | ||
|  | fftw_plan fftw_plan_guru64_dft( | ||
|  |      int rank, const fftw_iodim64 *dims, | ||
|  |      int howmany_rank, const fftw_iodim64 *howmany_dims, | ||
|  |      fftw_complex *in, fftw_complex *out, | ||
|  |      int sign, unsigned flags); | ||
|  | @end example | ||
|  | @findex fftw_plan_guru64_dft | ||
|  | 
 | ||
|  | The @code{fftw_iodim64} type is similar to @code{fftw_iodim}, with the | ||
|  | same interpretation, except that it uses type @code{ptrdiff_t} instead | ||
|  | of type @code{int}. | ||
|  | 
 | ||
|  | @example | ||
|  | typedef struct @{ | ||
|  |      ptrdiff_t n; | ||
|  |      ptrdiff_t is; | ||
|  |      ptrdiff_t os; | ||
|  | @} fftw_iodim64; | ||
|  | @end example | ||
|  | @tindex fftw_iodim64 | ||
|  | 
 | ||
|  | Every other @samp{fftw_plan_guru} function also has a | ||
|  | @samp{fftw_plan_guru64} equivalent, but we do not repeat their | ||
|  | documentation here since they are identical to the 32-bit versions | ||
|  | except as noted above. | ||
|  | 
 | ||
|  | @c ----------------------------------------------------------- | ||
|  | @node New-array Execute Functions, Wisdom, Guru Interface, FFTW Reference | ||
|  | @section New-array Execute Functions | ||
|  | @cindex execute | ||
|  | @cindex new-array execution | ||
|  | 
 | ||
|  | Normally, one executes a plan for the arrays with which the plan was | ||
|  | created, by calling @code{fftw_execute(plan)} as described in @ref{Using | ||
|  | Plans}. | ||
|  | @findex fftw_execute | ||
|  | However, it is possible for sophisticated users to apply a given plan | ||
|  | to a @emph{different} array using the ``new-array execute'' functions | ||
|  | detailed below, provided that the following conditions are met: | ||
|  | 
 | ||
|  | @itemize @bullet | ||
|  | 
 | ||
|  | @item | ||
|  | The array size, strides, etcetera are the same (since those are set by | ||
|  | the plan). | ||
|  | 
 | ||
|  | @item | ||
|  | The input and output arrays are the same (in-place) or different | ||
|  | (out-of-place) if the plan was originally created to be in-place or | ||
|  | out-of-place, respectively. | ||
|  | 
 | ||
|  | @item | ||
|  | For split arrays, the separations between the real and imaginary | ||
|  | parts, @code{ii-ri} and @code{io-ro}, are the same as they were for | ||
|  | the input and output arrays when the plan was created.  (This | ||
|  | condition is automatically satisfied for interleaved arrays.) | ||
|  | 
 | ||
|  | @item | ||
|  | The @dfn{alignment} of the new input/output arrays is the same as that | ||
|  | of the input/output arrays when the plan was created, unless the plan | ||
|  | was created with the @code{FFTW_UNALIGNED} flag. | ||
|  | @ctindex FFTW_UNALIGNED | ||
|  | Here, the alignment is a platform-dependent quantity (for example, it is | ||
|  | the address modulo 16 if SSE SIMD instructions are used, but the address | ||
|  | modulo 4 for non-SIMD single-precision FFTW on the same machine).  In | ||
|  | general, only arrays allocated with @code{fftw_malloc} are guaranteed to | ||
|  | be equally aligned (@pxref{SIMD alignment and fftw_malloc}). | ||
|  | 
 | ||
|  | @end itemize | ||
|  | 
 | ||
|  | @cindex alignment | ||
|  | The alignment issue is especially critical, because if you don't use | ||
|  | @code{fftw_malloc} then you may have little control over the alignment | ||
|  | of arrays in memory.  For example, neither the C++ @code{new} function | ||
|  | nor the Fortran @code{allocate} statement provide strong enough | ||
|  | guarantees about data alignment.  If you don't use @code{fftw_malloc}, | ||
|  | therefore, you probably have to use @code{FFTW_UNALIGNED} (which | ||
|  | disables most SIMD support).  If possible, it is probably better for | ||
|  | you to simply create multiple plans (creating a new plan is quick once | ||
|  | one exists for a given size), or better yet re-use the same array for | ||
|  | your transforms. | ||
|  | 
 | ||
|  | @findex fftw_alignment_of | ||
|  | For rare circumstances in which you cannot control the alignment of | ||
|  | allocated memory, but wish to determine where a given array is | ||
|  | aligned like the original array for which a plan was created, you can | ||
|  | use the @code{fftw_alignment_of} function: | ||
|  | @example | ||
|  | int fftw_alignment_of(double *p); | ||
|  | @end example | ||
|  | Two arrays have equivalent alignment (for the purposes of applying a | ||
|  | plan) if and only if @code{fftw_alignment_of} returns the same value | ||
|  | for the corresponding pointers to their data (typecast to @code{double*}  | ||
|  | if necessary). | ||
|  | 
 | ||
|  | If you are tempted to use the new-array execute interface because you | ||
|  | want to transform a known bunch of arrays of the same size, you should | ||
|  | probably go use the advanced interface instead (@pxref{Advanced | ||
|  | Interface})). | ||
|  | 
 | ||
|  | The new-array execute functions are: | ||
|  | 
 | ||
|  | @example | ||
|  | void fftw_execute_dft( | ||
|  |      const fftw_plan p,  | ||
|  |      fftw_complex *in, fftw_complex *out); | ||
|  | 
 | ||
|  | void fftw_execute_split_dft( | ||
|  |      const fftw_plan p,  | ||
|  |      double *ri, double *ii, double *ro, double *io); | ||
|  | 
 | ||
|  | void fftw_execute_dft_r2c( | ||
|  |      const fftw_plan p, | ||
|  |      double *in, fftw_complex *out); | ||
|  | 
 | ||
|  | void fftw_execute_split_dft_r2c( | ||
|  |      const fftw_plan p, | ||
|  |      double *in, double *ro, double *io); | ||
|  | 
 | ||
|  | void fftw_execute_dft_c2r( | ||
|  |      const fftw_plan p, | ||
|  |      fftw_complex *in, double *out); | ||
|  | 
 | ||
|  | void fftw_execute_split_dft_c2r( | ||
|  |      const fftw_plan p, | ||
|  |      double *ri, double *ii, double *out); | ||
|  | 
 | ||
|  | void fftw_execute_r2r( | ||
|  |      const fftw_plan p,  | ||
|  |      double *in, double *out); | ||
|  | @end example | ||
|  | @findex fftw_execute_dft | ||
|  | @findex fftw_execute_split_dft | ||
|  | @findex fftw_execute_dft_r2c | ||
|  | @findex fftw_execute_split_dft_r2c | ||
|  | @findex fftw_execute_dft_c2r | ||
|  | @findex fftw_execute_split_dft_c2r | ||
|  | @findex fftw_execute_r2r | ||
|  | 
 | ||
|  | These execute the @code{plan} to compute the corresponding transform on | ||
|  | the input/output arrays specified by the subsequent arguments.  The | ||
|  | input/output array arguments have the same meanings as the ones passed | ||
|  | to the guru planner routines in the preceding sections.  The @code{plan} | ||
|  | is not modified, and these routines can be called as many times as | ||
|  | desired, or intermixed with calls to the ordinary @code{fftw_execute}. | ||
|  | 
 | ||
|  | The @code{plan} @emph{must} have been created for the transform type | ||
|  | corresponding to the execute function, e.g. it must be a complex-DFT | ||
|  | plan for @code{fftw_execute_dft}.  Any of the planner routines for that | ||
|  | transform type, from the basic to the guru interface, could have been | ||
|  | used to create the plan, however. | ||
|  | 
 | ||
|  | @c ------------------------------------------------------------ | ||
|  | @node Wisdom, What FFTW Really Computes, New-array Execute Functions, FFTW Reference | ||
|  | @section Wisdom | ||
|  | @cindex wisdom | ||
|  | @cindex saving plans to disk | ||
|  | 
 | ||
|  | This section documents the FFTW mechanism for saving and restoring | ||
|  | plans from disk.  This mechanism is called @dfn{wisdom}. | ||
|  | 
 | ||
|  | @menu | ||
|  | * Wisdom Export:: | ||
|  | * Wisdom Import:: | ||
|  | * Forgetting Wisdom:: | ||
|  | * Wisdom Utilities:: | ||
|  | @end menu | ||
|  | 
 | ||
|  | @c =========> | ||
|  | @node Wisdom Export, Wisdom Import, Wisdom, Wisdom | ||
|  | @subsection Wisdom Export | ||
|  | 
 | ||
|  | @example | ||
|  | int fftw_export_wisdom_to_filename(const char *filename); | ||
|  | void fftw_export_wisdom_to_file(FILE *output_file); | ||
|  | char *fftw_export_wisdom_to_string(void); | ||
|  | void fftw_export_wisdom(void (*write_char)(char c, void *), void *data); | ||
|  | @end example | ||
|  | @findex fftw_export_wisdom | ||
|  | @findex fftw_export_wisdom_to_filename | ||
|  | @findex fftw_export_wisdom_to_file | ||
|  | @findex fftw_export_wisdom_to_string | ||
|  | 
 | ||
|  | These functions allow you to export all currently accumulated wisdom | ||
|  | in a form from which it can be later imported and restored, even | ||
|  | during a separate run of the program. (@xref{Words of Wisdom-Saving | ||
|  | Plans}.)  The current store of wisdom is not affected by calling any | ||
|  | of these routines. | ||
|  | 
 | ||
|  | @code{fftw_export_wisdom} exports the wisdom to any output | ||
|  | medium, as specified by the callback function | ||
|  | @code{write_char}. @code{write_char} is a @code{putc}-like function that | ||
|  | writes the character @code{c} to some output; its second parameter is | ||
|  | the @code{data} pointer passed to @code{fftw_export_wisdom}.  For | ||
|  | convenience, the following three ``wrapper'' routines are provided: | ||
|  | 
 | ||
|  | @code{fftw_export_wisdom_to_filename} writes wisdom to a file named | ||
|  | @code{filename} (which is created or overwritten), returning @code{1} | ||
|  | on success and @code{0} on failure.  A lower-level function, which | ||
|  | requires you to open and close the file yourself (e.g. if you want to | ||
|  | write wisdom to a portion of a larger file) is | ||
|  | @code{fftw_export_wisdom_to_file}.  This writes the wisdom to the | ||
|  | current position in @code{output_file}, which should be open with | ||
|  | write permission; upon exit, the file remains open and is positioned | ||
|  | at the end of the wisdom data. | ||
|  | 
 | ||
|  | @code{fftw_export_wisdom_to_string} returns a pointer to a | ||
|  | @code{NULL}-terminated string holding the wisdom data. This string is | ||
|  | dynamically allocated, and it is the responsibility of the caller to | ||
|  | deallocate it with @code{free} when it is no longer needed. | ||
|  | 
 | ||
|  | All of these routines export the wisdom in the same format, which we | ||
|  | will not document here except to say that it is LISP-like ASCII text | ||
|  | that is insensitive to white space. | ||
|  | 
 | ||
|  | @c =========> | ||
|  | @node Wisdom Import, Forgetting Wisdom, Wisdom Export, Wisdom | ||
|  | @subsection Wisdom Import | ||
|  | 
 | ||
|  | @example | ||
|  | int fftw_import_system_wisdom(void); | ||
|  | int fftw_import_wisdom_from_filename(const char *filename); | ||
|  | int fftw_import_wisdom_from_string(const char *input_string); | ||
|  | int fftw_import_wisdom(int (*read_char)(void *), void *data); | ||
|  | @end example | ||
|  | @findex fftw_import_wisdom | ||
|  | @findex fftw_import_system_wisdom | ||
|  | @findex fftw_import_wisdom_from_filename | ||
|  | @findex fftw_import_wisdom_from_file | ||
|  | @findex fftw_import_wisdom_from_string | ||
|  | 
 | ||
|  | These functions import wisdom into a program from data stored by the | ||
|  | @code{fftw_export_wisdom} functions above. (@xref{Words of | ||
|  | Wisdom-Saving Plans}.)  The imported wisdom replaces any wisdom | ||
|  | already accumulated by the running program. | ||
|  | 
 | ||
|  | @code{fftw_import_wisdom} imports wisdom from any input medium, as | ||
|  | specified by the callback function @code{read_char}. @code{read_char} is | ||
|  | a @code{getc}-like function that returns the next character in the | ||
|  | input; its parameter is the @code{data} pointer passed to | ||
|  | @code{fftw_import_wisdom}. If the end of the input data is reached | ||
|  | (which should never happen for valid data), @code{read_char} should | ||
|  | return @code{EOF} (as defined in @code{<stdio.h>}).  For convenience, | ||
|  | the following three ``wrapper'' routines are provided: | ||
|  | 
 | ||
|  | @code{fftw_import_wisdom_from_filename} reads wisdom from a file named | ||
|  | @code{filename}.  A lower-level function, which requires you to open | ||
|  | and close the file yourself (e.g. if you want to read wisdom from a | ||
|  | portion of a larger file) is @code{fftw_import_wisdom_from_file}. This | ||
|  | reads wisdom from the current position in @code{input_file} (which | ||
|  | should be open with read permission); upon exit, the file remains | ||
|  | open, but the position of the read pointer is unspecified. | ||
|  | 
 | ||
|  | @code{fftw_import_wisdom_from_string} reads wisdom from the | ||
|  | @code{NULL}-terminated string @code{input_string}. | ||
|  | 
 | ||
|  | @code{fftw_import_system_wisdom} reads wisdom from an | ||
|  | implementation-defined standard file (@code{/etc/fftw/wisdom} on Unix | ||
|  | and GNU systems). | ||
|  | @cindex wisdom, system-wide | ||
|  | 
 | ||
|  | 
 | ||
|  | The return value of these import routines is @code{1} if the wisdom was | ||
|  | read successfully and @code{0} otherwise. Note that, in all of these | ||
|  | functions, any data in the input stream past the end of the wisdom data | ||
|  | is simply ignored. | ||
|  | 
 | ||
|  | @c =========> | ||
|  | @node Forgetting Wisdom, Wisdom Utilities, Wisdom Import, Wisdom | ||
|  | @subsection Forgetting Wisdom | ||
|  | 
 | ||
|  | @example | ||
|  | void fftw_forget_wisdom(void); | ||
|  | @end example | ||
|  | @findex fftw_forget_wisdom | ||
|  | 
 | ||
|  | Calling @code{fftw_forget_wisdom} causes all accumulated @code{wisdom} | ||
|  | to be discarded and its associated memory to be freed. (New | ||
|  | @code{wisdom} can still be gathered subsequently, however.) | ||
|  | 
 | ||
|  | @c =========> | ||
|  | @node Wisdom Utilities,  , Forgetting Wisdom, Wisdom | ||
|  | @subsection Wisdom Utilities | ||
|  | 
 | ||
|  | FFTW includes two standalone utility programs that deal with wisdom.  We | ||
|  | merely summarize them here, since they come with their own @code{man} | ||
|  | pages for Unix and GNU systems (with HTML versions on our web site). | ||
|  | 
 | ||
|  | The first program is @code{fftw-wisdom} (or @code{fftwf-wisdom} in | ||
|  | single precision, etcetera), which can be used to create a wisdom file | ||
|  | containing plans for any of the transform sizes and types supported by | ||
|  | FFTW.  It is preferable to create wisdom directly from your executable | ||
|  | (@pxref{Caveats in Using Wisdom}), but this program is useful for | ||
|  | creating global wisdom files for @code{fftw_import_system_wisdom}. | ||
|  | @cindex fftw-wisdom utility | ||
|  | 
 | ||
|  | 
 | ||
|  | The second program is @code{fftw-wisdom-to-conf}, which takes a wisdom | ||
|  | file as input and produces a @dfn{configuration routine} as output.  The | ||
|  | latter is a C subroutine that you can compile and link into your | ||
|  | program, replacing a routine of the same name in the FFTW library, that | ||
|  | determines which parts of FFTW are callable by your program. | ||
|  | @code{fftw-wisdom-to-conf} produces a configuration routine that links | ||
|  | to only those parts of FFTW needed by the saved plans in the wisdom, | ||
|  | greatly reducing the size of statically linked executables (which should | ||
|  | only attempt to create plans corresponding to those in the wisdom, | ||
|  | however). | ||
|  | @cindex fftw-wisdom-to-conf utility | ||
|  | @cindex configuration routines | ||
|  | 
 | ||
|  | @c ------------------------------------------------------------ | ||
|  | @node What FFTW Really Computes,  , Wisdom, FFTW Reference | ||
|  | @section What FFTW Really Computes | ||
|  | 
 | ||
|  | In this section, we provide precise mathematical definitions for the | ||
|  | transforms that FFTW computes.  These transform definitions are fairly | ||
|  | standard, but some authors follow slightly different conventions for the | ||
|  | normalization of the transform (the constant factor in front) and the | ||
|  | sign of the complex exponent.  We begin by presenting the | ||
|  | one-dimensional (1d) transform definitions, and then give the | ||
|  | straightforward extension to multi-dimensional transforms. | ||
|  | 
 | ||
|  | @menu | ||
|  | * The 1d Discrete Fourier Transform (DFT):: | ||
|  | * The 1d Real-data DFT:: | ||
|  | * 1d Real-even DFTs (DCTs):: | ||
|  | * 1d Real-odd DFTs (DSTs):: | ||
|  | * 1d Discrete Hartley Transforms (DHTs):: | ||
|  | * Multi-dimensional Transforms:: | ||
|  | @end menu | ||
|  | 
 | ||
|  | @c =========> | ||
|  | @node The 1d Discrete Fourier Transform (DFT), The 1d Real-data DFT, What FFTW Really Computes, What FFTW Really Computes | ||
|  | @subsection The 1d Discrete Fourier Transform (DFT) | ||
|  | 
 | ||
|  | @cindex discrete Fourier transform | ||
|  | @cindex DFT | ||
|  | The forward (@code{FFTW_FORWARD}) discrete Fourier transform (DFT) of a | ||
|  | 1d complex array @math{X} of size @math{n} computes an array @math{Y}, | ||
|  | where: | ||
|  | @tex | ||
|  | $$ | ||
|  | Y_k = \sum_{j = 0}^{n - 1} X_j e^{-2\pi j k \sqrt{-1}/n} \ . | ||
|  | $$ | ||
|  | @end tex | ||
|  | @ifinfo | ||
|  | @center Y[k] = sum for j = 0 to (n - 1) of X[j] * exp(-2 pi j k sqrt(-1)/n) . | ||
|  | @end ifinfo | ||
|  | @html | ||
|  | <center><img src="equation-dft.png" align="top">.</center> | ||
|  | @end html | ||
|  | The backward (@code{FFTW_BACKWARD}) DFT computes: | ||
|  | @tex | ||
|  | $$ | ||
|  | Y_k = \sum_{j = 0}^{n - 1} X_j e^{2\pi j k \sqrt{-1}/n} \ . | ||
|  | $$ | ||
|  | @end tex | ||
|  | @ifinfo | ||
|  | @center Y[k] = sum for j = 0 to (n - 1) of X[j] * exp(2 pi j k sqrt(-1)/n) . | ||
|  | @end ifinfo | ||
|  | @html | ||
|  | <center><img src="equation-idft.png" align="top">.</center> | ||
|  | @end html | ||
|  | 
 | ||
|  | @cindex normalization | ||
|  | FFTW computes an unnormalized transform, in that there is no coefficient | ||
|  | in front of the summation in the DFT.  In other words, applying the | ||
|  | forward and then the backward transform will multiply the input by | ||
|  | @math{n}. | ||
|  | 
 | ||
|  | @cindex frequency | ||
|  | From above, an @code{FFTW_FORWARD} transform corresponds to a sign of | ||
|  | @math{-1} in the exponent of the DFT.  Note also that we use the | ||
|  | standard ``in-order'' output ordering---the @math{k}-th output | ||
|  | corresponds to the frequency @math{k/n} (or @math{k/T}, where @math{T} | ||
|  | is your total sampling period).  For those who like to think in terms of | ||
|  | positive and negative frequencies, this means that the positive | ||
|  | frequencies are stored in the first half of the output and the negative | ||
|  | frequencies are stored in backwards order in the second half of the | ||
|  | output.  (The frequency @math{-k/n} is the same as the frequency | ||
|  | @math{(n-k)/n}.) | ||
|  | 
 | ||
|  | @c =========> | ||
|  | @node The 1d Real-data DFT, 1d Real-even DFTs (DCTs), The 1d Discrete Fourier Transform (DFT), What FFTW Really Computes | ||
|  | @subsection The 1d Real-data DFT | ||
|  | 
 | ||
|  | The real-input (r2c) DFT in FFTW computes the @emph{forward} transform | ||
|  | @math{Y} of the size @code{n} real array @math{X}, exactly as defined | ||
|  | above, i.e. | ||
|  | @tex | ||
|  | $$ | ||
|  | Y_k = \sum_{j = 0}^{n - 1} X_j e^{-2\pi j k \sqrt{-1}/n} \ . | ||
|  | $$ | ||
|  | @end tex | ||
|  | @ifinfo | ||
|  | @center Y[k] = sum for j = 0 to (n - 1) of X[j] * exp(-2 pi j k sqrt(-1)/n) . | ||
|  | @end ifinfo | ||
|  | @html | ||
|  | <center><img src="equation-dft.png" align="top">.</center> | ||
|  | @end html | ||
|  | This output array @math{Y} can easily be shown to possess the | ||
|  | ``Hermitian'' symmetry | ||
|  | @cindex Hermitian | ||
|  | @tex | ||
|  | $Y_k = Y_{n-k}^*$, | ||
|  | @end tex | ||
|  | @ifinfo | ||
|  | Y[k] = Y[n-k]*, | ||
|  | @end ifinfo | ||
|  | @html | ||
|  | <i>Y<sub>k</sub> = Y<sub>n-k</sub></i><sup>*</sup>, | ||
|  | @end html | ||
|  | where we take @math{Y} to be periodic so that | ||
|  | @tex | ||
|  | $Y_n = Y_0$. | ||
|  | @end tex | ||
|  | @ifinfo | ||
|  | Y[n] = Y[0]. | ||
|  | @end ifinfo | ||
|  | @html | ||
|  | <i>Y<sub>n</sub> = Y</i><sub>0</sub>. | ||
|  | @end html | ||
|  | 
 | ||
|  | As a result of this symmetry, half of the output @math{Y} is redundant | ||
|  | (being the complex conjugate of the other half), and so the 1d r2c | ||
|  | transforms only output elements @math{0}@dots{}@math{n/2} of @math{Y} | ||
|  | (@math{n/2+1} complex numbers), where the division by @math{2} is | ||
|  | rounded down.  | ||
|  | 
 | ||
|  | Moreover, the Hermitian symmetry implies that | ||
|  | @tex | ||
|  | $Y_0$ | ||
|  | @end tex | ||
|  | @ifinfo | ||
|  | Y[0] | ||
|  | @end ifinfo | ||
|  | @html | ||
|  | <i>Y</i><sub>0</sub> | ||
|  | @end html | ||
|  | and, if @math{n} is even, the | ||
|  | @tex | ||
|  | $Y_{n/2}$ | ||
|  | @end tex | ||
|  | @ifinfo | ||
|  | Y[n/2] | ||
|  | @end ifinfo | ||
|  | @html | ||
|  | <i>Y</i><sub><i>n</i>/2</sub> | ||
|  | @end html | ||
|  | element, are purely real.  So, for the @code{R2HC} r2r transform, the | ||
|  | halfcomplex format does not store the imaginary parts of these elements. | ||
|  | @cindex r2r | ||
|  | @ctindex R2HC | ||
|  | @cindex halfcomplex format | ||
|  | 
 | ||
|  | 
 | ||
|  | The c2r and @code{H2RC} r2r transforms compute the backward DFT of the | ||
|  | @emph{complex} array @math{X} with Hermitian symmetry, stored in the | ||
|  | r2c/@code{R2HC} output formats, respectively, where the backward | ||
|  | transform is defined exactly as for the complex case: | ||
|  | @tex | ||
|  | $$ | ||
|  | Y_k = \sum_{j = 0}^{n - 1} X_j e^{2\pi j k \sqrt{-1}/n} \ . | ||
|  | $$ | ||
|  | @end tex | ||
|  | @ifinfo | ||
|  | @center Y[k] = sum for j = 0 to (n - 1) of X[j] * exp(2 pi j k sqrt(-1)/n) . | ||
|  | @end ifinfo | ||
|  | @html | ||
|  | <center><img src="equation-idft.png" align="top">.</center> | ||
|  | @end html | ||
|  | The outputs @code{Y} of this transform can easily be seen to be purely | ||
|  | real, and are stored as an array of real numbers. | ||
|  | 
 | ||
|  | @cindex normalization | ||
|  | Like FFTW's complex DFT, these transforms are unnormalized.  In other | ||
|  | words, applying the real-to-complex (forward) and then the | ||
|  | complex-to-real (backward) transform will multiply the input by | ||
|  | @math{n}. | ||
|  | 
 | ||
|  | @c =========> | ||
|  | @node 1d Real-even DFTs (DCTs), 1d Real-odd DFTs (DSTs), The 1d Real-data DFT, What FFTW Really Computes | ||
|  | @subsection 1d Real-even DFTs (DCTs) | ||
|  | 
 | ||
|  | The Real-even symmetry DFTs in FFTW are exactly equivalent to the unnormalized | ||
|  | forward (and backward) DFTs as defined above, where the input array | ||
|  | @math{X} of length @math{N} is purely real and is also @dfn{even} symmetry.  In | ||
|  | this case, the output array is likewise real and even symmetry. | ||
|  | @cindex real-even DFT | ||
|  | @cindex REDFT | ||
|  | 
 | ||
|  | 
 | ||
|  | @ctindex REDFT00 | ||
|  | For the case of @code{REDFT00}, this even symmetry means that | ||
|  | @tex | ||
|  | $X_j = X_{N-j}$, | ||
|  | @end tex | ||
|  | @ifinfo | ||
|  | X[j] = X[N-j], | ||
|  | @end ifinfo | ||
|  | @html | ||
|  | <i>X<sub>j</sub> = X<sub>N-j</sub></i>, | ||
|  | @end html | ||
|  | where we take @math{X} to be periodic so that | ||
|  | @tex | ||
|  | $X_N = X_0$. | ||
|  | @end tex | ||
|  | @ifinfo | ||
|  | X[N] = X[0]. | ||
|  | @end ifinfo | ||
|  | @html | ||
|  | <i>X<sub>N</sub> = X</i><sub>0</sub>. | ||
|  | @end html | ||
|  | Because of this redundancy, only the first @math{n} real numbers are | ||
|  | actually stored, where @math{N = 2(n-1)}. | ||
|  | 
 | ||
|  | The proper definition of even symmetry for @code{REDFT10}, | ||
|  | @code{REDFT01}, and @code{REDFT11} transforms is somewhat more intricate | ||
|  | because of the shifts by @math{1/2} of the input and/or output, although | ||
|  | the corresponding boundary conditions are given in @ref{Real even/odd | ||
|  | DFTs (cosine/sine transforms)}.  Because of the even symmetry, however, | ||
|  | the sine terms in the DFT all cancel and the remaining cosine terms are | ||
|  | written explicitly below.  This formulation often leads people to call | ||
|  | such a transform a @dfn{discrete cosine transform} (DCT), although it is | ||
|  | really just a special case of the DFT. | ||
|  | @cindex discrete cosine transform | ||
|  | @cindex DCT | ||
|  | 
 | ||
|  | 
 | ||
|  | In each of the definitions below, we transform a real array @math{X} of | ||
|  | length @math{n} to a real array @math{Y} of length @math{n}: | ||
|  | 
 | ||
|  | @subsubheading REDFT00 (DCT-I) | ||
|  | @ctindex REDFT00 | ||
|  | An @code{REDFT00} transform (type-I DCT) in FFTW is defined by: | ||
|  | @tex | ||
|  | $$ | ||
|  | Y_k = X_0 + (-1)^k X_{n-1} | ||
|  |        + 2 \sum_{j=1}^{n-2} X_j \cos [ \pi j k / (n-1)]. | ||
|  | $$ | ||
|  | @end tex | ||
|  | @ifinfo | ||
|  | Y[k] = X[0] + (-1)^k X[n-1] + 2 (sum for j = 1 to n-2 of X[j] cos(pi jk /(n-1))). | ||
|  | @end ifinfo | ||
|  | @html | ||
|  | <center><img src="equation-redft00.png" align="top">.</center> | ||
|  | @end html | ||
|  | Note that this transform is not defined for @math{n=1}.  For @math{n=2}, | ||
|  | the summation term above is dropped as you might expect. | ||
|  | 
 | ||
|  | @subsubheading REDFT10 (DCT-II) | ||
|  | @ctindex REDFT10 | ||
|  | An @code{REDFT10} transform (type-II DCT, sometimes called ``the'' DCT) in FFTW is defined by: | ||
|  | @tex | ||
|  | $$ | ||
|  | Y_k = 2 \sum_{j=0}^{n-1} X_j \cos [\pi (j+1/2) k / n]. | ||
|  | $$ | ||
|  | @end tex | ||
|  | @ifinfo | ||
|  | Y[k] = 2 (sum for j = 0 to n-1 of X[j] cos(pi (j+1/2) k / n)). | ||
|  | @end ifinfo | ||
|  | @html | ||
|  | <center><img src="equation-redft10.png" align="top">.</center> | ||
|  | @end html | ||
|  | 
 | ||
|  | @subsubheading REDFT01 (DCT-III) | ||
|  | @ctindex REDFT01 | ||
|  | An @code{REDFT01} transform (type-III DCT) in FFTW is defined by: | ||
|  | @tex | ||
|  | $$ | ||
|  | Y_k = X_0 + 2 \sum_{j=1}^{n-1} X_j \cos [\pi j (k+1/2) / n]. | ||
|  | $$ | ||
|  | @end tex | ||
|  | @ifinfo | ||
|  | Y[k] = X[0] + 2 (sum for j = 1 to n-1 of X[j] cos(pi j (k+1/2) / n)). | ||
|  | @end ifinfo | ||
|  | @html | ||
|  | <center><img src="equation-redft01.png" align="top">.</center> | ||
|  | @end html | ||
|  | In the case of @math{n=1}, this reduces to | ||
|  | @tex | ||
|  | $Y_0 = X_0$. | ||
|  | @end tex | ||
|  | @ifinfo | ||
|  | Y[0] = X[0]. | ||
|  | @end ifinfo | ||
|  | @html | ||
|  | <i>Y</i><sub>0</sub> = <i>X</i><sub>0</sub>. | ||
|  | @end html | ||
|  | Up to a scale factor (see below), this is the inverse of @code{REDFT10} (``the'' DCT), and so the @code{REDFT01} (DCT-III) is sometimes called the ``IDCT''. | ||
|  | @cindex IDCT | ||
|  | 
 | ||
|  | @subsubheading REDFT11 (DCT-IV) | ||
|  | @ctindex REDFT11 | ||
|  | An @code{REDFT11} transform (type-IV DCT) in FFTW is defined by: | ||
|  | @tex | ||
|  | $$ | ||
|  | Y_k = 2 \sum_{j=0}^{n-1} X_j \cos [\pi (j+1/2) (k+1/2) / n]. | ||
|  | $$ | ||
|  | @end tex | ||
|  | @ifinfo | ||
|  | Y[k] = 2 (sum for j = 0 to n-1 of X[j] cos(pi (j+1/2) (k+1/2) / n)). | ||
|  | @end ifinfo | ||
|  | @html | ||
|  | <center><img src="equation-redft11.png" align="top">.</center> | ||
|  | @end html | ||
|  | 
 | ||
|  | @subsubheading Inverses and Normalization | ||
|  | 
 | ||
|  | These definitions correspond directly to the unnormalized DFTs used | ||
|  | elsewhere in FFTW (hence the factors of @math{2} in front of the | ||
|  | summations).  The unnormalized inverse of @code{REDFT00} is | ||
|  | @code{REDFT00}, of @code{REDFT10} is @code{REDFT01} and vice versa, and | ||
|  | of @code{REDFT11} is @code{REDFT11}.  Each unnormalized inverse results | ||
|  | in the original array multiplied by @math{N}, where @math{N} is the | ||
|  | @emph{logical} DFT size.  For @code{REDFT00}, @math{N=2(n-1)} (note that | ||
|  | @math{n=1} is not defined); otherwise, @math{N=2n}. | ||
|  | @cindex normalization | ||
|  | 
 | ||
|  | 
 | ||
|  | In defining the discrete cosine transform, some authors also include | ||
|  | additional factors of | ||
|  | @ifinfo | ||
|  | sqrt(2) | ||
|  | @end ifinfo | ||
|  | @html | ||
|  | √2 | ||
|  | @end html | ||
|  | @tex | ||
|  | $\sqrt{2}$ | ||
|  | @end tex | ||
|  | (or its inverse) multiplying selected inputs and/or outputs.  This is a | ||
|  | mostly cosmetic change that makes the transform orthogonal, but | ||
|  | sacrifices the direct equivalence to a symmetric DFT. | ||
|  | 
 | ||
|  | @c =========> | ||
|  | @node 1d Real-odd DFTs (DSTs), 1d Discrete Hartley Transforms (DHTs), 1d Real-even DFTs (DCTs), What FFTW Really Computes | ||
|  | @subsection 1d Real-odd DFTs (DSTs) | ||
|  | 
 | ||
|  | The Real-odd symmetry DFTs in FFTW are exactly equivalent to the unnormalized | ||
|  | forward (and backward) DFTs as defined above, where the input array | ||
|  | @math{X} of length @math{N} is purely real and is also @dfn{odd} symmetry.  In | ||
|  | this case, the output is odd symmetry and purely imaginary. | ||
|  | @cindex real-odd DFT | ||
|  | @cindex RODFT | ||
|  | 
 | ||
|  | 
 | ||
|  | @ctindex RODFT00 | ||
|  | For the case of @code{RODFT00}, this odd symmetry means that | ||
|  | @tex | ||
|  | $X_j = -X_{N-j}$, | ||
|  | @end tex | ||
|  | @ifinfo | ||
|  | X[j] = -X[N-j], | ||
|  | @end ifinfo | ||
|  | @html | ||
|  | <i>X<sub>j</sub> = -X<sub>N-j</sub></i>, | ||
|  | @end html | ||
|  | where we take @math{X} to be periodic so that | ||
|  | @tex | ||
|  | $X_N = X_0$. | ||
|  | @end tex | ||
|  | @ifinfo | ||
|  | X[N] = X[0]. | ||
|  | @end ifinfo | ||
|  | @html | ||
|  | <i>X<sub>N</sub> = X</i><sub>0</sub>. | ||
|  | @end html | ||
|  | Because of this redundancy, only the first @math{n} real numbers | ||
|  | starting at @math{j=1} are actually stored (the @math{j=0} element is | ||
|  | zero), where @math{N = 2(n+1)}. | ||
|  | 
 | ||
|  | The proper definition of odd symmetry for @code{RODFT10}, | ||
|  | @code{RODFT01}, and @code{RODFT11} transforms is somewhat more intricate | ||
|  | because of the shifts by @math{1/2} of the input and/or output, although | ||
|  | the corresponding boundary conditions are given in @ref{Real even/odd | ||
|  | DFTs (cosine/sine transforms)}.  Because of the odd symmetry, however, | ||
|  | the cosine terms in the DFT all cancel and the remaining sine terms are | ||
|  | written explicitly below.  This formulation often leads people to call | ||
|  | such a transform a @dfn{discrete sine transform} (DST), although it is | ||
|  | really just a special case of the DFT. | ||
|  | @cindex discrete sine transform | ||
|  | @cindex DST | ||
|  | 
 | ||
|  | 
 | ||
|  | In each of the definitions below, we transform a real array @math{X} of | ||
|  | length @math{n} to a real array @math{Y} of length @math{n}: | ||
|  | 
 | ||
|  | @subsubheading RODFT00 (DST-I) | ||
|  | @ctindex RODFT00 | ||
|  | An @code{RODFT00} transform (type-I DST) in FFTW is defined by: | ||
|  | @tex | ||
|  | $$ | ||
|  | Y_k = 2 \sum_{j=0}^{n-1} X_j \sin [ \pi (j+1) (k+1) / (n+1)]. | ||
|  | $$ | ||
|  | @end tex | ||
|  | @ifinfo | ||
|  | Y[k] = 2 (sum for j = 0 to n-1 of X[j] sin(pi (j+1)(k+1) / (n+1))). | ||
|  | @end ifinfo | ||
|  | @html | ||
|  | <center><img src="equation-rodft00.png" align="top">.</center> | ||
|  | @end html | ||
|  | 
 | ||
|  | @subsubheading RODFT10 (DST-II) | ||
|  | @ctindex RODFT10 | ||
|  | An @code{RODFT10} transform (type-II DST) in FFTW is defined by: | ||
|  | @tex | ||
|  | $$ | ||
|  | Y_k = 2 \sum_{j=0}^{n-1} X_j \sin [\pi (j+1/2) (k+1) / n]. | ||
|  | $$ | ||
|  | @end tex | ||
|  | @ifinfo | ||
|  | Y[k] = 2 (sum for j = 0 to n-1 of X[j] sin(pi (j+1/2) (k+1) / n)). | ||
|  | @end ifinfo | ||
|  | @html | ||
|  | <center><img src="equation-rodft10.png" align="top">.</center> | ||
|  | @end html | ||
|  | 
 | ||
|  | @subsubheading RODFT01 (DST-III) | ||
|  | @ctindex RODFT01 | ||
|  | An @code{RODFT01} transform (type-III DST) in FFTW is defined by: | ||
|  | @tex | ||
|  | $$ | ||
|  | Y_k = (-1)^k X_{n-1} + 2 \sum_{j=0}^{n-2} X_j \sin [\pi (j+1) (k+1/2) / n]. | ||
|  | $$ | ||
|  | @end tex | ||
|  | @ifinfo | ||
|  | Y[k] = (-1)^k X[n-1] + 2 (sum for j = 0 to n-2 of X[j] sin(pi (j+1) (k+1/2) / n)). | ||
|  | @end ifinfo | ||
|  | @html | ||
|  | <center><img src="equation-rodft01.png" align="top">.</center> | ||
|  | @end html | ||
|  | In the case of @math{n=1}, this reduces to | ||
|  | @tex | ||
|  | $Y_0 = X_0$. | ||
|  | @end tex | ||
|  | @ifinfo | ||
|  | Y[0] = X[0]. | ||
|  | @end ifinfo | ||
|  | @html | ||
|  | <i>Y</i><sub>0</sub> = <i>X</i><sub>0</sub>. | ||
|  | @end html | ||
|  | 
 | ||
|  | @subsubheading RODFT11 (DST-IV) | ||
|  | @ctindex RODFT11 | ||
|  | An @code{RODFT11} transform (type-IV DST) in FFTW is defined by: | ||
|  | @tex | ||
|  | $$ | ||
|  | Y_k = 2 \sum_{j=0}^{n-1} X_j \sin [\pi (j+1/2) (k+1/2) / n]. | ||
|  | $$ | ||
|  | @end tex | ||
|  | @ifinfo | ||
|  | Y[k] = 2 (sum for j = 0 to n-1 of X[j] sin(pi (j+1/2) (k+1/2) / n)). | ||
|  | @end ifinfo | ||
|  | @html | ||
|  | <center><img src="equation-rodft11.png" align="top">.</center> | ||
|  | @end html | ||
|  | 
 | ||
|  | @subsubheading Inverses and Normalization | ||
|  | 
 | ||
|  | These definitions correspond directly to the unnormalized DFTs used | ||
|  | elsewhere in FFTW (hence the factors of @math{2} in front of the | ||
|  | summations).  The unnormalized inverse of @code{RODFT00} is | ||
|  | @code{RODFT00}, of @code{RODFT10} is @code{RODFT01} and vice versa, and | ||
|  | of @code{RODFT11} is @code{RODFT11}.  Each unnormalized inverse results | ||
|  | in the original array multiplied by @math{N}, where @math{N} is the | ||
|  | @emph{logical} DFT size.  For @code{RODFT00}, @math{N=2(n+1)}; | ||
|  | otherwise, @math{N=2n}. | ||
|  | @cindex normalization | ||
|  | 
 | ||
|  | 
 | ||
|  | In defining the discrete sine transform, some authors also include | ||
|  | additional factors of | ||
|  | @ifinfo | ||
|  | sqrt(2) | ||
|  | @end ifinfo | ||
|  | @html | ||
|  | √2 | ||
|  | @end html | ||
|  | @tex | ||
|  | $\sqrt{2}$ | ||
|  | @end tex | ||
|  | (or its inverse) multiplying selected inputs and/or outputs.  This is a | ||
|  | mostly cosmetic change that makes the transform orthogonal, but | ||
|  | sacrifices the direct equivalence to an antisymmetric DFT. | ||
|  | 
 | ||
|  | @c =========> | ||
|  | @node 1d Discrete Hartley Transforms (DHTs), Multi-dimensional Transforms, 1d Real-odd DFTs (DSTs), What FFTW Really Computes | ||
|  | @subsection 1d Discrete Hartley Transforms (DHTs) | ||
|  | 
 | ||
|  | @cindex discrete Hartley transform | ||
|  | @cindex DHT | ||
|  | The discrete Hartley transform (DHT) of a 1d real array @math{X} of size | ||
|  | @math{n} computes a real array @math{Y} of the same size, where: | ||
|  | @tex | ||
|  | $$ | ||
|  | Y_k = \sum_{j = 0}^{n - 1} X_j [ \cos(2\pi j k / n) + \sin(2\pi j k / n)]. | ||
|  | $$ | ||
|  | @end tex | ||
|  | @ifinfo | ||
|  | @center Y[k] = sum for j = 0 to (n - 1) of X[j] * [cos(2 pi j k / n) + sin(2 pi j k / n)]. | ||
|  | @end ifinfo | ||
|  | @html | ||
|  | <center><img src="equation-dht.png" align="top">.</center> | ||
|  | @end html | ||
|  | 
 | ||
|  | @cindex normalization | ||
|  | FFTW computes an unnormalized transform, in that there is no coefficient | ||
|  | in front of the summation in the DHT.  In other words, applying the | ||
|  | transform twice (the DHT is its own inverse) will multiply the input by | ||
|  | @math{n}. | ||
|  | 
 | ||
|  | @c =========> | ||
|  | @node Multi-dimensional Transforms,  , 1d Discrete Hartley Transforms (DHTs), What FFTW Really Computes | ||
|  | @subsection Multi-dimensional Transforms | ||
|  | 
 | ||
|  | The multi-dimensional transforms of FFTW, in general, compute simply the | ||
|  | separable product of the given 1d transform along each dimension of the | ||
|  | array.  Since each of these transforms is unnormalized, computing the | ||
|  | forward followed by the backward/inverse multi-dimensional transform | ||
|  | will result in the original array scaled by the product of the | ||
|  | normalization factors for each dimension (e.g. the product of the | ||
|  | dimension sizes, for a multi-dimensional DFT). | ||
|  | 
 | ||
|  | @tex | ||
|  | As an explicit example, consider the following exact mathematical | ||
|  | definition of our multi-dimensional DFT.  Let $X$ be a $d$-dimensional | ||
|  | complex array whose elements are $X[j_1, j_2, \ldots, j_d]$, where $0 | ||
|  | \leq j_s < n_s$ for all~$s \in \{ 1, 2, \ldots, d \}$.  Let also | ||
|  | $\omega_s = e^{2\pi \sqrt{-1}/n_s}$, for all ~$s \in \{ 1, 2, \ldots, d | ||
|  | \}$. | ||
|  | 
 | ||
|  | The forward transform computes a complex array~$Y$, whose | ||
|  | structure is the same as that of~$X$, defined by | ||
|  | 
 | ||
|  | $$ | ||
|  | Y[k_1, k_2, \ldots, k_d] = | ||
|  |     \sum_{j_1 = 0}^{n_1 - 1} | ||
|  |         \sum_{j_2 = 0}^{n_2 - 1} | ||
|  |            \cdots | ||
|  |               \sum_{j_d = 0}^{n_d - 1} | ||
|  |                   X[j_1, j_2, \ldots, j_d]  | ||
|  |                       \omega_1^{-j_1 k_1} | ||
|  |                       \omega_2^{-j_2 k_2} | ||
|  |                       \cdots | ||
|  |                       \omega_d^{-j_d k_d} \ . | ||
|  | $$ | ||
|  | 
 | ||
|  | The backward transform computes | ||
|  | $$ | ||
|  | Y[k_1, k_2, \ldots, k_d] = | ||
|  |     \sum_{j_1 = 0}^{n_1 - 1} | ||
|  |         \sum_{j_2 = 0}^{n_2 - 1} | ||
|  |            \cdots | ||
|  |               \sum_{j_d = 0}^{n_d - 1} | ||
|  |                   X[j_1, j_2, \ldots, j_d]  | ||
|  |                       \omega_1^{j_1 k_1} | ||
|  |                       \omega_2^{j_2 k_2} | ||
|  |                       \cdots | ||
|  |                       \omega_d^{j_d k_d} \ . | ||
|  | $$ | ||
|  | 
 | ||
|  | Computing the forward transform followed by the backward transform | ||
|  | will multiply the array by $\prod_{s=1}^{d} n_d$. | ||
|  | @end tex | ||
|  | 
 | ||
|  | @cindex r2c | ||
|  | The definition of FFTW's multi-dimensional DFT of real data (r2c) | ||
|  | deserves special attention.  In this case, we logically compute the full | ||
|  | multi-dimensional DFT of the input data; since the input data are purely | ||
|  | real, the output data have the Hermitian symmetry and therefore only one | ||
|  | non-redundant half need be stored.  More specifically, for an @ndims multi-dimensional real-input DFT, the full (logical) complex output array | ||
|  | @tex | ||
|  | $Y[k_0, k_1, \ldots, k_{d-1}]$ | ||
|  | @end tex | ||
|  | @html | ||
|  | <i>Y</i>[<i>k</i><sub>0</sub>, <i>k</i><sub>1</sub>, ..., | ||
|  | <i>k</i><sub><i>d-1</i></sub>] | ||
|  | @end html | ||
|  | @ifinfo | ||
|  | Y[k[0], k[1], ..., k[d-1]] | ||
|  | @end ifinfo | ||
|  | has the symmetry: | ||
|  | @tex | ||
|  | $$ | ||
|  | Y[k_0, k_1, \ldots, k_{d-1}] = Y[n_0 - k_0, n_1 - k_1, \ldots, n_{d-1} - k_{d-1}]^* | ||
|  | $$ | ||
|  | @end tex | ||
|  | @html | ||
|  | <i>Y</i>[<i>k</i><sub>0</sub>, <i>k</i><sub>1</sub>, ..., | ||
|  | <i>k</i><sub><i>d-1</i></sub>] = <i>Y</i>[<i>n</i><sub>0</sub> - | ||
|  | <i>k</i><sub>0</sub>, <i>n</i><sub>1</sub> - <i>k</i><sub>1</sub>, ..., | ||
|  | <i>n</i><sub><i>d-1</i></sub> - <i>k</i><sub><i>d-1</i></sub>]<sup>*</sup> | ||
|  | @end html | ||
|  | @ifinfo | ||
|  | Y[k[0], k[1], ..., k[d-1]] = Y[n[0] - k[0], n[1] - k[1], ..., n[d-1] - k[d-1]]* | ||
|  | @end ifinfo | ||
|  | (where each dimension is periodic).  Because of this symmetry, we only | ||
|  | store the | ||
|  | @tex | ||
|  | $k_{d-1} = 0 \cdots n_{d-1}/2$ | ||
|  | @end tex | ||
|  | @html | ||
|  | <i>k</i><sub><i>d-1</i></sub> = 0...<i>n</i><sub><i>d-1</i></sub>/2+1 | ||
|  | @end html | ||
|  | @ifinfo | ||
|  | k[d-1] = 0...n[d-1]/2 | ||
|  | @end ifinfo | ||
|  | elements of the @emph{last} dimension (division by @math{2} is rounded | ||
|  | down).  (We could instead have cut any other dimension in half, but the | ||
|  | last dimension proved computationally convenient.)  This results in the | ||
|  | peculiar array format described in more detail by @ref{Real-data DFT | ||
|  | Array Format}. | ||
|  | 
 | ||
|  | The multi-dimensional c2r transform is simply the unnormalized inverse | ||
|  | of the r2c transform.  i.e. it is the same as FFTW's complex backward | ||
|  | multi-dimensional DFT, operating on a Hermitian input array in the | ||
|  | peculiar format mentioned above and outputting a real array (since the | ||
|  | DFT output is purely real). | ||
|  | 
 | ||
|  | We should remind the user that the separable product of 1d transforms | ||
|  | along each dimension, as computed by FFTW, is not always the same thing | ||
|  | as the usual multi-dimensional transform.  A multi-dimensional | ||
|  | @code{R2HC} (or @code{HC2R}) transform is not identical to the | ||
|  | multi-dimensional DFT, requiring some post-processing to combine the | ||
|  | requisite real and imaginary parts, as was described in @ref{The | ||
|  | Halfcomplex-format DFT}.  Likewise, FFTW's multidimensional | ||
|  | @code{FFTW_DHT} r2r transform is not the same thing as the logical | ||
|  | multi-dimensional discrete Hartley transform defined in the literature, | ||
|  | as discussed in @ref{The Discrete Hartley Transform}. | ||
|  | 
 |