44 lines
		
	
	
		
			1.3 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
			
		
		
	
	
			44 lines
		
	
	
		
			1.3 KiB
		
	
	
	
		
			Plaintext
		
	
	
	
	
	
| TODO before FFTW-$2\pi$:
 | |
| 
 | |
| * figure out how to autodetect NEON at runtime
 | |
| 
 | |
| * figure out the arm cycle counter business
 | |
| 
 | |
| * Wisdom: make it clear that it is specific to the exact fftw version
 | |
|   and configuration.  Report error codes when reading wisdom.  Maybe
 | |
|   have multiple system wisdom files, one per version?
 | |
| 
 | |
| * DCT/DST codelets?  which kinds?
 | |
| 
 | |
| * investigate the addition-chain trig computation
 | |
| 
 | |
| * I can't believe that there isn't a closed form for the omega
 | |
|   array in Rader.
 | |
| 
 | |
| * convolution problem type(s)
 | |
| 
 | |
| * Explore the idea of having n < 0 in tensors, possibly to mean
 | |
|   inverse DFT.
 | |
| 
 | |
| * better estimator: possibly, let "other" cost be coef * n, where
 | |
|   coef is a per-solver constant determined via some big numerical
 | |
|   optimization/fit.
 | |
| 
 | |
| * vector radix, multidimensional codelets
 | |
| 
 | |
| * it may be a good idea to unify all those little loops that do
 | |
|   copying, (X[i], X[n-i]) <- (X[i] + X[n-i], X[i] - X[n-i]),
 | |
|   and multiplication of vectors by twiddle factors.
 | |
| 
 | |
| * Pruned FFTs (basically, a vecloop that skips zeros).
 | |
| 
 | |
| * Try FFTPACK-style back-and-forth (Stockham) FFT.  (We tried this a
 | |
|   few years ago and it was slower, but perhaps matters have changed.)
 | |
| 
 | |
| * Generate assembly directly for more processors, or maybe fork gcc.  =)
 | |
| 
 | |
| * ensure that threaded solvers generate (block_size % 4 == 0)
 | |
|   to allow SIMD to be used.
 | |
| 
 | |
| * memoize triggen.
 | 
