This is my implementation of integer real fft computation. It was
designed for high speed analysis of audio input in real time on a
PC/AT.

The data used was 8bit but the routine does 16bit signed arithmetic.
The algorithm of choice was one which minimizes mults while keeping the
number of needed registers to a minimum (the Intel chip does not have
enough). Data if gradualy scaled in flight to avoid overflow.

A standard (2,4) split-radix was used as described in [Sorensen et al.,
IEEE ASSP-35 no 6., June 87, pp. 849-863]. Other algoriths (WFT etc.)
where found to use too complex arithmetic for efficient implementation
on a machine with a decent mult instruction.

WHAT IT DOES NOT DO:
- NO inverse fft.
- NO complex input.
- NO windowing.
- output is power spectrum WITHOUT the square root taken.
- the data array x[] is not descramled so you cannot use it directly.
  see how the power spectrum is derived if you want to generate
  phase or whatever. fft7() and fft8() function do this post-
  processing.

The method of implementaion was to have a program (fftg.c) that
generates the fft routine which is then compiled/assembled and used.
The generated program is assembly, but for portability this release
generates a C program too. I have routines for Intel assembly and ns32k
assembly, other machines can have the basic routines re-written without
much effort (but for best performance one would optimize these routines
extensively).

The tar file contains a sample program (fft2.c) that uses the fft routine. It
draws some basic pictute on a screen using move() and draw() function which
you will have to supply (it uses the microsoft c graphics library).

Originaly the program started as a BASIC program from some British mag
a few years ago which started my interest as it took a few minutes to
do one fft. The current version does a 256 point fft on a 25MHz ns32532
in 2.9ms, on a 10MHz 80286 it took 7ms (I think, it has a rather fast
int mult).

To get the generator programs 'make progs'.

To get the fft programs 'make fft'. The 'c' version uses fftsubs.h. The
'f' version uses C function calls which is a bit slower but uses far
less memory, it uses fftsubs.c. Note that the makefile generates a 256
point routine with entry point fft(), you may want to change it for your
needs or just edit the output.

fftouts.c generates ns32k assembly, fftouta.c generates Intel assembly.
These should give a good idea of how to port the assembly level
routines to another machine.

To get the test programs 'make tests' (but first fix fft2.c for the
graphics).

Please let me know of problems/difficulties you encounter with these
programs.
