Welcome to POWERpv, a set of programs
for altering and creating sounds
using Fourier analysis as its basis.
These programs build upon the
structure of the practical phase
vocoder C program described by
Richard Moore in Elements of Computer
Music. The programs work at the
level of the Unix command line,
which tends to make them quite
portable.

In order to build these programs,
first visit the PVLIB directory
and type "make". (Don't be a slob! 
Type "make clean" afterwards.) You may 
then build any of the POWERpv programs at 
your leisure by visiting the appropriate
directory and typing "make". These
programs were built on a NeXT 2.1 slab,
however, since they're written in
antsy (ANSI) C, I expect that they
should compile successfully on
a variety of platforms. 

Most of the programs behave like
filters, with an input and an output.
A few of the programs are synthesizers
with no input. These modules are dedicated
to Gordon Mumma, who has been fond of building
electronic music circuits with no inputs 
since the late 1950s.

Each directory will have some discussion
of what the processor does, some more than
others, if I already had to explain it to
someone. Following are a few general
remarks about using these processors.


"OFF WITH HER HEAD!" SHOUTED THE QUEEN

These programs process a stream of
floating point samples on input and
on output. Since most soundfiles have
some header information, and are often
stored as integer samples, you will
have to strip off the header, possibly
convert to floating point values, run
them through a POWERpv processor and then
reverse the conversion header process.
If you have the utilities to accomplish
this, then you probably already know how 
to use them. If not, I suggest you pick
up the Sound_io_NeXT.tar.Z distribution 
from princeton.edu. (While you're at it,
treat yourself to a copy of Christopher
Penrose's phase vocoder processor kit,
pV.tar.Z if you don't already have a copy.)
The CARL Package has recently been released
at ccrma-ftp.stanford.edu so you might
get the soundfile IO you need from there
instead. As a bonus, you also get cmusic
and Moore's original pv program.

All the processors in POWERpv are 
self-documenting to a certain degree. Example:

mitosis> temper
temper:  static spectral compander
temper   [flags] < floatsams > floatsams
        N:      fft length [1024]
        R:      sampling rate [44100]
        M:      window size in samples [2048]
        D:      decimation factor in samples [256]
        I:      interpolation factor in samples [256]
        P:      pitch factor [0 for overlap add] [1.0]
        f:      spectral companding factor [2.0]
        t:      oscillator resynthesis threshold [.001]
        s:      synthesize analysis input

The usage message gives you a list of flags,
and their default values. The temper processor,
since all the important values are predefined,
could be successfully executed without any user-supplied
parameters. 

I will briefly discuss the parameters which are
most common throughout these programs. For more
information, see Moore, or Penrose's discussion in pV.
For discussion of processor-specific parameters,
visit the appropriate directory.

R - sampling rate of input sound.
N - number of points in the FFT analysis.
This works out to N/2 instantaneous amplitudes
and frequencies (or phases - more about that later).
At first I felt ripped off when I realized that for
a value of N FFT points, I was only getting N/2
frequency points. But I felt better after I
realized I was also getting negative frequencies. [:)]
N must be a power of 2, since we are using the
FFT. Typical values are 1024 or 2048, but I recommennd
experimenting with extreme high and low values as they can
produce interesting effects. Note that higher
values of N increase frequency resolution but
decrease transient resolution. 


M - window size. if M is larger than N, blocks of
samples will be mixed together before analysis,
resulting in some loss of fidelity, but speeding
the computation process.

D - This is the number of samples skipped between
FFT analysis frames. D is conservatively set to
N/8 but it be set as large as N under certain
conditions. The ratio N/D is roughly inversely
proportional to compute time. So if you want
quality, you have to pay.

I - this is the number of sapmles pumped out
for each FFT frame. If I != D, you have effectively
altered the duration of the sound without changing
its spectrum (much). Your own personal Springer machine.

P - this is a pitch multiplier for the spectrum.
One of the interesting features of Moore's 
implementation is that the FFT frame may be
resynthesized either with the inverse FFT, or
with a bank of oscillators. Oscillator resynthesis
tends to be more computationally expensive than
inverse FFT resynthesis, but it is also necessary for
processors which radically alter the frequency
content of the spectrum. A non-zero value for
P specifies oscillator bank resynthesis.
For P = 2, the spectrum is shifted up an octave
while the duration remains the same. Groovy.
When using inverse FFT synthesis (P = 0), it may not
be necessary to convert phase to instantaneous
frequency if you pretty much leave the frequencies
alone. This efficiency measure is discussed elsewhere.
See the leanconvert subroutine for details.

t - this is a multiplier to determine a threshold
below which frequencies are not resynthesized.
It applies only to oscillator bank resynthesis
and can speed up the processor considerably.
a threshold of .001 corresponds to about -60dB.
I've gotten away with values of .01 and higher.
Past a certain point, this creates noticeable
artifacts which you may or may not enjoy.
The actual synthesis threshold is adaptively
recaluated at each FFT frame, relative to the
highest reported amplitude.

s - it is possible for the processors to accept
a stored FFT analysis file. The s flag indicates
that your input is in this format. Some of the
processors only use analysis data, in which case
the s flag is disabled. I have separated the
analysis portion of Moore' program into a
standalone, pvanal, since I didn't wish to
include the analysis code in every processor.
Please note that analysis files will be larger
than the source files, perhaps considerably
larger, and courtesy may dictate sparing use of
storing these files at a shared installation.

Finally, here is an example commandline use of
the above program employing the CARL soundfile
filters fromsf and tosf:

mitosis> fromsf -H crumhorn_master | temper -R44100 -N2048 -M8192 -D1024 -I1024 \
-f.666 | tosf -R44100 -c1 pile_driver

Congratulations! You are now ready to do some
serious frequency-domain hacking. Invade the
subdirectories at will. "We encourage our clients
to make themselves as comfortable as possible."

Bug reports, tapes and denatured texts to:

Eric Lyon
eric@cmlab.keio.sfc.ac.jp