$Header: /u/drspeech/src/rasta/RCS/README,v 1.12 1996/02/27 21:24:21 bedk Exp $
****************************************************
****************************************************
        RASTA 2.2.3 - February 27, 1996 Release
****************************************************
****************************************************

Changes from 2.2.2:

1) Added the create_mapping utility to simplify the creation of the map
   files required for adaptive J-RASTA processing.

2) Updated the rasta man page to emphasize the badness of doing adaptive
   J-RASTA processing on signals with under 100 ms. of leading non-speech.

****************************************************
****************************************************
        RASTA 2.2.2 - November 14, 1995 Release
****************************************************
****************************************************

Changes from 2.2.1:

1) Bug fix in rasta.c --- fvec_HPfilter could be called before all the
   variables it needed were set.  Specifically, it was called before
   init_param was called.

****************************************************
****************************************************
        RASTA 2.2.1 - July 3, 1995 Release
****************************************************
****************************************************

Changes from 2.1:

1) lpccep.c now has a more accurate routine for converting
   critical band filter outputs to autocorrelations.  This
   means that rasta2_2 will give different results than
   previous versions of the program; however, they are better
   results, so it is OK.

2) rasta2_2 does matlab MAT file I/O.  Use the -b/-B options to
   do this.  You'll need to build rasta with MATLAB=1 and a
   correct value of MATLAB_BASE to get this to work.  You'll
   also need matlab.

3) The input waveform can be filtered with an IIR highpass
   filter (cutoff at 45 Hz) to reduce DC offset.  The -F option
   enables the filter.

4) rasta2_2 can use a previously-stored estimate of the noise
   level and rasta filter history for initialization.  This
   option is recommended for use with Jah-RASTA processing if
   the input can have less than 100--200 ms. of silence at the
   beginning.  The -h and -H options control history
   maintenance.

5) The calls to irint() and log2() have been eliminated to
   improve code portability.

6) Several unused variables in the fvec and svec structures have
   been eliminated.

7) The fmat_x_fvec routine was changed to run faster.
   Thanks to Ralf W. Stephan (<ralf@ark.franken.de>) for
   suggesting changes (6) and (7).

8) rasta now has a man page!

9) Some minor bug fixes and code cleaning, including the
   reduction of a statically allocated 60 MB (!) table.

****************************************************
****************************************************
        RASTA 2.1 - Changes from 2.0
****************************************************
****************************************************

1) Can now output the critical-band filter bank trajectories
   after the cube-root compression and loundess equalization.
   Enabled by the -P option.

2) Can byte-swap input when reading binary data (either on or   
   off-line).  Enabled by the -T option.

****************************************************
****************************************************
        RASTA 2.0 - May 30, 1994 Release
****************************************************
****************************************************

	RASTA 2.0 is an update version for RASTA 1.0. 
RASTA1.0 is a program for the rasta-plp processing and
it supports the following front-end techniques: PLP,
RASTA, and Jah-RASTA with fixed Jah-value. The Jah-Rasta
technique handles two different types of harmful effects
for speech recognition systems, namely additive noise and
spectral distortion, simultaneously, by bandpass filtering
the temporal trajectories of a non-linearly transformed
critical band spectrum. Since the optimal form of the
nonlinearity used in Jah-Rasta is dependent on the noise
level, this non-constant nonlinearity introduces a new
source of variability into the speech recognition system.
An approach was developed for compensating for this new
source of variability by Joachim Koehler and Grace Tong.
It will be referred as the spectral mapping method from
now on. RASTA 2.0 is an update version for RASTA 1.0 with
the spectral mapping method included. The spectral mapping
method will be further discussed in the file rasta.c.
	The J factor depends on the SNR. The J that we have
found to work well is something like 1/(3 * noisepower). 
In order to calculate the J factor adaptively, a noise 
estimation subroutine is also included in RASTA 2.0.
In the current version, the J factor can be either kept as
a constant by entering the value at the command line or
calculated adaptively using the formula J = 1/(3 * noisepower).
	There are two new files, mapping.c and noise_est.c 
added to RASTA 2.0. These two files handle the spectral 
mapping and noise_estimation respectively. Besides these
two files, a number of files from RASTA 1.0 were also 
slightly modified. They were modified to incorporate the 
Jah-Rasta spectral mapping and to correct some newly found
bugs in RASTA 1.0. The files that were modified are listed
below:

1) anal.c
2) debug.c 
3) init.c  
4) io.c
5) rasta.c
6) rasta_filt.c
7) Makefile
8) functions.h
9) rasta.h

The bugs that were corrected are:

1) In init.c, get_comline(),
    the following is added:
        case 'd':/* flag for debug output */
                pptr->debug = TRUE;
                break;

2) In init.c check_arg()
    the following is added:
        if((pptr->lrasta == TRUE) && (pptr->jrasta == TRUE))
        {
                fprintf(stderr,"Can't do log rasta and jah rasta at the same tim
e\n");
                exit(-1);
        }

3) In rasta_filt.c   struct fvec *rasta_filt()

   if(pptr->rfrac != 1.0)
   {
      outptr->values[i] =
                 pptr->rfrac * outptr->values[i]
                + (1-pptr->rfrac) * nl_audspec->values[i];
   }


4) Previously, for doing partial rasta and partial PLP, there
   were 2 switches; one as a flag to indicate that we are doing
   partial rasta, and the other to say what fraction of the output
   comes from the rasta part. Some users were complaining that
   the 2 switches are sort of confusing. Now, we just have the
   latter switches. It makes more sense since if you want it to
   be blended you will need to give a fractional amount.
  
5) The two makefiles have been merged to form one. Use e.g. "make
   ESPS=1 ESPS_BASE=/usr/local/esps" to build the version with ESPS
   IO.

Listed below is the old README file from RASTA 1.0 : 








****************************************************
****************************************************
	RASTA 1.0 - August 13, 1993 Release
****************************************************
****************************************************
	This directory holds the current debugged (at least we think so) 
release materials for the rasta-plp program. This code comes from 
(in ancient history) a Fortran source written by Hynek Hermansky to 
implement PLP.  Later versions done by H. Hermansky and C. Wooters used 
automatically translated C code and hand modifications to add in the 
RASTA processing.  This version (by N. Morgan) is pretty much 
a complete rewrite, splitting up analysis elements into reasonably-sized 
modules. A few routines from the past linger on with little change,
(e.g., the fft and the Durbin recursion)
because they work fine and suspect data structures (e.g. most arrays
with short fixed lengths) have been changed to dynamically allocate
what is needed and to check for array bounds.

In this directory you will find 2 sub-directories, one for
source files and one for the binary. 

See the front of "rasta.c" for a description of the algorithm.

**************************************************
	WHAT IS AND ISN'T HERE
**************************************************
	What is frozen here in reasonable form
is the most debugged version of the rasta algorithm.
Some of the things that we are currently experimenting with
are not yet included, since they are not yet sufficiently
mature for release (although we will be happy to talk
about them with you). However, it should be relatively easy
for you to modify the program to play with some of these ideas.
Some of the unimplemented pieces are:

	1) General linear filters - what we have implemented is
a ``standard'' rasta filter, which consists of a delta
calculation followed by a single-pole integration.
We have used data structures in the core filtering
routine that permit a more general filtering function
consisting of numerator and denominator polynomials,
but as of this release we don't read in such files
or allow for general polynomials in the routine that calls
the filt() function. 

	2) Similarly, we currently use the same rasta
filter for each band. Putting in the modifications for (1)
will also permit more flexibility in this regard.

	3) Nonlinear modifications within the filtering
are potentially possible (such as suppression of small
changes, or median filtering) but have not yet been tested.

	4) Noise estimation - we currently do not
include any adaptive (or non-adaptive) noise estimation.
Thus the J factor that can be entered at the command line
must be separately estimated. We have successfully estimated
this quantity on-line for some of our experiments, but
the code is not yet in shape and tested for a more general
facility. The J that we have found to work well is something
like 1 / (3 * noisepower), but it is possible that
for a new type of task a different constant would be needed. 
(See the commentary in rasta.c for a discussion of the potential
pitfalls in using an adaptive J).

CAUTION: Even with the current code, you can get in trouble by
using the J-RASTA option if you don't measure the noise,
since the default J value we use is just something that worked
for one experiment at ICSI; you might have a radically different
noise, or different gain or #bits in the A/D, etc. . In general,
it is best not to just run rasta with default values unless you
look at what they are and see if they match your problem
(e.g., for sampling rate). Just type the program name to
see the options.

**************************************************
	SUPPORT
**************************************************

	There is none.

	This is public domain code, and we cannot
fix your modifications or even guarantee fixing
bugs in this release. However, if you find
improvements, bug fixes, etc., please do share them
with us so that we can (if possible) include them 
in the next release. Send them to
	
	morgan@icsi.berkeley.edu

**************************************************
	A BIT OF PHILOSOPHY
**************************************************
	While some users may find the existing program useful
as is (we have put in quite a few command line options to
encourage this), this package is intended as a research tool.
Therefore it is likely that many users will want to hack at it.
The program has been split up into reasonably-sized modules for this
purpose. Some users will modify or replace various
files in the source directory, and the structure of the code has
been made as regular as possible to facilitate this, in particular
to facilitate modification of the speech analysis steps.
While we obviously can't guarantee that all such mods will work,
we can make this more likely by a couple of RASTA 1.0
coding tips:

1) Use a coding model - each of the core analysis steps
look very similar , and if one of them is replaced,
try to follow the same model. The general pattern is that each
analysis routine is called in rastaplp(), and that each one
is passed a pointer to a read-only structure of parameter
values (such as sampling rate) and a pointer to an "fvec"
structure, which contains a float array and its integer length.
The routine then processes this data as you wish, and
returns a pointer to an output fvec. Within the routine
itself, analysis tables are initialized and space is allocated
(but only the first time the routine is called).
Note that this is always a frame-based analysis in the current
implementation. Look at nl_audspec.c, for instance, for a
simple example of such a routine.

2) Use the fvec/fmat utility routines - in fvecsubs.c are several
relevant routines. Whenever possible,
use fvecs rather than simple arrays of floats, as the relevant
routines allow you to pretty painlessly allocate space,
bounds-check array references, copy, etc. .
You also can use show_vec() in debugging, which saves you
a bit of time in writing fprintf lines.
	
**************************************************
	SRC FILES
**************************************************

What follows is a list of the relevant files for the rasta program,
(all found in src) along with a brief description for each.

Makefile - the usual compilation script. Type "make" to build the new
	executable, and "make print" to print out the source.
	Copy in non_ESPS_Makefile if you don't have ESPS.

**************************************
	GOOD GENERAL READING
**************************************
rasta.h - the header file, including definitions for the 2 main
	data structures used: an fvec, which is a floating point vector
	with an associated integer vector length, and a parameter
	structure, which is the collection of all analysis parameters
	such as window size. It also defines upper-case constants that
	are used throughout the routines.

functions.h - a header file with globally accessible function
	prototypes (i.e., showing the calling argument types).
	Note that fft.c, rasta.c, lpccep.c, and audspec.c all
	have some local function prototypes not listed in this file.
	
rasta.c - the main program, including a front comment that
	briefly explains the algorithm and a few more things about the
	code.

init.c  - routines to initialize things, such as initial values for
	parameters, reading the command line to possibly update these
	values. a usage printing routine that is accessed by typing
	the command without arguments, and a routine to compute a
	couple of required parameters such as the number of
	analysis frames.

**************************************
	ANALYSIS ROUTINES
**************************************
anal.c  - the main calling analysis routine that is used for
	each new frame, along with a couple of routines that
	do windowing and fill the frame structure with data.

powspec.c - a routine to compute the power spectrum for a frame.
	This mostly computes a couple of parameters and then calls fft,
	but it is separate so that later mods can alter this step.

audspec.c - a file with routines to compute an auditory spectrum.
	In current form, it computes critical band ranges in the fft
	and weights for the integration within band (the first time 
	it is called) and then it uses this information to integrate
	the power spectrum into equivalent critical band powers.

nl_audspec.c - a file to compute a compressive nonlinearity, used 
	to get the auditory spectrum into an appropriate 
	domain for rasta filtering.

rasta_filt.c - a file that does the rasta filtering.

inverse_nl.c - a file to compute an inverse nonlinearity, used
	to get back to the auditory-like spectral domain after
	rasta filtering.

post_audspec.c - a file with routines to equalize and compress
	 an auditory spectrum. 

lpccep.c - a file to approximate the auditory-like spectrum 
	by the spectrum of the all-pole model and
	compute its cepstrum using an inverse DFT, 
	autocorrelation-based LPC analysis, 
	and a recursion to compute cepstra. A lot of the
	core recursions are relatively unchanged from the fortran.

fft.c	- old, ugly, but efficient code to compute the fft for a vector;
	actually the current form of the routine returns the power
	spectrum in the form of the magnitude squared of the fft.

fvecsubs.c - a file with basic routines for allocation and
	some arithmetic for float vectors (fvec structure) and matrices
	(fmat structure). It includes an fvec_check routine that
	makes sure that the array element you are planning to
	access (read or write) is within the allocated space.
	This seems to take about 1/2 microsecond on a Sparc 10,
	so you might not generally call it in the inner loop
	but it is nice to use at least once for a loop when you
	can figure out what the largest array index is going to be.
	For things out of the inner loop it is simplest to
	just use it whenever you access an array element if
	the fvec pointer has been passed to you from elsewhere.
	This may seem like extra work, but it can save you
	a lot of time tracking down an overwriting-type bug.

svecsubs.c - similar to fvecsubs, but for shorts; a file with 
	basic routines for allocation and
	some arithmetic for short vectors (svec structure).

**************************************
	GOTTA HAVE I/O
**************************************	
io.c    - routines to open input and output files, and to read
	them in one of a number of specified forms; as of 7/21/93,
	the input forms supported are binary shorts, ascii, and esps
	formatted files. The output can be binary floats,
	ascii, or esps files. Note: if you do not have an ESPS
	license, you will want to use the alternate Makefile which
	is a non_ESPS_Makefile ; then if you unwittingly try to
	use ESPS input or output you will get tossed out of the program.

	If you use binary inputs from stdin, you can run in online
	mode. In this mode, there is no preliminary step of reading
	in the whole input file; instead, you just read as you process,
	and it goes on until the fread fails, for instance at
	the end of file. The routine to do this kind of
	input, get_online_bindata(), is also in this file.

debug.c - printing routines so you can look at things if something gets
	messed up. Some are rather specific to looking at particular
	structures, while show_vec() prints out the length and values
	for the fvec pointed to by the argument.
 
*****************************************************
        ACKNOWLEDGEMENTS
  
        Thanks to Bill Byrne for his help with the ESPS interface;
  to Chuck Wooters, Joachim Koehler, and especially Phil Kohn for 
  critical comments; to Jordan Cohen for providing the environment
  in which I wrote this code; to ICSI and UC Berkeley for providing
  my financial support during this time; and most of all to Hynek
  Hermansky for being at the source of all the basic ideas.
