-- (C) Copyright International Business Machines Corporation 23 January 
-- 1990.  All Rights Reserved. 
--  
-- See the file USERAGREEMENT distributed with this software for full 
-- terms and conditions of use. 
File: README
Author: Andy Lowry
SCCS Info: @(#)README	1.5 3/13/90

		How To Build the Hermes Code Generator



1. Theory

The Hermes code generator is written in Hermes.  There arises a
chicken-and-egg situation as a result, since in order to compile
Hermes process modules (including those making up the code generator),
a Hermes code generator is needed!

In order to resolve this problem, a bootstrapping procedure has been
carefully worked out that allows a code generator to be built without
a working code generator being available.  The key pieces to the
puzzle are the following:

	- An "assembler" that can produce .lo files (the "object"
	files in the Hermes world) from .li files (Hermes assembler
	sources).  The assembler is written in C, compilers for which
	have existed since before the beginning of time.
	- A mode in which the code generator can produce .li files as
	a by-product of code generation.
	- An initial code generator created by the Hermes codegen god
	(Van "Vicious" Nguyen) before the beginning of time, in the
	form of .li files.

The first Hermes code generator was generated from the .li files
handed down from god.  These sacred files have been preserved so that,
should we humans ever so totally screw up the Hermes world that it
cannot otherwise be saved, we will still be able to rebuild from
scratch.

Other than the initial build from the sacred .li files, the build
process is an iterative process, with each build producing the
starting point for the next build, as follows:

	1. Starting with a complete set of .li files (the sacred first
	set or the results of a prior build (see step 4)), construct a
	code generator.  This is the "phase 1" code generator.

	2. Using the phase 1 code generator, build another code
	generator from the Hermes codegen sources.  This is the "phase
	2" code generator.

	3. Using the phase 2 code generator, build another code
	generator from the Hermes codegen sources.  This is the "phase
	3" code generator.

	4. Using the phase 3 (or phase 2) code generator, generate .li
	assembler source files from all the Hermes sources, to form
	the basis of a future phase 1 build.

Normally, the phase 3 code generator would be installed as the "floor"
version.  The phase 1 code generator is not a full-featured code
generator, as it lacks some functionality and does not produce as
high-quality code as do the phase 2 and phase 3 code generators.  This
is because in step 4, non-crucial portions of the code are left out of
the translation.  This produces simpler .li files, and therefore
simplifies the task of directly debugging the .li files, should that
ever be necessary.

The phase 2 code generator is full-function, having been generated
from the Hermes sources.  However since it was generated using the
less featureful phase 1 code generator, it is probably too slow for
use as a floor version.

Note that it is not always necessary to perform step 4, since the
phase 1 code generator need not be full-function.  For example, if the
Hermes codegen sources are modified to include a new sort of
optimization, the new code will probably be suppressed in step 4
anyway, so step 4 can be safely skipped.  However, since the phase 1
code generator depends on common definition modules, step 1 often
cannot be skipped even if none of the assembler sources have changed.

2. The Directories

The build process is carried out in a self-contained tree of
directories.  We will refer to the root of the tree as 'codegen'... in
the Hermes system as distributed, the directory is $HERMES/codegen,
where $HERMES names the Hermes system root directory.

The codegen directories are laid out as follows:

                  Hermes System Root
                           |
            ----------- codegen------------
	   /           	  /|\	       	   \
          /       -------- | -------   	   SCCS
	 p0	 /         |        \
	/       p1         p2        p3
      SCCS      |
                li

The functions of the directories are as follows:


    codegen -- Contains the Hermes codegen sources (under SCCS
	control).  Object files that are shared by multiple build
	phases are also generated in this directory.  Perform a 'make'
	here before you attempt any of the individual build phases.

    codegen/p0 -- Contains the sacred assembler sources (.li files).
	Do a 'make' in this directory (following a 'make' in codegen)
	in order to produce a true bootstrap version of the code
	generator.  N.B. Be on the lookout for compatibility problems
	between the p0 files and the current interpreter and interface
	definitions!  Though we will probably attempt to prevent these
	from creeping in, p0 is out of the normal build loop, and is
	therefore prone to be out of date.

    codegen/p1 -- Contains assembler sources produced from a prior
	build.  Do a 'make' in codegen/p1 to produce the phase 1 code
	generator.

    codegen/p1/li -- This is the work directory where assembler
	sources are produced from the Hermes sources.  Do a 'make' to
	generate the .li files; do a 'make install' to move them into
	codegen/p1 in anticipation of another build.

    codegen/p2 -- Contains no source files... this is the work
	directory where the phase 2 code generator is normally built.

    codegen/p3 -- Contains no source files... this is the work
	directory where the phase 3 code generator is normally built.


The p0 direcotory contains several hand-written source files and its
own individualized make file, all under SCCS control.  The other
subdirectories do not contain any hand-produced files, and therefore
it should never be a disaster to completely erase their contents.
Their minimum required contents before builds can be undertaken are as
follows:

    p1:
      Makefile -- This is normally a symbolic link to the file
	codegen/Makefile.libuild, which is maintained under SCCS
	control in the codegen directory.
    p1/li:
      Makefile -- This is normally a symbolic link to the file
	codegen/Makefile.ligen, which is maintained under SCCS control
	in the codegen directory.
      phase1 -- This is a (symbolic link to a) directory where the
	generated .li files should be installed.  Normally,
	codegen/p1/li/phase1 is a symbolic link to codegen/p1.
    p2, p3:
      Makefile -- This is normally a symbolic link to the file
	codegen/Makefile.hbuild, which is maintained under SCCS
	control in the codegen directory.
      priorphase -- This should be a (symbolic link to a) directory
	containing a complete Hermes code generator.  Normally,
	codegen/p2/priorphase is a link to codegen/p1, and
	codegen/p3/priorphase is a link to codegen/p2.

Note that none of the make files for any of the phases depends on the
above directory layout or naming conventions.  Each make file includes
a macro definition for the codegen root directory and other required
directories and/or files.  Thus, for example, a directory called
'testp1' could be created (as a part of the codegen directory tree or
not), and after all the files in codegen/p1 had been copied there,
building could proceed normally.  Similarly, any directory with a copy
of (or link to) codegen/Makefile.hbuild and an appropriate link named
'priorphase' could be used to build a code generator from Hermes
sources.

The motivation for using the phase1 and priorphase links required in
the above scheme rather than utilizing variables in the makefiles was
to minimize (and in most cases elimintate) customizations required on
make files in order to build test versions in nonstandard directories.
The only directories that are defined by make file variables are the
codegen root and various Hermes system directories (which are assumed
to be up-to-date prior to any codegen build).


3. Shared things

In general, any file that takes part in more than one build phase
resides in the codegen root directory.  This includes all of the
following:

    - All the Hermes sources, including process modules and
	definitions modules
    - Compiled definitions modules (.do files)
    - A single compiled process module: listuff.lo

Because the compiled definitions modules are shared, they are built
when a 'make' is performed in the codegen root directory.  Any other
phase that needs these .do files establishes symbolic links
automatically when a 'make' is performed.

The listuff.lo file is a bit of a special hack.  It belongs in the
codegen root directory because it is needed by all three phases.
However, all the necessary make file support for assembling .li files
into .lo files appears in the phase 1 make file.  Repeating all that
support in the root make file would be tedious and error-prone.
Instead, the listuff.lo file is actually generated as part of the
phase1 build process, but it is moved into the codegen root directory
after it is built..  As with .do files, all phases establish symbolic
links to codegen/listuff.lo as part of their build procedure.  The
listuff sources, like all other handwritten sources (other than the
sacred p0 sources), are kept in the codegen root directory.
