Changes between release v1.2.1 (June 12, 1995) and v1.2:

- Changes to the Cache:
	- Added IP-based access control.
	- Added more sophisticated setting of the TTL based on HTTP headers.
	- Added support for user-configurable periodic garbage collection.
	- Added more statistics information.
    	- Added support for HEAD HTTP request method.
	- Added support for user-configurable stoplist.
	- If cached.conf has a parent/neighbor with udp_port==echo (7) then
          we just bounce ECHO messages, which are treated as a hit.
    	- Added high/low water marks for disk storage.
	- Changed trace mail into cached.conf option.
	- Changed default FTP MIME type to application/octet-stream.
	- Fixed bug preventing some URLs from working.
	- Fixed bug with DNS lookups on IP numbers.
	- Cleaned up and updated cached.conf.
	- Fixed various minor bugs.

- Miscellaneous Changes
	- Incorporated fixes for FreeBSD port from ted@oz.plymouth.edu

##############################################################################
Changes between release v1.2 (April 3, 1995) and v1.1:

- Changes to the Broker:
	- Major performance improvements to the collector interface.
	- Added fast, efficient internal Gatherer ID management.
	- Added support for clients requesting attributes with #attribute.
	- Added support for log file rotation, and terse logging.
	- Added support for #operation in query manager interface.
	- Cleaned up the log file format.
	- Cleaned up the administrative interface.
	- Cleaned up the UNIX file system-based storage manager.
	- Fixed major bug with WAIS support.
	- Fixed file descriptor leaks in glimpseserver when the index
	  contained files that had since been deleted.
	- Fixed bug with overflowing lines from glimpse.
	- Fixed bug with hostname initialization.
	- Fixed memory leak with the Description-Tag attribute matching.
	- Fixed various minor bugs.

- Changes to the Cache:
	- Added httpd accelerator support.
	- Added IP number logging.
	- Added setuid() to a user when cached is run as root.
	- Added support for HTTP servers that die abruptly.
	- Added client_timeout which places a hard limit on the life
	  of incoming connections on the ascii port, or on outgoing
	  HTTP or Gopher clients.
	- Cleaner implementation for retrieving FTP URLs via ftpget.pl.
	- Tries to write cached.pid file in same directory as cached.conf.
	- Changed FTP support to sacrifice correct HTTP headers for 
	  dramatically decreased latency for large FTP objects.
	- Fixed ftpget.pl -htmlify to determine directory vs. file 
	  correctly and send HTTP header as soon as possible.
	- Fixed rare core dump during HTTP xfers.
	- Fixed how the error messages are printed.
	- Better support for larger file descriptor tables.
	- Debug level 0 and 1 now has timestamp logged.
	- Cleaned and updated defaults for cached.conf.
	- When run as root and do suid, cached will change current directory
	  to its swap directory.  Swap directory is pretty sure that writable 
	  to cached.  Just in case, it crashes so it can write core file.
    	- Minor modification of store error message.
	- Remote client connection resets are handled as soft error.
	- Strip an extra /r/n from MIME.
	- Hierachy log (yet another log, but it's optional).
	- Periodically hunts for zombies processes.
	- Added more information to the stat interface.
	- Cleaned up info data for improved parsability/readability.

- Changes to the Gatherer:
	- Added support to follow HTTP redirection pointers.
	- Added support for $http_proxy environment variable in liburl.
	- Added support for summarizing SGML data.
	- Added better support for summarizing TeX data.
        - Added support for summarizing RTF and MIF data, using Rainbow
          software provided by EBT, which we make available in our new
          components distribution
	- Added support for summarizing WordPerfect 5.1 data.
        - changed HTML summarizing to use SGML summarizer, providing more
          easily customizable results
        - Added support for local filesystem gatherering for NNTP.
	- Improved incremental gatherering support, and integrated the
	  support into the Essence program (removed dbcheck program).
	- Added support for "fake" MD5 generation per SOIF object on 
	  external presentation unnesting streams (exploders) -- 
	  permits incremental gathering on data generated by an Exploder.
	- Added --memory-efficient to Essence to trade time for memory
	  efficiency; this help users who have limited with memory resources
	  but are dealing with large SOIF objects.
	- Added --confirm-host to Essence for explicit host DNS validation.
	- Added --max-refresh to Essence to limit refreshing activity.
	- RootNode enumerators generate RFC 1738 escaped URLs.
	- Improved performance of SOIF parsing.
	- Fixed bug in locating gzip in gatherd.
	- Fixed bug in the unnesting commands in Essence.
	- Fixed bug with HTTP/1.0 requests, now sends encoded URIs for GETs.
	- Fixed ftp.pl for Solaris.  Wasn't setting PF_INET correctly.

- Changes to the Replicator:
	- Updated with USC's version from 3/15/95

- Changes to the User's Manual:
	- Added sections for new plug'n'play components: standard,
	  SGML, HTML, MIF, RTF, WordPerfect 5.1.
	- Updated support policy.
	- Added clarification in Local Gatherering section.
	- Added clarification in RootNode enumeration section.
	- Added clarification on Gatherer/Broker information flow.
	- Added clarification for some cached internals.
	- Added section on upgrading from v1.1 to v1.2.
	- Added discussion about httpd_accel for cached.
	- Updated info about software for the replicator section.
	- Updated numerous facts to v1.2.
        - Reorganized essence/content extraction customization section.
        - Added description of SGML summarizing and components distribution
          (including Rainbow software for MIF and RTF formats)
        - Added more troubleshooting comments to all sections.
        - Added more detail to cache and replication sections, including
          discussions of httpd-accelerator, CreateReplica, and some of the
          performance and failure-mode characteristics of the cache.
        - Cleared up inaccuracies and unclarities in Gatherer RootNode
          specification section.
        - Added notes about user-contributed software.
        - Updated support policy.
        - Added index entries for all programs in appendicies.
        - Other minor changes.

- Miscellaneous changes:
	- Reorganized the source tree to support plug'n'play components.

##############################################################################
Changes between release v1.1 (February 17, 1995) and v1.1.beta.v2:

- Changes to the Broker:
	- Added a leading protocol version header for the result set.
	- Added support for query flags during Broker-to-Broker collections.
	- Added support for limiting the lifetime of glimpse queries.
	- Fixed major bugs in Broker-to-Broker collections.
	- Fixed major bugs with deleting Registry entries during initial build.
	- Fixed memory leaks and file descriptor mgmt bugs in glimpseserver.
	- Fixed bug with -L in glimpseserver.
	- Fixed bug that increased the size of structured glimpse indexes.
	- Fixed bugs in the administrative interface and WAIS support.
	- Fixed core dump when searching the Registry during collections.
	- Fixed display SOIF links flag in BrokerQuery.pl.
	- Fixed .cgi pgms, so that httpd kills the cleanly after user abort.
	- Changed glimpseserver and broker so that they will not block
	  longer than 15 seconds while waiting for an incoming connection.
          This prevents SunOS from blindly swapping out the process.
	- Optimized so that a full glimpseindex will only happen if more
	  than 10% of the objects have changed.
	- Added some more logging output.
	- Fixed various minor bugs.

- Changes to the Cache:
	- Added Gopher->HTML support. For mosaic proxy, you'll need to 
		set gopher_proxy http://cache.server:3128/
  	  instead of 
		set gopher_proxy gopher://cache.server:3128/
	- Fixed bug with HTML-ify FTP directories using ftpget.pl.
	- Fixed bug with hierachical problem for refreshing.
	- Fixed bogus client error message.
	- Improved cached error messages.

- Changes to the Gatherer:
	- Generates the 'Description' attribute whenever possible.
	- Fixed bug in the expiring of objects from the PRODUCTION database.
	- Fixed bug in httpenum that wasn't cleaning up correctly.
	- Fixed newsenum to obey URL-Max limit.
	- Improved the Mail summarizer.
	- Improved the USENET support, added NewsArticle and NewsGroup.
	- Improved gatherd to speed up SEND-UPDATE timestamp computation.
	- Improved preparation for the Gatherer's database to be exported.
	- Purify'd Essence to remove memory leaks.

- Changes to the User's Manual:
	- Updated the section on the Broker's Collection.conf file.
	- Updated many minor points.
	- Improved HTML version of the manual, by upgrading latex2html pgm.

- Miscellaneous changes:
	- Fixed problems with Solaris' socket.ph for Perl programs.

##############################################################################
Changes between release v1.1.beta.v2 (February 3, 1995) and v1.1.beta:

- Changes to the Broker:
	- Major performance improvements while doing collections.
	- Uses the customizable BrokerQuery.pl for the WWW interface.
	- Fixed major bugs in Broker-to-Broker transfers.
	- Fixed minor bug in collections that caused necessary indexing.
	- Cleaned and improved the information that is logged to broker.out.
	- Changed broker to run cleanly as a daemon by disconnecting from
	  the controlling terminal.
	- glimpseserver now prints its error messages correctly.
	- Fixed various minor bugs.

- Changes to the Cache:
	- Fixed core dump bug when cached is heavily loaded.
	- Improved error messages.

- Changes to the Gatherer:
	- Site enumeration filter is based on host:port, and better argv 
	  processing for 'Gatherer' - fixes by "Albert Dvornik" <bert@MIT.EDU>
	- Major performance improvements while preparing databases.
	- Fixed Gatherer to change to Top-Directory before running.
	- Fixed Gatherer to write dummy index.html files in data/ and tmp/.
	- Fixed bug in HTTP enumeration to only extract links from HTML.
	- Fixed various minor bugs.

- Changes to the User's Manual:
	- Added detailed appendix on Harvest software layout and programs.
	- HTML version of the manual now contains the local copy of the icons.
	- Added section on customizing BrokerQuery.pl.
	- Fixed example for Filters during RootNode enumeration.
	- Added a search interface to the User's Manual using a Broker.
	- Updated index.

- Miscellaneous changes:
	- Improved log output format to be more readable.
	- Added HP-UX port/fixes from Chris Dalton (crd@hplb.hpl.hp.com).

##############################################################################
Changes between release v1.1.beta (January 26, 1995) and v1.0:

- Changes to the Broker:
	- Upgraded to Glimpse 2.1 which includes glimpseserver.
        - Added faster, more memory-efficient internal Registry lookups.
	- Added support for switching the indexing subsystem at run-time.
	- Added a statistics generator for the Broker.
	- Fixed BrokerQuery.cgi so that the rejection message from the
	  Broker while its doing indexing works all of the time.
	- Fixed Broker bug that would cause the Broker to hang sometimes
	  on a pclose() after doing a collection with the gather command.
	- Immediately denies outside connections during a collection, 
          indexing, or other administrative operations.
        - Improved the HTML result set generated by BrokerQuery.
        - Pointers to content summaries in the result set is now an option.
	- Changed /brokers to /Harvest/brokers, etc.
	- Limit the time that the Glimpse search engine runs for a query.
	- Added Query.cgi which can be used to support Broker replicas.
	- Added support for minimal bookkeeping from Gatherer.
	- Fixed problems with the Broker's cleaning, added compress Registry.
	- Fixed problems with the Broker's updating of objects.
        - Fixed BrokerQuery syntax error message to point to queryhelp.html.
	- Fixed BrokerRestart for Replicator interface.
	- Fixed WWW interface to work with any document root.
        - Fixed various minor bugs.

- Changes to the Cache:
	- Fixed serious hierachical cache bug.
	- New error messages. HTTP/1.0 compliant.
	- Nuke If-Modified-Since to work with Netscape.
	- Non-blocking DNS lookup using dnsserver program.
	- New config parameter, cache_dns_program.
	- Removed Tcl library binaries - have a precompiled version of Harvest.
	- Fixed stat for outgoing message.
	- Use multiple directories for on-disk swap storage.

- Changes to the Gatherer:
	- Added flexible support for specifying a Gatherer's workload.
	- Added support for gatherering through the local file system.
	- Added support for USENET URLs.
	- Added INFO command to Gatherer for statistics.
	- Added support for generating minimial bookkeeping attributes.
        - Improved HTTP/1.0 support for MIME headers and Last-Modified headers.
	- Fixed bug with 'gather' that caused 'gunzip' decompression to fail.
        - Made automatic keyword generation, and local disk cache maximum 
	  size a run-time flag.
	- Added a SOIF parser in Perl.
	- Changed HTML URL extractor from HTML.sum to separate program.
        - Fixed Gopher support to have longer read timeout.
	- Consolidated GDBM utilities into the 'gdbmutil' program.
	- Fixed bug with gatherd leaving zombie children.
        - Fixed various minor bugs.

- Changes to the Replicator:
	- Replaced with USC's Replicator distribution.

- Changes to the User's Manual:
	- Added a new subsection on Extended RootNode Specifications
	- Added discussion about new Local-Mapping support
	- Fixed various typos and clarified wording in various places
	- Fixed some URLs, and added others
	- Fixed the discussion on using Glimpse with the Broker.
	- Added a new subsection the Perl SOIF library.
	- Added more descriptions about various system components (e.g., HSR)
	- Added more index entries, and clarified some of the existing entries
	- Added a note about realtime Gatherer updates
	- Added mention of cache RAM requirements
	- Added section on Support Policy and Harvest Team Contact Information
	- Updated copyright/licensing discussion
	- Added a section about the binary-only distribution
	- Changed section names and content at beginning to make it more 
  	  clear and to make more sense with the new installation. 
	- Reorganized manual by subsystem
	- Added troubleshootings sections to each subsystem, and shifted 
	  some stuff into there that had been in other places
	- Expanded section on supported platforms and software needed for
	  running/building Harvest
	- Clarified some parts of the ``Querying a Broker'' section
	- Added appendix on Directory layout of installed Harvest software
	- Updated to reflect new httpd reorg
	- Updated default summarizer action list
	- Noted that glimpseserver is now part of the system
	- Added more discussion to replicator section, including a figure

- Miscellaneous changes:
	- Reorganized Harvest's installed directory structure.
	- Integrated port to AIX 3.2 and AIX cc by greving@dv.go.dlr.de.
	- Integrated port to HP-UX A.09.03 by steff@csc.liv.ac.uk.
	- Integrated port to IRIX 5.3 by leclerc@ai.sri.com.
	- Integrated port to Linux 1.1.59 by hardy@cs.colorado.edu.
	- Integrated port/fixes to HP-UX 09.03 and HP ANSI C compiler A.09.69
	  by crd@hplb.hpl.hp.com.
	- Changed all Perl scripts to work under Perl 4.x or 5.0.
	- Try to use vfork rather than fork to save memory when possible.
	- Updated Copyright.

##############################################################################
Changes between release v1.0 (November 7, 1994) and v1.0-beta-1.5:

- Changes to the Broker:
        - Upgraded Glimpse from version 1.1 to 2.0.
        - Added support for Glimpse 2.0 which allows byte-level indexing,
          limiting result set sizes, arbitrary Boolean queries, and more.
        - Made case insenstive and word matching the default for Glimpse.
        - Improved and updated queryhelp.html and adminhelp.html.
        - Added soifhelp.html to the help suite.
        - Added a reboot-broker tag to the default broker Makefile.
        - Fixed various minor bugs.

- Changes to the Gatherer:
        - Better HTTP/1.0 support, sends User-Agent and From fields.
        - Fixed a problem with cross-site Gopher RootNode enumeration.
        - Fixed bug in HTTP RootNode enumeration.
        - Generation of unique, sorted keyword list is optional in config.h.
        - Changed Gatherer program to work around Solaris 2.3 Perl 4.036 bug.
        - Fixed various minor bugs.

- Changes to the Cache:
        - Added support for the Netscape browser.
        - No longer caches /cgi-bin/ URLs.
        - Updated the Tcl/Tk/dpwish pointers for the Cache manager.

- Changes to the User's Manual:
        - Added an index with over 300 entries.
        - Added a new section about Querying a Broker.
        - Added a new section about common SOIF attribute names.
        - Added a new section on periodic gatherering.
        - Added a new section on tuning Glimpse.
        - Added a new section on the WWW interface to the Broker.
        - Added a new section on integrating new search/indexing subsystems
          into the Broker, and give detailed interface description.
        - Added more detail to SOIF appendix.
        - Improved and updated the Administrating a Broker subsection.
        - Added more explanation about manual annotations.
        - Folded in content from FAQ.
        - Noted particular usefulness of the Essence-Options variable,
          e.g., for setting --full-text.
        - Added a note to the Customizing the candidate selection step
          subsection that it's particularly useful to do section based on
          file and URL naming heuristics when gathering remote data,
          because it can avoid retrieving lots of data.
        - Added a note in the subsection on Running a Gatherer that you can
          set MAX_ENUM in src/common/include/config.h, and that a future 
          release of Harvest we will make it possible to set this limit more 
          flexibly.  Also noted about the robot guidelines.
        - Added an overview about the lib and bin directories for the Gatherer,
          including the defaults and descriptions of each file.
        - Showed RunGatherer and RunGatherd scripts and added discussion 
          of how to use them from cron and /etc/rc.local.
        - Added pointer to FAQ on setting up HTTPD in the Broker section.
        - Put the logo on the cover page.

- Miscellaneous changes:
        - Updated the COPYRIGHT and added it to all appropriate source files.
        - Updated the FAQ, and converted to HTML.
        - Fixed BSD compatability bug in src/install.sh.

##############################################################################
Changes between release v1.0-beta-1.5 (October 14, 1994) and v1.0-beta-1.4:

- Added a user manual that is intended to help both novice and advanced
  Harvest users better use the system.  It covers the following topics:

        - Introduction to Harvest (1 page)
        - Subsystem Overview (2 pages)
        - Getting and Installing the Harvest software (1 page)
        - Making Basic Use of Harvest (3 pages)
        - Advanced Features of Harvest (5 pages)
        - References (1 page)
        - Appendix on The Summary Object Interchange Format (SOIF) (3 pages)
        - Appendix on Essence Summarizer Actions (1 page)
        - Appendix on Gatherer Examples (6 pages)
        - Appendix on Broker's Query Manager and Collector Interface (2 pages)

- Changes to the Broker:
        - Improved Broker installation, and added the CreateBroker program
          that automatically creates and configures a Harvest Broker based
          on a brief Question & Answer session with the user.
        - Improved the Mosaic interface to be more user-friendly.
        - Added support for duplicate removal based on MD5 values.
        - Made Query Manager and Administrative interface more extensible.
        - Rewrote the Broker registry to improve performance and readability.
        - Added the dumpregistry command to view the Broker's registry.
        - Added the test-broker command for simple testing of a Broker.
        - Added support for wais-8-b5, freeWAIS, commerical WAIS, and Nebula.
        - Cleaned up the admin.html and query.html files.
        - Cleaned up much of the code to make more extensible.
        - Fixed bug in the registry garbage collection.
        - Fixed major memory leak bugs.
        - Fixed various minor bugs.

- Changes to the Cache:
        - Started using icp version_id 2 of the protocol.
        - Improved support for OSF/1 v2.0 on 64-bit DEC Alphas.
        - Added password support for administrative interface.
        - Fix bug with FTP "Parent Directory", and cleaned up HTML for dirs.
        - Fixed various major bugs with hierarchial caching.
        - Fixed various minor bugs.

- Changes to the Gatherer:
        - Added support for generating a sorted, unique keyword attribute,
          based on the Descripton, Partial-Text, or Keywords attribute.
        - Added an "allow only these types" in the Candidate Selection step.
        - Added stub Exploder type to help users use the unnesting step.
        - Gatherer automatically creates a gatherd.cf file if needed.
        - Fixed major gatherd bug that caused.
        - Fixed various minor bugs and memory leaks.

- Changes to the Replicator:
        - Working on instrumenting the code to measure peformance.
        - Fixed various bugs.

##############################################################################

$Id: ChangeLog,v 1.146 1995/05/27 19:45:30 hardy Exp $
