Conservation DistList Archives [Date] [Subject] [Author] [SEARCH]

Subject: Conservation OnLine

Conservation OnLine

From: Walter Henry <whenry>
Date: Wednesday, February 3, 1993
    **** Moderator's comments:   This is a special mailing, devoted to a
    single announcement.  This is very long, but I think it important
    enough to warrant the length.  Please save this, as it is intended
    to be part of the fundamental DistList documentation.  If you are in
    a hurry, read only sections 1 & 3, and skim 2.

    For those who would prefer to print this document, I will try to
    provide a postscript version at an anonymous ftp site soon.

    1. Introduction
    2.  An overview of WAIS
    3.  The CoOL databases
    4.  Finding a client
    Appendix 1.  Frequently Asked Questions about WAIS
    Appendix 2 SWAIS help

1. Introduction

The Preservation Department of Stanford University Libraries is pleased
to announce the creation of Conservation OnLine (CoOL), a Wide Area
Information Server (WAIS) dedicated to providing Internet access to a
full text database of conservation information.  The databases cover a
wide spectrum of topics of interest to those involved with the
conservation of library, archives and museum materials.

Those of you familiar with WAIS can simply search
cool-directory-of-servers.src at port 210, seed word
"source", to find out more about CoOL.  The rest of this article will
describe the CoOL databases in more detail, will provide a short
overview of WAIS, and will offer some help in getting started using it.
Please save this article, as it will form the basic reference document
for CoOL.

The content of the CoOL databases comes from a variety of sources and we
hope that all users will consider contributing some material to the
project.  As you use the server please pay attention to lacunae that you
might be able to help fill.  As a start, I'd very much like to assemble
a collection of disaster plans.  Please send your institution's disaster
plan, in machine readable form (preferably as an ascii file) either by
email to waiscool [at] aldus__stanford__edu or on a floppy to

    Walter Henry
    Conservation Lab
    Stanford University Libraries
    Stanford, CA  94305-6004

[Please include a note telling me what format the thing is (eg "Dos/Word
Perfect" or "Mac/Microsoft Word", etc).  We can probably read anything
you throw at us if you tell us what it is]

2.  An overview of WAIS

Bruce Kahle, one of the key figures in the development of WAIS describes
it this way:

    The Wide Area Information Servers system is a set of products
    supplied by different vendors to help end-users find and retrieve
    information over networks.  Thinking Machines, Apple Computer, and
    Dow Jones initially implemented such a system for use by business
    executives.  These products are becoming more widely available from
    various companies.
    Users on different platforms can access personal, company, and
    published information from one interface.  The information can be
    anything: text, pictures, voice, or formatted documents.  Since a
    single computer-to-computer protocol is used, information can be
    stored anywhere on different types of machines.  Anyone can use this
    system since it uses natural language questions to find relevant
    documents.  Relevant documents can be fed back to a server to refine
    the search.  This avoids complicated query languages and vendor
    specific systems.  Successful searches can be automatically run to
    alert the user when new information becomes available.

2.1 Client-Server Model
In the client-server model, an increasingly important concept for
network applications, an application entails two completely separate
components, that may be created independently of each other.  The
server sits on remote machine somewhere on the network and process
requests for services (e.g. database queries).  These requests, and the
response from the server may be in a format not at all comfortable to
humans, but this doesn't matter because the end-user never communicates
directly with the server; she talks only with a program called a client,
which provides a comfortable user interface, translates the user's
requests to a format that the server can understand, and processes the
response from the server to make it palatable for the user.

A common example of such a system is found in some Campus Wide
Information Systems (CWIS) which allow users to communicate with online
library catalogs at other universities.  The foreign university catalogs
may well use a completely different query language and would be
difficult for the local users to make sense of, but they need never deal
directly with the foreign catalog.  Instead, they deal only with a
client program (probably developed by programmers at the local site)
that makes all database requests look like those the user are familiar
with.  The client translates those request into a standard form
that the server can understand.  When the server returns bibliographic
records, the client steps in again and formats those records in a
display that is comfortable to the user.

Obviously, the client-server model depends on the availability of a
standard "language" that both client and server software can agree to
understand.  In the library catalog example above, a standard known as
Z39.50 provides such a language.  Now it is common to speak of WAIS as
if it were program, but is really a protocol, a language understood by
clients and servers.  It is, in fact, an extended form of Z39.50 The
programs that we deal with (clients) are just pieces of software that
happen to speak the WAIS protocol.

2.2 Full Text
CoOL is a full text database.  This means that what you retrieve is a
final product (eg an article) rather than a pointer to another product.
At first CoOL will contain only text files, but eventually it will also
contain non-text material, such as images, in standard formats.  Right
now, there aren't many clients widely available that can handle non-text
files, but when these appear, CoOL will provide material to keep them
2.3 Guide to WAIS Searching
Unlike conventional databases, WAIS does not use a specialized query
language.  That is, your question can be phrased in English, in whatever
fashion you like.  If the question doesn't produce the desired results,
you will learn this immediately and can rephrase the question.  By doing
so, one quickly learns, without any real effort, what sort of questions
get satisfactory results in a given database.  The texts that are
retrieved are returned with weights indicating the extent to which the
documents match the words in your question, a concept central to the
WAIS protocol, called relevance ranking.  Your client can then use this
to present the document list in order of relevance.

2.4 Bye-bye boolean
One of the major differences between WAIS searching and conventional
searching is that WAIS generally does not support boolean search
operations.  Whether this is a bug or a feature is largely a matter of
preference, and those of us brought up with boolean searching will need
to take a while to get used to the WAIS way of doing things.  I can
promise you though, that after a while, it does begin to feel
comfortable.  It is, by the way, not quite accurate to say that WAIS
"doesn't do boolean", since there is nothing inherent in the protocol
that prevents it, and indeed, there are experimental implementations
that do support boolean searching.
In place of boolean searching, WAIS offers natural language queries
("tell me about glues and adhesives and sticky things"), quick
retrieval, casual browsing, and relevance feedback.

2.5 Relevance feedback

One of the genuinely spiffy ideas in WAIS is relevance feedback.  The
concept is simple:  after you've asked a question, perhaps in a
less-than-optimal form, you have a set of retrieved texts that you can
browse through.  The chances are good, if the database has anything at
all in your subject, that at least one of the retrieved texts will be
the sort of thing you had in mind.  With relevance feedback, you can
repeat your search and tell WAIS that you are interested in seeing more
texts that are "like" that one, with "like" meaning "having a lot of
text in common with".  CoOL will support relevance feedback, if your
client supports it, but at this stage in WAIS's development (at least
with the noncommercial implementation that CoOL uses) , it is not really
as effective as it might be.  Because of the limitations of the current
relevance ranking scheme, the "similar" documents may not seem, to a
human reader at least, to have much in common.  Nevertheless, it does
sometimes work well.  Some clients allow you to cut and paste a portion
of a retrieved document into a relevance feedback-driven query, rather
than using the entire document.  In this case, relevance feedback can be
particularly effective.
2.6 Searching
WAIS databases consist of the text files themselves and a set of indexes
in which every word, except those in a stopword list, are indexed and
weighted according to a relevance-ranking algorithm.  In principle, an
indexer should be able to discern that in an article on mass
deacidification, the words "calcium hydroxide" are probably more
important to you than the words "Thank you for your attention".  At this
stage in the game, however, things aren't that sophisticated, so you
will need to be a little bit careful about choosing your search terms.
A simple example will illustrate the point.  Since (almost) every word
in the text is indexed, if your search question for the cool database
(the archives of the Cons DistList), is
    "So, what's the latest buzz on the subject of bookcloth"
your search will be accepted, because WAIS will let you search for
anything you like and will do its best to match the retrieved text to
your query, but in practice, you will probably not be happy with what
you retrieve.  "So", "what's", "on", and "the" are in the stopword list,
so they will not get in the way.  The search will actually use only the
words "latest", "buzz", "subject", "bookcloth".  Now "bookcloth" is
obviously a relevant term, but "buzz" and "subject" clearly aren't.
"Latest" is indeed part of the concept of your query, but the appearance
of the word in a text is not likely to indicate that the text is "the
latest buzz", so it is irrelevant as well.
In CoOL, because of the nature of the subject matter, "buzz" won't have
much effect (in fact, at the moment, it doesn't occur in the database).
"Subject" however appears in every single message, often several times,
so unless other weighting factors dominate, the chances are good that
articles that contain the word "subject" will be ranked as more relevant
than the (fewer) articles that mention bookcloth.
We have tried by various means to maximize the probability that the
document ranking will reflect the subject matter of the text, but there
are severe limits on how effectively this can be done with the available
indexing tools.  For subjects that are relatively more well-covered in
the database, document ranking is rather decent.  For example, if in the
above example, you substitute "mass deacidification" for "bookcloth" you
will probably find the ranking to be satisfactory.

2.7 The future
WAIS as it exists today is a wonderful tool, but much of the excitement
that surrounds it has to do with its potential.  Like other tools based
on the client-server model, the richness of the application depends upon
the extent to which the server provides a rich set of services and the
client provides an effective interface to those services.  WAIS is still
very young, and both the clients and servers are undergoing improvements
so you should expect the WAIS scene to look a lot more interesting as
the work progresses.  For now, it is very exciting and quite useful, but
one ought to be a little reasonable in one's expectations.
Among the areas that are not yet as exciting as they promise to be are
relevance ranking and, to a lesser extent, relevance feedback.  At
present, most of the servers (including CoOL) use an almost trivially
simple algorithm to weight the words in the index, but work is underway
to incorporate the findings of the information sciences to produce a new
generation of servers that should provide really interesting search and
retrieval functionality.
2.8 Stopwords

When indexing texts, CoOL ignores many common words like "the", "what",
etc., as well as single letters so they do not interfere with your
search.  However, stopwords are a function of the server, not the
client, and your client has no way of knowing which of the terms it
passed to the server were actually used for the search.  Because of
this, when the text is retrieved, if your client highlights the 'seed
words', it may well highlight "the" and "what", giving the erroneous
impression that those terms played a part in the selection of the

3.  The CoOL databases

In WAIS terminology, a "source" is a file that describes, for both the
client software and the user, what the database is about and how it is
used.  It includes information about where the database is (address and
port) as well as information about costs, how often the database is
changed, etc.  Your client can retrieve these source files and present
them to you whenever you want to ask a question, making the CoOL
databases an extension of your own computing environment.  Although, the
terminology is a little slippery, in general I will use "database" to
refer to a collection of texts and "source" to refer to a particular
type of WAIS file (called <database>.src, as described above.

The universe of CoOL texts is subdivided into several databases (each of
which is described by a source) and the divisions are calculated to
enhance the probability that you will find a relevant answer (if not
"the" answer) to your question.  That is, it enhances searching
precision.  Since most WAIS clients allow you to search several
databases at a time (even databases on separate hosts), it is easy to
expand the scope of a search to increase the number of items retrieved
(recall).  As of this writing there are 8 databases, and more will be
added as we gather material.

As the databases grow, expect this structure to will be expanded and
refined to provide a well defined search space.

3.1 cool-directory-of-servers
    cool-directory-of-servers is a top level directory for Conservation
    OnLine (CoOL), a collection of WAIS databases containing information
    of interest to people involved with the conservation of library,
    archives and museum materials.  This is used to locate the
    individual CoOL databases, in which you will actually do your
    actual searching.

    To determine which CoOL database will best meet your needs, query
    cool-directory-of-servers.  To see a list of all the CoOL databases,
    use the word 'source' as your search term.  CoOL will return a list
    of all the other CoOL sources and your client can retrieve these and
    save them on your machine so that the next time you search, you will
    be able to select them as your target.  New databases will be added
    to CoOL so it will be a good idea to search
    cool-directory-of-servers regularly.

3.2 cool

    cool contains the complete archives of the Conservation DistList.
    Every message that has appeared in the DistList since its inception,
    has been reformatted and enhanced (e.g. full names added to From:
    fields, subjects regularized, spelling corrected) to increase the
    probability of your search retrieving a relevant item.  Searches
    will return individual messages rather than complete DistList
3.4 cool-cfl
    The largest of the CoOL database, cool-cfl is the information
    workhorse, containing files on a wide variety of conservation
    topics.  Most of your searches will probably include this database.

3.5 cool-cdr
    cool-cdr contains an uptodate version of the Conservation Email
    Directory (ConsDir).  Searches return single entries for
    individuals. You can search for any word that appears in a Directory
    entry, eg "Conservator from California Interested in book structure"
    (but see "Guide to WAIS Searching" elsewhere in this document for
    ways to improve this question).
3.6 cool-net
    cool-net contains information of a general nature, concerning
    networking, mailing lists, the Internet, etc.  It is the only
    component of CoOL that is not focused explicitly on conservation

3.7 cool-lex

    cool-lex.src contains lexical and classification material pertaining
    to conservation and preservation, including thesauri (or
    microthesauri), glossaries, classification schemes, authority lists
    (descriptors, subject headings), etc.  These items are segregated
    from other CoOL databases in order to prevent false hits in the
    other databases:  if you search cool-cfl for "Adhesives" you are
    probably not going to be satisfied by learning that "Adhesives"
    appears an authority list.
3.8 cool-bib and cool-ref
    cool-bib contains complete bibliographies on conservation topics.
    cool-ref is similar but returns individual citations.
    Note that in general CoOL databases return full text rather than
    literature citations.  cool-bib and cool-ref are, however,
    exceptions to this rule.  Although there is a considerable
    duplication between the two databases, the overlapping coverage is
    not complete and cool-ref will probably always contain a great many
    more citations than cool-bib.  Normally you will search cool-ref to
    find out if someone has provided a thorough coverage of a topic, and
    cool-ref to find answers to specific questions.

3.9 Acceptance of material to be mounted in CoOL.

We will always be grateful to receive machine readable text to be
mounted in CoOL and hope that you will all dig through your files for
material to share.  There are only a few restrictions, and of course we
reserve the prerogative of deciding what will be mounted.

3.10 Copyright

The material must be either in the public domain or material for which
we have permission to reproduce and present in machine readable form. If
you are not the copyright holder of the material you submit, please
verify with the copyright holder that s/he is willing to permit us to
mount the material and tell us, at the time of your submission, how to
get in touch with him/her.

If you submit material for which you hold the rights, please send a note
with your submission, making explicit the nature and extent of the
permission being granted.  If you wish to include a copyright statement
in the text, you are encouraged to do so.  The following is an example
that might be appropriate, but please confer with your own legal counsel
before relying on it (we have *not* consulted an attorney about this
clause and make no claims for its appropriateness).
Copyright <YEAR> by <NAME>.  Copying in excess of rights otherwise
established under copyright law is permitted, without individual
permission or payment of a fee, provided that copies are made or
distributed for non-profit purposes and credit is given for the source.
Abstracting with credit is permitted.
If you discover copyrighted material in CoOL that you believe may be
mounted without proper permissions, please let us know so that we can
correct the error.

3.11 Advertising
Advertising per se is not welcome, but announcements, technical
specifications and other material whose primary purpose is to provide
information about products and services, rather than to entice sales,
are welcome.

3.12 Limitations

We are not able to provide any support for client software.  Nor can we
offer help in obtaining and installing clients, beyond what is offered
in this document.  If you have trouble, please get help from someone at
your site.  If, on the other hand, you discover anomalies in the data
(e.g. missing or incorrect Headlines) or in the behaviour of the server,
we will be grateful if you would bring them to our attention.

We do not have adequate resources to go out hunting for text or, in most
cases, to scan printed text.  If you want to suggest that a given
printed text be scanned and mounted, we will be happy to record the
suggestion, but the chances are very slim that we will be able to act on

If you know of an appropriate text in machine readable form, please
provide specific information about where it can be found (eg "by anonymous
ftp to abc [at] xyz__edu in directory pub/doc/foo" or "Point your gopher to
abc [at] xyz__edu and look in directory Foo" are helpful; "It's on listserv"
is not).
3.13 Acknowledgements

I would like to thank Jonathan Goldman of Thinking Machines for help
advice on WAIS matters and for sharing with us the source code for
WAISmail and Bill Tierney of the Stanford University Libraries Systems
Office for invaluable help in getting this thing off the ground.

4.  Finding a client

If you are directly connected to the Internet, (ie. you can telnet and
ftp from your machine to other places on the net) and are able to
install software on your machine

    see 4.1 Finding WAIS Clients

If you have access to Gopher

    see 4.2 Using Gopher with WAIS

If you are directly connected the Internet (ie you can telnet and ftp
from your machine to other places on the net) but are NOT able to
install software on your machine

    see 4.3 Running SWAIS at

If you are not on the Internet (ie you are able to send mail to Internet
hosts, but are not able to telnet or ftp from your machine to other
places on the net

    see 4.4 The WAISmail interface

4.1 Finding WAIS Clients

WAIS clients are available, free of charge, for several machines.  Some
are more sophisticated than others, and all are likely to undergo
considerable change in the next year or so.

There are two clients available for the Macintosh, and both are fairly
nice.  Both require that you have MacTCP installed.  WAISstation is a
standalone program that is available by anonymous ftp to in
the the wais directory.  You will also find a WAISstation demo there,
which is a great introduction to using WAIS.  WAIStation0.63.hqx is
binhexed, so you will need binhex 4.0 or another utility that can
unbinhex files (eg Compactor Pro, Stuffit Deluxe, etc).  HyperWais,
which requires Hypercard to run) is available from
[] by anonymous ftp.  The application is located in
incoming/HyperWais.sea.hqx, and the source is in
incoming/HyperWais.src.sea.hqx. has a nice selection of clients for a variety of
computers including Dos and Windows machines, NeXt, Unix boxes, etc.
They are found in pub/wais.   If you don't find what you need there, see
the Frequently Asked Questions posting below.

4.2 Using Gopher with WAIS

If you have access to a Gopher client, you can use it to search WAIS
databases, including CoOL.  To find your way to CoOL, navigate through
the Gopher directories until you find something like "Other
Information," and look there for "WAIS Based Information".  Somewhere
below that you may find cool-directory-of-servers.  If not just look for
directory-of-servers, which is the top level directory at
Search either for the word "conservation" and you will be presented with
all the CoOL databases.

Gopher clients are available for a variety of machines and you can
obtain them by anonymous FTP to in the directory
pub/gopher.  There are of course, other sources, and you can use gopher
to find them.

If you do not have your own gopher client, there are publicly available
Gopher sites.  To use them, telnet to one near you and login using the
name indicated in the table below (taken from the Gopher Frequently
Asked Questions file)

     Non-tn3270 Public Logins:

     Hostname                  IP#              Login   Area
     ------------------------- ---------------  ------  -------------     gopher  North America    gopher  North America    panda   North America      gopher  Europe     info    Australia    gopher  Sweden        gopher  South America             gopher  Ecuador

     tn3270 Public Logins:

     Hostname                  IP#              Login   Area
     ------------------------- ---------------  ------  -------------    -none-  North America

4.3 Running SWAIS at

NB in the following discussion, when the instruction says type
"something", it is understood that you will not type the quote marks.
Case *is* significant:  "B" is not the same as "b".  Remember to read
whatever instructions are displayed on the screen as they will usually
--but not always-- tell you what to do next.

Running your own client is without question the most desirable way to
use CoOL, but if you are unable to install a client, there are some
options available to you.  However, they are neither as powerful nor
convenient as running your own client.

Thinking Machines provides a publicly available client (SWAIS) which
you can use via telnet.  To use it telnet to and login
as "wais".  You will be asked for your terminal type (if you are not
emulating VT100, the chances are good you will find quake's facility
unusable).  Detailed information for using SWAIS, whether at quake or on
your own system, are found in the SWAIS Manual below.  Note that you can
not use the "Save" command on quake, since that would save the retrieved
text on quake instead of your own machine, but you can use "m" to have
the retrieved text mailed to your account.  While you are logged on, you
can get help by typing "?"

Once logged into quake you will be presented with a list of 'sources'.
Page down (upper case J" or, if you are in a hurry type


to search the source list for the cool- group of databases.  You should
see a list including cool-directory-of-servers.src, cool.src,
cool-cfl.src, etc.

Let's assume you want to search the cool-cdr, which contains the
Conservation Email Directory, in order to find someone from Australia.
Move the cursor until it is on cool-cdr.src and type SPACE to tell SWAIS
that this is the database you want to search in.  (You can select more
than one database so that they will be searched simultaneously, but for
this exercise let's keep things simple).   Then type "w".  This tells
SWAIS you want to enter Words for your search.  Type "Australia"
<return> After the search you will see a list of names.  Use the cursor
to select one and type a SPACE to tell SWAIS to retrieve.

While you are viewing the retrieved text, you can get help by typing
"h".  SPACE will move you forward to the next screen and "b" will move
you back to the previous screen.  Type "q" when you are done reading.

Some useful keys (again, case is significant):

   ? & h        get Help.  If you try one and don't get help, try the
                other (and repeat to yourself "Unix is my friend")

   q            stop doing what you're doing and go back.  If you find
                yourself stuck, "q" should get you out of trouble.  If
                you are at the Sources list, though, "q" will quit the
                program (but you will be asked first).
   delete/bs    if your Backspace key doesn't seem to do what you want,
                try the Delete key, and vice versa.

  ^u            erase the line you typed

   J            Down a page

4.4 The WAISmail interface

Until very recently, those whose only access to the network is
electronic mail were out of luck, but now, thanks to the efforts of
Jonathan Goldman, of Thinking Machines, there is a mail interface known
as WAISmail.  To get help on using WAISmail, send a message to
waismail [at] think__com and make the first line

You will receive a detailed help file by mail.  A version of the help
file is included below, but you should get a new one, as there may be
changes as WAISmail develops.

To see how it works, let's try the same search that we did with SWAIS.
WAISmail searches are quite simple.  They consist of two separate
transactions a search and a retrieval.  To search, send a message to
waismail [at] think__com that looks like

    search cool-cdr australia
In a very short time, you will get back a message that looks something
like this

    From daemon [at] quake__think__com  Mon Feb  1 16:42:19 1993
    Date: Mon, 1 Feb 93 16:43:51 PST
    From: WAISmail [at] quake__think__com
    To: whenry [at] lindy__Stanford__EDU (homo obsolescensis)
    Subject: Your WAIS Request: 

    Searching: cool-cdr
    Keywords: australia 
    Result # 1 Score:1000 lines:  0 bytes:    421 Date:     0 Type: TEXT
    Headline: name      |Drew, Nancy
    DocID: 41139 41560 /u/wais/Cdr/CONSDIR:/u/wais/src/cool-cdr@aldus.
    Result # 2 Score:1000 lines:  0 bytes:    390 Date:     0 Type: TEXT
    Headline: name      |Spade, Sam
    DocID: 104773 105163 /u/wais/Cdr/CONSDIR:/u/wais/src/cool-cdr@aldus.
    Result # 3 Score:1000 lines:  0 bytes:    338 Date:     0 Type: TEXT
    Headline: name      |Wolfe, Nero
    DocID: 140758 141096 /u/wais/Cdr/CONSDIR:/u/wais/src/cool-cdr@aldus.

Note that these will be very long lines and may wrap on your terminal
screen, but you should not insert newlines except between items.

Now to retrieve one or more of these texts, you will need to compose a
new message to waismail [at] think__com and *include* the DocID lines for
those items you want.  If you are using Unix mail, when you are done
reading this result set, you can

 r      (reply)
~f      (forward the current message)
~v      (edit the outgoing message, cut away any items you do *not*
            want, save)
 .      (send the message)
Note that you don't have to edit away any of the headers or other
extraneous material.  Don't "quote" the included text (eg with ">") Only
the lines that begin with "DocID:" are relevant.  Be very careful
however, not to delete the blank lines between items and not to insert
or delete any extraneous spaces at the ends of lines.  If your system
automatically inserts a .sig or "--------Original Message________" line
around the included text, be sure to insert one or more blank lines
around the DocIDs

Mail the message to waismail [at] think__com and with luck, in a short time
you will get the full text of your selections.  If you're not so lucky,
you will get back an error message that may help you figure out what
went wrong.  The most likely problems are extraneous characters,
especially if your editor or mailer has wrapped long lines and missing
blank lines between the docids.  If the file comes to you uuencoded, it
means that WAISmail thinks you have requested a non-text file.  You can
recognize this situation because the file you receive will say

   begin WAIS.res 666
and be followed by what looks like garbage.  Since (at least for the
time being) all CoOL files are pure text, this is obviously an error.
Most likely your mailer has joined the last line of the docid to your
.sig or other extraneous text.  Just insert a newline after the docid (ie
after "210%TEXT) and it should work fine.

Appendix 1.  Frequently Asked Questions about WAIS

  Archive-name: wais-faq/getting-started
  Last-modified: 27 Dec 92 00:00:01 EST

  comp.infosystems.wais Frequently asked Questions [FAQ] (with answers)

      -1-  What is the purpose of this newsgroup?
      -2-  How can I search this FAQ to find the answers?
      -3-  What is WAIS?
      -4-  Where can I find more information on WAIS?
      -5-  How can I get access to WAIS?
      -6-  Where can I find WAIS software for the XYZ OS?
      -7-  Where can I pick up the list of sources (e.g. databases) for

  Please send suggested corrections and additions to: edguer [at] ces__cwru__edu


  Subject: -1-  What is the purpose of this newsgroup?
  Date: 28 Oct 92 00:00:01 EST

  From the Charter:

  comp.infosystems.wais is for discussion of WAIS, the Wide Area
  Information Servers, a networked full text retrieval system developed
  by Thinking Machines, Apple Computer, and Dow Jones.


  Subject: -2-  How can I search this FAQ to find the answers?
  Date: 19 Nov 92 00:00:01 EST

  This FAQ follows the RFC1153 recommendations for message digests and
  thus should easily be viewed by newsreaders that understand message

  This FAQ also uses the Subject: lines with the answer to each question
  and thus it should be easy to step through the answers with the "^G"
  command of rn.

  This FAQ marks each question with a "dash number dash" so that using a
  regular expression search pattern you can easily get directly to any
  question on the document.


  Subject: -3-  What is WAIS?
  Date: 27 Dec 92 00:00:01 EST

  WAIS stands for Wide Area Information Servers.

  WAIS is a networked information retrieval system.  WAIS currently uses
  TCP/IP to connect client applications to information servers.  Client
  applications are able to retrieve text or multimedia documents stored
  on the servers. Client applications request documents using keywords.
  Servers search a full text index for the documents and return a list
  of documents containing the keyword.  The client may then request the
  server to send a copy of any of the documents found.

  Although the name "Wide Area" implies the use of the large networks
  such as the Internet to connect clients to servers distributed around
  the network, WAIS can be used between a client and server on the same
  machine or a client and server on the same LAN.

  WAIS uses the Z39.50 query protocol to communicate between clients and
  servers. WAIS does not, at this time, implement the full Z39.50-1992
  specification. In particular, WAIS does not permit boolean searches
  but instead is restricted to relevance feedback.

  There are a large number of servers running currently [over 350
  databases]. Topics range from recipes and movies to bibliographies,
  technical documents, and newsgroup archives.

  WAIS is a project of Thinking Machines, Apple Computer and Dow Jones.
  WAIS is a free product available with full source to the server,
  indexing software, and many clients.


  Subject: -4-  Where can I find more information on WAIS?
  Date: 3 Dec 92 00:00:01 EST

  Depending upon the information you seek there are many options.

  Perhaps the best place to start is the WAIS white sheet available via
  anonymous FTP from in the file wais-corporate-paper.text.
  This will give you a good idea of why people got interested in WAIS
  and a very simple overview of the WAIS architecture.

  If you want to learn more about how WAIS really works or answer other
  FAQ's the best place to start is the documentation that comes with
  WAIS. The WAIS distribution is available via anonymous FTP from in the file /wais/wais-8-b5.1.tar.Z.  After uncompressing
  and untarring the distribution, you will find a ./doc directory that
  includes a more complete FAQ, documents for programmers, users guides,
  protocol specifications, a paper on digital librarian ethics, and a
  bibliography of WAIS articles.

  If you wish to do further reading the bibliography of articles
  published on WAIS is also available separately from in the
  file /wais/bibliography.txt.

  Next, of course, there is the newsgroup comp.infosystems.wais. The
  newsgroup is regularly visited by the authors of WAIS at and
  other experts on using both WAIS and other resources on the Internet.
  After listening in on the group for a while, you are welcome to post
  your questions if you have been unable to find an answer yourself from
  the documentation.

  Finally, there are a number of mailing lists which you can join if you
  wish to follow WAIS.

  wais-interest           Contact: wais-interest-request [at] think__com This
      is a moderated list used to announcement new releases for the
      Internet environment.

  wais-discussion         Contact: wais-discussion-request [at] think__com The
      WAIS-discussion is a digested, moderated list on Electronic
      publishing issues in general and Wide Area Information Servers in
      particular.  There are postings every week or two.

  wais-talk           Contact: wais-talk-request [at] think__com The WAIS-talk
      is an open list (interactive, not moderated) for implementors and
      developers.  This is a technical list that is not meant to be used
      as a support list.

  Z3950IW             Contact: LISTSERV [at] nervm__nerdc__ufl__edu Z39.50
      Implementors list for low level discussions of protocol details.


  Subject: -5-  How can I get access to WAIS?
  Date: 19 Nov 92 00:00:01 EST

  Perhaps the easiest way to get started, if you do not want to get a
  copy of the full distribution and build your own clients is to try
  WAIS out using the client running at Thinking Machines.  To do this
  you must use TELNET to connect to and enter the
  username "wais" [lowercase-no quotes] at the "login:" prompt.  This
  will permit you to use swais (Screen WAIS). swais is a curses based
  interface, so if you have problems, it may be due to your terminal
  setup.  If you are unsure of the commands, try using a question mark
  [?] at the prompt.


  Subject: -6-  Where can I find WAIS software for the XYZ OS?
  Date: 3 Dec 92 00:00:01 EST

  There are a number of sources for WAIS software available via
  anonymous FTP. [please try
  first, if in Europe]
      This is the main UNIX distribution.  It includes waisindex, the
      program that builds the indexes, and waisserver, the program that
      responds to client queries.  The clients include:
      waissearch - a "dumb" tty client interface
      swais - a "simple" curses based client interface
      wais.el - A GNU Emacs client interface
      xwais - an X Window System client interface
      mxwais - an OSF-Motif client interface that requires the xwais
          source. xwais - an OpenLook
      (NeWS) client interface sunsearch -
      a SunView (SunTools) client interface* client - a VAX VMS based
      client interface (based on the code from 8-b2?)*
      waisserver - a VAX VMS based server
      waisindex - a VAX VMS based indexer - a NeXTstep based client interface for NeXT
      WAIStation - a Macintosh interface client based on MacTCP.
      MacTCP must be obtained separately.
      Source to the client in THINK C is available from
      HyperWais - A Macintosh Hypercard client interface.
      Based on MacTCP and Hypercard (which must be obtained separately).
      Source is also available from
      pcwais - An MS-DOS client interface
      Based on Borland TurboVision and the Crynwr Packet Drivers.
      oacwais - An MS-DOS client interface
      Based on FTP Software's PC/TCP.
      FTP Software's PC/TCP must be obtained separately.
      wwais - a Microsoft Windows 3.0 client interface
      Based on Visual Basic and Novell's LAN Workplace for DOS.
      LAN Workplace for DOS must be obtained separately.

  You can also use Gopher to access WAIS.  For the availability of
  Gopher clients, please visit the comp.infosystems.gopher newsgroup.


  Subject: -7-  Where can I pick up the list of sources (e.g. databases) for WAIS?
  Date: 3 Dec 92 00:00:01 EST

  The current listing of publicly advertised sources is always available
  via anonymous FTP from in the /wais directory in the file
  wais-sources.tar.Z (a compressed UNIX tar file).

  [please try first, if in

Appendix 2 SWAIS help

SWAIS(1)                UWO (1992-12-04)                 SWAIS(1)
    swais, vtswais - a simple WAIS query front-end


    swais [-s sourcename] [-S sourcedir] [-C common sourcedir]
    [-h] [keywords]
    vtswais [-s sourcename] [-S sourcedir] [-C common sourcedir]
    [-h] [keywords]


    swais is a curses-based, simple screen user interface for making
    WAIS queries.  This Simple WAIS interface is an basic access tool
    designed for those focused on data retrieval and not computer
    operation. It provides most of the functionality of the more
    complicated interfaces but features a simple and potentially more
    natural interface for non-bitmapped screens.  The functionality
    supported includes source selection, keyword entry, and automatic
    document retrieval.

    There is currently no provision for relevance feedback based
    questions nor is there a mechanism for storing questions to be asked
    again.  Remember, this is a simple interface! This software is
    fairly new and experimental.  You should expect a few bugs.

    vtswais is a special version of swais that forces your terminal type
    to be a VT100 variant that works around a bug or two in swais.  Use
    it on all machines when  your terminal type is set to xterm as it
    fixes a problem where the last page of a document is cleared before
    it can be viewed.  You might want to try vtwais if you encounter
    some other strangeness in the swais program.

USING swais

     When swais is first started, the "Source Selection" screen is
     displayed.  The database source files displayed here are a
     combination of those in the system source area and those that you
     have copied into your own $HOME/wais-sources directory.  You must
     create this directory before trying to save any database sources.

     Select a database source (or source for short) by pressing the up
     and down arrow keys to move the reverse-video bar (if available on
     your terminal) over the source that you want and then pressing the
     space-bar.  The selected source is marked with a star.  You can
     select one or more sources for a search.  A control-V, control-D or
     K will move you down a page of sources, <esc>v, control-U or J will
     move you up a page.  The slash /<string> command can be used to
     search forward for the next line with a given string in the sources

     Type a "w" to move the cursor to the keyword list to be used in the
     search.  Enter the keywords.  Use control-U to erase them and start
     again, <rubout> to delete one character at a time.

     Once you have typed your keyword list, type a <return> to do the
     search of your selected sources using the keywords typed.

     When your search is complete and there are some matches, a "Search
     Results" screen is displayed.  It lists information headlines about
     each matched entry.  The score column gives you a 1-1000 rating for
     how "good" the match was. To retrieve an entry, move the bar over
     it using the cursor keys (or the slash search command or by number)
     and press the space bar.  The document is retrieved and displayed
     using your PAGER.  To retrieve and save or otherwise process a
     document use the vertical bar "|" command to pipe the document to a
     program.  For example the command "cat >/tmp/article" would save
     the retrieved document in the file /tmp/article and the command
     "lpr" would send the item to your default printer.  You can also
     use the "m" command to email a document.

     When retrieving from a directory of source files (like the
     directory-of-servers source) you can use the the "u" command to add
     it to your list of personal sources instead of the "<space>" to
     view it.  Once the source has been added, it will appear in the
     "Source Selection" screen.  (The source file will be copied into
     the $HOME/wais-sources directory (which must exist!))

     To return to select another source or to try different search
     terms, use the "s"  (source) or "w" (word) commands.

     To exit from the program, type a "q" at the source selection or
     results screen.


     The command line options can be used to customize or accelerate
     your use of swais.  Most people just leave these out and interact
     directly with the program.  See the previous section for details.

     -s is followed by a WAIS database source name. That name will be
     selected and you will be immediately placed at the keyword prompt
     (if no key words were entered on the command line).  Note that only
     one database may be selected with the -s command-line option.  The
     last one mentioned on the command line is used.  (You can always
     add others once you are inside the program.)

     -S is used to specify the directory to look for your own personal
     set of WAIS sources.  This directory is also used to save (the Use
     command) any new database sources that you may discover.  If you
     don't specify this switch, $HOME/wais-sources is assumed.

     -C is used to specify the directory to use for the system- wide
     database sources.  On our machines this defaults to

     -h is used to print a summary of the command line options to

     If a list of keywords is included on the command line they will be
     used for your initial search.


     $(HOME)/wais-sources location for personal sources.
     /uwo/ccs/share/lib/wais-sources location for system-wide sources.


     While swais works on non-VT100 screens, it isn't very good looking
     unless the screen has some sort of high-lighting. The program uses
     highlighting to indicate the current selection and you will require
     a fast eye to see which selection or document is the current one if
     your screen doesn't implement reverse-video or some other form of

     If you try to save (the use command) a source without having a
     $HOME/wais-sources directory, it looks as if the source is added to
     your list, but in fact nothing is done and no error message is
     generated.  Create this directory first.

     The <esc>v command (for moving up a page) doesn't work on MIPS or
     Sun4 machines.  Use one of the alternative commands: J or


     xwaisq(1), xwais(1), waissearch(1), waisindex(1),

     Program by John Curran (jcurran [at] nnsc__nsf__net).
     Third pass at a manual by Peter Marshall CCS, The University of
     Western Ontario: <peter.marshall [at] uwo__ca>.

                  Conservation DistList Instance 6:42
                Distributed: Wednesday, February 3, 1993
                        Message Id: cdl-6-42-001
Received on Wednesday, 3 February, 1993

[Search all CoOL documents]