whitakers-words/wordsdoc.htm

<HTML>

<HEAD>
<TITLE>WORDS 1.97F (LATIN-ENGLISH DICTIONARY) PROGRAM DOCUMENTATION</TITLE>
</HEAD>
<BODY>


<H1><CENTER>WORDS Version 1.97FC</CENTER>
<CENTER>LATIN-ENGLISH DICTIONARY PROGRAM</CENTER></H1>

<BR><BR>

<A HREF="#SUMMARY"><B>SUMMARY</B></A><BR>
<BR>
<A HREF="#INSTALLATION"><B>INSTALLATION</B></A><BR>
<A HREF="#Is There a Problem">Is There a Problem?</A><BR>
<BR>

<A HREF="#INTRODUCTION"><B>INTRODUCTION</B></A><BR>
<BR>
<A HREF="#OPERATIONAL DESCRIPTION"><B>OPERATIONAL DESCRIPTION</B></A><BR>
<A HREF="#Program Operation">Program Operation</A><BR>
<A HREF="#Modes of Operation">Modes of Operation</A><BR>
<A HREF="#Command Line Operation">Command Line Operation</A><BR>
<A HREF="#Latin-to-English Examples">Latin-to-English Examples</A><BR>
<A HREF="#English-to-Latin Examples">English-to-Latin Examples</A><BR>
<A HREF="#Design of the Meaning Line">Design of the Meaning Line</A><BR>
<A HREF="#Signs and Abbreviations in Meaning">Signs and Abbreviations in Meaning</A><BR>
<BR>
<A HREF="#PROGRAM DESCRIPTION"><B>PROGRAM DESCRIPTION</B></A><BR>
<A HREF="#Codes in Inflection Line">Codes in Inflection Line</A><BR>
<A HREF="#Help for Parameters">Help for Parameters</A><BR>
<A HREF="#Special Cases">Special Cases</A><BR>
<A HREF="#Uniques">Uniques</A><BR>
<A HREF="#Tricks">Tricks</A><BR>
<A HREF="#Trimming of uncommon results">Trimming of uncommon results</A><BR>
<BR>
<A HREF="#GUIDING PHILOSOPHY"><B>GUIDING PHILOSOPHY</B></A><BR>
<A HREF="#Purpose">Purpose</A> <BR>
<A HREF="#Method">Method</A><BR>
<A HREF="#Word Meanings">Word Meanings</A><BR>
<A HREF="#Proper Names">Proper Names</A><BR>
<A HREF="#Letter Conventions (u/v, i/j, w)">Letter Conventions (u/v, i/j, w)</A><BR>
<BR>
<A HREF="#DICTIONARY"><B>DICTIONARY</B></A><BR>
<A HREF="#Dictionary Codes">Dictionary Codes</A><BR>
<A HREF="#AGE">AGE</A><BR>
<A HREF="#AREA">AREA</A><BR>
<A HREF="#GEO">GEO</A><BR>
<A HREF="#FREQ">FREQ</A><BR>
<A HREF="#SOURCE">SOURCE</A><BR>
<A HREF="#Current Distribution of DICTLINE Flags">
Current Distribution of DICTLINE Flags</A><BR>
<A HREF="#Dictionary Conventions">Dictionary Conventions</A><BR>
<A HREF="#Evolution of the Dictionary">Evolution of the Dictionary</A><BR>
<A HREF="#Text Dictionary - DICTPAGE.TXT">Text Dictionary - DICTPAGE.TXT</A><BR>
<A HREF="#Latin Spellchecking - Text Processor List - LISTALL.ZIP">
Latin Spellchecking - Text Processor List - LISTALL.ZIP</A><BR>
<BR>
<A HREF="#INFLECTIONS"><B>INFLECTIONS</B></A><BR>
<BR>
<A HREF="#ENGLISH to LATIN"><B>ENGLISH to LATIN</B></A><BR>
<A HREF="#English Parsing of Meanings">English Parsing of Meanings</A><BR>
<A HREF="#Ordering English-to-Latin Output">Ordering English-to-Latin Output</A><BR>
<BR>
<A HREF="#TESTS AND STATUS"><B>TESTS AND STATUS</B></A><BR>
<A HREF="#Testing">Testing</A><BR>
<A HREF="#Current Status and Future Plans">Current Status and Future Plans</A><BR>
<BR>
<A HREF="#USER MODIFICATIONS"><B>USER MODIFICATIONS</B></A><BR>
<A HREF="#Writing DICT.LOC and UNIQUES.LAT">Writing DICT.LOC and UNIQUES.LAT</A><BR>
<A HREF="#DICT.LOC">DICT.LOC</A><BR>
<A HREF="#UNIQUES.LAT">UNIQUES.LAT</A><BR>
<BR>
<A HREF="#DEVELOPERS AND REHOSTING"><B>DEVELOPERS AND REHOSTING</B></A><BR>
<A HREF="#Program source code and data">Program source code and data</A><BR>
<A HREF="#License">License</A><BR>
<A HREF="#Rehosting WORDS">Rehosting WORDS</A><BR>
<A HREF="#Feedback">Feedback</A><BR>
<BR><BR>


<A NAME="SUMMARY">
<H2><CENTER>SUMMARY</CENTER>
</H2></A> <BR>
<BR>

<P>
This program, WORDS, takes keyboard input or a file of Latin text lines and
provides an analysis of each word individually.  It uses an INFLECT.SEC,
UNIQUES.LAT, ADDONS.LAT, STEMFILE.GEN, INDXFILE.GEN, and DICTFILE.GEN, and
possibly .SPE and DICT.LOC.
<P>
The dictionary contains over 39000 entries, as would be counted in an
ordinary dictionary.  This expands to almost twice that number of
individual stems (the count that the program may display at startup), and,
through additional word construction with hundreds of prefixes and
suffixes, may generate more, leading to many hundreds of thousands of
'words' that can be formed by declension and conjugation.  This version of
WORDS provides a tool to help in translations for the Latin student.  It
is now a large dictionary by any measure and can be helpful to advanced
users.  The dictionary will continue to grow - slowly.  <BR>
<BR>


<A NAME="INSTALLATION">
<H2><CENTER>INSTALLATION</CENTER>
</H2></A> <BR>

<P>
The WORDS program, with its accompanying data files should run on any
machine for which it is adapted, any monitor.  Simply download the
self-extracting EXE files or the compressed file for the appropriate
system and execute/decompress it in your chosen subdirectory on the hard
disk, creating the necessary files.  Then call/run WORDS, or do as instructed
in any README.

<P>The load includes SPQR.ICO, a possible icon for WORDS,
but just that, only an icon.
You have to install the program as per the directions
(put the downloaded files in a folder,
run them to expand to the WORDS system, then run from that folder).
However, If you are Windows-wise, you can use Explorer and
make a shortcut and put it on the desktop.
Windows will make a generic icon,
but you can change it (using Properties)
to whatever other icon you can find, for instance,
the one included with the package.  Or not.
Make sure that the Properties on the icon
has as Target the WORDS.EXE
in the folder in which the system is loaded.

<P>
See the particular page for each specific system.  <BR>
<A HREF="http://www.erols.com/whitaker/wordsdos.htm"><B>DOS</B></A><BR>
<A HREF="http://www.erols.com/whitaker/wordsw95.htm"><B>Windows 95/NT/98/2000/XP</B></A><BR>
<A HREF="http://www.erols.com/whitaker/wordslux.htm"><B>Linux and FreeBSD</B></A><BR>
<A HREF="http://www.erols.com/whitaker/wordsos2.htm"><B>OS/2</B></A><BR>
<A HREF="http://www.erols.com/whitaker/wordsmac.htm"><B>MAC OS X</B></A><BR>
<BR><BR>

<A NAME="Is There a Problem">
<H4>Is There a Problem?</H4></A>

<P>Did you download the two appropriate file(s) to your hard disk,
as listed in the download page for your system?

<P>Can you verify that they are there and full size (megabytes as indicated)?

<P>Did you execute/run/unzip these programs?

<P>If self-extracted, were you asked where to put the generated files?
(Maybe a default C:\WORDS)?
If not, did you put them in the folder/subdirectory from which you wish to operate?

<P>Can you verify that the full set of files (about 10 MB) was generated in that folder/subdirectory,
or wherever you chose? At least
WORDS.EXE, INFLECT.SEC, UNIQUES.LAT, ADDONS.LAT, STEMFILE.GEN, INDXFILE.GEN, and DICTFILE.GEN,
plus documentation.

<P>Did you run/execute WORDS in that folder/subdirectory?  e.g. <BR>
<B>C:\WORDS</B>

<P>If when you try to run there is no WORDS.EXE (or equivalent),
the system should let you know.<BR>
If there is no INFLECTS.SEC, the program will say so and abort immediately.<BR>
If there are no dictionary files, the program will tell you, but will start
(you can get Roman numerals!).<BR>
If there is no ADDONS.LAT or UNIQUES.LAT, the program will tell you,
and if they are there it will tell you how many.<BR>

<BR><BR>

<A NAME="INTRODUCTION">
<H3><CENTER>INTRODUCTION</CENTER>
</H3></A><BR>
<BR>

<P>
I am no expert in Latin, indeed my training is limited to a couple of
years in high school more than 50 years ago.  But I always felt that Latin, as
presented after two millennia, was a scientific language.  It had the
interesting property of inflection, words were constructed in a logical
manner.  I admired this feature, but could never remember the vocabulary
well enough when it came time to exercise it on tests.
<P>
I decided to automate an elementary-level Latin vocabulary list.  As a
first stage, I produced a computer program that will analyze a Latin word
and give the various possible interpretations (case, person, gender,
tense, mood, etc.), within the limitations of its dictionary.  This might
be the first step to a full parsing system, but, although just a
development tool, it is useful by itself.
<P>
<B>Please remember that this is only a computer exercise in automating a
Latin dictionary.  I am not a Latin scholar and anything in the program or
documentation is filtered by me from reading the cited Latin dictionaries.  Please
let no one go to his teacher and cite my interpretation as an authority.  </B>
<P>
While developing this initial implementation, based on different sources,
I learned (or re-learned) something that I had overlooked at the
beginning.  Latin courses, and even very large Latin dictionaries, are put
together under very strict ground rules.  Some dictionary might be based
exclusively on 'Classical' (200 BC - 200 AD) texts; it might have every
word that appears in every surviving writing of Cicero, but nothing much
before or since.  Such a dictionary will be inadequate for translating
medieval theological or scientific texts.  In another example, one
textbook might use Caesar as their main source of readings (my high school
texts did), while another might avoid Caesar and all military writings
(either for pacifist reasons, or just because the author had taught Caesar
for 30 years and had grown bored with going over the same material, year
after year).  One can imagine that the selection of words in such
different texts would differ considerably; moreover, even with the same
words, the meanings attached would be different.  This presents a problem
in the development of a dictionary for general use.
<P>
One could produce a separate dictionary for each era and application or a
universal dictionary with tags to indicate the appropriate application and
meaning for each word.  With such a tag arrangement one would not be
offered inappropriate or improbable interpretations.  The present system
has such a mechanism, but it is not fully exploited.
<P>
The Version 1.97E dictionary may be found to be of fairly general use for
the student; it has all the easy words that every text uses.  It also has the
adverbs, prepositions, and conjunctions, which are not as
sensitive to application as are the nouns and verbs.  The system also
tests a few hundred prefixes and suffixes, if the raw word cannot be
found.  Beyond that, there are a large number of TRICKS which may be applied.
These may be thought of as correcting for variations in spelling.
This allows an interpretation of many words which would otherwise
be marked unknown.  The result of this analysis is fairly straightforward
in most cases, accurate but esoteric in some others.  Some constructions
are recognized Latin words, and some are perfectly reasonable words which
may never have been used by Cicero or Caesar but might have been used by
Augustine or a monk of Jarrow.  For about 1 in 10 constructed words the
result has no relation to the normal dictionary meaning.
<P>
BE WARNED!  The program will go to great lengths if all tricks are
invoked.  If you get a word formed with an enclitic, prefix, suffix, and
syncope, be very suspicious!  It my well be right, but look carefully.
(Try siquempiamque!)
<P>
The final try is to look at the input as two words run together.  In
most cases this works out, and is especially useful for late Latin number
usage.  However, this algorithm may go very wrong.  If it is not obviously
right, it is probably incorrect.
<P>
With this facility, and a 39000 word dictionary, trials on some tested
classical texts and the Vulgate Bible give hit rates of far better than
99%, excluding proper names (there are very few proper names in this
dictionary).  (I am an old soldier so the dictionary may have
every possible word for attack or destroy.  The system is near perfect for
Caesar.) The question arises, what hit rate can be expected for a general
dictionary.  Classical Latin dictionaries have no references to the
terminology of Christian theology.  The legal documents and deeds of the
Middle Ages are a challenge of jargon and abbreviations.  These areas
require special knowledge and vocabulary, but even there the ability to
handle the non-specialized words is a large part of the effort.
<P>
The development system allows the inclusion of specialized vocabulary (for
instance a SPEcial dictionary for specialized words not wanted in most
dictionaries), and the opportunity for the user to add additional words to
a DICT.LOC.
<P>
It was initially expected that there would be special dictionaries for
special applications.  That is why there is the possibility of a SPECIAL
dictionary.  Now the general dictionary is coded by AGE and application
AREA.  Thus special words used initially/only by St Thomas Aquinas would
be Medieval (AGE code F) and Ecclesiastical (AREA code E).  Eventually
there needs to be a filter that allows one, upon setting parameters for
Medieval and Ecclesiastical, to push those words over others.  Right now
there are not have enough non-classical vocabulary to rely on such a
scheme.  The problem is that one needs a very complete classical
dictionary before one can assure that new entries are uniquely Medieval,
that they are not just classical words that appear in a Medieval text.
And the updated is only into the D's.  So the situation is that the
mechanism is there, but not sufficient data.  Nevertheless that is exactly
the application I had in mind when I set out to do the program.
<P>
One can set a parameter to exclude medieval words if there is a classical
word answering the same parse.  Likewise, the program can ignore rare
meanings if there is a common meaning for the parse.
<P>
The program may be larger than is necessary for the present
application.  It is still in development but some effort has now been put
into optimization.  Nevertheless there is lots of room for speeding it up.
Specifically, the program is disk-oriented is order to run on small machines,
such as DOS with the 640KB limitation.  Rejecting this limitation and assuming
that the user has tens of megabytes of memory (clearly realistic today)
would allow faster processing.  The next version may go that way.

<P>
This is a free program, which means it is proper to copy it and pass it on
to your friends.  Consider it a developmental item for which there is no
charge.  However, just for form, it is Copyrighted (c).
Permission is hereby freely given for any and all use of program and data.
You can sell it as your own, but at least tell me.

<P>
This version is distributed without obligation, but the developer would
appreciate comments and suggestions.
<P>
<BR>
William A Whitaker <BR>
PO Box 3036 <BR>
McLean VA 22103-3036 <BR>
USA <BR>
whitaker@erols.com <BR>
<BR>


<A NAME="OPERATIONAL DESCRIPTION">

<H3><CENTER>OPERATIONAL DESCRIPTION</CENTER>
</H3></A> <BR>

<P>
This write up is rudimentary and assumes that the user is experienced with
computers, and as an example assumes a PC with a Windows OS.
Other systems operate essentially the same.
<P>
The WORDS program, Version 1.97E, with it's accompanying data files should
run on PC in Windows 95/98/NT, any monitor.  Simply download the
self-extracting EXE file and execute it in your chosen subdirectory/folder to
UNZIP the files into a subdirectory of a hard disk.  Then call WORDS.
<P>
There are a number of files associated with the program.  These must be in
the subdirectory/folder of the program, and the program must be run from that
subdirectory.  WORDS.EXE is the executable program.  INFLECT.SEC holds the
encoded inflection records.  STEMFILE.GEN contains the stems of the
GENERAL dictionary in a searchable form.
DICTFILE.GEN is an indexed form of the GENERAL dictionary entries with form
information and meanings.  INDXFILE.GEN contains a set of indexes into the
DICTFILE.  In some versions, there may be a set of files for a SPECIAL (.SPE)
dictionary of the same structure as the GENERAL dictionary, but there is
no SPECIAL dictionary in the present distribution.  A LOCAL dictionary may
also be used.  This is a limited dictionary of a different form, human
readable and writeable.  The knowledgeable user can augment and modify it
on-line.  It would consist of the file DICT.LOC.  UNIQUES.LAT contains
certain words which regular processing does not get.  ADDONS.LAT contains
the set of prefixes, suffixes and enclitics (-que, -ve) and the like.
Other files may be generated by the program, so run it in a configuration
that allows the creation of files.
<P>
All these files are necessary to run the program (except the optional
dictionaries SPE and LOC).  This excess of files is a consequence of the
present developmental nature of the program.  The files are very simple,
almost human-readable.  Presumably, a later version could condense and
encode them.  Nevertheless, beyond the original COPY, the user need not
worry about them.
<P>
Additionally, there are files that the program may produce on request.
All of these share the name WORD, with various extensions, and they are
all ASCII/DOS text files which can be viewed and processed with an ordinary
editor.  The casual user may not want to get involved with
these.  WORD.OUT will record the whole output, WORD.UNK will list only
words the program is unable to interpret.  These outputs are turned on
through the PARAMETERS mechanism.
<P>
PARAMETERS may be changed while running the program by inputting a line
containing a '#' mark as the only (or first) character.  Alternatively,
WORD.MOD contains the MODES that can be set by CHANGE_PARAMETERS.  If this
file does not exist, default modes will be used.  The file may be produced
or changed when changing parameters.  It can also be modified, if the user
is sufficiently confident, with an editor, or deleted, thereby reverting
to defaults.
<P>
There is another set of developers parameters which may be set
with the input of '!'.  These MODES may be changed and saved in a
file WORD.MDV.  These are not normal user facilities, probably no one but
the developer would be interested.  In any specific release these
facilities may, or may not, work.  They are just mentioned here in case
they ever come up accidentally, and to point out that there are other
capabilities, actual and possible, which may be invoked if there is a
special need.  The user is invited to review these parameters to see
if any address an unusual need.
<P>
WORD.OUT is the file produced if the user requests
output to a file.  This output can be used for later manipulation with a
text editor, especially when the input was a text file of some length.  If
the parameter UNKNOWNS_ONLY is set, the output serves as a sort of a Latin
spell checker.  Those words it cannot match may just not be in the
dictionary, but alternatively they may be typos.  A WORD.UNK file of
unknowns can be generated.
<BR>
<BR>
<A NAME="Program Operation">
<H4>Program Operation</H4></A>

<P>
To start the program, in the subdirectory that contains all the files,
type WORDS.  A setup procedure will execute, processing files.  Then the
program will ask for a word to be keyed in.  Input the word and give a
return (ENTER).  Information about the word will be displayed.
<P>
One can input a whole line at a time, however long,
but only one line since the return
at the end of line will start the processing.  If the results would fill
more than a computer screen, the output is halted until the user responds
to the 'MORE' message with a return.  A file containing a text, a series
of lines, can be input by keying in the character '@', followed (with no
spaces) by the DOS name of the file of text.  This input file need not be
in the program subdirectory, just use the full path and name of the
file.  This is usually accompanied with the setting of the parameter
switches to create and write to an output file, WORD.OUT.
<P>
One can have a comment in the file, a terminal portion of a line that is
not parsed.  This could be an English meaning, a source where the word was
found, an indication that it may have been miscopied, etc.  A comment
begins with a double dash [--] and continues to the end of the line.  The
'--' and everything after on that line is ignored by the program.
<P>
A simple # character input at the start of a line (that is, a line
containing only #) will permit the user to set modes to prevent the
process from trying prefixes and suffixes to get a match on an item
unknown to the dictionary, put output to a file, etc.  Going into the
CHANGE_PARAMETERS, the '?' character calls help for each entry.
<P>
Another set of parameters is invoked by the character !.  These developer parameters
are fairly specialized and are probably not required by the average user,
nevertheless they are available for special applications.
<P>
Two successive returns with no text will terminate the program (except in
text being read from an @ disk file.)

<A NAME="Modes of Operation">
<H4>Modes of Operation</H4></A>

<P>The mode of operation of WORDS can be specialized by setting some combination
of available parameters.  Here are a couple of example situations.

<P>If you want only meanings to show up, set the # parameter
<BR>
DO_ONLY_MEANINGS => Yes
<BR>
<P>If you do not even want to see the dictionary form (principle parts) set
# parameter
<BR>
DO_DICTIONARY_FORM => No
<BR>
<P>If you want to accept only the dictionary entry (amo, but not amas), set
the ! parameter (this is the tricky one, requiring two parameters set)
<BR>
DO_ONLY_INITIAL_WORD => Yes
<BR>
<P>This will ten require you to input one enrty per line, which is not
unreasonable for a dictionary look-up process.  Then you will be offered
another, otherwise unavailable, option
<BR>
FOR_WORD_LIST_CHECK => Yes
<BR>
<P>There are a large number of other options.  The user is invited
to consider all the options if needing anything more than the basic parse.

<P>Of course, for both sets of parameters, you will want to go to the end
of the parameter setting menu and save this set so you can restart with
the same situation.


<A NAME="Command Line Operation">
<H4>Command Line Operation</H4></A>


The main mode of usage for WORDS is a simple call, followed by screen interaction.
<P>
But there are other, command line, options.
WORDS may be called with arguments on the same line, in a number of different modes.
The program will execute with these arguments as input.
Remember that the saved parameter settings (in WORD.MOD and WORD.MDV)
are controlling, even for command line input.

<P>
Single argument, either a simple Latin word or an input file.

<P>
WORDS amo
<BR>which will cause it to execute for that input and then terminate.  This is
for a quick word.

<P>
WORDS infile
<BR>causes WORDS to execute with the contents of the inflie.
The infile may be from any folder if the full path name is given.

<P>
With two arguments the options are: inputfile and outputfile,
two Latin words, or a language shift to English (Latin being the startup default)
and an English  word (with no part of speech).

<P>
WORDS infile outfile
<BR>The program will read as input the INFILE and write
the output to the OUTFILE (as though it were WORD.OUT).  It will then
await further input from the user.  It terminates with a return.  If the
parameters are not legal file names, the program will assume they are
Latin words to be processed as command line input.

<P>
WORDS amo amas

<P>
WORDS ^e  love
<BR>switches to English input from the default Latin and searches for love.

<P>
With three arguments there could be three Latin words or a language shift
and and English word and part of speech.

<P>
WORDS amo amas amat

<P>
WORDS ^e love v

<P>
More than three arguments must all be Latin words.

<P>
WORDS amo amas amat amamus amatis amant

<P>
There cannot be more than one English word in the argument list,
since there can only be one English word per line for WORDS input.


<P>
An input file (either from interactive with @ or from command line)
can have changes of language, but the ^E or ^L must be on a seperate line.
Note that this capability can create confusing situations.
An input file that starts off Latin then switches to English will be
correctly processed.  But if it is followed by a similiar input file, the
second file will start off English (from the setting in the earlier file) and fail
on the Latin input.  Thus even submitting the same file twice in a run
will give different results.  Ithis problem can be alleviated by starting each
input file with an explicit language instruction, but this will not normally be
the situation.


<A NAME="Latin-to-English Examples">
<H4>Latin-to-English Examples</H4>
<P>
Following are annotated examples of output.  Examination of these will
give a good idea of the system.  The present version may not match these
examples exactly - things are changing - but the principle is there.  A
recent modification is the output of dictionary forms or 'principal parts'
(shown below for some examples).

<PRE><TT>=>agricolarum
agricol.arum         N      1 1 GEN P M
agricola, agricolae  N    M   [XAXBO]
farmer, cultivator, gardener, agriculturist; plowman, countryman, peasant;
</TT></PRE>

<P>
This is a simple first declension noun, and a unique interpretation.  The
'1 1' means it is first declension, with variant 1. This is an internal
coding of the program, and may not correspond exactly with the grammatical
numbering.  The 'N' means it is a noun.  It is the form for genitive
(GEN), plural ('P').  The stem is masculine (M).
The stem is given as 'agricol' and the ending is
'arum'.  The stem is normal in this case, but is a product of the program,
and may not always correspond to conventional usage.

<P>On the next line is given the expansion of the form that one might find
in a paper dictionary, the nominitive and genitive (agricola, agricolae).
The [XAXBO] is an internal code of the program and is documented below as Dictionary Codes.
Several codes are associated with each dictionary entry (presently AGE, AREA, GEO, FREQ, SOURCE).
These provide some information to enhance the interpretation of the dictionary entry.
In this case, the interesting piece is the B, which signifies
that this word is found frequently in texts, in the top 10 percent.
The O says it has been verified in the Oxford Latin Dictionary.
The A says it is an agrigultural word.

<P>The declension/conjugation numbers for nouns and verbs are
essentially arbitary (but will be familiar to Latin students).
The variants are complete inventions.
They have no real meaning, just codes for the program.

<P>(In the case of adjectives, they are even more arbitary,
although a Latin student might see how I came by them.
Again they are only codes for the program.
The initial release of the program did not put these out,
but there is some interest on the part of students, so they are now included.
The user may ignore them altogether.
There is no relation between the declension/variant codes of a noun
and the accompaning adjective.
They only agree in case, number, and gender (NOM S N),
which are listed in the output.)


<PRE><TT>=>feminae
femin.ae             N      1 1 GEN S F
femin.ae             N      1 1 DAT S F
femin.ae             N      1 1 NOM P F
femin.ae             N      1 1 VOC P F
femina, feminae  N    F   [XXXAX]
woman; female;
</TT></PRE>

<P>
This word has several possible interpretations in case and number
(Singular and Plural).  The gender is Feminine.  Presumably, the user can
examine the adjoining words and reduce the set of possibilities.

<PRE><TT>=>cornu
corn.u               N      4 1 ABL S F
cornus, cornus  N    F   [XXXCO]
cornel-cherry-tree (Cornus mas); cornel wood; javelin (of cornel wood);
corn.u               N      4 2 NOM S N
corn.u               N      4 2 DAT S N
corn.u               N      4 2 ABL S N
corn.u               N      4 2 ACC S N
cornu, cornus  N    N   [XXXAO]
horn; hoof; beak/tusk/claw; bow; horn/trumpet; end, wing of army; mountain top;
*
</TT></PRE>

<P>
Here is an example of another declension and two variants.  The
Masculine (and few Feminine) (-us) nouns of the declension are '4 1' and the Neuter
(-u) nouns are coded as '4 2'.  This word has both.
The horn parse is very frequent (A), while the cornel option (C) is
less so but still common.


<PRE><TT>=>ego
ego                  PRON   5 1 NOM S C
 [XXXAX]
I, me; myself;</TT></PRE>

<P>
A pronoun is much like a noun.  The gender is common (C), that is, it may
be masculine or feminine.  For some odd words, especially including pronouns,
there is no dictionary form given.

<PRE><TT>=>illud
ill.ud               PRON   6 1 NOM S N
ill.ud               PRON   6 1 ACC S N
ille, illa, illud  PRON   [XXXAX]
that; those (pl.); also DEMONST; that person/thing; the well known; the former;
*
</TT></PRE>
<P>The asterisk means that there are other, less probable forms which have been
trimmed, but which may be recovered by running with the TRIM parameter reset.

<PRE><TT>=>hic
h.ic                 PRON   3 1 NOM S M
hic, haec, hoc  PRON   [XXXAX]
this; these (pl.); also DEMONST;
hic                  ADV    POS
hic                 ADV   [XXXCX]
here, in this place; in the present circumstances;</TT></PRE>

<P>
In this case there is a adjectival/demonstrative pronoun, or it may be an
adverb.  The POS means that the comparison of the adverb is positive.

<PRE><TT>=>bonum
bon.um               N      2 1 ACC S M
bonus, boni  N    M   [XXXCO]
good/moral/honest/brave man; man of honor, gentleman; better/rich people (pl.);
bon.um               N      2 2 NOM S N
bon.um               N      2 2 ACC S N
bonum, boni  N    N   [XXXAO]
good, good thing, profit, advantage; goods (pl.), possessions, wealth, estate;
bon.um               ADJ    1 1 NOM S N POS
bon.um               ADJ    1 1 ACC S M POS
bon.um               ADJ    1 1 ACC S N POS
bonus, bona -um, melior -or -us, optimus -a -um  ADJ   [XXXAO]
good, honest, brave, noble, kind, pleasant, right, useful; valid; healthy;
*
</TT></PRE>

<P>
Here we have an adjective, but it might also be a noun.  The
interpretation of the adjective says that it is POSitive, and that
is the meaning listed, as is the convention for all dictionaries.
The user must generate form this the meanings for other comparisons.
Check the comparison value before deciding on the real meaning.
Again, there is an asterisk, indicating further inflected forms were trimmed out.

<PRE><TT>=>facile
facil.e              ADJ    3 2 NOM S N POS
facil.e              ADJ    3 2 ABL S X POS
facil.e              ADJ    3 2 ACC S N POS
facilis, facile, facilior -or -us, facillimus -a -um  ADJ   [XXXAX]
easy, easy to do, without difficulty, ready, quick, good natured, courteous;
facile               ADV    POS
facile, facilius, facillime  ADV   [XXXBO]
easily, readily, without difficulty; generally, often; willingly; heedlessly;
*
</TT></PRE>

<P>
Here is an adjective or and adverb.  Although they are related in meaning,
they are different words.

<PRE><TT>=>acerrimus
acerri.mus           ADJ    3 3 NOM S M SUPER
acer, acris -e, acrior -or -us, acerrimus -a -um  ADJ   [XXXAO]
sharp, bitter, pointed, piercing, shrill; sagacious, keen; severe, vigorous;
</TT></PRE>

<P>
Here we have an adjective in the SUPERlative.  The meanings are all
POSitive and the user must add the -est by himself.

<PRE><TT>=>optime
optime               ADV    SUPER
bene, melius, optime  ADV   [XXXAO]
well, very, quite, rightly, agreeably, cheaply, in good style; better; best;
opti.me              ADJ    1 1 VOC S M SUPER
bonus, bona -um, melior -or -us, optimus -a -um  ADJ   [XXXAO]
good, honest, brave, noble, kind, pleasant, right, useful; valid; healthy;
</TT></PRE>

<P>Here is an adjective or and adverb, both are SUPERlative.

<PRE><TT>=>monuissemus
monu.issemus         V      2 1 PLUP ACTIVE  SUB 1 P
moneo, monere, monui, monitus  V   [XXXAX]
remind, advise, warn; teach; admonish; foretell, presage;
</TT></PRE>

<P>
Here is a verb for which the form is PLUPerfect, ACTIVE, SUBjunctive, 1st
person, Plural.  It is 2nd conjugation, variant 1.

<PRE><TT>=>amat
am.at                V      1 1 PRES ACTIVE  IND 3 S
amo, amare, amavi, amatus  V   [XXXAO]
love, like; fall in love with; be fond of; have a tendency to;
</TT></PRE>

<P>
Another regular verb, PRESent, ACTIVE, INDicative.

<PRE><TT>=>amatus
amat.us              VPAR   1 1 NOM S M PERF PASSIVE PPL
amo, amare, amavi, amatus  V   [XXXAO]
love, like; fall in love with; be fond of; have a tendency to;
amat.us              ADJ    1 1 NOM S M POS
amatus, amata, amatum  ADJ   [XXXEO]    uncommon
loved, beloved;
</TT></PRE>

<P>
Here we have the PERFect, PASSIVE ParticiPLe, in the NOMinative, Singular,
Masculine.   In addition, there is the ADJective that is formed from
this participle.  If the ADJective is common, it will likely have its own
dictionary entry.  Sometimes there may be a special or idiomatic meaning
not obvious from the verb, or the meaning may stray from the original.
In this case, the verb is very frequent, but the use as a adjective is uncommon.

<PRE><TT>=>amatu
amat.u               SUPINE 1 1 ABL S N
amo, amare, amavi, amatus  V   [XXXAO]
love, like; fall in love with; be fond of; have a tendency to;
</TT></PRE>

<P>
Here is the SUPINE of the verb in the ABLative Singular.

<PRE><TT>=>orietur
ori.etur             V      4 1 FUT          IND 3 S
orior, oriri, oritus sum  V    DEP   [XXXAO]
rise (sun/river); arise/emerge, crop up; get up (wake); begin; originate from;
be born/created; be born of, decend/spring from; proceed/be derived (from);
ori.etur             V      3 1 FUT          IND 3 S
orior, ori, ortus sum  V    DEP   [XXXBO]
rise (sun/river); arise/emerge, crop up; get up (wake); begin; originate from;
be born/created; be born of, decend/spring from; proceed/be derived (from);
</TT></PRE>

<P>
For DEPondent verbs the passive form is to be translated as if it were
active voice, so there is no VOICE given in the output.

<PRE><TT>=>ab
ab                   PREP   ABL
ab  PREP  ABL   [XXXAO]
by (agent), from (departure, cause, remote origin/time); after (reference);
</TT></PRE>

<P>
Here is a PREPosition that takes an ABLative for an object.

<PRE><TT>=>sine
sin.e                N      2 2 NOM P N
sin.e                N      2 2 ACC P N
sinum, sini  N    N   [XXXCX]
bowl for serving wine, etc;
sin.e                V      3 1 PRES ACTIVE  IMP 2 S
sino, sinere, sivi, situs  V   [XXXAX]
allow, permit;
sine                 PREP   ABL
sine  PREP  ABL   [XXXAX]
without;
*
</TT></PRE>

<P>
Here is a PREPosition that might also be a Verb or a Noun.
While as a preperation it is so common that it is unlikely
that any other use would occur, there is no way to indicate that.
Just be reminded that the frequency given for a verb is for the
sum of all the couple of hundred forms of the verb, not just
the one form that is parsed.

<PRE><TT>=>contra
contra               ADV    POS
contra              ADV   [XXXAO]
facing, face-to-face, in the eyes; towards/up to; across; in opposite direction;
against, opposite, opposed/hostile/contrary/in reply to; directly over/level;
otherwise, differently; conversely; on the contrary; vice versa;
contra               PREP   ACC
contra  PREP  ACC   [XXXAO]
against, facing, opposite; weighed against; as against; in resistance/reply to;
contrary to, not in conformance with; the reverse of; otherwise than;
towards/up to, in direction of;  directly over/level with; to detriment of;
</TT></PRE>

<P>
Here is a PREPosition that might also be an ADVerb.  This is a very common
situation, with the meanings being much the same.

<PRE><TT>=>et
et                   CONJ
et                  CONJ   [XXXAX]
and, and even; also, even;  (et ... et = both ... and);
</TT></PRE>

<P>
Here is a straight CONJunction.

<PRE><TT>=>vae
vae                  INTERJ
vae                 INTERJ   [XXXBX]
alas, woe, ah; oh dear;  (Vae, puto deus fio - Vespasian); Bah!, Curses!;
</TT></PRE>

<P>
Here is a straight INTERJection.

<PRE><TT>=>septem
septem               NUM    2 0 X   X X CARD
septem, septimus -a -um, septeni -ae -a, septie(n)s  NUM   [XXXAX]
 7 - (CARD answers 'how many');</TT></PRE>

<P>
Numbers are recognized as such and given a value.
An additional provision is the attempt to recognize and display the value
of Roman numerals, even combinations of appropriate letters that do not
parse conventionally to a value but may be ill-formed Roman numerals.

<PRE><TT>=>VII
VII                  NUM    2 0 X   X X CARD
7  as a ROMAN NUMERAL;
</TT></PRE>


<P>Beyond simple dictionary entry words, the program
can construct additional words with prefixes, suffixes and other ADDONS.

<PRE><TT>=>populusque
que                  TACKON
-que = and (enclitic, translated before attached word); completes plerus/uter;
popul.us             N      2 1 NOM S M
populus, populi  N    M   [XXXAO]
people, nation, State; public/populace/multitude/crowd; a following;
members of a society/sex; region/district (L+S); army (Bee);
</TT></PRE>

<P>Here the input word is recognized as a combination of a base word
and an enclitic (-que) tacked on.  This particular enclitic is
extremely common and its omission, or the omission of the process
that handles it, would result in an very large number of UNKNOWNs
in the output.


<PRE><TT>=>pseudochristus
pseudo               PREFIX
false, fallacious, deceitful; sperious; imitation of;
christ.us            N      2 1 NOM S M
Christus, Christi  N    M   [XEXAO]
Christ;</TT></PRE>

<P>Here there is a prefix and a base.  The user must make the combination
into a word or phrase.

<P>
Generally, the meaning is given for the base word, as is usual for
dictionaries.  For the verb, it will be a present meaning, even when the
tense given is perfect.  For a noun, it will be the singular, and the user
must interpret when the form is plural.

<P>For an adjective, the positive meaning is given,
even if a comparative or superlative form is output.
The user is invited to expand to
comparative (-er) and superlative (-est).
For a few adjectives, the only stem in the dictionary is COMP or SUPER.
When there is just one comparison,
the WORDS dictionary gives that expanded meaning.
This might be considered inconsistant,
in that it expects the user to observe the FORM to interpret the meaning,
but it is consisent with ordinary dictionary practice.


<P>Initially there were more defective adjective entries.
I had accepted assertions in OLD or L+S and others like
'comparative does not exist'.
Later on I went over to the position that
even if theCicero did not use it, someone might.
I started generating COMP and SUPER where it seemed reasonable.
One can also count on a suffix to correct most omissions, and it will.

<P>Sometimes a word is constructed from a suffix and a stem of a different
part of speech.
Thus an adverb may be constructed from its adjective.
It will show the base adjective meaning and an indication of how to
make the adverb in English.  The user must make the proper interpretation.

<P>
In some cases an adjective will be found that is a participle of a verb
that is also found.  The participle meaning, as inferred by the user from
the verb meaning, is not superseded by the explicit adjective entry, but
supplemented by it with possible specialized meanings.  <BR>
 <BR>


<A NAME="English-to-Latin Examples">
<H4>English-to-Latin Examples</H4></A>

<P>~E (tilde E/e plus Enter/CR)
changes mode from Latin-to-English to English-to-Latin.  ~L changes back.

<P>A single input English word is followed by the desired part of speech.
Omitting the part of speech defaults to all, which is not recommended
for any word which can be ambiguous.  Since the program is looking for a
part of speech, it would be inconvenient to support the input of several
English words on a line.  While a (@) file of words can be processed in the
English mode, it must be one word per line.

<P>Output looks much like a paper dictionary entry, with form, part of speech,
gender, etc.  Also included are the WORDS coded declension/conjugation and the
TRANS flags, which give age, frequency and source, information for the user
in selecting the best trnslation.  The output may also contain a vertical bar
leading the meaning.  This is a continuation symbol which states that there
are other meanings for the Latin word.  The user might want to run the Latin
phase of WORDS to get the full set of meanings so that no unintended conflicts
appear.


<PRE><TT>
love v

amo, amare, amavi, amatus  V     1 1 [XXXAO]
love, like; fall in love with; be fond of; have a tendency to;

diligo, diligere, dilexi, dilectus  V     3 1 [XXXAX]
select, pick, single out; love, value, esteem; approve, aspire to, appreciate;

amo, amare, additional, forms  V     9 1 [BXXEO]
love, like; fall in love with; be fond of; have a tendency to;

ardeo, ardere, arsi, arsus  V     2 1 [XXXAO]
be on fire; burn, blaze; flash; glow, sparkle; rage; be in a turmoil/love;

adamo, adamare, adamavi, adamatus  V     1 1  TRANS   [XXXBO]
fall in love/lust with; love passionately/adulterously; admire greatly; covet;

deamo, deamare, deamavi, deamatus  V     1 1  TRANS   [XXXCO]
love dearly; be passionately/desperately in love with; be delighted with/obliged
*

in prep

in  PREP  ABL    [XXXAX]
in, on, at (space); in accordance with/regard to/the case of; within (time);

ante  PREP  ACC    [XXXAO]
in front/presence of, in view; before (space/time/degree); over against, facing;

super  PREP  ABL    [XXXAX]
over (space), above, upon, in addition to; during (time); concerning; beyond;

in  PREP  ACC    [XXXAX]
into; about, in the mist of; according to, after (manner); for; to, among;

prae  PREP  ABL    [XXXAX]
before, in front; in view of, because of;

praeter  PREP  ACC    [XXXAX]
besides, except, contrary to; beyond (rank), in front of, before; more than;
*

in

intro               ADV    [XXXAX]
within, in; to the inside, indoors;

in  PREP  ABL    [XXXAX]
in, on, at (space); in accordance with/regard to/the case of; within (time);

gener, generi  N     2 3  M   [XXXBX]
son-in-law;

baro, baronis  N     3 1  M   [XXXBL]
baron; magnate; tenant-in-chief (of crown/earl); burgess; official; husband;

sororius, sorori(i)  N     2 4  M   [XXXCX]
sister's husband, brother-in-law;

socrus, socrus  N     4 1  M   [XXXCX]
father-in-law; spouse's grandfather/great grandfather;
*

kill v

occido, occidere, occidi, occisus  V     3 1 [XXXAX]
kill, murder, slaughter, slay; cut/knock down; weary, be the death/ruin of;

interficio, interficere, interfeci, interfectus  V     3 1 [XWXAX]
kill; destroy;

consumo, consumere, consumpsi, consumptus  V     3 1  TRANS   [XXXAO]
burn up, destroy/kill; put end to; reduce/wear away; annul; extinguish (right);

perago, peragere, peregi, peractus  V     3 1 [XXXAX]
disturb; finish; kill; carry through to the end, complete;

dejicio, dejicere, dejeci, dejectus  V     3 1  TRANS   [XXXAS]
|overthrow, bring down, depose; kill, destroy; shoot/strike down; fell (victim);

deicio, deicere, dejeci, dejectus  V     3 1  TRANS   [XXXAO]
|overthrow, bring down, depose; kill, destroy; shoot/strike down; fell (victim);
*

death n

mors, mortis  N     3 3  F   [XXXAX]
death; corpse; annihilation;

fatum, fati  N     2 2  N   [XPXAX]
utterance, oracle; fate, destiny; natural term of life; doom, death, calamity;

funus, funeris  N     3 2  N   [XXXAX]
burial, funeral; funeral rites; ruin; corpse; death;

nex, necis  N     3 1  F   [XXXBX]
death; murder;

letum, leti  N     2 2  N   [XXXBX]
death, ruin, annihilation; death and destruction;

Orcus, Orci  N     2 1  M   [XXXBX]
god of the underworld, Dis; death; the underworld;
*

destruction n

cinis, cineris  N     3 1  C   [XXXAO]
ashes; embers, spent love/hate; ruin, destruction; the grave/dead, cremation;

pestis, pestis  N     3 3  F   [XXXBX]
plague, pestilence, curse, destruction;

exitium, exiti(i)  N     2 4  N   [XXXBX]
destruction, ruin; death; mischief;

ruina, ruinae  N     1 1  F   [XXXBX]
fall; catastrophe; collapse, destruction;

interitus, interitus  N     4 1  M   [XXXBX]
ruin; violent/untimely death, extinction; destruction, dissolution;

excidium, excidi(i)  N     2 4  N   [XXXCX]
ruin, destruction, military destruction; overthrow;
*
</TT></PRE>


<P>While six prioritized translations may seem like enough,
and they will likely cover the needs of a student, the full set
(setting # parameter to not TRIM) contains much valuable information
for the advanced translator.  For instance for the verb live vivo
usually works, but there are other options associated with specific
situations: cohabito meand live together, ruror means live in the country,
adjaceo means live near, judaizo means live in the Jewish manner keeping the law.
These sorts of meanings are often conveyed in Latin by a single word,
while in English one might just use live and a modifing word or phrase.

<A NAME="Design of the Meaning Line">
<H4>Design of the Meaning Line</H4></A>

<P>The role and complexity of the WORDS meaning line has evolved over time.
Initially it reflected an elementry, back-of-the-book,  textbook dictionary
with a single word or two for each entry.
Nevertheless, the size of the MEAN element was set at 80 characters
(as God, Holerith and IBM decreed),
as appropriate for a standard computer screen in text mode.
(Depending on the system and mode of display, the output
may be limited to 78 or 79 characters, but the traditional 80
characters of the century-old IBM card was chosen.
They will likely appear on printed output.)


<P>With expansion of the dictionary beyond a few thousand elementary seentries
and the extensive inclusion of the Oxford dictionaries,
a much larger set of possible interpretations surfaced for many words,
filling and exceeding the 80 character limit.
A certain disipline was introduced to structure the line.


<P>Through the many phases of development of the
dictionary, standards were developed and modified and
rigor was not always maintained, therefore the rules
below are generally, but not universally, observed.
Evolution of the dictionary is bringing it more closely
in line with these rules.
<BR><BR>


<P>A decision was made to include as many meanings and synonyms as
convenient.  The OLD will sometimes list a dozen or more meaning
groups with notably different senses, each with several similiar meanings.
Presumably these different meanings were the product of different
translations of the Latin word, different translators,
different context, and different eras.
The WORDS dictionary includes many of these synonyms, and specifically adds
some more modern ones, in order to give the user inspiration
for his translation.
Further, it is important to give the user the full flavor of the word
that various translations employ.  A word with a nominal meaning of
respect may be found to also mean fear (which may be the basis of all
respect for the Romans), and that will certainly color the interpretation
of a passage.
Going the other way, one might not want to
apply it to a discription of Mother Teressa.
Also one should be warned if an otherwise simple word also is used as a rude
reference to female anatomy.

<P>There are a couple of other factors that may influence the user in determining
the appropriate meaning from the list.  Some words have different meanings depending
on the age.  If one is reading a text written recently in modern Latin, one must
consider hints about the meaning.  While the classical meaning, the WORDS default,
may be appropriate, if there is a line with a late AGE code or an indication
of a modern dictionary source (e.g,. Cal), the user should take this into consideration.
<BR><BR>

<A NAME="Signs and Abbreviations in Meaning">
<H5>Signs and Abbreviations in Meaning</H5></A>

, [comma] is used to separate meanings that are similar.  The philosophy
has been to list a number of synonyms just to key the reader in making his
translation.<BR>
<BR>
; [semicolon] is used to separate sets of meanings that differ in intent.
This is just a general tendency and is not always rigorously enforced.  <BR>
<BR>
: [colon] is used with an AREA code to specify a single special meaning
appropriate for that AREA in a series of general meanings.  For example,
L: has the same impact as (legal) before or after a defination in meaning.
This supplements the use of the AREA code in the set of flags, which
implies that all or most of the meanings are associated with that area.<BR>
<BR>
/ [solidus] means 'or' or gives an alternative word.  It sometimes
replaces the comma and is often used to compress the meaning into a short
line.  <BR>
<BR>
(...) [parentheses] set off and optional word or modifier, e.g., '(nearly)
white' means 'white' or 'nearly white', (matter in) dispute means either
the matter in dispute or the dispute itself.  They are also used to set
off an explanation, further information about the word or meaning, or an
example of a translation or a word combination.  <BR>
<BR>
?  [question mark] in a meaning implies a doubt about the interpretation,
or even about the existence of the word at all.  For the purposes of this
program, it does not matter much.  If the dubious word does not exist, no
one will ask for it.  If it appears in his text, the reader is warned that
the interpretation may be questionable to some degree, but is what is
available.  May indicate somewhat more doubt than (perh.).  <BR>
<BR>
~ [tilde] stands for the stem or word in question.  Usually it does not
have an ending affixed, as is the convention in other dictionaries, but
represents the word with whatever ending is proper.  It is just a space
saving shorthand or abbreviation.  <BR>
<BR>
{~ [tilde] also is the flag for changing the language base.  ~E (plus Enter/CR)
changes from Latin-to-English to English-to-Latin.  ~L changes back.)<BR>
<BR>
=&gt; in meaning this indicates a translation example.  <BR>
<BR>
abb.  abbreviation.  <BR>
<BR>
(Dif) - [Diferrari] is used to indicate an additional meaning taken from A
Latin-English Dictionary of St. Thomas Aquinas by Roy J. Diferrari.  This
is singled out because of the importance of Aquinas.  The reference is to
be applied from the last semicolon before the mark.  It is likely that the
meaning diverges from the base by being medieval and ecclesiastical, but
not so overwhelming as to deserve a separate entry.  <BR>
<BR>
(Douay) is used to designate those words for which the meaning has been
derived or modified by examination of the Douay translation of the Latin
Vulgate Bible of St Jerome.  <BR>
<BR>
(eccl.) ecclesiastical - designating a special church meaning in a list of
conventional meanings, an additional meaning not sufficient to justify a
separate entry with an ecclesiastical code.  <BR>
<BR>
esp.  [especially] - indicates a significant association, but is only
advisory.  <BR>
<BR>
(King James) or (KJames) is used to designate those words for which the
meaning has been derived or modified by examination of the King James
Bible in connection with the Latin Vulgate Bible of St Jerome.  <BR>
<BR>
(KLUDGE) This indicates that the particular form is distorted in order to
make it come out correctly.  This usually takes the form of a special
conjugational form applied to a few words, not applicable to other words
of the same conjugation or declension.  The user can expect the form and
meaning to be correct, but the numerical coding will be odd.  <BR>
<BR>
(L+S) [Lewis and Short] is used to indicate that the meaning starting from
the previous semicolon is information from Lewis and Short 'A Latin
Dictionary' that differs from, or significantly expands on, the meaning in
the 'Oxford Latin Dictionary' (OLD) which is the baseline for this
program.  This is not to imply that the meaning listed is otherwise taken
directly from the OLD, just that it is not inconsistent with OLD, but the
L+S information either inconsistent (likely OLD knows better) or Lewis and
Short has included meanings appropriate for late Latin writers beyond the
scope of OLD.  The program is just warning the reader that there may be
some difference.  There are cases in which this indication occurs in
entries that have Lewis and Short as the source.  In those cases, the
basic word is in OLD but the entry is a variant form or spelling not cited
there.  There are cases where OLD and L+S give somewhat different
spellings and meanings for the 'same' word (same in the sense that both
dictionaries point to the same citation).  In these cases a combination of
meanings are given for both entries with the (L+S) code distinction and
the entries of different spelling or declension have the SOURCE coded.  <BR>
<BR>
NT [New Testament] is a reference in the Bible.
<BR>
(OLD) [Oxford Latin Dictionary] is used to indicate an additional meaning
taken from the Oxford Latin Dictionary in an entry that is otherwise
attributed.  While it is usually true that if a classical word has other
than OLD as the listed source then it does not appear in that form in OLD,
this is not always the case.  On occasion some other dictionary gives a
much better or more complete and understandable definition and the honor
of source is thereto given.  <BR>
<BR>
OT [Old Testament] is a reference in the Bible.
<BR>
Other source indicators are occasionally used and are indicated
in the general discription of SOURCE below.
<BR><BR>
(PASS) [passive] - indicates a special, unexpected meaning for the passive
form of the verb, not easily associated with the active meaning.
In addition this is often used to remind the user that compounds of facio
form the passive by using the active of fio.  Ex: calefio (calefacio PASS).
There may be more translation information in the base word cited and
the user is encouraged to refer to it.<BR>
<BR>
perh.  [perhaps] - denotes an additional uncertainty, but not as strong as
(?).  <BR>
<BR>
(pl.) [plural] means that the Latin word is believed by scholars to be
used (almost) always in the plural form, with the meaning stated, even
though that meaning in English may be singular.  If it appears in the
beginning of the meaning, before the first comma, it applies to all the
meanings.  If it appears later, it applies only to that and later
meanings.  For the purpose of this program, this is only advisory.  While
it is used by some tools to find the expected dictionary entry, the
program does not necessarily exclude a singular form in the output.  While it may be
true that in good, classical Latin it is never used in the singular, this
does not mean that some text somewhere might not use the singular, nor
that it is uncommon in later Latin. The TRIM_OUTPUT option may cause only plural
forms to appear, with no TRIM_OUTPUT the singular will be shown. <BR>
<BR>
prob.  [probably] - denotes some uncertainty, but not as much as
(perh.).  <BR>
<BR>
pure Latin ...  indicates a pure Latin term for a word which is derived
from another language (almost certainly Greek).  <BR>
<BR>
(rude) - indicates that this meaning was used in a rude, vulgar, coarse,
or obscene manner, not what one should hear in polite company.  Such use
is likely from graffiti or epigrams, or in plays in which the dialogue is
to indicate that the characters are low or crude.  Meanings given by the
program for these words are more polite, and the user is invited to
substitute the current street language or obscenity of his choice to get
the flavor of text.  <BR>
<BR>
(sg.) [singular] means that the Latin word is believed by scholars to be
used always in the singular.  If it appears in the beginning of the
meaning, before the first comma, it applies to all the meanings.  If it
appears later, it applies only to that and later meanings.  For the
purpose of this program, this is only advisory.  <BR>
<BR>
usu.  [usually] is weakly advisory.  (usu.  pl.) is even weaker than (pl.)
and may imply that the plural tendency occurred only during certain periods.
<BR>
<BR>
w/ means 'with'.
<BR>
 <BR>


<A NAME="PROGRAM DESCRIPTION">
<H3><CENTER>PROGRAM DESCRIPTION</CENTER>
</H3></A> <BR>

<P>
A effect of the program is to derive the structure and meaning of
individual Latin words.  A procedure was devised to: examine the ending of
a word, compare it with the standard endings, derive the possible stems
that could be consistent, compare those stems with a dictionary of stems,
eliminate those for which the ending is inconsistent with the dictionary
stem (e.g., a verb ending with a noun dictionary item), if unsuccessful,
it tries with a large set of prefixes and suffixes, and various tackons
(e.g., -que), finally it tries various 'tricks' (e.g., 'ae' may be
replaced by 'e', 'inp...' by 'imp...', syncope, etc.), and it reports any
resulting matches as possible interpretations.
<P>
With the input of a word, or several words in a line, the program returns
information about the possible accedience, if it can find an agreeable
stem in its dictionary.

<PRE><TT>=>amo
am.o               V       1  1 PRES ACTIVE  IND  1 S
love, like; fall in love with; be fond of; have a tendency to</TT></PRE>

<P>
To support this method, an INFLECT.SEC data file was constructed
containing possible Latin endings encoded by a structure that identifies
the part of speech, declension, conjugation, gender, person, number, etc.
This is a pure computer encoding for a 'brute force' search.  No
sophisticated knowledge of Latin is used at this point.  Rules of thumb
(e.g., the fact, always noted early in any Latin course, that a neuter
noun has the same ending in the nominative and accusative, with a final -a
in the plural) are not used in the search.  However, it is convenient to
combine several identical endings with a general encoding (e.g., the
endings of the perfect tenses are the same for all verbs, and are so
encoded, not replicated for every conjugation and variant).
<P>
Many of the distinguishing differences identifying conjugations come from
the voiced length of stem vowels (e.g., between the present, imperfect and
future tenses of a third conjugation I-stem verb and a fourth conjugation
verb).  These aural differences, the features that make Latin 'sound
right' to one who speaks it, are not relevant in the analysis of written
endings.
<P>
The endings for the verb conjugations are the result of trying to minimize
the number of individual endings records, while yet keeping the structure
of the inflections data file fairly readable.  There is no claim that the
resulting arrangement is consonant with any grammarian's view of Latin,
nor should it be examined from that viewpoint.  While it started from the
conjugations in text books, it can only be viewed as some fuzzy
intermediate step along a path to a mathematically minimal number of
encoded verb endings.  Later versions of the program might improve the
system.
<P>
There are some egregious liberties taken in the encoding.  With the
inclusion of two present stems, the third conjugation I-stem verbs may
share the endings of the regular third conjugation.  The fourth
conjugation has disappeared altogether, and is represented internally as a
variant of the third conjugation (3, 4), but this is
replaced for the user in output by 4 1. There is an artificial fifth
conjugation for esse and others, a sixth for eo, and a seventh for other
irregularities.
<P>
As an example, a verb ending record has the structure:
<BR>PART -- the part code for a verb = V;
<BR>CONjugation -- consisting of two parts:
<BR>WHICH -- a conjugation identifier - range 0..9 and
<BR>VAR -- a variant identifier on WHICH - range 0..9;
<BR>TENSE -- an enumeration type - range PRES..FUTP + X;
<BR>VOICE -- an enumeration type - range ACTIVE..PASSIVE + X;
<BR>MOOD -- an enumeration type - range IND..PPL + X;
<BR>PERSON -- person, first to third - range 1..3 + 0;
<BR>NUMBER -- an enumeration type - range S..P + X;
<BR>KEY -- which stem to be used - range 1..4;
<BR>SIZE -- number of characters - range 0..9;
<BR>ENDING -- the ending as a string of SIZE characters;
<BR>AGE and FREQ flags which are not usually visible to the user.
<P>
Thus, the entry for the ending appropriate to 'amo' (with STEM = am) is:

<PRE><TT>V 1 1 PRES IND ACTIVE 1 S X 1 o</TT></PRE>

<P>
The elements are straightforward and generally use the
abbreviations that are common in any Latin text.  An X or 0 represents the
'don't know' or 'don't care' for enumeration or numeric types.  Details
are documented below in the CODES section.
<P>
A verb dictionary record has the structure:
<BR>STEMS -- for a verb there are 4 stems;
<BR>PART --  part code for a verb = V
<BR>WHICH -- a conjugation identifier - range 0..9
<BR>VAR -- a variant identifier - range 0..9;
<BR>KIND -- enumeration type of verb - range TO_BE..PERFDEF + X;
<BR>AGE, AREA, GEO, FREQ, and SOURCE flags
<BR>MEANING -- text for English translations (up to 80 characters).
<P>
Thus, an entry corresponding to 'amo amare amavi amatus' is:

<PRE><TT>am am amav amat
V 1 1 X            X X X X X
love, like; fall in love with; be fond of; have a tendency to</TT></PRE>


<P>
Endings may not uniquely determine which stem, and therefore the right
meaning.  'portas' could be the accusitive plural of 'gate', or the second
person, singular, present indicative active of 'carry'.  In both cases the
stem is 'port'.  All possibilities are reported.

<PRE><TT>portas
port.as V 1 1 PRES IND ACTIVE 2 S X
carry, bring

port.as N 1 1 ACC P F T
gate, entrance; city gates; door; avenue;</TT></PRE>

<P>
And note that the same stem (port) has other uses (portus = harbor).


<PRE><TT>portum
port.um N 4 1 ACC S M T
port, harbor; refuge, haven, place of refuge</TT></PRE>

<P>
PLEASE NOTE: It is certainly possible for the program to find a valid
Latin construction that fits the input word and to have that
interpretation be entirely wrong in the context.  It is even possible to
interpret a number, in Roman numerals, as a word!  (But the number would
be reported also.)

<P>
For the case of defective verbs, the process does not necessarily have to
be precise.  Since the purpose is only to translate from Latin, even if
there are unused forms included in the algorithm these will not come up
in any real Latin text.  The endings for the verb conjugations are the
result of trying to minimize the number of individual endings records,
while keeping the structure of the base INFLECTIONS data file fairly
readable.
<P>
In general the program will try to construct a match with the inflections
and the dictionaries.  There are some specific checks to reject
certain mathematically correct combinations that do not appear in the
language, but these checks are relatively few.  The philosophy has been to
allow a generous interpretation.  A remark in a text or dictionary that a
particular form does not exist must be tempered with the realization that
the author probably means that it has not been observed in the surviving
classical literature.  This body of reference is minuscule compared to the
total use of Latin, even limited to the classical period.  Who is to say
that further examples would not turn up such an example, even if it might
not have been approved of by Cicero.  It is also possible that such
reasonable, if 'improper', constructs might occur in later writings by
less educated, or just different, authors.  Certainly English shows this
sort of variation over time.
<P>
If the exact stem is not found in the dictionary, there are rules for the
construction of words which any student would try.  The simplest situation
is a known stem to which a prefix or suffix has been attached.  The method used
by the program (if DO_FIXES is on, default is Yes) is to try any fixes that fit,
to see if their removal results in an identifiable remainder.  Then the
meaning is mechanically implied from the meaning of the fix and the
stem.  The user may need to interpret with a more conventional English
usage.  This technique improves the hit performance significantly.  However,
in about 40% of the instances in which there is a hit, the derivation is
correct but the interpretation takes some imagination.  In something less
than 10% of the cases, the inferred fix is just wrong, so the user must
take some care to see if the interpretation makes any sense.
<P>
This method is complicated by the tendency for prefixes to be modified
upon attachment (ab+fero = aufero, sub+fero = suffero).  The program's
'tricks' take many such instances into account.  Ideally, one should look
inside the stem for identifiable fragments.  One would like to start with
the smallest possible stem, and that is most frequently the correct one.
While it is mathematically possible that the stem of 'actorum' is 'actor'
with the common inflection 'um', no intuitive first semester Latin student
would fail to opt for the genitive plural 'orum', and probably be right.
To first order, the procedure ignores such hints and may report this word in
both forms, as well as a verb participle.  However, it can use certain
generally applicable rules, like the superlative characteristic 'issim',
to further guess.
<P>
In addition, there is the capability to examine the word for such common
techniques as syncope, the omission of the 've' or 'vi' in certain verb
perfect forms (audivissem = audissem).
<P>
If the dictionary can not identify a matching stem, it may be possible to
derive a stem from 'nearby' stems (an adverb from an adjective is the most
common example) and infer a meaning.  If all else fails, a portion of the
possible dictionary stems can be listed, from which the user can draw in
making his own guess.  <BR>

<A NAME="Codes in Inflection Line">

<H4>Codes in Inflection Line</H4></A>
<P>
For completeness, the enumeration codes used in the output are listed here
from the Ada statements.  Simple numbers are used for person, declension,
conjugations, and their variants.  Not all the facilities implied by these
values are developed or used in the program or the dictionary.  This list
is only for Version 1.97E.  Other versions may be somewhat different.  This
may make their dictionaries incompatible with the present program.
<P>
NOTE: in print dictionaries certain information is conveyed by font
encoding, e.g., the use of bold face or italics.  There is no system
independent method of displaying such on computers (although individual
programs can handle these, each in it own unique way).  WORDS uses capital
letters to express some such differences, which method is system independent
in present usage.

<PRE><TT>
 type PART_OF_SPEECH_TYPE
          X,         --  all, none, or unknown
          N,         --  Noun
          PRON,      --  PRONoun
          PACK,      --  PACKON -- artificial for code
          ADJ,       --  ADJective
          NUM,       --  NUMeral
          ADV,       --  ADVerb
          V,         --  Verb
          VPAR,      --  Verb PARticiple
          SUPINE,    --  SUPINE
          PREP,      --  PREPosition
          CONJ,      --  CONJunction
          INTERJ,    --  INTERJection
          TACKON,    --  TACKON --  artificial for code
          PREFIX,    --  PREFIX --  here artificial for code
          SUFFIX     --  SUFFIX --  here artificial for code

  type GENDER_TYPE
          X,         --  all, none, or unknown
          M,         --  Masculine
          F,         --  Feminine
          N,         --  Neuter
          C          --  Common (masculine and/or feminine)

  type CASE_TYPE
          X,         --  all, none, or unknown
          NOM,       --  NOMinative
          VOC,       --  VOCative
          GEN,       --  GENitive
          LOC,       --  LOCative
          DAT,       --  DATive
          ABL,       --  ABLative
          ACC        --  ACCusitive

  type NUMBER_TYPE
          X,         --  all, none, or unknown
          S,         --  Singular
          P          --  Plural

  type PERSON_TYPE is range 0..3;

  type COMPARISON_TYPE
          X,         --  all, none, or unknown
          POS,       --  POSitive
          COMP,      --  COMParative
          SUPER      --  SUPERlative

  type NUMERAL_SORT_TYPE
         X,          --  all, none, or unknown
         CARD,       --  CARDinal
         ORD,        --  ORDinal
         DIST,       --  DISTributive
         ADVERB      --  numeral ADVERB

  type TENSE_TYPE
          X,         --  all, none, or unknown
          PRES,      --  PRESent
          IMPF,      --  IMPerFect
          FUT,       --  FUTure
          PERF,      --  PERFect
          PLUP,      --  PLUPerfect
          FUTP       --  FUTure Perfect

  type VOICE_TYPE
          X,         --  all, none, or unknown
          ACTIVE,    --  ACTIVE
          PASSIVE    --  PASSIVE

  type MOOD_TYPE
          X,         --  all, none, or unknown
          IND,       --  INDicative
          SUB,       --  SUBjunctive
          IMP,       --  IMPerative
          INF,       --  INFinative
          PPL        --  ParticiPLe

  type NOUN_KIND_TYPE
          X,            --  unknown, nondescript
          S,            --  Singular "only"           --  not really used
          M,            --  plural or Multiple "only" --  not really used
          A,            --  Abstract idea
          G,            --  Group/collective Name -- Roman(s)
          N,            --  proper Name
          P,            --  a Person
          T,            --  a Thing
          L,            --  Locale, name of country/city
          W             --  a place Where

  type PRONOUN_KIND_TYPE
          X,            --  unknown, nondescript
          PERS,         --  PERSonal
          REL,          --  RELative
          REFLEX,       --  REFLEXive
          DEMONS,       --  DEMONStrative
          INTERR,       --  INTERRogative
          INDEF,        --  INDEFinite
          ADJECT        --  ADJECTival

   type VERB_KIND_TYPE
          X,         --  all, none, or unknown
          TO_BE,     --  only the verb TO BE (esse)
          TO_BEING,  --  compounds of the verb to be (esse)
          GEN,       --  verb taking the GENitive
          DAT,       --  verb taking the DATive
          ABL,       --  verb taking the ABLative
          TRANS,     --  TRANSitive verb
          INTRANS,   --  INTRANSitive verb
          IMPERS,    --  IMPERSonal verb (implied subject 'it', 'they', 'God')
                     --  agent implied in action, subject in predicate
          DEP,       --  DEPonent verb
                     --  only passive form but with active meaning
          SEMIDEP,   --  SEMIDEPonent verb (forms perfect as deponent)
                     --  (perfect passive has active force)
          PERFDEF    --  PERFect DEFinite verb
                     --  having only perfect stem, but with present force

</TT></PRE>

<P>
The KIND_TYPEs represent various aspects of a word which may be useful to
some program, not necessarily the present one.  They were put in for
various reasons, and later versions may change the selection and use.
Some of the KIND flags are never used.  In some cases more than one KIND
flag might be appropriate, but only one is selected.  Some seemed to be a
good idea at one time, but have not since proved out.  The lists above are
just for completeness.
<P>
NOUN KIND is used in trimming (when set) the output and removing possibly
spurious cases (locative for a person, but preserving the vocative).
<P>
VERB KIND allows examples (when set) to give a more reasonable meaning.  A
DEP flag allows the example to reflect active meaning for passive form.
It also allows the dictionary form to be constructed properly from stems.
TRANS/INTRANS were included to allow a further program a hint as to what
kind of object it should expect.  This flag is only now being fixed during
the update.  There are some verbs which, although mostly used in one way,
might be either.  These are assigned X rather than breaking into two
entries.  This would be of no particular use at this point since it would
not allow the object to be determined.  GEN/DAT/ABL flags have related
function, but are almost absent.  TO_BE is used to indicate that a form of
esse may be part of a compound verb tense with a participle.  TO_BEING
indicates a verb related to esse (e.g., abesse) which has no object,
neither is in used to form compounds.  IMPERS is used to weed out person
and forms inappropriate to an impersonal verb, and to insert a special
meaning distinct from a general form associated with the same verb stem.

<P>There is a problem in that all values for this parameter are not orthogonal.
DEP is a different sort of thing from INTRANS.  There ought to be a
KIND_1 and KIND_2 to separate the different classes.  However, this would
be overkill considering the use made of this parameter, so far.


<P>There is a more difficult DEP problem.
'Good Latin' requires that the DEP be recognized and
processed to eliminate active forms.
In some cases there are dictionary examples, mostly medieval,
of the depondency being violated.
Some of those cases have been recognized with a separate entry.
This is not something that a suffix can handle appropriately,
even if mechanically it can function.
A better way might be to include the perfect form but still have the DEP flag,
thereby allow the trimming in most cases. This has not been done yet.
But an active form would be recognized if input, especially if the text is medieval.

<P>
NUMERAL KIND and VALUE are used by the program in constructing the meaning line.
<BR>

<A NAME="Help for Parameters">
<H4>Help for Parameters</H4></A>

<P>
One can CHANGE_PARAMETERS by inputting a '#' [number sign] character (ASCII
35) as the input word, followed by a return.  (Note that this has changed
from early versions in which '?' was used.) Each parameter is listed and
the user is offered the opportunity to change it from the current value by
answering Y or N (any case).  For each parameter there is some explanation
or help.  This is displayed by in putting a '?' [question mark], followed
by a return.  HINT: While going down the list if one has made all the
changes desired, one need not continue to the end.  Just enter a space and
then give a return.  The program will interpret this as an illegal entry
(not Y or N) and will cancel the rest of the list, while retaining any
changes made to that point.

<P>Some parameters may not function in the English mode, nor is the documentation
necessarily complete,

<P>
The various help displays are listed here:

<PRE><TT>

TRIM_OUTPUT_HELP
   This option instructs the program to remove from the output list of
   possible constructs those which are least likely.  There is now a fair
   amount of trimming, killing LOC and VOC plus removing Uncommon and
   non-classical (Archaic/Medieval) when more common results are found
   and this action is requested (turn it off in MDV (!) parameters).
   When a TRIM has been done, the output is followed by an asterix (*).
   There certainly is no absolute assurence that the items removed are
   not correct, just that they are statistically less likely.
   Note that poets are likely to employ unusual words and inflections for
   various reasons.  These may be trimmed out if this parameter in on.
   When in English mode, trim just reduces the output to the top six
   results, if there are that many.  Asterix means there are more
                                                   The default is Y(es)

HAVE_OUTPUT_FILE_HELP
   This option instructs the program to create a file which can hold the
   output for later study, otherwise the results are just displayed on
   the screen.  The output file is named  WORD.OUT
   This means that one run will necessarily overwrite a previous run,
   unless the previous results are renamed or copied to a file of another
   name.  This is available if the METHOD is INTERACTIVE, no parameters.
   The default is N(o), since this prevents the program from overwriting
   previous work unintentionally.  Y(es) creates the output file.

WRITE_OUTPUT_TO_FILE_HELP
   This option instructs the program, when HAVE_OUTPUT_FILE is on, to
   write results to the WORD.OUT file.
   This option may be turned on and off during running of the program,
   thereby capturing only certain desired results.  If the option
   HAVE_OUTPUT_FILE is off, the user will not be given a chance to turn
   this one on.  Only for INTERACTIVE running.         Default is N(o).
   This works in English mode, but output in somewhat diffeent so far.

DO_UNKNOWNS_ONLY_HELP
   This option instructs the program to only output those words that it
   cannot resolve.  Of course, it has to do processing on all words, but
   those that are found (with prefix/suffix, if that option in on) will
   be ignored.  The purpose of this option is t allow a quick look to
   determine if the dictionary and process is going to do an acceptable
   job on the current text.  It also allows the user to assemble a list
   of unknown words to look up manually, and perhaps augment the system
   dictionary.  For those purposes, the system is usually run with the
   MINIMIZE_OUTPUT option, just producing a list.  Another use is to run
   without MINIMIZE to an output file.  This gives a list of the input
   text with the unknown words, by line.  This functions as a spelling
   checker for Latin texts.  The default is N(o).
   This does not work in English mode, but may in the future.

WRITE_UNKNOWNS_TO_FILE_HELP
   This option instructs the program to write all unresolved words to a
   UNKNOWNS file named  WORD.UNK
   With this option on , the file of unknowns is written, even though
   the main output contains both known and unknown (unresolved) words.
   One may wish to save the unknowns for later analysis, testing, or to
   form the basis for dictionary additions.  When this option is turned
   on, the UNKNOWNS file is written, destroying any file from a previous
   run.  However, the write may be turned on and off during a single run
   without destroying the information written in that run.
   This option is for specialized use, so its default is N(o).
   This does not work in English mode, but may in the future.

IGNORE_UNKNOWN_NAMES_HELP
   This option instructs the program to assume that any capitalized word
   longer than three letters is a proper name.  As no dictionary can be
   expected to account for many proper names, many such occur that would
   be called UNKNOWN.  This contaminates the output in most cases, and
   it is often convenient to ignore these sperious UNKNOWN hits.  This
   option implements that mode, and calls such words proper names.
   Any proper names that are in the dictionary are handled in the normal
   manner.                                The default is Y(es).

IGNORE_UNKNOWN_CAPS_HELP
   This option instructs the program to assume that any all caps word
   is a proper name or similar designation.  This convention is often
   used to designate speakers in a discussion or play.  No dictionary can
   claim to be exaustive on proper names, so many such occur that would
   be called UNKNOWN.  This contaminates the output in most cases, and
   it is often convenient to ignore these sperious UNKNOWN hits.  This
   option implements that mode, and calls such words names.  Any similar
   designations that are in the dictionary are handled in the normal
   manner, as are normal words in all caps.    The default is Y(es).

DO_COMPOUNDS_HELP
   This option instructs the program to look ahead for the verb TO_BE (or
   iri) when it finds a verb participle, with the expectation of finding
   a compound perfect tense or periphastic.  This option can also be a
   trimming of the output, in that VPAR that do not fit (not NOM) will be
   excluded, possible interpretations are lost.  Default choice is Y(es).
   This processing is turned off with the choice of N(o).

DO_FIXES_HELP
   This option instructs the program, when it is unable to find a proper
   match in the dictionary, to attach various prefixes and suffixes and
   try again.  This effort is successful in about a quarter of the cases
   which would otherwise give UNKNOWN results, or so it seems in limited
   tests.  For those cases in which a result is produced, about half give
   easily interpreted output; many of the rest are etymologically true,
   but not necessarily obvious; about a tenth give entirely spurious
   derivations.  The user must proceed with caution.
   The default choice is Y(es), since the results are generally useful.
   This processing can be turned off with the choice of N(o).

DO_TRICKS_HELP
   This option instructs the program, when it is unable to find a proper
   match in the dictionary, and after various prefixes and suffixes, to
   try every dirty Latin trick it can think of, mainly common letter
   replacements like cl -> cul, vul -> vol, ads -> ass, inp -> imp, etc.
   Together these tricks are useful, but may give false positives (>10%).
   They provide for recognized varients in classical spelling.  Most of
   the texts with which this program will be used have been well edited
   and standardized in spelling.  Now, moreover,  the dictionary is being
   populated to such a state that the hit rate on tricks has fallen to a
   low level.  It is very seldom productive, and it is always expensive.
   The only excuse for keeping it as default is that now the dictionary
   is quite extensive and misses are rare.         Default is now Y(es). ) ;

DO_DICTIONARY_FORMS_HELP
   This option instructs the program to output a line with the forms
   normally associated with a dictionary entry (NOM and GEN of a noun,
   the four principal parts of a verb, M-F-N NOM of an adjective, ...).
   This occurs when there is other output (i.e., not with UNKNOWNS_ONLY).
   The default choice is N(o), but it can be turned on with a Y(es).

SHOW_AGE_HELP
   This option causes a flag, like '<Late>' to appear for inflection or
   form in the output.  The AGE indicates when this word/inflection was
   in use, at least from indications is dictionary citations.  It is
   just an indication, not controlling, useful when there are choices.
   No indication means that it is common throughout all periods.
   The default choice is Y(es), but it can be turned off with a N(o).

SHOW_FREQUENCY_HELP
   This option causes a flag, like '<rare>' to appear for inflection or
   form in the output.  The FREQ is indicates the relative usage of the
   word or inflection, from indications is dictionary citations.  It is
   just an indication, not controlling, useful when there are choices.
   No indication means that it is common throughout all periods.
   The default choice is Y(es), but it can be turned off with a N(o).

DO_EXAMPLES_HELP
   This option instructs the program to provide examples of usage of the
   cases/tenses/etc. that were constructed.  The default choice is N(o).
   This produces lengthly output and is turned on with the choice Y(es).

DO_ONLY_MEANINGS_HELP
   This option instructs the program to only output the MEANING for a
   word, and omit the inflection details.  This is primarily used in
   analyzing new dictionary material, comparing with the existing.
   However it may be of use for the translator who knows most all of
   the words and just needs a little reminder for a few.
   The default choice is N(o), but it can be turned on with a Y(es).

DO_STEMS_FOR_UNKNOWN_HELP
   This option instructs the program, when it is unable to find a proper
   match in the dictionary, and after various prefixes and suffixes, to
   list the dictionary entries around the unknown.  This will likely
   catch a substantive for which only the ADJ stem appears in dictionary,
   an ADJ for which there is only a N stem, etc.  This option should
   probably only be used with individual UNKNOWN words, and off-line
   from full translations, therefore the default choice is N(o).
   This processing can be turned on with the choice of Y(es).

SAVE_PARAMETERS_HELP
   This option instructs the program, to save the current parameters, as
   just established by the user, in a file WORD.MOD.  If such a file
   exists, the program will load those parameters at the start.  If no
   such file can be found in the current subdirectory, the program will
   start with a default set of parameters.  Since this parameter file is
   human-readable ASCII, it may also be created with a text editor.  If
   the file found has been improperly created, is in the wrong format, or
   otherwise uninterpretable by the program, it will be ignored and the
   default parameters used, until a proper parameter file in written by
   the program.  Since one may want to make temporary changes during a
   run, but revert to the usual set, the default is N(o).

</TT></PRE>

<P>
There is also a set of DEVELOPER_PARAMETERS that are unlikely to be of
interest to the normal user.  Some of these facilities may be disconnected
or not work for other reasons.  Additional parameters may be included
without notice or documentation.  The HELP may be the most reliable
source of information.  These parameters are mostly for the use in the
development process.  These may be changed or examined by in similar
change procedure by inputting a '!' [exclamation sign] character, followed
by a return.

<PRE><TT>
HAVE_STATISTICS_FILE_HELP
   This option instructs the program to create a file which can hold
   certain statistical information about the process.  The file is
   overwritten for new invocation of the program, so old data must be
   explicitly saved if it is to be retained.  The statistics are in TEXT
   format.     The statistics file is named  WORD.STA
   This information is only of development use, so the default is N(o).

WRITE_STATISTICS_FILE_HELP
   This option instructs the program, with HAVE_STATISTICS_FILE, to put
   derived statistics in a file named  WORD.STA
   This option may be turned on and off while running of the program,
   thereby capturing only certain desired results.  The file is reset at
   each invocation of the program, if the HAVE_STATISTICS_FILE is set.
   If the option HAVE_STATISTICS_FILE is off, the user will not be given
   a chance to turn this one on.                Default is N(o).

SHOW_DICTIONARY_HELP
   This option causes a flag, like 'GEN>' to be put before the meaning
   in the output.  While this is useful for certain development purposes,
   it forces off a few characters from the meaning, and is really of no
   interest to most users.
   The default choice is N(o), but it can be turned on with a Y(es).

SHOW_DICTIONARY_LINE_HELP
   This option causes the number of the dictionary line for the current
   meaning to be output.  This is of use to no one but the dictionary
   maintainer.  The default choice is N(o).  It is activated by Y(es).

SHOW_DICTIONARY_CODES_HELP
   This option causes the codes for the dictionary entry for the current
   meaning to be output.  This may not be useful to any but the most
   involved user.  The default choice is N(o).  It is activated by Y(es).

DO_PEARSE_CODES_HELP
   This option causes special codes to be output flagging the different
   kinds of output lines.  01 for forms, 02 for dictionary forms, and
   03 for meaning. The default choice is N(o).  It is activated by Y(es).
  There are no Pearse codes in English mode.

DO_ONLY_INITIAL_WORD_HELP
   This option instructs the program to only analyze the initial word on
   each line submitted.  This is a tool for checking and integrating new
   dictionary input, and will be of no interest to the general user.
   The default choice is N(o), but it can be turned on with a Y(es).

FOR_WORD_LIST_CHECK_HELP
   This option works in conjunction with DO_ONLY_INITIAL_WORD to allow
   the processing of scanned dictionarys or text word lists.  It accepts
   only the forms common in dictionary entries, like NOM S for N or ADJ,
   or PRES ACTIVE IND 1 S for V.  It is be used only with DO_INITIAL_WORD
   The default choice is N(o), but it can be turned on with a Y(es).

DO_ONLY_FIXES_HELP
   This option instructs the program to ignore the normal dictionary
   search and to go direct to attach various prefixes and suffixes before
   processing. This is a pure research tool.  It allows one to examine
   the coverage of pure stems and dictionary primary compositions.
   This option is only available if DO_FIXES is turned on.
   This is entirely a development and research tool, not to be used in
   conventional translation situations, so the default choice is N(o).
   This processing can be turned on with the choice of Y(es).

DO_FIXES_ANYWAY_HELP
   This option instructs the program to do both the normal dictionary
   search and then process for the various prefixes and suffixes too.
   This is a pure research tool allowing one to consider the possibility
   of strange constructions, even in the presence of conventional
   results, e.g., alte => deeply (ADV), but al+t+e => wing+ed (ADJ VOC)
   (If multiple suffixes were supported this could also be wing+ed+ly.)
   This option is only available if DO_FIXES is turned on.
   This is entirely a development and research tool, not to be used in
   conventional translation situations, so the default choice is N(o).
   This processing can be turned on with the choice of Y(es).
         ------    PRESENTLY NOT IMPLEMENTED    ------

USE_PREFIXES_HELP
   This option instructs the program to implement prefixes from ADDONS
   whenever and wherever FIXES are called for.  The purpose of this
   option is to allow some flexibility while the program in running to
   select various combinations of fixes, to turn them on and off,
   individually as well as collectively.  This is an option usually
   employed by the developer while experimenting with the ADDONS file.
   This option is only effective in connection with DO_FIXES.
   This is primarily a development tool, so the conventional user should
   probably maintain the default  choice of Y(es).

USE_SUFFIXES_HELP
   This option instructs the program to implement suffixes from ADDONS
   whenever and wherever FIXES are called for.  The purpose of this
   option is to allow some flexibility while the program in running to
   select various combinations of fixes, to turn them on and off,
   individually as well as collectively.  This is an option usually
   employed by the developer while experimenting with the ADDONS file.
   This option is only effective in connection with DO_FIXES.
   This is primarily a development tool, so the conventional user should
   probably maintain the default  choice of Y(es).

USE_TACKONS_HELP
   This option instructs the program to implement TACKONS from ADDONS
   whenever and wherever FIXES are called for.  The purpose of this
   option is to allow some flexibility while the program in running to
   select various combinations of fixes, to turn them on and off,
   individually as well as collectively.  This is an option usually
   employed by the developer while experimenting with the ADDONS file.
   This option is only effective in connection with DO_FIXES.
   This is primarily a development tool, so the conventional user should
   probably maintain the default  choice of Y(es).

DO_MEDIEVAL_TRICKS_HELP
   This option instructs the program, when it is unable to find a proper
   match in the dictionary, and after various prefixes and suffixes, and
   tring every Classical Latin trick it can think of, to go to a few that
   are usually only found in medieval Latin, replacements of caul -> col,
   st -> est, z -> di, ix -> is, nct -> nt.  It also tries some things
   like replacing doubled consonants in classical with a single one.
   Together these tricks are useful, but may give false positives (>20%).
   This option is only available if the general DO_TRICKS is chosen.
   If the text is late or medieval, this option is much more useful than
   tricks for classical.  The dictionary can never contain all spelling
   variations found in medieval Latin, but some constructs are common.
   The default choice is N(o), since the results are iffy, medieval only,
   and expensive.  This processing is turned on with the choice of Y(es).

DO_SYNCOPE_HELP
   This option instructs the program to postulate that syncope of
   perfect stem verbs may have occured (e.g, aver -> ar in the perfect),
   and to try various possibilities for the insertion of a removed 'v'.
   To do this it has to fully process the modified candidates, which can
   have a consderable impact on the speed of processind a large file.
   However, this trick seldom producesa false positive, and syncope is
   very common in Latin (first year texts excepted).  Default is Y(es).
   This processing is turned off with the choice of N(o).

DO_TWO_WORDS_HELP
   There are some few common Lain expressions that combine two inflected
   words (e.g. respublica, paterfamilias).  There are numerous examples
   of numbers composed of two words combined together.
   Sometimes a text or inscription will have words run together.
   When WORDS is unable to reach a satisfactory solution with all other
   tricks, as a last stab it will try to break the input into two words.
   This most often fails.  Even if mechnically successful, the result is
   usually false and must be examined by the user.  If the result is
   correct, it is probably clear to the user.  Otherwise,  beware.
   This problem will not occur for a well edited text, such as one will
   find on your Latin exam, but sometimes with raw text.
   Since this is a last chanceand infrequent, the default is Y(es);
   This processing is turned off with the choice of N(o).

INCLUDE_UNKNOWN_CONTEXT_HELP
   This option instructs the program, when writing to an UNKNOWNS file,
   to put out the whole context of the UNKNOWN (the whole input line on
   which the UNKNOWN was found).  This is appropriate for processing
   large text files in which it is expected that there will be relatively
   few UNKNOWNS.    The main use at the moment is to provide display
   of the input line on the output file in the case of UNKNOWNS_ONLY.

NO_MEANINGS_HELP
   This option instructs the program to omit putting out meanings.
   This is only useful for certain dictionary maintenance procedures.
   The combination not DO_DICTIONARY_FORMS, MEANINGS_ONLY, NO_MEANINGS
   results in no visible output, except spacing lines.    Default is N)o.

OMIT_ARCHAIC_HELP
   THIS OPTION IS CAN ONLY BE ACTIVE IF WORDS_MODE(TRIM_OUTPUT) IS SET!
   This option instructs the program to omit inflections and dictionary
   entries with an AGE code of A (Archaic).  Archaic results are rarely
   of interest in general use.  If there is no other possible form, then
   the Archaic (roughly defined) will be reported.  The default is Y(es).

OMIT_MEDIEVAL_HELP
   THIS OPTION IS CAN ONLY BE ACTIVE IF WORDS_MODE(TRIM_OUTPUT) IS SET!
   This option instructs the program to omit inflections and dictionary
   entries with AGE codes of E or later, those not in use in Roman times.
   While later forms and words are a significant application, most users
   will not want them.  If there is no other possible form, then the
   Medieval (roughly defined) will be reported.   The default is Y(es).

OMIT_UNCOMMON_HELP
   THIS OPTION IS CAN ONLY BE ACTIVE IF WORDS_MODE(TRIM_OUTPUT) IS SET!
   This option instructs the program to omit inflections and dictionary
   entries with FREQ codes indicating that the selection is uncommon.
   While these forms area significant feature of the program, many users
   will not want them.  If there is no other possible form, then the
   uncommon (roughly defined) will be reported.   The default is Y(es).

DO_I_FOR_J_HELP
   This option instructs the program to modify the output so that the j/J
   is represented as i/I.  The consonant i was writen as j in cursive in
   Imperial times and called i longa, and often rendered as j in medieval
   times.  The capital is usually rendered as I, as in inscriptions.
   If this is NO/FALSE, the output will have the same character as input.
   The program default, and the dictionary convention is to retain the j.
   Reset if this ia unsuitable for your application. The default is N(o).

DO_U_FOR_V_HELP
   This option instructs the program to modify the output so that the u
   is represented as v.  The consonant u was writen sometimes as uu.
   The pronounciation was as current w, and important for poetic meter.
   With the printing press came the practice of distinguishing consonant
   u with the character v, and was common for centuries.  The practice of
   using only u has been adopted in some 20th century publications (OLD),
    but it is confusing to many modern readers.  The capital is commonly
   V in any case, as it was and is in inscriptions (easier to chisel).
   If this is NO/FALSE, the output will have the same character as input.
   The program default, and the dictionary convention is to retain the v.
   Reset If this ia unsuitable for your application. The default is N(o).

PAUSE_IN_SCREEN_OUTPUT_HELP
   This option instructs the program to pause in output on the screen
   after about 16 lines so that the user can read the output, otherwise
   it would just scroll off the top.  A RETURN/ENTER gives another page.
   If the program is waiting for a return, it cannot take other input.
   This option is active only for keyboard entry or command line input,
   and only when there is no output file.  It is moot if only single word
   input or brief output.                 The default is Y(es).

NO_SCREEN_ACTIVITY_HELP
   This option instructs the program not to keep a running screen of the
   input.  This is probably only to be used by the developer to calibrate
   run times for large text file input, removing the time necessary to
   write to screen.                       The default is N(o).

UPDATE_LOCAL_DICTIONARY_HELP
   This option instructs the program to invite the user to input a new
   word to the local dictionary on the fly.  This is only active if the
   program is not using an (@) input file!  If an UNKNOWN is discovered,
   the program asks for STEM, PART, and MEAN, the basic elements of a
   dictionary entry.  These are put into the local dictionary right then,
   and are available for the rest of the session, and all later sessions.
   The use of this option requires a detailed knowledge of the structure
   of dictionary entries, and is not for the average user.  If the entry
   is not valid, reloading the dictionary will raise and exception, and
   the invalid entry will be rejected, but the program will continue
   without that word.  Any invalid entries can be corrected or deleted
   off-line with a text editor on the local dictionary file.  If one does
   not want to enter a word when this option is on, a simple RETURN at
   the STEM=> prompt will ignore and continue the program.  This option
   is only for very experienced users and should normally be off.
                                             The default is N(o).
         ------    NOT AVAILABLE IN THIS VERSION   -------

UPDATE_MEANINGS_HELP
   This option instructs the program to invite the user to modify the
   meaning displayed on a word translation.  This is only active if the
   program is not using an (@) input file!  These changes are put into
   the dictionary right then and permenently, and are available from
   then on, in this session, and all later sessions.   Unfortunately,
   these changes will not survive the replacement of the dictionary by a
   new version from the developer.  Changes can only be recovered by
   considerable prcessing by the deneloper, and should be left there.
   This option is only for experienced users and should remain off.
                                             The default is N(o).
         ------    NOT AVAILABLE IN THIS VERSION   -------

MINIMIZE_OUTPUT_HELP
   This option instructs the program to minimize the output.  This is a
   somewhat flexible term, but the use of this option will probably lead
   to less output.                        The default is Y(es).

SAVE_PARAMETERS_HELP
   This option instructs the program, to save the current parameters, as
   just established by the user, in a file WORD.MDV.  If such a file
   exists, the program will load those parameters at the start.  If no
   such file can be found in the current subdirectory, the program will
   start with a default set of parameters.  Since this parameter file is
   human-readable ASCII, it may also be created with a text editor.  If
   the file found has been improperly created, is in the wrong format, or
   otherwise uninterpretable by the program, it will be ignored and the
   default parameters used, until a proper parameter file in written by
   the program.  Since one may want to make temporary changes during a
   run, but revert to the usual set, the default is N(o).
</TT></PRE>

<A NAME="Special Cases">
<H4>Special Cases</H4></A>
<P>
Some adjectives have no conventional positive forms (either missing or
undeclined), or the POS forms have more than one COMP/SUPER.  In these few
cases, the individual COMP or SUPER form is entered separately.  Since it
is not directly connected with a POS form, and only the POS forms have
different numbered declensions, the special form is given a declension of
(0, 0).  An additional consequence is that the dictionary form in output
is only for the COMP/SUPER, and does not reflect all comparisons.

<A NAME="Uniques">
<H4>Uniques</H4></A>
<P>
There are some irregular situations which are not convenient to handle
through the general algorithms.  For these a UNIQUES file and procedure
was established.  The number of these special cases is less than one
hundred, but may increase as new situations arise, and decrease as
algorithms provide better coverage.  The user will not see much
difference, except in that no dictionary forms are available for these
unique words.

<A NAME="Tricks">
<H4>Tricks</H4></A>
<P>
There are a number of situations in Latin writing where certain
modifications or conventions regularly are found.  While often found,
these are not the normal classical forms.  If a conventional match is not
found, the program may be instructed to TRY_TRICKS.  Below is a partial
list of current tricks.  The syncopated form of the perfect often drops
the 'v' and loses the vowel.  An initial 'a' followed by a double letter
often is used for an 'ad' prefix, likewise an initial 'ad' prefix is often
replaced by an 'a' followed by a double letter.  An initial 'i' followed
by a double letter often is used for an 'in' prefix, likewise an initial
'in' prefix is often replaced by an 'i' followed by a double letter.  A
leading 'inp' could be an 'imp'.  A leading 'obt' could be an 'opt'.  An
initial 'har...' or 'hal...' may be rendered by an 'ar' or 'al', likewise
the dictionary entry may have 'ar'/'al' and the trial word begin with
'ha...'.  An initial 'c' could be a 'k', or the dictionary entry uses 'c'
for 'k'.  A nonterminal 'ae' is often rendered by an 'e'.  An initial 'E'
can replace an 'Ae'.  An 'iis...' beginning some forms of 'eo' may be
contracted to 'is...'.  A nonterminal 'ii' is often replaced by just 'i';
including 'ji', since in this program and dictionary all 'j' are made 'i'.
A 'cl' could be a 'cul'.  A 'vul' could be a 'vol'.  and many others,
including a procedure to try to break the input word into two.
<P>
Various manipulations of 'u' and 'v' are possible: 'v' could be replaced
by 'u', like the new Oxford Latin Dictionary, leading 'U' could be
replaced by 'V', checking capitalization, all 'U's could have been
replaced by 'V', like stone cutting.  Previous versions had various
kludges attempting to calculate the correct interpretation.  They were
surprisingly good, but philosophically baseless and certainly failed in a
number of cases.  The present version simply considers 'u' and 'v' as the
same letter in parsing the word.  However, the dictionary entries make the
distinction and this is reflected in the output.
<P>
Various combinations of these tricks are attempted, and each try that
results in a possible hit is run against the full dictionary, which can
make these efforts time consuming.  That is a good reason to make the
dictionary as large as possible, rather than counting on a smaller number
of roots and doing the maximum word formation.
<P>
Finally, while the program could succeed on a word that requires two or
three of these tricks to work in combination, there are limits.  Some
words for which all the modifications are supported will fail, if there
are just too many.  In fact, it is probably better that that be the case,
otherwise one will generate too many false positives.  Testing so far does
not seem to show excessive zeal on the part of the program, but the user
should examine the results, especially when several tricks are involved.
<P>
There is a basic conflict here.  At the state of the 1.97E dictionary there
are so few words that both fail the main program and are caught by tricks
that this option could be defaulted to No.  However, one could argue that
there will be very few occasions for trying TRICKS, so that the cost is
minimal.  Unfortunately the degree of completeness of the dictionary for
classical latin does not carry over to medieval Latin.  With the hope that
the program will become more useful in that area, the default has been
set to Yes, reflecting the philosophy early in the development
for classical Latin.

<A NAME="Trimming of uncommon results">
<H4>Trimming of uncommon results</H4></A>
<P>
Trimming has an impact on output.  If TRIM_OUTPUT parameter is set, and
specific parameters set in the MDEV, the program will deprecate those
possible forms which come from archaic or medieval (non-classical) stems
or inflections, also stems or inflections which are relatively uncommon.
It will report such if no classical/common solutions are found.  The
default is set for this, expecting that most users are students and
unlikely to encounter rare forms.  Other users can set the parameters
appropriately for their situation.
<P>
This capability is preliminary.  It is just becoming useful in that the
factors are set for about half the dictionary entries.  There are still a
large number of entries and inflections that are not set and will continue
to be reported until determination of rarity is made.
<BR>
<BR>

<A NAME="GUIDING PHILOSOPHY">
<H3><CENTER>GUIDING PHILOSOPHY</CENTER></H3></A>

<A NAME="Purpose">
<H4>Purpose</H4></A>
<P>
The dictionary is intended as a help to someone who knows roughly enough
Latin for the document under study.  It gives the accidence and meanings
possible for an input Latin word.  It is for someone reading Latin text.

<P>
This is a translation dictionary.  Mostly it provides individual words in
English that correspond to, and might be used in a translation of, words
in Latin text.  The program assumes a fair command of English.  This is in
contrast to a conventional same-language desktop dictionary which would
explain the meanings of words in the same language.  The distinction may
be obvious but it is important.  A Latin dictionary in medieval times
would have explanations in Latin of Latin words.
<P>
There are various approaches to the preparation of a dictionary.  The most
scholarly might be to select only proper and correct entries, only correct
derivations, grammar, and spelling.  This would be a dictionary for one
who wished to write 'correct' Latin.  (Correct being defined as the way
Cicero, or your favorite writer or grammarian, used it.) The current
project has a different goal.  This program is successful if a word found in
text is given an appropriate meaning, whether or not that word is spelled
in the generally approved way, or is 'good Latin'.  Thus the program
includes various words and forms that may have been rejected by recent
scholars, but still appear in some texts.  Philosophically, thus program
deals with Latin as it was, not as it should have been.  I make no
corrections to Cicero, which some might have been tempted to do if
producing an academic dictionary instead of a program.  Moreover I make no
corrections of St Jerome.  If your copy of the Vulgate has a particular
spelling, that may be recognized by the program, either through a TRICK or
as a dictionary entry that I have generated.
<P>
A philosophical difference from many dictionary projects is that this one
has no firm model of the user or application.  It is not limited to
classical Latin, or to 'good practice', or to common words, or to words
appearing in certain texts.  As a result there will be a lot of chaff in
the output.  Some of this may be trimmed out automatically if desired, but
it is there and available.
<P>
However inadequately, I hope to document decisions that went into the
arrangement of the program and dictionary.  I am surprised that there is
little or no such information to the user of published dictionaries.  If
others generate similar products, or use the data from this one, they can
do so in knowledge of how and why processes and forms were constructed.
<P>
I make few value judgments and those are mechanical, not scholarly, and
are documented herein.  Nevertheless some may be inappropriate, in spite of
good intentions.
<BR>

<A NAME="Method">

<H4>Method</H4></A>
<P>
The program subtracts possible endings from an input words and searches a
list of stems, trying to make a match.  If no exact match is possible, it
tries various modifications, beginning with prefixes and suffixes, and
eventually involving various regular spelling variations (or 'tricks')
common in classical and medieval Latin.
<P>
A choice was made that the base was classical Latin as defined by the
Oxford Latin Dictionary (OLD).  Their primary time period is
arbitrary/roughly 100 BC to 100 AD.
<P>
The classical form of words is taken as the base.  Modifications are in
such a way to correct to this base.  Further additions to local
dictionaries should keep this in mind.  Modifications are made to the
input words, not to the dictionary stems.  It could be done the other way,
but the present situation was initially much easier.  There are some
consequences of this approach.  For instance, it is easy to remove an 'h'
from an input word to match with a stem.  It is much more difficult (but
not impossible) to add 'h' in all possible positions to check against
stems.
<P>
It would be possible to match most words with a relatively smaller list of
stems (or roots) and generous application of word construction.  This
approach is not followed.  One difficulty is that while words may be
constructed correctly, and the underlying meaning to be found from this
construction, the common usage may be obscured by a formal interpretation
of the parts.  In practice this occurs in 20-40% of the cases.  This
method is still very useful in approaching a word for which there has been
no dictionary interpretation, but it puts a considerable burden on the
normal user.  Further, in about 10% of constructions, the result is just
wrong.
<P>
In normal usage, if the program finds a simple match, it does not go
further and consider what constructed words might also be valid.  (One can
override and force prefix/suffix construction with a switch, but one might
not want to force all possible tricks.)
<P>
For instance, if there is an adjective that matches, a corresponding
identically spelled, logically valid noun will not be reported unless it
is explicitly found in the dictionary, even though it could be constructed
or inferred from the adjective or constructed with a suffix from a verb in
the dictionary.
<P>
An exception to this is that enclitics (eg., -que) are always considered.
Coloque can be a verb or collo-que.  The latter is in Virgil and should
not be omitted.  Verb syncope is also favored.  In the vast majority of
cases, if there is a possible syncope it is the correct parse.  This is
given preference over word construction with suffix.  Audii is syncope of
audivi, but it could also be aud-i-i.  The latter is considered very
unlikely.
<P>
There are a large number of paths and possibilities.  Choices have been
made in the code that result in the exclusion of some.  It is hoped that
they were the best choices.  The method was constructed by taking a number
of primary procedures and combining/assembling them in such a way as to
give reasonable parses for a number of test cases.  Basicly, this is
hacking, but it might be considered and emperical starting point from
which one could construct a logical rationale.
<P>
Therefore, the philosophy is to populate the stem list as densely as
possible.  Even easily resolved differences are included redundantly
(adligo as well as alligo - ad- is most of duplicates).  The advantage is
that while regular single-letter modifications are fairly easy, and two
letter differences are possible (but more expensive), further deviations
are problematical.  The better populated the stem list, the better the
chance of a result.
<P>
Even in easy cases the overpopulation is helpful.  Antebasis is easily
parsed as ante-basis ('pedestal before', which is reasonable), but
inclusion as a separate word allows the additional information that it is
the hindmost pillar of the pedestal of a ballista.
<P>
The stem list is also populated with variants suggested by different
sources.  The problem is that the remains of classical Latin have gone
through many monks along the way.  These copyists may have made simple
mistakes (typos!), or have made what they thought were proper corrections
(spell checkers!).  And twenty centuries later scholars work hard to
reassemble the best Latin to present in the dictionary.  But a particular
document in the form presented to the reader may have have a variety of
spellings for exactly the same word in the same referenced passage
(Pliny's Natural History is often subject to this problem).  (It may even
be that modern texts and dictionaries have misprints!) All forms found in
various dictionaries can be included, with the exception of those
explicitly labeled 'misread' (and the argument probably could mandate
their inclusion also).  However, a single example of a variant in one case
will not be included as a dictionary entry.  If such a word is
sufficiently important, if it is used frequently or by several authors, it
will be entered as a UNIQUE.
<P>
Lewis and Short seem to be more willing than the more recent Oxford Latin
Dictionary to raise a few examples of variation to an entry (at least an
alternate).  Generally, I make an entry if some dictionary does so.  But
within an entry I generate additional possible stems not noted elsewhere,
e.g., I expand first declension verbs with '-av' perfect stems, even
though no example exists in classical Latin.  This is often the practice
in other dictionaries also.

<P>
Verb parts omitted from source dictionaries are mechanically added where it is clear,
(ex. where the base verb is documented, but parts are omitted in compounds).
Whether Cicero used them or not, some later text might.

<P>
In some cases I also have expanded adjectives and adverbs to include comparative
and superlative stems where they seem reasonable or have corresponding English
instances, even when there is no specific dictionary citation.
This effort was modivated primarily by finding examples of such comparisons
in processing of large amounts of text beyond the classical
works upon which authoritative dictionaries are based, but even classical
works yielded examples.  The point is that, while these forms would usually be
caught by the word formation (prefix/suffix) process in the program,
the process is limited to how many operations can be done serially.
Having more/expended stems allows another level of word modification to be
implemented.

<P>Adjectives are extrapolated to COMP and SUPER where it makes sense
(when those meanings are reasonable, and in many cases they are not)
even if the source dictionary only lists POS.
They are expanded fully especially even when the source lists a COMP but no SUPER.

<P>
Perhaps a bit out of context, consider the common question of SECLORUM in
the Great Seal of the USA.  This pure word in not in any dictionary I know of,
not the OLS or L+S.  A simple trick gives seculorum (seculum = world),
but the favored translation is from the twice modified saeculorum (saeculum = age),
which would not be found by a minimalistic system.
<P>
It is often the practice in paper dictionaries to double up on an entry
that may be either adjective or noun, usually by leading with the
adjective and mentioning its use as a noun.  A much larger set of
adjective/noun pairs is favored with separate entries.  It is the
philosophy of this program to make separate entries whenever there is an
example in any reference dictionary.  This might faciliate the task of a
larger translation program which would handle phrases or sentences.
However there has been no effort to explicitly generate such pair
expansion if there is no precedent, and the user must still recognize the
possibility of unexpanded multiple possibilities for substantives.
<P>
An argument against a large stem list is that it increases the storage
required (but this is extremely modest by current standards) and increases
processing time for search of the stems (this is far offset by the
processing which would be required to construct or analyze words working
from a smaller stem list).
<P>
A significant objection is that artifically generated stems may conflict with
real/common ones and produce false output confusing to the user.  A certain
amount of this is eliminated by trimming the output to emphisize the most
probable results, but it is still a problem.
<P>
Perhaps a counterexample would be an inferred fourth stem to no/nare (swim).
Natus conflicts with the fourth stem of nascor (be produced/born) and the
nouns and adjectives stemming from it.  The nare natus does not appear in
dictionaries, nor does it occur in compounds of nare, so it has been omitted
from the WORDS dictionary.
<P>
Additional parts of verbs are included (first conjugation is easily filled
out, even eccentric verbs if they are compounds of known parts), although
they may not have been found in any well known texts.  Cases can be
logically constructed that are 'missing' in classical Latin.  Verbs with
prefix can be expanded when the base is known.  That a form has not been
found in surviving copies of classical texts does not mean that it was
not on the lips of every centurion and his girl friend, or that it might
not find its way into medieval texts.
<P>
It may be argued in some cases that forms are missing because their
pronounciation would be awkward.  This may well be true when Cicero is the
arbiter, but others may not be so elegant.  Moreover, much of the texts
are represented by medieval documents, Latin the was written but may not
have been spoken, so the problem did not arise.
However, I might be willing to accept this argument for considering carefully
some perfect stems of first conjugation verbs which otherwise would end
in -avav.  In the end, the only one I found that I could not support
was lavo (wash), and its compounds, for which the perfect is lavi.
<P>
In some cases there are good reasons not to do the mathematical expansion,
and these are pointedly avoided.  There is no mechanical generation of,
for instance, conl- words for every coll- word, unless there is some
citation or reasonable rationale.  They may be paired in almost every
case, but, for instance, collis and collyra are not.  However, forms that
are mentioned in dictionaries explicitly, or implicitly by being derived
from words having variant forms, are included in order to reduce the
dependence on 'tricks'.  OLD has a conp- for almost every comp- (except
derivatives from como).  Rare exceptions seem to be rare words for which
few examples (or only one) exist.  Even in some of these cases, OLD
(mechanically?) gives two forms.  L+S follows the same pattern, except for
words of late Latin (which would not be found in OLD).  It is presumed
that the general practice in later times was always to use comp-, and the
program dictionary follows that.  There are many acc-/adc- pairs, but OLD
has a fair number of acc- words without mention of a corresponding adc-,
and so the possible generation of these words has been resisted.  If an
example turns up in text, the appropriate trick procedure should suffice
<P>
One suspects that some amount of analytical expansion is present even in
the best dictionaries.  Otherwise how can one explain four alternate
spellings for a word which apparently only appears in citation as a single
inscription.
<P>
In a some few cases I have infered a declension to certain very obscure Greek words
which other dictionaries have treated as indeclinable
(having only a single classical example of its use).
My argument is that some later writer, using this word, might attempt
to decline in it in a conventional manner,
no matter what Vitruvius thought.  I have indicated the indecl. option in the meaning.
<P>
Adjectives from participles are included if an entry is found in some
reference dictionary.  In some case the adjective has a special meaning
not obvious from the verb.  The program will return both the adjective and
the participle with its verb meaning.  The user should give some
additional consideration to the adjective meaning in this case.  If the
adjective is marked rare while the verb is common, it is likely there is
reference to a special meaning.
<P>
Tricks are expensive in processing time.  Each possible modification is
made, then the resulting word goes through the full recognition process.
If it passed, that is reported as the answer.  If it fails, another trick
is tried.  This is effective if very few words get this far.  It is
expected that application of single tricks will solve most of the
resolvable difficulties.  It would be impractical to mechanically apply
several tricks in series to a word.  A large stem population reduces the
likelyhood of multiple tricks being required.  If the dictionary is heavily and
redundantly populated, tricks are rarely necessary (and therefore not an
overall processing burden) and largely successful (if the input word is a
valid, but unusual, variant/construction).
<P>
Further, a conventional dictionary, especially one that wishes to set a
standard for proper language, excludes words that may not meet criteria of
propriety, slang, misspellings, etc.  This may place the onus on the
reader to convert words.  A computer dictionary ought to relieve the
reader as much as possible.  The present program may be a far way from
complete, but it's goal is to strive for that.

<A NAME="Word Meanings">
<H4>Word Meanings</H4></A>
<P>
The meanings listed are generally those in the literature/dictionaries.
In the case of common words, there is general agreement among authors.
Some uncommon words display convoluted interpretations.
<P>
Generally, the meaning is given for the base word, as is usual for
dictionaries.  For the verb, it will be a present meaning, even when the
tense input is perfect.  For an adjective, the positive meaning is given,
even if a comparative or superlative form is shown.  This is also so when
a word is constructed with a suffix, thus an adverb constructed from its
adjective will show the base adjective meaning and an indication of how to
make the adverb in English.
<P>
For the level of usage for this program, and for convenience in coding,
the meaning field has been fixed at 80 characters.  It is possible to have
multiple 80 character lines for an entry, but this only necessary for the
most common words.  In order to conserve space, extraneous helpers like
'a', 'the', 'to', which sometimes appear in dictionary definitions, are
generally omitted.  The solidus ('/') is used both to separate equivalent
English meanings and to conserve space.
<P>
I have taken it upon myself to add some interpretations and synonyms, and
propose common usage for otherwise complex descriptive definitions.  The
idea is to prompt the reader, expecting that the text may not be that from
which some dictionary copied the meaning (from some 18th century
translator!).

<P>
In the meanings I only use words of which I know the meaning.
I find that in some cases the Oxford Latin Dictionary uses English
that is not in the Oxford English Dictionary.

<P>
Where available, the Linnean or 'scientific Latin' name is given in
parentheses, mostly for plants.  This is not a classical Latin name, but a
modern designation.  Similarity of this designation to some Latin word may
not be historically significant.
<P>
The spelling of the English meanings is US (plow not plough, color not
colour, and English corn is rendered as grain or wheat), in spite of the
fact that most of the Latin dictionaries that I have are British and use
British spelling.  The reason for this is (besides uniformity in the
program) that there is much computer processing and checking of the
dictionary data, including spell-checking of the English.  (This is not to
say that everything is correct, but it is much better than it would be
without the computer checking.) All my programs speak US English, so I can
count on it.  Only some are available in UK English, and I do not have all
of those versions.
<P>
Latin dictionaries seem to be locked into the 19th century.  The
English terms seem stilted, even by current British usage.  This is
probably because much work in translation was started then and later work
tended to copy from the previous dictionaries.  While this dictionary has
done some modernization, some of the previous obscurities have been
preserved.  This was done in order that certain machine processes could
compare the results of automatic translation with existing published work.

<P>
In addition, I have given US meanings to some terms that seem to be
literally translated from the Latin (or German!) (a person who
steals/drives off cattle is a rustler in the US).
<P>
Most dictionaries have an etymological approach, they are driven by the
derivation of words to distinguish with separate entries words that may be
identical in spelling but different derivations.  But they can lump
entirely different, even contradictory, meanings in a single entry if
there is some common derivation.  Philosophically, this dictionary is
usually not sensitive to derivations, but sometimes supports multiple
entries for vastly different meanings, application areas, or eras.  <BR>

<P>
In a very small number of cases a source, such as OLD, will have an entry for which no English meaning is ptovifrd.
Instead, a few words of Latin text containg the word is given.
If they cannot figure it out, I certainly cannot.
Such a source entry is usually omitted ftom WORDS.


<A NAME="Proper Names">
<H4>Proper Names</H4></A>
<P>
Only a very few proper names are included, many just for test purposes,
others that users have requested.  The number of proper names is almost
limitless but very few are applicable to a particular document, and if it
is an obscure document it is unlikely that the names would be found in any
dictionary.
<P>
Meaning for proper names may cite a likely example of a person with that
name.  This is just an example; there are lots of others with that name.
<P>
There is a switch (defaulted to Yes) that allows the program to assume
that any capitalized unknown word is a proper name, and to ignore it.
Also, one can make up a local dictionary of names for one's particular
application.

<A NAME="Letter Conventions (u/v, i/j, w)">
<H4>Letter Conventions (u/v, i/j, w)</H4></A>

<H5>U and/or V</H5>
<P>
Strictly speaking, Latin did not have a V, just a consonant U, or a U
character that was easier in capitals (the way Latin was written by the
Romans) to write or chisel in stone as V. However, many modern texts and
dictionaries (with the important exception of the OLD) make the
distinction with two characters (u and v).  It appeared most appropriate
in a computer context (never destroy information) to make the distinction
and follow the common practice.  So all dictionary entries maintain the
V/v.  However, an input word following the U convention will be found.  At
an earlier version, an algorithm was kludged to convert where necessary.
While this worked in most cases, there were difficulties.  The present
system processes the dictionary and the input word as though U and V were
the same letter, although the basic dictionary maintains the distinction
and the output reflects this.  There is no need for the user to
set modes for this process.

<H5>I and/or J</H5>
<P>
A similar situation arises with I, and its consonant form, J. In this
instance, the common practice is use only I, but there are many
counter-examples, both text and dictionaries.  (Lewis+Short uses J, but
OLD does not.) Because of common practice, the program started out as
pure-I dictionary with conversion of J-to-I on input.  It remained that
way through many versions, in spite of the logical inconsistency with U-V.
The technique worked perfectly, but eventually the aesthetic of
consistency won out and the U/V technique described above was extended to
I/J.

<H5>W</H5>
<P>
While the letter W does not exist in classical Latin,
there are examples of W in medieval Latin.  I have not directly
faced this, and have few words in the dictionary yet with W.   The W
problem is not analogous to U/V.  While W sometimes could correspond to V
or UU, in most cases it is a valid letter, reflecting a Germanic origin of
the word.  It will be treated as a real letter, and tricks employed as useful.

<A NAME="DICTIONARY">
<H3><CENTER>DICTIONARY</CENTER>
</H3></A> <BR>


<A NAME="Dictionary Codes">
<H4>Dictionary Codes</H4></A>
<P>
Several codes are associated with each dictionary entry (e.g., AGE,
AREA, GEO, FREQ, SOURCE).  Initially these were provided against the possibility of
the program using them to make a better interpretation, however
this additional information may be of some help to the reader.
It is carried in codes because it is not available to the program in any
other way.  Other codes, like the KIND code for nouns, may be
used, others may not.  The program is still in development and these are
put in to experiment with a possible capability.  Later versions may use
them, omit them, or provide others.
<P>
The program covers a combination of time periods and applications areas.
This is certainly not the way in which dictionaries are usually prepared.
Usually there is a clear limit to the time or area of coverage, and with
good reason.  A computer dictionary may have capabilities that mitigate
those reasons.  Time or area can be coded into each entry, so that one
could return only classical words, even though matching medieval entries
existed.  (The program has that capability now, but it is not yet clear
how to apply it.)
<P>
There is some measure of period and frequency that can be used to
discriminate between identical forms, but if there is only one possible
match to an input word, it will be displayed no matter its era or rarity.
The user can choose to display age and frequency warnings associated with
stems and meanings, but the present default is not to, although inflectios
are so identified by default.
<P>
So far these codes have not been of much use, especially since the only
significant exercises have been with classical Latin.  Other situations
may change this.  Perhaps the only impact now is for those words which
have different meanings in different applications or periods.  For these
the warning may be useful.  Otherwise, if there is only one interpretation
for a word, that is given.
<P>
Rare and age specific inflection forms are also displayed, but there is a
warning associated with each such.  <A NAME="AGE">

<H5>AGE</H5>
<P>
The designation of time period is very rough.  It is presently based on
dictionary information.  If the quotes cited are in the 4th century, and
none earlier, then the word is assumed to be late Latin, and one might
conclude that it was not current earlier.  One flaw in this argument could
be that the citation given was just the best illustration from a large
number covering a wide period.  On the other hand, the word could have
been well known in classical times but did not appear in any surviving
classical writings.  In such a case, it is reasonable to warn the reader
of Cicero that this is not likely the correct interpretation for his
example.  This capability is still developmental, and its usefulness is
still an open question.
<P>
If there is a classical citation, the word could be designated as
classical, but unless there is some reason to conclude otherwise, it is
expected that classical words are valid for use in all periods (X), are
universal for well considered (published) Latin.
<P>
A designation of Early (B) means that there are not classical citations,
except for poetry, in which the poet is invoking the past (or just
straining for meter).  Obsolete words occur similarly in English
literature and poetry.
<P>
Much which is designated late or medieval may be vulgar Latin, in common
use in classical times but not thought suitable for literary works.
<P>
In all periods the target is Latin.  Archaic Latin, for purposes of the
program, is still Latin, not Etruscan or Greek.  Medieval Latin is that
which was written by scholars as the universal Latin, not versions of
early French or Italian.

<PRE><TT>  type AGE_TYPE is (
    X,   --              --  In use throughout the ages/unknown -- the default
    A,   --  archaic     --  Very early forms, obsolete by classical times
    B,   --  early       --  Early Latin, pre-classical, used for effect/poetry
    C,   --  classical   --  Limited to classical (~150 BC - 200 AD)
    D,   --  late        --  Late, post-classical (3rd-5th centuries)
    E,   --  later       --  Latin not in use in Classical times (6-10) Christian
    F,   --  medieval    --  Medieval (11th-15th centuries)
    G,   --  scholar     --  Latin post 15th - Scholarly/Scientific   (16-18)
    H    --  modern      --  Coined recently, words for new things (19-20)
                             );</TT></PRE>
<A NAME="AREA">

<H5>AREA</H5>
<P>
While the reader can make his own interpretation of the area of
application from the given meaning, there may be some cases in which the
program can also use that information (which it can only get from a direct
coding).  This has not yet been used in the program, but the possibility
exists.  If the reader were doing a medical text, then higher priority
should be given to words coded B, if a farming book, then A coded words
should be given preference.
<P>
The area need not apply to all the meanings, just that there is some part
of the meaning that is specialized to or applies specifically to that area
and so is called out.

<PRE><TT>type AREA_TYPE is (
                        X,      --  All or none
                        A,      --  Agriculture, Flora, Fauna, Land, Equipment, Rural
                        B,      --  Biological, Medical, Body Parts
                        D,      --  Drama, Music, Theater, Art, Painting, Sculpture
                        E,      --  Ecclesiastic, Biblical, Religious
                        G,      --  Grammar, Retoric, Logic, Literature, Schools
                        L,      --  Legal, Government, Tax, Financial, Political, Titles
                        P,      --  Poetic
                        S,      --  Science, Philosophy, Mathematics, Units/Measures
                        T,      --  Technical, Architecture, Topography, Surveying
                        W,      --  War, Military, Naval, Ships, Armor
                        Y       --  Mythology
                             );</TT></PRE>
<A NAME="GEO">

<H5>GEO</H5>
<P>
This code was included to enable the program to distinguish between
different usages of a word depending on where it was used or what country
was the subject of the text.  This is a dual usage, origin or subject.

<PRE><TT>type GEO_TYPE is (
                       X,      --  All or none
                       A,      --  Africa
                       B,      --  Britian
                       C,      --  China
                       D,      --  Scandinavia
                       E,      --  Egypt
                       F,      --  France, Gaul
                       G,      --  Germany
                       H,      --  Greece
                       I,      --  Italy, Rome
                       J,      --  India
                       K,      --  Balkans
                       N,      --  Netherlands
                       P,      --  Persia
                       Q,      --  Near East
                       R,      --  Russia
                       S,      --  Spain, Iberia
                       U       --  Eastern Europe
                       );
</TT></PRE>
<A NAME="FREQ">

<H5>FREQ</H5>
<P>
There is an indication of relative frequency for each entry.  These codes
also apply to inflections, with somewhat different meaning.  If there were
several matches to an input word, this key may be used to sort the output,
or to exclude rare interpretations.  The first problem is to provide the
score.  The initial method is to grade each word by how much column space
is allocated to it in the Oxford Latin Dictionary, or the number of
citations, on the assumption that many citations mean a word is common.
This is not the main intent of the compilers of existing dictionaries, but it
is almost the only indication of frequency that can be inferred from the
dictionaries.  In many cases it seems to be a reasonable guess, certainly
for those most common words, and for those that are very rare.

<P>FREQ guessed from the relative number of citations given by sources
need not be valid, but seems to work.
If the compiler's purpose were just to give sufficient
examples to clarify the use of the word,
perhaps a single reference would serve for a simple word.
However one might observe that dictionary people seem to be enamored
with filling up this section whenever possible.
('et' has more than a page in OLD.)
If there is only one citation, they could only find one.
(This assertion can now easily be verified by searching the texts
available on the Internet.)


With the
understanding that adjustments can be made when additional information is
available, the initial numeric criteria are:

<PRE>
A   full column or more, more than 50 citations - very frequent
B   half column, more than 20 citations - frequent
C   more then 5 citations - common
D   4-5 citations - lesser
E   2-3 citations - uncommon
F   only 1 citation - very rare
</PRE>


<P>
In the case of late Latin in Lewis and Short, these frequencies may be
significant underestimates, since the volume of applicable texts
considered seems to be much smaller than for classical Latin resulting in
fewer opportunities for citations.  Nevertheless, barring additional
information, the system is generally followed.
<P>
For the situation where there are several slightly different spellings
given for a word, they all are given the same initial frequency.  The
theory is that the spelling is author's choice while the frequency is
attached to the word no matter how it is spelled.  I presume that for a
specific text the author always spells the word the same way, that there
is no distribution of spellings within a individual text.  One exception
to this rule is the case where a variant spelling is cited only for
inscriptions.  There may be some significance to this and a FREQ of I is
assigned.  The logic of this choice is debatable.  However, for some
variations there is clearly a difference in application and this can be
reflected in the frequency code.  Likewise, there are situations wherein
words of the same spelling but different meanings may have different
frequencies.  This may help to select the most likely interpretation.
<P>
One has a check against the frequency list of Diederich for the most
common, and those are probably the only ones that matter.  But the
frequency depends on the application, and it should be possible to run a
new set of frequencies if one had a reasonable volume of applicable text.
The mechanical verification of word frequency codes is a long-term goal of
the development, but must wait until the dictionary data is complete.
<P>
Inscription and Graffiti are designations of frequency only in that the
only citations found were of that nature.  One might suppose that if
literary examples were known they would have been used.  So one might
expect that such words would not be found in a student's text.  There is
no implication that they were not common in the spoken language.
<P>
A very special case has been created for 'N' words, words for which the
only dictionary citation is Pliny's Natural History.  It seems, from
reading of dictionaries, that this work may be the only source for these
words, that they do not appear in any other surviving texts.  They are
usually names for animals, plants or stones, many without identification.
Such words may appear only in Lewis and Short and the Oxford Latin
Dictionary, the unabridged Latin classical dictionaries.  These words are
omitted from most other Latin dictionaries and, although they fall in the
classical period and are from a very well known writer, there is no
mention of the omission.  So there may be an argument to disparage these
words, unless one is reading Pliny.
<P>
Most of these words are of Greek origin (although that is also true for
much of Latin).  For many, the dictionaries report different forms or
declensions for the word giving the same citation.  Often one dictionary
will give a Greek-like form (-os, -on) where another gives a Latinized
form (-us).  There is no consistency.  Both OLD and L+S disagree on Latin
and Greek forms, with no overwhelming favoritism to one form attached to
either dictionary.  This may be a reflection of the fact that the
dictionaries grew over a long time with several editors, many workers, and
no rigid enforcement of standards.
<P>
I have made it a point to try to complete (give M, F, N) Greek adjectives
where other dictionaries give only a single form.  To do this I have referred
to the base Greek in Liddell + Scott Greek-English Lexicon, assuming that
any Roman scholar pedantic enough to use a Greek form knew the Greek and
would draw on that knowledge.
<P>
There is another problem that is found chiefly in connection with
Pliny-type words.  Since the literature is very sparse on examples, it is
often uncertain whether a particular usage is appropriately listed as a
noun, as an adjective, or as adjective used as a substantive.  The present
dictionary, in blessed innocence, records all forms without bias.

<PRE><TT>    type FREQUENCY_TYPE is (     --  For dictionary entries
    X,    --              --  Unknown or unspecified
    A,    --  very freq   --  Very frequent, in all Elementry Latin books, top 1000+ words
    B,    --  frequent    --  Frequent, next 2000+ words
    C,    --  common      --  For Dictionary, in top 10,000 words
    D,    --  lesser      --  For Dictionary, in top 20,000 words
    E,    --  uncommon    --  2 or 3 citations
    F,    --  very rare   --  Having only single citation in OLD or L+S
    I,    --  inscription --  Only citation is inscription
    M,    --  graffiti    --  Presently not much used
    N     --  Pliny       --  Things that appear only in Pliny Natural History
                      );</TT></PRE>

<P>
For inflections, the same type is used with different weights

<PRE><TT>
--  X,    --              --  Unknown or unspecified
--  A,    --  most freq   --  Very frequent, the most common
--  B,    --  sometimes   --  sometimes, a not unusual VARIANT
--  C,    --  uncommon    --  occasionally seen
--  D,    --  infrequent  --  recognizable variant, but unlikely
--  E,    --  rare        --  for a few cases, very unlikely
--  F,    --  very rare   --  singular examples,
--  I,    --              --  Presently not used
--  M,    --              --  Presently not used
--  N     --              --  Presently not used

</TT></PRE>
<A NAME="SOURCE">

<H5>SOURCE</H5>
<P>
Source is the dictionary or grammar which is the source of the
information, not the Cicero or Caesar text in which it is found.

<P>
For a number of entries, X is now given as Source.  This is primarily from
the vocabulary (about 13000 words) which was in place before the Source
parameter was put in, and some have not been updated.  They are
from no particular Source, just general vocabulary picked up in various
texts and readings.  Although, during the dictionary update beginning in
1998, all entries are being checked against sources, it may be improper to
credit (blame?) a Source when that was not the origin of the entry,
remembering that the actual entries are of my generation entirely and may
not correspond exactly to any other view.  However, in the second pass (as
far as it has progressed) all classical entries have been verified with
the Oxford Latin Dictionary (OLD).  (By that I mean that I have checked,
not to imply that I have not made errors.) This does not mean that the
entry copies or agrees with the OLD, but that I read the OLD entry with
great respect and put down what I did anyway.  Newer entries, added in
this process, and those checked later in the process, if found in the OLD,
have the O code.  Words added from Lewis and Short, but not in OLD, have
the S code, etc.

All entries for which there is a Source will be found in
some form in that Source, but the details of the interpretation of
declension and meaning is mine.
Each entry is
my responsibility alone, and there are significant differences and
elaborations.  They may not necessarily be found as
primary entries, or even directly referrenced, but they will have been
constructed from information in that source.  For instance, the remark 'adp see app'
in a source dictionary may generate 'adp' WORDS entries that are not explicitlt mentioned in the
source dictionary.
There might be occasions where the source gives a noun
but on my own initiative I have also introduced the corresponding adjective
(or the converse), particularly if that usage was found in a text.
In such a case the source would be the same.  If I have
done a proper job, the reader will not often be surprised.

<P>An important implication of the SOURCE is age.  OLD contains words
from the classical period of Latin, and these are carried forward to all ages.
Thus AGE for OLD entries will be X (all ages).  Those in L+S, but not in OLD,
might be checked against the premise that they were late/post-classical Latin (D),
citations being the determining factor.  Souter (SOURCE=P) is a wordlist of
later Latin (AGE=E), so his entries might be presumed not to be common in
classical times.  Other sources, indicated by AGE flags or by parenthesized
comments, may also indicate to the user the age appropriate for the entry.
Calepinus Novus (Cal) (SOURCE=K) is especially noteworthy  in that it is
of modern, 20th century Latin and its meanings should probably not be applied to
earlier texts.

<P>
OLD is taken as the most authorative source
and if it is in OLD then it was used in classical times within a very limited period.
An entry with source O will have AGE X (or C if it is unique to classical).
This also define good Latin and the usage should be valid for all ages.

<P>
Lewis and Short (S) is next in authority and also somewhat in time.
It covers, in addition to classical, a later period.  That a word appears in S but not in O
may mean it is a somewhat later usage.
If that poiint is well established, the AGE is D.
But most often the main source is OLD and there are additional meanings
indicated as L+S.  The user is warned that this may be a case of modified meaning
coming into use at a later age.
But it may be that, after review of L+S, OLD differs for reason and has a better interpretation.

<P>
A formation from a classical word with the natural meaning is usually assumed
to also be appropriately classical/general - X.
Such a word with an enhanced, specialized or modified meaning might
indicate a later usage, and is so labeled.
It may be that the word was in use earlier but no reference is available,
In some cases, an additional meaning is identified as (L+S)
just to give credit, without implying anything further.

<P>
In time, Souter (P) is next.
Again if it is in Souter but not O of S it is very likely later Latin
The date may reflect this, but the source is a hint to the user, not an firm promise.

<P>
Next in line in time is Latham (M) for medieval Latin.

<P>
Souter and Latham are poorly represented.  There is no attempt to include these sources
with the throughness of the OLD and L+S effort.
Entries from these sources come up only when a particular
word is submitted from a text and no other source serves, giving credance to the assumption that
such entries belong to a later AGE..

<P>
Stelten (Ecc) is more fully represented (goal to complete) since it specializes in an area not well
covered by other sources.  While it is a complete dictionary, with all the general words, it has
a number of entries specifically or solely applicable to the Christian Church.
These are from later (non-classical) times, chiefly medieval.

<P>
Licoppe (K) is modern.  An additional meaning on a word from an earlier AGE is likely to be uniquely modern.

<P>
Note that there are examples in which different sources at different ages give contrary meanings.
This may reflect a real and not uncommon shift in meaning, or there may be errors in the sources.
At least in such cases the sources (and their implied ages) are identified.

<P>The list of sources goes far beyond what has been directly used so far.
There should be no expectation at this point in the development that
all these sources have even been used.  They are listed as I have copies
and as they might be consulted.  They are encoded so that the program might
recognize and process the source should it come up.
I have sought and received permission for those which have been
extensively used.  Others have only been used for an occasional check
(fair use).


<PRE><TT>  type SOURCE_TYPE is (
    X,      --  General or unknown or too common to say
    A,
    B,      --  C.H.Beeson, A Primer of Medieval Latin, 1925 (Bee)
    C,      --  Charles Beard, Cassell's Latin Dictionary 1892 (CAS)
    D,      --  J.N.Adams, Latin Sexual Vocabulary, 1982 (Sex)
    E,      --  L.F.Stelten, Dictionary of Eccles. Latin, 1995 (Ecc)
    F,      --  Roy J. Deferrari, Dictionary of St. Thomas Aquinas, 1960 (DeF)
    G,      --  Gildersleeve + Lodge, Latin Grammar 1895 (G+L)
    H,      --  Collatinus Dictionary by Yves Ouvrard
    I,      --  Leverett, F.P., Lexicon of the Latin Language, Boston 1845
    J,
    K,      --  Calepinus Novus, modern Latin, by Guy Licoppe (Cal)
    L,      --  Lewis, C.S., Elementary Latin Dictionary 1891
    M,      --  Latham, Revised Medieval Word List, 1980
    N,      --  Lynn Nelson, Wordlist
    O,      --  Oxford Latin Dictionary, 1982 (OLD)
    P,      --  Souter, A Glossary of Later Latin to 600 A.D., Oxford 1949
    Q,      --  Other, cited or unspecified dictionaries
    R,      --  Plater & White, A Grammar of the Vulgate, Oxford 1926
    S,      --  Lewis and Short, A Latin Dictionary, 1879 (L+S)
    T,      --  Found in a translation  --  no dictionary reference
    U,      --  Du Cange
    V,      --  Vademecum in opus Saxonis - Franz Blatt (Saxo)
    W,      --  My personal guess
    Y,      --  Temp special code
    Z       --  Sent by user --  no dictionary reference
            --  Mostly John White of Blitz Latin

    --  Consulted but used only indirectly
    --  Liddell + Scott Greek-English Lexicon

    --  Consulted but used only occasionally, seperately referenced
    --  Allen + Greenough, New Latin Grammar, 1888 (A+G)
    --  Harrington/Pucci/Elliott, Medieval Latin 2nd Ed 1997 (Harr)
    --  C.C./C.L. Scanlon Latin Grammar/Second Latin, TAN 1976 (SCANLON)
    --  W. M. Lindsay, Short Historical Latin Grammar, 1895 (Lindsay)
                        );
</TT></PRE>


<A NAME="Current Distribution of DICTLINE Flags">
<H4>Current Distribution of DICTLINE Flags</H4></A>
<PRE><TT>
Number of lines in DICTLINE GENERAL  1.97F    39187

AGE
X         28858
A         61
B         446
C         58
D         3937
E         1718
F         1996
G         1920
H         193

AREA
X         29181
A         2955
B         912
D         410
E         1916
G         504
L         1221
P         181
S         730
T         382
W         722
Y         73

GEO
X         38147
A         64
B         52
C         1
D         3
E         49
F         67
G         20
H         278
I         141
J         4
K         6
N         8
P         9
Q         312
R         1
S         25
U         0

FREQ
X         11
A         2133
B         2711
C         10757
D         2678
E         11218
F         7982
I         424
M         0
N         1273

SOURCE
X         7554
A         0
B         41
C         1751
D         14
E         1417
F         119
G         59
H         0
I         4
J         116
K         2100
L         60
M         759
N         84
O         16039
P         296
Q         24
R         12
S         8094
T         88
U         0
V         47
W         316
Y         35
Z         158
</TT></PRE>


<A NAME="Dictionary Conventions">
<H4>Dictionary Conventions</H4></A>
<P>
There are a few special conventions in setting codes.
<P>
Proper Names
<P>
Proper names are often identified by the AGE in which the person lived,
not the age of the text in which he is referenced, the AREA of his fame or
occupation, and the GEO from which he hailed.  This refers to some
most-likely person of this name.  A name may be shared by others in
different ages.  Thus Jason, the Argonaut, is Archaic, Myth, Greek (A Y
H).  (It is not likely that a Latin text would refer to a TV star.)
Tertullian, an early 3rd century Church Father from Carthage, author of
the first Christian writings in Latin, is Late, Ecclesiastic, Africa (D E
A).  Jupiter is (A E I), which is a bit sloppy since he is present later.
Today he may be a myth, but then he was a god.  But even gods are not
eternal (X) in language, and an initial place is found for them.  Place
names are likewise coded, although with less confidence.
<P>
Vertical Bar
<P>
While not visible to the user, the dictionary contains certain meanings
starting with a vertical bar (|).  This is a code used to identify meanigs
that run beyond the conventional 80 characters.  One or more vertical bars
leading the meaning allows tools to recognize that they are additional
meanings to an entry already encountered, usually the entry immediately
before when the sort is for that reason.  This is only of concern to those
dealing with the raw dictionary who have asked.  <BR>

<BR><A NAME="Evolution of the Dictionary">
<H4>Evolution of the Dictionary</H4></A>
<P>
The stem list was originally put together from what might be called
'common knowledge', those words that most Latin texts have.  The first
version had about 5000 dictionary entries, giving up to 95% coverage of
simple classical texts.  This grew to about 13000 entries with specific
additions when gaps were found.  With this number it was possible to get
better than a 99% hit rate on Caesar (an area from which the dictionary
was built).  Parse of other works fell to 95-97%, which may be
mathematically attractive but leaves a lot to be desired in a dictionary,
since a translator is usually familiar with the vast bulk of the language
and just needs help on the obscure words.  Having just the common words is
not enough, indeed not much help at all.  So an attempt is made to make
the dictionary as complete as possible.  All possible spellings found in
dictionaries are included.
<P>
Starting with the 13000, the expansion project beginning in 1998 sought to
verify the existing words and supplement with any new found ones.  Thus
all classical Latin words are consistent with the OLD (not to say taken
from, because most were not, but checked against).  Any significant
deviation is indicated, either as from another source, or in the
definition itself.
<P>
L+S is used for later Latin and to check OLE work.  This started with the
thought that if a word was in L+S but not in OLE it must be later Latin,
beyond the range of OLD.  I was surprised at how many words with classical
citations were in L+S but not in OLD, and how many are of different
spelling.
<P>
The refinement is proceeding one letter at a time, as is the tradition for
all great dictionaries.  First stage refinement has proceeded through DI.
<BR>
<BR>

<BR><A NAME="Text Dictionary - DICTPAGE.TXT">
<H4>Text Dictionary - DICTPAGE.TXT</H4></A>

<P>In response to many requests, a simple ASCII text list has been created of
the WORDS dictionary, in what might be called the paper dictionary form.
Each coded dictionary entry has been expanded to its dictionary form
(nominitive and genitive for nouns, four principle parts for verbs, etc.).
In content it is like a paper dictionary, but each entry is on one long line
and the headwords are in all capitals, convenient for case-sensitive search.
The headwords are listed alphabetically (not the same as the coded file)
and offered in an ASCII/DOS text file
<A HREF="http://www/erols/com/whitaker/dictpage.txt"><B>DICTPAGE.TXT</B></A>
which may be searched from the user's browser,
or best downloaded and searched by any editor off-line.
To make it possible to search on-line, the file is not compressed and so is
about 3 MB.
<BR><BR>

<BR><A NAME="Latin Spellchecking - Text Processor List - LISTALL.ZIP">
<H4>Latin Spellchecking - Text Processor List - LISTALL.ZIP</H4></A>

<P>I have done a lot of Latin spell checking directly with WORDS.
All you have to do is put the text in a file,
run WORDS with a text file (@) input,
and require output of an WORD.UNK file (see # parameters).
It is sometimes useful to run without FIXes and TRICKS first,
then run the resulting first-pass UNKNOWNs
and look at the full WORD.OUT to make sure the modifications are reasonable.


<P>There are other techniques.
As I understand it, WORD2000 and other processors take a simple list of valid spellings
and use that for spellchecking.
I am speaking on secondhand information.
I have not tried to do the WORD2000 job.
However several people have proposed to use my dictionary files to do so.

<P>Since Latin is an inflected language, each dictionary entry expands
to many "words", often hundreds.
The present WORDS raw dictionary would expand to an enormous number of simple words,
but that is not the end of it.
Each of those words might have attached prefixes and suffixes, enclitics, and spelling variations.
Literaly billions of different words can be parsed and analyzed by the WORDS program.
These are legal Latin words, whether any Roman actually spoke them.
Of course, one could make a list of all the words in Cicero, or in the Vulgate,
and make a dictionary of those (and we are close to that),
but the body of medieval Latin is enormously greater
than that of classical Latin on which most dictionaries are based.


<P>In response to several requests, a simple ASCII text list has been created of
the two million primary words
that the WORDS program  and dictionary can form by adding inflections to stems.
This list has been reduced to half by eliminating duplicates.
The downloadable
<A HREF="http://www/erols/com/whitaker/listall.zip"><B>ZIP</B></A>
of this file is over 2 MB.

<P>The purpose of such a list is to provide data for conventional
word processor spell checking.

<P>Currently there are some ommissions.

<P>1) Latin has a widely used enclitic, -que, also -ne and -ve.
In principle these could be tacked on to almost any word.
If the spell checking system had the capability of recognizing
them, that would be the most convenient way of handling this problem.
Otherwise, completeness would require their addition to every word,
quadrupling the size of the list.

<P>2) Many Latin verb forms are subject to syncope, contracting the
form for pronounciation.  In WORDS this is handled by a process.
For the list another method must be used and the contracted words
generated by modifing both stem and ending.

<P>3) There are some common combined words in Latin in which the first
part of the word is declined, followed by a fixed form.  Unlike the
enclitic situation, these forms are limited and should be generated
seperately (quidam).  Other qu- pronouns are handled seperately
in WORDS and need special processing here also.

<P>4) Uniques have not yet been added.  This is a trivial
matter.

<P>5) There is the problem of prefixes and suffixes.  WORDS provides for
hundreds of these.  It would be impractical to multiply the list
by mechanically including all such possibilities.  Fortunately,
this may not be a significant problem.  The philosophy for the
dictionary has been to include all words, even those which could
be easily generated by a base and fixes, as they occur or are found
in sources.  This means that the most common compound words are
in the system, but that coverage is mostly concentrates on classical Latin.

<P>6) In later times especially,
there came some more or less common spelling variations.
These are handled in WORDS by TRICKS.
They can be relatively expensive, but are only applied to words
which otherwise have failed, are these are becoming rarer.
This process, if generally applied, would not only expand the
list enormously, the added words would not advance the goal
of spell checking.  They are, in some sense, misspelled words.
For a reader, it can be useful to have a guess at the word.
He can examine the form and context and judge whether it makes
sense.  It is not a process to be applied mechanically.

<P>7) There is a divergence in the way editors treat the non-Latin
characters J and V.  These are the consonant forms of I and U.
They are explicit in English, so for convenience, familiarity, and
pronounciation general practice in the past has been to use them.
More recently, some academic purists have rejected this and eliminated
J and V altogether.  (Note that the same purists use lower case
letters, in spite of the fact that the Romans had only the upper case.)
WORDS keeps the variant characters in the dictionary and maps them
to a single character in processing.  A list could include both
expressions, and it would only add a few percent in size.  However,
that would allow inconsistent spelling choices in a text.  This
seems to be contrary to the goals of a spell checker.
It is probably better eventually to offer two seperate lists so that the user
may select the option appropriate for his work.


<P>All the above factors are applied by processes in the WORDS program.
Running WORDS looking for UNKNOWNS will give a superior spell check,
but the list can be useful in conjunction with common editors.
Experience will determine its effectiveness.
<BR><BR>


<A NAME="INFLECTIONS">
<H3><CENTER>INFLECTIONS</CENTER>
</H3></A> <BR>

<P>
Inflections for WORDS are in a human-readable file called INFLECTS.LAT.
Presently there are almost 1800 separate entries.
This data is processed to produce a file INFLECTS.SEC used by the code.
The format of INFLECTS.LAT is simple, as for example:

<PRE><TT>N     1 1 NOM S C                 1 1 a             X A

V     1 1 PRES  ACTIVE  IND  1 P  2 4 amus          X A

PREP  ACC                         1 0               X A</TT></PRE>

<P>
The part of speech is given,along with the appropriate characteristics
for a particular inflection.  The inflection/ending is specified by
the stem to which it is attached, a number of characters, and the ending string.
There is an AGE and FREQ for each entry.


<BR>
<BR>


<A NAME="ENGLISH to LATIN">
<H3><CENTER>ENGLISH to LATIN</CENTER>
</H3></A> <BR>


<P>A fairly new application for the WORDS dictionary has been an attempt to
go English to Latin.
Up to now there is no satisfactory computer facility for this.
The best on the net is a search of the Perseus dictionary,
finding all uses of the English word in the text of the dictionary.
One can do the same with the WORDS dictionary,
and DICTPAGE.TXT is a convenient form for that purpose.
In the present release of WORDS, a primitive English-to-Latin
facility has been implemented, based on this inverted dictionary method.

<P>However, except for very simple situations,
the resulting raw output can be excessive and often spurious.
It is necessary to TRIM the output for the general user.
In order to do this, one needs to be able to computer parse the MEAN field
and prioritize the significance of a word appearing therein.
This is a more rigorous requirement than the one applied hitheretofore,
that MEAN should be human-readable.  Now it must be computer parsed.
Therein lies the reason for a formal set of rules for constructing MEAN.
These rules are new and certainly have not been applied throughout
the dictionary yet, further,
they may change in the future if more powerful ordering algorithms evolve.

<P>The primary rule is that nothing should surprise or
inconvenience the casual user of WORDS.
Further, for system independence,
the MEAN line should be readable by anyone in ASCII,
without special characters or fonts.


<P>I have just begun to work on an English-to-Latin capability.
Initially this is just a inversion of the WORDS Latin dictionary,
extracting all the English words in the MEAN field of the WORDS dictionary
and associating these with the corresponding Latin entry.
A real English-to-Latin is much more than that.
To construct from first principles,
one should take a set of English words and find the Latin equivalent,
not the reverse.
Nevertheless, WORDS now has some primitive capability.

<P>The raw inversion produces almost 200_000 English words.
WEEDing them by the present algorithms (eliminate a, the, to, ...,
and a number of common modifiers when included in meanings not of
their part of speach)
reduces this number only by a third.
But this finally results in only somewhat over 20_000 unique words,
less than the number of Latin entries!
This probably reflects more on dictionary makers than on the languages.


<P>English is certainly a far richer lanaguage than Latin, measured
by the number of individual words.
WORDS has about 40_000 Latin entries, and the corresponding inversion to
English yields only 22_000 unique English words.


<P>The reason seens to be that,
while English may have lots of words for love or hate,
in making up a Latin
dictionary one will opt to give a simple translation.
So while love is a proper translation of a number of Latin words,
and one could as well replace it with any of dozens of English synonyms,
a dictionary compiller will usually take the simplest English word
that provides the reader with the meaning.   That is what the reader usually wants.


<P>Starting from an English basis to produce an English-to-Latin dictinoary
gives an entirely different outcome.
In that case, the full power of English can be invoked,
and it is the Latin that will seem simple by comparison.

<P>In many cases, an English-to-Latin dictionary bound in the same volume
as a Latin-to-English will have been
developed by a different author, and sometimes they are not consistent.
At least the inversion procedure assures basic consistency.

<P>One problem with the inversion method is that one needs
to weed out a lot of the chaff before presenting to the user.
And even then there are a lot of choices for the user.
GOLD occurs 120 times; COPPER, 57 times: ABANDON, 24 times,
plus several times for ABANDONED, ABANDONS, ABANDONING, and ABANDONMENT.
Further trimming has to get very severe!


<P>If the program is run with TRIM_OUTPUT parameter set
(this parameter works on both Latin and English output),
the six highest priority (by FREQ or whatever the current algorithm is)
will be listed.  This should serve for the general user.
Turning off this parameter allows the program to list all instances
found in the Latin dictionary,
which were not removed by WEEDing in the data preparation.

<P>Finally there is the problem that most paper Latin dictionaries harken back
to the 19th century or earlier, even those published more recently.
Their base English may not be current.
Take a purely hypothetical example.  On the first page of every English-Latin
dictionary is <B>abase</B>.  This is a good 18th century word.  Today one is
more likely to see humble, degrade or humiliate, and those are the words the
user is more likely to request.  But the dictionaries from which WORDS draws
may be fonder of abase as the meaning of a Latin word which could serve for
any of these.  The user may want to try some synonyms, but this can be
a considerable burden.  A built-in thesaurus could mechanically generate
a broad range of words to include, but this is surely overkill and will
generate so many inappropriate results as to render the search excessively
cumbersome.  The user is advised to check the meanings returned for suggestions
as to what other words might be tried, if the immediate result does not
seem satisfactory.


<P>One important point is that the program mechanically searches the Latin dictionary.
If one is looking for a adjective, presently one will find all adjectives for which the
MEAN contains the search word, no more.  However one should be aware that
participles of appropriate verbs can also serve as adjectives and may be
a better choice.

<P>At the present time there is no complex constrution/deconstruction
of the English input.  Thus if the input is 'kill', only Latin entries
with the exact word 'kill' in their MEAN will be selected.  The suffixed words
'kills'/'killing'/'killed'/'killer'/etc. will not be found.  They must
be queried seperately.
Likewise, unlike the Latin phase of WORDS, prefixes are not extracted.
It may be desirable in the future to provide such additional capabilities.
This would be value added over simple search by the program.


<BR>
<BR>


<A NAME="English Parsing of Meanings">
<H4>English Parsing of Meanings</H4></A>
<BR>


<P>Puncuation in meanings is now formalized, in order
to allow computer processing of the text.
Diviation from these rules would make parsing
of the English very difficult, so they must be enforced.
There is nothing which will mislead the user,
but it goes beyond standard text practice.


<P>The semicolon separator has greater significance.
Various groups of meanings may have varying frequency or likelihood.
The most likely are placed first and thereby prioritized.
Within a semicolon group (SEMI) of meaning/synonyms separated by commas or slashes,
their probability is assumed to be the same.
Where possible, a PURE word (e.g., 'perhaps') should lead,
followed by compound meanings (e.g., 'it may be').
There is much work to do before this ordering is complete.

<P>Any PURE meaning (one not involving modifiers) set off by
commas or slashes, is assigned a high priority on output that
a modified/compound meaning in the same MEAN SEMI.

<P>Semicolons seperate meaning groups that have a different
flavor/sense.
Initially the interpretation and selection among these were left
to the user, as in paper dictionaries.
Recent requirements demand an ordering of these groups.
The order of the semicolon groups (called SEMIs in the code) should
indicate the frequency or probability of that meaning
among different groups, where this inferrence can be made.
This ideal is not yet rigorously enforced, even in recent entries,
and less so in those earlier in the update.

<P>Commas separate meanings that are roughly equivalent -
synomyns. In parsing, a COMMA consistes of the words between commas.
There is no inherent logical order within a SEMI, however,
to support another application for the dictionary,
full sentence Latin-to-English translation,
it is desirable to be able to pick a single,
simple, modern English word that is most likely to be the translation.
This should be the first word of the meaning.

<P>Question marks and exclamation points may appear as
an integral part of the meaning.  They do not replace
the comma/semicolon separator, as in normal text.

<P>The soldius/slash (/) does the work of 'or' in many cases.
It is used solely to conserve space,
to compress the meaning line to no more than 80 characters.
It separates (generally close) synonyms and
also alternative options (jump up/out = jump up; jump out).

<P>Plus (+) is used in the dictionary, as well as in this documentation,
in place of ampersand, for compatibility with HTML.
It s a full separator, between two words, each recognized separately.

<P>Hyphen (-) should be is used in the dictionary only to break
into two words in the parse, each recognized separately.
Thus, book-keeper will appear in the English pharse as two words.
But it is likely that a user looking for an accountant would search
for bookkeeper, rather than book or keeper.
The dictionary has not yet been scrubbed for this situation.

<P>Parentheses set off both possible supporting words
(go (down) = go; go down) and explanatory information.
Since parenthesized words are excluded from the extraction process,
they are a way to further reduce clutter in the English dictionary.
Words in the meaning that should not find this entry when searched
can be excluded from the English dictionary tables by parenthesizing.
(NOTE: two sets of parentheses not separated by a comma or semicolon
can cause processing troubles and should be avoided.)

<P>Square brackets enclose translation examples or idioms,
a Latin expression to English equivalent.
The English translation of the Latin is introduced by =>.
The parser expects this (=>) token.
A bracketed expression is always
at the end of the meanings line so that it may be
extracted before spellchecking, otherwise the spellcheck
will fail on the Latin and there are an inconveniently large
number of these examples.
Brackets should never be use where parentheses are appropriate.

<P>Generally, articles (a, an, the) are omitted in meanings.
While this compresses the line, it also reflects the fact
that Latin does not distinguish between those uses.
To define agricola as 'a' farmer would disparage the possiblity
of the proper translation being 'the' farmer.  Most dictionaries
report nouns without an article.  This one go further and
avoids the use of articles almost everywhere.

<P>Some dictionaries prefix verb meanings with TO.
This is superfulous, except in the case of a list of meanings
not distinguished by part of speech (to cut, a cut), not
the situation for this dictionary.

<P>Vertical bars at the begining indicate continuation meaning lines.
There may be several continuation lines,
numbered/ordered by the number of leading vertical bars.
For words with a large number of meanings,
additional meaning lines are provided by another entry for the same stems
and part with what amounts to a continuation line for MEAN.
In order to associate the resulting series of meaning lines,
a vertical bar (|)is placed at the begining of the
first continuation MEAN, two bars for the second, etc.
The dictionary is sorted so as to assure that
these entries are grouped and ordered.
This allows checking  of the dictionary for spurious duplicate entries
without flagging intended continuation entries.
Further it facilitates compression
of the WORDS output by combining the inflection output for the several
identical parts followed by the group of meaning lines.
The STEMS and PART are identical for the base (no |) and all extensions.
They are all the same word, however they may have different flags,
that is, there may be different meanings for different AGE or AREA.

<P>The bar is a code for MEAN continuation seen only in the raw DICTLINE.
Bars are removed before WORDS output and are not visible to the user.
There are also some entries with identical STEMS and PART which are
really different words,
different derivation and completely different meanings.
These will not be | coded and will be reported separately in output.
{NOTE: The vertical bar should not appear anywhere in meanings except
at the begining as a continuation flag.)

<P>Correct use of symbols/codes in MEAN is very important.
One must not use them 'free form'.
They are used in the parsing of MEAN and inproper use can defeat a processing program.
While some main programs have many built-in checks,
there are a number of secondary tools which are not so 'fool proof'.
MAKEEWDS is a complicated program which I did not do well.
If it hits something strange it might well fail to properly
parse that MEAN.  The program will still complete and the
output will only lose a part of the strange MEAN, affecting only the English mode,
and may or may not not give a report on the failure.


<A NAME="Ordering English-to-Latin Output">
<H4>Ordering English-to-Latin Output</H4></A>
<BR>


<P>Essentially we start by associating English words in the dictionary entry meaning
with the entry number (line number in DICTLINE).
The list of English words (EWDSLIST) is sorted so that all occurances of a particular word are together.
Then, upon inquiry, a list of the associated Latin dictionary entries is output.
Unfortunately this list could be large (a hundred or more for some common words) and thereby user-unfriendly.
The task is to order the list and reduce the output to a few most likely


<P>Priorities for display are based on frequencies.  Besides the basic
FREQ assigned to the entry, it is presumed that the frequency is
greater for those meanings in the first SEMIs, with gradually
decreasing frequency assigned to later SEMIs and to bar flagged continuations.
The algorithm presently used is summarized below,
but it is subject to modification in future versions.


<P>Each English word found is given a numerical RANK/priority/weight based on the algorithm below.
The numerical values of each consideration are added or subtracted to give the priority of the entry.


<H5>FREQ</H5>

<P>The obvious choice for frequency weights might be the comparative paper dictionary citations,
which would be roughly:

<PRE><TT>
A=>50
B=>25
C=>10
D=> 5
E=> 3
F=> 1</TT></PRE>


<P>However these would weight the A frequency so heavily
that it would be impossible to overcome with anything
that could be applied to lower frequencies.  So we must reject this scale
for a more managable set:

<PRE><TT>
A=>70
B=>60
C=>50
D=>40
E=>30
F=>20
etc.</TT></PRE>


<P>(N is special case, add 25 after formula)

<H5>Compounds</H5>

<P>Compound words ('very tall' vs. 'tall') are often useful,
indeed the user may be looking for components to make up a compound translation,
however generally they should be disparaged relative to the pure/simple word.
A compound A FREQ might be no better than a pure D.
<BR>
<BR>
Compound Yes=> 0<BR>
Compound No (Pure) =>  10<BR>
<BR>

Which SEMI (a SEMI is a part of MEAN set off by semicolons)<BR>
<BR>

-3 per SEMI  after 1

<P>Further, a word on a continuation line is disparaged by 3 SEMIs (-9).


<P>The words in the first SEMI are enhanced in the expectation that they are
the primary meaning.  This follows the tenuous idea that there is a single simple
translation for each English word.  At least the first SEMI is emphasized. <BR>
<BR>

If PURE and 1st SEMI => 5<BR>

<P>Priority = FREQ value + Compound value + SEMI value + Continuation value + First SEMI value <BR>
<BR>


Example: for lamp - lanum N 3 2 N

<PRE><TT>
FREQ A => 70
Compound No => 10
Semi 2 => -3
Continuattion Line No => 0
Pure 1st No => 0
RANK/priority => 77</TT></PRE>


<A NAME="TESTS AND STATUS">
<H3><CENTER>TESTS AND STATUS</CENTER>
</H3></A> <BR>

<A NAME="Testing">
<H4>Testing</H4></A>

<P>
The program has been run against common classical texts.  Initially
this was mostly a check of the process and reliability of the program.  It
is now possible to run real texts and get valid statistics.  Relatively
few texts have been run multiple times in order to understand exactly
where failure occured and to regression test the solutions.  Such testing
has taken place on texts totaling well over a million words.  The best
results come from those which have been run the most times.  Caesar and
the Vulgate are essentially without unknowns (excluding proper names.,
Seutonius and Virgil are at the 0.1% level, Varro and Pliny have somewhat
more than 1% unknowns due to their specialized vocabulary.  While this is
a mechanical test and does not assure that the form and meaning reported
by the program is always correct, the actual number of misses found by
limited detailed examination is vanishingly small.

<P>
A far larger test (with feedback) has been made by John White in the development
of his Blitz Latin.  While not using WORDS, he has a program from much the
same basis, incorporating approximately the WORDS dictionary.  He has run
a much larger set of texts, including both classical and medieval, to the
extent of 20 million Latin words, and provided significant unknowns
to be included in WORDS.

<P>
The hardest test is against another dictionary.  While getting a 97+% hit
rate on long classical texts, a run against a large dictionary might fall
to 85-90%, the missing words being in those letters which the update has
not reached.  This is to be expected, since we both have the 10000 most
common words and have made somewhat different additions beyond that.  So
large electronic wordlists are a check on the program, and have been reserved
for that purpose, not simply incorporated as such.

<P>We have gone so far that this is no longer significant and wordlists can be
integrated.  The only real impact has been the inclusion of modern Latin words
which come from such lists, and not from scans of texts.

<BR>
<BR>


<H5>English-to-Latin Tests</H5>

<P>So far there have been no formal validation of the English-to-Latin capability.
There have been numerous individual checks and anacdotal testing, as well
as some mechnical performance tests, but nothing fundemental.

<P>The first test proposed is to take a small English-to-Latin dictionary,
say from the back of an introductory textbook, and check that the Latin
suggested for each entry is found in the top six returned by WORDS.
It is expected that there will be a high corespondence (to be shown).
Taking a much larger example may give a different result.
It may be that the Latin words chosen by WORDS are not the same as
the paper dictionary.

<BR><BR>


<A NAME="Current Status and Future Plans">

<H4>Current Status and Future Plans</H4></A>
<P>
The present phase of refinement has incorporated the Oxford Latin
Dictionary and Lewis and Short entries into <B>D</B> (about a fourth).
Periodically, when I need a change of task, I run a major author
to check the
effectiveness of the code.  I may then include some words which turn up
frequently as unknowns, but this is done as the spirit moves me.  Smaller
sections of later authors may also be processed, giving some growth in
medieval Latin entries.  Recently I have worked the Vulgate of St. Jerome.

<P>John White in support of his Blitz Latin program has run a very large
body of Latin text, including much medieval legal documents.  He provides
input to the dictionary as he finds significant unknowns.

<P>
I will continue to refine the dictionary and the program.  The major goal
is to complete the inclusion of OLD and L+S, and this may take years.
Along the way, and later, I will expand to medieval Latin.  I am not so
unrealistic as to believe that I will 'finish', indeed, this is a hobby
and there is no advantage to finishing.
<P>
An eventual outcome would be to have some institution, with real Latin
capability, provide an exhaustive and authoritative program of this
nature.  Until then, I and other individuals will make available our
programs.  <BR>
<BR><BR>

<A NAME="USER MODIFICATIONS">
<H3><CENTER>USER MODIFICATIONS</CENTER>
</H3></A>
<BR>
<A NAME="Writing DICT.LOC and UNIQUES.LAT">
<H4>Writing DICT.LOC and UNIQUES.LAT</H4></A>
<P>
To make the dictionary files used by the program is not difficult, but it
takes several auxiliary programs for checking and ordering which are best
handled by one center.  These are available to anyone who needs them, but
it is better that any general additions to the dictionary be handled
centrally that they can be included in the public release for everyone.
<P>
However, it is possible for a user to enhance the dictionary for special
situations.  This may be accomplished either by providing new dictionary
entries in a DICT.LOC file, those to be processed in the regular manner,
or to add a unique (single case/number/gender/...) in a text file called
UNIQUES.  <A NAME="DICT.LOC">

<H4>DICT.LOC</H4></A>
<P>
A dictionary entry for WORDS (in the simplest, editable form as read in a
DICT.LOC) is

<PRE><TT>
aqu   aqu
N    1 1 F T     X X X X X
water;

</TT></PRE>

<P>
For a noun there are two stems.  The definition of STEM is inherent in
the coding of inflections in the program.  Different grammars have
different definitions.  There is no formal connection with any other
usage.
<P>
To these stems are applied, as appropriate, the endings

<PRE><TT>
         S       P
NOM      a       ae
GEN      ae      arum
DAT      ae      is
ACC      am      as
ABL      a       is
</TT></PRE>

<P>
Or rather, the input word is analyzed for possible endings, and when these
are subtracted a match is sought with the dictionary stems.  A file
(INFLECTS.LAT) gives all the endings.

<P>
In this example, the first line
<PRE><TT>
aqu   aqu
</TT></PRE>
contains the two noun stems for the word found in printed dictionaries as

<PRE><TT>
aqua, -ae
</TT></PRE>

<P>
The second line

<PRE><TT>
N    1 1 F T     X X X X X
</TT></PRE>
says it is a noun (N), of the first declension, first variant, is feminine
(F), and is a thing (T), as opposed to a person, location, etc.  The X X X
X X represents coding about the age in which it is applicable, the
geographic and application area of the word, its frequency of use, and the
dictionary source of the entry.  None of this is necessary in a DICT.LOC
although something must be filled in and X X X X X is always satisfactory.

<P>
The last line is the English definition.  It can be as long as 80
characters.

<PRE><TT>
water;
</TT></PRE>

<P>
The case and exact spacing of the stems and codes is unimportant, as long
as they are separated by at least one blank.
<P>
The PART_OF_SPEECH_TYPE that you are most interested in are (X, N, ADJ,
ADV, V).  X is always a valid entry.  It stands for none, or all, or
unknown.  0 has the same function for numeric types.
<P>
The others in the type (PRON, PACK, VPAR, SUPINE, PREP, CONJ, INTERJ, NUM,
TACKON, PREFIX, SUFFIX) are either less interesting or artificial, used
only internally to the code.
<P>
A noun or a verb has a DECN_RECORD consisting of two small integers.  The
first is the declension/conjugation, and the second is a variant within
that.
<P>
N 1 1 is the conventional first declension.  But there are variants (6, 7,
8) which model Greek-line declensions.  (Greek-like variant start at 6);
<P>
N 2 1 is the regular -us, -i second declension.
<P>
N 2 2 is the regular -um, -i neuter form.
<P>
There is a N 2 3 for 'r' forms like puer, pueri.  In this case there is
the possibility of a difference in stems (ager, agri has stems coded as
ager, agr).
<P>
Again there are Greek-like variants (6, 7, 8, 9).
<P>
N 3 1 is regular third declension (lex, legis - lex, leg) for masculine
and feminine.
<P>
N 3 2 is for neuter (iter, itineris - iter, itiner).
<P>
Variants 3 and 4 are for I-stems.  And so it goes.
<P>
Each noun has a GENDER_TYPE (X, M, F, N, C).  X for unknown (something I
avoid for gender - guess if you have to) or all genders (useful in the
code but not in a dictionary), and C for common (M + F).
<P>
There is also a

<PRE><TT>
NOUN_KIND_TYPE (X,            --  unknown, nondescript
                N,            --  proper Name
                L,            --  Locale, country, city
                W,            --  a place Where
                P,            --  a Person type
                T)            --  a Thing
</TT></PRE>
which you probably do not care about either.  Most entries will all be
Thing.
<P>
Other codes are enumerated in the body of this document.
<P>
Verbs are done likewise, but there are four stems, as described below.  An
example is

<PRE><TT>
am  am  amav  amat
V 1 1 X    X X X A O
love;
</TT></PRE>

<P>
Now comes the hard part.  When starting from a dictionary one has all the
information to decide the values.  Just having a single instance of the
word lacks a lot.  Consider some examples from a user.
<P>
Elytris is surely from the Greek for sheath.  The question is how
Latinized did it get.  I suspect that by the 17th century it was
completely Latinized.  Even in classical times there was very little left
in the way of Greek forms ( elythris (or -es), elythris (N 3 3) but it
could be a Greek-like form (N 3 9).  I do not even know what case I
started with, if NOM, then it must be -is, -is, if GEN then -es, -is is
reasonable.  Then again, if it is DAT P we might have a N 1 1.
<P>
All this seems very uncertain, and, in the absence of a real dictionary
entry, it is.  However you can make the choices such that the result (the
output of the code) matches exactly what you have.  If you have more
information, lots of examples, the uncertainty shrinks.  If you have just
a single isolated example, there are limits.  (But if you do 100 and have
more information about some, you can make better guesses about the rest.)
<P>
Next we need a gender.  It may not make much difference (if M or F, or C)
in this case, but sometimes it matters.  You might be able to figure that
out from the text.
<P>
It is a thing (T), but X will work for your purposes.  For the rest, X X X
X X works fine.
<P>
So we have

<PRE><TT>
elythris   elythr
N   3 3  F T       X X X X X
elytra, wing cover of beetles
</TT></PRE>

<P>
sat, I happen to know is an abbreviated form of satis, so it is easy.  If
you want the adverb form, as you indicate:

<PRE><TT>
Sat
ADV POS     X X X X X
sufficiently, adequately; quite, well enough; fairly, (moderately)
</TT></PRE>

<P>
Adverbs have a comparison parameter (X, POS, COMP, SUPER).  Most will be
POS.
<P>
It also is an indeclinable (N 9 9) substantive:

<PRE><TT>
sat
N 9 9 N T     X X X X X
enough, sufficient; enough and some to spare; one of sufficient power

</TT></PRE>

<P>
Deplanata seems to be a 1-2 declension adjective, the -us, -a, -um form.
It also seems to derived from the verb deplanto (V 1 1) - break off/sever
(branch/shoot).

<PRE><TT>
deplanat   deplanat
ADJ 1 1 POS     X X X X X
broken off/severed (branch/shoot); (flattened)
</TT></PRE>

<P>
Adjectives have a DECN and a comparison.
<P>
The following were not at the time in the dictionary, but were in the OLD.

<PRE><TT>
alat  alat
ADJ 1 1 POS     X X X X X
winged, having wings; having a broad/expanded margin


(punct - ul - at  -> hole/prick/puncture - small - having)

punctulat   punctulat
ADJ 1 1 POS    X X X X X
punctured; having small holes/pricks/stabs/punctures

appendiculat   appendiculat
ADJ 1 1 POS    X X X X X
appendiculate; having/fringed by small appendages/bodies


acetabul   acetabul
N 2 2 N T     X X X X X
small cup (vinegar), 1/8 pint; cupped part (plant); sucker; socket, (cavity)


ruf  ruf
ADJ   1 1 POS     X X X X X
red (various); tawny; red-haired (persons); strong yellow/moderate orange


testace   testace
ADJ  1 1 POS    X X X X X
bricks; resembling bricks (esp. color); having hard covering/shell (animals)
</TT></PRE>

<P>
This one had no classical correspondence.

<PRE><TT>
brunne   brunne
ADJ 1 1  POS     X X X X X
brown
</TT></PRE>

<P>
There is one other remark.  It is probably wise to include in the
definition a more complete English meaning.  Just saying the meaning of appendiculatus is
appendiculate is not as interesting as it might be.
<P>
All the inflections are in a file called INFLECTS.LAT now a part of the
general distribution of <A HREF="http://www.erols.com/whitaker/wordsall.zip">source code and data files</A>.  <BR>

<P>
Here is a quick reference for the most common types.

<PRE><TT>

--  All first declension nouns  - N 1 1
--  Ex: aqua aquae  =>  aqu aqu

--  Second declension nouns in "us"  - N 2 1
--  Ex: amicus amici  =>  amic amic

--  Second declension neuter nouns - N 2 2
--  Ex: verbum verbi  =>  verb verb

--  Second declension nouns in "er" whether of not the "er" in base - N 2 3
--  Ex; puer pueri  =>  puer puer
--  Ex: ager agri   =>  ager agr

--  Early (BC) 2nd declension nouns in ius/ium (not filius-like)  - N 2 4
--  for the most part formed GEN S in 'i', not 'ii'   --  G+L 33 R 1
--  Dictionaries often show as ...(i)i
--  N 2 4 uses GENDER discrimination to reduce to single VAR
--  Ex: radius rad(i)i  => radi radi        M
--  Ex: atrium atr(i)i  =>  atri atri       N

--  Third declension M or F nouns whose stems end in a consonant - N 3 1
--  Ex: miles militis  =>  miles milit
--  Ex: lex legis  =>  lex leg
--  Ex: frater fratris  =>  frater fratr
--  Ex: soror sororis  =>  soror soror
--  All third declension that have the endings -udo, -io, -tas, -x
--  Ex: pulcritudo pulcritudinis  =>  plucritudo pulcritudin
--  Ex: legio legionis  =>  legio legion
--  Ex: varietas varietatis  =>  varietas varietat
--  Ex: radix radicis  =>  radix  radic

--  Third declension  N nouns with stems ending in a consonant - N 3 2
--  Ex: nomen nomenis  =>  nomen nomen
--  Ex: iter itineris =>  iter itiner
--  Ex: tempus temporis  =>  tempus  tempor

--  Third declension nouns  I-stems (M + F)     - N 3 3
--  Ex: hostis hostis  =>  hostis host
--  Ex: finis finis  =>  finis fin
--  Consonant i-stems
--  Ex: urbs urbis  =>  urbs urb
--  Ex: mons montis  =>  mons mont
--  Also use this for present participles (-ns) used as substantives in M + F

--  Third declension nouns  I-stems (N)    - N 3 4
--  Ex: mare amris  =>  mare mar                       --  ending in "e"
--  Ex: animal animalis  =>  animal animal             --  ending in "al"
--  Ex: exemplar exemplaris  =>  exemplar exemplar     --  ending in "ar"
--  Also use this for present participles (-ns) used as substantives in N

--  Fourth declension nouns M + F in "us"  - N 4 1
--  Ex: passus passus  =>  pass pass
--  Ex: manus manus  =>  man man

--  Fourth declension nouns N in "u"  - N 4 2
--  Ex: genu genus  =>  gen gen
--  Ex: cornu cornus  =>  corn corn

--  All fifth declension nouns  - N 5 1
--  Ex: dies diei  =>  di di
--  Ex: res rei  =>  r r


--  Adjectives will mostly only be POS and have only the first two stems
--  ADJ X have four stems, zzz stands for any unknown/non-existent stem

--  Adjectives of first and second declension (-us in NOM S M)  - ADJ 1 1
--  Two stems for POS, third is for COMP, fourth for SUPER
--  Ex: malus mala malum  => mal mal pei pessi
--  Ex: altus alta altum  => alt alt alti altissi

--  Adjectives of first and second declension (-er) - ADJ 1 2
--  Ex: miser misera miserum  =>  miser miser miseri miserri
--  Ex: sacer sacra sacrum  =>  sacer sacr zzz  sacerri     --  no COMP
--  Ex: pulcher pulchri  =>  pulcher pulchr pulchri pulcherri

--  Adjectives of third declension - one ending  - ADJ 3 1
--  Ex: audax (gen) audacis  =>  audax audac audaci audacissi
--  Ex: prudens prudentis  =>  prudens prudent prudenti prudentissi

--  Adjectives of third declension - two endings   - ADJ 3 2
--  Ex: brevis breve  =>  brev brev brevi brevissi
--  Ex: facil facil   =>  facil facil facili facilli

--  Adjectives of third declension - three endings  - ADJ 3 3
--  Ex: celer celeris  celere  =>  celer celer celeri celerri
--  Ex: acer acris acre  =>  acer acr acri acerri


--  Verbs are mostly TRANS or INTRANS, but X works fine
--  Depondent verbs must have DEP
--  Verbs have four stems
--  The first stem is the first principal part (dictionary entry) - less 'o'
--  For 2nd decl, the 'e' is omitted, for 3rd decl i-stem, the 'i' is included
--  Third principal part always ends in 'i', this is omitted in stem
--  Fourth part in dictionary ends in -us (or -um), this is omitted
--  DEP verbs omit (have zzz) the third stem

--  Verbs of the first conjugation  --  V 1 1
--  Ex: voco vocare vocavi vocatus  =>  voc voc vocav vocat
--  Ex: porto portave portavi portatus  =>  port port portav portat

--  Verbs of the second conjugation   -  V 2 1
--  The characteristic 'e' is in the inflection, not carried in the stem
--  Ex:  moneo monere monui monitum  =>  mon mon monu monit
--  Ex:  habeo habere habui habitus  =>  hab hab habu habit
--  Ex:  deleo delere delevi deletus  =>  del del delev delet
--  Ex:  iubeo iubere iussi iussus  =>   iub iub iuss iuss
--  Ex:  video videre vidi visus  =>  vid vid vid vis

--  Verbs of the third conjugation, variant 1  - V 3 1
--  Ex: rego regere rexi rectum  =>  reg reg rex rect
--  Ex: pono ponere posui positus  =>  pon pon posu posit
--  Ex: capio capere cepi captus  => capi cap cep capt   --  I-stem too w/KEY

--  Verbs of the fourth conjugation are coded as a variant of third - V 3 4
--  Ex: audio audire audivi auditus  =>  audi aud audiv audit

--  Verbs like to be - coded as V 5 1
--  Ex: sum esse fui futurus  =>  s . fu fut
--  Ex: adsum adesse adfui adfuturus  =>  ads ad adfu adfut

</TT></PRE>


<A NAME="UNIQUES.LAT">
<H4>UNIQUES.LAT</H4></A>
<P>
There are a few Latin words that cannot be represented with the scheme of
stems and endings used by the program.  For these very few cases, the
program invokes a unique procedure.  The file UNIQUES.  contains a list of
such words and is read in at the loading of the program.  This is a simple
ASCII/DOS text file which the user can augment.  It is expected that there
will be very few occasions to do so, indeed, the tendency has been that
better processing has allowed uniques to be removed.  If a user finds an
important word that should be included, please communicate that to the
author.
<P>
The UNIQUES record is essentially the form as one might have it in output
if the word was processed normally.  In addition there are some additional
fields that the program presently expects.  While these could be
eliminated, it is convenient for the program not to make the UNIQUES a
special case.  So a noun form

<PRE><TT>
N 3 1 ACC S F T
</TT></PRE>
is followed by two zeros and an X

<PRE><TT>
N 3 1 ACC S F T  0 0                            X        X  X  X  B  O
</TT></PRE>
and then the five X's or, more properly, the dictionary codes.

<PRE><TT>
N 3 1 ACC S F T  0 0                            X        X  X  X  B  O
</TT></PRE>

<P>
These pro forma codes are absolutely necessary, but have no further
impact.
<P>
The program is written in Ada and uses Ada techniques.  Ada is designed
for high reliability systems (there is no claim the WORDS was developed
with all the other safeguards that that implies!) as a consequence is
unforgiving.  The exact form is required.  If you want to be sloppy you
have to deliberately program that in.
<P>
The following examples, and an examination of the UNIQUES.LAT file, should
allow the user to insert any unique necessary.

<PRE><TT>
requiem
N 3 1 ACC S F T  0 0                            X        X  X  X  B  O
rest (from labor), respite; intermission, pause, break; amusement, hobby;
bobus
N 3 1 DAT P C T  0 0                            X        X  X  X  C  X
ox, bull; cow; cattle (pl.)
quicquid
PRON 1 6 NOM S N INDEF   0 0                    X        X  X  X  B  X
whatever, whatsoever; everything which; each one; each; everything; anything
mavis
V     6 2 PRES  ACTIVE  IND  2 S X 0 0          X        X  X  X  B  X
prefer
cette
V    3 1 PRES ACTIVE IMP  2 P TRANS    0 0      X        X  X  X  B  O
give/bring here!/hand over, come (now/here); tell/show us, out with it! behold!
</TT></PRE>

<P>
<BR>
<BR>
<A NAME="DEVELOPERS AND REHOSTING">

<H3><CENTER>DEVELOPERS AND REHOSTING</CENTER>
</H3></A>


<A NAME="Program source code and data">

<H4>Program source code and data</H4></A>
<P>
The program is written in Ada, and is machine independent.  Ada source
code is available for compiling onto other machines. <BR>
<BR>

<A NAME="Licence">
<H4>Licence</H4>
<P>
<B>All parts of the WORDS system, source code and data files, are made freely
available to anyone who wishes to use them, for whatever purpose.</B><BR>
<BR>

<A NAME="Rehosting WORDS">
<H4>Rehosting WORDS</H4>
<P>
There is a <A HREF="wordsall.zip"><B>wordsall.zip</B></A>
zip of all the Ada source files to port WORDS, and
support programs and data to generate the necessary dictionaries and
inflections for re-hosting the WORDS Latin dictionary
parsing/translation system on any machine with an Ada 95 compiler.  (It
can be made to work with Ada 83 also by replacing just tha short driver routine.)
<P>
This a console program (keyboard entry), without fancy Windows GUI, and has
thereby been made system independent.
<P>
wordsall contains the Ada source files for WORDS, and complete details for rehosting
summarized below:
<P>
Ada source files for the WORDS system are:
<PRE>
strings_package.ads
strings_package.adb
latin_file_names.ads
latin_file_names.adb
config.ads
preface.ads
word_parameters.ads
developer_parameters.ads
preface.adb
put_stat.adb
word_parameters.adb
inflections_package.adb
inflections_package.ads
dictionary_package.ads
dictionary_package.adb
addons_package.ads
addons_package.adb
uniques_package.ads
word_support_package.ads
word_support_package.adb
english_support package.ads
english_support package.adb
word_package.ads
line_stuff.ads
line_stuff.adb
developer_parameters.adb
tricks_package.ads
word_package.adb
tricks_package.adb
list_package.ads
list_sweep.adb
dictionary_form.adb
search_english.adb
put_example_line.adb
list_package.adb
parse.adb
words.adb
</PRE>

<P>
There are four supporting programs

<PRE>
makedict.adb
makestem.adb
makeinfl.adb
makeefil.adb
</PRE>

<P>
and DOS ASCII data files for them to act upon to produce WORDS data files

<PRE>
DICTLINE.GEN
STEMLIST.GEN
EWDSLIST.GEN
INFLECTS.LAT
</PRE>

<P>
and other WORDS DOS ASCII supporting files

<PRE>
ADDONS.LAT
UNIQUES.LAT
</PRE>

<P>
<P>
The process is to download the 197fall.zip and unzip into a
subdirectory.  (If the zip form is unsuitable for your system, I can
provide the files in an uncompressed form.) The wordy file names are for
compliance with the restrictions of the GNAT system.  They may be renamed,
and I can provide an alternative.  However, the long file names demand an
UNZIP that preserves them, if GNAT is to be used.

<P>For example, in a GNAT
environment (-O3 optimizes if your system supports it):

<PRE>
gnatmake -O3 words
gnatmake makedict
gnatmake makestem
gnatmake ewdsefil
gnatmake makeinfl
</PRE>

<P>
This produces executables for WORDS, MAKEDICT, MAKESTEM, MAKEEFIL, and MAKEINFL.
Executing the latter four against the input respectively of

<PRE>
DICTLINE.GEN
STEMLIST.GEN
EWDSLIST.GEN
INFLECTS.LAT
</PRE>

<P>
(when they ask for DICTIONARY say G) producing

<PRE>
DICTFILE.GEN
STEMFILE.GEN
INDXFILE.GEN
EWDSFILE.GEN
INFLECTS.SEC
</PRE>

<P>
Along with ADDONS.LAT and UNIQUES.LAT, this is the set of data for WORDS.
<P>
The only problem that has appeared on porting so far is that one must be
careful of file names.  Problems sometimes turn up but have been easily
rectified by inspection.  All of my systems are case-independent on file
names.  If one is running in a case-dependent system (UNIX), this is a
point to check.  Note that the data files are capitalized, source files
are not.


<P>The source is in Ada and therefore very readable, which is not claimed for the
logic which is my. not Ada's, fault.  The source and data are freely available
for anyone to use for any purpose.  It may be converted to other
languages, used in pieces, or modified in any way without further permission
or notification.

<P>There is one oddity that the reader may remark upon.  The code is loaded
with PUT/print statements which are now commented out.   These were used at
some time for debug purposes and were just left in.  They (mostly) are left
justified and may fairly easily be removed for a cleaner presentation.
Further there are many blocks of code which during development have been
moved or removed, but have in their previous place been left commented.
This is also messy.  I cannot really justify not having fixed this, but there it is.


<A NAME="Feedback">
<H4>Feedback</H4>
<P>
Feedback is invited.  If there is a problem in installing or operating, in
the results or their display, or if your favorite word is omitted from the
dictionary, please let me know.
<P>
All comments are appreciated.  Check back for new version releases at<BR>
<BR>
<A HREF="http://www.erols.com/whitaker/words.htm">
http://www.erols.com/whitaker/words.htm</A>
<P>
Contact e-mail <A HREF="mailto:whitaker@erols.com"> <B>whitaker@erols.com</B></A>,
<P>
or William Whitaker, PO Box 51225 Midland TX 79710 USA.  <BR>


</BODY>
</HTML>