Brave GNU World standard questions:
-----------------------------------

INTRODUCTION:

 These are the standard questions for the Brave GNU World column that
 can be found online at http://brave-gnu-world.org. 

 It has become customary to fill out these questions and mail them
 back to <column@gnu.org> in order to give me a good
 feeling/impression about a project. This also applies to
 non-technical projects, btw. 

 Please feel free to be as verbose as you want as more information
 will allow me to gain a more complete picture which I can then use to
 write a more complete feature.


QUESTIONS:

**********************************************************************
* What is it?

GNU Aspell is a Free and Open Source spell checker designed to
eventually replace Ispell. It can either be used as a library or as an
independent spell checker. Its main feature is that it does a much
better job of coming up with possible suggestions than just about any
other spell checker out there for the English language, including
Ispell and Microsoft Word. It also has many other technical
enhancements over Ispell such as using shared memory for dictionaries
and intelligently handling personal dictionaries when more than one
Aspell process is open at once.

**********************************************************************
* Who would use it?

Anyone who needs to spell check their writings.  (Who doesn't?)

**********************************************************************
* Why would they use it instead of similar projects?
* Special features/strengths?

From the manual:

1.1 Comparison to other spell checker engines

+------------------------------------------------------------------------+
|                        | Aspell | Ispell |  Netscape  | Microsoft Word |
|                        |        |        |    4.0     |       97       |
|------------------------+--------+--------+------------+----------------|
|            Open Source | x      |   x    |            |                |
|------------------------+--------+--------+------------+----------------|
|             Suggestion | 88-98  |   54   |   55-70?   |       71       |
|           Intelligence |        |        |            |                |
|------------------------+--------+--------+------------+----------------|
|       Personal part of | x      |   x    |     x      |                |
|            Suggestions |        |        |            |                |
|------------------------+--------+--------+------------+----------------|
| Alternate Dictionaries | x      |   x    |     ?      |       ?        |
|------------------------+--------+--------+------------+----------------|
|  International Support | x      |   x    |     ?      |       ?        |
+------------------------------------------------------------------------+

The suggestion Intelligence is based on a small test kernel of misspelled/
correct word pairs. Go to http://aspell.net/test for more info and how you
can help contribute to the test kernel. The current scores for Aspell are
88 in fast mode, 93 in normal mode, and 98 in bad spellers mode see
section 4.4.4 for more information about the various suggestion modes.

If you have any other information you would like to add to this chart
please contact me at kevina at users sourceforge net.

1.1.1 Comparison to Ispell

1.1.1.1 Features that only Aspell has

  * Does a much better job with coming up with suggestions than Ispell
    does or for that matter any other spell checker I have seen. If you
    know a spell checker that does a better job please let me know.
  * Can learn from users misspellings.
  * Is an actual library that others programs can link to instead of
    having to use it through a pipe.
  * Is multiprocess intelligent. When a personal dictionary (or
    replacement list) is saved it will now first update the list against
    the dictionary on disk in case another process modified it.
  * Can share the memory used in the main word list between processes.
  * Support for detachable dictionaries so that more than one aspell class
    can use the same dictionary.
  * Support for multiple personal dictionaries as well as support for
    special auxiliary dictionaries.
  * Better support for run-together words.
  * Ability to use multiple dictionaries by simply specifying it on the
    command line or in the configuration files.
  * A better, more complete word list for the English language. Word
    lists are provided for American, British, and Canadian
    spelling. Special care has been taken to only include one spelling
    for each word in any particular word list. The word list included
    in Ispell by contrast only included support for American and
    British and also tends to included multiple spellings for a word
    which can mask some spelling errors.

1.1.1.2 Things that, currently, only Ispell have

  * Support for affix compression (However this should change soon once
    Kevin Hendricks is integrated)
  * Lower memory footprint
  * Perhaps better support for spell checking (La)TEX files.
  * Support for spell checking Nroff files.


**********************************************************************
* (Programming) language used in this project?

Standard C++.  Requires Gcc 2.95 or better.  Should work with other
compilers but I only work with gcc.

**********************************************************************
* Special problems?

**********************************************************************
* Who is working on it?

So far just me :(

**********************************************************************
* Why did the project start?

I started this project because I needed a better spell checker for
GNU/Linux.  Ispell's suggestion strategy was far from adequate,
especially since I was use to using Microsoft Word.

I don't really remember the details but somehow I stumbled across
Lawrence Philips' Metaphone Algorithm and it worked great.  Thus the
first version of Aspell (then named Kspell) was born.

*********************************************************************
* History of the project?

Significant Dates:
  1998-09 Kspell 0.10
  1998-10 Aspell 0.20
  2000-02 Aspell 0.29
  2000-04 Aspell 0.30
  2000-07 Aspell 0.32
  2001-01 Aspell 0.33
  2002-08 GNU Aspell 0.50

Sometime around September, 1998 I released the first version of Aspell
however then it was called Kspell.  It was able to check documents,
however the interface was horrible.  It could also act as a
replacement for Ispell when used in pipe mode "aspell -a".  The main
feature of Kspell was the fact that it could come up with suggestions
better than any other spell checker I knew of including Microsoft
Word, which I originally though was pretty good.  In version .20 I
changed the name to Aspell to avoid confusion with KDE spell checker
which was also called Kspell.

I then continued to improve the suggestion strategy up until around
Aspell .29, where the score for Aspell improved from 82 to 93 based on
a small test kernel of misspelled words. In Aspell .29 I added a new
mode for Aspell suggestions "fast" which was several times faster than
Aspell's normal suggestion mode but still got decent results, with a
score of 87.  In Aspell .33 I added support to to give higher priority
to words such as "the" instead of "teh" which are likely to be due to
typos.  Unfortunately, this increased the time it took for Aspell to
come up with suggestions by several magnitudes. See the relevant
information section at the end of this document for more info
on Aspell's suggestion strategy.

In Aspell .29 I added rudimentary filter support which will
automatically skip over stuff that should not be spell checked such as
URL's, TeX and HTML formatting commands.  The filter interface,
however was a complete mess.  It was such a mess that I refused to to
even tough it after a while.  In Aspell 0.50 I totally rewrote the
interface to make it much simpler to add new filters to Aspell.

At the same time of Aspell .30 I released the first version of Pspell.
The goal of the Pspell was to provide a generic interface to Spell
checker libraries installed on the system.  However, in actuality, it
turned out just to make everyones like more difficult.  So in Aspell
0.50 I decided merge Pspell into Aspell.

I completely redid the word list for Aspell 0.32.  The resulting word
list in my view was far superior to the original word list I used
which came from Ispell.  The wordlist involved the careful combination
of a large number of other free word lists using a very complex
makefile and a lot of "sort" and "comm"s and few helper utilities.
Great care was taken when creating this new word list so that only one
spelling for any particular word was included depending on which
spelling region was chosen.  The new word list provided support for
American, British, and Canadian, spelling. The result of all this work
can be found at http://wordlist.sourceforge.net/.

A nice new curses interface was provided with Aspell .33 to replace
the dump terminal interface everyone was bitching about.

**********************************************************************
* Plans for the close and distant future?

I plan to release the first version of GNU Aspell by the end of
August.  The main reasons for doing this release is so that I can
officially retire the old Aspell/Pspell combination, to get GNU Aspell
into the next version of various GNU/Linux and other free operating
system (such as FreeBSD), and to generate publicity for GNU Aspell by
creating a press release for GNU Aspell 0.50.  With a few exceptions
it should be at least as functional and stable as the old
Aspell/Pspell combination.  The main exception is that there will be
no ispell module for GNU Aspell since Kevin Hendricks yet to be
integrated affix code should make the need for Ispell obsolete.  From
an end-users standpoint GNU Aspell also has a few minor enhancements
over the old Aspell/Pspell combination such as the ability to use
Ispell key bindings when checking documents.

I really don't know what direction GNU Aspell will take after the
release of Aspell 0.50.  I am really hoping that some other people
will pitch in and help.  Someday Aspell will be ready for the big 1.0.
But I have no idea when that is.  Maybe never, unless I get some
assistance. (See next question)

**********************************************************************
* Do you need help? If so: of what kind?

Most defiantly, from the manual:

2.4 Helping Out

I have very little time to work on Aspell so I desperately need other
people to help with the development of Aspell. There are a lot of things
that need to be done before I consider Aspell complete. See section B.1
for a list of them. I would really appreciate some help with doing them.
If you are interested in helping with one of them please let me know.

I am also looking for someone to eventually take over the maintenance and
development of Aspell. If you are interested please contact me
directly.

B.1 Things that need to be done

These items need to be done before I consider Aspell finished.
Unfortunately, I do not have the time to do all so if you are interested
in helping me with one of these tasks please email me. Good C++ skills are
needed for most of these tasks involving coding.

  * Convert manual from LyX/LATEX to Texinfo.
  * Clean up copyright notices and bring the Aspell package up to GNU
    Standards.
  * Add gettext support to Aspell.
  * Allow Aspell to check documents which are in UTF-8. The main thing to
    be done is to get Aspell to display UTF-8 correctly. This will involve
    being able to get the length of UTF-8 strings and display UTF-8 using
    the curses library.
  * Make Aspell Thread safe. Even though Aspell itself is not
    multi-threaded I would like it to be thread safe so that it can be
    used by multi-threaded programs. There are several areas of Aspell
    that that are potently thread unsafe (such as accessing a global pool)
    and several several classes which have the potential of being used by
    more than one thread (such as the personal dictionary).
  * Allow filters to be loaded at run-time. At the moment all filters must
    be complied in.
  * Integrate Kevin Hendricks affix compression code into Aspell. His code
    is already in use in OpenOffice as part of the lingucomponent
    component. More information can be found at http://
    whiteboard.openoffice.org/lingucomponent/dictionary.html. The latest
    version of his code is also available there.
  * Create a C++ interface for Aspell, possibly on top of the C one.
  * Enhance ispell.el so that it will work better with the new Aspell.

FYI:

Affix compression is the act of combining several words with a common
base word into one word which consists of the base word and a list of
affixes to apply.  (Affix is the generic term for prefix, suffix or
infix).  For example "alarm alarms alarmed alarming" will become
"alarm/SDG" where SDG stands for the suffixes of alarm.  This can make
a huge difference in space for languages with have extensive
affixation such as German.

**********************************************************************
* Interesting/fun stories that might juice up the story?

Sorry, none that I can think of :|.

**********************************************************************
* Website/FTP addresses?

http://aspell.net

**********************************************************************
* License?!

LGPL.  Naturally RMS wasn't happy with it.  After several emails here
is what he finally agreed to (in late July, 2002)

  The LGPL gives everyone permission to release a GPL-covered version.
  In fact, we could do this today.  If everyone else can do this, it
  isn't fair to say that the FSF alone can never do this.  At some point
  we have to have as much rights as everyone else.

  So I'm proposing that the FSF will agree, for a year, to respect
  your preference about the license we use, but after that we would
  no longer be bound.  We could still talk about it with you but would
  not have to follow your wishes.

  If you are working actively on the program, that provides a reason to
  pay attention to your views about the license for it.  If you continue
  working on it (which would be a good thing), then that reason
  continues; but if you really do stop, eventually that reason fades
  away, and we should be free to do whatever the license allows and our
  principles support.

One of the main reason I want it to be LGPL is so that projects like
OpenOffice and Mozilla will use it.  If it was GPL they will only be
allowed to use it in the "Free" version and not the commercial
version.  This means that it will probably not be worth their effort
to use GNU Aspell.

I also want Aspell to eventually become the de-factual spell checker
for GNU/Linux and other free operation systems.  The only way I see
this happening is if it under a permissive enough license to allow
non-free programs to use it as a library, which is why I chose to use
LGPL rather than GPL.  Although non-free programs can technically use
Aspell though it's pipe interface (in the same way they can use
Ispell) this is less than ideal in my view.

Also, since Aspell is already released under the LGPL, I fear that
changing it to GPL may cause an undesirable fork of Aspell: One under
the GPL by the FSF and another under the LGPL by someone else.

**********************************************************************
* Standard documents to read in this context?

None that I know of.

However, it may be worthwhile to look at Ispell if you are not already
familiar with it.  Aspell gets many of its ideas from Ispell and my
eventual goal is to make the need for Ispell obsolete.

**********************************************************************
* Anything you would like to see mentioned?

In case your interested, from the manual:

8. How Aspell Works

The magic behind my spell checker comes from merging Lawrence Philips
excellent Metaphone algorithm and Ispell's near miss strategy which is
inserting a space or hyphen, interchanging two adjacent letters, changing
one letter, deleting a letter, or adding a letter.

The process goes something like this.

 1. Convert the misspelled word to its soundslike equivalent (its
    Metaphone for English words).
 2. Find all words that have a soundslike within one or two edit distances
    from the original words soundslike. The edit distance is the total
    number of deletions, insertions, exchanges, or adjacent swaps needed
    to make one string equivalent to the other. When set to only look for
    soundslikes within one edit distance it tries all possible soundslike
    combinations and check if each one is in the dictionary. When set to
    find all soundslike within two edit distance it scans through the
    entire dictionary and quickly scores each soundslike. The scoring is
    quick because it will give up if the two soundslikes are more than two
    edit distances apart.
 3. Find misspelled words that have a correctly spelled replacement by the
    same criteria of step number 2 and 3. That is the misspelled word in
    the word pair (such as teh -> the) would appear in the suggestions
    list as if it was a correct spelling.
 4. Score the result list and return the words with the lowest score. The
    score is roughly the weighed average of the weighed edit distance of
    the word to the misspelled word and the soundslike equivalent of the
    two words. The weighted edit distance is like the edit distance except
    that the various edits have weights attached to them.
 5. Replace the misspelled words that have correctly spelled replacements
    with their replacements and remove any duplicates that might arise
    because of this.

Please note that the soundslike equivalent is a rough approximation of how
the words sounds. It is not the phoneme of the word by any means. For more
details about exactly how each step is performed please see the file 
suggest.cc. For more information on the metaphone algorithm please see the
data file english_phonet.dat.


**********************************************************************
* Answer to a question I forgot?

None that I can think of.

**********************************************************************
**********************************************************************
* Relevant Information

**********************************************************************
* Notes on the Different Suggestion Modes

[From the manual]

In order to understand what these suggestion modes do, a basic understanding
of how aspell works is required. See section 8 for that. The suggestion modes
are as follows.

ultra
    This method will use the fastest method available to come up with decent
    suggestions. This currently means that it will look for soundslikes
    within one edit distance apart without doing any typo analysis. It is
    slower than Ispell by a factor of 1.5 to 2 when a single word list is
    used. It speed is only minor affected by the size of the word list, if at
    all, but it is strongly effected by the number of word lists use. In this
    mode Aspell gets about 87% of the words from my small test kernel of
    misspelled words. (Go to http://aspell.sourceforge.net/testfor more info
    on the test kernel as well as comparisons of this version of Aspell with
    previous versions and other spell checkers.)
fast
    This method is like ultra except that it also performs typo analysis
    unless it is turned off by setting the keyboard to none. The typo
    analysis brings words which are likely to be due to typos to the
    beginning of the list but slows things down by a factor of about two.
    This mode should get around the same number of words that the ultra
    method does.
normal
    This method looks for soundslikes within two edit distance apart and
    perform typo-analysis unless it is turned off. Is is around 10 times
    slower than fast mode with the english word list but returns better
    suggestions. Its speed is directly proportional to the size of the word
    list. This mode gets 93% of the words.
bad-spellers
    This method also looks for soundslikes within two edit distances apart
    but is more tailored for the bad speller where as fast or normal are more
    tailed to strike a good balance between typos and true misspellings. This
    mode never performs typo-analysis and returns a huge number of words for
    the really bad spellers who can't seam to get the spelling anything close
    to what it should be. If the misspelled word looks anything like the
    correct spelling it is bound to be found somewhere on the list of 100 or
    more suggestions. This mode gets 98% of the words.


**********************************************************************
* http://aspell.net/test/

Spell Checker Test Kernel Results

-----------------------------------------------------------------------------

Scores

+-------------------------------------------------------------------------+
|             |       | Total | Total |     |     | 1 - | 1 - | 1 - |     |
|             | Score |  Not  | Found |First|1 - 5| 10  | 25  | 50  | Any |
|             |       | Found |       |     |     |     |     |     |     |
|-------------+-------+-------+-------+-----+-----+-----+-----+-----+-----|
|Aspell .20   |   82.8|     88|    424| 58.4| 78.1| 81.6| 82.8| 82.8| 82.8|
|-------------+-------+-------+-------+-----+-----+-----+-----+-----+-----|
|Aspell .21   |   80.9|     98|    414| 58.0| 76.6| 79.7| 80.7| 80.9| 80.9|
|-------------+-------+-------+-------+-----+-----+-----+-----+-----+-----|
|Aspell .22 - |   85.7|     73|    439| 55.5| 77.0| 81.1| 85.0| 85.5| 85.7|
|.24          |       |       |       |     |     |     |     |     |     |
|-------------+-------+-------+-------+-----+-----+-----+-----+-----+-----|
|Aspell .25 - |   83.2|     86|    426| 56.2| 78.1| 80.9| 83.2| 83.2| 83.2|
|.28          |       |       |       |     |     |     |     |     |     |
|-------------+-------+-------+-------+-----+-----+-----+-----+-----+-----|
|Aspell .29.0 |   93.9|     31|    481| 59.0| 85.5| 89.8| 93.6| 93.8| 93.9|
|/ Normal     |       |       |       |     |     |     |     |     |     |
|-------------+-------+-------+-------+-----+-----+-----+-----+-----+-----|
|Aspell .29.1 |       |       |       |     |     |     |     |     |     |
|- .30 /      |   92.8|     37|    475| 59.0| 85.5| 89.8| 92.6| 92.8| 92.8|
|Normal       |       |       |       |     |     |     |     |     |     |
|-------------+-------+-------+-------+-----+-----+-----+-----+-----+-----|
|Aspell       |       |       |       |     |     |     |     |     |     |
|Dmetaph /    |   93.8|     32|    480| 59.2| 85.9| 91.4| 93.6| 93.8| 93.8|
|Normal       |       |       |       |     |     |     |     |     |     |
|-------------+-------+-------+-------+-----+-----+-----+-----+-----+-----|
|Aspell .29 - |   87.3|     65|    447| 59.0| 81.8| 85.0| 87.1| 87.3| 87.3|
|.30 / Fast   |       |       |       |     |     |     |     |     |     |
|-------------+-------+-------+-------+-----+-----+-----+-----+-----+-----|
|Aspell       |       |       |       |     |     |     |     |     |     |
|Dmetaph /    |   88.1|     61|    451| 58.6| 82.6| 86.5| 87.9| 88.1| 88.1|
|Fast         |       |       |       |     |     |     |     |     |     |
|-------------+-------+-------+-------+-----+-----+-----+-----+-----+-----|
|Aspell .29   |       |       |       |     |     |     |     |     |     |
|-.30 / Bad   |   98.0|     10|    502| 59.6| 82.8| 87.3| 94.5| 96.5| 98.0|
|Spellers     |       |       |       |     |     |     |     |     |     |
|-------------+-------+-------+-------+-----+-----+-----+-----+-----+-----|
|Aspell       |       |       |       |     |     |     |     |     |     |
|Dmetaph / Bad|   98.2|      9|    503| 59.8| 83.0| 89.5| 95.7| 97.1| 98.2|
|Spellers     |       |       |       |     |     |     |     |     |     |
|-------------+-------+-------+-------+-----+-----+-----+-----+-----+-----|
|Ispell 3.1.20|   54.7|    232|    280| 39.3| 52.1| 54.1| 54.7| 54.7| 54.7|
|w/ -S option |       |       |       |     |     |     |     |     |     |
|-------------+-------+-------+-------+-----+-----+-----+-----+-----+-----|
|Word 97      |   70.7|    150|    362| 57.2| 69.1| 70.7| 70.7| 70.7| 70.7|
+-------------------------------------------------------------------------+
Note: Only Data in which the correct spelling was found in all three
dictionaries was counted.

The Score is: (Total Found)/(Total)*100
First is: (Total Found First On List)/(Total)*100, 1-5 is: (Total Found 1st -
5th)/(Total)*100, etc...

Dmetaph is Aspell .30.1 which uses the Double Metaphone algorithm. Please see
my Metaphone page for more information. To see which words Aspell with the
Double Metaphone algorithm got and Aspell .30.1 did not, and vice versa,
please see this file.

[scores]
Graph created with Ploticus

-----------------------------------------------------------------------------

Number of Results Returned

+---------------------------------------------------------------------------+
|                    |  Min  |  5%  |  25%  |  50%  |  75%  |  95%  |  Max  |
|--------------------+-------+------+-------+-------+-------+-------+-------|
|Aspell .20          |      0|     3|      3|      5|      8|     16|     38|
|--------------------+-------+------+-------+-------+-------+-------+-------|
|Aspell .21          |      0|     3|      3|      5|     11|     26|     76|
|--------------------+-------+------+-------+-------+-------+-------+-------|
|Aspell .22 - .24    |      0|     3|      3|      5|     11|     23|     60|
|--------------------+-------+------+-------+-------+-------+-------+-------|
|Aspell .25 - .28    |      0|     2|      3|      5|      8|     18|     32|
|--------------------+-------+------+-------+-------+-------+-------+-------|
|Aspell .29.0 /      |      3|     5|     10|     17|     31|     82|    100|
|Normal              |       |      |       |       |       |       |       |
|--------------------+-------+------+-------+-------+-------+-------+-------|
|Aspell .29.1 - .30 /|      3|     4|      8|     12|     20|     36|    100|
|Normal              |       |      |       |       |       |       |       |
|--------------------+-------+------+-------+-------+-------+-------+-------|
|Aspell Dmetaph /    |      3|     4|      8|     13|     19|     37|     77|
|Normal              |       |      |       |       |       |       |       |
|--------------------+-------+------+-------+-------+-------+-------+-------|
|Aspell .29 - .30 /  |      0|     2|      5|     10|     17|     33|     96|
|Fast                |       |      |       |       |       |       |       |
|--------------------+-------+------+-------+-------+-------+-------+-------|
|Aspell Dmetaph /    |      0|     3|      5|      9|     17|     33|     77|
|Fast                |       |      |       |       |       |       |       |
|--------------------+-------+------+-------+-------+-------+-------+-------|
|Aspell .29 -.30 /   |      3|    11|     29|     74|    173|    470|   1000|
|Bad Spellers        |       |      |       |       |       |       |       |
|--------------------+-------+------+-------+-------+-------+-------+-------|
|Aspell Dmetaph / Bad|      4|     9|     31|     75|    168|    414|   1000|
|Spellers            |       |      |       |       |       |       |       |
|--------------------+-------+------+-------+-------+-------+-------+-------|
|Ispell 3.1.20 w/ -S |      0|     0|      0|      1|      3|      9|     25|
|option              |       |      |       |       |       |       |       |
|--------------------+-------+------+-------+-------+-------+-------+-------|
|Word 97             |      0|     0|      1|      2|      4|     10|     20|
+---------------------------------------------------------------------------+
-----------------------------------------------------------------------------

The Raw Data

  * Test Data
  * Results, Scripts and Raw Data

-----------------------------------------------------------------------------

                               Back to Aspell                                



**********************************************************************
* http://wordlist.sourceforge.net/

Kevin's Word Lists Page

This page contains links to various Word Lists and related information.

--------------------------------------------------------------------------

News

June 7, 2001 A new version of the official 12Dicts package is now
available. It differs from previous versions primarily by replacement of
one dictionary with another dictionary from the same publisher.
Additionally, there have been many error corrections, especially to the
2of12inf list.

May 20, 2001 Finally updated the webpage to reflect the fact that a new
version of SCOWL, and AGID was released in the beginning of April.

Added a link to the an online version of a Longman dictionary.

Older News
--------------------------------------------------------------------------

Packaged by Me

"tar.gz" files has Unix EOL charters and "zip" files has DOS/Windows EOL
characters. Older releases can be found here.

SCOWL

SCOWL (Spell Checker Oriented Word Lists) is a collection of word lists
split up in various sizes, and other categories, intended to be suitable
for use in spell checkers. However, I am sure it will have numerous other
uses as well.

readme, tar.gz, zip (rev 4a, 2.2 MB) (release notes & changelog)

AGID

AGID is an Automatically Generated Inflection Database from an insanely
large word list. My goal is for the non-questionable entries to be 100%
accurate.

readme, tar.gz, zip (rev 3a, 0.9 MB) (release notes & changelog)

VarCon

VarCon (Variant Conversion Info) contains tables to convert between
American, British, and Canadian spellings and vocabulary as well as well
as a table listing the equivalent forms of other variants.

readme, tar.gz, zip (rev 2, 0.1 MB) (release notes & changelog)

Part Of Speech Database

The Part Of Speech Database is a combination of "Moby (tm) Part-of-Speech
II" and the WordNet database.

readme, tar.gz, zip (1.2 MB)

Unofficial Jargon File Word Lists

The Unofficial Jargon File Word Lists is a collection of useful Word Lists
created from the Jargon file.

readme, tar.gz

Ispell English Word Lists

This package contains the contents of the Ispell (ver 3.1.20) word list
after being expand from there affix compressed form used by Ispell.

readme, tar.gz, zip (0.4 MB)

Unofficial Alternate 12 Dicts Package

The Unofficial Alternate 12 Dicts Package contains almost all the
information in the official 12Dicts package but in a different format as
well as a good deal of additional information. However it is not meant as
a replacement for the official 12Dicts package. It simply offers the
information in a different way.

readme, readme-infl, tar.gz, zip (rev 2b, 0.6 MB) (release notes &
changelog)

--------------------------------------------------------------------------

Official 12Dicts Package

The methodology of the project is to record and correlate the words listed
in a number of small dictionaries. The number of dictionaries so recorded
is now 12, comprising 8 ESL (English as a Second Language) dictionaries
and 4 "desk dictionaries". The dictionaries chosen vary widely by
publisher, by style, by completeness and by depth. In this version of
12dicts, all of them are dictionaries of American English (three from
British publishers). The smallest of them contains about 20,000 entries,
and the largest 46,000. (All totalled, there are about 75,000 entries,
many of which appear in only a single dictionary.) All but two of them
were published in the last six years.

12dicts-3.0.zip (ver 3.0, 473 KB)

--------------------------------------------------------------------------

Links to other General Word Lists

  * ENABLE, YAWL
  * The UK Advanced Cryptics Dictionary (UKACD), plus others
  * UK English wordlist with frequency classification, plus linkes to many
    other lists with frequency info.
  * BNC Corpus Frequency Info
  * Moby, Alternate Site
  * english.words DEC-SRC-92-04-05, Alternate Site
  * Large Collection of Word Lists
  * National Puzzlers' League -- Word Lists

Links to Online Dictionaries

  * Marriam-Webster
  * American Heritage
  * Random House
  * Webster's New World (Search Currently Broken)
  * Encarta
  * Wordsmyth
  * WordNet
  * Cambridge (ESL)
  * Longman (ESL)
  * Macquarie (Australian)
  * 1913 Webster's: ARTFL, Bibliomania, dict.org, Ralph
  * OneLook (Meta)
  * More: YourDictionary, Randy's Links

Links to Slang Word Lists and Dictionaries

  * American
      + Common American Slang
      + American-Australian Slang Dictionary
      + ESL Slang Page
      + Canadian/US Slang
  * British
      + A dictionary of slang
      + British Isles Slang
      + British Slang Glossary
  * Australian
      + Australian Slang
  * Cajun
      + Cajun Slang
  * Computer/Internet
      + Computer/Internet Slang
  * Miscellaneous
      + Miscellaneous Slang
  * The Alternative English Dictionary

Links to Specialty Word Lists and Dictionaries

  * Jargon File (Also known as The New Hacker's Dictionary)
  * The Free On-Line Dictionary of Computing
  * U.S. Census Bureau Names Files with frequency info.

Other Links

  * Aspell
  * The American-British British-American Dictionary

--------------------------------------------------------------------------

This site is maintained by Kevin Atkinson (kevina at users sourceforge
net). Be sure to check out my home page.

[sflogo]







NOTE:

 Everything you want me to read should be sent to me by
 mail because very often I will take my mail with me and 
 read it where I don't have access to the net.