Dictionaries for International Ispell
Many people have contributed dictionaries and affix files to allow
ispell to be used with their native languages.
These are listed below. Unlike the languages/Where file
in the distribution, this list includes only dictionaries that are
actually intended for use with ispell.
Some of the affix files distributed with ispell are identical to or
are based on the affix files at the ftp sites listed below.
Older affix files from some of these sites will require changes to the
defstringtype and altstringtype statements to
specify a deformatter before they will work with ispell 3.1. This is
done by adding a new quoted string, "TeX" or
"nroff", after the first quoted string in the statement.
SPECIAL NOTE REGARDING "OBSCURE" LANGUAGES: Kevin
Scannell of St. Louis University is interested in
developing word lists for minority languages.
If you don't find a word list below, it is worth visiting
his site to
check for the latest status and updates.
Please do not write me asking where to find dictionaries that are not
listed on this page. Everything I know is posted here. On the other
hand, if you find an error or broken link in the page, or if you know
of a dictionary that is not listed, please do send mail to
geoff@cs.hmc.edu so that I can
update this page.
Following is a canonical list of everything I know about:
Afrikaans Dictionaries and Affix Files
Albanian Dictionaries and Affix Files
Belarusian Dictionaries and Affix Files
Unfortunately, all known Belarusian dictionaries have disappeared
from the Internet. Please let me know if you are aware of a
working Belarusian dictionary for ispell.
Bulgarian Dictionaries and Affix Files
Catalan Dictionaries and Affix Files
Chichewa Dictionaries and Affix Files
Croatian Dictionaries and Affix Files
- A Croatian dictionary is discussed on a
Web page
from the University of Split; apparently you have to
e-mail someone to get a copy.
- A Croatian
dictionary and affix file with over 200 000 words, courtesy of
Denis Lackovic. Search the Web page for "ISPELL" to find
the links. A Hungarian dictionary is available on the
same page.
Czech Dictionaries and Affix Files
- An
updated version of the Cermak dictionary and affix
file, submitted by Petr Kolar. This dictionary accepts
over 3 times more word forms than the earlier dictionary.
The
README file (in Czech) is very detailed and should
tell you all you need to now.
- A
dictionary
and affix file by Tomas Cermak (this page is in Czech).
- For Windows users, Luzius Schneider has provided a
complete
package containing ispell along with a large
collection of languages.
Danish Dictionaries and Affix Files
Dutch Dictionaries and Affix Files
The Dutch dictionaries that used to be hosted at ftp.tue.nl have
disappeared. However, a site called
smartocr.com
appears to have archived them and many more.
Variant English Spellings
Specialized Dictionaries for Specific Fields
- A dictionary of
geological terms from Thomas L. Moore.
Esperanto Dictionaries and Affix Files
Estonian Dictionaries and Affix Files
- An advanced
dictionary
and
affix file
produced by Jaak Pruulmann in cooperation with the Estonian
Language Institute.
I'm told that this is the best Estonian dictionary to date
(late 2005).
- A dictionary
and affix file
created by Pearu Peterson in 2001.
There is also a
Debian
package created by Andres Soolo. However, Andres
informs me that this package is outdated, and that users should
download Jaak Pruulmann's work instead.
- For Windows users, Luzius Schneider has provided a
complete
package containing ispell along with a large
collection of languages.
Finnish Dictionaries and Affix Files
- A good dictionary
and affix file
originally written by Martin Vermeer, and with major
extensions and improvements made by Pauli Virtanen.
- For Windows users, Luzius Schneider has provided a
complete
package containing ispell along with a large
collection of languages.
French Dictionaries and Affix Files
Frisian Dictionaries and Affix Files
- Eeltje de Vries and Kevin Scannell have created a
Frisian
dictionary using web-crawled corpora and statistical
methods.
- A Galician (Minimos)
dictionary and affix file from Ramon Flores. There are
descriptive Web pages in
Galician
and
English.
- For Windows users, Luzius Schneider has provided a
complete
package containing ispell along with a large
collection of languages.
German Dictionaries and Affix Files
Perhaps because German was my first test language, there are plenty of
German dictionaries available. I don't have a lot of
hints for choosing between them, but I believe that Heinz Knutzen's
version, being the most recent, is probably the best.
Incidentally, if you choose a dictionary that uses the
"wrong" umlaut notation of `"a' instead of the
`a"' that ispell expects, you can convert it with the
following ugly sed command:
sed 's/"\(.\)/\1"/g' wrong-dictionary > proper-dictionary
(I'd suggest that you use cut-and-paste directly out of this Web page
to make sure you get it right.)
Greek Dictionaries and Affix Files
- For Windows users, Luzius Schneider has provided a
complete
package containing ispell along with a large
collection of languages.
- There is also a specialized spelling checker for Hebrew,
called hspell. It is distributed by the
Ivrix Project.
Interlingua Dictionaries and Affix Files
Italian Dictionaries and Affix Files
Irish Dictionaries and Affix Files
Kinyarwanda Dictionaries and Affix Files
Kurdish Dictionaries and Affix Files
Latin Dictionaries and Affix Files
- A very specialized
Latin
dictionary that was created to check the full edition
(about 500 pages) of the works of the Renaissance
mathematician Maurolico. The README is in French. This
dictionary might serve as a useful starting point for
other scholars.
- For Windows users, Luzius Schneider has provided a
complete
package containing ispell along with a large
collection of languages.
Lithuanian Dictionaries and Affix Files
- A small
Lithuanian
dictionary and affix file. It does not currently have any
English READMEs.
- For Windows users, Luzius Schneider has provided a
complete
package containing ispell along with a large
collection of languages.
Maltese Dictionaries and Affix Files
Manx Gaelic Dictionaries and Affix Files
Norwegian Dictionaries and Affix Files
- The
most recent version of a Norwegian dictionary
identifies itself as being for ispell 3.1.20, but it should
work with the current version as well.
- An older
page about the above dictionary still contains useful
information, even though it is not being maintained.
- The Norwegian
spelling project is extending the dictionary created
by Rune Kleveland. If you click on "Files", you will find
links that you can use to download the current release.
As well as Bokmål (ispell-nb), you can build a Nynorsk
(ispell-nn) dictionary. RPMs may also be available from
rpmfind.net,
and there is a Debian package that can be acquired with apt-get.
- For Windows users, Luzius Schneider has provided a
complete
package containing ispell along with a large
collection of languages.
Polish Dictionaries and Affix Files
Portuguese Dictionaries and Affix Files
Quechua Dictionaries and Affix Files
- Amos Batto has produced a comprehensive Quechua package.
(If you don't know, Quechua is a native Bolivian language.)
He provides descriptions in
Spanish
and
English,
a
hash file
(which may or may not work on your machine, due to
the unfortunate nonportability of ispell hash files), and a
ZIP
archive containing the affix file and word list
(which you can use to create a working hash file with
buildhash
).
Romanian Dictionaries and Affix Files
- Ionuț Păduraru has created a
dictionary
that covers nearly every Romanian word.
He has provided a brief
explanation
(in Romanian).
-
George Pauliuc has taken over Mihai Budiu's Romanian dictionary.
There is a new
tar file
available from sourceforge.net.
-
Mihai Budiu has kindly provided what he describes as a
"modest"
Romanian dictionary
for use with ispell. The
compressed
tar file
includes an affix file and some auxiliary scripts. There is also
a README
file in Romanian, which is unfortunately stored on an unreliable
server.
The tar file is also mirrored on
ftp://ftp.pub.ro/pub/spell-roman/spell-roman.taz,
which Mihai tells me is on the same unreliable server.
- For Windows users, Luzius Schneider has provided a
complete
package containing ispell along with a large
collection of languages.
Russian Dictionaries and Affix Files
- A constantly-improving Russian
dictionary
from Alexander Lebedev,
based on Neal Dalton's version (below). This dictionary
contains 120,400 roots and recognizes over 1,149,000 words. The
author says "This package seems to be the only one that
supports the right spelling of words with the Russian letter
'yo' (other dictionaries simply replace the letter 'yo' by
'ye')."
An
earlier version of this dictionary is mirrored at
sunsite.
- A 200,000-word Russian
dictionary and affix file from
Constantine
Knizhnik, and based on Neal Dalton's version (below).
Constantine's page also includes an incremental ispell mode
(similar to MS-Word) for Emacs.
- A very preliminary 50,000-word
dictionary,
from
sunsite.unc.edu, created
by Neal Dalton.
The tar file includes an affix file, which needs to be
corrected before it will work.
You must comment out the second
wordchars line (the one just after the comment
about GOSTCII-8).
The dictionary is also
mirrored
at ftp.cs.umd.edu.
- A dictionary for the
old
(pre-1918) Russian orthography, from Serge Winitzki.
- For Windows users, Luzius Schneider has provided a
complete
package containing ispell along with a large
collection of languages.
Sardinian Dictionaries and Affix Files
Setswana Dictionaries and Affix Files
Slovakian Dictionaries and Affix Files
Slovenian Dictionaries and Affix Files
Spanish Dictionaries and Affix Files
- A
dictionary
maintained by Santiago Rodriguez and Jesus
Carretero at the University of Madrid.
The tar file includes an affix file, instructions, and
TeX hyphenation rules for Spanish.
More information can be gotten from their
Web page.
- A
number
of useful dictionaries from Grady Ward's Moby Project.
These lists are also available from
The
Gutenberg Project.
- For Windows users, Luzius Schneider has provided a
complete
package containing ispell along with a large
collection of languages.
Swedish Dictionaries and Affix Files
- Sadly, all of the Swedish dictionaries I used to link to
have disappeared from the Internet. I did find a
Debian page that contains copies of a Swedish
dictionary, but I don't know how easy it will be for
non-Debian users to install them.
- For Windows users, Luzius Schneider has provided a
complete
package containing ispell along with a large
collection of languages.
Tagalog Dictionaries and Affix Files
Tartar Dictionaries and Affix Files
- For Windows users, Luzius Schneider has provided a
complete
package containing ispell along with a large
collection of languages.
Tetum Dictionaries and Affix Files
Ukrainian Dictionaries and Affix Files
Vietnamese Dictionaries and Affix Files
Walloon Dictionaries and Affix Files
- A
Walloon
dictionary and affix file. The dictionary
uses the new normalized orthography, and currently has
8,705 entries that expand to something between 70,000-100,000
forms.
Geoff Kuenning's
home page.
This page maintained by
Geoff Kuenning.