From hyphen@ibmpcug.co.uk Sat Apr 16 08:38:48 1994
Received: from oxmail by black.ox.ac.uk; Sat, 16 Apr 1994 08:38:46 +0100
Received: from kate.ibmpcug.co.uk by oxmail.ox.ac.uk with SMTP (PP) 
          id <17426-0@oxmail.ox.ac.uk>; Sat, 16 Apr 1994 08:38:41 +0100
To: pcl@ox.ac.uk
Subject: hungarian readme file
Organization: The PC User Group, UK
Cc: 
Date: Sat, 16 Apr 94 8:38:33 BST
From: D Fawthrop <hyphen@ibmpcug.co.uk>
Sender: hyphen@ibmpcug.co.uk
Message-Id: <9404160838.aa28556@kate.ibmpcug.co.uk>
Status: RO


This is a list of 18,000 Hungarian words collected in early 1994 from
soc.culture.hungary.  We believe that the quality of text used was poor and
therefore spelling errors are common.  One obvious problem however is that these
words are in "Computer Hungarian" without accents or the special characters.
While during collection, internet's headings, and signature blocks were removed,
there is a lot of computerese, and loan words still left in the  It really
requires someone to correct the words, which we cannot do.

Each word is followed by the frequency of occurrence in the text examined, in a
pale imitation of the Brown corpus.  This was performed using two utilities
which I have placed in the Public Domain "one word" and "uniq_num" which and are
to be found in the directory "utilities".

This is the result of a project which only required some tens of thousands of
words, which is now complete.  Should anyone want to continue the project, good
luck, get on with it!  soc.culture.hungary contains text from many authors and
on many subjects, there are many more words available.

I am interested in wordlists for any unusual language and am collecting several.
If you know of any please let me know.

Dave Fawthrop <hyphen@ibmpcug.co.uk> Hyphen House, 8 Cooper Grove,
Shelf, Halifax, HX3 7RF, England. Phone/Fax/Answer : +44 274 691092
-   God loved the World so much that he gave his only Son, so that    -
- anyone who believes in him shall not perish, but have eternal life. -


