Source: http://ftp.sunet.se/pub/etext/ota/public/dicts/1192/
------------------------------------------------------------------------
OXFORD TEXT ARCHIVE
MACHINE-READABLE DICTIONARIES revised Jul 1993
------------------------------------------------------------------------
0. Introduction
The Oxford Text Archive has for several years maintained copies of
several machine-readable dictionaries along with its extensive (if
unsystematic) collections of other machine-readable texts. This
document gives some further details of the various dictionaries
available, and summarises the conditions under which copies of them
are currently distributed.
The Oxford Text Archive Shortlist (available on request via electronic
mail and by FTP) gives up to date brief details of all texts held in
the Archive. Send electronic mail to ARCHIVE@VAX.OXFORD.AC.UK. For
anonymous FTP, look in the directory ota on ota.ox.ac.uk (129.67.1.165)
In the following extract from the shortlist, *** indicates that
more detail is given below.
English ============
***U-1192-E The CED Prolog Factbase.
-----------------------------------------------------------------------------
2.1 Text no 1192: The CED Prolog Fact Base
We are currently able to distribute all or parts of a set of Prolog facts
derived from the first published edition of Collins dictionary by a team
headed by Ed Fox and Robert France at Virginia Tech as a part of the CODER
project. (see E.A. Fox 'Development of the CODER system', Information
Processing and Management 23:4, 1987). The text was originally produced as
part of an MS Thesis by Robert C. Wohlwend.
It consists of 20 files, one for each relation identified in the structure
of the dictionary, and a smaller number of files of statistics gathered
from the full text. Each relation file consists of ground facts in
Edinburgh standard Prolog syntax, one fact per line. In each case (except
for the headword relation which has no accompanying data), the facts are of
the form name(descriptor,data) where name identifies the relation,
descriptor specifies both the associated entry and the depth within it at
which the fact is bound, and data represents the data stored for that
information. A full description of each relation is given in Fox et al,
"Building the CODER Lexicon" (Technical Report TR-86-23, Virginia Tech
Department of Computer Science.
There follows the first few lines of each of the relation files, together
with a brief description of its content. Brief documentation is also
supplied with the text.
relations/category :- 28492 lines: indicating domain, register or region.
c_CATEGORY([ "-bashing",1,2,1 ], informal ).
c_CATEGORY([ "-bashing",1,2,1 ], slang ).
c_CATEGORY([ "-eme",1,1,1 ], linguistics ).
c_CATEGORY([ "-ish",1,1,2 ], often_derogatory ).
c_CATEGORY([ "-ium",1,0,1 ], sometimes ).
c_CATEGORY([ "-ji",1,0,1 ], indian ).
relations/abbrev :- 551 abbreviations
c_ABBREV([ "Alabama",1,1,1 ], "(with zip code)AL" ).
c_ABBREV([ "Alabama",1,1,1 ], "Ala|" ).
c_ABBREV([ "Alaska",1,1,1 ], "(with zip code)AK" ).
c_ABBREV([ "Alaska",1,1,1 ], "Alas|" ).
c_ABBREV([ "Alberta",1,1,1 ], "Alta|" ).
c_ABBREV([ "American Indian Movement",1,1,1 ], "AIM|" ).
relations/also.called :- 5342 alternate lexemes
c_ALSO_CALLED([ "-ine",2,1,2 ], "-in" ).
c_ALSO_CALLED([ "-wise",1,1,1 ], "-ways" ).
c_ALSO_CALLED([ "A",1,1,8 ], "at" ).
c_ALSO_CALLED([ "ACTH",1,1,1 ], "corticotropin" ).
c_ALSO_CALLED([ "Aborigine",1,1,1 ], "native Australian").
c_ALSO_CALLED([ "Abu Simbel",1,1,1 ], "Ipsambul" ).
relations/def.num :- 172944 lines : number of blocks in definition text
c_DEF_NUM([ "'elan vital",1,1,1 ], 2 ).
c_DEF_NUM([ "'em",1,1,1 ], 1 ).
c_DEF_NUM([ "'eminence grise",1,1,1 ], 1 ).
c_DEF_NUM([ "'epatant",1,1,1 ], 1 ).
c_DEF_NUM([ "'Alborg",1,1,1 ], 1 ).
relations/headword :- 83896 headwords
c_HEADWORD([ "'A",1 ]).
c_HEADWORD([ "'Abo",1 ]).
c_HEADWORD([ "'Aland Islands",1 ]).
c_HEADWORD([ "'Alborg",1 ]).
c_HEADWORD([ "'Alesund",1 ]).
relations/irreg.verb :- 2522 irregular verb forms
c_IRREGULAR([ "Crescendo",1 ], "doed" ).
c_IRREGULAR([ "Frenchify",1 ], "fied" ).
c_IRREGULAR([ "Frenchify",1 ], "fying" ).
c_IRREGULAR([ "Frlein",1 ], "lein" ).
c_IRREGULAR([ "K|O|",1 ], "KO""d" ).
relations/morph.var :- 28151 spellings and part of speech codes for
morphological variants
c_MORPH([ "-able",1 ], "-ability" , suffn ).
c_MORPH([ "-able",1 ], "-ably" , suffadv ).
c_MORPH([ "-agogue",1 ], "-agogic" , adjcomb ).
c_MORPH([ "-agogue",1 ], "-agogy" , ncomb ).
c_MORPH([ "-algia",1 ], "-algic" , adjcomb ).
c_MORPH([ "-archy",1 ], "-archic" , adjcomb ).
c_MORPH([ "-archy",1 ], "-archist" , ncomb ).
relations/name.cont :- 3051 continuations of proper names
c_NLAST([ "'Angstr",1,1,1 ], "Anders Jonas" ).
c_NLAST([ "Aalto",1,1,1 ], "Alvar" ).
c_NLAST([ "Abelard",1,1,1 ], "Peter" ).
c_NLAST([ "Aberdeen",2,1,1 ], "George Hamilton-Gordon" ).
c_NLAST([ "Academy",1,1,1 ], "the" ).
relations/irreg.plural :- 8263 irregular plural forms
c_PLURAL([ "'eminence grise",1 ], "eminences grises" ).
c_PLURAL([ "Abo",1 ], "Abos" ).
c_PLURAL([ "Achaemenid",1 ], "Achaemenidae" ).
c_PLURAL([ "Achaemenid",1 ], "Achaemenides" ).
c_PLURAL([ "Achaemenid",1 ], "Achaemenids" ).
c_PLURAL([ "Afro",1 ], "ros" ).
relations/p.o.s:- 100565 parts of speech
c_POS([ "'A",1,1 ], symbol ).
c_POS([ "'Abo",1,1 ], n ).
c_POS([ "'Aland Islands",1,1 ], npl ).
c_POS([ "'Alborg",1,1 ], n ).
c_POS([ "'Alesund",1,1 ], n ).
c_POS([ "'Angstr",1,1 ], n ).
c_POS([ "'Arhus",1,1 ], n ).
c_POS([ "'elan vital",1,1 ], n ).
c_POS([ "'em",1,1 ], pron ).
c_POS([ "'eminence grise",1,1 ], n ).
relations/rel.adj :- 155 related adjectives
c_RELADJ([ "China",1,1,2 ],[ "Sinitic",1 ]).
c_RELADJ([ "Descartes",1,1,2 ],[ "Cartesian",1 ]).
c_RELADJ([ "Easter",1,1,3 ],[ "paschal",1 ]).
c_RELADJ([ "England",1,1,1 ],[ "Anglican",1 ]).
c_RELADJ([ "France",1,1,1 ],[ "Gallic",1 ]).
relations/samp.use :- 20644 usage samples provided in sense
c_SAMP([ "'m",1,1,1 ], "yes'm" ).
c_SAMP([ "'re",1,1,1 ], "we're" ).
c_SAMP([ "-'s",1,1,6 ], "where's he live what's he do" ).
c_SAMP([ "-ad",1,1,1 ], "triad" ).
c_SAMP([ "-ad",1,1,2 ], "Dunciad" ).
relations/syll :- 83946 syllabifications
c_SYLL([ "'A",1 ],[ "'A" ]).
c_SYLL([ "'Abo",1 ],[ "'A_bo" ]).
c_SYLL([ "'Aland Islands",1 ],[ "'A_land Is_lands" ]).
c_SYLL([ "'Alborg",1 ],[ "'Al_borg" ]).
c_SYLL([ "'Alesund",1 ],[ "'A_le_sund" ]).
c_SYLL([ "'Angstr",1 ],[ "'Ang_str" ]).
c_SYLL([ "'Arhus",1 ],[ "'Ar_hus" ]).
c_SYLL([ "'elan vital",1 ],[ "'e_lan vi_tal" ]).
relations/usage.text :- 285 blocks (up to 80 bytes) of usage notes
c_USAGE([ "-ful",1 ], "Where the amount held by a spoon, etc|, is
used as a rough unit of" ,1 ).
c_USAGE([ "-ful",1 ], "measurement, the correct form is spoonful,
etc|" ,2 ).
c_USAGE([ "-ize",1 ], "-ise is equally acceptable in British
English|Certain words are, however, always" ,2 ).
c_USAGE([ "-ize",1 ], "In the U|S| and in Britain, -ize is the
standard ending for many verbs, but" ,1 ).
c_USAGE([ "-ize",1 ], "spelt with -ise in both the U|S| and in
Britain" ,3 ).
relations/usage.num :- 94 lines indicating number of blocks of usage text
c_USAGE_NUM([ "-ful",1 ], 2 ).
c_USAGE_NUM([ "-ize",1 ], 3 ).
c_USAGE_NUM([ "-wise",1 ], 3 ).
relations/irreg.sing :- 43 irregular singular forms
c_SINGULAR([ "Carbonari",1,1,1 ], "naro" ).
c_SINGULAR([ "Chasidim",1,1,1 ], "Chasid" ).
c_SINGULAR([ "Chasidim",1,1,1 ], "Chassid" ).
c_SINGULAR([ "Danaides",1,1,1 ], "Danaid" ).
relations/var.spell :- 4768 variant spellings
c_VAR_SPELL([ "'Alesund",1 ], "Aalesund" ).
c_VAR_SPELL([ "'gainst",1 ], "gainst" ).
c_VAR_SPELL([ "'twixt",1 ], "twixt" ).
c_VAR_SPELL([ "-aemia",1 ], "-haemia" ).
relations/var.syll :- 6713 syllabifications for variant spellings
c_VAR_SYLL([ "Abu-Bakr" ] , [ "A_bu-Bakr" ]).
c_VAR_SYLL([ "Achaia" ] , [ "A_cha_ia" ]).
c_VAR_SYLL([ "Achaian" ] , [ "A_chai_an" ]).
c_VAR_SYLL([ "Achitophel" ] , [ "A_chit_o_phel" ]).
c_VAR_SYLL([ "Adirondacks" ] , [ "Ad_i_ron_dacks" ]).
relations/def.text :- 230210 blocks (up to 80 chars) of definition text
c_DEF([ "'A",1,1,1 ], "angstrom unit" ,1 ).
c_DEF([ "'Abo",1,1,1 ], "the Swedish name for Turku" ,1 ).
c_DEF([ "'Aland Islands",1,1,1 ], "Bothnia| Capital: Mariehamn|
Pop|: 21,500 (1968)| Finnish name: Ahvenanmaa" ,2 ).
c_DEF([ "'Aland Islands",1,1,1 ], "a group of over 6000 islands
under Finnish administration, in the Gulf of" ,1 ).
c_DEF([ "'Alborg",1,1,1 ], "a variant spelling of Aalborg" ,1 ).
c_DEF([ "'Alesund",1,1,1 ], "a port and market town in W Norway, on
an island between Bergen and Trondheim:" ,1 ).
relations/compare :- 7978 cross-references indicated by 'compare'
c_COMPARE([ "'elan vital",1,1,1 ],[ "Bergsonism" ]).
c_COMPARE([ "-ance",1,1,1 ],[ "-ence" ]).
c_COMPARE([ "-ation",1,1,1 ],[ "-ion" ]).
c_COMPARE([ "-ation",1,1,1 ],[ "-tion" ]).
=============================================================================
Appendix 1: costs and conditions
Unless otherwise stated, all texts are deposited in and maintained by the
Archive according to the same conditions, which are summarised in the "User
Declaration" reprinted at the end of this document. Two conditions of
particular importance in connexion with Dictionary texts are (a) that copies
can only be made for individual scholars for use in private non-profit
scholarly research and (b) that any revised or reformatted versions of the
dictionaries produced as a part of such work should be made freely available to
other such scholars by re-depositing new versions of the texts with the
Archive. For use of the dictionaries not covered by these conditions, users are
referred to the dictionary publishers, whose rights in the texts the Archive
(and all users of the texts) are legally obliged to respect.
A small number of dictionaries are now available for anonymous FTP: this is
indicated explicitly in the list above. For all other texts, the following
porocedure applies. Orders must be prepaid and must be given on our standard
order form, which is available on request, or by FTP. In some cases, letters
of permission must also accompany the order.
Current prices for dictionaries supplied on magnetic tape or cartridge:
Private individuals #20(UK and Europe) #30(Outside Europe)
Academic Institutions #100
Commercial Institutions #500
These prices represent a flat charge for each dictionary which covers the
cost of media, such documentation as is available and despatch by the
fastest route (usually airmail). The above prices are given in sterling; if
payment is made in some other currency, a 10 pound conversion fee should be
added. Payment should be made by money order or cheque payable to the
account of Oxford University Computing Services, and must accompany the
order. To keep our running costs low, we are obliged to insist on pre-
payment.
Academic or Commercial Institutions are requested to ensure that any
additional users of the text other than the original signatory to the User
Declaration should also sign a copy of the Declaration, and return it to
the Archive.
Commercial Institutions are also reminded that the Archive is not licensed
to distribute the dictionaries on a commercial basis, and that any
commercial use, application or development of them for commercial purposes
is an infringement of the user declaration. For permission to use the texts
commercially, an approach should be made to the publishers of the
dictionaries.
Appendix 2 User Declaration
I hereby undertake:-
PURPOSE To use the text for purposes of private scholarly research only
and not for profit (this shall not preclude the publication in a
scholarly context of analyses or interpretations nderived from
the text).
ACKNOWLEDGEMENT
To acknowledge in any work, published or unpublished, based in
whole or in part on analyses made of the texts both the original
depositors of the text and the Archive.
COPYRIGHT Not to copy in whole or in part the text, except insofar as this
may be necessary for security purposes or for my own personal
use. Not to distribute the text to third parties not to publish
or reproduce it in any way. Copyright of all machine-readable
texts issued by the Archive is reserved.
ACCESS To give access to the text only to persons directly associated
with me or working under my control and to require of such
persons signed undertakings neither to use the text except in
connexion with my academic purposes nor to give access to the
text to others; these signed undertakings to be made available to
the Archive on request.
ERRORS Not to hold the Archive liable for any errors of transcription
which may be found in the text, but to notify the Archive of such
errors wherever possible.
REVISIONS Where substantial revisions or reformatting of the text is
carried out as a part of my research to inform the Archive of the
nature of such revisions and to make available a copy of any such
revised version to the Archive.
Oxford Text Archive February 1992