An Arabic lexicon to support information retrieval, parsing, and text generation
General Material Designation
[Thesis]
First Statement of Responsibility
Alsamara, Khalid Said
Subsequent Statement of Responsibility
M. Evens
.PUBLICATION, DISTRIBUTION, ETC
Name of Publisher, Distributor, etc.
Illinois Institute of Technology
Date of Publication, Distribution, etc.
1996
PHYSICAL DESCRIPTION
Specific Material Designation and Extent of Item
81-81 p.
DISSERTATION (THESIS) NOTE
Dissertation or thesis details and type of degree
Ph.D.
Body granting the degree
Illinois Institute of Technology
Text preceding or following the note
1996
SUMMARY OR ABSTRACT
Text of Note
We developed an Arabic lexical database to support information retrieval, text generation, and parsing. It contains information about 12,500 words in the computer sublanguage. The database has a main table containing all words and then separate tables for nouns, adjectives, verbs, and particles. The main table contains basic information for each Arabic word in a corpus of 242 abstracts, part of speech (noun, verb, particle, adjective), gender (masculine or feminine), number (singular, dual, plural), person (1st, 2nd, 3rd). The lexical entry for the noun contains gender (masc. or fem.), person (1st, 2nd, 3rd), number (singular, dual, plural). It also places the noun in a number of syntactic/semantic categories; inert or derived, concrete or abstract, structured or declined, denuded or augmented, animate or inanimate and human or nonhuman. The lexical entry for the verb tells whether it is complete or deficient, transitive or intransitive, denuded or augmented, sound or defective or mixed, and imperfect, or perfect, imperative. The lexical entry for each particle tells whether it acts on nouns or verbs or both. Particles that are active on nouns are classified as letters of reduction or annulment or vocatives or letters of exclusion or conjunctions. Particles that are active on verbs are specified as different kinds of letters of elusion or opening. Particles that are active on both nouns and verbs are particles of attraction. The lexical entry for the adjective tells whether they are animate/nonanimate, for animate adjectives record (human/nonhuman). We implemented a lexical database system on Arabic Windows using the Microsoft Access DBMS and a Graphical User Interface (GUI). It runs on IBM/PC's and its compatibles. It is designed to be used by both programs and human endusers, with the goal of supporting natural language processing systems, ongoing research at the Arabic Language Processing Laboratory at Illinois Institute of Technology and future research in the Arabic language.