عنوان

A text mining approach for Arabic question answering systems

پدید آورنده

Sadek, J.

موضوع

رده

کتابخانه

Center and Library of Islamic Studies in European Languages

محل استقرار

استان: Qom ـ شهر: Qom

تماس با کتابخانه : 32910706-025

NATIONAL BIBLIOGRAPHY NUMBER

Number

TLets644761

TITLE AND STATEMENT OF RESPONSIBILITY

Title Proper

A text mining approach for Arabic question answering systems

General Material Designation

[Thesis]

First Statement of Responsibility

Sadek, J.

.PUBLICATION, DISTRIBUTION, ETC

Name of Publisher, Distributor, etc.

University of Salford

Date of Publication, Distribution, etc.

2014

DISSERTATION (THESIS) NOTE

Dissertation or thesis details and type of degree

Thesis (Ph.D.)

Text preceding or following the note

2014

SUMMARY OR ABSTRACT

Text of Note

As most of the electronic information available nowadays on the web is stored as text, developing Question Answering systems (QAS) has been the focus of many individual researchers and organizations. Relatively, few studies have been produced for extracting answers to "why" and "how to" questions. One reason for this negligence is that when going beyond sentence boundaries, deriving text structure is a very time-consuming and complex process. This thesis explores a new strategy for dealing with the exponentially large space issue associated with the text derivation task. To our knowledge, to date there are no systems that have attempted to addressing such type of questions for the Arabic language. We have proposed two analytical models; the first one is the Pattern Recognizer which employs a set of approximately 900 linguistic patterns targeting relationships that hold within sentences. This model is enhanced with three independent algorithms to discover the causal/explanatory role indicated by the justification particles. The second model is the Text Parser which is approaching text from a discourse perspective in the framework of Rhetorical Structure Theory (RST). This model is meant to break away from the sentence limit. The Text Parser model is built on top of the output produced by the Pattern Recognizer and incorporates a set of heuristics scores to produce the most suitable structure representing the whole text. The two models are combined together in a way to allow for the development of an Arabic QAS to deal with "why" and "how to" questions. The Pattern Recognizer model achieved an overall recall of 81% and a precision of 78%. On the other hand, our question answering system was able to find the correct answer for 68% of the test questions. Our results reveal that the justification particles play a key role in indicating intrasentential relations.

PERSONAL NAME - PRIMARY RESPONSIBILITY

Sadek, J.

CORPORATE BODY NAME - SECONDARY RESPONSIBILITY

University of Salford

ELECTRONIC LOCATION AND ACCESS

Electronic name