عنوان

Reducing out-of-vocabulary in morphology to improve the accuracy in Arabic dialects speech recognition

پدید آورنده

Almeman, Khalid Abdulrahman

موضوع

P Philology. Linguistics ; PJ Semitic ; QA75 Electronic computers. Computer science

رده

کتابخانه

کتابخانه مطالعات اسلامی به زبان های اروپایی

محل استقرار

استان: قم ـ شهر: قم

تماس با کتابخانه : 32910706-025

TLets642395

Reducing out-of-vocabulary in morphology to improve the accuracy in Arabic dialects speech recognition

[Thesis]

Almeman, Khalid Abdulrahman

University of Birmingham

2015

Thesis (Ph.D.)

2015

This thesis has two aims: developing resources for Arabic dialects and improving the speech recognition of Arabic dialects. Two important components are considered: Pronunciation Dictionary (PD) and Language Model (LM). Six parts are involved, which relate to building and evaluating dialects resources and improving the performance of systems for the speech recognition of dialects. Three resources are built and evaluated: one tool and two corpora. The methodology that was used for building the multi-dialect morphology analyser involves the proposal and evaluation of linguistic and statistic bases. We obtained an overall accuracy of 94%. The dialect text corpora have four sub-dialects, with more than 50 million tokens. The multi-dialect speech corpora have 32 speech hours, which were collected from 52 participants. The resultant speech corpora have more than 67,000 speech files. The main objective is improvement in the PDs and LMs of Arabic dialects. The use of incremental methodology made it possible to check orthography and phonology rules incrementally. We were able to distinguish the rules that positively affected the PDs. The Word Error Rate (WER) improved by an accuracy of 5.3% in MSA and 5% in Levantine. Three levels of morphemes were used to improve the LMs of dialects: stem, prefix+stem and stem+suffix. We checked the three forms using two different types of LMs. Eighteen experiments are carried out on MSA, Gulf dialect and Egyptian dialect, all of which yielded positive results, showing that WERs were reduced by 0.5% to 6.8%.

P Philology. Linguistics ; PJ Semitic ; QA75 Electronic computers. Computer science

Almeman, Khalid Abdulrahman

University of Birmingham

[Thesis]

276903

عنوان Reducing out-of-vocabulary in morphology to improve the accuracy in Arabic dialects speech recognition

پدید آورنده Almeman, Khalid Abdulrahman

موضوع P Philology. Linguistics ; PJ Semitic ; QA75 Electronic computers. Computer science

رده

کتابخانه کتابخانه مطالعات اسلامی به زبان های اروپایی

محل استقرار استان: قم ـ شهر: قم

عنوان

Reducing out-of-vocabulary in morphology to improve the accuracy in Arabic dialects speech recognition

پدید آورنده

Almeman, Khalid Abdulrahman

موضوع

P Philology. Linguistics ; PJ Semitic ; QA75 Electronic computers. Computer science

کتابخانه

کتابخانه مطالعات اسلامی به زبان های اروپایی

محل استقرار

استان: قم ـ شهر: قم