عنوان

Sentiment Analysis for the Low-Resourced Latinised Arabic 'Arabizi'

پدید آورنده

Tobaili, Taha

موضوع

Language,Linguistics,Sociolinguistics

رده

کتابخانه

مرکز و کتابخانه مطالعات اسلامی به زبان‌های اروپایی

محل استقرار

استان: قم ـ شهر: قم

تماس با کتابخانه : 32910706-025

شماره کتابشناسی ملی

شماره

TLpq2487824237

زبان اثر

زبان متن نوشتاري يا گفتاري و مانند آن

انگلیسی

عنوان و نام پديدآور

عنوان اصلي

Sentiment Analysis for the Low-Resourced Latinised Arabic 'Arabizi'

نام عام مواد

[Thesis]

نام نخستين پديدآور

Tobaili, Taha

نام ساير پديدآوران

Fernandez, Miriam

وضعیت نشر و پخش و غیره

نام ناشر، پخش کننده و غيره

Open University (United Kingdom)

تاریخ نشرو بخش و غیره

2020

مشخصات ظاهری

نام خاص و کميت اثر

228

یادداشتهای مربوط به پایان نامه ها

جزئيات پايان نامه و نوع درجه آن

Ph.D.

کسي که مدرک را اعطا کرده

Open University (United Kingdom)

امتياز متن

2020

یادداشتهای مربوط به خلاصه یا چکیده

متن يادداشت

The expansion of digital communication mediums from private mobile messaging into the public through social media presented an opportunity for the data science research and industry to mine the generated big data for artificial information extraction. A popular information extraction task is sentiment analysis, which aims at extracting polarity opinions, positive, negative, or neutral, from the written natural language. This science helped organisations better understand the public's opinion towards events, news, public figures, and products. However, sentiment analysis has advanced for the English language ahead of Arabic. While sentiment analysis for Arabic is developing in the literature of Natural Language Processing (NLP), a popular variety of Arabic, Arabizi, has been overlooked for sentiment analysis advancements. Arabizi is an informal transcription of the spoken dialectal Arabic in Latin script used for social texting. It is known to be common among the Arab youth, yet it is overlooked in efforts on Arabic sentiment analysis for its linguistic complexities. As to Arabic, Arabizi is rich in inflectional morphology, but also codeswitched with English or French, and distinctively transcribed without adhering to a standard orthography. The rich morphology, inconsistent orthography, and codeswitching challenges are compounded together to have a multiplied effect on the lexical sparsity of the language, where each Arabizi word becomes eligible to be spelled in many ways, that, in addition to the mixing of other languages within the same textual context. The resulting high degree of lexical sparsity defies the very basics of sentiment analysis, classification of positive and negative words. Arabizi is even faced with a severe shortage of data resources that are required to set out any sentiment analysis approach. In this thesis, we tackle this gap by conducting research on sentiment analysis for Arabizi. We addressed the sparsity challenge by harvesting Arabizi data from multi-lingual social media text using deep learning to build Arabizi resources for sentiment analysis. We developed six new morphologically and orthographically rich Arabizi sentiment lexicons and set the baseline for Arabizi sentiment analysis on social media.

موضوع (اسم عام یاعبارت اسمی عام)

موضوع مستند نشده

Language

موضوع مستند نشده

Linguistics

موضوع مستند نشده

Sociolinguistics

نام شخص به منزله سر شناسه - (مسئولیت معنوی درجه اول )

مستند نام اشخاص تاييد نشده

Fernandez, Miriam

مستند نام اشخاص تاييد نشده

Tobaili, Taha

دسترسی و محل الکترونیکی

نام الکترونيکي

وضعیت انتشار

فرمت انتشار

اطلاعات رکورد کتابشناسی

نوع ماده

[Thesis]

کد کاربرگه

276903

اطلاعات دسترسی رکورد

سطح دسترسي

تكميل شده

عنوان Sentiment Analysis for the Low-Resourced Latinised Arabic 'Arabizi'

پدید آورنده Tobaili, Taha

موضوع Language,Linguistics,Sociolinguistics

رده

کتابخانه مرکز و کتابخانه مطالعات اسلامی به زبان‌های اروپایی

محل استقرار استان: قم ـ شهر: قم

شماره کتابشناسی ملی

زبان اثر

عنوان و نام پديدآور

وضعیت نشر و پخش و غیره

مشخصات ظاهری

یادداشتهای مربوط به پایان نامه ها

یادداشتهای مربوط به خلاصه یا چکیده

موضوع (اسم عام یاعبارت اسمی عام)

نام شخص به منزله سر شناسه - (مسئولیت معنوی درجه اول )

دسترسی و محل الکترونیکی

وضعیت انتشار

اطلاعات رکورد کتابشناسی

اطلاعات دسترسی رکورد

عنوان

Sentiment Analysis for the Low-Resourced Latinised Arabic 'Arabizi'

پدید آورنده

Tobaili, Taha

موضوع

Language,Linguistics,Sociolinguistics

کتابخانه

مرکز و کتابخانه مطالعات اسلامی به زبان‌های اروپایی

محل استقرار

استان: قم ـ شهر: قم