عنوان

Efficient and Interpretable Machine Learning Algorithms for Predictive Analyses in Metagenomic Data

پدید آورنده

Rahman, Mohammad Arifur

موضوع

Artificial intelligence,Bioinformatics,Computer science,Epidemiology,Genetics,Microbiology

رده

کتابخانه

مرکز و کتابخانه مطالعات اسلامی به زبان‌های اروپایی

محل استقرار

استان: قم ـ شهر: قم

تماس با کتابخانه : 32910706-025

شماره کتابشناسی ملی

شماره

TL55649

زبان اثر

زبان متن نوشتاري يا گفتاري و مانند آن

انگلیسی

عنوان و نام پديدآور

عنوان اصلي

Efficient and Interpretable Machine Learning Algorithms for Predictive Analyses in Metagenomic Data

نام عام مواد

[Thesis]

نام نخستين پديدآور

Rahman, Mohammad Arifur

نام ساير پديدآوران

Rangwala, Huzefa

وضعیت نشر و پخش و غیره

نام ناشر، پخش کننده و غيره

George Mason University

تاریخ نشرو بخش و غیره

2020

يادداشت کلی

متن يادداشت

164 p.

یادداشتهای مربوط به پایان نامه ها

جزئيات پايان نامه و نوع درجه آن

Ph.D.

کسي که مدرک را اعطا کرده

George Mason University

امتياز متن

2020

یادداشتهای مربوط به خلاصه یا چکیده

متن يادداشت

Advancements in DNA sequencing technologies have enabled the direct investigation of the microbiome. Microbiome refers to all the microorganisms i.e., bacteria and viruses, present as a community in a host. Researchers and clinicians have embarked on studying the role of these microorganisms concerning human health and diseases. Most existing approaches first identify the microbial abundance in a sample using the sequence databases of known microorganisms and then use the abundance values as features for predicting diseases i.e., Liver Cirrhosis, Type-2 diabetes and other diseases. The taxonomic profiling and abundance quantification is computationally expensive, creates a bias in subsequent predictions and ignores a large amount of data that comes from the Next Generation Sequencing (NGS) technologies. Moreover, most microbes have not been laboratory-cultured and thus remain unknown. Existing approaches do not account for novel and unknown microorganisms. The lack of efficient analytical methods that overcome these limitations impedes the identification of the presence and functions of the microbial organisms within different clinical and environmental samples. Hence, there is a need to develop scalable analytical algorithms for large-scale DNA sequence data i.e., metagenomic data to discover the microbiome, perform taxonomic profiling, quantify species abundance and predict diseases. In this thesis, I develop Multiple Instance Learning (MIL) based algorithms to predict the diseases from large-scale Metagenomic data. Multiple Instance Learning (MIL) is a supervised classification approach that considers a single sample as a group of relevant data instances rather than just one single instance. In addition to predicting diseases, our proposed approaches can identify the individual microbial DNA sequences that are indicative of the diseases. We hypothesize that an optimized solution to the MIL formulation of the problem will predict diseases more accurately than existing approaches by utilizing the available DNA sequence data and avoiding the inherent bias from the microbial profiling process. To ensure that the proposed algorithms can scale to the large volume of input sequences (obtained from a Metagenomic sample) we propose efficient canopy based clustering solutions that can be integrated within the prediction pipeline. We evaluate the proposed algorithms on several clinical benchmarks and show improved prediction performance in terms of identifying clinical phenotypes, reporting interpretable results for clinicians and ensuring scalable implementations.

اصطلاحهای موضوعی کنترل نشده

اصطلاح موضوعی

Artificial intelligence

اصطلاح موضوعی

Bioinformatics

اصطلاح موضوعی

Computer science

اصطلاح موضوعی

Epidemiology

اصطلاح موضوعی

Genetics

اصطلاح موضوعی

Microbiology

نام شخص به منزله سر شناسه - (مسئولیت معنوی درجه اول )

مستند نام اشخاص تاييد نشده

Rahman, Mohammad Arifur

نام شخص - ( مسئولیت معنوی درجه دوم )

مستند نام اشخاص تاييد نشده

Rangwala, Huzefa

شناسه افزوده (تنالگان)

مستند نام تنالگان تاييد نشده

George Mason University

دسترسی و محل الکترونیکی

نام الکترونيکي

وضعیت انتشار

فرمت انتشار

اطلاعات رکورد کتابشناسی

نوع ماده

[Thesis]

کد کاربرگه

276903

اطلاعات دسترسی رکورد

سطح دسترسي

تكميل شده

عنوان Efficient and Interpretable Machine Learning Algorithms for Predictive Analyses in Metagenomic Data

پدید آورنده Rahman, Mohammad Arifur

موضوع Artificial intelligence,Bioinformatics,Computer science,Epidemiology,Genetics,Microbiology

رده

کتابخانه مرکز و کتابخانه مطالعات اسلامی به زبان‌های اروپایی

محل استقرار استان: قم ـ شهر: قم

شماره کتابشناسی ملی

زبان اثر

عنوان و نام پديدآور

وضعیت نشر و پخش و غیره

يادداشت کلی

یادداشتهای مربوط به پایان نامه ها

یادداشتهای مربوط به خلاصه یا چکیده

اصطلاحهای موضوعی کنترل نشده

نام شخص به منزله سر شناسه - (مسئولیت معنوی درجه اول )

نام شخص - ( مسئولیت معنوی درجه دوم )

شناسه افزوده (تنالگان)

دسترسی و محل الکترونیکی

وضعیت انتشار

اطلاعات رکورد کتابشناسی

اطلاعات دسترسی رکورد

عنوان

Efficient and Interpretable Machine Learning Algorithms for Predictive Analyses in Metagenomic Data

پدید آورنده

Rahman, Mohammad Arifur

موضوع

Artificial intelligence,Bioinformatics,Computer science,Epidemiology,Genetics,Microbiology

کتابخانه

مرکز و کتابخانه مطالعات اسلامی به زبان‌های اروپایی

محل استقرار

استان: قم ـ شهر: قم