عنوان

Developing Efficient Algorithms for Data Mining Large Scale High Dimensional Data

پدید آورنده

Zakaria, Jesin

موضوع

رده

کتابخانه

مرکز و کتابخانه مطالعات اسلامی به زبان‌های اروپایی

محل استقرار

استان: قم ـ شهر: قم

تماس با کتابخانه : 32910706-025

شماره کتابشناسی ملی

شماره

TL660316zp

زبان اثر

زبان متن نوشتاري يا گفتاري و مانند آن

انگلیسی

عنوان و نام پديدآور

عنوان اصلي

Developing Efficient Algorithms for Data Mining Large Scale High Dimensional Data

نام عام مواد

[Thesis]

نام نخستين پديدآور

Zakaria, Jesin

نام ساير پديدآوران

Keogh, Eamonn

وضعیت نشر و پخش و غیره

نام ناشر، پخش کننده و غيره

UC Riverside

تاریخ نشرو بخش و غیره

2013

یادداشتهای مربوط به پایان نامه ها

کسي که مدرک را اعطا کرده

UC Riverside

امتياز متن

2013

یادداشتهای مربوط به خلاصه یا چکیده

متن يادداشت

Data mining and knowledge discovery has attracted a great deal of attention in information technology in recent years. The rapid progress of computer hardware technology in the past three decades provides a great enhancement to the database and information industry. The size and complexity of real world data is dramatically increasing with the growth of hardware technology. Although new efficient algorithms to deal with such data are constantly being proposed, the mining of large scale high dimensional data still presents a lot of challenges. In this dissertation, several novel algorithms are proposed to handle such challenges. These algorithms are applied to domains as diverse as electrocardiography (ECG), stock market data, geospatial data, power supply data, audio data, image data, etc. This dissertation contributes to the data mining community in the following three ways:Firstly, we propose a novel algorithm for clustering time series data efficiently in the presence of noise or extraneous data. Most existing methods for time series clustering rely on distances calculated from the entire raw data. As a consequence, most work on time series clustering only considers the clustering of individual time series "behaviors," e.g., individual heart beats and contrives the time series in some way to make them all equal in length. However, for any real world problem, formatting the data in such a way is often a harder task than the clustering itself. In order to remove these unrealistic assumptions, we have developed a new primitive called unsupervised shapelet or u-shapelet and shown its utility for clustering time series.Secondly, in order to speed up the discovery of u-shapelet and make it scalable we have proposed two optimization techniques which can speed up the unsupervised shapelet discovery independently of each other. Moreover, if we combine the two optimization procedures, it results in a super linear speedup. In addition to the above, we can also cast our u-shapelet discovery algorithm as an anytime algorithm. In my final contribution, we have developed a novel and robust algorithm for mining mice vocalizations with symbolized representation. Our algorithm processes large scale, high dimensional, noisy mice vocalization by dimensionality reduction and cardinality reduction and make it suitable for knowledge discovery like classification, clustering, similarity search, motif discovery, contrast set mining etc.

نام شخص به منزله سر شناسه - (مسئولیت معنوی درجه اول )

مستند نام اشخاص تاييد نشده

Zakaria, Jesin

نام شخص - ( مسئولیت معنوی درجه دوم )

مستند نام اشخاص تاييد نشده

Keogh, Eamonn

شناسه افزوده (تنالگان)

مستند نام تنالگان تاييد نشده

UC Riverside

دسترسی و محل الکترونیکی

نام الکترونيکي

وضعیت انتشار

فرمت انتشار

اطلاعات رکورد کتابشناسی

نوع ماده

[Thesis]

کد کاربرگه

276903

اطلاعات دسترسی رکورد

سطح دسترسي

تكميل شده

عنوان Developing Efficient Algorithms for Data Mining Large Scale High Dimensional Data

پدید آورنده Zakaria, Jesin

موضوع

رده

کتابخانه مرکز و کتابخانه مطالعات اسلامی به زبان‌های اروپایی

محل استقرار استان: قم ـ شهر: قم

شماره کتابشناسی ملی

زبان اثر

عنوان و نام پديدآور

وضعیت نشر و پخش و غیره

یادداشتهای مربوط به پایان نامه ها

یادداشتهای مربوط به خلاصه یا چکیده

نام شخص به منزله سر شناسه - (مسئولیت معنوی درجه اول )

نام شخص - ( مسئولیت معنوی درجه دوم )

شناسه افزوده (تنالگان)

دسترسی و محل الکترونیکی

وضعیت انتشار

اطلاعات رکورد کتابشناسی

اطلاعات دسترسی رکورد

عنوان

Developing Efficient Algorithms for Data Mining Large Scale High Dimensional Data

پدید آورنده

Zakaria, Jesin

کتابخانه

مرکز و کتابخانه مطالعات اسلامی به زبان‌های اروپایی

محل استقرار

استان: قم ـ شهر: قم