عنوان

یک راهکار ترکیبی بدون ناظر برای انتخاب ویژگی در متن با استفاده از الگوریتمهای بهینه سازی مبتنی بر جمعیت

پدید آورنده

/ مهدیه لبنی, لبنی،

موضوع

Feature selection, Dimensionality reduction, Text categorization, Multivariate technique, Multi-objective optimization algorithm

رده

کتابخانه

Central library and document university of Kurdistan

محل استقرار

استان: Kurdistan ـ شهر: Sanandaj

تماس با کتابخانه : 9-08733624006 و 08733664600

RIS Bibtex ISO

NATIONAL BIBLIOGRAPHY NUMBER

Number

۲۵۰۲پ

LANGUAGE OF THE ITEM

.Language of Text, Soundtrack etc

فارسی

Language of Original Work

فارسی

TITLE AND STATEMENT OF RESPONSIBILITY

Title Proper

یک راهکار ترکیبی بدون ناظر برای انتخاب ویژگی در متن با استفاده از الگوریتمهای بهینه سازی مبتنی بر جمعیت

General Material Designation

[پایان نامه]

First Statement of Responsibility

/ مهدیه لبنی

.PUBLICATION, DISTRIBUTION, ETC

Place of Publication, Distribution, etc.

سنندج

Name of Publisher, Distributor, etc.

: دانشگاه کردستان، دانشکده مهندسی

Date of Publication, Distribution, etc.

، ۱۳۹۵

PHYSICAL DESCRIPTION

Specific Material Designation and Extent of Item

د، ۱۰۸ص

Other Physical Details

: مصور، جدول

Accompanying Material

+ لوح فشرده

GENERAL NOTES

Text of Note

چکیده فارسی - انگلیسی

INTERNAL BIBLIOGRAPHIES/INDEXES NOTE

Text of Note

کتابنامه: ص. ۹۱-۹۶

CONTENTS NOTE

Text of Note

پیوست

DISSERTATION (THESIS) NOTE

Dissertation or thesis details and type of degree

کارشناسی ارشد

Discipline of degree

هوش مصنوعي و رباتيكز

Body granting the degree

کردستان

Text preceding or following the note

۲۰

SUMMARY OR ABSTRACT

Text of Note

با پیشرفت روزافزون فناوری اینترنت، تعداد اسناد الکترونیکی به طور چشم‌گیری افزایش یافته است. دسته‌بندی متن، نقش مهمی در دسترسی آسان‌تر به این حجم عظیم از داده‌ها را دارد. یکی از مشکلات دسته‌بندی متن، ابعاد بالای فضای ویژگی است. در مجموعه‌های داده‌ای با ابعاد بالا، بسیاری از ویژگی‌ها، نامناسب و دارای افزونگی می‌باشند و می‌توانند تاثیر منفی بر روی عملکرد سیستم طبقه‌بندی داشته باشند. انتخاب ویژگی یک راهکارمهم برای غلبه بر این مشکل است که هدف آن، انتخاب زیرمجموعه‌ای از ویژگی‌های مناسب از بین مجموعه ویژگی‌های اولیه است. از این رو، راهکار انتخاب ویژگی با کاهش ابعاد مسئله، سبب کاهش پیچیدگی محاسباتی و افزایش قابلیت تعمیم الگوریتم طبقه‌بندی می‌شود. در این پایان‌نامه، سه روش انتخاب ویژگی جدید ارائه می‌شود. روش پیشنهادی اول، بر انتخاب ویژگی با استفاده از مفهوم کمترین افزونگی بین ویژگی‌ها و بیشترین ارتباط با کلاس هدف در دسته‌بندی متن تمرکز دارد. در این روش، ویژگی‌های نامناسب و دارای افزونگی به طور موثر حذف می‌شوند، اما به دلیل انتخاب حریصانه ویژگی‌ها در فرآیند انتخاب ویژگی راه‌حل‌های محلی تولید می‌کند. با در نظر گرفتن این ضعف، در راهکار پیشنهادی دوم، یک الگوریتم چند‌هدفه مبتنی بر اطلاعات متقابل با هدف کاهش افزونگی بین ویژگی‌ها و افزایش ارتباط با کلاس ارائه شده است. روش پیشنهادی سوم براساس روش پیشنهادی اول ارائه شده است. روش پیشنهادی سوم، با به‌کار بردن معیار‌های ارتباط و افزونگی روش اول در یک الگوریتم تکاملی چندهدفه، سعی در انتخاب بهترین ویژگی‌ها دارد. مزیت عمده روش‌های پیشنهادی دوم و سوم، استفاده از الگوریتم‌های تکاملی چند‌هدفه در فرآیند انتخاب ویژگی می‌باشد. عملکرد روش‌های پیشنهادی با چندین روش انتخاب ویژگی، بر روی طبقه‌بندی کننده‌های مختلف مقایسه شده است. نتایج آزمایشات کارایی روش‌های پیشنهادی و بهبود روش‌های انتخاب ویژگی قبلی را نشان می‌دهد.

Text of Note

With rapid advance of internet technologies, the amount of electronic documents has drastically increased world wide. Automatic text categorization becomes more and more important for dealing with massive data. However the major problem of text categorization is the high dimensionality of the feature space. Excessive numbers of features not only increase the computational time, but also degrade the classification accuracy. In high dimensional dataset, typically many features are irrelevant and/or redundant for a given learning task, having harmful consequences in terms of performance. Feature selection is main approach for reducing the dimensionality of the text feature space by selecting the most informative features and still retains sufficient information for the classification task. On the other hand, this reduction helps to reduce the computational cost and speed up the learning process.In this thesis, three novel methods for feature selection problem are proposed. In the first proposed method, focuses on the reduction of redundant features using minimal-redundancy maximal relevance concept. To this end, the proposed method takes into account document frequencies for each term while estimating their usefulness. The proposed method not only select the maximum relevant features, but also the redundancy between them is takes into account using a correlation metric. this algorithm adopt greedy searching to incrementally select features, which usually generate local optimal solutions. The other methods have been proposed based on the first method. Considering this weakness, the second proposed method proposed, a novel multi-objective algorithm based on mutual information for feature selection. The proposed method identifies those of minimal redundant features which have maximum relevant with the target class. The third proposed method have been proposed based on the first method. In the third proposed method, by applying the criteria of relevance and redundancy of First method, in an multi objective evolutionary algorithm to choose the best features. The main advantage of the second and third proposed method are using the multi objective evolutionary algorithm in the feature selection process.The performance of the proposed methods is compared to several well-known feature selection methods using different classifiers. The experimental results show the efficiency and effectiveness of the proposed methods as well as improvements over previous related methods.

UNCONTROLLED SUBJECT TERMS

Subject Term

Feature selection

Subject Term

Dimensionality reduction

Subject Term

Text categorization

Subject Term

Multivariate technique

Subject Term

Multi-objective optimization algorithm

PERSONAL NAME - PRIMARY RESPONSIBILITY

Entry Element

لبنی،

Part of Name Other than Entry Element

مهدیه

Relator Code

پديدآور

PERSONAL NAME - SECONDARY RESPONSIBILITY

Entry Element

مرادی،

Entry Element

احمدی زر،

Part of Name Other than Entry Element

پرهام

Part of Name Other than Entry Element

فردین

Relator Code

استاد راهنما

Relator Code

استاد مشاور

CORPORATE BODY NAME - SECONDARY RESPONSIBILITY

Entry Element

دانشگاه کردستان

Subdivision

. دانشکده مهندسی

ORIGINATING SOURCE

Country

ایران

Agency

کتابخانه مرکزی دانشگاه کردستان

Date of Transaction

20170816

LOCATION AND CALL NUMBER

Call Number

EAI۲۶۶۷ ۱۳۹۵ کتابخانه مرکزی

92029

عنوان یک راهکار ترکیبی بدون ناظر برای انتخاب ویژگی در متن با استفاده از الگوریتمهای بهینه سازی مبتنی بر جمعیت

پدید آورنده / مهدیه لبنی, لبنی،

موضوع Feature selection, Dimensionality reduction, Text categorization, Multivariate technique, Multi-objective optimization algorithm

رده

کتابخانه Central library and document university of Kurdistan

محل استقرار استان: Kurdistan ـ شهر: Sanandaj

NATIONAL BIBLIOGRAPHY NUMBER

LANGUAGE OF THE ITEM

TITLE AND STATEMENT OF RESPONSIBILITY

.PUBLICATION, DISTRIBUTION, ETC

PHYSICAL DESCRIPTION

GENERAL NOTES

INTERNAL BIBLIOGRAPHIES/INDEXES NOTE

CONTENTS NOTE

DISSERTATION (THESIS) NOTE

SUMMARY OR ABSTRACT

UNCONTROLLED SUBJECT TERMS

PERSONAL NAME - PRIMARY RESPONSIBILITY

PERSONAL NAME - SECONDARY RESPONSIBILITY

CORPORATE BODY NAME - SECONDARY RESPONSIBILITY

ORIGINATING SOURCE

LOCATION AND CALL NUMBER

عنوان

یک راهکار ترکیبی بدون ناظر برای انتخاب ویژگی در متن با استفاده از الگوریتمهای بهینه سازی مبتنی بر جمعیت

پدید آورنده

/ مهدیه لبنی, لبنی،

موضوع

Feature selection, Dimensionality reduction, Text categorization, Multivariate technique, Multi-objective optimization algorithm

کتابخانه

Central library and document university of Kurdistan

محل استقرار

استان: Kurdistan ـ شهر: Sanandaj