Use of Dispersion Ratio with Ferrer Diagram for Classification
General Material Designation
[Thesis]
First Statement of Responsibility
Shah, Yash Alpeshkumar
Subsequent Statement of Responsibility
Khan, Maleq
.PUBLICATION, DISTRIBUTION, ETC
Name of Publisher, Distributor, etc.
Texas A&M University - Kingsville
Date of Publication, Distribution, etc.
2019
PHYSICAL DESCRIPTION
Specific Material Designation and Extent of Item
32
DISSERTATION (THESIS) NOTE
Dissertation or thesis details and type of degree
M.S.
Body granting the degree
Texas A&M University - Kingsville
Text preceding or following the note
2019
SUMMARY OR ABSTRACT
Text of Note
Classification is one of the most useful techniques used in data mining. Classification is used to make predictions by assigning a class label to a given data instance. It is used in various applications like biological classification, document classification, drug discovery, pattern recognition, etc. Data mining is all about extracting hidden information and finding unknown patterns from collected data using various techniques like classification, clustering, and regression. There is a vast amount of data collected worldwide but hardly used for data mining purposes. A lot of research has been done in order to improve the accuracy and efficiency of these techniques. Guggari et al. [1], have suggested using Ferrer diagram feature selection technique to improve accuracy. Roy et al. [2] have suggested using dispersion ratio as splitting criteria instead of Gini index. This thesis combines these two techniques in order to improve the accuracy by using Ferrer diagram technique and replacing Gini index used in Ferrer diagram with dispersion ratio. Rigorous experiments were performed on different datasets regarding heart disease, diabetes, wine quality, and vehicle identification. 80% of the experiments performed show improvement in accuracy ranging between 0.3% - 4.16%.