1. Quick Overview.- 1.1 The Classifier Design Problem.- 1.2 Single Layer and Multilayer Perceptrons.- 1.3 The SLP as the Euclidean Distance and the Fisher Linear Classifiers.- 1.4 The Generalisation Error of the EDC and the Fisher DF.- 1.5 Optimal Complexity - The Scissors Effect.- 1.6 Overtraining in Neural Networks.- 1.7 Bibliographical and Historical Remarks.- 2. Taxonomy of Pattern Classification Algorithms.- 2.1 Principles of Statistical Decision Theory.- 2.2 Four Parametric Statistical Classifiers.- 2.2.1 The Quadratic Discriminant Function.- 2.2.2 The Standard Fisher Linear Discriminant Function.- 2.2.3 The Euclidean Distance Classifier.- 2.2.4 The Anderson-Bahadur Linear DF.- 2.3 Structures of the Covariance Matrices.- 2.3.1 A Set of Standard Assumptions.- 2.3.2 Block Diagonal Matrices.- 2.3.3 The Tree Type Dependence Models.- 2.3.4 Temporal Dependence Models.- 2.4 The Bayes Predictive Approach to Design Optimal Classification Rules.- 2.4.1 A General Theory.- 2.4.2 Learning the Mean Vector.- 2.4.3 Learning the Mean Vector and CM.- 2.4.4 Qualities and Shortcomings.- 2.5. Modifications of the Standard Linear and Quadratic DF.- 2.5.1 A Pseudo-Inversion of the Covariance Matrix.- 2.5.2 Regularised Discriminant Analysis (RDA).- 2.5.3 Scaled Rotation Regularisation.- 2.5.4 Non-Gausian Densities.- 2.5.5 Robust Discriminant Analysis.- 2.6 Nonparametric Local Statistical Classifiers.- 2.6.1 Methods Based on Mixtures of Densities.- 2.6.2 Piecewise-Linear Classifiers.- 2.6.3 The Parzen Window Classifier.- 2.6.4 The k-NN Rule and a Calculation Speed.- 2.6.5 Polynomial and Potential Function Classifiers.- 2.7 Minimum Empirical Error and Maximal Margin Linear Classifiers.- 2.7.1 The Minimum Empirical Error Classifier.- 2.7.2 The Maximal Margin Classifier.- 2.7.3 The Support Vector Machine.- 2.8 Piecewise-Linear Classifiers.- 2.8.1 Multimodal Density Based Classifiers.- 2.8.2 Architectural Approach to Design of the Classifiers.- 2.8.3 Decision Tree Classifiers.- 2.9 Classifiers for Categorical Data.- 2.9.1 Multinornial Classifiers.- 2.9.2 Estimation of Parameters.- 2.9.3 Decision Tree and the Multinornial Classifiers.- 2.9.4 Linear Classifiers.- 2.9.5 Nonparametric Local Classifiers.- 2.10 Bibliographical and Historical Remarks.- 3. Performance and the Generalisation Error.- 3.1 Bayes, Conditional, Expected, and Asymptotic Probabilities of Misclassification.- 3.1.1 The Bayes Probability of Misclassification.- 3.1.2 The Conditional Probability of Misclassification.- 3.1.3 The Expected Probability of Misclassification.- 3.1.4 The Asymptotic Probability of Misclassification.- 3.1.5 Learning Curves: An Overview of Different Analysis Methods.- 3.1.6 Error Estimation.- 3.2 Generalisation Error of the Euclidean Distance Classifier.- 3.2.1 The Classification Algorithm.- 3.2.2 Double Asymptotics in the Error Analysis.- 3.2.3 The Spherical Gaussian Case.- 3.2.3.1 The Case N2 = N1.- 3.2.3.2 The Case N2 ? N1.- 3.3 Most Favourable and Least Favourable Distributions of the Data.- 3.3.1 The Non-Spherical Gaussian Case.- 3.3.2 The Most Favourable Distributions of the Data.- 3.3.3 The Least Favourable Distributions of the Data.- 3.3.4 Intrinsic Dimensionality.- 3.4 Generalisation Errors for Modifications of the Standard Linear Classifier.- 3.4.1 The Standard Fisher Linear DF.- 3.4.2 The Double Asymptotics for the Expected Error.- 3.4.3 The Conditional Probability of Misc1assification.- 3.4.4 A Standard Deviation of the Conditional Error.- 3.4.5 Favourable and Unfavourable Distributions.- 3.4.6 Theory and Real-World Problems.- 3.4.7 The Linear Classifier D for the Diagonal CM.- 3.4.8 The Pseudo-Fisher Classifier.- 3.4.9 The Regularised Discriminant Analysis.- 3.5 Common Parameters in Different Competing Pattern Classes.- 3.5.1 The Generalisation Error of the Quadratic DF.- 3.5.2 The Effect of Common Parameters in Two Competing Classes.- 3.5.3 Unequal Sampie Sizes in Plug-In Classifiers.- 3.6 Minimum Empirical Error and Maximal Margin Classifiers.- 3.6.1 Favourable Distributions of the Pattern Classes.- 3.6.2 VC Bounds for the Conditional Generalisation Error.- 3.6.3 Unfavourable Distributions for the Euclidean Distance and Minimum Empirical Error Classifiers.- 3.6.4 Generalisation Error in the Spherical Gaussian Case.- 3.6.5 Intrinsic Dimensionality.- 3.6.6 The Influence of the Margin.- 3.6.7 Characteristics of the Learning Curves.- 3.7 Parzen Window Classifier.- 3.7.1 The Decision Boundary of the PW Classifier with Spherical Kerneis.- 3.7.2 The Generalisation Error.- 3.7.3 Intrinsic Dimensionality.- 3.7.4 Optimal Value of the Smoothing Parameter.- 3.7.5 The k-NN Rule.- 3.8 Multinomial Classifier.- 3.9 Bibliographical and Historical Remarks.- 4. Neural Network Classifiers.- 4.1 Training Dynamics of the Single Layer Perceptron.- 4.1.1 The SLP and its Training Rule.- 4.1.2 The SLP as Statistical Classifier.- 4.1.2.1 The Euclidean Distance Classifier.- 4.1.2.2 The Regularised Discriminant Analysis.- 4.1.2.3 The Standard Linear Fisher Classifier.- 4.1.2.4 The Pseudo-Fisher Classifier.- 4.1.2.5 Dynamics of the Magnitudes of the Weights.- 4.1.2.6 The Robust Discriminant Analysis.- 4.1.2.7 The Minimum Empirical Error Classifier.- 4.1.2.8 The Maximum Margin (Support Vector) Classifier.- 4.1.3 Training Dynamics and Generalisation.- 4.2 Non-linear Decision Boundaries.- 4.2.1 The SLP in Transformed Feature Space.- 4.2.2 The MLP Classifier.- 4.2.3 Radial Basis-Function Networks.- 4.2.4 Learning Vector Quantisation Networks.- 4.3 Training Peculiarities of the Perceptrons.- 4.3.1 Cost Function Surfaces of the SLP Classifier.- 4.3.2 Cost Function Surfaces of the MLP Classifier.- 4.3.3 The Gradient Minimisation of the Cost Function.- 4.4 Generalisation of the Perceptrons.- 4.4.1 Single Layer Perceptron.- 4.4.1.1 Theoretical Background.- 4.4.1.2 The Experiment Design.- 4.4.1.3 The SLP and Parametric Classifiers.- 4.4.1.4 The SLP and Structural (Nonparametric) Classifiers.- 4.4.2 Multilayer Perceptron.- 4.4.2.1 Weights of the Hidden Layer Neurones are Common for all Outputs.- 4.4.2.2 Intrinsic Dimensionality Problems.- 4.4.2.3 An Effective Capacity of the Network.- 4.5 Overtraining and Initialisation.- 4.5.1 Overtraining.- 4.5.2 Effect of Initial Values.- 4.6 Tools to Control Complexity.- 4.6.1 The Number of Iterations.- 4.6.2 The Weight Decay Term.- 4.6.3 The Antiregularisation Technique.- 4.6.4 Noise Injection.- 4.6.4.1 Noise Injection into Inputs.- 4.6.4.2 Noise Injection into the Weights and into the Outputs of the Network.- 4.6.4.3 "Coloured" Noise Injection into Inputs.- 4.6.5 Control of Target Values.- 4.6.6 The Learning Step.- 4.6.7 Optimal Values of the Training Parameters.- 4.6.8 Learning Step in the Hidden Layer of MLP.- 4.6.9 Sigmoid Scaling.- 4.7 The Co-Operation of the Neural Networks.- 4.7.1 The Boss Decision Rule.- 4.7.2 Small Sampie Problems and Regularisation.- 4.8 Bibliographical and Historical Remarks.- 5. Integration of Statistical and Neural Approaches.- 5.1 Statistical Methods or Neural Nets?.- 5.2 Positive and Negative Attributes of Statistical Pattern Recognition.- 5.3 Positive and Negative Attributes of Artificial Neural Networks.- 5.4 Merging Statistical Classifiers and Neural Networks.- 5.4.1 Three Key Points in the Solution.- 5.4.2 Data Transformation or Statistical Classifier?.- 5.4.3 The Training Speed and Data Whitening Transformation.- 5.4.4 Dynamics of the Classifier after the Data Whitening Transformation.- 5.5 Data Transformations for the Integrated Approach.- 5.5.1 Linear Transformations.- 5.5.2 Non-linear Transformations.- 5.5.3 Performance of the Integrated Classifiers in Solving Real-World Problems.- 5.6 The Statistical Approach in Multilayer Feed-forward Networks.- 5.7 Concluding and Bibliographical Remarks.- 6.
متن يادداشت
Model Selection.- 6.1 Classification Errors and their Estimation Methods.- 6.1.1 Types of Classification Error.- 6.1.2 Taxonomy of Error Rate Estimation Methods.- 6.1.2.1 Methods for Splitting the Design Set into Training and Validation Sets.- 6.1.2.2 Practical Aspects of using the Leave-One-Out Method.- 6.1.2.3 Pattern Error Functions.- 6.2 Simplified Performance Measures.- 6.2.1 Performance Criteria for Feature Extraction.- 6.2.1.1 Unsupervised Feature Extraction.- 6.2.1.2 Supervised Feature Extraction.- 6.2.2 Performance Criteria for Feature Selection.- 6.2.3 Feature Selection Strategies.- 6.3 Accuracy of Performance Estimates.- 6.3.1 Error Counting Estimates.- 6.3.1.1 The Hold-Out Method.- 6.3.1.2 The Resubstitution Estimator.- 6.3.1.3 The Leaving-One-Out Estimator.- 6.3.1.4 The Bootstrap Estimator.- 6.3.2 Parametric Estimators for the Linear Fisher Classifier.- 6.3.3 Associations Between the Classification Performance Measures.- 6.4 Feature Ranking and the Optimal Number of Feature.- 6.4.1 The Complexity of the Classifiers.- 6.4.2 Feature Ranking.- 6.4.3 Determining the Optimal Number of Features.- 6.5 The Accuracy of the Model Selection.- 6.5.1 True, Apparent and Ideal Classification Errors.- 6.5.2 An Effect of the Number of Variants.- 6.5.3 Evaluation of the Bias.- 6.6 Additional Bibliographical Remarks.- Appendices.- A.1 Elements of Matrix Algebra.- A.2 The First Order Tree Type Dependence Model.- A.3 Temporal Dependence Models.- A.4 Pikelis Algorithm for Evaluating Means and Variances of the True, Apparent and Ideal Errors in Model Selection.- A.5 Matlab Codes (the Non-Linear SLP Training, the First Order Tree Dependence Model, and Data Whitening Transformation).- References.
رده بندی کنگره
شماره رده
TK7882
.
P3
نشانه اثر
S278
2001
نام شخص به منزله سر شناسه - (مسئولیت معنوی درجه اول )