Synthesis lectures on artificial intelligence and machine learning,
Volume Designation
#38
ISSN of Series
1939-4616 ;
INTERNAL BIBLIOGRAPHIES/INDEXES NOTE
Text of Note
Includes bibliographical references (pages 159-186).
CONTENTS NOTE
Text of Note
1. Introduction -- 1.1 Classic machine learning paradigm -- 1.2 Motivating examples -- 1.3 A brief history of lifelong learning -- 1.4 Definition of lifelong learning -- 1.5 Types of knowledge and key challenges -- 1.6 Evaluation methodology and role of big data -- 1.7 Outline of the book -- 2. Related learning paradigms -- 2.1 Transfer learning -- 2.1.1 Structural correspondence learning -- 2.1.2 Naïve Bayes transfer classifier -- 2.1.3 Deep learning in transfer learning -- 2.1.4 Difference from lifelong learning -- 2.2 Multi-task learning -- 2.2.1 Task relatedness in multi-task learning -- 2.2.2 GO-MTL: multi-task learning using latent basis -- 2.2.3 Deep learning in multi-task learning -- 2.2.4 Difference from lifelong learning -- 2.3 Online learning -- 2.3.1 Difference from lifelong learning -- 2.4 Reinforcement learning -- 2.4.1 Difference from lifelong learning -- 2.5 Meta learning -- 2.5.1 Difference from lifelong learning -- 2.6 Summary -- 3. Lifelong supervised learning -- 3.1 Definition and overview -- 3.2 Lifelong memory-based learning -- 3.2.1 Two memory-based learning methods -- 3.2.2 Learning a new representation for lifelong learning -- 3.3 Lifelong neural networks -- 3.3.1 MTL net -- 3.3.2 Lifelong EBNN -- 3.4 ELLA: an efficient lifelong learning algorithm -- 3.4.1 Problem setting -- 3.4.2 Objective function -- 3.4.3 Dealing with the first inefficiency -- 3.4.4 Dealing with the second inefficiency -- 3.4.5 Active task selection -- 3.5 Lifelong naive Bayesian classification -- 3.5.1 Naïve Bayesian text classification -- 3.5.2 Basic ideas of LSC -- 3.5.3 LSC technique -- 3.5.4 Discussions -- 3.6 Domain word embedding via meta-learning -- 3.7 Summary and evaluation datasets -- 4. Continual learning and catastrophic forgetting -- 4.1 Catastrophic forgetting -- 4.2 Continual learning in neural networks -- 4.3 Learning without forgetting -- 4.4 Progressive neural networks -- 4.5 Elastic weight consolidation -- 4.6 iCaRL: incremental classifier and representation learning -- 4.6.1 Incremental training -- 4.6.2 Updating representation -- 4.6.3 Constructing exemplar sets for new classes -- 4.6.4 Performing classification in iCaRL -- 4.7 Expert gate -- 4.7.1 Autoencoder gate -- 4.7.2 Measuring task relatedness for training -- 4.7.3 Selecting the most relevant expert for testing -- 4.7.4 Encoder-based lifelong learning -- 4.8 Continual learning with generative replay -- 4.8.1 Generative adversarial networks -- 4.8.2 Generative replay -- 4.9 Evaluating catastrophic forgetting -- 4.10 Summary and evaluation datasets -- 5. Open-world learning -- 5.1 Problem definition and applications -- 5.2 Center-based similarity space learning -- 5.2.1 Incrementally updating a CBS learning model -- 5.2.2 Testing a CBS learning model -- 5.2.3 CBS learning for unseen class detection -- 5.3 DOC: deep open classification -- 5.3.1 Feed-forward layers and the 1-vs.-rest layer -- 5.3.2 Reducing open-space risk -- 5.3.3 DOC for image classification -- 5.3.4 Unseen class discovery -- 5.4 Summary and evaluation datasets -- 5058 6. Lifelong topic modeling -- 6.1 Main ideas of lifelong topic modeling -- 6.2 LTM: a lifelong topic model -- 6.2.1 LTM model -- 6.2.2 Topic knowledge mining -- 6.2.3 Incorporating past knowledge -- 6.2.4 Conditional distribution of Gibbs sampler -- 6.3 AMC: a lifelong topic model for small data -- 6.3.1 Overall algorithm of AMC -- 6.3.2 Mining must-link knowledge -- 6.3.3 Mining cannot-link knowledge -- 6.3.4 Extended Pólya Urn model -- 6.3.5 Sampling distributions in Gibbs sampler -- 6.4 Summary and evaluation datasets -- 7. Lifelong information extraction -- 7.1 NELL: a never-ending language learner -- 7.1.1 NELL architecture -- 7.1.2 Extractors and learning in NELL -- 7.1.3 Coupling constraints in NELL -- 7.2 Lifelong opinion target extraction -- 7.2.1 Lifelong learning through recommendation -- 7.2.2 AER algorithm -- 7.2.3 Knowledge learning -- 7.2.4 Recommendation using past knowledge -- 7.3 Learning on the job -- 7.3.1 Conditional random fields -- 7.3.2 General dependency feature -- 7.3.3 The L-CRF algorithm -- 7.4 Lifelong-RL: lifelong relaxation labeling -- 7.4.1 Relaxation labeling -- 7.4.2 Lifelong relaxation labeling -- 7.5 Summary and evaluation datasets -- 5058 8. Continuous knowledge learning in chatbots -- 8.1 LiLi: lifelong interactive learning and inference -- 8.2 Basic ideas of LiLi -- 8.3 Components of LiLi -- 8.4 A running example -- 8.5 Summary and evaluation datasets -- 9. Lifelong reinforcement learning -- 9.1 Lifelong reinforcement learning through multiple environments -- 9.1.1 Acquiring and incorporating bias -- 9.2 Hierarchical Bayesian lifelong reinforcement learning -- 9.2.1 Motivation -- 9.2.2 Hierarchical Bayesian approach -- 9.2.3 MTRL algorithm -- 9.2.4 Updating hierarchical model parameters -- 9.2.5 Sampling an MDP -- 9.3 PG-ELLA: lifelong policy gradient reinforcement learning -- 9.3.1 Policy gradient reinforcement learning -- 9.3.2 Policy gradient lifelong learning setting -- 9.3.3 Objective function and optimization -- 9.3.4 Safe policy search for lifelong learning -- 9.3.5 Cross-domain lifelong reinforcement learning -- 9.4 Summary and evaluation datasets -- 10. Conclusion and future directions -- Bibliography -- Authors' biographies.
0
SUMMARY OR ABSTRACT
Text of Note
"This is an introduction to an advanced machine learning paradigm that continuously learns by accumulating past knowledge that it then uses in future learning and problem solving. In contrast, the current dominant machine learning paradigm learns in isolation: given a training dataset, it runs a machine learning algorithm on the dataset to produce a model that is then used in its intended application. It makes no attempt to retain the learned knowledge and use it in subsequent learning. Unlike this isolated system, humans learn effectively with only a few examples precisely because our learning is very knowledge-driven: the knowledge learned in the past helps us learn new things with little data or effort. Lifelong learning aims to emulate this capability, because without it, an AI system cannot be considered truly intelligent. Research in lifelong learning has developed significantly in the relatively short time since the first edition of this book was published. The purpose of this second edition is to expand the definition of lifelong learning, update the content of several chapters, and add a new chapter about continual learning in deep neural networks--which has been actively researched over the past two or three years. A few chapters have also been reorganized to make each of them more coherent for the reader. Moreover, the authors want to propose a unified framework for the research area. Currently, there are several research topics in machine learning that are closely related to lifelong learning--most notably, multi-task learning, transfer learning, and metalearning--because they also employ the idea of knowledge sharing and transfer. This book brings all these topics under one roof and discusses their similarities and differences. Its goal is to introduce this emerging machine learning paradigm and present a comprehensive survey and review of the important research results and latest ideas in the area. This book is thus suitable for students, researchers, and practitioners who are interested in machine learning, data mining, natural language processing, or pattern recognition. Lecturers can readily use the book for courses in any of these related fields."--Provided by publisher.