1 Introduction -- 2 Foundations of R -- 3 Managing Data in R -- 4 Data Visualization -- 5 Linear Algebra & Matrix Computing -- 6 Dimensionality Reduction -- 7 Lazy Learning: Classification Using Nearest Neighbors -- 8 Probabilistic Learning: Classification Using Naive Bayes -- 9 Decision Tree Divide and Conquer Classification -- 10 Forecasting Numeric Data Using Regression Models -- 11 Black Box Machine-Learning Methods: Neural Networks and Support Vector Machines -- 12 Apriori Association Rules Learning -- 13 k-Means Clustering -- 14 Model Performance Assessment -- 15 Improving Model Performance -- 16 Specialized Machine Learning Topics -- 17 Variable/Feature Selection -- 18 Regularized Linear Modeling and Controlled Variable Selection -- 19 Big Longitudinal Data Analysis -- 20 Natural Language Processing/Text Mining -- 21 Prediction and Internal Statistical Cross Validation -- 22 Function Optimization -- 23 Deep Learning Neural Networks -- 24 Summary -- 25 Glossary -- 26 Index -- 27 Errata.
0
Over the past decade, Big Data have become ubiquitous in all economic sectors, scientific disciplines, and human activities. They have led to striking technological advances, affecting all human experiences. Our ability to manage, understand, interrogate, and interpret such extremely large, multisource, heterogeneous, incomplete, multiscale, and incongruent data has not kept pace with the rapid increase of the volume, complexity and proliferation of the deluge of digital information. There are three reasons for this shortfall. First, the volume of data is increasing much faster than the corresponding rise of our computational processing power (Kryder's law> Moore's law). Second, traditional discipline-bounds inhibit expeditious progress. Third, our education and training activities have fallen behind the accelerated trend of scientific, information, and communication advances.^There are very few rigorous instructional resources, interactive learning materials, and dynamic training environments that support active data science learning. The textbook balances the mathematical foundations with dexterous demonstrations and examples of data, tools, modules and workflows that serve as pillars for the urgently needed bridge to close that supply and demand predictive analytic skills gap. Exposing the enormous opportunities presented by the tsunami of Big data, this textbook aims to identify specific knowledge gaps, educational barriers, and workforce readiness deficiencies. Specifically, it focuses on the development of a transdisciplinary curriculum integrating modern computational methods, advanced data science techniques, innovative biomedical applications, and impactful health analytics.^The content of this graduate-level textbook fills a substantial gap in integrating modern engineering concepts, computational algorithms, mathematical optimization, statistical computing and biomedical inference. Big data analytic techniques and predictive scientific methods demand broad transdisciplinary knowledge, appeal to an extremely wide spectrum of readers/learners, and provide incredible opportunities for engagement throughout the academy, industry, regulatory and funding agencies.
Springer Nature
com.springer.onix. 9783319723471
9783319723464
Big data.
Mathematical statistics.
Medical records-- Data processing.
R (Computer program language)
Big Data.
Big Data/Analytics.
Data Mining and Knowledge Discovery.
Health Informatics.
Probability and Statistics in Computer Science.
Big data.
Business & Economics-- Industries-- Computer Industry.