社区
下载资源悬赏专区
帖子详情
机器学习 第十二讲:Regularization and model selection下载
1努力加油1
2019-03-05 06:56:30
介绍机器学习中的特征选择的一些方法,以及评估学习模型的方法。
相关下载链接:
//download.csdn.net/download/shinian1987/8542313?utm_source=bbsseo
...全文
17
回复
打赏
收藏
机器学习 第十二讲:Regularization and model selection下载
介绍机器学习中的特征选择的一些方法,以及评估学习模型的方法。 相关下载链接://download.csdn.net/download/shinian1987/8542313?utm_source=bbsseo
复制链接
扫一扫
分享
转发到动态
举报
写回复
配置赞助广告
用AI写文章
回复
切换为时间正序
请发表友善的回复…
发表回复
打赏红包
机器学习
第十二
讲
:Re
gularizat
ion
and
model
select
ion
介绍
机器学习
中的特征选择的一些方法,以及评估学习模型的方法。
斯坦福大学-
机器学习
公开课课件.rar
斯坦福大学的
机器学习
公开课课件 Lecture notes 1 (ps) (pdf) Supervised Learning, Discriminative Algorithms Lecture notes 2 (ps) (pdf) Generative Algorithms Lecture notes 3 (ps) (pdf) Support Vector Machines Lecture notes 4 (ps) (pdf) Learning Theory Lecture notes 5 (ps) (pdf) Re
gularizat
ion
and
Model
Select
ion
Lecture notes 6 (ps) (pdf) Online Learning and the Perceptron Algorithm. (opt
ion
al reading) Lecture notes 7a (ps) (pdf) Unsupervised Learning, k-means clustering. Lecture notes 7b (ps) (pdf) Mixture of Gaussians Lecture notes 8 (ps) (pdf) The EM Algorithm Lecture notes 9 (ps) (pdf) Factor Analysis Lecture notes 10 (ps) (pdf) Principal Components Analysis Lecture notes 11 (ps) (pdf) Independent Components Analysis Lecture notes 12 (ps) (pdf) Reinforcement Learning and Control Sect
ion
Notes Sect
ion
notes 1 (pdf) Linear Algebra Review and Reference Sect
ion
notes 2 (pdf) Probability Theory Review Files for the Matlab tutorial: sigmoid.m, logistic_grad_ascent.m, matlab_sess
ion
.m Sect
ion
notes 4 (ps) (pdf) Convex Optimizat
ion
Overview, Part I Sect
ion
notes 5 (ps) (pdf) Convex Optimizat
ion
Overview, Part II Sect
ion
notes 6 (ps) (pdf) Hidden Markov
Model
s Sect
ion
notes 7 (pdf) The Multivariate Gaussian Distribut
ion
Sect
ion
notes 8 (pdf) More on Gaussian Distribut
ion
Sect
ion
notes 9 (pdf) Gaussian Processes
Python Machine Learning By Example-Packt Publishing(2017).epub
Data science and machine learning are some of the top buzzwords in the technical world today. A resurging interest in machine learning is due to the same factors that have made data mining and Bayesian analysis more popular than ever. This book is your entry point to machine learning. Chapter 1, Getting Started with Python and Machine Learning, is the starting point for someone who is looking forward to enter the field of ML with Python. You will get familiar with the basics of Python and ML in this chapter and set up the software on your machine. Chapter 2, Exploring the 20 Newsgroups Dataset with Text Analysis Algorithms, explains important concepts such as getting the data, its features, and pre-processing. It also covers the dimens
ion
reduct
ion
technique, principal component analysis, and the k-nearest neighbors algorithm. Chapter 3, Spam Email Detect
ion
with Naive Bayes, covers classificat
ion
, naive Bayes, and its in-depth implementat
ion
, classificat
ion
performance evaluat
ion
,
model
select
ion
and tuning, and cross-validat
ion
. Examples such as spam e-mail detect
ion
are demonstrated. Chapter 4, News Topic Classificat
ion
with Support Vector Machine, covers multiclass classificat
ion
, Support Vector Machine, and how it is applied in topic classificat
ion
. Other important concepts, such as kernel machine, overfitting, and re
gularizat
ion
, are discussed as well. Chapter 5, Click-Through Predict
ion
with Tree-Based Algorithms, explains decis
ion
trees and random forests in depth over the course of solving an advertising click-through rate problem. Chapter 6, Click-Through Predict
ion
with Logistic Regress
ion
, explains in depth the logistic regress
ion
classifier. Also, concepts such as categorical variable encoding, L1 and L2 re
gularizat
ion
, feature
select
ion
, online learning, and stochastic gradient descent are detailed. Chapter 7, Stock Price Predict
ion
with Regress
ion
Algorithms, analyzes predicting stock market prices using Yahoo/Google Finance data and maybe addit
Gaussian Processes for Machine Learning
这是一本关于Gaussian过程回归、分类方面的书。对
机器学习
新方法感兴趣的不妨看一看。 Contents Series Foreword . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii Symbols and Notat
ion
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii 1 Introduct
ion
1 1.1 A Pictorial Introduct
ion
to Bayesian
Model
ling . . . . . . . . . . . . . . . 3 1.2 Roadmap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2 Regress
ion
7 2.1 Weight-space View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.1.1 The Standard Linear
Model
. . . . . . . . . . . . . . . . . . . . . . 8 2.1.2 Project
ion
s of Inputs into Feature Space . . . . . . . . . . . . . . . 11 2.2 Funct
ion
-space View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 2.3 Varying the Hyperparameters . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.4 Decis
ion
Theory for Regress
ion
. . . . . . . . . . . . . . . . . . . . . . . . 21 2.5 An Example Applicat
ion
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.6 Smoothing, Weight Funct
ion
s and Equivalent Kernels . . . . . . . . . . . 24 2.7 Incorporating Explicit Basis Funct
ion
s . . . . . . . . . . . . . . . . . . . . 27 2.7.1 Marginal Likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . 29 2.8 History and Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 2.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3 Classificat
ion
33 3.1 Classificat
ion
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.1.1 Decis
ion
Theory for Classificat
ion
. . . . . . . . . . . . . . . . . . 35 3.2 Linear
Model
s for Classificat
ion
. . . . . . . . . . . . . . . . . . . . . . . . 37 3.3 Gaussian Process Classificat
ion
. . . . . . . . . . . . . . . . . . . . . . . . 39 3.4 The Laplace Approximat
ion
for the Binary GP Classifier . . . . . . . . . . 41 3.4.1 Posterior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 3.4.2 Predict
ion
s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3.4.3 Implementat
ion
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3.4.4 Marginal Likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.5 Multi-class Laplace Approximat
ion
. . . . . . . . . . . . . . . . . . . . . . 48 3.5.1 Implementat
ion
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 3.6 Expectat
ion
Propagat
ion
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 3.6.1 Predict
ion
s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 3.6.2 Marginal Likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . 57 3.6.3 Implementat
ion
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 3.7 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 3.7.1 A Toy Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 3.7.2 One-dimens
ion
al Example . . . . . . . . . . . . . . . . . . . . . . 62 3.7.3 Binary Handwritten Digit Classificat
ion
Example . . . . . . . . . . 63 3.7.4 10-class Handwritten Digit Classificat
ion
Example . . . . . . . . . 70 3.8 Discuss
ion
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Sect
ion
s marked by an asterisk contain advanced material that may be omitted on a first reading. C. E. Rasmussen & C. K. I. Williams, Gaussian Processes for Machine Learning, the MIT Press, 2006, ISBN 026218253X. c 2006 Massachusetts Institute of Technology. www.GaussianProcess.org/gpml viii Contents 3.9 Appendix: Moment Derivat
ion
s . . . . . . . . . . . . . . . . . . . . . . . . 74 3.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75 4 Covariance Funct
ion
s 79 4.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 4.1.1 Mean Square Continuity and Differentiability . . . . . . . . . . . . 81 4.2 Examples of Covariance Funct
ion
s . . . . . . . . . . . . . . . . . . . . . . 81 4.2.1 Stat
ion
ary Covariance Funct
ion
s . . . . . . . . . . . . . . . . . . . 82 4.2.2 Dot Product Covariance Funct
ion
s . . . . . . . . . . . . . . . . . . 89 4.2.3 Other Non-stat
ion
ary Covariance Funct
ion
s . . . . . . . . . . . . . 90 4.2.4 Making New Kernels from Old . . . . . . . . . . . . . . . . . . . . 94 4.3 Eigenfunct
ion
Analysis of Kernels . . . . . . . . . . . . . . . . . . . . . . . 96 4.3.1 An Analytic Example . . . . . . . . . . . . . . . . . . . . . . . . . 97 4.3.2 Numerical Approximat
ion
of Eigenfunct
ion
s . . . . . . . . . . . . . 98 4.4 Kernels for Non-vectorial Inputs . . . . . . . . . . . . . . . . . . . . . . . 99 4.4.1 String Kernels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100 4.4.2 Fisher Kernels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 4.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 5
Model
Select
ion
and Adaptat
ion
of Hyperparameters 105 5.1 The
Model
Select
ion
Problem . . . . . . . . . . . . . . . . . . . . . . . . . 106 5.2 Bayesian
Model
Select
ion
. . . . . . . . . . . . . . . . . . . . . . . . . . . 108 5.3 Cross-validat
ion
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111 5.4
Model
Select
ion
for GP Regress
ion
. . . . . . . . . . . . . . . . . . . . . . 112 5.4.1 Marginal Likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . 112 5.4.2 Cross-validat
ion
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 5.4.3 Examples and Discuss
ion
. . . . . . . . . . . . . . . . . . . . . . . 118 5.5
Model
Select
ion
for GP Classificat
ion
. . . . . . . . . . . . . . . . . . . . . 124 5.5.1 Derivatives of the Marginal Likelihood for Laplace’s Approximat
ion
125 5.5.2 Derivatives of the Marginal Likelihood for EP . . . . . . . . . . . . 127 5.5.3 Cross-validat
ion
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 5.5.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 5.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128 6 Relat
ion
ships between GPs and Other
Model
s 129 6.1 Reproducing Kernel Hilbert Spaces . . . . . . . . . . . . . . . . . . . . . . 129 6.2 Re
gularizat
ion
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 6.2.1 Re
gularizat
ion
Defined by Differential Operators . . . . . . . . . . 133 6.2.2 Obtaining the Regularized Solut
ion
. . . . . . . . . . . . . . . . . . 135 6.2.3 The Relat
ion
ship of the Re
gularizat
ion
View to Gaussian Process Predict
ion
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135 6.3 Spline
Model
s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 6.3.1 A 1-d Gaussian Process Spline Construct
ion
. . . . . . . . . . . . . 138 6.4 Support Vector Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 6.4.1 Support Vector Classificat
ion
. . . . . . . . . . . . . . . . . . . . . 141 6.4.2 Support Vector Regress
ion
. . . . . . . . . . . . . . . . . . . . . . 145 6.5 Least-squares Classificat
ion
. . . . . . . . . . . . . . . . . . . . . . . . . . 146 6.5.1 Probabilistic Least-squares Classificat
ion
. . . . . . . . . . . . . . . 147 C. E. Rasmussen & C. K. I. Williams, Gaussian Processes for Machine Learning, the MIT Press, 2006, ISBN 026218253X. c 2006 Massachusetts Institute of Technology. www.GaussianProcess.org/gpml Contents ix 6.6 Relevance Vector Machines . . . . . . . . . . . . . . . . . . . . . . . . . . 149 6.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 7 Theoretical Perspectives 151 7.1 The Equivalent Kernel . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 7.1.1 Some Specific Examples of Equivalent Kernels . . . . . . . . . . . 153 7.2 Asymptotic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 7.2.1 Consistency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 7.2.2 Equivalence and Orthogonality . . . . . . . . . . . . . . . . . . . . 157 7.3 Average-case Learning Curves . . . . . . . . . . . . . . . . . . . . . . . . . 159 7.4 PAC-Bayesian Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 7.4.1 The PAC Framework . . . . . . . . . . . . . . . . . . . . . . . . . . 162 7.4.2 PAC-Bayesian Analysis . . . . . . . . . . . . . . . . . . . . . . . . 163 7.4.3 PAC-Bayesian Analysis of GP Classificat
ion
. . . . . . . . . . . . . 164 7.5 Comparison with Other Supervised Learning Methods . . . . . . . . . . . 165 7.6 Appendix: Learning Curve for the Ornstein-Uhlenbeck Process . . . . . . 168 7.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 8 Approximat
ion
Methods for Large Datasets 171 8.1 Reduced-rank Approximat
ion
s of the Gram Matrix . . . . . . . . . . . . . 171 8.2 Greedy Approximat
ion
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 8.3 Approximat
ion
s for GPR with Fixed Hyperparameters . . . . . . . . . . . 175 8.3.1 Subset of Regressors . . . . . . . . . . . . . . . . . . . . . . . . . . 175 8.3.2 The Nystr¨om Method . . . . . . . . . . . . . . . . . . . . . . . . . 177 8.3.3 Subset of Datapoints . . . . . . . . . . . . . . . . . . . . . . . . . 177 8.3.4 Projected Process Approximat
ion
. . . . . . . . . . . . . . . . . . . 178 8.3.5 Bayesian Committee Machine . . . . . . . . . . . . . . . . . . . . . 180 8.3.6 Iterative Solut
ion
of Linear Systems . . . . . . . . . . . . . . . . . 181 8.3.7 Comparison of Approximate GPR Methods . . . . . . . . . . . . . 182 8.4 Approximat
ion
s for GPC with Fixed Hyperparameters . . . . . . . . . . . 185 8.5 Approximating the Marginal Likelihood and its Derivatives . . . . . . . . 185 8.6 Appendix: Equivalence of SR and GPR Using the Nystr¨om Approximate Kernel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 8.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 9 Further Issues and Conclus
ion
s 189 9.1 Multiple Outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190 9.2 Noise
Model
s with Dependencies . . . . . . . . . . . . . . . . . . . . . . . 190 9.3 Non-Gaussian Likelihoods . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 9.4 Derivative Observat
ion
s . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 9.5 Predict
ion
with Uncertain Inputs . . . . . . . . . . . . . . . . . . . . . . . 192 9.6 Mixtures of Gaussian Processes . . . . . . . . . . . . . . . . . . . . . . . . 192 9.7 Global Optimizat
ion
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 9.8 Evaluat
ion
of Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 9.9 Student’s t Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 9.10 Invariances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 9.11 Latent Variable
Model
s . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196 9.12 Conclus
ion
s and Future Direct
ion
s . . . . . . . . . . . . . . . . . . . . . . 196 C. E. Rasmussen & C. K. I. Williams, Gaussian Processes for Machine Learning, the MIT Press, 2006, ISBN 026218253X. c 2006 Massachusetts Institute of Technology. www.GaussianProcess.org/gpml x Contents Appendix A Mathematical Background 199 A.1 Joint, Marginal and Condit
ion
al Probability . . . . . . . . . . . . . . . . . 199 A.2 Gaussian Identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 A.3 Matrix Identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 A.3.1 Matrix Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 A.3.2 Matrix Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 A.4 Cholesky Decomposit
ion
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 A.5 Entropy and Kullback-Leibler Divergence . . . . . . . . . . . . . . . . . . 203 A.6 Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204 A.7 Measure and Integrat
ion
. . . . . . . . . . . . . . . . . . . . . . . . . . . . 204 A.7.1 Lp Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 A.8 Fourier Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205 A.9 Convexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206 Appendix B Gaussian Markov Processes 207 B.1 Fourier Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 B.1.1 Sampling and Periodizat
ion
. . . . . . . . . . . . . . . . . . . . . . 209 B.2 Continuous-time Gaussian Markov Processes . . . . . . . . . . . . . . . . 211 B.2.1 Continuous-time GMPs on R . . . . . . . . . . . . . . . . . . . . . 211 B.2.2 The Solut
ion
of the Corresponding SDE on the Circle . . . . . . . 213 B.3 Discrete-time Gaussian Markov Processes . . . . . . . . . . . . . . . . . . 214 B.3.1 Discrete-time GMPs on Z . . . . . . . . . . . . . . . . . . . . . . . 214 B.3.2 The Solut
ion
of the Corresponding Difference Equat
ion
on PN . . 215 B.4 The Relat
ion
ship Between Discrete-time and Sampled Continuous-time GMPs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217 B.5 Markov Processes in Higher Dimens
ion
s . . . . . . . . . . . . . . . . . . . 218 Appendix C Datasets and Code 221 Bibliography 223 Author Index 239 Subject Index 245
下载资源悬赏专区
12,795
社区成员
12,332,766
社区内容
发帖
与我相关
我的任务
下载资源悬赏专区
CSDN 下载资源悬赏专区
复制链接
扫一扫
分享
社区描述
CSDN 下载资源悬赏专区
其他
技术论坛(原bbs)
社区管理员
加入社区
获取链接或二维码
近7日
近30日
至今
加载中
查看更多榜单
社区公告
暂无公告
试试用AI创作助手写篇文章吧
+ 用AI写文章