斯坦福大学机器学习第十四讲Lecture14_第1页
斯坦福大学机器学习第十四讲Lecture14_第2页
斯坦福大学机器学习第十四讲Lecture14_第3页
斯坦福大学机器学习第十四讲Lecture14_第4页
斯坦福大学机器学习第十四讲Lecture14_第5页
已阅读5页,还剩11页未读 继续免费阅读

下载本文档

版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领

文档简介

1、 Reduc1onMo1va1on I:Data C ompressionMachine L earning(inche s (cm Reduce d ataf rom2D t o 1DReduce d ata f rom2D t o 1D (inches (cmData C ompressionReduce d ata f rom 3D t o 2D Reduc1onMo1va1on I I:Data V isualiza1onMachine L earningData V isualiza2onCountryGDP(trillions o fUS$Per c apitaGDP(thousa

2、ndsof i ntl. $HumanDevelop-­ment I ndexLifeexpectancyPovertyIndex(Gini a spercentageMeanhouseholdincome(thousandsof U S$ Canada 1.577 39.17 0.908 80.7 32.6 67.293 China 5.878 7.54 0.687 73 46.9 10.22 India 1.632 3.41 0.547 64.7 36.8 0.735 Russia 1.48 19.84 0.755 65.5 39.9 0.72 Singapore 0.223 5

3、6.69 0.866 80 42.5 67.1 USA 14.527 46.86 0.91 78.3 40.8 84.3 Data V isualiza2onCountry Data V isualiza2on Reduc1onPrincipal C omponentAnalysis p roblemformula1onMachine L earning Principal C omponent A nalysis (PCA p roblem f ormula2onPrincipal C omponent A nalysis (PCA p roblem f ormula2onReduce f

4、rom 2-­dimension t o 1-­dimension: F ind a d irec1on (a v ector onto w hich t o p roject t he d ata s o a s t o m inimize t he p rojec1on e rror. Reduce f rom n-­dimension t o k-­dimension: F ind v ectorsonto w hich t o p roject t he d ata, s o a s t o m inimize t he p rojec1on e

5、 rror. Reduc1onPrincipal C omponentAnalysis a lgorithmMachine L earningData p reprocessing Training s et:Preprocessing (feature s caling/mean n ormaliza1on:Replace e ach w ith .If d ierent f eatures o n d ierent s cales (e.g., s ize o f h ouse, n umber o f b edrooms, s cale f eatures t o h ave c omp

6、arable range o f v alues.Reduce d ata f rom 2D t o 1D Reduce d ata f rom 3D t o 2DReduce d ata f rom -­dimensions t o -­dimensions Compute “covariance m atrix”:Compute “eigenvectors” o f m atrix :U,S,V = svd(Sigma;From , w e g et: U,S,V = svd(Sigma Principal C omponent A nalysis (PCA a lgo

7、rithm s ummary Ader m ean n ormaliza1on (ensure e very f eature h as zero m ean a nd o p1onally f eature s caling: Sigma =U,S,V = svd(Sigma; Ureduce = U(:,1:k; z = Ureduce*x; Reduc1onReconstruc1on f romcompressedrepresenta1onMachine L earningReconstruc2on f rom c ompressed r epresenta2on Reduc1onCho

8、osing t he n umber o fprincipal c omponentsMachine L earningAverage s quared p rojec1on e rror:Total v aria1on i n t he d ata:Typically, c hoose t o b e s mallest v alue s o t hat“99% o f v ariance i s r etained” (1% Algorithm: Try P CA w ith Compute Check i fU,S,V = svd(SigmaU,S,V = svd(Sigma Pick

9、s mallest v alue o f f or w hich (99% o f v ariance r etained Reduc1onAdvice f orapplying P CAMachine L earningSupervised l earning s peedup Extract i nputs: U nlabeled d ataset: New t raining s et: Note: M apping s hould b e d ened b y r unning P CA only o n t he t raining s et. T his m apping c an

10、 b e a pplied a s w ell t o the e xamples a nd i n t he c ross v alida1on a nd t estApplica2on o f P CA-­Compression-­Reduce m emory/disk n eeded t o s tore d ata -­Speed u p l earning a lgorithm-­Visualiza1on Bad u se o f P CA: T o p revent o verEngUse i nstead o f t o r educe t

11、 he n umber o f features t o Thus, f ewer f eatures, l ess l ikely t o o vert. This m ight w ork O K, b ut i snt a g ood w ay t o a ddress overlng. U se r egulariza1on i nstead. PCA i s s ome2mes u sed w here i t s houldnt b eDesign o f M L s ystem:-­Get t raining s et-­Run P CA t o r educe i n d imension t o g et -­Train l ogis1c r egression o n -­Test o n t est s et: M ap t o . R un o n How a bout d oing t he w hole t hing w ithout u sing P CA?Before i mpl

温馨提示

  • 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
  • 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
  • 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
  • 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
  • 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
  • 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
  • 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

评论

0/150

提交评论