




版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、GradientDirectional DerivativesGradient descent (GD): AKA steepest descent (SD)Goal: Minimize a function iteratively based on gradientFormula for GD:Normalized versionWith momentumGradient Descent (GD)Step size or learning rateQuiz!Vanilla GDorExample of Single-Input FunctionsIf n=1, GD reduces to t
2、he problem of going left or right.ExampleAnimation:/?p=gradient.descentEach point/region with zero gradient has a basin of attractionBasin of Attraction in 1DExample of Two-input Functions“Peaks” Functions (1/2)If n=2, GD needs to find a direction in 2D plane.Example: “Peaks” function in MATLABAnima
3、tion: gradientDescentDemo.mGradients is perpendicularto contours, why?3 local maxima3 local minima“Peaks” Functions (2/2)Gradient of the “peaks” functiondz/dx = -6*(1-x)*exp(-x2-(y+1)2) - 6*(1-x)2*x*exp(-x2-(y+1)2) - 10*(1/5-3*x2)*exp(-x2-y2) + 20*(1/5*x-x3-y5)*x*exp(-x2-y2) - 1/3*(-2*x-2)*exp(-(x+1
4、)2-y2)dz/dy = 3*(1-x)2*(-2*y-2)*exp(-x2-(y+1)2) + 50*y4*exp(-x2-y2) + 20*(1/5*x-x3-y5)*y*exp(-x2-y2) + 2/3*y*exp(-(x+1)2-y2)d(dz/dx)/dx = 36*x*exp(-x2-(y+1)2) - 18*x2*exp(-x2-(y+1)2) - 24*x3*exp(-x2-(y+1)2) + 12*x4*exp(-x2-(y+1)2) + 72*x*exp(-x2-y2) - 148*x3*exp(-x2-y2) - 20*y5*exp(-x2-y2) + 40*x5*e
5、xp(-x2-y2) + 40*x2*exp(-x2-y2)*y5 -2/3*exp(-(x+1)2-y2) - 4/3*exp(-(x+1)2-y2)*x2 -8/3*exp(-(x+1)2-y2)*x Each point/region with zero gradient has a basin of attractionBasin of Attraction in 2DRosenbrock FunctionRosenbrock functionMore about this functionAnimation: /?p=gradient.descentDocument on how t
6、o optimize this functionJustification for using momentum termsProperties of Gradient DescentPropertiesNo guarantee for reaching global optimumFeasible for differentiable objective functions (which can have finite number of non-differential points)Performance heavily dependent on starting point and s
7、tep size VariantsUse adaptive step sizesNormalize the gradient by its lengthUse the momentum term to reduce zig-zag pathsUse line minimization at each iterationQuiz!Comparisons of Gradient-based OptimizationGradient descent (GD)Treat all parameters as nonlinearHybrid learning of GD+LSEDistinguish be
8、tween linear and nonlinearConjugate gradient descentTry to reach the minimizing point by assume the objective function is quadraticGauss-Newton (GN) methodLinearize the objective function to treat all parameters as linearLevenberg-Marquardt (LM) methodSwitch smoothly between SD and GNSynonymsLineari
9、zation methodExtended Kalman filter methodConcept:General nonlinear model: y = f(x, q)linearization at q = qnow: y = f(x, qnow)+a1(q1 - q1,now)+a2(q2 - q2,now) + .LSE solution: qnext = qnow + h(ATA)-1ATBGauss-Newton MethodFormulaqnext = qnow + h(ATA+lI)-1ATBEffects of ll small Gauss-Newton methodl big Gradient descentHow to update lGreedy policy Make l smallCautious policy Make l bigLevenberg-Marquardt MethodCan we use GD to find the minimum of f(x)=|x|?What is the gradient of the sigmoid function? Can you express the gradient using the original funct
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 度品牌推广服务合同协议书范
- 借款合同抵押贷款合同
- 劳动合同合同类型和期限
- 化工运输服务合同
- 回迁房子买卖合同
- 外卖送货员劳务雇佣合同协议书
- 安装太阳能施工合同书
- 投资股份合同书(2025年版)
- 机械摊位销售合同范本
- 新房无证转让合同范本
- 2025年安徽九华山旅游发展股份有限公司招聘66人笔试参考题库附带答案详解
- 普通高中生物学课程标准-(2024修订版)
- 2024年中国农业银行系统招聘笔试考试题库(浓缩500题)
- 2025年日历表(A4版含农历可编辑)
- 黑布林绘本 Dad-for-Sale 出售爸爸课件
- 腹腔镜下肝叶切除术(实用课件)
- 三菱M70数控系统以太网应用
- 光伏并网逆变器调试报告正式版
- 市政道路大中修工程管理指引
- SF_T 0097-2021 医疗损害司法鉴定指南_(高清版)
- PVC发泡板材项目可行性研究报告写作范文
评论
0/150
提交评论