浮点运算单元PPT课件

上传人：海*** IP属地：广东上传时间：2022-04-22 格式：PPT 页数：48 大小：1.69MB 积分：12 举报 版权申诉

已阅读5页，还剩43页未读，继续免费阅读

版权说明：本文档由用户提供并上传，收益归属内容提供方，若内容存在侵权，请进行举报或认领

文档简介

1、2021/3/91浮点运算单元2021/3/92浮点运算nFloating-Point NumbersnIEEE 754 Floating-Point StandardnFloating-Point Addition and SubtractionnFloating-Point Multiplication2021/3/93浮点数在计算机内的格式浮点数在计算机内的格式浮点数浮点数: X = MS ES Em-1 .E2 E1 M-1M-2.M-n 符号位符号位阶码位阶码位尾数数码位尾数数码位总位数总位数短浮点数短浮点数: 1 8 23 32长浮点数长浮点数: 1 11 52 64 临时

2、浮点数临时浮点数: 1 15 64 80IEEE 标准：标准：阶码用移码，基为阶码用移码，基为2；尾数用原码尾数用原码X = MX * 2EX浮点数的浮点数的阶码阶码的位数决定数的表示范围，的位数决定数的表示范围，尾数尾数的位数决定数的有效精度。的位数决定数的有效精度。2021/3/94浮点数在计算机内的格式浮点数在计算机内的格式浮点数浮点数: X = M E E .E E M M .M ssm-110-1-2-nIEEE 标准：标准：尾数用原码尾数用原码X = MX * 2EX 浮点数是数学中实数的子集合，由一个纯小数乘上一个指数浮点数是数学中实数的子集合，由一个纯小数乘上一个指数值来组成

3、。在计算机内，其纯小数部分被称为浮点数的值来组成。在计算机内，其纯小数部分被称为浮点数的尾数尾数，对非对非 0 值的浮点数，要求尾数的绝对值值的浮点数，要求尾数的绝对值必须必须 = 1/2，称满足这，称满足这种表示要求的浮点数为种表示要求的浮点数为规格化表示规格化表示；把不满足这一表示要求的尾数，变成满足这一要求的尾数把不满足这一表示要求的尾数，变成满足这一要求的尾数的操作过程，叫作浮点数的的操作过程，叫作浮点数的规格化处理规格化处理，通过尾数移位和修改，通过尾数移位和修改阶码实现。阶码实现。2021/3/95浮点数在计算机内的格式浮点数在计算机内的格式浮点数浮点数: X = M E E .

4、E E M M .M ssm-110-1-2-nIEEE 标准：标准：尾数用原码尾数用原码X = MX * 2EX 按国际电子电气工程师协会规定的标准，浮点数的尾数要按国际电子电气工程师协会规定的标准，浮点数的尾数要用原码表示，即符号位用原码表示，即符号位 Ms: 0 表示正，表示正，1 表示负，且非表示负，且非 0 值尾数值尾数数值的最高位数值的最高位 M-1 必为必为 1, 才能满足浮点数规格化表示的要求；才能满足浮点数规格化表示的要求；既然非既然非 0 值浮点数的尾数数值最高位必定为值浮点数的尾数数值最高位必定为 1，则在保存，则在保存浮点数到内存前，通过尾数右移浮点数到内存前，通过尾

5、数右移, 强行把该位去掉强行把该位去掉, 用同样多的用同样多的尾数位就能多存一位二进制数，有利于提高数据表示精度，称尾数位就能多存一位二进制数，有利于提高数据表示精度，称这种处理方案使用了这种处理方案使用了隐藏位隐藏位技术。技术。当然，在取回这样的浮点数到运算器执行运算时，必须先当然，在取回这样的浮点数到运算器执行运算时，必须先恢复该隐藏位。恢复该隐藏位。2021/3/96Floating Point2021/3/97浮点数在计算机内的格式浮点数在计算机内的格式X = Ms Es Em-1 .E1 E0 M-1 M-2 .M-n IEEE 标准：标准：阶码用移码，基为阶码用移码，基为2X =

6、 MX * 2EX 按国际电子电气工程师协会规定的国际通用标准，浮点按国际电子电气工程师协会规定的国际通用标准，浮点数的阶码用整数给出，并且要用移码表示，用作为以数的阶码用整数给出，并且要用移码表示，用作为以 2为底为底的指数的幂。既然该指数的底一定为的指数的幂。既然该指数的底一定为 2 ，可以不必在浮点数，可以不必在浮点数的格式中明确表示出来，的格式中明确表示出来，只需给出阶码的幂值即可。只需给出阶码的幂值即可。移码表示移码表示只用于只用于表示整数，表示整数，只用在只用在浮点数的阶码部分浮点数的阶码部分，其定义类似于整数的补码定义，差别在符号位。其定义类似于整数的补码定义，差别在符号位。

7、移码的符号位移码的符号位是是 0 表示负，表示负，1 表示正，与补码的符号位表示正，与补码的符号位正好相反，移码是指机器数在数轴上有个移位关系；正好相反，移码是指机器数在数轴上有个移位关系；移码的数值位移码的数值位则与补码的数值位完全相同。则与补码的数值位完全相同。2021/3/98浮点数格式：关于浮点数格式：关于移码移码的知识的知识浮点数浮点数: X = M E E .E E M M .M ssm-110-1-2-nX = MX * 2EX移码表示整数，用在浮点数的阶码部分。移码表示整数，用在浮点数的阶码部分。一位符号位和一位符号位和 n 位数值位组成的移码位数值位组成的移码, 其定义为

8、；其定义为；E移移 = 2n + E -2n=E2n 表示范围：表示范围： 00000000 111111110负数负数正数正数机器数机器数X补补 =X 0 X 2n 2n+1 + X -2n X 02021/3/99浮点数格式：关于浮点数格式：关于移码移码的知识的知识一位符号位和一位符号位和 n 位数值位组成的移码位数值位组成的移码, 其定义为；其定义为；E移移 = 2n + E -2n=E2n 表示范围：表示范围： 00000000 11111111 负数负数正数正数机器数机器数0 移码只执行二数的加减运算与增移码只执行二数的加减运算与增 1、减减 1 操作。加减运算操作。加减运算

9、时，符号位计算结果求反后时，符号位计算结果求反后, 才是加减运算的正确符号位的值。才是加减运算的正确符号位的值。注意注意:当用双符号位时，当用双符号位时，00代表负，代表负，01代表正，而不是代表正，而不是11代表正代表正 8 位的阶码能表示位的阶码能表示-128+127，当阶码为，当阶码为-128时，其补码表时，其补码表示为示为 00000000，该浮点数的绝对值，该浮点数的绝对值2-128,人们规定此浮点数的人们规定此浮点数的值为零，若尾数不为值为零，若尾数不为 0 就清其为就清其为 0，并特称此值为，并特称此值为机器零。机器零。8 位移码表示的机器数为数的真值位移码表示的机器数为数的真值

10、在数轴上在数轴上向右平移向右平移了了 128 个位置个位置-128+1272021/3/910Biased Exponent nValue of exponent = val(E) = E Bias (Bias is a constant)n8 bits for single precisionn E can be in the range 0 to 255n E = 0 and E = 255 are reserved for special usen E = 1 to 254 are used for normalized floating point numbersn Bias = 12

11、7 (half of 254), val(E) = E 127 val(E=1) = 126, val(E=127) = 0, val(E=254) = 1272021/3/911Example of ExponentExponent (E)Adjusted Binary (E + 127) +51321000010001271111111-101171110101+12825511111111-12700-112611111102021/3/912Example of Normalized Mantissa Binary ValueNormalized AsExponent1101.1011

12、.10110130.001011.01-31.00011.00010100000111.000001172021/3/913Biased Exponent 2021/3/914Example of Floating Point2021/3/915Largest Normalized Float2021/3/916Smallest Normalized Float2021/3/917Zero Infinity NaN2021/3/918Denormalized numbers2021/3/919Zero & Infinity2021/3/920nThe value NaN (Not a Numb

13、er) is used to represent a value that does not represent a real number.nNaN is a special value represented with maximum E and F 0nResult from exceptional situations, such as 0/0 or sqrt(negative)nOperation on a NaN results is NaN: Op(X, NaN) = NaNnQNaN denote indeterminate operations, nSNaN denote i

14、nvalid operations NaN2021/3/921SignSignExponent (Exponent (e e) )Fraction Fraction ( (f f) )ValueValue000.0000.00+0000.0000.01Positive Denormalized Real:0.f 2(-b+1)11.11000.01XX.XXPositive Normalized Real:1.f 2(e-b)11.10011.1100.00+Infinity011.1100.01SNaN:01.11011.1110.00QNaN:11.112021/3/922SignSign

15、Exponent (Exponent (e e) )Fraction Fraction ( (f f) )ValueValue100.0000.00-0100.0000.01Negative Denormalized Real:-0.f 2(-b+1)11.11100.01XX.XXNegative Normalized Real:-1.f 2(e-b)11.10111.1100.00-Infinity111.1100.01SNaN:01.11111.1110.00QNaN:11.112021/3/923OperationOperationResultResultn Infinity0Infi

16、nity InfinityInfinitynonzero 0InfinityInfinity + InfinityInfinity0 0NaNInfinity - InfinityNaNInfinity InfinityNaNInfinity 0NaN2021/3/924FP Add2021/3/925FP Add2021/3/926Floating Point Subtraction Example2021/3/927Floating Point Subtraction Example2021/3/928Extra bits2021/3/929Guard bit2021/3/930Extra

17、 bit2021/3/931Rounding Modenearest nIn this mode, the inexact results are rounded to the nearer of the two possible result values. If the neither possibility is nearer, then the even alternative is chosen. This form of rounding is also called round to even。 “Even” when least significant bit is 0nVal

18、ueBinary RoundedAction Rounded Valuen2 3/3210.00011210.002 (1/2up) 2 1/4n2 7/810.11100211.002 (1/2up) 3n2 5/810.10100210.102 (1/2down) 2 1/22021/3/9322021/3/933Rounding Mode2021/3/934Steps in Addition/Subtraction of Floating-Point NumbersnStep 1: Calculate difference d of the two exponents - d=|E1 -

19、 E2|nStep 2: Shift significand of smaller number by d-base positions to the rightnStep 3: Add aligned significands and set exponent of result to exponent of larger operandnStep 4: Normalize resultant significand and adjust exponent if necessarynStep 5: Round resultant significand and adjust exponent

20、 if necessary2021/3/935Addition/Subtraction Structure2021/3/936Addition/Subtraction nE1E2 - Exponent of larger number not decreased - this will result in a larger significand adder required.u Addition - resultant significand M (sum of two aligned significands) is in range 1/ M 1 - a postnormalizatio

21、n step - shifting significand to the right to yield M3 and increasing exponent by one - is required (an exponent overflow may occur)2021/3/937Addition/Subtraction NormalizationnSubtraction - Resultant significand M is in range 0 |M|1 - postnormalization step - shifting significand to left and decrea

22、sing exponent - is required if M1) - only a pre-alignment shift may be needed2021/3/941CLOSE CasenExponent difference predicted based on two least significant bits of operands - allows subtraction of significands to start as soon as possibleqIf 0 - subtract executed with no alignmentqIf 1 - signific

23、and of smaller operand is shifted once to the right (using a multiplexor) and then subtracted from other significand nIn parallel - true exponent difference calculated qIf 1 - procedure aborted and FAR procedure followedqIf 1 - CLOSE procedure continuednIn parallel with subtraction - number of leadi

24、ng zeros predicted to determine number of shift positions in postnormalization2021/3/942 CLOSE Case - Normalization and RoundingnNext - normalization of significand and corresponding exponent adjustment nLast - rounding - precomputing sum, sum+1 - selecting the one which is properly rounded - negati

25、on of result may be necessary nResult of subtraction usually positive - negation not requirednOnly when exponents equal - result of significand subtraction may be negative (in twos complement) - requiring a negation stepnNegation and rounding steps - mutually exclusive2021/3/943FAR CasenFirst - expo

26、nent difference calculated nNext - significand of smaller operand shifted to right for alignment nShifted-out bits used to set sticky bitnSmaller significand subtracted from larger -result either normalized.nLast step - rounding 2021/3/944Leading Zeros Prediction CircuitnPredict position of leading

27、non-zero bit in result of subtract before subtraction is completed nAllowing to execute postnormalization shift immediately following subtractionnExamine bits of operands (of subtract) in a serial fashion, starting with most significant bits to determine position of first 1 nThis serial operation ca

28、n be accelerated using a parallel scheme similar to carry-look-ahead2021/3/945Leading Zeros Prediction CircuitnPredict position of leading non-zero bit in result of subtract before subtraction is completed nAllowing to execute postnormalization shift immediately following subtractionnExamine bits of operands (of subtract) in a serial fashion, starting

人人文库> 全部分类> 专业文献 > 通信电子

温馨提示

1. 本站所有资源如无特殊说明，都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
2. 本站的文档不包含任何第三方提供的附件图纸等，如果需要附件，请联系上传者。文件的所有权益归上传用户所有。
3. 本站RAR压缩包中若带图纸，网页内容里面会有图纸预览，若没有图纸预览就没有图纸。
4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
5. 人人文库网仅提供信息存储空间，仅对用户上传内容的表现方式做保护处理，对用户上传分享的文档内容本身不做任何修改或编辑，并不能对任何下载内容负责。
6. 下载文件中如有侵权或不适当内容，请与我们联系，我们立即纠正。
7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。

浮点运算单元PPT课件

文档简介

温馨提示

最新文档

评论

浮点运算单元PPT课件

文档简介

温馨提示

最新文档

评论

相关文档