




版权说明:本文档由用户提供并上传,收益归属内容提供方,若内容存在侵权,请进行举报或认领
文档简介
1、HEVC核心编码技术讲解(英文讲解有翻译)HEVC核心编码技术之一.图像的编码划分 IV. HEVC的编码技术HEVC Video Coding TechniquesAs in all prior ITU-T and ISO/IEC JTC 1 video coding standards since H.261 2, the HEVC design follows the classic block-based hybrid video coding approach (as depicted in Fig. 1). The basic source-codin
2、g algorithm is a hybridof interpicture prediction to exploit temporal statistical dependences,intrapicture prediction to exploit spatial statistical dependences, and transform coding of the prediction residual signals to further exploit spatial statistical dependences. There is no single c
3、oding element in the HEVC design that providesthe majority of its significant improvement in compression efficiency in relation to prior video coding standards. It is, rather, a plurality of smaller improvements that add up to the significant gain.和从H.261以来的视频编码标准一样,HEVC的设计沿用了经典的基于块的混合视频编码
4、方式(如图1所示)。基本的信源编码算法是对时域统计相关性使用帧间预测,对空域统计相关性使用帧内预测,再对预测残差信号使得变换编码进一步去除空间统计相关性。HEVC取得比以前的标准更好的压缩效果,而且以很小的改善就取得了很大的收获。Fig. 1. Typical HEVC video encoder (with decoder modeling elements shaded in light gray).A. 图像的像素表示 Sampled Representation of PicturesFor representing color video signals, HEVC typ
5、ically uses a tristimulus YCbCr color space with 4:2:0 sampling (although extension to other sampling formats is straightforward, and is planned to be defined in a subsequent version). This separates a color representation into three components called Y, Cb, and Cr. The Y component is
6、 also called luma, and represents brightness. The two chroma components Cb and Cr representthe extent to which the color deviates from gray toward blue and red, respectively. Because the human visual system is more sensitive to luma than chroma, the 4:2:0 sampling structure is typical
7、ly used, in which each chroma component has one fourth of the number of samples of the luma component (half the number of samples in both the horizontal and vertical dimensions). Each sample for each component is typically represented with 8 or 10 b of precision, and the 8-b case is t
8、he more typical one. In the remainder of this paper, we focus our attention on the typical use: YCbCr components with 4:2:0sampling and 8 b per sample for the representation of the encoded input and decoded output video signal.为了表示色彩视频信号,HEVC使用了典型的三色色彩空间YCbCr, 以及4:2:0像素样本。Y表示亮度,Cb,Cr为色度,分别
9、表示绿,红。每个像素是8比特或10比特表示;The video pictures are typically progressively sampled with rectangular picture sizes W×H, where W is the width and H is the height of the picture in terms of luma samples.Each chroma component array, with 4:2:0 sampling, is then W/2×H/2. Given such a video
10、signal, the HEVC syntax partitions the pictures further as described follows.视频图像通常是逐行扫描的,图像尺寸为WxH,W和H分别为高度像素的宽度和高度;在4:2:0像素样本中,每个色度分量的宽度和高度分别为W/2和H/2。B. 将图像分割成编码树单元(CTU) Division of the Picture into Coding Tree UnitsA picture is partitioned into coding tree units (CTUs), which each contain lum
11、a CTBs and chroma CTBs. A luma CTB covers a rectangular picture area of L×L samples of the lumacomponent and the corresponding chroma CTBs cover each L/2×L/2 samples of each of the two chroma components. The value of L may be equal to 16, 32, or 64 as determined byan encoded synt
12、ax element specified in the SPS. Compared with the traditional macroblock using a fixed array size of 16×16 luma samples, as used by all previous ITU-T andISO/IEC JTC 1 video coding standards since H.261 (that was standardized in 1990), HEVC supports variable-size CTBs selected accord
13、ing to needs of encoders in terms of memory and computational requirements. The support of larger CTBs than in previous standards is particularly beneficial when encoding high-resolution video content. The luma CTB and the two chroma CTBs together with the associated syntax forma CTU.
14、 The CTU is the basic processing unit used in the standard to specify the decoding process.图像被分割成编码树单元(CTU), 每个CTU包含亮度CTB和色度CTB;每个高度CTB表示图像中的LxL亮度像素块区域,对应的两个色度CTB的大小为L/2xL/2;L的大小可以是16,32,64,并在SPS的语法元素中声明;相对于传统的16x16大小的宏块,HEVC对可变尺寸的CTB选择需要依据编码端的内存和计算资源来确定;大尺寸的CTB支持,相对于之前的标准,在编码高分辨率视频内容时非常有效果;CTU
15、由一个亮度CTB,两个色度CTB以及相关语法组成;CTU是解码处理的基本处理单元;C. 将CTB分割成CBDivision of the CTB into CBsThe blocks specified as luma and chroma CTBs can be directly used as CBs or can be further partitioned into multiple CBs. Partitioning is achieved using tree structures. The tree partitioning in HEVC is genera
16、lly applied simultaneously to both luma and chroma, although exceptions apply when certain minimum sizes are reached for chroma.亮度CTB和色度CTB可以直接当作CB使用;也可以更一步划分成多个CB;划分是以树结构的方式实现;HEVC的树划分方式通常是同时用于亮度和色度;The CTU contains a quadtree syntax that allows for splitting the CBs to a selected appropr
17、iate size based on the signal characteristics of the region that is covered by the CTB. The quadtree splitting process can be iterated until the size for aluma CB reaches a minimum allowed luma CB size that is selected by the encoder using syntax in the SPS and is always 8×8 or larger
18、 (in units of luma samples).CTU包含一个四叉树的语法,它允许对CTB区域依据信号特征划分成合适尺寸的CB;四叉树的切分处理可以迭代进行,直到亮度CB到了允许的最小尺寸,通常是8x8或更大;这个语法元素在SPS中定义;The boundaries of the picture are defined in units of the minimum allowed luma CB size. As a result, at the right and bottom edges of the picture, some CTUs may cover regio
19、nsthat are partly outside the boundaries of the picture. This condition is detected by the decoder, and the CTU quadtree is implicitly split as necessary to reduce the CB size to the point where the entire CB will fit into the picture.图像边界定义为最小可允许的亮度CB大小的整数倍;因此,图像右下角的边缘,有些CTU可能会超出图像的边界;这种情
20、况需要解码器检查并处理,并且CTU的四叉树暗示对CB尺寸的划分最好是整个CB正好适合图像;Fig. 3. Modes for splitting a CB into PBs, subject to certain size constraints.For intrapicture-predicted CBs, only M × M and M/2×M/2 are supported.D. 预测块(PB)和预测单元(PU)PBs and PUsThe prediction mode for the CU is signaled as being intra or inter,
21、 according to whether it uses intrapicture (spatial) prediction or interpicture (temporal) prediction.依据其预测模式使用的是帧内预测还是帧间预测,可以将CU分成帧内或帧间;When the prediction mode is signaled as intra, the PB size, which is the block size at which the intrapicture prediction mode is established is the same
22、as the CB size for all block sizes except for the smallest CB size that is allowed in thebitstream. For the latter case, a flag is present that indicates whether the CB is split into four PB quadrants that each have their own intrapicture prediction mode. The reason for allowing this
23、split is to enable distinct intrapicture predictionmode selections for blocks as small as 4×4 in size. When the luma intrapicture prediction operates with 4×4 blocks, the chroma intrapicture prediction also uses 4×4 blocks (each covering the same picture region as four 4
24、5;4 luma blocks).The actual region size at which the intrapicture prediction operates (which is distinct from the PB size, at which the intrapicture prediction mode is established) depends on theresidual coding partitioning that is described as follows.当使用帧内预测模式时,PB的尺寸(即使用帧内预测模式块的尺寸)和所有块的CB尺寸相同
25、,除非在码流中允许最小的CB尺寸;对于后一种情况,使用一个标志来指示是否CB被划分成四个PB,并且每个PB有不同的帧内预测模式;这种划分的目的是为了帧内预测模式块最小可达到4x4大小;当亮度预测模式为4x4块时,色度预测模式同样也使用4x4块大小-它对应的亮度区域是4个4x4块;实际上,帧内预测模式操作的区域尺寸(源自PB尺寸)依赖于下面所述的残差编码划分;When the prediction mode is signaled as inter, it is specified whether the luma and chroma CBs are split into one,
26、two, or four PBs. The splitting into four PBs is allowed only when the CB size is equal to the minimum allowed CB size, using anequivalent type of splitting as could otherwise be performed at the CB level of the design rather than at the PB level.当使用帧间预测模式时,它指明了亮度和色度CB是划分成一个,二个,或是四个PB;只有CB
27、尺寸等于允许的最小CB尺寸时,才能将CB划分成四个PB;When a CB is split into four PBs, each PB covers a quadrant of the CB. When a CB is split into two PBs, six types of this splitting are possible. The partitioning possibilities for interpicture-predicted CBs are depicted in Fig. 3. The upperpartitions illustrate
28、 the cases of not splitting the CB of size M×M, of splitting the CB into two PBs of size M×M/2 or M/2×M, or splitting it into four PBs of size M/2×M/2.当一个CB被划分成四个PB时,每个PB覆盖CB的四分之一个象限;当一个CB被划分成二个PB时,有六种可能的划分类型;帧间预测的这六种可能的CB划分类型如图3所示;图3中上部分的划分显示了将CB划分成两个PB,尺寸为MxM/2或M/2xM,
29、 或划分成四个PB,尺寸为M/2xM/2;The lower four partition types in Fig. 3 are referred to asasymmetric motion partitioning (AMP), and are only allowedwhen M is 16 or larger for luma. One PB of the asymmetricpartition has the height or width M/4 and width or heightM, respectively, and the other PB fills the
30、 rest of the CB byhaving a height or width of 3M/4 and width or height M.图3的下部分的四个划分类型只有在M为16或更大的亮度尺寸时才允许,并称其为非对称运动划分(AMP);如果一个非对称划分PB的高度或宽度为M/4且宽度和高度为M时,则该CB的其它PB的高度或宽度为3M/4和宽度和高度为M;Each interpicture-predicted PB is assigned one or two motionvectors and reference picture indices. To minimize worst-
31、casememory bandwidth, PBs of luma size 4×4 are not allowedfor interpicture prediction, and PBs of luma sizes 4×8 and8×4 are restricted to unipredictive coding. The interpictureprediction process is further described as follows.每个帧间预测PB对应有一个或两MV,和参考图像索引;为了尽可能地减少内存带宽浪费,对于帧间预测来说,亮度PB的尺寸可
32、以为4x4,并且不允许在双向预测编码中使用4x8和8x4的亮度PB尺寸;帧间预测处理的更多细节将在后面的章节中详述。The luma and chroma PBs, together with the associatedprediction syntax, form the PU.PU由亮度和色度PB,以及相应的预测语法组成;Fig. 4. Subdivision of a CTB into CBs and transform block (TBs).Solid lines indicate CB boundaries and dotted lines indicate TB boundar
33、ies.(a) CTB with its partitioning. (b) Corresponding quadtree.E. 以树结构方式切分成变换块和单元 Tree-Structured Partitioning Into Transform Blocks and unitsFor residual coding, a CB can be recursively partitionedinto transform blocks (TBs). The partitioning is signaled by aresidual quadtree.对于残差编码,可以将CB划
34、分成变换块(TB),这种划分被标记成残差四叉树结构;Only square CB and TB partitioning is specified, where ablock can be recursively split into quadrants, as illustrated inFig. 4. For a given luma CB of size M×M, a flag signalswhether it is split into four blocks of size M/2×M/2. Iffurther splitting is possible, as
35、 signaled by a maximum depthof the residual quadtree indicated in the SPS, each quadrantis assigned a flag that indicates whether it is split into fourquadrants. The leaf node blocks resulting from the residualquadtree are the transform blocks that are further processedby transform coding. The encod
36、er indicates the maximum andminimum luma TB sizes that it will use. Splitting is implicitwhen the CB size is larger than the maximum TB size. Notsplitting is implicit when splitting would result in a luma TBsize smaller than the indicated minimum. The chroma TB sizeis half the luma TB size in each d
37、imension, except when theluma TB size is 4×4, in which case a single 4×4 chroma TBis used for the region covered by four 4×4 luma TBs. In thecase of intrapicture-predicted CUs, the decoded samples of thenearest-neighboring TBs (within or outside the CB) are usedas reference data for i
38、ntrapicture prediction.只有正方形的CB和TB划分是被允许的;如图4中所示,可以将一个块以四象限的方式递归划分;对于尺寸为MxM的亮度CB,使用了一个标志来标记它是否被切分成四个尺寸为M/2xM/2的块;如果更进一步的划分是被允许的,那么需要在SPS中指定残差四叉树的最大深度,并且每个象限有一个对应的标志来指示其是否被划分成四个更象限;源自残差四叉树的叶子结点块都是变换块,它们更进一步的处理是变换编码;编码器指明了将会使用的最大和最小亮度TB尺寸;当CB尺寸大于最大的TB尺寸时,对CB的划分就是默认的;当划分会导致亮度TB的尺寸小于最小TB尺寸时,划分就是不被允许的;通常
39、情况下,色度TB的尺寸在每个维度上都是亮度TB尺寸的一半,除了亮度TB尺寸为4x4的情况,这时的4x4色度TB对应的是四个4x4的亮度TB;在帧内预测CU中,最近相邻TB的解码像素被用作帧内预测的参考数据;In contrast to previous standards, the HEVC design allowsa TB to span across multiple PBs for interpicture-predictedCUs to maximize the potential coding efficiency benefits ofthe quadtree-structure
40、d TB partitioning.相比于之前的标准,为了使帧间预测的CU在四叉树结构的TB划分上获得最大的编码效率,HEVC允许一个TB跨越多个PB;F. 片和瓦片Slices and TilesSlices are a sequence of CTUs that are processed in theorder of a raster scan. A picture may be split into one orseveral slices as shown in Fig. 5(a) so that a picture is acollection of one or more sli
41、ces. Slices are self-contained inthe sense that, given the availability of the active sequenceand picture parameter sets, their syntax elements can be parsedfrom the bitstream and the values of the samples in the area ofthe picture that the slice represents can be correctly decoded(except with regar
42、d to the effects of in-loop filtering near theedges of the slice) without the use of any data from other slicesin the same picture. This means that prediction within thepicture (e.g., intrapicture spatial signal prediction or predictionof motion vectors) is not performed across slice boundaries.Some
43、 information from other slices may, however, be neededto apply the in-loop filtering across slice boundaries. Each slicecan be coded using different coding types as follows.片由CTU序列组成,它以光栅扫描的顺序进行处理;如图5中所示,一帧图像能划分成一个或多个片,也可以说,一帧图像是一个或多个片的集合;对于当前激活的SPS和PPS,片是自包含的,它们的语法元素和区域中的像素能在码流中解析;片的解码不依赖于图像中的其它片(除
44、了环内滤波的边界需要时);这也意味着图像的预测不能跨边界;但是,片的有些信息是需要跨边界的,如环内滤波;每个片可使用的编码类型如下:Fig. 5. Subdivision of a picture into (a) slices and (b) tiles. (c) Illustration of wavefront parallel processing.1) I片I slice: A slice in which all CUs of the slice are codedusing only intrapicture prediction.片中的所
45、有编码单元(CU)都使用帧内预测.2) P片P slice: In addition to the coding types of an I slice,some CUs of a P slice can also be coded using interpictureprediction with at most one motion-compensatedprediction signal per PB (i.e., uniprediction). P slicesonly use reference picture list 0.除了I片外,还有P片,它使用帧间预测方式,并且只
46、使用参考图像列表0;3) B片B slice: In addition to the coding types available in aP slice, some CUs of the B slice can also be codedusing interpicture prediction with at most two motion compensatedprediction signals per PB (i.e., biprediction).B slices use both reference picture list 0 and list 1.B片也是使用帧间预
47、测,并且是双向运动补偿预测,使用参考图像列表0和列表1;The main purpose of slices is resynchronization after datalosses. Furthermore, slices are often restricted to use a maximumnumber of bits, e.g., for packetized transmission. Therefore,slices may often contain a highly varying number ofCTUs per slice in a manner dependent
48、on the activity in thevideo scene. In addition to slices, HEVC also defines tiles,which are self-contained and independently decodable rectangularregions of the picture. The main purpose of tiles is toenable the use of parallel processing architectures for encodingand decoding. Multiple tiles may sh
49、are header information bybeing contained in the same slice. Alternatively, a single tilemay contain multiple slices. A tile consists of a rectangulararranged group of CTUs (typically, but not necessarily, withall of them containing about the same number of CTUs), asshown in Fig. 5(b).片的主要目的是为了在数据丢失后
50、实现重同步;通常会对片的最大比特数做限制,如为了包传输;因此,依据视频场景的运动和复杂性,每个片中的CTU个数是高度可变的;除了片外,HEVC还定义了瓦片,它是自包含的,可独立解码的正方形图像区域;瓦片的主要目的是为在编码端和解码端实现并行处理;多个瓦片可以共享同一个片的头信息;相应的,一个瓦片可以包含多个片;一个瓦片由一个正文形的CTU组(通常,但不是必须的,所有瓦片中的CTU数相同)组成;如图5(b)所示;To assist with the granularity of data packetization, dependentslices are additionally define
51、d. Finally, with WPP, aslice is divided into rows of CTUs. The decoding of eachrow can be begun as soon a few decisions that are neededfor prediction and adaptation of the entropy coder have beenmade in the preceding row. This supports parallel processingof rows of CTUs by using several processing t
52、hreads inthe encoder or decoder (or both). An example is shown inFig. 5(c). For design simplicity, WPP is not allowed to beused in combination with tiles (although these features could,in principle, work properly together).为了达到数据分组的控制精度,片的相关性需要额外定义;最后,对于波前并行处理(WPP),每个片都分割成CTU行;每一行可以在前一行未完全解码完成时就开始解码
53、;这种并行处理方式的支持需要在编码和解码端使用多个处理线程,如图5(c)中所示;为了使程序的设计更简单,WPP不允许使用瓦片的组合; HEVC核心编码技术之二.帧内预测G. 帧内预测Intrapicture PredictionFig. 6. Modes and directional orientations for intrapicture prediction.Intrapicture prediction operates according to the TB size,and previously decoded boundary samples from spatial
54、lyneighboring TBs are used to form the prediction signal.Directional prediction with 33 different directional orientationsis defined for (square) TB sizes from 4×4 up to 32×32. Thepossible prediction directions are shown in Fig. 6. Alternatively,planar prediction (assuming an amplitude sur
55、face with ahorizontal and vertical slope derived from the boundaries) andDC prediction (a flat surface with a value matching the meanvalue of the boundary samples) can also be used. For chroma,the horizontal, vertical, planar, and DC prediction modes canbe explicitly signaled, or the chroma predicti
56、on mode can beindicated to be the same as the luma prediction mode (and, as aspecial case to avoid redundant signaling, when one of the firstfour choices is indicated and is the same as the luma predictionmode, the Intra_Angular34 mode is applied instead).帧内预测是以TB尺寸进行操作的;并且在空域上相邻的前面已解码的边界像素将被用作预测参考信
57、号;对从4x4到32x32的TB定义了33个不同的预测方向;所有可能的预测方向如图6中所示;而且Planar预测和DC预测同样可以使用;对于色度分量,水平,垂直,planar,DC预测模式可以显示使用;也可以直接沿用相应的亮度分量的预测模式;Each CB can be coded by one of several coding types,depending on the slice type. Similar to H.264/MPEG-4 AVC,intrapicture predictive coding is supported in all slice types.HEVC su
58、pports various intrapicture predictive coding methodsreferred to as Intra_Angular, Intra_Planar, and Intra_DC. Thefollowing subsections present a brief further explanation ofthese and several techniques to be applied in common.依据片类型,每个CB可以使用一个或多个编码类型编码;和H.264/MPEG-4 AVC一样,帧内预测编码支持所有的片类型;HEVC支持多种帧内预测编码方法,包括方向,planar, DC;下面将对通常使用的技术做进一步的解析;1) 预测块(PB)的划分PB Partitioning: An intrapicture-predicted CB of sizeM×M may have one of two types of PB partitions referredto as PART_2N×2N and PART_N×N, the first of whichin
温馨提示
- 1. 本站所有资源如无特殊说明,都需要本地电脑安装OFFICE2007和PDF阅读器。图纸软件为CAD,CAXA,PROE,UG,SolidWorks等.压缩文件请下载最新的WinRAR软件解压。
- 2. 本站的文档不包含任何第三方提供的附件图纸等,如果需要附件,请联系上传者。文件的所有权益归上传用户所有。
- 3. 本站RAR压缩包中若带图纸,网页内容里面会有图纸预览,若没有图纸预览就没有图纸。
- 4. 未经权益所有人同意不得将文件中的内容挪作商业或盈利用途。
- 5. 人人文库网仅提供信息存储空间,仅对用户上传内容的表现方式做保护处理,对用户上传分享的文档内容本身不做任何修改或编辑,并不能对任何下载内容负责。
- 6. 下载文件中如有侵权或不适当内容,请与我们联系,我们立即纠正。
- 7. 本站不保证下载资源的准确性、安全性和完整性, 同时也不承担用户因使用这些下载资源对自己和他人造成任何形式的伤害或损失。
最新文档
- 汽车抵押借款合同范例二零二五年
- 酒店转让合同
- 劳务分包结算单范例
- 2024四川资阳城建投资集团有限公司及下属子公司市场化招聘人员4人笔试参考题库附带答案详解
- 班级特色课程促幼儿成长
- 七下数学测试卷子及答案
- 七年级普宁试卷及答案
- 新能源汽车创业策划书
- 二五年首季度跨境橡胶手套质押借款协议医疗认证追溯附件
- 围堰栈桥施工方案
- 少喝饮料安全教育
- 中国汽车用品行业市场深度分析及发展前景预测报告
- 《森马服饰公司营运能力存在的问题及对策【数据图表论文】》11000字
- 外墙真石漆采购合同
- 《法律职业伦理》课件-第二讲 法官职业伦理
- 《专业咖啡制作技术》课件
- 印刷行业售后服务质量保障措施
- 2025年扎赉诺尔煤业有限责任公司招聘笔试参考题库含答案解析
- 《急性阑尾炎幻灯》课件
- 舞蹈工作室前台接待聘用合同
- 酒店物业租赁合同样本3篇
评论
0/150
提交评论