本文主要记录 X264 中使用到的 zigzag 技术。
X264 中关于 zigzag 的函数定义在dct.c
的x264_zigzag_init
中。
首先看一下源码中关于 zigzag 的定义如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
|
从源码中可以看出 zigzag 扫描有两种分类方法:按照宏块大小可分为8x8扫描和4x4扫描、按照图像类型可分为frame
扫描和field
扫描。
首先看一下最简单的4x4宏块帧扫描的代码描述:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
将上面的定义展开如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
|
上面的代码只是简单的把一个4x4 宏块的二维数组扫描成了一个一维线性数组,但具体的扫描行为不够形象。
在The H.264 advanced video compression standard里有如下描述:
Blocks of transform coefficients are scanned, i.e. converted to linear array, prior to entropy coding. The scan order is intended to group together significant coefficients, i.e. non-zero quantized coefficients. In a typical block in a progressive frame, non-zero coefficients tend to be clustered around the top left'DC' coefficient. In this case, a zigzag scan order may be the most efficient, shown in 4x4 and 8x8 blocks. After scanning the block in a zigzag order, the coefficients are placed in a linear array in which most of the non-zero coefficients tend to occur near the start of the array.
However, in an interlaced field or a field of a progressive frame converted from interlaced content, vertical frequencies in each block tend to dominate because the field is vertically sub-sampled from the original scene. This means that non-zero coefficients ten to occur at the top and towards the left side of the block. A block in a field macroblock is therefore scanned in a modified field scan order.
通过描述可以看出,通过扫描后,非零系统会集中在一维线性数组最开始的几个位置。示例图如下:
上面的图片给出了扫描的顺序,X264 中的源码,与 4x4 frame 类似,此处不在重复。