X264源码解析之x264_zigzag_init函数

本文主要记录 X264 中使用到的 zigzag 技术。

X264 中关于 zigzag 的函数定义在dct.c的x264_zigzag_init中。

首先看一下源码中关于 zigzag 的定义如下：

void x264_zigzag_init( int cpu, x264_zigzag_function_t *pf_progressive, x264_zigzag_function_t *pf_interlaced )
{
    pf_interlaced->scan_8x8   = zigzag_scan_8x8_field;
    pf_progressive->scan_8x8  = zigzag_scan_8x8_frame;
    pf_interlaced->scan_4x4   = zigzag_scan_4x4_field;
    pf_progressive->scan_4x4  = zigzag_scan_4x4_frame;
    pf_interlaced->sub_8x8    = zigzag_sub_8x8_field;
    pf_progressive->sub_8x8   = zigzag_sub_8x8_frame;
    pf_interlaced->sub_4x4    = zigzag_sub_4x4_field;
    pf_progressive->sub_4x4   = zigzag_sub_4x4_frame;
    pf_interlaced->sub_4x4ac  = zigzag_sub_4x4ac_field;
    pf_progressive->sub_4x4ac = zigzag_sub_4x4ac_frame;
    ...
}

从源码中可以看出 zigzag 扫描有两种分类方法：按照宏块大小可分为8x8扫描和4x4扫描、按照图像类型可分为frame扫描和field扫描。
首先看一下最简单的4x4宏块帧扫描的代码描述：

#define ZIGZAG4_FRAME\
    ZIGDC( 0,0,0) ZIG( 1,0,1) ZIG( 2,1,0) ZIG( 3,2,0)\
    ZIG( 4,1,1) ZIG( 5,0,2) ZIG( 6,0,3) ZIG( 7,1,2)\
    ZIG( 8,2,1) ZIG( 9,3,0) ZIG(10,3,1) ZIG(11,2,2)\
    ZIG(12,1,3) ZIG(13,2,3) ZIG(14,3,2) ZIG(15,3,3)

#define ZIG(i,y,x) level[i] = dct[x*4+y];
#define ZIGDC(i,y,x) ZIG(i,y,x)

static void zigzag_scan_4x4_frame( dctcoef level[16], dctcoef dct[16] )
{
    ZIGZAG4_FRAME
}

将上面的定义展开如下：

level[0] = dct[0*4 + 0];
level[1] = dct[1*4 + 0];
level[2] = dct[0*4 + 1];
level[3] = dct[0*4 + 2];
level[4] = dct[1*4 + 1];
level[5] = dct[0*4 + 2];
level[6] = dct[0*4 + 3];
level[7] = dct[1*4 + 2];
level[8] = dct[2*4 + 1];
level[9] = dct[3*4 + 0];
level[10] = dct[3*4 + 1];
level[11] = dct[2*4 + 2];
level[12] = dct[1*4 + 3];
level[13] = dct[2*4 + 3];
level[14] = dct[3*4 + 2];
level[15] = dct[3*4 + 3];

上面的代码只是简单的把一个4x4 宏块的二维数组扫描成了一个一维线性数组，但具体的扫描行为不够形象。

在The H.264 advanced video compression standard里有如下描述：

Blocks of transform coefficients are scanned, i.e. converted to linear array, prior to entropy coding. The scan order is intended to group together significant coefficients, i.e. non-zero quantized coefficients. In a typical block in a progressive frame, non-zero coefficients tend to be clustered around the top left'DC' coefficient. In this case, a zigzag scan order may be the most efficient, shown in 4x4 and 8x8 blocks. After scanning the block in a zigzag order, the coefficients are placed in a linear array in which most of the non-zero coefficients tend to occur near the start of the array.
However, in an interlaced field or a field of a progressive frame converted from interlaced content, vertical frequencies in each block tend to dominate because the field is vertically sub-sampled from the original scene. This means that non-zero coefficients ten to occur at the top and towards the left side of the block. A block in a field macroblock is therefore scanned in a modified field scan order.
Block scan orders

通过描述可以看出，通过扫描后，非零系统会集中在一维线性数组最开始的几个位置。示例图如下：

上面的图片给出了扫描的顺序，X264 中的源码，与 4x4 frame 类似，此处不在重复。

懒人李冰

记录我的生活、学习

X264源码解析之x264_zigzag_init函数