懒人李冰

记录我的生活、学习

X264源码解析之x264_zigzag_init函数

本文主要记录 X264 中使用到的 zigzag 技术。

X264 中关于 zigzag 的函数定义在dct.cx264_zigzag_init中。

首先看一下源码中关于 zigzag 的定义如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
void x264_zigzag_init( int cpu, x264_zigzag_function_t *pf_progressive, x264_zigzag_function_t *pf_interlaced )
{
    pf_interlaced->scan_8x8   = zigzag_scan_8x8_field;
    pf_progressive->scan_8x8  = zigzag_scan_8x8_frame;
    pf_interlaced->scan_4x4   = zigzag_scan_4x4_field;
    pf_progressive->scan_4x4  = zigzag_scan_4x4_frame;
    pf_interlaced->sub_8x8    = zigzag_sub_8x8_field;
    pf_progressive->sub_8x8   = zigzag_sub_8x8_frame;
    pf_interlaced->sub_4x4    = zigzag_sub_4x4_field;
    pf_progressive->sub_4x4   = zigzag_sub_4x4_frame;
    pf_interlaced->sub_4x4ac  = zigzag_sub_4x4ac_field;
    pf_progressive->sub_4x4ac = zigzag_sub_4x4ac_frame;
    ...
}

从源码中可以看出 zigzag 扫描有两种分类方法:按照宏块大小可分为8x8扫描和4x4扫描、按照图像类型可分为frame扫描和field扫描。
首先看一下最简单的4x4宏块帧扫描的代码描述:

1
2
3
4
5
6
7
8
9
10
11
12
13
#define ZIGZAG4_FRAME\
    ZIGDC( 0,0,0) ZIG( 1,0,1) ZIG( 2,1,0) ZIG( 3,2,0)\
    ZIG( 4,1,1) ZIG( 5,0,2) ZIG( 6,0,3) ZIG( 7,1,2)\
    ZIG( 8,2,1) ZIG( 9,3,0) ZIG(10,3,1) ZIG(11,2,2)\
    ZIG(12,1,3) ZIG(13,2,3) ZIG(14,3,2) ZIG(15,3,3)

#define ZIG(i,y,x) level[i] = dct[x*4+y];
#define ZIGDC(i,y,x) ZIG(i,y,x)

static void zigzag_scan_4x4_frame( dctcoef level[16], dctcoef dct[16] )
{
    ZIGZAG4_FRAME
}

将上面的定义展开如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
level[0] = dct[0*4 + 0];
level[1] = dct[1*4 + 0];
level[2] = dct[0*4 + 1];
level[3] = dct[0*4 + 2];
level[4] = dct[1*4 + 1];
level[5] = dct[0*4 + 2];
level[6] = dct[0*4 + 3];
level[7] = dct[1*4 + 2];
level[8] = dct[2*4 + 1];
level[9] = dct[3*4 + 0];
level[10] = dct[3*4 + 1];
level[11] = dct[2*4 + 2];
level[12] = dct[1*4 + 3];
level[13] = dct[2*4 + 3];
level[14] = dct[3*4 + 2];
level[15] = dct[3*4 + 3];

上面的代码只是简单的把一个4x4 宏块的二维数组扫描成了一个一维线性数组,但具体的扫描行为不够形象。

The H.264 advanced video compression standard里有如下描述:

Blocks of transform coefficients are scanned, i.e. converted to linear array, prior to entropy coding. The scan order is intended to group together significant coefficients, i.e. non-zero quantized coefficients. In a typical block in a progressive frame, non-zero coefficients tend to be clustered around the top left'DC' coefficient. In this case, a zigzag scan order may be the most efficient, shown in 4x4 and 8x8 blocks. After scanning the block in a zigzag order, the coefficients are placed in a linear array in which most of the non-zero coefficients tend to occur near the start of the array.

However, in an interlaced field or a field of a progressive frame converted from interlaced content, vertical frequencies in each block tend to dominate because the field is vertically sub-sampled from the original scene. This means that non-zero coefficients ten to occur at the top and towards the left side of the block. A block in a field macroblock is therefore scanned in a modified field scan order.

Block scan orders

通过描述可以看出,通过扫描后,非零系统会集中在一维线性数组最开始的几个位置。示例图如下:

上面的图片给出了扫描的顺序,X264 中的源码,与 4x4 frame 类似,此处不在重复。