#AV1 specification 日本語訳 (2018-03-26)
- 参照画像のMV情報を使わない(モーションフィールドMVを使わない)
- 確率分布の学習結果を継承しない
- ループフィルタを使わない
7.2. Large Scale Tile Decoding Process
The large scale tile decoding process allows a decoder to extract only an interesting section in a frame without the need to decompress the entire frame, which helps reduce decoder complexity.
This feature is extremely useful for VR applications (e.g. light fields), which renders a single section of the frame following the viewers’s head movement.
One expected use case will have a mixture of anchor frames and camera frames.
Anchor frames can be decoded with the general decoding process, while camera frames are decoded with the large scale tile decoding process.
The entire light field at a particular time will be represented by a small number of anchor frames (representing the view from a particular viewing position and gaze direction), plus a larger number of camera frames (representing the view from a larger number of positions and directions near to the anchor frame).
Depending on the viewer’s head position and direction, the application will first decode the whole of an anchor frame, and then just the portions of the camera frame that are required to generate the viewable part of the display.
The format of the container format for such applications is outside the scope of this specification.
Note: The reference decoder implements a particular way of sending a modified frame header and sizes that allow the required tiles to be efficiently retrieved.
This format may be useful for conformance testing purposes, but these modifications are not specified as part of the normative decoding process and may change in the future.
The aim of this process is to define a small set of requirements on a decoder implementation that will ensure that a compliant decoder will be able to support such use cases.
The input to this process are:
- contents of all syntax elements and variables normally produced when parsing a sequence header OBU,
- contents of all syntax elements and variables normally produced when parsing a frame header OBU,
- contents of all the values normally stored by the reference frame update process specified in section 7.19,
- a bitstream corresponding to the tile data for a single tile,
- a variable tileWidth representing the width of the tile in super blocks,
- a variable tileHeight representing the height of the tile in super blocks,
- variables startRowSb and startColSb specifying the top-left location of the tile in units of superblocks,
- A variable tileBytes representing the size in bytes of the tile data.
- シーケンスヘッダOBUをパースすることで通常生成される、全てのシンタックスエレメントと変数
- フレームヘッダOBUをパースすることで通常生成される、全てのシンタックスエレメントと変数
- 7.19節で規定される、参照フレーム更新処理によって通常格納されるすべての値
- ある1つのタイルのタイルデータに対応するビットストリーム
- スーパーブロック単位のタイルの幅 tileWidth
- スーパーブロック単位のタイルの高さ tileHeight
- スーパーブロック単位のタイル左上座標 (startRowSb, startColSb)
- タイルデータのバイト数 tileBytes
The output of this process is the decoded samples for the tile.
Decoders shall produce output samples that are identical in all respects as those produced by this decoding process.
In large scale tile decoding, the decode process is defined for one tile at a time.
The expectation is that the reference buffers have been produced using the general decoding process,
but this is not a normative requirement and applications can choose to provide the reference buffers in alternative manners as well.
It is a requirement of bitstream conformance that the following conditions are met:
- use_ref_frame_mvs is equal to 0
- disable_frame_end_update_cdf is equal to 1
- FrameRestorationType[ plane ] is equal to RESTORE_NONE for plane equal to 0,1, and 2
- cdef_bits is equal to 0
- delta_lf_present is equal to 0
- use_ref_frame_mvs == 0
- disable_frame_end_update_cdf == 1
- FrameRestorationType[ plane ] == RESTORE_NONE (for plane = 0,1,2)
- cdef_bits == 0
- delta_lf_present == 0
The decoding process defined here does not invoke the normal post-processing steps of deblock, cdef, superres, loop restoration and reference buffer update.
Implementations may choose to implement this process by using the general decode process with these tools disabled.
ここで定義されるデコード処理は、通常のポストプロセッシング処理(deblock, cdef, superres, loop restoration, 参照バッファ更新)を呼び出しません。
The process is specified as:
init_symbol( tileBytes )
clear_above_context( )
sbSize = use_128x128_superblock ? BLOCK_128X128 : BLOCK_64X64
sbSize4 = Num_4x4_Blocks_Wide[ sbSize ]
MiColStart = startColSb * sbSize4
MiColEnd = Min( MiCols, MiColStart + tileWidth * sbSize4 )
MiRowStart = startRowSb * sbSize4
MiRowEnd = Min( MiRows, MiRowStart + tileHeight * sbSize4 )
for ( r = MiRowStart; r < MiRowEnd; r += sbSize4 ) {
clear_left_context( )
for ( c = MiColStart; c < MiColEnd; c += sbSize4 ) {
ReadDeltas = delta_q_present
clear_block_decoded_flags( c < ( MiColEnd - 1 ) )
decode_partition( r, c, sbSize )
exit_symbol( 0 )
w = (MiColEnd - MiColStart) * MI_SIZE
h = (MiRowEnd - MiRowStart) * MI_SIZE
x0 = MiColStart * MI_SIZE
y0 = MiRowStart * MI_SIZE
subX = subsampling_x
subY = subsampling_y
xC0 = ( MiColStart * MI_SIZE ) >> subX
yC0 = ( MiRowStart * MI_SIZE ) >> subY
The intention is that the same decoding process for tile data can be used as for the general decoding process.
It is a requirement of bitstream conformance that:
- w is less than or equal to 4096
- h is less than or equal to 4096
- at most 512 tiles are decoded for each frame (the frame may include more than 512 tiles, but at most 512 will be required for each render from a single frame)
- w <= 4096
- h <= 4096
- 各フレームあたり最大512タイル(フレームには512タイル以上含むこともできますが、1つのフレームからの各描画には、最大でも512を要求します)
Arrays OutY, OutU, OutV (presenting the decoded samples for the tile) are specified as:
- The array OutY is w samples across by h samples down and the sample at location x samples across and y samples down is given by OutY[ y ][ x ] = CurrFrame[ 0 ][ y0 + y ][ x0 + x ] with x = 0..w - 1 and y = 0..h - 1.
- The array OutU is (w + subX) >> subX samples across by (h + subY) >> subY samples down and the sample at location x samples across and y samples down is given by OutU[ y ][ x ] = CurrFrame[ 1 ][ yC0 + y ][ xC0 + x ] with x = 0..(w >> subX) - 1 and y = 0..(h >> subY) - 1.
- The array OutV is (w + subX) >> subX samples across by (h + subY) >> subY samples down and the sample at location x samples across and y samples down is given by OutV[ y ][ x ] = CurrFrame[ 2 ][ yC0 + y ][ xC0 + x ] with x = 0..(w >> subX) - 1 and y = 0..(h >> subY) - 1.
配列 OutY, OutU, OutV(このタイルのデコードされた画素を表現)を以下のように規定します。
- 配列 OutY は、横w 縦h で、座標 (x, y) の画素を OutY[y][x] = CurrFrame[0][y0+y][x0+x], x=0..w-1, y=0..h-1
- 配列 OutU は、横(w+subX)>>subX 縦(h+subY)>>subY で、座標 (x, y) の画素を OutU[y][x] = CurrFrame[1][yC0+y][xC0+x], x=0..(w>>subX)-1, y=0..(h>>subY)-1
- 配列 OutV は、横(w+subX)>>subX 縦(h+subY)>>subY で、座標 (x, y) の画素を OutV[y][x] = CurrFrame[2][yC0+y][xC0+x], x=0..(w>>subX)-1, y=0..(h>>subY)-1
If mono_chrome is equal to 0, the output of this process is arrays OutY, OutU, OutV representing the Y, U, and V samples.
Otherwise (mono_chrome is equal to 1), the output of this process is array OutY.
mono_chroma==0 ならば、この処理の出力は配列 OutY, OutU, OutV です。
そうではなければ、この処理の出力は配列 OutY です。
The bitdepth of each output sample is given by BitDepth.
各出力のビット深度は BitDepth で与えられます。