More than 5 years have passed since last update.

AV1 specification を読む 2018-03-26 .. 04-17

Last updated at 2018-04-28Posted at 2017-08-07

AOMedia Video 1 (AV1)はインターネット上での動画配信を目的として設計されたオープンかつロイヤリティフリーな動画圧縮コーデックである。このコーデックはGoogleのVP9、そしてMPEGのHEVC/H.265の置き換えを目指している。

2018年3月28日に、AV1仕様がFreezeしました。
Freezeとはいっても、ドキュメントは更新され続けていますし、リファレンス実装もデバグ・最適化が続けられている状態です。
大きな仕様変更や追加はもうないよ、程度に考えているとよさそうです。

AV1仕様書はgithubで公開されています。

自分自身がAV1仕様を理解するために、日本語訳を作っています。
「とりあえず」2018-03-26 版で進めてみます。
（ほぼ毎日更新されているので、気が向いたら更新します）

まずは、章ごとにどんな内容が書いてあるかを簡単に説明します。
章ごとの日本語訳は、別ページで記述します。

1. Scope

本ドキュメントのスコープが定義されています。

2. Terms and Definitions

用語定義です。
我々は雰囲気で規格書を読んではいけません。
一般名詞ではない意味が込められていたりするので、気が向いたときにでも参照してみましょう。

3. Symbols and Abbreviated Terms

略語と定数定義です。
（定数をまとめて定義されても、参照されているところでは忘れてしまいますが・・・）

4. Conventions

算術演算子・論理演算子・関係演算子・ビット操作・代入・数学関数・ビットストリーム記法・関数記法などが定義されます。
まあ、雰囲気でだいたい理解したつもりにはなれるので、気が向いたときに細かく読むとよいでしょう。

5. Syntax Structures

AV1規格に適合するビットストリームの文法構造の定義です。
表形式で定義されていて、読むには慣れが必要です。
（AVC, HEVC の規格書も、ほぼ同じ形式で書かれています）

文法の定義とデコード処理が一体となって記述されています。
（個人的には、分離したほうが理解しやすいのではないのかなと思っています。）

5.1. OBU Syntax
5.2. Reserved OBU Syntax
5.3. Sequence Header OBU Syntax
5.4. Temporal Delimiter OBU Syntax
5.5. Padding OBU Syntax
5.6. Metadata OBU Syntax
5.7. Frame Header OBU Syntax
5.8. Frame OBU Syntax
5.9. Tile Group OBU Syntax

6. Syntax Structures Semantics

5章で定義された文法構造について、それぞれの要素の意味を説明しています。

処理の内容がわからないのに使うパラメータを説明されても、まったく意味が分からないと思います。
実際のデコード処理は7章以降で定義されているので、そちらを先に読むことをお勧めします。

6.1. OBU Semantics
6.2. Reserved OBU Semantics
6.3. Sequence Header OBU Semantics
6.4. Temporal Delimiter OBU Semantics
6.5. Padding OBU Semantics
6.6. Metadata OBU Semantics
6.7. Frame Header OBU Semantics
6.8. Frame OBU Semantics
6.9. Tile Group OBU Semantics

7. Decoding Process

5章・6章で定義された変数を使って、画像をデコードする処理を説明しています。

[7.1. General Decoding Process] (https://qiita.com/srmfsan/items/1819e572dc10b4d742c2)
[7.2. Large Scale Tile Decoding Process] (https://qiita.com/srmfsan/items/26628c2dea796a2bfb14)
[7.3. Decode Frame Process] (https://qiita.com/srmfsan/items/5d4f95c811c8443155c0)
[7.4. Ordering of OBUs] (https://qiita.com/srmfsan/items/4cc31732354da362f7fc)
7.5. Random Access Decoding
[7.6. CDF Update Process] (https://qiita.com/srmfsan/items/045ddfd13405e3815eca)
[7.7. Set Frame Refs Process] (https://qiita.com/srmfsan/items/43f955c7627e9d968380)
[7.8. Motion Field Estimation Process] (https://qiita.com/srmfsan/items/5bc65a32e55e9c337604)
- 7.8.1. Projection Process
- 7.8.2. Get MV Projection Process
- 7.8.3. Get Block Position Process
7.9. Motion Vector Prediction Processes
- 7.9.1. Find MV Stack Process
  - 7.9.1.1. Setup Zero MV Process
  - 7.9.1.2. Scan Row Process
  - 7.9.1.3. Scan Col Process
  - 7.9.1.4. Scan Point Process
  - 7.9.1.5. Temporal Scan Process
  - 7.9.1.6. Temporal Sample Process
  - 7.9.1.7. Add Reference Motion Vector Process
  - 7.9.1.8. Search Stack Process
  - 7.9.1.9. Compound Search Stack Process
  - 7.9.1.10. Lower Precision Process
  - 7.9.1.11. Sorting Process
  - 7.9.1.12. Extra Search Process
  - 7.9.1.13. Add Extra Mv Candidate Process
  - 7.9.1.14. Context and Clamping Process
- 7.9.2. Has Overlappable Candidates Process
- 7.9.3. Find Warp Samples Process
7.10. Prediction Processes
- [7.10.1. Intra Prediction Process] (https://qiita.com/srmfsan/items/4e3936243baaf4975b82)
  - [7.10.1.1. Basic Intra Prediction Process] (https://qiita.com/srmfsan/items/f2b02f3f251d612474f8)
  - [7.10.1.2. Recursive Intra Prediction Process] (https://qiita.com/srmfsan/items/f2b02f3f251d612474f8)
  - [7.10.1.3. Directional Intra Prediction Process] (https://qiita.com/srmfsan/items/f2b02f3f251d612474f8)
  - [7.10.1.4. DC Intra Prediction Process] (https://qiita.com/srmfsan/items/f2b02f3f251d612474f8)
  - [7.10.1.5. Smooth Intra Prediction Process] (https://qiita.com/srmfsan/items/f2b02f3f251d612474f8)
  - [7.10.1.6. Filter Corner Process] (https://qiita.com/srmfsan/items/f2b02f3f251d612474f8)
  - [7.10.1.7. Intra Filter Type Process] (https://qiita.com/srmfsan/items/f2b02f3f251d612474f8)
  - [7.10.1.8. Intra Edge Filter Strength Selection Process] (https://qiita.com/srmfsan/items/f2b02f3f251d612474f8)
  - [7.10.1.9. Intra Edge Upsample Selection Process] (https://qiita.com/srmfsan/items/f2b02f3f251d612474f8)
  - [7.10.1.10. Intra Edge Upsample Process] (https://qiita.com/srmfsan/items/f2b02f3f251d612474f8)
  - [7.10.1.11. Intra Edge Filter Process] (https://qiita.com/srmfsan/items/f2b02f3f251d612474f8)
- [7.10.2. Inter Prediction Process] (https://qiita.com/srmfsan/items/b08bf2683974fe1dbd7f)
  - [7.10.2.1. Rounding Variables Derivation Process] (https://qiita.com/srmfsan/items/8da4c762cb4f92f24dda)
  - [7.10.2.2. Motion Vector Scaling Process] (https://qiita.com/srmfsan/items/6c3518505cad79417187)
  - [7.10.2.3. Block Inter Prediction Process] (https://qiita.com/srmfsan/items/e83f5ad467bee11fd157)
  - [7.10.2.4. Block Warp Process] (https://qiita.com/srmfsan/items/e975bc1ff0f5fcbd2147)
  - [7.10.2.5. Setup Shear Process] (https://qiita.com/srmfsan/items/7a1c60dbd49605230858)
  - [7.10.2.6. Resolve Divisor Process] (https://qiita.com/srmfsan/items/c955ee8b1d6bd63a20eb)
  - 7.10.2.7. Warp Estimation Process
  - 7.10.2.8. Overlapped Motion Compensation Process
  - 7.10.2.9. Overlap Blending Process
  - 7.10.2.10. Wedge Mask Process
  - 7.10.2.11. Segment Mask Process
  - 7.10.2.12. Inter Intra Mask Process
  - 7.10.2.13. Mask Blend Process
  - 7.10.2.14. Distance Weights Process
- 7.10.3. Palette Prediction Process
- 7.10.4. Predict Chroma From Luma Process
7.11. Reconstruction and Dequantization
- 7.11.1. Dequantization Functions
- 7.11.2. Reconstruct Process
7.12. Inverse Transform Process
7.13. Loop Filter Process
7.14. CDEF Process
7.15. Upscaling Process
7.16. Loop Restoration Process
7.17. Output Process
7.18. Motion Field Motion Vector Storage Process
7.19. Reference Frame Update Process
7.20. Reference Frame Loading Process

8. Parsing Process

8章では、AV1ビットストリームから、どのように値を抽出するかについて説明をしています。

ヘッダやパラメータ類は、固定長・可変長の符号が使われています。
（がんばれば、目視でエンコード・デコード可能です）

画像情報（変換係数や動きベクトルの値）は、算術符号化が使われています。
（人間がビット列を解釈することは無理です）

8.1. Parsing Process for f(n)
8.2. Parsing Process for Symbol Decoder
8.3. Parsing process for CDF encoded syntax elements

9. Additional Tables

処理で使う様々なテーブルの定義です。
（わざわざ規格書に乗せなくても良いのでは・・・と思います）

10. Annex A: Profiles and Levels

おそらく・・・

プロファイル：AV1で定義されている符号化ツールの使用を制約すること
レベル：パラメータの値域を制約すること

（まだ工事中らしい）

11. Annex B: Bitstream Format

符号化した情報をどのようにビットストリームとして構成するかについて説明しています。

12. Annex C: Included Experiments

AV1策定時の符号化ツールが列挙されています。
（いずれ消えるのでは・・・）

13. Bibliography

参考文献

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up