0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

ICLR2025の量子化論文 (1)

Posted at

概要

この記事では、ICLR2025の量子化論文を紹介します。[^1]

CBQ: Cross-Block Quantization for Large Language Models

  • 概要:LLMのPTQで、ブロック間の依存性を活用して量子化パラメータを学習する方法 (Fig. 2)

SynQ: Accurate Zero-shot Quantization by Synthesis-aware Fine-tuning

  • 概要:Zero-shot(データを使わない)量子化
  • キモ:ローパスフィルタで合成データのノイズを最小化する。合成データはFig. 4のように学習する

ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation

  • 概要:DiT (Diffusion Transformer) 向け量子化方法
  • キモ:量子化のグループを変える、量子化レンジの調整、混合量子化

DGQ: Distribution-Aware Group Quantization for Text-to-Image Diffusion Models

  • 概要:拡散モデルの量子化
  • キモ:ピクセル毎とチャンネル毎の外れ値を見て、レイヤ毎に量子化パラメータを決める

LeanQuant: Accurate and Scalable Large Language Model Quantization with Loss-error-aware Grid

  • 概要:PTQで、最適な量子化グリッドを決める方法
  • キモ:量子化によるロスの誤差が最小になるように量子化グリッドを決める。(5)式に基づいて、非一様量子化の場合は(6)式で、一様量子化の場合は(7)式で最適化
0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?