More than 1 year has passed since last update.

量子化後のモデルのサイズと備考のまとめ

Last updated at 2024-06-23Posted at 2024-05-22

量子化後のモデルのサイズと備考のまとめです．どの量子化されたモデルを使うか迷ったとき，参考になるかも知れません．

モデルがGPUのメモリにのる割合が高いほど高速に動作します．モデルがGPUのメモリにすべてのるのが理想的です．量子化はビット数が多いほど精度が高いですが，大規模言語モデルの場合4ビット(q4)程度でも多くの場合使用に支障がないと言われています．

Aya 8B

出典サイト
https://huggingface.co/bartowski/aya-23-8B-GGUF

Filename	Quant type	File Size	Description
aya-23-8B-Q8_0.gguf	Q8_0	8.54GB	Extremely high quality, generally unneeded but max available quant.
aya-23-8B-Q6_K.gguf	Q6_K	6.59GB	Very high quality, near perfect, recommended.
aya-23-8B-Q5_K_M.gguf	Q5_K_M	5.80GB	High quality, recommended.
aya-23-8B-Q5_K_S.gguf	Q5_K_S	5.66GB	High quality, recommended.
aya-23-8B-Q4_K_M.gguf	Q4_K_M	5.05GB	Good quality, uses about 4.83 bits per weight, recommended.
aya-23-8B-Q4_K_S.gguf	Q4_K_S	4.82GB	Slightly lower quality with more space savings, recommended.
aya-23-8B-IQ4_NL.gguf	IQ4_NL	4.81GB	Decent quality, slightly smaller than Q4_K_S with similar performance recommended.
aya-23-8B-IQ4_XS.gguf	IQ4_XS	4.60GB	Decent quality, smaller than Q4_K_S with similar performance, recommended.
aya-23-8B-Q3_K_L.gguf	Q3_K_L	4.52GB	Lower quality but usable, good for low RAM availability.
aya-23-8B-Q3_K_M.gguf	Q3_K_M	4.22GB	Even lower quality.
aya-23-8B-IQ3_M.gguf	IQ3_M	3.99GB	Medium-low quality, new method with decent performance comparable to Q3_K_M.
aya-23-8B-IQ3_S.gguf	IQ3_S	3.88GB	Lower quality, new method with decent performance, recommended over Q3_K_S quant, same size with better performance.
aya-23-8B-Q3_K_S.gguf	Q3_K_S	3.87GB	Low quality, not recommended.
aya-23-8B-IQ3_XS.gguf	IQ3_XS	3.72GB	Lower quality, new method with decent performance, slightly better than Q3_K_S.
aya-23-8B-IQ3_XXS.gguf	IQ3_XXS	3.41GB	Lower quality, new method with decent performance, comparable to Q3 quants.
aya-23-8B-Q2_K.gguf	Q2_K	3.43GB	Very low quality but surprisingly usable.
aya-23-8B-IQ2_M.gguf	IQ2_M	3.08GB	Very low quality, uses SOTA techniques to also be surprisingly usable.
aya-23-8B-IQ2_S.gguf	IQ2_S	2.89GB	Very low quality, uses SOTA techniques to be usable.
aya-23-8B-IQ2_XS.gguf	IQ2_XS	2.79GB	Very low quality, uses SOTA techniques to be usable.
aya-23-8B-IQ2_XXS.gguf	IQ2_XXS	2.58GB	Lower quality, uses SOTA techniques to be usable.
aya-23-8B-IQ1_M.gguf	IQ1_M	2.35GB	Extremely low quality, not recommended.
aya-23-8B-IQ1_S.gguf	IQ1_S	2.20GB	Extremely low quality, not recommended.Filename Quant type File Size Description

Llama3 8B

出典サイト
https://huggingface.co/bartowski/Meta-Llama-3-8B-Instruct-GGUF

Filename	Quant type	File Size	Description
Meta-Llama-3-8B-Instruct-Q8_0.gguf	Q8_0	8.54GB	Extremely high quality, generally unneeded but max available quant.
Meta-Llama-3-8B-Instruct-Q6_K.gguf	Q6_K	6.59GB	Very high quality, near perfect, recommended.
Meta-Llama-3-8B-Instruct-Q5_K_M.gguf	Q5_K_M	5.73GB	High quality, recommended.
Meta-Llama-3-8B-Instruct-Q5_K_S.gguf	Q5_K_S	5.59GB	High quality, recommended.
Meta-Llama-3-8B-Instruct-Q4_K_M.gguf	Q4_K_M	4.92GB	Good quality, uses about 4.83 bits per weight, recommended.
Meta-Llama-3-8B-Instruct-Q4_K_S.gguf	Q4_K_S	4.69GB	Slightly lower quality with more space savings, recommended.
Meta-Llama-3-8B-Instruct-IQ4_NL.gguf	IQ4_NL	4.67GB	Decent quality, slightly smaller than Q4_K_S with similar performance recommended.
Meta-Llama-3-8B-Instruct-IQ4_XS.gguf	IQ4_XS	4.44GB	Decent quality, smaller than Q4_K_S with similar performance, recommended.
Meta-Llama-3-8B-Instruct-Q3_K_L.gguf	Q3_K_L	4.32GB	Lower quality but usable, good for low RAM availability.
Meta-Llama-3-8B-Instruct-Q3_K_M.gguf	Q3_K_M	4.01GB	Even lower quality.
Meta-Llama-3-8B-Instruct-IQ3_M.gguf	IQ3_M	3.78GB	Medium-low quality, new method with decent performance comparable to Q3_K_M.
Meta-Llama-3-8B-Instruct-IQ3_S.gguf	IQ3_S	3.68GB	Lower quality, new method with decent performance, recommended over Q3_K_S quant, same size with better performance.
Meta-Llama-3-8B-Instruct-Q3_K_S.gguf	Q3_K_S	3.66GB	Low quality, not recommended.
Meta-Llama-3-8B-Instruct-IQ3_XS.gguf	IQ3_XS	3.51GB	Lower quality, new method with decent performance, slightly better than Q3_K_S.
Meta-Llama-3-8B-Instruct-IQ3_XXS.gguf	IQ3_XXS	3.27GB	Lower quality, new method with decent performance, comparable to Q3 quants.
Meta-Llama-3-8B-Instruct-Q2_K.gguf	Q2_K	3.17GB	Very low quality but surprisingly usable.
Meta-Llama-3-8B-Instruct-IQ2_M.gguf	IQ2_M	2.94GB	Very low quality, uses SOTA techniques to also be surprisingly usable.
Meta-Llama-3-8B-Instruct-IQ2_S.gguf	IQ2_S	2.75GB	Very low quality, uses SOTA techniques to be usable.
Meta-Llama-3-8B-Instruct-IQ2_XS.gguf	IQ2_XS	2.60GB	Very low quality, uses SOTA techniques to be usable.
Meta-Llama-3-8B-Instruct-IQ2_XXS.gguf	IQ2_XXS	2.39GB	Lower quality, uses SOTA techniques to be usable.
Meta-Llama-3-8B-Instruct-IQ1_M.gguf	IQ1_M	2.16GB	Extremely low quality, not recommended.
Meta-Llama-3-8B-Instruct-IQ1_S.gguf	IQ1_S	2.01GB	Extremely low quality, not recommended.

Llama3 70b

出典サイト
https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-old-GGUF

Filename	Quant type	File Size	Description
Meta-Llama-3-70B-Instruct-Q8_0.gguf	Q8_0	74.97GB	Extremely high quality, generally unneeded but max available quant.
Meta-Llama-3-70B-Instruct-Q6_K.gguf	Q6_K	57.88GB	Very high quality, near perfect, recommended.
Meta-Llama-3-70B-Instruct-Q5_K_M.gguf	Q5_K_M	49.94GB	High quality, recommended.
Meta-Llama-3-70B-Instruct-Q5_K_S.gguf	Q5_K_S	48.65GB	High quality, recommended.
Meta-Llama-3-70B-Instruct-Q4_K_M.gguf	Q4_K_M	42.52GB	Good quality, uses about 4.83 bits per weight, recommended.
Meta-Llama-3-70B-Instruct-Q4_K_S.gguf	Q4_K_S	40.34GB	Slightly lower quality with more space savings, recommended.
Meta-Llama-3-70B-Instruct-IQ4_NL.gguf	IQ4_NL	40.05GB	Decent quality, slightly smaller than Q4_K_S with similar performance recommended.
Meta-Llama-3-70B-Instruct-IQ4_XS.gguf	IQ4_XS	37.90GB	Decent quality, smaller than Q4_K_S with similar performance, recommended.
Meta-Llama-3-70B-Instruct-Q3_K_L.gguf	Q3_K_L	37.14GB	Lower quality but usable, good for low RAM availability.
Meta-Llama-3-70B-Instruct-Q3_K_M.gguf	Q3_K_M	34.26GB	Even lower quality.
Meta-Llama-3-70B-Instruct-IQ3_M.gguf	IQ3_M	31.93GB	Medium-low quality, new method with decent performance comparable to Q3_K_M.
Meta-Llama-3-70B-Instruct-IQ3_S.gguf	IQ3_S	30.91GB	Lower quality, new method with decent performance, recommended over Q3_K_S quant, same size with better performance.
Meta-Llama-3-70B-Instruct-Q3_K_S.gguf	Q3_K_S	30.91GB	Low quality, not recommended.
Meta-Llama-3-70B-Instruct-IQ3_XS.gguf	IQ3_XS	29.30GB	Lower quality, new method with decent performance, slightly better than Q3_K_S.
Meta-Llama-3-70B-Instruct-IQ3_XXS.gguf	IQ3_XXS	27.46GB	Lower quality, new method with decent performance, comparable to Q3 quants.
Meta-Llama-3-70B-Instruct-Q2_K.gguf	Q2_K	26.37GB	Very low quality but surprisingly usable.
Meta-Llama-3-70B-Instruct-IQ2_M.gguf	IQ2_M	24.11GB	Very low quality, uses SOTA techniques to also be surprisingly usable.
Meta-Llama-3-70B-Instruct-IQ2_S.gguf	IQ2_S	22.24GB	Very low quality, uses SOTA techniques to be usable.
Meta-Llama-3-70B-Instruct-IQ2_XS.gguf	IQ2_XS	21.14GB	Very low quality, uses SOTA techniques to be usable.
Meta-Llama-3-70B-Instruct-IQ2_XXS.gguf	IQ2_XXS	19.09GB	Lower quality, uses SOTA techniques to be usable.
Meta-Llama-3-70B-Instruct-IQ1_M.gguf	IQ1_M	16.75GB	Extremely low quality, not recommended.
Meta-Llama-3-70B-Instruct-IQ1_S.gguf	IQ1_S	15.34GB	Extremely low quality, not recommended.

Phi3-medium

出典サイト
https://huggingface.co/bartowski/Phi-3-medium-4k-instruct-GGUF

Filename	Quant type	File Size	Description
Phi-3-medium-4k-instruct-Q8_0.gguf	Q8_0	14.83GB	Extremely high quality, generally unneeded but max available quant.
Phi-3-medium-4k-instruct-Q6_K.gguf	Q6_K	11.45GB	Very high quality, near perfect, recommended.
Phi-3-medium-4k-instruct-Q5_K_M.gguf	Q5_K_M	10.07GB	High quality, recommended.
Phi-3-medium-4k-instruct-Q5_K_S.gguf	Q5_K_S	9.62GB	High quality, recommended.
Phi-3-medium-4k-instruct-Q4_K_M.gguf	Q4_K_M	8.56GB	Good quality, uses about 4.83 bits per weight, recommended.
Phi-3-medium-4k-instruct-Q4_K_S.gguf	Q4_K_S	7.95GB	Slightly lower quality with more space savings, recommended.
Phi-3-medium-4k-instruct-IQ4_NL.gguf	IQ4_NL	7.89GB	Decent quality, slightly smaller than Q4_K_S with similar performance recommended.
Phi-3-medium-4k-instruct-IQ4_XS.gguf	IQ4_XS	7.46GB	Decent quality, smaller than Q4_K_S with similar performance, recommended.
Phi-3-medium-4k-instruct-Q3_K_L.gguf	Q3_K_L	7.49GB	Lower quality but usable, good for low RAM availability.
Phi-3-medium-4k-instruct-Q3_K_M.gguf	Q3_K_M	6.92GB	Even lower quality.
Phi-3-medium-4k-instruct-IQ3_M.gguf	IQ3_M	6.47GB	Medium-low quality, new method with decent performance comparable to Q3_K_M.
Phi-3-medium-4k-instruct-IQ3_S.gguf	IQ3_S	6.06GB	Lower quality, new method with decent performance, recommended over Q3_K_S quant, same size with better performance.
Phi-3-medium-4k-instruct-Q3_K_S.gguf	Q3_K_S	6.06GB	Low quality, not recommended.
Phi-3-medium-4k-instruct-IQ3_XS.gguf	IQ3_XS	5.80GB	Lower quality, new method with decent performance, slightly better than Q3_K_S.
Phi-3-medium-4k-instruct-IQ3_XXS.gguf	IQ3_XXS	5.45GB	Lower quality, new method with decent performance, comparable to Q3 quants.
Phi-3-medium-4k-instruct-Q2_K.gguf	Q2_K	5.14GB	Very low quality but surprisingly usable.
Phi-3-medium-4k-instruct-IQ2_M.gguf	IQ2_M	4.71GB	Very low quality, uses SOTA techniques to also be surprisingly usable.
Phi-3-medium-4k-instruct-IQ2_S.gguf	IQ2_S	4.33GB	Very low quality, uses SOTA techniques to be usable.
Phi-3-medium-4k-instruct-IQ2_XS.gguf	IQ2_XS	4.12GB	Very low quality, uses SOTA techniques to be usable.
Phi-3-medium-4k-instruct-IQ2_XXS.gguf	IQ2_XXS	3.71GB	Lower quality, uses SOTA techniques to be usable.
Phi-3-medium-4k-instruct-IQ1_M.gguf	IQ1_M	3.24GB	Extremely low quality, not recommended.
Phi-3-medium-4k-instruct-IQ1_S.gguf	IQ1_S	2.95GB	Extremely low quality, not recommended.

Mixtral 8x7b

出典サイト
https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF

Name	Quant method	Bits	Size	Max RAM required	Use case
mixtral-8x7b-instruct-v0.1.Q2_K.gguf	Q2_K	2	15.64 GB	18.14 GB	smallest, significant quality loss - not recommended for most purposes
mixtral-8x7b-instruct-v0.1.Q3_K_M.gguf	Q3_K_M	3	20.36 GB	22.86 GB	very small, high quality loss
mixtral-8x7b-instruct-v0.1.Q4_0.gguf	Q4_0	4	26.44 GB	28.94 GB	legacy; small, very high quality loss - prefer using Q3_K_M
mixtral-8x7b-instruct-v0.1.Q4_K_M.gguf	Q4_K_M	4	26.44 GB	28.94 GB	medium, balanced quality - recommended
mixtral-8x7b-instruct-v0.1.Q5_0.gguf	Q5_0	5	32.23 GB	34.73 GB	legacy; medium, balanced quality - prefer using Q4_K_M
mixtral-8x7b-instruct-v0.1.Q5_K_M.gguf	Q5_K_M	5	32.23 GB	34.73 GB	large, very low quality loss - recommended
mixtral-8x7b-instruct-v0.1.Q6_K.gguf	Q6_K	6	38.38 GB	40.88 GB	very large, extremely low quality loss
mixtral-8x7b-instruct-v0.1.Q8_0.gguf	Q8_0	8	49.62 GB	52.12 GB	very large, extremely low quality loss - not recommended

Starling-LM 7b

出典サイト
https://huggingface.co/bartowski/Starling-LM-7B-beta-GGUF

Name	Quant method	Size	Use case
Starling-LM-7B-beta-Q8_0.gguf	Q8_0	7.69GB	Extremely high quality, generally unneeded but max available quant.
Starling-LM-7B-beta-Q6_K.gguf	Q6_K	5.94GB	Very high quality, near perfect, recommended.
Starling-LM-7B-beta-Q5_K_M.gguf	Q5_K_M	5.13GB	High quality, very usable.
Starling-LM-7B-beta-Q5_K_S.gguf	Q5_K_S	4.99GB	High quality, very usable.
Starling-LM-7B-beta-Q5_0.gguf	Q5_0	4.99GB	High quality, older format, generally not recommended.
Starling-LM-7B-beta-Q4_K_M.gguf	Q4_K_M	4.36GB	Good quality, similar to 4.25 bpw.
Starling-LM-7B-beta-Q4_K_S.gguf	Q4_K_S	4.14GB	Slightly lower quality with small space savings.
Starling-LM-7B-beta-IQ4_NL.gguf	IQ4_NL	4.15GB	Good quality, similar to Q4_K_S, new method of quanting,
Starling-LM-7B-beta-IQ4_XS.gguf	IQ4_XS	3.94GB	Decent quality, new method with similar performance to Q4.
Starling-LM-7B-beta-Q4_0.gguf	Q4_0	4.10GB	Decent quality, older format, generally not recommended.
Starling-LM-7B-beta-IQ3_M.gguf	IQ3_M	3.28GB	Medium-low quality, new method with decent performance.
Starling-LM-7B-beta-IQ3_S.gguf	IQ3_S	3.18GB	Lower quality, new method with decent performance, recommended over Q3 quants.
Starling-LM-7B-beta-Q3_K_L.gguf	Q3_K_L	3.82GB	Lower quality but usable, good for low RAM availability.
Starling-LM-7B-beta-Q3_K_M.gguf	Q3_K_M	3.51GB	Even lower quality.
Starling-LM-7B-beta-Q3_K_S.gguf	Q3_K_S	3.16GB	Low quality, not recommended.
Starling-LM-7B-beta-Q2_K.gguf	Q2_K	2.71GB	Extremely low quality, not recommended.

WizardLM2 7b

出典サイト
https://huggingface.co/bartowski/WizardLM-2-7B-GGUF

Filename	Quant type	File Size	Description
WizardLM-2-7B-Q8_0.gguf	Q8_0	7.69GB	Extremely high quality, generally unneeded but max available quant.
WizardLM-2-7B-Q6_K.gguf	Q6_K	5.94GB	Very high quality, near perfect, recommended.
WizardLM-2-7B-Q5_K_M.gguf	Q5_K_M	5.13GB	High quality, recommended.
WizardLM-2-7B-Q5_K_S.gguf	Q5_K_S	4.99GB	High quality, recommended.
WizardLM-2-7B-Q4_K_M.gguf	Q4_K_M	4.36GB	Good quality, uses about 4.83 bits per weight, recommended.
WizardLM-2-7B-Q4_K_S.gguf	Q4_K_S	4.14GB	Slightly lower quality with more space savings, recommended.
WizardLM-2-7B-IQ4_NL.gguf	IQ4_NL	4.12GB	Decent quality, slightly smaller than Q4_K_S with similar performance recommended.
WizardLM-2-7B-IQ4_XS.gguf	IQ4_XS	3.90GB	Decent quality, smaller than Q4_K_S with similar performance, recommended.
WizardLM-2-7B-Q3_K_L.gguf	Q3_K_L	3.82GB	Lower quality but usable, good for low RAM availability.
WizardLM-2-7B-Q3_K_M.gguf	Q3_K_M	3.51GB	Even lower quality.
WizardLM-2-7B-IQ3_M.gguf	IQ3_M	3.28GB	Medium-low quality, new method with decent performance comparable to Q3_K_M.
WizardLM-2-7B-IQ3_S.gguf	IQ3_S	3.18GB	Lower quality, new method with decent performance, recommended over Q3_K_S quant, same size with better performance.
WizardLM-2-7B-Q3_K_S.gguf	Q3_K_S	3.16GB	Low quality, not recommended.
WizardLM-2-7B-IQ3_XS.gguf	IQ3_XS	3.01GB	Lower quality, new method with decent performance, slightly better than Q3_K_S.
WizardLM-2-7B-IQ3_XXS.gguf	IQ3_XXS	2.82GB	Lower quality, new method with decent performance, comparable to Q3 quants.
WizardLM-2-7B-Q2_K.gguf	Q2_K	2.71GB	Very low quality but surprisingly usable.
WizardLM-2-7B-IQ2_M.gguf	IQ2_M	2.50GB	Very low quality, uses SOTA techniques to also be surprisingly usable.
WizardLM-2-7B-IQ2_S.gguf	IQ2_S	2.31GB	Very low quality, uses SOTA techniques to be usable.
WizardLM-2-7B-IQ2_XS.gguf	IQ2_XS	2.19GB	Very low quality, uses SOTA techniques to be usable.
WizardLM-2-7B-IQ2_XXS.gguf	IQ2_XXS	1.99GB	Lower quality, uses SOTA techniques to be usable.
WizardLM-2-7B-IQ1_M.gguf	IQ1_M	1.75GB	Extremely low quality, not recommended.
WizardLM-2-7B-IQ1_S.gguf	IQ1_S	1.61GB	Extremely low quality, not recommended.

Codestral 22B

出典サイト
https://huggingface.co/bartowski/Codestral-22B-v0.1-GGUF

Filename	Quant type	File Size	Description
Codestral-22B-v0.1-Q8_0.gguf	Q8_0	23.64GB	Extremely high quality, generally unneeded but max available quant.
Codestral-22B-v0.1-Q6_K.gguf	Q6_K	18.25GB	Very high quality, near perfect, recommended.
Codestral-22B-v0.1-Q5_K_M.gguf	Q5_K_M	15.72GB	High quality, recommended.
Codestral-22B-v0.1-Q5_K_S.gguf	Q5_K_S	15.32GB	High quality, recommended.
Codestral-22B-v0.1-Q4_K_M.gguf	Q4_K_M	13.34GB	Good quality, uses about 4.83 bits per weight, recommended.
Codestral-22B-v0.1-Q4_K_S.gguf	Q4_K_S	12.66GB	Slightly lower quality with more space savings, recommended.
Codestral-22B-v0.1-IQ4_XS.gguf	IQ4_XS	11.93GB	Decent quality, smaller than Q4_K_S with similar performance, recommended.
Codestral-22B-v0.1-Q3_K_L.gguf	Q3_K_L	11.73GB	Lower quality but usable, good for low RAM availability.
Codestral-22B-v0.1-Q3_K_M.gguf	Q3_K_M	10.75GB	Even lower quality.
Codestral-22B-v0.1-IQ3_M.gguf	IQ3_M	10.06GB	Medium-low quality, new method with decent performance comparable to Q3_K_M.
Codestral-22B-v0.1-Q3_K_S.gguf	Q3_K_S	9.64GB	Low quality, not recommended.
Codestral-22B-v0.1-IQ3_XS.gguf	IQ3_XS	9.17GB	Lower quality, new method with decent performance, slightly better than Q3_K_S.
Codestral-22B-v0.1-IQ3_XXS.gguf	IQ3_XXS	8.59GB	Lower quality, new method with decent performance, comparable to Q3 quants.
Codestral-22B-v0.1-Q2_K.gguf	Q2_K	8.27GB	Very low quality but surprisingly usable.
Codestral-22B-v0.1-IQ2_M.gguf	IQ2_M	7.61GB	Very low quality, uses SOTA techniques to also be surprisingly usable.
Codestral-22B-v0.1-IQ2_S.gguf	IQ2_S	7.03GB	Very low quality, uses SOTA techniques to be usable.
Codestral-22B-v0.1-IQ2_XS.gguf	IQ2_XS	6.64GB	Very low quality, uses SOTA techniques to be usable.

Deepseek coder 33b

出典サイト
https://huggingface.co/TheBloke/deepseek-coder-33B-base-GGUF

Name	Quant method	Bits	Size	Max RAM required	Use case
deepseek-coder-33b-base.Q2_K.gguf	Q2_K	2	14.03 GB	16.53 GB	smallest, significant quality loss - not recommended for most purposes
deepseek-coder-33b-base.Q3_K_S.gguf	Q3_K_S	3	14.42 GB	16.92 GB	very small, high quality loss
deepseek-coder-33b-base.Q3_K_M.gguf	Q3_K_M	3	16.07 GB	18.57 GB	very small, high quality loss
deepseek-coder-33b-base.Q3_K_L.gguf	Q3_K_L	3	17.56 GB	20.06 GB	small, substantial quality loss
deepseek-coder-33b-base.Q4_0.gguf	Q4_0	4	18.82 GB	21.32 GB	legacy; small, very high quality loss - prefer using Q3_K_M
deepseek-coder-33b-base.Q4_K_S.gguf	Q4_K_S	4	18.89 GB	21.39 GB	small, greater quality loss
deepseek-coder-33b-base.Q4_K_M.gguf	Q4_K_M	4	19.94 GB	22.44 GB	medium, balanced quality - recommended
deepseek-coder-33b-base.Q5_0.gguf	Q5_0	5	22.96 GB	25.46 GB	legacy; medium, balanced quality - prefer using Q4_K_M
deepseek-coder-33b-base.Q5_K_S.gguf	Q5_K_S	5	22.96 GB	25.46 GB	large, low quality loss - recommended
deepseek-coder-33b-base.Q5_K_M.gguf	Q5_K_M	5	23.54 GB	26.04 GB	large, very low quality loss - recommended
deepseek-coder-33b-base.Q6_K.gguf	Q6_K	6	27.36 GB	29.86 GB	very large, extremely low quality loss
deepseek-coder-33b-base.Q8_0.gguf	Q8_0	8	35.43 GB	37.93 GB	very large, extremely low quality loss - not recommended

Deepseek coder 6.7b

出典サイト
https://huggingface.co/TheBloke/deepseek-coder-6.7B-instruct-GGUF

Name	Quant method	Bits	Size	Max RAM required	Use case
deepseek-coder-6.7b-instruct.Q2_K.gguf	Q2_K	2	2.83 GB	5.33 GB	smallest, significant quality loss - not recommended for most purposes
deepseek-coder-6.7b-instruct.Q3_K_S.gguf	Q3_K_S	3	2.95 GB	5.45 GB	very small, high quality loss
deepseek-coder-6.7b-instruct.Q3_K_M.gguf	Q3_K_M	3	3.30 GB	5.80 GB	very small, high quality loss
deepseek-coder-6.7b-instruct.Q3_K_L.gguf	Q3_K_L	3	3.60 GB	6.10 GB	small, substantial quality loss
deepseek-coder-6.7b-instruct.Q4_0.gguf	Q4_0	4	3.83 GB	6.33 GB	legacy; small, very high quality loss - prefer using Q3_K_M
deepseek-coder-6.7b-instruct.Q4_K_S.gguf	Q4_K_S	4	3.86 GB	6.36 GB	small, greater quality loss
deepseek-coder-6.7b-instruct.Q4_K_M.gguf	Q4_K_M	4	4.08 GB	6.58 GB	medium, balanced quality - recommended
deepseek-coder-6.7b-instruct.Q5_0.gguf	Q5_0	5	4.65 GB	7.15 GB	legacy; medium, balanced quality - prefer using Q4_K_M
deepseek-coder-6.7b-instruct.Q5_K_S.gguf	Q5_K_S	5	4.65 GB	7.15 GB	large, low quality loss - recommended
deepseek-coder-6.7b-instruct.Q5_K_M.gguf	Q5_K_M	5	4.79 GB	7.29 GB	large, very low quality loss - recommended
deepseek-coder-6.7b-instruct.Q6_K.gguf	Q6_K	6	5.53 GB	8.03 GB	very large, extremely low quality loss
deepseek-coder-6.7b-instruct.Q8_0.gguf	Q8_0	8	7.16 GB	9.66 GB	very large, extremely low quality loss - not recommended

wizardcoder 33b

出典サイト
https://huggingface.co/TheBloke/WizardCoder-33B-V1.1-GGUF

Name	Quant method	Bits	Size	Max RAM required	Use case
wizardcoder-33b-v1.1.Q2_K.gguf	Q2_K	2	14.03 GB	16.53 GB	smallest, significant quality loss - not recommended for most purposes
wizardcoder-33b-v1.1.Q3_K_S.gguf	Q3_K_S	3	14.42 GB	16.92 GB	very small, high quality loss
wizardcoder-33b-v1.1.Q3_K_M.gguf	Q3_K_M	3	16.07 GB	18.57 GB	very small, high quality loss
wizardcoder-33b-v1.1.Q3_K_L.gguf	Q3_K_L	3	17.56 GB	20.06 GB	small, substantial quality loss
wizardcoder-33b-v1.1.Q4_0.gguf	Q4_0	4	18.82 GB	21.32 GB	legacy; small, very high quality loss - prefer using Q3_K_M
wizardcoder-33b-v1.1.Q4_K_S.gguf	Q4_K_S	4	18.89 GB	21.39 GB	small, greater quality loss
wizardcoder-33b-v1.1.Q4_K_M.gguf	Q4_K_M	4	19.94 GB	22.44 GB	medium, balanced quality - recommended
wizardcoder-33b-v1.1.Q5_0.gguf	Q5_0	5	22.96 GB	25.46 GB	legacy; medium, balanced quality - prefer using Q4_K_M
wizardcoder-33b-v1.1.Q5_K_S.gguf	Q5_K_S	5	22.96 GB	25.46 GB	large, low quality loss - recommended
wizardcoder-33b-v1.1.Q5_K_M.gguf	Q5_K_M	5	23.54 GB	26.04 GB	large, very low quality loss - recommended
wizardcoder-33b-v1.1.Q6_K.gguf	Q6_K	6	27.36 GB	29.86 GB	very large, extremely low quality loss
wizardcoder-33b-v1.1.Q8_0.gguf	Q8_0	8	35.43 GB	37.93 GB	very large, extremely low quality loss - not recommended

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up