2
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

量子化後のモデルのサイズと備考のまとめ

Last updated at Posted at 2024-05-22

量子化後のモデルのサイズと備考のまとめです.どの量子化されたモデルを使うか迷ったとき,参考になるかも知れません.

モデルがGPUのメモリにのる割合が高いほど高速に動作します.モデルがGPUのメモリにすべてのるのが理想的です.量子化はビット数が多いほど精度が高いですが,大規模言語モデルの場合4ビット(q4)程度でも多くの場合使用に支障がないと言われています.

Aya 8B

出典サイト
https://huggingface.co/bartowski/aya-23-8B-GGUF

Filename Quant type File Size Description
aya-23-8B-Q8_0.gguf Q8_0 8.54GB Extremely high quality, generally unneeded but max available quant.
aya-23-8B-Q6_K.gguf Q6_K 6.59GB Very high quality, near perfect, recommended.
aya-23-8B-Q5_K_M.gguf Q5_K_M 5.80GB High quality, recommended.
aya-23-8B-Q5_K_S.gguf Q5_K_S 5.66GB High quality, recommended.
aya-23-8B-Q4_K_M.gguf Q4_K_M 5.05GB Good quality, uses about 4.83 bits per weight, recommended.
aya-23-8B-Q4_K_S.gguf Q4_K_S 4.82GB Slightly lower quality with more space savings, recommended.
aya-23-8B-IQ4_NL.gguf IQ4_NL 4.81GB Decent quality, slightly smaller than Q4_K_S with similar performance recommended.
aya-23-8B-IQ4_XS.gguf IQ4_XS 4.60GB Decent quality, smaller than Q4_K_S with similar performance, recommended.
aya-23-8B-Q3_K_L.gguf Q3_K_L 4.52GB Lower quality but usable, good for low RAM availability.
aya-23-8B-Q3_K_M.gguf Q3_K_M 4.22GB Even lower quality.
aya-23-8B-IQ3_M.gguf IQ3_M 3.99GB Medium-low quality, new method with decent performance comparable to Q3_K_M.
aya-23-8B-IQ3_S.gguf IQ3_S 3.88GB Lower quality, new method with decent performance, recommended over Q3_K_S quant, same size with better performance.
aya-23-8B-Q3_K_S.gguf Q3_K_S 3.87GB Low quality, not recommended.
aya-23-8B-IQ3_XS.gguf IQ3_XS 3.72GB Lower quality, new method with decent performance, slightly better than Q3_K_S.
aya-23-8B-IQ3_XXS.gguf IQ3_XXS 3.41GB Lower quality, new method with decent performance, comparable to Q3 quants.
aya-23-8B-Q2_K.gguf Q2_K 3.43GB Very low quality but surprisingly usable.
aya-23-8B-IQ2_M.gguf IQ2_M 3.08GB Very low quality, uses SOTA techniques to also be surprisingly usable.
aya-23-8B-IQ2_S.gguf IQ2_S 2.89GB Very low quality, uses SOTA techniques to be usable.
aya-23-8B-IQ2_XS.gguf IQ2_XS 2.79GB Very low quality, uses SOTA techniques to be usable.
aya-23-8B-IQ2_XXS.gguf IQ2_XXS 2.58GB Lower quality, uses SOTA techniques to be usable.
aya-23-8B-IQ1_M.gguf IQ1_M 2.35GB Extremely low quality, not recommended.
aya-23-8B-IQ1_S.gguf IQ1_S 2.20GB Extremely low quality, not recommended.Filename Quant type File Size Description

Llama3 8B

出典サイト
https://huggingface.co/bartowski/Meta-Llama-3-8B-Instruct-GGUF

Filename Quant type File Size Description
Meta-Llama-3-8B-Instruct-Q8_0.gguf Q8_0 8.54GB Extremely high quality, generally unneeded but max available quant.
Meta-Llama-3-8B-Instruct-Q6_K.gguf Q6_K 6.59GB Very high quality, near perfect, recommended.
Meta-Llama-3-8B-Instruct-Q5_K_M.gguf Q5_K_M 5.73GB High quality, recommended.
Meta-Llama-3-8B-Instruct-Q5_K_S.gguf Q5_K_S 5.59GB High quality, recommended.
Meta-Llama-3-8B-Instruct-Q4_K_M.gguf Q4_K_M 4.92GB Good quality, uses about 4.83 bits per weight, recommended.
Meta-Llama-3-8B-Instruct-Q4_K_S.gguf Q4_K_S 4.69GB Slightly lower quality with more space savings, recommended.
Meta-Llama-3-8B-Instruct-IQ4_NL.gguf IQ4_NL 4.67GB Decent quality, slightly smaller than Q4_K_S with similar performance recommended.
Meta-Llama-3-8B-Instruct-IQ4_XS.gguf IQ4_XS 4.44GB Decent quality, smaller than Q4_K_S with similar performance, recommended.
Meta-Llama-3-8B-Instruct-Q3_K_L.gguf Q3_K_L 4.32GB Lower quality but usable, good for low RAM availability.
Meta-Llama-3-8B-Instruct-Q3_K_M.gguf Q3_K_M 4.01GB Even lower quality.
Meta-Llama-3-8B-Instruct-IQ3_M.gguf IQ3_M 3.78GB Medium-low quality, new method with decent performance comparable to Q3_K_M.
Meta-Llama-3-8B-Instruct-IQ3_S.gguf IQ3_S 3.68GB Lower quality, new method with decent performance, recommended over Q3_K_S quant, same size with better performance.
Meta-Llama-3-8B-Instruct-Q3_K_S.gguf Q3_K_S 3.66GB Low quality, not recommended.
Meta-Llama-3-8B-Instruct-IQ3_XS.gguf IQ3_XS 3.51GB Lower quality, new method with decent performance, slightly better than Q3_K_S.
Meta-Llama-3-8B-Instruct-IQ3_XXS.gguf IQ3_XXS 3.27GB Lower quality, new method with decent performance, comparable to Q3 quants.
Meta-Llama-3-8B-Instruct-Q2_K.gguf Q2_K 3.17GB Very low quality but surprisingly usable.
Meta-Llama-3-8B-Instruct-IQ2_M.gguf IQ2_M 2.94GB Very low quality, uses SOTA techniques to also be surprisingly usable.
Meta-Llama-3-8B-Instruct-IQ2_S.gguf IQ2_S 2.75GB Very low quality, uses SOTA techniques to be usable.
Meta-Llama-3-8B-Instruct-IQ2_XS.gguf IQ2_XS 2.60GB Very low quality, uses SOTA techniques to be usable.
Meta-Llama-3-8B-Instruct-IQ2_XXS.gguf IQ2_XXS 2.39GB Lower quality, uses SOTA techniques to be usable.
Meta-Llama-3-8B-Instruct-IQ1_M.gguf IQ1_M 2.16GB Extremely low quality, not recommended.
Meta-Llama-3-8B-Instruct-IQ1_S.gguf IQ1_S 2.01GB Extremely low quality, not recommended.

Llama3 70b

出典サイト
https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-old-GGUF

Filename Quant type File Size Description
Meta-Llama-3-70B-Instruct-Q8_0.gguf Q8_0 74.97GB Extremely high quality, generally unneeded but max available quant.
Meta-Llama-3-70B-Instruct-Q6_K.gguf Q6_K 57.88GB Very high quality, near perfect, recommended.
Meta-Llama-3-70B-Instruct-Q5_K_M.gguf Q5_K_M 49.94GB High quality, recommended.
Meta-Llama-3-70B-Instruct-Q5_K_S.gguf Q5_K_S 48.65GB High quality, recommended.
Meta-Llama-3-70B-Instruct-Q4_K_M.gguf Q4_K_M 42.52GB Good quality, uses about 4.83 bits per weight, recommended.
Meta-Llama-3-70B-Instruct-Q4_K_S.gguf Q4_K_S 40.34GB Slightly lower quality with more space savings, recommended.
Meta-Llama-3-70B-Instruct-IQ4_NL.gguf IQ4_NL 40.05GB Decent quality, slightly smaller than Q4_K_S with similar performance recommended.
Meta-Llama-3-70B-Instruct-IQ4_XS.gguf IQ4_XS 37.90GB Decent quality, smaller than Q4_K_S with similar performance, recommended.
Meta-Llama-3-70B-Instruct-Q3_K_L.gguf Q3_K_L 37.14GB Lower quality but usable, good for low RAM availability.
Meta-Llama-3-70B-Instruct-Q3_K_M.gguf Q3_K_M 34.26GB Even lower quality.
Meta-Llama-3-70B-Instruct-IQ3_M.gguf IQ3_M 31.93GB Medium-low quality, new method with decent performance comparable to Q3_K_M.
Meta-Llama-3-70B-Instruct-IQ3_S.gguf IQ3_S 30.91GB Lower quality, new method with decent performance, recommended over Q3_K_S quant, same size with better performance.
Meta-Llama-3-70B-Instruct-Q3_K_S.gguf Q3_K_S 30.91GB Low quality, not recommended.
Meta-Llama-3-70B-Instruct-IQ3_XS.gguf IQ3_XS 29.30GB Lower quality, new method with decent performance, slightly better than Q3_K_S.
Meta-Llama-3-70B-Instruct-IQ3_XXS.gguf IQ3_XXS 27.46GB Lower quality, new method with decent performance, comparable to Q3 quants.
Meta-Llama-3-70B-Instruct-Q2_K.gguf Q2_K 26.37GB Very low quality but surprisingly usable.
Meta-Llama-3-70B-Instruct-IQ2_M.gguf IQ2_M 24.11GB Very low quality, uses SOTA techniques to also be surprisingly usable.
Meta-Llama-3-70B-Instruct-IQ2_S.gguf IQ2_S 22.24GB Very low quality, uses SOTA techniques to be usable.
Meta-Llama-3-70B-Instruct-IQ2_XS.gguf IQ2_XS 21.14GB Very low quality, uses SOTA techniques to be usable.
Meta-Llama-3-70B-Instruct-IQ2_XXS.gguf IQ2_XXS 19.09GB Lower quality, uses SOTA techniques to be usable.
Meta-Llama-3-70B-Instruct-IQ1_M.gguf IQ1_M 16.75GB Extremely low quality, not recommended.
Meta-Llama-3-70B-Instruct-IQ1_S.gguf IQ1_S 15.34GB Extremely low quality, not recommended.

Phi3-medium

出典サイト
https://huggingface.co/bartowski/Phi-3-medium-4k-instruct-GGUF

Filename Quant type File Size Description
Phi-3-medium-4k-instruct-Q8_0.gguf Q8_0 14.83GB Extremely high quality, generally unneeded but max available quant.
Phi-3-medium-4k-instruct-Q6_K.gguf Q6_K 11.45GB Very high quality, near perfect, recommended.
Phi-3-medium-4k-instruct-Q5_K_M.gguf Q5_K_M 10.07GB High quality, recommended.
Phi-3-medium-4k-instruct-Q5_K_S.gguf Q5_K_S 9.62GB High quality, recommended.
Phi-3-medium-4k-instruct-Q4_K_M.gguf Q4_K_M 8.56GB Good quality, uses about 4.83 bits per weight, recommended.
Phi-3-medium-4k-instruct-Q4_K_S.gguf Q4_K_S 7.95GB Slightly lower quality with more space savings, recommended.
Phi-3-medium-4k-instruct-IQ4_NL.gguf IQ4_NL 7.89GB Decent quality, slightly smaller than Q4_K_S with similar performance recommended.
Phi-3-medium-4k-instruct-IQ4_XS.gguf IQ4_XS 7.46GB Decent quality, smaller than Q4_K_S with similar performance, recommended.
Phi-3-medium-4k-instruct-Q3_K_L.gguf Q3_K_L 7.49GB Lower quality but usable, good for low RAM availability.
Phi-3-medium-4k-instruct-Q3_K_M.gguf Q3_K_M 6.92GB Even lower quality.
Phi-3-medium-4k-instruct-IQ3_M.gguf IQ3_M 6.47GB Medium-low quality, new method with decent performance comparable to Q3_K_M.
Phi-3-medium-4k-instruct-IQ3_S.gguf IQ3_S 6.06GB Lower quality, new method with decent performance, recommended over Q3_K_S quant, same size with better performance.
Phi-3-medium-4k-instruct-Q3_K_S.gguf Q3_K_S 6.06GB Low quality, not recommended.
Phi-3-medium-4k-instruct-IQ3_XS.gguf IQ3_XS 5.80GB Lower quality, new method with decent performance, slightly better than Q3_K_S.
Phi-3-medium-4k-instruct-IQ3_XXS.gguf IQ3_XXS 5.45GB Lower quality, new method with decent performance, comparable to Q3 quants.
Phi-3-medium-4k-instruct-Q2_K.gguf Q2_K 5.14GB Very low quality but surprisingly usable.
Phi-3-medium-4k-instruct-IQ2_M.gguf IQ2_M 4.71GB Very low quality, uses SOTA techniques to also be surprisingly usable.
Phi-3-medium-4k-instruct-IQ2_S.gguf IQ2_S 4.33GB Very low quality, uses SOTA techniques to be usable.
Phi-3-medium-4k-instruct-IQ2_XS.gguf IQ2_XS 4.12GB Very low quality, uses SOTA techniques to be usable.
Phi-3-medium-4k-instruct-IQ2_XXS.gguf IQ2_XXS 3.71GB Lower quality, uses SOTA techniques to be usable.
Phi-3-medium-4k-instruct-IQ1_M.gguf IQ1_M 3.24GB Extremely low quality, not recommended.
Phi-3-medium-4k-instruct-IQ1_S.gguf IQ1_S 2.95GB Extremely low quality, not recommended.

Mixtral 8x7b

出典サイト
https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF

Name Quant method Bits Size Max RAM required Use case
mixtral-8x7b-instruct-v0.1.Q2_K.gguf Q2_K 2 15.64 GB 18.14 GB smallest, significant quality loss - not recommended for most purposes
mixtral-8x7b-instruct-v0.1.Q3_K_M.gguf Q3_K_M 3 20.36 GB 22.86 GB very small, high quality loss
mixtral-8x7b-instruct-v0.1.Q4_0.gguf Q4_0 4 26.44 GB 28.94 GB legacy; small, very high quality loss - prefer using Q3_K_M
mixtral-8x7b-instruct-v0.1.Q4_K_M.gguf Q4_K_M 4 26.44 GB 28.94 GB medium, balanced quality - recommended
mixtral-8x7b-instruct-v0.1.Q5_0.gguf Q5_0 5 32.23 GB 34.73 GB legacy; medium, balanced quality - prefer using Q4_K_M
mixtral-8x7b-instruct-v0.1.Q5_K_M.gguf Q5_K_M 5 32.23 GB 34.73 GB large, very low quality loss - recommended
mixtral-8x7b-instruct-v0.1.Q6_K.gguf Q6_K 6 38.38 GB 40.88 GB very large, extremely low quality loss
mixtral-8x7b-instruct-v0.1.Q8_0.gguf Q8_0 8 49.62 GB 52.12 GB very large, extremely low quality loss - not recommended

Starling-LM 7b

出典サイト
https://huggingface.co/bartowski/Starling-LM-7B-beta-GGUF

Name Quant method Size Use case
Starling-LM-7B-beta-Q8_0.gguf Q8_0 7.69GB Extremely high quality, generally unneeded but max available quant.
Starling-LM-7B-beta-Q6_K.gguf Q6_K 5.94GB Very high quality, near perfect, recommended.
Starling-LM-7B-beta-Q5_K_M.gguf Q5_K_M 5.13GB High quality, very usable.
Starling-LM-7B-beta-Q5_K_S.gguf Q5_K_S 4.99GB High quality, very usable.
Starling-LM-7B-beta-Q5_0.gguf Q5_0 4.99GB High quality, older format, generally not recommended.
Starling-LM-7B-beta-Q4_K_M.gguf Q4_K_M 4.36GB Good quality, similar to 4.25 bpw.
Starling-LM-7B-beta-Q4_K_S.gguf Q4_K_S 4.14GB Slightly lower quality with small space savings.
Starling-LM-7B-beta-IQ4_NL.gguf IQ4_NL 4.15GB Good quality, similar to Q4_K_S, new method of quanting,
Starling-LM-7B-beta-IQ4_XS.gguf IQ4_XS 3.94GB Decent quality, new method with similar performance to Q4.
Starling-LM-7B-beta-Q4_0.gguf Q4_0 4.10GB Decent quality, older format, generally not recommended.
Starling-LM-7B-beta-IQ3_M.gguf IQ3_M 3.28GB Medium-low quality, new method with decent performance.
Starling-LM-7B-beta-IQ3_S.gguf IQ3_S 3.18GB Lower quality, new method with decent performance, recommended over Q3 quants.
Starling-LM-7B-beta-Q3_K_L.gguf Q3_K_L 3.82GB Lower quality but usable, good for low RAM availability.
Starling-LM-7B-beta-Q3_K_M.gguf Q3_K_M 3.51GB Even lower quality.
Starling-LM-7B-beta-Q3_K_S.gguf Q3_K_S 3.16GB Low quality, not recommended.
Starling-LM-7B-beta-Q2_K.gguf Q2_K 2.71GB Extremely low quality, not recommended.

WizardLM2 7b

出典サイト
https://huggingface.co/bartowski/WizardLM-2-7B-GGUF

Filename Quant type File Size Description
WizardLM-2-7B-Q8_0.gguf Q8_0 7.69GB Extremely high quality, generally unneeded but max available quant.
WizardLM-2-7B-Q6_K.gguf Q6_K 5.94GB Very high quality, near perfect, recommended.
WizardLM-2-7B-Q5_K_M.gguf Q5_K_M 5.13GB High quality, recommended.
WizardLM-2-7B-Q5_K_S.gguf Q5_K_S 4.99GB High quality, recommended.
WizardLM-2-7B-Q4_K_M.gguf Q4_K_M 4.36GB Good quality, uses about 4.83 bits per weight, recommended.
WizardLM-2-7B-Q4_K_S.gguf Q4_K_S 4.14GB Slightly lower quality with more space savings, recommended.
WizardLM-2-7B-IQ4_NL.gguf IQ4_NL 4.12GB Decent quality, slightly smaller than Q4_K_S with similar performance recommended.
WizardLM-2-7B-IQ4_XS.gguf IQ4_XS 3.90GB Decent quality, smaller than Q4_K_S with similar performance, recommended.
WizardLM-2-7B-Q3_K_L.gguf Q3_K_L 3.82GB Lower quality but usable, good for low RAM availability.
WizardLM-2-7B-Q3_K_M.gguf Q3_K_M 3.51GB Even lower quality.
WizardLM-2-7B-IQ3_M.gguf IQ3_M 3.28GB Medium-low quality, new method with decent performance comparable to Q3_K_M.
WizardLM-2-7B-IQ3_S.gguf IQ3_S 3.18GB Lower quality, new method with decent performance, recommended over Q3_K_S quant, same size with better performance.
WizardLM-2-7B-Q3_K_S.gguf Q3_K_S 3.16GB Low quality, not recommended.
WizardLM-2-7B-IQ3_XS.gguf IQ3_XS 3.01GB Lower quality, new method with decent performance, slightly better than Q3_K_S.
WizardLM-2-7B-IQ3_XXS.gguf IQ3_XXS 2.82GB Lower quality, new method with decent performance, comparable to Q3 quants.
WizardLM-2-7B-Q2_K.gguf Q2_K 2.71GB Very low quality but surprisingly usable.
WizardLM-2-7B-IQ2_M.gguf IQ2_M 2.50GB Very low quality, uses SOTA techniques to also be surprisingly usable.
WizardLM-2-7B-IQ2_S.gguf IQ2_S 2.31GB Very low quality, uses SOTA techniques to be usable.
WizardLM-2-7B-IQ2_XS.gguf IQ2_XS 2.19GB Very low quality, uses SOTA techniques to be usable.
WizardLM-2-7B-IQ2_XXS.gguf IQ2_XXS 1.99GB Lower quality, uses SOTA techniques to be usable.
WizardLM-2-7B-IQ1_M.gguf IQ1_M 1.75GB Extremely low quality, not recommended.
WizardLM-2-7B-IQ1_S.gguf IQ1_S 1.61GB Extremely low quality, not recommended.

Codestral 22B

出典サイト
https://huggingface.co/bartowski/Codestral-22B-v0.1-GGUF

Filename Quant type File Size Description
Codestral-22B-v0.1-Q8_0.gguf Q8_0 23.64GB Extremely high quality, generally unneeded but max available quant.
Codestral-22B-v0.1-Q6_K.gguf Q6_K 18.25GB Very high quality, near perfect, recommended.
Codestral-22B-v0.1-Q5_K_M.gguf Q5_K_M 15.72GB High quality, recommended.
Codestral-22B-v0.1-Q5_K_S.gguf Q5_K_S 15.32GB High quality, recommended.
Codestral-22B-v0.1-Q4_K_M.gguf Q4_K_M 13.34GB Good quality, uses about 4.83 bits per weight, recommended.
Codestral-22B-v0.1-Q4_K_S.gguf Q4_K_S 12.66GB Slightly lower quality with more space savings, recommended.
Codestral-22B-v0.1-IQ4_XS.gguf IQ4_XS 11.93GB Decent quality, smaller than Q4_K_S with similar performance, recommended.
Codestral-22B-v0.1-Q3_K_L.gguf Q3_K_L 11.73GB Lower quality but usable, good for low RAM availability.
Codestral-22B-v0.1-Q3_K_M.gguf Q3_K_M 10.75GB Even lower quality.
Codestral-22B-v0.1-IQ3_M.gguf IQ3_M 10.06GB Medium-low quality, new method with decent performance comparable to Q3_K_M.
Codestral-22B-v0.1-Q3_K_S.gguf Q3_K_S 9.64GB Low quality, not recommended.
Codestral-22B-v0.1-IQ3_XS.gguf IQ3_XS 9.17GB Lower quality, new method with decent performance, slightly better than Q3_K_S.
Codestral-22B-v0.1-IQ3_XXS.gguf IQ3_XXS 8.59GB Lower quality, new method with decent performance, comparable to Q3 quants.
Codestral-22B-v0.1-Q2_K.gguf Q2_K 8.27GB Very low quality but surprisingly usable.
Codestral-22B-v0.1-IQ2_M.gguf IQ2_M 7.61GB Very low quality, uses SOTA techniques to also be surprisingly usable.
Codestral-22B-v0.1-IQ2_S.gguf IQ2_S 7.03GB Very low quality, uses SOTA techniques to be usable.
Codestral-22B-v0.1-IQ2_XS.gguf IQ2_XS 6.64GB Very low quality, uses SOTA techniques to be usable.

Deepseek coder 33b

出典サイト
https://huggingface.co/TheBloke/deepseek-coder-33B-base-GGUF

Name Quant method Bits Size Max RAM required Use case
deepseek-coder-33b-base.Q2_K.gguf Q2_K 2 14.03 GB 16.53 GB smallest, significant quality loss - not recommended for most purposes
deepseek-coder-33b-base.Q3_K_S.gguf Q3_K_S 3 14.42 GB 16.92 GB very small, high quality loss
deepseek-coder-33b-base.Q3_K_M.gguf Q3_K_M 3 16.07 GB 18.57 GB very small, high quality loss
deepseek-coder-33b-base.Q3_K_L.gguf Q3_K_L 3 17.56 GB 20.06 GB small, substantial quality loss
deepseek-coder-33b-base.Q4_0.gguf Q4_0 4 18.82 GB 21.32 GB legacy; small, very high quality loss - prefer using Q3_K_M
deepseek-coder-33b-base.Q4_K_S.gguf Q4_K_S 4 18.89 GB 21.39 GB small, greater quality loss
deepseek-coder-33b-base.Q4_K_M.gguf Q4_K_M 4 19.94 GB 22.44 GB medium, balanced quality - recommended
deepseek-coder-33b-base.Q5_0.gguf Q5_0 5 22.96 GB 25.46 GB legacy; medium, balanced quality - prefer using Q4_K_M
deepseek-coder-33b-base.Q5_K_S.gguf Q5_K_S 5 22.96 GB 25.46 GB large, low quality loss - recommended
deepseek-coder-33b-base.Q5_K_M.gguf Q5_K_M 5 23.54 GB 26.04 GB large, very low quality loss - recommended
deepseek-coder-33b-base.Q6_K.gguf Q6_K 6 27.36 GB 29.86 GB very large, extremely low quality loss
deepseek-coder-33b-base.Q8_0.gguf Q8_0 8 35.43 GB 37.93 GB very large, extremely low quality loss - not recommended

Deepseek coder 6.7b

出典サイト
https://huggingface.co/TheBloke/deepseek-coder-6.7B-instruct-GGUF

Name Quant method Bits Size Max RAM required Use case
deepseek-coder-6.7b-instruct.Q2_K.gguf Q2_K 2 2.83 GB 5.33 GB smallest, significant quality loss - not recommended for most purposes
deepseek-coder-6.7b-instruct.Q3_K_S.gguf Q3_K_S 3 2.95 GB 5.45 GB very small, high quality loss
deepseek-coder-6.7b-instruct.Q3_K_M.gguf Q3_K_M 3 3.30 GB 5.80 GB very small, high quality loss
deepseek-coder-6.7b-instruct.Q3_K_L.gguf Q3_K_L 3 3.60 GB 6.10 GB small, substantial quality loss
deepseek-coder-6.7b-instruct.Q4_0.gguf Q4_0 4 3.83 GB 6.33 GB legacy; small, very high quality loss - prefer using Q3_K_M
deepseek-coder-6.7b-instruct.Q4_K_S.gguf Q4_K_S 4 3.86 GB 6.36 GB small, greater quality loss
deepseek-coder-6.7b-instruct.Q4_K_M.gguf Q4_K_M 4 4.08 GB 6.58 GB medium, balanced quality - recommended
deepseek-coder-6.7b-instruct.Q5_0.gguf Q5_0 5 4.65 GB 7.15 GB legacy; medium, balanced quality - prefer using Q4_K_M
deepseek-coder-6.7b-instruct.Q5_K_S.gguf Q5_K_S 5 4.65 GB 7.15 GB large, low quality loss - recommended
deepseek-coder-6.7b-instruct.Q5_K_M.gguf Q5_K_M 5 4.79 GB 7.29 GB large, very low quality loss - recommended
deepseek-coder-6.7b-instruct.Q6_K.gguf Q6_K 6 5.53 GB 8.03 GB very large, extremely low quality loss
deepseek-coder-6.7b-instruct.Q8_0.gguf Q8_0 8 7.16 GB 9.66 GB very large, extremely low quality loss - not recommended

wizardcoder 33b

出典サイト
https://huggingface.co/TheBloke/WizardCoder-33B-V1.1-GGUF

Name Quant method Bits Size Max RAM required Use case
wizardcoder-33b-v1.1.Q2_K.gguf Q2_K 2 14.03 GB 16.53 GB smallest, significant quality loss - not recommended for most purposes
wizardcoder-33b-v1.1.Q3_K_S.gguf Q3_K_S 3 14.42 GB 16.92 GB very small, high quality loss
wizardcoder-33b-v1.1.Q3_K_M.gguf Q3_K_M 3 16.07 GB 18.57 GB very small, high quality loss
wizardcoder-33b-v1.1.Q3_K_L.gguf Q3_K_L 3 17.56 GB 20.06 GB small, substantial quality loss
wizardcoder-33b-v1.1.Q4_0.gguf Q4_0 4 18.82 GB 21.32 GB legacy; small, very high quality loss - prefer using Q3_K_M
wizardcoder-33b-v1.1.Q4_K_S.gguf Q4_K_S 4 18.89 GB 21.39 GB small, greater quality loss
wizardcoder-33b-v1.1.Q4_K_M.gguf Q4_K_M 4 19.94 GB 22.44 GB medium, balanced quality - recommended
wizardcoder-33b-v1.1.Q5_0.gguf Q5_0 5 22.96 GB 25.46 GB legacy; medium, balanced quality - prefer using Q4_K_M
wizardcoder-33b-v1.1.Q5_K_S.gguf Q5_K_S 5 22.96 GB 25.46 GB large, low quality loss - recommended
wizardcoder-33b-v1.1.Q5_K_M.gguf Q5_K_M 5 23.54 GB 26.04 GB large, very low quality loss - recommended
wizardcoder-33b-v1.1.Q6_K.gguf Q6_K 6 27.36 GB 29.86 GB very large, extremely low quality loss
wizardcoder-33b-v1.1.Q8_0.gguf Q8_0 8 35.43 GB 37.93 GB very large, extremely low quality loss - not recommended
2
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
2
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?