量子化後のモデルのサイズと備考のまとめです.どの量子化されたモデルを使うか迷ったとき,参考になるかも知れません.
モデルがGPUのメモリにのる割合が高いほど高速に動作します.モデルがGPUのメモリにすべてのるのが理想的です.量子化はビット数が多いほど精度が高いですが,大規模言語モデルの場合4ビット(q4)程度でも多くの場合使用に支障がないと言われています.
Aya 8B
出典サイト
https://huggingface.co/bartowski/aya-23-8B-GGUF
Filename | Quant type | File Size | Description |
---|---|---|---|
aya-23-8B-Q8_0.gguf | Q8_0 | 8.54GB | Extremely high quality, generally unneeded but max available quant. |
aya-23-8B-Q6_K.gguf | Q6_K | 6.59GB | Very high quality, near perfect, recommended. |
aya-23-8B-Q5_K_M.gguf | Q5_K_M | 5.80GB | High quality, recommended. |
aya-23-8B-Q5_K_S.gguf | Q5_K_S | 5.66GB | High quality, recommended. |
aya-23-8B-Q4_K_M.gguf | Q4_K_M | 5.05GB | Good quality, uses about 4.83 bits per weight, recommended. |
aya-23-8B-Q4_K_S.gguf | Q4_K_S | 4.82GB | Slightly lower quality with more space savings, recommended. |
aya-23-8B-IQ4_NL.gguf | IQ4_NL | 4.81GB | Decent quality, slightly smaller than Q4_K_S with similar performance recommended. |
aya-23-8B-IQ4_XS.gguf | IQ4_XS | 4.60GB | Decent quality, smaller than Q4_K_S with similar performance, recommended. |
aya-23-8B-Q3_K_L.gguf | Q3_K_L | 4.52GB | Lower quality but usable, good for low RAM availability. |
aya-23-8B-Q3_K_M.gguf | Q3_K_M | 4.22GB | Even lower quality. |
aya-23-8B-IQ3_M.gguf | IQ3_M | 3.99GB | Medium-low quality, new method with decent performance comparable to Q3_K_M. |
aya-23-8B-IQ3_S.gguf | IQ3_S | 3.88GB | Lower quality, new method with decent performance, recommended over Q3_K_S quant, same size with better performance. |
aya-23-8B-Q3_K_S.gguf | Q3_K_S | 3.87GB | Low quality, not recommended. |
aya-23-8B-IQ3_XS.gguf | IQ3_XS | 3.72GB | Lower quality, new method with decent performance, slightly better than Q3_K_S. |
aya-23-8B-IQ3_XXS.gguf | IQ3_XXS | 3.41GB | Lower quality, new method with decent performance, comparable to Q3 quants. |
aya-23-8B-Q2_K.gguf | Q2_K | 3.43GB | Very low quality but surprisingly usable. |
aya-23-8B-IQ2_M.gguf | IQ2_M | 3.08GB | Very low quality, uses SOTA techniques to also be surprisingly usable. |
aya-23-8B-IQ2_S.gguf | IQ2_S | 2.89GB | Very low quality, uses SOTA techniques to be usable. |
aya-23-8B-IQ2_XS.gguf | IQ2_XS | 2.79GB | Very low quality, uses SOTA techniques to be usable. |
aya-23-8B-IQ2_XXS.gguf | IQ2_XXS | 2.58GB | Lower quality, uses SOTA techniques to be usable. |
aya-23-8B-IQ1_M.gguf | IQ1_M | 2.35GB | Extremely low quality, not recommended. |
aya-23-8B-IQ1_S.gguf | IQ1_S | 2.20GB | Extremely low quality, not recommended.Filename Quant type File Size Description |
Llama3 8B
出典サイト
https://huggingface.co/bartowski/Meta-Llama-3-8B-Instruct-GGUF
Filename | Quant type | File Size | Description |
---|---|---|---|
Meta-Llama-3-8B-Instruct-Q8_0.gguf | Q8_0 | 8.54GB | Extremely high quality, generally unneeded but max available quant. |
Meta-Llama-3-8B-Instruct-Q6_K.gguf | Q6_K | 6.59GB | Very high quality, near perfect, recommended. |
Meta-Llama-3-8B-Instruct-Q5_K_M.gguf | Q5_K_M | 5.73GB | High quality, recommended. |
Meta-Llama-3-8B-Instruct-Q5_K_S.gguf | Q5_K_S | 5.59GB | High quality, recommended. |
Meta-Llama-3-8B-Instruct-Q4_K_M.gguf | Q4_K_M | 4.92GB | Good quality, uses about 4.83 bits per weight, recommended. |
Meta-Llama-3-8B-Instruct-Q4_K_S.gguf | Q4_K_S | 4.69GB | Slightly lower quality with more space savings, recommended. |
Meta-Llama-3-8B-Instruct-IQ4_NL.gguf | IQ4_NL | 4.67GB | Decent quality, slightly smaller than Q4_K_S with similar performance recommended. |
Meta-Llama-3-8B-Instruct-IQ4_XS.gguf | IQ4_XS | 4.44GB | Decent quality, smaller than Q4_K_S with similar performance, recommended. |
Meta-Llama-3-8B-Instruct-Q3_K_L.gguf | Q3_K_L | 4.32GB | Lower quality but usable, good for low RAM availability. |
Meta-Llama-3-8B-Instruct-Q3_K_M.gguf | Q3_K_M | 4.01GB | Even lower quality. |
Meta-Llama-3-8B-Instruct-IQ3_M.gguf | IQ3_M | 3.78GB | Medium-low quality, new method with decent performance comparable to Q3_K_M. |
Meta-Llama-3-8B-Instruct-IQ3_S.gguf | IQ3_S | 3.68GB | Lower quality, new method with decent performance, recommended over Q3_K_S quant, same size with better performance. |
Meta-Llama-3-8B-Instruct-Q3_K_S.gguf | Q3_K_S | 3.66GB | Low quality, not recommended. |
Meta-Llama-3-8B-Instruct-IQ3_XS.gguf | IQ3_XS | 3.51GB | Lower quality, new method with decent performance, slightly better than Q3_K_S. |
Meta-Llama-3-8B-Instruct-IQ3_XXS.gguf | IQ3_XXS | 3.27GB | Lower quality, new method with decent performance, comparable to Q3 quants. |
Meta-Llama-3-8B-Instruct-Q2_K.gguf | Q2_K | 3.17GB | Very low quality but surprisingly usable. |
Meta-Llama-3-8B-Instruct-IQ2_M.gguf | IQ2_M | 2.94GB | Very low quality, uses SOTA techniques to also be surprisingly usable. |
Meta-Llama-3-8B-Instruct-IQ2_S.gguf | IQ2_S | 2.75GB | Very low quality, uses SOTA techniques to be usable. |
Meta-Llama-3-8B-Instruct-IQ2_XS.gguf | IQ2_XS | 2.60GB | Very low quality, uses SOTA techniques to be usable. |
Meta-Llama-3-8B-Instruct-IQ2_XXS.gguf | IQ2_XXS | 2.39GB | Lower quality, uses SOTA techniques to be usable. |
Meta-Llama-3-8B-Instruct-IQ1_M.gguf | IQ1_M | 2.16GB | Extremely low quality, not recommended. |
Meta-Llama-3-8B-Instruct-IQ1_S.gguf | IQ1_S | 2.01GB | Extremely low quality, not recommended. |
Llama3 70b
出典サイト
https://huggingface.co/bartowski/Meta-Llama-3-70B-Instruct-old-GGUF
Filename | Quant type | File Size | Description |
---|---|---|---|
Meta-Llama-3-70B-Instruct-Q8_0.gguf | Q8_0 | 74.97GB | Extremely high quality, generally unneeded but max available quant. |
Meta-Llama-3-70B-Instruct-Q6_K.gguf | Q6_K | 57.88GB | Very high quality, near perfect, recommended. |
Meta-Llama-3-70B-Instruct-Q5_K_M.gguf | Q5_K_M | 49.94GB | High quality, recommended. |
Meta-Llama-3-70B-Instruct-Q5_K_S.gguf | Q5_K_S | 48.65GB | High quality, recommended. |
Meta-Llama-3-70B-Instruct-Q4_K_M.gguf | Q4_K_M | 42.52GB | Good quality, uses about 4.83 bits per weight, recommended. |
Meta-Llama-3-70B-Instruct-Q4_K_S.gguf | Q4_K_S | 40.34GB | Slightly lower quality with more space savings, recommended. |
Meta-Llama-3-70B-Instruct-IQ4_NL.gguf | IQ4_NL | 40.05GB | Decent quality, slightly smaller than Q4_K_S with similar performance recommended. |
Meta-Llama-3-70B-Instruct-IQ4_XS.gguf | IQ4_XS | 37.90GB | Decent quality, smaller than Q4_K_S with similar performance, recommended. |
Meta-Llama-3-70B-Instruct-Q3_K_L.gguf | Q3_K_L | 37.14GB | Lower quality but usable, good for low RAM availability. |
Meta-Llama-3-70B-Instruct-Q3_K_M.gguf | Q3_K_M | 34.26GB | Even lower quality. |
Meta-Llama-3-70B-Instruct-IQ3_M.gguf | IQ3_M | 31.93GB | Medium-low quality, new method with decent performance comparable to Q3_K_M. |
Meta-Llama-3-70B-Instruct-IQ3_S.gguf | IQ3_S | 30.91GB | Lower quality, new method with decent performance, recommended over Q3_K_S quant, same size with better performance. |
Meta-Llama-3-70B-Instruct-Q3_K_S.gguf | Q3_K_S | 30.91GB | Low quality, not recommended. |
Meta-Llama-3-70B-Instruct-IQ3_XS.gguf | IQ3_XS | 29.30GB | Lower quality, new method with decent performance, slightly better than Q3_K_S. |
Meta-Llama-3-70B-Instruct-IQ3_XXS.gguf | IQ3_XXS | 27.46GB | Lower quality, new method with decent performance, comparable to Q3 quants. |
Meta-Llama-3-70B-Instruct-Q2_K.gguf | Q2_K | 26.37GB | Very low quality but surprisingly usable. |
Meta-Llama-3-70B-Instruct-IQ2_M.gguf | IQ2_M | 24.11GB | Very low quality, uses SOTA techniques to also be surprisingly usable. |
Meta-Llama-3-70B-Instruct-IQ2_S.gguf | IQ2_S | 22.24GB | Very low quality, uses SOTA techniques to be usable. |
Meta-Llama-3-70B-Instruct-IQ2_XS.gguf | IQ2_XS | 21.14GB | Very low quality, uses SOTA techniques to be usable. |
Meta-Llama-3-70B-Instruct-IQ2_XXS.gguf | IQ2_XXS | 19.09GB | Lower quality, uses SOTA techniques to be usable. |
Meta-Llama-3-70B-Instruct-IQ1_M.gguf | IQ1_M | 16.75GB | Extremely low quality, not recommended. |
Meta-Llama-3-70B-Instruct-IQ1_S.gguf | IQ1_S | 15.34GB | Extremely low quality, not recommended. |
Phi3-medium
出典サイト
https://huggingface.co/bartowski/Phi-3-medium-4k-instruct-GGUF
Filename | Quant type | File Size | Description |
---|---|---|---|
Phi-3-medium-4k-instruct-Q8_0.gguf | Q8_0 | 14.83GB | Extremely high quality, generally unneeded but max available quant. |
Phi-3-medium-4k-instruct-Q6_K.gguf | Q6_K | 11.45GB | Very high quality, near perfect, recommended. |
Phi-3-medium-4k-instruct-Q5_K_M.gguf | Q5_K_M | 10.07GB | High quality, recommended. |
Phi-3-medium-4k-instruct-Q5_K_S.gguf | Q5_K_S | 9.62GB | High quality, recommended. |
Phi-3-medium-4k-instruct-Q4_K_M.gguf | Q4_K_M | 8.56GB | Good quality, uses about 4.83 bits per weight, recommended. |
Phi-3-medium-4k-instruct-Q4_K_S.gguf | Q4_K_S | 7.95GB | Slightly lower quality with more space savings, recommended. |
Phi-3-medium-4k-instruct-IQ4_NL.gguf | IQ4_NL | 7.89GB | Decent quality, slightly smaller than Q4_K_S with similar performance recommended. |
Phi-3-medium-4k-instruct-IQ4_XS.gguf | IQ4_XS | 7.46GB | Decent quality, smaller than Q4_K_S with similar performance, recommended. |
Phi-3-medium-4k-instruct-Q3_K_L.gguf | Q3_K_L | 7.49GB | Lower quality but usable, good for low RAM availability. |
Phi-3-medium-4k-instruct-Q3_K_M.gguf | Q3_K_M | 6.92GB | Even lower quality. |
Phi-3-medium-4k-instruct-IQ3_M.gguf | IQ3_M | 6.47GB | Medium-low quality, new method with decent performance comparable to Q3_K_M. |
Phi-3-medium-4k-instruct-IQ3_S.gguf | IQ3_S | 6.06GB | Lower quality, new method with decent performance, recommended over Q3_K_S quant, same size with better performance. |
Phi-3-medium-4k-instruct-Q3_K_S.gguf | Q3_K_S | 6.06GB | Low quality, not recommended. |
Phi-3-medium-4k-instruct-IQ3_XS.gguf | IQ3_XS | 5.80GB | Lower quality, new method with decent performance, slightly better than Q3_K_S. |
Phi-3-medium-4k-instruct-IQ3_XXS.gguf | IQ3_XXS | 5.45GB | Lower quality, new method with decent performance, comparable to Q3 quants. |
Phi-3-medium-4k-instruct-Q2_K.gguf | Q2_K | 5.14GB | Very low quality but surprisingly usable. |
Phi-3-medium-4k-instruct-IQ2_M.gguf | IQ2_M | 4.71GB | Very low quality, uses SOTA techniques to also be surprisingly usable. |
Phi-3-medium-4k-instruct-IQ2_S.gguf | IQ2_S | 4.33GB | Very low quality, uses SOTA techniques to be usable. |
Phi-3-medium-4k-instruct-IQ2_XS.gguf | IQ2_XS | 4.12GB | Very low quality, uses SOTA techniques to be usable. |
Phi-3-medium-4k-instruct-IQ2_XXS.gguf | IQ2_XXS | 3.71GB | Lower quality, uses SOTA techniques to be usable. |
Phi-3-medium-4k-instruct-IQ1_M.gguf | IQ1_M | 3.24GB | Extremely low quality, not recommended. |
Phi-3-medium-4k-instruct-IQ1_S.gguf | IQ1_S | 2.95GB | Extremely low quality, not recommended. |
Mixtral 8x7b
出典サイト
https://huggingface.co/TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF
Name | Quant method | Bits | Size | Max RAM required | Use case |
---|---|---|---|---|---|
mixtral-8x7b-instruct-v0.1.Q2_K.gguf | Q2_K | 2 | 15.64 GB | 18.14 GB | smallest, significant quality loss - not recommended for most purposes |
mixtral-8x7b-instruct-v0.1.Q3_K_M.gguf | Q3_K_M | 3 | 20.36 GB | 22.86 GB | very small, high quality loss |
mixtral-8x7b-instruct-v0.1.Q4_0.gguf | Q4_0 | 4 | 26.44 GB | 28.94 GB | legacy; small, very high quality loss - prefer using Q3_K_M |
mixtral-8x7b-instruct-v0.1.Q4_K_M.gguf | Q4_K_M | 4 | 26.44 GB | 28.94 GB | medium, balanced quality - recommended |
mixtral-8x7b-instruct-v0.1.Q5_0.gguf | Q5_0 | 5 | 32.23 GB | 34.73 GB | legacy; medium, balanced quality - prefer using Q4_K_M |
mixtral-8x7b-instruct-v0.1.Q5_K_M.gguf | Q5_K_M | 5 | 32.23 GB | 34.73 GB | large, very low quality loss - recommended |
mixtral-8x7b-instruct-v0.1.Q6_K.gguf | Q6_K | 6 | 38.38 GB | 40.88 GB | very large, extremely low quality loss |
mixtral-8x7b-instruct-v0.1.Q8_0.gguf | Q8_0 | 8 | 49.62 GB | 52.12 GB | very large, extremely low quality loss - not recommended |
Starling-LM 7b
出典サイト
https://huggingface.co/bartowski/Starling-LM-7B-beta-GGUF
Name | Quant method | Size | Use case |
---|---|---|---|
Starling-LM-7B-beta-Q8_0.gguf | Q8_0 | 7.69GB | Extremely high quality, generally unneeded but max available quant. |
Starling-LM-7B-beta-Q6_K.gguf | Q6_K | 5.94GB | Very high quality, near perfect, recommended. |
Starling-LM-7B-beta-Q5_K_M.gguf | Q5_K_M | 5.13GB | High quality, very usable. |
Starling-LM-7B-beta-Q5_K_S.gguf | Q5_K_S | 4.99GB | High quality, very usable. |
Starling-LM-7B-beta-Q5_0.gguf | Q5_0 | 4.99GB | High quality, older format, generally not recommended. |
Starling-LM-7B-beta-Q4_K_M.gguf | Q4_K_M | 4.36GB | Good quality, similar to 4.25 bpw. |
Starling-LM-7B-beta-Q4_K_S.gguf | Q4_K_S | 4.14GB | Slightly lower quality with small space savings. |
Starling-LM-7B-beta-IQ4_NL.gguf | IQ4_NL | 4.15GB | Good quality, similar to Q4_K_S, new method of quanting, |
Starling-LM-7B-beta-IQ4_XS.gguf | IQ4_XS | 3.94GB | Decent quality, new method with similar performance to Q4. |
Starling-LM-7B-beta-Q4_0.gguf | Q4_0 | 4.10GB | Decent quality, older format, generally not recommended. |
Starling-LM-7B-beta-IQ3_M.gguf | IQ3_M | 3.28GB | Medium-low quality, new method with decent performance. |
Starling-LM-7B-beta-IQ3_S.gguf | IQ3_S | 3.18GB | Lower quality, new method with decent performance, recommended over Q3 quants. |
Starling-LM-7B-beta-Q3_K_L.gguf | Q3_K_L | 3.82GB | Lower quality but usable, good for low RAM availability. |
Starling-LM-7B-beta-Q3_K_M.gguf | Q3_K_M | 3.51GB | Even lower quality. |
Starling-LM-7B-beta-Q3_K_S.gguf | Q3_K_S | 3.16GB | Low quality, not recommended. |
Starling-LM-7B-beta-Q2_K.gguf | Q2_K | 2.71GB | Extremely low quality, not recommended. |
WizardLM2 7b
出典サイト
https://huggingface.co/bartowski/WizardLM-2-7B-GGUF
Filename | Quant type | File Size | Description |
---|---|---|---|
WizardLM-2-7B-Q8_0.gguf | Q8_0 | 7.69GB | Extremely high quality, generally unneeded but max available quant. |
WizardLM-2-7B-Q6_K.gguf | Q6_K | 5.94GB | Very high quality, near perfect, recommended. |
WizardLM-2-7B-Q5_K_M.gguf | Q5_K_M | 5.13GB | High quality, recommended. |
WizardLM-2-7B-Q5_K_S.gguf | Q5_K_S | 4.99GB | High quality, recommended. |
WizardLM-2-7B-Q4_K_M.gguf | Q4_K_M | 4.36GB | Good quality, uses about 4.83 bits per weight, recommended. |
WizardLM-2-7B-Q4_K_S.gguf | Q4_K_S | 4.14GB | Slightly lower quality with more space savings, recommended. |
WizardLM-2-7B-IQ4_NL.gguf | IQ4_NL | 4.12GB | Decent quality, slightly smaller than Q4_K_S with similar performance recommended. |
WizardLM-2-7B-IQ4_XS.gguf | IQ4_XS | 3.90GB | Decent quality, smaller than Q4_K_S with similar performance, recommended. |
WizardLM-2-7B-Q3_K_L.gguf | Q3_K_L | 3.82GB | Lower quality but usable, good for low RAM availability. |
WizardLM-2-7B-Q3_K_M.gguf | Q3_K_M | 3.51GB | Even lower quality. |
WizardLM-2-7B-IQ3_M.gguf | IQ3_M | 3.28GB | Medium-low quality, new method with decent performance comparable to Q3_K_M. |
WizardLM-2-7B-IQ3_S.gguf | IQ3_S | 3.18GB | Lower quality, new method with decent performance, recommended over Q3_K_S quant, same size with better performance. |
WizardLM-2-7B-Q3_K_S.gguf | Q3_K_S | 3.16GB | Low quality, not recommended. |
WizardLM-2-7B-IQ3_XS.gguf | IQ3_XS | 3.01GB | Lower quality, new method with decent performance, slightly better than Q3_K_S. |
WizardLM-2-7B-IQ3_XXS.gguf | IQ3_XXS | 2.82GB | Lower quality, new method with decent performance, comparable to Q3 quants. |
WizardLM-2-7B-Q2_K.gguf | Q2_K | 2.71GB | Very low quality but surprisingly usable. |
WizardLM-2-7B-IQ2_M.gguf | IQ2_M | 2.50GB | Very low quality, uses SOTA techniques to also be surprisingly usable. |
WizardLM-2-7B-IQ2_S.gguf | IQ2_S | 2.31GB | Very low quality, uses SOTA techniques to be usable. |
WizardLM-2-7B-IQ2_XS.gguf | IQ2_XS | 2.19GB | Very low quality, uses SOTA techniques to be usable. |
WizardLM-2-7B-IQ2_XXS.gguf | IQ2_XXS | 1.99GB | Lower quality, uses SOTA techniques to be usable. |
WizardLM-2-7B-IQ1_M.gguf | IQ1_M | 1.75GB | Extremely low quality, not recommended. |
WizardLM-2-7B-IQ1_S.gguf | IQ1_S | 1.61GB | Extremely low quality, not recommended. |
Codestral 22B
出典サイト
https://huggingface.co/bartowski/Codestral-22B-v0.1-GGUF
Filename | Quant type | File Size | Description |
---|---|---|---|
Codestral-22B-v0.1-Q8_0.gguf | Q8_0 | 23.64GB | Extremely high quality, generally unneeded but max available quant. |
Codestral-22B-v0.1-Q6_K.gguf | Q6_K | 18.25GB | Very high quality, near perfect, recommended. |
Codestral-22B-v0.1-Q5_K_M.gguf | Q5_K_M | 15.72GB | High quality, recommended. |
Codestral-22B-v0.1-Q5_K_S.gguf | Q5_K_S | 15.32GB | High quality, recommended. |
Codestral-22B-v0.1-Q4_K_M.gguf | Q4_K_M | 13.34GB | Good quality, uses about 4.83 bits per weight, recommended. |
Codestral-22B-v0.1-Q4_K_S.gguf | Q4_K_S | 12.66GB | Slightly lower quality with more space savings, recommended. |
Codestral-22B-v0.1-IQ4_XS.gguf | IQ4_XS | 11.93GB | Decent quality, smaller than Q4_K_S with similar performance, recommended. |
Codestral-22B-v0.1-Q3_K_L.gguf | Q3_K_L | 11.73GB | Lower quality but usable, good for low RAM availability. |
Codestral-22B-v0.1-Q3_K_M.gguf | Q3_K_M | 10.75GB | Even lower quality. |
Codestral-22B-v0.1-IQ3_M.gguf | IQ3_M | 10.06GB | Medium-low quality, new method with decent performance comparable to Q3_K_M. |
Codestral-22B-v0.1-Q3_K_S.gguf | Q3_K_S | 9.64GB | Low quality, not recommended. |
Codestral-22B-v0.1-IQ3_XS.gguf | IQ3_XS | 9.17GB | Lower quality, new method with decent performance, slightly better than Q3_K_S. |
Codestral-22B-v0.1-IQ3_XXS.gguf | IQ3_XXS | 8.59GB | Lower quality, new method with decent performance, comparable to Q3 quants. |
Codestral-22B-v0.1-Q2_K.gguf | Q2_K | 8.27GB | Very low quality but surprisingly usable. |
Codestral-22B-v0.1-IQ2_M.gguf | IQ2_M | 7.61GB | Very low quality, uses SOTA techniques to also be surprisingly usable. |
Codestral-22B-v0.1-IQ2_S.gguf | IQ2_S | 7.03GB | Very low quality, uses SOTA techniques to be usable. |
Codestral-22B-v0.1-IQ2_XS.gguf | IQ2_XS | 6.64GB | Very low quality, uses SOTA techniques to be usable. |
Deepseek coder 33b
出典サイト
https://huggingface.co/TheBloke/deepseek-coder-33B-base-GGUF
Name | Quant method | Bits | Size | Max RAM required | Use case |
---|---|---|---|---|---|
deepseek-coder-33b-base.Q2_K.gguf | Q2_K | 2 | 14.03 GB | 16.53 GB | smallest, significant quality loss - not recommended for most purposes |
deepseek-coder-33b-base.Q3_K_S.gguf | Q3_K_S | 3 | 14.42 GB | 16.92 GB | very small, high quality loss |
deepseek-coder-33b-base.Q3_K_M.gguf | Q3_K_M | 3 | 16.07 GB | 18.57 GB | very small, high quality loss |
deepseek-coder-33b-base.Q3_K_L.gguf | Q3_K_L | 3 | 17.56 GB | 20.06 GB | small, substantial quality loss |
deepseek-coder-33b-base.Q4_0.gguf | Q4_0 | 4 | 18.82 GB | 21.32 GB | legacy; small, very high quality loss - prefer using Q3_K_M |
deepseek-coder-33b-base.Q4_K_S.gguf | Q4_K_S | 4 | 18.89 GB | 21.39 GB | small, greater quality loss |
deepseek-coder-33b-base.Q4_K_M.gguf | Q4_K_M | 4 | 19.94 GB | 22.44 GB | medium, balanced quality - recommended |
deepseek-coder-33b-base.Q5_0.gguf | Q5_0 | 5 | 22.96 GB | 25.46 GB | legacy; medium, balanced quality - prefer using Q4_K_M |
deepseek-coder-33b-base.Q5_K_S.gguf | Q5_K_S | 5 | 22.96 GB | 25.46 GB | large, low quality loss - recommended |
deepseek-coder-33b-base.Q5_K_M.gguf | Q5_K_M | 5 | 23.54 GB | 26.04 GB | large, very low quality loss - recommended |
deepseek-coder-33b-base.Q6_K.gguf | Q6_K | 6 | 27.36 GB | 29.86 GB | very large, extremely low quality loss |
deepseek-coder-33b-base.Q8_0.gguf | Q8_0 | 8 | 35.43 GB | 37.93 GB | very large, extremely low quality loss - not recommended |
Deepseek coder 6.7b
出典サイト
https://huggingface.co/TheBloke/deepseek-coder-6.7B-instruct-GGUF
Name | Quant method | Bits | Size | Max RAM required | Use case |
---|---|---|---|---|---|
deepseek-coder-6.7b-instruct.Q2_K.gguf | Q2_K | 2 | 2.83 GB | 5.33 GB | smallest, significant quality loss - not recommended for most purposes |
deepseek-coder-6.7b-instruct.Q3_K_S.gguf | Q3_K_S | 3 | 2.95 GB | 5.45 GB | very small, high quality loss |
deepseek-coder-6.7b-instruct.Q3_K_M.gguf | Q3_K_M | 3 | 3.30 GB | 5.80 GB | very small, high quality loss |
deepseek-coder-6.7b-instruct.Q3_K_L.gguf | Q3_K_L | 3 | 3.60 GB | 6.10 GB | small, substantial quality loss |
deepseek-coder-6.7b-instruct.Q4_0.gguf | Q4_0 | 4 | 3.83 GB | 6.33 GB | legacy; small, very high quality loss - prefer using Q3_K_M |
deepseek-coder-6.7b-instruct.Q4_K_S.gguf | Q4_K_S | 4 | 3.86 GB | 6.36 GB | small, greater quality loss |
deepseek-coder-6.7b-instruct.Q4_K_M.gguf | Q4_K_M | 4 | 4.08 GB | 6.58 GB | medium, balanced quality - recommended |
deepseek-coder-6.7b-instruct.Q5_0.gguf | Q5_0 | 5 | 4.65 GB | 7.15 GB | legacy; medium, balanced quality - prefer using Q4_K_M |
deepseek-coder-6.7b-instruct.Q5_K_S.gguf | Q5_K_S | 5 | 4.65 GB | 7.15 GB | large, low quality loss - recommended |
deepseek-coder-6.7b-instruct.Q5_K_M.gguf | Q5_K_M | 5 | 4.79 GB | 7.29 GB | large, very low quality loss - recommended |
deepseek-coder-6.7b-instruct.Q6_K.gguf | Q6_K | 6 | 5.53 GB | 8.03 GB | very large, extremely low quality loss |
deepseek-coder-6.7b-instruct.Q8_0.gguf | Q8_0 | 8 | 7.16 GB | 9.66 GB | very large, extremely low quality loss - not recommended |
wizardcoder 33b
出典サイト
https://huggingface.co/TheBloke/WizardCoder-33B-V1.1-GGUF
Name | Quant method | Bits | Size | Max RAM required | Use case |
---|---|---|---|---|---|
wizardcoder-33b-v1.1.Q2_K.gguf | Q2_K | 2 | 14.03 GB | 16.53 GB | smallest, significant quality loss - not recommended for most purposes |
wizardcoder-33b-v1.1.Q3_K_S.gguf | Q3_K_S | 3 | 14.42 GB | 16.92 GB | very small, high quality loss |
wizardcoder-33b-v1.1.Q3_K_M.gguf | Q3_K_M | 3 | 16.07 GB | 18.57 GB | very small, high quality loss |
wizardcoder-33b-v1.1.Q3_K_L.gguf | Q3_K_L | 3 | 17.56 GB | 20.06 GB | small, substantial quality loss |
wizardcoder-33b-v1.1.Q4_0.gguf | Q4_0 | 4 | 18.82 GB | 21.32 GB | legacy; small, very high quality loss - prefer using Q3_K_M |
wizardcoder-33b-v1.1.Q4_K_S.gguf | Q4_K_S | 4 | 18.89 GB | 21.39 GB | small, greater quality loss |
wizardcoder-33b-v1.1.Q4_K_M.gguf | Q4_K_M | 4 | 19.94 GB | 22.44 GB | medium, balanced quality - recommended |
wizardcoder-33b-v1.1.Q5_0.gguf | Q5_0 | 5 | 22.96 GB | 25.46 GB | legacy; medium, balanced quality - prefer using Q4_K_M |
wizardcoder-33b-v1.1.Q5_K_S.gguf | Q5_K_S | 5 | 22.96 GB | 25.46 GB | large, low quality loss - recommended |
wizardcoder-33b-v1.1.Q5_K_M.gguf | Q5_K_M | 5 | 23.54 GB | 26.04 GB | large, very low quality loss - recommended |
wizardcoder-33b-v1.1.Q6_K.gguf | Q6_K | 6 | 27.36 GB | 29.86 GB | very large, extremely low quality loss |
wizardcoder-33b-v1.1.Q8_0.gguf | Q8_0 | 8 | 35.43 GB | 37.93 GB | very large, extremely low quality loss - not recommended |