2
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

qwen3-coder:30bをNVIDIA GeForce RTX 5070 12GBを半分くらい使って動かす

Posted at

環境

GPU: 5070 12G
CPU: Core i7-14700F (2.1GHz-5.3GHz/20コア/28スレッド)
メモリ: 64GB
OS: windows11上のWSL2のubuntu22.04

コマンド一例

# 1) モデル取得(重いので時間・容量に注意)
ollama pull qwen3-coder:30b

# 2) Modelfile(GPUを一部だけ使用+短め出力)
cat > Modelfile <<'EOF'
FROM qwen3-coder:30b

# --- 12GB VRAM向けの妥協設定 ---
PARAMETER num_gpu 14        # 12〜16で調整(上げるほど速いが負荷↑)
PARAMETER num_ctx 2048      # 重くなるのでまずは2K
PARAMETER num_predict 256    # 返答短めでタイムアウト回避

PARAMETER temperature 0.2
PARAMETER top_p 0.9
PARAMETER top_k 40

SYSTEM """
You are a concise coding assistant. Prefer TypeScript/JavaScript examples unless specified.
"""
EOF

# 3) 作成&テスト
ollama create qwen3-coder:30b-cline-lite -f Modelfile
ollama run qwen3-coder:30b-cline-lite "TypeScriptで配列をシャッフルする関数を1つだけ出力"

30bは妥協設定しても長い時で1分くらい待たされますけど、使えないことはないってことが分ったので満足です。

2
1
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
2
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?