オンプレミスでDeepSeekを実装する方法

Posted at 2025-02-26

はじめに

近年、大規模言語モデル（LLM）の発展により、企業や研究機関は独自のAIモデルを活用することが可能になりました。中でも、DeepSeekは高性能なオープンソースのLLMであり、オンプレミス環境でも運用することが可能です。

オンプレミス環境でDeepSeekを実装することで、機密データの安全性を確保しながらAIを活用できます。本記事では、DeepSeekをオンプレミス環境で運用するための具体的な手順を詳細に解説します。

1. ハードウェア要件の確認

DeepSeekのような大規模言語モデルを運用するには、高性能なハードウェアが必要です。以下に最低限必要なスペックを示します。

1.1 GPU要件

NVIDIA A100 (80GB) x2 以上推奨
または RTX 4090 / 3090 x2
CUDAおよびcuDNNの最新バージョン対応

1.2 CPU・メモリ要件

CPU: Intel Xeon / AMD EPYC シリーズ
RAM: 128GB 以上推奨（少なくとも64GB）
ストレージ: SSD 2TB以上（モデルの保存と処理用）

1.3 ネットワーク要件

インターネット接続（初回モデル取得時のみ）
ローカルネットワーク帯域幅の確保（データ処理の高速化）

2. ソフトウェアの準備

DeepSeekを動作させるためには、いくつかのソフトウェアをセットアップする必要があります。

2.1 OSとドライバのインストール

Ubuntu 22.04 LTS のインストール

最新の NVIDIAドライバ のインストール

sudo apt update
sudo apt install -y nvidia-driver-535
reboot

CUDAおよびcuDNNのインストール

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-repo-ubuntu2204_12.2.0-1_amd64.deb
sudo dpkg -i cuda-repo-ubuntu2204_12.2.0-1_amd64.deb
sudo apt update
sudo apt install -y cuda

2.2 Python環境のセットアップ

Minicondaのインストール

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh
source ~/.bashrc

Python仮想環境の作成

conda create -n deepseek python=3.10
conda activate deepseek

2.3 必要ライブラリのインストール

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
pip install transformers accelerate optimum
pip install huggingface_hub

3. DeepSeekモデルの取得

Hugging Faceの deepseek-ai リポジトリからモデルをダウンロードします。

huggingface-cli login
mkdir -p ~/deepseek_model
cd ~/deepseek_model
wget https://huggingface.co/deepseek-ai/deepseek-llm-7b-base/resolve/main/pytorch_model.bin
wget https://huggingface.co/deepseek-ai/deepseek-llm-7b-base/resolve/main/config.json
wget https://huggingface.co/deepseek-ai/deepseek-llm-7b-base/resolve/main/tokenizer.json

4. DeepSeekのローカル実行

4.1 モデルのロード

以下のPythonスクリプトを実行し、DeepSeekモデルをローカルで読み込むことができます。

from transformers import AutoModelForCausalLM, AutoTokenizer

model_path = "~/deepseek_model"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path)

input_text = "こんにちは、今日はどんな話をしましょうか？"
input_ids = tokenizer(input_text, return_tensors="pt").input_ids
output = model.generate(input_ids, max_length=100)
print(tokenizer.decode(output[0], skip_special_tokens=True))

4.2 API化して運用

ローカル環境でAPIサーバーを立ち上げることで、他のシステムと連携可能です。

`app.py`

from flask import Flask, request, jsonify
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

app = Flask(__name__)

model_path = "~/deepseek_model"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path)

def generate_response(prompt):
    input_ids = tokenizer(prompt, return_tensors="pt").input_ids
    output = model.generate(input_ids, max_length=200)
    return tokenizer.decode(output[0], skip_special_tokens=True)

@app.route("/generate", methods=["POST"])
def generate():
    data = request.get_json()
    prompt = data.get("prompt", "")
    response = generate_response(prompt)
    return jsonify({"response": response})

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=5000)

Flask APIの起動

export FLASK_APP=app.py
flask run --host=0.0.0.0 --port=5000

5. まとめ

本記事では、DeepSeekをオンプレミス環境で実装するための手順を詳細に解説しました。

必要なハードウェア・ソフトウェア環境の準備
DeepSeekモデルの取得とセットアップ
モデルのローカル実行およびAPI化

この手順を参考に、企業や研究機関のローカル環境で安全にDeepSeekを活用してください。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up