Colab で Ollama を使い LLM (gemma3) してみた

Posted at 2026-01-29

はじめに

Google Colaboratory 上で、Ollama 経由で LLM してみました。
Colabは切断リスクがありますので、あくまで体験として実行します。
体験なので、LLMのモデルサイズは小さ目の 4B にしました。

準備

まず、Colaboratory 上に存在しない zstd をインストールします。
これをしないと、Ollamaのインストールでエラーがでます。
ERROR: This version requires zstd for extraction. Please install zstd and try again:

!apt-get update
!apt-get install -y zstd

Ollama をインストールします。

!curl -fsSL https://ollama.com/install.sh | sh

Ollamaのインストールで WARNING がでますが、問題ありません。ただし、
Unable to detect NVIDIA/AMD GPUとあるようにこのままだと GPU は使えないようです。
GPU使う場合は、lspci か lshw をインストールしろと出ますが、面倒なので今回はこのままです。
Install lspci or lshw to automatically detect and install GPU dependencies.

>>> Installing ollama to /usr/local
>>> Downloading ollama-linux-amd64.tar.zst
######################################################################## 100.0%
>>> Creating ollama user...
>>> Adding ollama user to video group...
>>> Adding current user to ollama group...
>>> Creating ollama systemd service...
WARNING: systemd is not running
WARNING: Unable to detect NVIDIA/AMD GPU. Install lspci or lshw to automatically detect and install GPU dependencies.
>>> The Ollama API is now available at 127.0.0.1:11434.
>>> Install complete. Run "ollama" from the command line.

ノートブックがブロックされないようバックグラウンドで起動します。

!nohup ollama serve > /dev/null 2>&1 &

以下で、gemma3:4b をプルします。

!ollama pull gemma3:4b

LLMを動かします。###...### のところに質問内容を入れて実行します。

import requests

r = requests.post(
    "http://localhost:11434/api/chat",
    json={
        "model": "gemma3:4b",
        "messages": [
            {"role":"user","content":"###############"}
        ],
        "stream": False
    }
)

print(r.json()["message"]["content"])

ただし、毎回このスクリプトを書き換えて、実行する必要があるので面倒です。
そこで、以下のように連続して質問できるスクリプトを実行します。
少しだけ、チャットぽくなります。

import requests

history = []

while True:
    q = input("You: ")
    history.append({"role":"user","content":q})

    r = requests.post(
        "http://localhost:11434/api/chat",
        json={"model":"gemma3:4b","messages":history,"stream":False}
    ).json()

    a = r["message"]["content"]
    print("LLM:", a)

    history.append({"role":"assistant","content":a})

以下は、gemma3:4b でのやりとり

 You: hello!

LLM: Hello there! How's your day going so far? 😊 
Is there anything you’d like to chat about, or were you just saying hello?

You: Is LiDAR getting popular in autonomous driving car?

LLM: Okay, let's talk about LiDAR and its role in autonomous driving! The short answer is: **yes, LiDAR is becoming increasingly popular, but it’s not the *only* technology and it's a complex situation.**

Here’s a breakdown of where things stand:
.....
Do you want me to delve deeper into any specific aspect of this, such as:
.....

You: Tesla’s approach to autonomous driving?

LLM: Okay, let’s dive into Tesla’s approach to autonomous driving 
.....

プログラムを中断すると、以下が表示されます。

KeyboardInterrupt: Interrupted by user

qwen3:4b をプルする場合はこちらです。

!ollama pull qwen3:4b

qwen3:4b を使う場合は、json= の行を修正します。

import requests

history = []

while True:
    q = input("You: ")
    history.append({"role":"user","content":q})

    r = requests.post(
        "http://localhost:11434/api/chat",
        json={"model":"qwen3:4b","messages":history,"stream":False}
    ).json()

    a = r["message"]["content"]
    print("LLM:", a)

    history.append({"role":"assistant","content":a})

おわりに

意外と簡単に LLM の体験ができることがわかりました。
いい PC が手に入ったら、ローカル LLM にも挑戦したいです。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up