Azure Web AppsにデプロイしたChainlitでファイル添付対応

Last updated at 2025-07-08Posted at 2025-07-08

以下の記事で実装した基本的なChainlitでのLLM使用に対して、ファイル添付対応しました。PDFだけしか対応しておらず、Wordなど個別にロジック追加で書く必要あります。
以下の記事の追加実装です。

実装アプリ

ファイル添付してプロンプト入力して実行すると、使用したモデルと添付ファイル読込結果をまずは返すようにしています。ファイルを複数添付することも可能です。

でプロンプト指示結果を返します。

Step

0. 前提

種類	Version	備考
OS	Ubuntu22.04.5 LTS	WSL2で動かしています
Python	3.13.2
Poetry	2.1.3
VS Code	1.101.2

VSCode 拡張

種類	Version	備考
Azure App Service	0.26.2

Python パッケージ

種類	Version	備考
chainlit	2.5.5
openai	1.93.0
pypdf	5.7.0	前記事からの追加
python-dotenv	1.1.1
semantic-kernel	1.34.0

その他

以下の記事の実装。

1. Web Apps 作成

主スクリプト全体です。

app.py

import os

import chainlit as cl
import semantic_kernel as sk
from chainlit.input_widget import Select
from dotenv import load_dotenv
from pypdf import PdfReader
from semantic_kernel.connectors.ai import FunctionChoiceBehavior
from semantic_kernel.connectors.ai.open_ai import (
    AzureChatCompletion,
    OpenAIChatPromptExecutionSettings,
)
from semantic_kernel.contents import ChatHistory

request_settings = OpenAIChatPromptExecutionSettings(
    function_choice_behavior=FunctionChoiceBehavior.Auto(
        filters={"excluded_plugins": ["ChatBot"]}
    )
)

load_dotenv(override=True)
MODELS = ["gpt-4.1-mini", "gpt-4o-mini", "o3"]
ENDPOINT = os.getenv("AZURE_OPENAI_ENDPOINT")
deployment = os.getenv("AZURE_OPENAI_DEPLOYMENT_NAME")
API_KEY = os.getenv("AZURE_OPENAI_KEY")


@cl.on_chat_start
async def on_chat_start():
    # Setup Semantic Kernel
    kernel = sk.Kernel()
    ai_service = AzureChatCompletion(
        endpoint=ENDPOINT,
        deployment_name=MODELS[0],
        api_key=API_KEY,
        api_version="2024-12-01-preview",
    )
    kernel.add_service(ai_service)

    # Instantiate and add the Chainlit filter to the kernel
    # This will automatically capture function calls as Steps
    cl.SemanticKernelFilter(kernel=kernel)

    await cl.ChatSettings(
        [
            Select(
                id="model_choice",
                label="使用するモデルを選択してください",
                values=MODELS,
                initial_index=0,
            )
        ]
    ).send()

    # モデルの初期設定
    cl.user_session.set("model_name", MODELS[0])
    cl.user_session.set("kernel", kernel)
    cl.user_session.set("chat_history", ChatHistory())


# モデル設定変更時のイベントハンドラ
@cl.on_settings_update
async def on_settings_update(settings):
    model_name = settings.get("model_choice")
    cl.user_session.set("model_name", model_name)

    kernel = cl.user_session.get("kernel")
    kernel.remove_all_services()
    ai_service = AzureChatCompletion(
        endpoint=ENDPOINT,
        deployment_name=model_name,
        api_key=API_KEY,
        api_version="2024-12-01-preview",
    )
    kernel.add_service(ai_service)
    cl.user_session.set("kernel", kernel)
    cl.user_session.set("model_name", model_name)


async def extract_file(elements) -> str:
    # pdfのみだが、対応するファイルをここで拡張
    supported_mimes = [
        "application/pdf",
        #        "application/msword",  # .doc
        #        "application/vnd.openxmlformats-officedocument.wordprocessingml.document"  # .docx
    ]

    # msg.elements の中からファイル要素を抽出
    files = [
        e for e in (elements or []) if hasattr(e, "mime") and e.mime in supported_mimes
    ]
    text = ""

    for file in files:
        # ファイル拡張子を取得
        _, ext = os.path.splitext(file.name.lower())
        if file.mime == "application/pdf" or ext == ".pdf":
            reader = PdfReader(file.path)
            file_text = ""
            for page in reader.pages:
                # extract_text() が None のときは、代わりに空文字列 "" を使用
                file_text += page.extract_text() or ""
            print(f"Extracted text from {file.name}: {text[:100]}...")  # Debug output
            await cl.Message(content=f"# 添付ファイル内容 \n{file_text}").send()
            text += file_text
    return text


@cl.on_message
async def on_message(message: cl.Message):
    print(f"{message.content=}")
    kernel = cl.user_session.get("kernel")  # type: sk.Kernel
    model_name = cl.user_session.get("model_name")
    chat_history = cl.user_session.get("chat_history")  # type: ChatHistory
    ai_service = kernel.get_service()

    # メッセージが邪魔ならコメントアウト
    await cl.Message(content=f"モデル {model_name} を使用").send()

    text = await extract_file(message.elements)

    if text == "":
        user_input = message.content

    # 添付ファイルがある場合は、ユーザプロンプトをシステムプロンプトに入れ、添付ファイル内容をユーザプロンプトに設定
    else:
        user_input = text
        chat_history.add_developer_message(message.content)

    chat_history.add_user_message(user_input)

    # Create a Chainlit message for the response stream
    answer = cl.Message(content="")

    async for msg in ai_service.get_streaming_chat_message_content(
        chat_history=chat_history,
        user_input=user_input,  # 必要か不明
        settings=request_settings,
        kernel=kernel,
    ):
        if msg.content:
            await answer.stream_token(msg.content)

    # Add the full assistant response to history
    chat_history.add_assistant_message(answer.content)

    # Send the final message
    await answer.send()

1.1. pypdf インポート

pypdfからPdfReaderをインポート。

インポート部分

from pypdf import PdfReader

1.2. PDF内容抽出

PDF内容を抽出する関数extract_fileを定義

extract_file

async def extract_file(elements) -> str:
    # pdfのみだが、対応するファイルをここで拡張
    supported_mimes = [
        "application/pdf",
        #        "application/msword",  # .doc
        #        "application/vnd.openxmlformats-officedocument.wordprocessingml.document"  # .docx
    ]

    # msg.elements の中からファイル要素を抽出
    files = [
        e for e in (elements or []) if hasattr(e, "mime") and e.mime in supported_mimes
    ]
    text = ""

    for file in files:
        # ファイル拡張子を取得
        _, ext = os.path.splitext(file.name.lower())
        if file.mime == "application/pdf" or ext == ".pdf":
            reader = PdfReader(file.path)
            file_text = ""
            for page in reader.pages:
                # extract_text() が None のときは、代わりに空文字列 "" を使用
                file_text += page.extract_text() or ""
            print(f"Extracted text from {file.name}: {text[:100]}...")  # Debug output
            await cl.Message(content=f"# 添付ファイル内容 \n{file_text}").send()
            text += file_text
    return text

1.3. PDF読込内容のハンドリング

関数extract_fileを呼び出し、PDF読込内容を受け取ります。その後に、システムプロンプト/ユーザープロンプトに渡す内容制御をしています。

on_message 抜粋

    text = await extract_file(message.elements)

    if text == "":
        user_input = message.content

    # 添付ファイルがある場合は、ユーザプロンプトをシステムプロンプトに入れ、添付ファイル内容をユーザプロンプトに設定
    else:
        user_input = text
        chat_history.add_developer_message(message.content)

    chat_history.add_user_message(user_input)

    # Create a Chainlit message for the response stream
    answer = cl.Message(content="")

    async for msg in ai_service.get_streaming_chat_message_content(
        chat_history=chat_history,
        user_input=user_input,  # 必要か不明
        settings=request_settings,
        kernel=kernel,
    ):

2. 環境変数追加

.envに以下の環境変数を追加しました。Chainlitはファイル添付すると.filesディレクトリに一時的にファイルを置きます。ですが、Azure Web Appsではデフォルトで書込不可な場所なので、それを変更するための対策です(詳しく調べていないのでAIで調べたことを鵜呑みにしています)。

.env

WEBSITE_RUN_FROM_PACKAGE=0

これは本来的に良くない方法な気もしますが、本来どうするべきか調べ切れていません。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up