Databricks上でRAG Q&A botを画面含めて動かしてみる

Posted at 2023-09-07

この辺りの記事の総まとめです。

これらをまとめることで、以下を実現することができます。

サーバレスモデルサービングがリリースされていない日本リージョンのDatabricksでもLLMをサービングできる
画面を担うStreamlitもDatabricksでサービングできる

注意
この記事で説明している方法はあくまで開発用途で使用してください。プロダクション用途においては、今後日本でもリリースされるサーバレスモデルサービングエンドポイントやLakehouse Appsをお待ちください。

こちらで説明しているソースコードはこちらに公開しています。

ベクトルDBの構築

こちらの記事に従ってください。

Driver ProxyによるLLMのサービング

詳細はこちらをご覧ください。

from flask import Flask, jsonify, request

app = Flask("rag-chatbot")
app.config["JSON_AS_ASCII"] = False # 日本語対応

@app.route('/', methods=['POST'])
def serve_rag_chatbot():
  # リクエストからプロンプトを取得
  prompt = request.get_json(force = True).get('prompt','')
  # 回答の生成
  resp = qabot.get_answer(prompt) 
  return jsonify(resp)

from dbruntime.databricks_repl_context import get_context
ctx = get_context()

port = "7777"
driver_proxy_api = f"https://{ctx.browserHostName}/driver-proxy-api/o/0/{ctx.clusterId}/{port}"

print(f"""
driver_proxy_api = '{driver_proxy_api}'
cluster_id = '{ctx.clusterId}'
port = {port}
""")

これでクラスター上でモデルがサービングされるようになります。

app.run(host="0.0.0.0", port=port, debug=True, use_reloader=False)

 * Serving Flask app "rag-chatbot" (lazy loading)
 * Environment: production
   WARNING: This is a development server. Do not use it in a production deployment.
   Use a production WSGI server instead.
 * Debug mode: on
 * Running on all addresses.
   WARNING: This is a development server. Do not use it in a production deployment.
 * Running on http://10.0.5.15:7777/ (Press CTRL+C to quit)

接続確認

こちらで動作確認しています。

import requests
import json

def request_rag_chatbot(prompt, temperature=1.0, max_new_tokens=1024):
  token = "" # TODO: Driver Proxyが動作しているクラスターと別のクラスター、ローカルからアクセスする際にはパーソナルアクセストークンが必要です
  url = "http://127.0.0.1:7777/" # Driver Proxyが動作しているポートを指定します
  
  headers = {
      "Content-Type": "application/json",
      "Authentication": f"Bearer {token}"
  }
  data = {
    "prompt": prompt,
    "temperature": temperature,
    "max_new_tokens": max_new_tokens,
  }

  response = requests.post(url, headers=headers, data=json.dumps(data))
  return response

# Driver Proxyへのリクエスト
response = request_rag_chatbot("Databricksとは？")
response_json = response.json()

print("回答: " + response_json["answer"])
print("ソース: " + response_json["source"])

回答: Databricksは、ビッグデータ分析や機械学習に特化した統合分析プラットフォームを提供する企業です。お客様との共創を通じて生み出したソリューションを、他の会社の方でも利用できるようにインターネットに公開しているサンプルノートブックと説明資料(ブログ記事)を提供しています。
ソース: https://qiita.com/taka_yayoi/items/24f23ba228a4656c0b80

画面の作成

こちらを参考に。重ね重ね、教えていただいた方ありがとうございます！

client_streamlit.py

import streamlit as st 
import numpy as np 
import json
import requests

st.title('Databricks Q&A bot')
#st.header('Databricks Q&A bot')

def generate_answer(question):
  # Driver Proxyと異なるクラスター、ローカルからDriver Proxyにアクセスする際にはパーソナルアクセストークンを設定してください
  token = "" 
  url = "http://127.0.0.1:7777/"

  headers = {
      "Content-Type": "application/json",
      "Authentication": f"Bearer {token}"
  }
  data = {
    "prompt": question
  }

  response = requests.post(url, headers=headers, data=json.dumps(data))
  if response.status_code != 200:
    raise Exception(
       f"Request failed with status {response.status_code}, {response.text}"
    )
  
  response_json = response.json()
  return response_json

question = st.text_input("**質問**")

if question != "":
  response = generate_answer(question)

  answer = response["answer"]
  source = response["source"]

  st.write(f"**回答:** {answer}")
  st.write(f"**ソース:** [{source}]({source})")

Streamlitの起動

詳細はこちらをご覧ください。

streamlit_file = "/Workspace/Users/takaaki.yayoi@databricks.com/20230908_llmbot_w_driver_proxy/client_streamlit.py"

!streamlit run {streamlit_file} --server.port {PORT}

動きました！

データの準備、モデルの構築、モデルのサービング、さらには画面のサービングまでをすべてDatabricksで完結させることができました。POCなどであればこれらの機能で十分かと思います。

プロダクション用途に関しては、上述の通り新機能のリリースを今しばらくお待ちください。

Databricksクイックスタートガイド

Databricks無料トライアル

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up