More than 1 year has passed since last update.

【AzureML プロンプトフロー】Chat with Wikipedia

Posted at 2023-09-25

はじめに

今回はAzureMLのプロンプトフローを試してみます。
サンプルフローはいくつか出ていますが、その中でも「Chat With Wikipedia」を使います。

Chat with Wikipedia

Chat with WikipediaはWikipediaを使ってチャットできるボットです。

Chat with Wikipediaを作成したらさっそく始めます。

Azure OpenAIの接続

はじめにプロンプトフローではLLM接続が必要となってくるため、すでにデプロイしたAzure OpenAIのモデルに接続します。

「接続」タブの「作成」から下記の必要な項目を入力してください。

Runtimeの作成

つづいてRuntimeの作成を行います。

「ランタイム」タブをクリックし、「作成」をすると「マネージドオンラインエンドポイントデプロイ」と「コンピューティングインスタンス」のどちらかを選択できるようになります。

今回は推奨されている「コンピューティングインスタンス」を選択します。

「AzureMLコンピューティングインスタンスを作成する」をクリックし、

「コンピューティング名」や仮想マシンのサイズを選びます。

数分待てば、コンピューティングインスタンスが作成されるので、完了したら「ランタイム名」を入力し、Runtimeを作成します。

処理の流れ

Chat with Wikipediaのサンプルを作成すると、次のような画面になります。

デフォルトでは、「What is ChatGPT?」という質問に対する会話履歴がすでに残っている状態からスタート。

こちらがその会話履歴です。

そのため、この会話履歴を使って2つめの質問を行うところから始めます。

質問内容はデフォルトで入力されている「What is the difference between this model and previous neural network?」を使っていきます。

Chat with Wikipediaの処理の流れはこのようになっています。

1つずつ流れや入力・出力を見ていきます。

・extract_query_from_question
これまでの会話の履歴と質問文から新しいWikipediaのクエリ（検索ワード）を生成するように指示しています。プロンプトは下記の通りで、会話履歴を用いる例について説明しています。

プロンプト

system:
You are an AI assistant reading the transcript of a conversation between an AI and a human. Given an input question and conversation history, infer user real intent.

The conversation history is provided just in case of a coreference (e.g. "What is this?" where "this" is defined in previous conversation).

Return the output as query used for next round user message.

user:
EXAMPLE
Conversation history:
Human: I want to find the best restaurants nearby, could you recommend some?
AI: Sure, I can help you with that. Here are some of the best restaurants nearby: Rock Bar.
Human: How do I get to Rock Bar?

Output: directions to Rock Bar
END OF EXAMPLE

EXAMPLE
Conversation history:
Human: I want to find the best restaurants nearby, could you recommend some?
AI: Sure, I can help you with that. Here are some of the best restaurants nearby: Rock Bar.
Human: How do I get to Rock Bar?
AI: To get to Rock Bar, you need to go to the 52nd floor of the Park A. You can take the subway to Station A and walk for about 8 minutes from exit A53. Alternatively, you can take the train to S Station and walk for about 12 minutes from the south exit3.
Human: Show me more restaurants.

Output: best restaurants nearby
END OF EXAMPLE

Conversation history (for reference only):
{% for item in chat_history %}
Human: {{item.inputs.question}}
AI: {{item.outputs.answer}}
{% endfor %}
Human: {{question}}

Output:

ここではLLMへの接続が必要なので、設定します。

「接続」や「deployment name」ではAzure OpenAIのデプロイしたモデルを選択します。今回はgpt-3.5-turboを選択しました。

下記の入力に対して

入力

{
  "question":"What is the difference between this model and previous neural network?"
  "chat_history":[
    0:{
      "inputs":{
        "question":"What is ChatGPT?"
      }
      "outputs":{
        "answer":"ChatGPT is a chatbot product developed by OpenAI. It is powered by the Generative Pre-trained Transformer (GPT) series of language models, with GPT-4 being the latest version. ChatGPT uses natural language processing to generate responses to user inputs in a conversational manner. It was released as ChatGPT Plus, a premium version, which provides enhanced features and access to the GPT-4 based version of OpenAI's API. ChatGPT allows users to interact and have conversations with the language model, utilizing both text and image inputs. It is designed to be more reliable, creative, and capable of handling nuanced instructions compared to previous versions. However, it is important to note that while GPT-4 improves upon its predecessors, it still retains some of the same limitations and challenges."
      }
    }
  ]
}

出力はこのようになりました。

出力

[
  0:{
    "system_metrics":{
      "completion_tokens":10
      "duration":1.187621
      "prompt_tokens":486
      "total_tokens":496
    }
    "output":"difference between ChatGPT and previous neural network models"
  }
]

・get_wiki_url
先ほど出力された「difference between ChatGPT and previous neural network models」をrequestsからWikipediaで検索し、その結果のうち上位2つのページのURLを取得します。

from promptflow import tool
import requests
import bs4
import re


def decode_str(string):
    return string.encode().decode("unicode-escape").encode("latin1").decode("utf-8")


def remove_nested_parentheses(string):
    pattern = r'\([^()]+\)'
    while re.search(pattern, string):
        string = re.sub(pattern, '', string)
    return string


@tool
def get_wiki_url(entity: str, count=2):
    # Send a request to the URL
    url = f"https://en.wikipedia.org/w/index.php?search={entity}"
    url_list = []
    try:
        headers = {
            "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) "
                          "Chrome/113.0.0.0 Safari/537.36 Edg/113.0.1774.35"}
        response = requests.get(url, headers=headers)
        if response.status_code == 200:
            # Parse the HTML content using BeautifulSoup
            soup = bs4.BeautifulSoup(response.text, 'html.parser')
            mw_divs = soup.find_all("div", {"class": "mw-search-result-heading"})
            if mw_divs:  # mismatch
                result_titles = [decode_str(div.get_text().strip()) for div in mw_divs]
                result_titles = [remove_nested_parentheses(result_title) for result_title in result_titles]
                print(f"Could not find {entity}. Similar ententity: {result_titles[:count]}.")
                url_list.extend([f"https://en.wikipedia.org/w/index.php?search={result_title}" for result_title in
                                 result_titles])
            else:
                page_content = [p_ul.get_text().strip() for p_ul in soup.find_all("p") + soup.find_all("ul")]
                if any("may refer to:" in p for p in page_content):
                    url_list.extend(get_wiki_url("[" + entity + "]"))
                else:
                    url_list.append(url)
        else:
            msg = f"Get url failed with status code {response.status_code}.\nURL: {url}\nResponse: " \
                  f"{response.text[:100]}"
            print(msg)
        return url_list[:count]
    except Exception as e:
        print("Get url failed with error: {}".format(e))
        return url_list

出力は下記のようになりました。

出力

[
  0:{
    "system_metrics":{
      "duration":1.697971
    }
    "output":[
      0:"https://en.wikipedia.org/w/index.php?search=ChatGPT"
      1:"https://en.wikipedia.org/w/index.php?search=GPT-3"
    ]
  }
]

ChatGPTとGPT-3の2つのURLが返ってきました。

・search_result_from_url
先ほど取得したURLのページからテキストを取得します。各リクエストの結果はリストとして、URLとそのテキストをまとめて返します。

from promptflow import tool
import requests
import bs4
import time
import random
from concurrent.futures import ThreadPoolExecutor
from functools import partial

session = requests.Session()


def decode_str(string):
    return string.encode().decode("unicode-escape").encode("latin1").decode("utf-8")


def get_page_sentence(page, count: int = 10):
    # find all paragraphs
    paragraphs = page.split("\n")
    paragraphs = [p.strip() for p in paragraphs if p.strip()]

    # find all sentence
    sentences = []
    for p in paragraphs:
        sentences += p.split('. ')
    sentences = [s.strip() + '.' for s in sentences if s.strip()]
    # get first `count` number of sentences
    return ' '.join(sentences[:count])


def fetch_text_content_from_url(url: str, count: int = 10):
    # Send a request to the URL
    try:
        headers = {
            "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) "
                          "Chrome/113.0.0.0 Safari/537.36 Edg/113.0.1774.35"
        }
        delay = random.uniform(0, 0.5)
        time.sleep(delay)
        response = session.get(url, headers=headers)
        if response.status_code == 200:
            # Parse the HTML content using BeautifulSoup
            soup = bs4.BeautifulSoup(response.text, 'html.parser')
            page_content = [p_ul.get_text().strip() for p_ul in soup.find_all("p") + soup.find_all("ul")]
            page = ""
            for content in page_content:
                if len(content.split(" ")) > 2:
                    page += decode_str(content)
                if not content.endswith("\n"):
                    page += "\n"
            text = get_page_sentence(page, count=count)
            return (url, text)
        else:
            msg = f"Get url failed with status code {response.status_code}.\nURL: {url}\nResponse: " \
                  f"{response.text[:100]}"
            print(msg)
            return (url, "No available content")

    except Exception as e:
        print("Get url failed with error: {}".format(e))
        return (url, "No available content")


@tool
def search_result_from_url(url_list: list, count: int = 10):
    results = []
    patial_func_of_fetch_text_content_from_url = partial(fetch_text_content_from_url, count=count)
    with ThreadPoolExecutor(max_workers=5) as executor:
        futures = executor.map(patial_func_of_fetch_text_content_from_url, url_list)
        for feature in futures:
            results.append(feature)
    return results

出力はこちらのようになりました。

出力

[
  0:{
    "system_metrics":{
      "duration":1.919661
    }
    "output":[
    0:[
      0:"https://en.wikipedia.org/w/index.php?search=ChatGPT"
      1:"ChatGPT, which stands for Chat Generative Pre-trained Transformer, is a large language model–based chatbot developed by OpenAI and launched on November 30, 2022, which enables users to refine and steer a conversation towards a desired length, format, style, level of detail, and language used. Successive prompts and replies, known as prompt engineering, are considered at each conversation stage as a context.[2]. ChatGPT is built upon either GPT-3.5 or GPT-4 —members of OpenAI's proprietary series of generative pre-trained transformer (GPT) models, based on the transformer architecture developed by Google[3]—and it is fine-tuned for conversational applications using a combination of supervised and reinforcement learning techniques.[4] ChatGPT was released as a freely available research preview, but due to its popularity, OpenAI now operates the service on a freemium model. It allows users on its free tier to access the GPT-3.5-based version. In contrast, the more advanced GPT-4 based version and priority access to newer features are provided to paid subscribers under the commercial name "ChatGPT Plus".. By January 2023, it had become what was then the fastest-growing consumer software application in history, gaining over 100 million users and contributing to OpenAI's valuation growing to US$29 billion.[5][6] Within months, Google, Baidu, and Meta accelerated the development of their competing products: Bard, Ernie Bot, and LLaMA.[7] Microsoft launched its Bing Chat based on OpenAI's GPT-4. It raised concern among some observers over the potential of ChatGPT and similar programs to displace or atrophy human intelligence, enable plagiarism, or fuel misinformation.[4][8]. ChatGPT is based on particular GPT foundation models, namely GPT-3.5 and GPT-4, that were fine-tuned to target conversational usage.[9] The fine-tuning process leveraged both supervised learning as well as reinforcement learning in a process called reinforcement learning from human feedback (RLHF).[10][11] Both approaches employed human trainers to improve model performance. In the case of supervised learning, the trainers played both sides: the user and the AI assistant. In the reinforcement learning stage, human trainers first ranked responses that the model had created in a previous conversation.[12] These rankings were used to create "reward models" that were used to fine-tune the model further by using several iterations of Proximal Policy Optimization (PPO).[10][13]."
    ]
    1:[
      0:"https://en.wikipedia.org/w/index.php?search=GPT-3"
      1:"Generative Pre-trained Transformer 3 (GPT-3) is a large language model released by OpenAI in 2020. Like its predecessor GPT-2, it is a decoder-only transformer model of deep neural network, which uses attention in place of previous recurrence- and convolution-based architectures.[2] Attention mechanisms allow the model to selectively focus on segments of input text it predicts to be the most relevant.[3] It uses a 2048-tokens-long context and then-unprecedented size of 175 billion parameters, requiring 800GB to store. The model demonstrated strong zero-shot and few-shot learning on many tasks.[4]. Microsoft announced on September 22, 2020, that it had licensed "exclusive" use of GPT-3; others can still use the public API to receive output, but only Microsoft has access to GPT-3's underlying model.[5]. According to The Economist, improved algorithms, powerful computers, and an increase in digitized data have fueled a revolution in machine learning, with new techniques in the 2010s resulting in "rapid improvements in tasks" including manipulating language.[6] Software models are trained to learn by using thousands or millions of examples in a "structure ... loosely based on the neural architecture of the brain".[6] One architecture used in natural language processing (NLP) is a neural network based on a deep learning model that was first introduced in 2017—the transformer architecture.[7] There are a number of NLP systems capable of processing, mining, organizing, connecting and contrasting textual input, as well as correctly answering questions.[8]. On June 11, 2018, OpenAI researchers and engineers posted their original paper introducing the first generative pre-trained transformer (GPT)—a type of generative large language model that is pre-trained with an enormous and diverse corpus of text via datasets, followed by discriminative fine-tuning to focus on a specific task. GPT models are transformer-based deep learning neural network architectures. Up to that point, the best-performing neural NLP models commonly employed supervised learning from large amounts of manually-labeled data, which made it prohibitively expensive and time-consuming to train extremely large language models.[4]. That first GPT model is known as "GPT-1," and it was then followed by "GPT-2" in February 2019."
      ]
    ]
  }
]

・process_search_result
検索結果のコンテンツとURLのリストを文字列として出力します。

from promptflow import tool


@tool
def process_search_result(search_result):
    def format(doc: dict):
        return f"Content: {doc['Content']}\nSource: {doc['Source']}"

    try:
        context = []
        for url, content in search_result:
            context.append({
                "Content": content,
                "Source": url
            })
        context_str = "\n\n".join([format(c) for c in context])
        return context_str
    except Exception as e:
        print(f"Error: {e}")
        return ""

出力はこちらです。

出力

[
  0:{
    "system_metrics":{
      "duration":0.000589
    }
    "output":"Content: ChatGPT, which stands for Chat Generative Pre-trained Transformer, is a large language model–based chatbot developed by OpenAI and launched on November 30, 2022, which enables users to refine and steer a conversation towards a desired length, format, style, level of detail, and language used. Successive prompts and replies, known as prompt engineering, are considered at each conversation stage as a context.[2]. ChatGPT is built upon either GPT-3.5 or GPT-4 —members of OpenAI's proprietary series of generative pre-trained transformer (GPT) models, based on the transformer architecture developed by Google[3]—and it is fine-tuned for conversational applications using a combination of supervised and reinforcement learning techniques.[4] ChatGPT was released as a freely available research preview, but due to its popularity, OpenAI now operates the service on a freemium model. It allows users on its free tier to access the GPT-3.5-based version. In contrast, the more advanced GPT-4 based version and priority access to newer features are provided to paid subscribers under the commercial name "ChatGPT Plus".. By January 2023, it had become what was then the fastest-growing consumer software application in history, gaining over 100 million users and contributing to OpenAI's valuation growing to US$29 billion.[5][6] Within months, Google, Baidu, and Meta accelerated the development of their competing products: Bard, Ernie Bot, and LLaMA.[7] Microsoft launched its Bing Chat based on OpenAI's GPT-4. It raised concern among some observers over the potential of ChatGPT and similar programs to displace or atrophy human intelligence, enable plagiarism, or fuel misinformation.[4][8]. ChatGPT is based on particular GPT foundation models, namely GPT-3.5 and GPT-4, that were fine-tuned to target conversational usage.[9] The fine-tuning process leveraged both supervised learning as well as reinforcement learning in a process called reinforcement learning from human feedback (RLHF).[10][11] Both approaches employed human trainers to improve model performance. In the case of supervised learning, the trainers played both sides: the user and the AI assistant. In the reinforcement learning stage, human trainers first ranked responses that the model had created in a previous conversation.[12] These rankings were used to create "reward models" that were used to fine-tune the model further by using several iterations of Proximal Policy Optimization (PPO).[10][13]. Source: https://en.wikipedia.org/w/index.php?search=ChatGPT Content: Generative Pre-trained Transformer 3 (GPT-3) is a large language model released by OpenAI in 2020. Like its predecessor GPT-2, it is a decoder-only transformer model of deep neural network, which uses attention in place of previous recurrence- and convolution-based architectures.[2] Attention mechanisms allow the model to selectively focus on segments of input text it predicts to be the most relevant.[3] It uses a 2048-tokens-long context and then-unprecedented size of 175 billion parameters, requiring 800GB to store. The model demonstrated strong zero-shot and few-shot learning on many tasks.[4]. Microsoft announced on September 22, 2020, that it had licensed "exclusive" use of GPT-3; others can still use the public API to receive output, but only Microsoft has access to GPT-3's underlying model.[5]. According to The Economist, improved algorithms, powerful computers, and an increase in digitized data have fueled a revolution in machine learning, with new techniques in the 2010s resulting in "rapid improvements in tasks" including manipulating language.[6] Software models are trained to learn by using thousands or millions of examples in a "structure ... loosely based on the neural architecture of the brain".[6] One architecture used in natural language processing (NLP) is a neural network based on a deep learning model that was first introduced in 2017—the transformer architecture.[7] There are a number of NLP systems capable of processing, mining, organizing, connecting and contrasting textual input, as well as correctly answering questions.[8]. On June 11, 2018, OpenAI researchers and engineers posted their original paper introducing the first generative pre-trained transformer (GPT)—a type of generative large language model that is pre-trained with an enormous and diverse corpus of text via datasets, followed by discriminative fine-tuning to focus on a specific task. GPT models are transformer-based deep learning neural network architectures. Up to that point, the best-performing neural NLP models commonly employed supervised learning from large amounts of manually-labeled data, which made it prohibitively expensive and time-consuming to train extremely large language models.[4]. That first GPT model is known as "GPT-1," and it was then followed by "GPT-2" in February 2019. Source: https://en.wikipedia.org/w/index.php?search=GPT-3"
  }
]

・augmented_chat
抽出されたテキストと質問を元に、Wikipediaからの参考文献を含む最終的な回答を生成します。また、答えがわからない場合は、わからないと伝えるようにプロンプトで伝えています。

ここでもLLMの接続が必要なので、忘れずに設定します。

プロンプト

system:
You are a chatbot having a conversation with a human.
Given the following extracted parts of a long document and a question, create a final answer with references ("SOURCES").
If you don't know the answer, just say that you don't know. Don't try to make up an answer.
ALWAYS return a "SOURCES" part in your answer.

{{contexts}}

{% for item in chat_history %}
user:
{{item.inputs.question}}
assistant:
{{item.outputs.answer}}
{% endfor %}

user:
{{question}}

出力結果

それでは「チャット」をクリックして、これまでのフローを順に実行させます。

1段目には、会話履歴にあった「What is ChatGPT?」という質問に対する回答がすでにでています。

続けて「What is the difference between this model and previous neural network?」という質問を送信すると次のように回答が返ってきました。

The main difference between ChatGPT and previous neural network models lies in the advancements made in the architecture and scale of the model. ChatGPT is built upon either GPT-3.5 or GPT-4, which are part of OpenAI's series of generative pre-trained transformer (GPT) models.

Compared to its predecessor, GPT-2, GPT-3 introduced several improvements. GPT-3 boasted an unprecedented size, with 175 billion parameters and a 2048-tokens-long context. This large size allowed GPT-3 to perform strongly in zero-shot and few-shot learning on various tasks. GPT-3 also utilized attention mechanisms, which enabled the model to selectively focus on the most relevant segments of input text.

Furthermore, GPT-4, upon which ChatGPT is built, represents another iteration of improvements. Although specific details about GPT-4 are not provided in the document, it can be assumed that GPT-4 continues to enhance the capabilities and performance of the model compared to GPT-3.

Overall, with each iteration, the GPT models have shown improvements in size, context length, performance, and the ability to handle various tasks. These advancements have allowed ChatGPT to be more reliable, creative, and capable of generating nuanced responses in a conversational manner.

以上がChat with Wikipediaのフローになります。
前回行ったAsk Wikipediaをさらに会話履歴を用いて質問できるようになりました。

また新しくチャットをやりなおしたいときは、左下の青いボタンから作り直すこともできるので手軽で便利です。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up

【AzureML プロンプト フロー】Chat with Wikipedia