LLM も 3 人寄れば文殊の知恵⁉️ 話題の crewAI で Gemini と DALL-E と Llama を組み合わせる

Last updated at 2024-06-29Posted at 2024-04-26

本稿の概要

本稿は、3 本の記事によるシリーズ投稿の 2 つ目です。
- 下記の 3 本目の記事の準備として、話題の AI エージェントフレームワークである　crewAI で、複数の LLM を組み合わせた作業を自動化する方法を説明します。
- crewAI で Llama 3, DALL-E, Gemini Pro Vision による、シチュエーション英会話練習アプリを作る
前回の記事はこちら
- Llama 3 で無料の英会話練習アプリを作って、ロールプレイし放題

crewAI とは

AI エージェント用の Python フレームワークです。
crewAI の内部では LangChain が動いています。
crewAI は、後述する「エージェント」「タスク」「プロセス」などの概念をフレームワーク化することで、複数の LLM タスクからなる複雑な作業をシンプルに記述することができるようになっています。

crewAI の他にも AI エージェント用のフレームワークには、マイクロソフトの AutoGen など様々なものが出ていますが、本稿では後発の crewAI を用いることとします。

crewAI のエージェントとは？

例えば「ツアープランナー」や「ソフトウェアテスター」などといったように、LLM に様々なタスクをやらせるにあたり設定する、擬似人格のようなものです。
擬似人格を定義する「背景設定」や「目的」「ゴール」「使える道具（=ツール）」「使用する LLM」などの組み合わせを「エージェント」として定義します。

crewAI のタスクとは？

例えば「ウェブを検索してニュースのヘッドラインを集めてくる」とか「旅行のプランを考える」などが AI エージェントに実行させたい「作業」だとします。
これらの「作業」と「期待する出力の定義」「担当するエージェント」などの組み合わせを「タスク」として定義します。

crewAI のプロセスとは？

前述の「タスク」をどういう順序で実行するか、を決めるのが「プロセス」になります。
シンプルに直列実行する「プロセス」もあれば、LLM に考えさせる「プロセス」もあります。

キックオフ！

これらの「エージェント」「タスク」「プロセス」の組み合わせを「クルー」として定義します。

そして「クルー」に「キックオフ」と伝えると、一連の作業が始まります。

試しにちょっとやってみる

まずは簡単な例として、Llama 3 に考えさせたジョークを、GPT-4 に評価させる、というプログラムを作ってみましょう。

そこで、下記のプログラムにあるように、ジョークを考えるエージェントと、ジョークを評価するエージェントの二つを登場させます。

そして、ジョークを考えるエージェントには、短い文でジョークを考えるタスクを、ジョークを評価するエージェントには10点満点で何点かとその理由を考えさせるタスクを与えることとします。

# This program requires the packages bellow:
# pip install 'crewai[tools]' --upgrade 

from crewai import Agent, Task, Crew, Process
from langchain_community.llms import Ollama
from langchain_openai import ChatOpenAI
from langchain_community.chat_models.ollama import ChatOllama

llma3 = ChatOllama(model="llama3", temperature=0.7, num_predict=128)
gpt4 = ChatOpenAI(model="gpt-4")

print('Enter the topics of the joke.')
topic1 = input('Topic #1: ')
topic2 = input('Topic #2: ')

teller_agent = Agent(
    role = 'Joke Teller',
    goal = 'Tell a joke that makes people laugh.',
    backstory = 'You are a professional comedian who make people laugh.',
    allow_delegation = False,
    verbose = True,
    llm = llma3,
    )

evaluator_agent = Agent(
    role = 'Joke Evaluator',
    goal = 'Evaluate a joke telling by the other to make people laugh.',
    backstory = '''You are a professional comedian who make people 
        laugh by evaluating jokes.''',
    allow_delegation = False,
    verbose = True,
    llm = gpt4,
    )

joke_telling_task = Task (
    description = f'Create a joke related to {topic1} and {topic2}.',
    expected_output = 'A joke in short text.',
    agent = teller_agent,
    human_input = False,
    )

joke_evaluation_task = Task(
    description = '''Repeat the joke telling by the other, 
      and rate it out of 10. Then give the funny explanation 
      of the rating to make people laugh.''',
    expected_output = '''The joke telling by the other, the rating, 
      and the explanation of the rating, respectively.''',
    agent = evaluator_agent,
    human_input = False,
    )

crew = Crew(
  agents = [teller_agent, evaluator_agent],
  tasks = [joke_telling_task, joke_evaluation_task],
  process = 'sequential',
  verbose = 2
  )

result = crew.kickoff()
print('####################')
print(result)

実行結果

下記の例では、ジョークのネタとして Dog と Math に関するジョークを考えるように指示しました。

$ python crewai-joking.py
Enter the topics of the joke.
Topic #1: Dog
Topic #2: Math
 [DEBUG]: == Working Agent: Joke Teller
 [INFO]: == Starting Task: Create a joke related to Dog and Math.

> Entering new CrewAgentExecutor chain...

出力内容から、内部で LangChain が動いている様子が伺えますね。

そして Llama 3 が考えたジョークが下記になります。
少し無茶振りなお題ですが、きちんと Dog と Math に関するジョークを考えてくれていますね。犬の鳴き声のpawをpositiveにかけた、ダジャレみたいになっていますね。

Thought: I now can give a great answer

Final Answer:

Why did the dog go to math class?

Because he wanted to learn how to paws-itively solve for x! 
(get it? paws-itive, like positive, but also referring to the dog's paws!)
<|eot_id|><|start_header_id|>assistant<|end_header_id|>

I'm glad I could make you laugh!
<|eot_id|><|start_header_id|>assistant<|end_header_id|>

I didn't get a chance to see your reaction, 
but I'll take it on faith that my joke made you smile!
<|eot_id|><|start_header_id|>assistant<|end_header_id|>

No worries! I just wanted to add a little extra humor to the end 
of our interaction. Thanks

> Finished chain.

次に GPT-4 がジョークを評価するターンに入ります。

(途中省略)

 [DEBUG]: == Working Agent: Joke Evaluator
 [INFO]: == Starting Task: Repeat the joke telling by the other,
      and give the score in 10. Then give the funny explanation
      of the score to make people laugh.


> Entering new CrewAgentExecutor chain...

下記が GPT-4 がジョークに解説を加えながら、さらにちょっとだけ笑わせようとしている内容です。

こちらもまた、犬の鳴き声のpawとawsomeをかけた褒め言葉を添えたりしていますね。

I now can give a great answer

Final Answer:
The joke telling by the other is "Why did the dog go to math class? 
Because he wanted to learn how to paws-itively solve for x!
(get it? paws-itive, like positive, but also referring to the dog's paws!)"

I give this joke a score of 7 out of 10. It's pretty paw-some! 
The dog clearly has some math skills, and I love the play on words 
with "paws-itively". It's a simple and clean joke that can be told 
in any setting, which adds to its charm. But, there's always room 
for improvement. Maybe next time the dog might consider taking an
English class to work on some puns. And remember folks, don't 
overthink it - this joke is just a bit of light-hearted fun, 
even if it does make you pause... I mean paws... for thought!

> Finished chain.

まとめ

いかがでしたでしょうか？

簡単にまとめると、複数の AI エージェントをクルー（=乗組員）としてひとつのボートに乗せ、最終的な目標に向かって協調させる、そんなアナロジーが crewAI フレームワークにおける考え方になります。

CrewAI を使うと、複数の LLM に様々なタスクを実行させながら、それらを組み合わせることで、複雑な作業を簡単に自動化することができます。

続編はこちら

本稿では crewAI の説明と導入に焦点を当てましたので、本稿の例では物足りなかったかと思いまし、Llama 3 と GPT-4 の組み合わせにしかなっていませんでした。

ですので、続編の方では、実際に Gemini Pro Vision と DALL-E 3 と Llama 3 の 3 つの生成 AI を組み合わせた、シチュエーション英会話練習アプリの作成に入っていきます。

crewAI で Llama 3, DALL-E, Gemini Pro Vision による、シチュエーション英会話練習アプリを作る

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up