[生成AI] OpenAIのAgents SDKをお試し

Last updated at 2025-03-27Posted at 2025-03-12

1.動機

OpenAIからAgents SDKがリリースされました。
これは、実験的にリリースされていたSwarmの製品版とのことです。

Swarmは、ほんのちょっとだけ触ったことがあり、すごく簡単に動いたので好印象を持っていたのですが、実験的なものなの製品で使用しないで下さいという但し書きがあったので、深入りしませんでした

2.最初の一歩「SDKのドキュメントのチュートリアル」

SDKのドキュメントの冒頭の"Hello world example"は、俳句を生成するサンプルですが、下記の手順で動作いました。インストールが簡単でよいですね。

OpenAI Agents SDK

#1. モジュールのインストール
pip install openai-agents

#2. 環境変数OPENAI_API_KEYの設定
・・・

#3. pythonのプログラムを実行
(.venv) PS C:\Users\yamaz\Dropbox\work\20250312_agents_sdk> python .\01_hello_world.py
Code calls to itself,
Infinite loops define paths—
Boundless logic spins.

SDKのQuickstartのプログラムでは、宿題を回答してくれるエージェントで、宿題以外のことは回答しないようにガードレールの設定がされています。

Quickstart - OpenAI Agents SDK

実行結果は、こんな感じのイメージです。
最初の問い"who was the first president of the united states?"には回答がありますが、二つ目の問い"what is life"に対しては、ガードレールによるチェックの結果、例外(agents.exceptions.InputGuardrailTripwireTriggered)が発生しています。

(.venv) PS C:\Users\yamaz\Dropbox\work\20250312_agents_sdk> python .\02_quick_start.py
The first President of the United States was George Washington. He served from 1789 to 1797. Washington is known for leading the Continental Army to victory during the American Revolutionary War and for presiding over the convention that drafted the U.S. Constitution. His leadership set many precedents for the new nation, including the tradition of a two-term presidency. Washington's presidency helped establish a sense of national unity and respect for the office.
Traceback (most recent call last):
File "C:\Users\yamaz\Dropbox\work\20250312_agents_sdk\02_quick_start.py", line 55, in
asyncio.run(main())
File "C:\Users\yamaz\AppData\Local\Programs\Python\Python312\Lib\asyncio\runners.py", line 194, in run
return runner.run(main)
^^^^^^^^^^^^^^^^
File "C:\Users\yamaz\AppData\Local\Programs\Python\Python312\Lib\asyncio\runners.py", line 118, in run
return self._loop.run_until_complete(task)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\yamaz\AppData\Local\Programs\Python\Python312\Lib\asyncio\base_events.py", line 686, in run_until_complete
return future.result()
^^^^^^^^^^^^^^^
File "C:\Users\yamaz\Dropbox\work\20250312_agents_sdk\02_quick_start.py", line 51, in main
result = await Runner.run(triage_agent, "what is life")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\yamaz\Dropbox\work\20250312_agents_sdk.venv\Lib\site-packages\agents\run.py", line 210, in run
input_guardrail_results, turn_result = await asyncio.gather(
^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\yamaz\Dropbox\work\20250312_agents_sdk.venv\Lib\site-packages\agents\run.py", line 805, in _run_input_guardrails
raise InputGuardrailTripwireTriggered(result)
agents.exceptions.InputGuardrailTripwireTriggered: Guardrail InputGuardrail triggered tripwire

Quicksightの記事内に記載されていますが、OpenAIのサイト上で送受信の結果が分かります。この機能は非常に助かります。
Trace viewer in the OpenAI Dashboard

下記が、Qucicksightのチュートリアルの一回目のやり取りのトレースです。

下記が、二回目のやり取りで、ガードレールにはじかれています。

3.次の一歩「githubで公開されているexsample」

openai/openai-agents-python

examplesフォルダ内に、いろいろなサンプルが、機能ごとにコンパクトに格納されています、とても助かります。下記の表は、提供されたサンプルの概略を、まとめたものです。

分類	サンプル	概要
basic	hello_world.py	シンプルなエージェントの呼出しのみです。
basic	agent_lifecycle_example.py	エージェントの呼出しのライフサイクルごとのイベントをフックするサンプルです。
basic	lifecycle_example.py	上記と同様ですが、入出力トークン数を取得するサンプルです。
basic	dynamic_system_prompt.py	システムプロンプトを変更するサンプルです。
basic	stream_item.py	AIのストリーム応答の扱い方のサンプルです(event.typeやevnt.itemに応じた処理です）。
basic	stream_text.py	AIのストリーム応答の扱い方のサンプルです（ストリームのテキスト出力の処理、一文字ずつ流れるように表示されるパターンの処理です）。
agent_patterns	deterministic.py	複数のエージェントによる文章作成のサンプルです。アウトライン生成時の品質が悪かったり、SF以外だと、処理が止まる仕組みになっています。チェック結果をクラスのインスタンス変数で返すことができるので、良い仕組みだなと思いました。
agent_patterns	agents_as_tools.py	ツールの定義と呼出しのサンプルです。
agent_patterns	input_guardrails.py	入力のガードレールのサンプルです。説明では、数学以外の宿題の回答をするエージェントなのですが、数学の宿題のみ回答するようになっていました
agent_patterns	output_guardrails.py	出力のガードレールのサンプルです。秘密にすべき情報があったら置き換えるような動作のサンプルです。
agent_patterns	llm_as_a_judge.py	文章作成エージェントと品質チェックエージェントがあり、合格するまでやり直すサンプルです。サンプルだと、無限ループになっても不思議がないコードのようなので、実際にはごーどを入れる必要があるかもしれません。出力結果を表の下に載せておきます。4回やりブラッシュアップしていますね。
agent_patterns	parallelization.py	同じ問いを複数のエージェントに渡して、良いものを選択するパターンのサンプルです。
agent_patterns	routing.py	英語、フランス語、スペイン語で対応するエージェントを振り分けるサンプルです。
tools	web_search.py	web searchのサンプルです。
tools	file_search.py	ファイル検索のサンプルです。
tools	computer_use.py	ブラウザ操作のサンプルです。playwrightモジュールが必要です
handoffs	message_filter.py message_filter_streaming.py	英語の会話中に、スペイン語の会話をされたときに、エージェントが引き継ぎを行うサンプルです。動作を理解するには、Web側のトレース画面を見るのが一番です。
customer_service	customer_service	質問内容に応じて、エージェント型のエージェンを呼出すサンプルです。エージェント間のたらい回しが見えます
research_bot	research_bot	調査用エージェントサンプルです。これでサンプルなんだというすごさを感じました richモジュールを使用しているので、"pip install rich"を実行する必要があります。リポジトリクローンのトップフォルダから"python -m examples.research_bot.main"でサンプル起動です。動作例は下記に載せます。

llm_as_a_judgy.pyの実行例

(.venv) PS C:\Users\yamaz\Dropbox\work\openai-agents-python\examples\agent_patterns> python .\llm_as_a_judge.py
What kind of story would you like to hear? 太宰治の走れメロスをモチーフにした、泳げメロスというコメディを作成してください。
Story outline generated
Evaluator score: needs_improvement
Re-running with feedback
Story outline generated
Evaluator score: needs_improvement
Re-running with feedback
Story outline generated
Evaluator score: needs_improvement
Re-running with feedback
Story outline generated
Evaluator score: needs_improvement
Re-running with feedback
Story outline generated
Evaluator score: pass
Story outline is good enough, exiting.
Final story outline: タイトル: 泳げメロス

ジャンル: コメディ

あらすじ:

古代ギリシャ、シシリアのにぎやかな海辺の町。親友セリヌンティウスの結婚式で会場を沸かせるため、メロスは海を乗り越える冒険に挑む。滑稽な出来事が彼を待ち受ける。

プロットポイント:

メロスとセリヌンティウス: 幼い頃の冒険と約束を胸に、メロスは結婚式でのユーモラスなスピーチに全力を注ぐことを決意。
賑やかな町: 市場でタコの触手に絡まれるメロス。必死でもがきながらも、町人たちはそれを見て笑い声を上げる。
ポセイドンの風変わりな試練: ポセイドンは海の生き物たちとのシンクロナイズドスイミングを計画。メロスにリーダーシップを取らせるが、彼の奇妙な指示で大混乱。
コミカルな冒険: フェリックスは「空中ジャンプ」を教えようとして、大げさなおとぎ話を語る。メロスは試してみるが、ドタバタと失敗。
カモメとの追いかけっこ: カモメたちはパンを盗み、メロスの帽子を奪おうとする。メロスは怒りながらも華麗に足を滑らせ、泥だらけになる。
波とのコメディ闘い: メロスの歌うおかしな海の歌「嵐の止まり方」が響き渡り、波がコミカルに揺れながら静まる。ポセイドン自身もびっくり。
友情のミニパフォーマンス: セリヌンティウスが現れ、二人はポセイドンのシンクロ公演を即興で補助。波をかぶりながらのスラップスティックな動きが笑いを誘う。
ハッピーエンド: メロスは成功を収め、結婚式のスピーチでユーモアを交えた話で皆を笑わせる。友人の絆とユーモアの力で勝利を収める。

テーマ: 友情と笑いは、危機をも乗り越える最強のコンビネーションである。

このコメディは、滑稽な出来事と友情を織り交ぜたユーモラスな冒険物語です。

research_botの実行例

(.venv) PS C:\Users\yamaz\Dropbox\work\openai-agents-python> python -m examples.research_bot.main
What would you like to research? AI Agents SDK
View trace: https://platform.openai.com/traces/trace_f98e・・・（省略）・・・・
Starting research...
✅ Will perform 11 searches
✅ Searching... 11/11 completed
✅ Cleaning up formatting...
✅ Report summary

This comprehensive report delves into the landscape of AI Agents SDKs, exploring their definitions, functionalities, and applications across various
industries. The report examines notable SDKs and frameworks such as LangChain, Microsoft 365 Agents SDK, and emerging platforms like SpinAI and
Inferable. It also discusses the technical aspects, multi-language support, and challenges associated with developing AI agents, culminating in an
outlook on future trends in autonomous AI technology.

=====REPORT=====

Report: # Comprehensive Review of AI Agents SDKs

This report provides an in-depth analysis of AI Agents SDKs (Software Development Kits), with a focus on the latest developments, leading frameworks, industry-oriented use cases, and challenges associated with building and deploying autonomous AI agents. Over the next several sections, we will explore various aspects of AI Agents SDKs and their implications in modern software development and enterprise applications.

1. Introduction

In recent years, the emergence of autonomous AI agents has revolutionized how businesses, developers, and researchers approach automation, interaction, and data processing. AI Agents SDKs serve as comprehensive toolkits—providing libraries, frameworks, and sample code—that simplify the development of sophisticated agents capable of performing complex and context-aware tasks with minimal human intervention.

These SDKs are becoming indispensable in a wide range of applications from healthcare and finance to customer service and education. Driven by the rapid improvements in machine learning and natural language processing (NLP), AI agents are not only reshaping traditional workflows but are also creating new paradigms for personalized user experiences. This document reviews the variety of SDKs available, their underlying technology stacks, associated industry use cases, and challenges developers face when integrating these tools into existing systems.

2. Overview of AI Agents SDKs

An AI Agent SDK provides packages of tools that enhance the productivity of developers by offering ready-to-use libraries, API integrations, and infrastructure support necessary to build and deploy autonomous agents. These agents are designed to:

Interact with users: Handling complex queries, managing dialogues, and providing intelligent responses.
Orchestrate requests: Coordinating various background tasks like database querying, processing external API calls, and integrating with other services.
Collaborate with multiple agents: Facilitating inter-agent communication and coordination in multi-agent environments.

2.1 Core Components

Agent Workflows: Frameworks such as LangChain provide structured methods to design workflows with modular components, ensuring that different tasks can be composed and managed effectively.
Tool Integration: Several SDKs enable seamless integration with third-party services (e.g., Microsoft Graph, Azure OpenAI, blockchain data via Covalent), thus extending the functionality of the autonomous agents.
Human-in-the-loop Options: Many frameworks support scenarios where human oversight is crucial, blending automation with oversight to maintain reliability and ethical standards.

3. Notable SDKs and Frameworks in the Market

Numerous AI Agents SDKs have emerged, each catering to distinct development needs and technical specialties. Below is a detailed comparison of some of the most prominent offerings:

3.1 Open-Source Frameworks

LangChain & LangGraph:
- LangChain is celebrated for its modular approach, enabling developers to easily manage prompts, chains of tasks, and memory systems.
- LangGraph, an extension within the LangChain ecosystem, focuses on stateful, multi-agent applications and supports complex agent interactions with cyclic graph structures.
Microsoft AutoGen:
Designed for developing multi-agent conversation frameworks, it allows for interaction and role-based collaboration among agents, easing the complexity of building systems that handle variable inputs.
CrewAI:
A framework that emphasizes ease of use, CrewAI lets non-technical users set up multiple agents quickly by offering simple prompt-based designs and limited customization options for less demanding applications.

3.2 Proprietary and Enterprise-Focused SDKs

Microsoft 365 Agents SDK:
This toolkit simplifies the process of building full-stack, multichannel agents designed for platforms such as Microsoft 365, Teams, and Copilot Studio. It integrates well with third-party services including Facebook Messenger and Slack, and provides secure, guided development with comprehensive documentation.
Covalent's AI Agent SDK:
By offering integration with blockchain data, this SDK enables new kinds of decentralized applications. Its capabilities include interacting directly with smart contracts, verifying transactions, and facilitating on-chain operations.

3.3 Emerging Platforms

SpinAI & Inferable:
- SpinAI is a promising TypeScript framework that provides rapid development for AI agents along with features like built-in logging and cost tracking.
- Inferable offers a code-first platform designed to build production-ready agentic automations with durable execution, emphasizing robustness and scalability.

4. Industry-Specific Use Cases

The flexibility of AI Agents SDKs allows them to be applied across several industries, often transforming traditional processes by automating tasks that once required significant human supervision.

4.1 Healthcare

Patient Care Management: Autonomous agents can consolidate data from multiple sources to improve diagnostics, manage appointment scheduling, and even predict patient outcomes.
Virtual Health Assistants: Providing timely responses, these agents help in triaging patient needs and offering initial guidance, thereby reducing the load on human medical staff.

4.2 Retail

Personalized Shopping Experiences: By analyzing customer behavior and preferences, AI agents can recommend products and deliver personalized marketing messages.
Inventory and Pricing Management: Advanced prediction algorithms help in managing stock levels, forecasting demand, and optimizing pricing strategies based on seasonal trends and consumer data.

4.3 Financial Services

Risk Assessment and Underwriting: AI agents sift through vast datasets to evaluate risk, process transactions, monitor compliance, and detect fraudulent activities.
Automated Trading: Some SDKs are integrated into trading platforms to automate market analysis and execute trades, reducing the margin for human error.

4.4 Education

Personalized Learning Paths: Agents analyze student performance, generate tailored learning plans, and even grade assignments, thus enhancing educational outcomes.
Virtual Tutoring: They serve as round-the-clock assistants, providing explanations and contextual help on educational content which allows human educators to focus on more complex teaching tasks.

4.5 Customer Service

Automated Support: Chatbots powered by AI agents provide instant solutions to customer queries, troubleshoot issues, and sometimes even diagnose technical problems.
Network Monitoring: In sectors like IT and telecommunications, agents can predict maintenance needs and monitor network performance to ensure high availability and reduced downtime.

5. Technical Specifications and Multi-Language Support

5.1 Language Support Overview

Modern AI agent SDKs typically support multiple programming languages to cater to a broad range of developers:

Python: Widely used for its rich libraries and simplicity. SDKs like PySpur and Anthropic’s Python SDK empower rapid prototyping and experimentation.
JavaScript/TypeScript: Frameworks such as Retell AI SDK, Fetch.ai’s AI Engine SDK, and Anthropic’s TypeScript SDK are designed for web-based applications, providing robust support for asynchronous processing.
Java: Via SDKs for Anthropic, enabling integration into enterprise environments that rely on Java for backend operations.

5.2 Integration with Existing Technologies

Many of these SDKs are built to integrate smoothly with legacy systems:

API-Driven Architectures: Use of standardized APIs to facilitate integration with external services, custom business logic, and cloud-native architectures.
Modular and Extensible Frameworks: Examples include the Microsoft Semantic Kernel, which acts as a middleware layer allowing developers to plug in AI capabilities into existing applications with minimal disruptions.

6. Recent Trends and Developments

6.1 Evolving Capabilities

Significant strides in AI technology have led to emerging trends, such as:

Low-Code and No-Code Platforms: These democratize development, enabling even non-programmers to build and deploy AI agents by providing simplified interfaces and pre-built templates.
Hybrid Systems: Integration of various AI paradigms that combine both natural language processing and computer vision, among others, to create more versatile agents.
Explainability and Ethical AI: With a growing emphasis on transparency, several frameworks are focusing on making AI decisions more explainable and ensuring that agents are held accountable for their actions.

6.2 Competition and Innovation

High-profile releases, such as OpenAI's Responses API, represent the next wave of development in AI agents. The new tool aims to simplify the creation of agents capable of executing sophisticated tasks autonomously, while also replacing older models like the Assistants API. Competitive dynamics are evident with Chinese startups, such as Monica and their Manus agent, which push the boundaries of performance and integration.

7. Challenges and Limitations

While AI Agents SDKs offer compelling benefits, several challenges persist:

7.1 Computational Resource Limitations

High Resource Requirement: Training and deploying advanced AI models can be resource-intensive, often requiring access to advanced hardware or scalable cloud services.
Mitigation Strategies: Techniques such as model compression and leveraging third-party cloud platforms can help counter this issue.

7.2 Bias and Fairness

Inherited Biases: AI agents based on flawed training data can produce biased or unfair outcomes.
Solutions: Utilizing diverse datasets and implementing regular fairness audits are essential corrective measures.

7.3 Integration Complexity

Legacy Systems: Coupling new AI agents with existing legacy infrastructures may present compatibility issues.
API Standardization: Adopting standardized integration patterns early in the development process can minimize these challenges.

7.4 Scalability and Performance

Dynamic Workloads: While agents may perform admirably in controlled environments, they might struggle under heavy production loads.
Design Considerations: Cloud services and scalable architectures should be incorporated from the outset.

7.5 Security and Transparency

Security Vulnerabilities: Issues like model poisoning and adversarial attacks necessitate regular security assessments and robust fallback mechanisms.
Explainability: The opaque nature of some AI models can erode trust, underscoring the importance of explainable AI techniques.

7.6 Ethical and Legal Aspects

Privacy Concerns: Deploying agents in sensitive environments raises questions about data privacy, accountability, and consent.
Compliance: Developers need to integrate ethical guidelines and compliance standards into the design phase to ensure responsible AI deployment.

8. Conclusion and Future Outlook

The rapid evolution of AI Agents SDKs is indicative of a broader trend toward more autonomous, integrated, and intelligent systems. With platforms like Microsoft 365 Agents SDK and emerging frameworks like SpinAI and Inferable, the future of agent-based software development appears poised for major breakthroughs. As these technologies continue to develop, we expect to see:

Broader Adoption Across Industries: From healthcare to finance, the ability to build robust, autonomous agents will further transform traditional workflows.
Increased Emphasis on Ethical AI: With growing public and regulatory scrutiny, ensuring fairness, transparency, and accountability will be more critical than ever.
Integration of Multi-Modal Capabilities: Future agents are anticipated to incorporate not only NLP but also computer vision, enabling richer, context-aware interactions.
Enhanced Developer Tools: New APIs and SDKs will continue to lower the barrier to entry for developing sophisticated AI agents, further driving innovation and market competition.

The trajectory of AI Agents SDKs suggests a promising future where technology not only automates routine tasks but also empowers businesses to make insightful, data-driven decisions with greater agility and precision.

9. Follow-Up Questions for Further Research

What specific challenges are faced when integrating AI Agents SDKs with legacy systems in large enterprises?
How are emerging ethical and regulatory concerns being addressed by major SDK providers?
What performance benchmarks are most critical when deploying AI agents at scale in real-time applications?
How can the evolving landscape of low-code and no-code platforms further democratize AI development?
What future trends can we expect regarding multi-modal agent capabilities (e.g., integration of vision and language)?
How might ongoing improvements in GPU and cloud infrastructure influence the scalability of these AI agent solutions?

This report highlights the importance of AI Agents SDKs as a transformative toolset for modern software development. By addressing both the technological opportunities and the inherent challenges, developers and organizations are well-positioned to harness the power of autonomous AI to drive innovation and improve operational efficiencies across a variety of domains.

=====FOLLOW UP QUESTIONS=====

Follow up questions: How will AI Agents SDKs address the increasing demand for real-time data processing in autonomous systems?
What strategies can be implemented to mitigate bias and ensure fairness in AI agent decisions?
How can developers prepare for the integration of AI Agents with legacy systems, particularly in large-scale enterprise environments?
What are the implications of emerging low-code/no-code platforms on the future development of AI agents?
How might advancements in hardware and cloud infrastructure affect the scalability and performance of AI agent solutions?

4.まとめ

・OpenAIが提供しているチュートリアルと、Githubのサンプルを触ってみました。
・サンプルで基本的な機能が試せて、とても分かりやすい。
・トレース機能が標準的についているのがよい。
・容易にエージェントを作れそうです。
・~~でも、OpenAI以外のモデルは、やはり使えないのだろうなぁ。~~
→下記の記事によると、ClaudeやGeminiなどのモデルを使用できるようです。素晴らしい。

2025.03.13 山崎作成
2025.03.14 山崎修正(research_botのサンプルのリンク間違いを修正）
2025.03.27 山崎修正(Agents SDKは、OpenAI以外のモデルも使えるらしい）

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up