Copilot StudioでDeepResearchを自作 & Deep Reasoning Models機能を試してみる

Posted at 2025-04-21

はじめに

AIの進化が早すぎる。もはや私が言うまでもないことですが、本当に早い。
様々な便利な機能が生まれる中で、特にDeepResearchの機能が気に入っています。

Gemini、Grok、ChatGPTなどで利用できますが、o3のDeepResearchを実行すると、もう私はブログを書く意味はないのでは🧐と感じるほどです、

とても気に入っておりChatGPTにはProプランで加入しているものの、全く使用回数が足りません。MAXプランへの加入は、金銭的な課題で断念しています。

そこで今回はDeepResearchの機能を再現できないか、検証してみました。

今回登場する製品

今回はCopilot Studioそしてカスタムコネクタ、Azure Functionsを組み合わせて、DeepResearchの機能を再現してみます。

なぜCopilot Studioを使ってみるのか、それは Deep Reasoning Models機能が利用できることから選定しています。

本記事は 2025.04.20時点の記事です。
内容は数日後に陳腐化する可能性があります

Copilot StudioのDeep Reasoning Models機能は現在プレビュー段階の機能です。
展開されているリージョンも限定的であり、LLMモデルは Azure OpenAI o1となっています。

Copilot Studio内で生成 AI を使用し、アクション、トピック、ナレッジの組み合わせが実現できるため、どれほど使えるのか、試してみたいと思います。

実現までのステップ

DeepResearchの機能を再現する方法はDifyのテンプレートを参考にしています。

トピックをCopilot Studioの機能で抽出する
調査するキーワードをGeminiで拡張する。モデル(gemini-2.0-flash)
(2)の結果からTavilyのSearch APIを使う
(3)の結果のサマリから、Copilot StudioでDeep Reasoning Models機能を利用する

DifyではDeepSeek R1が使われていますが、個人的な好みと価格の魅力からGeminiを選択しています。
Azure Functions以外は、一定の無料枠がある破格のAPIです。

上記の手順のうち(2),(3)の工程は、Azure Functionsでまとめて書いています。
Power Automateで書き上げることもできそうな内容ですが、AIのコーディング力を活用すると、Azure Functionsで作ったほうが早いといえます。

また安さにこだわるので有ればPythonAnywhereやRender、Cloudflare、Replitといったサービスでも再現可能です。全部使ってみましたが、PythonAnywhereは非常に手軽、凝って作るのであればCloudflareという選択で決めてます。Cloudflareは無償枠が大きいため心配に感じるほどです。

カスタムコネクタ(Azure Functions)を準備する

 2. 調査するキーワードを`Gemini`で拡張する。モデル(gemini-2.0-flash)
 3. (2)の結果から[Tavily](https://tavily.com/)のSearch APIを使う

上記の部分はAzure Functionsです。
内容はTavilyとGeminiのAPIを使っているだけなので割愛します。フルAIコーディングですが内容に相違はないです。

function_app.py

import logging
import json
import azure.functions as func
import os
from typing import List, Dict, Any, Optional

# Import the official client libraries
import google.generativeai as genai
from tavily import TavilyClient

# Set up API keys from environment variables
GEMINI_API_KEY = os.environ.get("GEMINI_API_KEY")
TAVILY_API_KEY = os.environ.get("TAVILY_API_KEY")

# Configure the Google Generative AI client
genai.configure(api_key=GEMINI_API_KEY)

# Create Tavily client
tavily_client = TavilyClient(api_key=TAVILY_API_KEY)

# Define the function app
app = func.FunctionApp()

# Gemini prompt template embedded directly in the script
GEMINI_PROMPT_TEMPLATE = "引用で記載します"

def extract_search_terms(topic: str) -> List[str]:
    """Use Gemini API to extract search terms from a topic."""
    # Create the prompt for Gemini
    prompt = f"{GEMINI_PROMPT_TEMPLATE}\n\nResearch topic: {topic}"
    
    # Initialize the model
    model = genai.GenerativeModel('gemini-2.0-flash')
    
    # Generate content
    generation_config = {
        "temperature": 0.2,
        "top_p": 0.95,
        "top_k": 40
    }
    
    response = model.generate_content(
        prompt,
        generation_config=generation_config
    )
    
    generated_text = response.text
    logging.info(f"Gemini response: {generated_text}")
    
    # Extract the JSON from the generated text
    try:
        json_start = generated_text.find("{")
        json_end = generated_text.rfind("}") + 1
        
        if json_start == -1 or json_end == 0:
            logging.warning("No JSON found in Gemini response")
            return [topic]  # Return original topic if we can't parse JSON
        
        json_str = generated_text[json_start:json_end]
        result = json.loads(json_str)
        
        next_search_topic = result.get("nextSearchTopic")
        
        # Handle different types of nextSearchTopic
        if next_search_topic is None:
            return []
        elif isinstance(next_search_topic, str):
            return [next_search_topic]
        elif isinstance(next_search_topic, list):
            return next_search_topic
        else:
            logging.warning(f"Unexpected type for nextSearchTopic: {type(next_search_topic)}")
            return [topic]
    
    except Exception as e:
        logging.error(f"Error parsing Gemini response: {e}")
        return [topic]  # Return original topic if parsing fails

def search_with_tavily(term: str) -> Dict[str, Any]:
    """Use Tavily API to search for information."""
    # Use the Tavily client to perform the search
    response = tavily_client.search(
        query=term,
        search_depth="advanced",
        include_answer=True
    )
    
    return response

def structure_results(topic: str, search_results: Dict[str, Dict[str, Any]]) -> str:
    """Structure search results in markdown format."""
    markdown_content = f"# Deep Research: {topic}\n\n"
    
    # Summary section
    markdown_content += "## Research Summary\n\n"
    for term, result in search_results.items():
        if "answer" in result and result["answer"]:
            markdown_content += f"### {term}\n{result['answer']}\n\n"
    
    # Detailed findings section
    markdown_content += "## Detailed Findings\n\n"
    for term, result in search_results.items():
        markdown_content += f"### {term}\n\n"
        
        if "results" in result and result["results"]:
            for i, source in enumerate(result["results"]):
                markdown_content += f"#### Source {i+1}: {source.get('title', 'No Title')}\n"
                markdown_content += f"- URL: {source.get('url', 'No URL')}\n"
                
                if "snippet" in source and source["snippet"]:
                    markdown_content += f"- Snippet: {source['snippet']}\n\n"
                elif "content" in source and source["content"]:
                    content_snippet = source["content"][:200] + "..." if len(source["content"]) > 200 else source["content"]
                    markdown_content += f"- Snippet: {content_snippet}\n\n"
                else:
                    markdown_content += "\n"
    
    return markdown_content

def deep_research(topic: str) -> str:
    """Main function to perform deep research on a topic."""
    # Step 2: Extract search terms using Gemini API
    logging.info(f"Extracting search terms for topic: '{topic}'")
    search_terms = extract_search_terms(topic)
    
    if not search_terms:
        return f"# Deep Research: {topic}\n\nNo search terms were generated for this topic."
    
    logging.info(f"Generated search terms: {search_terms}")
    
    # Step 3: Search for each term using Tavily API
    search_results = {}
    
    for term in search_terms:
        logging.info(f"Searching for term: '{term}'")
        
        try:
            result = search_with_tavily(term)
            search_results[term] = result
        except Exception as e:
            logging.error(f"Error searching for term '{term}': {str(e)}")
            search_results[term] = {"results": [], "answer": f"Error: {str(e)}"}
    
    # Step 4: Structure the results in markdown format
    return structure_results(topic, search_results)

@app.route(route="deep_research", auth_level=func.AuthLevel.FUNCTION)
def deep_research_http_trigger(req: func.HttpRequest) -> func.HttpResponse:
    logging.info('Python HTTP trigger function processed a request.')

    try:
        # Get the request body
        req_body = req.get_json()
        
        # Extract the search topic from the request
        if 'topic' in req_body:
            topic = req_body['topic']
        else:
            return func.HttpResponse(
                json.dumps({"error": "Please pass a 'topic' property in the request body"}),
                status_code=400,
                mimetype="application/json"
            )
        
        # Perform deep research
        research_results = deep_research(topic)
        
        # Return the results
        return func.HttpResponse(
            json.dumps({"research_results": research_results}),
            status_code=200,
            mimetype="application/json"
        )
        
    except ValueError as e:
        return func.HttpResponse(
            json.dumps({"error": f"Invalid request format: {str(e)}"}),
            status_code=400,
            mimetype="application/json"
        )
    except Exception as e:
        logging.error(f"Error processing request: {str(e)}")
        return func.HttpResponse(
            json.dumps({"error": f"Internal server error: {str(e)}"}),
            status_code=500,
            mimetype="application/json"
        )

markdownがぶれるのでGeminiのプロンプトは👇に書きます

"""You are a research agent investigating the following topic.
What have you found? What questions remain unanswered? What specific aspects should be investigated next?

## Output
- Do not output topics that are exactly the same as already searched topics.
- If further information search is needed, set nextSearchTopic.
- If sufficient information has been obtained, set shouldContinue to false.
- Please output in json format

{こちらにjson schemae}
"""

nextSearchTopic: str | None
shouldContinue: bool

このAzure Functionsの用途はtopicを引数にresearch_resultsを返します。

今回登場しているTavilyはAIリサーチツールです。
月1,000 creditsが付与されるFreeプランもあります。
登録はクレジットカード不要です。

またGeminiも同様ですね。

Gemini-2.0-flashも無料枠が設けられており、非常に高性能なAIを気軽に使うことができます。

さてAzure Functionsの準備ができたら、カスタムコネクタを用意します。
Azure portalには、非常に便利な機能としてOpenAPIファイルを出力する機能があります。

Power Platformとプロコードの融合が進む内容になりますね。
また後日カスタムコネクタの設定、Azure Functionsの設定を含め、すべてGitHubにアップする予定です。
コピペで済ませたいかたはそちらをご参照ください。

Copilot Studio側の設定

Copilot Studioの設定ですが、Power Platformの環境をUS、プライマリ言語が英語(en-us)となっています。

■ Power Platformの環境

■ Copilot Studio

重要
生成オーケストレーションは、プライマリ言語が英語 (en-US) のエージェントのみをサポートします。他の言語はまだサポートされていません。

生成 AI でエージェントの動作を調整する

また課金へ影響が出る可能性があります。、ご注意ください。

生成オーケストレーションをオンにすると、課金の計算方法に影響を与える可能性があります。

生成オーケストレーションの請求

プレビュー機能を使う上で非常に重要なことですので、自分の画面に機能が見当たらない場合は、環境や言語の設定を疑ってみてください。

この中で生成型アクションとDeep Reasoning Models機能をオンにします。

詳細セクション

Copilot Studioは自然言語で高機能なAIチャットボットが構築できる魅力的な製品です。
自分が期待する役割、指示は入念に検討し、設定する必要があります。

またDeep Reasoning Models機能を利用する上では、システムプロンプトである指示にreasonという文字が入っていることが必須になります。

説明

An AI-powered research assistant that performs comprehensive analysis on any topic. It extracts key search terms, gathers relevant information, and provides deep reasoning with structured summaries to help users quickly understand complex subjects.

I am a DeepResearch Assistant designed to help you explore topics in depth. I can analyze your requests, identify key research topics, search for relevant information, and provide reasoned analysis with comprehensive summaries.

When you give me a topic to research, I will:
- Extract the most relevant search terms
- reason: Gather information from reliable sources
- reason: Apply deep reasoning to analyze findings
- reason: Provide a structured summary with key insights
- Highlight areas for further exploration

My responses include both high-level summaries and detailed findings with source information. Feel free to ask follow-up questions about any aspect of my research.

カスタムエージェントでアクションを使用する

Copilot Studioの期待大の機能として、現段階ではプレビューですが、コアアクションを実施することができます。

生成オーケストレーションをオンにすると、エージェントは、ユーザーに対応するために自動的に最も適切なアクションやトピックを選択したり、ナレッジを検索したりすることができます。

プロンプトから、トピックのみで引数を判別し

事前構築されたコネクタアクション
ユーザー定義コネクタとアクション
Power Automate クラウドフロー
AI Builder プロンプト (トピック内)
Bot Framework スキル
REST API 接続

上記を自律的に実行する機能です。入力にはエージェントの判断による動的な値や変数、PowerFxを用いることができる魅力的な機能です。
引数の設定が簡略化し、アクションを追加していくことでエージェントの拡張が見込めます。

今回はカスタムコネクタなのでユーザー定義コネクタとアクションに該当します。

プロンプトに書いた通り、トピックを判別し検索を実行、そのあと深い推論を実施します。

Model Context Protocol (MCP)

この機能については、若干Model Context Protocol (MCP) と似たところを私は感じました。
今話題のモデルコンテキストプロトコル (MCP)ですが、クライアントであるClaude DesktopからMCPサーバーにあるツールをAIで判別し、ツールを実行します。

ハードルの低さも魅力です。私自身もPython SDKを使い、SharePointの操作を自動化するMCPサーバーを設定することができました。

SharePoint MCP...
流行りにのって、SDKからClaudeの力を存分に借りてMCPツールを作ってみました！

自然言語からSharePointサイトを操っています🐟
デモではListsを作成したり、内容を提案したり、ドキュメントライブラリをホストして資料を格納したりと盛りだくさんです。#Claude #MCP pic.twitter.com/I7aVd8HCne
— 出戻りガツオ🐟 Microsoft MVP (@DemodoriGatsuo) April 10, 2025

爆発的にMCPサーバーは広がっており、AzureでもMCPが発表されています。

また自分で建てることもできます。

Copilot Studioとモデルコンテキストプロトコルの連動も実はできたりするのですが、非常に難しく反応しません。コミュニティもできているので、できた方はぜひ寄稿してみてください。

お手製`DeepResearch`の実力やいかに！

プロンプト

モデルコンテキストプロトコル(MCP)について日本語で深い推論も加えて内容を説明してください

Deep Reasoning Models機能が使われると目に見えて、リサーチに時間をかけてもらえます！

それっぽい！

使ってみた感想ですが、o3など本家のモデルには到底及びません。Perplexityもありますが、意外とお金かかるんですよね。いい方法が知りたい。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up