AIエージェントを業務システムに入れる前の完全チェックリスト2026

Last updated at 2026-06-14Posted at 2026-06-11

AIエージェントを業務システムに本格導入する前に確認すべき、権限・ID設計からControl Plane設計、CI安全テスト、ガバナンスまでを1本にまとめたチェックリストです。

本記事は次の3本のQiita記事を統合・再構成したものです。

AIエージェントを全社展開する前に作る「権限・実行環境・CI安全テスト」チェックリスト
エージェント運用事故を減らす、実装者向けガバナンスチェックリスト
AIエージェントの暴走を防ぐControl Plane設計：権限・Skill・Sandbox・監査APIの実装チェックリスト

結論

AIエージェントを本番利用するなら、アプリ本体とは別に Control Plane を設計する必要があります。

Control Plane項目	目的	最低限の実装
権限・ID	エージェントのIDと操作範囲を追う	agent_id、owner、workspace、token種別
実行環境	sandbox・実行環境の影響範囲を閉じる	sandbox種別、network allowlist、file scope
CI安全テスト	事故をコードと同じサイクルで検知する	prompt injection・tool side effect のCI検査
Tool/MCP Registry	接続可能な外部操作を固定する	MCP server、API scope、read/write/action分類
Skill Supply Chain	agentに入る能力を検証する	skill card、署名、scan結果
Policy Audit	設定差分を継続監査する	repo単位の設定取得、drift検知
ガバナンス	モデル更新・状態管理・事故対応	モデルルール、記憶スコープ、インシデントテンプレート
人間承認	外部影響を人が止める	金額・顧客影響・本番変更の承認条件

一言でいうと、AIエージェントのControl Planeは、「何を任せるか」ではなく「何を任せてよい状態だと証明できるか」を管理する層です。

1. 権限・ID設計チェックリスト

基本原則

最初に決めるべきことは「どのモデルを使うか」ではなく、「誰の代わりに動くエージェントか」です。人間の個人tokenをエージェントに渡すことは最も多い設計ミスです。

エージェントのIdentity Planeでは、最低でも次の4種類を分けて管理します。

種別	例	主なリスク	管理すべきもの
Human interactive	mobileからエージェント作業に戻る	承認者の取り違え	user identity、device、MFA、approval log
Automation token	非対話workflow	token流出、過剰scope	token owner、expiry、scope、last used
Workspace agent	schedule実行	owner不明、共有範囲拡大	agent owner、publish status、schedule
App connector	Slack/Drive/Teamsなど	data oversharing	RBAC、read/write action、sensitivity label

権限設計テンプレート

coding_agent_policy:
  identity:
    run_as: agent_service_account
    human_impersonation: false
    branch_prefix: agent/
  repository_scope:
    allowed_repos:
      - app
      - docs
    denied_paths:
      - .env
      - secrets/
      - billing/
  actions:
    read_code: allow
    create_branch: allow
    modify_code: allow
    run_tests: allow
    push_branch: require_approval
    merge_pr: deny
    deploy_production: deny
  audit:
    keep_prompt_summary: true
    keep_tool_calls: true
    keep_diff: true
    link_to_ticket: required

人間承認ゲート

agent_identity_plane:
  human_approval:
    required_for:
      - production_deploy
      - customer_visible_message
      - billing_change
      - external_post
    approver_policy:
      min_role: workspace_admin
      mfa_required: true
      record:
        - approver_user_id
        - device_context
        - diff_summary
        - timestamp
  automation_tokens:
    max_ttl_days: 30
    require_owner: true
    allowed_scopes:
      - read_project_context
      - write_local_artifact
    denied_scopes:
      - publish_external
      - rotate_secrets
  workspace_agents:
    default_publish_state: private
    schedule_requires_owner: true
    connected_apps_default: disabled
    external_actions_default: review_required

2. 実行環境・サンドボックス分離

Runtime Boundary

sandboxを「安全そうな実行場所」として扱うだけでは不十分です。sandbox種別・network・file書き込み・browser利用・session永続化の境界を別々に管理します。

agent_runtime_boundary:
  agent_id: support-ticket-triage-agent
  sandbox:
    isolation: ephemeral_linux
    persistent_state: session_files_only
    cleanup_policy: delete_after_task_close
  file_scope:
    read:
      - /workspace/tickets/redacted
    write:
      - /workspace/output
    deny:
      - /workspace/secrets
      - /workspace/customer_raw_exports
  network:
    default: deny
    allow:
      - api.github.com
      - api.linear.app
  required_logs:
    - agent_id
    - runtime_id
    - tool_calls
    - files_read
    - files_written
    - external_urls

長時間実行のゲート

エージェントが長く動けるほど、開始時の権限設計が重要になります。誤った前提が何十ステップも伝播するリスクがあるため、チェックポイントを設けます。

agent_run_gate:
  run_type:
    allowed:
      - read_only_investigation
      - local_patch
      - draft_content
    requires_approval:
      - external_api_write
      - production_config_change
      - dependency_upgrade
      - payment_or_email_action
  context_inputs:
    app_screenshot: allowed
    terminal_output: allowed
    secrets: deny
    customer_pii: redact_before_attach
  long_running_mode:
    max_minutes_without_human_checkpoint: 30
    require_goal_statement: true
    require_rollback_plan: true

Appshotsのように画面文脈を渡す仕組みは便利ですが、スクリーンショットにAPI key・顧客情報・未公開財務情報が映る可能性があります。導入前に画面共有可能なアプリ範囲とredaction手順を決めておきます。

3. CI安全テスト・デプロイゲート

AIエージェントの安全性は設計レビューだけでは維持できません。通常のソフトウェアと同じように、PRごとに安全性を検査します。

重要なのは、LLMの出力文だけを見るのではなく、実際に呼ばれたtool・変更されたファイル・外部送信・削除・デプロイなどの副作用を検査対象にすることです。

agent_safety_ci:
  trigger:
    - pull_request
    - tool_policy_change
    - retrieval_source_change
    - model_or_prompt_change
  test_classes:
    prompt_injection:
      min_trials: 20
      required_safe_rate: 0.95
    tool_side_effect:
      must_not_call:
        - send_email
        - delete_record
        - deploy_production
    data_boundary:
      pii_leak_check: required
      tenant_isolation_check: required
    incident_regression:
      replay_known_incidents: required
  merge_gate:
    fail_on_policy_violation: true

prompt injectionは確率的な挙動のため、1回通っただけで合格にせず、複数試行と安全率の基準を設けます。

エージェント定義ファイルのレビュー

AGENTS.md・SKILL.md・tool policyは実行仕様です。コードと同じように差分レビューします。GitのCODEOWNERSでレビュー担当を分けるのが現実的です。

agent_definition_review:
  files:
    - AGENTS.md
    - SKILL.md
    - tool-policy.yaml
    - evals/
  review_required_when:
    - new_tool_added
    - sandbox_network_policy_changed
    - data_source_added
    - write_capability_added
    - persistent_state_enabled
  ci_checks:
    markdown_lint: true
    prohibited_secret_scan: true
    tool_policy_diff: true
    prompt_injection_regression: true

4. Control Plane設計（権限・Skill・監査API）

Tool / MCP Registry

SDK・CLI・MCP serverはエージェントの「操作境界そのもの」です。MCP server名だけを管理するのではなく、内側にあるtool・scope・write可否・data classification・owner・変更レビュー条件まで管理します。

tool_registry:
  - name: github-repo-readonly
    type: mcp_server
    owner: platform-team
    allowed_agents:
      - code-review-agent
    auth_scope:
      - repo:read
    actions:
      read:
        - get_issue
        - list_pull_requests
        - get_file
      write: []
      destructive: []
    data_classification:
      max_input: internal
      max_output: internal
    change_control:
      breaking_change_requires: security_review

  - name: billing-admin-api
    type: rest_api
    owner: finance-platform
    allowed_agents: []
    reason: "write and money movement require human-only operation"

接続を増やすたびにcapability registryを更新します。未登録の接続先はdenyが基本です。

agent_capability_registry:
  capability: crm_search
  connector_type: mcp_server
  owner_team: sales_ops
  data_classification:
    - customer_pii
    - commercial_confidential
  allowed_actions:
    - read_account
    - read_activity
  denied_actions:
    - update_deal_amount
    - send_customer_email
    - export_csv
  approval:
    required_for_write: true
    approver_role: sales_manager
  monitoring:
    log_all_queries: true
    alert_on_bulk_access: true

Skill Supply Chain

AI agentのskillは単なるprompt片ではなくdeployable artifactです。libraryやcontainer imageと同じようにSupply Chain管理が必要です。

skill_supply_chain:
  required_files:
    - SKILL.md
    - SKILLCARD.yaml
  required_metadata:
    - owner
    - purpose
    - allowed_tools
    - required_permissions
    - data_flow
    - known_risks
    - limitations
    - verification_status
  checks:
    static:
      - no_hidden_instruction
      - no_secret_access_pattern
      - no_unapproved_network_target
      - no_destructive_default_action
    provenance:
      - source_repo_verified
      - signature_verified
      - version_pinned
    review:
      - security_owner_approved
      - product_owner_approved

Policy Audit

エージェント設定を「管理画面で見れば分かる」状態にしないことが重要です。APIで設定を取得できるなら、scheduled jobでdrift検知できます。

agent_policy_audit:
  schedule: "daily"
  targets:
    - org: example-org
      repos: all
  checks:
    - name: mcp_servers_are_allowlisted
      fail_when:
        - mcp_server not in approved_mcp_servers
    - name: write_tools_require_human_review
      fail_when:
        - enabled_tool.write == true
        - required_reviewers < 1
    - name: firewall_default_deny
      fail_when:
        - firewall.default != deny
    - name: actions_policy_no_auto_prod
      fail_when:
        - workflow_policy.allows_production_deploy == true
        - human_approval_required == false
  output:
    - security_report.md
    - drift_pr

最小構成のControl Planeモデル

最初から大きなplatformを作る必要はありません。5ファイルで始められます。

agent-control-plane/
  agents.yaml              # agent_id, owner, runtime, token, schedule
  runtimes.yaml            # sandbox, network, file scope
  tools.yaml               # MCP/API/tools, scopes, read/write/destructive
  skills.yaml              # skill source, card, signature, scan result
  policy-audit.yml         # scheduled audit job definition

agents.yaml の例:

agents:
  - id: pr-fix-agent
    owner: developer-platform
    runtime: github-copilot-cloud-agent
    identity_type: workspace_agent
    allowed_repos:
      - service-api
      - frontend-app
    allowed_actions:
      - read_issue
      - create_branch
      - open_draft_pr
    denied_actions:
      - merge_pr
      - deploy_production
      - rotate_secret
    required_approval:
      - run_ci_on_agent_pr
      - merge_to_main

5. ガバナンス・事故対応ランブック

モデルルール（組織・用途別）

モデル選定は技術の好みではなく、運用責任の境界整理です。データ分類ごとにモデルルールを前提にします。

model_rules:
  customer_data:
    allowed_models: [enterprise-approved-coding]
    approvals: required
    log_level: full
  internal_docs:
    allowed_models: [fast-coding-model, review-model]
    approvals: optional
    log_level: summary
  open_data:
    allowed_models: [default-general-model]
    approvals: none

モデル退役・更新の自動監視

モデル更新日と退役日程はCI/CDのロールフォワード計画に直結します。少なくとも以下を自動化します。

日次で利用モデルの有効性を検証（エラー率・出力仕様差分）
退役30日前から代替モデルの検証ジョブを起動
失敗時にfallbackへ切り替えるフローを宣言

model_ops:
  check_in: daily
  fallback:
    enabled: true
    primary: current-model
    secondary: fallback-model
  deprecation_watch:
    grace_period_days: 30
    switch_on: policy_review_complete
    ci_validation_required: true

記憶スコープの分離

エージェントに保存するcontextは、個人設定としての覚え書きとリポジトリ事実を混在させないようにします。session終了時に一時contextは削除します。

agent_memory:
  user_level:
    allowed:
      - preferred_review_style
      - coding_standards
    denied:
      - customer_data
  repository_level:
    allowed:
      - build_command
      - test_command
      - architecture_constraints
  session_level:
    allowed:
      - temporary_investigation_context
    retention: delete_after_task_close

インシデント対応テンプレート

incident_response:
  intake:
    channel: secure-form
    fields:
      - resource_id
      - affected_users
      - suspicious_content_hash
  validation:
    triage_window_min: 30
    ownership: SOC
  remediation:
    immediate:
      - disable_agent_session
      - revoke_api_token
      - freeze_model_changes
    mandatory_items:
      - evidence_bundle
      - owner_signoff
      - root_cause_after_action

監査ログモデル

エージェント監査では最低でも3種類のログを分けて保存します。「Claudeを使った」「Codexを使った」だけでは監査になりません。

agent_audit_log_model:
  identity_log:
    - agent_id
    - human_owner
    - delegated_role
    - approval_event_id
  data_access_log:
    - source_system
    - record_type
    - sensitivity_label
    - retrieval_reason
  execution_log:
    - runtime_environment_id
    - tool_called
    - side_effect_type
    - output_artifact_hash

実装チェックリスト（全項目）

1. 権限・ID設計

agent_id と owner が明記されている
human user / automation token / workspace agent が区別されている
tokenのscope、TTL、last_usedを確認できる
承認ログに approver、diff、timestamp が残る
人間の個人tokenをエージェントに渡していない

2. 実行環境・サンドボックス

sandbox種別が明記されている（ephemeral vs persistent）
file read/write scopeがallowlist化されている
network default deny が設定されている
session stateの保持/削除条件が決まっている
長時間実行時の人間チェックポイントが定義されている（30分以内）

3. CI安全テスト

prompt injectionテストがPRで回帰される（複数試行・安全率基準あり）
tool side effectが検査される（send/delete/deployなど）
PII漏洩チェックがある
AGENTS.md / SKILL.md の変更がdiff reviewされる
赤チームの指摘をCIテストに変換している

4. Control Plane

MCP server と内包toolsが一覧化されている
read/write/destructive actionを分類している
API scopeがagent単位で最小化されている
SKILL.md に skill card が付いている
repo/workspace設定をAPIで取得・監査できる
drift検知をscheduled job化している

5. ガバナンス

モデルルール表（用途別に許可モデルを固定）がある
記憶スコープ（user/repo/session）を設定し削除手順がある
モデル退役通知を受けるジョブをCron化している
インシデントテンプレートを使った復旧演習を月1回している
tool call監査ログの保存場所を一本化している（ローテーション付き）

6. 人間承認

production deployは人間承認が必須
customer-visible actionは人間承認が必須
billing/payment/contract変更はagentに直接実行させない
secretやcustomer raw dataに触る操作は停止条件を持つ

よくある失敗パターン

失敗	なぜ危険か	対策
管理者tokenでagentを動かす	権限最大・事故時の影響範囲が最大	agent専用IDを作り用途ごとに最小化
agent定義をレビュー対象にしない	実行仕様の変更が素通りする	AGENTS.md/SKILL.md/tool policyをCODEOWNERS管理
出力テキストだけを評価する	tool callと副作用が見落とされる	tool callと副作用をCIで検査
prompt injectionを単発テストで済ませる	LLMの挙動は確率的	複数試行・安全率の基準を設ける
sandboxがあるので安全だと思う	network・file・session stateが開く	Runtime BoundaryをYAML化して明示
接続先が増えたことに気づけない	監査不能な操作面が広がる	新規SDK/MCP追加時にRegistry更新を必須化
モデル退役対応が後手になる	突然の停止で業務停止	退役30日前から代替モデル検証を自動起動
記憶スコープが混在する	過去の誤入力が次タスクへ伝播	user/repo/sessionを分離しsession終了時にリセット
skillをpromptとして扱う	skillがagentの能力と権限判断を変える	skill card・scan・署名・version pinningを必須化

6. 参考リンク・一次情報

この記事を書いた人✏️@YushiYamamoto
ITPRODX.com代表 / AIアーキテクト
Next.js / TypeScript / n8nを活用した自律型アーキテクチャ設計を専門としています。
日々の自動化の検証結果や、ビジネス側の視点（ROI等）に関するより深い考察は、以下の公式サイトおよびnoteで発信しています。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up