0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

Codex CLI用 自作プロキシ 運用編 〜マルチテナント対応とアクセス制御〜

Last updated at Posted at 2025-09-18

1. はじめに

前回までに、(1) 基本実装、(2) ストリーミング対応とリトライ戦略、(3) 監視とメトリクス収集を整備しました。本記事では マルチテナント対応アクセス制御(AuthZ/AuthN) をテーマに、複数チーム/プロジェクトが安全に共用できるプロキシ設計・実装パターンを解説します。


2. 要件整理

  • テナント分離: テナントA/Bのリクエスト・ログ・メトリクス・予算を明確に分離
  • 認証/認可: APIキー/JWT/mTLSのいずれかで認証、テナント別にモデル許可・最大トークン・リージョン等を制御
  • コスト/レート制御: テナントごとに日次/月次上限、レート制限、クォータ、モデル別単価反映
  • 監査/可観測性: すべてのメトリクス/ログに tenant ラベルを付与しダッシュボード/アラート
  • キー管理: 発行、ローテーション、失効(即時無効化)
  • データ保護: PII/機密の取り扱い、マスキング/匿名化オプション

3. アーキテクチャ概要


4. データモデル(PostgreSQL 例)

-- tenants: テナント基本情報
CREATE TABLE tenants (
  id UUID PRIMARY KEY,
  name TEXT NOT NULL UNIQUE,
  status TEXT NOT NULL DEFAULT 'active', -- active/suspended
  daily_budget_usd NUMERIC(12,4) DEFAULT 50.0,
  monthly_budget_usd NUMERIC(12,4) DEFAULT 500.0,
  created_at TIMESTAMP DEFAULT now()
);

-- api_keys: 認証用キー(複数/ローテーション対応)
CREATE TABLE api_keys (
  id UUID PRIMARY KEY,
  tenant_id UUID REFERENCES tenants(id),
  key_hash TEXT NOT NULL,           -- ハッシュ保存
  label TEXT,
  expires_at TIMESTAMP,
  revoked BOOLEAN DEFAULT FALSE,
  created_at TIMESTAMP DEFAULT now()
);
CREATE INDEX ON api_keys(tenant_id);

-- policies: 認可/制限(モデル許可・max_tokens 等)
CREATE TABLE policies (
  id UUID PRIMARY KEY,
  tenant_id UUID REFERENCES tenants(id),
  model_allowlist TEXT[] NOT NULL,  -- ['gpt-4.1-mini','gpt-oss'] 等
  max_tokens INTEGER DEFAULT 4096,
  regions TEXT[] DEFAULT ARRAY['*'],
  created_at TIMESTAMP DEFAULT now()
);

-- quotas: レート/同時実行数制限
CREATE TABLE quotas (
  id UUID PRIMARY KEY,
  tenant_id UUID REFERENCES tenants(id),
  rpm INTEGER DEFAULT 120,          -- requests per minute
  rps INTEGER DEFAULT 5,            -- requests per second
  concurrent_limit INTEGER DEFAULT 4,
  created_at TIMESTAMP DEFAULT now()
);

5. 認証方式の選択

  • APIキー(シンプル/機械向け): Authorization: Bearer <key>、DBにハッシュ保存、定期ローテーション
  • JWT(人/サービス混在、権限同封): 署名検証、tenant / roles クレームでRBAC
  • mTLS(ゼロトラスト/社内限定): クライアント証明書のCN→テナント紐付け
  • IP許可リスト(補助): 厳格な接続元制御

小規模スタートはAPIキー、将来JWT/mTLSへ拡張可能に。


6. ミドルウェア実装(TypeScript/Hono)

6.1 認証(APIキー/JWT)

// src/auth.ts
import { createHash } from 'crypto'
import { getTenantByApiKeyHash, getTenantByJwt } from './repo'

export async function authMiddleware(c: any, next: any) {
  const auth = c.req.header('authorization') || ''
  const token = auth.startsWith('Bearer ') ? auth.slice(7) : null

  // API Key 優先
  if (token) {
    const keyHash = createHash('sha256').update(token).digest('hex')
    const tenant = await getTenantByApiKeyHash(keyHash)
    if (tenant) { c.set('tenant', tenant); return next() }
  }

  // JWT(例)
  const jwt = c.req.header('x-jwt')
  if (jwt) {
    const tenant = await getTenantByJwt(jwt)
    if (tenant) { c.set('tenant', tenant); return next() }
  }

  return c.json({ error: 'Unauthorized' }, 401)
}

6.2 認可(RBAC/モデル許可)

// src/rbac.ts
import { getPolicyByTenant } from './repo'

export async function rbacMiddleware(c: any, next: any) {
  const tenant = c.get('tenant')
  const body = await c.req.json()
  c.req.bodyCache = body // 以降でも使えるようにキャッシュ

  const policy = await getPolicyByTenant(tenant.id)
  if (!policy.model_allowlist.includes(body.model)) {
    return c.json({ error: 'model_not_allowed' }, 403)
  }
  if (body.max_tokens && body.max_tokens > policy.max_tokens) {
    body.max_tokens = policy.max_tokens
  }
  return next()
}

6.3 ガードレール(プロンプト検査・サイズ制限)

// src/guardrails.ts
export function sanitizePrompt(text: string) {
  // 機密情報(例: 社員番号/クレカ)の簡易マスキング等を実装
  return text.replace(/\b\d{4}-\d{4}-\d{4}-\d{4}\b/g, '****-****-****-****')
}

export function enforcePromptLimits(body: any, maxChars = 20000) {
  const userMsg = body.messages?.find((m: any) => m.role === 'user')
  if (!userMsg) return
  const content = typeof userMsg.content === 'string' ? userMsg.content : JSON.stringify(userMsg.content)
  if (content.length > maxChars) {
    throw new Error('prompt_too_long')
  }
  userMsg.content = sanitizePrompt(content)
}

7. レート制限/クォータ(Redis + Token Bucket)

// src/rate.ts
import { RateLimiterRedis } from 'rate-limiter-flexible'
import { createClient } from 'redis'

const redis = createClient({ url: process.env.REDIS_URL })
await redis.connect()

export const limiterPerTenant = new RateLimiterRedis({
  storeClient: redis,
  keyPrefix: 'rl_tenant',
  points: 120, // 1分あたり120
  duration: 60,
})

export async function rateMiddleware(c: any, next: any) {
  const tenant = c.get('tenant')
  try {
    await limiterPerTenant.consume(tenant.id)
    return next()
  } catch {
    return c.json({ error: 'too_many_requests' }, 429)
  }
}

8. ルーティングと実行(OpenAI/Ollama)

// src/routes.ts
import { Context } from 'hono'
import { enforcePromptLimits } from './guardrails'
import { callOpenAI } from './providers/openai'
import { callOllama } from './providers/ollama'

export async function chatCompletions(c: Context) {
  const tenant = c.get('tenant')
  const body = c.req.bodyCache || await c.req.json()

  enforcePromptLimits(body, 20000)

  const isOllama = body.model.includes('gpt-oss') || body.model.includes('ollama')
  return isOllama ? callOllama(c, body, tenant) : callOpenAI(c, body, tenant)
}

9. キー発行・ローテーション・失効

9.1 発行フロー(管理API)

  • POST /admin/tenants/:id/api-keys → 新規キーを生成しハッシュ保存、平文キーは一度だけ表示
  • POST /admin/api-keys/:id/revoke → 即時失効
  • POST /admin/api-keys/:id/rotate → 新旧キーの併存期間を設けつつ移行

9.2 OpenAPI スニペット

paths:
  /admin/tenants/{id}/api-keys:
    post:
      summary: Issue new API key
  /admin/api-keys/{id}/revoke:
    post:
      summary: Revoke API key

10. マルチテナント×コスト管理の連携

  • すべてのメトリクスに tenant ラベル
  • 予算超過見込み時に モデル自動ダウングレード / max_tokens 自動縮小
  • 日次請求サマリ を Slack/Email 配信

11. デプロイ/構成管理(Helm values 例)

image: your/proxy:latest
env:
  OPENAI_BASE_URL: https://api.openai.com/v1
  OLLAMA_BASE_URL: http://ollama:11434/v1
  REDIS_URL: redis://redis:6379
  PROXY_API_KEY_SIGNING: "..."
resources:
  limits: { cpu: "500m", memory: "512Mi" }
  requests: { cpu: "100m", memory: "256Mi" }
service:
  type: ClusterIP
  port: 8787

12. テスト例(curl)

# テナントAのAPIキーで、gpt-oss を利用
curl -H "Authorization: Bearer <TENANT_A_KEY>" \
     -H "Content-Type: application/json" \
     -d '{"model":"gpt-oss","messages":[{"role":"user","content":"Write a haiku."}]}' \
     http://localhost:8787/v1/chat/completions

# 許可されていないモデルを試す(403)
curl -H "Authorization: Bearer <TENANT_A_KEY>" \
     -H "Content-Type: application/json" \
     -d '{"model":"gpt-4o","messages":[{"role":"user","content":"..."}]}' \
     http://localhost:8787/v1/chat/completions

13. まとめ

  • テナント分離・認証/認可・レート/クォータ・監査・キー管理をひとまとめに設計
  • 小さく始めて段階的に強化(APIキー→JWT/mTLS、静的制限→動的ポリシー)
  • 既存の監視/コスト可視化(前記事)と連携して、安全かつ予算内での運用を実現します。

14. サンプル: Codex Proxy Admin Ui Mock

コードサンプル
import React, { useMemo, useState } from "react";

// 管理UIモック(1ファイル版)
// - タブ: Tenants / API Keys / Policies
// - モーダル: キー発行、ポリシー編集
// - ダミーデータ: useMemo で生成
// - スタイル: Tailwind(ChatGPT Canvas の React プレビュー可)

export default function AdminConsole() {
  type Tenant = { id: string; name: string; status: "active" | "suspended"; daily: number; monthly: number };
  type ApiKey = { id: string; tenantId: string; label: string; last4: string; expiresAt?: string; revoked?: boolean };
  type Policy = { id: string; tenantId: string; allow: string[]; maxTokens: number; regions: string[] };

  const [activeTab, setActiveTab] = useState<"tenants" | "keys" | "policies">("tenants");
  const [showKeyModal, setShowKeyModal] = useState(false);
  const [showPolicyModal, setShowPolicyModal] = useState<null | Policy>(null);

  const tenants = useMemo<Tenant[]>(() => [
    { id: "tnt_a", name: "Team A", status: "active", daily: 50, monthly: 500 },
    { id: "tnt_b", name: "Team B", status: "active", daily: 75, monthly: 750 },
    { id: "tnt_c", name: "Team C", status: "suspended", daily: 0, monthly: 0 },
  ], []);

  const keys = useMemo<ApiKey[]>(() => [
    { id: "key1", tenantId: "tnt_a", label: "backend-ci", last4: "9F3C", expiresAt: "2025-12-31" },
    { id: "key2", tenantId: "tnt_a", label: "local-dev", last4: "1A2B" },
    { id: "key3", tenantId: "tnt_b", label: "ml-job", last4: "77C1", revoked: true },
  ], []);

  const policies = useMemo<Policy[]>(() => [
    { id: "pol_a", tenantId: "tnt_a", allow: ["gpt-oss", "gpt-4.1-mini"], maxTokens: 4096, regions: ["*"] },
    { id: "pol_b", tenantId: "tnt_b", allow: ["gpt-oss"], maxTokens: 2048, regions: ["ap-northeast-1"] },
  ], []);

  const [filterTenant, setFilterTenant] = useState<string>("all");

  const filteredKeys = keys.filter(k => filterTenant === "all" || k.tenantId === filterTenant);
  const filteredPolicies = policies.filter(p => filterTenant === "all" || p.tenantId === filterTenant);

  return (
    <div className="min-h-screen bg-gray-50 p-6">
      <header className="mb-6">
        <h1 className="text-2xl font-bold">Codex Proxy Admin Console</h1>
        <p className="text-gray-600 text-sm">テナント・APIキー・ポリシーの運用モック。実装はバックエンドAPIに接続して差し替え。</p>
      </header>

      <div className="mb-4 flex items-center gap-3">
        <div className="inline-flex rounded-xl shadow-sm overflow-hidden border bg-white">
          <button onClick={() => setActiveTab("tenants")} className={`px-4 py-2 text-sm ${activeTab === "tenants" ? "bg-gray-900 text-white" : "text-gray-700 hover:bg-gray-100"}`}>Tenants</button>
          <button onClick={() => setActiveTab("keys")} className={`px-4 py-2 text-sm ${activeTab === "keys" ? "bg-gray-900 text-white" : "text-gray-700 hover:bg-gray-100"}`}>API Keys</button>
          <button onClick={() => setActiveTab("policies")} className={`px-4 py-2 text-sm ${activeTab === "policies" ? "bg-gray-900 text-white" : "text-gray-700 hover:bg-gray-100"}`}>Policies</button>
        </div>

        <div className="ml-auto flex items-center gap-2">
          <select value={filterTenant} onChange={(e) => setFilterTenant(e.target.value)} className="border rounded-lg px-3 py-2 text-sm bg-white">
            <option value="all">All Tenants</option>
            {tenants.map(t => <option key={t.id} value={t.id}>{t.name}</option>)}
          </select>
          {activeTab === "keys" && (
            <button onClick={() => setShowKeyModal(true)} className="px-3 py-2 text-sm rounded-lg bg-blue-600 text-white shadow">キー発行</button>
          )}
        </div>
      </div>

      {activeTab === "tenants" && (
        <section className="bg-white rounded-2xl shadow p-4">
          <h2 className="text-lg font-semibold mb-3">Tenants</h2>
          <div className="overflow-x-auto">
            <table className="min-w-full text-sm">
              <thead>
                <tr className="text-left text-gray-500 border-b">
                  <th className="py-2 pr-4">Name</th>
                  <th className="py-2 pr-4">Status</th>
                  <th className="py-2 pr-4">Daily Budget (USD)</th>
                  <th className="py-2 pr-4">Monthly Budget (USD)</th>
                  <th className="py-2 pr-4">Actions</th>
                </tr>
              </thead>
              <tbody>
                {tenants.map(t => (
                  <tr key={t.id} className="border-b last:border-0">
                    <td className="py-2 pr-4 font-medium">{t.name}</td>
                    <td className="py-2 pr-4">
                      <span className={`px-2 py-1 rounded text-xs ${t.status === "active" ? "bg-green-100 text-green-700" : "bg-red-100 text-red-700"}`}>{t.status}</span>
                    </td>
                    <td className="py-2 pr-4">{t.daily.toFixed(2)}</td>
                    <td className="py-2 pr-4">{t.monthly.toFixed(2)}</td>
                    <td className="py-2 pr-4 flex gap-2">
                      <button className="px-2 py-1 text-xs rounded bg-gray-100">編集</button>
                      <button className="px-2 py-1 text-xs rounded bg-gray-100">一時停止</button>
                    </td>
                  </tr>
                ))}
              </tbody>
            </table>
          </div>
        </section>
      )}

      {activeTab === "keys" && (
        <section className="bg-white rounded-2xl shadow p-4">
          <h2 className="text-lg font-semibold mb-3">API Keys</h2>
          <div className="overflow-x-auto">
            <table className="min-w-full text-sm">
              <thead>
                <tr className="text-left text-gray-500 border-b">
                  <th className="py-2 pr-4">Tenant</th>
                  <th className="py-2 pr-4">Label</th>
                  <th className="py-2 pr-4">Key (last4)</th>
                  <th className="py-2 pr-4">Expires</th>
                  <th className="py-2 pr-4">Status</th>
                  <th className="py-2 pr-4">Actions</th>
                </tr>
              </thead>
              <tbody>
                {filteredKeys.map(k => {
                  const t = tenants.find(x => x.id === k.tenantId)!;
                  return (
                    <tr key={k.id} className="border-b last:border-0">
                      <td className="py-2 pr-4">{t.name}</td>
                      <td className="py-2 pr-4 font-medium">{k.label}</td>
                      <td className="py-2 pr-4">•••• •••• •••• {k.last4}</td>
                      <td className="py-2 pr-4">{k.expiresAt ?? "-"}</td>
                      <td className="py-2 pr-4">{k.revoked ? <span className="text-red-600">revoked</span> : "active"}</td>
                      <td className="py-2 pr-4 flex gap-2">
                        <button className="px-2 py-1 text-xs rounded bg-gray-100">再発行</button>
                        <button className="px-2 py-1 text-xs rounded bg-gray-100">失効</button>
                      </td>
                    </tr>
                  );
                })}
              </tbody>
            </table>
          </div>
        </section>
      )}

      {activeTab === "policies" && (
        <section className="bg-white rounded-2xl shadow p-4">
          <h2 className="text-lg font-semibold mb-3">Policies</h2>
          <div className="overflow-x-auto">
            <table className="min-w-full text-sm">
              <thead>
                <tr className="text-left text-gray-500 border-b">
                  <th className="py-2 pr-4">Tenant</th>
                  <th className="py-2 pr-4">Allow Models</th>
                  <th className="py-2 pr-4">max_tokens</th>
                  <th className="py-2 pr-4">Regions</th>
                  <th className="py-2 pr-4">Actions</th>
                </tr>
              </thead>
              <tbody>
                {filteredPolicies.map(p => {
                  const t = tenants.find(x => x.id === p.tenantId)!;
                  return (
                    <tr key={p.id} className="border-b last:border-0">
                      <td className="py-2 pr-4">{t.name}</td>
                      <td className="py-2 pr-4">{p.allow.join(", ")}</td>
                      <td className="py-2 pr-4">{p.maxTokens}</td>
                      <td className="py-2 pr-4">{p.regions.join(", ")}</td>
                      <td className="py-2 pr-4 flex gap-2">
                        <button className="px-2 py-1 text-xs rounded bg-gray-100" onClick={() => setShowPolicyModal(p)}>編集</button>
                      </td>
                    </tr>
                  );
                })}
              </tbody>
            </table>
          </div>
        </section>
      )}

      {showKeyModal && (
        <div className="fixed inset-0 bg-black/40 flex items-center justify-center p-4">
          <div className="bg-white rounded-2xl shadow-xl w-full max-w-md p-4">
            <h3 className="text-lg font-semibold mb-3">キー発行</h3>
            <div className="space-y-3">
              <label className="block text-sm">Tenant</label>
              <select className="w-full border rounded-lg px-3 py-2 text-sm bg-white">
                {tenants.map(t => <option key={t.id} value={t.id}>{t.name}</option>)}
              </select>
              <label className="block text-sm">Label</label>
              <input className="w-full border rounded-lg px-3 py-2 text-sm" placeholder="backend-ci" />
            </div>
            <div className="mt-4 flex justify-end gap-2">
              <button className="px-3 py-2 text-sm rounded-lg bg-gray-100" onClick={() => setShowKeyModal(false)}>Cancel</button>
              <button className="px-3 py-2 text-sm rounded-lg bg-blue-600 text-white">Issue</button>
            </div>
          </div>
        </div>
      )}

      {showPolicyModal && (
        <div className="fixed inset-0 bg-black/40 flex items-center justify-center p-4">
          <div className="bg-white rounded-2xl shadow-xl w-full max-w-lg p-4">
            <h3 className="text-lg font-semibold mb-3">ポリシー編集</h3>
            <div className="grid grid-cols-1 md:grid-cols-2 gap-3">
              <div>
                <label className="block text-sm">Allow Models</label>
                <input defaultValue={showPolicyModal.allow.join(", ")} className="w-full border rounded-lg px-3 py-2 text-sm" />
              </div>
              <div>
                <label className="block text-sm">max_tokens</label>
                <input type="number" defaultValue={showPolicyModal.maxTokens} className="w-full border rounded-lg px-3 py-2 text-sm" />
              </div>
              <div className="md:col-span-2">
                <label className="block text-sm">Regions</label>
                <input defaultValue={showPolicyModal.regions.join(", ")} className="w-full border rounded-lg px-3 py-2 text-sm" />
              </div>
            </div>
            <div className="mt-4 flex justify-end gap-2">
              <button className="px-3 py-2 text-sm rounded-lg bg-gray-100" onClick={() => setShowPolicyModal(null)}>Cancel</button>
              <button className="px-3 py-2 text-sm rounded-lg bg-blue-600 text-white" onClick={() => setShowPolicyModal(null)}>Save</button>
            </div>
          </div>
        </div>
      )}
    </div>
  );
}

15. サンプル: Codex Proxy インフラテンプレート(Terraform / Helm)

1. 前提

  • Kubernetes 上で自作プロキシを稼働させる前提
  • Prometheus によるメトリクス収集、Grafana での可視化
  • Redis をレート制限用に使用

2. Helm Chart スケルトン

charts/codex-proxy/
  Chart.yaml
  values.yaml
  templates/
    deployment.yaml
    service.yaml
    configmap-env.yaml
    secret-env.yaml
    servicemonitor.yaml   # Prometheus Operator 使用時
Chart.yaml
apiVersion: v2
name: codex-proxy
version: 0.1.0
appVersion: "latest"
values.yaml(例)
image:
  repository: ghcr.io/your-org/codex-proxy
  tag: latest
  pullPolicy: IfNotPresent

replicaCount: 2

service:
  type: ClusterIP
  port: 8787

resources:
  limits:
    cpu: 500m
    memory: 512Mi
  requests:
    cpu: 100m
    memory: 256Mi

env:
  OPENAI_BASE_URL: https://api.openai.com/v1
  OLLAMA_BASE_URL: http://ollama:11434/v1
  REDIS_URL: redis://redis:6379
  # PROXY_API_KEY は Secret で注入

prometheus:
  enabled: true
  scrape: true
templates/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ include "codex-proxy.fullname" . }}
spec:
  replicas: {{ .Values.replicaCount }}
  selector:
    matchLabels:
      app: {{ include "codex-proxy.name" . }}
  template:
    metadata:
      labels:
        app: {{ include "codex-proxy.name" . }}
      annotations:
        {{- if .Values.prometheus.scrape }}
        prometheus.io/scrape: "true"
        prometheus.io/port: "{{ .Values.service.port }}"
        prometheus.io/path: "/metrics"
        {{- end }}
    spec:
      containers:
        - name: proxy
          image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
          imagePullPolicy: {{ .Values.image.pullPolicy }}
          ports:
            - containerPort: {{ .Values.service.port }}
          env:
            - name: OPENAI_BASE_URL
              value: {{ .Values.env.OPENAI_BASE_URL | quote }}
            - name: OLLAMA_BASE_URL
              value: {{ .Values.env.OLLAMA_BASE_URL | quote }}
            - name: REDIS_URL
              value: {{ .Values.env.REDIS_URL | quote }}
            - name: PROXY_API_KEY
              valueFrom:
                secretKeyRef:
                  name: {{ include "codex-proxy.fullname" . }}-secret
                  key: PROXY_API_KEY
          resources:
            {{- toYaml .Values.resources | nindent 12 }}
          readinessProbe:
            httpGet: { path: /healthz, port: {{ .Values.service.port }} }
            initialDelaySeconds: 5
            periodSeconds: 10
          livenessProbe:
            httpGet: { path: /healthz, port: {{ .Values.service.port }} }
            initialDelaySeconds: 10
            periodSeconds: 20
templates/service.yaml
apiVersion: v1
kind: Service
metadata:
  name: {{ include "codex-proxy.fullname" . }}
spec:
  type: {{ .Values.service.type }}
  selector:
    app: {{ include "codex-proxy.name" . }}
  ports:
    - port: {{ .Values.service.port }}
      targetPort: {{ .Values.service.port }}
      protocol: TCP
      name: http
templates/secret-env.yaml
apiVersion: v1
kind: Secret
metadata:
  name: {{ include "codex-proxy.fullname" . }}-secret
stringData:
  PROXY_API_KEY: "change-me"
templates/servicemonitor.yaml(Prometheus Operator)
{{- if .Values.prometheus.enabled }}
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: {{ include "codex-proxy.fullname" . }}
spec:
  selector:
    matchLabels:
      app: {{ include "codex-proxy.name" . }}
  endpoints:
    - port: http
      path: /metrics
      interval: 15s
{{- end }}

3. Terraform テンプレート

3.1 Kubernetes Provider と Secret/ConfigMap

terraform {
  required_providers {
    kubernetes = {
      source  = "hashicorp/kubernetes"
      version = ">= 2.23.0"
    }
    helm = {
      source  = "hashicorp/helm"
      version = ">= 2.12.1"
    }
  }
}

provider "kubernetes" {
  config_path = var.kubeconfig
}

provider "helm" {
  kubernetes {
    config_path = var.kubeconfig
  }
}

variable "kubeconfig" { type = string }
variable "namespace" { type = string default = "codex" }
variable "proxy_api_key" { type = string }

resource "kubernetes_namespace" "codex" {
  metadata { name = var.namespace }
}

resource "kubernetes_secret" "proxy" {
  metadata { name = "codex-proxy-secret" namespace = var.namespace }
  data = {
    PROXY_API_KEY = base64encode(var.proxy_api_key)
  }
}

3.2 Helm で Chart をデプロイ

resource "helm_release" "codex_proxy" {
  name       = "codex-proxy"
  namespace  = var.namespace
  chart      = "./charts/codex-proxy"

  values = [
    yamlencode({
      image = {
        repository = "ghcr.io/your-org/codex-proxy"
        tag        = "latest"
        pullPolicy = "IfNotPresent"
      }
      env = {
        OPENAI_BASE_URL = "https://api.openai.com/v1"
        OLLAMA_BASE_URL = "http://ollama.codex:11434/v1"
        REDIS_URL       = "redis://redis.codex:6379"
      }
      service = { port = 8787 }
      resources = {
        limits = { cpu = "500m", memory = "512Mi" }
        requests = { cpu = "100m", memory = "256Mi" }
      }
      prometheus = { enabled = true, scrape = true }
    })
  ]
}

3.3 追加: Redis(Bitnami Chart 例)

resource "helm_release" "redis" {
  name       = "redis"
  namespace  = var.namespace
  repository = "https://charts.bitnami.com/bitnami"
  chart      = "redis"
  version    = "19.0.0"
  values = [
    yamlencode({
      architecture = "standalone",
      master = { persistence = { enabled = false } },
      auth = { enabled = false }
    })
  ]
}

4. Grafana ダッシュボード(抜粋 JSON)

{
  "title": "Codex Proxy Overview",
  "panels": [
    { "type": "timeseries", "title": "Requests/min", "targets": [{ "expr": "sum(rate(llm_requests_total[1m]))" }] },
    { "type": "timeseries", "title": "Avg Latency (s)", "targets": [{ "expr": "rate(http_request_duration_seconds_sum[5m]) / rate(http_request_duration_seconds_count[5m])" }] },
    { "type": "timeseries", "title": "Cost USD (1h)", "targets": [{ "expr": "sum(increase(llm_cost_usd_total[1h])) by (tenant)" }] }
  ]
}

5. 使い方

  1. charts/codex-proxy をリポジトリに配置
  2. Terraform で namespace と Secret を作成
  3. helm_release でプロキシ本体・Redis をインストール
  4. Prometheus/Grafana から /metrics をスクレイプ・可視化

必要に応じて Ingress/Nginx/Cert-Manager を追加し、外部公開や mTLS を組み込みます。

関連記事一覧

0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?