1. はじめに
前回までに、(1) 基本実装、(2) ストリーミング対応とリトライ戦略、(3) 監視とメトリクス収集を整備しました。本記事では マルチテナント対応 と アクセス制御(AuthZ/AuthN) をテーマに、複数チーム/プロジェクトが安全に共用できるプロキシ設計・実装パターンを解説します。
2. 要件整理
- テナント分離: テナントA/Bのリクエスト・ログ・メトリクス・予算を明確に分離
- 認証/認可: APIキー/JWT/mTLSのいずれかで認証、テナント別にモデル許可・最大トークン・リージョン等を制御
- コスト/レート制御: テナントごとに日次/月次上限、レート制限、クォータ、モデル別単価反映
-
監査/可観測性: すべてのメトリクス/ログに
tenant
ラベルを付与しダッシュボード/アラート - キー管理: 発行、ローテーション、失効(即時無効化)
- データ保護: PII/機密の取り扱い、マスキング/匿名化オプション
3. アーキテクチャ概要
4. データモデル(PostgreSQL 例)
-- tenants: テナント基本情報
CREATE TABLE tenants (
id UUID PRIMARY KEY,
name TEXT NOT NULL UNIQUE,
status TEXT NOT NULL DEFAULT 'active', -- active/suspended
daily_budget_usd NUMERIC(12,4) DEFAULT 50.0,
monthly_budget_usd NUMERIC(12,4) DEFAULT 500.0,
created_at TIMESTAMP DEFAULT now()
);
-- api_keys: 認証用キー(複数/ローテーション対応)
CREATE TABLE api_keys (
id UUID PRIMARY KEY,
tenant_id UUID REFERENCES tenants(id),
key_hash TEXT NOT NULL, -- ハッシュ保存
label TEXT,
expires_at TIMESTAMP,
revoked BOOLEAN DEFAULT FALSE,
created_at TIMESTAMP DEFAULT now()
);
CREATE INDEX ON api_keys(tenant_id);
-- policies: 認可/制限(モデル許可・max_tokens 等)
CREATE TABLE policies (
id UUID PRIMARY KEY,
tenant_id UUID REFERENCES tenants(id),
model_allowlist TEXT[] NOT NULL, -- ['gpt-4.1-mini','gpt-oss'] 等
max_tokens INTEGER DEFAULT 4096,
regions TEXT[] DEFAULT ARRAY['*'],
created_at TIMESTAMP DEFAULT now()
);
-- quotas: レート/同時実行数制限
CREATE TABLE quotas (
id UUID PRIMARY KEY,
tenant_id UUID REFERENCES tenants(id),
rpm INTEGER DEFAULT 120, -- requests per minute
rps INTEGER DEFAULT 5, -- requests per second
concurrent_limit INTEGER DEFAULT 4,
created_at TIMESTAMP DEFAULT now()
);
5. 認証方式の選択
-
APIキー(シンプル/機械向け):
Authorization: Bearer <key>
、DBにハッシュ保存、定期ローテーション -
JWT(人/サービス混在、権限同封): 署名検証、
tenant
/roles
クレームでRBAC - mTLS(ゼロトラスト/社内限定): クライアント証明書のCN→テナント紐付け
- IP許可リスト(補助): 厳格な接続元制御
小規模スタートはAPIキー、将来JWT/mTLSへ拡張可能に。
6. ミドルウェア実装(TypeScript/Hono)
6.1 認証(APIキー/JWT)
// src/auth.ts
import { createHash } from 'crypto'
import { getTenantByApiKeyHash, getTenantByJwt } from './repo'
export async function authMiddleware(c: any, next: any) {
const auth = c.req.header('authorization') || ''
const token = auth.startsWith('Bearer ') ? auth.slice(7) : null
// API Key 優先
if (token) {
const keyHash = createHash('sha256').update(token).digest('hex')
const tenant = await getTenantByApiKeyHash(keyHash)
if (tenant) { c.set('tenant', tenant); return next() }
}
// JWT(例)
const jwt = c.req.header('x-jwt')
if (jwt) {
const tenant = await getTenantByJwt(jwt)
if (tenant) { c.set('tenant', tenant); return next() }
}
return c.json({ error: 'Unauthorized' }, 401)
}
6.2 認可(RBAC/モデル許可)
// src/rbac.ts
import { getPolicyByTenant } from './repo'
export async function rbacMiddleware(c: any, next: any) {
const tenant = c.get('tenant')
const body = await c.req.json()
c.req.bodyCache = body // 以降でも使えるようにキャッシュ
const policy = await getPolicyByTenant(tenant.id)
if (!policy.model_allowlist.includes(body.model)) {
return c.json({ error: 'model_not_allowed' }, 403)
}
if (body.max_tokens && body.max_tokens > policy.max_tokens) {
body.max_tokens = policy.max_tokens
}
return next()
}
6.3 ガードレール(プロンプト検査・サイズ制限)
// src/guardrails.ts
export function sanitizePrompt(text: string) {
// 機密情報(例: 社員番号/クレカ)の簡易マスキング等を実装
return text.replace(/\b\d{4}-\d{4}-\d{4}-\d{4}\b/g, '****-****-****-****')
}
export function enforcePromptLimits(body: any, maxChars = 20000) {
const userMsg = body.messages?.find((m: any) => m.role === 'user')
if (!userMsg) return
const content = typeof userMsg.content === 'string' ? userMsg.content : JSON.stringify(userMsg.content)
if (content.length > maxChars) {
throw new Error('prompt_too_long')
}
userMsg.content = sanitizePrompt(content)
}
7. レート制限/クォータ(Redis + Token Bucket)
// src/rate.ts
import { RateLimiterRedis } from 'rate-limiter-flexible'
import { createClient } from 'redis'
const redis = createClient({ url: process.env.REDIS_URL })
await redis.connect()
export const limiterPerTenant = new RateLimiterRedis({
storeClient: redis,
keyPrefix: 'rl_tenant',
points: 120, // 1分あたり120
duration: 60,
})
export async function rateMiddleware(c: any, next: any) {
const tenant = c.get('tenant')
try {
await limiterPerTenant.consume(tenant.id)
return next()
} catch {
return c.json({ error: 'too_many_requests' }, 429)
}
}
8. ルーティングと実行(OpenAI/Ollama)
// src/routes.ts
import { Context } from 'hono'
import { enforcePromptLimits } from './guardrails'
import { callOpenAI } from './providers/openai'
import { callOllama } from './providers/ollama'
export async function chatCompletions(c: Context) {
const tenant = c.get('tenant')
const body = c.req.bodyCache || await c.req.json()
enforcePromptLimits(body, 20000)
const isOllama = body.model.includes('gpt-oss') || body.model.includes('ollama')
return isOllama ? callOllama(c, body, tenant) : callOpenAI(c, body, tenant)
}
9. キー発行・ローテーション・失効
9.1 発行フロー(管理API)
-
POST /admin/tenants/:id/api-keys
→ 新規キーを生成しハッシュ保存、平文キーは一度だけ表示 -
POST /admin/api-keys/:id/revoke
→ 即時失効 -
POST /admin/api-keys/:id/rotate
→ 新旧キーの併存期間を設けつつ移行
9.2 OpenAPI スニペット
paths:
/admin/tenants/{id}/api-keys:
post:
summary: Issue new API key
/admin/api-keys/{id}/revoke:
post:
summary: Revoke API key
10. マルチテナント×コスト管理の連携
- すべてのメトリクスに
tenant
ラベル - 予算超過見込み時に モデル自動ダウングレード / max_tokens 自動縮小
- 日次請求サマリ を Slack/Email 配信
11. デプロイ/構成管理(Helm values 例)
image: your/proxy:latest
env:
OPENAI_BASE_URL: https://api.openai.com/v1
OLLAMA_BASE_URL: http://ollama:11434/v1
REDIS_URL: redis://redis:6379
PROXY_API_KEY_SIGNING: "..."
resources:
limits: { cpu: "500m", memory: "512Mi" }
requests: { cpu: "100m", memory: "256Mi" }
service:
type: ClusterIP
port: 8787
12. テスト例(curl)
# テナントAのAPIキーで、gpt-oss を利用
curl -H "Authorization: Bearer <TENANT_A_KEY>" \
-H "Content-Type: application/json" \
-d '{"model":"gpt-oss","messages":[{"role":"user","content":"Write a haiku."}]}' \
http://localhost:8787/v1/chat/completions
# 許可されていないモデルを試す(403)
curl -H "Authorization: Bearer <TENANT_A_KEY>" \
-H "Content-Type: application/json" \
-d '{"model":"gpt-4o","messages":[{"role":"user","content":"..."}]}' \
http://localhost:8787/v1/chat/completions
13. まとめ
- テナント分離・認証/認可・レート/クォータ・監査・キー管理をひとまとめに設計
- 小さく始めて段階的に強化(APIキー→JWT/mTLS、静的制限→動的ポリシー)
- 既存の監視/コスト可視化(前記事)と連携して、安全かつ予算内での運用を実現します。
14. サンプル: Codex Proxy Admin Ui Mock
コードサンプル
import React, { useMemo, useState } from "react";
// 管理UIモック(1ファイル版)
// - タブ: Tenants / API Keys / Policies
// - モーダル: キー発行、ポリシー編集
// - ダミーデータ: useMemo で生成
// - スタイル: Tailwind(ChatGPT Canvas の React プレビュー可)
export default function AdminConsole() {
type Tenant = { id: string; name: string; status: "active" | "suspended"; daily: number; monthly: number };
type ApiKey = { id: string; tenantId: string; label: string; last4: string; expiresAt?: string; revoked?: boolean };
type Policy = { id: string; tenantId: string; allow: string[]; maxTokens: number; regions: string[] };
const [activeTab, setActiveTab] = useState<"tenants" | "keys" | "policies">("tenants");
const [showKeyModal, setShowKeyModal] = useState(false);
const [showPolicyModal, setShowPolicyModal] = useState<null | Policy>(null);
const tenants = useMemo<Tenant[]>(() => [
{ id: "tnt_a", name: "Team A", status: "active", daily: 50, monthly: 500 },
{ id: "tnt_b", name: "Team B", status: "active", daily: 75, monthly: 750 },
{ id: "tnt_c", name: "Team C", status: "suspended", daily: 0, monthly: 0 },
], []);
const keys = useMemo<ApiKey[]>(() => [
{ id: "key1", tenantId: "tnt_a", label: "backend-ci", last4: "9F3C", expiresAt: "2025-12-31" },
{ id: "key2", tenantId: "tnt_a", label: "local-dev", last4: "1A2B" },
{ id: "key3", tenantId: "tnt_b", label: "ml-job", last4: "77C1", revoked: true },
], []);
const policies = useMemo<Policy[]>(() => [
{ id: "pol_a", tenantId: "tnt_a", allow: ["gpt-oss", "gpt-4.1-mini"], maxTokens: 4096, regions: ["*"] },
{ id: "pol_b", tenantId: "tnt_b", allow: ["gpt-oss"], maxTokens: 2048, regions: ["ap-northeast-1"] },
], []);
const [filterTenant, setFilterTenant] = useState<string>("all");
const filteredKeys = keys.filter(k => filterTenant === "all" || k.tenantId === filterTenant);
const filteredPolicies = policies.filter(p => filterTenant === "all" || p.tenantId === filterTenant);
return (
<div className="min-h-screen bg-gray-50 p-6">
<header className="mb-6">
<h1 className="text-2xl font-bold">Codex Proxy Admin Console</h1>
<p className="text-gray-600 text-sm">テナント・APIキー・ポリシーの運用モック。実装はバックエンドAPIに接続して差し替え。</p>
</header>
<div className="mb-4 flex items-center gap-3">
<div className="inline-flex rounded-xl shadow-sm overflow-hidden border bg-white">
<button onClick={() => setActiveTab("tenants")} className={`px-4 py-2 text-sm ${activeTab === "tenants" ? "bg-gray-900 text-white" : "text-gray-700 hover:bg-gray-100"}`}>Tenants</button>
<button onClick={() => setActiveTab("keys")} className={`px-4 py-2 text-sm ${activeTab === "keys" ? "bg-gray-900 text-white" : "text-gray-700 hover:bg-gray-100"}`}>API Keys</button>
<button onClick={() => setActiveTab("policies")} className={`px-4 py-2 text-sm ${activeTab === "policies" ? "bg-gray-900 text-white" : "text-gray-700 hover:bg-gray-100"}`}>Policies</button>
</div>
<div className="ml-auto flex items-center gap-2">
<select value={filterTenant} onChange={(e) => setFilterTenant(e.target.value)} className="border rounded-lg px-3 py-2 text-sm bg-white">
<option value="all">All Tenants</option>
{tenants.map(t => <option key={t.id} value={t.id}>{t.name}</option>)}
</select>
{activeTab === "keys" && (
<button onClick={() => setShowKeyModal(true)} className="px-3 py-2 text-sm rounded-lg bg-blue-600 text-white shadow">キー発行</button>
)}
</div>
</div>
{activeTab === "tenants" && (
<section className="bg-white rounded-2xl shadow p-4">
<h2 className="text-lg font-semibold mb-3">Tenants</h2>
<div className="overflow-x-auto">
<table className="min-w-full text-sm">
<thead>
<tr className="text-left text-gray-500 border-b">
<th className="py-2 pr-4">Name</th>
<th className="py-2 pr-4">Status</th>
<th className="py-2 pr-4">Daily Budget (USD)</th>
<th className="py-2 pr-4">Monthly Budget (USD)</th>
<th className="py-2 pr-4">Actions</th>
</tr>
</thead>
<tbody>
{tenants.map(t => (
<tr key={t.id} className="border-b last:border-0">
<td className="py-2 pr-4 font-medium">{t.name}</td>
<td className="py-2 pr-4">
<span className={`px-2 py-1 rounded text-xs ${t.status === "active" ? "bg-green-100 text-green-700" : "bg-red-100 text-red-700"}`}>{t.status}</span>
</td>
<td className="py-2 pr-4">{t.daily.toFixed(2)}</td>
<td className="py-2 pr-4">{t.monthly.toFixed(2)}</td>
<td className="py-2 pr-4 flex gap-2">
<button className="px-2 py-1 text-xs rounded bg-gray-100">編集</button>
<button className="px-2 py-1 text-xs rounded bg-gray-100">一時停止</button>
</td>
</tr>
))}
</tbody>
</table>
</div>
</section>
)}
{activeTab === "keys" && (
<section className="bg-white rounded-2xl shadow p-4">
<h2 className="text-lg font-semibold mb-3">API Keys</h2>
<div className="overflow-x-auto">
<table className="min-w-full text-sm">
<thead>
<tr className="text-left text-gray-500 border-b">
<th className="py-2 pr-4">Tenant</th>
<th className="py-2 pr-4">Label</th>
<th className="py-2 pr-4">Key (last4)</th>
<th className="py-2 pr-4">Expires</th>
<th className="py-2 pr-4">Status</th>
<th className="py-2 pr-4">Actions</th>
</tr>
</thead>
<tbody>
{filteredKeys.map(k => {
const t = tenants.find(x => x.id === k.tenantId)!;
return (
<tr key={k.id} className="border-b last:border-0">
<td className="py-2 pr-4">{t.name}</td>
<td className="py-2 pr-4 font-medium">{k.label}</td>
<td className="py-2 pr-4">•••• •••• •••• {k.last4}</td>
<td className="py-2 pr-4">{k.expiresAt ?? "-"}</td>
<td className="py-2 pr-4">{k.revoked ? <span className="text-red-600">revoked</span> : "active"}</td>
<td className="py-2 pr-4 flex gap-2">
<button className="px-2 py-1 text-xs rounded bg-gray-100">再発行</button>
<button className="px-2 py-1 text-xs rounded bg-gray-100">失効</button>
</td>
</tr>
);
})}
</tbody>
</table>
</div>
</section>
)}
{activeTab === "policies" && (
<section className="bg-white rounded-2xl shadow p-4">
<h2 className="text-lg font-semibold mb-3">Policies</h2>
<div className="overflow-x-auto">
<table className="min-w-full text-sm">
<thead>
<tr className="text-left text-gray-500 border-b">
<th className="py-2 pr-4">Tenant</th>
<th className="py-2 pr-4">Allow Models</th>
<th className="py-2 pr-4">max_tokens</th>
<th className="py-2 pr-4">Regions</th>
<th className="py-2 pr-4">Actions</th>
</tr>
</thead>
<tbody>
{filteredPolicies.map(p => {
const t = tenants.find(x => x.id === p.tenantId)!;
return (
<tr key={p.id} className="border-b last:border-0">
<td className="py-2 pr-4">{t.name}</td>
<td className="py-2 pr-4">{p.allow.join(", ")}</td>
<td className="py-2 pr-4">{p.maxTokens}</td>
<td className="py-2 pr-4">{p.regions.join(", ")}</td>
<td className="py-2 pr-4 flex gap-2">
<button className="px-2 py-1 text-xs rounded bg-gray-100" onClick={() => setShowPolicyModal(p)}>編集</button>
</td>
</tr>
);
})}
</tbody>
</table>
</div>
</section>
)}
{showKeyModal && (
<div className="fixed inset-0 bg-black/40 flex items-center justify-center p-4">
<div className="bg-white rounded-2xl shadow-xl w-full max-w-md p-4">
<h3 className="text-lg font-semibold mb-3">キー発行</h3>
<div className="space-y-3">
<label className="block text-sm">Tenant</label>
<select className="w-full border rounded-lg px-3 py-2 text-sm bg-white">
{tenants.map(t => <option key={t.id} value={t.id}>{t.name}</option>)}
</select>
<label className="block text-sm">Label</label>
<input className="w-full border rounded-lg px-3 py-2 text-sm" placeholder="backend-ci" />
</div>
<div className="mt-4 flex justify-end gap-2">
<button className="px-3 py-2 text-sm rounded-lg bg-gray-100" onClick={() => setShowKeyModal(false)}>Cancel</button>
<button className="px-3 py-2 text-sm rounded-lg bg-blue-600 text-white">Issue</button>
</div>
</div>
</div>
)}
{showPolicyModal && (
<div className="fixed inset-0 bg-black/40 flex items-center justify-center p-4">
<div className="bg-white rounded-2xl shadow-xl w-full max-w-lg p-4">
<h3 className="text-lg font-semibold mb-3">ポリシー編集</h3>
<div className="grid grid-cols-1 md:grid-cols-2 gap-3">
<div>
<label className="block text-sm">Allow Models</label>
<input defaultValue={showPolicyModal.allow.join(", ")} className="w-full border rounded-lg px-3 py-2 text-sm" />
</div>
<div>
<label className="block text-sm">max_tokens</label>
<input type="number" defaultValue={showPolicyModal.maxTokens} className="w-full border rounded-lg px-3 py-2 text-sm" />
</div>
<div className="md:col-span-2">
<label className="block text-sm">Regions</label>
<input defaultValue={showPolicyModal.regions.join(", ")} className="w-full border rounded-lg px-3 py-2 text-sm" />
</div>
</div>
<div className="mt-4 flex justify-end gap-2">
<button className="px-3 py-2 text-sm rounded-lg bg-gray-100" onClick={() => setShowPolicyModal(null)}>Cancel</button>
<button className="px-3 py-2 text-sm rounded-lg bg-blue-600 text-white" onClick={() => setShowPolicyModal(null)}>Save</button>
</div>
</div>
</div>
)}
</div>
);
}
15. サンプル: Codex Proxy インフラテンプレート(Terraform / Helm)
1. 前提
- Kubernetes 上で自作プロキシを稼働させる前提
- Prometheus によるメトリクス収集、Grafana での可視化
- Redis をレート制限用に使用
2. Helm Chart スケルトン
charts/codex-proxy/
Chart.yaml
values.yaml
templates/
deployment.yaml
service.yaml
configmap-env.yaml
secret-env.yaml
servicemonitor.yaml # Prometheus Operator 使用時
Chart.yaml
apiVersion: v2
name: codex-proxy
version: 0.1.0
appVersion: "latest"
values.yaml(例)
image:
repository: ghcr.io/your-org/codex-proxy
tag: latest
pullPolicy: IfNotPresent
replicaCount: 2
service:
type: ClusterIP
port: 8787
resources:
limits:
cpu: 500m
memory: 512Mi
requests:
cpu: 100m
memory: 256Mi
env:
OPENAI_BASE_URL: https://api.openai.com/v1
OLLAMA_BASE_URL: http://ollama:11434/v1
REDIS_URL: redis://redis:6379
# PROXY_API_KEY は Secret で注入
prometheus:
enabled: true
scrape: true
templates/deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ include "codex-proxy.fullname" . }}
spec:
replicas: {{ .Values.replicaCount }}
selector:
matchLabels:
app: {{ include "codex-proxy.name" . }}
template:
metadata:
labels:
app: {{ include "codex-proxy.name" . }}
annotations:
{{- if .Values.prometheus.scrape }}
prometheus.io/scrape: "true"
prometheus.io/port: "{{ .Values.service.port }}"
prometheus.io/path: "/metrics"
{{- end }}
spec:
containers:
- name: proxy
image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
ports:
- containerPort: {{ .Values.service.port }}
env:
- name: OPENAI_BASE_URL
value: {{ .Values.env.OPENAI_BASE_URL | quote }}
- name: OLLAMA_BASE_URL
value: {{ .Values.env.OLLAMA_BASE_URL | quote }}
- name: REDIS_URL
value: {{ .Values.env.REDIS_URL | quote }}
- name: PROXY_API_KEY
valueFrom:
secretKeyRef:
name: {{ include "codex-proxy.fullname" . }}-secret
key: PROXY_API_KEY
resources:
{{- toYaml .Values.resources | nindent 12 }}
readinessProbe:
httpGet: { path: /healthz, port: {{ .Values.service.port }} }
initialDelaySeconds: 5
periodSeconds: 10
livenessProbe:
httpGet: { path: /healthz, port: {{ .Values.service.port }} }
initialDelaySeconds: 10
periodSeconds: 20
templates/service.yaml
apiVersion: v1
kind: Service
metadata:
name: {{ include "codex-proxy.fullname" . }}
spec:
type: {{ .Values.service.type }}
selector:
app: {{ include "codex-proxy.name" . }}
ports:
- port: {{ .Values.service.port }}
targetPort: {{ .Values.service.port }}
protocol: TCP
name: http
templates/secret-env.yaml
apiVersion: v1
kind: Secret
metadata:
name: {{ include "codex-proxy.fullname" . }}-secret
stringData:
PROXY_API_KEY: "change-me"
templates/servicemonitor.yaml(Prometheus Operator)
{{- if .Values.prometheus.enabled }}
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: {{ include "codex-proxy.fullname" . }}
spec:
selector:
matchLabels:
app: {{ include "codex-proxy.name" . }}
endpoints:
- port: http
path: /metrics
interval: 15s
{{- end }}
3. Terraform テンプレート
3.1 Kubernetes Provider と Secret/ConfigMap
terraform {
required_providers {
kubernetes = {
source = "hashicorp/kubernetes"
version = ">= 2.23.0"
}
helm = {
source = "hashicorp/helm"
version = ">= 2.12.1"
}
}
}
provider "kubernetes" {
config_path = var.kubeconfig
}
provider "helm" {
kubernetes {
config_path = var.kubeconfig
}
}
variable "kubeconfig" { type = string }
variable "namespace" { type = string default = "codex" }
variable "proxy_api_key" { type = string }
resource "kubernetes_namespace" "codex" {
metadata { name = var.namespace }
}
resource "kubernetes_secret" "proxy" {
metadata { name = "codex-proxy-secret" namespace = var.namespace }
data = {
PROXY_API_KEY = base64encode(var.proxy_api_key)
}
}
3.2 Helm で Chart をデプロイ
resource "helm_release" "codex_proxy" {
name = "codex-proxy"
namespace = var.namespace
chart = "./charts/codex-proxy"
values = [
yamlencode({
image = {
repository = "ghcr.io/your-org/codex-proxy"
tag = "latest"
pullPolicy = "IfNotPresent"
}
env = {
OPENAI_BASE_URL = "https://api.openai.com/v1"
OLLAMA_BASE_URL = "http://ollama.codex:11434/v1"
REDIS_URL = "redis://redis.codex:6379"
}
service = { port = 8787 }
resources = {
limits = { cpu = "500m", memory = "512Mi" }
requests = { cpu = "100m", memory = "256Mi" }
}
prometheus = { enabled = true, scrape = true }
})
]
}
3.3 追加: Redis(Bitnami Chart 例)
resource "helm_release" "redis" {
name = "redis"
namespace = var.namespace
repository = "https://charts.bitnami.com/bitnami"
chart = "redis"
version = "19.0.0"
values = [
yamlencode({
architecture = "standalone",
master = { persistence = { enabled = false } },
auth = { enabled = false }
})
]
}
4. Grafana ダッシュボード(抜粋 JSON)
{
"title": "Codex Proxy Overview",
"panels": [
{ "type": "timeseries", "title": "Requests/min", "targets": [{ "expr": "sum(rate(llm_requests_total[1m]))" }] },
{ "type": "timeseries", "title": "Avg Latency (s)", "targets": [{ "expr": "rate(http_request_duration_seconds_sum[5m]) / rate(http_request_duration_seconds_count[5m])" }] },
{ "type": "timeseries", "title": "Cost USD (1h)", "targets": [{ "expr": "sum(increase(llm_cost_usd_total[1h])) by (tenant)" }] }
]
}
5. 使い方
-
charts/codex-proxy
をリポジトリに配置 - Terraform で namespace と Secret を作成
-
helm_release
でプロキシ本体・Redis をインストール - Prometheus/Grafana から
/metrics
をスクレイプ・可視化
必要に応じて Ingress/Nginx/Cert-Manager を追加し、外部公開や mTLS を組み込みます。