GPT-5のリリースに伴うOpenAIの新しいAPI機能を試した

Posted at 2025-09-04

イントロ

GPT-5のリリースは賛否両論の感じにはなっていますが、モデルの性能などと別で、実はOpenAIのAPIバージョンがたまにモデルのリリースに伴って更新があります。今回も新しい機能が追加されたらしくて、説明だけ読んであまりわからない部分もあるので試してみました。

参考する記事はこちらになります。一部既存の機能の記載もあります。

reasoningとverbosityの設定

2つの概念になりますが、

reasoningは、思考の出力の制御
verbosityは、回答の出力の制御

です。

verbosity

verbosityは以前にはなかったらしいので、先にverbosityを見てみます。

公式の記載にはこのような説明です。

High verbosity: Use when you need the model to provide thorough explanations of documents or perform extensive code refactoring.
Low verbosity: Best for situations where you want concise answers or simple code generation, such as SQL queries.

早速例を見ていきます。

verbosity=high

インプット

response = client.responses.create(
    model="gpt-5",
    input="現在のJST時間を教えてください。",
    text={"format": {"type": "text"},"verbosity": "high"},
    tools=[
        {
            "type": "custom",
            "name": "code_exec",
            "description": "Executes python code",
        }
    ]
)
print(response.output[1].input)
print("-"*20)
response.usage

アウトプット

# Get current time in Japan Standard Time (JST, UTC+9) using Python
from datetime import datetime, timezone, timedelta
try:
    from zoneinfo import ZoneInfo
    now_jst = datetime.now(ZoneInfo("Asia/Tokyo"))
except Exception:
    # Fallback: compute from UTC
    now_jst = datetime.now(timezone.utc) + timedelta(hours=9)

result = {
    "iso": now_jst.isoformat(),
    "date": now_jst.strftime("%Y-%m-%d"),
    "time": now_jst.strftime("%H:%M:%S"),
    "weekday_ja": ["月","火","水","木","金","土","日"][now_jst.weekday()],
    "utc_offset": "+09:00",
}

result
--------------------
ResponseUsage(input_tokens=47, 
input_tokens_details=InputTokensDetails(cached_tokens=0), 
output_tokens=437, 
output_tokens_details=OutputTokensDetails(reasoning_tokens=256), 
total_tokens=484)

verbosity=low

インプット

response = client.responses.create(
    model="gpt-5",
    input="現在のJST時間を教えてください。",
    text={"format": {"type": "text"},"verbosity": "low"},
    tools=[
        {
            "type": "custom",
            "name": "code_exec",
            "description": "Executes python code",
        }
    ]
)
print(response.output[1].input)
print("-"*20)
response.usage

アウトプット

from datetime import datetime, timezone, timedelta

jst = timezone(timedelta(hours=9))
now_jst = datetime.now(jst)
now_jst.strftime("%Y-%m-%d %H:%M:%S (JST)")
--------------------
ResponseUsage(input_tokens=47, 
input_tokens_details=InputTokensDetails(cached_tokens=0), 
output_tokens=255, 
output_tokens_details=OutputTokensDetails(reasoning_tokens=192), total_tokens=302)

同じくソースコード生成の依頼で、

verbosity=highの時は、output_tokens=484、コードも比較的に長い
verbosity=lowの時は、output_tokens=302、コードも比較的に短い

reasoning

verbosityと別で、思考の設定もできて、これまでに low, medium, high だけがあるけど、今回は low の上位に minimal が出てきたので、verbosity合わせて設定してみます。

effort=minimal

インプット

response = client.responses.create(
    model="gpt-5",
    input="現在のJST時間を教えてください。",
    text={"format": {"type": "text"},"verbosity": "low"},
    reasoning = {
        "effort": "minimal"
    },
    tools=[
        {
            "type": "custom",
            "name": "code_exec",
            "description": "Executes python code",
        }
    ]
)
print(response.output[1].input)
print("-"*20)
response.usage

アウトプット

import datetime, pytz

tz = pytz.timezone('Asia/Tokyo')
now = datetime.datetime.now(tz)
now.isoformat()
--------------------
ResponseUsage(input_tokens=47, 
input_tokens_details=InputTokensDetails(cached_tokens=0), 
output_tokens=45, 
output_tokens_details=OutputTokensDetails(reasoning_tokens=0), 
total_tokens=92)

reasoning_tokens=0のことが確認できます。全体のアウトプットトークン数はかなり減らしたけど、ネットでの情報によると、思考なしのGPT-5はGPT-4.1より弱いまたは同じぐらいの賢さだと言われているので、実際にソースコード生成タスクには向いていないです。あくまで今回は挙動の検証です。

Context‑Free Grammar

さらに別の機能になりますが、アウトプットフォーマットの指定です。昔にはJSONフォーマットの指定があるけど、今回はより複雑なRegexやLarkの指定ができます。

Regex

まずはRegexを定義して、そしてある株価ページから取ってきたHTMLをインプットして、
価格 | 価格変動 | %変動 の形でアウトプットしてみたいです。

インプット

quote_grammar_definition = r"^[+-]?\d+(?:\.\d+)? \| [+-]?\d+(?:\.\d+)? \| [+-]?\d+(?:\.\d+)?%$"

html_prompt = """
<div class="container yf-16vvaki"><div style="display: contents; --_font-weight:var(--font-bold); --_font-size:var(--font-6xl);">
<span class="down1 yf-ipw1h0 base" data-testid="qsp-price">174.64</span></div>

<div style="display: contents; --_font-weight:var(--font-bold); --_font-size:var(--font-2xl);">
<span class="down2 txt-negative yf-ipw1h0 base" data-testid="qsp-price-change">-0.76</span></div>

<div style="display: contents; --_font-weight:var(--font-bold); --_font-size:var(--font-2xl);">
<span class="up2 txt-negative yf-ipw1h0 base" data-testid="qsp-price-change-percent">(-0.43%)</span></div></div>
"""

response = client.responses.create(
    model="gpt-5",
    input=html_prompt,
    text={"format": {"type": "text"}},
    tools=[
        {
            "type": "custom",
            "name": "quote_grammar",
            "description": "Extracts and formats 'price | change | percent change' from HTML.",
            "format": {
                "type": "grammar",
                "syntax": "regex",
                "definition": quote_grammar_definition
            }
        },
    ],
    parallel_tool_calls=False
)

print(response.output[1].input)
print("-"*20)
response.usage

アウトプット

174.64 | -0.76 | -0.43%
--------------------
ResponseUsage(input_tokens=414, 
input_tokens_details=InputTokensDetails(cached_tokens=0), 
output_tokens=414, 
output_tokens_details=OutputTokensDetails(reasoning_tokens=384), 
total_tokens=828)

無事アウトプット成功で、ただし使うトークン数が少ないと言えないので、ユースケースによってコスト効果が良くない可能性があります。

Lark

次にさらに複雑なLarkです。自分もあまり詳しくないので、そもそもLLMでログから抽出のユースケースは多いけど、逆に何らかのデータからログまたは固定なフォーマットに変換するユースケースがあまり思いつかなくて、今回は敢えて少し極端な例を上げてみました。

インプットは、ある海外の天気モデルの解析データで、例えば雲や風の情報です。アウトプットは、該当地域のTAF電文です。この例で示したいユースケースは、複雑なモデルのローデータから、別の業界のフォーマットのデータに変換するみたいな処理です。情報はローデータなので、風速は東西風成分と南北風成分みたいな形で、TAF電文のような方向x速度の形にするために変換が必要で、さらに状況によって欠損データもあるので、処理が大変のはずです。（あくまで例なので、実際に業界での処理と異なるのでご了承ください）

インプット

Lark grammar の定義が非常に長いのでこちらに収めます。

import textwrap

grammar = textwrap.dedent(r"""
?start: taf_report

// A TAF report is composed of a header and the main forecast body
taf_report: report_type icao issuance_time validity_period forecast

// 1. HEADER SECTION
// ------------------
report_type: "TAF" ("AMD" | "COR")?
icao: LETTER{4}
issuance_time: DIGIT{2} DIGIT{2} DIGIT{2} "Z"
validity_period: DIGIT{2} DIGIT{2} "/" DIGIT{2} DIGIT{2}

// 2. FORECAST BODY
// ----------------
// The forecast contains initial conditions followed by optional change groups
forecast: base_conditions change_group*

// Base conditions are the initial forecast state
base_conditions: wind (cavok | (visibility? weather* clouds* wind_shear?))

// Change groups describe how the weather will evolve
change_group: tempo_group | becmg_group | from_group | prob_group

tempo_group: "TEMPO" time_period changed_conditions
becmg_group: "BECMG" time_period changed_conditions
from_group: "FM" from_time changed_conditions
prob_group: "PROB" probability time_period changed_conditions

// Time definitions for change groups
time_period: DIGIT{2} DIGIT{2} "/" DIGIT{2} DIGIT{2}
from_time: DIGIT{2} DIGIT{2} DIGIT{2}
probability: "30" | "40"

// 3. CORE WEATHER ELEMENTS
// ------------------------
// These rules define the individual meteorological components

// Wind: Direction, speed, optional gusts, and units
wind: (DIGIT{3} | "VRB") DIGIT{2,3} ("G" DIGIT{2,3})? wind_unit
wind_unit: "KT" | "MPS"

// Visibility: 9999 for >=10km, or specific meter values
visibility: DIGIT{4}

// Clouds: Amount, height, and optional significant type (CB/TCU)
// Also handles No Significant Cloud (NSC) and Vertical Visibility (VV)
clouds: (cloud_group | "NSC" | vertical_visibility)
cloud_group: cloud_amount DIGIT{3} cloud_type?
cloud_amount: "FEW" | "SCT" | "BKN" | "OVC"
cloud_type: "CB" | "TCU"
vertical_visibility: "VV" DIGIT{3}

// Significant Weather Phenomena
weather: INTENSITY? DESCRIPTOR? PHENOMENON+
INTENSITY: "-" | "+" | "VC"
DESCRIPTOR: "MI" | "PR" | "BC" | "DR" | "BL" | "SH" | "TS" | "FZ"
PHENOMENON: "DZ" | "RA" | "SN" | "SG" | "IC" | "PL" | "GR" | "GS" | "UP" | "BR" | "FG" | "FU" | "VA" | "DU" | "SA" | "HZ" | "PY" | "PO" | "SQ" | "FC" | "SS" | "DS"

// Special Keywords
cavok: "CAVOK" // "Ceiling and Visibility OK"
wind_shear: "WS" DIGIT{3} "/" wind

// 4. TERMINALS AND CONFIGURATION
// ------------------------------
%import common.DIGIT
%import common.LETTER
%import common.WS
%ignore WS // Ignore whitespace between tokens
""")

input_prompt = '''
3:11796818:18Z19mar2025:VGRD V-Component of Wind [m/s]:lvl1=(103,10) lvl2=(255,missing):10 m above ground:7200 min fcst::lon=139.039062,lat=35.008774,i=252206,ix=252206,iy=1,val=0.223282
3:11796818:18Z19mar2025:UGRD U-Component of Wind [m/s]:lvl1=(103,10) lvl2=(255,missing):10 m above ground:7200 min fcst::lon=139.039062,lat=35.008774,i=252206,ix=252206,iy=1,val=2.04112
3:11796818:18Z19mar2025:CDCC Cloud Cover [%]:lvl1=(100,0) lvl2=(100,40000):0-400 mb:7200 min fcst::lon=139.039062,lat=35.008774,i=252206,ix=252206,iy=1,val=0
3:11796818:18Z19mar2025:CDCC Cloud Cover [%]:lvl1=(100,40000) lvl2=(100,80000):400-800 mb:7200 min fcst::lon=139.039062,lat=35.008774,i=252206,ix=252206,iy=1,val=0
3:11796818:18Z19mar2025:CDCC Cloud Cover [%]:lvl1=(100,80000) lvl2=(1,0):800 mb - surface:7200 min fcst::lon=139.039062,lat=35.008774,i=252206,ix=252206,iy=1,val=47.0547
---
From the model data, transform the wind part and the cloud SCT part to TAF, ignore other parts if you cannot find the data from above.
'''

response_transform = client.responses.create(
    model="gpt-5",
    input=input_prompt,
    text={"format": {"type": "text"}},
    reasoning = {
        "effort": "high"
    },
    tools=[
        {
            "type": "custom",
            "name": "taf_grammar",
            "description": "YOU MUST REASON HEAVILY TO MAKE SURE IT OBEYS THE GRAMMAR.",
            "format": {
                "type": "grammar",
                "syntax": "lark",
                "definition": grammar
            }
        },
    ],
    parallel_tool_calls=False
)

print(response_transform.output[1].input)
print("-"*20)
response_transform.usage

アウトプット

TAF RJTT 191800Z 1918/2018 26404KT 9999 SCT020
--------------------
ResponseUsage(input_tokens=1435, 
input_tokens_details=InputTokensDetails(cached_tokens=0), 
output_tokens=7016, output_tokens_details=OutputTokensDetails(reasoning_tokens=6976), 
total_tokens=8451)

これで欠損があるデータや風速の変換処理もやってくれてTAF電文を出力することが確認できます。当然ここまでやるとトークン数も多いので、ほとんどの場合はこの手法を使わいないでしょう。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up