Express の `/users/:id` がどう regex になるかを自前で書いてみる — path-to-regexp を 100 行で再実装

Posted at 2026-05-15

app.get('/users/:id', ...) を Express に渡すと、内部で path-to-regexp が /^\/users\/([^/]+)$/ に変換している。これを自分で書いてみるとどう難しいか。? 修飾子と inline regex (:id(\d+)) を含めて 100 行、エッジケース 23 件のテストつき。書いたものはそのままブラウザツールにできるので、パターンと URL を入力するとマッチ判定 + 抽出 params をリアルタイム表示 する /users/:id テスターを作った。

$regex-route の画面: パターン入力欄に /users/:id(\d+) が入力され、その下に生成された regex ^\/users\/(\d+)$ と keys :id が表示。テスト URL のテキストエリアには /users/42 / /users/abc / /users / /users/42?include=author など 7 個の URL が並び、結果テーブルでは数字 ID が match (id=42)、文字や空セグメントは no match と表示される$

🌐 デモ: https://sen.ltd/portfolio/regex-route/
📦 GitHub: https://github.com/sen-ltd/regex-route

なぜ path-to-regexp を書き直すのか

production の Express コードで app.get('/users/:id', ...) を書いている人は、/users/:id がどう regex に変換されているかを意識する必要はない。が、ある日「/users/:id で /users/42/extra がマッチしてしまう と思い込んでいた」「/users/:id? が /users で動かない」というバグに当たる。原因を理解するには、内部で何が起きているかを覗くのが最速。

path-to-regexp は v6 → v7 で breaking change があり、/users/:id の挙動が微妙に変わった (v6 では [^/]+ でセグメントマッチ、v7 では default が変更)。それが原因で Express 4 → Express 5 に上げたときに本番のルーティングが部分的に壊れるアプリがある。

「自分で書いてみる」のはこの罠を避ける唯一確実な方法。

4 状態の小さいパーサ

パターン文字列を 1 文字ずつ読んで、4 つの状態を行き来する:

while (i < pattern.length) {
  const ch = pattern[i];
  
  if (ch === ":") {
    // 名前付きパラメータ: `:name` `:name(regex)` `:name?` 等
    // ...
  } else if (ch === "*") {
    // ワイルドカード: 残りのパス全部
    re += "(.*)";
    keys.push({ name: "wild", modifier: "*" });
  } else if (REGEX_META.includes(ch)) {
    // regex メタ文字: 静的テキスト中なのでエスケープ
    re += "\\" + ch;
  } else {
    // 普通の文字 (含 `/`)
    re += ch === "/" ? "\\/" : ch;
  }
  i++;
}

ポイントは:

静的テキストの regex メタ文字を必ずエスケープする。/foo.bar パターンで . をエスケープしないと、/fooXbar にもマッチしてしまう (静かなバグ)
/ も明示的にエスケープする。動作上は / も \/ も同じだが、生成 regex を そのまま new RegExp(...) に渡せる文字列にする ため (デバッグ表示用)
状態を分けすぎず、switch ではなく else if の直線的な分岐で書く。100 行に収まる

`:name?` 修飾子の落とし穴

「optional」と書くとシンプルに見えるが、先頭の / も optional に含める必要がある のがハマりどころ:

/users/:id? パターン
これは /users でも /users/42 でもマッチさせたい
ナイーブに (?:([^/]+))? を入れると、生成 regex は ^\/users\/(?:([^/]+))?$ → これは /users/ (末尾スラッシュ) にしかマッチしない

正しくは、? を見たら 直前の \/ を吸収して optional グループに含める:

if (modifier === "?") {
  if (re.endsWith("\\/")) {
    re = re.slice(0, -2) + `(?:\\/(${seg}))?`;
  } else {
    re += `(${seg})?`;
  }
}

これで /users/:id? は ^\/users(?:\/([^/]+))?$ になり、/users も /users/42 も両方マッチする。

テストで pin:

test("compilePath: :param? makes the segment + leading slash optional", () => {
  const c = compilePath("/users/:id?");
  const r1 = matchPath(c, "/users");
  const r2 = matchPath(c, "/users/42");
  assert.equal(r1.params.id, null);
  assert.equal(r2.params.id, "42");
});

inline regex の括弧バランス

:id(\d+) のような inline regex は、開き括弧を見たら 対応する閉じ括弧 を探して取り出す。ナイーブに indexOf(")") を使うと、:date((\d{4})-(\d{2})-(\d{2})) のようにネストした括弧で破綻する。

スタックでバランスを取る + バックスラッシュエスケープ尊重:

export function findMatchingParen(s, start) {
  if (s[start] !== "(") return -1;
  let depth = 0;
  for (let i = start; i < s.length; i++) {
    if (s[i] === "\\") { i++; continue; }  // skip the escaped char
    if (s[i] === "(") depth++;
    else if (s[i] === ")") {
      depth--;
      if (depth === 0) return i;
    }
  }
  return -1;
}

\) を「閉じ括弧として扱わない」のがポイント。inline regex 内で ) 自体をリテラル文字として書きたいユーザーは \) と書く、というのが path-to-regexp の慣習。

テスト:

test("findMatchingParen respects backslash escapes", () => {
  // (\)) → 文字列の長さ 4。中の `\)` は閉じ括弧扱いにしない、
  // 末尾の `)` が対応する。
  assert.equal(findMatchingParen("(\\))", 0), 3);
});

`query` と `#fragment` のストリッピング

/users/42?include=author を /users/:id でマッチさせたい場合、Express は内部で req.url を見ているが path-to-regexp 自体には URL から path だけを取り出す機構がない。本ツールでは マッチ前に ? 以降と # 以降を切り落とす:

const hashPos = url.indexOf("#");
const noHash = hashPos === -1 ? url : url.slice(0, hashPos);
const qPos = noHash.indexOf("?");
const pathPart = qPos === -1 ? noHash : noHash.slice(0, qPos);
const query = qPos === -1 ? "" : noHash.slice(qPos + 1);

query は別フィールドとして返して、UI で「query: ?include=author」のラベルとして表示する。マッチ結果には影響しない、というのが分かりやすい挙動。

URL デコードと malformed-input 防御

/users/%E5%B1%B1%E7%94%B0 の %E5%B1%B1%E7%94%B0 は UTF-8 percent-encoding で「山田」を表す。マッチ結果は 生のままではなく decode して返す のが Express 慣習:

try {
  params[key.name] = decodeURIComponent(raw);
} catch {
  // malformed (例: 単独の "%") → 生のまま返す
  params[key.name] = raw;
}

decodeURIComponent は 不正な percent-encoding で URIError を投げる ので、try/catch でガード。ユーザーが URL のテキスト欄に手で /users/foo%bar のような半端な入力を書いたときに、画面全体が落ちないようにする。

test("matchPath: malformed percent-encoding returns the raw value", () => {
  const c = compilePath("/users/:name");
  const r = matchPath(c, "/users/foo%bar");
  assert.ok(r !== null);                  // マッチは成功する
  assert.ok(typeof r.params.name === "string");
});

出来上がりの全関数

// route.js (~100 行)
export class CompileError extends Error { ... }
export function findMatchingParen(s, start) { ... }
export function compilePath(pattern) { ... }     // → {regex, keys, source, generated}
export function matchPath(compiled, url) { ... } // → {url, pathPart, query, params} | null
export function testRoute(pattern, url) { ... }  // shortcut

これを script.js で:

pattern 入力 → debounce 80 ms → compilePath → 生成 regex と keys を表示
URLs 入力 (1 行 1 URL) → 各行で matchPath → テーブルに status chip + 抽出 params 表示

200 行未満で「Express の route がどう regex 化されるか」が見える dev tool になった。

触る

https://sen.ltd/portfolio/regex-route/ でサンプルボタン (/users/:id / /users/:id(\d+) / /users/:id? / /api/:version(v\d+)/users/:id / /posts/:year/:month/:day / /files/*) を試せる。

ソース: https://github.com/sen-ltd/regex-route — MIT、合計 ~350 行 (JS)、23 ユニットテスト、ビルド不要、依存ゼロ。

🛠 本記事は SEN 合同会社が公開している小さな開発者ツール群の 1 つ。他は portfolio 一覧から。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up