Haskellでミニマムな正規表現のmatch関数を実装してみる

Posted at 2025-06-30

「ビューティフルコード」¹（オーライリー・ジャパン）のしょっぱなの章で、
カーニハン先生がロブ・パイクさんのミニマムな正規表現のmatch関数のソースコードを紹介していた。

たった３０行ほどのC言語コードで、下記正規表現をサポートする。

文字	意味
c	文字cそのもの
.	任意の一文字
^	文字列の先頭にマッチ
$	文字列の末尾にマッチ
*	直前の文字の0回以上の反復

95％の用途はこれで足りると書いてあったが、全くその通りだと思う。
短くて日常のほとんどのケースで用が足せる簡潔なプログラムは、まさにUNIX精神そのもの。

この本のパイクさんのコードに触発されて、Haskellで同等のものを書いてみた。

ミニマム正規表現match関数Haskell版

-- match regex string 
match :: [Char] -> [Char] -> Bool
match _ [] = False
match (x:xs) (y:ys)
    | x == '^' = match_here xs (y:ys)
    | match_here (x:xs) (y:ys) = True
    | otherwise = match (x:xs) ys


match_here :: [Char] -> [Char] -> Bool
match_here [] _ = True
match_here (x:xs) []
    | x == '$' && xs == [] = True
    | otherwise = False
match_here (x:xs) (y:ys)
    | x == y || x == '.' = if xs /= [] && head xs == '*' 
                           then match_repeat x (tail xs) ys
                           else match_here xs ys
    | otherwise = False


match_repeat :: Char -> [Char] -> [Char] -> Bool
match_repeat _ [] _ = True
match_repeat _ _ [] = False 
match_repeat c (x:xs) (y:ys)
    | x == y = match_here xs ys
    | c == y || c == '.' = match_repeat c (x:xs) ys
    | otherwise = match_here xs (y:ys)

使用例

*Main> match "^a.*1" "abccfb1"
True
*Main> match "abc" "abccfb1"
True
*Main> match "c.*1$" "abccfb1"
True
*Main> match "ccd" "abccfb1"
False

ロブ・パイクさんのコードは、再帰とポインタを巧みに使っていて、
ほとんどC言語で関数型プログラミングをしているようなもの。

プログラム構成のアイディアはそのまま使った感じだが、
コードを見ながら移植だとつまらないので、
一度みた構成を思い出しながら、Haskellで実装するというスタイルをとった。

実際書いてみるとHaskell言語はかなり見通しがよいと思った。
パターンマッチング、条件分岐や関数宣言の簡潔な記法、が効いているな。

*この記事は7年くらい前に自分がブログに書いたものを多少リバイスして転記。

https://www.oreilly.co.jp/books/9784873113630/ ↩

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up