More than 3 years have passed since last update.

正規表現　後方参照は文字クラス内では使えない

Last updated at 2021-07-23Posted at 2021-07-23

正規表現の勉強中。

HackerRankのRegex(https://www.hackerrank.com/domains/regex)
の問題「Negative Lookahead」で少々ハマったので備忘録。

問は

Write a regex which can match all characters which are not immediately followed by that same character.
直後に同じ文字が続かないすべての文字にマッチする正規表現を書きなさい。

正解は
　(.)(?!\1)
なのだけれど

　(.)[^\1]
も思いついてしまって・・・

改行以外の任意の一文字の次にグループ１でマッチした一文字以外の一文字。
のつもりだが、思い通りにマッチしない。

で、日本語でかなりググってようやく見つけた記事が以下。

文字クラス外で、バックスラッシュに続いて 1 以上の数値（複数桁可）を記述したものは、パターン中のより前方（すなわち左）にあるキャプチャ用サブパターンに対する後方参照 (back reference) です。
( https://www.php.net/manual/ja/regexp.reference.back-references.php より引用)

　文字クラス外・・・
もう少し詳しく知りたかったので英語検索で以下の記事がヒット。

"In Perl regexes, expressions like \1, \2, etc. are usually interpreted as "backreferences" to previously captured groups, but not so when the \1, \2, etc. appear within a character class. In the latter case, the \ is treated as an escape character (and therefore \1 is just 1, etc.)."
( https://stackoverflow.com/questions/18242119/general-approach-for-equivalent-of-backreferences-within-character-class より引用）
　文字クラス内での\はエスケープ文字として扱われるので　　　\1 は文字の1でしかなく、マッチの後方参照にはならない。

で、ようやく納得。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up

正規表現 後方参照は文字クラス内では使えない

正規表現　後方参照は文字クラス内では使えない