More than 5 years have passed since last update.

JavaScriptの正規表現でキャプチャを使う

Posted at 2018-04-29

以下のような文字列からhttps://...pngの部分を取り出したいとする。

start"Url":"https://a/b/aaa.png","Url":"https://a/b/bbb.png","Url":"https://a/b/ccc.png"end

phpであればpreg_match_allで$result[1]に配列で格納される。

$str = 'start"Url":"https://a/b/aaa.png","Url":"https://a/b/bbb.png","Url":"https://a/b/ccc.png"end';

preg_match_all('/"(https.+?)"/', $str, $result);

print_r($result);

=>
Array
(
    [0] => Array
        (
            [0] => "https://a/b/aaa.jp"
            [1] => "https://a/b/bbb.jp"
            [2] => "https://a/b/ccc.jp"
        )

    [1] => Array
        (
            [0] => https://a/b/aaa.jp
            [1] => https://a/b/bbb.jp
            [2] => https://a/b/ccc.jp
        )

)

Javascriptもstr.matchで簡単に取れると考えてましたが。。。

let str = 'start"Url":"https://a/b/aaa.png","Url":"https://a/b/bbb.png","Url":"https://a/b/ccc.png"end';

let result = str.match(/"(https.+?)"/g);
console.log(result);

=>
[ '"https://a/b/aaa.png"',
  '"https://a/b/bbb.png"',
  '"https://a/b/ccc.png"']

matchでは取り出したい文字にダブルクォートがついた形でしか取り出せない。
なのでRegExp.execを使う。

let str = 'start"Url":"https://a/b/aaa.png","Url":"https://a/b/bbb.png","Url":"https://a/b/ccc.png"end';

let match;
let matches = [];
while (match = regexp.exec(str)) {
  matches.push(match[1]);
}
console.log(matches);

=>
[ '"https://a/b/aaa.png"',
  '"https://a/b/bbb.png"',
  '"https://a/b/ccc.png"']

最初はこのexecの繰り返しで何で取得できる値が変わっていくのかが理解できませんでした。
mozillaのドキュメントを読むと正規表現オブジェクトのlastIndexプロパティの値から検索を始めると書いてました。つまりexecを実行する度に正規表現オブジェクトのlastIndexの値が更新されていくんですね。
ドキュメントにもそう書いておいて欲しかった。。。

MDN: developer.mozilla.org

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up