More than 5 years have passed since last update.

pythonの正規表現メモ

Python

Last updated at 2019-03-10Posted at 2017-01-01

はじめに

Pythonにはperlやrubyのような正規表現リテラルはないらしい
正規表現を使用するときはreモジュールをインポートする

import re

Raw文字列

文字列リテラルの頭に rをつけると、その文字列ではエスケープシーケンスは無効となる。
（C#でいうところの@"~"みたいなやつ）

"C:¥¥My Document¥¥fire¥¥test.txt"

を

r"C:¥My Document¥fire¥test.txt"

と書ける。

rは大文字でもよい模様。

R"C:¥My Document¥fire¥test.txt"

マッチ

import re

m = re.match(r"FIRE[0-9]+", "FIRE123")

if m:
	print("matchした")
else:
	print("matchしてない")

matchメソッドの他にsearchメソッドもある。
matchメソッドは先頭マッチするかどうかを、searchメソッドは文字列の途中でマッチするかどかうかをみている

m = re.match(r"FIRE[0-9]+", "Python FIRE123")     # matchしない
m = re.search(r"FIRE[0-9]+", "Python FIRE123")    # matchする

マッチした文字列を取得する

m = re.search(r"([a-z]+)\s*=\s*([0-9]+)", "index = 123")

if m:
	print(m.group(0))                # ->  "index = 123"
	print(m.group(1))                # ->  "index"
	print(m.group(2))                # ->  "123"

名前付きグループは　　(?P<名前> ... )

m = re.search(r"(?P<name>[a-z]+)\s*=\s*(?P<value>[0-9]+)", "index = 123")

if m:
	print(m.group(0))                # ->  "index = 123"
	print(m.group(1))                # ->  "index"
	print(m.group(2))                # ->  "123"
	print(m.group('name'))           # ->  "index"
	print(m.group('value'))          # ->  "123"

マッチした位置を取得

m = re.match(r"abc([0-9]+).([0-9]+)", "abc123.456")

# --- マッチ全体
print(m.start())                # -> 0
print(m.end())                  # -> 10  ※endはマッチした次の位置（index + 1）の値が返ってくる
print(m.span())                 # -> (0, 10)  startとendをタプルで返す

# --- グループ1
print(m.start(1))               # -> 3
print(m.end(1))                 # -> 6
print(m.span(1))                # -> (3, 6)


# --- グループ2
print(m.start(2))               # -> 7
print(m.end(2))                 # -> 10
print(m.span(2))                # -> (7, 10)

正規表現をコンパイル

reg = re.compile(r"ABC([0-9]+)")

m = reg.match("ABC888")
print(m.group(1))               # -> "888"

重複しないすべてのマッチを文字列リストで返す

match_list = re.findall(r'([0-9]+)', "123a456b789")
print(match_list)               # -> ['123', '456', '789']

127

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up