More than 5 years have passed since last update.

No.036【Python】正規表現「モジュール re」について

Posted at 2019-02-18

今回は、正規表現「モジュール re」について書いていきます。

I'll write about "re module", regular expression in python" on this page.

■ 文字列の先頭とパターン一致有無： match()関数

The judgement of agreement between a lead position of strings and a pattern: match()function

>>> import re
>>> 
>>> w = "one two one two"
>>> 
>>> # 文字列の先頭とパターンとの一致有無の確認：match()
>>> 
>>> # re.match()にて調べることが可能
>>> 
>>> m = re.match("one",w)
>>> 
>>> print(m)
<re.Match object; span=(0, 3), match='one'>
>>> # ↑ 一致の場合は、matchオブジェクトを返す

>>> # matchオブジェクトは、以下のmethodを持つ
>>> # group(), start(), end(), span(), etc.
>>> 
>>> import re
>>> 
>>> w = "one two one two"
>>> 
>>> m = re.match("one",w)
>>> print(m)
<re.Match object; span=(0, 3), match='one'>
>>> 
>>> print(m.group())
one
>>> 
>>> print(m.start())
0
>>> 
>>> print(m.end())
3
>>> 
>>> print(m.span())
(0, 3)

>>> # group()：パターンに一致した全体を返す
>>> # groups()：()で囲まれた部分と一致した各文字列をタプルで取得可能
>>> 
>>> import re
>>> w = "one two one two"
>>> 
>>> m = re.match("(one) (two)",w)
>>> 
>>> print(m)
<re.Match object; span=(0, 7), match='one two'>
>>> 
>>> print(m.group())
one two
>>> 
>>> print(m.groups())
('one', 'two')
>>> 
>>> # 先頭に一致する文字列がない場合：Noneを返す
>>> 
>>> m = re.match("two", w)
>>> 
>>> print(m)
None

■ パターンの一致を調べる： search()

Search for pattern agreements

>>> # 先頭にない文字列を調べることが可能
>>> # re.match()と同様、一致の場合はmatchオブジェクトを返す
>>> 
>>> import re
>>> w = "one two one two"
>>> 
>>> m = re.search("one",w)
>>> 
>>> print(m)
<re.Match object; span=(0, 3), match='one'>
>>> 
>>> m = re.search("two",w)
>>> 
>>> print(m)
<re.Match object; span=(4, 7), match='two'>
>>> #↑ 文字列中に一致箇所が複数あっても最初に一致したかのみ返す

■ 一致箇所全てをリストで返す： findall()

Return all the part of agreement by lists

>>> # 一致箇所を全てリストにして返す
>>> # 返すのはmatchオブジェクトではない
>>> 
>>> import re
>>> 
>>> w = "one two one two"
>>> 
>>> m = re.findall("one",w)
>>> 
>>> print(m)
['one', 'one']
>>> 
>>> m = re.findall("one two",w)
>>> 
>>> print(m)
['one two', 'one two']

■ 一致箇所全てをイテレータで返す： finditer()

Return all the part of agreement by iterators

>>> # re.finditer():一致箇所をmatchオブジェクトのイテレータで返す
>>> # re.findallとは異なり、matchオブジェクトを得られる
>>> 
>>> 
>>> import re
>>> w = "one two one two"
>>> 
>>> m = re.finditer('one', w)
>>> 
>>> print(m)
<callable_iterator object at 0x107b79fd0>
>>> 
>>> for match in m:
	print(match)

	
<re.Match object; span=(0, 3), match='one'>
<re.Match object; span=(8, 11), match='one'>

■ 一致箇所の置換：sub() ／ subn()

Replacement the part of agreement

>>> import re
>>> w = "one two one two"
>>> 
>>> m = re.sub("one", "ONE",w)
>>> 
>>> print(m)
ONE two ONE two
>>> 
>>> m = re.sub("one two", "ONE TWO", w)
>>> 
>>> print(m)
ONE TWO ONE TWO
>>> 
>>> #パターンの一部を囲み、置換後の文字列中の一致箇所を使用することが可能
>>> 
>>> m = re.sub("(one) (two)", "\\1X\\2",w)
>>> 
>>> print(m)
oneXtwo oneXtwo
>>> 
>>> m = re.sub('(one) (two)', r'\1X\2', w)
>>> 
>>> print(m)
oneXtwo oneXtwo

■ パターンによる文字列分割：split()

String division by patterns

>>> import re
>>> w = "one two one two"
>>> m = re.split(" ", w)
>>> 
>>> print(m)
['one', 'two', 'one', 'two']

■ 正規表現オブジェクトのコンパイル：compile()

Complie of regular expression obejcts

>>> # re.compile()：同じパターンの繰り返し使用の場合
>>> # パターンをコンパイルして正規表現オブジェクトを生成したほうがいい
>>> 
>>> import re
>>> w = "one two one two"
>>> 
>>> com = re.compile("one")
>>> 
>>> m = com.match(w)
>>> print(m)
<re.Match object; span=(0, 3), match='one'>
>>> 
>>> m = com.findall(w)
>>> print(m)
['one', 'one']
>>> 
>>> m = com.sub("ONE", w)
>>> print(m)
ONE two ONE two

随時に更新していきますので、
定期的な購読をよろしくお願いします。
I'll update my article at all times.
So, please subscribe my articles from now on.

本記事について、
何か要望等ありましたら、気軽にメッセージをください！
If you have some requests, please leave some messages! by You-Tarin

また、「Qiita」へ投稿した内容は、随時ブログへ移動して行きたいと思いますので、よろしくお願いします。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up