More than 3 years have passed since last update.

【Python】文字列を一斉置換する正規表現も使える関数を作った。

Python

Posted at 2020-10-22

#はじめに
文字列を複数のパターンで置換する場合、↓ のような冗長なコードになりがちですよね。

#例えば
# " <-> '
# abc...z -> *
# ABC...Z -> AA, BB, CC, ...,ZZ
#のような置換がしたい場合

text = "'abc'" + '"ABC"'

#パターン1
replaced_text = text.replace('"', '#').replace("'", '"').replace('#', "'").replace("a", "*"). ....... 

#パターン2
trdict = str.maketrans({'"': "'", "'": '"', "a": "*", "b": "*", .......})
replaces_text = text.translate(trdict)

#パターン3
import re
replaced_text = re.sub("#", '"', re.sub('"', "'", re.sub("'", '#', re.sub("[a-z], "*", re.sub("[A-Z]", "\\1\\1", text)))))

#etc...

また、パターン１、３のような置換方法の場合、置換は順を追って行われるため、置換後の文字が更に置換されたりなど予期せぬ置換が行われる可能性も考慮しなければなりません。
しかし、パターン３のように正規表現を使えないと多大で無駄な手間がかかってしまいます。

そんな不満を解消するべく、

正規表現も使えて、
置換パターンを辞書でまとめて渡せて、
すべての置換を同時に行える

関数を書きました。

#出来たもの

import re
from typing import Dict

def replaces(text: str, trdict: Dict[str, str]) -> str:
    """
    IN:
        Source text
        Replacement dictionary
    OUT:
        Replaced text
        
    NOTE:
        You can use regular expressions.
        If more than one pattern is matched, 
        the pattern closest to the front of the dictionary takes precedence.
    
    EXAMPLE:
        text = "'abc'" + '"ABC"'
        replaces(text, {"'": '"', '"': "'", "[a-z]": "*", "([A-Z])": "\\1\\1"})
        
        ---> "***"'AABBCC'
    """
    return re.sub(
        "|".join(trdict.keys()), lambda m: next(
            (re.sub(pattern, trdict[pattern], m.group(0)) for pattern in trdict
             if re.fullmatch(pattern, m.group(0)))), text)

#使い方
第１引数：元の文字列
第２引数：置換用辞書 {before: after}
返り値　：置換後の文字列

text = "'abc'" + '"ABC"'
trdict = {"'": '"', '"': "'", "[a-z]": "*", "([A-Z])": "\\1\\1"}
replaces(text, trdict)
# ---> "***"'AABBCC'

辞書内の複数パターンに一致した場合、前方のパターンが優先されます。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up