More than 5 years have passed since last update.

listの中から最小のPrefixのみ抽出したい

Python3

Last updated at 2019-09-24Posted at 2019-09-22

例えば以下のデータがあるとする。


prefix_list = [
    'hogehoge',
    'foo',
    'foo1',
    'hoge',
    'ho'
]

抽出したい文字は、'ho'と'foo'である。hoはhoge, hogehogeのprefix、fooはfoo1のprefix。このように、最小のprefixをlistから抽出するアルゴリズムを考える。

sortとlist内包表記を使ってみる

自分が書いたコードは以下である。


def find_smallest_subset_prefix(prefix_list: List[str]) -> List[str]:
    result = []
    for prefix in sorted(prefix_list):
        if not [data for data in result if prefix.startswith(data)]:
            result.append(prefix)

    return result

list内包表記の部分が何とかならないかな、と思ったのでfilterも試してみたが速度がいまいちだった。
もっと効率が良いコードはないかと思い、webサイトを検索したところ他のコードも見つかった。(https://codereview.stackexchange.com/questions/150467/find-smallest-subset-prefixesから引用)


def find_smallest_subset_prefix_2(prefix_list: List[str]) -> Set[str]:
    prefixes = set()

    for w in prefix_list:
        if any(p for p in prefixes if w.startswith(p)):
            # A prefix exists in the prefixes set.
            continue

        # Re-build the set with all elements not matching the set.
        prefixes = set(p for p in prefixes if not p.startswith(w))
        prefixes.add(w)

    return prefixes

正直このコードより自分が書いたコードの方が効率がいいと思ったのだが、実際はこちらの方が早い場合もあるようだ。ただ、コードの可読性は自分のコードの方がよいように思う。

実は・・・

一番最初に自分が書いたコードはもっと効率の悪いコードだった。pythonに詳しいおっちゃんに教えてもらい最初に比べるとかなり可読性、速度共に改善されたが、自分としては不甲斐ない結果となってしまった。

もっといいコードあるぜ！という方はコメントお待ちしております～。でわでわ。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up