0
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 3 years have passed since last update.

登録されている単語に従って文字列を切り出す(Python)

Last updated at Posted at 2020-07-27

登録した単語で分割するクラスを作りました

#概要

文字列を登録済みの単語で分割します

"かも","かもめ","めだか" とう言う単語が登録されている時、
"かもめだか"という単語を"かも/めだか"という単語に分割します。

"かもめ/だか"では、"だか"という単語がないので分割されません。

なお、"かも/めだか"というふうにデリミタ(区切り文字)があっても認識されます
厳密には「登録した単語の頭文字以外の文字」をすべてデリミタとしてみなします

もし、'だか'という単語が登録されていて、
"かもめ/だか","かも/めだか"の2通り二分割される場合、
前の文字数が長い、"かもめ/だか"として認識されます。
(基本的には、二通りの分割方法があるパターンは考えていません)

#使い方

s = StringChopper()

s.Register('かもめ')
s.Register('かも')
s.Register('めだか')

chop = s.Chopper('かもめだか')
if chop is not None:
    print('/'.join(chop))
else:
    print('分割失敗')

#ソース


class StringChopper:
    def __init__(self):
        self.initialdic :Dict[str, List[str]] = {}
    
    def Register(self, name):
        length = len(name)
        if length == 0: return

        page = self.initialdic.get(name[0])
        if page is None:
            self.initialdic[name[0]] = [name]
            return

        pagelength = len(page)
        for i in range(0, pagelength):
            if name == page[i]: return
            if name < page[i]:
                page.insert(i, name)
                return
        page.append(name)
    
    def Print(self):
        for value in self.initialdic.values():
            for name in value:
                print(name)
    
    def Serialize(self) -> List[str]:
        result = []
        for value in self.initialdic.values():
            for name in value:
                result.append(name)

        return result
    
    def Deserialize(self, datalist):
        for data in datalist:
            self.Register(data)

    def Chopper(self, namelist : str) -> Optional[List[str]]:
        if len(namelist) == 0: return []
        tmpname = namelist

        page = None
        while 0 < len(tmpname):
            page = self.initialdic.get(tmpname[0])
            if page is not None:
                break
            tmpname = tmpname[1:]
        if len(tmpname) == 0: return None

        for s in page:
            if tmpname.startswith(s):
                result = [s]
                a = self.Chopper(tmpname[len(s):])
                if a is None: continue
                result.extend(a)
                return result

        return None

0
1
2

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?