More than 3 years have passed since last update.

文字列を均等に対応付ける【python】

Posted at 2021-05-30

#概要
長さだけを見て、２つの文字列をバランス良く対応付ける関数を作りました。
例えば、"abcdefg"と"123"があったときに、[["abc","1"],["de","2"],["fg","3"]]といったように対応付けることを目指しました。
#入力と出力
入力は文字列２つです。
出力は部分文字列のペアのリストです。
対応付け方は、長い方の文字列長が短い方の文字列長のちょうどn倍のときは、短い方1文字と、長い方n文字ずつを対応づけます。
割り切れないときは、先頭に近いほうから１文字ずつ長くします。

具体的には以下のテストに合格することを目指します。

#(略)
def balancedAllocate(text1,text2):
  #(略)
  return output

if __name__ == "__main__":
  import unittest
  class TestAllocator(unittest.TestCase):
    def test_blancedAllocator(self):
      self.assertEqual(balancedAllocate("abcdef","123"),[["ab","1"],["cd","2"],["ef","3"]])
      self.assertEqual(balancedAllocate("abcdefg","123"),[["abc","1"],["de","2"],["fg","3"]])
      self.assertEqual(balancedAllocate("123","abcdef"),[["1","ab"],["2","cd"],["3","ef"]])
      self.assertEqual(balancedAllocate("123","abcdefg"),[["1","abc"],["2","de"],["3","fg"]])
  unittest.main()

#実装
まず文字数だけが与えられたときに、先頭から何文字ずつ対応付ければよいかを出力する関数を作ります。
基本、長い方が短い方の何倍をかを計算して、あまりが出たら、その文だけ、リストの冒頭の数に加算すればよいです。
第一引数と第二引数のどちらが長い（大きい）かわからないので、そこだけどちらでも対応できるように実装します。

def getAllocatedLength(num1,num2):
  #長い方と短い方をindexで管理する
  num = [num1,num2]
  if num1 > num2:
    longer, shorter = 0,1
  else:
    longer, shorter = 1,0
  #商と余り
  unit = num[longer]//num[shorter]
  mod = num[longer]%num[shorter]
  
  length = []
  for i in range(num[shorter]):
    l = [0,0]
    l[shorter] = 1
    l[longer] = unit
    #余りより小さければ１を足す
    if i < mod: l[longer] += 1
    length.append(l)
  return length

あとは、この長さに従って文字をスライスすればよいです。

def balancedAllocate(text1,text2):
  #getAllocatedLengthに0が入るとまずいので例外処理
  if not all([text1,text2]):
    return [[text1,text2]]

  length = getAllocatedLength(len(text1),len(text2))
  index1,index2=0,0
  tokens = []
  for len1,len2 in length:
    token = [text1[index1:index1+len1], text2[index2:index2+len2]]
    tokens.append(token)
    index1 += len1
    index2 += len2
  return tokens

#テスト

def getAllocatedLength(num1,num2):
  #長い方と短い方をindexで管理する
  num = [num1,num2]
  if num1 > num2:
    longer, shorter = 0,1
  else:
    longer, shorter = 1,0
  #商と余り
  unit = num[longer]//num[shorter]
  mod = num[longer]%num[shorter]
  
  length = []
  for i in range(num[shorter]):
    l = [0,0]
    l[shorter] = 1
    l[longer] = unit
    #余りより小さければ１を足す
    if i < mod: l[longer] += 1
    length.append(l)
  return length

def balancedAllocate(text1,text2):
  #getAllocatedLengthに0が入るとまずいので例外処理
  if not all([text1,text2]):
    return [[text1,text2]]

  length = getAllocatedLength(len(text1),len(text2))
  index1,index2=0,0
  tokens = []
  for len1,len2 in length:
    token = [text1[index1:index1+len1], text2[index2:index2+len2]]
    tokens.append(token)
    index1 += len1
    index2 += len2
  return tokens

if __name__ == "__main__":
  import unittest
  class TestAllocator(unittest.TestCase):
    def test_blancedAllocator(self):
      self.assertEqual(balancedAllocate("abcdef","123"),[["ab","1"],["cd","2"],["ef","3"]])
      self.assertEqual(balancedAllocate("abcdefg","123"),[["abc","1"],["de","2"],["fg","3"]])
      self.assertEqual(balancedAllocate("123","abcdef"),[["1","ab"],["2","cd"],["3","ef"]])
      self.assertEqual(balancedAllocate("123","abcdefg"),[["1","abc"],["2","de"],["3","fg"]])
  unittest.main()

##結果

.
----------------------------------------------------------------------
Ran 1 test in 0.000s

OK

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up