More than 3 years have passed since last update.

組み合わせ問題をスクラッチ開発してみた

Last updated at 2021-12-04Posted at 2021-11-24

プログラミングを学び初め、ある程度の知識を取得したと思うので、身近な題材にアプローチしてみた。

その題材とはこれ！！
“とあるスーパー、過去3ヵ月以内のレシートを合計N万円分集めると500円券をN枚貰える“

なぜこの題材なのか、、、
計算が億劫で溜まってしまったレシート、迫る期限、端数を最小限にしたいドケチ精神、だからと言って欲しい金額になるように買い物するのは無駄な買い物しそうで嫌だ

このような理由から、自動で最適な組み合わせを教えてくれるコードを作ることにした。余談ですが、最近、車輪の再発明という慣用句を知りました。

構想

レシート5枚の時、1枚組は5通り、2枚組は10通り、3枚組は10通り、4枚組は5通り、5枚組は1通りとなる。20枚とか計算する気にならない、、、
どうやってコードにすればええんやと悩み続けたところ、こんな方法を思いついた。

〇をレシート、番号をレシートの古い順、棒線を加算とした時、左上から順に探索すれば全パターンの合計値がわかるはずである。これをデータフレームで表し、各要素に加算していき、目標の金額を超えた段階で抽出するという方法である。

コード

構想を形にするべく破壊と創造を繰り返し、数多のエラーを潜り抜けた末、以下のコードが完成した。

import pandas as pd
import numpy as np

class ReceiptCal():
    def __init__(self,List_money,money):
        self.List_money=List_money
        self.money=money
        
    def Cal(self):
        List_columns=[]
        for columns in range(len(self.List_money)):
            List_columns.append(columns)
        List_index=[]
        for index in range(2**(len(self.List_money)-2)):
            List_index.append(index)
        
        #表を作成（df1：合計値、df2：使用したレシート番号）
        df1=pd.DataFrame(0,index=List_index ,columns=List_columns)
        df2=pd.DataFrame(str(0),index=List_index ,columns=List_columns)
        #N万円を超えた要素を格納するところ
        df_result1=pd.DataFrame(columns=["合計"])
        df_result2=pd.DataFrame(columns=["番号"])
        #------------------------ここから計算------------------------
        #カラム毎に計算（1層目を実行）
        for col in List_columns:
            df1.iloc[0:2**max(0,len(List_columns[col:])-2),col]+=List_money[col]
            df2.iloc[0:2**max(0,len(List_columns[col:])-2),col]=str(col)
            
            #層毎に計算（2層目以降）
            for i in range(len(List_columns[col:])-1):
                i+=1+col
                a=0
                
                #次の層に行く条件（処理中のカラムの最大indexまで処理が済むまで）
                while a!=2**max(0,len(List_columns[col:])-2):
                    count=2**max(0,len(List_columns[i:])-2)
                    List=[]
                    for c in range(max(1,count)):
                        List.append(","+str(List_columns[i]))
                        
                    df1.iloc[a:count+a,col]+=List_money[i]
                    df2.iloc[a:count+a,col]+=List
                    a+=count
                    i+=1
                    
                    #直前の処理でN万円超えた要素があれば、抽出し、dfは-1000000にする
                    if sum(df1[col]>=self.money)>0:
                        df_result1=df_result1.append(
                            df1[df1[col]>=self.money][[col]].rename(columns={col:"合計"})
                        )
                        df_result2=df_result2.append(
                            df2[df1[col]>=self.money][[col]].rename(columns={col:"番号"})
                        )
                        df1[col]=df1[col].where(df1[col]<self.money,-1000000)
                        
                    #ひとつ前の要素が処理最後ならば 
                    if (max([int(n) for n in df2.iloc[a-1,col].split(",")])
                        ==max(List_columns)):
                        #そこが最大indexでない及び処理最後であれば次のindexへ
                        while (a!=2**max(0,len(List_columns[col:])-2) 
                               and (max([int(n) for n in df2.iloc[a,col].split(",")])
                                    ==max(List_columns))):
                            a+=1    
                        #そこが最大indexでないならば、処理を行うために加算値を代入 
                        if a<2**max(0,len(List_columns[col:])-2):
                            i=max([int(n) for n in df2.iloc[a,col].split(",")])+1
                            
        df_result1=pd.concat(
            [df_result1,df_result2],axis=1
        ).drop_duplicates(subset=["番号"]).reset_index(drop=True)
        print("古い順",df_result1,sep="\n")
        print("お得順",df_result1.sort_values("合計"),sep="\n")

すごく見にくいです。申し訳ありません。

実行結果（レシート6枚）

まずは、手持ちのレシート6枚だけで実行してみた。

List_money=[347,2599,1706,1560,2405,5638]
money=10000
ReceiptCal(List_money=List_money,money=money).Cal()

結果

古い順                       お得順
     合計         番号           合計         番号
0   10290      0,1,2,5       3   10096      0,2,4,5
1   10144      0,1,3,5       1   10144      0,1,3,5
2   10989      0,1,4,5       0   10290      0,1,2,5
3   10096      0,2,4,5       9   10642        1,4,5
4   11850    0,1,2,3,5       2   10989      0,1,4,5
5   12695    0,1,2,4,5       14  11309      2,3,4,5
6   12549    0,1,3,4,5       10  11503      1,2,3,5
7   11656    0,2,3,4,5       7   11656    0,2,3,4,5
8   14255  0,1,2,3,4,5       4   11850    0,1,2,3,5
9   10642        1,4,5       12  12202      1,3,4,5
10  11503      1,2,3,5       11  12348      1,2,4,5
11  12348      1,2,4,5       6   12549    0,1,3,4,5
12  12202      1,3,4,5       5   12695    0,1,2,4,5
13  13908    1,2,3,4,5       13  13908    1,2,3,4,5
14  11309      2,3,4,5       8   14255  0,1,2,3,4,5

できてます！！！！
因みに、処理後のdf1とdf2はこうなってます。

df1
          0        1        2     3     4     5
0  -1000000 -1000000 -1000000  9603  8043  5638
1  -1000000 -1000000     8904  7198     0     0
2  -1000000 -1000000     9749     0     0     0
3  -1000000     9943     7344     0     0     0
4  -1000000 -1000000        0     0     0     0
5  -1000000     9797        0     0     0     0
6  -1000000 -1000000        0     0     0     0
7      8584     8237        0     0     0     0
8  -1000000        0        0     0     0     0
9      9251        0        0     0     0     0
10 -1000000        0        0     0     0     0
11     7691        0        0     0     0     0
12     9950        0        0     0     0     0
13     7545        0        0     0     0     0
14     8390        0        0     0     0     0
15     5985        0        0     0     0     0

df2 
              0          1        2      3    4  5
0   0,1,2,3,4,5  1,2,3,4,5  2,3,4,5  3,4,5  4,5  5
1     0,1,2,3,5    1,2,3,5    2,3,5    3,5    0  0
2     0,1,2,4,5    1,2,4,5    2,4,5      0    0  0
3       0,1,2,5      1,2,5      2,5      0    0  0
4     0,1,3,4,5    1,3,4,5        0      0    0  0
5       0,1,3,5      1,3,5        0      0    0  0
6       0,1,4,5      1,4,5        0      0    0  0
7         0,1,5        1,5        0      0    0  0
8     0,2,3,4,5          0        0      0    0  0
9       0,2,3,5          0        0      0    0  0
10      0,2,4,5          0        0      0    0  0
11        0,2,5          0        0      0    0  0
12      0,3,4,5          0        0      0    0  0
13        0,3,5          0        0      0    0  0
14        0,4,5          0        0      0    0  0
15          0,5          0        0      0    0  0

実行結果（レシート20枚）

手持ちのレシート全て計算させてみました。

import time
start_time = time.process_time()
List_money=[347,2599,1706,1560,2405,5638,
           2876,3375,2261,2746,2579,2067,
           2296,1239,534,1677,1985,989,
           4565,562
           ]
money=40000
ReceiptCal(List_money=List_money,money=money).Cal()
end_time = time.process_time()
p_time = end_time - start_time
print(p_time)

結果

古い順
       合計                                         　番号
0    40244          0,1,2,3,4,5,6,7,8,9,10,11,12,13,16,18
1    40682          0,1,2,3,4,5,6,7,8,9,10,11,12,15,16,18
2    40361         0,1,2,4,5,6,7,8,9,10,11,12,13,15,16,18
3    40111         0,1,2,4,5,6,7,8,9,10,11,12,15,16,17,18
4    40215         0,1,3,4,5,6,7,8,9,10,11,12,13,15,16,18
..     ...                                            ...
151  43097   1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18
152  40284  1,2,3,4,5,6,8,9,10,11,12,13,14,15,16,17,18,19
153  40498     2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18
154  40071     2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,18,19
155  40526     2,3,4,5,6,7,8,9,10,11,12,13,15,16,17,18,19

お得順
       合計                                     　     番号
85   40004    0,1,3,4,5,6,7,8,9,10,11,13,14,15,16,17,18,19
93   40014            1,2,4,5,6,7,8,9,10,11,12,13,15,16,18
143  40015      1,2,3,5,6,7,8,9,10,11,12,14,15,16,17,18,19
67   40021    0,1,2,3,4,5,6,7,8,10,11,12,14,15,16,17,18,19
147  40032      1,2,4,5,6,7,8,9,10,12,13,14,15,16,17,18,19
..     ...                                             ...
51   42455     0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,18
123  42563       1,2,3,4,5,6,7,8,9,10,11,12,13,15,16,17,18
55   42910     0,1,2,3,4,5,6,7,8,9,10,11,12,13,15,16,17,18
151  43097    1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18
91   43444  0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18

16633.125

出力されました。計算も合ってそう。でも感動が薄い。待ってる間の約5時間、不安に苛まれていたからです。

まとめ

今回、全てを計算する方法を取っています。というのも、古いレシートから使用したいことと、できる限りお得になるようにしたいことを、厳密に定める方法が思いつかなかったからです。そのため、この2パターンの結果から、最適解を主観で選択する方法に落ち着きました。参考資料無しで、頭の中の構想をそのままコードにしたために、非効率的な処理だと思います。一方で、とてもありがたいことに、世の中には組み合わせ問題の様々なアルゴリズムが発明されています。それらを活用すれば容易に処理時間の短縮が実現できるはずです。今後、研鑽を積み、より良くしていく所存です。

追記：こちらが実用的にした方法をまとめた記事になります。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up