More than 1 year has passed since last update.

希望調査をもとに参加者のグループ配属を最適化する（主に制約条件の書き方のメモ）

Last updated at 2022-12-07Posted at 2022-11-17

この記事は？

pythonのpulpソルバを使用してグループ分けを実現するためのtipsを備忘録として残しておく．
本記事の内容は個人の学習記録であり，所属する組織とは無関係である．

シチュエーション

内容の異なる複数のグループワークがあり，参加者はどれか一つだけに参加するシチュエーションを考える．
参加者の全てのグループの志望度（例えば10段階評価）の調査が完了している．

目的

参加者全体の達成志望度が最大になる班分けの実行．
その際に，各グループの定員などの制約条件も満たすようにする．

方法

pythonのpulpパッケージを使用する．
使用方法等の説明は分かりやすい記事に譲る．

データ

参加者IDをindexに持ち，各グループの希望度やグループ分けに使用する参加者の属性が格納された下記のようなデータフレーム(df 表1)．下記条件のものを想定．
・各参加者の希望度は０以上の実数で合計値が参加者間で一定に調整されている．
・参加者の属性はバイナリデータである．

表1. 想定データ(df)

ID	group_A	group_B	...	group_X	flag_0	flag_1	...
0000	0	8	...	0	1	0	...
0001	2	3	...	3	1	1	...
0002	3	5	...	0	0	1	...
...	...	...	...	...	...	...	...

目的関数・問題設定

指定した制約条件下でD・t(X)が最大となるようにXを設定する．

D: 参加者の志望度行列(表1の左部分)
X: 参加者配属行列(配属する場合は1, 配属しない場合は0が格納された行列, 表2)

表2. 参加者配属行列

ID	group_A	group_B	...	group_X
0000	0	1	...	0
0001	0	0	...	1
0002	1	0	...	0
...	...	...	...	...

import pulp

# 問題の宣言
problem = pulp.LpProblem('GroupAssignmentProblem',
                         pulp.LpMaximize)

# 参加者のリスト
participants_list = df['ID'].tolist()
# グループのリスト
group_list = ["group_A","group_B",...,"group_X"]
# 参加者とグループのペアのリスト
participant_group = [(p,g) for p in participants_list for g in group_list]

# 参加者をどのクラスに割り当てるかを変数xとして定義する
# 各参加者が各グループに属するか否かを 0 or 1 で最適化するので、Binaryを指定
x = pulp.LpVariable.dicts('x', participant_group, cat='Binary')

# 目的関数の追加
problem += pulp.lpSum([x[p,g] * df.loc[df["ID"]==p,g].values for p in participants_list for g in groups])

制約条件の追加

線形な数式で表せる条件でないと実装できない

各参加者は必ずどれか一つのグループに所属する

全ての参加者pにおいて，sum(x[p,:]) == 1

for p in participants_list:
    problem.addConstraint(
        pulp.lpSum([x[p,g] for g in groups])==1)

全てのグループの定員は偏らないようにする

参加者数をグループ数で割った下図の商 ~ +1 の範囲内に収める

num_of_groups = len(groups)
min_num_of_member = math.floor(len(df)/num_of_groups)
max_num_of_member = math.ceil(len(df)/num_of_groups)

for g in groups:
    problem.addConstraint(
        pulp.lpSum([x[p,g] for p in participants_list]) >= min_num_of_member)
    problem.addConstraint(
        pulp.lpSum([x[p,g] for p in participants_list]) <= max_num_of_member)

(オプション)ある特定の参加者属性が偏らないようにする

上記と同様

min_num_of_flag_0 = math.floor(sum(df["flag_0"]==1)/num_of_groups)
max_num_of_flag_0 = math.ceil(sum(df["flag_0"]==1)/num_of_groups)

flag_0_list = df.loc[df["flag_0"]==1,"ID"].to_list()

for g in groups:
    problem.addConstraint(
        pulp.lpSum([x[p,g] for p in flag_0_list]) >= min_num_of_flag_0)
    problem.addConstraint(
        pulp.lpSum([x[p,g] for p in flag_0_list]) <= max_num_of_flag_0)

割合を一定範囲で収めたい，等のゆるい制約を貸す場合は場合は以下のように書ける

flag_1_list = df.loc[df["flag_1"]==1,:].index.to_list()
th_flag_1 = np.ceil(min_num_of_member * 0.4)

for g in groups:
    problem.addConstraint(
        pulp.lpSum([x[p,g] for p in flag_1_list]) >= th_flag_1)

(オプション)ある特定の参加者同士が同じグループにならないようにする

同じグループにならない = 両者の要素が互いに1になるグループが存在しない = 各グループ足しても1以下

ID1, ID2 = "0001", "0002"
for g in groups:
    problem += x[ID1, g] + x[ID2, g] <= 1

(オプション)ある特定の参加者同士が同じグループになるようにする

同じグループになる = 1になるグループが同じ = 各グループの差が0

ID1, ID3 = "0001", "0003"
for g in groups:
    problem += x[ID1, g] - x[ID3, g] == 0

最適化問題を解く

status = problem.solve()

結果の抜き出し

result_array = [[key[0],key[1]] for key in x.keys() if x[key].value()==1]
df_result = pd.DataFrame(result_array, columns=['ID','groups'])
df_result = df_result.set_index('ID')

多くの参加者の希望が第一希望になっているかを確認する

RANK = []
for id in df_result.index:
    R = df_result.loc[id, groups].rank(ascending=False,method='min')[df_result.loc[id].groups]
    RANK.append(R)

import matplotlib.pyplot as plt
plt.rcParams['pdf.fonttype'] = 42
plt.rcParams['font.family'] = 'Arial'
plt.grid()
plt.hist(RANK)

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up