More than 1 year has passed since last update.

Polarsで1行を2行に複製・変更する方法

Posted at 2025-07-15

やりたいこと

以下のdfを考えます。

import polars as pl

df = pl.DataFrame({
    "Path": ["A/B", "C/D"],
    "Score": [12, 43],
})
df

shape: (2, 2)
┌──────┬───────┐
│ Path ┆ Score │
│ ---  ┆ ---   │
│ str  ┆ i64   │
╞══════╪═══════╡
│ A/B  ┆ 12    │
│ C/D  ┆ 43    │
└──────┴───────┘

このdfを下記のように変更する方法を紹介します。

shape: (4, 2)
┌──────┬───────┐
│ Path ┆ Score │
│ ---  ┆ ---   │
│ str  ┆ i64   │
╞══════╪═══════╡
│ A    ┆ 1     │
│ B    ┆ 2     │
│ C    ┆ 4     │
│ D    ┆ 3     │
└──────┴───────┘

準備

エクスプレッションをまとめたオブジェクトを用意しておきます。

from types import SimpleNamespace

col = SimpleNamespace(
    Path=pl.col("Path"),
    Score=pl.col("Score"),
    index=pl.col("index"),
)

ベタな方法

機械的に書けそうな方法です。

rows = []
for row in df.rows(named=True):
    path1, path2 = row["Path"].split("/")
    score1, score2 = divmod(row["Score"], 10)
    rows.append({"Path": path1, "Score": score1})
    rows.append({"Path": path2, "Score": score2})
pl.DataFrame(rows)

コードは平易ですが、冗長な感じがします。

結合する方法

ちょっとPolarsっぽい方法です。

df1 = df.select(
    col.Path.str.split("/").list[0],
    col.Score // 10,
)
df2 = df.select(
    col.Path.str.split("/").list[1],
    col.Score % 10,
)
pl.concat([df1, df2]).sort(col.Path)

無駄な処理がある上にメモリ効率もよくはないです。

長い方法

下記の処理をつなげる方法です。

要素をリスト化
1行を2行に複製
インデックスを追加
奇数行か偶数行かでリストの要素を選択

def divmod10(x):
    return [*divmod(x, 10)]

df.select(
    col.Path.str.split("/"),
    col.Score.map_elements(divmod10, return_dtype=list[int]),
).select(
    pl.all().repeat_by(2),
).explode(
    pl.all()
).with_row_index(
).select(
    Path=col.Path.list[col.index % 2],
    Score=col.Score.list[col.index % 2],
)

長いです。

Polarsで1行を2行に複製・変更する方法

やりたいこと

準備

ベタな方法

結合する方法

長い方法

おすすめの方法