1
1

More than 1 year has passed since last update.

【bokeh】ColumnDataSourceインスタンス生成時に、`ValueError: cannot insert ~, already exists`が発生

Last updated at Posted at 2021-12-27

環境

  • Python 3.9
  • bokeh 2.4.2
  • pandas 1.3.3

やりたいこと

Pythonの可視化ライブラリbokehを使って、グラフを生成したいです。

発生したエラー

bokeh.models.ColumnDataSourceのインスタンス生成時に、ValueError: cannot insert x, already existsというエラーが発生しました。

foo.py
from bokeh.plotting import figure, output_file, save
from bokeh.models import ColumnDataSource
import pandas

index = pandas.Series([0, 1, 2, 3, 4], name="x")
data = {"x": [1, 2, 3, 4, 5], "y": [6, 7, 2, 3, 6]}
df = pandas.DataFrame(data=data, index=index)

source = ColumnDataSource(data=df)

p = figure()
p.circle(x="x", y="y", source=source)

output_file("foo.html")
save(p)
$ python foo.py
Traceback (most recent call last):
  File "foo.py", line 9, in <module>
    source = ColumnDataSource(data=df)
  File "/home/vagrant/.pyenv/versions/3.8.6/lib/python3.8/site-packages/bokeh/models/sources.py", line 227, in __init__
    raw_data = self._data_from_df(raw_data)
  File "/home/vagrant/.pyenv/versions/3.8.6/lib/python3.8/site-packages/bokeh/models/sources.py", line 272, in _data_from_df
    _df.reset_index(inplace=True)
  File "/home/vagrant/.pyenv/versions/3.8.6/lib/python3.8/site-packages/pandas/util/_decorators.py", line 311, in wrapper
    return func(*args, **kwargs)
  File "/home/vagrant/.pyenv/versions/3.8.6/lib/python3.8/site-packages/pandas/core/frame.py", line 5799, in reset_index
    new_obj.insert(0, name, level_values)
  File "/home/vagrant/.pyenv/versions/3.8.6/lib/python3.8/site-packages/pandas/core/frame.py", line 4414, in insert
    raise ValueError(f"cannot insert {column}, already exists")
ValueError: cannot insert x, already exists

原因

ColumDataSourceコンストラクタの処理で、pandas.DataFrame.reset_indexを実行しています。
https://github.com/bokeh/bokeh/blob/d8ac28b6997b969fbb5f808927a09a1a57e659b9/bokeh/models/sources.py#L271
このとき、indexのnamexとcolumnのnamexが重複していたため、上記のエラーが発生しました。

ColumnDataSourceはDataFrameに名前付きのindexがあれば、その名前を持つ列を作ろうとします。そのため、reset_indexを使っているのだと思います。

  • If the DataFrame has a named index column, the ColumnDataSource will also have a column with this name.
  • If the index name is None, the ColumnDataSource will have a generic name: either index (if that name is available) or level_0.

https://docs.bokeh.org/en/latest/docs/user_guide/data.html#using-a-pandas-dataframe 引用

1
1
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1
1