@Tiger140304posted at 2024-04-18

python jupter　ValueError: cannot set a row with mismatched columns

Q&A

Closed

解決したいこと

jupyterで　"https://en.wikipedia.org/wiki/List_of_cryptocurrencies"のデータを取得したいので、df.loc[length] = individual_row_dataのエラー解決したい。

問題発生してるエラー

ValueError                                Traceback (most recent call last)
Cell In[326], line 6
      3 individual_row_data = [data.text.strip() for data in row_data]
      5 length = len(df)
----> 6 df.loc[length] = individual_row_data

File ~\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\pandas\core\indexing.py:911, in _LocationIndexer.__setitem__(self, key, value)
    908 self._has_valid_setitem_indexer(key)
    910 iloc = self if self.name == "iloc" else self.obj.iloc
--> 911 iloc._setitem_with_indexer(indexer, value, self.name)

File ~\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\pandas\core\indexing.py:1932, in _iLocIndexer._setitem_with_indexer(self, indexer, value, name)
   1929     indexer, missing = convert_missing_indexer(indexer)
   1931     if missing:
-> 1932         self._setitem_with_indexer_missing(indexer, value)
   1933         return
   1935 if name == "loc":
   1936     # must come after setting of missing

File ~\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\pandas\core\indexing.py:2306, in _iLocIndexer._setitem_with_indexer_missing(self, indexer, value)
   2303     if is_list_like_indexer(value):
   2304         # must have conforming columns
   2305         if len(value) != len(self.obj.columns):
-> 2306             raise ValueError("cannot set a row with mismatched columns")
...
   2310 if not len(self.obj):
   2311     # We will ignore the existing dtypes instead of using
   2312     #  internals.concat logic

ValueError: cannot set a row with mismatched columns




### 該当するソースコード
```言語名
python jupyter

from bs4 import BeautifulSoup
import requests

url = "https://en.wikipedia.org/wiki/List_of_cryptocurrencies"
page = requests.get(url)
soup = BeautifulSoup(page.text, "html")
print(soup)

soup.find('table', class_ = "wikitable sortable" )

table = soup.find_all('table')[1]

print(table)

world_titles = soup.find_all('th')

world_titles

world_table_titles = [title.text.strip() for title in world_titles]
print(world_table_titles)

df = pd.DataFrame(columns = world_table_titles)
df

for row in column_data[1:]:
    row_data = row.find_all('td')
    individual_row_data = [data.text.strip() for data in row_data]
   
    length = len(df)
    df.loc[length] = individual_row_data
    df

自分で試したこと

一通り調べてエラーの改善方法をしてみました。

0 likes

4Answer

@nak435 posted at 2024-04-18

for row in column_data[1:]:
    row_data = row.find_all('td')
    individual_row_data = [data.text.strip() for data in row_data]
   
    length = len(df)
    df.loc[length] = individual_row_data

↑これがコードのすべてでしょうか？
一部だけ抜粋されても調べられませんので、全部載せてください。

0Like

Comments

@Tiger140304
Questioner
かしこまりました。・

@nak435

NameError                                 Traceback (most recent call last)
<ipython-input-1-5ddd5aa84ecb> in <cell line: 22>()
     20 print(world_table_titles)
     21 
---> 22 df = pd.DataFrame(columns = world_table_titles)
     23 df
     24 

NameError: name 'pd' is not defined

もっと手前で、別なエラーになります。コードが違いませんか？

確かに、pdは未定義ですので、抜粋時のミスですかね。

@nak435

import pandas as pd
を追加して、先に進んだところで、以下のエラーとなりました。

Empty DataFrame
Columns: [Year of introduction, Currency, Symbol, Founder(s), Hash algorithm, Programming language of implementation, Consensus mechanism, Notes, Release, Currency, Symbol, Founder(s), Hash algorithm, Programming language of implementation, Cryptocurrency blockchain (PoS, PoW, or other), Notes, 2014, 2017, 2018, vteCryptocurrencies, Technology, Consensus mechanisms, Proof of work currencies, SHA-256-based, Ethash-based, Scrypt-based, Equihash-based, RandomX-based, X11-based, Other, Proof of stake currencies, ERC-20 tokens, Stablecoins, Other currencies, Inactive currencies, Cryptocurrency exchanges, Defunct, Crypto service companies, Related topics]
Index: []
[0 rows x 39 columns]
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-3-c9612b4a2ccd> in <cell line: 26>()
     24 print(4, df)
     25 
---> 26 for row in column_data[1:]:
     27     row_data = row.find_all('td')
     28     individual_row_data = [data.text.strip() for data in row_data]

NameError: name 'column_data' is not defined

column_dataは未定義ですので、抜粋時のミスと思われます。

Empty DataFrame（0 rows ？）も問題ですかね？

@Tiger140304
Questioner
column_data = table.find_all('tr')
for row in column_data[1:]:
row_data = row.find_all('td')
individual_row_data = [data.text.strip() for data in row_data]
length = len(df) df.loc[length] = individual_row_data
にしてみましたが、上記のエラーになりました。
@nak435
column_data = table.find_all('tr') を追加して、やっと再現しました。

原因は、
df = pd.DataFrame(columns = world_table_titles)でセットしたカラム数は39に対して、
individual_row_data = [data.text.strip() for data in row_data]のカラム数は8 と、一致していないことです。

おそらく、world_table_titlesの先頭の8カラムを期待していると思われるので、以下のように変更します。
- df = pd.DataFrame(columns = world_table_titles) + df = pd.DataFrame(columns = world_table_titles[:8])
@Tiger140304
Questioner

解決しましたが、excelの表示がこのようになってしまいます。
@nak435
df のprint結果をExcelにコピペしてませんか？

最終的なdfをCSVファイルに出力して、それをExcelで開きます。
df.to_csv("sample.csv")

@Tiger140304 posted at 2024-04-19

df.to_csv(r'C:\Users\OOOOO\OneDrive\デスクトップ\csv\crypto.data.csv')
としてみましたが上記のようになりました。

0Like

Comments

@nak435

df.to_csv(r'C:\Users\OOOOO\OneDrive\デスクトップ\csv\crypto.data.csv')
としてみましたが上記のようになりました。

先のスクショは、ファイル名が crypto.data となっているので、カンマ区切りが正しく無かったと思います。
crypto.data.csvをExcelで開いた時のスクショを見せてください。

@Tiger140304 posted at 2024-04-19

やってみました。

0Like

Comments

@nak435
↑これ、csvファイルですか？　カンマで区切られてませんね。

このファイルの中身を、コードブロックに貼り付けてください。

@Tiger140304 posted at 2024-04-19

excelでcsvからデータ取得で、できました。

ありがとうございました。

0Like

Comments

@nak435
解決であれば、当Q&Aをクローズしてください。

Are you sure you want to delete the question?

python jupter　ValueError: cannot set a row with mismatched columns

解決したいこと

問題発生してるエラー

自分で試したこと

4Answer

Comments

Comments

Comments

Comments

Your answer might help someone💌

python jupter ValueError: cannot set a row with mismatched columns

解決したいこと

問題発生してるエラー

自分で試したこと

4Answer

Comments

Comments

Comments

Comments

Your answer might help someone💌

python jupter　ValueError: cannot set a row with mismatched columns