Help us understand the problem. What is going on with this article?

Pandasについて

More than 1 year has passed since last update.

Pandasを使ってみましょう。

import pandas as pd

ファイルを開く

df_white = pd.read_csv("./winequality-white.csv", sep=";")
df_red = pd.read_csv("./winequality-red.csv", sep=";")
df_white
df_red

それぞれのデータフレームにtypeという列名で列を追加せよ
type列は白ワインは0、赤ワインは1を埋めよ

df_white["type"] = 0
df_red["type"] = 1

df_white
df_red.head()

df_whiteとdf_redを行を追加する形でdfと言う名前の1つのデータフレームにせよ また、1つに結合した際にインデックスはリセットせよ

df = df_white.append(df_red).reset_index(drop=True)
df

欠損値がないかの確認

df.isnull()

特徴量側での欠損値の確認

df.isnull().any()

qualityごとのレコード数を確認せよ

df.groupby("quality").count()
tmp_set = set(df["quality"])
tmp_dict = {}
for s in tmp_set:
    cnt = sum(df["quality"] == s)
    tmp_dict[s] = cnt
print(tmp_dict)
df["quality"] == 3

qualityとtypeごとのレコード数を確認せよ

df.groupby(["quality", "type"]).count()

matplotlibのhistを用い、typeごとのpHの分布をヒストグラムを描き確認せよ

%matplotlib inline
import matplotlib.pyplot as plt
plt.hist([df_white["density"], df_red["density"]], label=["white", "red"])
#plt.hist(df_white["pH"], label="white", rwidth=0.4)
#plt.hist(df_red["pH"], label="red", rwidth=0.4)
plt.legend()
plt.show()

typeごと、pHを0.1単位ごとにレコード数がいくつあるか確認せよ

import numpy as np
df["round_pH"] =  np.round(df["pH"], 1)
#df["round_pH"] =  df["pH"].apply(lambda x: round(x, 1))
df.groupby(["type", "round_pH"]).count()

本日はここまで!

Why not register and get more from Qiita?
  1. We will deliver articles that match you
    By following users and tags, you can catch up information on technical fields that you are interested in as a whole
  2. you can read useful information later efficiently
    By "stocking" the articles you like, you can search right away
Comments
No comments
Sign up for free and join this conversation.
If you already have a Qiita account
Why do not you register as a user and use Qiita more conveniently?
You need to log in to use this function. Qiita can be used more conveniently after logging in.
You seem to be reading articles frequently this month. Qiita can be used more conveniently after logging in.
  1. We will deliver articles that match you
    By following users and tags, you can catch up information on technical fields that you are interested in as a whole
  2. you can read useful information later efficiently
    By "stocking" the articles you like, you can search right away
ユーザーは見つかりませんでした