0
2

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 3 years have passed since last update.

pandasで列の値をリストから追加する方法

Posted at

忘備録として

#レファレンス

はじめに

##データの取得

finance_data.ipynb
import fix_yahoo_finance as yf
import pandas as pd

data = yf.download('SPLK','2021-05-01','2021-05-22')
data

株価情報を3週間分とって、週ごとでグラフ化したい。

Date Open High Low Close Adj Close Volume
2021-05-03 126.720001 127.000000 122.970001 122.970001 122.970001 1656000
2021-05-04 121.570000 121.570000 117.589996 119.849998 119.849998 2094300
2021-05-05 120.559998 121.300003 116.459999 116.839996 116.839996 2805200
2021-05-06 116.830002 117.500000 113.849998 116.779999 116.779999 2210800
2021-05-07 118.040001 120.559998 116.839996 117.820000 117.820000 1644000
2021-05-10 117.000000 117.040001 113.680000 113.739998 113.739998 2132200
2021-05-11 110.500000 119.620003 110.279999 119.250000 119.250000 2994800
2021-05-12 116.209999 118.180000 115.099998 115.690002 115.690002 2125700
2021-05-13 115.949997 117.680000 111.500000 111.989998 111.989998 2420200
2021-05-14 113.690002 116.459999 112.709999 116.220001 116.220001 1349600
2021-05-17 116.000000 116.430000 112.820000 114.370003 114.370003 1307200
2021-05-18 115.239998 117.019997 114.290001 115.410004 115.410004 2410600
2021-05-19 112.580002 114.010002 112.110001 112.769997 112.769997 2416000
2021-05-20 114.510002 116.669998 114.029999 116.260002 116.260002 2823200
2021-05-21 119.000000 120.089996 117.070000 118.050003 118.050003 1645800

##データの加工
各週にラベルをつける

week_add1.ipynb
data=data.assign(
    wday=data.index.day_name(),
    week=data.index.isocalendar().week,
    week_n=lambda df: max(df.week) % df.week,
    week_s=lambda df: df.week_n.apply(lambda s: ['currrent','weekago','2weekago'][s])
).drop(['week'],axis=1)

data
Date Open High Low Close Adj Close Volume wday week_n week_s
2021-05-03 126.720001 127.000000 122.970001 122.970001 122.970001 1656000 Monday 2 2weekago
2021-05-04 121.570000 121.570000 117.589996 119.849998 119.849998 2094300 Tuesday 2 2weekago
2021-05-05 120.559998 121.300003 116.459999 116.839996 116.839996 2805200 Wednesday 2 2weekago
2021-05-06 116.830002 117.500000 113.849998 116.779999 116.779999 2210800 Thursday 2 2weekago
2021-05-07 118.040001 120.559998 116.839996 117.820000 117.820000 1644000 Friday 2 2weekago
2021-05-10 117.000000 117.040001 113.680000 113.739998 113.739998 2132200 Monday 1 weekago
2021-05-11 110.500000 119.620003 110.279999 119.250000 119.250000 2994800 Tuesday 1 weekago
2021-05-12 116.209999 118.180000 115.099998 115.690002 115.690002 2125700 Wednesday 1 weekago
2021-05-13 115.949997 117.680000 111.500000 111.989998 111.989998 2420200 Thursday 1 weekago
2021-05-14 113.690002 116.459999 112.709999 116.220001 116.220001 1349600 Friday 1 weekago
2021-05-17 116.000000 116.430000 112.820000 114.370003 114.370003 1307200 Monday 0 currrent
2021-05-18 115.239998 117.019997 114.290001 115.410004 115.410004 2410600 Tuesday 0 currrent
2021-05-19 112.580002 114.010002 112.110001 112.769997 112.769997 2416000 Wednesday 0 currrent
2021-05-20 114.510002 116.669998 114.029999 116.260002 116.260002 2823200 Thursday 0 currrent
2021-05-21 119.000000 120.089996 117.070000 118.050003 118.050003 1645800 Friday 0 currrent

##別の方法

python:week_add2.ipynb
data=data.assign(
    wday=data.index.day_name(),
    week=data.index.isocalendar().week,
    week_n=lambda df: max(df.week) % df.week,
    week_s=lambda df: df.week_n.map({0: 'current', 1: 'weekago', 2: '2weekago'})
).drop(['week'],axis=1)

data

###解説

  • 1ではlistからスライスでラベルをつけていて、2は辞書から引っ張ってきている。
  • assign()で列を追加している。
    • データフレームのassign()ではlambdaでまず、データフレームが呼び出されている。
    • そこで列を指定して、apply()lambdaでそのシリーズから値を取り出している。
    • map()を利用して辞書で_key_から値を取り出しているのは2。記述は見やすい。
      - 値を動的に作って、そこから取り出すのは1の方がやりやすいと思う。

#作図
seabornで作図

lineplot.ipynb
fig = plt.figure(figsize=(8,8))
ax = fig.subplots()
g = sns.lineplot(data=data, x='wday', y='Close', hue='week_s', linewidth=8, ax=ax)

ax.legend().set_title('Weeks')

graphline.png

#まとめ
いったんリストを作って、列の値で、リストから取り出したい場合
df['new_col']=[some_list][df['col']]で行けそうな気がするけど、できない。

Dataframe -(lambda)-> Series -(apply() lambda)-> value
なので、しっかり取り出してあげないと、値がとれないので、気をつけていきたい。

もっと手軽に取れる方法があるような気がする :sweat:

0
2
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
2

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?