LoginSignup
1
0

More than 1 year has passed since last update.

②Portfolio using Python : Obtaining Data, Visualization in Python

Last updated at Posted at 2023-01-22

Objective

I would like to give a try to create something using Python rather than preparing in advance to know deeply about it.

Things I am doing are...

  • obtaining historical data from Stock Market
  • making data visualization (plot)

1: geting stock market data

There are several sources you can get historical daily price-volume stock market data from.
I use stooq(https://stooq.com/) this time.
(Yahoo is no longer being used since Pandas is no longer working with Yahoo Finance.)

installing pandas_reader module via pip

!pip install pandas_datareader
from pandas_datareader import data
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

S&P500 is used in this prac.
The ticker is "SPX" and 'stooq' is used for the data.

start = '2006-03-15'
end = '2023-01-21'

df = data.DataReader('^SPX', 'stooq', start, end)
df.head(10)

Running the script above should return the following in your console.

	Open	High	Low	Close	Volume
Date					
2023-01-20	3909.04	3972.96	3897.86	3972.61	2.699404e+09
2023-01-19	3911.84	3922.94	3885.54	3898.85	2.550553e+09
2023-01-18	4002.25	4014.16	3926.59	3928.86	2.644401e+09
2023-01-17	3999.28	4015.39	3984.57	3990.97	2.561165e+09
2023-01-13	3960.60	4003.95	3947.67	3999.09	2.305645e+09
2023-01-12	3977.57	3997.76	3937.56	3983.17	2.468086e+09
2023-01-11	3932.35	3970.07	3928.54	3969.61	2.353913e+09
2023-01-10	3888.57	3919.83	3877.29	3919.25	2.140006e+09
2023-01-09	3910.82	3950.57	3890.42	3892.09	2.498159e+09
2023-01-06	3823.37	3906.19	3809.56	3895.08	2.462500e+09

In order to make visualize the table data above, you can use the matplotlib library and plot method as shown below.

date = df.index
price=df['Close']
plt.figure(figsize=(30, 10))
plt.plot(date,price,label='S&P500')
plt.title('S&P500',color='blue',backgroundcolor='white',size=40, loc='center')
plt.xlabel('date', color='black', size=30)
plt.ylabel('price', color='black', size=30)
plt.legend()

スクリーンショット 2023-01-23 4.35.50.png

Let me improve the plot by adding 2 kinda moving averages (50&200days) and giving approproate labels.

#移動平均線を算出する
span01=50
span02=200

df['sma01'] = price.rolling(window=span01).mean()
df['sma02'] = price.rolling(window=span02).mean()
pd.set_option('display.max_rows', None)
df.head(100)
plt.figure(figsize=(30, 10))
plt.plot(date,price,label='S&P500')
plt.plot(date,df['sma01'],label='SMA(50)')
plt.plot(date,df['sma02'],label='SMA(200)')

plt.title('S&P500',color='blue',backgroundcolor='white',size=40, loc='center')
plt.xlabel('date', color='black', size=30)
plt.ylabel('price', color='black', size=30)
plt.legend()

スクリーンショット 2023-01-23 4.42.49.png

a lot better but it seems something is missing...

That is ,,,, trading volume !!

Let's add that into the plot.

plt.figure(figsize=(30, 15))
plt.bar(date,df['Volume'],label='Volume',color='grey')

plt.legend()

スクリーンショット 2023-01-23 4.47.32.png

In Matplotlib, we can draw multiple graphs in a single.
I use Subplot() function.

plt.figure(figsize=(30, 15))
plt.subplot(2,1,1)

plt.plot(date,price,label='S&P500')
plt.plot(date,df['sma01'],label='SMA(50)')
plt.plot(date,df['sma02'],label='SMA(200)')

plt.subplot(2,1,2)
plt.bar(date,df['Volume'],label='Volume',color='grey')

plt.legend()

スクリーンショット 2023-01-23 4.51.29.png

I'm done for today.
I know this is just the beginning but I feel Python is much easier compared to Java that was my frist programming language.

p.s.

I wanted to create an another plot for an individual stock.
I choose JP Morgan Chase coz their Q4 earnings beat the consensus estimate.

 #JPMorgan Q4 
df = data.DataReader('JPM.US', 'stooq')
df = df.sort_index()
df = df[(df.index>='2020-01-01 00:00:00') & (df.index<='2023-01-23 00:00:00')]
date=df.index
price=df['Close']

span01=50
span02=200

df['sma01'] = price.rolling(window=span01).mean()
df['sma02'] = price.rolling(window=span02).mean()

plt.figure(figsize=(30, 15))
plt.subplot(2,1,1)

plt.plot(date,price,label='JP Morgan')
plt.plot(date,df['sma01'],label='SMA(50)')
plt.plot(date,df['sma02'],label='SMA(200)')

plt.subplot(2,1,2)
plt.bar(date,df['Volume'],label='Volume',color='grey')

plt.legend()

スクリーンショット 2023-01-23 5.02.42.png

The code should be modified using variables and functions.
I leave the task for next time!

Appendix (adding Variables and Functions)

It's kinda troublesome to write all the code above every time I look up each stock. 
Therefore, functions are used.

Defining a function using def

# 関数定義

def Individual_Stock(start,end,Company_Ticker):
    df = df[(df.index>=start) & (df.index<=end)]
    df = data.DataReader(Company_Ticker, 'stooq')


    date=df.index
    price=df['Close']

    span01=50
    span02=200

    df['sma01'] = price.rolling(window=span01).mean()
    df['sma02'] = price.rolling(window=span02).mean()

    plt.figure(figsize=(20, 10))
    plt.subplot(2,1,1)

    plt.plot(date,price,label='JP Morgan')
    plt.plot(date,df['sma01'],label='SMA(50)')
    plt.plot(date,df['sma02'],label='SMA(200)')

    plt.subplot(2,1,2)
    plt.bar(date,df['Volume'],label='Volume',color='grey')

    plt.legend()

calling the function with arguments.

Individual_Stock('2020-01-1','2023-01-23','JPM.US')

seems something wrong with my code.

---------------------------------------------------------------------------
UnboundLocalError                         Traceback (most recent call last)
/var/folders/f7/1shbvbps5872h62m_8_6r1pc0000gn/T/ipykernel_2050/2333707301.py in <module>
----> 1 Individual_Stock('2020-01-1','2023-01-23','JPM.US')

/var/folders/f7/1shbvbps5872h62m_8_6r1pc0000gn/T/ipykernel_2050/2780953359.py in Individual_Stock(start, end, Company_Ticker)
      2 
      3 def Individual_Stock(start,end,Company_Ticker):
----> 4     df = df[(df.index>=start) & (df.index<=end)]
      5     df = data.DataReader(Company_Ticker, 'stooq')
      6 

UnboundLocalError: local variable 'df' referenced before assignment

It says 'local variable 'df' referenced before assignment', which means that I should have defined the variable 'df' before anything else.
It should be like below.

def Individual_Stock(start,end,Company_Ticker):
    df = data.DataReader(Company_Ticker, 'stooq')
    df = df[(df.index>=start) & (df.index<=end)]  

calling the function again after modification.

スクリーンショット 2023-01-24 19.02.07.png

Done.

1
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1
0