Help us understand the problem. What is going on with this article?

儲かりやすいティッカーシンボルはあるの?

More than 1 year has passed since last update.

ティッカーシンボル名でパフォーマンスは違うのか?

米株のティッカーシンボル名は何から始まるのが多いの?という投稿をしました.

これを使って,シンボル名でパフォーマンスは変わるか?というネタ投稿をしたいと思います.

from quantopian.pipeline.data.builtin import USEquityPricing
from quantopian.research import run_pipeline
from quantopian.pipeline import Pipeline
from quantopian.pipeline.factors import DailyReturns,SimpleMovingAverage
from quantopian.pipeline.experimental import QTradableStocksUS

from collections import Counter
import pandas as pd

def make_pipeline():
    # Quantopian 側で用意してくれている,トレーダブルな銘柄リストに入っている銘柄だけで検証します.詳しくは
    # https://www.quantopian.com/posts/working-on-our-best-universe-yet-qtradablestocksus
    universe = QTradableStocksUS()
    dayreturn = DailyReturns(inputs = [USEquityPricing.close])
    # 且つ10ドル以下の銘柄は外します.ココを外さないと結果は違うかもしれません.
    sma30 = SimpleMovingAverage(inputs = [USEquityPricing.close],window_length=30)
    not_penny = sma30 > 10

    pipe = Pipeline()
    pipe.add(dayreturn, 'dayreturn')
    pipe.set_screen(universe & not_penny)
    return pipe

# ザックリ過去8年くらいチェックしたいと思います
results = run_pipeline(make_pipeline(),start_date="2010-1-1", end_date="2018-2-22" )

まずはこの期間中に存在したのシンボル名をチェックします.前回はQuantopianで利用できる全ての銘柄を数えましたが,今回は,トレーダブルで,10ドル以上の銘柄のみを対象にしました.また,前回は,2018年1月2日に存在した銘柄のみを数えましたが,今回は過去8年間,各営業日に存在したシンボルを全部数えました.

syms = [sym.symbol for sym in results.index.get_level_values(1).unique()]
data = [sym[0] for sym in syms]

counter = Counter(data)
df_initials = pd.DataFrame(counter.most_common(), columns=["initial", "count"])
df_initials["pct"] = df_initials["count"] / df_initials["count"].sum()
df_initials.sort_values(by="pct", ascending=False)

Screenshot from 2018-02-26 21-40-54.png

トレーダブルな銘柄に絞ると,CとAから始まる銘柄が全体の20%占めるなんてちょっと驚きですね.

頭文字毎に日々のパフォーマンスをチェック

results["initial"] = [sym.symbol[0]  for sym in results.index.get_level_values(1)]
results["count"] = [len(sym.symbol) for sym in results.index.get_level_values(1)]
by_initial = results.groupby(by="initial")
by_initial["dayreturn"].median().plot(kind="bar")

Screenshot from 2018-02-27 15-06-35.png

これは,年率にすると結構大きくパフォーマンスが違ってきそうです.

文字数は小さい方が有利か?

by_count = results.groupby(by="count")
by_count.median().plot(kind="bar")

Screenshot from 2018-02-27 15-08-35.png

やはり短い方が有利のようです.

Pipelineがべんり

それにしても Pipeline は便利すぎます.もっとガツガツ使いこなしたいです.
それと,pandas の groupby 最高\(^o^)/.

Why not register and get more from Qiita?
  1. We will deliver articles that match you
    By following users and tags, you can catch up information on technical fields that you are interested in as a whole
  2. you can read useful information later efficiently
    By "stocking" the articles you like, you can search right away
Comments
Sign up for free and join this conversation.
If you already have a Qiita account
Why do not you register as a user and use Qiita more conveniently?
You need to log in to use this function. Qiita can be used more conveniently after logging in.
You seem to be reading articles frequently this month. Qiita can be used more conveniently after logging in.
  1. We will deliver articles that match you
    By following users and tags, you can catch up information on technical fields that you are interested in as a whole
  2. you can read useful information later efficiently
    By "stocking" the articles you like, you can search right away