More than 3 years have passed since last update.

BitMEXから入手したBTCのTradeデータを、PyTorchを使用してLSTMで予測を試みるも、勾配消失問題？によると思われるNaN問題等に挫折。気を取り直して、別売買ストラテジでバックテストを試してみた！

Last updated at 2020-12-26Posted at 2020-12-26

表題

背景

機械学習及び、深層学習等を活用した、システムまたはソリューションを開発するスキル習得のために、まずはPythonを修学しました。
その集大成として、昨今話題の仮想通貨の価格変動のデータを取得し、分析してストラテジを検証するまでの流れを一通りやってみた結果を纏めた記事です。

何故PyTorch？何故BTC？

今後DeepLearningのフレームワークを活用していく中で、TnsorFlowは若干枯れている感がありました。しかし、最近PyTorchが話題になることが多かった事と、全く触ったことがなかったため、腰を据えていじって見たいと思いました。
BTC関連はこれから技術革新が進むと思った事と、APIが充実していることもあり、色々と試したりデータを入手し易いと思い選択しました。

結論、残項目

教師強制は難しく、今のモデルだと推論が収束しない
正弦波のようなパターンであればモデルがつくれるが、昨今の右肩上がりのパターンんだと「勾配消失問題」だと思われるNaNが発生してしまう。
簡易LSTMモデルだとバックテストの成績が良くない

流れ

bitmexのデータ取得
データの前処理
LSTMのモデル構築
学習
推論
評価
考察
別のLSTMモデル構築
バックテスト
必要なモジュール
ソースコード１
実行結果ログ１
実行結果ログ２
ソースコード２
実行結果ログ３
参考文献

BitMEXのデータ取得

最近は、日本から取引が出来ないためすっかり下火になっているようですが、APIは活用できるので活用させて頂きました。
ただ実施は、https://public.bitmex.com/?prefix=data/trade/ にTickのデータが置いてあるので、ファイルで撮ってきて活用しています。

データの前処理

VWAP計算

1分間隔でボリューム加重平均価格(VWAP)を計算
事前定義された時間間隔で取引をグループ化します。何故なら、市場の取引は時間によって取引活性が変化するため

train/val/testデータ作成

ボリューム加重平均価格(VWAP)のデータを、モデル学習用データ0.65、モデル評価用データ0.08、推論用データ0.27に分ける

スケーリング

LSTMモデルをより速く収束させるに、データをスケーリングする
入力の値が大きいと、学習が遅くなる可能性がある
sklearnライブラリのStandardScalerを使用
スケーラーはトレーニングセットに適合し、検証およびテストセットで見えない取引データを変換するために使用
すべてのデータにスカラーを適合させると、モデルは過学習してしまい、このデータで良好な結果が得られるが、実際のデータでパフォーマンスが低下する

タイムバーデータ変換

スケーリング後、LSTMでのモデリングに適した形式にデータを変換
データの長いシーケンスを、単一のタイムバーだけシフトされる多くの短いシーケンス（シーケンスごとに100タイムバー）に変換
以下のプロットは、トレーニングセットの最初と２番目、３番目のシーケンス
両方のシーケンスの長さは100タイムバー
両方のシーケンスのターゲットが機能とほぼ同じであり、違いは最初と最後のタイムバーにあること
LSTMはトレーニングフェーズでシーケンスをどのように使用するか？
まず、最初のシーケンスに焦点を当てる
モデルは、インデックス0のタイムバーの特徴を取り、インデックス1のタイムバーのターゲットを予測しようとする
次に、インデックス1のタイムバーの特徴を取り、タイムバーのターゲットを予測しようとする
インデックス2などで。2番目のシーケンスの特徴は1番目のシーケンスの特徴から1タイムバーだけシフトされ、3番目のシーケンスの特徴は2番目のシーケンスから1タイムバーだけシフトされる
この手順では、多くの短いシーケンスが得られ、単一のタイムバーだけシフトされる
分類または回帰タスクでは、通常予測しようとしている一連の機能とターゲットがあることに注意
LSTMを使用したこの例では、フィーチャとターゲットは同じシーケンスからのもので、唯一の違いはターゲットが1タイムバーだけシフトされている

モデル構築と学習

LSTMのトレーニング
21個の隠れユニットを用いてLSTMを学習
単位数を少なくすることで、LSTMが完全に記憶する可能性が低くなるようにしている
学習には平均二乗誤差損失関数とAdamオプティマイザを用いる
学習率は0.001に設定し，5エポックごとに減衰する
バッチごとに100個のシーケンスを15エポックで学習
プロットから、学習損失と検証損失が6エポック目で収束

モデル評価

テストセット

テストセットでモデルを評価
futureパラメータは5に設定
モデルが次の5つの時間帯（この例では5分）にあると考えられるVWAPを出力する
これにより、価格の変化が発生する数時間前に目に見えるようになる
プロットでは、予測値が実際のVWAPの値と密接に一致していることがわかる
しかし、将来のパラメータは5に設定されており、オレンジ色のラインはスパイクをカバーするのではなく、発生する前に反応しなければならない。

ズームイン

スパイクにズームインすると（開始時と終了時の1つともう1つの時系列）、予測値が実際の値を模倣していることがわかる
実際の値が方向を変えると、予測値が追従しますが、これでは役にたたない
未来のパラメータを増やしても同じことが起こる（予測線には影響しない）

モデルによる推論

モデルを使用して最初のテストシーケンスについて1000本のタイムバーを生成し、予測値、生成値、実際のVWAPを比較
モデルが予測値を出力している間は、実際の値に近いことが観察
しかし、値を生成し始めると、出力はほとんど正弦波に似ている
ある期間の後、値は9600に収束する
この動作は、モデルが実際の入力でのみ訓練され、生成された入力では訓練されなかったために起こっていると考えられる
モデルは生成された入力から出力を生成すると、次の値を生成するのが下手になる
教師強制でこの問題の是正を試みる

教師強制

教師強制は、
前の時間ステップの出力を入力とするリカレントニューラルネットワークを訓練する方法
RNNを訓練する際に、前の出力を現在の入力として使うことでシーケンスを生成することができる
訓練時にも同様の処理を行うことができるが、モデルが不安定になったり、収束しなかったりすることがある
教師強制は、訓練中にこれらの問題に対処するためのアプローチ
言語モデルでは一般的に使われている

今回は、Scheduled samplingと呼ばれるTeacher forcingの拡張機能を使用する
モデルは、学習中に一定の確率で、その生成された出力を入力として使用
最初は、モデルがその生成された出力を見る確率は小さく、訓練中に徐々に増加する
この例では、訓練中には増加しないランダムな確率を使用していることに注意

前と同じパラメータで、教師強制を有効にしたモデルを訓練
7エポック後、学習と検証の損失は収束

モデル評価

以前と同様の予測されたシーケンスを観察
スパイクを拡大すると、予測値が実際の値を模倣しているようなモデルの挙動が観察できる
教師の強制では問題が解決しなかった。。。

モデルによる推論

教師強制で学習したモデルを用いて、最初のテストシーケンスの1000本のタイムバーを生成

考察

生成されたシーケンスに関する考察は，教師強制で訓練されたモデルから生成された値は，収束するまでに時間がかかるということ
もう一つは、シーケンスが増加しているとき、それはある点まで増加し続け、その後、減少し始め、シーケンスが収束するまでパターンが繰り返され、このパターンは、振幅が減少する正弦波のように見える

結論

検証の結果、モデルの予測がシーケンスの実際の値を模倣していることがわかる
第1のモデルと第2のモデルは、価格の変化を発生前に検出しない
別の特徴（ボリュームのような）を追加すると、モデルが発生する前に価格変化を検出するのに役立つかもしれないが、その場合、モデルは次のステップでそれらの出力を入力として使用するために2つの特徴を生成する必要があり、モデルが複雑になる
上のプロットで見られるように、モデルはVWAP時系列を予測する能力を持っているので、より複雑なモデル（複数のLSTMCellを使用し、隠れユニットの数を増やす）を使用しても役に立たないかもしれない
より高度な教師強制の方法で、モデルのシーケンス生成スキルを向上させる可能性はあるかもしれない。。。

別のLSTMモデル構築

定数の設定
教師データの作成
価格の正規化
データの分割、TorchのTensorに変換
LSTMの学習モデル構築
まずはtrainデータのindexをランダムに入れ替える。最初のtime_steps分は使わない。
batch size毎にperm_idxの対象のindexを取得
LSTM入力用の時系列データの準備
pytorch LSTMの学習実施
validationデータの評価
validationの評価が良ければモデルを保存
bestモデルで予測する。
簡易なストラテジでバックテストを行う

バックテスト

スコア

Start                     2020-11-24 13:20:00
End                       2020-12-23 23:50:00
Duration                     29 days 10:30:00
Exposure Time [%]                     10.2358
Equity Final [$]                       110980
Equity Peak [$]                        111948
Return [%]                            10.9804
Buy & Hold Return [%]                 20.5915
Return (Ann.) [%]                     255.219
Volatility (Ann.) [%]                 52.4708
Sharpe Ratio                          4.86403
Sortino Ratio                         30.6017
Calmar Ratio                          127.727
Max. Drawdown [%]                    -1.99816
Avg. Drawdown [%]                   -0.526232
Max. Drawdown Duration        7 days 22:30:00
Avg. Drawdown Duration        0 days 16:50:00
# Trades                                   26
Win Rate [%]                          73.0769
Best Trade [%]                        1.00127
Worst Trade [%]                      -1.00668
Avg. Trade [%]                       0.453453
Max. Trade Duration           0 days 10:40:00
Avg. Trade Duration           0 days 02:37:00
Profit Factor                         2.69042
Expectancy [%]                       0.457399
SQN                                   2.53193
_strategy                    myCustomStrategy
_equity_curve                             ...
_trades                       Size  EntryB...

プロット

必要なモジュール（下記のモジュールをpipでインストール）

pip install numpy pandas matplotlib dateutil pprint sklearn torch skorch backtesting bitmex

ソースコード１（PyTorchのLSTMで、データのスケーリングや教師強制にチャレンジしたコード）

btc_prediction_by_lstm_pytorch.py

# -*- coding: utf-8 -*-
'''
btc_prediction_by_lstm_pytorch.py

Copyright (C) 2020 HIROSE Ken-ichi (hirosenokensan@gmail.com) 
                                                 All rights reserved.
 This is free software with ABSOLUTELY NO WARRANTY.
 
 This program is free software; you can redistribute it and/or modify
 it under the terms of the GNU General Public License as published by
 the Free Software Foundation; either version 2 of the License, or
 (at your option) any later version.
 
 This program is distributed in the hope that it will be useful,
 but WITHOUT ANY WARRANTY; without even the implied warranty of
 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 GNU General Public License for more details.
 
 You should have received a copy of the GNU General Public License
 along with this program; if not, write to the Free Software
 Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA
 02111-1307, USA
'''

import glob
import warnings

import os
import math
import time
import random
# import pprint
from dateutil import parser
from datetime import timedelta, datetime

import numpy as np
import pandas as pd
# import pandas_datareader.data as web

import matplotlib
import matplotlib.pyplot as plt

# import sklearn
from sklearn.preprocessing import StandardScaler

import torch
import torch.nn as nn
import torch.optim as optim
from torch.autograd import Variable
# import skorch

from backtesting import Backtest, Strategy
from backtesting.lib import plot_heatmaps

# import bitmex

class Model(nn.Module):
    def __init__(self, input_size, hidden_size, output_size):
        super(Model, self).__init__()
        self.input_size = input_size
        self.hidden_size = hidden_size
        self.output_size = output_size
        self.lstm = nn.LSTMCell(self.input_size, self.hidden_size)
        self.linear = nn.Linear(self.hidden_size, self.output_size)

    def forward(self, input, future=0, y=None):
        outputs = []

        # h_t = torch.zeros(input.size(0), self.hidden_size, dtype=torch.float32)
        h_t = torch.zeros(input.size(0), self.hidden_size, dtype=torch.float32, device=cuda_device)
        # c_t = torch.zeros(input.size(0), self.hidden_size, dtype=torch.float32)
        c_t = torch.zeros(input.size(0), self.hidden_size, dtype=torch.float32, device=cuda_device)

        for i, input_t in enumerate(input.chunk(input.size(1), dim=1)):
            h_t, c_t = self.lstm(input_t, (h_t, c_t))
            # print("c_t:{}".format(c_t)) # NaNに変化している。
            # print("h_t:{}".format(h_t)) # NaNに変化している。
            output = self.linear(h_t)
            # print("output:{}".format(output)) # NaNに変化している。
            outputs += [output]
            # print("e-outputs:{}".format(outputs)) # NaNに変化している。

        for i in range(future):
            if y is not None and random.random() > 0.5:
                output = y[:, [i]]  # teacher forcing
            h_t, c_t = self.lstm(output, (h_t, c_t))
            output = self.linear(h_t)
            outputs += [output]
        outputs = torch.stack(outputs, 1).squeeze(2)
        # print("outputs:{}".format(outputs)) # NaNに変化している。
        return outputs

class Optimization:
    def __init__(self, model, loss_fn, optimizer, scheduler):
        self.model = model
        self.loss_fn = loss_fn
        self.optimizer = optimizer
        self.scheduler = scheduler
        self.train_losses = []
        self.val_losses = []
        self.futures = []

    @staticmethod
    def generate_batch_data(x, y, batch_size):
        for batch, i in enumerate(range(0, len(x) - batch_size, batch_size)):
            x_batch = x[i : i + batch_size]
            y_batch = y[i : i + batch_size]
            yield x_batch, y_batch, batch

    def train(
        self,
        x_train,
        y_train,
        x_val=None,
        y_val=None,
        batch_size=100,
        n_epochs=15,
        do_teacher_forcing=None,
    ):
        seq_len = x_train.shape[1]
        for epoch in range(n_epochs):
            startup = time.time()
            self.futures = []

            # with torch.autograd.detect_anomaly():
            train_loss = 0
            for x_batch, y_batch, batch in self.generate_batch_data(x_train, y_train, batch_size):
                y_pred = self._predict(x_batch, y_batch, seq_len, do_teacher_forcing)
                self.optimizer.zero_grad()
                loss = self.loss_fn(y_pred, y_batch)
                # print("tloss:{}".format(loss)) # NaNに変化している。
                loss.backward()
                # nn.utils.clip_grad_norm_(self.model.parameters(), 0.25) # https://pytorch.org/docs/stable/_modules/torch/nn/utils/clip_grad.html
                self.optimizer.step()
                train_loss += loss.item()
            self.scheduler.step()
            train_loss /= batch
            self.train_losses.append(train_loss)
        
            self._validation(x_val, y_val, batch_size)
        
            elapsed = time.time() - startup
            print(
                "Epoch %d Train loss: %.2f. Validation loss: %.2f. Avg future: %.2f. Elapsed time: %.2fs."
                 % (epoch + 1, train_loss, self.val_losses[-1], np.average(self.futures), elapsed)
            )

    def _predict(self, x_batch, y_batch, seq_len, do_teacher_forcing):
        # print("x_batch:{}".format(x_batch)) 
        if do_teacher_forcing:
            future = random.randint(1, int(seq_len) / 2)
            limit = x_batch.size(1) - future
            y_pred = self.model(x_batch[:, :limit], future=future, y=y_batch[:, limit:])
            # print("if-y_pred:{}".format(y_pred)) 
        else:
            # print("x_batch:{}".format(x_batch)) # NaNに変化している。
            future = 0
            y_pred = self.model(x_batch)
            # print("else-y_pred:{}".format(y_pred)) # NaNに変化している。
        self.futures.append(future)
        return y_pred

    def _validation(self, x_val, y_val, batch_size):
        if x_val is None or y_val is None:
            return
        with torch.no_grad():
            val_loss = 0
            batch = 1
            for x_batch, y_batch, batch in self.generate_batch_data(x_val, y_val, batch_size):
                y_pred = self.model(x_batch)
                loss = self.loss_fn(y_pred, y_batch)
                # print("vloss:{}".format(loss)) # NaNに変化している。 
                val_loss += loss.item()
            val_loss /= batch
            self.val_losses.append(val_loss)

    def evaluate(self, x_test, y_test, batch_size, future=1):
        with torch.no_grad():
            test_loss = 0
            actual, predicted = [], []
            for x_batch, y_batch, batch in self.generate_batch_data(x_test, y_test, batch_size):
                y_pred = self.model(x_batch, future=future)
                y_pred = (
                    y_pred[:, -len(y_batch) :] if y_pred.shape[1] > y_batch.shape[1] else y_pred
                )
                loss = self.loss_fn(y_pred, y_batch)
                # print("eloss:{}".format(loss)) # NaNに変化している。 
                test_loss += loss.item()
                actual += torch.squeeze(y_batch[:, -1]).data.cpu().numpy().tolist()
                predicted += torch.squeeze(y_pred[:, -1]).data.cpu().numpy().tolist()
            test_loss /= batch
            return actual, predicted, test_loss

    def plot_losses(self):
        plt.plot(self.train_losses, lw=1, label="Training loss")
        plt.plot(self.val_losses, lw=1, label="Validation loss")
        plt.legend()
        plt.title("Losses")

def transform_data(arr, seq_len):
    x, y = [], []
    for i in range(len(arr) - seq_len):
        x_i = arr[i : i + seq_len]
        y_i = arr[i + 1 : i + seq_len + 1]
        x.append(x_i)
        y.append(y_i)
    x_arr = np.array(x).reshape(-1, seq_len)
    y_arr = np.array(y).reshape(-1, seq_len)
    x_var = Variable(torch.from_numpy(x_arr).float().to(cuda_device))
    y_var = Variable(torch.from_numpy(y_arr).float().to(cuda_device))
    return x_var, y_var

def plot_sequence(axes, i, x_train, y_train):
    axes[i].set_title("%d. Sequence" % (i + 1))
    axes[i].set_xlabel("Time bars")
    axes[i].set_ylabel("Scaled VWAP")
    axes[i].plot(range(seq_len), x_train[i].cpu().numpy(), color="r", lw=1, label="Feature")
    axes[i].plot(range(1, seq_len + 1), y_train[i].cpu().numpy(), color="b", lw=1, label="Target")
    axes[i].legend()

def generate_sequence(scaler, model, x_sample, future=1000):
    y_pred_tensor = model(x_sample, future=future)
    y_pred = y_pred_tensor.cpu().tolist()
    y_pred = scaler.inverse_transform(y_pred)
    return y_pred

def to_dataframe(actual, predicted):
    return pd.DataFrame({"actual": actual, "predicted": predicted})

def inverse_transform(scalar, df, columns):
    for col in columns:
        df[col] = scaler.inverse_transform(df[col])
    return df

def minutes_of_new_data(symbol, kline_size, data):
    if len(data) > 0:
        old = parser.parse(data["timestamp"].iloc[-1])
    else:
        old = bitmex_client.Trade.Trade_getBucketed(symbol=symbol, 
                binSize=kline_size, count=1, reverse=False).result()[0][0]['timestamp']
    new = bitmex_client.Trade.Trade_getBucketed(symbol=symbol, 
                binSize=kline_size, count=1, reverse=True).result()[0][0]['timestamp']
    return old, new

def get_all_bitmex(symbol, kline_size, save = False):
    filename = 'data/%s-%s-data.csv' % (symbol, kline_size)
    if os.path.isfile(filename):
        data_df = pd.read_csv(filename)
    else:
        data_df = pd.DataFrame()
    oldest_point, newest_point = minutes_of_new_data(symbol, kline_size, data_df)
    delta_min = (newest_point - oldest_point).total_seconds()/60
    available_data = math.ceil(delta_min/binsizes[kline_size])
    rounds = math.ceil(available_data / batch_size)
    if rounds > 0:
        for round_num in range(rounds):
            time.sleep(1)
            new_time = (oldest_point + timedelta(minutes = round_num * batch_size * binsizes[kline_size]))
            data = bitmex_client.Trade.Trade_getBucketed(symbol=symbol, 
                    binSize=kline_size, count=batch_size, startTime = new_time).result()[0]
            temp_df = pd.DataFrame(data)
            data_df = data_df.append(temp_df)
    data_df.set_index('timestamp', inplace=True)
    if save and rounds > 0:
        data_df.to_csv(filename)
    return data_df

if __name__ == '__main__':
    os.chdir(os.path.dirname(os.path.abspath(__file__)))
    ownprefix = os.path.basename(__file__)

    warnings.simplefilter('ignore')
    pd.set_option('display.max_columns', 100)
    np.set_printoptions(precision=3, suppress=True, formatter={'float': '{: 0.2f}'.format}) #桁を揃える

    start_time = time.perf_counter()
    print("start time: ", datetime.now().strftime("%H:%M:%S"))
    
    print("pandas==%s" % pd.__version__)
    print("numpy==%s" % np.__version__)
    print("torch==%s" % torch.__version__)
    print("matplotlib==%s" % matplotlib.__version__)
    
    cuda_device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    print("cuda_device:",cuda_device)
    if cuda_device != "cpu": 
        print("devicde_name:",torch.cuda.get_device_name(torch.cuda.current_device()))
        torch.cuda.manual_seed(1)
    np.random.seed(1)
    random.seed(1)
    torch.manual_seed(1)
    
    if os.path.exists('{}.pickle'.format(ownprefix)):
        print("read_pickle:")
        df = pd.read_pickle('{}.pickle'.format(ownprefix))
    else:
        print("get_from_bitmex:")
        ## bitmex API
        # bitmex_api_key = ''    #Enter your own API-key here
        # bitmex_api_secret = '' #Enter your own API-secret here
        # binsizes = {"1m": 1, "5m": 5, "1h": 60, "1d": 1440}
        # batch_size = 750
        # bitmex_client = bitmex(test=False, api_key=bitmex_api_key, api_secret=bitmex_api_secret)
        # df = get_all_bitmex("XBTUSD","5m",save=True)
        ##
        # https://public.bitmex.com/?prefix=data/trade/
        files = sorted(glob.glob('./data/2019*.csv.gz'))
        # files = sorted(glob.glob('./data/2020*.csv.gz'))
        print("files:",files)
        df = pd.concat(map(pd.read_csv, files))
        df = df[df.symbol == 'XBTUSD']
        df.timestamp = pd.to_datetime(df.timestamp.str.replace('D', 'T'))
        df = df.sort_values('timestamp')
        df.set_index('timestamp', inplace=True)
        df.to_pickle('{}.pickle'.format(ownprefix))
        df.to_csv('{}.csv'.format(ownprefix))
    
    print("df.shape:",df.shape)
    print("df.tail:",df.tail(-5))
    

    '''
    1分間隔で ボリューム加重平均価格(VWAP)を計算
    事前定義された時間間隔で取引をグループ化します。何故なら、市場の取引は時間によって取引活性が変化するため
    '''
    df_vwap = df.groupby(pd.Grouper(freq="1Min")).apply(
                            lambda row: pd.np.sum(row.price * row.foreignNotional) 
                                        / pd.np.sum(row.foreignNotional))
    print("df_vwap.shape:",df_vwap.shape)
    
    df_vwap.plot(figsize=(14, 7))    
    plt.show()
    plt.savefig('{}1-df_vwap.plot.png'.format(ownprefix), dpi=175, constrained_layout=True, tight_layout=True)
    plt.close()
    
    '''
    ボリューム加重平均価格(VWAP)のデータを、モデル学習用データ0.65、モデル評価用データ0.08、推論用データ0.27に分ける
    '''
    train_len = round(len(df_vwap)*0.65)
    val_len = round(len(df_vwap)*0.08)
    
    df_train = df_vwap[:train_len].to_frame(name="vwap")
    print("df_train.shape:",df_train.shape)
    print("df_train.tail:",df_train.tail(-5))

    df_val = df_vwap[train_len:(train_len + val_len)].to_frame(name="vwap")
    print("df_val.shape:",df_val.shape)
    print("df_val.tail:",df_val.tail(-5))

    df_test = df_vwap[(train_len + val_len):].to_frame(name='vwap')
    print("df_test.shape:",df_test.shape)
    print("df_test.tail:",df_test.tail(-5))
    
    '''
    LSTMモデルをより速く収束させるに、データをスケーリングする
    入力の値が大きいと、学習が遅くなる可能性がある
    sklearnライブラリのStandardScalerを使用
    スケーラーはトレーニングセットに適合し、検証およびテストセットで見えない取引データを変換するために使用
    すべてのデータにスカラーを適合させると、モデルは過学習してしまい、このデータで良好な結果が得られるが、実際のデータでパフォーマンスが低下する 
    '''
    scaler = StandardScaler()
    print("scaler type:",type(scaler),"\n",scaler)

    train_arr = scaler.fit_transform(df_train)
    print("train_arr.shape:",train_arr.shape,"train_arr type:",type(train_arr),"\n",train_arr)

    val_arr = scaler.transform(df_val)
    print("val_arr.shape:",val_arr.shape,"val_arr type:",type(val_arr),"\n",val_arr)

    test_arr = scaler.transform(df_test)
    print("test_arr.shape:",test_arr.shape,"test_arr type:",type(test_arr),"\n",test_arr)

    '''
    スケーリング後、LSTMでのモデリングに適した形式にデータを変換
    データの長いシーケンスを、単一のタイムバーだけシフトされる多くの短いシーケンス（シーケンスごとに100タイムバー）に変換
    以下のプロットは、トレーニングセットの最初と２番目、３番目のシーケンス
    両方のシーケンスの長さは100タイムバー
    両方のシーケンスのターゲットが機能とほぼ同じであり、違いは最初と最後のタイムバーにあること
    LSTMはトレーニングフェーズでシーケンスをどのように使用するか？
    まず、最初のシーケンスに焦点を当てる
    モデルは、インデックス0のタイムバーの特徴を取り、インデックス1のタイムバーのターゲットを予測しようとする
    次に、インデックス1のタイムバーの特徴を取り、タイムバーのターゲットを予測しようとする
    インデックス2などで。2番目のシーケンスの特徴は1番目のシーケンスの特徴から1タイムバーだけシフトされ、3番目のシーケンスの特徴は2番目のシーケンスから1タイムバーだけシフトされる
    この手順では、多くの短いシーケンスが得られ、単一のタイムバーだけシフトされる
    分類または回帰タスクでは、通常予測しようとしている一連の機能とターゲットがあることに注意
    LSTMを使用したこの例では、フィーチャとターゲットは同じシーケンスからのもので、唯一の違いはターゲットが1タイムバーだけシフトされている
    '''
    seq_len = 100
    x_train, y_train = transform_data(train_arr, seq_len)
    print("x_train.shape:",x_train.shape,"x_train type:",type(x_train),"\n",x_train)
    print("y_train.shape:",y_train.shape,"y_train type:",type(y_train),"\n",y_train)

    x_val, y_val = transform_data(val_arr, seq_len)
    print("x_val.shape:",x_val.shape,"x_val type:",type(x_val),"\n",x_val)
    print("y_val.shape:",y_val.shape,"y_val type:",type(y_val),"\n",y_val)

    x_test, y_test = transform_data(test_arr, seq_len)
    print("x_test.shape:",x_test.shape,"x_test type:",type(x_test),"\n",x_test)
    print("y_test.shape:",y_test.shape,"y_test type:",type(y_test),"\n",y_test)
    
    fig, axes = plt.subplots(nrows=1, ncols=3, figsize=(14, 7))
    plot_sequence(axes, 0, x_train, y_train)
    plot_sequence(axes, 1, x_train, y_train)    
    plot_sequence(axes, 2, x_train, y_train)    
    plt.show()
    plt.savefig('{}2-plot_sequence.png'.format(ownprefix), dpi=175, constrained_layout=True, tight_layout=True)
    plt.close()
    
    '''
    LSTMのトレーニング
    21個の隠れユニットを用いてLSTMを学習
    単位数を少なくすることで、LSTMが完全に記憶する可能性が低くなるようにしている
    学習には平均二乗誤差損失関数とAdamオプティマイザを用いる
    学習率は0.001に設定し，5エポックごとに減衰する
    バッチごとに100個のシーケンスを15エポックで学習
    プロットから、学習損失と検証損失が6エポック目で収束
    '''
    # model_1 = Model(input_size=1, hidden_size=21, output_size=1)
    model_1 = Model(input_size=1, hidden_size=21, output_size=1).to(cuda_device)
    print("model_1 type:",type(model_1),"\n",model_1)

    loss_fn_1 = nn.MSELoss()
    # loss_fn_1 = nn.BCELoss()
    # loss_fn_1 = nn.BCEWithLogitsLoss()
    print("loss_fn_1 type:",type(loss_fn_1),"\n",loss_fn_1)

    optimizer_1 = optim.Adam(model_1.parameters(), lr=1e-4)
    print("optimizer_1 type:",type(optimizer_1),"\n",optimizer_1)

    scheduler_1 = optim.lr_scheduler.StepLR(optimizer_1, step_size=5, gamma=0.1)
    # scheduler_1 = torch.optim.lr_scheduler.MultiStepLR(optimizer_1, milestones=[2, 6], gamma=0.1)
    print("scheduler_1 type:",type(scheduler_1),"\n",scheduler_1)

    optimization_1 = Optimization(model_1, loss_fn_1, optimizer_1, scheduler_1)
    print("optimization_1 type:",type(optimization_1),"\n",optimization_1)
    
    optimization_1.train(x_train, y_train, x_val, y_val, do_teacher_forcing=False)
    print("optimization_1 type:",type(optimization_1),"\n",optimization_1)

    optimization_1.plot_losses()    
    plt.show()
    plt.savefig('{}3-optimization_1.plot_losses.png'.format(ownprefix), dpi=175, constrained_layout=True, tight_layout=True)
    plt.close()
    
    '''
    テストセットでモデルを評価
    futureパラメータは5に設定
    モデルが次の5つの時間帯（この例では5分）にあると考えられるVWAPを出力する
    これにより、価格の変化が発生する数時間前に目に見えるようになる
    プロットでは、予測値が実際のVWAPの値と密接に一致していることがわかる
    しかし、将来のパラメータは5に設定されており、オレンジ色のラインはスパイクをカバーするのではなく、発生する前に反応しなければならない。
    '''
    actual_1, predicted_1, test_loss_1 = optimization_1.evaluate(x_test, y_test, batch_size=100, future=5)
    print("Test loss %.4f" % test_loss_1)
    df_result_1 = to_dataframe(actual_1, predicted_1) 
    df_result_1 = inverse_transform(scaler, df_result_1, ['actual', 'predicted'])

    df_result_1.plot(figsize=(14*2, 7), lw=0.3)    
    plt.show()
    plt.savefig('{}4-df_result_1.plot.png'.format(ownprefix), dpi=175, constrained_layout=True, tight_layout=True)
    plt.close()

    '''
    スパイクにズームインすると（開始時と終了時の1つともう1つの時系列）、予測値が実際の値を模倣していることがわかる
    実際の値が方向を変えると、予測値が追従しますが、これでは役にたたない
    未来のパラメータを増やしても同じことが起こる（予測線には影響しない）
    '''
    fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(14, 7))
    df_result_1.iloc[2350:2450].plot(ax=axes[0], figsize=(14, 7), lw=0.3)
    df_result_1.iloc[16000:17500].plot(ax=axes[1], figsize=(14, 7), lw=0.3)    
    plt.show()
    plt.savefig('{}5-df_result_1.iloc.plot.png'.format(ownprefix), dpi=175, constrained_layout=True, tight_layout=True)
    plt.close()

    '''
    モデルを使用して最初のテストシーケンスについて1000本のタイムバーを生成し、予測値、生成値、実際のVWAPを比較
    モデルが予測値を出力している間は、実際の値に近いことが観察
    しかし、値を生成し始めると、出力はほとんど正弦波に似ている
    ある期間の後、値は9600に収束する
    '''
    x_sample = x_test[0].reshape(1, -1)
    y_sample = df_test.vwap[:1100]     
    y_pred1 = generate_sequence(scaler, optimization_1.model, x_sample)
    
    plt.figure(figsize=(14, 7))
    plt.plot(range(100), y_pred1[0][:100], color="blue", lw=1, label="Predicted VWAP")
    plt.plot(range(100, 1100), y_pred1[0][100:], "--", color="blue", lw=1, label="Generated VWAP")
    plt.plot(range(0, 1100), y_sample, color="red", lw=1, label="Actual VWAP")
    plt.legend()    
    plt.show()
    plt.savefig('{}6-generate_sequence.png'.format(ownprefix), dpi=175, constrained_layout=True, tight_layout=True)
    plt.close()

    '''
    この動作は、モデルが実際の入力でのみ訓練され、生成された入力では訓練されなかったために起こっていると考えられる
    モデルは生成された入力から出力を生成すると、次の値を生成するのが下手になる
    教師強制でこの問題の是正を試みる
    
    [教師強制](https://machinelearningmastery.com/teacher-forcing-for-recurrent-neural-networks/)は、
    前の時間ステップの出力を入力とするリカレントニューラルネットワークを訓練する方法
    RNNを訓練する際に、前の出力を現在の入力として使うことでシーケンスを生成することができる
    訓練時にも同様の処理を行うことができるが、モデルが不安定になったり、収束しなかったりすることがある
    教師強制は、訓練中にこれらの問題に対処するためのアプローチ
    言語モデルでは一般的に使われている
    
    今回は、[Scheduled sampling](https://arxiv.org/abs/1506.03099)と呼ばれるTeacher forcingの拡張機能を使用する
    モデルは、学習中に一定の確率で、その生成された出力を入力として使用
    最初は、モデルがその生成された出力を見る確率は小さく、訓練中に徐々に増加する
    この例では、訓練中には増加しないランダムな確率を使用していることに注意
    
    前と同じパラメータで、教師強制を有効にしたモデルを訓練
    7エポック後、学習と検証の損失は収束
    '''
    # model_2 = Model(input_size=1, hidden_size=21, output_size=1)
    model_2 = Model(input_size=1, hidden_size=21, output_size=1).to(cuda_device)
    loss_fn_2 = nn.MSELoss()
    optimizer_2 = optim.Adam(model_2.parameters(), lr=1e-4)
    scheduler_2 = optim.lr_scheduler.StepLR(optimizer_2, step_size=5, gamma=0.1)
    optimization_2 = Optimization(model_2, loss_fn_2,  optimizer_2, scheduler_2)
    optimization_2.train(x_train, y_train, x_val, y_val, do_teacher_forcing=True)

    optimization_2.plot_losses()
    plt.show()
    plt.savefig('{}7-optimization_2.plot_losses.png'.format(ownprefix), dpi=175, constrained_layout=True, tight_layout=True)
    plt.close()
    
    
    '''
    以前と同様の予測されたシーケンスを観察
    スパイクを拡大すると、予測値が実際の値を模倣しているようなモデルの挙動が観察できる
    教師の強制では問題が解決しなかった。。。
    '''
    actual_2, predicted_2, test_loss_2 = optimization_2.evaluate(x_test, y_test, batch_size=100, future=5)
    print("Test loss %.4f" % test_loss_2)
    df_result_2 = to_dataframe(actual_2, predicted_2)
    df_result_2 = inverse_transform(scaler, df_result_2, ["actual", "predicted"])

    df_result_2.plot(figsize=(14*2, 7), lw=0.3)
    plt.show()
    plt.savefig('{}8-df_result_2.plot.png'.format(ownprefix), dpi=175, constrained_layout=True, tight_layout=True)
    plt.close()

    fig, axes = plt.subplots(nrows=1, ncols=2, figsize=(14, 7))
    df_result_2.iloc[2350:2450].plot(ax=axes[0], figsize=(14, 7), lw=0.3)
    df_result_2.iloc[16000:17500].plot(ax=axes[1], figsize=(14, 7), lw=0.3)    
    plt.show()
    plt.savefig('{}9-df_result_2.iloc.plot.png'.format(ownprefix), dpi=175, constrained_layout=True, tight_layout=True)
    plt.close()
    
    
    '''
    教師強制で学習したモデルを用いて、最初のテストシーケンスの1000本のタイムバーを生成
    '''
    y_pred2 = generate_sequence(scaler, optimization_2.model, x_sample)
    
    plt.figure(figsize=(14, 7))
    plt.plot(range(100), y_pred2[0][:100], color="blue", lw=1, label="Predicted VWAP")
    plt.plot(range(100, 1100), y_pred2[0][100:], "--", color="blue", lw=1, label="Generated VWAP")
    plt.plot(range(0, 1100), y_sample, color="red", lw=1, label="Actual VWAP")
    plt.legend()
    plt.show()
    plt.savefig('{}10-generate_sequence.png'.format(ownprefix), dpi=175, constrained_layout=True, tight_layout=True)
    plt.close()
    
    
    end_time = time.perf_counter()
    print("end time: ", datetime.now().strftime("%H:%M:%S"))
    
    time = end_time - start_time
    print("end_time - start_time:%f" % (time))
    
    '''
    生成されたシーケンスに関する興味深い考察は，教師強制で訓練されたモデルから生成された値は，収束するまでに時間がかかるということ
    もう一つの考察は、シーケンスが増加しているとき、それはある点まで増加し続け、その後、減少し始め、シーケンスが収束するまでパターンが繰り返される
    このパターンは、振幅が減少する正弦波のように見える
    
    ## 結論
    検証の結果、モデルの予測がシーケンスの実際の値を模倣していることがわかる
    第1のモデルと第2のモデルは、価格の変化を発生前に検出しない
    別の特徴（ボリュームのような）を追加すると、モデルが発生する前に価格変化を検出するのに役立つかもしれないが、
    その場合、モデルは次のステップでそれらの出力を入力として使用するために2つの特徴を生成する必要があり、モデルが複雑になる
    上のプロットで見られるように、モデルはVWAP時系列を予測する能力を持っているので、
    より複雑なモデル（複数のLSTMCellを使用し、隠れユニットの数を増やす）を使用しても役に立たないかもしれない
    より高度な教師強制の方法で、モデルのシーケンス生成スキルを向上させる可能性はあるかもしれない。。。
    
    ## 参考文献
     - [時系列予測](https://github.com/pytorch/examples/tree/master/time_sequence_prediction)
     - [LSTMネットワークを理解する](https://colah.github.io/posts/2015-08-Understanding-LSTMs/)
     - [リカレントニューラルネットワークのための教師強制とは何か](https://machinelearningmastery.com/teacher-forcing-for-recurrent-neural-networks/)
     - [リカレントニューラルネットワークを用いた配列予測のためのスケジューリング](https://arxiv.org/abs/1506.03099)
    '''

実行結果ログ１（ソースコード１に2019年のBTCのデータを読み込ませた際のログ）

$ python3 btc_prediction_by_lstm_pytorch.py
start time:  19:56:49
pandas==1.1.4
numpy==1.19.2
torch==1.5.0
matplotlib==3.3.3
cuda_device: cuda
devicde_name: GeForce RTX 2070
get_from_bitmex:
files: ['./data/20190801.csv.gz', './data/20190802.csv.gz', './data/20190803.csv.gz', './data/20190804.csv.gz', './data/20190805.csv.gz', './data/20190806.csv.gz', './data/20190807.csv.gz', './data/20190808.csv.gz', './data/20190809.csv.gz', './data/20190810.csv.gz', './data/20190811.csv.gz', './data/20190812.csv.gz', './data/20190813.csv.gz', './data/20190814.csv.gz', './data/20190815.csv.gz', './data/20190816.csv.gz', './data/20190817.csv.gz', './data/20190818.csv.gz', './data/20190819.csv.gz', './data/20190820.csv.gz', './data/20190821.csv.gz', './data/20190822.csv.gz', './data/20190823.csv.gz', './data/20190824.csv.gz', './data/20190825.csv.gz', './data/20190826.csv.gz', './data/20190827.csv.gz', './data/20190828.csv.gz', './data/20190829.csv.gz', './data/20190830.csv.gz', './data/20190831.csv.gz', './data/20190901.csv.gz', './data/20190902.csv.gz', './data/20190903.csv.gz', './data/20190904.csv.gz', './data/20190905.csv.gz', './data/20190906.csv.gz', './data/20190907.csv.gz', './data/20190908.csv.gz', './data/20190909.csv.gz', './data/20190910.csv.gz', './data/20190911.csv.gz', './data/20190912.csv.gz', './data/20190913.csv.gz', './data/20190914.csv.gz', './data/20190915.csv.gz', './data/20190916.csv.gz', './data/20190917.csv.gz']
df.shape: (36708098, 9)
df.tail:                             symbol  side   size    price  tickDirection  \
timestamp
2019-08-01 00:00:03.950526  XBTUSD   Buy     35  10089.0   ZeroPlusTick
2019-08-01 00:00:03.950526  XBTUSD   Buy     35  10089.0   ZeroPlusTick
2019-08-01 00:00:03.950526  XBTUSD   Buy     40  10089.0   ZeroPlusTick
2019-08-01 00:00:03.950526  XBTUSD   Buy   3117  10089.0   ZeroPlusTick
2019-08-01 00:00:03.956035  XBTUSD   Buy  18670  10089.0   ZeroPlusTick
...                            ...   ...    ...      ...            ...
2019-09-17 23:59:59.189310  XBTUSD  Sell   2000  10184.5  ZeroMinusTick
2019-09-17 23:59:59.189310  XBTUSD  Sell  15000  10184.5  ZeroMinusTick
2019-09-17 23:59:59.189310  XBTUSD  Sell  45383  10184.5  ZeroMinusTick
2019-09-17 23:59:59.517938  XBTUSD  Sell  10000  10184.5  ZeroMinusTick
2019-09-17 23:59:59.531223  XBTUSD  Sell      1  10184.5  ZeroMinusTick

                                                      trdMatchID  grossValue  \
timestamp
2019-08-01 00:00:03.950526  cec26c94-563c-7fcb-6194-d03ae2f41b92      346920
2019-08-01 00:00:03.950526  607b403e-4211-abda-8392-4003ad9f9ad0      346920
2019-08-01 00:00:03.950526  4d5cff30-eb11-43fe-caf4-1e150bc0bc18      396480
2019-08-01 00:00:03.950526  ab4806f1-2fac-6949-a5bc-07b4da4048f0    30895704
2019-08-01 00:00:03.956035  63205fcd-5165-3ac9-35d8-a39c632a5e60   185057040
...                                                          ...         ...
2019-09-17 23:59:59.189310  f2385097-5527-3075-3498-0b660dd39e4c    19638000
2019-09-17 23:59:59.189310  af7e0ceb-f670-b863-abb3-0feb76066711   147285000
2019-09-17 23:59:59.189310  b954d196-c63a-f86c-2599-cb86b409bbb6   445615677
2019-09-17 23:59:59.517938  69391f3f-65bc-c3bc-548d-963ffe5501df    98190000
2019-09-17 23:59:59.531223  6926bf37-72b7-3ac8-2c8a-3cdf22688ec0        9819

                            homeNotional  foreignNotional
timestamp
2019-08-01 00:00:03.950526      0.003469             35.0
2019-08-01 00:00:03.950526      0.003469             35.0
2019-08-01 00:00:03.950526      0.003965             40.0
2019-08-01 00:00:03.950526      0.308957           3117.0
2019-08-01 00:00:03.956035      1.850570          18670.0
...                                  ...              ...
2019-09-17 23:59:59.189310      0.196380           2000.0
2019-09-17 23:59:59.189310      1.472850          15000.0
2019-09-17 23:59:59.189310      4.456157          45383.0
2019-09-17 23:59:59.517938      0.981900          10000.0
2019-09-17 23:59:59.531223      0.000098              1.0

[36708093 rows x 9 columns]
df_vwap.shape: (69120,)
df_train.shape: (44928, 1)
df_train.tail:                              vwap
timestamp
2019-08-01 00:05:00  10108.710704
2019-08-01 00:06:00  10122.709866
2019-08-01 00:07:00  10122.340347
2019-08-01 00:08:00  10126.617881
2019-08-01 00:09:00  10147.380407
...                           ...
2019-09-01 04:43:00   9625.001311
2019-09-01 04:44:00   9625.045365
2019-09-01 04:45:00   9625.063765
2019-09-01 04:46:00   9626.437824
2019-09-01 04:47:00   9630.077496

[44923 rows x 1 columns]
df_val.shape: (5530, 1)
df_val.tail:                              vwap
timestamp
2019-09-01 04:53:00   9630.231584
2019-09-01 04:54:00   9629.291041
2019-09-01 04:55:00   9626.054305
2019-09-01 04:56:00   9626.067363
2019-09-01 04:57:00   9626.485287
...                           ...
2019-09-05 00:53:00  10545.065878
2019-09-05 00:54:00  10545.195416
2019-09-05 00:55:00  10542.311323
2019-09-05 00:56:00  10538.363399
2019-09-05 00:57:00  10537.195479

[5525 rows x 1 columns]
df_test.shape: (18662, 1)
df_test.tail:                              vwap
timestamp
2019-09-05 01:03:00  10522.959503
2019-09-05 01:04:00  10521.220404
2019-09-05 01:05:00  10517.852199
2019-09-05 01:06:00  10520.261375
2019-09-05 01:07:00  10520.428804
...                           ...
2019-09-17 23:55:00  10191.031001
2019-09-17 23:56:00  10194.615079
2019-09-17 23:57:00  10193.758451
2019-09-17 23:58:00  10187.193670
2019-09-17 23:59:00  10184.758720

[18657 rows x 1 columns]
scaler type: <class 'sklearn.preprocessing._data.StandardScaler'>
 StandardScaler()
train_arr.shape: (44928, 1) train_arr type: <class 'numpy.ndarray'>
 [[-0.71]
 [-0.69]
 [-0.71]
 ...
 [-1.37]
 [-1.36]
 [-1.36]]
val_arr.shape: (5530, 1) val_arr type: <class 'numpy.ndarray'>
 [[-1.36]
 [-1.36]
 [-1.36]
 ...
 [-0.09]
 [-0.10]
 [-0.10]]
test_arr.shape: (18662, 1) test_arr type: <class 'numpy.ndarray'>
 [[-0.10]
 [-0.10]
 [-0.11]
 ...
 [-0.58]
 [-0.59]
 [-0.59]]
x_train.shape: torch.Size([44828, 100]) x_train type: <class 'torch.Tensor'>
 tensor([[-0.7070, -0.6901, -0.7066,  ..., -0.8349, -0.8319, -0.8261],
        [-0.6901, -0.7066, -0.7005,  ..., -0.8319, -0.8261, -0.8263],
        [-0.7066, -0.7005, -0.6831,  ..., -0.8261, -0.8263, -0.8259],
        ...,
        [-1.3670, -1.3665, -1.3665,  ..., -1.3646, -1.3651, -1.3651],
        [-1.3665, -1.3665, -1.3640,  ..., -1.3651, -1.3651, -1.3651],
        [-1.3665, -1.3640, -1.3614,  ..., -1.3651, -1.3651, -1.3631]],
       device='cuda:0')
y_train.shape: torch.Size([44828, 100]) y_train type: <class 'torch.Tensor'>
 tensor([[-0.6901, -0.7066, -0.7005,  ..., -0.8319, -0.8261, -0.8263],
        [-0.7066, -0.7005, -0.6831,  ..., -0.8261, -0.8263, -0.8259],
        [-0.7005, -0.6831, -0.6940,  ..., -0.8263, -0.8259, -0.8228],
        ...,
        [-1.3665, -1.3665, -1.3640,  ..., -1.3651, -1.3651, -1.3651],
        [-1.3665, -1.3640, -1.3614,  ..., -1.3651, -1.3651, -1.3631],
        [-1.3640, -1.3614, -1.3612,  ..., -1.3651, -1.3631, -1.3581]],
       device='cuda:0')
x_val.shape: torch.Size([5430, 100]) x_val type: <class 'torch.Tensor'>
 tensor([[-1.3576, -1.3577, -1.3575,  ..., -1.3784, -1.3783, -1.3783],
        [-1.3577, -1.3575, -1.3579,  ..., -1.3783, -1.3783, -1.3783],
        [-1.3575, -1.3579, -1.3580,  ..., -1.3783, -1.3783, -1.3773],
        ...,
        [-0.0342, -0.0332, -0.0311,  ..., -0.0853, -0.0886, -0.0884],
        [-0.0332, -0.0311, -0.0231,  ..., -0.0886, -0.0884, -0.0924],
        [-0.0311, -0.0231, -0.0397,  ..., -0.0884, -0.0924, -0.0979]],
       device='cuda:0')
y_val.shape: torch.Size([5430, 100]) y_val type: <class 'torch.Tensor'>
 tensor([[-1.3577, -1.3575, -1.3579,  ..., -1.3783, -1.3783, -1.3783],
        [-1.3575, -1.3579, -1.3580,  ..., -1.3783, -1.3783, -1.3773],
        [-1.3579, -1.3580, -1.3579,  ..., -1.3783, -1.3773, -1.3774],
        ...,
        [-0.0332, -0.0311, -0.0231,  ..., -0.0886, -0.0884, -0.0924],
        [-0.0311, -0.0231, -0.0397,  ..., -0.0884, -0.0924, -0.0979],
        [-0.0231, -0.0397, -0.0419,  ..., -0.0924, -0.0979, -0.0995]],
       device='cuda:0')
x_test.shape: torch.Size([18562, 100]) x_test type: <class 'torch.Tensor'>
 tensor([[-0.1027, -0.1037, -0.1098,  ..., -0.0839, -0.0890, -0.0888],
        [-0.1037, -0.1098, -0.1119,  ..., -0.0890, -0.0888, -0.0886],
        [-0.1098, -0.1119, -0.1121,  ..., -0.0888, -0.0886, -0.0886],
        ...,
        [-0.5124, -0.5140, -0.5179,  ..., -0.5837, -0.5798, -0.5748],
        [-0.5140, -0.5179, -0.5202,  ..., -0.5798, -0.5748, -0.5760],
        [-0.5179, -0.5202, -0.5229,  ..., -0.5748, -0.5760, -0.5851]],
       device='cuda:0')
y_test.shape: torch.Size([18562, 100]) y_test type: <class 'torch.Tensor'>
 tensor([[-0.1037, -0.1098, -0.1119,  ..., -0.0890, -0.0888, -0.0886],
        [-0.1098, -0.1119, -0.1121,  ..., -0.0888, -0.0886, -0.0886],
        [-0.1119, -0.1121, -0.1192,  ..., -0.0886, -0.0886, -0.0886],
        ...,
        [-0.5140, -0.5179, -0.5202,  ..., -0.5798, -0.5748, -0.5760],
        [-0.5179, -0.5202, -0.5229,  ..., -0.5748, -0.5760, -0.5851],
        [-0.5202, -0.5229, -0.5224,  ..., -0.5760, -0.5851, -0.5885]],
       device='cuda:0')
model_1 type: <class '__main__.Model'>
 Model(
  (lstm): LSTMCell(1, 21)
  (linear): Linear(in_features=21, out_features=1, bias=True)
)
loss_fn_1 type: <class 'torch.nn.modules.loss.MSELoss'>
 MSELoss()
optimizer_1 type: <class 'torch.optim.adam.Adam'>
 Adam (
Parameter Group 0
    amsgrad: False
    betas: (0.9, 0.999)
    eps: 1e-08
    lr: 0.0001
    weight_decay: 0
)
scheduler_1 type: <class 'torch.optim.lr_scheduler.StepLR'>
 <torch.optim.lr_scheduler.StepLR object at 0x7fb3c80daac8>
optimization_1 type: <class '__main__.Optimization'>
 <__main__.Optimization object at 0x7fb3c80daa58>
Epoch 1 Train loss: 1.03. Validation loss: 0.71. Avg future: 0.00. Elapsed time: 14.63s.
Epoch 2 Train loss: 0.66. Validation loss: 0.22. Avg future: 0.00. Elapsed time: 14.95s.
Epoch 3 Train loss: 0.29. Validation loss: 0.16. Avg future: 0.00. Elapsed time: 14.95s.
Epoch 4 Train loss: 0.14. Validation loss: 0.08. Avg future: 0.00. Elapsed time: 14.70s.
Epoch 5 Train loss: 0.08. Validation loss: 0.05. Avg future: 0.00. Elapsed time: 14.71s.
Epoch 6 Train loss: 0.08. Validation loss: 0.03. Avg future: 0.00. Elapsed time: 15.09s.
Epoch 7 Train loss: 0.06. Validation loss: 0.03. Avg future: 0.00. Elapsed time: 15.07s.
Epoch 8 Train loss: 0.06. Validation loss: 0.03. Avg future: 0.00. Elapsed time: 14.74s.
Epoch 9 Train loss: 0.06. Validation loss: 0.03. Avg future: 0.00. Elapsed time: 15.76s.
Epoch 10 Train loss: 0.05. Validation loss: 0.03. Avg future: 0.00. Elapsed time: 15.07s.
Epoch 11 Train loss: 0.05. Validation loss: 0.03. Avg future: 0.00. Elapsed time: 14.75s.
Epoch 12 Train loss: 0.05. Validation loss: 0.03. Avg future: 0.00. Elapsed time: 14.30s.
Epoch 13 Train loss: 0.05. Validation loss: 0.03. Avg future: 0.00. Elapsed time: 15.01s.
Epoch 14 Train loss: 0.05. Validation loss: 0.03. Avg future: 0.00. Elapsed time: 15.27s.
Epoch 15 Train loss: 0.05. Validation loss: 0.03. Avg future: 0.00. Elapsed time: 15.07s.
optimization_1 type: <class '__main__.Optimization'>
 <__main__.Optimization object at 0x7fb3c80daa58>
Test loss 0.0038
Epoch 1 Train loss: 0.89. Validation loss: 0.46. Avg future: 25.75. Elapsed time: 14.54s.
Epoch 2 Train loss: 0.51. Validation loss: 0.17. Avg future: 25.30. Elapsed time: 15.22s.
Epoch 3 Train loss: 0.15. Validation loss: 0.08. Avg future: 24.66. Elapsed time: 15.31s.
Epoch 4 Train loss: 0.08. Validation loss: 0.05. Avg future: 24.25. Elapsed time: 15.34s.
Epoch 5 Train loss: 0.06. Validation loss: 0.04. Avg future: 25.27. Elapsed time: 15.14s.
Epoch 6 Train loss: 0.07. Validation loss: 0.02. Avg future: 24.95. Elapsed time: 15.41s.
Epoch 7 Train loss: 0.06. Validation loss: 0.02. Avg future: 24.85. Elapsed time: 15.41s.
Epoch 8 Train loss: 0.05. Validation loss: 0.02. Avg future: 25.62. Elapsed time: 14.70s.
Epoch 9 Train loss: 0.05. Validation loss: 0.02. Avg future: 24.58. Elapsed time: 14.90s.
Epoch 10 Train loss: 0.05. Validation loss: 0.02. Avg future: 26.30. Elapsed time: 14.72s.
Epoch 11 Train loss: 0.05. Validation loss: 0.02. Avg future: 25.90. Elapsed time: 14.56s.
Epoch 12 Train loss: 0.05. Validation loss: 0.02. Avg future: 26.14. Elapsed time: 15.48s.
Epoch 13 Train loss: 0.04. Validation loss: 0.02. Avg future: 25.35. Elapsed time: 15.84s.
Epoch 14 Train loss: 0.04. Validation loss: 0.02. Avg future: 25.82. Elapsed time: 15.05s.
Epoch 15 Train loss: 0.04. Validation loss: 0.02. Avg future: 24.90. Elapsed time: 15.24s.
Test loss 0.0024
end time:  20:10:52
end_time - start_time:843.225909

実行結果ログ２（ソースコード１に2020年のBTCのデータを読み込ませてLossがNaNになってしまうログ）

$ python3 btc_prediction_by_lstm_pytorch.py
start time:  20:14:13
pandas==1.1.4
numpy==1.19.2
torch==1.5.0
matplotlib==3.3.3
cuda_device: cuda
devicde_name: GeForce RTX 2070
get_from_bitmex:
files: ['./data/20201031.csv.gz', './data/20201101.csv.gz', './data/20201102.csv.gz', './data/20201103.csv.gz', './data/20201104.csv.gz', './data/20201105.csv.gz', './data/20201106.csv.gz', './data/20201107.csv.gz', './data/20201108.csv.gz', './data/20201109.csv.gz', './data/20201110.csv.gz', './data/20201111.csv.gz', './data/20201112.csv.gz', './data/20201113.csv.gz', './data/20201114.csv.gz', './data/20201115.csv.gz', './data/20201116.csv.gz', './data/20201117.csv.gz', './data/20201118.csv.gz', './data/20201119.csv.gz', './data/20201120.csv.gz', './data/20201121.csv.gz', './data/20201122.csv.gz', './data/20201123.csv.gz', './data/20201124.csv.gz', './data/20201125.csv.gz', './data/20201126.csv.gz', './data/20201127.csv.gz', './data/20201128.csv.gz', './data/20201129.csv.gz', './data/20201130.csv.gz', './data/20201201.csv.gz', './data/20201202.csv.gz', './data/20201203.csv.gz', './data/20201204.csv.gz', './data/20201205.csv.gz', './data/20201206.csv.gz', './data/20201207.csv.gz', './data/20201208.csv.gz', './data/20201209.csv.gz', './data/20201210.csv.gz', './data/20201211.csv.gz', './data/20201212.csv.gz', './data/20201213.csv.gz', './data/20201214.csv.gz', './data/20201215.csv.gz', './data/20201216.csv.gz', './data/20201217.csv.gz', './data/20201218.csv.gz', './data/20201219.csv.gz', './data/20201220.csv.gz', './data/20201221.csv.gz', './data/20201222.csv.gz', './data/20201223.csv.gz']
df.shape: (18271747, 9)
df.tail:                             symbol  side   size    price tickDirection  \
timestamp
2020-10-31 00:00:02.626781  XBTUSD   Buy      1  13559.0      PlusTick
2020-10-31 00:00:02.748137  XBTUSD   Buy  24061  13560.5  ZeroPlusTick
2020-10-31 00:00:02.748137  XBTUSD   Buy   1414  13560.5  ZeroPlusTick
2020-10-31 00:00:02.748137  XBTUSD   Buy    150  13560.5  ZeroPlusTick
2020-10-31 00:00:02.748137  XBTUSD   Buy    100  13560.5      PlusTick
...                            ...   ...    ...      ...           ...
2020-12-23 23:59:58.571018  XBTUSD  Sell    420  23245.0     MinusTick
2020-12-23 23:59:58.580506  XBTUSD   Buy     13  23244.5     MinusTick
2020-12-23 23:59:58.593966  XBTUSD   Buy     10  23243.5     MinusTick
2020-12-23 23:59:58.597077  XBTUSD  Sell    447  23243.0     MinusTick
2020-12-23 23:59:58.646200  XBTUSD   Buy     11  23241.0     MinusTick

                                                      trdMatchID  grossValue  \
timestamp
2020-10-31 00:00:02.626781  eac8256e-bbe9-ad3c-9e63-17719295a974        7375
2020-10-31 00:00:02.748137  5581c2ae-b0ad-5121-858a-ce9c44d74943   177425814
2020-10-31 00:00:02.748137  e8a864fb-aa79-35e1-84af-8cbcaa127694    10426836
2020-10-31 00:00:02.748137  813afd63-06a0-6f79-f1b5-ee7643ed1eec     1106100
2020-10-31 00:00:02.748137  80a4d7f2-7a83-e7a9-4e9f-bb287c408b44      737400
...                                                          ...         ...
2020-12-23 23:59:58.571018  a12bbffc-d9d7-083c-cbd5-dbadb88cb0a0     1806840
2020-12-23 23:59:58.580506  f1e97f20-6739-4145-715a-fe0914393f20       55926
2020-12-23 23:59:58.593966  b57033c9-4dd2-9c96-04e5-7bf702e36806       43020
2020-12-23 23:59:58.597077  35a0a31a-4a9d-34e9-208f-5af533a576c5     1922994
2020-12-23 23:59:58.646200  b3cb9ba2-cebc-cdf5-528c-ce7c7b79269b       47333

                            homeNotional  foreignNotional
timestamp
2020-10-31 00:00:02.626781      0.000074              1.0
2020-10-31 00:00:02.748137      1.774258          24061.0
2020-10-31 00:00:02.748137      0.104268           1414.0
2020-10-31 00:00:02.748137      0.011061            150.0
2020-10-31 00:00:02.748137      0.007374            100.0
...                                  ...              ...
2020-12-23 23:59:58.571018      0.018068            420.0
2020-12-23 23:59:58.580506      0.000559             13.0
2020-12-23 23:59:58.593966      0.000430             10.0
2020-12-23 23:59:58.597077      0.019230            447.0
2020-12-23 23:59:58.646200      0.000473             11.0

[18271742 rows x 9 columns]
df_vwap.shape: (77760,)
df_train.shape: (50544, 1)
df_train.tail:                              vwap
timestamp
2020-10-31 00:05:00  13554.231586
2020-10-31 00:06:00  13557.871331
2020-10-31 00:07:00  13559.282798
2020-10-31 00:08:00  13557.649806
2020-10-31 00:09:00  13560.874972
...                           ...
2020-12-05 02:19:00  18741.511471
2020-12-05 02:20:00  18730.342480
2020-12-05 02:21:00  18724.025946
2020-12-05 02:22:00  18720.223295
2020-12-05 02:23:00  18721.599737

[50539 rows x 1 columns]
df_val.shape: (6221, 1)
df_val.tail:                              vwap
timestamp
2020-12-05 02:29:00  18745.846928
2020-12-05 02:30:00  18758.132585
2020-12-05 02:31:00  18765.631351
2020-12-05 02:32:00  18774.092713
2020-12-05 02:33:00  18788.076590
...                           ...
2020-12-09 10:00:00  18047.134980
2020-12-09 10:01:00  18026.450682
2020-12-09 10:02:00  18011.715896
2020-12-09 10:03:00  17995.496554
2020-12-09 10:04:00  18005.457240

[6216 rows x 1 columns]
df_test.shape: (20995, 1)
df_test.tail:                              vwap
timestamp
2020-12-09 10:10:00  17984.900355
2020-12-09 10:11:00  17991.922489
2020-12-09 10:12:00  17983.009223
2020-12-09 10:13:00  17979.464915
2020-12-09 10:14:00  17977.227506
...                           ...
2020-12-23 23:55:00  23248.030879
2020-12-23 23:56:00  23206.786654
2020-12-23 23:57:00  23264.388328
2020-12-23 23:58:00  23264.377437
2020-12-23 23:59:00  23268.652009

[20990 rows x 1 columns]
scaler type: <class 'sklearn.preprocessing._data.StandardScaler'>
 StandardScaler()
train_arr.shape: (50544, 1) train_arr type: <class 'numpy.ndarray'>
 [[-1.72]
 [-1.72]
 [-1.72]
 ...
 [ 1.06]
 [ 1.06]
 [ 1.06]]
val_arr.shape: (6221, 1) val_arr type: <class 'numpy.ndarray'>
 [[ 1.06]
 [ 1.07]
 [ 1.07]
 ...
 [ 0.67]
 [ 0.67]
 [ 0.67]]
test_arr.shape: (20995, 1) test_arr type: <class 'numpy.ndarray'>
 [[ 0.67]
 [ 0.66]
 [ 0.67]
 ...
 [ 3.51]
 [ 3.51]
 [ 3.51]]
x_train.shape: torch.Size([50444, 100]) x_train type: <class 'torch.Tensor'>
 tensor([[-1.7223, -1.7214, -1.7227,  ..., -1.6951, -1.6975, -1.6995],
        [-1.7214, -1.7227, -1.7272,  ..., -1.6975, -1.6995, -1.7009],
        [-1.7227, -1.7272, -1.7283,  ..., -1.6995, -1.7009, -1.6961],
        ...,
        [ 1.0484,  1.0509,  1.0464,  ...,  1.0668,  1.0681,  1.0621],
        [ 1.0509,  1.0464,  1.0463,  ...,  1.0681,  1.0621,  1.0587],
        [ 1.0464,  1.0463,  1.0454,  ...,  1.0621,  1.0587,  1.0566]],
       device='cuda:0')
y_train.shape: torch.Size([50444, 100]) y_train type: <class 'torch.Tensor'>
 tensor([[-1.7214, -1.7227, -1.7272,  ..., -1.6975, -1.6995, -1.7009],
        [-1.7227, -1.7272, -1.7283,  ..., -1.6995, -1.7009, -1.6961],
        [-1.7272, -1.7283, -1.7272,  ..., -1.7009, -1.6961, -1.6963],
        ...,
        [ 1.0509,  1.0464,  1.0463,  ...,  1.0681,  1.0621,  1.0587],
        [ 1.0464,  1.0463,  1.0454,  ...,  1.0621,  1.0587,  1.0566],
        [ 1.0463,  1.0454,  1.0409,  ...,  1.0587,  1.0566,  1.0574]],
       device='cuda:0')
x_val.shape: torch.Size([6121, 100]) x_val type: <class 'torch.Tensor'>
 tensor([[1.0619, 1.0737, 1.0740,  ..., 1.1188, 1.1161, 1.1076],
        [1.0737, 1.0740, 1.0740,  ..., 1.1161, 1.1076, 1.1094],
        [1.0740, 1.0740, 1.0698,  ..., 1.1076, 1.1094, 1.1188],
        ...,
        [0.5756, 0.5710, 0.5978,  ..., 0.6945, 0.6939, 0.6828],
        [0.5710, 0.5978, 0.5860,  ..., 0.6939, 0.6828, 0.6748],
        [0.5978, 0.5860, 0.5710,  ..., 0.6828, 0.6748, 0.6661]],
       device='cuda:0')
y_val.shape: torch.Size([6121, 100]) y_val type: <class 'torch.Tensor'>
 tensor([[1.0737, 1.0740, 1.0740,  ..., 1.1161, 1.1076, 1.1094],
        [1.0740, 1.0740, 1.0698,  ..., 1.1076, 1.1094, 1.1188],
        [1.0740, 1.0698, 1.0705,  ..., 1.1094, 1.1188, 1.1192],
        ...,
        [0.5710, 0.5978, 0.5860,  ..., 0.6939, 0.6828, 0.6748],
        [0.5978, 0.5860, 0.5710,  ..., 0.6828, 0.6748, 0.6661],
        [0.5860, 0.5710, 0.5541,  ..., 0.6748, 0.6661, 0.6715]],
       device='cuda:0')
x_test.shape: torch.Size([20895, 100]) x_test type: <class 'torch.Tensor'>
 tensor([[0.6657, 0.6632, 0.6724,  ..., 0.8106, 0.8074, 0.8068],
        [0.6632, 0.6724, 0.6659,  ..., 0.8074, 0.8068, 0.8062],
        [0.6724, 0.6659, 0.6675,  ..., 0.8068, 0.8062, 0.8041],
        ...,
        [3.3850, 3.3734, 3.3574,  ..., 3.5053, 3.4966, 3.4744],
        [3.3734, 3.3574, 3.3327,  ..., 3.4966, 3.4744, 3.5054],
        [3.3574, 3.3327, 3.2934,  ..., 3.4744, 3.5054, 3.5054]],
       device='cuda:0')
y_test.shape: torch.Size([20895, 100]) y_test type: <class 'torch.Tensor'>
 tensor([[0.6632, 0.6724, 0.6659,  ..., 0.8074, 0.8068, 0.8062],
        [0.6724, 0.6659, 0.6675,  ..., 0.8068, 0.8062, 0.8041],
        [0.6659, 0.6675, 0.6604,  ..., 0.8062, 0.8041, 0.8065],
        ...,
        [3.3734, 3.3574, 3.3327,  ..., 3.4966, 3.4744, 3.5054],
        [3.3574, 3.3327, 3.2934,  ..., 3.4744, 3.5054, 3.5054],
        [3.3327, 3.2934, 3.2572,  ..., 3.5054, 3.5054, 3.5077]],
       device='cuda:0')
model_1 type: <class '__main__.Model'>
 Model(
  (lstm): LSTMCell(1, 21)
  (linear): Linear(in_features=21, out_features=1, bias=True)
)
loss_fn_1 type: <class 'torch.nn.modules.loss.MSELoss'>
 MSELoss()
optimizer_1 type: <class 'torch.optim.adam.Adam'>
 Adam (
Parameter Group 0
    amsgrad: False
    betas: (0.9, 0.999)
    eps: 1e-08
    lr: 0.0001
    weight_decay: 0
)
scheduler_1 type: <class 'torch.optim.lr_scheduler.StepLR'>
 <torch.optim.lr_scheduler.StepLR object at 0x7fae7cf75828>
optimization_1 type: <class '__main__.Optimization'>
 <__main__.Optimization object at 0x7faead82ebe0>
Epoch 1 Train loss: nan. Validation loss: nan. Avg future: 0.00. Elapsed time: 18.00s.
Epoch 2 Train loss: nan. Validation loss: nan. Avg future: 0.00. Elapsed time: 18.02s.
Epoch 3 Train loss: nan. Validation loss: nan. Avg future: 0.00. Elapsed time: 18.04s.
Epoch 4 Train loss: nan. Validation loss: nan. Avg future: 0.00. Elapsed time: 18.03s.
Epoch 5 Train loss: nan. Validation loss: nan. Avg future: 0.00. Elapsed time: 18.00s.
Epoch 6 Train loss: nan. Validation loss: nan. Avg future: 0.00. Elapsed time: 17.75s.
Epoch 7 Train loss: nan. Validation loss: nan. Avg future: 0.00. Elapsed time: 17.87s.
Epoch 8 Train loss: nan. Validation loss: nan. Avg future: 0.00. Elapsed time: 17.92s.
Epoch 9 Train loss: nan. Validation loss: nan. Avg future: 0.00. Elapsed time: 17.73s.
Epoch 10 Train loss: nan. Validation loss: nan. Avg future: 0.00. Elapsed time: 17.97s.
Epoch 11 Train loss: nan. Validation loss: nan. Avg future: 0.00. Elapsed time: 17.93s.
Epoch 12 Train loss: nan. Validation loss: nan. Avg future: 0.00. Elapsed time: 17.84s.
Epoch 13 Train loss: nan. Validation loss: nan. Avg future: 0.00. Elapsed time: 17.92s.
Epoch 14 Train loss: nan. Validation loss: nan. Avg future: 0.00. Elapsed time: 18.05s.
Epoch 15 Train loss: nan. Validation loss: nan. Avg future: 0.00. Elapsed time: 17.68s.
optimization_1 type: <class '__main__.Optimization'>
 <__main__.Optimization object at 0x7faead82ebe0>
Test loss nan
Epoch 1 Train loss: nan. Validation loss: nan. Avg future: 25.69. Elapsed time: 18.89s.
Epoch 2 Train loss: nan. Validation loss: nan. Avg future: 25.42. Elapsed time: 19.03s.
Epoch 3 Train loss: nan. Validation loss: nan. Avg future: 23.72. Elapsed time: 18.80s.
Epoch 4 Train loss: nan. Validation loss: nan. Avg future: 24.97. Elapsed time: 18.47s.
Epoch 5 Train loss: nan. Validation loss: nan. Avg future: 25.43. Elapsed time: 18.58s.
Epoch 6 Train loss: nan. Validation loss: nan. Avg future: 24.85. Elapsed time: 18.69s.
Epoch 7 Train loss: nan. Validation loss: nan. Avg future: 25.38. Elapsed time: 18.40s.
Epoch 8 Train loss: nan. Validation loss: nan. Avg future: 24.75. Elapsed time: 18.42s.
Epoch 9 Train loss: nan. Validation loss: nan. Avg future: 26.19. Elapsed time: 18.63s.
Epoch 10 Train loss: nan. Validation loss: nan. Avg future: 25.97. Elapsed time: 18.22s.
Epoch 11 Train loss: nan. Validation loss: nan. Avg future: 25.67. Elapsed time: 17.62s.
Epoch 12 Train loss: nan. Validation loss: nan. Avg future: 25.30. Elapsed time: 17.29s.
Epoch 13 Train loss: nan. Validation loss: nan. Avg future: 26.18. Elapsed time: 16.94s.
Epoch 14 Train loss: nan. Validation loss: nan. Avg future: 25.29. Elapsed time: 17.39s.
Epoch 15 Train loss: nan. Validation loss: nan. Avg future: 24.92. Elapsed time: 16.77s.
Test loss nan
end time:  20:26:48
end_time - start_time:754.422876
$

ソースコード２（PyTorchのLSTMの簡便なモデリングで、バックテストまで行うコード）

btc_prediction_and_backtest_by_pytorch.py

# -*- coding: utf-8 -*-
'''
btc_prediction_and_backtest_by_pytorch.py

Copyright (C) 2020 HIROSE Ken-ichi (hirosenokensan@gmail.com) 
                                                 All rights reserved.
 This is free software with ABSOLUTELY NO WARRANTY.
 
 This program is free software; you can redistribute it and/or modify
 it under the terms of the GNU General Public License as published by
 the Free Software Foundation; either version 2 of the License, or
 (at your option) any later version.
 
 This program is distributed in the hope that it will be useful,
 but WITHOUT ANY WARRANTY; without even the implied warranty of
 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 GNU General Public License for more details.
 
 You should have received a copy of the GNU General Public License
 along with this program; if not, write to the Free Software
 Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA
 02111-1307, USA
'''

import glob
import warnings

import os
# import math
import time
import random
# import pprint
# from dateutil import parser
from datetime import timedelta, datetime

import numpy as np
import pandas as pd
# import pandas_datareader.data as web

import matplotlib
import matplotlib.pyplot as plt

import sklearn
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, roc_auc_score

import torch
import torch.nn as nn
import torch.optim as optim
from torch.autograd import Variable
import skorch

from backtesting import Backtest, Strategy
from backtesting.lib import plot_heatmaps

# import bitmex

class LSTMClassifier(nn.Module):
    def __init__(self, lstm_input_dim, lstm_hidden_dim, target_dim):
        super(LSTMClassifier, self).__init__()
        self.input_dim = lstm_input_dim
        self.hidden_dim = lstm_hidden_dim
        self.lstm = nn.LSTM(input_size=lstm_input_dim, 
                            hidden_size=lstm_hidden_dim,
                            num_layers=1, #default
                            #dropout=0.2,
                            batch_first=True
                            )
        self.dense = nn.Linear(lstm_hidden_dim, target_dim)

    def forward(self, X_input):
        _, lstm_out = self.lstm(X_input)
        linear_out = self.dense(lstm_out[0].view(X_input.size(0), -1))
        return torch.sigmoid(linear_out)

def prep_feature_data(batch_idx, time_steps, X_data, feature_num, cuda_device):  
    feats = torch.zeros((len(batch_idx), time_steps, feature_num), dtype=torch.float, device=cuda_device)
    for b_i, b_idx in enumerate(batch_idx):
        b_slc = slice(b_idx + 1 - time_steps ,b_idx + 1) # 過去のN足分をtime stepのデータとして格納する。
        feats[b_i, :, :] = X_data[b_slc, :]        
    return feats

def plot_losses(train_losses, val_losses):
    plt.plot(train_losses, lw=1, label="Training loss")
    plt.plot(val_losses, lw=1, label="Validation loss")
    plt.legend()
    plt.title("Losses")

def get_all_bitmex(symbol, kline_size, save = False):
    filename = 'data/%s-%s-data.csv' % (symbol, kline_size)
    if os.path.isfile(filename):
        data_df = pd.read_csv(filename)
    else:
        data_df = pd.DataFrame()
    oldest_point, newest_point = minutes_of_new_data(symbol, kline_size, data_df)
    delta_min = (newest_point - oldest_point).total_seconds()/60
    available_data = math.ceil(delta_min/binsizes[kline_size])
    rounds = math.ceil(available_data / batch_size)
    if rounds > 0:
        for round_num in range(rounds):
            time.sleep(1)
            new_time = (oldest_point + timedelta(minutes = round_num * batch_size * binsizes[kline_size]))
            data = bitmex_client.Trade.Trade_getBucketed(symbol=symbol, 
                    binSize=kline_size, count=batch_size, startTime = new_time).result()[0]
            temp_df = pd.DataFrame(data)
            data_df = data_df.append(temp_df)
    data_df.set_index('timestamp', inplace=True)
    if save and rounds > 0:
        data_df.to_csv(filename)
    return data_df

class myCustomStrategy(Strategy):
    def init(self):
        self.model = LSTMClassifier(feature_num, lstm_hidden_dim, target_dim).to(cuda_device) # LSTMの学習済みモデルの読み込み
        self.model.load_state_dict(torch.load('{}.mdl'.format(ownprefix), map_location=torch.device(cuda_device))) # load model

    def next(self): 
        # 過去500ステップ分のデータが貯まるまではスキップ
        # 1日に1回のみ取引するため、hour & minuteが0の時のみ処理するようにする。
        if len(self.data) >= moving_average_num + time_steps and len(self.data) % future_num == 0:
            # 2. 推測用データの用意
            x_array = self.prepare_data()
            x_tensor = torch.tensor(x_array, dtype=torch.float, device=cuda_device)
            # 3. 予測の実行
            with torch.no_grad():
                y_pred = self.predict(x_tensor.view(1, time_steps, feature_num))

            # 4. 予測が買い(1)であればbuy()、それ以外はsell()
            if y_pred == 1:
                self.buy(sl=self.data.Close[-1]*0.99, 
                         tp=self.data.Close[-1]*1.01)
            else:
                self.sell(sl=self.data.Close[-1]*1.01, 
                         tp=self.data.Close[-1]*0.99)

    def prepare_data(self):
        # いったんPandasのデータフレームに変換
        tmp_df = pd.concat([
                    self.data.Open.to_series(), 
                    self.data.High.to_series(), 
                    self.data.Low.to_series(), 
                    self.data.Close.to_series(), 
                    self.data.Volume.to_series(), 
                    ], axis=1)

        # 500足の移動平均に対する割合とする。
        cols = tmp_df.columns
        for col in cols:
            tmp_df['Roll_' + col] = tmp_df[col].rolling(window=moving_average_num, min_periods=moving_average_num).mean()
            tmp_df[col] = tmp_df[col] / tmp_df['Roll_' + col] - 1

        #最後のtime_steps分のみの値を返す
        return tmp_df.tail(time_steps)[cols].values

    def predict(self, x_array):
        y_score = self.model(x_array) 
        return np.round(y_score.view(-1).to('cpu').numpy())[0]

class mySimpleStrategy(Strategy):
    def init(self):
        pass

    def next(self): 
        self.buy if self.data.Close[-1]> self.data.Open[-1] else self.sell()


if __name__ == '__main__':
    os.chdir(os.path.dirname(os.path.abspath(__file__)))
    ownprefix = os.path.basename(__file__)

    warnings.simplefilter('ignore')
    pd.set_option('display.max_columns', 100)
    np.set_printoptions(precision=3, suppress=True, formatter={'float': '{: 0.2f}'.format}) #桁を揃える

    start_time = time.perf_counter()
    print("start time: ", datetime.now().strftime("%H:%M:%S"))
    
    print("pandas==%s" % pd.__version__)
    print("numpy==%s" % np.__version__)
    print("torch==%s" % torch.__version__)
    print("matplotlib==%s" % matplotlib.__version__)
    
    cuda_device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
    print("cuda_device:",cuda_device)
    if cuda_device != "cpu": 
        print("devicde_name:",torch.cuda.get_device_name(torch.cuda.current_device()))
        torch.cuda.manual_seed(1)
    np.random.seed(1)
    random.seed(1)
    torch.manual_seed(1)
    
    if os.path.exists('{}.pickle'.format(ownprefix)):
        print("read_pickle:")
        df = pd.read_pickle('{}.pickle'.format(ownprefix))
    else:
        print("get_from_bitmex:")
        ## bitmex API
        # bitmex_api_key = ''    #Enter your own API-key here
        # bitmex_api_secret = '' #Enter your own API-secret here
        # binsizes = {"1m": 1, "5m": 5, "1h": 60, "1d": 1440}
        # batch_size = 750
        # bitmex_client = bitmex(test=False, api_key=bitmex_api_key, api_secret=bitmex_api_secret)
        # df = get_all_bitmex("XBTUSD","5m",save=True)
        ##
        # https://public.bitmex.com/?prefix=data/trade/
        # files = sorted(glob.glob('./data/2019*.csv.gz'))
        files = sorted(glob.glob('./data/2020*.csv.gz'))
        print("files:",files)
        df = pd.concat(map(pd.read_csv, files))
        df = df[df.symbol == 'XBTUSD']
        df.timestamp = pd.to_datetime(df.timestamp.str.replace('D', 'T'))
        df = df.sort_values('timestamp')
        df.set_index('timestamp', inplace=True)
        df.to_pickle('{}.pickle'.format(ownprefix))
        df.to_csv('{}.csv'.format(ownprefix))
    
    print("df.shape:",df.shape)
    print("df.tail:",df.tail(-5))

    # resample()の頻度コードをH（時間）、T（分）、S（秒）、B（月 - 金）、W（週）
    df_ohlcv = df['price'].resample('10T', label='left', closed='left').ohlc().assign(
                                    volume=df['foreignNotional'].resample('10T').sum().values)
    df_ohlcv.rename(columns={'timestamp':'Datetime','open':'Open','high':'High',
                    'low':'Low','close':'Close','volume':'Volume'}, inplace=True)
    print("df_ohlcv.shape:",df_ohlcv.shape,"df_ohlcv type:",type(df_ohlcv),"\n",df_ohlcv)

    '''
    # 1. 定数の設定
    '''
    future_num = 144 #何足先を予測するか
    feature_num = 5 # open,high,low,close,volume の5項目
    batch_size = 64 # batch_size = 128
    time_steps = 50 # lstmのtimesteps
    moving_average_num = 500 # 移動平均を取るCandle数
    n_epocs = 30 
    
    lstm_hidden_dim = 16
    target_dim = 1

    '''
    # 2. 教師データの作成
    '''
    future_price = df_ohlcv.iloc[future_num:]['Close'].values
    curr_price = df_ohlcv.iloc[:-future_num]['Close'].values
    y_data_tmp = future_price - curr_price
    print("future_price:",future_price,"\ncurr_price:",curr_price,"\ny_data_tmp:",y_data_tmp)

    y_data = np.zeros_like(y_data_tmp)
    y_data[y_data_tmp > 0] = 1
    y_data = y_data[moving_average_num:]
    print("y_data.shape:",y_data.shape,"y_data type:",type(y_data),"\n",y_data)

    '''
    # 3. 価格の正規化
    '''
    cols = df_ohlcv.columns # cols = df.columns
    for col in cols:
        df_ohlcv['Roll_' + col] = df_ohlcv[col].rolling(window=moving_average_num, min_periods=moving_average_num).mean()
        df_ohlcv[col] = df_ohlcv[col] / df_ohlcv['Roll_' + col] - 1    
    print("df_ohlcv.shape:",df_ohlcv.shape,"df_ohlcv type:",type(df_ohlcv))
    print("df_ohlcv.tail:",df_ohlcv.tail(-5))

    X_data = df_ohlcv.iloc[moving_average_num:-future_num][cols].values #最初の500足分は移動平均データがないため除く。後半の144足分は予測データがないため除く
    print("X_data.shape:",X_data.shape,"X_data type:",type(X_data),"\n",X_data)

    '''
    # 4. データの分割、TorchのTensorに変換
    '''
    val_idx_from = round(len(df_ohlcv)*0.65) #データをtrain, testに分割するIndex
    test_idx_from = round(len(df_ohlcv)*0.65) + round(len(df_ohlcv)*0.08)
    print("val_idx_from:",val_idx_from,"test_idx_from:",test_idx_from)
   
    X_train = torch.tensor(X_data[:val_idx_from], dtype=torch.float, device=cuda_device) #学習用データ
    y_train = torch.tensor(y_data[:val_idx_from], dtype=torch.float, device=cuda_device)
    print("X_train.shape:",X_train.shape,"X_train type:",type(X_train),"\n",X_train)
    print("y_train.shape:",y_train.shape,"y_train type:",type(y_train),"\n",y_train)
    
    X_val   = torch.tensor(X_data[val_idx_from:test_idx_from], dtype=torch.float, device=cuda_device) #評価用データ
    y_val   = y_data[val_idx_from:test_idx_from]
    print("X_val.shape:",X_val.shape,"X_val type:",type(X_val),"\n",X_val)
    print("y_val.shape:",y_val.shape,"y_val type:",type(y_val),"\n",y_val)
    
    X_test  = torch.tensor(X_data[test_idx_from:], dtype=torch.float, device=cuda_device) #テスト用データ
    y_test  = y_data[test_idx_from:]
    print("X_test.shape:",X_test.shape,"X_test type:",type(X_test),"\n",X_test)
    print("y_test.shape:",y_test.shape,"y_test type:",type(y_test),"\n",y_test)

    '''
    # 5. LSTMの学習モデル構築
    '''
    model = LSTMClassifier(feature_num, lstm_hidden_dim, target_dim).to(cuda_device)
    print("model type:",type(model),"\n",model)

    loss_function = nn.BCELoss()
    print("loss_function type:",type(loss_function),"\n",loss_function)

    optimizer= optim.Adam(model.parameters(), lr=1e-4)
    print("optimizer type:",type(optimizer),"\n",optimizer)

    
    train_size = X_train.size(0)
    print("train_size:",train_size)

    best_acc_score = 0
    for epoch in range(n_epocs):
        '''
        # 1. まずはtrainデータのindexをランダムに入れ替える。最初のtime_steps分は使わない。
        '''
        perm_idx = np.random.permutation(np.arange(time_steps, train_size))
        # print("perm_idx.shape:",perm_idx.shape,"perm_idx type:",type(perm_idx),"\n",perm_idx)
        '''
        # 2. batch size毎にperm_idxの対象のindexを取得
        '''
        for t_i in range(0, len(perm_idx), batch_size):
            batch_idx = perm_idx[t_i:(t_i + batch_size)]
            '''
            # 3. LSTM入力用の時系列データの準備
            '''
            feats = prep_feature_data(batch_idx, time_steps, X_train, feature_num, cuda_device)
            y_target = y_train[batch_idx]
            '''
            # 4. pytorch LSTMの学習実施
            '''
            model.zero_grad()
            train_scores = model(feats) # batch size x time steps x feature_num
            loss = loss_function(train_scores, y_target.view(-1, 1))
            loss.backward()
            optimizer.step()
    
        '''
        # 5. validationデータの評価
        '''
        with torch.no_grad():
            feats_val = prep_feature_data(np.arange(time_steps, X_val.size(0)), time_steps, X_val, feature_num, cuda_device)
            val_scores = model(feats_val)
            tmp_scores = val_scores.view(-1).to('cpu').numpy()
            bi_scores = np.round(tmp_scores)
            acc_score = accuracy_score(y_val[time_steps:], bi_scores)
            roc_score = roc_auc_score(y_val[time_steps:], tmp_scores)
            print('EPOCH:',str(epoch),'loss:',loss.item(),'Val ACC Score:',acc_score,'ROC AUC Score:',roc_score)
    
        '''
        # 6. validationの評価が良ければモデルを保存
        '''
        if acc_score > best_acc_score:
            best_acc_score = acc_score
            torch.save(model.state_dict(),'{}.mdl'.format(ownprefix))
            print('best score updated, Pytorch model was saved!!', )
    
    '''
    # 7. bestモデルで予測する。
    '''
    model.load_state_dict(torch.load('{}.mdl'.format(ownprefix)))
    with torch.no_grad():
        feats_test = prep_feature_data(np.arange(time_steps, X_test.size(0)), time_steps, X_test, feature_num, cuda_device)
        val_scores = model(feats_test)
        tmp_scores = val_scores.view(-1).to('cpu').numpy()
        bi_scores = np.round(tmp_scores)
        acc_score = accuracy_score(y_test[time_steps:], bi_scores)
        roc_score = roc_auc_score(y_test[time_steps:], tmp_scores)
        print('Test ACC Score:',acc_score,'ROC AUC Score:',roc_score)

    '''
    # 8. 簡易なストラテジでバックテストを行う
    '''
    # resample()の頻度コードをH（時間）、T（分）、S（秒）、B（月 - 金）、W（週）
    df_ohlcv = df['price'].resample('10T', label='left', closed='left').ohlc().assign(
                                    volume=df['foreignNotional'].resample('10T').sum().values)
    df_ohlcv.rename(columns={'timestamp':'Datetime','open':'Open','high':'High',
                    'low':'Low','close':'Close','volume':'Volume'}, inplace=True)
    print("df_ohlcv.shape:",df_ohlcv.shape,"df_ohlcv type:",type(df_ohlcv),"\n",df_ohlcv)

    bt = Backtest(df_ohlcv[8000:], myCustomStrategy, cash=100000, commission=.00004)
    print(bt.run())
    bt.plot(filename='{}'.format(ownprefix), open_browser=False)

実行結果ログ３（ソースコード２に2020年のBTCのデータを読み込ませたログ）

$ python3 btc_prediction_and_backtest_by_pytorch.py
start time:  23:52:58
pandas==1.1.4
numpy==1.19.2
torch==1.5.0
matplotlib==3.3.3
cuda_device: cuda
devicde_name: GeForce RTX 2070
get_from_bitmex:
files: ['./data/20200930.csv.gz', './data/20201001.csv.gz', './data/20201002.csv.gz', './data/20201003.csv.gz', './data/20201004.csv.gz', './data/20201005.csv.gz', './data/20201006.csv.gz', './data/20201007.csv.gz', './data/20201008.csv.gz', './data/20201009.csv.gz', './data/20201010.csv.gz', './data/20201011.csv.gz', './data/20201012.csv.gz', './data/20201013.csv.gz', './data/20201014.csv.gz', './data/20201015.csv.gz', './data/20201016.csv.gz', './data/20201017.csv.gz', './data/20201018.csv.gz', './data/20201019.csv.gz', './data/20201020.csv.gz', './data/20201021.csv.gz', './data/20201022.csv.gz', './data/20201023.csv.gz', './data/20201024.csv.gz', './data/20201025.csv.gz', './data/20201026.csv.gz', './data/20201027.csv.gz', './data/20201028.csv.gz', './data/20201029.csv.gz', './data/20201030.csv.gz', './data/20201031.csv.gz', './data/20201101.csv.gz', './data/20201102.csv.gz', './data/20201103.csv.gz', './data/20201104.csv.gz', './data/20201105.csv.gz', './data/20201106.csv.gz', './data/20201107.csv.gz', './data/20201108.csv.gz', './data/20201109.csv.gz', './data/20201110.csv.gz', './data/20201111.csv.gz', './data/20201112.csv.gz', './data/20201113.csv.gz', './data/20201114.csv.gz', './data/20201115.csv.gz', './data/20201116.csv.gz', './data/20201117.csv.gz', './data/20201118.csv.gz', './data/20201119.csv.gz', './data/20201120.csv.gz', './data/20201121.csv.gz', './data/20201122.csv.gz', './data/20201123.csv.gz', './data/20201124.csv.gz', './data/20201125.csv.gz', './data/20201126.csv.gz', './data/20201127.csv.gz', './data/20201128.csv.gz', './data/20201129.csv.gz', './data/20201130.csv.gz', './data/20201201.csv.gz', './data/20201202.csv.gz', './data/20201203.csv.gz', './data/20201204.csv.gz', './data/20201205.csv.gz', './data/20201206.csv.gz', './data/20201207.csv.gz', './data/20201208.csv.gz', './data/20201209.csv.gz', './data/20201210.csv.gz', './data/20201211.csv.gz', './data/20201212.csv.gz', './data/20201213.csv.gz', './data/20201214.csv.gz', './data/20201215.csv.gz', './data/20201216.csv.gz', './data/20201217.csv.gz', './data/20201218.csv.gz', './data/20201219.csv.gz', './data/20201220.csv.gz', './data/20201221.csv.gz', './data/20201222.csv.gz', './data/20201223.csv.gz']
df.shape: (25878209, 9)
df.tail:                             symbol  side   size    price tickDirection  \
timestamp
2020-09-30 00:00:02.771822  XBTUSD   Buy  12055  10839.5  ZeroPlusTick
2020-09-30 00:00:02.885748  XBTUSD  Sell   4500  10839.0     MinusTick
2020-09-30 00:00:02.989378  XBTUSD   Buy   3499  10839.5      PlusTick
2020-09-30 00:00:02.992595  XBTUSD   Buy     87  10839.5  ZeroPlusTick
2020-09-30 00:00:02.998145  XBTUSD   Buy   2383  10839.5  ZeroPlusTick
...                            ...   ...    ...      ...           ...
2020-12-23 23:59:58.571018  XBTUSD  Sell    420  23245.0     MinusTick
2020-12-23 23:59:58.580506  XBTUSD   Buy     13  23244.5     MinusTick
2020-12-23 23:59:58.593966  XBTUSD   Buy     10  23243.5     MinusTick
2020-12-23 23:59:58.597077  XBTUSD  Sell    447  23243.0     MinusTick
2020-12-23 23:59:58.646200  XBTUSD   Buy     11  23241.0     MinusTick

                                                      trdMatchID  grossValue  \
timestamp
2020-09-30 00:00:02.771822  ddc9e2a6-40b5-b5bf-715b-60cf18ab847a   111219430
2020-09-30 00:00:02.885748  938ba483-0bd9-b498-c5fb-162c2cc72acb    41517000
2020-09-30 00:00:02.989378  be1bab97-78a9-b0db-1ecd-be70eb7bdb99    32281774
2020-09-30 00:00:02.992595  29f68ab4-cc4f-291d-5cfb-a96baacac448      802662
2020-09-30 00:00:02.998145  be2e5e02-b1c8-88da-262a-1d23ffc62b32    21985558
...                                                          ...         ...
2020-12-23 23:59:58.571018  a12bbffc-d9d7-083c-cbd5-dbadb88cb0a0     1806840
2020-12-23 23:59:58.580506  f1e97f20-6739-4145-715a-fe0914393f20       55926
2020-12-23 23:59:58.593966  b57033c9-4dd2-9c96-04e5-7bf702e36806       43020
2020-12-23 23:59:58.597077  35a0a31a-4a9d-34e9-208f-5af533a576c5     1922994
2020-12-23 23:59:58.646200  b3cb9ba2-cebc-cdf5-528c-ce7c7b79269b       47333

                            homeNotional  foreignNotional
timestamp
2020-09-30 00:00:02.771822      1.112194          12055.0
2020-09-30 00:00:02.885748      0.415170           4500.0
2020-09-30 00:00:02.989378      0.322818           3499.0
2020-09-30 00:00:02.992595      0.008027             87.0
2020-09-30 00:00:02.998145      0.219856           2383.0
...                                  ...              ...
2020-12-23 23:59:58.571018      0.018068            420.0
2020-12-23 23:59:58.580506      0.000559             13.0
2020-12-23 23:59:58.593966      0.000430             10.0
2020-12-23 23:59:58.597077      0.019230            447.0
2020-12-23 23:59:58.646200      0.000473             11.0

[25878204 rows x 9 columns]
df_ohlcv.shape: (12240, 5) df_ohlcv type: <class 'pandas.core.frame.DataFrame'>
                         Open     High      Low    Close      Volume
timestamp
2020-09-30 00:00:00  10839.5  10842.0  10827.5  10828.0  13500945.0
2020-09-30 00:10:00  10828.5  10829.0  10822.0  10829.0   4477779.0
2020-09-30 00:20:00  10829.0  10829.0  10816.5  10819.5   3589041.0
2020-09-30 00:30:00  10819.5  10820.0  10814.5  10814.5   4523661.0
2020-09-30 00:40:00  10815.0  10820.0  10812.0  10820.0   3463389.0
...                      ...      ...      ...      ...         ...
2020-12-23 23:10:00  23307.0  23400.0  23264.0  23315.5  14089639.0
2020-12-23 23:20:00  23315.5  23485.5  23315.0  23382.0  40956253.0
2020-12-23 23:30:00  23382.0  23420.0  23333.0  23365.0  15243013.0
2020-12-23 23:40:00  23365.5  23376.0  23264.0  23281.0  13256178.0
2020-12-23 23:50:00  23281.5  23288.0  23190.0  23241.0  16298938.0

[12240 rows x 5 columns]
future_price: [ 10817.50  10799.50  10798.50 ...  23365.00  23281.00  23241.00]
curr_price: [ 10828.00  10829.00  10819.50 ...  23736.00  23757.50  23835.00]
y_data_tmp: [-10.50 -29.50 -21.00 ... -371.00 -476.50 -594.00]
y_data.shape: (11596,) y_data type: <class 'numpy.ndarray'>
 [ 1.00  1.00  1.00 ...  0.00  0.00  0.00]
df_ohlcv.shape: (12240, 10) df_ohlcv type: <class 'pandas.core.frame.DataFrame'>
df_ohlcv.tail:                          Open      High       Low     Close    Volume  \
timestamp
2020-09-30 00:50:00       NaN       NaN       NaN       NaN       NaN
2020-09-30 01:00:00       NaN       NaN       NaN       NaN       NaN
2020-09-30 01:10:00       NaN       NaN       NaN       NaN       NaN
2020-09-30 01:20:00       NaN       NaN       NaN       NaN       NaN
2020-09-30 01:30:00       NaN       NaN       NaN       NaN       NaN
...                       ...       ...       ...       ...       ...
2020-12-23 23:10:00 -0.002501 -0.000952 -0.001648 -0.002115 -0.256627
2020-12-23 23:20:00 -0.002115  0.002709  0.000556  0.000742  1.154715
2020-12-23 23:30:00  0.000742 -0.000075  0.001342  0.000026 -0.198753
2020-12-23 23:40:00  0.000048 -0.001942 -0.001603 -0.003553 -0.303641
2020-12-23 23:50:00 -0.003531 -0.005679 -0.004760 -0.005249 -0.144335

                     Roll_Open  Roll_High   Roll_Low  Roll_Close   Roll_Volume
timestamp
2020-09-30 00:50:00        NaN        NaN        NaN         NaN           NaN
2020-09-30 01:00:00        NaN        NaN        NaN         NaN           NaN
2020-09-30 01:10:00        NaN        NaN        NaN         NaN           NaN
2020-09-30 01:20:00        NaN        NaN        NaN         NaN           NaN
2020-09-30 01:30:00        NaN        NaN        NaN         NaN           NaN
...                        ...        ...        ...         ...           ...
2020-12-23 23:10:00  23365.435  23422.301  23302.411   23364.911  1.895367e+07
2020-12-23 23:20:00  23364.912  23422.039  23302.050   23364.672  1.900774e+07
2020-12-23 23:30:00  23364.673  23421.745  23301.725   23364.382  1.902411e+07
2020-12-23 23:40:00  23364.384  23421.476  23301.362   23364.003  1.903641e+07
2020-12-23 23:50:00  23364.007  23421.018  23300.902   23363.645  1.904826e+07

[12235 rows x 10 columns]
X_data.shape: (11596, 5) X_data type: <class 'numpy.ndarray'>
 [[-0.01 -0.01 -0.01 -0.01  0.11]
 [-0.01 -0.01 -0.01 -0.01 -0.54]
 [-0.01 -0.01 -0.01 -0.01 -0.69]
 ...
 [ 0.01  0.02  0.02  0.02  0.02]
 [ 0.02  0.02  0.02  0.02 -0.51]
 [ 0.02  0.02  0.02  0.02  0.18]]
val_idx_from: 7956 test_idx_from: 8935
X_train.shape: torch.Size([7956, 5]) X_train type: <class 'torch.Tensor'>
 tensor([[-0.0103, -0.0091, -0.0107, -0.0114,  0.1141],
        [-0.0113, -0.0116, -0.0109, -0.0117, -0.5357],
        [-0.0116, -0.0115, -0.0108, -0.0110, -0.6896],
        ...,
        [-0.0780, -0.0754, -0.0748, -0.0734, -0.2527],
        [-0.0734, -0.0760, -0.0737, -0.0763, -0.6506],
        [-0.0763, -0.0767, -0.0739, -0.0763, -0.3977]], device='cuda:0')
y_train.shape: torch.Size([7956]) y_train type: <class 'torch.Tensor'>
 tensor([1., 1., 1.,  ..., 1., 1., 1.], device='cuda:0')
X_val.shape: torch.Size([979, 5]) X_val type: <class 'torch.Tensor'>
 tensor([[-0.0763, -0.0761, -0.0738, -0.0761, -0.5517],
        [-0.0760, -0.0784, -0.0737, -0.0767, -0.7224],
        [-0.0767, -0.0780, -0.0767, -0.0791, -0.4797],
        ...,
        [-0.0186, -0.0175, -0.0194, -0.0182,  3.2428],
        [-0.0183, -0.0203, -0.0228, -0.0219,  2.7683],
        [-0.0219, -0.0187, -0.0194, -0.0167,  1.5095]], device='cuda:0')
y_val.shape: (979,) y_val type: <class 'numpy.ndarray'>
 [ 1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  0.00  1.00  1.00  0.00  0.00  0.00
  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
  0.00  0.00  0.00  0.00  0.00  0.00  1.00  0.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  0.00  0.00  0.00  0.00  0.00
  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  1.00  1.00
  0.00  0.00  0.00  0.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  0.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  0.00
  0.00  0.00  0.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  0.00  0.00
  0.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  1.00  0.00  0.00  1.00  1.00  1.00  1.00  1.00
  1.00  1.00  1.00  1.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00  0.00
  0.00  0.00  0.00  0.00  0.00  1.00  1.00  1.00  1.00  1.00  0.00  0.00
  1.00  0.00  1.00  1.00  1.00  1.00  1.00]
X_test.shape: torch.Size([2661, 5]) X_test type: <class 'torch.Tensor'>
 tensor([[-0.0168, -0.0148, -0.0143, -0.0127,  0.9034],
        [-0.0128, -0.0124, -0.0111, -0.0121,  0.2539],
        [-0.0121, -0.0102, -0.0112, -0.0080,  0.7918],
        ...,
        [ 0.0135,  0.0165,  0.0155,  0.0151,  0.0248],
        [ 0.0151,  0.0152,  0.0175,  0.0159, -0.5096],
        [ 0.0159,  0.0181,  0.0175,  0.0192,  0.1800]], device='cuda:0')
y_test.shape: (2661,) y_test type: <class 'numpy.ndarray'>
 [ 1.00  1.00  1.00 ...  0.00  0.00  0.00]
model type: <class '__main__.LSTMClassifier'>
 LSTMClassifier(
  (lstm): LSTM(5, 16, batch_first=True)
  (dense): Linear(in_features=16, out_features=1, bias=True)
)
loss_function type: <class 'torch.nn.modules.loss.BCELoss'>
 BCELoss()
optimizer type: <class 'torch.optim.adam.Adam'>
 Adam (
Parameter Group 0
    amsgrad: False
    betas: (0.9, 0.999)
    eps: 1e-08
    lr: 0.0001
    weight_decay: 0
)
train_size: 7956
EPOCH: 0 loss: 0.691195547580719 Val ACC Score: 0.6555435952637244 ROC AUC Score: 0.5107399425287357
best score updated, Pytorch model was saved!!
EPOCH: 1 loss: 0.6914775967597961 Val ACC Score: 0.6555435952637244 ROC AUC Score: 0.5501719006568144
EPOCH: 2 loss: 0.6868444085121155 Val ACC Score: 0.6555435952637244 ROC AUC Score: 0.5867456896551724
EPOCH: 3 loss: 0.6009563207626343 Val ACC Score: 0.6555435952637244 ROC AUC Score: 0.6160483374384237
EPOCH: 4 loss: 0.6663740277290344 Val ACC Score: 0.6555435952637244 ROC AUC Score: 0.6489326765188834
EPOCH: 5 loss: 0.6851885318756104 Val ACC Score: 0.6555435952637244 ROC AUC Score: 0.682473830049261
EPOCH: 6 loss: 0.6095741391181946 Val ACC Score: 0.6555435952637244 ROC AUC Score: 0.703132697044335
EPOCH: 7 loss: 0.7387176156044006 Val ACC Score: 0.6555435952637244 ROC AUC Score: 0.7015419745484401
EPOCH: 8 loss: 0.690645694732666 Val ACC Score: 0.6555435952637244 ROC AUC Score: 0.6566810344827586
EPOCH: 9 loss: 0.6829000115394592 Val ACC Score: 0.6555435952637244 ROC AUC Score: 0.622172619047619
EPOCH: 10 loss: 0.738982617855072 Val ACC Score: 0.6555435952637244 ROC AUC Score: 0.5912587233169129
EPOCH: 11 loss: 0.6995638608932495 Val ACC Score: 0.6555435952637244 ROC AUC Score: 0.5838156814449917
EPOCH: 12 loss: 0.7947514057159424 Val ACC Score: 0.6555435952637244 ROC AUC Score: 0.5787972085385878
EPOCH: 13 loss: 0.6817842125892639 Val ACC Score: 0.6555435952637244 ROC AUC Score: 0.5798465722495894
EPOCH: 14 loss: 0.6924517154693604 Val ACC Score: 0.6555435952637244 ROC AUC Score: 0.5760827175697866
EPOCH: 15 loss: 0.5952319502830505 Val ACC Score: 0.6555435952637244 ROC AUC Score: 0.5766369047619048
EPOCH: 16 loss: 0.6980494260787964 Val ACC Score: 0.6555435952637244 ROC AUC Score: 0.5808010057471265
EPOCH: 17 loss: 0.6621570587158203 Val ACC Score: 0.6555435952637244 ROC AUC Score: 0.5822839696223316
EPOCH: 18 loss: 0.6212792992591858 Val ACC Score: 0.6555435952637244 ROC AUC Score: 0.586889367816092
EPOCH: 19 loss: 0.6528978943824768 Val ACC Score: 0.6555435952637244 ROC AUC Score: 0.5857194170771757
EPOCH: 20 loss: 0.7154173254966736 Val ACC Score: 0.6555435952637244 ROC AUC Score: 0.5881773399014778
EPOCH: 21 loss: 0.7460910677909851 Val ACC Score: 0.6555435952637244 ROC AUC Score: 0.5952534893267653
EPOCH: 22 loss: 0.6252413988113403 Val ACC Score: 0.6555435952637244 ROC AUC Score: 0.6035714285714285
EPOCH: 23 loss: 0.6823109984397888 Val ACC Score: 0.6555435952637244 ROC AUC Score: 0.6014983579638752
EPOCH: 24 loss: 0.6286243200302124 Val ACC Score: 0.6555435952637244 ROC AUC Score: 0.6065219622331691
EPOCH: 25 loss: 0.6132022142410278 Val ACC Score: 0.6555435952637244 ROC AUC Score: 0.6133697660098522
EPOCH: 26 loss: 0.5485300421714783 Val ACC Score: 0.6555435952637244 ROC AUC Score: 0.6092980295566501
EPOCH: 27 loss: 0.5204343795776367 Val ACC Score: 0.6609257265877287 ROC AUC Score: 0.6157327586206897
best score updated, Pytorch model was saved!!
EPOCH: 28 loss: 0.6037634611129761 Val ACC Score: 0.6749192680301399 ROC AUC Score: 0.6112941297208538
best score updated, Pytorch model was saved!!
EPOCH: 29 loss: 0.6126266717910767 Val ACC Score: 0.7158234660925726 ROC AUC Score: 0.6105398193760263
best score updated, Pytorch model was saved!!
Test ACC Score: 0.5947912677135198 ROC AUC Score: 0.45556905274916193
df_ohlcv.shape: (12240, 5) df_ohlcv type: <class 'pandas.core.frame.DataFrame'>
                         Open     High      Low    Close      Volume
timestamp
2020-09-30 00:00:00  10839.5  10842.0  10827.5  10828.0  13500945.0
2020-09-30 00:10:00  10828.5  10829.0  10822.0  10829.0   4477779.0
2020-09-30 00:20:00  10829.0  10829.0  10816.5  10819.5   3589041.0
2020-09-30 00:30:00  10819.5  10820.0  10814.5  10814.5   4523661.0
2020-09-30 00:40:00  10815.0  10820.0  10812.0  10820.0   3463389.0
...                      ...      ...      ...      ...         ...
2020-12-23 23:10:00  23307.0  23400.0  23264.0  23315.5  14089639.0
2020-12-23 23:20:00  23315.5  23485.5  23315.0  23382.0  40956253.0
2020-12-23 23:30:00  23382.0  23420.0  23333.0  23365.0  15243013.0
2020-12-23 23:40:00  23365.5  23376.0  23264.0  23281.0  13256178.0
2020-12-23 23:50:00  23281.5  23288.0  23190.0  23241.0  16298938.0

[12240 rows x 5 columns]
Start                     2020-11-24 13:20:00
End                       2020-12-23 23:50:00
Duration                     29 days 10:30:00
Exposure Time [%]                     10.2358
Equity Final [$]                       110980
Equity Peak [$]                        111948
Return [%]                            10.9804
Buy & Hold Return [%]                 20.5915
Return (Ann.) [%]                     255.219
Volatility (Ann.) [%]                 52.4708
Sharpe Ratio                          4.86403
Sortino Ratio                         30.6017
Calmar Ratio                          127.727
Max. Drawdown [%]                    -1.99816
Avg. Drawdown [%]                   -0.526232
Max. Drawdown Duration        7 days 22:30:00
Avg. Drawdown Duration        0 days 16:50:00
# Trades                                   26
Win Rate [%]                          73.0769
Best Trade [%]                        1.00127
Worst Trade [%]                      -1.00668
Avg. Trade [%]                       0.453453
Max. Trade Duration           0 days 10:40:00
Avg. Trade Duration           0 days 02:37:00
Profit Factor                         2.69042
Expectancy [%]                       0.457399
SQN                                   2.53193
_strategy                    myCustomStrategy
_equity_curve                             ...
_trades                       Size  EntryB...
dtype: object
$

参考文献

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up