More than 3 years have passed since last update.

SSD（物体検出）を用いて航空機識別

Last updated at 2021-10-04Posted at 2021-10-04

はじめに

私は、アイデミー様でpythonを約6ヶ月学習しました。
主に学習内容は、アプリ開発を学びました。
そこで成果物を作る際に、画像認識ではなく物体検出を使ってアプリ開発を
行おうと考えました。

物体検出を選んだ理由

画像分類は、画像全体から何が写っているか学習しますが物体検出は、
どこに何が写っているかを学習します。
下記のブログを参考にしました。
https://qiita.com/DeepTama/items/aab46729d2aa51a8954d
ブログ内容は非常に難しいですが、通常の画像認識だと航空機の特徴量が
上手く認識できずに精度が出ないと感じました。物体検出を探した所、
SSD(Single Shot multibox Detector)を見つけました。

使用環境

自分は、MacとWindows2つ機材があった為、2つ使用することにしました。
アプリ開発は主にMac、学習データセットはWindowsを使いました。
またGCPも環境設定を行い、Jupyterノートブックの設定も行いました。
もちろんそれぞれ互換性がない為、グーグルドライブを使ってWindowsとMacのファイル共有を行いました。
またそれぞれ環境設定を行う必要があり、かなり苦労したので下で説明していきたいと思います。

バージョンPython 3.7.11
tensorflow 2.1.0

アプリ内容について

主に3種類の機体を識別します。
エアバス（airbus）、ボーイング（boeing）、軍用機（millitary）
エアバスとボーイングは旅客機になります。

画像収集方法

まずはWEBから下記のブログを参照してプログラムで画像収集を行いました。
https://qiita.com/ichii731/items/d55c53a49fb3b63670e9

googleimagesdownload --keywords "apple" --limit 20

キーワードに必要なワード、リミットは枚数を表します。
また下記のデータセットも用いて学習用データとして使いました。
https://www.kaggle.com/seryouxblaster764/fgvc-aircraft
ボーイングが400枚、エアバスが600枚、軍用機が200枚
合計約1200枚を集めて学習しました。

アノテーションツールを用いる

アノテーションとは、用意した画像にタグ付け（バウンディングBOX）を行い、
機体がボーイングなのかエアバスか軍用機かを注釈するツールです。
タグ付けを行うことで、エリア内に絞って機体を認識することができます。
画像認識より精度を高く行うことができる為、使うことにしました。

アノテーションツールの設定

集めた画像をVoTTを使ってタグ付け（バウンディングBOX）を行う為
下記のブログを参照しました
https://sleepless-se.net/2019/06/21/how-to-use-vott/
VoTTを選んだ理由は、インストールがスムーズに行えたので選びました笑
しかしインストールが行った後、プロジェクト設定が上手くいかず
エラーが多発しました。
原因を探るとMacのiCloudの可能性がありました。
ファイル設定をしていた際に、ローカルで作成したファイルが
見つからなかったり、消したはずのファイルが残っていると
表示されたりしました。
その為、iCloudとの同期を解除しました。
解除後、問題なく起動と設定をすることができました。

VoTT(BBoxUtility)を使用

先ほど集めた約1200枚に1枚ずつタグ付け（バウンディングBOX）を行いました。
トータル時間は、約6時間掛かりました。
また旅客機の機体が非常に似ている為、タグを確認するのにも時間が掛りました笑
そして今回実装するSSDのモデルでは、入力画像のサイズとして300×300を仮定しています。

Keras SSDで機体検出

SSDを使用するに辺り、下記のブログを参考にしました。
https://qiita.com/chicken_data_analyst/items/f05c345e6a090bd3ee7f

但し、ここでTensorflowのバージョンが参考にしたブログが古い為1系から2系へ
変更することにしました。
1系から2系にした理由は、現在のtensorflowは主に2系がメインで使われているのとkerasが2系でtensorflowへ統合されたので、変更すればいいのかなとkerasが2系でtensorflowへ統合されたので、変更すればいいのかなと安直な考えで計画しました。笑
ただ参考文献を探しても物体検出のgitは1系が多かったです。
なんとか下記の変更仕様のブログを見つけました。
https://techblog.cccmk.co.jp/entry/2020/05/11/094834

# xmlファイルをpickleに変換


import sys
sys.path.append('ssd_keras')

# 必要パッケージのimport
import cv2
from tensorflow.keras.applications.imagenet_utils import preprocess_input
from tensorflow.keras.models import Model
from tensorflow.keras.preprocessing import image
import matplotlib.pyplot as plt
import numpy as np
import pickle
from random import shuffle
# from scipy.misc import imread
# from scipy.misc import imresize
from imageio import imread
from PIL import Image
import tensorflow as tf
from ssd import SSD300
from ssd_training import MultiboxLoss
from ssd_utils import BBoxUtility
from PIL import Image
from lxml import etree
from xml.etree import ElementTree
import math
import os
import glob
# from google.colab.patches import cv2_imshow

ライブラリのインポートと学習済モデルのダウンロード

import sys
sys.path.append('ssd_keras')

# 必要パッケージのimport
import cv2
from tensorflow.keras.applications.imagenet_utils import preprocess_input
from tensorflow.keras.models import Model
from tensorflow.keras.preprocessing import image
import matplotlib.pyplot as plt
import numpy as np
import pickle
from random import shuffle
# from scipy.misc import imread
# from scipy.misc import imresize
from imageio import imread
from PIL import Image
import tensorflow as tf
from ssd import SSD300
from ssd_training import MultiboxLoss
from ssd_utils import BBoxUtility
from PIL import Image
from lxml import etree
from xml.etree import ElementTree
import math
import os
import glob
# from google.colab.patches import cv2_imshow

必要なものをインポート



import cv2
import tensorflow
import tensorflow.keras
from tensorflow.keras.applications.imagenet_utils import preprocess_input
# from tensorflow.keras.backend.tensorflow_backend import set_session
from tensorflow.keras.models import Model
from tensorflow.keras.preprocessing import image
import matplotlib.pyplot as plt
import numpy as np
import pickle
from random import shuffle
# from scipy.misc import imread
# from scipy.misc import imresize
from imageio import imread
from PIL import Image
import tensorflow as tf

from ssd import SSD300
from ssd_training import MultiboxLoss
from ssd_utils import BBoxUtility

import os
os.environ['KMP_DUPLICATE_LIB_OK']='True'

plt.rcParams['figure.figsize'] = (8, 8)
plt.rcParams['image.interpolation'] = 'nearest'

np.set_printoptions(suppress=True)

# config = tf.ConfigProto()
# config.gpu_options.per_process_gpu_memory_fraction = 0.9
# set_session(tf.Session(config=config))

クラス分け


# some constants
# 3種類の機体と背景がある為、クラスの数は4つ
NUM_CLASSES = 4
# 入力画像のサイズは300x300にリサイズしている
input_shape = (300, 300, 3)

正解データの読み込み

VoTTでタグ付け（バウンディングBOX）した、読み込みを行う
正解データに紐づいた画像のユニークキーを取り出して
trainとvalに別ける


priors = pickle.load(open('prior_boxes_ssd300.pkl', 'rb'))
bbox_util = BBoxUtility(NUM_CLASSES, priors)


gt = pickle.load(open('aircraft_cl4.pkl', 'rb'))
keys = sorted(gt.keys())
num_train = int(round(0.8 * len(keys)))
train_keys = keys[:num_train]
val_keys = keys[num_train:]
num_val = len(val_keys)

正解データの確認



import pandas as pd

list(gt.values())

Generatorの定義

kerasのgenerator(fitの中身)メモリと画像処理の関係上fit_generatorを利用
学習データをコントロールするのに使う



class Generator(object):
   def __init__(self, gt, bbox_util,
                batch_size, path_prefix,
                train_keys, val_keys, image_size,
                saturation_var=0.5,
                brightness_var=0.5,
                contrast_var=0.5,
                lighting_std=0.5,
                hflip_prob=0.5,
                vflip_prob=0.5,
                do_crop=True,
                crop_area_range=[0.75, 1.0],
                aspect_ratio_range=[3./4., 4./3.]):
       self.gt = gt
       self.bbox_util = bbox_util
       self.batch_size = batch_size
       self.path_prefix = path_prefix
       self.train_keys = train_keys
       self.val_keys = val_keys
       self.train_batches = len(train_keys)
       self.val_batches = len(val_keys)
       self.image_size = image_size
       self.color_jitter = []
       if saturation_var:
           self.saturation_var = saturation_var
           self.color_jitter.append(self.saturation)
       if brightness_var:
           self.brightness_var = brightness_var
           self.color_jitter.append(self.brightness)
       if contrast_var:
           self.contrast_var = contrast_var
           self.color_jitter.append(self.contrast)
       self.lighting_std = lighting_std
       self.hflip_prob = hflip_prob
       self.vflip_prob = vflip_prob
       self.do_crop = do_crop
       self.crop_area_range = crop_area_range
       self.aspect_ratio_range = aspect_ratio_range
 
# 画像の前処理メソッドを準備
　　　　　　#その名の通りグレー色へ変更      
   def grayscale(self, rgb):
       return rgb.dot([0.299, 0.587, 0.114])
　　　　　#鮮やかさを表す　
   def saturation(self, rgb):
       gs = self.grayscale(rgb)
       alpha = 2 * np.random.random() * self.saturation_var 
       alpha += 1 - self.saturation_var
       rgb = rgb * alpha + (1 - alpha) * gs[:, :, None]
       return np.clip(rgb, 0, 255)
　　　　　#色の明るさを表す
   def brightness(self, rgb):
       alpha = 2 * np.random.random() * self.brightness_var 
       alpha += 1 - self.saturation_var
       rgb = rgb * alpha
       return np.clip(rgb, 0, 255)
　　　　　#コントラスト処理
   def contrast(self, rgb):
       gs = self.grayscale(rgb).mean() * np.ones_like(rgb)
       alpha = 2 * np.random.random() * self.contrast_var 
       alpha += 1 - self.contrast_var
       rgb = rgb * alpha + (1 - alpha) * gs
       return np.clip(rgb, 0, 255)

   def lighting(self, img):
       cov = np.cov(img.reshape(-1, 3) / 255.0, rowvar=False)
       eigval, eigvec = np.linalg.eigh(cov)
       noise = np.random.randn(3) * self.lighting_std
       noise = eigvec.dot(eigval * noise) * 255
       img += noise
       return np.clip(img, 0, 255)
   #画像の反転
   def horizontal_flip(self, img, y):
       if np.random.random() < self.hflip_prob:
           img = img[:, ::-1]
           y[:, [0, 2]] = 1 - y[:, [2, 0]]
       return img, y
   #画像を逆さまに反転
   def vertical_flip(self, img, y):
       if np.random.random() < self.vflip_prob:
           img = img[::-1]
           y[:, [1, 3]] = 1 - y[:, [3, 1]]
       return img, y
   #画像をランダムなサイズとアスペクト比にトリミング
   def random_sized_crop(self, img, targets):
       img_w = img.shape[1]
       img_h = img.shape[0]
       img_area = img_w * img_h
       random_scale = np.random.random()
       random_scale *= (self.crop_area_range[1] -
                        self.crop_area_range[0])
       random_scale += self.crop_area_range[0]
       target_area = random_scale * img_area
       random_ratio = np.random.random()
       random_ratio *= (self.aspect_ratio_range[1] -
                        self.aspect_ratio_range[0])
       random_ratio += self.aspect_ratio_range[0]
       w = np.round(np.sqrt(target_area * random_ratio))     
       h = np.round(np.sqrt(target_area / random_ratio))
       if np.random.random() < 0.5:
           w, h = h, w
       w = min(w, img_w)
       w_rel = w / img_w
       w = int(w)
       h = min(h, img_h)
       h_rel = h / img_h
       h = int(h)
       x = np.random.random() * (img_w - w)
       x_rel = x / img_w
       x = int(x)
       y = np.random.random() * (img_h - h)
       y_rel = y / img_h
       y = int(y)
       img = img[y:y+h, x:x+w]
       new_targets = []
       for box in targets:
           cx = 0.5 * (box[0] + box[2])
           cy = 0.5 * (box[1] + box[3])
           if (x_rel < cx < x_rel + w_rel and
               y_rel < cy < y_rel + h_rel):
               xmin = (box[0] - x_rel) / w_rel
               ymin = (box[1] - y_rel) / h_rel
               xmax = (box[2] - x_rel) / w_rel
               ymax = (box[3] - y_rel) / h_rel
               xmin = max(0, xmin)
               ymin = max(0, ymin)
               xmax = min(1, xmax)
               ymax = min(1, ymax)
               box[:4] = [xmin, ymin, xmax, ymax]
               new_targets.append(box)
       new_targets = np.asarray(new_targets).reshape(-1, targets.shape[1])
       return img, new_targets

# 上記の関数群を使ってgenerateを実行   
   def generate(self, train=True):
       while True:
           if train:
               shuffle(self.train_keys)
               keys = self.train_keys
           else:
               shuffle(self.val_keys)
               keys = self.val_keys
           inputs = []
           targets = []
           for key in keys:            
               img_path = self.path_prefix + key
               img = imread(img_path).astype('float32')
               y = self.gt[key].copy()
               if train and self.do_crop:
                   img, y = self.random_sized_crop(img, y)
               img = np.array(Image.fromarray((img * 255).astype(np.uint8)).resize(self.image_size)).astype('float32')
               if train:
                   shuffle(self.color_jitter)
                   for jitter in self.color_jitter:
                       img = jitter(img)
                   if self.lighting_std:
                       img = self.lighting(img)
                   if self.hflip_prob > 0:
                       img, y = self.horizontal_flip(img, y)
                   if self.vflip_prob > 0:
                       img, y = self.vertical_flip(img, y)
               y = self.bbox_util.assign_boxes(y)
               inputs.append(img)                
               targets.append(y)
               if len(targets) == self.batch_size:
                   tmp_inp = np.array(inputs)
                   tmp_targets = np.array(targets)
                   inputs = []
                   targets = []
                   yield preprocess_input(tmp_inp), tmp_targets

path_prefix = 'JPEGImages/'
gen = Generator(gt, bbox_util, 16, path_prefix,
               train_keys, val_keys,
               (input_shape[0], input_shape[1]), do_crop=False)

SSDを使用

学習済みであるSSD300をロードする


model = SSD300(input_shape, num_classes=NUM_CLASSES)
# 学習したデータをロードする
model.load_weights('weights_SSD300.hdf5', by_name=True)

学習済みの層のパラメーターの固定

# 再学習しないレイヤー
freeze = ['input_1', 'conv1_1', 'conv1_2', 'pool1',
         'conv2_1', 'conv2_2', 'pool2',
         'conv3_1', 'conv3_2', 'conv3_3', 'pool3']#,
#           'conv4_1', 'conv4_2', 'conv4_3', 'pool4']
# 再学習しないように設定
for L in model.layers:
   if L.name in freeze:
       L.trainable = False

def schedule(epoch, decay=0.9):
   return base_lr * decay**(epoch)

モデルのコンパイル

学習率をepochごとに変化



callbacks = [tensorflow.keras.callbacks.ModelCheckpoint('./checkpoints/weights.{epoch:02d}-{val_loss:.2f}.hdf5',
                                            verbose=1,
                                            save_weights_only=False),
            tensorflow.keras.callbacks.LearningRateScheduler(schedule)]
base_lr = 3e-4
optim = tensorflow.keras.optimizers.Adam(lr=base_lr)
# optim = tensorflow.keras.optimizers.RMSprop(lr=base_lr)
# optim = tensorflow.keras.optimizers.SGD(lr=base_lr, momentum=0.9, decay=decay, nesterov=True)
model.compile(optimizer=optim,
             loss=MultiboxLoss(NUM_CLASSES, neg_pos_ratio=2.0).compute_loss)

学習を実行

今回は、最適化を探すためepoch100にて実施


nb_epoch = 100
history = model.fit_generator(gen.generate(True), gen.train_batches,
                             nb_epoch, verbose=1,
                             callbacks=callbacks,
                             validation_data=gen.generate(False),
                             validation_steps=gen.val_batches,
                             workers=1)

model.save_weights('./airplane_SSD300_100ep_weights.hdf5')

plt.plot(range(1,nb_epoch+1),history.history["loss"],label="train")
plt.plot(range(1,nb_epoch+1),history.history["val_loss"],label="valid")
plt.xlabel("epoch")
plt.ylabel("loss")
plt.legend()

学習過程

GPUをGTX1650にて使用
約12時間近くかかりました笑

学習過程の可視化

因みに5epochで実行した場合のグラフは下記になります。
過学習傾向があったので5epochにて実施しました。

学習/検証データの推論


inputs = []
images = []
for val_key in sorted(val_keys)[0:]:
   img_path = path_prefix + val_key
   img = image.load_img(img_path, target_size=(300, 300))
   img = image.img_to_array(img)
   images.append(imread(img_path))
   inputs.append(img.copy())
inputs = preprocess_input(np.array(inputs))

preds = model.predict(inputs, batch_size=1, verbose=1)
results = bbox_util.detection_out(preds)

for i, img in enumerate(images):
   # Parse the outputs.
   det_label = results[i][:, 0]
   det_conf = results[i][:, 1]
   det_xmin = results[i][:, 2]
   det_ymin = results[i][:, 3]
   det_xmax = results[i][:, 4]
   det_ymax = results[i][:, 5]

   # Get detections with confidence higher than 0.6.
   top_indices = [i for i, conf in enumerate(det_conf) if conf >= 0.6]

   top_conf = det_conf[top_indices]
   top_label_indices = det_label[top_indices].tolist()
   top_xmin = det_xmin[top_indices]
   top_ymin = det_ymin[top_indices]
   top_xmax = det_xmax[top_indices]
   top_ymax = det_ymax[top_indices]

   colors = plt.cm.hsv(np.linspace(0, 1, 4)).tolist()

   plt.imshow(img / 255.)
   currentAxis = plt.gca()

   for i in range(top_conf.shape[0]):
       xmin = int(round(top_xmin[i] * img.shape[1]))
       ymin = int(round(top_ymin[i] * img.shape[0]))
       xmax = int(round(top_xmax[i] * img.shape[1]))
       ymax = int(round(top_ymax[i] * img.shape[0]))
       score = top_conf[i]
       label = int(top_label_indices[i])
#         label_name = voc_classes[label - 1]
       display_txt = '{:0.2f}, {}'.format(score, ["","boeing","airbus","military"][label])
       coords = (xmin, ymin), xmax-xmin+1, ymax-ymin+1
       color = colors[label]
       currentAxis.add_patch(plt.Rectangle(*coords, fill=False, edgecolor=color, linewidth=2))
       currentAxis.text(xmin, ymin, display_txt, bbox={'facecolor':color, 'alpha':0.5})
   
   plt.show()

評価方法

評価方法に関して調べると、IoUなどあることが分かりました。
参照ブログhttps://qiita.com/panchokyutech/items/2c972ede2bc883597a87

どのように物体検出を行うか

実際の航空機や航空機と予想したモデルの四角の範囲が正しく航空機と認識できた領域（TP）がどの程度あっているか表した物を
IoUと言うそうです。
しかし、参考にしたブログだと全体から検出を行う為、BBOXとはまた違った為実装不可と判断しました。

今回は、見送り

物体検出における評価指標は調べてみると、複数あるが厳密な実装が難しかったので
今回は正解率のみ実装しました。

正解率

正解率は、59.6％と意外と精度が悪くないと思いましたが、課題等も見つかり下記に詳しく結果を書きました。

学習データの推論

エアバスの正解データ

ボーイングの正解データ

ミリタリーの正解データ

認識出来なかった画像データ

学習した際に色も学習していた可能性があり
暗闇で尚可機体も黒だった為、認識できなかったと思われます。

違った認識をしてしまったデータ

エアバス（A320）を、ボーイングと認識
機体が似ているため間違えたと思われます。

全く違う認識してしまったデータ

離れていた為、小さく認識し戦闘機と間違えてしまった可能性が考えられます。

検証データの推論結果

エアバスの正解データ

ボーイングの正解データ

ミリタリーの正解データ

2重で認識してしまったデータ
やはり機体が似ていたため、両方で認識してしまった可能性があります。

認識できなかったデータ

かなり特徴のあるA380だが、上手く認識できなかった
画像収集の際に、この角度から撮った画像が少なかった為
認識できなかったと思われます。

認識できた二箇所認識

車輪を格納した認識と地上の画像両方を認識したと思われます。

エアバスの機体をボーイングと認識

4発エンジンのA340を誤ってボーイングと認識
もう少しA340の画像データがあれば防げた可能性があります。

上記の結果を踏まえて

今回1200枚をデータを集めてSSDを行ったが、やはり学習する画像が
少なかったと思います。
また旅客機は非常に似ており、窓枠やエンジンの形の特徴が必要となるが
shape300の縛りがあり、特徴を掴むのが難しかったと思われます。
より精度高めるには、データ収集と画像の大きさが重要になると
思われます。

アプリ制作をおこなう

遂にherokuにて公開しようと思い、appを作っていく
ローカルで稼働することを確認


import os
import shutil

from flask import Flask, request, redirect, render_template, flash
from werkzeug.utils import secure_filename
import tensorflow
from tensorflow.keras.models import Sequential, load_model
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.imagenet_utils import preprocess_input
import base64
import numpy as np
from ssd import SSD300
from ssd_training import MultiboxLoss
from ssd_utils import BBoxUtility
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
import pickle 
import time
from imageio import imread

# クラス別け
classes = ['','boeing','airbus','military']
image_size = 300

UPLOAD_FOLDER = "uploads"
ALLOWED_EXTENSIONS = set(['png', 'jpg', 'jpeg', 'gif'])

# app = Flask(__name__)
app = Flask(__name__, static_folder='response_image')

def allowed_file(filename):
   return '.' in filename and filename.rsplit('.', 1)[1].lower() in ALLOWED_EXTENSIONS

model = SSD300(input_shape=(image_size, image_size, 3),num_classes=4)#学習済みモデルをロード
optim = tensorflow.keras.optimizers.Adam(lr=3e-4)
model.compile(optimizer=optim,loss=MultiboxLoss(4,neg_pos_ratio=2.0).compute_loss)
model.load_weights('airplane_SSD300_5ep_weights.hdf5')

priors = pickle.load(open('aircraft_cl4.pkl','rb'))
bbox_util = BBoxUtility(4,priors)

@app.route('/', methods=['GET', 'POST'])
def upload_file():
   if request.method == 'POST':
       if 'file' not in request.files:
           flash('ファイルがありません')
           return redirect(request.url)
       file = request.files['file']
       if file.filename == '':
           flash('ファイルがありません')
           return redirect(request.url)
       if file and allowed_file(file.filename):
           filename = secure_filename(file.filename)
           file.save(os.path.join(UPLOAD_FOLDER, filename))
           filepath = os.path.join(UPLOAD_FOLDER, filename)

           shutil.rmtree('response_image',ignore_errors=True)
           os.makedirs('response_image',exist_ok=True)

           #受け取った画像を読み込み、np形式に変換
           img = image.load_img(filepath, grayscale=False, target_size=(image_size,image_size))
           img = image.img_to_array(img)
           data =preprocess_input(np.array([img]))
           #変換したデータをモデルに渡して予測する
           result = bbox_util.detection_out(model.predict(data))[0]
           #推論結果を基に画像生成及び保存する
           
           # Parse the outputs.
           det_label = result[:, 0]
           det_conf = result[:, 1]
           det_xmin = result[:, 2]
           det_ymin = result[:, 3]
           det_xmax = result[:, 4]
           det_ymax = result[:, 5]
           # Get detections with confidence higher than 0.6.
           top_indices = [i for i, conf in enumerate(det_conf) if conf >= 0.6]
           #0.6以上の推論結果のみを抽出する
           top_conf = det_conf[top_indices]
           top_label_indices = det_label[top_indices].tolist()
           top_xmin = det_xmin[top_indices]
           top_ymin = det_ymin[top_indices]
           top_xmax = det_xmax[top_indices]
           top_ymax = det_ymax[top_indices]
           colors = plt.cm.hsv(np.linspace(0, 1, 4)).tolist()
           img = imread(filepath)
           plt.imshow(img / 255.)
           currentAxis = plt.gca()
           for i in range(top_conf.shape[0]):
               xmin = int(round(top_xmin[i] * img.shape[1]))
               ymin = int(round(top_ymin[i] * img.shape[0]))
               xmax = int(round(top_xmax[i] * img.shape[1]))
               ymax = int(round(top_ymax[i] * img.shape[0]))
               score = top_conf[i]
               label = int(top_label_indices[i])
               #label_name = voc_classes[label - 1]
               display_txt = '{:0.2f}, {}'.format(score, ["","boeing","airbus","military"][label])
               coords = (xmin, ymin), xmax-xmin+1, ymax-ymin+1
               color = colors[label]
               currentAxis.add_patch(plt.Rectangle(*coords, fill=False, edgecolor=color, linewidth=2))
               currentAxis.text(xmin, ymin, display_txt, bbox={'facecolor':color, 'alpha':0.5})
           
           fig_name=str(int(time.time()))+'.png'
           plt.savefig('response_image/'+fig_name)


           #画像を返却する
           

           return render_template("index.html",answer="../response_image/"+fig_name)

   return render_template("index.html",answer="")


if __name__ == "__main__":
   port = int(os.environ.get('PORT', 8080))
   app.run(host ='0.0.0.0',port = port)

ローカルで確認し、いざherokuにて公開しようとするが

herokuにて公開しようとするが何度試してもメモリエラーが発生
バージョン等を必要最低限にて行い無事ページが開いたが、
画像を送った段階でエラー
どうやら展開時に容量不足とのこと
herokuは容量が多い機械学習に適していない可能性がありました。

別サーバで模索

費用等を考慮した結果、GCPを使った実績があったのでGAEにて公開することを
決める
参照URL
https://cloud.google.com/appengine/docs/flexible/custom-runtimes/configuring-your-app-with-app-yaml?hl=ja

グーグルアプリエンジンにてアプリ公開

アプリ公開用のHTMLコード
公開するのに約3週間ほど費やした。
主に送った画像が戻ってくる予定が、上手く表示されないエラーが起きていた
試行錯誤を繰り返し、あらゆるコードやデプロイを何度か行った。
結果、コードの中に" ," が紛れ込んでおり、上手く表示されなかった
この発見するのにかなり時間を要したが、ちょっとしたミスが
大きな問題へ発展してしまうことを痛感しました。

<!DOCTYPE html>
<html lang='ja'>
<head>
   <meta charset='UTF-8'>
   <meta name='viewport' content="device-width, initial-scale=1.0">
   <meta http-equiv='X-UA-Compatible' content="ie=edge">
   <title>airplane Classifier</title>
   <link rel='stylesheet' href="./static/stylesheet.css">
</head>
<body>
   <header>   
       
       <a class='header-logo' href="#">airplane Classifier</a>
   </header>

   <div class='main'>    
       <h2> AIが送信された画像の機体を識別します</h2>
       <p>画像を送信してください</p>
       <form method='POST' enctype="multipart/form-data">
           <input class='file_choose' type="file" name="file">
           <input class='btn' value="submit!" type="submit">
       </form>
       <img src="data:image/jpeg;base64,{{answer}}" />
       <!-- <img src="{{answer}}"> -->
   </div>

   <footer>
       
       <small> 2021,okuda  </small>   
   </footer>
</body>
</html>

無事アプリ公開

アプリ公開できたが、一度に画像を送るとサーバーエラーが起こってしまうが
容量の問題だと思われる。
これ以上の容量増やすのは、費用対効果として薄い為断念
しかしなんとか動かすことができ、制作期間2ヶ月掛り、精度も決して良くないですが、達成感がすごくありました。
画像は、F2戦闘機を認識していますが、予測結果が3重になりました。笑

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up