More than 1 year has passed since last update.

[Python] 画像認識によるボーイングの機種判別

Posted at 2024-02-04

はじめに

私はPythonについて全くの初心者である文系人間ですが、昔からプログラムには興味があったため、今回簡単なプログラムは自分で組めるようになりたいと思い、勉強を始めました。中高年でもあるため、今話題のAIに少しでも近づきたいという見栄もありました。
飛行機について乗るのも見るのも好きなので、せっかく勉強した画像認識を使って航空機の識別ができるのではと考え、このアプリケーションを作成しました。

作成したプログラム

ボーイング社の７４７と７８７を識別するアプリケーションになります。
https://aideny-boeing-aircraft-app.onrender.com

実行環境
画像収集
モデルの実行
HTMLの作成
FLASK
RENDER
今後の活用
おわりに

1. 実行環境
・Visual Studio Code
・Google Colaboratory

2. 画像収集
画像収集について、Youtubeの
「【だれでもできる】プログラミングが未経験でも大丈夫。Webから大量画像を収集する方法をわかりやすく解説します。」
https://www.youtube.com/watch?v=hRB104ik6pQ
を参考にしました。

a) GoogleDrive上に画像を保管する場合はドライブをマウントする必要があるため、Google Colaboratory上で、以下のプログラムを実行

from google.colab import drive
drive.mount('/content/drive')

b) icrawlerの利用

pip install icrawler

c) Bingでの利用

from icrawler.builtin import BingImageCrawler

d) 747 Aircraft の画像収集

bing_crawler = BingImageCrawler(downloader_threads=4,
                                storage={'root_dir': '747 aircraft_data'})
bing_crawler.crawl(keyword='747 aircraft', filters=None, offset=0, max_num=500)

e) 集めた747画像をZIP化し、ダウンロードしホルダーへ保存

!zip -r /content/747_download.zip /content/747_aircraft_data

そのZIPファイルをダウンロードし、Googleドライブの
マイドライブ>Colab>Boeing_aircraft>747_aircraft_data
へ保存。

f) 画像の選別
画像には、機内の写真、座席表、複数の航空機などの画像が混じっていたため、それを削除し、飛行機の外観だけの画像に厳選
同様にボーイング787についても、それぞれ300枚程度の画像を収集

3. モデルの実行
手書き認識アプリ作成時のコードを参照した

import os
import cv2
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.layers import Dense, Dropout, Flatten, Input
from tensorflow.keras.applications.vgg16 import VGG16
from tensorflow.keras.models import Model, Sequential
from tensorflow.keras import optimizers

# Googleドライブに利用
path_747 = os.listdir('/content/drive/MyDrive/Colab/Boeing_aircraft/747_aircraft_data/')
path_787 = os.listdir('/content/drive/MyDrive/Colab/Boeing_aircraft/787_aircraft_data/')

img_747 = []
img_787 = []

for i in range(len(path_747)):
    img = cv2.imread('/content/drive/MyDrive/Colab/Boeing_aircraft/747_aircraft_data/' + path_747[i])
    b,g,r = cv2.split(img)
    img = cv2.merge([r,g,b])
    img = cv2.resize(img, (150,150))
    img_747.append(img)

for i in range(len(path_787)):
    img = cv2.imread('/content/drive/MyDrive/Colab/Boeing_aircraft/787_aircraft_data/' + path_787[i])
    b,g,r = cv2.split(img)
    img = cv2.merge([r,g,b])
    img = cv2.resize(img, (150,150))
    img_787.append(img)

X = np.array(img_747 + img_787)
y =  np.array([0]*len(img_747) + [1]*len(img_787))

rand_index = np.random.permutation(np.arange(len(X)))
X = X[rand_index]
y = y[rand_index]
y = to_categorical(y)

# データの分割
X_train = X[:int(len(X)*0.8)]
y_train = y[:int(len(y)*0.8)]
X_test = X[int(len(X)*0.8):]
y_test = y[int(len(y)*0.8):]

# vgg16のインスタンスの生成
input_tensor = Input(shape=(150, 150, 3))
vgg16 = VGG16(include_top=False, weights='imagenet', input_tensor=input_tensor)

top_model = Sequential()
top_model.add(Flatten(input_shape=vgg16.output_shape[1:]))
#top_model.add(Dense(256, activation='relu'))
top_model.add(Dense(256, activation="sigmoid"))

# ドロップアウト
#---------------------------
top_model.add(Dropout(rate=0.5))
#---------------------------

top_model.add(Dense(2, activation='softmax'))


# モデルの連結
model = Model(inputs=vgg16.input, outputs=top_model(vgg16.output))

# vgg16の重みの固定
for layer in model.layers[:19]:
    layer.trainable = False

model.compile(loss='categorical_crossentropy',
              optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
              metrics=['accuracy'])

model.fit(X_train, y_train, batch_size=100, epochs=150)


# 画像を一枚受け取り、747か787かを判定して返す関数
def pred_boeing(img):
  img = cv2.resize(img, (150,150))
  img = img.reshape(1, 150, 150, 3)
  pred = np.argmax(model.predict(img))
  if pred == 1:
    print("787")
  else:
    print("747")

model.summary()

# 精度の評価
scores = model.evaluate(X_test, y_test, verbose=1)
print('Test loss:', scores[0])
print('Test accuracy:', scores[1])

# pred_boeing関数に航空機写真を渡して型を予測します
for i in range(5):
    # pred_boeing関数に航空機写真を渡して型を予測します
    img = cv2.imread('/content/drive/MyDrive/Colab/Boeing_aircraft/787_aircraft_data/' + path_787[i])
    b,g,r = cv2.split(img)
    img = cv2.merge([r,g,b])
    plt.imshow(img)
    plt.show()
    print(pred_boeing(img))

#resultsディレクトリを作成
result_dir = 'results'
if not os.path.exists(result_dir):
    os.mkdir(result_dir)
# 重みを保存
model.save(os.path.join(result_dir, 'my_model48.h5'))

精度評価

 Layer (type)                Output Shape              Param #   
=================================================================
 input_3 (InputLayer)        [(None, 150, 150, 3)]     0         
                                                                 
 block1_conv1 (Conv2D)       (None, 150, 150, 64)      1792      
                                                                 
 block1_conv2 (Conv2D)       (None, 150, 150, 64)      36928     
                                                                 
 block1_pool (MaxPooling2D)  (None, 75, 75, 64)        0         
                                                                 
 block2_conv1 (Conv2D)       (None, 75, 75, 128)       73856     
                                                                 
 block2_conv2 (Conv2D)       (None, 75, 75, 128)       147584    
                                                                 
 block2_pool (MaxPooling2D)  (None, 37, 37, 128)       0         
                                                                 
 block3_conv1 (Conv2D)       (None, 37, 37, 256)       295168    
                                                                 
 block3_conv2 (Conv2D)       (None, 37, 37, 256)       590080    
                                                                 
 block3_conv3 (Conv2D)       (None, 37, 37, 256)       590080    
                                                                 
 block3_pool (MaxPooling2D)  (None, 18, 18, 256)       0         
                                                                 
 block4_conv1 (Conv2D)       (None, 18, 18, 512)       1180160   
                                                                 
 block4_conv2 (Conv2D)       (None, 18, 18, 512)       2359808   
                                                                 
 block4_conv3 (Conv2D)       (None, 18, 18, 512)       2359808   
                                                                 
 block4_pool (MaxPooling2D)  (None, 9, 9, 512)         0         
                                                                 
 block5_conv1 (Conv2D)       (None, 9, 9, 512)         2359808   
                                                                 
 block5_conv2 (Conv2D)       (None, 9, 9, 512)         2359808   
                                                                 
 block5_conv3 (Conv2D)       (None, 9, 9, 512)         2359808   
                                                                 
 block5_pool (MaxPooling2D)  (None, 4, 4, 512)         0         
                                                                 
 sequential_2 (Sequential)   (None, 2)                 2097922   
                                                                 
=================================================================
Total params: 16812610 (64.14 MB)
Trainable params: 2097922 (8.00 MB)
Non-trainable params: 14714688 (56.13 MB)
_________________________________________________________________
7/7 [==============================] - 1s 66ms/step - loss: 0.2793 - accuracy: 0.8977
Test loss: 0.2792545258998871
Test accuracy: 0.8976744413375854

最終的に精度は、約90%

a) 検証１(モデルの定義）

top_model = Sequential()
top_model.add(Flatten(input_shape=vgg16.output_shape[1:]))
#top_model.add(Dense(256, activation='relu'))
top_model.add(Dense(256, activation="sigmoid"))

「Relu」を使用した場合、精度は約60%
だたっため、「sigmoid」を採用

b) 検証２（画像サイズ）
画像を50とした場合、精度は約80%弱
よって、サイズを150とした

c) 検証３（画像数）
当初画像数は、それぞれ300枚程度
150枚ほど画像を追加したが、精度はあまり変わらず
（同じ画像も含まれていたためと推察）

d) 検証４（学習回数）

model.fit(X_train, y_train, batch_size=100, epochs=150)

epochsを50, 100, 150で試したが、どれも約90%
150でも過学習を起こしていないと思われるので、150のままで設定

4. HTMLの作成

<!DOCTYPE html>
<html lang="ja">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <meta http-equiv="X-UA-Compatible" content="ie=edge">
    <title>Boeing Aircraft Classifier</title>
    <link rel="stylesheet" href="./static/stylesheet.css">
</head>
<body>
    <header>   
        <img class="header_img" src="https://aidemyexstorage.blob.core.windows.net/aidemycontents/1621500180546399.png" alt="Aidemy">
        <a class="header-logo" href="#">Boeing Aircraft Classifier</a>
    </header>

    <div class="main">    
        <h2> AIが送信された画像のボーイングの機種(747or787)を識別します</h2>
        <p>画像を送信してください</p>
        <form method="POST" enctype="multipart/form-data">
            <input class="file_choose" type="file" name="file">
            <input class="btn" value="submit!" type="submit">
        </form>
        <div class="answer">{{answer}}</div>
    </div>

    <footer>
        <img class="footer_img" src="https://aidemyexstorage.blob.core.windows.net/aidemycontents/1621500180546399.png" alt="Aidemy">
        <small>&copy; 2024 Aidemy, inc.</small>   
    </footer>
</body>
</html>

Stylesheet

header {
    background-color: #76B55B;
    height: 60px;
    margin: -8px;
    display: flex;
    flex-direction: row-reverse;
    justify-content: space-between;
}

.header-logo {
    color: #fff;
    font-size: 25px;
    margin: 15px 25px;
}

.header_img {
    height: 25px;
    margin: 15px 25px;
}

.main {
    height: 370px;
}

h2 {
    color: #444444;
    margin: 90px 0px;
    text-align: center;
}

p {
    color: #444444;
    margin: 70px 0px 30px 0px;
    text-align: center;
}

.answer {
    color: #444444;
    margin: 70px 0px 30px 0px;
    text-align: center;
}

form {
    text-align: center;
}

footer {
    background-color: #F7F7F7;
    height: 110px;
    margin: -8px;
    position: relative;
}

.footer_img {
    height: 25px;
    margin: 15px 25px;
}

small {
    margin: 15px 25px;
    position: absolute;
    left: 0;
    bottom: 0;
}

どちらも、AidemyのFlask入門のコードを参考

5. FLASK

import os
from flask import Flask, request, redirect, render_template, flash
from werkzeug.utils import secure_filename
from tensorflow.keras.models import Sequential, load_model
from tensorflow.keras.preprocessing import image

import numpy as np


classes = ["747","787"]
image_size = 150

UPLOAD_FOLDER = "uploads"
ALLOWED_EXTENSIONS = set(['png', 'jpg', 'jpeg', 'gif'])

app = Flask(__name__)

def allowed_file(filename):
    return '.' in filename and filename.rsplit('.', 1)[1].lower() in ALLOWED_EXTENSIONS

model = load_model('./my_model48.h5',compile=False)#学習済みモデルをロード


@app.route('/', methods=['GET', 'POST'])
def upload_file():
    if request.method == 'POST':
        if 'file' not in request.files:
            flash('ファイルがありません')
            return redirect(request.url)
        file = request.files['file']
        if file.filename == '':
            flash('ファイルがありません')
            return redirect(request.url)
        if file and allowed_file(file.filename):
            filename = secure_filename(file.filename)
            file.save(os.path.join(UPLOAD_FOLDER, filename))
            filepath = os.path.join(UPLOAD_FOLDER, filename)

            #受け取った画像を読み込み、np形式に変換
            img = image.load_img(filepath, grayscale=False, target_size=(image_size,image_size))
            img = image.img_to_array(img)
            data = np.array([img])
            #変換したデータをモデルに渡して予測する
            result = model.predict(data)[0]
            predicted = result.argmax()
            pred_answer = "これは " + classes[predicted] + " です"

            return render_template("index.html",answer=pred_answer)

    return render_template("index.html",answer="")


if __name__ == "__main__":
    port = int(os.environ.get('PORT', 8080))
    app.run(host ='0.0.0.0',port = port)

手書き認識アプリ時のコードを参照し作成。
最初、エラーが出たが、Logを参考に自分で対処できた。
原因は

img = image.load_img(filepath, grayscale=False, target_size=(image_size,image_size))
            img = image.img_to_array(img)

で最初、grayscale=True としていたが、今回はカラーということで、grayscale=False　としてエラーが解消

6. RENDER
手書き認識アプリ時と同様に作成したがエラー

  Getting requirements to build wheel: finished with status 'error'
  error: subprocess-exited-with-error
  
  × Getting requirements to build wheel did not run successfully.
  │ exit code: 1
  ╰─> [20 lines of output]

RuntimeError: Python version 2.7 or 3.4+ is required.

自分では対処できずカウンセラーへ相談
プログラムのバージョンの違いによるエラーということで
1.requirementの右に記載しているバージョンを削除
2.RenderのEnvironment Variablesで、PYTHON_VERSIONを現在使用しているバージョンを記載することで解決

7. 今後の活用
今回、ボーイングの２機種だけの判定とした（特に747は、特徴的でわかりやすい）。今後、他機種についても追加して判定できるように改善したい。
また、画像による判別は、飛行機だけでなく、他の画像判別にも応用できると思う。
より精度の高い判別は、更なる知識の習得が必要だが、判別は身近なところで活用できると期待している。

8. おわりに
実際、受講してみて、PYTHONプログラムの基礎やAI（ディープラーニングなど）の基本的な知識を学べました。何より教材を参考にしながら自分で実際にコードを書いて、試行錯誤しながらプログラムを動かすことができることは、非常にやりがいのある勉強でした。
途中、システムの環境やエラーなどで、自分なりにWebなどで探して解決を図ろうとしましたが、無理だったことも多く、その時、カウンセリングで質問できるというのは非常に助かりました。その場ですぐ解決できるカウンセラーの方々の経験や能力の高さに感服いたしました。

まだまだPYTHONやAIについては奥が深いので、無理をしない程度に少しずつ知識を増やして、何か新しいモデルを作成したい。

補足：
「このブログはAidemy Premiumのカリキュラムの一環で、受講修了条件を満たすために公開しています」

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up