LoginSignup
35
23

More than 5 years have passed since last update.

Appleが下記ページで配布しているCore MLモデル(.mlmodel)6種を比較してみました。

Machine Learning - Apple Developer

ちなみにInputs, Outputsの欄は「name / type : description」というフォーマットで書いています。(.mlmodelファイルをXcodeプロジェクトに突っ込んで調べました)

MobileNet

MobileNets are based on a streamlined architecture that have depth-wise separable convolutions to build lightweight, deep neural networks.

Detects the dominant objects present in an image from a set of 1000 categories such as trees, animals, food, vehicles, people, and more.

  • サイズ: 17.1 MB
  • ライセンス: Apache License. Version 2.0 http://www.apache.org/licenses/LICENSE-2.0
  • Inputs
    • image / Image (Color 224 x 224) - Input image to be classified
  • Outputs
    • classLabelProbs / Dictionary (String → Double) - Probability of each category
    • classLabel / String - Most likely image category

SqueezeNet

Detects the dominant objects present in an image from a set of 1000 categories such as trees, animals, food, vehicles, people, and more.

With an overall footprint of only 5 MB, SqueezeNet has a similar level of accuracy as AlexNet but with 50 times fewer parameters.

  • サイズ: 5 MB
  • License: BSD License. More information available at https://github.com/DeepScale/SqueezeNet/blob/master/LICENSE
  • Inputs
    • image / Image (Color 227 x 227) - Input image to be classified
  • Outputs
    • classLabelProbs / Dictionary (String → Double) - Probability of each category
    • classLabel / String - Most likely image category

Places205-GoogLeNet

Detects the scene of an image from 205 categories such as an airport terminal, bedroom, forest, coast, and more.

  • サイズ: 24.8 MB
  • ライセンス: Creative Common License. More information available at http://places.csail.mit.edu
  • Inputs
    • sceneImage / Image (Color 224 x 224) - Input image of scene to be classified
  • Outputs
    • sceneLabelProbs / Dictionary (String → Double) - Probability of each scene
    • sceneLabel / String - Most likely scene label

ResNet50

Detects the dominant objects present in an image from a set of 1000 categories such as trees, animals, food, vehicles, people, and more.

The top-5 error from the original publication is 7.8%.

  • サイズ: 102.6 MB
  • ライセンス: MIT License. More information available at https://github.com/fchollet/keras/blob/master/LICENSE
  • Inputs
    • image / Image (Color 224 x 224) - Input image to be classified
  • Outputs
    • classLabelProbs / Dictionary (String → Double) - Probability of each category
    • classLabel / String - Most likely image category

Inception v3

Detects the dominant objects present in an image from a set of 1000 categories such as trees, animals, food, vehicles, people, and more.

The top-5 error from the original publication is 5.6%.

  • サイズ: 94.7 MB
  • ライセンス: MIT License. More information available at https://github.com/fchollet/keras/blob/master/LICENSE
  • Inputs
    • image / Image (Color 299 x 299) - Input image to be classified
  • Outputs
    • classLabelProbs / Dictionary (String → Double) - Probability of each category
    • classLabel / String - Most likely image category

VGG16

Detects the dominant objects present in an image from a set of 1000 categories such as trees, animals, food, vehicles, people, and more.

The top-5 error from the original publication is 7.4%.

  • サイズ: 553.5 MB
  • ライセンス: Creative Commons Attribution 4.0 International(CC BY 4.0). More information available at https://creativecommons.org/licenses/by/4.0/
  • Inputs
    • image / Image (Color 224 x 224) - Input image to be classified
  • Outputs
    • classLabelProbs / Dictionary (String → Double) - Probability of each category
    • classLabel / String - Most likely image category

うち5種類はimagenetの1000クラス分類

なんとなく「さまざまな種類のモデルが配布されている」と思ってましたが、こうしてまじめに見てみると、なんと6種類のうち実に5種類はimagenetの1000クラスの一般物体(木、動物、食べ物、乗り物、人、etc...)を分類するモデルでした。

これらの5種類の違いとなるモデルサイズ、入力画像サイズ、ライセンスについて一覧にしてみます。

モデル名 モデルサイズ 入力画像サイズ ライセンス
MobileNet 17.1 MB 224 x 224 Apache 2.0
SqueezeNet 5 MB 227 x 227 BSD
ResNet50 102.6 MB 224 x 224 MIT
Inception v3 94.7 MB 299 x 299 Apache 2.0
VGG16 553.5 MB 224 x 224 CC 4.0

シーン分類を行う「Places205-GoogLeNet」

唯一他と毛色の違うモデルが「Places205-GoogLeNet」(モデルファイル名GoogLeNetPlaces)です。入力画像から空港、駅、森、等々205種類の「シーン」を検出します。

ラベルファイルは下記URLにあります。

どんなシーンが認識できるのかパッとわかるように、ここに一覧を載せておきます。

abbey
airport_terminal
alley
amphitheater
amusement_park
aquarium
aqueduct
arch
art_gallery
art_studio
assembly_line
attic
auditorium
apartment_building/outdoor
badlands
ballroom
bamboo_forest
banquet_hall
bar
baseball_field
basement
basilica
bayou
beauty_salon
bedroom
boardwalk
boat_deck
bookstore
botanical_garden
bowling_alley
boxing_ring
bridge
building_facade
bus_interior
butchers_shop
butte
bakery/shop
cafeteria
campsite
candy_store
canyon
castle
cemetery
chalet
classroom
closet
clothing_store
coast
cockpit
coffee_shop
conference_center
conference_room
construction_site
corn_field
corridor
cottage_garden
courthouse
courtyard
creek
crevasse
crosswalk
cathedral/outdoor
church/outdoor
dam
dining_room
dock
dorm_room
driveway
desert/sand
desert/vegetation
dinette/home
doorway/outdoor
engine_room
excavation
fairway
fire_escape
fire_station
food_court
forest_path
forest_road
formal_garden
fountain
field/cultivated
field/wild
galley
game_room
garbage_dump
gas_station
gift_shop
golf_course
harbor
herb_garden
highway
home_office
hospital
hospital_room
hot_spring
hotel_room
hotel/outdoor
ice_cream_parlor
iceberg
igloo
islet
ice_skating_rink/outdoor
inn/outdoor
jail_cell
kasbah
kindergarden_classroom
kitchen
kitchenette
laundromat
lighthouse
living_room
lobby
locker_room
mansion
marsh
martial_arts_gym
mausoleum
medina
motel
mountain
mountain_snowy
music_studio
market/outdoor
monastery/outdoor
museum/indoor
nursery
ocean
office
office_building
orchard
pagoda
palace
pantry
parking_lot
parlor
pasture
patio
pavilion
phone_booth
picnic_area
playground
plaza
pond
pulpit
racecourse
raft
railroad_track
rainforest
reception
residential_neighborhood
restaurant
restaurant_kitchen
restaurant_patio
rice_paddy
river
rock_arch
rope_bridge
ruin
runway
sandbar
schoolhouse
sea_cliff
shed
shoe_shop
shopfront
shower
ski_resort
ski_slope
sky
skyscraper
slum
snowfield
staircase
supermarket
swamp
stadium/baseball
stadium/football
stage/indoor
subway_station/platform
swimming_pool/outdoor
television_studio
topiary_garden
tower
train_railway
tree_farm
trench
temple/east_asia
temple/south_asia
track/outdoor
train_station/platform
underwater/coral_reef
valley
vegetable_garden
veranda
viaduct
volcano
waiting_room
water_tower
watering_hole
wheat_field
wind_farm
windmill
yard

他のモデル

今回比較してみたのはあくまで公式配布モデル。Core MLも発表されて1年近く経ち、サードパーティ製モデルも多く公開されています。次回はそのへんを色々発掘して比較してみたいと思います。

35
23
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
35
23