【Python】URLを指定してimread()したい。ついでにpip公開。

Last updated at 2022-01-21Posted at 2021-12-14

この記事はOpenCV Advent Calendar 2021の15日目の記事です。

はじめに

僕の知り得る限り、10年前くらいから同じこと考えている人はチョイチョイいたと思うネタですが👀

やはり、Google ColaboratoryやJupyter Notebook、JupyterLabで適当なテスト画像を使おうと思うと、この手の処理があると便利です(特にColaboratory)

ただ、処理自体をコピペして使いまわすのも若干手間なので、pip登録しました。
Colaboratoryでの使用感は以下のような感じです。

ただ、この投稿書いている最中にふと「python-opencv-utils」の存在を思い出して、確認したら案の定「urlread()」ありました、、、😇
あああー、、、
まあいっか、、、👀
ちなみに使用感は以下のような感じです。

追記(2022年1月21日現在)

実行時に以下のようなエラーが発生するケースがあります。

Error occurred during loading data. Trying to use cache server https://fake-useragent.herokuapp.com/browsers/0.1.11
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/dist-packages/fake_useragent/utils.py", line 154, in load
    for item in get_browsers(verify_ssl=verify_ssl):
  File "/usr/local/lib/python3.7/dist-packages/fake_useragent/utils.py", line 99, in get_browsers
    html = html.split('<table class="w3-table-all notranslate">')[1]
IndexError: list index out of range

その場合、もう一度imread_from_url()を実行するか
!pip install imread_from_url==0.1.2で旧バージョンをインストールしてください

実装

実装は以下です。
requestsとPillowに依存しています。

import requests
from io import BytesIO

import cv2
import numpy as np
from PIL import Image


def imread_from_url(url, seek_index=0, debug=False):
    image = None

    # URLから画像を取得
    response = requests.get(url)

    # PILでURLの画像を読み込み
    try:
        image = Image.open(BytesIO(response.content))
    except Exception as e:
        print(e)
        return None

    if debug:
        print(image)
        print(image.size)
        print(dir(image))

    # アニメーションGIF等の複数枚ある画像はシーク
    if 'n_frames' in dir(image):
        if debug:
            print('n_frames', image.n_frames)
            print('seek_index', seek_index)

        if 0 <= seek_index < image.n_frames:
            image.seek(seek_index)
        elif seek_index < 0:
            print('The index when seeking must be a positive integer')
            print('seek_index:', seek_index)
            image.seek(0)
        else:
            print('An index outside the seek range was specified')
            print('n_frames:', image.n_frames)
            print('seek_index:', seek_index)
            image.seek(image.n_frames - 1)
        image = image.convert('RGB')

    # RGB -> BGR
    image = cv2.cvtColor(np.array(image, dtype=np.uint8), cv2.COLOR_RGB2BGR)

    return image

urlread()のようにbytearrayにしてimdecode()したほうが短い行数ですっきりするのですが、、、
個人的な都合でアニメーションGIFも読み込みたかったためPillowを経由しています。

import cv2
import numpy as np
from urllib.request import urlopen

def urlread(url, flags=cv2.IMREAD_UNCHANGED):
    response = urlopen(url)
    img = np.asarray(bytearray(response.read()), dtype=np.uint8)
    img = cv2.imdecode(img, flags)
    return img

大したソースコードではないですが、ソースコードはGitHubで公開しています。

使用例

以降のスクリプトは、Qiita-AdventCalendar-20211215-OpenCV.ipynb でも公開しています。

① pip インストール

pip install imread_from_url

② 画像のURLを調べる

例：Chrome上で画像を右クリックして「画像アドレスをコピー」など

③ imread_from_url()でURLを読み込む

読み込んで使用する👾

from imread_from_url import imread_from_url

image = imread_from_url(
    'https://raw.githubusercontent.com/Kazuhito00/Kazuhito00/master/image/icon200.jpg'
)

その他：アニメーションGIFを読み込む

以下のようなアニメーションGIFを読み込むと、最初のフレームを読み込みます。
※PillowのアニメーションGIF読み込みは現時点では試験的な実装のようです。実写のアニメーションGIFなどを読み込むと画像が崩れる場合があります。

gif_image = imread_from_url(
    'https://user-images.githubusercontent.com/37477845/145334396-8995ec38-a886-4641-8636-fc855c1fcc32.gif' 
)

デフォルトでは最初のフレーム(0)を読み込みますが、seek_index引数を指定することで指定のフレームを読み込みます。
この時、最大フレーム数より大きい数値を指定すると、最後のフレームを読み込みます。

gif_image = imread_from_url(
    'https://user-images.githubusercontent.com/37477845/145334396-8995ec38-a886-4641-8636-fc855c1fcc32.gif' 
    seek_index=50,
)

debug引数にTrueを与えると、Pillowで読み込んだ際のフォーマットや、最大フレーム数、指定フレーム数の情報を表示します。

gif_image = imread_from_url(
    'https://user-images.githubusercontent.com/37477845/145334396-8995ec38-a886-4641-8636-fc855c1fcc32.gif' ,
    seek_index=50,
    debug=True,
)

以上。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up