More than 3 years have passed since last update.

画像のトリミングとモノクロ化(Kindle for PCのスクショを撮る2)

Posted at 2022-05-23

前文

https://qiita.com/dengax/items/d2f64d462715a640184b
のつづきです。

前回から二ヶ月

ちょっと時間がかかりすぎですね。スクリプト自体はかなり早くに出来ていたのですが、譚ページで撮ったときに見開きページをどうするか、見開きで撮ったときに単ページに分割するか、その辺どうしようかと思っていたり、スクリーンショットの部分をもう少し改良したかったりしたかったので、そちらその辺で色々調べていたら放置になっていました
あと、あんまり書くことがなくて・・・

ソースコード

trim.py

import cv2
import numpy as np
import os, os.path
import pathlib

gray_check_margins = (0, 0, 0, 0)
trim_margins = (1, 1, 1, 1)  # サイズ自動設定のときのマージン(左、右、上、下)

processing_dir = 'd:\\kss'

def imread(filename, flags=cv2.IMREAD_COLOR, dtype=np.uint8):
    try:
        n = np.fromfile(filename, dtype)
        img = cv2.imdecode(n, flags)
        return img
    except Exception as e:
        print(e)
        return None

def imwrite(filename, img, params=None):
    try:
        ext = os.path.splitext(filename)[1]
        result, n = cv2.imencode(ext, img, params)

        if result:
            with open(filename, mode='w+b') as f:
                n.tofile(f)
            return True
        else:
            return False
    except Exception as e:
        print(e)
        return False


def glayscale_detect(img: np.array, margin):
    sx, sy = img.shape[1], img.shape[0]
    nx, ny = margin[0], margin[1]
    xx, xy = sx - margin[2], sy - margin[3]
    if (img[ny:xy, nx:xx, 0] == img[ny:xy, nx:xx, 1]).all() and (img[ny:xy, nx:xx, 0] == img[ny:xy, nx:xx, 2]).all():
        return True
    return False


def comic_detect(img: np.ndarray, checkpoints: list, color):
    for i in checkpoints:
        if img[i[1],i[0]] != color:
            return False
    return True


def trim_check(img: np.ndarray, color, margin):
    sx, sy = img.shape[1], img.shape[0]
    nx, ny = margin[0], margin[1]
    xx, xy = sx - margin[2], sy - margin[3]
    def cmps(img, xrange, yrange , color, xdef):
        rt = xdef
        for x in xrange:
            if (img[yrange[0]:yrange[1] , x] != color).any():
                rt = x
                break
        return rt
    lm = cmps(img, range(nx, xx), (ny, xy),color, sx)
    if lm == nx:
        lm = 0
    rm = cmps(img, reversed(range(nx, xx)), (ny, xy), color, 0)
    if rm == xx:
        rm = sx
    return lm,rm


dirs = pathlib.Path(processing_dir)

for d in [x for x in dirs.iterdir() if x.is_dir()]:
    l = d.glob('**/*')
    if l == []:
        continue
    hm = []
    for i in l:
        print(i)
        img = imread(str(i))
        hm += [trim_check(img, img[1, 1], trim_margins)]
    ml = min([x[0] for x in hm])
    mr = max([x[1] for x in hm])
    print(str(d), ':trim =', ml, mr)
    l = d.glob('**/*')
    for i in l:
        img = imread(str(i))
        img = img[:,ml:mr]
        if img.shape[2] != 1:
            if glayscale_detect(img, gray_check_margins):
                img = img[:,:,1]
                print(i,'is grayscale')
        imwrite(str(i),img)

使い方

gray_check_margins グレイスケールを判定する時に検出しない上下左右ピクセル
trim_margins トリミングを判定する時に検出しない上下左右ピクセル
processing_dir 処理するディレクトリ
を書き換えて、実行すれば、指定したディレクトリにある全ての画像に対してトリミングと白黒判定をします。

コミックかどうかを判定する関数自体は作りましたが、実際には使っていません。小説だと変な結果になるので、使用するのはコミックだけにしてください

スクリプトについて

opencvのimread/imwriteでは日本語ファイル名を上手く扱えない

前回はこのことをすっかり忘れていて、カレントディレクトリを変更して強引に対策しましたが、今回は
https://qiita.com/SKYS/items/cbde3775e2143cad7455
ここをそのまま利用しました。
個人的にはcatchする必要は無いか、raise eでそのまま例外出す方がいいと思うのですが、そのままにしてあります。

ndarrayはできる限りforを使わない

ndarrayは条件式でも範囲を指定できるので、こういう部分もできる限りndarrayの機能を使うのが良いです

Python標準のanyとnumpyのanyはちょっと違う

Python標準のany、allは関数ですが、ndarrayではメソッドです。ちょっとわかりにくいですね

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up