More than 3 years have passed since last update.

PyAutoGuiで繰り返し作業をPythonにやらせよう

Last updated at 2021-01-05Posted at 2018-06-12

はじめに

PythonのGUI自動化モジュールPyAutoGuiがいろいろ使えそうな気がしてみたので試してみた
(2018/11/26 opencv_pythonモジュールが使える場合に限り、閾値(confidence)が使える話を追加)
(2019/06/05 Python 3.7 での動作確認を実施、同じ手順で動作することを確認）
(2019/08/09 try ~ except ImageNotFoundException を使えるようにする）
(2020/11/04 Python 3.9 ではPyAutoGUIは動作しないことを確認、3.8で一旦様子見しましょう)
(2021/01/04 Python 3.9 での動作を確認しました、注意点といてWindows上での実行では文字コードの指定を厳格に指定する必要があるようです。)

PyAutoGui
https://pyautogui.readthedocs.io/en/latest/

PyAutoGuiとその他を使ったWindowsの自動化記事もよければどうぞ
UWSC を Python で置換しよう（1）環境構築編
 UWSC を Python で置換しよう（2）関数置き換え
 UWSC を Python で置換しよう（3）チートシート[1]
UWSC を Python で置換しよう（4）チートシート[2]

できること

マウスの操作(移動/ボタン/スクロール)
キーボードの操作(テキスト入力/キー操作)
アラートウインドウの制御(通常のアプリケーションウインドウは操作無理)
イメージマッチング(bmp/jpg/pngで設定可能) ※opencv_python , pillow , Imageモジュールがあるとあいまい検索が可能
ピクセルカラー取得(特定座標または範囲の色を取得)
スクリーンショットの取得(全域/範囲)

いいところ

クロスプラットフォーム

ざっくりとした結論

画像認識型の自動化なので、seleniumライブラリなど別ライブラリと組み合わせを推奨。
単独だと、画像操作で何とかなるものに限る

インストール

Linux (Debian系)

sudo apt install python3　scrot python3-tk python3-dev
pip3 install python3-xlib
pip3 install Image
pip3 install pillow
pip3 install pyscreeze
pip3 install PyTweening
pip3 install opencv_python
pip3 install pyautogui

※Redhat系だとiusレポジトリにpython 3.6があるので、そちらを入れるか、python 3.7をtar玉からコンパイルする方法がいいかも

MacOSX

brew install python
pip3 install pyobjc-core
pip3 install pyobjc
pip3 install Image
pip3 install pillow
pip3 install pyscreeze
pip3 install PyTweening
pip3 install opencv_python
pip3 install pyautogui

# pyobjc等のインストールがこける場合はOSXが10.11以降の場合は以下のようにしてみるといいかも
# MACOSX_DEPLOYMENT_TARGETの値は以下を参考に変更してください
# 10.11（El Capitan）、10.12（Sierra）、10.13（High Sierra）、10.14(Mojave)、10.15（Catalina）

MACOSX_DEPLOYMENT_TARGET=10.11 pip install pyobjc

Windows

以下からPython-3.7.3.exeを取得して実行、インストール先はC:\Python37\にする
https://www.python.org/downloads/

あとは、pipでインストールできるので、コマンドプロンプトやPowerShellで実行しましょう

C:\Python37\Scripts\pip.exe install pyautogui

Windowsの場合、COM制御系のモジュールも入れておくといいかも、ウインドウの場所が識別できるので、ちなみに、pywin32にもwin32guiは含まれてるはずなんだけど、Windows10だとwin32guiも単独で入れておかないとなぜかハンドリングに失敗する謎な現象が出ます(実は別物?)
※WindowsのCOMにアタッチしないのであれば、win32guiは不要でした

C:\Python37\Scripts\pip.exe install pywin32
C:\Python37\Scripts\pip.exe install win32gui
C:\Python37\Scripts\pip.exe install Image
C:\Python37\Scripts\pip.exe install pillow
C:\Python37\Scripts\pip.exe install pyscreeze
C:\Python37\Scripts\pip.exe install PyTweening
C:\Python37\Scripts\pip.exe install opencv_python

インタプリタモードでテストしてみよう

Mac/Linuxであればターミナル(端末)を、呼び出して

python

Windowsであれば、PowerShellかコマンドプロンプトを呼び出して、

C:\Python37\python.exe

を実行して、Pythonプロンプトを出します

>>>

となっているはず。ここに

import pyautogui
import sys
import time
screen_x,screen_y = pyautogui.size()
curmus_x,curmus_y = pyautogui.position()
print (u"printについてる[u]はunicodeにするのuでマルチバイト表記が化けるときにつけるよ")
print (u"画面サイズ [" + str(screen_x) + "]/[" + str(screen_y) + "]")
print (u"現在のマウス位置 [" + str(curmus_x) + "]/[" + str(curmus_y) + "]")
center_x = screen_x / 2
center_y = screen_y / 2
print (u"画面中央 [" + str(center_x) + "]/[" + str(center_y) + "]")
pyautogui.moveTo(center_x, center_y, duration=2)
print (u"２秒かけて、マウスが中央に移動したかい?、duration=(移動にかける時間[sec])だよ")

一行ずつ入力すると、画面やマウスの座標が取れたり、移動したり、計算したり、文字が表示されたりする
インタプリタモードは(Control+C/CTRL+D/CTRL+Z)で抜けれる

スクリプトファイルを作って実行しよう

次は、プロンプトではなく、スクリプトファイルを書いて、実行してみようと思う。
エディタは何でもいいけれど、保存の際、文字コードをUTF-8に、改行コードをLFにするのを忘れないこと。
表示/取扱い文字列に6～8バイト文字を含む場合は、文字コードをUTF-16LEかUTF-32にして保存を推奨
Windows環境の場合はSJISでも動いたけど、文字列取得で化けたりする。

mouse_action.py

# -*- coding: utf-8 -*-

## PyAutoGUIのモジュール
import pyautogui

## プロセスを制御するためにOS周りのモジュール
import re
import os
import subprocess
import sys
import time
import array

## Win32のUI情報と制御用モジュール
import win32api
import win32gui
import win32con

## MacやLinuxの場合は、↑３つの代わりに
# import re
# import subprocess
# を利用して以下のようにすると
#
# from subprocess import Popen, PIPE
# cmd = "xwininfo -name (ウインドウ名)"
# p = Popen(cmd.split(' '),stdout = PIPE, stderr = PIPE)
# ret = str(p.communicate())
# coord = re.search('X:\s+(\d+)[^Y]+Y:\s+(\d+)',ret)
# appwin_x,appwin_y = coord.groups()
#
## アプリウインドウの左上(appwin_x,appwin_y)が取れます

# 以下、メインルーチン
if __name__ == "__main__":
    #実行前の待機(秒)
    time.sleep(1)
    #画面サイズの取得
    screen_x,screen_y = pyautogui.size()
    
    #マウスを(1,1)に移動しておく
    pyautogui.moveTo(1, 1, duration=1)
    
    #win32guiを使ってウインドウタイトルを探す
    #Windowのハンドル取得('クラス名','タイトルの一部')で検索クラスがわからなかったらNoneにする
    #有名どころで('#32770',"名前を付けて保存")かな
    parent_handle = win32gui.FindWindow(None, "電卓")

    #ハンドルIDが取れなかったら、電卓を起動する
    if parent_handle == 0 :
        cmd = 'start C:\Windows\System32\calc.exe'
        subprocess.Popen(cmd, shell=True)
        time.sleep(3)
        parent_handle = win32gui.FindWindow(None, "電卓")
    
    if parent_handle == 0 :
        print(u"アプリの起動に失敗したみたい、中断します")
        sys.exit()

    #ハンドルが取れたら、ウインドウの左上と右下の座標と画面のアクティブ化
    #ちなみに、アプリ内のボタンとか入力窓も頑張ればとれるけど、win32guiでやると複雑になりすぎる
    #おとなしく、アプリの座標とトップレベルウインドウの情報だけ使う
    if parent_handle > 0 :
        win_x1,win_y1,win_x2,win_y2 = win32gui.GetWindowRect(parent_handle)
        print(u"アプリの座標:"+str(win_x1)+"/"+str(win_y1))
        apw_x = win_x2 - win_x1
        apw_y = win_y2 - win_y1
        print(u"アプリの画面サイズ:"+str(apw_x)+"/"+str(apw_y))
        print(u"アプリを最前面に持ってくるよ")
        win32gui.SetForegroundWindow(parent_handle)
        #ウインドウの完全な情報を取ってくる、FindWindowで部分一致だったりした場合の補完用
        titlebar = win32gui.GetWindowText(parent_handle)
        classname = win32gui.GetClassName(parent_handle)


    #Drag/Drop関数があるけれど、実はmouseDownとmoveToを使ったほうがいい
    #アプリのマウスオーバー挙動はD&Dでは反応しない(逆に反応させないように使う場合はdragTo/dragRelがいい)
    print(u"移動して")
    pyautogui.moveTo(win_x1+40,win_y1+4, duration=1)
    print(u"掴んで")
    pyautogui.mouseDown(win_x1+40,win_y1+4, button='left')
    print(u"移動して")
    pyautogui.moveTo(100,200, duration=1)
    print(u"移動して")
    pyautogui.moveTo(110,100, duration=1)
    print(u"移動して")
    pyautogui.moveTo(120,300, duration=1)
    print(u"移動して")
    pyautogui.moveTo(130,100, duration=1)
    print(u"移動して")
    pyautogui.moveTo(140,300, duration=1)
    print(u"移動して")
    pyautogui.moveTo(150,100, duration=1)
    print(u"移動して")
    pyautogui.moveTo(160,300, duration=1)
    print(u"移動して")
    pyautogui.moveTo(170,100, duration=1)
    print(u"移動して")
    pyautogui.moveTo(180,300, duration=1)
    print(u"移動して")
    pyautogui.moveTo(190,100, duration=1)
    print(u"移動して")
    pyautogui.moveTo(200,300, duration=1)
    print(u"放す")
    pyautogui.mouseUp(210,200, button='left')
    print(u"おしまい。とじていいよ")
    time.sleep(30)

とまぁ、単純に電卓を出して、ウインドウをつかんで、ぐりぐりするだけの簡単なやつです

python ./mouse_action.py

みたいにして実行してみてください。

ImageNotFoundException を有効に

try ~ except で ImageNotFoundException を取得できるようにして、画像が見つからない場合の対応をできるようにする
使い方としては以下のような使い方になる

try_search_image.py

import sys
import os
import time
import pyautogui as pg

# pyscreeze の設定(画像が見当たらない場合に"ImageNotFoundException"を受け取る)
import pyscreeze
from pyscreeze import ImageNotFoundException
pyscreeze.USE_IMAGE_NOT_FOUND_EXCEPTION = True

# Logを残す
sys.stdout = open("pyautogui.log", "w")

# 画像ファイルから座標を取得する関数
def get_locate_from_filename(filename):
    locate = None
    while locate == None:
        time.sleep(0.1)
        #グレイスケールで検索(95%一致で判定)
        locate = pg.locateCenterOnScreen(filename, grayscale=True,confidence=0.950)
        #フルカラーで検索(遅い)
        #locate = pg.locateCenterOnScreen(filename)
    return locate

# 以下、メインルーチン
if __name__ == "__main__":

    #画面サイズの取得
    screen_x,screen_y = pg.size()

    #マウスを(1,1)に移動しておく
    pg.moveTo(1, 1, duration=1)

    #画像ファイルを検索してクリック
    nsec    = 0
    timeout = 5
    while True:
        try:
            button_position = get_locate_from_filename('search.png')
            pg.click(button_position)
            break
        except ImageNotFoundException: 
            time.sleep(1)
            nsec += 1
            if nsec > timeout:
                pg.alert(text='タイムアウト', button='OK')
                break

これで、画像が見つからなくても処理が継続できるようになる

適当なチートシート

以下は私の検証した際にメモった覚書みたいなものです。正確なのはオフィシャルのドキュメントを読んでください。
あくまで、覚書メモです。

cheatsheet.py


# 画面解像度の取得
# 要素別にとる
max_x,max_y = pyautogui.size()
# 配列にとる
pos         = pyautogui.size()

# マウス現在位置取得
cur_x,cur_y = pyautogui.position()
pos         = pyautogui.position()

# 画像認識
img_x,img_y = pyautogui.locateCenterOnScreen('search.png')
pos         = pyautogui.locateCenterOnScreen('search.png')

# マウス移動、移動先の(x,y)と移動にかける時間(duration)を秒で指定
# 注意点としては、左上の頂点は(0,0)ではなく(1,1)なのを忘れない
pyautogui.moveTo(img_x, img_y, duration=2)

# 現在位置からの相対移動、かける時間(duration)を秒で指定は同じ
pyautogui.moveRel(xOffset, yOffset, duration=num_seconds)

# ドラッグ&ドロップ
# マウス移動、ドラッグ元の(x,y)と移動にかける時間(duration)を秒で指定
# もう一回指定すると、ドロップする
pyautogui.dragTo(x, y, duration=num_seconds)

# 現在位置からの相対移動、ドラッグ&ドロップする位置を指定
# もう一回指定すると、ドロップする
pyautogui.dragRel(xOffset, yOffset, duration=num_seconds)

# マウスクリック通常設定であれば、左クリック、システムに依存するので、右手用設定時は要注意
pyautogui.click(x,y)

# マウスクリック、連射設定、場所の指定以外にクリック数、クリック間隔、ボタンの指定ができる
pyautogui.click(x,y, clicks=num_of_clicks, interval=secs_between_clicks, button='left')

# その他のマウスクリック
pyautogui.rightClick(x, y)
pyautogui.middleClick(x, y)
pyautogui.doubleClick(x, y)
pyautogui.tripleClick(x, y)

# マウスホイール
pyautogui.scroll(amount_to_scroll, x, y)

# 押しっぱなしと解放
pyautogui.mouseDown(x, y, button='left')
pyautogui.mouseUp(x, y, button='left')

# ショートカットを設定するとき
pyautogui.hotkey('ctrl', 'c')
pyautogui.hotkey('ctrl', 'v')

# キーを押下・押上するとき
pyautogui.keyDown(key_name)
pyautogui.keyUp(key_name)

# ■■■■記述サンプル■■■■

# 画像が画面内にあるか
if pyautogui.onScreen(img_x, img_y):
    pyautogui.moveTo(img_x, img_y, duration=2)
    # pyautogui.moveRel(xOffset, yOffset, duration=num_seconds)

if pyautogui.locateCenterOnScreen('search.png'):
    # 画像があったらD&D
    pyautogui.dragTo(x, y, duration=num_seconds)
    #pyautogui.dragRel(xOffset, yOffset, duration=num_seconds)

if pyautogui.locateCenterOnScreen('search.png'):
    # 画像があったらクリック
    pyautogui.click(img_x, img_y)
    #pyautogui.click(x, y, clicks=num_of_clicks, interval=secs_between_clicks, button='left')

# pip install opencv_python でopencvモジュールがインストールされている場合に限り
# 閾値(confidence)が指定できる、以下の場合95%同じだったらTrue
# また、grayscale=Trueを指定すると、グレースケールによる判定を実施し30%前後速度が向上する
if pyautogui.locateCenterOnScreen('search.png',grayscale=True,confidence=0.950):
    # 画像があったらクリック
    pyautogui.click(img_x, img_y)

# 2.5秒待ちを入れる
pyautogui.PAUSE = 2.5

# スクリーンの[0,0]座標にカーソルを持っていくと、FailSafeExceptionを任意に発生させることができる
pyautogui.FAILSAFE = True

# マウス移動
pyautogui.moveTo(700, 500) #X, Y
pyautogui.moveTo(None, 100, 2) #700, 500へ2秒で
pyautogui.moveRel(-50, 0) #相対座標
pyautogui.dragTo(415, 508, button = 'left')

# 現在地から指定座標までドラッグして離す
pyautogui.dragRel(-20, -20, button = 'left')
pyautogui.click()
pyautogui.click(button='right', clicks=2, interval=0.25)
pyautogui.doubleClick() 
pyautogui.mouseDown(button='left') 
pyautogui.mouseUp(button='left') 
pyautogui.scroll(10)
pyautogui.hscroll(10)

# インターバル0.5秒でshiftを2回押す
pyautogui.press('shift', presses=2, interval=0.5)
pyautogui.keyDown('shift')
pyautogui.keyUp('shift')
pyautogui.hotkey('ctrl','c') #同時押し
pyautogui.typewrite('hello world!')

# 指定できるキー名
KEYBOAD_KEYS = ['\t', '\n', '\r', ' ', '!', '"', '#', '$', '%', '&', "'", '(',
')', '*', '+', ',', '-', '.', '/', '0', '1', '2', '3', '4', '5', '6', '7',
'8', '9', ':', ';', '<', '=', '>', '?', '@', '[', '\\', ']', '^', '_', '`',
'a', 'b', 'c', 'd', 'e','f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o',
'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', '{', '|', '}', '~',
'accept', 'add', 'alt', 'altleft', 'altright', 'apps', 'backspace',
'browserback', 'browserfavorites', 'browserforward', 'browserhome',
'browserrefresh', 'browsersearch', 'browserstop', 'capslock', 'clear',
'convert', 'ctrl', 'ctrlleft', 'ctrlright', 'decimal', 'del', 'delete',
'divide', 'down', 'end', 'enter', 'esc', 'escape', 'execute', 'f1', 'f10',
'f11', 'f12', 'f13', 'f14', 'f15', 'f16', 'f17', 'f18', 'f19', 'f2', 'f20',
'f21', 'f22', 'f23', 'f24', 'f3', 'f4', 'f5', 'f6', 'f7', 'f8', 'f9',
'final', 'fn', 'hanguel', 'hangul', 'hanja', 'help', 'home', 'insert', 'junja',
'kana', 'kanji', 'launchapp1', 'launchapp2', 'launchmail',
'launchmediaselect', 'left', 'modechange', 'multiply', 'nexttrack',
'nonconvert', 'num0', 'num1', 'num2', 'num3', 'num4', 'num5', 'num6',
'num7', 'num8', 'num9', 'numlock', 'pagedown', 'pageup', 'pause', 'pgdn',
'pgup', 'playpause', 'prevtrack', 'print', 'printscreen', 'prntscrn',
'prtsc', 'prtscr', 'return', 'right', 'scrolllock', 'select', 'separator',
'shift', 'shiftleft', 'shiftright', 'sleep', 'space', 'stop', 'subtract', 'tab',
'up', 'volumedown', 'volumemute', 'volumeup', 'win', 'winleft', 'winright', 'yen',
'command', 'option', 'optionleft', 'optionright']

## メッセージボックス操作
alert(text='', title='', button='OK')
confirm(text='', title='', buttons=['OK', 'Cancel'])
prompt(text='', title='' , default='')
password(text='', title='', default='', mask='*')

## スクリーンショット作成
# 全体
pyautogui.screenshot('my_screenshot.png')
# 矩形範囲指定
pyautogui.screenshot(region=(0,0, 300, 400))

## 画像認識
# それぞれの要素をとる
pos_x,pos_y,size_x,size_y = pyautogui.locateOnScreen('target.png')
# 配列に格納してとる
pos                       = pyautogui.locateOnScreen('target.png')

# 画像の中心座標を取る
pos_x,pos_y = pyautogui.center('target.png')

## サーチ範囲(region)指定
# pyautogui.locateOnScreen('someButton.png', region=(0,0, 300, 400))

## 色の無視(grayscale)
# pyautogui.locateOnScreen('someButton.png', grayscale=True, region=(0,0, 300, 400))

## マッチ誤差(tolerance)[0-100%]
# pyautogui.locateOnScreen('someButton.png', grayscale=True,tolerance=10 ,region=(0,0, 300, 400))

使ってみた感じ、軽いは軽いので、簡単な自動化向けかなぁ、参考になれば!!

505

636

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up

Why not login to Qiita and try out its useful features?