More than 5 years have passed since last update.

Seleniumのスクリーンショットはどうやって撮っているのだろう

Last updated at 2016-05-16Posted at 2015-05-03

ことの発端

フレームセットを使用したサイトのテストにおいて、操作を確認するためにスクリーンショットを撮ることにした。

File file = ((TakesScreenshot) driver).getScreenshotAs(OutputType.FILE);
FileUtils.copyFile(file, new File(path));

もしくは

 TakesScreenshot screen = (TakesScreenshot)driver;
 Path capture = captureDirectory.resolve(fileName);
 Files.write(capture, screen.getScreenshotAs(OutputType.BYTES));

違いはテンポラリーに出力された画像をコピーするか、自分で書き出すかという点。

さて、これを使って取得したスクリーンショットを確認したところサイト画面の下がブラウザ表示範囲しか写っていない！

フレームセットが使われていなければ全ページが写っています。
（同ウィンドウサイズで撮影）

動作環境

selenium2.45.0
FireFox 37.0

IEやChromeではいろいろな現象が発生することは既知の情報ですがことFFは普通に撮れると思っていたので驚愕。

調べてみる　-その１-

そもそも、画像の高さは一体どうやってきまっているのか？というところからスタート。
まず使っているコマンドを調べてみる。

getScreenshotAs
public <X> X getScreenshotAs(OutputType<X> target)
Description copied from interface: TakesScreenshot
Capture the screenshot and store it in the specified location.
For WebDriver extending TakesScreenshot, this makes a best effort depending on the browser to return the following in order of preference:

・Entire page
・Current window
・Visible portion of the current frame
・The screenshot of the entire display containing the browser

For WebElement extending TakesScreenshot, this makes a best effort depending on the browser to return the following in order of preference: - The entire content of the HTML element - The visisble portion of the HTML element

え？「a best effort」？！（超訳：なるべく頑張るよ）そうなの？！

考えてみる

Javaのいずれの方式にしても、Java側でやっているのは画像として返ってきたデータをファイルに書き出しているだけで、実際にイメージを扱っているのはテストスクリプト側ではない。となるとイメージを扱っているのは誰？

調べてみる　-その２-

テストスクリプトとSelenium、ブラウザーの関係が分かりやすく載っているのがこれ

「入門、Selenium　Seleniumの仕組み」より
https://app.codegrid.net/entry/selenium-1

なるほど、テストスクリプトで記述している　API はJsonWireProtocolを使ってブラウザのドライバーと通信してその先「ブラウザの拡張機能やOSのネイティブ機能」を使ってブラウザを操作する、という流れ。

で、JsonWireProtocolはどう定義されている？

JsonWireProtocol
https://github.com/SeleniumHQ/selenium/wiki/JsonWireProtocol

/session/:sessionId/screenshot

GET /session/:sessionId/screenshot

Take a screenshot of the current page.
URL Parameters:
:sessionId - ID of the session to route the command to.
Returns:
{string} The screenshot as a base64 encoded PNG.
Potential Errors:
NoSuchWindow - If the currently selected window has been closed.

なるほど、base64でエンコードされたPNGが返ってくる、となっています。

考えてみる

となると、イメージを作っているのはこの通信先にいる人ということになる。だれ？
ブラウザの拡張機能　ということはFireFoxで言う　Add-onとか？

調べてみる　-その３-

webdriver経由で起動したFireFoxの拡張機能に何がインストールされているか調べてみる。

なるほど！FirefoxWebDriverがアドオンでインストールされている！！

通常の起動と違い、実行時のプロファイルはテンポラリー的に作られるので about:config　で作成されているディレクトリを探してみる

それらしきものがあった。中を見てみる。
たしか、FirefoxのAdd-onは基本的にJavascriptのはず。少しは見ることができるかも。

$ pwd
/var/folders/fc/4_l4xmy914sfl5lsgc56j8880000gn/T/anonymous6720768172872236647webdriver-profile/extensions
$ ls -l
total 0
drwxr-xr-x  8 xxxxx  staff  272  5  3 21:49 fxdriver@googlecode.com
drwxr-xr-x  3 xxxxx  staff  102  5  3 21:49 webdriver-staging

これが拡張機能の実態部分。

$ cd fxdriver\@googlecode.com/
$ ls -l
total 24
-rw-r--r--   1 xxxxx  staff  4173  5  3 21:49 chrome.manifest
drwxr-xr-x  22 xxxxx  staff   748  5  3 21:49 components
drwxr-xr-x   6 xxxxx  staff   204  5  3 21:49 content
-rw-r--r--   1 xxxxx  staff  1301  5  3 21:49 install.rdf
drwxr-xr-x   5 xxxxx  staff   170  5  3 21:49 platform
drwxr-xr-x   5 xxxxx  staff   170  5  3 21:49 resource

操作ロジックを探す

$ cd components/
$ ls -l
total 6072
-rw-r--r--  1 xxxxx  staff  173172  5  3 21:49 bad-cert-listener.js
-rw-r--r--  1 xxxxx  staff  513787  5  3 21:49 command-processor.js
-rw-r--r--  1 xxxxx  staff  506863  5  3 21:49 driver-component.js
-rw-r--r--  1 xxxxx  staff  154612  5  3 21:49 httpd.js
-rw-r--r--  1 xxxxx  staff  367681  5  3 21:49 modifier-keys.js
-rw-r--r--  1 xxxxx  staff     197  5  3 21:49 nsICommandProcessor.xpt
-rw-r--r--  1 xxxxx  staff    1594  5  3 21:49 nsIHttpServer.xpt
-rw-r--r--  1 xxxxx  staff     220  5  3 21:49 nsINativeEvents.xpt
-rw-r--r--  1 xxxxx  staff     299  5  3 21:49 nsINativeIME.xpt
-rw-r--r--  1 xxxxx  staff     151  5  3 21:49 nsINativeKeyboard.xpt
-rw-r--r--  1 xxxxx  staff     265  5  3 21:49 nsINativeMouse.xpt
-rw-r--r--  1 xxxxx  staff     152  5  3 21:49 nsIResponseHandler.xpt
-rw-r--r--  1 xxxxx  staff  360083  5  3 21:49 prompt-service.js
-rw-r--r--  1 xxxxx  staff  366014  5  3 21:49 session-store.js
-rw-r--r--  1 xxxxx  staff  171974  5  3 21:49 session.js
-rw-r--r--  1 xxxxx  staff  435961  5  3 21:49 synthetic-mouse.js
-rw-r--r--  1 xxxxx  staff     214  5  3 21:49 wdICoordinate.xpt
-rw-r--r--  1 xxxxx  staff     326  5  3 21:49 wdIModifierKeys.xpt
-rw-r--r--  1 xxxxx  staff     412  5  3 21:49 wdIMouse.xpt
-rw-r--r--  1 xxxxx  staff     153  5  3 21:49 wdIStatus.xpt

どれだろ？ jsの拡張子が付いているファイルはそれほど多くない。JsonWireProtocolでは screenshotで定義されていたのでこのキーワードで検索。

command-processor.js で見つける事ができた。

fxdriver.screenshot = {};
fxdriver.screenshot.grab = function(a) {
  var b = a.document, c = b.documentElement;
  if (!c) {
    throw Error("Page is not loaded yet, try later");
  }
  var d = b.getElementById("fxdriver-screenshot-canvas");
  null == d && (d = b.createElement("canvas"), d.id = "fxdriver-screenshot-canvas", d.style.display = "none", c.appendChild(d));
  var e = c.scrollWidth;
  b.body && b.body.scrollWidth > e && (e = b.body.scrollWidth);
  c = c.scrollHeight;
  b.body && b.body.scrollHeight > c && (c = b.body.scrollHeight);
  32767 <= e && (e = 32766);
  32767 <= c && (c = 32766);
  d.width = e;
  d.height = c;
  try {
    var f = d.getContext("2d");
  } catch (g) {
    throw Error("Unable to get context - " + g);
  }
  try {
    f.drawWindow(a, 0, 0, e, c, "rgb(255,255,255)");
  } catch (h) {
    throw Error("Unable to draw window - " + h);
  }
  return d;
};

スクリーンショットは　canvas要素を作りそこに表示イメージを描画してデータ化しているということのようだ。

で、画像の高さはこの式で決めているらしい。

var b = a.document, c = b.documentElement;
(略）
c = c.scrollHeight;
b.body && b.body.scrollHeight > c && (c =　b.body.scrollHeight);
32767 <= c && (c = 32766);
d.height = c;

APIにあった　「a best effort」っていうのは　このロジックを指しているということか？

framesetタグを使用するときには　bodyタグは使われないので

framesetのscrollHeightがどうなっているのかを FireBugのDOMタブでみてみる

ページのサイズではなくブラウザ画面表示サイズになっている。

考えてみる

このサイズがページサイズになっていれば問題はなかったはず。

（個人的な想像）フレームを使った際最大ページ高の判断が難しいためこうした？

これを定義しているのは誰か？
DOM構築時に設定していると思われるので　Gekkoさん？

対応策

案１）スクリーンショットをとる直前にこのプロパティーを書き換えればよい？

よく見ると、このプロパティーはリードオンリーでした orz

案２）このロジックを実装すればよい？

MDN canvas に絵を描く

canvas への Web コンテンツの描画

この機能は Chrome 特権コードの実行時のみに存在します。通常の HTML ページでは許可されていません。理由についてはソースをお読みください。
Mozilla の canvas は drawWindow().drawWindow() メソッドで拡張できます。このメソッドは DOM window の中身のスナップショットを canvas に描画します。以下に例を示します。

ctx.drawWindow(window, 0, 0, 100, 200, "rgb(255,255,255)");

「特権コードの実行時のみ」そうですか。。。

根本的な解決は難しそう、外科的対応（Javascriptを使ってスクロールさせながら撮ってつなぎ合わせるなど）しかないという事か・・・。

追記　（2016/05/16）

上述の外科的対応の具体的なコードを　meganetaaan さんが記事に書かれていたので遅くなりましたが追記しておきます。

Seleniumで撮ったスクリーンショットがブラウザごとにばらばら問題

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up

Seleniumのスクリーンショットはどうやって撮っているのだろう

ことの発端

動作環境

調べてみる -その１-

考えてみる

調べてみる -その２-

考えてみる

調べてみる -その３-

考えてみる

対応策

追記 （2016/05/16）

調べてみる　-その１-

調べてみる　-その２-

調べてみる　-その３-

追記　（2016/05/16）