More than 5 years have passed since last update.

WebGL Advent Calendar 2017

WebGLでGPGPU（gpu.js / turbo.js / deeplearn.js / WebDNN）

Last updated at 2018-04-10Posted at 2017-12-16

WebGLはコンピュータグラフィックスのための機能しかないと思われがちですが，数値計算のようにグラフィックとは関係ないGeneral-purpose computing on GPU（GPGPU）に転用することもできます．そこで，WebGLを使ってGPGPUを実現しているライブラリについていくつか紹介します．どのライブラリも内部ではWebGLを使っているのですが，~~ボロが出るので~~ここでは各ライブラリの概要や使い方の紹介に留めます．

gpu.js

gpu.jsではKernel functionと呼ばれるGPU上で実行するメソッドを用意することで計算を実現します．

const gpu = new GPU();
const matMul = gpu.createKernel(function(a, b) {
    var sum = 0;
    for (var i = 0; i < this.constants.size; i++) {
        sum += a[this.thread.y][i] * b[i][this.thread.x];
    }
    return sum;
}, {
  constants: { size: 512 },
  output: [512, 512],
});

gpu.jsではcreateKernelで与えられた関数を（文字列として）パースして，その内容に応じてGLSLコードを生成します．shader-frag.jsではフラグメントシェーダのテンプレートが確認できます．Kernel functionではlet，constは使えましたが，Arrow function等は対応していないようです．

作成したKernel functionは普通に関数として利用します．

const a = [];
const b = [];
for (let i = 0; i < 512; i++) {
  const p = [];
  const q = [];
  for (let j = 0; j < 512; j++) {
    a.push(Math.random());
    b.push(Math.random());
  }
  a.push(p);
  b.push(q);
}
const c = matMul(a, b);
console.log(c);

合成


const add = gpu.createKernel(function(a, b) {
  return a[this.thread.x] + b[this.thread.x];
}).setOutput([4]);

const multiply = gpu.createKernel(function(a, b) {
  return a[this.thread.x] * b[this.thread.x];
}).setOutput([4]);

const superKernel = gpu.combineKernels(add, multiply, function(a, b, c) {
  return multiply(add(a, b), c);
});

canvasへ出力

~~Three.jsでも良いのでは~~

index.html

<canvas id="canvas" width="128" height="128"></canvas>

const canvas = document.getElementById('canvas');
const gpu = new GPU({ canvas });

const render = gpu.createKernel(function() {
  this.color(this.thread.x / 128, this.thread.y / 128, 0.5, 1);
}).setOutput([128, 128])
  .setGraphical(true);
    
render();

以下のように出力されます．

turbo.js

turbo.jsはWebGLを扱いGPGPUを実現するライブラリですが，gpu.jsよりもずっと薄いライブラリで，ソースコードは200行程度のみです．GLSLに慣れている場合はこちらが書きやすいかもしれません．

allocでメモリ（Float32Array）を確保．

const matSize = 512;
const mem = turbojs.alloc(matSize ** 2);
for (let i = 0; i < matSize ** 2; i++) {
  mem.data[i*4+0] = i + 1;  // matrix A
  mem.data[i*4+1] = i + 1;  // matrix B
}

runで実行．実行したいGLSLコードの文字列をワイルドに投げ入れます．

turbojs.run(mem, `
    float getMatrixA(vec2 p) {
        return texture2D(u_texture, p).r;
    }
    float getMatrixB(vec2 p) {
        return texture2D(u_texture, p).g;
    }
    float mul(vec2 pos) {
        float result = 0.;
        for (int i = 0; i < ${matSize}; i++) {
            float q = (float(i) + 0.5) / ${matSize}.;
            float a = getMatrixA(vec2(pos.r, q));
            float b = getMatrixB(vec2(q, pos.g));
            result += a * b;
   	    }
        return result;
    }
    void main() {
        vec4 ipt = read();
        float res = mul(pos);
        commit(vec4(ipt.rg, res, 0));
    }
`);

const c = [];
for (let i=0; i < matSize**2; i++) {
  c.push(mem.data[i*4+2]);  // matrix C
}
console.log(c);

runで与えたコードはフラグメントシェーダ内で実行されますが，ヘルパとしてこのコード以外にGLSL実行時に計算する座標を示すvec2 pos，メモリを読み込むvec4 read(void)，計算結果をメモリに戻すvoid commit(vec4 val)などが付与されます．

deeplearn.js

4/11追記: 公式ホームページでアナウンスされている通り，deeplearn.js自体はtfjs-coreと名前を変え，今後はTensorFlow.jsの開発に移行していくとのことです．元々同じ開発元のGoogleがTensorFlowに似せて作られているライブラリなので，APIの名称は違えどそこまで混乱せずに移行できるかと思います．以下は参考としてdeeplearn.jsの内容をそのまま載せたものです．

deeplearn.jsはGPGPUの中でもDNNの計算に特化したライブラリです．公式サイトのExamplesがたくさん用意されおり見ているだけで楽しいです．

deeplearn.jsのデータはScalarからArray1D〜Array4Dまで定義できます．

import { Scalar, Array1D, Array2D, Array3D, Array4D } from 'deeplearn';

const a = Array2D.new([2, 2], [1., 2., 3., 4.]);
const b = Array2D.new([2, 2], [0., 2., 4., 6.]);

// new Scalar(data, dtype)
// new Array1D(data, dtype)
// new Array2D(shape: [number, number], data, dtype)
// new Array3D(shape: [number, number, number], data, dtype)
// new Array4D(shape: [number, number, number, number], data, dtype)

以下はaとbの平均二乗誤差を計算する例です．

import { NDArrayMathGPU } from 'deeplearn';

const math = new NDArrayMathGPU();
const diff = math.sub(a, b);
const squaredDiff = math.elementWiseMul(diff, diff);
const sum = math.sum(squaredDiff);
const size = Scalar.new(a.size);
const average = math.divide(sum, size);

console.log('MSE: ' + await average.val());

awaitを使っていることにお気づきかと思います．すなわち，deeplearn.jsの計算はval()を実行してGPUに送られるまで評価されません．このコードは以下のように書くこともできます．（以前はこちらの例で紹介されていました）

const math = new NDArrayMathGPU();
math.scope((keep, track) => {
  const a = track(a);
  const b = track(b);

  const diff = math.sub(a, b);
  const squaredDiff = math.elementWiseMul(diff, diff);
  const sum = math.sum(squaredDiff);
  const size = Scalar.new(a.size);
  const average = math.divide(sum, size);

  console.log('MSE: ' + average.get());
});

Graph

deeplearn.jsのGlaphを使うことで，TensorFlow風にモデルを記述することができます．以下はMNIST（28×28ピクセルに0〜9の手書き数字が書かれたデータセット）を分類するサンプルです．

import { CheckpointLoader, Graph, Session } from 'deeplearn';

const varLoader = new CheckpointLoader('.');
const vars = await varLoader.getAllVariables();

const g = new Graph();
const input = g.placeholder('input', [784]);  // 784 = 28 * 28
const hidden1W = g.constant(vars['hidden1/weights']);
const hidden1B = g.constant(vars['hidden1/biases']);
const hidden1 = g.relu(g.add(g.matmul(input, hidden1W), hidden1B));
const hidden2W = g.constant(vars['hidden2/weights']);
const hidden2B = g.constant(vars['hidden2/biases']);
const hidden2 = g.relu(g.add(g.matmul(hidden1, hidden2W), hidden2B));
const softmaxW = g.constant(vars['softmax_linear/weights']);
const softmaxB = g.constant(vars['softmax_linear/biases']);
const logits = g.add(g.matmul(hidden2, softmaxW), softmaxB);
const probs = g.argmax(logits);

const data = [0.0, 0.706, 0.996, ... ];  // 784次元のデータ
const math = new NDArrayMathGPU();
const sess =  new Session(g, math);
math.scope(() => {
  const inputData = Array1D.new(data);
  const probsVal = sess.eval(probs, [{ tensor: input, data: inputData }]);
  console.log(`Number is ${probsVal.get()}`);
});

CheckpointLoaderはmanifest.jsonの保存されているディレクトリを指定することで，学習済みのモデルを読み込んでくれます．manifest.jsonは以下のような形式です．

manifest.json

{
  "hidden1/biases": {
    "filename": "hidden1_biases",
    "shape": [128]
  },
  "hidden1/weights": {
    "filename": "hidden1_weights",
    "shape": [784, 128]
  },
  ......

……という例を2ヶ月前に紹介したのですが，今公式サンプルを見てみると，以下のようなソースコードになっていました．確かにこちらのほうが書きやすそうです．

const hidden1W = vars['hidden1/weights'];
const hidden1B = vars['hidden1/biases'];
const hidden2W = vars['hidden2/weights'];
const hidden2B = vars['hidden2/biases'];
const softmaxW = vars['softmax_linear/weights'];
const softmaxB = vars['softmax_linear/biases'];

math.scope(() => {
  const hidden1 = math.relu(math.add(
    math.vectorTimesMatrix(x, hidden1W), hidden1B
  ));
  const hidden2 = math.relu(math.add(
    math.vectorTimesMatrix(hidden1, hidden2W), hidden2B
  ));
  const logits = math.add(
    math.vectorTimesMatrix(hidden2, softmaxW), softmaxB
  );
  const predictedLabel = Math.round(await math.argMax(logits).val());
  console.log(`Number is ${predictedLabel}`);
});

WebDNN

WebDNNもWebGLでDNNの計算を実行するライブラリです．deeplearn.jsと異なり，（他のライブラリで作成した）既存のモデルを変換して，高速に実行する点を主眼においているようです．バックエンドには，ブラウザの対応状況に応じてWebGL以外にもWebAssemblyやWebGPUを利用することで高速化を図ります（すごい）．

モデルの変換

Caffe，Chainer，Keras，PyTorch，TensorFlowに対応しており，CaffeとKerasはconverterスクリプトが用意されています．

python bin/convert_keras.py your_model.h5 --input_shape '(1,224,224,3)' --out output

実行

const runner = await WebDNN.load('./output');

runner.getInputViews()[0].set(
  await WebDNN.Image.getImageArray('./input_image.png')
);
await runner.run();
const pred = WebDNN.Math.argmax(runner.getOutputViews()[0].toActual();
console.log('Output', pred);

実のところちゃんと試せていないです，すみません…

まとめ

ライブラリによって，WebGLについて詳しく知らなくても，手軽にハイパフォーマンスなコードが書けてありがたみを感じました．一方で，GPU特有の並列計算を理解するには，これらのライブラリのソースを読むなどしてWebGL（とGLSL）を直接勉強するするのが一番近道のような気がします．

WebGLマスターしたいですね，いつか，きっと，そのうちに…

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up