More than 3 years have passed since last update.

OpenCV.js のGaussianBlurが遅すぎたのでWebAssemblyで作ってみた

Last updated at 2021-12-01Posted at 2021-12-01

~おい！OpenCVもっと頑張ってくれ！(ガウシアンフィルタ編)~

はじめに

この投稿はOpenCV Advent Calendar 2021の2日目の投稿です．

webで画像処理やってみたいなーって思って公式のコンパイル済みのOpenCV.jsを使ったら遅かったので手始めにガウシアンフィルタを早くしてみました．コード整理は全くやってないです．
本記事ではWebAssemblyの環境構築はできている前提で進みます．
環境構築についてはこちら

準備

環境

OS: Windows 10
WSL: Ubuntu 20.04.3 LTS (GNU/Linux 4.4.0-19041-Microsoft x86_64)
CPU: Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz 3.60 GHz
テスト環境: Mozilla Firefox 92.0
OpenCV: 4.5.2 (https://docs.opencv.org/4.5.2/opencv.js)

OpenCV.jsの準備

一応参考のためにダウンロード方法とコンパイル方法

wget https://docs.opencv.org/4.5.2/opencv.js # 公式コンパイル済みのjsファイル←今回の比較対象
python ./opencv/platforms/js/build_js.py --emscripten $HOME/emsdk/upstream/emscripten --simd build_simd # SIMD用
python ./opencv/platforms/js/build_js.py --emscripten $HOME/emsdk/upstream/emscripten --build_wasm build_wasm # WebAssembly用

そもそもOpenCVはどの程度の速さなんじゃい！

そもそも速くするとは？OpenCVは遅いの？って思いません？

お答えしましょう．lenna.png($512\times 512$)で測ってみました!!
結果は310ms($\sigma = 15$, 大きめにすることで結果をちょっとわかりやすく)

左(場合により上)：原画像，右(場合により上): OpenCVの出力画像

opencvの計測手法

function cvGF(input, cvs_id, sigma) {
    let src = cv.imread(input);
    let dst = new cv.Mat();
    let size = new cv.Size(2 * 3 * sigma + 1, 2 * 3 * sigma + 1);
    cv.GaussianBlur(src, dst, size, sigma, sigma, cv.BORDER_DEFAULT);
    cv.imshow(cvs_id, dst);
    src.delete();
    dst.delete();
}
const cv_start = Date.now();
cvGF("input_image", "cv_Image", sigma);
const cv_end = Date.now();

※imshowをこの後のプログラムで書いてしまっているのでここでの計測はimshowも含めてます

背景（余談）

WebAssembly

WebAssemblyはC言語やC++,Rustなどの言語で書かれたソースコードのコンパイルして吐き出される機械語のバイトコードです．
WebAssemblyは主要なWebブラウザでサポートされており，chrome, firefox等で実行することができます．
と，難しい感じで考えるより，ずっと簡単に言うとWebAssemblyはネイティブコードと同等の性能を発揮するための言語です．

そもそも，javascriptが遅いのはまだしも，WebAssemblyってなぜ早いの？って方は参考文献をご覧ください．
WebAssemblyについての詳しい説明はwikiをご覧ください．

Gaussian Filter

\bar{\boldsymbol{I}}(p) = \frac{\sum_{\boldsymbol{q}\in N(\boldsymbol{p})}\exp{\left(\frac{-\|\boldsymbol{p}-\boldsymbol{q}\|_{2}^{2}}{2\sigma_{s}^2}\right)}\boldsymbol{I}(\boldsymbol{q})}{\sum_{\boldsymbol{q}\in N(\boldsymbol{p})}\exp{\left(\frac{-\|\boldsymbol{p}-\boldsymbol{q}\|_{2}^{2}}{2\sigma_{s}^2}\right)}}

I(p)は画素pの輝度値です
しばしば正規化のための分母が2πというのをみかけますが，理想的にはそうでも，フィルタリングは離散的に行われてるので一応2πとはしていません.
（面倒だったので，SIMDありの手法以外は2πでやってます）

高速化の方法としていくつかの手法がありますが，今回はセパラブルガウシアンフィルタを使いました．

でも，一応普通のガウシアンフィルタも計測しましょう．
結果はなんと4164ms．．．私の実装がごみだということを差し引いても．．．って感じです．ちなみにLUTを使ってます．

GaussianfilterWithoutSIMD

EMSCRIPTEN_KEEPALIVE
extern "C" void GaussianFilterwithoutSIMD(float* src, int width, int height, int channels, float sigma)
{
	size_t size = width * height * channels * sizeof(float);
	uint8_t dst_array[size / sizeof(float)];
	uint8_t* dst = &dst_array[0];
	int r = (int)3 * sigma;
	float* begin = src;
	float kernel[2 * r + 1][2 * r + 1];
	for (int j = -r; j <= r; j++)
	{
		for (int i = -r; i <= r; i++)
		{
			kernel[j + r][i + r] = expf(-(j * j + i * i) / (2 * sigma * sigma)) / (2 * M_PI * sigma * sigma);
		}
	}
	int wstep = channels * width;
	for (int c = 0; c < channels-1; c++)
	{
		for (int j = r; j < height - r; j++)
		{
			float* src_p = src + c + j * wstep + channels * r;
			dst = &dst_array[0] + c + j * wstep + channels * r;
			for (int i = r; i < width - r; i++)
			{
				float sum = 0;
				for (int l = -r; l <= r; l++)
				{
					for (int k = -r; k <= r; k++)
					{
						sum += kernel[l + r][k + r] * (*(src_p + channels * k + l * wstep));
					}
				}
				*dst = (uint8_t)sum;
				src_p += channels;
				dst += channels;
			}
		}
	}

	
	dst = &dst_array[0] + 3;
	for (int j = 0; j < height * width; j++)
	{
		*dst = 255;
		dst += channels;
	}
	result[0] = (int)&dst_array[0];
	result[1] = size;
}

Separable Gaussian Filter って？

詳しくは参考文献を参照していただければと思いますが，縦方向へのフィルタリング処理と，横方向へのフィルタリング処理を分けてやりましょうって手法です．
この手法のいいところは普通のフィルタリングではカーネル半径をrとしたときに各画素に対して$O(r^2)$の計算量ですが，2r+1回の畳み込みを2度やるだけなので計算量が$O(r)$になる点です．
一応どちらからやってもいいことになっていますが，今回の記事では横やってから縦にフィルタリングをしてます．
（実はこの手法は最適化の際にはナンセンスだとかなんとか．．．気が付くのが遅かったので統一して割り切って横縦の順でやってます．）

ここでそのコードを以下に記します.
ここは本質ではないのでボーダーについての処理はしていません．
結果は134ms
あれ！？勝った！？とおもいきや，実はボーダーの処理してません．
(余談：この記事書く前に書き直したのですが，だいぶ前に書いたときにはもっと遅かったはず．．．なにがあった？)

SeparableGaussianfilterWithouSIMD

EMSCRIPTEN_KEEPALIVE
extern "C" void SeparableGaussianFilterwithoutSIMD(float* src, int width, int height, int channels, float sigma)
{
	size_t size = width * height * channels * sizeof(float);
	uint8_t dst_array[size / sizeof(float)];
	float* tmp_p;
	tmp_p = (float*)malloc(size);
	memcpy(tmp_p, src, size);
	uint8_t* dst = &dst_array[0];
	int r = (int)3 * sigma;
	float* begin = src;
	float* tmp_begin = &tmp_p[0];
	float kernel[2 * r + 1];

	for (int j = -r; j <= r; j++)
	{
		kernel[j + r] = expf(-((float)j * j) / (2 * sigma * sigma)) / powf(2 * M_PI * sigma * sigma, 0.5);
	}

	for (int c = 0; c < channels - 1; c++)
	{
		cout << "channels" << endl;
		src = begin + c + r * (width + 1) * channels;
		tmp_p = tmp_begin + c + r * (width + 1) * channels;
		for (int j = r; j < height - r; j++)
		{
			src = begin + c + (j * width + r) * channels;
			tmp_p = tmp_begin + c + (j * width + r) * channels;
			for (int i = r; i < width - r; i++)
			{
				float sum = 0;
				for (int k = -r; k <= r; k++)
				{
					sum += kernel[r + k] * (*(src + channels * k * width));
				}
				*tmp_p = sum;
				tmp_p += channels;
				src += channels;
			}
			tmp_p += 2 * r * channels;
			src += 2 * r * channels;
		}

		tmp_p = tmp_begin + c + r * (width + 1) * channels;
		dst = &dst_array[0] + c + r * (width + 1) * channels;
		for (int j = r; j < height - r; j++)
		{
			for (int i = r; i < width - r; i++)
			{
				float sum = 0;

				for (int k = -r; k <= r; k++)
				{
					sum += kernel[k + r] * (*(tmp_p + channels * k));
				}

				*dst = (uint8_t)sum;
				tmp_p += channels;
				dst += channels;
			}
			tmp_p += 2 * r * channels;
			dst += 2 * r * channels;
		}
	}

	dst = &dst_array[0] + 3;
	for (int j = 0; j < height * width; j++)
	{
		*dst = 255;
		dst += channels;
	}
	result[0] = (int)&dst_array[0];
	result[1] = size;
}

EMSCRIPTEN_KEEPALIVE
extern "C" int getResultPtr()
{
	return result[0];
}

EMSCRIPTEN_KEEPALIVE
extern "C" int getResultSize()
{
	return result[1];
}

テストコード

javascript

function separableGFwithoutSIMD(cvs, sigma){
    const height = cvs.height;
    const width = cvs.width;
    var ctx = cvs.getContext("2d");

    const data = ctx.getImageData(0, 0, width, height);
    const bytes = 4; // sizeof(type)
    const b = data.data.length * bytes;
    const buf = Module._mm_malloc(b);
    Module.HEAPF32.set(data.data, buf / 4);
    //Module._GaussianFilter(buf, width, height, 4, sigma);
    //Module._GaussianFilterwithoutSIMD(buf, width, height, 4, sigma);
    Module._SeparableGaussianFilterwithoutSIMD(buf, width, height, 4, sigma);
    let result_ptr = Module._getResultPtr();
    let size = Module._getResultSize();
    show(output_canvas, result_ptr, width, height, size);
    // 以下メモリ解放
}

const start = Date.now();
separableGFwithoutSIMD(canvas, sigma);
const end = Date.now();

SIMDを使ったSeparable Gaussian filter

ここからがメインです．
長くなってごめんなさい．
さっきまでのはお遊びでした．こっからは頑張りました．
とりあえず，ループの仕方をめちゃくちゃ修正できるのでいつかしますが，何はともあれ上記のものよりも早くなったので公開します．

ループの仕方については横フィルタ，縦フィルタをやっているのですが，身近におよそ60通り（こんなに思いつかない．．．）のセパラブルガウシアンフィルタのテストを行った方がいらっしゃったので，アドバイスをいただいたところ，まだまだとのお言葉をいただきました．

左(場合により上): 自作，右(場合により上): OpenCV
当然変わらないです．

GaussianFilterwithSIMD

inline int border_s(const int val) { return (val >= 0) ? val : -val - 1; }
inline int border_sv(const int val) { return (val >= 0) ? val : -val - 4; }
inline int border_e(const int val, const int maxval) { return (val <= maxval) ? val : 2 * maxval - val + 1; }
inline int border_ev(const int val, const int maxval) { return (val <= maxval) ? val : 2 * maxval - val + 4; }
inline int get_simd_ceil(int val, int simdwidth)
{
	int v = (val % simdwidth == 0) ? val : (val / simdwidth + 1) * simdwidth;
	return v;
}

EMSCRIPTEN_KEEPALIVE
extern "C" void GaussianFilter(float* src, int width, int height, int channels, float sigma)
{
	const size_t size = width * height * channels;
	const int r = (int)3 * sigma;
	const int ksize = 2 * r + 1;
	const float norm = -1.f / (2.f * sigma * sigma);
	float* src_mat = (float*)_mm_malloc(sizeof(float) * size, 32);
	src_mat = src;
	float* kernel = (float*)_mm_malloc(sizeof(float) * ksize, 32);
	uint8_t* dst = (uint8_t*)_mm_malloc(sizeof(uint8_t) * size, 32);
	float* dst_f = (float*)_mm_malloc(sizeof(float) * size, 32);
	uint8_t* ptr = &dst[0]; // 出力
	float* s = src;
	const int wstep = width * 4;
	const int wstep0 = 0 * wstep;
	const int wstep1 = 1 * wstep;
	const int wstep2 = 2 * wstep;
	const int wstep3 = 3 * wstep;
	const int wstep4 = 4 * wstep;
	double sum = 0.0;
	for (int j = -r, index = 0; j <= r; j++)
	{
		float v = exp((j * j) * norm);
		sum += (double)v;
		kernel[index++] = v;
	}
	for (int j = -r, index = 0; j <= r; j++)
	{
		kernel[index] = (float)(kernel[index] / sum);
		index++;
	}

	const int r_ = get_simd_ceil(r, 4);
	const int R = 4*r_;

	// h filter
	float* d = &dst_f[0];
	for (int j = 0; j < height; j++)
	{

		{
			float* si = s;
			for (int i = 0; i < R; i += 4)
			{
				__m128 mv = wasm_f32x4_const(0.f, 0.f, 0.f, 0.f);
				for (int k = 0; k < ksize; k++)
				{
					int idx = border_sv(i + 4 * k - R);
					__m128 ms = (idx >= 0) ? wasm_v128_load(si + idx) : wasm_v128_load(si);
					__m128 mg = wasm_f32x4_splat(kernel[k]);
					__m128 tmp = wasm_f32x4_mul(ms, mg);
					mv = wasm_f32x4_add(tmp, mv);
				}
				d[i] = mv[0];
				d[i + 1] = mv[1];
				d[i + 2] = mv[2];
			}
		}
		for (int i = R; i < 4 * width - R; i += 4)
		{
			__m128 mv = wasm_f32x4_const(0.f, 0.f, 0.f, 0.f);
			float* si = s + i - R;
			for (int k = 0; k < ksize; k++)
			{
				__m128 ms = wasm_v128_load(si); // b, g, r, alpha
				__m128 mg = wasm_f32x4_splat(kernel[k]);
				__m128 tmp = wasm_f32x4_mul(ms, mg);
				mv = wasm_f32x4_add(tmp, mv);
				si += 4;
			}
			wasm_v128_store(d + i, mv);
		}
		{
			float* si = s;
			for (int i = wstep - R; i < wstep; i += 4)
			{
				__m128 mv = wasm_f32x4_const(0.f, 0.f, 0.f, 0.f);
				for (int k = 0; k < ksize; k++)
				{
					int idx = border_ev(i + 4 * k - R, wstep - 4);
					__m128 ms = (idx >= 0) ? wasm_v128_load(si + idx) : wasm_v128_load(si + wstep - 4);
					__m128 mg = wasm_f32x4_splat(kernel[k]);
					__m128 tmp = wasm_f32x4_mul(ms, mg);
					mv = wasm_f32x4_add(tmp, mv);
				}
				d[i + 0] = mv[0];
				d[i + 1] = mv[1];
				d[i + 2] = mv[2];
			}
		}

		s += wstep;
		d += wstep;
	}

	// v filter
	// w:height, h: 1の配列 => 要素数heightの配列
	float* buffer_line_rows = (float*)_mm_malloc(sizeof(float) * height, 32);
	float* b = &buffer_line_rows[0];

	for (int i = 0; i < width * 4; i += 1)
	{
		for (int j = 0; j < height; j++) b[j] = dst_f[j * wstep + i];
		ptr = &dst[i];
		for (int j = 0; j < r_; j++)
		{
			float v = 0.f;
			float* si = &b[0];
			for (int k = 0; k < ksize; k++)
			{
				int idx = border_s(j + k - r);
				v += (idx >= 0) ? kernel[k] * si[idx] : kernel[k] * b[0];
			}
			*ptr = v;
			ptr += wstep;
		}
		for (int j = r_; j < height - r_; j += 4)
		{
			__m128 mv = wasm_f32x4_const(0.f, 0.f, 0.f, 0.f);
			float* bi = b + j - r;
			for (int k = 0; k < ksize; k++)
			{
				__m128 ms = wasm_v128_load(bi);
				__m128 mg = wasm_f32x4_splat(kernel[k]);
				__m128 tmp = wasm_f32x4_mul(ms, mg);
				mv = wasm_f32x4_add(tmp, mv);
				bi++;
			}

			ptr[wstep0] = (uint8_t)mv[0];
			ptr[wstep1] = (uint8_t)mv[1];
			ptr[wstep2] = (uint8_t)mv[2];
			ptr[wstep3] = (uint8_t)mv[3];
			ptr += wstep4;
		}
		for (int j = height - r_; j < height; j++)
		{
			float v = 0.f;
			float* si = &b[0];
			for (int k = 0; k < ksize; k++)
			{
				int idx = border_e(j + k - r, height-1);
				v += (idx >= 0) ? kernel[k] * si[idx] : kernel[k] * b[0];
			}
			*ptr = v;
			ptr += wstep;
		}
		i++;
		for (int j = 0; j < height; j++) b[j] = dst_f[j * wstep + i];
		ptr = &dst[i];
		for (int j = 0; j < r_; j++)
		{
			float v = 0.f;
			float* si = &b[0];
			for (int k = 0; k < ksize; k++)
			{
				int idx = border_s(j + k - r);
				v += (idx >= 0) ? kernel[k] * si[idx] : kernel[k] * b[0];
			}
			*ptr = v;
			ptr += wstep;
		}
		for (int j = r_; j < height - r_; j += 4)
		{
			__m128 mv = wasm_f32x4_const(0.f, 0.f, 0.f, 0.f);
			float* bi = b + j - r;
			for (int k = 0; k < ksize; k++)
			{
				__m128 ms = wasm_v128_load(bi);
				__m128 mg = wasm_f32x4_splat(kernel[k]);
				__m128 tmp = wasm_f32x4_mul(ms, mg);
				mv = wasm_f32x4_add(tmp, mv);
				bi++;
			}

			ptr[wstep0] = (uint8_t)mv[0];
			ptr[wstep1] = (uint8_t)mv[1];
			ptr[wstep2] = (uint8_t)mv[2];
			ptr[wstep3] = (uint8_t)mv[3];
			ptr += wstep4;
		}
		for (int j = height - r_; j < height; j++)
		{
			float v = 0.f;
			float* si = &b[0];
			for (int k = 0; k < ksize; k++)
			{
				int idx = border_e(j + k - r, height-1);
				v += (idx >= 0) ? kernel[k] * si[idx] : kernel[k] * b[0];
			}
			*ptr = v;
			ptr += wstep;
		}
		i++;
		for (int j = 0; j < height; j++) b[j] = dst_f[j * wstep + i];
		ptr = &dst[i];
		for (int j = 0; j < r_; j++)
		{
			float v = 0.f;
			float* si = &b[0];
			for (int k = 0; k < ksize; k++)
			{
				int idx = border_s(j + k - r);
				v += (idx >= 0) ? kernel[k] * si[idx] : kernel[k] * b[0];
			}
			*ptr = v;
			ptr += wstep;
		}
		for (int j = r_; j < height - r_; j += 4)
		{
			__m128 mv = wasm_f32x4_const(0.f, 0.f, 0.f, 0.f);
			float* bi = b + j - r;
			for (int k = 0; k < ksize; k++)
			{
				__m128 ms = wasm_v128_load(bi);
				__m128 mg = wasm_f32x4_splat(kernel[k]);
				__m128 tmp = wasm_f32x4_mul(ms, mg);
				mv = wasm_f32x4_add(tmp, mv);
				bi++;
			}

			ptr[wstep0] = (uint8_t)mv[0];
			ptr[wstep1] = (uint8_t)mv[1];
			ptr[wstep2] = (uint8_t)mv[2];
			ptr[wstep3] = (uint8_t)mv[3];
			ptr += wstep4;
		}
		for (int j = height - r_; j < height; j++)
		{
			float v = 0.f;
			float* si = &b[0];
			for (int k = 0; k < ksize; k++)
			{
				int idx = border_e(j + k - r, height - 1);
				v += (idx >= 0) ? kernel[k] * si[idx] : kernel[k] * b[0];
			}
			*ptr = v;
			ptr += wstep;
		}
		i++;
		ptr = &dst[i];

		for (int j = 0; j < height; j += 4)
		{
			ptr[wstep0] = 255;
			ptr[wstep1] = 255;
			ptr[wstep2] = 255;
			ptr[wstep3] = 255;
			ptr += wstep4;
		}
	}

	_mm_free(kernel);
	_mm_free(buffer_line_rows);
	_mm_free(dst_f);
	_mm_free(src_mat);

	usingMemory();
	result[0] = (int)&dst[0];
	result[1] = size * sizeof(uint8_t);
}

結果まとめ

	OpenCV	GF without SIMD	Separable without SIMD	Separable with SIMD
Time(ms)	310	4164	134	65

感想

ガウシアンフィルタ以外も作ろうかなとは考えていますが，ちょっと気が向くかわからないのでご了承ください．
まだまだコード整理（クラス化等）をすべきところがあったり，改善の余地はたくさんあるので直していきたいなーとは思っています．
アドバイス．．．うれしいです！文句．．．優しくお願いします．

（今回実験してみてすごく思ったのですが，OpenCVがこんなに遅いはずないので多分SIMDで計算されてない．．．ごめんなさい．．．
最悪の場合，asm.jsでコンパイルされてる？バイナリなのでよくわからないですが，いろんなパターンコンパイルして試してみます．）

そして今回はスレッドで並列化してないですが，OpenCVのコンパイルオプションにあったのでまたそいつとも比較できるように並列化する予定です．

絶賛開発中のgithubはこちら

chromeとfirefoxでの違いや，nativeとの差については面白そうな論文がありましたので共有します．なんといってもタイトルの引きが素晴らしい!!
#####Not So Fast！
これどこかで使いたい私です．

参考文献

画像処理の高速化（ボックスフィルタ）
WebAssemblyはなぜ早いのか
SIMD関連

メモリリーク

マルチスレッド

追記

一応ビルド情報をcv.getBuildInformation()で確認したところこのようになってました．
やっぱりという他ありません．
他のフィルタをつくるより先にOpenCVのSIMDやマルチスレッドで試した結果との比較を記事にします．

ビルド情報

General configuration for OpenCV 4.5.2 =====================================
  Version control:               4.5.2

  Platform:
    Timestamp:                   2021-04-02T11:31:57Z
    Host:                        Linux 4.4.0-197-generic x86_64
    Target:                      Emscripten 1 x86
    CMake:                       3.10.2
    CMake generator:             Unix Makefiles
    CMake build tool:            /usr/bin/make
    Configuration:               Release

  CPU/HW features:
    Baseline:

  C/C++:
    Built as dynamic libs?:      NO
    C++ standard:                11
    C++ Compiler:                /opt/emsdk-portable/upstream/emscripten/em++  (ver 10.0.0)
    C++ flags (Release):         -s USE_PTHREADS=0    -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wuninitialized -Winconsistent-missing-override -Wno-delete-non-virtual-dtor -Wno-unnamed-type-template-args -Wno-comment -fdiagnostics-show-option -Qunused-arguments -ffunction-sections -fdata-sections  -fvisibility=hidden -fvisibility-inlines-hidden -DNDEBUG -O2  -DNDEBUG
    C++ flags (Debug):           -s USE_PTHREADS=0    -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wuninitialized -Winconsistent-missing-override -Wno-delete-non-virtual-dtor -Wno-unnamed-type-template-args -Wno-comment -fdiagnostics-show-option -Qunused-arguments -ffunction-sections -fdata-sections  -fvisibility=hidden -fvisibility-inlines-hidden -g  -O0 -DDEBUG -D_DEBUG
    C Compiler:                  /opt/emsdk-portable/upstream/emscripten/emcc
    C flags (Release):           -s USE_PTHREADS=0    -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wuninitialized -Winconsistent-missing-override -Wno-delete-non-virtual-dtor -Wno-unnamed-type-template-args -Wno-comment -fdiagnostics-show-option -Qunused-arguments -ffunction-sections -fdata-sections  -fvisibility=hidden -fvisibility-inlines-hidden -DNDEBUG -O2  -DNDEBUG
    C flags (Debug):             -s USE_PTHREADS=0    -fsigned-char -W -Wall -Werror=return-type -Werror=non-virtual-dtor -Werror=address -Werror=sequence-point -Wformat -Werror=format-security -Wmissing-declarations -Wmissing-prototypes -Wstrict-prototypes -Wundef -Winit-self -Wpointer-arith -Wshadow -Wsign-promo -Wuninitialized -Winconsistent-missing-override -Wno-delete-non-virtual-dtor -Wno-unnamed-type-template-args -Wno-comment -fdiagnostics-show-option -Qunused-arguments -ffunction-sections -fdata-sections  -fvisibility=hidden -fvisibility-inlines-hidden -g  -O0 -DDEBUG -D_DEBUG
    Linker flags (Release):      -Wl,--gc-sections -O2 
    Linker flags (Debug):        -Wl,--gc-sections  
    ccache:                      NO
    Precompiled headers:         NO
    Extra dependencies:
    3rdparty dependencies:       zlib libprotobuf quirc

  OpenCV modules:
    To be built:                 calib3d core dnn features2d flann imgproc js objdetect photo video
    Disabled:                    highgui imgcodecs ml stitching videoio world
    Disabled by dependency:      -
    Unavailable:                 gapi java python2 python3 ts
    Applications:                -
    Documentation:               js
    Non-free algorithms:         NO

  GUI: 

  Media I/O: 
    ZLib:                        build (ver 1.2.11)
    JPEG 2000:                   build (ver 2.4.0)
    HDR:                         YES
    SUNRASTER:                   YES
    PXM:                         YES
    PFM:                         YES

  Video I/O:

  Parallel framework:            none

  Other third-party libraries:
    VA:                          NO
    Custom HAL:                  NO
    Protobuf:                    build (3.5.1)

  Python (for build):            /usr/bin/python

  Install to:                    /build/master-contrib_docs-lin64/build/js/install
-----------------------------------------------------------------

OpenCVAdvent Calendar 2021

Day 2

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up