More than 5 years have passed since last update.

cudaいろいろ

CUDA

Last updated at 2018-12-11Posted at 2016-07-22

compute capability一覧

時間計測

cudaイベントを使って時間計測

struct Timer {
	float elapsed;
	cudaEvent_t m_start_event;
	cudaEvent_t m_stop_event;
	Timer()　{
		elapsed = 0.f;
		cudaEventCreate(&m_start_event);
		cudaEventCreate(&m_stop_event);
	}
	~Timer()
	{
		cudaEventDestroy(m_start_event);
		cudaEventDestroy(m_stop_event);
	}
	void start()
	{
		cudaEventRecord(m_start_event, 0);
	}
	void stop()
	{
		cudaEventRecord(m_stop_event, 0);
		cudaEventSynchronize(m_stop_event);
		cudaEventElapsedTime(&elapsed, m_start_event, m_stop_event);
	}
	float elapsedInMs() const
	{
		return elapsed;
	}
};
void benchmark()
{
	Timer t;
	t.start();
	kernel実行;
	t.stop();
	printf("time %f msec\n", t.elapsedInMs());
}

runtime API と driver API

runtime API
driver API

runtime API: 簡単。普通はこれを使う。
driver API: 複雑。なんでもできる。driver APIでしかできないこともある。

Difference between the driver and runtime APIs

Context managementはdriver APIでのみ可能。
runtime APIはどのコンテキストを使うかを選択できる。driver APIがスレッドに対して作成したコンテキストがあればそれを使う。なければprimary contextを使う。
primary contextは必要に応じて作られる。one per device per process。リファレンスカウントで管理される。
1つのプロセス内では、runtime APIは、primary contextを共有する。スレッドでcontextが作られてなければ。
runtimeが使うcontextは、cudaDeviceSynchronize()で同期できる。cudaDeviceReset()で破壊できる。
primary contextをruntime APIで使うのはちょっと危険かも。１つのプロセスで複数のプラグインが動作する場合とか、contextが共有されるけどお互いに通信する手段はないから。誰かがcudaDeviceReset()呼び出したら終了です。なので、driver APIでcontextをってruntime APIで動かしましょう。cuBLAS とか cuFFT のruntime APIでビルドされたライブラリを使うときは大事。

CUDA Driver APIでカーネル作成と実行まで
 ランタイムとドライバAPIの相互運用
 Convert cudaStream_t object to CUStream object

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up