ヘテロジニアス・コンピューティング向けのプログラミング環境として、OpenMP 4.0からdeviceコンストラクトが導入され、OpenACCとよく似た処理を記述できるようになった。
とはいえ、OpenMP/OpenACCが目指す方向性は微妙に異なっている。
“The real issue is which one, OpenACC or OpenMP, solves the issue for the users,” explains Wolfe. “OpenMP is richer and has more features than OpenACC, we make no apologies about that. OpenACC is targeting scalable parallelism, OpenMP is targeting more general parallelism including things like tasks, which in some senses are inherently not scalable. OpenMP has a lot more synchronization primitives, and if you are talking about scalable parallelism, that is just a way to slow your program down. The important differences are performance portability – and at SC15 you heard that it is either important or impossible – and we are saying that it is not only important and possible, but that we are demonstrating this today.”
(Wolfe氏の立場:technical chair for OpenACC)
(http://www.nextplatform.com/wp-content/uploads/2015/11/openacc-versus-openmp.jpg)
OpenMP公式
OMPAPI.Relatives.01 How does OpenMP relate to OpenACC ?
OpenMP and OpenACC are actively merging their specification while continuing to evolve. A first step at merging has been made with the release of OpenMP 4.0. OpenACC implementations can be considered to be a beta test of the OpenMP accelerator specification. They give early implementation experience.
OpenACC has been created and implemented by several members of the OpenMP ARB in order to address their immediate customer needs. These members are NVIDIA, PGI, Cray, and CAPS.
OpenACC公式
What does it take port application OpenACC versus OpenMP
OpenACC and OpenMP require a similar approach to parallelizing code, so no developer investment is wasted.
OpenACC allows the developer to express the parallelism in the code while relying on an OpenACC compiler to map that parallelism to the hardware. This enables the developer to write parallel code that is performance portable to any architecture.
OpenMP relies on the developer to explicitly parallelize their code, which makes OpenMP simpler for a compiler to implement but more difficult to make portable to different architectures.