synchronization with levels

11 Feb 2020 Tags:

gpgpu , synchronization

Cooperative Groups

A flexible model for synchronization and communication within groups of threads

Before CUDA 9.0 ‘_syncthreads()’ was proposed for block level synchronization. But starting from CUDA 9.0, in order of bottom to top

g = this_thread_block();  //thread launch
tiled_partition(g, 32);   // threds in thread_block

http://on-demand.gputechconf.com/gtc/2017/presentation/s7622-Kyrylo-perelygin-robust-and-scalable-cuda.pdf