Simpler / more basic utility code for working with shared memory, not involving any actual computation. More...

#include <kat/on_device/grid_info.cuh>
#include <kat/on_device/ptx.cuh>

Namespaces
	kat::shared_memory::dynamic::warp_specific

Typedefs
using	kat::shared_memory::offset_t = int

using	kat::shared_memory::size_t = unsigned

Functions
KAT_FD size_t	kat::shared_memory::size ()
	Obtain the total size in bytes of the (per-block) shared memory for the running kernel - static + dynamic. More...

KAT_FD size_t	kat::shared_memory::static_::size ()
	Obtain the size in bytes of the (per-block) static shared memory for the running kernel. More...

template<typename T = unsigned char>
KAT_FD size_t	kat::shared_memory::dynamic::size ()
	Obtain the size of the (per-block) dynamic shared_memory for the running kernel. More...

template<typename T >
KAT_DEV T *	kat::shared_memory::dynamic::proxy ()
	This gadget is necessary for using dynamically-sized shared memory in templated kernels (i.e. More...

template<typename T >
KAT_FD T *	kat::shared_memory::dynamic::warp_specific::contiguous (unsigned num_elements_per_warp, offset_t base_offset=0)
	Accesses the calling thread's warp-specific dynamic shared memory - assuming the warps voluntarily divvy up the shared memory beyond some point amongst themselves, using striding. More...

template<typename T >
KAT_FD T *	kat::shared_memory::dynamic::warp_specific::strided (offset_t base_offset=0)
	Accesses the calling thread's warp-specific dynamic shared memory - assuming the warps voluntarily divvy up the shared memory beyond some point amongst themselves into contiguous areas. More...

Detailed Description

Simpler / more basic utility code for working with shared memory, not involving any actual computation.

Function Documentation

§ proxy()

template<typename T >

KAT_DEV T* kat::shared_memory::dynamic::proxy ( )

This gadget is necessary for using dynamically-sized shared memory in templated kernels (i.e.

shared memory whose size is set by the launch parameters rather than being fixed at compile time). Use of such memory requires a __shared__ extern unspecified-size array variable; however, the way nvcc works, you cannot declare two such variables of different types in your program - even if they're in different scopes. That means we either need to have a different variable name for each type (which would lead us into preprocessor macro hell), or - just use the same type, and reintrepret according to the type we want... which is what this gadget does.

Note: all threads would get the same address when calling this function, so you would need to add different offsets for different threads if you want a warp-specific or thread-specific pointer.; see also https://stackoverflow.com/questions/27570552/

§ size() [1/3]

KAT_FD size_t kat::shared_memory::size ( )

Obtain the total size in bytes of the (per-block) shared memory for the running kernel - static + dynamic.

Note: requires special register access which is not so cheap.

§ size() [2/3]

KAT_FD size_t kat::shared_memory::static_::size ( )

Obtain the size in bytes of the (per-block) static shared memory for the running kernel.

Note: requires special register access which is not so cheap.

§ size() [3/3]

template<typename T = unsigned char>

KAT_FD size_t kat::shared_memory::dynamic::size ( )

Obtain the size of the (per-block) dynamic shared_memory for the running kernel.

Note: without a template parameter, returns the size in bytes; requires special register access which is not so cheap.

Namespaces