cuda-kat
CUDA kernel author's tools
|
Type-generic wrappers for CUDA atomic operations. More...
#include <type_traits>
#include <climits>
#include <cuda_runtime_api.h>
Typedefs | |
using | kat::grid_dimension_t = decltype(dim3::x) |
CUDA kernels are launched in grids of blocks of threads, in 3 dimensions. More... | |
using | kat::grid_block_dimension_t = grid_dimension_t |
CUDA kernels are launched in grids of blocks of threads, in 3 dimensions. More... | |
using | kat::native_word_t = unsigned |
template<typename Size > | |
using | kat::promoted_size_t = typename std::common_type< Size, native_word_t >::type |
a size type no smaller than a native word. More... | |
using | kat::lane_mask_t = unsigned |
A mask with one bit for each lane in a warp. More... | |
Enumerations | |
enum | : native_word_t { warp_size = 32 } |
enum | : native_word_t { log_warp_size = 5 } |
enum | : lane_mask_t { kat::full_warp_mask = 0xFFFFFFFF, kat::empty_warp_mask = 0x0 } |
Functions | |
template<typename T > | |
constexpr std::size_t | kat::size_in_bits () |
The number bits in the representation of a value of type T. More... | |
template<typename T > | |
constexpr std::size_t | kat::size_in_bits (const T &) |
The number bits in the representation of a value of type T. More... | |
template<typename Interpreted , typename Original > | |
KAT_FHD Interpreted | kat::reinterpret (Original &x) |
Type-generic wrappers for CUDA atomic operations.
Some basic type and constant definitions used by all device-side CUDA KAT code.
CUDA's atomic "primitive" atomic functions are non-generic C functions, defined only for some specific types - and sometimes only for some of the types of the same size for which semantics are identical. In this file are found type-generic variants of these same function, with functionality extended as much as possible - either through recasting or using the compare-and-swap (compare-and-exchange) primitive to implement other functions for types not directly supported.
Additionally, the wrapper used for emulating atomics on arbitrary types is made available here for the user to be able to do the same for arbitrary functions.
using kat::grid_block_dimension_t = typedef grid_dimension_t |
CUDA kernels are launched in grids of blocks of threads, in 3 dimensions.
In each of these, the number of threads per block is specified in this type.
using kat::grid_dimension_t = typedef decltype(dim3::x) |
CUDA kernels are launched in grids of blocks of threads, in 3 dimensions.
In each of these, the numbers of blocks per grid is specified in this type.
using kat::lane_mask_t = typedef unsigned |
A mask with one bit for each lane in a warp.
Used to indicate which threads meet a certain criterion or need to have some action applied to them.
using kat::promoted_size_t = typedef typename std::common_type<Size, native_word_t>::type |
a size type no smaller than a native word.
Sometimes, in device code, we only need our size type to cover a small range of values; but - it is still more effective to use a full native word, rather than to risk extra instructions to enforce the limits of sub-native-word values. And while it's true this might not help much, or be optimized away - let's be on the safe side anyway.
anonymous enum : lane_mask_t |
KAT_FHD Interpreted kat::reinterpret | ( | Original & | x | ) |
constexpr std::size_t kat::size_in_bits | ( | ) |
The number bits in the representation of a value of type T.
constexpr std::size_t kat::size_in_bits | ( | const T & | ) |
The number bits in the representation of a value of type T.