cuda-kat
CUDA kernel author's tools
|
Non-templated wrappers for PTX "video" instructions, which nVIDIA does not provide wrappers for through the CUDA <device_functions.h>
header.
More...
#include "detail/define_macros.cuh"
#include <kat/on_device/common.cuh>
#include <type_traits>
#include "detail/undefine_macros.cuh"
Namespaces | |
kat::ptx | |
Code exposing CUDA's PTX intermediate representation instructions to C++ code. | |
Macros | |
#define | CUDA_KAT_PTX_VIDEO_INSTRUCTIONS_CUH_ |
#define | DEFINE_SHIFT_AND_OP(direction, second_op) |
bit shift, then apply a binary operator. More... | |
Non-templated wrappers for PTX "video" instructions, which nVIDIA does not provide wrappers for through the CUDA <device_functions.h>
header.
"Video" instructions are not really about video (although they're probably used for video somehow). Essentially they're instructions which combine another operation, and another operand, after the main one; additionally, they offer variants with all sorts of saturation, wraparound, sign-extension and similar bells and whistles.
These instructions (at least, the "scalar" ones) are:
vadd - addition vsub - subtraction vabsdiff - absolute difference vmin - minimum vmax - maximum vshl - shift left vshr - shift right vmad - multiply-and-add vset - equality check
For now, we won't implement most of these instructions, and even for the ones we do implement - we'll only choose some of the variants.
#define DEFINE_SHIFT_AND_OP | ( | direction, | |
second_op | |||
) |
bit shift, then apply a binary operator.