|
cuda-kat
CUDA kernel author's tools
|
Non-templated wrappers for PTX "video" instructions, which nVIDIA does not provide wrappers for through the CUDA <device_functions.h> header.
More...
#include "detail/define_macros.cuh"#include <kat/on_device/common.cuh>#include <type_traits>#include "detail/undefine_macros.cuh"Namespaces | |
| kat::ptx | |
| Code exposing CUDA's PTX intermediate representation instructions to C++ code. | |
Macros | |
| #define | CUDA_KAT_PTX_VIDEO_INSTRUCTIONS_CUH_ |
| #define | DEFINE_SHIFT_AND_OP(direction, second_op) |
| bit shift, then apply a binary operator. More... | |
Non-templated wrappers for PTX "video" instructions, which nVIDIA does not provide wrappers for through the CUDA <device_functions.h> header.
"Video" instructions are not really about video (although they're probably used for video somehow). Essentially they're instructions which combine another operation, and another operand, after the main one; additionally, they offer variants with all sorts of saturation, wraparound, sign-extension and similar bells and whistles.
These instructions (at least, the "scalar" ones) are:
vadd - addition vsub - subtraction vabsdiff - absolute difference vmin - minimum vmax - maximum vshl - shift left vshr - shift right vmad - multiply-and-add vset - equality check
For now, we won't implement most of these instructions, and even for the ones we do implement - we'll only choose some of the variants.
| #define DEFINE_SHIFT_AND_OP | ( | direction, | |
| second_op | |||
| ) |
bit shift, then apply a binary operator.
1.8.12