cuda-kat
CUDA kernel author's tools
|
Wrapper functions for single PTX instructions — using inline PTX assembly — which are not already available in the official CUDA includes. More...
#include "ptx/special_registers.cuh"
#include "ptx/miscellany.cuh"
#include "ptx/video_instructions.cuh"
Namespaces | |
kat::ptx | |
Code exposing CUDA's PTX intermediate representation instructions to C++ code. | |
Wrapper functions for single PTX instructions — using inline PTX assembly — which are not already available in the official CUDA includes.
CUDA provides many "intrinsics" functions, which wrap single PTX instructions, e.g. __ldg
or __funnelshift_l
from sm_32_intrinsics.h
. But - CUDA doesn't provide such functions for all of the PTX instruction set. The files included from this master-include contain such single-line assembly wrapper functions for different categories of missing PTX instructions.
on_device/builtins.cuh
, functions here are not templated, and do not necessarily have the same name for different parameter types. on_device/builtins.cuh
functions do use PTX wrapper functions as their implementation.