Non-templated wrappers for PTX "video" instructions, which nVIDIA does not provide wrappers for through the CUDA <device_functions.h> header. More...

#include "detail/define_macros.cuh"
#include <kat/on_device/common.cuh>
#include <type_traits>
#include "detail/undefine_macros.cuh"

Namespaces
	kat::ptx
	Code exposing CUDA's PTX intermediate representation instructions to C++ code.

Macros
#define	CUDA_KAT_PTX_VIDEO_INSTRUCTIONS_CUH_

#define	DEFINE_SHIFT_AND_OP(direction, second_op)
	bit shift, then apply a binary operator. More...

Functions
	kat::ptx::DEFINE_SHIFT_AND_OP (l, add) DEFINE_SHIFT_AND_OP(l

min	kat::ptx::DEFINE_SHIFT_AND_OP (l, max) DEFINE_SHIFT_AND_OP(r

min add	kat::ptx::DEFINE_SHIFT_AND_OP (r, min) DEFINE_SHIFT_AND_OP(r

Detailed Description

Non-templated wrappers for PTX "video" instructions, which nVIDIA does not provide wrappers for through the CUDA <device_functions.h> header.

"Video" instructions are not really about video (although they're probably used for video somehow). Essentially they're instructions which combine another operation, and another operand, after the main one; additionally, they offer variants with all sorts of saturation, wraparound, sign-extension and similar bells and whistles.

These instructions (at least, the "scalar" ones) are:

vadd - addition vsub - subtraction vabsdiff - absolute difference vmin - minimum vmax - maximum vshl - shift left vshr - shift right vmad - multiply-and-add vset - equality check

For now, we won't implement most of these instructions, and even for the ones we do implement - we'll only choose some of the variants.

Macro Definition Documentation

§ DEFINE_SHIFT_AND_OP

#define DEFINE_SHIFT_AND_OP	(	direction,
		second_op
	)

Value:

KAT_FD uint32_t \
vsh##direction##_##second_op ( \
    uint32_t x, \
    uint32_t shift_amount, \
    uint32_t extra_operand) \
{ \
    uint32_t ret; \
    asm ("vsh" PTX_STRINGIFY(direction) ".u32.u32.u32.clamp." PTX_STRINGIFY(second_op) " %0, %1, %2, %3;" \
        : "=r"(ret)  \
        : "r"(x) \
        , "r"(shift_amount) \
        , "r"(extra_operand) \
    ); \
    return ret; \
}

bit shift, then apply a binary operator.

Namespaces

Macros

Functions

Detailed Description

Macro Definition Documentation

§ DEFINE_SHIFT_AND_OP