cuda-kat
CUDA kernel author's tools
Macros | Functions
math.cuh File Reference

Templatized mathematical function definitions for integer and floating-point types. More...

#include "common.cuh"
#include "constexpr_math.cuh"
#include <kat/on_device/builtins.cuh>
#include <type_traits>

Functions

template<typename I >
KAT_FD unsigned kat::log2_of_power_of_2 (I p)
 compute the base-two logarithm of a number known to be a power of 2. More...
 
template<typename T , typename S >
KAT_FD T kat::div_by_power_of_2_rounding_up (const T &dividend, const S &divisor)
 A variant of div_rounding_up (which you can find in constexpr_math.cuh), which has (non-constexpr, unfortunately) optimizations based on the knowledge the divisor is a power of 2. More...
 
template<typename I , typename P >
constexpr KAT_FD I kat::div_by_power_of_2 (I dividend, P power_of_2)
 
template<typename T >
constexpr KAT_FD T kat::gcd (T u, T v)
 compute the greatest common divisor (gcd) of two values. More...
 
template<typename I >
KAT_FD I kat::lcm (I u, I v)
 compute the least common multiple (LCM) of two integer values More...
 
template<typename I >
KAT_FD int kat::detail::count_leading_zeros (I x)
 Return the number of bits, beginning from the least-significant, which are all 0 ("leading" zeros) More...
 
template<typename I >
KAT_FD unsigned kat::log2 (I x)
 compute the (integral) base-two logarithm of a number More...
 
template<typename T >
KAT_FD T kat::detail::minimum (std::integral_constant< bool, false >, T x, T y)
 
template<typename T >
KAT_FD T kat::detail::minimum (std::integral_constant< bool, true >, T x, T y)
 
template<typename T >
KAT_FD T kat::detail::maximum (std::integral_constant< bool, false >, T x, T y)
 
template<typename T >
KAT_FD T kat::detail::maximum (std::integral_constant< bool, true >, T x, T y)
 
template<typename T >
KAT_FD T kat::detail::absolute_value (std::integral_constant< bool, false >, T x)
 
template<typename T >
KAT_FD T kat::detail::absolute_value (std::integral_constant< bool, true >, T x)
 
template<typename T >
KAT_FD T kat::minimum (T x, T y)
 
template<typename T >
KAT_FD T kat::maximum (T x, T y)
 
template<typename T >
KAT_FD T kat::absolute_value (T x)
 

Detailed Description

Templatized mathematical function definitions for integer and floating-point types.

CUDA has many mathematical primitives - which are already found in builtins.cuh. However, they are often not defined for all types; and - some functions are missing (e.g. gcd()) or can benefit from specialization (e.g. division by a power of 2). This file has the wider selection of functions, utilizing a primitive (from builtins::) when relevant, and multi-instruction implementation otherwise.

Note
Including this file is sufficient for accessing all functions in constexpr_math.cuh.

Function Documentation

§ count_leading_zeros()

template<typename I >
KAT_FD int kat::detail::count_leading_zeros ( x)
delete

Return the number of bits, beginning from the least-significant, which are all 0 ("leading" zeros)

Returns
The number of leading zeros, between 0 and the size of I in bits.

§ div_by_power_of_2_rounding_up()

template<typename T , typename S >
KAT_FD T kat::div_by_power_of_2_rounding_up ( const T &  dividend,
const S &  divisor 
)

A variant of div_rounding_up (which you can find in constexpr_math.cuh), which has (non-constexpr, unfortunately) optimizations based on the knowledge the divisor is a power of 2.

Returns
The smallest multiple of divisor above dividend / divisor

§ gcd()

template<typename T >
constexpr KAT_FD T kat::gcd ( u,
v 
)

compute the greatest common divisor (gcd) of two values.

Parameters
uOne integral value (prefer making this the larger one)
vAnother integral value (prefer making this the smaller one)
Returns
the largest I value d such that d divides u and d divides v.

§ lcm()

template<typename I >
KAT_FD I kat::lcm ( u,
v 
)

compute the least common multiple (LCM) of two integer values

Template Parameters
Ian integral (or integral-number-like) type
Parameters
uOne of the numbers which the result must divide
vAnother one of the numbers which the result must divide
Returns
The highest I value which divides both u and v.

§ log2()

template<typename I >
KAT_FD unsigned kat::log2 ( x)

compute the (integral) base-two logarithm of a number

Note
Yes, this is trivial to do, but:
  1. This says what you're doing, not how you do it (e.g. left-shifting bits and such)
  2. There's a device-side optimization here (which isn't constexpr)
Parameters
xa non-negative value
Returns
floor(log2(x)), i.e. the least exponent l such than 2^l >= x

§ log2_of_power_of_2()

template<typename I >
KAT_FD unsigned kat::log2_of_power_of_2 ( p)

compute the base-two logarithm of a number known to be a power of 2.

Note
Yes, this is trivial to do, but:
  1. This says what you're doing, not how you do it (e.g. left-shifting bits and such)
  2. There's a device-side optimization here (which isn't constexpr)
Parameters
pan integral power of 2
Returns
the exponent l such than 2^l equals p