cuda-api-wrappers
Thin C++-flavored wrappers for the CUDA Runtime API
Classes | Public Types | Public Member Functions | Static Public Member Functions | Friends | List of all members
cuda::device_t Class Reference

Proxy class for a CUDA device. More...

#include <device.hpp>

Classes

class  global_memory_t
 A class to create a faux member in a device_t, in lieu of an in-class namespace (which C++ does not support); whenever you see a function my_dev.memory::foo(), think of it as a my_dev::memory::foo(). More...
 

Public Types

using properties_t = device::properties_t
 
using attribute_value_t = device::attribute_value_t
 
using resource_limit_t = size_t
 
using shared_memory_bank_size_t = cudaSharedMemConfig
 
using resource_id_t = cudaLimit
 

Public Member Functions

bool can_access (device_t peer) const
 Determine whether this device can access the global memory of another CUDA device. More...
 
void enable_access_to (device_t peer)
 Enable access by this device to the global memory of another device. More...
 
void disable_access_to (device_t peer)
 Disable access by this device to the global memory of another device. More...
 
global_memory_t memory ()
 Obtains a proxy for the device's global memory.
 
properties_t properties () const
 Obtains the (mostly) non-numeric properties for this device.
 
::std::string name () const
 Obtains this device's human-readable name, e.g. More...
 
device::pci_location_t pci_id () const
 Obtains this device's location on the PCI express bus in terms of domain, bus and device id, e.g. More...
 
device::compute_architecture_t architecture () const
 Obtains the device's hardware architecture generation numeric designator see cuda::device::compute_architecture_t.
 
device::compute_capability_t compute_capability () const
 Obtains the device's compute capability; see cuda::device::compute_capability_t.
 
attribute_value_t get_attribute (device::attribute_t attribute) const
 Obtain a numeric-value attribute of the device. More...
 
bool supports_concurrent_managed_access () const
 Determine whether this device can coherently access managed memory concurrently with the CPU.
 
resource_limit_t get_resource_limit (resource_id_t resource) const
 Obtains the upper limit on the amount of a certain kind of resource this device offers. More...
 
void set_resource_limit (resource_id_t resource, resource_limit_t new_limit)
 Set the upper limit of one of the named numeric resources on this device.
 
void synchronize ()
 Waits for all previously-scheduled tasks on all streams (= queues) on this device to conclude. More...
 
void reset ()
 Invalidates all memory allocations and resets all state regarding this CUDA device on the current operating system process. More...
 
void set_cache_preference (multiprocessor_cache_preference_t preference)
 Controls the balance between L1 space and shared memory space for kernels executing on this device. More...
 
multiprocessor_cache_preference_t cache_preference () const
 Determines the balance between L1 space and shared memory space set for kernels executing on this device.
 
void set_shared_memory_bank_size (shared_memory_bank_size_t new_bank_size)
 Sets the shared memory bank size, described in this Parallel-for-all blog entry More...
 
shared_memory_bank_size_t shared_memory_bank_size () const
 Returns the shared memory bank size, as described in this Parallel-for-all blog entry More...
 
device::id_t id () const
 Return the proxied device's ID. More...
 
stream_t default_stream () const noexcept
 
stream_t create_stream (bool will_synchronize_with_default_stream, stream::priority_t priority=cuda::stream::default_priority)
 
event_t create_event (bool uses_blocking_sync=event::sync_by_busy_waiting, bool records_timing=event::do_record_timings, bool interprocess=event::not_interprocess)
 See cuda::event::create()
 
template<typename KernelFunction , typename ... KernelParameters>
void launch (bool thread_block_cooperativity, KernelFunction kernel_function, launch_configuration_t launch_configuration, KernelParameters ... parameters)
 
template<typename KernelFunction , typename ... KernelParameters>
void launch (const KernelFunction &kernel_function, launch_configuration_t launch_configuration, KernelParameters ... parameters)
 
device::stream_priority_range_t stream_priority_range () const
 Determines the range of possible priorities for streams on this device. More...
 
host_thread_synch_scheduling_policy_t synch_scheduling_policy () const
 
void set_synch_scheduling_policy (host_thread_synch_scheduling_policy_t new_policy)
 
bool keeping_larger_local_mem_after_resize () const
 
void keep_larger_local_mem_after_resize (bool keep=true)
 
void dont_keep_larger_local_mem_after_resize ()
 
bool can_map_host_memory () const
 Can we allocated mapped pinned memory on this device?
 
void enable_mapping_host_memory (bool allow=true)
 Control whether this device will support allocation of mapped pinned memory.
 
void disable_mapping_host_memory ()
 See enable_mapping_host_memory.
 
device_tmake_current ()
 Makes this device the CUDA Runtime API's current device. More...
 
 device_t (device_t &&other) noexcept=default
 
 device_t (const device_t &other) noexcept=default
 
device_toperator= (const device_t &other) noexcept=default
 
device_toperator= (device_t &&other) noexcept=default
 

Static Public Member Functions

static device_t choose_best_match (const properties_t &properties)
 

Friends

device_t device::get (device::id_t) noexcept
 

Detailed Description

Proxy class for a CUDA device.

Use this class - built around a device ID, or for the current device - to perform almost, if not all, device-related operations, as opposed to passing the device ID around all that time.

Note
this is one of the three main classes in the Runtime API wrapper library, together with cuda::stream_t and cuda::event_t

Member Function Documentation

◆ can_access()

bool cuda::device_t::can_access ( device_t  peer) const
inline

Determine whether this device can access the global memory of another CUDA device.

Parameters
peerthe device which is to be accessed
Returns
true iff acesss is possible

◆ disable_access_to()

void cuda::device_t::disable_access_to ( device_t  peer)
inline

Disable access by this device to the global memory of another device.

Parameters
peerthe device to which to disable access

◆ enable_access_to()

void cuda::device_t::enable_access_to ( device_t  peer)
inline

Enable access by this device to the global memory of another device.

Parameters
peerthe device to which to enable access

◆ get_attribute()

attribute_value_t cuda::device_t::get_attribute ( device::attribute_t  attribute) const
inline

Obtain a numeric-value attribute of the device.

Note
See device::attribute_t for explanation about attributes, properties and flags.

◆ get_resource_limit()

resource_limit_t cuda::device_t::get_resource_limit ( resource_id_t  resource) const
inline

Obtains the upper limit on the amount of a certain kind of resource this device offers.

Parameters
resourcewhich resource's limit to obtain

◆ id()

device::id_t cuda::device_t::id ( ) const
inline

Return the proxied device's ID.

◆ make_current()

device_t& cuda::device_t::make_current ( )
inline

Makes this device the CUDA Runtime API's current device.

Note
a non-current device becoming current will not stop its methods from always expressly setting the current device before doing anything(!)

◆ name()

::std::string cuda::device_t::name ( ) const
inline

Obtains this device's human-readable name, e.g.

"GeForce GTX 650 Ti BOOST".

◆ pci_id()

device::pci_location_t cuda::device_t::pci_id ( ) const
inline

Obtains this device's location on the PCI express bus in terms of domain, bus and device id, e.g.

(0, 1, 0)

◆ reset()

void cuda::device_t::reset ( )
inline

Invalidates all memory allocations and resets all state regarding this CUDA device on the current operating system process.

Todo:
Determine whether this actually performs a hardware reset or not

◆ set_cache_preference()

void cuda::device_t::set_cache_preference ( multiprocessor_cache_preference_t  preference)
inline

Controls the balance between L1 space and shared memory space for kernels executing on this device.

Parameters
preferencethe preferred balance between L1 and shared memory

◆ set_shared_memory_bank_size()

void cuda::device_t::set_shared_memory_bank_size ( shared_memory_bank_size_t  new_bank_size)
inline

Sets the shared memory bank size, described in this Parallel-for-all blog entry

Parameters
new_bank_sizethe shared memory bank size to set, in bytes

◆ shared_memory_bank_size()

shared_memory_bank_size_t cuda::device_t::shared_memory_bank_size ( ) const
inline

Returns the shared memory bank size, as described in this Parallel-for-all blog entry

Returns
the shared memory bank size in bytes

◆ stream_priority_range()

device::stream_priority_range_t cuda::device_t::stream_priority_range ( ) const
inline

Determines the range of possible priorities for streams on this device.

Returns
a priority range, whose semantics are a bit confusing; see priority_range_t . If the device does not support stream priorities, a 'trivial' range of priority values will be returned.

◆ synchronize()

void cuda::device_t::synchronize ( )
inline

Waits for all previously-scheduled tasks on all streams (= queues) on this device to conclude.

Depending on the host_thread_synch_scheduling_policy_t set for this device, the thread calling this method will either yield, spin or block until all tasks scheduled previously scheduled on this device have been concluded.


The documentation for this class was generated from the following files: