cuda-api-wrappers
Thin C++-flavored wrappers for the CUDA Runtime API
cuda::context_t Class Reference

Wrapper class for a CUDA context. More...

#include <context.hpp>

Inheritance diagram for cuda::context_t:

Classes

class  global_memory_type
 A class to create a faux member in a context_t, in lieu of an in-class namespace (which C++ does not support); whenever you see a function my_context.memory::foo(), think of it as a my_dev::memory::foo(). More...
 

Public Member Functions

context::handle_t handle () const noexcept
 
device::id_t device_id () const noexcept
 
device_t device () const
 
bool is_owning () const noexcept
 
size_t total_memory () const
 The amount of total global device memory available to this context, including memory already allocated.
 
size_t free_memory () const
 The amount of unallocated global device memory available to this context and not yet allocated. More...
 
stream_t default_stream () const
 
template<typename Kernel , typename ... KernelParameters>
void launch (Kernel kernel, launch_configuration_t launch_configuration, KernelParameters... parameters) const
 
multiprocessor_cache_preference_t cache_preference () const
 Determines the balance between L1 space and shared memory space set for kernels executing within this context.
 
size_t stack_size () const
 
context::limit_value_t printf_buffer_size () const
 
context::limit_value_t memory_allocation_heap_size () const
 
context::limit_value_t maximum_depth_of_child_grid_sync_calls () const
 
global_memory_type memory () const
 Get a wrapper object for this context's associated device-global memory.
 
context::limit_value_t maximum_outstanding_kernel_launches () const
 
context::shared_memory_bank_size_t shared_memory_bank_size () const
 Returns the shared memory bank size, as described in this Parallel-for-all blog entry More...
 
bool is_current () const
 
bool is_primary () const
 
context::stream_priority_range_t stream_priority_range () const
 Get the range of priority values one can set for streams in this context.
 
context::limit_value_t get_limit (context::limit_t limit_id) const
 Get one of the configurable limits for this context (and events, streams, kernels, etc. More...
 
version_t api_version () const
 Returns a version number corresponding to the capabilities of this context, which can be used can use to direct work targeting a specific API version among contexts. More...
 
context::host_thread_sync_scheduling_policy_t sync_scheduling_policy () const
 Gets the synchronization policy to be used for threads synchronizing with this CUDA context. More...
 
bool keeping_larger_local_mem_after_resize () const
 
stream_t create_stream (bool will_synchronize_with_default_stream, stream::priority_t priority=cuda::stream::default_priority) const
 Create a new event within this context; see cuda::stream::create() for details regarding the parameters.
 
event_t create_event (bool uses_blocking_sync=event::sync_by_busy_waiting, bool records_timing=event::do_record_timings, bool interprocess=event::not_interprocess) const
 Create a new event within this context; see cuda::event::create() for details regarding the parameters.
 
void enable_access_to (const context_t &peer) const
 Allow kernels and memory operations within this context to involve memory allocated in a peer context.
 
void disable_access_to (const context_t &peer) const
 Prevent kernels and memory operations within this context from involving memory allocated in a peer context.
 
void reset_persisting_l2_cache () const
 Clear the L2 cache memory which persists between invocations of kernels.
 
void set_shared_memory_bank_size (context::shared_memory_bank_size_t bank_size) const
 Sets the shared memory bank size, described in this Parallel-for-all blog entry More...
 
void set_cache_preference (multiprocessor_cache_preference_t preference) const
 Controls the balance between L1 space and shared memory space for kernels executing within this context. More...
 
void set_limit (context::limit_t limit_id, context::limit_value_t new_value) const
 Set one of the configurable limits for this context (and events, streams, kernels, etc. More...
 
void stack_size (context::limit_value_t new_value) const
 Set the limit on the size of the stack a kernel thread can use when running. More...
 
void printf_buffer_size (context::limit_value_t new_value) const
 
void memory_allocation_heap_size (context::limit_value_t new_value) const
 
void set_maximum_depth_of_child_grid_sync_calls (context::limit_value_t new_value) const
 
void set_maximum_outstanding_kernel_launches (context::limit_value_t new_value) const
 
void synchronize () const
 Avoid executing any additional instructions on this thread until all work on all streams in this context has been concluded. More...
 
 context_t (const context_t &other)
 
 context_t (context_t &&other) noexcept
 
context_toperator= (const context_t &)=delete
 
context_toperator= (context_t &&other) noexcept
 
template<typename ContiguousContainer , cuda::detail_::enable_if_t< detail_::is_kinda_like_contiguous_container< ContiguousContainer >::value, bool > = true>
module_t create_module (ContiguousContainer module_data, const link::options_t &link_options) const
 Create a new module of kernels and global memory regions within this context; see also cuda::module::create()
 
template<typename ContiguousContainer , cuda::detail_::enable_if_t< detail_::is_kinda_like_contiguous_container< ContiguousContainer >::value, bool > = true>
module_t create_module (ContiguousContainer module_data) const
 

Detailed Description

Wrapper class for a CUDA context.

Use this class - built around a context id - to perform all context-related operations the CUDA Driver (or, in fact, Runtime) API is capable of.

Note
By default this class has RAII semantics, i.e. it creates a context on construction and destroys it on destruction, and isn't merely an ephemeral wrapper one could apply and discard; but this second kind of semantics is also supported, through the context_t::owning_ field.
A context is a specific to a device; see, therefore, also {cuda::device_t}.
This class is a "reference type", not a "value type". Therefore, making changes to properties of the context is a const-respecting operation on this class.

Member Function Documentation

◆ api_version()

version_t cuda::context_t::api_version ( ) const
inline

Returns a version number corresponding to the capabilities of this context, which can be used can use to direct work targeting a specific API version among contexts.

Note
The versions returned by this call are not the same as the driver version.

◆ free_memory()

size_t cuda::context_t::free_memory ( ) const
inline

The amount of unallocated global device memory available to this context and not yet allocated.

Note
It is not guaranteed that this entire amount can actually be successfully allocated.

◆ get_limit()

context::limit_value_t cuda::context_t::get_limit ( context::limit_t  limit_id) const
inline

Get one of the configurable limits for this context (and events, streams, kernels, etc.

defined in this context).

◆ is_current()

bool cuda::context_t::is_current ( ) const
inline
Returns
True if this context is the current CUDA context for this thread (i.e. the top item in the context stack)

◆ is_owning()

bool cuda::context_t::is_owning ( ) const
inlinenoexcept
Returns
True if this wrapper is the one responsible for having the wrapped CUDA context destroyed eventually

◆ is_primary()

bool cuda::context_t::is_primary ( ) const
inline
Returns
True if this context is the primary context for its associated device.

◆ maximum_depth_of_child_grid_sync_calls()

context::limit_value_t cuda::context_t::maximum_depth_of_child_grid_sync_calls ( ) const
inline
Returns
the maximum grid depth at which a thread can issue the device runtime call cudaDeviceSynchronize() / cuda::device::synchronize() to wait on child grid launches to complete.
Todo:
Is this really a feature of the context? Not of the device?

◆ maximum_outstanding_kernel_launches()

context::limit_value_t cuda::context_t::maximum_outstanding_kernel_launches ( ) const
inline
Returns
maximum number of outstanding device runtime launches that can be made from this context.
Todo:
Is this really a feature of the context? Not of the device?

◆ memory_allocation_heap_size()

context::limit_value_t cuda::context_t::memory_allocation_heap_size ( ) const
inline
Returns
the size in bytes of the heap available to malloc() & free() calls in device-side code, of kernels within this context

◆ printf_buffer_size()

context::limit_value_t cuda::context_t::printf_buffer_size ( ) const
inline
Returns
the size of the FIFO (first-in, first-out) buffer used by the printf() function available in device-side code, of kernels within this context

◆ set_cache_preference()

void cuda::context_t::set_cache_preference ( multiprocessor_cache_preference_t  preference) const
inline

Controls the balance between L1 space and shared memory space for kernels executing within this context.

Parameters
preferencethe preferred balance between L1 and shared memory

◆ set_limit()

void cuda::context_t::set_limit ( context::limit_t  limit_id,
context::limit_value_t  new_value 
) const
inline

Set one of the configurable limits for this context (and events, streams, kernels, etc.

defined in this context).

◆ set_shared_memory_bank_size()

void cuda::context_t::set_shared_memory_bank_size ( context::shared_memory_bank_size_t  bank_size) const
inline

Sets the shared memory bank size, described in this Parallel-for-all blog entry

Parameters
bank_sizethe shared memory bank size to set

◆ shared_memory_bank_size()

context::shared_memory_bank_size_t cuda::context_t::shared_memory_bank_size ( ) const
inline

Returns the shared memory bank size, as described in this Parallel-for-all blog entry

Returns
the shared memory bank size in bytes

◆ stack_size() [1/2]

size_t cuda::context_t::stack_size ( ) const
inline
Returns
the stack size in bytes of each GPU thread when running kernels within this context

◆ stack_size() [2/2]

void cuda::context_t::stack_size ( context::limit_value_t  new_value) const
inline

Set the limit on the size of the stack a kernel thread can use when running.

Todo:
Verify this is in bytes!

◆ sync_scheduling_policy()

context::host_thread_sync_scheduling_policy_t cuda::context_t::sync_scheduling_policy ( ) const
inline

Gets the synchronization policy to be used for threads synchronizing with this CUDA context.

Note
see context::host_thread_sync_scheduling_policy_t for a description of the various policies.

◆ synchronize()

void cuda::context_t::synchronize ( ) const
inline

Avoid executing any additional instructions on this thread until all work on all streams in this context has been concluded.

Note
The synchronization will occur using this context's sync_scheduling_policy()

The documentation for this class was generated from the following files: