Wrapper class for a CUDA context. More...

#include <context.hpp>

Inheritance diagram for cuda::context_t:

Classes
class	global_memory_type
	A class to create a faux member in a context_t, in lieu of an in-class namespace (which C++ does not support); whenever you see a function `my_context.memory::foo()`, think of it as a `my_dev::memory::foo()`. More...

Public Member Functions
context::handle_t	handle () const noexcept

device::id_t	device_id () const noexcept

device_t	device () const

bool	is_owning () const noexcept

size_t	total_memory () const
	The amount of total global device memory available to this context, including memory already allocated.

size_t	free_memory () const
	The amount of unallocated global device memory available to this context and not yet allocated. More...

stream_t	default_stream () const

template<typename Kernel , typename ... KernelParameters>
void	launch (Kernel kernel, launch_configuration_t launch_configuration, KernelParameters... parameters) const

multiprocessor_cache_preference_t	cache_preference () const
	Determines the balance between L1 space and shared memory space set for kernels executing within this context.

size_t	stack_size () const

context::limit_value_t	printf_buffer_size () const

context::limit_value_t	memory_allocation_heap_size () const

context::limit_value_t	maximum_depth_of_child_grid_sync_calls () const

global_memory_type	memory () const
	Get a wrapper object for this context's associated device-global memory.

context::limit_value_t	maximum_outstanding_kernel_launches () const

context::shared_memory_bank_size_t	shared_memory_bank_size () const
	Returns the shared memory bank size, as described in this Parallel-for-all blog entry More...

bool	is_current () const

bool	is_primary () const

context::stream_priority_range_t	stream_priority_range () const
	Get the range of priority values one can set for streams in this context.

context::limit_value_t	get_limit (context::limit_t limit_id) const
	Get one of the configurable limits for this context (and events, streams, kernels, etc. More...

version_t	api_version () const
	Returns a version number corresponding to the capabilities of this context, which can be used can use to direct work targeting a specific API version among contexts. More...

context::host_thread_sync_scheduling_policy_t	sync_scheduling_policy () const
	Gets the synchronization policy to be used for threads synchronizing with this CUDA context. More...

bool	keeping_larger_local_mem_after_resize () const

stream_t	create_stream (bool will_synchronize_with_default_stream, stream::priority_t priority=cuda::stream::default_priority) const
	Create a new event within this context; see cuda::stream::create() for details regarding the parameters.

event_t	create_event (bool uses_blocking_sync=event::sync_by_busy_waiting, bool records_timing=event::do_record_timings, bool interprocess=event::not_interprocess) const
	Create a new event within this context; see cuda::event::create() for details regarding the parameters.

void	enable_access_to (const context_t &peer) const
	Allow kernels and memory operations within this context to involve memory allocated in a peer context.

void	disable_access_to (const context_t &peer) const
	Prevent kernels and memory operations within this context from involving memory allocated in a peer context.

void	reset_persisting_l2_cache () const
	Clear the L2 cache memory which persists between invocations of kernels.

void	set_shared_memory_bank_size (context::shared_memory_bank_size_t bank_size) const
	Sets the shared memory bank size, described in this Parallel-for-all blog entry More...

void	set_cache_preference (multiprocessor_cache_preference_t preference) const
	Controls the balance between L1 space and shared memory space for kernels executing within this context. More...

void	set_limit (context::limit_t limit_id, context::limit_value_t new_value) const
	Set one of the configurable limits for this context (and events, streams, kernels, etc. More...

void	stack_size (context::limit_value_t new_value) const
	Set the limit on the size of the stack a kernel thread can use when running. More...

void	printf_buffer_size (context::limit_value_t new_value) const

void	memory_allocation_heap_size (context::limit_value_t new_value) const

void	set_maximum_depth_of_child_grid_sync_calls (context::limit_value_t new_value) const

void	set_maximum_outstanding_kernel_launches (context::limit_value_t new_value) const

void	synchronize () const
	Avoid executing any additional instructions on this thread until all work on all streams in this context has been concluded. More...

	context_t (const context_t &other)

	context_t (context_t &&other) noexcept

context_t &	operator= (const context_t &)=delete

context_t &	operator= (context_t &&other) noexcept


template<typename ContiguousContainer , cuda::detail_::enable_if_t< detail_::is_kinda_like_contiguous_container< ContiguousContainer >::value, bool > = true>
module_t	create_module (ContiguousContainer module_data, const link::options_t &link_options) const
	Create a new module of kernels and global memory regions within this context; see also cuda::module::create()

template<typename ContiguousContainer , cuda::detail_::enable_if_t< detail_::is_kinda_like_contiguous_container< ContiguousContainer >::value, bool > = true>
module_t	create_module (ContiguousContainer module_data) const

Detailed Description

Wrapper class for a CUDA context.

Use this class - built around a context id - to perform all context-related operations the CUDA Driver (or, in fact, Runtime) API is capable of.

Note: By default this class has RAII semantics, i.e. it creates a context on construction and destroys it on destruction, and isn't merely an ephemeral wrapper one could apply and discard; but this second kind of semantics is also supported, through the context_t::owning_ field.; A context is a specific to a device; see, therefore, also {cuda::device_t}.; This class is a "reference type", not a "value type". Therefore, making changes to properties of the context is a const-respecting operation on this class.

Member Function Documentation

◆ api_version()

version_t cuda::context_t::api_version ( ) const

inline

Returns a version number corresponding to the capabilities of this context, which can be used can use to direct work targeting a specific API version among contexts.

Note: The versions returned by this call are not the same as the driver version.

◆ free_memory()

size_t cuda::context_t::free_memory ( ) const

inline

The amount of unallocated global device memory available to this context and not yet allocated.

Note: It is not guaranteed that this entire amount can actually be successfully allocated.

◆ get_limit()

context::limit_value_t cuda::context_t::get_limit ( context::limit_t limit_id ) const

inline

Get one of the configurable limits for this context (and events, streams, kernels, etc.

defined in this context).

◆ is_current()

bool cuda::context_t::is_current ( ) const

inline

Returns: True if this context is the current CUDA context for this thread (i.e. the top item in the context stack)

◆ is_owning()

bool cuda::context_t::is_owning ( ) const

inlinenoexcept

Returns: True if this wrapper is the one responsible for having the wrapped CUDA context destroyed eventually

◆ is_primary()

bool cuda::context_t::is_primary ( ) const

inline

Returns: True if this context is the primary context for its associated device.

◆ maximum_depth_of_child_grid_sync_calls()

context::limit_value_t cuda::context_t::maximum_depth_of_child_grid_sync_calls ( ) const

inline

Returns: the maximum grid depth at which a thread can issue the device runtime call cudaDeviceSynchronize() / cuda::device::synchronize() to wait on child grid launches to complete.

Todo:: Is this really a feature of the context? Not of the device?

◆ maximum_outstanding_kernel_launches()

context::limit_value_t cuda::context_t::maximum_outstanding_kernel_launches ( ) const

inline

Returns: maximum number of outstanding device runtime launches that can be made from this context.

Todo:: Is this really a feature of the context? Not of the device?

◆ memory_allocation_heap_size()

context::limit_value_t cuda::context_t::memory_allocation_heap_size ( ) const

inline

Returns: the size in bytes of the heap available to malloc() & free() calls in device-side code, of kernels within this context

◆ printf_buffer_size()

context::limit_value_t cuda::context_t::printf_buffer_size ( ) const

inline

Returns: the size of the FIFO (first-in, first-out) buffer used by the printf() function available in device-side code, of kernels within this context

◆ set_cache_preference()

void cuda::context_t::set_cache_preference ( multiprocessor_cache_preference_t preference ) const

inline

Controls the balance between L1 space and shared memory space for kernels executing within this context.

Parameters

preference the preferred balance between L1 and shared memory

◆ set_limit()

void cuda::context_t::set_limit	(	context::limit_t	limit_id,
		context::limit_value_t	new_value
	)		const

inline

Set one of the configurable limits for this context (and events, streams, kernels, etc.

defined in this context).

◆ set_shared_memory_bank_size()

void cuda::context_t::set_shared_memory_bank_size ( context::shared_memory_bank_size_t bank_size ) const

inline

Sets the shared memory bank size, described in this Parallel-for-all blog entry

Parameters

bank_size the shared memory bank size to set

◆ shared_memory_bank_size()

context::shared_memory_bank_size_t cuda::context_t::shared_memory_bank_size ( ) const

inline

Returns the shared memory bank size, as described in this Parallel-for-all blog entry

Returns: the shared memory bank size in bytes

◆ stack_size() [1/2]

size_t cuda::context_t::stack_size ( ) const

inline

Returns: the stack size in bytes of each GPU thread when running kernels within this context

◆ stack_size() [2/2]

void cuda::context_t::stack_size ( context::limit_value_t new_value ) const

inline

Set the limit on the size of the stack a kernel thread can use when running.

Todo:: Verify this is in bytes!

◆ sync_scheduling_policy()

context::host_thread_sync_scheduling_policy_t cuda::context_t::sync_scheduling_policy ( ) const

inline

Gets the synchronization policy to be used for threads synchronizing with this CUDA context.

Note: see context::host_thread_sync_scheduling_policy_t for a description of the various policies.

◆ synchronize()

void cuda::context_t::synchronize ( ) const

inline

Avoid executing any additional instructions on this thread until all work on all streams in this context has been concluded.

Note: The synchronization will occur using this context's sync_scheduling_policy()

The documentation for this class was generated from the following files:

src/cuda/api/context.hpp
src/cuda/api/multi_wrapper_impls/module.hpp

Classes

Public Member Functions

Detailed Description

Member Function Documentation

◆ api_version()

◆ free_memory()

◆ get_limit()

◆ is_current()

◆ is_owning()

◆ is_primary()

◆ maximum_depth_of_child_grid_sync_calls()

◆ maximum_outstanding_kernel_launches()

◆ memory_allocation_heap_size()

◆ printf_buffer_size()

◆ set_cache_preference()

◆ set_limit()

◆ set_shared_memory_bank_size()

◆ shared_memory_bank_size()

◆ stack_size() [1/2]

◆ stack_size() [2/2]

◆ sync_scheduling_policy()

◆ synchronize()