Representation, allocation and manipulation of CUDA-related memory, of different. More...

Namespaces
	device
	CUDA-Device-global memory on a single device (not accessible from the host)

	host
	Host-side (= system) memory which is "pinned", i.e.

	managed
	Paged memory accessible in both device-side and host-side code by triggering transfers of pages between physical system memory and physical device memory.

	mapped
	Memory regions appearing in both on the host-side and device-side address spaces with the regions in both spaces mapped to each other (i.e.

	shared
	A memory space whose contents is shared by all threads in a CUDA kernel block, but specific to each kernel block separately.

Classes
struct	allocation_options
	options accepted by CUDA's allocator of memory with a host-side aspect (host-only or managed memory). More...

struct	copy_parameters_t
	A builder-ish subclass template around the basic 2D or 3D copy parameters which CUDA's complex copying API actually takes. More...

class	pointer_t
	A convenience wrapper around a raw pointer "known" to the CUDA runtime and which thus has various kinds of associated information which this wrapper allows access to. More...

class	unique_region
	A class for holding a region_t of memory owned "uniquely" by its creator - similar to how `::std::unique_ptr` holds a uniquely- owned pointer. More...

Enumerations
enum	endpoint_t { source, destination }
	Type for choosing between endpoints of copy operations.

enum	portability_across_contexts : bool { isnt_portable = false, is_portable = true }
	A memory allocation setting: Can the allocated memory be used in other CUDA driver contexts (in addition to the implicit default context we have with the Runtime API).

enum	cpu_write_combining : bool { without_wc = false, with_wc = true }
	A memory allocation setting: Should the allocated memory be configured as write-combined, i.e. More...

enum	type_t : ::std::underlying_type< CUmemorytype >::type { host_ = CU_MEMORYTYPE_HOST, device_ = CU_MEMORYTYPE_DEVICE, array = CU_MEMORYTYPE_ARRAY, unified_ = CU_MEMORYTYPE_UNIFIED, managed_ = CU_MEMORYTYPE_UNIFIED, non_cuda = ~(::std::underlying_type<CUmemorytype>::type{0}) }
	The CUDA execution ecosystem involves different memory spaces in their relation to a GPU device or their treatment by the CUDA driver; this type distinguishes among them.

Functions
copy_parameters_t< 3 >::intra_context_type	as_intra_context_parameters (const copy_parameters_t< 3 > &params)

void	set (void *ptr, int byte_value, size_t num_bytes, optional_ref< const stream_t > stream={})
	Sets a number of bytes in memory to a fixed value. More...

void	set (region_t region, int byte_value, optional_ref< const stream_t > stream={})
	Sets all bytes in a region of memory to a fixed value. More...

void	zero (region_t region, optional_ref< const stream_t > stream={})
	Sets all bytes in a region of memory to 0 (zero) More...

void	zero (void *ptr, size_t num_bytes, optional_ref< const stream_t > stream={})
	Zero-out a region of memory. More...

template<typename T >
void	zero (T *ptr)
	Sets all bytes of a single pointed-to value to 0. More...

template<dimensionality_t NumDimensions>
void	copy (copy_parameters_t< NumDimensions > params, optional_ref< const stream_t > stream={})
	An almost-generalized-case memory copy, taking a rather complex structure of copy parameters - wrapping the CUDA driver's own most-generalized-case copy. More...

template<typename T , dimensionality_t NumDimensions>
void	copy (const array_t< T, NumDimensions > &destination, const context_t &source_context, const T *source, optional_ref< const stream_t > stream={})
	Synchronously copies data from a CUDA array into non-array memory. More...

template<typename T , dimensionality_t NumDimensions>
void	copy (array_t< T, NumDimensions > &destination, const T *source, optional_ref< const stream_t > stream={})
	Synchronously copies data from a CUDA array into non-array memory. More...

template<typename T , dimensionality_t NumDimensions>
void	copy (const array_t< T, NumDimensions > &destination, span< T const > source, optional_ref< const stream_t > stream={})
	Copies a contiguous sequence of elements in memory into a CUDA array. More...

template<typename T , dimensionality_t NumDimensions>
void	copy (const context_t &context, T *destination, const array_t< T, NumDimensions > &source, optional_ref< const stream_t > stream={})
	Synchronously copies data into a CUDA array from non-array memory. More...

template<typename T , dimensionality_t NumDimensions>
void	copy (T *destination, const array_t< T, NumDimensions > &source, optional_ref< const stream_t > stream={})
	Synchronously copies data into a CUDA array from non-array memory. More...

template<typename T , dimensionality_t NumDimensions>
void	copy (span< T > destination, const array_t< T, NumDimensions > &source, optional_ref< const stream_t > stream={})
	Copies the contents of a CUDA array into a sequence of contiguous elements in memory. More...

template<typename T , dimensionality_t NumDimensions>
void	copy (const array_t< T, NumDimensions > &destination, const array_t< T, NumDimensions > &source, optional_ref< const stream_t > stream)
	Copies the contents of one CUDA array to another. More...

template<typename T , dimensionality_t NumDimensions>
void	copy (region_t destination, const array_t< T, NumDimensions > &source, optional_ref< const stream_t > stream={})
	Copies the contents of a CUDA array into a region of memory. More...

template<typename T , dimensionality_t NumDimensions>
void	copy (array_t< T, NumDimensions > &destination, const_region_t source, optional_ref< const stream_t > stream={})
	Copies the contents of a region of memory into a CUDA array. More...

template<typename T >
void	copy_single (T destination, const T source, optional_ref< const stream_t > stream={})
	Synchronously copies a single (typed) value between two memory locations. More...

void	copy (void destination, void const source, size_t num_bytes, optional_ref< const stream_t > stream={})
	Asynchronously copies data between memory spaces or within a memory space. More...

template<typename T , size_t N>
void	copy (c_array< T, N > &destination, const_region_t source, optional_ref< const stream_t > stream={})
	Copy the contents of memory region into a C-style array, interpreting the memory as a sequence of elements of the array's element type. More...

template<typename T , size_t N>
void	copy (region_t destination, c_array< const T, N > const &source, optional_ref< const stream_t > stream={})

void	copy (region_t destination, const_region_t source, size_t num_bytes, optional_ref< const stream_t > stream={})
	Asynchronously copies data between memory spaces or within a memory space. More...

void	copy (region_t destination, const_region_t source, optional_ref< const stream_t > stream={})

void	copy (region_t destination, void *source, optional_ref< const stream_t > stream={})
	Copy memory between memory regions. More...

void	copy (region_t destination, void *source, size_t num_bytes, optional_ref< const stream_t > stream={})
	Copy one region of memory into another. More...

void	copy (void *destination, const_region_t source, size_t num_bytes, optional_ref< const stream_t > stream={})
	Copy one region of memory to another location. More...

void	copy (void *destination, const_region_t source, optional_ref< const stream_t > stream={})

template<typename T >
unique_span< T >	make_unique_span (const context_t &context, size_t size)
	See `device::make_unique_span(const context_t& context, size_t size)`

template<typename T >
unique_span< T >	make_unique_span (const device_t &device, size_t size)
	See `device::make_unique_span(const context_t& context, size_t num_elements)`

template<typename T , dimensionality_t NumDimensions>
void	copy (array_t< T, NumDimensions > &destination, span< T const > source, optional_ref< const stream_t > stream)

context_t	context_of (void const *ptr)
	Obtain (a non-owning wrapper for) the CUDA context with which a memory address is associated (e.g. More...

memory::type_t	type_of (const void *ptr)
	Determine the type of memory at a given address vis-a-vis the CUDA ecosystem: Was it allocated by the CUDA driver? Does it reside solely on a GPU device'? Solely on the host? Movable between locations? etc.

void *	as_pointer (device::address_t address) noexcept

device::unique_region	make_unique_region (const context_t &context, size_t num_elements)
	See device::make_unique_region(const context_t& context, size_t num_elements)

device::unique_region	make_unique_region (const device_t &device, size_t num_elements)
	See device::make_unique_region(const device_t& device, size_t num_elements)


template<typename T , size_t N>
void	copy (span< T > destination, c_array< const T, N > const &source, optional_ref< const stream_t > stream={})
	Copy the contents of a C-style array into a span of same-type elements. More...

template<typename T , size_t N>
void	copy (c_array< T, N > &destination, span< T const > source, optional_ref< const stream_t > stream={})
	Copy the contents of a span into a C-style array. More...

template<typename T , size_t N>
void	copy (void *destination, c_array< const T, N > const &source, optional_ref< const stream_t > stream={})
	Copy the contents of a C-style array to another location in memory. More...

template<typename T , size_t N>
void	copy (c_array< T, N > &destination, T *source, optional_ref< const stream_t > stream={})
	Copy memory into a C-style array. More...

Detailed Description

Representation, allocation and manipulation of CUDA-related memory, of different.

Enumeration Type Documentation

◆ cpu_write_combining

enum cuda::memory::cpu_write_combining : bool

A memory allocation setting: Should the allocated memory be configured as write-combined, i.e.

a write may not be immediately applied to the allocated region and propagated (e.g. to caches, over the PCIe bus). Instead, writes will be applied as convenient, possibly in batch.

Write-combining memory frees up the host's L1 and L2 cache resources, making more cache available to the rest of the application. In addition, write-combining memory is not snooped during transfers across the PCI Express bus, which can improve transfer performance.

Reading from write-combining memory from the host is prohibitively slow, so write-combining memory should in general be used for memory that the host only writes to.

Function Documentation

◆ as_pointer()

void* cuda::memory::as_pointer ( device::address_t address )

inlinenoexcept

Returns: a cast of a numeric address in device memory space (which, in recent CUDA versions, is just a part of the unified all-system memory space) into a proper pointer.

◆ context_of()

context_t cuda::memory::context_of ( void const * ptr )

inline

Obtain (a non-owning wrapper for) the CUDA context with which a memory address is associated (e.g.

being the result of an allocation or mapping in that context)

◆ copy() [1/23]

template<typename T , size_t N>

void cuda::memory::copy	(	span< T >	destination,
		c_array< const T, N > const &	source,
		optional_ref< const stream_t >	stream = `{}`
	)

inline

Copy the contents of a C-style array into a span of same-type elements.

Note: Since we assume Compute Capability >= 2.0, all devices support the Unified Virtual Address Space, so the CUDA driver can determine, for each pointer, used in a copy function, where the data is located, and one does not have to specify this.; the sources and destinations may all be in any memory space addressable in the the unified virtual address space, which could be host-side memory, device global memory, device constant memory etc.

Parameters

destination	A span of elements to overwrite with the array contents.
source	A fixed-size C-style array from which copy data into `destination`,. As this is taken by reference rather than by address of the first element, there is no array-decay.

◆ copy() [2/23]

template<typename T , size_t N>

void cuda::memory::copy	(	c_array< T, N > &	destination,
		span< T const >	source,
		optional_ref< const stream_t >	stream = `{}`
	)

Copy the contents of a span into a C-style array.

Parameters

destination	A fixed-size C-style array, to which to copy the data in `source`,of size at least that of `source`.; as it is taken by reference rather than by address of the first element, there is no array-decay.
source	A span of the same element type as the destination array, containing the data to be copied

◆ copy() [3/23]

template<typename T , size_t N>

void cuda::memory::copy	(	void *	destination,
		c_array< const T, N > const &	source,
		optional_ref< const stream_t >	stream = `{}`
	)

inline

Copy the contents of a C-style array to another location in memory.

Parameters

destination The starting address of a sequence of

Template Parameters

N	values of type
T	to overwrite with the array contents.

Parameters

source A fixed-size C-style array from which copy data into destination,. As this is taken by reference rather than by address of the first element, there is no array-decay.

◆ copy() [4/23]

template<typename T , size_t N>

void cuda::memory::copy	(	c_array< T, N > &	destination,
		T *	source,
		optional_ref< const stream_t >	stream = `{}`
	)

inline

Copy memory into a C-style array.

Parameters

destination	A fixed-size C-style array, to which to copy the data in `source`,of size at least that of `source`.; as it is taken by reference rather than by address of the first element, there is no array-decay.
source	The starting address of a sequence of

Template Parameters

N	elements to copy

Asynchronously copies data from a memory region into a C-style array

Parameters

destination	A fixed-size C-style array, to which to copy the data in `source`,of size at least that of `source`.; as it is taken by reference rather than by address of the first element, there is no array-decay.
source	The starting address of a sequence of

Template Parameters

N	elements to copy

Parameters

stream schedule the copy operation in this CUDA stream

◆ copy() [5/23]

template<dimensionality_t NumDimensions>

void cuda::memory::copy	(	copy_parameters_t< NumDimensions >	params,
		optional_ref< const stream_t >	stream = `{}`
	)

An almost-generalized-case memory copy, taking a rather complex structure of copy parameters - wrapping the CUDA driver's own most-generalized-case copy.

Template Parameters

NumDimensions The number of dimensions of the parameter structure.

Parameters

params A parameter structure with details regarding the copy source and destination, including CUDA context specifications, which must have been set in advance. This function will not verify its validity, but rather merely pass it on to the CUDA driver

◆ copy() [6/23]

template<typename T , dimensionality_t NumDimensions>

void cuda::memory::copy	(	const array_t< T, NumDimensions > &	destination,
		const context_t &	source_context,
		const T *	source,
		optional_ref< const stream_t >	stream = `{}`
	)

Synchronously copies data from a CUDA array into non-array memory.

Template Parameters

NumDimensions	the number of array dimensions; only 2 and 3 are supported values
T	array element type

Parameters

destination A {

Template Parameters

NumDimensions}-dimensional CUDA array, including a specification of the context in which the array is defined.

Parameters

source A pointer to a region of contiguous memory holding destination.size() values of type

Template Parameters

T.	The memory may be located either on a CUDA device or in host memory.

Parameters

context The context in which the source memory was allocated - possibly different than the target array context

◆ copy() [7/23]

template<typename T , dimensionality_t NumDimensions>

void cuda::memory::copy	(	array_t< T, NumDimensions > &	destination,
		const T *	source,
		optional_ref< const stream_t >	stream = `{}`
	)

inline

Synchronously copies data from a CUDA array into non-array memory.

Template Parameters

NumDimensions	the number of array dimensions; only 2 and 3 are supported values
T	array element type

Parameters

destination A {

Template Parameters

NumDimensions}-dimensional CUDA array

Parameters

source A pointer to a region of contiguous memory holding destination.size() values of type

Template Parameters

T.	The memory may be located either on a CUDA device or in host memory.

Asynchronously copies data into a CUDA array.

Note: asynchronous version of memory::copy<T>(array_t<T, NumDimensions>&, const T*)

Parameters

destination	A CUDA array to copy data into
source	A pointer to a a memory region of size `destination.size() * sizeof(T)`
stream	schedule the copy operation into this CUDA stream

◆ copy() [8/23]

template<typename T , dimensionality_t NumDimensions>

void cuda::memory::copy	(	const array_t< T, NumDimensions > &	destination,
		span< T const >	source,
		optional_ref< const stream_t >	stream = `{}`
	)

Copies a contiguous sequence of elements in memory into a CUDA array.

Template Parameters

T	a trivially-copy-constructible, trivially-copy-destructible type of array elements

Note: only as many elements as fit in the array are copied, and any extra elements in the source span are ignored

◆ copy() [9/23]

template<typename T , dimensionality_t NumDimensions>

void cuda::memory::copy	(	const context_t &	context,
		T *	destination,
		const array_t< T, NumDimensions > &	source,
		optional_ref< const stream_t >	stream = `{}`
	)

Synchronously copies data into a CUDA array from non-array memory.

Template Parameters

NumDimensions	the number of array dimensions; only 2 and 3 are supported values
T	array element type

Parameters

destination A pointer to a region of contiguous memory holding destination.size() values of type

Template Parameters

T.	The memory may be located either on a CUDA device or in host memory.

Parameters

source A {

Template Parameters

NumDimensions}-dimensional CUDA array

◆ copy() [10/23]

template<typename T , dimensionality_t NumDimensions>

void cuda::memory::copy	(	T *	destination,
		const array_t< T, NumDimensions > &	source,
		optional_ref< const stream_t >	stream = `{}`
	)

inline

Synchronously copies data into a CUDA array from non-array memory.

Template Parameters

NumDimensions	the number of array dimensions; only 2 and 3 are supported values
T	array element type

Parameters

destination A pointer to a region of contiguous memory holding destination.size() values of type

Template Parameters

T.	The memory may be located either on a CUDA device or in host memory.

Parameters

source A {

Template Parameters

NumDimensions}-dimensional CUDA array

Asynchronously copies data from a CUDA array elsewhere

Note: asynchronous version of memory::copy

Parameters

destination	A pointer to a a memory region of size `source.size() * sizeof(T)`
source	A CUDA array cuda::array_t
stream	schedule the copy operation into this CUDA stream

◆ copy() [11/23]

template<typename T , dimensionality_t NumDimensions>

void cuda::memory::copy	(	span< T >	destination,
		const array_t< T, NumDimensions > &	source,
		optional_ref< const stream_t >	stream = `{}`
	)

Copies the contents of a CUDA array into a sequence of contiguous elements in memory.

Template Parameters

T	a trivially-copy-constructible, trivially-destructible, type of array elements

Note: The destination span must be at least as larger as the volume of the array.

◆ copy() [12/23]

template<typename T , dimensionality_t NumDimensions>

void cuda::memory::copy	(	const array_t< T, NumDimensions > &	destination,
		const array_t< T, NumDimensions > &	source,
		optional_ref< const stream_t >	stream
	)

Copies the contents of one CUDA array to another.

Template Parameters

T	a trivially-copy-constructible type of array elements

Note: The destination array must be at least as large in each dimension as the source array.

◆ copy() [13/23]

template<typename T , dimensionality_t NumDimensions>

void cuda::memory::copy	(	region_t	destination,
		const array_t< T, NumDimensions > &	source,
		optional_ref< const stream_t >	stream = `{}`
	)

Copies the contents of a CUDA array into a region of memory.

Template Parameters

T	a trivially-copy-constructible type of array elements

Note: the destination region must be large enough to hold all elements of the array, and may also be larger.

Asynchronously copies data from a CUDA array elsewhere

Note: asynchronous version of memory::copy

Parameters

destination	A memory region of size `source.size() * sizeof(T)`
source	A CUDA array cuda::array_t
stream	schedule the copy operation in this CUDA stream

◆ copy() [14/23]

template<typename T , dimensionality_t NumDimensions>

void cuda::memory::copy	(	array_t< T, NumDimensions > &	destination,
		const_region_t	source,
		optional_ref< const stream_t >	stream = `{}`
	)

Copies the contents of a region of memory into a CUDA array.

Template Parameters

T	a trivially-copy-constructible type of array elements

Note: only as many elements as fit in the array are copied, while the source region may be larger than what they take up.

Parameters

destination	A CUDA array to copy data into
source	A memory region of size `destination.size() * sizeof(T)`
stream	schedule the copy operation into this CUDA stream (or leave empty for a synchronous copy)

◆ copy() [15/23]

void cuda::memory::copy	(	void *	destination,
		void const *	source,
		size_t	num_bytes,
		optional_ref< const stream_t >	stream = `{}`
	)

inline

Asynchronously copies data between memory spaces or within a memory space.

Note: Since we assume Compute Capability >= 2.0, all devices support the Unified Virtual Address Space, so the CUDA driver can determine, for each pointer, where the data is located, and one does not have to specify this.; asynchronous version of {memory::copy(void*, void const*, size_t)}

Parameters

destination	A pointer to a memory region of size `num_bytes`, either in host memory or on any CUDA device's global memory. Must be defined in the same context as the stream.
source	A pointer to a memory region of size `num_bytes`, either in host memory or on any CUDA device's global memory. Must be defined in the same context as the stream.
num_bytes	The number of bytes to copy from `source` to `destination`
stream	A stream on which to enqueue the copy operation

◆ copy() [16/23]

template<typename T , size_t N>

void cuda::memory::copy	(	c_array< T, N > &	destination,
		const_region_t	source,
		optional_ref< const stream_t >	stream = `{}`
	)

inline

Copy the contents of memory region into a C-style array, interpreting the memory as a sequence of elements of the array's element type.

Parameters

destination	A region of memory to which to copy the data in `source`, of size at least that of `source`.
source	A region of at least `sizeof(T)*N` bytes with whose data to fill the `destination` array.

Asynchronously copies data from a memory region into a C-style array

Parameters

destination	A fixed-size C-style array, to which to copy the data in `source`,of size at least that of `source`.; as it is taken by reference rather than by address of the first element, there is no array-decay.
source	A region of at least `sizeof(T)*N` bytes with whose data to fill the `destination` array.
stream	schedule the copy operation in this CUDA stream

◆ copy() [17/23]

template<typename T , size_t N>

void cuda::memory::copy	(	region_t	destination,
		c_array< const T, N > const &	source,
		optional_ref< const stream_t >	stream = `{}`
	)

inline

Note: Since we assume Compute Capability >= 2.0, all devices support the Unified Virtual Address Space, so the CUDA driver can determine, for each pointer, used in a copy function, where the data is located, and one does not have to specify this.; the sources and destinations may all be in any memory space addressable in the the unified virtual address space, which could be host-side memory, device global memory, device constant memory etc.

Parameters

destination	A region of memory to which to copy the data in `source`, of size at least that of `source`.
source	A plain array whose contents is to be copied.

** Asynchronously copies data from an array into a memory region

Parameters

destination	A region of memory, either in host memory or on any CUDA device's global memory. Must be defined in the same context as the stream.
source	An array, either in host memory or on any CUDA device's global memory.
stream	A stream on which to enqueue the copy operation

◆ copy() [18/23]

void cuda::memory::copy	(	region_t	destination,
		const_region_t	source,
		size_t	num_bytes,
		optional_ref< const stream_t >	stream = `{}`
	)

inline

Asynchronously copies data between memory spaces or within a memory space.

Parameters

destination	A memory region of size no less than `num_bytes`, either in host memory or on any CUDA device's global memory. Must be registered with, or visible in, in the same context as `stream`.
source	A memory region of size `num_bytes`, either in host memory or on any CUDA device's global memory. Must be defined in the same contextas the stream.
num_bytes	The number of bytes to copy from `source` to `destination`
stream	A stream on which to enqueue the copy operation

◆ copy() [19/23]

void cuda::memory::copy	(	region_t	destination,
		const_region_t	source,
		optional_ref< const stream_t >	stream = `{}`
	)

inline

Parameters

destination	A region of memory to which to copy the data in `source`, of size at least that of `source` , either in host memory or on any CUDA device's global memory.
source	A region whose contents is to be copied, either in host memory or on any CUDA device's global memory

Asynchronously copies data between memory regions

Parameters

destination	A region of memory, either in host memory or on any CUDA device's global memory. Must be defined in the same context as the stream.
source	A region of memory, either in host memory or on any CUDA device's global memory. Must be defined in the same context as the stream.
stream	A stream on which to enqueue the copy operation

◆ copy() [20/23]

void cuda::memory::copy	(	region_t	destination,
		void *	source,
		optional_ref< const stream_t >	stream = `{}`
	)

inline

Copy memory between memory regions.

Parameters

destination	A target region of memory into which to copy; enough memory will be copied to fill this region
source	The beginning of a region of memory from which to copy

Asynchronously copies data between memory regions

Parameters

destination	A region of memory, either in host memory or on any CUDA device's global memory. Must be defined in the same context as the stream.
source	A pointer to region of memory, of size like that of `destination`, either in host memory or on any CUDA device's global memory. Must be defined in the same context as the stream.
stream	A stream on which to enqueue the copy operation

◆ copy() [21/23]

void cuda::memory::copy	(	region_t	destination,
		void *	source,
		size_t	num_bytes,
		optional_ref< const stream_t >	stream = `{}`
	)

inline

Copy one region of memory into another.

Parameters

destination	A region of memory to which to copy the data in `source`, of size at least that of `source`.
source	A pointer to a a memory region of size `num_bytes`.
num_bytes	The number of bytes to copy from `source` to `destination`

Asynchronously copies data from one region of memory to another

Parameters

destination	A region of memory, either in host memory or on any CUDA device's global memory. Must be defined in the same context as the stream.
source	Beginning of the region of memory to copy
num_bytes	Amount of memory to copy
stream	A stream on which to enqueue the copy operation

◆ copy() [22/23]

void cuda::memory::copy	(	void *	destination,
		const_region_t	source,
		size_t	num_bytes,
		optional_ref< const stream_t >	stream = `{}`
	)

inline

Copy one region of memory to another location.

Parameters

destination	The beginning of a target region of memory (of size at least `num_bytes`) into which to copy
source	A region of memory from which to copy, of size at least `num_bytes`
num_bytes	The number of bytes to copy from `source` to `destination`

Asynchronously copies data between memory regions

Parameters

destination	The beginning of a memory region of size `num_bytes`, either in host memory or on any CUDA device's global memory. Must be registered with, or visible in, in the same context as `stream`.
source	A memory region of size `num_bytes`, either in host memory or on any CUDA device's global memory. Must be defined in the same context as the stream.
num_bytes	The number of bytes to copy from `source` to `destination`
stream	A stream on which to enqueue the copy operation

◆ copy() [23/23]

void cuda::memory::copy	(	void *	destination,
		const_region_t	source,
		optional_ref< const stream_t >	stream = `{}`
	)

inline

Parameters

destination	A memory region of the same size as `source`.
source	A region whose contents is to be copied.

Asynchronously copies data between memory regions

Parameters

destination	Beginning of a memory region into which to copy data, either in host memory or on any CUDA device's global memory. The memory must be registered in, or visible within, the same context as {`stream}`.
source	A memory region of size `num_bytes`, either in host memory or on any CUDA device's global memory. Must be defined in the same context as the stream.
stream	A stream on which to enqueue the copy operation

◆ copy_single()

template<typename T >

void cuda::memory::copy_single	(	T *	destination,
		const T *	source,
		optional_ref< const stream_t >	stream = `{}`
	)

Synchronously copies a single (typed) value between two memory locations.

Parameters

destination	a value residing either in host memory or on any CUDA device's global memory
source	a value residing either in host memory or on any CUDA device's global memory

Copy a single (typed) value between memory locations

Note: asynchronous version of memory::copy_single<T>(T&, const T&)

Parameters

destination	a value residing either in host memory or on any CUDA device's global memory
source	a value residing either in host memory or on any CUDA device's global memory
stream	The CUDA command queue on which this copying will be enqueued

◆ set() [1/2]

void cuda::memory::set	(	void *	ptr,
		int	byte_value,
		size_t	num_bytes,
		optional_ref< const stream_t >	stream = `{}`
	)

inline

Sets a number of bytes in memory to a fixed value.

Note: The equivalent of ::std::memset - for any and all CUDA-related memory spaces

Parameters

ptr	Address of the first byte in memory to set. May be in host-side memory, global CUDA-device-side memory or CUDA-managed memory.
byte_value	value to set the memory region to
num_bytes	The amount of memory to set to `byte_value`
stream	A stream on which to schedule this action; may be omitted.

◆ set() [2/2]

void cuda::memory::set	(	region_t	region,
		int	byte_value,
		optional_ref< const stream_t >	stream = `{}`
	)

inline

Sets all bytes in a region of memory to a fixed value.

Note: The equivalent of ::std::memset - for any and all CUDA-related memory spaces

Parameters

region	the memory region to set; may be in host-side memory, global CUDA-device-side memory or CUDA-managed memory.
byte_value	value to set the memory region to
stream	A stream on which to schedule this action; may be omitted.

◆ zero() [1/3]

void cuda::memory::zero	(	region_t	region,
		optional_ref< const stream_t >	stream = `{}`
	)

inline

Sets all bytes in a region of memory to 0 (zero)

Parameters

region	the memory region to zero-out; may be in host-side memory, global CUDA-device-side memory or CUDA-managed memory.
stream	A stream on which to schedule this action; may be omitted.

◆ zero() [2/3]

void cuda::memory::zero	(	void *	ptr,
		size_t	num_bytes,
		optional_ref< const stream_t >	stream = `{}`
	)

inline

Zero-out a region of memory.

Parameters

ptr	the beginning of a region of memory to zero-out; may be in host-side memory, global CUDA-device-side memory or CUDA-managed memory.
num_bytes	the size in bytes of the region of memory to zero-out
stream	A stream on which to schedule this action; may be omitted.

◆ zero() [3/3]

template<typename T >

void cuda::memory::zero ( T * ptr )

inline

Sets all bytes of a single pointed-to value to 0.

Parameters

ptr	pointer to a single element of a certain type, which may be in host-side memory, global CUDA-device-side memory or CUDA-managed memory.
stream	A stream on which to schedule this action; may be omitted.

Namespaces

Classes

Enumerations

Functions

Detailed Description

Enumeration Type Documentation

◆ cpu_write_combining

Function Documentation

◆ as_pointer()

◆ context_of()

◆ copy() [1/23]

◆ copy() [2/23]

◆ copy() [3/23]

◆ copy() [4/23]

◆ copy() [5/23]

◆ copy() [6/23]

◆ copy() [7/23]

◆ copy() [8/23]

◆ copy() [9/23]

◆ copy() [10/23]

◆ copy() [11/23]

◆ copy() [12/23]

◆ copy() [13/23]

◆ copy() [14/23]

◆ copy() [15/23]

◆ copy() [16/23]

◆ copy() [17/23]

◆ copy() [18/23]

◆ copy() [19/23]

◆ copy() [20/23]

◆ copy() [21/23]

◆ copy() [22/23]

◆ copy() [23/23]

◆ copy_single()

◆ set() [1/2]

◆ set() [2/2]

◆ zero() [1/3]

◆ zero() [2/3]

◆ zero() [3/3]