cuda-api-wrappers
Thin C++-flavored wrappers for the CUDA Runtime API
cuda::memory Namespace Reference

Representation, allocation and manipulation of CUDA-related memory, of different. More...

Namespaces

 device
 CUDA-Device-global memory on a single device (not accessible from the host)
 
 host
 Host-side (= system) memory which is "pinned", i.e.
 
 managed
 Paged memory accessible in both device-side and host-side code by triggering transfers of pages between physical system memory and physical device memory.
 
 mapped
 Memory regions appearing in both on the host-side and device-side address spaces with the regions in both spaces mapped to each other (i.e.
 
 shared
 A memory space whose contents is shared by all threads in a CUDA kernel block, but specific to each kernel block separately.
 

Classes

struct  allocation_options
 options accepted by CUDA's allocator of memory with a host-side aspect (host-only or managed memory). More...
 
struct  copy_parameters_t
 A builder-ish subclass template around the basic 2D or 3D copy parameters which CUDA's complex copying API actually takes. More...
 
class  pointer_t
 A convenience wrapper around a raw pointer "known" to the CUDA runtime and which thus has various kinds of associated information which this wrapper allows access to. More...
 
class  unique_region
 A class for holding a region_t of memory owned "uniquely" by its creator - similar to how ::std::unique_ptr holds a uniquely- owned pointer. More...
 

Enumerations

enum  endpoint_t {
  source,
  destination
}
 Type for choosing between endpoints of copy operations.
 
enum  portability_across_contexts : bool {
  isnt_portable = false,
  is_portable = true
}
 A memory allocation setting: Can the allocated memory be used in other CUDA driver contexts (in addition to the implicit default context we have with the Runtime API).
 
enum  cpu_write_combining : bool {
  without_wc = false,
  with_wc = true
}
 A memory allocation setting: Should the allocated memory be configured as write-combined, i.e. More...
 
enum  type_t : ::std::underlying_type< CUmemorytype >::type {
  host_ = CU_MEMORYTYPE_HOST,
  device_ = CU_MEMORYTYPE_DEVICE,
  array = CU_MEMORYTYPE_ARRAY,
  unified_ = CU_MEMORYTYPE_UNIFIED,
  managed_ = CU_MEMORYTYPE_UNIFIED,
  non_cuda = ~(::std::underlying_type<CUmemorytype>::type{0})
}
 The CUDA execution ecosystem involves different memory spaces in their relation to a GPU device or their treatment by the CUDA driver; this type distinguishes among them.
 

Functions

copy_parameters_t< 3 >::intra_context_type as_intra_context_parameters (const copy_parameters_t< 3 > &params)
 
void set (void *ptr, int byte_value, size_t num_bytes, optional_ref< const stream_t > stream={})
 Sets a number of bytes in memory to a fixed value. More...
 
void set (region_t region, int byte_value, optional_ref< const stream_t > stream={})
 Sets all bytes in a region of memory to a fixed value. More...
 
void zero (region_t region, optional_ref< const stream_t > stream={})
 Sets all bytes in a region of memory to 0 (zero) More...
 
void zero (void *ptr, size_t num_bytes, optional_ref< const stream_t > stream={})
 Zero-out a region of memory. More...
 
template<typename T >
void zero (T *ptr)
 Sets all bytes of a single pointed-to value to 0. More...
 
template<dimensionality_t NumDimensions>
void copy (copy_parameters_t< NumDimensions > params, optional_ref< const stream_t > stream={})
 An almost-generalized-case memory copy, taking a rather complex structure of copy parameters - wrapping the CUDA driver's own most-generalized-case copy. More...
 
template<typename T , dimensionality_t NumDimensions>
void copy (const array_t< T, NumDimensions > &destination, const context_t &source_context, const T *source, optional_ref< const stream_t > stream={})
 Synchronously copies data from a CUDA array into non-array memory. More...
 
template<typename T , dimensionality_t NumDimensions>
void copy (array_t< T, NumDimensions > &destination, const T *source, optional_ref< const stream_t > stream={})
 Synchronously copies data from a CUDA array into non-array memory. More...
 
template<typename T , dimensionality_t NumDimensions>
void copy (const array_t< T, NumDimensions > &destination, span< T const > source, optional_ref< const stream_t > stream={})
 Copies a contiguous sequence of elements in memory into a CUDA array. More...
 
template<typename T , dimensionality_t NumDimensions>
void copy (const context_t &context, T *destination, const array_t< T, NumDimensions > &source, optional_ref< const stream_t > stream={})
 Synchronously copies data into a CUDA array from non-array memory. More...
 
template<typename T , dimensionality_t NumDimensions>
void copy (T *destination, const array_t< T, NumDimensions > &source, optional_ref< const stream_t > stream={})
 Synchronously copies data into a CUDA array from non-array memory. More...
 
template<typename T , dimensionality_t NumDimensions>
void copy (span< T > destination, const array_t< T, NumDimensions > &source, optional_ref< const stream_t > stream={})
 Copies the contents of a CUDA array into a sequence of contiguous elements in memory. More...
 
template<typename T , dimensionality_t NumDimensions>
void copy (const array_t< T, NumDimensions > &destination, const array_t< T, NumDimensions > &source, optional_ref< const stream_t > stream)
 Copies the contents of one CUDA array to another. More...
 
template<typename T , dimensionality_t NumDimensions>
void copy (region_t destination, const array_t< T, NumDimensions > &source, optional_ref< const stream_t > stream={})
 Copies the contents of a CUDA array into a region of memory. More...
 
template<typename T , dimensionality_t NumDimensions>
void copy (array_t< T, NumDimensions > &destination, const_region_t source, optional_ref< const stream_t > stream={})
 Copies the contents of a region of memory into a CUDA array. More...
 
template<typename T >
void copy_single (T *destination, const T *source, optional_ref< const stream_t > stream={})
 Synchronously copies a single (typed) value between two memory locations. More...
 
void copy (void *destination, void const *source, size_t num_bytes, optional_ref< const stream_t > stream={})
 Asynchronously copies data between memory spaces or within a memory space. More...
 
template<typename T , size_t N>
void copy (c_array< T, N > &destination, const_region_t source, optional_ref< const stream_t > stream={})
 Copy the contents of memory region into a C-style array, interpreting the memory as a sequence of elements of the array's element type. More...
 
template<typename T , size_t N>
void copy (region_t destination, c_array< const T, N > const &source, optional_ref< const stream_t > stream={})
 
void copy (region_t destination, const_region_t source, size_t num_bytes, optional_ref< const stream_t > stream={})
 Asynchronously copies data between memory spaces or within a memory space. More...
 
void copy (region_t destination, const_region_t source, optional_ref< const stream_t > stream={})
 
void copy (region_t destination, void *source, optional_ref< const stream_t > stream={})
 Copy memory between memory regions. More...
 
void copy (region_t destination, void *source, size_t num_bytes, optional_ref< const stream_t > stream={})
 Copy one region of memory into another. More...
 
void copy (void *destination, const_region_t source, size_t num_bytes, optional_ref< const stream_t > stream={})
 Copy one region of memory to another location. More...
 
void copy (void *destination, const_region_t source, optional_ref< const stream_t > stream={})
 
template<typename T >
unique_span< T > make_unique_span (const context_t &context, size_t size)
 See device::make_unique_span(const context_t& context, size_t size)
 
template<typename T >
unique_span< T > make_unique_span (const device_t &device, size_t size)
 See device::make_unique_span(const context_t& context, size_t num_elements)
 
template<typename T , dimensionality_t NumDimensions>
void copy (array_t< T, NumDimensions > &destination, span< T const > source, optional_ref< const stream_t > stream)
 
context_t context_of (void const *ptr)
 Obtain (a non-owning wrapper for) the CUDA context with which a memory address is associated (e.g. More...
 
memory::type_t type_of (const void *ptr)
 Determine the type of memory at a given address vis-a-vis the CUDA ecosystem: Was it allocated by the CUDA driver? Does it reside solely on a GPU device'? Solely on the host? Movable between locations? etc.
 
void * as_pointer (device::address_t address) noexcept
 
device::unique_region make_unique_region (const context_t &context, size_t num_elements)
 See device::make_unique_region(const context_t& context, size_t num_elements)
 
device::unique_region make_unique_region (const device_t &device, size_t num_elements)
 See device::make_unique_region(const device_t& device, size_t num_elements)
 
template<typename T , size_t N>
void copy (span< T > destination, c_array< const T, N > const &source, optional_ref< const stream_t > stream={})
 Copy the contents of a C-style array into a span of same-type elements. More...
 
template<typename T , size_t N>
void copy (c_array< T, N > &destination, span< T const > source, optional_ref< const stream_t > stream={})
 Copy the contents of a span into a C-style array. More...
 
template<typename T , size_t N>
void copy (void *destination, c_array< const T, N > const &source, optional_ref< const stream_t > stream={})
 Copy the contents of a C-style array to another location in memory. More...
 
template<typename T , size_t N>
void copy (c_array< T, N > &destination, T *source, optional_ref< const stream_t > stream={})
 Copy memory into a C-style array. More...
 

Detailed Description

Representation, allocation and manipulation of CUDA-related memory, of different.

Enumeration Type Documentation

◆ cpu_write_combining

A memory allocation setting: Should the allocated memory be configured as write-combined, i.e.

a write may not be immediately applied to the allocated region and propagated (e.g. to caches, over the PCIe bus). Instead, writes will be applied as convenient, possibly in batch.

Write-combining memory frees up the host's L1 and L2 cache resources, making more cache available to the rest of the application. In addition, write-combining memory is not snooped during transfers across the PCI Express bus, which can improve transfer performance.

Reading from write-combining memory from the host is prohibitively slow, so write-combining memory should in general be used for memory that the host only writes to.

Function Documentation

◆ as_pointer()

void* cuda::memory::as_pointer ( device::address_t  address)
inlinenoexcept
Returns
a cast of a numeric address in device memory space (which, in recent CUDA versions, is just a part of the unified all-system memory space) into a proper pointer.

◆ context_of()

context_t cuda::memory::context_of ( void const *  ptr)
inline

Obtain (a non-owning wrapper for) the CUDA context with which a memory address is associated (e.g.

being the result of an allocation or mapping in that context)

◆ copy() [1/23]

template<typename T , size_t N>
void cuda::memory::copy ( span< T >  destination,
c_array< const T, N > const &  source,
optional_ref< const stream_t stream = {} 
)
inline

Copy the contents of a C-style array into a span of same-type elements.

Note
Since we assume Compute Capability >= 2.0, all devices support the Unified Virtual Address Space, so the CUDA driver can determine, for each pointer, used in a copy function, where the data is located, and one does not have to specify this.
the sources and destinations may all be in any memory space addressable in the the unified virtual address space, which could be host-side memory, device global memory, device constant memory etc.
Parameters
destinationA span of elements to overwrite with the array contents.
sourceA fixed-size C-style array from which copy data into destination,. As this is taken by reference rather than by address of the first element, there is no array-decay.

◆ copy() [2/23]

template<typename T , size_t N>
void cuda::memory::copy ( c_array< T, N > &  destination,
span< T const >  source,
optional_ref< const stream_t stream = {} 
)

Copy the contents of a span into a C-style array.

Parameters
destinationA fixed-size C-style array, to which to copy the data in source,of size at least that of source.; as it is taken by reference rather than by address of the first element, there is no array-decay.
sourceA span of the same element type as the destination array, containing the data to be copied

◆ copy() [3/23]

template<typename T , size_t N>
void cuda::memory::copy ( void *  destination,
c_array< const T, N > const &  source,
optional_ref< const stream_t stream = {} 
)
inline

Copy the contents of a C-style array to another location in memory.

Parameters
destinationThe starting address of a sequence of
Template Parameters
Nvalues of type
Tto overwrite with the array contents.
Parameters
sourceA fixed-size C-style array from which copy data into destination,. As this is taken by reference rather than by address of the first element, there is no array-decay.

◆ copy() [4/23]

template<typename T , size_t N>
void cuda::memory::copy ( c_array< T, N > &  destination,
T *  source,
optional_ref< const stream_t stream = {} 
)
inline

Copy memory into a C-style array.

Parameters
destinationA fixed-size C-style array, to which to copy the data in source,of size at least that of source.; as it is taken by reference rather than by address of the first element, there is no array-decay.
sourceThe starting address of a sequence of
Template Parameters
Nelements to copy

Asynchronously copies data from a memory region into a C-style array

Parameters
destinationA fixed-size C-style array, to which to copy the data in source,of size at least that of source.; as it is taken by reference rather than by address of the first element, there is no array-decay.
sourceThe starting address of a sequence of
Template Parameters
Nelements to copy
Parameters
streamschedule the copy operation in this CUDA stream

◆ copy() [5/23]

template<dimensionality_t NumDimensions>
void cuda::memory::copy ( copy_parameters_t< NumDimensions >  params,
optional_ref< const stream_t stream = {} 
)

An almost-generalized-case memory copy, taking a rather complex structure of copy parameters - wrapping the CUDA driver's own most-generalized-case copy.

Template Parameters
NumDimensionsThe number of dimensions of the parameter structure.
Parameters
paramsA parameter structure with details regarding the copy source and destination, including CUDA context specifications, which must have been set in advance. This function will not verify its validity, but rather merely pass it on to the CUDA driver

◆ copy() [6/23]

template<typename T , dimensionality_t NumDimensions>
void cuda::memory::copy ( const array_t< T, NumDimensions > &  destination,
const context_t source_context,
const T *  source,
optional_ref< const stream_t stream = {} 
)

Synchronously copies data from a CUDA array into non-array memory.

Template Parameters
NumDimensionsthe number of array dimensions; only 2 and 3 are supported values
Tarray element type
Parameters
destinationA {
Template Parameters
NumDimensions}-dimensionalCUDA array, including a specification of the context in which the array is defined.
Parameters
sourceA pointer to a region of contiguous memory holding destination.size() values of type
Template Parameters
T.The memory may be located either on a CUDA device or in host memory.
Parameters
contextThe context in which the source memory was allocated - possibly different than the target array context

◆ copy() [7/23]

template<typename T , dimensionality_t NumDimensions>
void cuda::memory::copy ( array_t< T, NumDimensions > &  destination,
const T *  source,
optional_ref< const stream_t stream = {} 
)
inline

Synchronously copies data from a CUDA array into non-array memory.

Template Parameters
NumDimensionsthe number of array dimensions; only 2 and 3 are supported values
Tarray element type
Parameters
destinationA {
Template Parameters
NumDimensions}-dimensionalCUDA array
Parameters
sourceA pointer to a region of contiguous memory holding destination.size() values of type
Template Parameters
T.The memory may be located either on a CUDA device or in host memory.

Asynchronously copies data into a CUDA array.

Note
asynchronous version of memory::copy<T>(array_t<T, NumDimensions>&, const T*)
Parameters
destinationA CUDA array to copy data into
sourceA pointer to a a memory region of size destination.size() * sizeof(T)
streamschedule the copy operation into this CUDA stream

◆ copy() [8/23]

template<typename T , dimensionality_t NumDimensions>
void cuda::memory::copy ( const array_t< T, NumDimensions > &  destination,
span< T const >  source,
optional_ref< const stream_t stream = {} 
)

Copies a contiguous sequence of elements in memory into a CUDA array.

Template Parameters
Ta trivially-copy-constructible, trivially-copy-destructible type of array elements
Note
only as many elements as fit in the array are copied, and any extra elements in the source span are ignored

◆ copy() [9/23]

template<typename T , dimensionality_t NumDimensions>
void cuda::memory::copy ( const context_t context,
T *  destination,
const array_t< T, NumDimensions > &  source,
optional_ref< const stream_t stream = {} 
)

Synchronously copies data into a CUDA array from non-array memory.

Template Parameters
NumDimensionsthe number of array dimensions; only 2 and 3 are supported values
Tarray element type
Parameters
destinationA pointer to a region of contiguous memory holding destination.size() values of type
Template Parameters
T.The memory may be located either on a CUDA device or in host memory.
Parameters
sourceA {
Template Parameters
NumDimensions}-dimensionalCUDA array

◆ copy() [10/23]

template<typename T , dimensionality_t NumDimensions>
void cuda::memory::copy ( T *  destination,
const array_t< T, NumDimensions > &  source,
optional_ref< const stream_t stream = {} 
)
inline

Synchronously copies data into a CUDA array from non-array memory.

Template Parameters
NumDimensionsthe number of array dimensions; only 2 and 3 are supported values
Tarray element type
Parameters
destinationA pointer to a region of contiguous memory holding destination.size() values of type
Template Parameters
T.The memory may be located either on a CUDA device or in host memory.
Parameters
sourceA {
Template Parameters
NumDimensions}-dimensionalCUDA array

Asynchronously copies data from a CUDA array elsewhere

Note
asynchronous version of memory::copy
Parameters
destinationA pointer to a a memory region of size source.size() * sizeof(T)
sourceA CUDA array cuda::array_t
streamschedule the copy operation into this CUDA stream

◆ copy() [11/23]

template<typename T , dimensionality_t NumDimensions>
void cuda::memory::copy ( span< T >  destination,
const array_t< T, NumDimensions > &  source,
optional_ref< const stream_t stream = {} 
)

Copies the contents of a CUDA array into a sequence of contiguous elements in memory.

Template Parameters
Ta trivially-copy-constructible, trivially-destructible, type of array elements
Note
The destination span must be at least as larger as the volume of the array.

◆ copy() [12/23]

template<typename T , dimensionality_t NumDimensions>
void cuda::memory::copy ( const array_t< T, NumDimensions > &  destination,
const array_t< T, NumDimensions > &  source,
optional_ref< const stream_t stream 
)

Copies the contents of one CUDA array to another.

Template Parameters
Ta trivially-copy-constructible type of array elements
Note
The destination array must be at least as large in each dimension as the source array.

◆ copy() [13/23]

template<typename T , dimensionality_t NumDimensions>
void cuda::memory::copy ( region_t  destination,
const array_t< T, NumDimensions > &  source,
optional_ref< const stream_t stream = {} 
)

Copies the contents of a CUDA array into a region of memory.

Template Parameters
Ta trivially-copy-constructible type of array elements
Note
the destination region must be large enough to hold all elements of the array, and may also be larger.

Asynchronously copies data from a CUDA array elsewhere

Note
asynchronous version of memory::copy
Parameters
destinationA memory region of size source.size() * sizeof(T)
sourceA CUDA array cuda::array_t
streamschedule the copy operation in this CUDA stream

◆ copy() [14/23]

template<typename T , dimensionality_t NumDimensions>
void cuda::memory::copy ( array_t< T, NumDimensions > &  destination,
const_region_t  source,
optional_ref< const stream_t stream = {} 
)

Copies the contents of a region of memory into a CUDA array.

Template Parameters
Ta trivially-copy-constructible type of array elements
Note
only as many elements as fit in the array are copied, while the source region may be larger than what they take up.
Parameters
destinationA CUDA array to copy data into
sourceA memory region of size destination.size() * sizeof(T)
streamschedule the copy operation into this CUDA stream (or leave empty for a synchronous copy)

◆ copy() [15/23]

void cuda::memory::copy ( void *  destination,
void const *  source,
size_t  num_bytes,
optional_ref< const stream_t stream = {} 
)
inline

Asynchronously copies data between memory spaces or within a memory space.

Note
Since we assume Compute Capability >= 2.0, all devices support the Unified Virtual Address Space, so the CUDA driver can determine, for each pointer, where the data is located, and one does not have to specify this.
asynchronous version of {memory::copy(void*, void const*, size_t)}
Parameters
destinationA pointer to a memory region of size num_bytes, either in host memory or on any CUDA device's global memory. Must be defined in the same context as the stream.
sourceA pointer to a memory region of size num_bytes, either in host memory or on any CUDA device's global memory. Must be defined in the same context as the stream.
num_bytesThe number of bytes to copy from source to destination
streamA stream on which to enqueue the copy operation

◆ copy() [16/23]

template<typename T , size_t N>
void cuda::memory::copy ( c_array< T, N > &  destination,
const_region_t  source,
optional_ref< const stream_t stream = {} 
)
inline

Copy the contents of memory region into a C-style array, interpreting the memory as a sequence of elements of the array's element type.

Parameters
destinationA region of memory to which to copy the data in source, of size at least that of source.
sourceA region of at least sizeof(T)*N bytes with whose data to fill the destination array.

Asynchronously copies data from a memory region into a C-style array

Parameters
destinationA fixed-size C-style array, to which to copy the data in source,of size at least that of source.; as it is taken by reference rather than by address of the first element, there is no array-decay.
sourceA region of at least sizeof(T)*N bytes with whose data to fill the destination array.
streamschedule the copy operation in this CUDA stream

◆ copy() [17/23]

template<typename T , size_t N>
void cuda::memory::copy ( region_t  destination,
c_array< const T, N > const &  source,
optional_ref< const stream_t stream = {} 
)
inline
Note
Since we assume Compute Capability >= 2.0, all devices support the Unified Virtual Address Space, so the CUDA driver can determine, for each pointer, used in a copy function, where the data is located, and one does not have to specify this.
the sources and destinations may all be in any memory space addressable in the the unified virtual address space, which could be host-side memory, device global memory, device constant memory etc.
Parameters
destinationA region of memory to which to copy the data in source, of size at least that of source.
sourceA plain array whose contents is to be copied.

** Asynchronously copies data from an array into a memory region

Parameters
destinationA region of memory, either in host memory or on any CUDA device's global memory. Must be defined in the same context as the stream.
sourceAn array, either in host memory or on any CUDA device's global memory.
streamA stream on which to enqueue the copy operation

◆ copy() [18/23]

void cuda::memory::copy ( region_t  destination,
const_region_t  source,
size_t  num_bytes,
optional_ref< const stream_t stream = {} 
)
inline

Asynchronously copies data between memory spaces or within a memory space.

Parameters
destinationA memory region of size no less than num_bytes, either in host memory or on any CUDA device's global memory. Must be registered with, or visible in, in the same context as stream.
sourceA memory region of size num_bytes, either in host memory or on any CUDA device's global memory. Must be defined in the same contextas the stream.
num_bytesThe number of bytes to copy from source to destination
streamA stream on which to enqueue the copy operation

◆ copy() [19/23]

void cuda::memory::copy ( region_t  destination,
const_region_t  source,
optional_ref< const stream_t stream = {} 
)
inline
Parameters
destinationA region of memory to which to copy the data in source, of size at least that of source , either in host memory or on any CUDA device's global memory.
sourceA region whose contents is to be copied, either in host memory or on any CUDA device's global memory

Asynchronously copies data between memory regions

Parameters
destinationA region of memory, either in host memory or on any CUDA device's global memory. Must be defined in the same context as the stream.
sourceA region of memory, either in host memory or on any CUDA device's global memory. Must be defined in the same context as the stream.
streamA stream on which to enqueue the copy operation

◆ copy() [20/23]

void cuda::memory::copy ( region_t  destination,
void *  source,
optional_ref< const stream_t stream = {} 
)
inline

Copy memory between memory regions.

Parameters
destinationA target region of memory into which to copy; enough memory will be copied to fill this region
sourceThe beginning of a region of memory from which to copy

Asynchronously copies data between memory regions

Parameters
destinationA region of memory, either in host memory or on any CUDA device's global memory. Must be defined in the same context as the stream.
sourceA pointer to region of memory, of size like that of destination, either in host memory or on any CUDA device's global memory. Must be defined in the same context as the stream.
streamA stream on which to enqueue the copy operation

◆ copy() [21/23]

void cuda::memory::copy ( region_t  destination,
void *  source,
size_t  num_bytes,
optional_ref< const stream_t stream = {} 
)
inline

Copy one region of memory into another.

Parameters
destinationA region of memory to which to copy the data in source, of size at least that of source.
sourceA pointer to a a memory region of size num_bytes.
num_bytesThe number of bytes to copy from source to destination

Asynchronously copies data from one region of memory to another

Parameters
destinationA region of memory, either in host memory or on any CUDA device's global memory. Must be defined in the same context as the stream.
sourceBeginning of the region of memory to copy
num_bytesAmount of memory to copy
streamA stream on which to enqueue the copy operation

◆ copy() [22/23]

void cuda::memory::copy ( void *  destination,
const_region_t  source,
size_t  num_bytes,
optional_ref< const stream_t stream = {} 
)
inline

Copy one region of memory to another location.

Parameters
destinationThe beginning of a target region of memory (of size at least num_bytes) into which to copy
sourceA region of memory from which to copy, of size at least num_bytes
num_bytesThe number of bytes to copy from source to destination

Asynchronously copies data between memory regions

Parameters
destinationThe beginning of a memory region of size num_bytes, either in host memory or on any CUDA device's global memory. Must be registered with, or visible in, in the same context as stream.
sourceA memory region of size num_bytes, either in host memory or on any CUDA device's global memory. Must be defined in the same context as the stream.
num_bytesThe number of bytes to copy from source to destination
streamA stream on which to enqueue the copy operation

◆ copy() [23/23]

void cuda::memory::copy ( void *  destination,
const_region_t  source,
optional_ref< const stream_t stream = {} 
)
inline
Parameters
destinationA memory region of the same size as source.
sourceA region whose contents is to be copied.

Asynchronously copies data between memory regions

Parameters
destinationBeginning of a memory region into which to copy data, either in host memory or on any CUDA device's global memory. The memory must be registered in, or visible within, the same context as {stream}.
sourceA memory region of size num_bytes, either in host memory or on any CUDA device's global memory. Must be defined in the same context as the stream.
streamA stream on which to enqueue the copy operation

◆ copy_single()

template<typename T >
void cuda::memory::copy_single ( T *  destination,
const T *  source,
optional_ref< const stream_t stream = {} 
)

Synchronously copies a single (typed) value between two memory locations.

Parameters
destinationa value residing either in host memory or on any CUDA device's global memory
sourcea value residing either in host memory or on any CUDA device's global memory

Copy a single (typed) value between memory locations

Note
asynchronous version of memory::copy_single<T>(T&, const T&)
Parameters
destinationa value residing either in host memory or on any CUDA device's global memory
sourcea value residing either in host memory or on any CUDA device's global memory
streamThe CUDA command queue on which this copying will be enqueued

◆ set() [1/2]

void cuda::memory::set ( void *  ptr,
int  byte_value,
size_t  num_bytes,
optional_ref< const stream_t stream = {} 
)
inline

Sets a number of bytes in memory to a fixed value.

Note
The equivalent of ::std::memset - for any and all CUDA-related memory spaces
Parameters
ptrAddress of the first byte in memory to set. May be in host-side memory, global CUDA-device-side memory or CUDA-managed memory.
byte_valuevalue to set the memory region to
num_bytesThe amount of memory to set to byte_value
streamA stream on which to schedule this action; may be omitted.

◆ set() [2/2]

void cuda::memory::set ( region_t  region,
int  byte_value,
optional_ref< const stream_t stream = {} 
)
inline

Sets all bytes in a region of memory to a fixed value.

Note
The equivalent of ::std::memset - for any and all CUDA-related memory spaces
Parameters
regionthe memory region to set; may be in host-side memory, global CUDA-device-side memory or CUDA-managed memory.
byte_valuevalue to set the memory region to
streamA stream on which to schedule this action; may be omitted.

◆ zero() [1/3]

void cuda::memory::zero ( region_t  region,
optional_ref< const stream_t stream = {} 
)
inline

Sets all bytes in a region of memory to 0 (zero)

Parameters
regionthe memory region to zero-out; may be in host-side memory, global CUDA-device-side memory or CUDA-managed memory.
streamA stream on which to schedule this action; may be omitted.

◆ zero() [2/3]

void cuda::memory::zero ( void *  ptr,
size_t  num_bytes,
optional_ref< const stream_t stream = {} 
)
inline

Zero-out a region of memory.

Parameters
ptrthe beginning of a region of memory to zero-out; may be in host-side memory, global CUDA-device-side memory or CUDA-managed memory.
num_bytesthe size in bytes of the region of memory to zero-out
streamA stream on which to schedule this action; may be omitted.

◆ zero() [3/3]

template<typename T >
void cuda::memory::zero ( T *  ptr)
inline

Sets all bytes of a single pointed-to value to 0.

Parameters
ptrpointer to a single element of a certain type, which may be in host-side memory, global CUDA-device-side memory or CUDA-managed memory.
streamA stream on which to schedule this action; may be omitted.