cuda-api-wrappers
Thin C++-flavored wrappers for the CUDA Runtime API
Typedefs | Enumerations | Functions
cuda::event Namespace Reference

Definitions and functionality related to CUDA events (not including the event wrapper type event_t itself) More...

Typedefs

using duration_t = ::std::chrono::duration< float, ::std::milli >
 The type used by the CUDA Runtime API to represent the time difference between pairs of events.
 
using handle_t = CUevent
 The CUDA Runtime API's numeric handle for events.
 

Enumerations

enum  : bool {
  sync_by_busy_waiting = false,
  sync_by_blocking = true
}
 Synchronization option for cuda::event_t 's. More...
 
enum  : bool {
  dont_record_timings = false,
  do_record_timings = true
}
 Should the CUDA Runtime API record timing information for events as it schedules them?
 
enum  : bool {
  not_interprocess = false,
  interprocess = true,
  single_process = not_interprocess
}
 IPC usability option for {cuda::event_t}'s. More...
 

Functions

event_t wrap (device::id_t device_id, context::handle_t context_handle, handle_t event_handle, bool take_ownership=false) noexcept
 Wrap an existing CUDA event in a event_t instance. More...
 
::std::string identify (const event_t &event)
 
duration_t time_elapsed_between (const event_t &start, const event_t &end)
 Determine (inaccurately) the elapsed time between two events. More...
 
event_t create (device_t &device, bool uses_blocking_sync=sync_by_busy_waiting, bool records_timing=do_record_timings, bool interprocess=not_interprocess)
 creates a new execution stream on a device. More...
 
event_t create (const context_t &context, bool uses_blocking_sync, bool records_timing, bool interprocess)
 

Detailed Description

Definitions and functionality related to CUDA events (not including the event wrapper type event_t itself)

Enumeration Type Documentation

◆ anonymous enum

anonymous enum : bool

Synchronization option for cuda::event_t 's.

Enumerator
sync_by_busy_waiting 

The thread calling event_.synchronize() will enter a busy-wait loop; this (might) minimize delay between kernel execution conclusion and control returning to the thread, but is very wasteful of CPU time.

sync_by_blocking 

The thread calling event_.synchronize() will block - yield control of the CPU and will only become ready for execution after the kernel has completed its execution - at which point it would have to wait its turn among other threads.

This does not waste CPU computing time, but results in a longer delay.

◆ anonymous enum

anonymous enum : bool

IPC usability option for {cuda::event_t}'s.

Enumerator
not_interprocess 

Can only be used by the process which created it.

interprocess 

Can be shared between processes. Must not be able to record timings.

Function Documentation

◆ create()

event_t cuda::event::create ( device_t &  device,
bool  uses_blocking_sync = sync_by_busy_waiting,
bool  records_timing = do_record_timings,
bool  interprocess = not_interprocess 
)
inline

creates a new execution stream on a device.

Parameters
deviceThe device on which to create the new stream
uses_blocking_syncWhen synchronizing on this new event, shall a thread busy-wait for it, or block?
records_timingCan this event be used to record time values (e.g. duration between events)?
interprocessCan multiple processes work with the constructed event?
Returns
The constructed event proxy
Note
Creating an event

◆ time_elapsed_between()

duration_t cuda::event::time_elapsed_between ( const event_t start,
const event_t end 
)
inline

Determine (inaccurately) the elapsed time between two events.

Note
Q: Why the weird output type? A: This is what the CUDA Runtime API itself returns
Parameters
startfirst timepoint event
endsecond, later, timepoint event
Returns
the difference in the (inaccurately) measured time, in msec

◆ wrap()

event_t cuda::event::wrap ( device::id_t  device_id,
context::handle_t  context_handle,
handle_t  event_handle,
bool  take_ownership = false 
)
inlinenoexcept

Wrap an existing CUDA event in a event_t instance.

Note
This is a named constructor idiom, existing of direct access to the ctor of the same signature, to emphasize that a new event is not created.
Parameters
context_handleHandle of the context in which this event was created
event_handlehandle of the pre-existing event
take_ownershipWhen set to false, the CUDA event will not be destroyed along with proxy; use this setting when temporarily working with a stream existing irrespective of the current context and outlasting it. When set to true, the proxy class will act as it does usually, destroying the event when being destructed itself.
Returns
an event wrapper associated with the specified event