cuda-api-wrappers
Thin C++-flavored wrappers for the CUDA Runtime API
cuda::rtc::common_ptx_compilation_options_t Struct Reference

A subset of the options for compiling PTX code into SASS, usable both with the CUDA driver and with NVIDIA's PTX compilation library. More...

#include <common_ptx_compilation_options.hpp>

Inheritance diagram for cuda::rtc::common_ptx_compilation_options_t:

Public Member Functions

virtual optional< caching_mode_t< memory_operation_t::load > > & default_load_caching_mode ()
 see default_load_caching_mode_
 
virtual optional< caching_mode_t< memory_operation_t::load > > default_load_caching_mode () const
 

Public Attributes

optional< ptx_register_count_tmax_num_registers_per_thread {}
 Limit the number of registers which a kernel thread may use.
 
optional< grid::block_dimension_tmin_num_threads_per_block {}
 The minimum number of threads per block which the compiler should target.
 
optional< optimization_level_toptimization_level {}
 Compilation optimization level (as in -O1, -O2 etc.)
 
optional< device::compute_capability_tspecific_target
 Which NVIDIA physical architecture to generate SASS code for.
 
bool generate_source_line_info {false}
 Generate indications of which PTX/SASS instructions correspond to which lines of the source code, within the compiled output.
 
bool generate_debug_info {false}
 Generate debugging information associating SASS instructions to locations in the source, embedding it within the compilation output (-g)
 
optional< caching_mode_t< memory_operation_t::load > > default_load_caching_mode_
 Which of the memory-load-instruction caching modes (see {caching_mode_t}) to use by default, when no caching mode is specified in a PTX instruction. More...
 
bool generate_relocatable_device_code { false }
 Generate relocatable code that can be linked with other relocatable device code. More...
 

Detailed Description

A subset of the options for compiling PTX code into SASS, usable both with the CUDA driver and with NVIDIA's PTX compilation library.

Member Data Documentation

◆ default_load_caching_mode_

optional<caching_mode_t<memory_operation_t::load> > cuda::rtc::common_ptx_compilation_options_t::default_load_caching_mode_

Which of the memory-load-instruction caching modes (see {caching_mode_t}) to use by default, when no caching mode is specified in a PTX instruction.

◆ generate_relocatable_device_code

bool cuda::rtc::common_ptx_compilation_options_t::generate_relocatable_device_code { false }

Generate relocatable code that can be linked with other relocatable device code.

Note
For NVRTC, this is equivalent to specifying "--device-c" ; and if this option is not specified - that's equivalent to specifying "--device-w".

The documentation for this struct was generated from the following file: