cuda-api-wrappers
Thin C++-flavored wrappers for the CUDA Runtime API
|
A subset of the options for compiling PTX code into SASS, usable both with the CUDA driver and with NVIDIA's PTX compilation library. More...
#include <common_ptx_compilation_options.hpp>
Public Member Functions | |
virtual optional< caching_mode_t< memory_operation_t::load > > & | default_load_caching_mode () |
see default_load_caching_mode_ | |
virtual optional< caching_mode_t< memory_operation_t::load > > | default_load_caching_mode () const |
Public Attributes | |
optional< ptx_register_count_t > | max_num_registers_per_thread {} |
Limit the number of registers which a kernel thread may use. | |
optional< grid::block_dimension_t > | min_num_threads_per_block {} |
The minimum number of threads per block which the compiler should target. | |
optional< optimization_level_t > | optimization_level {} |
Compilation optimization level (as in -O1, -O2 etc.) | |
optional< device::compute_capability_t > | specific_target |
Which NVIDIA physical architecture to generate SASS code for. | |
bool | generate_source_line_info {false} |
Generate indications of which PTX/SASS instructions correspond to which lines of the source code, within the compiled output. | |
bool | generate_debug_info {false} |
Generate debugging information associating SASS instructions to locations in the source, embedding it within the compilation output (-g) | |
optional< caching_mode_t< memory_operation_t::load > > | default_load_caching_mode_ |
Which of the memory-load-instruction caching modes (see {caching_mode_t}) to use by default, when no caching mode is specified in a PTX instruction. More... | |
bool | generate_relocatable_device_code { false } |
Generate relocatable code that can be linked with other relocatable device code. More... | |
A subset of the options for compiling PTX code into SASS, usable both with the CUDA driver and with NVIDIA's PTX compilation library.
optional<caching_mode_t<memory_operation_t::load> > cuda::rtc::common_ptx_compilation_options_t::default_load_caching_mode_ |
Which of the memory-load-instruction caching modes (see {caching_mode_t}) to use by default, when no caching mode is specified in a PTX instruction.
bool cuda::rtc::common_ptx_compilation_options_t::generate_relocatable_device_code { false } |
Generate relocatable code that can be linked with other relocatable device code.