|
cuda-api-wrappers
Thin C++-flavored wrappers for the CUDA Runtime API
|
A convenience class for holding, setting and inspecting options for a CUDA binary code linking process - which may also involve PTX compilation. More...
#include <link_options.hpp>


Public Attributes | |
| struct { | |
| optional< span< char > > info | |
| Non-error information regarding the logging process (i.e. its "standard output" stream) | |
| optional< span< char > > error | |
| Information regarding errors in the logging process (i.e. its "standard error" stream) | |
| bool verbose | |
| Control whether the info and error logging will be verbose. | |
| } | logs |
| options related to logging the link-process | |
| bool | obtain_target_from_cuda_context { true } |
| Instead of using explicitly-specified binary target, from common_ptx_compilation_options_t::specific_target - use the device of the current CUDA context as the target for binary generation. | |
| optional< fallback_strategy_for_binary_code_t > | fallback_strategy_for_binary_code |
| Possible strategy for obtaining fully-compiled binary code when it is not simply available in the input to the link-process. | |
Public Attributes inherited from cuda::rtc::common_ptx_compilation_options_t | |
| optional< ptx_register_count_t > | max_num_registers_per_thread {} |
| Limit the number of registers which a kernel thread may use. | |
| optional< grid::block_dimension_t > | min_num_threads_per_block {} |
| The minimum number of threads per block which the compiler should target. | |
| optional< optimization_level_t > | optimization_level {} |
| Compilation optimization level (as in -O1, -O2 etc.) | |
| optional< device::compute_capability_t > | specific_target |
| Which NVIDIA physical architecture to generate SASS code for. | |
| bool | generate_source_line_info {false} |
| Generate indications of which PTX/SASS instructions correspond to which lines of the source code, within the compiled output. | |
| bool | generate_debug_info {false} |
| Generate debugging information associating SASS instructions to locations in the source, embedding it within the compilation output (-g) | |
| optional< caching_mode_t< memory_operation_t::load > > | default_load_caching_mode_ |
| Which of the memory-load-instruction caching modes (see {caching_mode_t}) to use by default, when no caching mode is specified in a PTX instruction. More... | |
| bool | generate_relocatable_device_code { false } |
| Generate relocatable code that can be linked with other relocatable device code. More... | |
Additional Inherited Members | |
Public Member Functions inherited from cuda::rtc::common_ptx_compilation_options_t | |
| virtual optional< caching_mode_t< memory_operation_t::load > > & | default_load_caching_mode () |
| see default_load_caching_mode_ | |
| virtual optional< caching_mode_t< memory_operation_t::load > > | default_load_caching_mode () const |
A convenience class for holding, setting and inspecting options for a CUDA binary code linking process - which may also involve PTX compilation.