|
rocPRIM
|
The block_discontinuity class is a block level parallel primitive which provides methods for flagging items that are discontinued within an ordered set of items across threads in a block.
More...
#include <block_discontinuity.hpp>


Public Types | |
| using | storage_type = detail::raw_storage< storage_type_ > |
| Struct used to allocate a temporary memory that is required for thread communication during operations provided by related parallel primitive. More... | |
Public Member Functions | |
| template<unsigned int ItemsPerThread, class Flag , class FlagOp > | |
| ROCPRIM_DEVICE ROCPRIM_INLINE void | flag_heads (Flag(&head_flags)[ItemsPerThread], const T(&input)[ItemsPerThread], FlagOp flag_op, storage_type &storage) |
Tags head_flags that indicate discontinuities between items partitioned across the thread block, where the first item has no reference and is always flagged. More... | |
| template<unsigned int ItemsPerThread, class Flag , class FlagOp > | |
| ROCPRIM_DEVICE ROCPRIM_FORCE_INLINE void | flag_heads (Flag(&head_flags)[ItemsPerThread], const T(&input)[ItemsPerThread], FlagOp flag_op) |
| This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. This overload does not take a reference to temporary storage, instead it is declared as part of the function itself. More... | |
| template<unsigned int ItemsPerThread, class Flag , class FlagOp > | |
| ROCPRIM_DEVICE ROCPRIM_INLINE void | flag_heads (Flag(&head_flags)[ItemsPerThread], T tile_predecessor_item, const T(&input)[ItemsPerThread], FlagOp flag_op, storage_type &storage) |
Tags head_flags that indicate discontinuities between items partitioned across the thread block, where the first item of the first thread is compared against a tile_predecessor_item. More... | |
| template<unsigned int ItemsPerThread, class Flag , class FlagOp > | |
| ROCPRIM_DEVICE ROCPRIM_FORCE_INLINE void | flag_heads (Flag(&head_flags)[ItemsPerThread], T tile_predecessor_item, const T(&input)[ItemsPerThread], FlagOp flag_op) |
| This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. This overload does not accept a reference to temporary storage, instead it is declared as part of the function itself. More... | |
| template<unsigned int ItemsPerThread, class Flag , class FlagOp > | |
| ROCPRIM_DEVICE ROCPRIM_INLINE void | flag_tails (Flag(&tail_flags)[ItemsPerThread], const T(&input)[ItemsPerThread], FlagOp flag_op, storage_type &storage) |
Tags tail_flags that indicate discontinuities between items partitioned across the thread block, where the last item has no reference and is always flagged. More... | |
| template<unsigned int ItemsPerThread, class Flag , class FlagOp > | |
| ROCPRIM_DEVICE ROCPRIM_FORCE_INLINE void | flag_tails (Flag(&tail_flags)[ItemsPerThread], const T(&input)[ItemsPerThread], FlagOp flag_op) |
| This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. This overload does not accept a reference to temporary storage, instead it is declared as part of the function itself. More... | |
| template<unsigned int ItemsPerThread, class Flag , class FlagOp > | |
| ROCPRIM_DEVICE ROCPRIM_INLINE void | flag_tails (Flag(&tail_flags)[ItemsPerThread], T tile_successor_item, const T(&input)[ItemsPerThread], FlagOp flag_op, storage_type &storage) |
Tags tail_flags that indicate discontinuities between items partitioned across the thread block, where the last item of the last thread is compared against a tile_successor_item. More... | |
| template<unsigned int ItemsPerThread, class Flag , class FlagOp > | |
| ROCPRIM_DEVICE ROCPRIM_FORCE_INLINE void | flag_tails (Flag(&tail_flags)[ItemsPerThread], T tile_successor_item, const T(&input)[ItemsPerThread], FlagOp flag_op) |
| This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. This overload does not accept a reference to temporary storage, instead it is declared as part of the function itself. More... | |
| template<unsigned int ItemsPerThread, class Flag , class FlagOp > | |
| ROCPRIM_DEVICE ROCPRIM_INLINE void | flag_heads_and_tails (Flag(&head_flags)[ItemsPerThread], Flag(&tail_flags)[ItemsPerThread], const T(&input)[ItemsPerThread], FlagOp flag_op, storage_type &storage) |
Tags both head_flags andtail_flags that indicate discontinuities between items partitioned across the thread block. More... | |
| template<unsigned int ItemsPerThread, class Flag , class FlagOp > | |
| ROCPRIM_DEVICE ROCPRIM_FORCE_INLINE void | flag_heads_and_tails (Flag(&head_flags)[ItemsPerThread], Flag(&tail_flags)[ItemsPerThread], const T(&input)[ItemsPerThread], FlagOp flag_op) |
| This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. This overload does not accept a reference to temporary storage, instead it is declared as part of the function itself. More... | |
| template<unsigned int ItemsPerThread, class Flag , class FlagOp > | |
| ROCPRIM_DEVICE ROCPRIM_INLINE void | flag_heads_and_tails (Flag(&head_flags)[ItemsPerThread], Flag(&tail_flags)[ItemsPerThread], T tile_successor_item, const T(&input)[ItemsPerThread], FlagOp flag_op, storage_type &storage) |
Tags both head_flags andtail_flags that indicate discontinuities between items partitioned across the thread block, where the last item of the last thread is compared against a tile_successor_item. More... | |
| template<unsigned int ItemsPerThread, class Flag , class FlagOp > | |
| ROCPRIM_DEVICE ROCPRIM_FORCE_INLINE void | flag_heads_and_tails (Flag(&head_flags)[ItemsPerThread], Flag(&tail_flags)[ItemsPerThread], T tile_successor_item, const T(&input)[ItemsPerThread], FlagOp flag_op) |
| This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. This overload does not accept a reference to temporary storage, instead it is declared as part of the function itself. More... | |
| template<unsigned int ItemsPerThread, class Flag , class FlagOp > | |
| ROCPRIM_DEVICE ROCPRIM_INLINE void | flag_heads_and_tails (Flag(&head_flags)[ItemsPerThread], T tile_predecessor_item, Flag(&tail_flags)[ItemsPerThread], const T(&input)[ItemsPerThread], FlagOp flag_op, storage_type &storage) |
Tags both head_flags andtail_flags that indicate discontinuities between items partitioned across the thread block, where the first item of the first thread is compared against a tile_predecessor_item. More... | |
| template<unsigned int ItemsPerThread, class Flag , class FlagOp > | |
| ROCPRIM_DEVICE ROCPRIM_FORCE_INLINE void | flag_heads_and_tails (Flag(&head_flags)[ItemsPerThread], T tile_predecessor_item, Flag(&tail_flags)[ItemsPerThread], const T(&input)[ItemsPerThread], FlagOp flag_op) |
| This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. This overload does not accept a reference to temporary storage, instead it is declared as part of the function itself. More... | |
| template<unsigned int ItemsPerThread, class Flag , class FlagOp > | |
| ROCPRIM_DEVICE ROCPRIM_INLINE void | flag_heads_and_tails (Flag(&head_flags)[ItemsPerThread], T tile_predecessor_item, Flag(&tail_flags)[ItemsPerThread], T tile_successor_item, const T(&input)[ItemsPerThread], FlagOp flag_op, storage_type &storage) |
Tags both head_flags andtail_flags that indicate discontinuities between items partitioned across the thread block, where the first and last items of the first and last thread is compared against a tile_predecessor_item and a tile_successor_item. More... | |
| template<unsigned int ItemsPerThread, class Flag , class FlagOp > | |
| ROCPRIM_DEVICE ROCPRIM_FORCE_INLINE void | flag_heads_and_tails (Flag(&head_flags)[ItemsPerThread], T tile_predecessor_item, Flag(&tail_flags)[ItemsPerThread], T tile_successor_item, const T(&input)[ItemsPerThread], FlagOp flag_op) |
| This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. This overload does not accept a reference to temporary storage, instead it is declared as part of the function itself. More... | |
The block_discontinuity class is a block level parallel primitive which provides methods for flagging items that are discontinued within an ordered set of items across threads in a block.
| T | - the input type. |
| BlockSize | - the number of threads in a block. |
In the examples discontinuity operation is performed on block of 128 threads, using type int.
| using block_discontinuity< T, BlockSizeX, BlockSizeY, BlockSizeZ >::storage_type = detail::raw_storage<storage_type_> |
Struct used to allocate a temporary memory that is required for thread communication during operations provided by related parallel primitive.
Depending on the implemention the operations exposed by parallel primitive may require a temporary storage for thread communication. The storage should be allocated using keywords __shared__. It can be aliased to an externally allocated memory, or be a part of a union type with other storage types to increase shared memory reusability.
|
inline |
Tags head_flags that indicate discontinuities between items partitioned across the thread block, where the first item has no reference and is always flagged.
| ItemsPerThread | - [inferred] the number of items to be processed by each thread. |
| Flag | - [inferred] the flag type. |
| FlagOp | - [inferred] type of binary function used for flagging. |
| [out] | head_flags | - array that contains the head flags. |
| [in] | input | - array that data is loaded from. |
| [in] | flag_op | - binary operation function object that will be used for flagging. The signature of the function should be equivalent to the following: bool f(const T &a, const T &b); or bool (const T& a, const T& b, unsigned int b_index);. The signature does not need to have const &, but function object must not modify the objects passed to it. |
| [in] | storage | - reference to a temporary storage object of type storage_type. |
storage is reused or repurposed: __syncthreads() or rocprim::syncthreads().
|
inline |
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. This overload does not take a reference to temporary storage, instead it is declared as part of the function itself.
Note that this does NOT decrease the shared memory requirements of a kernel using this function.
|
inline |
Tags head_flags that indicate discontinuities between items partitioned across the thread block, where the first item of the first thread is compared against a tile_predecessor_item.
| ItemsPerThread | - [inferred] the number of items to be processed by each thread. |
| Flag | - [inferred] the flag type. |
| FlagOp | - [inferred] type of binary function used for flagging. |
| [out] | head_flags | - array that contains the head flags. |
| [in] | tile_predecessor_item | - first tile item from thread to be compared against. |
| [in] | input | - array that data is loaded from. |
| [in] | flag_op | - binary operation function object that will be used for flagging. The signature of the function should be equivalent to the following: bool f(const T &a, const T &b); or bool (const T& a, const T& b, unsigned int b_index);. The signature does not need to have const &, but function object must not modify the objects passed to it. |
| [in] | storage | - reference to a temporary storage object of type storage_type. |
storage is reused or repurposed: __syncthreads() or rocprim::syncthreads().
|
inline |
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. This overload does not accept a reference to temporary storage, instead it is declared as part of the function itself.
Note that this does NOT decrease the shared memory requirements of a kernel using this function.
|
inline |
Tags both head_flags andtail_flags that indicate discontinuities between items partitioned across the thread block.
| ItemsPerThread | - [inferred] the number of items to be processed by each thread. |
| Flag | - [inferred] the flag type. |
| FlagOp | - [inferred] type of binary function used for flagging. |
| [out] | head_flags | - array that contains the head flags. |
| [out] | tail_flags | - array that contains the tail flags. |
| [in] | input | - array that data is loaded from. |
| [in] | flag_op | - binary operation function object that will be used for flagging. The signature of the function should be equivalent to the following: bool f(const T &a, const T &b); or bool (const T& a, const T& b, unsigned int b_index);. The signature does not need to have const &, but function object must not modify the objects passed to it. |
| [in] | storage | - reference to a temporary storage object of type storage_type. |
storage is reused or repurposed: __syncthreads() or rocprim::syncthreads().
|
inline |
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. This overload does not accept a reference to temporary storage, instead it is declared as part of the function itself.
Note that this does NOT decrease the shared memory requirements of a kernel using this function.
|
inline |
Tags both head_flags andtail_flags that indicate discontinuities between items partitioned across the thread block, where the last item of the last thread is compared against a tile_successor_item.
| ItemsPerThread | - [inferred] the number of items to be processed by each thread. |
| Flag | - [inferred] the flag type. |
| FlagOp | - [inferred] type of binary function used for flagging. |
| [out] | head_flags | - array that contains the head flags. |
| [out] | tail_flags | - array that contains the tail flags. |
| [in] | tile_successor_item | - last tile item from thread to be compared against. |
| [in] | input | - array that data is loaded from. |
| [in] | flag_op | - binary operation function object that will be used for flagging. The signature of the function should be equivalent to the following: bool f(const T &a, const T &b); or bool (const T& a, const T& b, unsigned int b_index);. The signature does not need to have const &, but function object must not modify the objects passed to it. |
| [in] | storage | - reference to a temporary storage object of type storage_type. |
storage is reused or repurposed: __syncthreads() or rocprim::syncthreads().
|
inline |
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. This overload does not accept a reference to temporary storage, instead it is declared as part of the function itself.
Note that this does NOT decrease the shared memory requirements of a kernel using this function.
|
inline |
Tags both head_flags andtail_flags that indicate discontinuities between items partitioned across the thread block, where the first item of the first thread is compared against a tile_predecessor_item.
| ItemsPerThread | - [inferred] the number of items to be processed by each thread. |
| Flag | - [inferred] the flag type. |
| FlagOp | - [inferred] type of binary function used for flagging. |
| [out] | head_flags | - array that contains the head flags. |
| [in] | tile_predecessor_item | - first tile item from thread to be compared against. |
| [out] | tail_flags | - array that contains the tail flags. |
| [in] | input | - array that data is loaded from. |
| [in] | flag_op | - binary operation function object that will be used for flagging. The signature of the function should be equivalent to the following: bool f(const T &a, const T &b); or bool (const T& a, const T& b, unsigned int b_index);. The signature does not need to have const &, but function object must not modify the objects passed to it. |
| [in] | storage | - reference to a temporary storage object of type storage_type. |
storage is reused or repurposed: __syncthreads() or rocprim::syncthreads().
|
inline |
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. This overload does not accept a reference to temporary storage, instead it is declared as part of the function itself.
Note that this does NOT decrease the shared memory requirements of a kernel using this function.
|
inline |
Tags both head_flags andtail_flags that indicate discontinuities between items partitioned across the thread block, where the first and last items of the first and last thread is compared against a tile_predecessor_item and a tile_successor_item.
| ItemsPerThread | - [inferred] the number of items to be processed by each thread. |
| Flag | - [inferred] the flag type. |
| FlagOp | - [inferred] type of binary function used for flagging. |
| [out] | head_flags | - array that contains the head flags. |
| [in] | tile_predecessor_item | - first tile item from thread to be compared against. |
| [out] | tail_flags | - array that contains the tail flags. |
| [in] | tile_successor_item | - last tile item from thread to be compared against. |
| [in] | input | - array that data is loaded from. |
| [in] | flag_op | - binary operation function object that will be used for flagging. The signature of the function should be equivalent to the following: bool f(const T &a, const T &b); or bool (const T& a, const T& b, unsigned int b_index);. The signature does not need to have const &, but function object must not modify the objects passed to it. |
| [in] | storage | - reference to a temporary storage object of type storage_type. |
storage is reused or repurposed: __syncthreads() or rocprim::syncthreads().
|
inline |
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. This overload does not accept a reference to temporary storage, instead it is declared as part of the function itself.
Note that this does NOT decrease the shared memory requirements of a kernel using this function.
|
inline |
Tags tail_flags that indicate discontinuities between items partitioned across the thread block, where the last item has no reference and is always flagged.
| ItemsPerThread | - [inferred] the number of items to be processed by each thread. |
| Flag | - [inferred] the flag type. |
| FlagOp | - [inferred] type of binary function used for flagging. |
| [out] | tail_flags | - array that contains the tail flags. |
| [in] | input | - array that data is loaded from. |
| [in] | flag_op | - binary operation function object that will be used for flagging. The signature of the function should be equivalent to the following: bool f(const T &a, const T &b); or bool (const T& a, const T& b, unsigned int b_index);. The signature does not need to have const &, but function object must not modify the objects passed to it. |
| [in] | storage | - reference to a temporary storage object of type storage_type. |
storage is reused or repurposed: __syncthreads() or rocprim::syncthreads().
|
inline |
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. This overload does not accept a reference to temporary storage, instead it is declared as part of the function itself.
Note that this does NOT decrease the shared memory requirements of a kernel using this function.
|
inline |
Tags tail_flags that indicate discontinuities between items partitioned across the thread block, where the last item of the last thread is compared against a tile_successor_item.
| ItemsPerThread | - [inferred] the number of items to be processed by each thread. |
| Flag | - [inferred] the flag type. |
| FlagOp | - [inferred] type of binary function used for flagging. |
| [out] | tail_flags | - array that contains the tail flags. |
| [in] | tile_successor_item | - last tile item from thread to be compared against. |
| [in] | input | - array that data is loaded from. |
| [in] | flag_op | - binary operation function object that will be used for flagging. The signature of the function should be equivalent to the following: bool f(const T &a, const T &b); or bool (const T& a, const T& b, unsigned int b_index);. The signature does not need to have const &, but function object must not modify the objects passed to it. |
| [in] | storage | - reference to a temporary storage object of type storage_type. |
storage is reused or repurposed: __syncthreads() or rocprim::syncthreads().
|
inline |
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts. This overload does not accept a reference to temporary storage, instead it is declared as part of the function itself.
Note that this does NOT decrease the shared memory requirements of a kernel using this function.
1.8.13