WarpReduceSmem provides smem-based variants of parallel reduction of items partitioned across a CUDA thread warp. More...
WarpReduceSmem provides smem-based variants of parallel reduction of items partitioned across a CUDA thread warp.
< The PTX compute capability for which to to specialize this collective
Definition at line 55 of file warp_reduce_smem.cuh.
Data Structures | |
struct | _TempStorage |
Shared memory storage layout type (1.5 warps-worth of elements for each warp) More... | |
struct | TempStorage |
Public Types | |
enum | { IS_ARCH_WARP = (LOGICAL_WARP_THREADS == CUB_WARP_THREADS(PTX_ARCH)) , IS_POW_OF_TWO = PowerOfTwo<LOGICAL_WARP_THREADS>::VALUE , STEPS = Log2<LOGICAL_WARP_THREADS>::VALUE , HALF_WARP_THREADS = 1 << (STEPS - 1) , WARP_SMEM_ELEMENTS = LOGICAL_WARP_THREADS + HALF_WARP_THREADS , UNSET = 0x0 , SET = 0x1 , SEEN = 0x2 } |
typedef unsigned char | SmemFlag |
Shared memory flag type. | |
Public Member Functions | |
__device__ __forceinline__ | WarpReduceSmem (TempStorage &temp_storage) |
Constructor. | |
template<bool ALL_LANES_VALID, typename ReductionOp , int STEP> | |
__device__ __forceinline__ T | ReduceStep (T input, int valid_items, ReductionOp reduction_op, Int2Type< STEP >) |
template<bool ALL_LANES_VALID, typename ReductionOp > | |
__device__ __forceinline__ T | ReduceStep (T input, int valid_items, ReductionOp, Int2Type< STEPS >) |
template<bool HEAD_SEGMENTED, typename FlagT , typename ReductionOp > | |
__device__ __forceinline__ T | SegmentedReduce (T input, FlagT flag, ReductionOp reduction_op, Int2Type< true >) |
template<bool HEAD_SEGMENTED, typename FlagT , typename ReductionOp > | |
__device__ __forceinline__ T | SegmentedReduce (T input, FlagT flag, ReductionOp reduction_op, Int2Type< false >) |
template<bool ALL_LANES_VALID, typename ReductionOp > | |
__device__ __forceinline__ T | Reduce (T input, int valid_items, ReductionOp reduction_op) |
template<bool HEAD_SEGMENTED, typename FlagT , typename ReductionOp > | |
__device__ __forceinline__ T | SegmentedReduce (T input, FlagT flag, ReductionOp reduction_op) |
Data Fields | |
_TempStorage & | temp_storage |
unsigned int | lane_id |
unsigned int | member_mask |
typedef unsigned char cub::WarpReduceSmem< T, LOGICAL_WARP_THREADS, PTX_ARCH >::SmemFlag |
Shared memory flag type.
Definition at line 85 of file warp_reduce_smem.cuh.
Definition at line 61 of file warp_reduce_smem.cuh.
|
inline |
Constructor.
Definition at line 112 of file warp_reduce_smem.cuh.
|
inline |
Reduction
[in] | input | Calling thread's input |
[in] | valid_items | Total number of valid items across the logical warp |
[in] | reduction_op | Reduction operator |
Definition at line 343 of file warp_reduce_smem.cuh.
|
inline |
Reduction step
[in] | input | Calling thread's input |
[in] | valid_items | Total number of valid items across the logical warp |
[in] | reduction_op | Reduction operator |
Definition at line 141 of file warp_reduce_smem.cuh.
|
inline |
Reduction step (terminate)
[in] | input | Calling thread's input |
[in] | valid_items | Total number of valid items across the logical warp |
Definition at line 173 of file warp_reduce_smem.cuh.
|
inline |
Segmented reduction
[in] | input | Calling thread's input |
[in] | flag | Whether or not the current lane is a segment head/tail |
[in] | reduction_op | Reduction operator |
Definition at line 359 of file warp_reduce_smem.cuh.
|
inline |
Smem-based segmented reduce
[in] | input | Calling thread's input |
[in] | flag | Whether or not the current lane is a segment head/tail |
[in] | reduction_op | Reduction operator |
Definition at line 254 of file warp_reduce_smem.cuh.
|
inline |
Ballot-based segmented reduce
[in] | input | Calling thread's input |
[in] | flag | Whether or not the current lane is a segment head/tail |
[in] | reduction_op | Reduction operator |
Definition at line 195 of file warp_reduce_smem.cuh.
unsigned int cub::WarpReduceSmem< T, LOGICAL_WARP_THREADS, PTX_ARCH >::lane_id |
Definition at line 103 of file warp_reduce_smem.cuh.
unsigned int cub::WarpReduceSmem< T, LOGICAL_WARP_THREADS, PTX_ARCH >::member_mask |
Definition at line 104 of file warp_reduce_smem.cuh.
_TempStorage& cub::WarpReduceSmem< T, LOGICAL_WARP_THREADS, PTX_ARCH >::temp_storage |
Definition at line 102 of file warp_reduce_smem.cuh.