< Signed integer type for global offsets More...
< Signed integer type for global offsets
Utility class for dispatching the appropriately-tuned kernels for DeviceScan
Definition at line 149 of file dispatch_scan.cuh.
Data Structures | |
| struct | KernelConfig |
| struct | Policy100 |
| SM10. More... | |
| struct | Policy130 |
| SM13. More... | |
| struct | Policy200 |
| SM20. More... | |
| struct | Policy300 |
| SM30. More... | |
| struct | Policy350 |
| SM35. More... | |
| struct | Policy520 |
| SM520. More... | |
| struct | Policy600 |
| SM600. More... | |
| struct | PtxAgentScanPolicy |
Public Types | |
| enum | { INIT_KERNEL_THREADS = 128 } |
| typedef If<(Equals< typenamestd::iterator_traits< OutputIteratorT >::value_type, void >::VALUE), typenamestd::iterator_traits< InputIteratorT >::value_type, typenamestd::iterator_traits< OutputIteratorT >::value_type >::Type | OutputT |
| typedef ScanTileState< OutputT > | ScanTileStateT |
| typedef Policy100 | PtxPolicy |
Static Public Member Functions | |
| template<typename KernelConfig > | |
| CUB_RUNTIME_FUNCTION static __forceinline__ void | InitConfigs (int ptx_version, KernelConfig &scan_kernel_config) |
| template<typename ScanInitKernelPtrT , typename ScanSweepKernelPtrT > | |
| CUB_RUNTIME_FUNCTION static __forceinline__ cudaError_t | Dispatch (void *d_temp_storage, size_t &temp_storage_bytes, InputIteratorT d_in, OutputIteratorT d_out, ScanOpT scan_op, InitValueT init_value, OffsetT num_items, cudaStream_t stream, bool debug_synchronous, int, ScanInitKernelPtrT init_kernel, ScanSweepKernelPtrT scan_kernel, KernelConfig scan_kernel_config) |
| < Function type of cub::DeviceScanKernelPtrT | |
| CUB_RUNTIME_FUNCTION static __forceinline__ cudaError_t | Dispatch (void *d_temp_storage, size_t &temp_storage_bytes, InputIteratorT d_in, OutputIteratorT d_out, ScanOpT scan_op, InitValueT init_value, OffsetT num_items, cudaStream_t stream, bool debug_synchronous) |
| typedef If<(Equals<typenamestd::iterator_traits<OutputIteratorT>::value_type,void>::VALUE),typenamestd::iterator_traits<InputIteratorT>::value_type,typenamestd::iterator_traits<OutputIteratorT>::value_type>::Type cub::DispatchScan< InputIteratorT, OutputIteratorT, ScanOpT, InitValueT, OffsetT >::OutputT |
Definition at line 163 of file dispatch_scan.cuh.
| typedef Policy100 cub::DispatchScan< InputIteratorT, OutputIteratorT, ScanOpT, InitValueT, OffsetT >::PtxPolicy |
Definition at line 286 of file dispatch_scan.cuh.
| typedef ScanTileState<OutputT> cub::DispatchScan< InputIteratorT, OutputIteratorT, ScanOpT, InitValueT, OffsetT >::ScanTileStateT |
Definition at line 166 of file dispatch_scan.cuh.
| anonymous enum |
Definition at line 155 of file dispatch_scan.cuh.
|
inlinestatic |
Internal dispatch routine
| [in] | d_temp_storage | Device-accessible allocation of temporary storage. When NULL, the required allocation size is written to temp_storage_bytes and no work is done. |
| [in,out] | temp_storage_bytes | Reference to size in bytes of d_temp_storage allocation |
| [in] | d_in | Pointer to the input sequence of data items |
| [out] | d_out | Pointer to the output sequence of data items |
| [in] | scan_op | Binary scan functor |
| [in] | init_value | Initial value to seed the exclusive scan |
| [in] | num_items | Total number of input items (i.e., the length of d_in) |
| [in] | stream | [optional] CUDA stream to launch kernels within. Default is stream0. |
| [in] | debug_synchronous | [optional] Whether or not to synchronize the stream after every kernel launch to check for errors. Also causes launch configurations to be printed to the console. Default is false. |
Definition at line 514 of file dispatch_scan.cuh.
|
inlinestatic |
< Function type of cub::DeviceScanKernelPtrT
Internal dispatch routine for computing a device-wide prefix scan using the specified kernel functions.
| [in] | d_temp_storage | Device-accessible allocation of temporary storage. When NULL, the required allocation size is written to temp_storage_bytes and no work is done. |
| [in,out] | temp_storage_bytes | Reference to size in bytes of d_temp_storage allocation |
| [in] | d_in | Pointer to the input sequence of data items |
| [out] | d_out | Pointer to the output sequence of data items |
| [in] | scan_op | Binary scan functor |
| [in] | init_value | Initial value to seed the exclusive scan |
| [in] | num_items | Total number of input items (i.e., the length of d_in) |
| [in] | stream | CUDA stream to launch kernels within. Default is stream0. |
| [in] | debug_synchronous | Whether or not to synchronize the stream after every kernel launch to check for errors. Also causes launch configurations to be printed to the console. Default is false. |
| [in] | init_kernel | Kernel function pointer to parameterization of cub::DeviceScanInitKernel |
| [in] | scan_kernel | Kernel function pointer to parameterization of cub::DeviceScanKernel |
| [in] | scan_kernel_config | Dispatch parameters that match the policy that scan_kernel was compiled for |
Definition at line 381 of file dispatch_scan.cuh.
|
inlinestatic |
Initialize kernel dispatch configurations with the policies corresponding to the PTX assembly we will use
Definition at line 303 of file dispatch_scan.cuh.