DeviceSpmv provides device-wide parallel operations for performing sparse-matrix * dense-vector multiplication (SpMV). More...
DeviceSpmv provides device-wide parallel operations for performing sparse-matrix * dense-vector multiplication (SpMV).
Definition at line 70 of file device_spmv.cuh.
Static Public Member Functions | |
CSR matrix operations | |
template<typename ValueT > | |
static CUB_RUNTIME_FUNCTION cudaError_t | CsrMV (void *d_temp_storage, size_t &temp_storage_bytes, ValueT *d_values, int *d_row_offsets, int *d_column_indices, ValueT *d_vector_x, ValueT *d_vector_y, int num_rows, int num_cols, int num_nonzeros, cudaStream_t stream=0, bool debug_synchronous=false) |
This function performs the matrix-vector operation y = A*x. | |
|
inlinestatic |
This function performs the matrix-vector operation y = A*x.
ValueT | [inferred] Matrix and vector value type (e.g., /p float, /p double, etc.) |
[in] | d_temp_storage | Device-accessible allocation of temporary storage. When NULL, the required allocation size is written to temp_storage_bytes and no work is done. |
[in,out] | temp_storage_bytes | Reference to size in bytes of d_temp_storage allocation |
[in] | d_values | Pointer to the array of num_nonzeros values of the corresponding nonzero elements of matrix A. |
[in] | d_row_offsets | Pointer to the array of m + 1 offsets demarcating the start of every row in d_column_indices and d_values (with the final entry being equal to num_nonzeros ) |
[in] | d_column_indices | Pointer to the array of num_nonzeros column-indices of the corresponding nonzero elements of matrix A. (Indices are zero-valued.) |
[in] | d_vector_x | Pointer to the array of num_cols values corresponding to the dense input vector x |
[out] | d_vector_y | Pointer to the array of num_rows values corresponding to the dense output vector y |
[in] | num_rows | number of rows of matrix A. |
[in] | num_cols | number of columns of matrix A. |
[in] | num_nonzeros | number of nonzero elements of matrix A. |
[in] | stream | [optional] CUDA stream to launch kernels within. Default is stream0. |
[in] | debug_synchronous | [optional] Whether or not to synchronize the stream after every kernel launch to check for errors. May cause significant slowdown. Default is false . |
Definition at line 132 of file device_spmv.cuh.