A simple caching allocator for device memory allocations. More...
A simple caching allocator for device memory allocations.
active_stream. Once freed, the allocation becomes available immediately for reuse within the active_stream with which it was associated with during allocation, and it becomes available for reuse within other streams when all prior work submitted to active_stream has completed.bin_growth provided during construction. Unused device allocations within a larger bin cache are not reused for allocation requests that categorize to smaller bin sizes.bin_growth ^ min_bin) are rounded up to (bin_growth ^ min_bin).bin_growth ^ max_bin) are not rounded up to the nearest bin and are simply freed when they are deallocated instead of being returned to a bin-cache.max_cached_bytes, allocations for that device are simply freed when they are deallocated instead of being returned to their bin-cache.bin_growth = 8min_bin = 3max_bin = 7max_cached_bytes = 6MB - 1BDefinition at line 101 of file util_allocator.cuh.
Data Structures | |
| struct | BlockDescriptor |
| class | TotalBytes |
Public Types | |
| typedef bool(* | Compare) (const BlockDescriptor &, const BlockDescriptor &) |
| BlockDescriptor comparator function interface. | |
| typedef std::multiset< BlockDescriptor, Compare > | CachedBlocks |
| Set type for cached blocks (ordered by size) | |
| typedef std::multiset< BlockDescriptor, Compare > | BusyBlocks |
| Set type for live blocks (ordered by ptr) | |
| typedef std::map< int, TotalBytes > | GpuCachedBytes |
| Map type of device ordinals to the number of cached bytes cached by each device. | |
Public Member Functions | |
| void | NearestPowerOf (unsigned int &power, size_t &rounded_bytes, unsigned int base, size_t value) |
| CachingDeviceAllocator (unsigned int bin_growth, unsigned int min_bin=1, unsigned int max_bin=INVALID_BIN, size_t max_cached_bytes=INVALID_SIZE, bool skip_cleanup=false, bool debug=false) | |
| Set of live device allocations currently in use. | |
| CachingDeviceAllocator (bool skip_cleanup=false, bool debug=false) | |
| Default constructor. | |
| cudaError_t | SetMaxCachedBytes (size_t max_cached_bytes) |
| Sets the limit on the number bytes this allocator is allowed to cache per device. | |
| cudaError_t | DeviceAllocate (int device, void **d_ptr, size_t bytes, cudaStream_t active_stream=0) |
| Provides a suitable allocation of device memory for the given size on the specified device. | |
| cudaError_t | DeviceAllocate (void **d_ptr, size_t bytes, cudaStream_t active_stream=0) |
| Provides a suitable allocation of device memory for the given size on the current device. | |
| cudaError_t | DeviceFree (int device, void *d_ptr) |
| Frees a live allocation of device memory on the specified device, returning it to the allocator. | |
| cudaError_t | DeviceFree (void *d_ptr) |
| Frees a live allocation of device memory on the current device, returning it to the allocator. | |
| cudaError_t | FreeAllCached () |
| Frees all cached device allocations on all devices. | |
| virtual | ~CachingDeviceAllocator () |
| Destructor. | |
Static Public Member Functions | |
| static unsigned int | IntPow (unsigned int base, unsigned int exp) |
Data Fields | |
| cub::Mutex | mutex |
| unsigned int | bin_growth |
| Mutex for thread-safety. | |
| unsigned int | min_bin |
| Geometric growth factor for bin-sizes. | |
| unsigned int | max_bin |
| Minimum bin enumeration. | |
| size_t | min_bin_bytes |
| Maximum bin enumeration. | |
| size_t | max_bin_bytes |
| Minimum bin size. | |
| size_t | max_cached_bytes |
| Maximum bin size. | |
| const bool | skip_cleanup |
| Maximum aggregate cached bytes per device. | |
| bool | debug |
| Whether or not to skip a call to FreeAllCached() when destructor is called. (The CUDA runtime may have already shut down for statically declared allocators) | |
| GpuCachedBytes | cached_bytes |
| Whether or not to print (de)allocation events to stdout. | |
| CachedBlocks | cached_blocks |
| Map of device ordinal to aggregate cached bytes on that device. | |
| BusyBlocks | live_blocks |
| Set of cached device allocations available for reuse. | |
Static Public Attributes | |
| static const unsigned int | INVALID_BIN = (unsigned int) -1 |
| Out-of-bounds bin. | |
| static const size_t | INVALID_SIZE = (size_t) -1 |
| Invalid size. | |
| static const int | INVALID_DEVICE_ORDINAL = -1 |
| Invalid device ordinal. | |
| typedef std::multiset<BlockDescriptor, Compare> cub::CachingDeviceAllocator::BusyBlocks |
Set type for live blocks (ordered by ptr)
Definition at line 188 of file util_allocator.cuh.
| typedef std::multiset<BlockDescriptor, Compare> cub::CachingDeviceAllocator::CachedBlocks |
Set type for cached blocks (ordered by size)
Definition at line 185 of file util_allocator.cuh.
| typedef bool(* cub::CachingDeviceAllocator::Compare) (const BlockDescriptor &, const BlockDescriptor &) |
BlockDescriptor comparator function interface.
Definition at line 175 of file util_allocator.cuh.
| typedef std::map<int, TotalBytes> cub::CachingDeviceAllocator::GpuCachedBytes |
Map type of device ordinals to the number of cached bytes cached by each device.
Definition at line 191 of file util_allocator.cuh.
|
inline |
Set of live device allocations currently in use.
Constructor.
| bin_growth | Geometric growth factor for bin-sizes |
| min_bin | Minimum bin (default is bin_growth ^ 1) |
| max_bin | Maximum bin (default is no max bin) |
| max_cached_bytes | Maximum aggregate cached bytes per device (default is no limit) |
| skip_cleanup | Whether or not to skip a call to FreeAllCached() when the destructor is called (default is to deallocate) |
| debug | Whether or not to print (de)allocation events to stdout (default is no stderr output) |
Definition at line 276 of file util_allocator.cuh.
|
inline |
Default constructor.
Configured with:
bin_growth = 8min_bin = 3max_bin = 7max_cached_bytes = (bin_growth ^ max_bin) * 3) - 1 = 6,291,455 byteswhich delineates five bin-sizes: 512B, 4KB, 32KB, 256KB, and 2MB and sets a maximum of 6,291,455 cached bytes per device
Definition at line 310 of file util_allocator.cuh.
|
inlinevirtual |
Destructor.
Definition at line 694 of file util_allocator.cuh.
|
inline |
Provides a suitable allocation of device memory for the given size on the specified device.
Once freed, the allocation becomes available immediately for reuse within the active_stream with which it was associated with during allocation, and it becomes available for reuse within other streams when all prior work submitted to active_stream has completed.
| [in] | device | Device on which to place the allocation |
| [out] | d_ptr | Reference to pointer to the allocation |
| [in] | bytes | Minimum number of bytes for the allocation |
| [in] | active_stream | The stream to be associated with this allocation |
Definition at line 357 of file util_allocator.cuh.
|
inline |
Provides a suitable allocation of device memory for the given size on the current device.
Once freed, the allocation becomes available immediately for reuse within the active_stream with which it was associated with during allocation, and it becomes available for reuse within other streams when all prior work submitted to active_stream has completed.
| [out] | d_ptr | Reference to pointer to the allocation |
| [in] | bytes | Minimum number of bytes for the allocation |
| [in] | active_stream | The stream to be associated with this allocation |
Definition at line 530 of file util_allocator.cuh.
|
inline |
Frees a live allocation of device memory on the specified device, returning it to the allocator.
Once freed, the allocation becomes available immediately for reuse within the active_stream with which it was associated with during allocation, and it becomes available for reuse within other streams when all prior work submitted to active_stream has completed.
Definition at line 546 of file util_allocator.cuh.
|
inline |
Frees a live allocation of device memory on the current device, returning it to the allocator.
Once freed, the allocation becomes available immediately for reuse within the active_stream with which it was associated with during allocation, and it becomes available for reuse within other streams when all prior work submitted to active_stream has completed.
Definition at line 630 of file util_allocator.cuh.
|
inline |
Frees all cached device allocations on all devices.
Definition at line 640 of file util_allocator.cuh.
|
inlinestatic |
Integer pow function for unsigned base and exponent
Definition at line 201 of file util_allocator.cuh.
|
inline |
Round up to the nearest power-of
Definition at line 221 of file util_allocator.cuh.
|
inline |
Sets the limit on the number bytes this allocator is allowed to cache per device.
Changing the ceiling of cached bytes does not cause any allocations (in-use or cached-in-reserve) to be freed. See FreeAllCached().
Definition at line 333 of file util_allocator.cuh.
| unsigned int cub::CachingDeviceAllocator::bin_growth |
Mutex for thread-safety.
Definition at line 252 of file util_allocator.cuh.
| CachedBlocks cub::CachingDeviceAllocator::cached_blocks |
Map of device ordinal to aggregate cached bytes on that device.
Definition at line 264 of file util_allocator.cuh.
| GpuCachedBytes cub::CachingDeviceAllocator::cached_bytes |
Whether or not to print (de)allocation events to stdout.
Definition at line 263 of file util_allocator.cuh.
| bool cub::CachingDeviceAllocator::debug |
Whether or not to skip a call to FreeAllCached() when destructor is called. (The CUDA runtime may have already shut down for statically declared allocators)
Definition at line 261 of file util_allocator.cuh.
Out-of-bounds bin.
Definition at line 109 of file util_allocator.cuh.
|
static |
Invalid device ordinal.
Definition at line 117 of file util_allocator.cuh.
|
static |
Invalid size.
Definition at line 112 of file util_allocator.cuh.
| BusyBlocks cub::CachingDeviceAllocator::live_blocks |
Set of cached device allocations available for reuse.
Definition at line 265 of file util_allocator.cuh.
| unsigned int cub::CachingDeviceAllocator::max_bin |
Minimum bin enumeration.
Definition at line 254 of file util_allocator.cuh.
| size_t cub::CachingDeviceAllocator::max_bin_bytes |
Minimum bin size.
Definition at line 257 of file util_allocator.cuh.
| size_t cub::CachingDeviceAllocator::max_cached_bytes |
Maximum bin size.
Definition at line 258 of file util_allocator.cuh.
| unsigned int cub::CachingDeviceAllocator::min_bin |
Geometric growth factor for bin-sizes.
Definition at line 253 of file util_allocator.cuh.
| size_t cub::CachingDeviceAllocator::min_bin_bytes |
Maximum bin enumeration.
Definition at line 256 of file util_allocator.cuh.
| cub::Mutex cub::CachingDeviceAllocator::mutex |
Definition at line 250 of file util_allocator.cuh.
| const bool cub::CachingDeviceAllocator::skip_cleanup |
Maximum aggregate cached bytes per device.
Definition at line 260 of file util_allocator.cuh.