A simple caching allocator for device memory allocations. More...
A simple caching allocator for device memory allocations.
active_stream
. Once freed, the allocation becomes available immediately for reuse within the active_stream
with which it was associated with during allocation, and it becomes available for reuse within other streams when all prior work submitted to active_stream
has completed.bin_growth
provided during construction. Unused device allocations within a larger bin cache are not reused for allocation requests that categorize to smaller bin sizes.bin_growth
^ min_bin
) are rounded up to (bin_growth
^ min_bin
).bin_growth
^ max_bin
) are not rounded up to the nearest bin and are simply freed when they are deallocated instead of being returned to a bin-cache.max_cached_bytes
, allocations for that device are simply freed when they are deallocated instead of being returned to their bin-cache.bin_growth
= 8min_bin
= 3max_bin
= 7max_cached_bytes
= 6MB - 1BDefinition at line 101 of file util_allocator.cuh.
Data Structures | |
struct | BlockDescriptor |
class | TotalBytes |
Public Types | |
typedef bool(* | Compare) (const BlockDescriptor &, const BlockDescriptor &) |
BlockDescriptor comparator function interface. | |
typedef std::multiset< BlockDescriptor, Compare > | CachedBlocks |
Set type for cached blocks (ordered by size) | |
typedef std::multiset< BlockDescriptor, Compare > | BusyBlocks |
Set type for live blocks (ordered by ptr) | |
typedef std::map< int, TotalBytes > | GpuCachedBytes |
Map type of device ordinals to the number of cached bytes cached by each device. | |
Public Member Functions | |
void | NearestPowerOf (unsigned int &power, size_t &rounded_bytes, unsigned int base, size_t value) |
CachingDeviceAllocator (unsigned int bin_growth, unsigned int min_bin=1, unsigned int max_bin=INVALID_BIN, size_t max_cached_bytes=INVALID_SIZE, bool skip_cleanup=false, bool debug=false) | |
Set of live device allocations currently in use. | |
CachingDeviceAllocator (bool skip_cleanup=false, bool debug=false) | |
Default constructor. | |
cudaError_t | SetMaxCachedBytes (size_t max_cached_bytes) |
Sets the limit on the number bytes this allocator is allowed to cache per device. | |
cudaError_t | DeviceAllocate (int device, void **d_ptr, size_t bytes, cudaStream_t active_stream=0) |
Provides a suitable allocation of device memory for the given size on the specified device. | |
cudaError_t | DeviceAllocate (void **d_ptr, size_t bytes, cudaStream_t active_stream=0) |
Provides a suitable allocation of device memory for the given size on the current device. | |
cudaError_t | DeviceFree (int device, void *d_ptr) |
Frees a live allocation of device memory on the specified device, returning it to the allocator. | |
cudaError_t | DeviceFree (void *d_ptr) |
Frees a live allocation of device memory on the current device, returning it to the allocator. | |
cudaError_t | FreeAllCached () |
Frees all cached device allocations on all devices. | |
virtual | ~CachingDeviceAllocator () |
Destructor. | |
Static Public Member Functions | |
static unsigned int | IntPow (unsigned int base, unsigned int exp) |
Data Fields | |
cub::Mutex | mutex |
unsigned int | bin_growth |
Mutex for thread-safety. | |
unsigned int | min_bin |
Geometric growth factor for bin-sizes. | |
unsigned int | max_bin |
Minimum bin enumeration. | |
size_t | min_bin_bytes |
Maximum bin enumeration. | |
size_t | max_bin_bytes |
Minimum bin size. | |
size_t | max_cached_bytes |
Maximum bin size. | |
const bool | skip_cleanup |
Maximum aggregate cached bytes per device. | |
bool | debug |
Whether or not to skip a call to FreeAllCached() when destructor is called. (The CUDA runtime may have already shut down for statically declared allocators) | |
GpuCachedBytes | cached_bytes |
Whether or not to print (de)allocation events to stdout. | |
CachedBlocks | cached_blocks |
Map of device ordinal to aggregate cached bytes on that device. | |
BusyBlocks | live_blocks |
Set of cached device allocations available for reuse. | |
Static Public Attributes | |
static const unsigned int | INVALID_BIN = (unsigned int) -1 |
Out-of-bounds bin. | |
static const size_t | INVALID_SIZE = (size_t) -1 |
Invalid size. | |
static const int | INVALID_DEVICE_ORDINAL = -1 |
Invalid device ordinal. | |
typedef std::multiset<BlockDescriptor, Compare> cub::CachingDeviceAllocator::BusyBlocks |
Set type for live blocks (ordered by ptr)
Definition at line 188 of file util_allocator.cuh.
typedef std::multiset<BlockDescriptor, Compare> cub::CachingDeviceAllocator::CachedBlocks |
Set type for cached blocks (ordered by size)
Definition at line 185 of file util_allocator.cuh.
typedef bool(* cub::CachingDeviceAllocator::Compare) (const BlockDescriptor &, const BlockDescriptor &) |
BlockDescriptor comparator function interface.
Definition at line 175 of file util_allocator.cuh.
typedef std::map<int, TotalBytes> cub::CachingDeviceAllocator::GpuCachedBytes |
Map type of device ordinals to the number of cached bytes cached by each device.
Definition at line 191 of file util_allocator.cuh.
|
inline |
Set of live device allocations currently in use.
Constructor.
bin_growth | Geometric growth factor for bin-sizes |
min_bin | Minimum bin (default is bin_growth ^ 1) |
max_bin | Maximum bin (default is no max bin) |
max_cached_bytes | Maximum aggregate cached bytes per device (default is no limit) |
skip_cleanup | Whether or not to skip a call to FreeAllCached() when the destructor is called (default is to deallocate) |
debug | Whether or not to print (de)allocation events to stdout (default is no stderr output) |
Definition at line 276 of file util_allocator.cuh.
|
inline |
Default constructor.
Configured with:
bin_growth
= 8min_bin
= 3max_bin
= 7max_cached_bytes
= (bin_growth
^ max_bin
) * 3) - 1 = 6,291,455 byteswhich delineates five bin-sizes: 512B, 4KB, 32KB, 256KB, and 2MB and sets a maximum of 6,291,455 cached bytes per device
Definition at line 310 of file util_allocator.cuh.
|
inlinevirtual |
Destructor.
Definition at line 694 of file util_allocator.cuh.
|
inline |
Provides a suitable allocation of device memory for the given size on the specified device.
Once freed, the allocation becomes available immediately for reuse within the active_stream
with which it was associated with during allocation, and it becomes available for reuse within other streams when all prior work submitted to active_stream
has completed.
[in] | device | Device on which to place the allocation |
[out] | d_ptr | Reference to pointer to the allocation |
[in] | bytes | Minimum number of bytes for the allocation |
[in] | active_stream | The stream to be associated with this allocation |
Definition at line 357 of file util_allocator.cuh.
|
inline |
Provides a suitable allocation of device memory for the given size on the current device.
Once freed, the allocation becomes available immediately for reuse within the active_stream
with which it was associated with during allocation, and it becomes available for reuse within other streams when all prior work submitted to active_stream
has completed.
[out] | d_ptr | Reference to pointer to the allocation |
[in] | bytes | Minimum number of bytes for the allocation |
[in] | active_stream | The stream to be associated with this allocation |
Definition at line 530 of file util_allocator.cuh.
|
inline |
Frees a live allocation of device memory on the specified device, returning it to the allocator.
Once freed, the allocation becomes available immediately for reuse within the active_stream
with which it was associated with during allocation, and it becomes available for reuse within other streams when all prior work submitted to active_stream
has completed.
Definition at line 546 of file util_allocator.cuh.
|
inline |
Frees a live allocation of device memory on the current device, returning it to the allocator.
Once freed, the allocation becomes available immediately for reuse within the active_stream
with which it was associated with during allocation, and it becomes available for reuse within other streams when all prior work submitted to active_stream
has completed.
Definition at line 630 of file util_allocator.cuh.
|
inline |
Frees all cached device allocations on all devices.
Definition at line 640 of file util_allocator.cuh.
|
inlinestatic |
Integer pow function for unsigned base and exponent
Definition at line 201 of file util_allocator.cuh.
|
inline |
Round up to the nearest power-of
Definition at line 221 of file util_allocator.cuh.
|
inline |
Sets the limit on the number bytes this allocator is allowed to cache per device.
Changing the ceiling of cached bytes does not cause any allocations (in-use or cached-in-reserve) to be freed. See FreeAllCached()
.
Definition at line 333 of file util_allocator.cuh.
unsigned int cub::CachingDeviceAllocator::bin_growth |
Mutex for thread-safety.
Definition at line 252 of file util_allocator.cuh.
CachedBlocks cub::CachingDeviceAllocator::cached_blocks |
Map of device ordinal to aggregate cached bytes on that device.
Definition at line 264 of file util_allocator.cuh.
GpuCachedBytes cub::CachingDeviceAllocator::cached_bytes |
Whether or not to print (de)allocation events to stdout.
Definition at line 263 of file util_allocator.cuh.
bool cub::CachingDeviceAllocator::debug |
Whether or not to skip a call to FreeAllCached() when destructor is called. (The CUDA runtime may have already shut down for statically declared allocators)
Definition at line 261 of file util_allocator.cuh.
Out-of-bounds bin.
Definition at line 109 of file util_allocator.cuh.
|
static |
Invalid device ordinal.
Definition at line 117 of file util_allocator.cuh.
|
static |
Invalid size.
Definition at line 112 of file util_allocator.cuh.
BusyBlocks cub::CachingDeviceAllocator::live_blocks |
Set of cached device allocations available for reuse.
Definition at line 265 of file util_allocator.cuh.
unsigned int cub::CachingDeviceAllocator::max_bin |
Minimum bin enumeration.
Definition at line 254 of file util_allocator.cuh.
size_t cub::CachingDeviceAllocator::max_bin_bytes |
Minimum bin size.
Definition at line 257 of file util_allocator.cuh.
size_t cub::CachingDeviceAllocator::max_cached_bytes |
Maximum bin size.
Definition at line 258 of file util_allocator.cuh.
unsigned int cub::CachingDeviceAllocator::min_bin |
Geometric growth factor for bin-sizes.
Definition at line 253 of file util_allocator.cuh.
size_t cub::CachingDeviceAllocator::min_bin_bytes |
Maximum bin enumeration.
Definition at line 256 of file util_allocator.cuh.
cub::Mutex cub::CachingDeviceAllocator::mutex |
Definition at line 250 of file util_allocator.cuh.
const bool cub::CachingDeviceAllocator::skip_cleanup |
Maximum aggregate cached bytes per device.
Definition at line 260 of file util_allocator.cuh.