OpenFPM_pdata  4.1.0
Project that contain the implementation of distributed structures
 
Loading...
Searching...
No Matches
cub::CachingDeviceAllocator Struct Reference

A simple caching allocator for device memory allocations. More...

Detailed Description

A simple caching allocator for device memory allocations.

Overview
The allocator is thread-safe and stream-safe and is capable of managing cached device allocations on multiple devices. It behaves as follows:
  • Allocations from the allocator are associated with an active_stream. Once freed, the allocation becomes available immediately for reuse within the active_stream with which it was associated with during allocation, and it becomes available for reuse within other streams when all prior work submitted to active_stream has completed.
  • Allocations are categorized and cached by bin size. A new allocation request of a given size will only consider cached allocations within the corresponding bin.
  • Bin limits progress geometrically in accordance with the growth factor bin_growth provided during construction. Unused device allocations within a larger bin cache are not reused for allocation requests that categorize to smaller bin sizes.
  • Allocation requests below (bin_growth ^ min_bin) are rounded up to (bin_growth ^ min_bin).
  • Allocations above (bin_growth ^ max_bin) are not rounded up to the nearest bin and are simply freed when they are deallocated instead of being returned to a bin-cache.
  • If the total storage of cached allocations on a given device will exceed max_cached_bytes, allocations for that device are simply freed when they are deallocated instead of being returned to their bin-cache.
For example, the default-constructed CachingDeviceAllocator is configured with:
  • bin_growth = 8
  • min_bin = 3
  • max_bin = 7
  • max_cached_bytes = 6MB - 1B
which delineates five bin-sizes: 512B, 4KB, 32KB, 256KB, and 2MB and sets a maximum of 6,291,455 cached bytes per device

Definition at line 101 of file util_allocator.cuh.

Data Structures

struct  BlockDescriptor
 
class  TotalBytes
 

Public Types

typedef bool(* Compare) (const BlockDescriptor &, const BlockDescriptor &)
 BlockDescriptor comparator function interface.
 
typedef std::multiset< BlockDescriptor, CompareCachedBlocks
 Set type for cached blocks (ordered by size)
 
typedef std::multiset< BlockDescriptor, CompareBusyBlocks
 Set type for live blocks (ordered by ptr)
 
typedef std::map< int, TotalBytesGpuCachedBytes
 Map type of device ordinals to the number of cached bytes cached by each device.
 

Public Member Functions

void NearestPowerOf (unsigned int &power, size_t &rounded_bytes, unsigned int base, size_t value)
 
 CachingDeviceAllocator (unsigned int bin_growth, unsigned int min_bin=1, unsigned int max_bin=INVALID_BIN, size_t max_cached_bytes=INVALID_SIZE, bool skip_cleanup=false, bool debug=false)
 Set of live device allocations currently in use.
 
 CachingDeviceAllocator (bool skip_cleanup=false, bool debug=false)
 Default constructor.
 
cudaError_t SetMaxCachedBytes (size_t max_cached_bytes)
 Sets the limit on the number bytes this allocator is allowed to cache per device.
 
cudaError_t DeviceAllocate (int device, void **d_ptr, size_t bytes, cudaStream_t active_stream=0)
 Provides a suitable allocation of device memory for the given size on the specified device.
 
cudaError_t DeviceAllocate (void **d_ptr, size_t bytes, cudaStream_t active_stream=0)
 Provides a suitable allocation of device memory for the given size on the current device.
 
cudaError_t DeviceFree (int device, void *d_ptr)
 Frees a live allocation of device memory on the specified device, returning it to the allocator.
 
cudaError_t DeviceFree (void *d_ptr)
 Frees a live allocation of device memory on the current device, returning it to the allocator.
 
cudaError_t FreeAllCached ()
 Frees all cached device allocations on all devices.
 
virtual ~CachingDeviceAllocator ()
 Destructor.
 

Static Public Member Functions

static unsigned int IntPow (unsigned int base, unsigned int exp)
 

Data Fields

cub::Mutex mutex
 
unsigned int bin_growth
 Mutex for thread-safety.
 
unsigned int min_bin
 Geometric growth factor for bin-sizes.
 
unsigned int max_bin
 Minimum bin enumeration.
 
size_t min_bin_bytes
 Maximum bin enumeration.
 
size_t max_bin_bytes
 Minimum bin size.
 
size_t max_cached_bytes
 Maximum bin size.
 
const bool skip_cleanup
 Maximum aggregate cached bytes per device.
 
bool debug
 Whether or not to skip a call to FreeAllCached() when destructor is called. (The CUDA runtime may have already shut down for statically declared allocators)
 
GpuCachedBytes cached_bytes
 Whether or not to print (de)allocation events to stdout.
 
CachedBlocks cached_blocks
 Map of device ordinal to aggregate cached bytes on that device.
 
BusyBlocks live_blocks
 Set of cached device allocations available for reuse.
 

Static Public Attributes

static const unsigned int INVALID_BIN = (unsigned int) -1
 Out-of-bounds bin.
 
static const size_t INVALID_SIZE = (size_t) -1
 Invalid size.
 
static const int INVALID_DEVICE_ORDINAL = -1
 Invalid device ordinal.
 

Member Typedef Documentation

◆ BusyBlocks

Set type for live blocks (ordered by ptr)

Definition at line 188 of file util_allocator.cuh.

◆ CachedBlocks

Set type for cached blocks (ordered by size)

Definition at line 185 of file util_allocator.cuh.

◆ Compare

typedef bool(* cub::CachingDeviceAllocator::Compare) (const BlockDescriptor &, const BlockDescriptor &)

BlockDescriptor comparator function interface.

Definition at line 175 of file util_allocator.cuh.

◆ GpuCachedBytes

Map type of device ordinals to the number of cached bytes cached by each device.

Definition at line 191 of file util_allocator.cuh.

Constructor & Destructor Documentation

◆ CachingDeviceAllocator() [1/2]

cub::CachingDeviceAllocator::CachingDeviceAllocator ( unsigned int  bin_growth,
unsigned int  min_bin = 1,
unsigned int  max_bin = INVALID_BIN,
size_t  max_cached_bytes = INVALID_SIZE,
bool  skip_cleanup = false,
bool  debug = false 
)
inline

Set of live device allocations currently in use.

Constructor.

Parameters
bin_growthGeometric growth factor for bin-sizes
min_binMinimum bin (default is bin_growth ^ 1)
max_binMaximum bin (default is no max bin)
max_cached_bytesMaximum aggregate cached bytes per device (default is no limit)
skip_cleanupWhether or not to skip a call to FreeAllCached() when the destructor is called (default is to deallocate)
debugWhether or not to print (de)allocation events to stdout (default is no stderr output)

Definition at line 276 of file util_allocator.cuh.

◆ CachingDeviceAllocator() [2/2]

cub::CachingDeviceAllocator::CachingDeviceAllocator ( bool  skip_cleanup = false,
bool  debug = false 
)
inline

Default constructor.

Configured with:

  • bin_growth = 8
  • min_bin = 3
  • max_bin = 7
  • max_cached_bytes = (bin_growth ^ max_bin) * 3) - 1 = 6,291,455 bytes

which delineates five bin-sizes: 512B, 4KB, 32KB, 256KB, and 2MB and sets a maximum of 6,291,455 cached bytes per device

Definition at line 310 of file util_allocator.cuh.

◆ ~CachingDeviceAllocator()

virtual cub::CachingDeviceAllocator::~CachingDeviceAllocator ( )
inlinevirtual

Destructor.

Definition at line 694 of file util_allocator.cuh.

Member Function Documentation

◆ DeviceAllocate() [1/2]

cudaError_t cub::CachingDeviceAllocator::DeviceAllocate ( int  device,
void **  d_ptr,
size_t  bytes,
cudaStream_t  active_stream = 0 
)
inline

Provides a suitable allocation of device memory for the given size on the specified device.

Once freed, the allocation becomes available immediately for reuse within the active_stream with which it was associated with during allocation, and it becomes available for reuse within other streams when all prior work submitted to active_stream has completed.

Parameters
[in]deviceDevice on which to place the allocation
[out]d_ptrReference to pointer to the allocation
[in]bytesMinimum number of bytes for the allocation
[in]active_streamThe stream to be associated with this allocation

Definition at line 357 of file util_allocator.cuh.

◆ DeviceAllocate() [2/2]

cudaError_t cub::CachingDeviceAllocator::DeviceAllocate ( void **  d_ptr,
size_t  bytes,
cudaStream_t  active_stream = 0 
)
inline

Provides a suitable allocation of device memory for the given size on the current device.

Once freed, the allocation becomes available immediately for reuse within the active_stream with which it was associated with during allocation, and it becomes available for reuse within other streams when all prior work submitted to active_stream has completed.

Parameters
[out]d_ptrReference to pointer to the allocation
[in]bytesMinimum number of bytes for the allocation
[in]active_streamThe stream to be associated with this allocation

Definition at line 530 of file util_allocator.cuh.

◆ DeviceFree() [1/2]

cudaError_t cub::CachingDeviceAllocator::DeviceFree ( int  device,
void *  d_ptr 
)
inline

Frees a live allocation of device memory on the specified device, returning it to the allocator.

Once freed, the allocation becomes available immediately for reuse within the active_stream with which it was associated with during allocation, and it becomes available for reuse within other streams when all prior work submitted to active_stream has completed.

Definition at line 546 of file util_allocator.cuh.

◆ DeviceFree() [2/2]

cudaError_t cub::CachingDeviceAllocator::DeviceFree ( void *  d_ptr)
inline

Frees a live allocation of device memory on the current device, returning it to the allocator.

Once freed, the allocation becomes available immediately for reuse within the active_stream with which it was associated with during allocation, and it becomes available for reuse within other streams when all prior work submitted to active_stream has completed.

Definition at line 630 of file util_allocator.cuh.

◆ FreeAllCached()

cudaError_t cub::CachingDeviceAllocator::FreeAllCached ( )
inline

Frees all cached device allocations on all devices.

Definition at line 640 of file util_allocator.cuh.

◆ IntPow()

static unsigned int cub::CachingDeviceAllocator::IntPow ( unsigned int  base,
unsigned int  exp 
)
inlinestatic

Integer pow function for unsigned base and exponent

Definition at line 201 of file util_allocator.cuh.

◆ NearestPowerOf()

void cub::CachingDeviceAllocator::NearestPowerOf ( unsigned int power,
size_t &  rounded_bytes,
unsigned int  base,
size_t  value 
)
inline

Round up to the nearest power-of

Definition at line 221 of file util_allocator.cuh.

◆ SetMaxCachedBytes()

cudaError_t cub::CachingDeviceAllocator::SetMaxCachedBytes ( size_t  max_cached_bytes)
inline

Sets the limit on the number bytes this allocator is allowed to cache per device.

Changing the ceiling of cached bytes does not cause any allocations (in-use or cached-in-reserve) to be freed. See FreeAllCached().

Definition at line 333 of file util_allocator.cuh.

Field Documentation

◆ bin_growth

unsigned int cub::CachingDeviceAllocator::bin_growth

Mutex for thread-safety.

Definition at line 252 of file util_allocator.cuh.

◆ cached_blocks

CachedBlocks cub::CachingDeviceAllocator::cached_blocks

Map of device ordinal to aggregate cached bytes on that device.

Definition at line 264 of file util_allocator.cuh.

◆ cached_bytes

GpuCachedBytes cub::CachingDeviceAllocator::cached_bytes

Whether or not to print (de)allocation events to stdout.

Definition at line 263 of file util_allocator.cuh.

◆ debug

bool cub::CachingDeviceAllocator::debug

Whether or not to skip a call to FreeAllCached() when destructor is called. (The CUDA runtime may have already shut down for statically declared allocators)

Definition at line 261 of file util_allocator.cuh.

◆ INVALID_BIN

const unsigned int cub::CachingDeviceAllocator::INVALID_BIN = (unsigned int) -1
static

Out-of-bounds bin.

Definition at line 109 of file util_allocator.cuh.

◆ INVALID_DEVICE_ORDINAL

const int cub::CachingDeviceAllocator::INVALID_DEVICE_ORDINAL = -1
static

Invalid device ordinal.

Definition at line 117 of file util_allocator.cuh.

◆ INVALID_SIZE

const size_t cub::CachingDeviceAllocator::INVALID_SIZE = (size_t) -1
static

Invalid size.

Definition at line 112 of file util_allocator.cuh.

◆ live_blocks

BusyBlocks cub::CachingDeviceAllocator::live_blocks

Set of cached device allocations available for reuse.

Definition at line 265 of file util_allocator.cuh.

◆ max_bin

unsigned int cub::CachingDeviceAllocator::max_bin

Minimum bin enumeration.

Definition at line 254 of file util_allocator.cuh.

◆ max_bin_bytes

size_t cub::CachingDeviceAllocator::max_bin_bytes

Minimum bin size.

Definition at line 257 of file util_allocator.cuh.

◆ max_cached_bytes

size_t cub::CachingDeviceAllocator::max_cached_bytes

Maximum bin size.

Definition at line 258 of file util_allocator.cuh.

◆ min_bin

unsigned int cub::CachingDeviceAllocator::min_bin

Geometric growth factor for bin-sizes.

Definition at line 253 of file util_allocator.cuh.

◆ min_bin_bytes

size_t cub::CachingDeviceAllocator::min_bin_bytes

Maximum bin enumeration.

Definition at line 256 of file util_allocator.cuh.

◆ mutex

cub::Mutex cub::CachingDeviceAllocator::mutex

Definition at line 250 of file util_allocator.cuh.

◆ skip_cleanup

const bool cub::CachingDeviceAllocator::skip_cleanup

Maximum aggregate cached bytes per device.

Definition at line 260 of file util_allocator.cuh.


The documentation for this struct was generated from the following file: