As I said elsewhere I've been trying to make heap allocation almost as fast as stack allocation for a realtime scripting language I'm making. Here is the allocator code.
I was going to make this a tutorial, but instead I'm going to try and get something about this published in somewhere like the C++ Users Journal.
For some info on metaprogramming have a look at http://osl.iu.edu/~tveldhui/papers/Template-Metaprograms/meta-art.html and/or do some googling. There is a fair amount of info out there for such an exotic technique, and there are even some libraries like the boost mpl.
Enjoy [:)]
[code]
#ifndef __METAPOOL_HH__
#define __METAPOOL_HH__
/********************************************************************************
** A cross-platform template metaprogrammed memory pool allocation system. **
** **
** Copyright (C) 2003-2005 Lorien Dunn and The Applied Computing Research **
** Institute, La Trobe University. **
** **
** Released at http://sumea.com.au/forum/topic.asp?TOPIC_ID=2716 under **
** the Boost license (repreoduced below) on the 7th of Feb 2005. The API for **
** normal use is at the end of this file. **
** **
** In addition to the boost license I ask that bug fixes, optimisations **
** and other improvements be posted on sumea at the URL given above. **
** And if you use this and meet me perhaps you could buy me a beer :) **
** **
** This system manages a variable number of pools, each of which caters for **
** a different size allocation. The metaprogramming is used to find which **
** pool bests fits a particular size allocation request at compile time. **
** A runtime search is also provided (for when a size can't be known at **
** compile time). **
** **
** Pool chunkSizes are arranged like this: **
** 16 24 32 48 64 96 128 192 or **
** 24 32 48 64 96 128 192 256 **
** depending on whether the value defined by METAPOOL_MIN_CHUNK_SIZE **
** is a power of two or a power of two + 1/2 times the same power of two. **
** **
** Compared to previous versions this release vastly improves memory **
** wastage and scalability. METAPOOL_HEAP_SIZE_IN_K can be set to 4 **
** megs and METAPOOL_MAX_CHUNK_SIZE to 2 megs without problem. There is a **
** small runtime penalty for this scalabilty, and synthetic benchmarks that **
** allocate tons of memory probably won't perform hugely better than malloc. **
** It will perform much better in real programs. **
** **
** It should be 64-bit clean, providing you change one macro, but I've not **
** been able to verify this yet. It will not work on 16-bit CPUs. **
** **
** Note pools with a chunkSizes that are multiples of 16 bytes will return **
** 16 byte allocations, making them suitable for use with SSE vector **
** operations. **
** **
** Normal use API appears at the end of this file. It is very similar to **
** the c standard library: **
** **
** void* alloc(); **
** void* alloc(); **
** void* alloc(size_t size); **
** void* realloc(void* mem); **
** void* realloc(void* mem, size_t newSize); **
** void free(void* mem); **
** **
** The Allocator classes from the previous versions are gone, but a base **
** class allocator for use with the "curiously recursive template pattern" **
** is provided, as are preprocessor macros. **
** **
** Forget about using this with MSVC 6.0 (even with SP 5). 6.0 is way too **
** broken (particularly in templates), and the headers are missing the memory **
** alignment functions. AFAIK the only version of MSVC this code will compile **
** with is .net 2003 because it requires partial template specialisation. **
** If you are using MinGW make sure it is >= 3.4.0 because the memory **
** alignment functions are missing from the headers in 3.3.x **
** Works fine with Intel C++ 7.1 on win32 with MSVC6 headers, but have to hak **
** alignedAlloc and alignedFree to use normal malloc/free :( **
** I have no idea if it works with Borland compilers, and to be honest I **
** don't care- I've had nothing but trouble with them. **
** **
********************************************************************************/
/********************************************************************************
** Boost Software License - Version 1.0 - August 17th, 2003 **
** **
** Permission is hereby granted, free of charge, to any person or organization **
** obtaining a copy of the software and accompanying documentation covered by **
** this license (the "Software") to use, reproduce, display, distribute, **
** execute, and transmit the Software, and to prepare derivative works of the **
** Software, and to permit third-parties to whom the Software is furnished to **
** do so, all subject to the following: **
** **
** The copyright notices in the Software and this entire statement, including **
** the above license grant, this restriction and the following disclaimer, **
** must be included in all copies of the Software, in whole or in part, and **
** all derivative works of the Software, unless such copies or derivative **
** works are solely in the form of machine-executable object code generated by **
** a source language processor. **
** **
** THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR **
** IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, **
** FITNESS FOR A PARTICULAR PURPOSE, TITLE AND NON-INFRINGEMENT. IN NO EVENT **
** SHALL THE COPYRIGHT HOLDERS OR ANYONE DISTRIBUTING THE SOFTWARE BE LIABLE **
** FOR ANY DAMAGES OR OTHER LIABILITY, WHETHER IN CONTRACT, TORT OR OTHERWISE, **
** ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER **
** DEALINGS IN THE SOFTWARE. **
********************************************************************************/
#include
#include //for std::bad_alloc
//not using headers because they causes
//compatibility issues when including c header files.
#include //for memcpy,
#include
//Re-implimentation of the Loki library static assert.
//Makes a compile error if the condition is false.
template struct CTAssert;
template <> struct CTAssert {};
#define CTASSERT(expr,msg) CTAssert<((expr) != false)> ERROR_##msg;
//All memory will be aligned on 16 byte boundaries. Do not change.
#define BYTE_ALIGNMENT 16
#ifdef METAPOOL_USE_MALLOC
#warning "METAPOOL_USE_MALLOC overridden"
#else
#define METAPOOL_USE_MALLOC
#endif //METAPOOL_USE_MALLOC
//change this to 0 on a 64 bit system (untested atm). I wish
//sizeof would work in the preprocessor!
#ifdef THIRTY_TWO_BIT
#warning "THIRTY_TWO_BIT overridden"
#else
#define THIRTY_TWO_BIT 1
#endif //THIRTY_TWO_BIT
//You might want to change these according to your needs.
//Chunk sizes should be be powers of two or memory alignment
//will likely be completely wrong, and smaller than 16 bytes
//on 32-bit systems and 32 bytes on 64-bit systems
//will mean more space wasted than allocated for the Pools
//with the smaller chunkSizes.
//add warnings if overidden
#ifdef METAPOOL_MAX_SPARE_HEAPS
#warning "METAPOOL_MAX_SPARE_HEAPS overridden"
#else
#define METAPOOL_MAX_SPARE_HEAPS 20
#endif //METAPOOL_MAX_SPARE_HEAPS
#if THIRTY_TWO_BIT == 1
#ifdef METAPOOL_HEAP_SIZE_IN_K
#warning "METAPOOL_HEAP_SIZE_IN_K overridden"
#else
#define METAPOOL_HEAP_SIZE_IN_K 1024
#endif //METAPOOL_HEAP_SIZE_IN_K
#ifdef METAPOOL_MIN_CHUNK_SIZE
#warning "METAPOOL_MIN_CHUNK_SIZE overridden"
#else
#define METAPOOL_MIN_CHUNK_SIZE 16
#endif //METAPOOL_MIN_CHUNK_SIZE
#else //64-bit
#ifdef METAPOOL_HEAP_SIZE_IN_K
#warning "METAPOOL_HEAP_SIZE_IN_K overridden"
#else
#define METAPOOL_HEAP_SIZE_IN_K 2048
#endif //METAPOOL_HEAP_SIZE_IN_K
#ifdef METAPOOL_MIN_CHUNK_SIZE
#warning "METAPOOL_MIN_CHUNK_SIZE overridden"
#else
#define METAPOOL_MIN_CHUNK_SIZE 32
#endif //METAPOOL_MIN_CHUNK_SIZE
#endif //THIRTY_TWO_BIT == 1
#ifdef METAPOOL_MAX_CHUNK_SIZE
#warning "METAPOOL_MIN_CHUNK_SIZE overridden"
#else
#define METAPOOL_MAX_CHUNK_SIZE ((METAPOOL_HEAP_SIZE_IN_K*1024)/2)
#endif //METAPOOL_MAX_CHUNK_SIZE
//do a little preprocessor sanity checking.
//Real power-of-2 and power-of-2 + 1/2-power-of-2 sanity
//checking is done with a template metaprogram.
#if METAPOOL_HEAP_SIZE_IN_K % 2 != 0
#error "METAPOOL_HEAP_SIZE_IN_K must be even"
#endif //METAPOOL_HEAP_SIZE_IN_K % 2 == 0
#if METAPOOL_MAX_CHUNK_SIZE < METAPOOL_MIN_CHUNK_SIZE
#error "METAPOOL_MAX_CHUNK_SIZE < METAPOOL_MIN_CHUNK_SIZE"
#endif //METAPOOL_MAX_CHUNK_SIZE < METAPOOL_MIN_CHUNK_SIZE
#if METAPOOL_MIN_CHUNK_SIZE % 2 != 0
#error "METAPOOL_MIN_CHUNK_SIZE must be even"
#endif //METAPOOL_MIN_CHUNK_SIZE % 2 != 0
#if METAPOOL_MAX_CHUNK_SIZE % 2 != 0
#warning "METAPOOL_MAX_CHUNK_SIZE must be even"
#endif //METAPOOL_MAX_CHUNK_SIZE % 2 != 0
#if METAPOOL_MAX_CHUNK_SIZE / 1024 > METAPOOL_HEAP_SIZE_IN_K
#error "METAPOOL_MAX_CHUNK_SIZE / 1024 > HEAP_SIZE"
#endif //METAPOOL_MAX_CHUNK_SIZE / 1024 > METAPOOL_HEAP_SIZE_IN_K
#if THIRTY_TWO_BIT == 1
#if METAPOOL_MIN_CHUNK_SIZE < 16
#warning "minimum chunk size is less than housekeeping size"
#endif //METAPOOL_MIN_CHUNK_SIZE < 16
#else //64 bit
#if METAPOOL_MIN_CHUNK_SIZE < 32
#warning "minimum chunk size is less than housekeeping size"
#endif //METAPOOL_MIN_CHUNK_SIZE < 32
#endif //THIRTY_TWO_BIT
#define MAX_CHUNKS_PER_HEAP (METAPOOL_HEAP_SIZE_IN_K/ METAPOOL_MIN_CHUNK_SIZE)
//You should probably get rid of this and use a
//better Singleton class beacuse static storage
//is rather slow to access.
//I plan on using a thread local storage singleton,
//so each thread gets it's own MetaPool. This will
//make everything threadsafe without using any
//mutexes. Providing the same thread that allocates
//a given piece of memory also frees it...
template
class Singleton
{
public:
inline static T* instance()
{
static T instance;
return &instance;
}
};
//Portability functions for allocating memory aligned on a
//particular bytes boundary. This allocation system always
//uses 16-byte alignment.
#ifdef unix
#include //cstdlib doesn't have posix_memalign!
//WARNING: arg order reversed from prev versions
inline void* alignedAlloc(size_t bytes, size_t align=BYTE_ALIGNMENT)
{
void *rtn = NULL;
#ifdef METAPOOL_USE_MALLOC
posix_memalign(&rtn,align,bytes);
#else
rtn = malloc(bytes);
#endif
if(!rtn) throw std::bad_alloc();
return rtn;
}
inline void alignedFree(void *mem)
{
assert(mem);
free(mem);
}
#elif defined WIN32
#include
//WARNING: arg order reversed from prev versions
inline void* alignedAlloc(size_t bytes,size_t align=BYTE_ALIGNMENT)
{
#ifdef METAPOOL_USE_MALLOC
void* rtn = malloc(bytes);
#else
void* rtn = _aligned_malloc(bytes,align);
#endif
if(!rtn) throw std::bad_alloc();
return rtn;
}
inline void alignedFree(void *mem)
{
assert(mem);
#ifdef METAPOOL_USE_MALLOC
free(mem);
#else
_aligned_free(mem);
#endif
}
#else
#error "Either unix or WIN32 was not defined, or you have an unsupported system"
#endif //unix
namespace Memory
{
class MetaPool;
namespace Details
{
class Heap;
////////////////////////////////////////////////////////////////////////
// //
// Struct Node. A header for a chunk and also a linked //
// list node. The list structure eliminates searching for //
// free chunks. //
// //
////////////////////////////////////////////////////////////////////////
struct Node
{
//Holds the actual size of the allocation to speed up realloc
//and test if free
size_t mAllocSize;
//next node in this heap
Node* mNext;
//heap that owns this node (BIG freeChunk speedup)
Heap* mHeap;
//To be used for reference counted smart pointers.
//Also ensures sizeof(Node) is 16 (32 on 64-bit systems) bytes to
//eliminate memory alignment issues
size_t mRefCount;
};
CTASSERT(sizeof(char)==1,CharSizeIsNot1Byte);
#if THIRTY_TWO_BIT == 1
CTASSERT(sizeof(Node)==16,NodeSizeIsNot16Bytes)
#else
CTASSERT(sizeof(Node)==32,NodeSizeIsNot32Bytes);
#endif //THIRTY_TWO_BIT
/*
** NB I use c-style casts in places because it is much
** easier to read. IMHO the reinterpret_cast syntax
** was never designed for such crazy casting.
*/
//Finds the mHeap member in the node that this memory
//belongs to.
inline Heap* getOwnerHeap(void* mem)
{
// *(Heap**) (((char*)mem) - HeapOffsetIntoNodeInBytes);
return *(Heap**) (((char*)mem) - sizeof(size_t) - sizeof(Heap*));
}
//returns the Node that owns this memory.
inline Node* getNode(void* mem)
{
return (Node*)(((char*)mem) - sizeof(Node));
}
class Pool;
////////////////////////////////////////////////////////////////////////
// //
// Class Heap. Uses a very strange and (eventually) tangled //
// linked list of nodes and chunks that get allocated as //
// one block of ram, arranged like this: //
// //
// |--------------| The tangle comes from nodes and chunks //
// | Node | being reused, and the linked list means //
// |--------------| there is no searching through the heap //
// | Allocated | for a free node/chunk combination. //
// | chunk | The nodes are headers for the chunks //
// |--------------| //
// | Next Node | Strangely this class becomes much slower //
// |--------------| if I make this block of memory part of //
// |Next Allocated| the instance rather than dynamically //
// | chunk | allocating it. //
// |--------------| //
// //
////////////////////////////////////////////////////////////////////////
class Heap
{
friend class Pool;
//the mem that gets split into Nodes and chunks
void* mMemory;
//sizeof(Node) + chunkSize
size_t mNodeSize;
//the next unused node
Node* mNextFree;
//the last node in the heap
Node* mLastNode;
//the size of the mem chunks this heap contains
size_t mChunkSize;
//the noOfChunks in this heap that have been allocated
size_t mAllocCount;
//the max allocs this heap can hold
size_t mMaxAllocs;
//used in cleanup phase
Pool* mParent;
//the requested heapSize
const size_t mSizeInBytes;
//size of the heap + sizeof the max possible no of nodes
const size_t mRealSize;
inline void setup(Pool* parent, size_t chunkSize)
{
mParent = parent;
mChunkSize = chunkSize;
mNodeSize = chunkSize + sizeof(Node);
mMaxAllocs = mRealSize / mNodeSize;
mAllocCount = 0;
//Set all the pointers. This loop will cause a little
//lag for small chunkSize heaps, but it can't be avoided :(
Node *n;
for(int i=0;imHeap = this;
n->mNext = (*this)[i+1]; //this will go past the end, but it gets
//set to NULL on the last Node below
n->mAllocSize = 0;
}
// set pointer to first free node.
mNextFree = reinterpret_cast(mMemory);
// set pointer to last free Node.
mLastNode = (*this)[mMaxAllocs-1];
mLastNode->mAllocSize = 0;
mLastNode->mHeap = this;
mLastNode->mNext = NULL;
}
public:
//make absolutely sure this thing throws bad_alloc
//independent of compiler stupidities
inline void* operator new(size_t size)
{
void* mem = malloc(size);
if(!mem)
throw std::bad_alloc();
return mem;
}
inline void operator delete(void* mem)
{
assert(mem);
::free(mem);
}
inline Heap(Pool* parent, size_t chunkSize, size_t size) :
mChunkSize(chunkSize),
mMemory(NULL),
mSizeInBytes(size),
mRealSize(size+(MAX_CHUNKS_PER_HEAP*sizeof(Node)))
{
mMemory = alignedAlloc(mRealSize);
setup(parent, chunkSize);
}
inline ~Heap()
{
if(mMemory)
alignedFree(mMemory);
}
//makes a node and chunk appear as one block of memory,
//and a heap as an array of them. mNodeSize is sizeof(Node)
//plus the size of the chunk.
inline Node* operator[](size_t idx) const
{
return reinterpret_cast((char*)mMemory + (idx*mNodeSize));
}
inline size_t chunkSize() const { return mChunkSize; }
inline size_t sizeInBytes() const { return mSizeInBytes; }
inline size_t allocCount() const { return mAllocCount; }
inline size_t maxAllocs() const { return mMaxAllocs; }
inline bool full() const { return !mNextFree; }
inline bool empty() const { return mAllocCount == 0; }
//Returns a free chunk from this heap and sets up the next
//allocation. This method will never be called if mNextFree
//is NULL
inline void* allocChunk(size_t allocSize)
{
void *chunk = NULL;
mNextFree->mAllocSize = allocSize;
mNextFree->mRefCount = 0;
chunk = (void*)(((char*)mNextFree) + sizeof(Node));
mNextFree = mNextFree->mNext;
mAllocCount++;
return chunk;
}
//Frees a chunk of memory and sets up all the pointers so no
//searching for free chunks is needed. This is what causes
//the tangles in the list. When the heap this belongs to is
//empty it gets passed to the parent pool for re-use.
//Implemented after Pool because it needs to call a Pool method.
inline void freeChunk(void *mem);
};
//A wrapper function for alignedAlloc that places a Node structure
//prior to the usable memory. Used when the requested allocation
//size will not fit in a Pool. The Node structure makes the allocation
//able to be freed and realloced by Memory::free and Memory::realloc.
inline void* bigAlloc(size_t size)
{
void* mem = alignedAlloc(size+sizeof(Details::Node));
Details::Node* n = reinterpret_cast(mem);
n->mAllocSize = size;
n->mNext = NULL;
n->mHeap = NULL;
n->mRefCount = 0;
return (void*)((char*)mem+sizeof(Details::Node));
}
////////////////////////////////////////////////////////////////////////
// //
// Class Pool. Manages a vector of (equal chunkSize) heaps //
// for allocation, also facilitates recycling of heaps both //
// within itself and though MetaPool. //
// //
////////////////////////////////////////////////////////////////////////
class Pool
{
protected:
friend class Heap;
friend class MetaPool;
Heap* mCurrentHeap;
Heap* mSpareHeap; //to avoid calling Heap::setup when there is no need
size_t mChunkSize;
MetaPool* mParent;
const size_t mDefaultHeapSize;
typedef std::vector HeapVector;
HeapVector mHeaps;
//When a heap is completely empty try putting it in the
//mSpareHeap member (so setup doesn't have to be called).
//If mSpareHeap is already used, add the heap to a LIFO
//in MetaPool for re-use elsewhere.
void cleanup(Heap* h);
public:
inline Pool(MetaPool* parent, size_t chunkSize, size_t defaultHeapSize) :
mCurrentHeap(NULL),
mSpareHeap(NULL),
mChunkSize(chunkSize),
mParent(parent),
mDefaultHeapSize(defaultHeapSize)
{
mHeaps.reserve(100);
}
inline ~Pool()
{
HeapVector::iterator i;
for(i=mHeaps.begin();i!=mHeaps.end();i++)
{
delete *i;
}
}
//Find a heap with a reasonable numer of free slots. If none
//are free allocates a new heap as a last resort, hence could
//throw std::bad_alloc. Implemented after MetaPool.
void findHeap();
//Returns a free chunk. Can throw std::bad_alloc.
//Contains the only branch I can't remove in mempry present conditions.
//Will trigger plenty more branches if mCurrentHeap is full though.
inline void *allocChunk(size_t allocSize)
{
if(!mCurrentHeap || mCurrentHeap->full())
findHeap();
return mCurrentHeap->allocChunk(allocSize);
}
inline size_t chunkSize() const { return mChunkSize; }
inline size_t noOfHeaps() const { return mHeaps.size(); }
inline size_t defaultHeapSize() const { return mDefaultHeapSize; }
};
/*
** This Heap method is here because it has to call a Pool
** method.
*/
//Frees a chunk of memory and sets up all the pointers so no
//searching for free chunks is needed. This is what causes
//the tangles in the list. When the heap this belongs to is
//empty it gets passed to the parent pool for re-use.
inline void Heap::freeChunk(void *mem)
{
Node* n = getNode(mem);
n->mRefCount = 0;
n->mAllocSize = 0;
//set up the pointers
Node* prevLast = mLastNode;
mLastNode = n;
prevLast->mNext = mLastNode;
mLastNode->mNext = NULL;
if(!mNextFree)
mNextFree = mLastNode;
mAllocCount--;
if(mAllocCount == 0)
mParent->cleanup(this);
}
/***********************************************************************
** **
** Begin template metaprogramming **
** **
***********************************************************************/
//////////////////////////////////////////////////////////////////////
// A compile-time loop to test if a number is a power of 2 //
//////////////////////////////////////////////////////////////////////
template
struct PowerOf2Test
{
enum
{
value = !(N%2) ? PowerOf2Test<(int)((N/2.0)+0.5)>::value : false
};
};
//it is a power of two
template <>
struct PowerOf2Test<2>
{
enum { value = true };
};
//it is not a power of two
template <>
struct PowerOf2Test<0>
{
enum { value = false };
};
//////////////////////////////////////////////////////////////////////
// A compile-time test to if a number is a power of 2 plus //
// (itself divided by two). This figures of that the next power //
// of two from 48 is 64 for example. //
//////////////////////////////////////////////////////////////////////
template
struct HalfwayTest
{
enum
{
//perhaps this first PowerOf2Test is redundant, but icc
//complains without it
value = PowerOf2Test::value ? false : PowerOf2Test<(int)(Size+((Size/3.0)+0.5))>::value
};
};
//////////////////////////////////////////////////////////////////////
// A metaprog "if" statement //
//////////////////////////////////////////////////////////////////////
template struct If;
//the true case
template
struct If
{
typedef TCase result;
};
//the false case
template
struct If
{
typedef FCase result;
};
//////////////////////////////////////////////////////////////////////
// True and False boolean wrapper types //
//////////////////////////////////////////////////////////////////////
struct True
{
enum { value = 1 };
};
struct False
{
enum { value = 0 };
};
#define MIN_CHUNKSIZE METAPOOL_MIN_CHUNK_SIZE
#define MAX_CHUNKSIZE METAPOOL_MAX_CHUNK_SIZE
///////////////////////////////////////////////////////////////////////
//Compile time sanity checks for heapSize, and min and max chunkSizes//
///////////////////////////////////////////////////////////////////////
typedef If::value,
True, HalfwayTest >::result HeapSizeCondition;
CTASSERT(HeapSizeCondition::value, BadHeapSize);
typedef If::value,
True, HalfwayTest >::result MinChunkCondition;
CTASSERT(MinChunkCondition::value, BadMinChunkSize);
typedef If::value,
True,HalfwayTest >::result MaxChunkCondition;
CTASSERT(MaxChunkCondition::value, BadMaxChunkSize);
//////////////////////////////////////////////////////////////////////
// Calculate (compile time) the number of pools required //
//////////////////////////////////////////////////////////////////////
template
struct PoolCount
{
enum
{
value = PoolCount::value + 1
};
};
//These terminates the above loop
template <>
struct PoolCount
{
enum { value = 1 };
};
template <>
struct PoolCount<(MIN_CHUNKSIZE+(MIN_CHUNKSIZE/2))>
{
enum { value = 1 };
};
template <>
struct PoolCount<(MIN_CHUNKSIZE+(MIN_CHUNKSIZE/3))>
{
enum { value = 1 };
};
//provides easy acces to the compile time calculated number of
//pools
struct NoOfPools
{
enum
{
max = MAX_CHUNKSIZE,
maxIsPowerOf2 = PowerOf2Test::value,
pwrOf2 = PoolCount<(maxIsPowerOf2 ? max : (max-(max/3)))>::value,
notPwrOf2 = PoolCount<(maxIsPowerOf2 ? (max-(max/4)) : max)>::value,
value = pwrOf2 + notPwrOf2
};
};
///////////////////////////////////////////////////////////////////////
// Calculate (compile time) the chunkSize of the pool at Idx //
///////////////////////////////////////////////////////////////////////
template
class ChunkSize
{
struct MinChunkPowerOf2
{
enum
{
powerOf2 = !(Idx%2),
value = powerOf2 ? ChunkSize::value - ChunkSize::value/3 : ChunkSize::value - ChunkSize::value/4
};
};
struct MinChunkNotPowerOf2
{
enum
{
powerOf2 = Idx%2,
value = powerOf2 ? ChunkSize::value - ChunkSize::value/3 : ChunkSize::value - ChunkSize::value/4
};
};
public:
enum
{
value = If::value, MinChunkPowerOf2, MinChunkNotPowerOf2>::result::value
};
};
//This terminates the above loop
template<>
class ChunkSize
{
struct MaxChunkPowerOf2
{
enum { value = MAX_CHUNKSIZE + (MAX_CHUNKSIZE/2) };
};
struct MaxChunkNotPowerOf2
{
enum { value = MAX_CHUNKSIZE + (MAX_CHUNKSIZE/3) };
};
public:
enum
{
value = If::value, MaxChunkPowerOf2, MaxChunkNotPowerOf2>::result::value
};
};
///////////////////////////////////////////////////////////////////////
// Calculate (compile time) the best pool Idx to allocate Size bytes //
///////////////////////////////////////////////////////////////////////
template
struct PoolIdxSearch
{
enum
{
//will the requested alloc fit in a chunk?
inRange = Size <= MAX_CHUNKSIZE? true : false,
//does the alloc fit in a pool, and it will fit in the current pool?
fitsHereToo = (inRange && (Size < ChunkSize::value)) ? true : false,
//if it does return the current index, otherwise keep recursing
value = fitsHereToo ? Idx : PoolIdxSearch::value
};
};
//terminate the loop at NoOfPools::value, which means we have to use
//the run-time allocator because there isn't a pool large enough
template
struct PoolIdxSearch
{
enum { value = -1 };
};
#undef MIN_CHUNKSIZE
#undef MAX_CHUNKSIZE
/***********************************************************************
** **
** End template metaprogramming **
** **
***********************************************************************/
} //namespace Details
////////////////////////////////////////////////////////////////////////
// //
// Class MetaPool. Encapsulates multiple pools of different //
// chunkSizes, handles runtime searching for the best size //
// pool for an alloc, and manages recylcling of heaps between //
// pools with different chunkSizes. //
// //
////////////////////////////////////////////////////////////////////////
class MetaPool
{
friend class Details::Pool;
typedef Details::NoOfPools NoOfPools;
//holds all the pools
Details::Pool* mPools[NoOfPools::value];
//some space to store empty heaps for re-use between
//pools. Using a vector as a LIFO because of constant-time
//pop_back() and ability to reserve space (which deque can't)
typedef std::vector HeapVector;
HeapVector mSpareHeaps;
inline void addSpareHeap(Details::Heap* h)
{
if(mSpareHeaps.size() > METAPOOL_MAX_SPARE_HEAPS)
{
delete h;
return;
}
mSpareHeaps.push_back(h);
}
inline Details::Heap* findSpareHeap(size_t chunkSize)
{
Details::Heap* h = NULL;
if(mSpareHeaps.empty())
{
for(int i=0;imSpareHeap && (mPools[i]->mSpareHeap->sizeInBytes() % chunkSize))
{
h = mPools[i]->mSpareHeap;
mPools[i]->mSpareHeap = NULL;
return h;
}
}
return NULL;
}
HeapVector::iterator it = mSpareHeaps.begin();
for(;it!=mSpareHeaps.end();it++)
{
if(!((*it)->sizeInBytes() % chunkSize))
{
h = (*it);
mSpareHeaps.erase(it);
return h;
}
}
return NULL;
}
public:
inline Details::Pool* pool(int idx)
{
return mPools[idx];
}
MetaPool()
{
const size_t heapSizeInK = METAPOOL_HEAP_SIZE_IN_K;
const size_t minChunkSize = METAPOOL_MIN_CHUNK_SIZE;
const size_t maxChunkSize = METAPOOL_MAX_CHUNK_SIZE;
//make sure there is plenty of room so the vector
//doesn't have to realloc
mSpareHeaps.reserve(METAPOOL_MAX_SPARE_HEAPS);
bool minChunkPwrOf2 = Details::PowerOf2Test::value;
bool maxChunkPwrOf2 = Details::PowerOf2Test::value;
bool heapSizePwrOf2 = Details::PowerOf2Test::value;
size_t pwrOf2HeapSize = heapSizePwrOf2 ? heapSizeInK : heapSizeInK-(heapSizeInK/3);
size_t notPwrOf2HeapSize = heapSizePwrOf2 ? heapSizeInK-(heapSizeInK/4) : heapSizeInK;
pwrOf2HeapSize *= 1024;
notPwrOf2HeapSize *= 1024;
size_t currentSize = minChunkSize;
if(minChunkPwrOf2)
{
int i;
int j = 0;
for(i=0;i METAPOOL_MAX_CHUNK_SIZE)
return Details::bigAlloc(size);
for(int i=0;ichunkSize())
return mPools[i]->allocChunk(size);
}
throw std::bad_alloc();
return NULL;
}
//Copies an allocation to a larger one and frees the old alloc.
//Can throw std::bad_alloc.
inline void* realloc(void *mem, size_t newSize)
{
assert(mem);
void* newMem = NULL;
Details::Node* n = Details::getNode(mem);
Details::Heap* h = n->mHeap;
if(!h)
{
if(newSize > n->mAllocSize || newSize < (n->mAllocSize/2))
{
newMem = alloc(newSize);
memcpy(newMem,mem,n->mAllocSize);
h->freeChunk(mem);
return newMem;
}
return mem;
}
if(newSize > h->chunkSize() || newSize < (h->chunkSize()/2))
{
void *newChunk = alloc(newSize);
memcpy(newChunk,mem,n->mAllocSize);
h->freeChunk(mem);
return newChunk;
}
return mem;
}
//Free an allocation. This WILL crash if you pass it memory
//that was not allocated through this allocation system.
inline void free(void *mem)
{
assert(mem);
Details::Heap *h = Details::getOwnerHeap(mem);
if(!h)
{
alignedFree(Details::getNode(mem));
return;
}
h->freeChunk(mem);
}
};
/*
** These Pool methods are here because they rely on calling
** MetaPool methods
*/
//When a heap is completely empty try putting it in the
//mSpareHeap member (so setup doesn't have to be called).
//If mSpareHeap is already used, add the heap to a list
//in MetaPool for re-use elsewhere (though setup will
//have to be called).
void Details::Pool::cleanup(Details::Heap* h)
{
if(h==mCurrentHeap && !mSpareHeap)
{
mSpareHeap = h;
mCurrentHeap = NULL;
return;
}
if(h!=mCurrentHeap)
{
mParent->addSpareHeap(h);
//need to remove h from mHeaps!!!
assert(!mHeaps.empty());
HeapVector::iterator it = mHeaps.end();
it--;
while(it!=mHeaps.begin())
{
if((*it) == h)
{
mHeaps.erase(it);
return;
}
it--;
}
assert(mHeaps[0]==h);
mHeaps.erase(it);
}
}
//Find a heap with reasonable number of free chunks. If none are suitable
//allocates a new heap, hence could throw std::bad_alloc.
void Details::Pool::findHeap()
{
if(mCurrentHeap)
{
mHeaps.push_back(mCurrentHeap);
mCurrentHeap = NULL;
}
if(mSpareHeap)
{
mCurrentHeap = mSpareHeap;
mSpareHeap = NULL;
return;
}
//Fallback to looking for a recent Heap with at least threshhold free chunks.
//Looping backwards to exploit any short-lived objects
//that have been freed.
int size = mHeaps.size();
int threshhold = 1;
int stopAt = 0;
if(size)
{
Heap* h;
stopAt = size > 5 ? 5 : size;
for(int i=size-1;i>=stopAt;i--)
{
h = mHeaps[i];
threshhold = h->maxAllocs() - (h->maxAllocs() / 4);
if(mHeaps[i]->allocCount() <= threshhold)
{
mCurrentHeap = mHeaps[i];
return;
}
}
}
//Fallback to asking MetaPool for a spare Heap and calling setup if found
Heap* fallback = mParent->findSpareHeap(mChunkSize);
if(fallback)
{
fallback->setup(this,mChunkSize);
mCurrentHeap = fallback;
return;
}
//Fallback to allocating a new one
mCurrentHeap = new Heap(this,mChunkSize,mDefaultHeapSize);
}
typedef Singleton MetaSingleton;
//This struct will be the result of an If template if the
//requested size will fit in a pool.
template
struct CTAllocHelper
{
inline static void* alloc(size_t size)
{
return MetaSingleton::instance()->pool(Idx)->allocChunk(size);
}
};
//This struct will be the result of an If template if the
//requested size will not fit in a pool.
template
struct RTAllocHelper
{
inline static void* alloc()
{
return Details::bigAlloc(Size);
}
};
/*
** Normal use API follows
*/
template
inline void* alloc()
{
const int idx = Details::PoolIdxSearch::value;
return Details::If, RTAllocHelper >::result::alloc(Size);
}
template
inline void* alloc()
{
return alloc();
};
//Allocate size bytes (from a pool if possible). Throws std::bad_alloc
//if the request cannot be fulfilled. Uses a runtime search for the
//best pool.
inline void* alloc(size_t size)
{
return MetaSingleton::instance()->alloc(size);
}
template
inline void* realloc(void* mem)
{
void* newMem = NULL;
Details::Node* n = Details::getNode(mem);
Details::Heap* h = n->mHeap;
if(!h)
{
if(Size > n->mAllocSize || Size < (n->mAllocSize/2))
{
newMem = alloc();
memcpy(newMem,mem,n->mAllocSize);
Memory::free(mem);
return newMem;
}
return mem;
}
if(Size > h->chunkSize() || Size < (h->chunkSize()/2))
{
newMem = alloc();
memcpy(newMem,mem,n->mAllocSize);
Memory::free(mem);
return newMem;
}
return mem;
}
//Copies an allocation to a larger one and frees the old alloc.
//Will never shrink an allocation. Uses pools if possible.
//Throws std::bad_alloc if the request cannot be fulfilled.
//Uses a runtime search for the best pool.
inline void* realloc(void *mem, size_t newSize)
{
return MetaSingleton::instance()->realloc(mem, newSize);
}
//Free memory allocated with this system only. Will crash if used on
//memory allocated with malloc/calloc/realloc or the default new
//operators.
inline void free(void* mem)
{
MetaSingleton::instance()->free(mem);
}
//class Test : public AddAllocator
template
struct AddAllocator
{
inline void* operator new(size_t rtSize)
{
assert(sizeof(T)==rtSize);
return Memory::alloc();
}
inline void operator delete(void* mem)
{
Memory::free(mem);
}
inline void* operator new[](size_t size)
{
return Memory::alloc(size);
}
inline void operator delete[](void* mem)
{
Memory::free(mem);
}
};
#define METAPOOL_CT_SINGLE_NEW(klass) inline void* operator new(size_t rtSize){ assert(sizeof(klass)==rtSize); return Memory::alloc::alloc();}
#define METAPOOL_RT_SINGLE_NEW inline void* operator new(size_t size){ return Memory::alloc(size);}
#define METAPOOL_SINGLE_DELETE inline void operator delete(void* mem){ Memory::free(mem);}
#define METAPOOL_ARRAY_NEW inline void* operator new[](size_t size){ return Memory::alloc(size);}
#define METAPOOL_ARRAY_DELETE inline void operator delete[](void* mem){ Memory::free(mem);}
#define METAPOOL_CT_ALLOCATORS(klass) METAPOOL_CT_SINGLE_NEW(klass) METAPOOL_SINGLE_DELETE METAPOOL_ARRAY_NEW METAPOOL_ARRAY_DELETE
#define METAPOOL_RT_ALLOCATORS METAPOOL_RT_SINGLE_NEW METAPOOL_SINGLE_DELETE METAPOOL_ARRAY_NEW METAPOOL_ARRAY_DELETE
}//namespace Memory
//clean up the macros
#undef CTASSERT
#undef BYTE_ALIGNMENT
#undef MAX_CHUNKS_PER_HEAP
#endif //__METAPOOL_HH__
[/code]
Don't we already have something like this in the coder section of Sumea?
Why not add the links there instead, or haven't you found that feature yet?