Friday, February 25, 2011

STL Allocator with low-fragmentation heap

The STL provides a series of data structure, like std::list or std::vector, known as containers. These containers are capable to change their sizes during application run-time, for that it uses dynamic allocators, or simply allocators. The STL already provides a standard allocator for general use, however it's also possible to write your own allocator to be used with containers. The disadvantage of using the standard allocator is because they uses new/delete, the same as malloc/free, which has performance implications that could not be ignored.

Versão em português desse artigo: STL Allocator com low-fragmentation heap.

There are some allocators available on web that could be used in any application like that one from, that I always like to mention, Boost library, the fast_pool_allocator. But, taking advantage of technic presented in article Memory management in Windows applications, we are providing a dynamic allocator based on HeapSpace class that was presented:

#include <iostream>
#include <windows.h>
#include "HeapSpace.h"

template <typename T>
class HeapAllocator {
public:
    HeapSpace* HeapHandler;
    int* Copies;

public:
    // type definitions
    typedef T        value_type;
    typedef T*       pointer;
    typedef const T* const_pointer;
    typedef T&       reference;
    typedef const T& const_reference;
    typedef std::size_t    size_type;
    typedef std::ptrdiff_t difference_type;

    // rebind allocator to type U
    template <typename U>
    struct rebind {
        typedef HeapAllocator<U> other;
    };

    // return address of values
    pointer address (reference value) const {
        return &value;
    }
    const_pointer address (const_reference value) const {
        return &value;
    }

    /* constructors and destructor
    */
    HeapAllocator() throw() {
        HeapHandler = new HeapSpace(50*sizeof(T));
        Copies = (int*)HeapHandler->alloc(sizeof(Copies));
        *Copies = 1;
    }
    HeapAllocator(HeapAllocator const& arg) throw() {
        Copies = arg.Copies;
        HeapHandler = arg.HeapHandler;
        (*Copies)++;
    }
    template <typename U>
    HeapAllocator (HeapAllocator<U> const& arg) throw() {
        Copies = arg.Copies;
        HeapHandler = arg.HeapHandler;
        (*Copies)++;
    }
    ~HeapAllocator() throw() {
        (*Copies)--;
        if (*Copies == 0)
            delete HeapHandler;
    }

    // return maximum number of elements that can be allocated
    size_type max_size () const throw() {
        return size_t(-1) / sizeof(value_type);
    }

    // allocate but don't initialize num elements of type T
    pointer allocate (size_type num, const void* = 0) {
        return (pointer) this->HeapHandler->alloc((unsigned long)(num*sizeof(T)));
    }

    // initialize elements of allocated storage p with value value
    void construct (pointer p, const T& value) {
        new((void*)p)T(value);
    }

    // destroy elements of initialized storage p
    void destroy (pointer p) {
        p->~T();
    }

    // deallocate storage p of deleted elements
    void deallocate (pointer p, size_type num) {
        this->HeapHandler->free((void*)p);
    }
};

// return that all specializations of this allocator are interchangeable
template <typename T1, typename T2>
bool operator==(HeapAllocator<T1> const& Type1, HeapAllocator<T2> const& Type2) throw()
{
    return Type1.HeapHandler == Type2.HeapHandler;
}

template <typename T1, typename T2>
bool operator!=(HeapAllocator<T1> const& Type1, HeapAllocator<T2> const& Type2) throw()
{
    return Type1.HeapHandler != Type2.HeapHandler;
}

Note that this allocator keeps the same memory manager instance when it is recreated with another type, in this way it will use only one memory manager instance, with their own heap space, for each container instance. Before continue, let's see a comparison:


Observe that the LFH allocator has proved advantageous to be applied both with std::queue, as with std::deque. However was with std::queue the it had the best performance, performing the task in about half the time. std::list, std::vector and other ones was also tested, cases that gain was also achieved, but with too different scales, taking dozen of times more, but was discarded for a better comprehension of chart. These measurements was made individually, one container by each time, obtaining the average time of a long sequence of insertions and deletions from queue being made by 10 threads simultaneously. Remembering that this test just simulates an artificial condition of forced stress that may not represent the truth for another applications, it's always recommended specific performance tests in applications with a critical response time.

Some examples of methods to configure containers for this allocator:

//using std::queue (faster FIFO)
std::queue<int,std::deque<int,HeapAllocator<int>>> Queue;

//for std::deque
std::deque<int,HeapAllocator<int>> Queue;

//std::vector
std::vector<int,HeapAllocator<int>> Queue;

//std::list
std::list<int,HeapAllocator<int>> Queue;

No comments:

Post a Comment