In the Memory Pool, why create a new linked List instead of using std::Vectors?

    This site uses cookies. By continuing to browse this site, you are agreeing to our Cookie Policy.

    • In the Memory Pool, why create a new linked List instead of using std::Vectors?

      Hello! I understand the need for memory managemente, and i understood the concepts behind the linked lists and the Memory pool, but i was thinking: Mike said, and i quote, "Whatever happens, don't get caught writting your own linked-list class or tree when STL would have worked.". And i am pretty sure a vector of a class with the same elements the memory pool had would have worked just fine. So why did Mark choose to write his own linked list class? Are Vectors slower by any chance??

      Thank you for the attention.

      The post was edited 1 time, last by vchavauty ().

    • RE: In the Memory Pool, why create a new linked List instead of using std::Vectors?

      There is really only one data structure here, which is a singly linked list. I implemented it using a dynamically allocated array of the size (D + H) * S, where D is the size of the data payload, H is the size of a pointer, and S is the number of elements initially allowed. Really, the S component is just a way to keep from allocating a bunch of nodes at runtime.

      The reason I needed to write my own linked list is because I had a very special implementation that isn't covered in the STL. I add, remove, and sort nodes in a specialized way. None of this is covered by the STL's implementation of a singly linked list, so it was necessary to write my own.

      As for the dynamically allocated array, I could have used an STL vector but there are three main reasons I didn't. The first is simple enough: performance. While STL vectors are typically very fast, they're not going to be as fast as accessing memory directly. Furthermore, the way they handle reallocation isn't what I wanted. When you run out of memory in an STL vector, it will allocate a new block of memory then copy everything over. This changes its location in memory, which is a complete deal-breaker for me. I needed to guarantee that a chunk of memory handed off to the program with Alloc() would always have the exact same memory address. Second, this allocate -> copy -> free cycle is REALLY slow, but it's necessary because an STL vector guarantees that all memory is contiguous. I don't care that much if my memory blocks are contiguous, so I just allocate a new chunk and link it to the previous one, which is a lot faster.

      Second, I wanted to have complete and utter control over the entire allocated block because I was doing wonky things. Specifically, I was hijacking the first four bytes to use as the "next" pointer and using the remaining bytes as the allocated chunk passed back from Alloc(). I could have done this with a vector of allocated chunks, but that would have required S allocations since each element would need to be allocated separately. Again, this is much slower.

      Third, why should I use an STL vector? Mike's point was really about reinventing the wheel. Don't write a linked list when the std::list will work just fine. This is totally correct, but it's really important to realize that the opposite is true. Don't use a std::vector when a simple dynamically allocated (or even better: a statically allocated) array will do. The only reason to use std::vector instead of an array is because you need some of the functionality and/or safety that std::vector gives you.

      -Rez

      (PS: Your post referred to the author of the memory pool as Mark. Who is that?)
    • I've been using this memory pool because it's pretty freaking awesome. I've been wondering if anyone's included any debugging goodies in it like adding __FILE__, __FUNCTION__, __LINE__ into the header of the linked list, leak checking, adding red zones to detect memory errors, checking memory alignment, etc? I think the book actually mentions some of those things but I don't think they're actually in the code (If I remember correctly GCC is using something else for leak/error detection).

      Something else I found pretty cool was to include this guys global new and delete overloads which you can make to re-direct to 2-3 different sized GCC memory pools: oroboro.com/overloading-operator-new/

      It actually makes everything including third party libraries use the memory pool. The performance improvement was noticeable, although I didn't do very robust testing.

      I would share mine but I'm sure it can be improved and I don't want to spread crappy code around. I'm interesting in seeing if anyone else has done something similar :)
    • I have some extra debugging code in the memory pool that I use for my own games. It's mostly a tracking system so I know how many times I have to allocate new blocks and how much memory I'm actually using.

      I don't globally override new or delete. In general, that can be dangerous because literally everything will go through those overridden functions. This is rarely what you actually want. This is especially true if you're piping everything through a memory pool. Memory pools are great for small allocations that are allocated and destroyed frequently, but they are very wasteful for large allocations or allocations that only happen once or twice.

      For example, for every object that I have to render in my game, I build up a list of render commands. These objects contain all the data I need to render that object and nothing else. I allocate a bunch every single frame, but they go through the memory pool so it's lightning fast.

      On the other hand, I very rarely need to destroy a loaded map and when I do, there's a loading screen. There's no reason I need to pool it or anything inside of it since it only happens once. The overhead isn't worth it and ends up being wasteful.

      Rather than overload the global new and delete operators, I would strongly suggest that you overload the class-specific new & delete operators and use STL allocators for any STL data structures you want pooled.

      One exception: If you're overriding global new and delete to add debugging information in your debug build, that's totally fine.

      -Rez
    • I have done both and I totally understand what you're talking about with wasting memory.

      But what about classes with heap memory? Lets say I have a matrix that determines its size at runtime. The matrix pointer will be in the pool, but not the data that I allocate, correct? And even if it did go there the size is almost certainly completely wrong.

      Source Code

      1. class Matrix
      2. {
      3. public:
      4. Matrix(int aRows, int aCols) : mRows(aRows), mCols(aCols)
      5. {
      6. mData = new double[aRows*aCols]; // Not in the class specific pool! Correct?
      7. // or mData.resize(aRows*aCols);
      8. }
      9. // Other functions
      10. private:
      11. int mRows;
      12. int mCols;
      13. double* mData; // or std::vector<double> mData;
      14. };
      Display All


      The allocation of the data and the actually matrix storage happen just as frequently but I don't know how to direct both of them to a pool without global overloads of new. Maybe I will do some more robust profiling with the global pool once I get to a good stopping point again.

      And just to be clear, I'm not picking a single chunk size pool. I parse an XML file with an option for how many pools and their chunk size. So for example I can have a set of pools with size 128 bytes, 512 bytes, and 4000 bytes. The new operator figures out which one to use, and if it's more than the max it calls malloc.

      any comments? :)

      The post was edited 3 times, last by DieselMaxPower ().

    • Yes, the new call in the constructor will be allocated from the heap, which is why you pool that as well.

      What you're describing is effectively what I use in my own games, where anything that's too large to go through a memory pool goes through the standard allocator. The difference is that I don't override global new/delete, I explicitly choose which system to use per allocation.

      Again, it's mostly a matter of taste. I prefer to explicitly choose my allocator because I have more control. For example, my engine has two instances of the ProcessManager, one for the logic and one for engine systems (the logic ones are stopped when the game is paused, the system ones aren't). Do these two managers need to be pooled? Nope. Doing so would not improve performance at all and would only waste memory. Why? Because they are instantiated at the start of the program and not destroyed until the very end. The actual memory used is very small since each Process is allocated separately, so in your system those allocations would go through your memory pools.

      The amount of memory wasted in my ProcessManager example is probably pretty small, but how many places am I doing these kinds of allocations? When I looked at my own game, there were only a couple of dozen places where using a memory pool made a lot of sense because I was doing lots of small, frequent allocations. The rest of the engine was just fine with standard new/delete.

      If you're going to replace new & delete, you have more work to do than just having it split based on the max pool size. You probably want to add a parameter to allow it to override the pool behavior.

      -Rez
    • I see. So a pooled allocation would be like g_pool->Alloc(size) instead of new?

      I think right now I'm OK with wasting some memory. The syntactical convenience of only using new (or GCC_NEW) rather than a custom allocation call is kind of nice. But you gave me some nice things to keep in mind.

      The post was edited 1 time, last by DieselMaxPower ().

    • Not exactly. I wrote a macro that overrides new & delete per class, so I can pool specific classes. I have pools allocated with chunks of the exact right size so I can waste as little as possible. All I have to do is add the macro to the class and it's automatically pooled when I call new.

      The generic pool I have, which is really a bunch of pools of various sizes, is only used when I don't know the size of the allocations before hand. I think the place I use this the most is with the STL allocator I wrote that's used in most STL data structures.

      When using a pool, the syntax is exactly the same. I just have to decorate each class a bit, or tweak the typedef for STL data structures. I never explicitly call Alloc().

      -Rez
    • My bad. I understand the class macros (they are super nice). If I added those to the matrix example `new Matrix()` would allocate from the pool. But what about the mData = new double[aRows*aCols]; part? Does that become mData = g_pool->Alloc(aRows*aCols);? And if I wanted this to be a vector I would write an allocator that uses the pool?

      I was worried about custom allocators because I know there are some issues with mismatching allocators that I didn't want to deal with yet.
    • Originally posted by DieselMaxPower
      But what about the mData = new double[aRows*aCols]; part? Does that become mData = g_pool->Alloc(aRows*aCols);?

      It's funny, I don't have this case at all in my game. I do have nested allocations, but they're always other classes rather than primitives. For example, my AnimatedSprite component has an array of ResourceProxy objects that represent the frames of animation. These objects are also pooled.

      Anyway, the simple answer to your question is yes, I would just manually call Alloc() in that case. The more complex answer is that I wouldn't build a matrix like that. I would use a template instead, like this:

      Source Code

      1. template <int WIDTH, int HEIGHT>
      2. class Matrix
      3. {
      4. public:
      5. Matrix()
      6. {
      7. // clear the matrix
      8. memset(&mData, 0, sizeof(mData));
      9. }
      10. // Other functions
      11. private:
      12. double mData[WIDTH][HEIGHT];
      13. };
      Display All



      And if I wanted this to be a vector I would write an allocator that uses the pool?

      Yep!


      I was worried about custom allocators because I know there are some issues with mismatching allocators that I didn't want to deal with yet.

      I have one STL allocator that I use for all STL data structures. I've never run into any mismatching issues.

      -Rez
    • The matrix was just an example. The library eigen (eigen.tuxfamily.org/index.php?title=Main_Page) is actually a very robust and tested implementation of templated matrices that people looking at that might be interested in. I guess gaming will usually use directx matrices though.

      Anyways, I remember reading in one of Scott Meyer's books a couple of issues on allocators that went way over my head, and I didn't want to even look at it. Maybe it's finally time to tackle that problem.

      Any chance you'd be willing to share one of your allocators? I promise I'll buy the next edition of the book if it ever comes out!
    • Originally posted by DieselMaxPower
      The matrix was just an example. The library eigen (eigen.tuxfamily.org/index.php?title=Main_Page) is actually a very robust and tested implementation of templated matrices that people looking at that might be interested in.

      I've seen this, but never used it in a game. One of my friends called it the STL of math. ;)


      I guess gaming will usually use directx matrices though.

      Nope! Most games use their own matrix implementation and convert to DX or OpenGL when necessary. Cross-platform code is your friend.


      Anyways, I remember reading in one of Scott Meyer's books a couple of issues on allocators that went way over my head, and I didn't want to even look at it. Maybe it's finally time to tackle that problem.

      Yup! The book you're thinking about is Effective C++.

      Things like alignment are a big issue on certain architectures. This problem exists in the current memory pool as well; I've just never gotten around to fixing it. The problem is that some CPU architectures require your memory to be aligned based on the type. Even with architectures that don't require this, you can run into major performance issues. new and malloc() both conform to the appropriate alignments, but the problem is that I'm using a 4-byte header. I'm not returning a memory-aligned pointer, I'm returning a pointer shifted by four bytes. The problem is actually worse because I'm allocating it all in one chunk, so if the size of each memory block is some wonky value, it'll be even worse.

      The real way to solve this is to set up the allocations in such a way that the actual pointer returned by Alloc() is always appropriately aligned. That's a TODO.

      There are other issues that he talks about with implementing your own memory manager. If I recall, there's a whole section in that book regarding overriding new & delete.


      Any chance you'd be willing to share one of your allocators? I promise I'll buy the next edition of the book if it ever comes out!

      Sure. It's a little ugly because I use a macro to generate the class. This was necessary (or at least easier) because of where these systems exist in my engine layering scheme. Still, you should be able to extract out the useful bit without too much trouble.

      Like the other memory pool stuff, it uses a macro to generate the allocator class.

      Brainfuck Source Code

      1. //----------------------------------------------------------------------------------------------------------------------------------
      2. // This macro generates STL allocator classes STL containers that wish to use with the bleach allocator.
      3. // - _StlAllocatorName_: This is the name of the class. It must be unique.
      4. // - _pBleachAllocator_: A pointer to the BleachAllocator object that will handle memory allocation.
      5. //----------------------------------------------------------------------------------------------------------------------------------
      6. #define BLEACH_MEMORY_GENERATE_STL_ALLOCATOR(_StlAllocatorName_, _pBleachAllocator) \
      7. template <class Type> \
      8. class _StlAllocatorName_ \
      9. { \
      10. public: \
      11. typedef size_t size_type; \
      12. typedef ptrdiff_t difference_type; \
      13. typedef Type* pointer; \
      14. typedef const Type* const_pointer; \
      15. typedef Type& reference; \
      16. typedef const Type& const_reference; \
      17. typedef Type value_type; \
      18. template <class OtherType> \
      19. struct rebind { typedef _StlAllocatorName_<OtherType> other; }; \
      20. _StlAllocatorName_() { } \
      21. _StlAllocatorName_(const _StlAllocatorName_&) { } \
      22. template <class Other> _StlAllocatorName_(const _StlAllocatorName_<Other>&) { } \
      23. ~_StlAllocatorName_() { } \
      24. pointer address(reference x) const { return &x; } \
      25. const_pointer address(const_reference x) const { return &x; } \
      26. pointer allocate(size_type size, _StlAllocatorName_<void>::const_pointer hint = 0) \
      27. { \
      28. UNUSED_PARAM(hint); \
      29. _CHECK(_pBleachAllocator); \
      30. return static_cast<pointer>(_pBleachAllocator->Alloc(size*sizeof(Type))); \
      31. } \
      32. void deallocate(pointer p, size_type n) \
      33. { \
      34. UNUSED_PARAM(n); \
      35. _CHECK(_pBleachAllocator); \
      36. _pBleachAllocator->Free(p); \
      37. } \
      38. size_type max_size() const \
      39. { \
      40. size_type maxSize = std::numeric_limits<size_type>::max() / sizeof(Type); \
      41. return maxSize; \
      42. } \
      43. void construct(pointer pPtr, const Type& val) \
      44. { \
      45. new(pPtr) Type(val); \
      46. } \
      47. void destroy(pointer pPtr) \
      48. { \
      49. UNUSED_PARAM(pPtr); /* Not sure why pPtr is considered an unused variable */ \
      50. pPtr->~Type(); \
      51. } \
      52. bool operator==(const _StlAllocatorName_<Type>& right) \
      53. { \
      54. return this == (&right); \
      55. } \
      56. }; \
      57. template <> \
      58. class _StlAllocatorName_<void> \
      59. { \
      60. public: \
      61. typedef void* pointer; \
      62. typedef const void* const_pointer; \
      63. typedef void value_type; \
      64. template <class OtherType> struct rebind { typedef _StlAllocatorName_<OtherType> other; }; \
      65. }; \
      Display All


      The _pBleachAllocator parameter is a pointer to my BleachAllocator class, which handles the memory pool distribution. It's interface has two functions: Alloc() and Free(). Alloc() will find the appropriate memory pool based on the size of the request and return the results of that pool's Alloc() call.

      As I said, the reason I jump through these hoops is because of layering. My memory management code lives on the Utility layer, which is the lowest layer I have and reusable for every project. The actual pool instances are created on the engine layer, which is used on every game project. The thing that binds them together is as follows:

      Source Code

      1. extern BleachAllocator* g_pMainThreadAllocator;
      2. BLEACH_MEMORY_GENERATE_STL_ALLOCATOR(MainThreadStlAllocator, g_pMainThreadAllocator)


      Now that that's all done, I can use the allocator in any STL data structure. Here's an example from my pathing system, which shows a list of PathPlanNode objects:

      Source Code

      1. typedef std::list< PathPlanNode*, MainThreadStlAllocator<PathPlanNode*> > PathPlanNodeList;


      You have to be careful because it's not always the second parameter. For some data structures, you have to supply a bunch of stuff you typically just leave as defaults. For example, here's a hash table from my object manager:

      Source Code

      1. typedef std::unordered_map< EntityId, StrongPtr<Entity>,
      2. std::hash<EntityId>,
      3. std::equal_to<EntityId>,
      4. MainThreadStlAllocator< std::pair<EntityId, StrongPtr<Entity> > > > EntityMap;


      Let me know if you have any questions.

      -Rez
    • YESSS! Cool, I won't be going in to this blind. I appreciate it.

      On the alignment issue - I was worrying about this early on. My best guess at the time of implementing my version of the memory pool was to do the following when calculating the header size:

      Source Code

      1. // I'm not exactly sure how alignment works, but I am moving the pointers to the next 8byte aligned address. I'm not sure if this is even useful
      2. const static size_t chunkAlignment = 8;
      3. const static size_t chunkRemainder = ((sizeof(SingleByte*)) % chunkAlignment);
      4. const static size_t CHUNK_HEADER_SIZE = (sizeof(SingleByte*)) + (chunkRemainder > 0 ? chunkAlignment - chunkRemainder : 0);


      From my earlier research the pointer returned by operator new or malloc "should be aligned for any type". I unfortunately don't remember where I read that, but it stuck in my brain. So my primitive guess was to take the largest alignment necessary and force the header to be that size. I have no idea if this is doing anything or complete nonsense. I also don't know if it would be good to use 4, 8, 16, or something else. What I do know is that it's wasteful, haha.

      I found the part that scared me off in Scott Meyer's "Effective STL" item 10 and 11. There's too much to type here but his basic message is "allocators are weird". That is actually the first sentence of item 10!

      Well cool, I will probably try to get this in on the weekend. Thanks Rez!

      The post was edited 1 time, last by DieselMaxPower ().