Life Without Malloc

Introduction

Many programmers don’t like working in low level environments. Memory management certainly contributes to this feeling. In most high level languages, allocating memory is extremely easy (e.g. “new”) and freeing memory isn’t a consideration since the language is garbage collected. Unfortunately, these conveniences come at a cost of performance in both apparent and subtle ways. Trying to mimic dynamically allocating memory in low level languages often leads to complex webs of memory that needs to be freed in a complicated order.       

What order should this graph be freed in?

I argue that dynamically allocating everything little thing using malloc or another general purpose heap allocator is not the route to take. There are plenty of alternative allocators each with their own benefits and tradeoffs compared to malloc. Usually the benefits are simplicity and performance. The downsides include less flexibility, which might seem bad but you may find that the flexibility of malloc isn’t necessary. In order to figure out how to make memory management easier, we need to assess why malloc exists in the first place.

Lifetimes

The lifetime of a particular chunk of memory is the time between its first allocation, and its respective release. By using malloc everytime you need some memory and freeing once you don’t need it anymore, you are generating a different lifetime for each allocation. Keeping track of every allocation you’ve made increases program complexity and contributes to the disdain of low level memory management as the number of allocations increases.

Many overlapping lifetimes can be confusing.

Understanding how your memory is structured and finding commonalities between sets of allocations and releases is important for identifying common lifetimes. Figuring out the lifetimes of your memory is crucial when deciding which type of allocator to use. For instance, you might find a specific set of memory is needed for the entire time the application is running, or that some memory is needed for a file upload and is all released once that completes.

The Stack

The call stack is a part of memory managed by the compiler so the programmer doesn’t need to think about it the way they think about heap memory. The stack –generally– works by allocating all the memory needed for a function when it is called. When the function returns, all the memory is released at once. Because all the variables for the function are allocated and freed at the same time, they all have the same lifetime. This makes the local variables both simple to think about and fast to allocate and release. Call stack memory can also be dynamically allocated such as with _alloca, which can be very useful. 

The call stack is an example of a stack allocator because you can only allocate and release from the top of the stack. This idea of a stack allocator, however, is not only for the call stack. It is possible to implement your own stack allocator by mimicking the functionality of the call stack. It all starts with allocating a large chunk of memory using something like VirtualAlloc or mmap. Now you just need to keep track of the top of your stack which starts at the beginning of your newly allocated memory. Allocating memory in the stack just involves returning the top of the stack, and moving it by the amount of memory requested. Releasing memory works by simply setting the head of the stack to the memory pointer passed into the function.

Treating memory as an array allows you to make your own stack.

Arena

Stack allocators may be relatable but not always that useful. One allocator that is very useful is actually a simpler version a stack allocator called an arena allocator. The difference is that the only way to release memory in an arena is by releasing all of it at the same time. This might seem restrictive but thats where the idea of lifetimes comes in. 

In a game I’m working on, I have an arena that represents memory useful for just the current frame. It is used to communicate temporary information such as which entities the player is colliding with. At the end of every frame, this arena is cleared because the information it holds is out of date. This makes the arena available to be used for the next frame. The benefits of using this arena is that I can easily, and quickly, dynamically allocate memory as long as I know it is only pertinent for one frame. I can use the arena without worrying about releasing the memory manually as I know the game system that I wrote does that automatically at the end of each frame.

Free List

Sometimes you do have a bunch of memory allocations with different lifetimes. Arena allocators aren’t suited for this situation as everything in the arena is supposed to have the same lifetime. Free lists can help alleviate this issue by allowing you to dynamically allocate and release memory just like malloc but much faster. This comes with the caveat that all your allocations need to be of the same size if you don’t want any internal fragmentation. If you have multiple sizes of allocations, you can just pretend that you are always allocating the max size.

The way a free list works is by allocating a large chunk of memory just like the other two allocators. Then you specify a block size, such as the size of a struct, and split the chunk of memory into pieces of that block size. Each block starts with a pointer to the next block to form a linked list and in the free list header we store a pointer to the start of the linked list. In order to allocate memory, we just return the start of the free list and make it equal the the next element in the list. Releasing memory works by treating the newly released memory as a node in the free list, setting its next node to the start of the free list, then setting the start of the free list to the released node. It basically works like a stack of free memory slots implemented with a linked list.

Linked lists aren’t that bad after all.

Conclusion

Memory can be a little tricky if you only use general purpose heap allocators, which are designed to be able to handle any situation. Analyzing the patterns of your memory accesses can simplify your program as well as provide a slight performance boost through use of different allocators. I was able to identify four common lifetimes in a game I’m making and an arena is used for each. For the instances where I need something a little more flexible, I use a free list like for the nodes in a red-black tree. See if you can fit these allocators into your program like when you have to release a bunch of memory at the same time or if you’re allocating a bunch of objects of the same type.

Previous
Previous

Beyond Dijkstra’s

Next
Next

Water Simulation in the Blink of an Eye