Play by Play: Effective Memory Management
By Brian Hixon, Daniel
Martin, Rob Moore, Greg Schaefer, and Richard Wifall
Gamasutra
August
2, 2002
URL: http://www.gamasutra.com/features/20020802/hixon_01.htm
Back before Al Gore invented the Internet, back when 64KB was more memory than any computer would ever need, there was a time when memory managers didn’t exist. But gradually, new computer systems came out with larger amounts of memory (Figure 1). Game designers discovered ways to eat up RAM faster than any system could dish it out. It became obvious that managing all this data with a simple static memory map was too much work. Programmers needed a way to create memory maps that could change dynamically at run time depending on the state of the program. Thus, the memory manager was born.
For an excellent introduction to how a basic memory manager works, check out www.memorymanagement.org/articles/begin.html. This article will assume this basic knowledge and focus instead on things to think about when writing your own memory manager and point out some of the problems you may encounter in using a memory manager. This discussion is based on our experiences in writing and rewriting the mem-ory manager for Madden NFL 97 to Madden NFL 2002. This article is also slanted toward the C language as written for console game machines. The same principles would apply to C++ and Windows programming, but the details are likely different.
Growing Your Own
So
you know that memory managers are great, but what’s wrong with just using malloc and free? They are part of the standard C
library, so they should be the best, right? One major problem is that you don’t
really know what they are doing. If you write your own memory manager, you can
generate statistics, add debug code, and add advanced features that might not be
found in the default memory manager — features like handles, heap compaction,
and virtual memory. The ultimate weapon of game programmers is context. You know
the needs of your game and can tailor the memory manager accordingly. Developing
a successful memory manager requires addressing five different issues: ease of
use, performance, memory over-head, fragmentation control, and debugging
support. Unfortunately, these attributes are all interconnected, and optimizing
one can result in substandard results for another. The memory manager designer
must govern a delicate balancing act.
At a lower level, memory manager design also requires paying attention to platform-specific requirements. In addition, it may be possible to utilize hardware support to assist the memory manager.
Ease of
UseWith
respect to a memory manager, the ease of use consideration really comes down to
a single question: Should the memory manager support pointers, handles, or both?
When design-ing a memory manager and dealing with the problem of fragmentation,
the use of handles can be very appealing. Unfortunately, while handles provide a
straight-forward solution to the fragmentation problem, they are much more
difficult to use than pointers. A memory manager that supports only handles is
essentially pushing its internal complexity back onto the user.
While
supporting both handles and pointers is possible, the resulting memory manager
is more complicated than one that supports a single method. Madden used
to support both handles and point-ers until an analysis showed that pointers
were being used 99 percent of the time. Not surprisingly, when given a choice,
programmers used pointers, since they were the easiest solution. Therefore, we
simplified the latest Madden memory manager by removing handle support
and optimizing the pointer support.
Performance
Performance must be
addressed both in terms of speed and consistency. A memory manager that is fast
most of the time but slows down dramatically at unpredictable times is
unacceptable in a gaming environment. Therefore it is important to understand
the issues that contribute to performance. From the memory manager user’s point
of view, two operations will impact the game: allocations and recycling.
Allocation performance is determined by allocation poli-cy, free list
management, and the use of garbage collection. The most popular allocation
policies are best fit, first fit, and next fit. By organizing the free blocks as
a linked list, best fit has consistent O(n) performance, while first fit
and next fit have worst-case O(n) and on average O(n/2). By
organizing the free blocks as a size-sorted binary tree, best fit has worst-case
O(n) and on average O(n log n). By organizing the free
blocks as a balanced binary tree, best fit has consistent O(n log
n). Garbage collection applies only to memory managers with handle
support, and generally involves a fairly significant performance penalty when it
occurs, as it essentially makes a copy of the entire memory heap during
compaction.
![]() |
||||
Figure 1: Main memory growth for console game machines. |
Recycling
performance is based on free list management and the use of free block
coalescing. Free block coalescing requires the ability to locate the blocks
immediate-ly preceding and following any arbitrary block. Some memory
structures, such as boundary tag (see Donald Knuth’s Fundamental
Algorithms under References for more information on the boundary tag), allow
this in O(1) time, while those that don’t require an O(n) scan of the
free block list. The current Madden memory manager uses boundary tags to
allow O(1) access to previous/subsequent blocks and organizes the free list as
an unbalanced binary tree. The result is allocation and recycling performance
both in the O(n log n) to O(n) range.
MemoryOverhead
Memory overhead is
the cost we have to pay to manage memory, since each allocation will cost some
memory. Memory overhead is an important consideration, especially if the memory
manager will be handling a large number of objects. The first decision is
whether the memory map state should be stored internally or externally to the
memory being managed. If it is stored internally, then the maximum number of
allocated blocks does not need to be known in advance, but memory that is not
directly CPU-addressable cannot be managed. If it is stored externally, you must
know the maximum allocated and free blocks in advance and set aside memory for
this state, but address spaces that are not directly CPU-addressable can be
managed.
Madden previously used external storage for the memory
state, but this required additional overhead because it was impossible to
predict accurately the maximum number of allocated and free blocks. We had to
include a “safety factor,” which turned directly into wasted memory. The new
memory manager uses internal storage as shown in Figure 2, thus avoiding the
entire issue. All allocations are 16-byte aligned and each block has 16-byte
overhead. By limiting allocations to a minimum of 16 bytes, every block is
guaranteed to have this much storage available. Therefore, when an allocated
block is released, those 16 bytes can be used to organize the block into the
free list.
It is worthwhile to digress slightly and consider the
management of non-CPU-addressable memory. Because consoles are designed for low
cost and high performance, they sometimes incorporate special-purpose memory
that is not directly CPU-addressable. For example, the Gamecube has 16MB of
auxiliary memory (ARAM) that supports only DMA access. A memory manager that
stores its state internal to the memory it is managing cannot be used in such
cases, while a memory manager that stores its state externally can. While it may
seem appealing to use external state storage in order to support all kinds of
memory, our experience with Madden has shown this to be a mistake. Memory
that is not directly CPU-addressable is normally used only for special types of
objects, due to the com-plexity of using the memory, and often contains only a
small number of objects. Therefore, Madden now uses a single,
general-purpose internal storage memory manager for all CPU-addressable memory
and an additional, customized external storage memory manager for any
special-purpose case, such as the Gamecube’s ARAM.
Fragmentation
Control
![]() |
Figure 2: Memory structure
image. |
Memory
fragmentation is a condition that occurs over time as memory is allocated and
released and isolated free blocks form. This is not usually an immediate
problem, but as more fragments form and they get smaller, the opportunity for an
allocation failure due to fragmentation increases. Eventually, if fragmentation
gets severe enough, an allocation may fail because no single free block is large
enough to accommodate the request even though the total amount of free memory is
(Figure 3). Therefore, a good memory manager needs to either take steps to limit
the amount of fragmentation that occurs or be able to consolidate fragmentation
periodically through garbage collection. An allocation failure is often a lethal
error for game code and must be avoided at all costs. While it is generally
fairly easy to determine the maximum memory required by an application,
fragmentation can make such a calculation meaningless. Allocation policy, free
block coalescing and multiple heaps all play a part in minimizing fragmentation.
Free memory coalescing is a straight-forward technique for limiting
fragmentation that attempts to merge a newly released memory block with its
neighbors. If the block immediately preceding or following the newly released
block is also free, merging the blocks together results in a single, larger free
block. This particular technique is almost mandatory, as without it,
fragmentation occurs very quickly. In addition, this limits the size of the free
list to the minimum number of blocks, which generally has a positive performance
impact on allocation. Since this technique has no impact on how the memory
manager is used externally, it is incorporated into most memory manager
designs.
The choice of allocation algorithm has a definite impact on
fragmentation. Unfortunately, because the size and timing of allocations and
releases vary for every application and are often different from run to run of
an application, it is impossible to say that a particular algorithm is always
better or worse than another. However, in general, it has been found that “best
fit” algorithms tend to exhibit the least fragmentation. See Johnstone and
Wilson’s “The Memory Fragmentation Problem: Solved?” in References for more
information on the impact of allocation techniques on
fragmentation.
![]() |
Figure 3 : Although 128 free bytes are available, the largest possible allocation is only 64 bytes due to fragmentation. |
A technique that
requires more involvement from the memory manager user is to allocate blocks
with similar lifetimes close to each other. If the lifetime of a memory block is
known when it is being allocated, the allocation algorithm can attempt to place
it by other blocks with a similar lifetime. When these blocks are released, they
will be merged into a single large free block. The problem is that this requires
the caller to specify both the lifetime of each block as well as the total size
of the group of blocks of similar lifetime. As a result, a simpler version of
this technique is usually implemented through the use of multiple heaps. By
allocating blocks with similar lifetimes within their own heap, a similar effect
is achieved, though there are generally practical limitations on the number of
heaps that can be effectively utilized.
Debugging
Support
One of the main benefits of writing a memory manager is the
abilityto add capabilities that are not provided in the supplied memory manager.
By adding some debugging features you can make sure that you are managing the
memory rather than stepping in the heap. To avoid stepping in the heap, you’ll
need to be able to see the heap. Adding information and tools to help visualize
memory usage can be a big help when managing memory. This can be done in a
number of ways.
The debugging
techniques used by the Madden memory manager include the ability to audit
memory usage, collect usage statistics, check for consistency, and follow the
memory map from a debugger. Auditing is really nothing more than being able to
mark a group of blocks when they are allocated and later check to see if they
were all released. Usage statistics, such as maximum number of allocated and
free blocks, as well as maximum allocated memory, are valuable for evaluating
the memory utilization of the application.
Consistency checking goes
through the memory map and makes sure there are no detectable errors in it. This
means verifying block sizes, checking boundary tags, ensuring matching block
counts, and so on. By performing consistency checking prior to every allocation
and release (in a debug build), certain forms of memory corruption can be
detected quickly. Also, the memory map contains ASCII debugging information that
can be viewed from within the debugger. Prefixing every allocated block with a
text string indicat-ing the module and line number of the caller that allocated
the memory greatly assists when debugging after a crash. Naming the memory
blocks is a necessity. Imagine going to a store where every item was in a plain
box that only had its UPC code and box size printed on it.
Finding the
item you wanted to buy would be a big challenge. Likewise, if you have a list of
memory blocks that only contains their address and size, locating memory blocks
is going to be difficult unless you give those memory blocks names. Using the
source file name and line number where the allocation occurred would be a good
start. In C, this can be accomplished quite easily through the use of macros and
the C preprocessor. Now you can recognize your blocks in a crowd, but it sure
would be nice to know who they hang out with. By adding a group label or flags,
you can group your allocations at the conceptual level rather than being limited
to grouping your blocks by source file. This way you can know that a memory
block that was allocated is really being used by your main character, even
though the actual memory allocation occurred in your texture
library.
Figure 4 shows an example memory dump as seen by a (slightly
drunk) debugger. Including debugging information in textual form within the
memory map allows you can make sense of it from the debugger. For example, if
you had an invalid address access at 0x007cfea4 , you could find that address in
the debugger and page around it to see that it occurred in the Joystick Driver
library and that the memory in question was allocated by pJoyDrvr.c at line 225.
![]() |
||||
Figure 4: Memory blocks with debug information as viewed in the debugger. |
With all this information available, you will need to find ways to view that information that can help you when managing your game. A linear list of memory blocks can be useful for spotting potential memory fragmentation, while a list of memory blocks sorted by name can be useful when you have memory leaks.
If you do find a
memory block that has leaked, you will know from its name exactly where the
allocation occurred. With group labels you can print out compact memory maps
that show how each conceptual module in your code is using resources. By
tracking this throughout the project, you can easily spot modules that are using
more or less memory than you had budgeted in your original design. You can also
create functions to check whether groups of memory allocations have been freed.
This can help prevent memory leaks if you know that in certain situations some
groups have no allocations.
Keeping track of worst-case situations is
also important. Have the memory manager save the lowest amount of free memory it
has ever encountered (update this every time an allocation occurs). If you are
using some sort of node system to keep track of your memory blocks, keep track
of the highest number of nodes that the memory manager has ever used so that you
know if you are close to running out of memory blocks. Sentinels added to the
beginning and end of memory allocations can help detect overflow situations that
would normally corrupt other data in memory. If the sentinels don’t check
correctly when the memory is freed, then some code has been bad.
Filling
memory with recognizable patterns can be extremely useful. We use a different
pattern for unallocated, allocated, and freed memory. This way, we can tell in
the debugger at a glance what the situation of a particular piece of memory is.
When someone forgets to initialize memory they allocated properly, the
“allocated” pattern shows up.
You can also have functions that scan
free/unallocated memory and make sure that it all still matches the prespecified
patterns. If it doesn’t match, some code out there is incorrectly writing to
memory locations that it doesn’t own. Finally, make sure that you set up an
extra heap for debug memory and put all this extra debug information there. You
want your game memory to be as similar as possible between a debug and final
build.
Platform
Specifics
A memory manager presents a logical view of memory in the same
way that a file system provides a logical view of a disk drive. Most often,
memory managers are concerned with managing dynamic RAM of some sort. Some
console makers like to make things more interesting by providing a relatively
large main memory but also scattering other smaller chunks of RAM around the
system.
The memory manager
allows us to abstract away the physical details of the memory subsystem and deal
instead with a nice, logical model of memory. For example, on the PS2 we don’t
necessarily need to know that the first megabyte of RAM is reserved for the
operating system. It’s enough that the memory manager knows. By abstracting away
some of the details of the physical memory system, our game can become more
platform independent.
Most console hardware has alignment requirements
that, not so surprisingly, differ from platform to platform. Many platforms
require anywhere from 4- to 64- byte alignment for buffers in graphics rendering
or file IO. Each type of hardware might need the memory manager to be tweaked to
better fit the needs and abilities of the platform. Often this information can
be passed to the memory manager using a heap setup structure.
Finally,
you should be wary of third-party libraries that may use malloc and free ,
effectively bypassing your memory manager’s interface. The printf function in
the PS2 standard C library uses malloc and free ; our solution was to write our
own printf function.
Hardware
Support
On most modern computers, the issue of fragmentation has been
greatly reduced by the use of a hardware-based paged memory manager unit (PMMU).
Obviously, the fact that virtual memory provides an application with
lots of addressable memory means that even an inefficient memory manager can be
used. However, the more interesting point is that the PMMU without any secondary
storage can dramatically help with fragmentation. The PMMU takes a very large
address space (larger than the physical RAM it represents) and maps it onto
physical memory. Obviously, this mapping is not one-to-one, but rather it maps a
subset of the memory space onto a “working set” of memory pages.
The key
impact of using a PMMU in terms of fragmentation is that when a memory block is
released, any portion of that block completely spanning one or more PMMU pages
can be remapped by the PMMU. The result is actually two forms of fragmentation:
address-space fragmentation and memory fragmentation. While this effect might
seem to make a bad problem worse, it actually simplifies things. Because the
PMMU provides a large address space, the address-space fragmentation can be
largely ignored. Instead, the allocation algorithm concentrates on minimizing
memory fragmentation by first attempting to utilize the unused portions of
allocated memory pages (pages that contain some allocation but are not
completely full) before remapping unused memory pages back into
memory.
Managing Your Memory
Manager
Memory managers are like most real-life managers. You have to
keep them under control, or else they tend to wander off without adequate
information and make strange decisions. Your memory manager sometimes needs your
help to stay on the right track too. Let’s look at some common problems that can
occur and how to work around them.
Fragged again.One of the major side
effects of hand grenades and memory managers is fragmentation. Previously, we
discussed fragmentation in the context of designing the memory manager to avoid
or reduce the effects of fragmentation. However, there are also some
application- level techniques that can reduce fragmentation.
The use of
static allocation (memory defined within a module with storage allocated within
the application image) avoids the memory manager completely and thus avoids
fragmentation. Of course, this usage really only makes sense when a single
object will be represented and when the lifetime of that object is approximately
the duration of the entire application. In such cases, static allocation can
provide a benefit by limiting utilization of the dynamic memory manager to those
memory objects that are truly dynamic.
Another strategy that relies
entirely on the memory manager user is to perform submanagement of memory based
on specific program knowledge. For example, if a module knows that it needs to
allocate X objects of size Y, it may be far more efficient for the user to
allocate a single block of X * Y bytes and perform its own management within
that larger block. By reducing the number of blocks that the memory manager has
to deal with, the user has generally made the job of course, a caveat. Depending
on the amount of fragmentation and the size of X * Y, it is possible that the
application could find itself in the situation where an allocation of X * Y
fails due to fragmentation, whereas X allocations of Y would have succeeded. We
also try to discourage this practice when possible, as there is code-maintenance
overhead.
One way to help avoid memory fragmentation is to always free
memory in the opposite order from which it was allocated. In fact, if you were
always able to allocate and release in such an order, you would never have
memory fragmentation. Realistically, it’s not possible to follow this practice
all the time, but doing it as much as possible will help. Memory fragmentation
is going to occur, and at some point you will probably run into a situation
where it causes a problem in your game. You might have fragmentation that is
occurring in one part of your game that is causing a memory allocation to fail
in a totally unrelated area. Fragmentation might even manifest itself as a
situation where your game will fail only when arriving at part of your game
through a specific path (that causes the fragmentation). If you don’t have some
advanced form of garbage collection, you are going to have to use other, more
crude methods to limit this problem. One possibility is to change the code that
is causing the fragmentation to use a different allocator so that it doesn’t
cause fragmentation. A common way to accomplish this is to have an allocator
that allocates from the other end of memory.
Depending on your game,
fragmentation can become more problematic over time (especially if your
allocations don’t have close locality of lifespan). You can use brute force to
minimize these effects, such as shutting down and restarting code modules
between levels as a brute-force garbage collection technique.
Release
builds.When running a release build, there isn’t any debugging information in
the game, as it will consume extra memory. But you still need a way to know
where the game runs out of memory. For Madden, we assume in memory. If
the game does run out, it will crash and optionally dump some memory info. With
a special build, we display some partial information about the memory manager
and the game state so that we can determine if it ran out of memory because of
fragmentation or other reasons.
Getting on the
Field
When talking about memory management, programmers often resort
to words more often associated with cow pastures than games. Terms like heaps,
stacks, and chunks are thrown around like flies buzzing around you-know-what. To
see how important a memory manager is to a game, you have to get past the
abstract poo. A good memory manager allows you to have more animations, more
characters, more textures, more sounds — in short, more of everything that your
game-playing customers love.
In this article we have described some of
the issues that may come up in writing and using your own memory manager. After
years of writing and rewriting memory managers for the Madden series, one
piece of advice that bears highlighting is simply to make sure you schedule
adequate design time on this very important piece of your system; you’ll be glad
you did.
References
The Memory
Management Reference Beginner’s Guide: www.memorymanagement.org/
articles/begin.html
Johnstone,
Mark S., and Paul R. Wilson. “The Memory Fragmentation Problem: Solved?” In
Proceedings of the International Symposium on Memory Management.ACM Press, 1998.
pp. 26–36. www.cs.utexas.edu/users/wilson/papers/fragsolved.pdf
Knuth,
Donald E. The Art of Computer Programming, vol. 1, Fundamental
Algorithms. Reading, Mass.: Addison-Wesley, 1973. pp. 435–444.
Copyright © 2003 CMP Media Inc. All rights reserved.