On this page:
Getting Started
How to Work on the Lab
Support Functions
Trace-based Driver Program
Programming Rules
Evaluation
Heap-Check Tips
More Tips

Malloc Assignment

This malloc assignment is based on the one by Bryant and O’Hallaron for Computer Systems: A Programmer’s Perspective, Third Edition

Due: Wednesday, November 24, 11:59pm

In this lab, you’ll write a dynamic storage allocator for C programs, i.e., your own version of the malloc and free functions. Your allocator will provide debugging facilities, somewhat like Valgrind, to help clients detect misuses of the allocator. Specifically, your library will provide mm_malloc and mm_free analogous to malloc and free, and it will provide mm_can_free and mm_check to let a client sanity-check its use of the library.

The mmap.zip lab material may be helpful as a set of warm-up exercises.

For example, assuming that mm_init has been called already, then

  char *p = mm_malloc(10);

  strcpy(p, "123456789");

  mm_free(p);

is a correct use of mm_malloc and mm_free, since the 10 bytes allocated by mm_malloc are enough to hold the string that is copied into the allocated memory, and then mm_free is called once on the allocated pointer.

The same client enriched with mm_check and mm_can_free calls should still run successfully:

  char *p = mm_malloc(10);

  strcpy(p, "123456789");

  if (!mm_check()) abort(); /* client is well behaved, so no abort */

  if (!mm_can_free(p)) abort(); /* p can be freed, so no abort */

  mm_free(p);

In contrast, the following program might abort, because it writes more bytes into p than the space allocated, and so mm_check might return 0:

  char *p = mm_malloc(5);

  strcpy(p, "123456789"); /* might crash, but if not... */

  if (!mm_check()) abort(); /* client is misbehaved, so might abort */

  mm_free(p); /* if we get here, must not crash */

The mm_check function is not obligated to return 0, because it is not obligated to detect all possible misbehavior by clients. In other words, mm_check is meant to be a debugging aid, not a safety enforcer.

However, if mm_check returns 1, then the mm_free call afterward must succeed without crashing. Additional correct uses of mm_malloc and mm_free must all work, too, if the client is well-behaved thereafter. While the mm_check function is not required to detect arbitrary misbehavior by a client, it is obligated to detect any misbehavior that would cause the allocator to later crash with a segmentation fault, produce a bad allocation, or incorrectly reject deallocation of newly allocated memory.

The following program, which uses the mm_can_free function, must abort:

  char *p = mm_malloc(5);

  mm_free(p);

  if (!mm_can_free(p)) abort(); /* p already freed, so must abort */

The mm_can_free function must return 1 only for an argument that corresponds to a value returned by mm_malloc and not yet passed to mm_free. As a corollary, mm_can_free must return 0 for a pointer that was previously (and correctly) freed when no further calls to mm_malloc have been made—but freed memory might be recycled by mm_malloc, so the freed address may become allocated again after a call to mm_malloc. Note that multiple calls to mm_free can happen before a pointer is checked with mm_can_free, and mm_can_free must certainly return 0 for any of the freed addresses until mm_malloc is called again:

  char *p1 = mm_malloc(5);

  char *p2 = mm_malloc(5);

  mm_free(p1);

  mm_free(p2);

  if (!mm_can_free(p1)) abort();  /* p1 already freed, so must abort */

Although mm_can_free has specific obligations when the client has not misbehaved, it is still a debugging aid like mm_check: it is not required to detect potential misuses of mm_free after misbehavior. At the same time, if mm_check returns 1 and mm_can_free returns 1 for a given address, then calling mm_free on the address must work without crashing and must leave the allocator in a state where mm_check succeeds.

  char *p1 = mm_malloc(5);

  char *p2;

  mm_free(p1);

  p2 = mm_malloc(10);

  p2[100] = 0; /* misbehavior! */

  if (!mm_check()) abort(); /* may or may not abort */

  if (!mm_can_free(p1)) abort(); /* may or may not abort */

  mm_free(p1); /* must not crash if program gets here */

  if (!mm_check()) abort(); /* must not abort, either */

Client misbehavior is not restricted to writing just beyond the end of an allocated pointer. The following program includes arbitrary misbehavior:

  char *p1 = mm_malloc(5);

  mm_free(p1); /* ok */

  mm_free(p1); /* might crash, but if not... */

  p1[random() % (1<<14)] = 1; /* might crash, but if not... */

  p1[-(random() % (1<<14))] = 2; /* might crash, but if not... */

  if (!mm_check()) abort(); /* may or may not not abort */

  if (!mm_can_free(p1)) abort(); /* may or may not not abort */

  mm_free(p1); /* must not crash if program gets here */

  p1 = mm_malloc(10); /* must not crash, either */

When testing your allocator’s mm_check and mm_can_free, we’ll limit misbehavior to random writing in pages that are mapped by the allocator, so the client won’t crash directly or mangle global variables.

The mm_check and mm_can_free functions do not need to be fast, and so the best strategy will be to scan the entire heap and make sure it is well-formed and that the pointer to free (in the case of mm_can_free) is consistent with that state. Not coincidentally, that strategy will also help you make the allocator work correctly for well-behaved clients, since checking the heap structure can detect bugs in the allocator, too.

The mm_malloc and mm_free functions are meant to be reasonably fast and have low space overhead. While the bulk of your grade for this assignment will be based on correctness of the implementation, some points depend on performance. You are encouraged to explore the design space creatively and implement an allocator that is correct, space-efficient, and fast.

Getting Started

Start by unpacking malloclab-handout.zip. The only file you will be modifying and handing in is "mm.c". The "mdriver.c" program is a driver program that allows you to evaluate the performance of your solution. Use make to generate the driver code and run it as

  $ ./mdriver

See Trace-based Driver Program for information about command-line flags to mdriver.

When you have completed the lab, you will hand in only one file, "mm.c", which contains your solution.

How to Work on the Lab

Your dynamic storage allocator will consist of the following five functions, which are declared in "mm.h" and defined in "mm.c":

  int   mm_init(void);

  void *mm_malloc(size_t size);

  void  mm_free(void *ptr);

  

  int   mm_check(void);

  int   mm_can_free(void *ptr);

The "mm.c" file that we have given you implements the simplest but still functionally correct malloc implementation that we could think of, except that it does not properly implement mm_can_free. Using this as a starting place, modify these functions (and possibly define other private, static functions), so that they obey the following semantics:

Beyond correctness, your goal is to produce an allocator that performs well in time and space. That is, the mm_malloc and mm_free functions should work as quickly as possible, and the total amount of memory used by your allocator should stay as close as possible to the amount of memory needed to hold the payload of mm_malloc calls not yet balanced by mm_free calls.

Support Functions

The "memlib.c" module provides a wrapper on the operating system’s virtual-memory system. Your allocator will need to use these functions to obtain pages of memory that it can allocate from. When mdriver runs your allocator, it resets the memory system (i.e., frees all pages) before calling mm_init to start a new benchmark run.

You can invoke the following functions from memlib.c:

While your allocator will obviously need to call mem_map to obtain memory for allocation, you can improve space-efficiency for this assignment using mem_unmap to avoid retaining pages that are not needed as allocated blocks are freed.

The mem_is_mapped function is useful for implementing mm_check and mm_can_free, since those functions may need to check whether an arbitrary address can be safely read.

You can assume that your allocator is the only part of a program that calls mem_map and mem_unmap, except that all pages are reset before mm_init is called.

Trace-based Driver Program

The driver program "mdriver.c" tests your "mm.c" implementation for correctness, defensiveness, space utilization, and throughput:

The driver program is controlled by a set of trace files that are included in the malloclab-handout.zip archive. Each trace file contains a sequence of allocate, free, and reallocate (i.e., allocate plus free) directions that instruct the driver to call your mm_malloc and mm_free functions in some sequence. The driver and the trace files are the same ones we will use when we grade your handin "mm.c" file, although we may change the order in which the traces are used.

The provided makefile combines "mdriver.c" with your "mm.c" to create mdriver, which accepts the following command line arguments:

We will test your program using a flag like -r 1000 to try many different instances of random chaos for the defensiveness pass.

You may find it useful to see the shape of memory use created by a trace file. Run

  $ racket plot.rkt tracefile

to see a plot of allocated memory over time for tracefile.

Programming Rules
Evaluation

Your grade will be calculated as follows:

Heap-Check Tips

Dynamic memory allocators are notoriously tricky beasts to program correctly and efficiently. They are difficult to program correctly, because they involve a lot of untyped pointer manipulation. Even if the assignment did not require mm_check and mm_can_free, you would find it very helpful to write a heap checker that scans the heap and checks it for consistency.

Some examples of what a heap checker might check are:

Since a misbehaved client can corrupt memory pages used by the allocator in arbitrary ways, you cannot assume that pointers set up by your allocator will remain valid when mm_check is called. Before dereferencing a pointer, make sure that the dereference will succeed by using mem_is_mapped. Since mem_is_mapped works at the level of pages, you’ll probably find it best to implement a ptr_is_mapped helper function that takes an address and size and makes sure that the address range is on mapped pages:

  /* rounds up to the nearest multiple of mem_pagesize() */

  #define PAGE_ALIGN(sz) (((sz) + (mem_pagesize()-1)) & ~(mem_pagesize()-1))

  

  /* rounds down to the nearest multiple of mem_pagesize() */

  #define ADDRESS_PAGE_START(p) ((void *)(((size_t)p) & ~(mem_pagesize()-1)))

  

  int ptr_is_mapped(void *p, size_t len) {

    void *s = ADDRESS_PAGE_START(p);

    return mem_is_mapped(s, PAGE_ALIGN((p + len) - s));

  }

A heap checker that scans all allocated pages once in linear time will be fast enough. Beware that a heap checker that is somehow quadratic in the size of allocated pages will be impractically slow. A heap checker that attempts to take a snapshot of the heap state and compare it later is unlikely to work, since the snapshot data will have to be recorded in an allocated page, and a chaotic client can modify the snapshot, too.

Random chaos to check your implementation’s defensiveness will modify only mapped pages, and not your allocator’s global variables—but that doesn’t help much, since the programming rules constrain your allocator’s use of global variables.

More Tips