Dynamic memory allocation; linked lists
Goals
- To understand that
malloc
andfree
allocate and de-allocate memory from the heap. - To use dynamic memory to build a linked list.
Terminal script from today’s class.
Background
We’ll be working with C code that builds and manipulates linked lists. You learned about linked lists in CS10, and may want to review the CS10 linked-list notes and implementation:
- slides in pdf or Powerpoint
- SinglyLinked.java
- SimpleList.java
Dynamic memory allocation
Before we dive in, check out this animated explanation of pointers. Fun!
C does not have language support for dynamically allocating new ‘things’.
Instead, the programmer has to call a library function called malloc()
to allocate a new chunk of memory from the heap segment, and later call free()
to return that chunk of memory to the heap.
The programmer has to remember to initialize the chunk of bytes received from malloc()
– which otherwise should be assumed to contain random data.
The programmer has to be careful allocate a large enough chunk to hold the data she intends to store there, and not to use pointers to write “outside” that chunk of memory.
Lots of flexibility and power - but as with any great power, you must take great care in using it.
In Java, you can use
new
to dynamically create a new object, anddelete
to discard an object created withnew
, but for the most part the Java compiler and runtime handles object deletion and memory recovery automatically - it’s called ‘garbage collection.’
There are four related functions you should understand:
- malloc
p = malloc(n)
- allocatesn
bytes of heap memory; the memory contents remain uninitialized. - calloc
p = calloc(count, size)
allocatescount*size
bytes of heap memory and initializes it all to zero; this call is appropriate when you want to allocate an array ofcount
items, each ofsize
bytes. - realloc
p = realloc(p, n)
- wherep
is a pointer to heap memory - expands (or shrinks) its allocation ton
bytes. - free
free(p)
- wherep
is a pointer to heap memory - releases that portion of heap memory for future use.
Our examples today show how to use malloc
to allocate space to store a string, and later, a struct
holding aggregate types.
For an example using calloc
and realloc
, read about how readlinep() works.
Memory leaks and stray pointers
For every malloc
there must be a matching free
.
- If you allocate space but never free it, you’ve created a memory leak; as the program runs, the process memory size will grow and grow, and eventually run out of memory.
- If you free a pointer you’ve never allocated - or free the same pointer twice - you may cause the heap manager to corrupt its data structures and trigger a segmentation fault at some future time - often a seemingly random time.
Whenever you write malloc
, write the corresponding free
call; or, document whose responsibility is to free
that pointer.
Whenever you call free(p)
, it’s good practice to immediately set p=NULL
.
Not because you have to - it makes no difference to the heap - but because that step will prevent you from accidentally re-using that now-defunct pointer p
- you’ll get a clean segmentation fault instead of unpredictable behavior.
When the process exits, all its memory is free - the four segments (code, global, stack, and heap) disappear.
Coding style - memory allocation
Avoid sprinking calls to malloc()
and free()
throughout your code.
Instead, think about the kinds of things you need to create and destroy, and write type-specific wrapper for each such type.
For example, if your program manipulates things of type struct listnode
, you would write two functions:
struct listnode *listnode_new(...);
void listnode_free(struct listnode *node);
The first function calls malloc(sizeof(struct listnode))
and initializes all of its contents, perhaps using parameters passed by the caller.
The second calls free(node)
.
Both involve careful error-checking code.
See example names7.c.
There are many advantages to this approach:
- The mainline code is more readable, because it’s clear what
listnode_new()
is doing. - The
new
function acts like a ‘constructor’ in object-oriented languages and can ensure the newly returned memory is initialized, or at least, not random bytes. - Code involving
malloc
can sometimes be tricky, and you isolate that trickiness in one spot and focus on getting it right once. - Some new types might need multiple
malloc
calls, as in our linked-list examples names5.c names6.c names7.c. All thosemalloc
calls (and correspondingfree
calls) can be in thenew
/free
functions. - You can insert debugging output or reference-counting logic, or set debugger breakpoints, in these
new
/free
functions and immediately have that feature apply to all occurrences of your program’s work with this type.
Data structures in C
C is not an object-oriented language like Java, Python, or C++. It does not provide any language features to declare a ‘class’ or create and manipulate object ‘instances’ of that class.
But C does provide two kinds of aggregate data: arrays and structures.
Arrays
We’ve seen examples with arrays of characters (aka strings) and arrays of character pointers, and even two-dimensional arrays of characters. But one can also define arrays of other types:
char names[NumStudents][NameLength];
int ages[NumStudents];
float grades[NumStudents];
and so forth.
Sometimes you need to understand that an array name is, in effect, a pointer to the first spot in memory where that array’s data lives; thus,
ages
is of typeint*
and is a pointer to the first entry in the array, butages[0]
is of typeint
and is the value of that first entry in the array. This concept comes up mostly in arrays of characters;names
is of typechar**
, and evennames[0]
is of typechar*
, andnames[0][1]
is of typechar
. In effect,names
is an array of strings, andnames[0]
is the first string in the array; like other strings,names[0]
is of typechar*
.
Structs
C allows you to define an aggregate called a ‘struct’; like an object, it has data members. Unlike an object, it has no methods (function members).
Continuing our above example, we could have defined a struct
for each student, and then an array of structs
, as follows:
struct student {
char name[NameLength];
int age;
float grade;
};
struct student students[NumStudents];
Now we can refer to the first student as student[0]
, the second student as student[1]
, and so forth.
We can refer to the name of the first student as student[0].name
and the age of the second student as student[1].age
.
In our classtime examples, names5.c, names6.c, and names7.c, we declare a struct listnode
and use it to build a linked list.
There we see a different notation: instead of .
to refer to a member of a struct, we have to use ->
to dereference the pointer and refer to member of the pointee.
struct student alice; // a single struct representing alice
struct student *sp = &alice; // a pointer, initialized to point at alice
// the following three all do the same thing.
int age = alice.age;
int age = sp->age;
int age = (*sp).age;
Linked lists
In the examples below we construct a simple linked list comprising nodes that each have a pointer to a string and a pointer to the next node.
The list includes a dummy head node to simplify list management.
The variable head
is a pointer to that dummy node.
// A structure for each node in linked list
struct listnode {
char *name;
struct listnode *next;
};
struct listnode head = {NULL, NULL}; // dummy node at head of empty list
After adding three nodes, the list might look like this:
Examples
We first live-coded a simple revision of the names
program, using an array of pointers instead of a two-dimensional array of characters:
- names3.c: array of pointers; use of
readlinep
Here, the two-dimensional names
array is replaced by a one-dimensional array of string pointers, which is more common in C, and which allows the names to be of any length - each array entry points to a dynamically allocated memory from the heap, big enough to hold that name.
We use a new freadlinep
function; after the next lecture, I recommend you read about its implementation.
We don’t have time in class, but we can extend this code to sort the array using qsort()
quicksort:
- names4.c: sort names;
void*
; casting
Then we live-coded a linked-list version of the names
program:
- names5.c - revises names3.c to use linked list instead of array
You should look at two more versions of the names
program:
- names6.c - extends names5.c to keep the list sorted
- names7.c - extends names6.c to remove duplicates
Activity
Today’s activity deletes the list.