Skip to content

๐Ÿ’พ Memory Allocation - 42 project {Mmap, memory allocation algorithm, Munmap & free}

Notifications You must be signed in to change notification settings

gaeduron/Malloc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Malloc

Introduction

โš ๏ธTHIS REPO IS A WORK IN PROGRESS AT THIS POINT

This is simple implementation of malloc. This malloc is a way to get familiar with memory allocation in C. If you are looking for a more advanced malloc implementation you can take a look at the glibc implementation.

๐Ÿ“– Summary

  1. Guideline
  2. Data-Structures overview
  3. Data-Structures details
    1. Chunk
    2. Bin
    3. Zone
    4. Global zone storage
  4. Functions Overview
    1. Malloc
    2. Free
    3. Realloc
  5. Documentation

๐Ÿ“ƒ Guideline

This malloc implementation is different from the glibc. The constrain for this project are:

  • You can only use some allowed functions:
mmap(2)
munmap(2)
getpagesize(3)
getrlimit(2)
The authorized functions within your libft (write(2) par exemple ;-) )
The functions from libpthread
  • The project must be written in accordance with the 42 Norm
  • You have to โ€œpre-allocateโ€ some memory zones to store your โ€œsmallโ€ and โ€œmediumโ€ malloc.
  • Each zone must contain at least 100 allocations.
โ€œTINYโ€ mallocs, from 1 to n bytes, will be stored in N bytes big zones.<br>
โ€œSMALLโ€ mallocs, from (n+1) to m bytes, will be stored in M bytes big zones.<br>
โ€œLARGEโ€ mallocs, fron (m+1) bytes and more, will be stored out of zone,<br>
which simply means with mmap(), they will be in a zone on their own.<br>

๐Ÿ”ญ Data-Structures overview

Chunk

Each chunk has a payload (where the user data is stored) and meta-data about how big it is (via a size field in the chunk header).
Chunks payload address is what is returned to the user.
When a chunk is in use by the application, the only data that's "remembered" is the size of the chunk.
You can use this size to find the next chunk in a bin.

Bin

Bins contains multiples chunks.
They contain only a specific type of chunks, either only SMALL chunks or only TINY chunks.
A bin can contain at least 100 chunks.
Each bins are the same size (e.g.: MAX_TINY_CHUNK_SIZE * 100 + headers).

Zone

Zones contains multipes bins.
Each zone is a freelist of all the bins in that specific zone. In this implementation the zones are [TINY, SMALL, LARGE], but you could have more zones.

Global zone storage

This were all the zones are stored. This is a global variable. It can be access by malloc and free.

๐Ÿ”ฌ Data-Structures details

๐Ÿ“„ Chunk

Chunks are 8 bytes aligned The minimum size of a chunk is 2*sizeof(void*). Usually sizeof(void*) is sizeof(size_t). You can find the size of your size_t with this command:

echo | gcc -E -xc -include 'stddef.h' - | grep size_t

A chunk in use:

- schema -
|---|---|---|---|---|---|---|---|
|   size_of_the_chunk       |  P|    P = Previous chunk is in use
|---|---|---|---|---|---|---|---|
|            payload            |
|                               |
|                               |
|                               |
|---|---|---|---|---|---|---|---|
	
- example -
|---|---|---|---|---|---|---|---|
|x00  00  00  00  00  00  20| 01|    Size = 32 bytes, P = true 
|---|---|---|---|---|---|---|---|
|            payload            |    Payload = User data
|                               |
|                               |
|                               |
|---|---|---|---|---|---|---|---|

In a free chunk we also have at the end the size of the chunk. That way we can navigate back in our bin. We will use this way of navigation during the memory defragmentation process. Where we want to combine free chunk with adjacent free chunks to "coalesce" them into larger chunks.

A free chunk:

- schema -
|---|---|---|---|---|---|---|---|
|   size_of_the_chunk       |  P|
|---|---|---|---|---|---|---|---|
|            payload            |
|                               |
|                               |
|---|---|---|---|---|---|---|---|
|       size_of_the_chunk       |
|---|---|---|---|---|---|---|---|

- example -
|---|---|---|---|---|---|---|---|
|x00  00  00  00  00  00  20| 01|    Size = 32 bytes, P = true 
|---|---|---|---|---|---|---|---|
|            payload            |    Payload = User data
|                               |
|                               |
|---|---|---|---|---|---|---|---|
|x00  00  00  00  00  00  20| 00|
|---|---|---|---|---|---|---|---|

Now let's see an exemple with 3 adjacent chunks

|---|---|---|---|---|---|---|---|
|x00  00  00  00  00  00  20| 01|    | Chunk 1
|---|---|---|---|---|---|---|---|    | Size = 32 bytes, previous chunk is used
|            payload            |    | Is now free
|                               |    |
|                               |    |
|---|---|---|---|---|---|---|---|
|x00  00  00  00  00  00  20| 01|
|---|---|---|---|---|---|---|---|
|x00  00  00  00  00  00  18| 00|    | Chunk 2 
|---|---|---|---|---|---|---|---|    | Size = 24 bytes, previous chunk is free
|            payload            |    | Is now in use
|                               |    |
|                               |    |
|---|---|---|---|---|---|---|---|
|x00  00  00  00  00  00  08| 03|    | Chunk 3
|---|---|---|---|---|---|---|---|    | Size = 24 bytes, previous chunk is used
|            payload            |    | Is the last chunk of the bin
|---|---|---|---|---|---|---|---|

In the last chunk we have P = 3, this mean that it's the last chunk of the bin. Because our chunk are aligned on 8 bytes, we can use the three least significant bits of our size to set some flags.

Chunk 1 header in binary
                                                                    | LP
|--------|--------|--------|--------|--------|--------|--------|--------|
|00000000|00000000|00000000|00000000|00000000|00000000|00100000|00000001|
|--------|--------|--------|--------|--------|--------|--------|--------|

Chunk 3 header in binary
                                                                    | LP			
|--------|--------|--------|--------|--------|--------|--------|--------|
|00000000|00000000|00000000|00000000|00000000|00000000|00001000|00000011|
|--------|--------|--------|--------|--------|--------|--------|--------|

So the L flag mean that this is the last chunk in a bin.

๐Ÿ“‚ Bin

Bins are multiples of get_page_size(), so they are large enough to be allocated with mmap.
They have a header which contain two pointers.
One pointing to the next bin in that zone and one pointing to the previous bin.

Bins are created with only two chunk at first.
Chunkd will be divided at each new allocation needed.
The second chunk is the last chunk of the bin. This chunk will never hold user data

The first chunk in a bin will always have the flag F (first) set as True.
The last chunk in a bin will always have the flag L (last) set as True.
That way we will know when we hit the end of a bin when searching for memory.

When coalescing our chunks, if we stumble upon the first chunk we will do the following:

  • Test if it's free.
  • Test if it's adjacent to the last chunk of the bin. If the two conditions are true, this bin is empty and we can use munmap() to give it back to the system.

Let's see a more visual explanation of a bin lifecycle:


void *ptr = (void*)malloc(8);

// create a bin 

|    BIN HEADER   |                      BIN PAYLOAD                             |
|                 |   first chunk                                       |lastchnk|
|--------|--------|--------|--------|--------|--------|--------|--------|--------|
|*bck_ptr|*nxt_ptr| MAX|101|               free memory to use  | MAX|101|   0|010|
|--------|--------|--------|--------|--------|--------|--------|--------|--------|

// create a chunk of size 8

|    BIN HEADER   |                      BIN PAYLOAD                             |
|                 |   first chunk                     |  second chunk   |lastchnk|
|--------|--------|--------|--------|--------|--------|--------|--------|--------|
|*bck_ptr|*nxt_ptr|  24|101| free memory     |  24|101|   8|000| payload|   0|011|
|--------|--------|--------|--------|--------|--------|--------|--------|--------|

free(ptr);

// set the chunk as free

|    BIN HEADER   |                      BIN PAYLOAD                             |
|                 |   first chunk                     |  second chunk   |lastchnk|
|--------|--------|--------|--------|--------|--------|--------|--------|--------|
|*bck_ptr|*nxt_ptr|  24|101| free memory     |  24|101|   8|000|   8|000|   0|010|
|--------|--------|--------|--------|--------|--------|--------|--------|--------|

// defragment the bin

|    BIN HEADER   |                      BIN PAYLOAD                             |
|                 |   first chunk                                       |lastchnk|
|--------|--------|--------|--------|--------|--------|--------|--------|--------|
|*bck_ptr|*nxt_ptr| MAX|101| free memory                       | MAX|101|   0|010|
|--------|--------|--------|--------|--------|--------|--------|--------|--------|

// The first chunk is free and is adjacent to the last chunk, so we can give this bin to munmap()

๐Ÿ—ƒ Zone

A zone is a freelist where each nodes is a bin.

|    BIN HEADER   | PAYLOAD |
|--------|--------|---------|
|*bck_ptr|*nxt_ptr| chunks  |
|--------|--------|---------|


|----------------------------------------------ZONE A-----------------------------------------------|

|bin1_address = 0x001...            |bin2_address = 0x002...            |bin3_address = 0x003...  
|    BIN HEADER   | PAYLOAD |       |    BIN HEADER   | PAYLOAD |       |    BIN HEADER   | PAYLOAD |
|--------|--------|---------|       |--------|--------|---------|       |--------|--------|---------|
|       0|0x002...| chunks  |  <->  |0x001...|0x003...| chunks  |  <->  |0x002...|       0| chunks  |
|--------|--------|---------|       |--------|--------|---------|       |--------|--------|---------|

๐Ÿ—„ Global zone storage

It's a simple array with the first pointer of each zone.

enum zone {TINY, SMALL, LARGE, ZONE_COUNT};

extern void g_zones[ZONE_COUNT];

// [TINY_PTR, SMALL_PTR, LARGE_PTR]

๐Ÿ›  Functions Overview

Malloc

Here is an overview of how this malloc implementation will work

1 - malloc
2 - find_space
3 - search_in_zone
4 - create_bin
5 - set_chunk_header
6 - set_bin_headers

Free

1 - free
1 - defragment
1 - bin_is_empty

Realloc

Realloc will be really simple in this implementation.
It will just call malloc to get a new space of the right size in memory.
Then we will do a memcopy from the previous memory to the new one.
Now we just free the old memory.
Then return the new pointer.

๐Ÿ“š Documentation

Glibc implementation of malloc()

https://sourceware.org/glibc/wiki/MallocInternals

Really well documented implementation of malloc from glibc

https://github.com/lattera/glibc/blob/master/malloc/malloc.c

Good slides on mmap, sbrk, malloc

https://pubweb.eng.utah.edu/~cs4400/malloc.pdf

About

๐Ÿ’พ Memory Allocation - 42 project {Mmap, memory allocation algorithm, Munmap & free}

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published