C++ – Disk Based Dynamic Memory Allocation


I have a program in which I want to be able to store certain data (dynamically allocated blocks), on disk for reduced memory usage and persistence.

My first thought was to write my own custom allocator which managed the content of files on the disk, but I want to see what alternatives there are too.

I've looked into custom memory allocators and topics on object serialization but there are subtle differences, both good and bad, when adapting those principles to managing the address space of a file.

In this situation:

  1. Memory is accessed only through IO (read/write) functions rather than directly

  2. No objects(methods/pointers) are stored, only data.

  3. The size of a file is not static, so it should grow when needed rather than being large and static

  4. For my uses, it is acceptable to re-map existing pointers after defragmentation

Because the data is not of a fixed size, most database implementations seem not well suited.

I ask, what is the best approach for this problem? Should I implement a simple memory allocator which treats a file as the heap?

For reference, im using C++ on embedded devices.

Edit: I've implemented my own memory manager which uses buddy memory allocation and block sizes of powers of two. I'm satisfied that it is correct and does not leak, coalesces free blocks, and can do a 'stop the world' defragmentation.

The problem is, as expected, there is quite a bit of internal and external fragmentation. I'm not an expert in this field and although I find it fascinating (I'm still a student), I wonder if there are any other implementations that have done the same or similar thing? Surely I cant be the only one?

Some helpful but so far incompatible topics are:

tbh I havent used mmap but, it addresses the file IO, but not the management of the file address space.

BOOST:serialization I have a (probably unjustified) reluctance to use boost libraries at the moment.

STXXL Interesting but doesnt address variable size memory allocation

Doug Lea Memory Allocator Has very good insights into issues with memory allocators, but I'm not in a position to try and make my own implementation

Best Solution

Your two goals are to reduce memory usage and persist your data. That definitely sounds like a job for a database. But then you say

Because the data is not of a fixed size, most database implementations seem not well suited.

I think you'll be interested in this distinctive feature of SQLite (a very lightweight cross-platform database with public domain source code):

Variable-length records


SQLite, in contrast, use only the amount of disk space actually needed to store the information in a row. If you store a single character in a VARCHAR(100) column, then only a single byte of disk space is consumed. (Actually two bytes - there is some overhead at the beginning of each column to record its datatype and length.)

It also is a good choice for embedded development:

Embedded devices and applications

Because an SQLite database requires little or no administration, SQLite is a good choice for devices or services that must work unattended and without human support. SQLite is a good fit for use in cellphones, PDAs, set-top boxes, and/or appliances. It also works well as an embedded database in downloadable consumer applications.