C++ – Tips for writing a DBMS


I have taken a graduate level course which is just one big project – to write a DBMS.

The objective is not to reinvent the wheel and make an enterprise DBMS to rival Oracle. Only a small subset of SQL commands need to be supported. Nor is the objective to create some fancy hybrid model DBMS for storing multimedia or something. It has to be a traditional RDBMS.

The main goal of the project is to use programing techniques to take advantage of modern architectures (multicore processors) to build a high performing database (speed, load).

I was just wondering if there were any resources on query evaluations, optimizers, data structures ideal for DBMSes or basically anything that could help me create a standout project. The professor was throwing around terms like metaprogramming for example.

The project must be done entirely in C++.

Thanks for the replies so far! I cannot optimize an existing DBMS such as MySQL as the project requires you to build your own DBMS from scratch. Yes I know this is pretty much reinventing the wheel for most part, but there is scope for some novel query evaluation and optimization algorithms. If you know any good resources or books dealing with this specific area, then please tell me!

Best Solution

First you need to learn about relational calculus and make a compiler to deal with making it from sql, thankfully sql is an easy language and this is not bad.

Then get familiar with bx-trees for your indexes. Then make a commit and rollback space and that is pretty much all there is to it. It's not rocket science, compared to other projects you might undertake, but it's definitely something you better start right away if you want a good result by the end of the semester/year.

edit: Oh, and as for modern architecture goes, trees don't usually benefit much from multithreading. Neither do disk reads. On the other hand, it's crucial for high performance to use the whole of your memory using OS level calls, not just the memory normally addressable in a process.