You can store the tree in memory or you can directly produce the required output code. Storing the intermediate form is normally done to be able to do some processing on the code at an higher level before generating output.
In your case for example it would be simple to discover that your expression contains no variables and therefore the result is a fixed number. Looking only at one node at a time this however is not possible. To be more explicit if after looking at "2*" you generate machine code for computing the double of something this code is sort of wasted when the other part is for example "3" because your program will compute "3" and then compute the double of that every time while just loading "6" would be equivalent but shorter and faster.
If you want to generate the machine code then you need first to know for what kind of machine the code is going to be generated... the simplest model uses a stack-based approach. In this case you need no register allocation logic and it's easy to compile directly to machine code without the intermediate representation. Consider this small example that handles just integers, four operations, unary negation and variables... you will notice that no data structure is used at all: source code characters are read and machine instructions are written to output...
#include <stdio.h>
#include <stdlib.h>
void error(const char *what) {
fprintf(stderr, "ERROR: %s\n", what);
exit(1);
}
void compileLiteral(const char *& s) {
int v = 0;
while (*s >= '0' && *s <= '9') {
v = v*10 + *s++ - '0';
}
printf(" mov eax, %i\n", v);
}
void compileSymbol(const char *& s) {
printf(" mov eax, dword ptr ");
while ((*s >= 'a' && *s <= 'z') ||
(*s >= 'A' && *s <= 'Z') ||
(*s >= '0' && *s <= '9') ||
(*s == '_')) {
putchar(*s++);
}
printf("\n");
}
void compileExpression(const char *&);
void compileTerm(const char *& s) {
if (*s >= '0' && *s <= '9') {
// Number
compileLiteral(s);
} else if ((*s >= 'a' && *s <= 'z') ||
(*s >= 'A' && *s <= 'Z') ||
(*s == '_')) {
// Variable
compileSymbol(s);
} else if (*s == '-') {
// Unary negation
s++;
compileTerm(s);
printf(" neg eax\n");
} else if (*s == '(') {
// Parenthesized sub-expression
s++;
compileExpression(s);
if (*s != ')')
error("')' expected");
s++;
} else {
error("Syntax error");
}
}
void compileMulDiv(const char *& s) {
compileTerm(s);
for (;;) {
if (*s == '*') {
s++;
printf(" push eax\n");
compileTerm(s);
printf(" mov ebx, eax\n");
printf(" pop eax\n");
printf(" imul ebx\n");
} else if (*s == '/') {
s++;
printf(" push eax\n");
compileTerm(s);
printf(" mov ebx, eax\n");
printf(" pop eax\n");
printf(" idiv ebx\n");
} else break;
}
}
void compileAddSub(const char *& s) {
compileMulDiv(s);
for (;;) {
if (*s == '+') {
s++;
printf(" push eax\n");
compileMulDiv(s);
printf(" mov ebx, eax\n");
printf(" pop eax\n");
printf(" add eax, ebx\n");
} else if (*s == '-') {
s++;
printf(" push eax\n");
compileMulDiv(s);
printf(" mov ebx, eax\n");
printf(" pop eax\n");
printf(" sub eax, ebx\n");
} else break;
}
}
void compileExpression(const char *& s) {
compileAddSub(s);
}
int main(int argc, const char *argv[]) {
if (argc != 2) error("Syntax: simple-compiler <expr>\n");
compileExpression(argv[1]);
return 0;
}
For example running the compiler with 1+y*(-3+x)
as input you get as output
mov eax, 1
push eax
mov eax, dword ptr y
push eax
mov eax, 3
neg eax
push eax
mov eax, dword ptr x
mov ebx, eax
pop eax
add eax, ebx
mov ebx, eax
pop eax
imul ebx
mov ebx, eax
pop eax
add eax, ebx
However this approach of writing compilers doesn't scale well to an optimizing compiler.
While it's possible to get some optimization by adding a "peephole" optimizer in the output stage, many useful optimizations are possible only looking at code from an higher point of view.
Also even the bare machine code generation could benefit by seeing more code, for example to decide which register assign to what or to decide which of the possible assembler implementations would be convenient for a specific code pattern.
For example the same expression could be compiled by an optimizing compiler to
mov eax, dword ptr x
sub eax, 3
imul dword ptr y
inc eax
One of the Roslyn engineers who specializes in understanding optimization of stack usage took a look at this and reports to me that there seems to be a problem in the interaction between the way the C# compiler generates local variable stores and the way the JIT compiler does register scheduling in the corresponding x86 code. The result is suboptimal code generation on the loads and stores of the locals.
For some reason unclear to all of us, the problematic code generation path is avoided when the JITter knows that the block is in a try-protected region.
This is pretty weird. We'll follow up with the JITter team and see whether we can get a bug entered so that they can fix this.
Also, we are working on improvements for Roslyn to the C# and VB compilers' algorithms for determining when locals can be made "ephemeral" -- that is, just pushed and popped on the stack, rather than allocated a specific location on the stack for the duration of the activation. We believe that the JITter will be able to do a better job of register allocation and whatnot if we give it better hints about when locals can be made "dead" earlier.
Thanks for bringing this to our attention, and apologies for the odd behaviour.
Best Solution
The Roslyn project is in Visual Studio 2010 and gives you programmatic access to the Syntax Tree, among other things.