R – Converting an expression to conjunctive normal form with a twist


I've got a library that I have to interface with which acts basically as a data source. When retrieving data, I can pass special "filter expressions" to that library, which later get translated to SQL WHERE part. These expressions are pretty limited. They must be in conjunctive normal form. Like:

(A or B or C) and (D or E or F) and ...

This of course isn't very comfortable for programming. So I want to make a little wrapper which can parse arbitrary expressions and translate them to this normal form. Like:

(A and (B or C) and D) or E

would get translated to something like:

(A or E) and (B or C or E) and (D or E)

I can parse an expression to a tree with the Irony library. Now I need to normalize it, but I don't know how… Oh, also, here's the twist:

  • The final expression may not contain the NOT operator. However, I can inverse the individual terms by replacing the operators with the inverse operators. So, this is OK:

    (not A or not B) AND (not C or not D)

    but this is not:

    not (A or B) and not (C or D)

  • I would like to make the expression as simple as possible, because it will be translated to a practically identical SQL WHERE statement, so a complex statement would most likely slow down execution speed.

Best Solution

I'd use two iterations over the tree, although it's probably possible in one.

First iteration: get rid of your NOT Nodes by walking through the tree and using De Morgan's law (wikipedia link) and remove double negation wherever applicable.

Second iteration (the NOT are now only directly before a leaf node) Go through your tree:

Case "AND NODE":
    fine, inspect the children
Case "OR NODE":
    if there is a child which is neither a Leaf nor a NOT node
        apply the distributive law.
        start from parent of current node again
        fine, inspect children

After that you should be done.

Related Question