Why is NaN not equal to NaN?


The relevant IEEE standard defines a numeric constant NaN (not a number) and prescribes that NaN should compare as not equal to itself. Why is that?

All the languages I'm familiar with implement this rule. But it often causes significant problems, for example unexpected behavior when NaN is stored in a container, when NaN is in the data that is being sorted, etc. Not to mention, the vast majority of programmers expect any object to be equal to itself (before they learn about NaN), so surprising them adds to the bugs and confusion.

IEEE standards are well thought out, so I am sure there is a good reason why NaN comparing as equal to itself would be bad. I just can't figure out what it is.

Edit: please refer to What is the rationale for all comparisons returning false for IEEE754 NaN values? as the authoritative answer.

Best Solution

The accepted answer is 100% without question WRONG. Not halfway wrong or even slightly wrong. I fear this issue is going to confuse and mislead programmers for a long time to come when this question pops up in searches.

NaN is designed to propagate through all calculations, infecting them like a virus, so if somewhere in your deep, complex calculations you hit upon a NaN, you don't bubble out a seemingly sensible answer. Otherwise by identity NaN/NaN should equal 1, along with all the other consequences like (NaN/NaN)==1, (NaN*1)==NaN, etc. If you imagine that your calculations went wrong somewhere (rounding produced a zero denominator, yielding NaN), etc then you could get wildly incorrect (or worse: subtly incorrect) results from your calculations with no obvious indicator as to why.

There are also really good reasons for NaNs in calculations when probing the value of a mathematical function; one of the examples given in the linked document is finding the zeros() of a function f(). It is entirely possible that in the process of probing the function with guess values that you will probe one where the function f() yields no sensible result. This allows zeros() to see the NaN and continue its work.

The alternative to NaN is to trigger an exception as soon as an illegal operation is encountered (also called a signal or a trap). Besides the massive performance penalties you might encounter, at the time there was no guarantee that the CPUs would support it in hardware or the OS/language would support it in software; everyone was their own unique snowflake in handling floating-point. IEEE decided to explicitly handle it in software as the NaN values so it would be portable across any OS or programming language. Correct floating point algorithms are generally correct across all floating point implementations, whether that be node.js or COBOL (hah).

In theory, you don't have to set specific #pragma directives, set crazy compiler flags, catch the correct exceptions, or install special signal handlers to make what appears to be the identical algorithm actually work correctly. Unfortunately some language designers and compiler writers have been really busy undoing this feature to the best of their abilities.

Please read some of the information about the history of IEEE 754 floating point. Also this answer on a similar question where a member of the committee responded: What is the rationale for all comparisons returning false for IEEE754 NaN values?

"An Interview with the Old Man of Floating-Point"

"History of IEEE Floating-Point Format"

What every computer scientist should know about floating point arithmetic