C++ – Exceptions with Unicode what()

cc++11

Or, "how do Russians throw exceptions?"

The definition of std::exception is:

namespace std {
  class exception {
  public:
    exception() throw();
    exception(const exception&) throw();
    exception& operator=(const exception&) throw();
    virtual ~exception() throw();
    virtual const char* what() const throw();
  };
}

A popular school of thought for designing exception hierarchies is to derive from std::exception:

Generally, it's best to throw objects,
not built-ins. If possible, you should
throw instances of classes that derive
(ultimately) from the std::exception
class. By making your exception class
inherit (ultimately) from the standard
exception base-class, you are making
life easier for your users (they have
the option of catching most things via
std::exception), plus you are probably
providing them with more information
(such as the fact that your particular
exception might be a refinement of
std::runtime_error or whatever).

But in the face of Unicode, it seems to be impossible to design an exception hierarchy that achieves both of the following:

  • Derives ultimately from std::exception for ease of use at the catch site
  • Provides Unicode compatibility so that diagnostics are not sliced or gibberish

Coming up with an exception class that can be constructed with Unicode strings is simple enough. But the standard dictates that what() must return a const char*, so at some point the input strings must be converted to ASCII. Whether that is done at construction time or when what() is called (if the source string uses characters not representable by 7-bit ASCII), it might be impossible to format the message without loss of fidelity.

How do you design an exception hierarchy that combines the seamless integration of a std::exception-derived class with lossless Unicode diagnostics?

Best Answer

char* does not mean ASCII. You could use an 8 bit Unicode encoding like UTF-8. char could also be 16 bit or more, you could then use UTF-16.