C++ – floating point literal suffix in C++ to make a number double precision

cgccliteralsprecision

I'm currently working on a C++ project which does numerical calculations. The vast, vast majority of the code uses single precision floating point values and works perfectly fine with that. Because of this I use compiler flags to make basic floating point literals single precision instead of the double precision, which is the default. I find that this makes expressions easier to read and I don't have to worry about forgetting a 'f' somewhere. However, every now and then I need the extra precision offered by double precision calculations and my question is how I can get a double precision literal into such an expression. Every way I've tried so far first store the value in a single precision variable and the converts the truncated value to a double precision value. Not what I want.

Some ways I've tried so far is given below.

#include <iostream>

int main()
{
  std::cout << sizeof(1.0E200) << std::endl;
  std::cout << 1.0E200 << std::endl;

  std::cout << sizeof(1.0E200L) << std::endl;
  std::cout << 1.0E200L << std::endl;

  std::cout << sizeof(double(1.0E200)) << std::endl;
  std::cout << double(1.0E200) << std::endl;

  std::cout << sizeof(static_cast<double>(1.0E200)) << std::endl;
  std::cout << static_cast<double>(1.0E200) << std::endl;

  return 0;
}

A run with single precision constants give the following results.

~/path$ g++ test.cpp -fsingle-precision-constant && ./a.out
test.cpp:6:3: warning: floating constant exceeds range of ‘float’ [-Woverflow]
test.cpp:7:3: warning: floating constant exceeds range of ‘float’ [-Woverflow]
test.cpp:12:3: warning: floating constant exceeds range of ‘float’ [-Woverflow]
test.cpp:13:3: warning: floating constant exceeds range of ‘float’ [-Woverflow]
test.cpp:15:3: warning: floating constant exceeds range of ‘float’ [-Woverflow]
test.cpp:16:3: warning: floating constant exceeds range of ‘float’ [-Woverflow]
4
inf
16
1e+200
8
inf
8
inf

It is my understanding that the 8 bytes provided by the last two cases should be enough to hold 1.0E200, a theory supported by the following output, where the same program is compiled without -fsingle-precision-constant.

~/path$ g++ test.cpp  && ./a.out
8
1e+200
16
1e+200
8
1e+200
8
1e+200

A possible workaround suggested by the above examples is to use quadruple precision floating point literals everywhere I originally intended to use double precision, and cast to double precision whenever required by libraries and such. However, this feels a bit wasteful.

What else can I do?

Best Answer

Like Mark said, the standard says that its a double unless its followed by an f.

There are good reasons behind the standard and using compiler flags to get around it for convenience is bad practice.

So, the correct approach would be:

  1. Remove the compiler flag
  2. Fix all the warnings about loss of precision when storing double values in floating point variables (add in all the f suffixes)
  3. When you need double, omit the f suffix.

Its probably not the answer you were looking for, but it is the approach you should use if you care about the longevity of your code base.