strings/string_basic.hpp
String Formatting of Basic Types

Introduction

This is a set of functions that perform string conversions and printing routines. There are functions here for converting integers to/from string representations as well as floating-point types to/from string representations. There are supporting functions for printing the same types.

There are two separate header files for including these functions:

The string formatting functions in string_basic.hpp are controlled by a set of enumeration types which have been separated out into the header format_types.hpp.

Radix Handling

C is a bit limited in its ability to display different radix (base) numbers. Basically you can print in decimal (e.g. 1234), octal (e.g. 01234) or hexadecimal (e.g. 0x1234) only. Sometimes you need other radices - in particular, binary and base 13.

To improve on this situation, the string formatting functions for integer types (but not for floating point types) support all radices from base 2 to base 36 using the character set [0-9a-z]. It offers three different formatting options for showing the radix:

Hash-style format starts with the base in decimal, a '#', then the number in the specified base. For example, 16#ff is 255 in hexadecimal. The advantage of this format is that it can be applied to any base. Thus 36#zz is a valid hash-style number. Its value is left as an exercise for the reader.

Hash-style format is a sign-magnitude format. A negative value has the sign after the # character: 16#-ff represents -255 in hexadecimal.

Each integer formatting function takes as an argument an enumeration of type stlplus::radix_display_t specifying which formatting style to use for the output. The type has the following values:

stlplus::radix_none
Just print the number with no radix indicated
stlplus::radix_hash_style
none for decimal, hash style for all others
stlplus::radix_hash_style_all
hash style for all radices including decimal
stlplus::radix_c_style
C style for hex and octal, none for others
stlplus::radix_c_style_or_hash
C style for hex and octal, none for decimal, hash style for others

Note that the only styles that are guaranteed to give a value that can be correctly converted back to an integer again are: radix_hash_style, radix_hash_style_all and radix_c_style_or_hash. The last of these is the recommended style for all printing since it is the most natural combination - decimal is printed as a number (e.g. 1234), binary, octal and hex are in familiar C-style (e.g. 0b0100100, 01234 or 0x1234) and all other bases are in hash style (e.g. 4#3210). Indeed, radix_c_style_or_hash is the default format for all the string formatting functions.

Real Format Handling

When formatting real numbers as strings, there are three formats supported. These are controlled by the enumeration type real_format_t. It has the following values:

stlplus::display_fixed
This displays the real number as a fixed-point value - that is it has no exponent (Ennn) part, just the mantissa. This is equivalent to the C format "%f".
stlplus::display_floating
This displays the number is a floating point value for any value - that is it always has an exponent part, even if it is zero. This is equivalent to the C format "%e".
stlplus::display_mixed
This selects whichever of the above formats is most appropriate for the value. For small exponents, it will use fixed point format, whilst for large exponents (positive or negative) it will use the floating point format. This is equivalent to the C format "%g".

Conversion from Integer to String

There is a whole family of functions called type_to_string which take an integer type and format it into a std::string. The parameter profile of these functions is:

std::string stlplus::type_to_string(type i,
                                    unsigned radix = 10,
                                    stlplus::radix_display_t display = stlplus::radix_c_style_or_hash,
                                    unsigned width = 0);

In this case, type is any integer type - namely bool, short, unsigned short, int, unsigned, long and unsigned long. For two-word types such as "unsigned long", the function name uses an underscore to make the function name unsigned_long_to_string.

The width parameter specifies the minimum number of digits to use to represent the value. The result may be larger than this if the value doesn't fit in the specified width. The default of 0 means use the minimum number of digits to represent the value. Any prefix that indicates the radix is in addition to this, so if you ask for, for example, zero in hexadecimal using C style with a width of 4, you will get 0x0000. Using hash style will give 16#0000.

The exception std::invalid_argument will be thrown if the radix is not in the range 2-36 or the display enumeration is illegal.

The default values mean that the functions can be used with just a single parameter:

std::string s = stlplus::int_to_string(i);

In this case, the output will be in decimal with no formatting codes (since radix_c_style_or_hash prints decimal as just a simple number).

There is one last form of to_string in this set that is worth noting:

std::string stlplus::address_to_string(const void*,
                                       unsigned radix = 16,
                                       radix_display_t display = radix_c_style_or_hash,
                                       unsigned width = 0);

This prints out an address as a number (any address, since in C any pointer can be treated as a void*). The default radix is set to 16 because most people expect addresses to be in hex.

Conversion from String to Integer

These functions do the reverse conversion, taking a string as an argument and returning the integer value represented. They recognise the normal C-style formatting and the hash-style formatting so can read a string written in any base.

The integer conversion functions are of the form:

type string_to_type(const std::string& value, unsigned radix = 0);;

where type is bool, short, unsigned short, int, unsigned int, long or unsigned long.

A radix of 0 means work out the radix from the string. The default is then 10. Any other radix will force the default to be that radix. Thus if you have a number which has been printed using radix_none but with a radix of 32, you can convert it back to integer by specifying a radix of 32. However, any number printed using the default radix_c_style_or_hash will be read correctly without specifying a conversion radix.

The exception std::invalid_argument will be thrown if a radix is specified outside the range 2-36.

And finally, there is the reverse conversion for addresses:

void* string_to_address(const std::string& value, unsigned radix = 0);

Conversion from Real to String

There are two type_to_string functions which format the two C++ real types to a string representation. These are:

std::string stlplus::float_to_string(float f,
                                     stlplus::real_display_t display = stlplus::display_mixed,
                                     unsigned width = 0,
                                     unsigned precision = 6);

std::string stlplus::double_to_string(double f,
                                      stlplus::real_display_t display = stlplus::display_mixed,
                                      unsigned width = 0,
                                      unsigned precision = 6);

The default values are chosen to give reasonable displays for most applications. The default format is display_mixed (equivalent to "%g") with a precision of 6 decimal places and no field width - which gives a minimum field width. See dprintf.hpp for the meanings of the precision and field width for floating point numbers..

Conversion from String to Real

Once again there are two conversions from string to real types, one for each C++ real type. These are:

float stlplus::string_to_float(const std::string& value);
double stlplus::string_to_double(const std::string& value);

These conversions will accept strings formatted in any of the formats which can be used by the real to_string functions, so there is symmetry here.

Printing Functions

In parallel with the set of string conversion routines, there is a set of print routines for the same set of types. Typically these will have the same parameters as the string-formatting functions, but take an extra first argument which is the IOStream output device to print to.

Printing Basic Types

The integer print routines have the following profile:

void stlplus::print_type(std::ostream& str, type value,
                         unsigned radix = 10,
                         stlplus::radix_display_t display = stlplus::radix_c_style_or_hash,
                         unsigned width = 0);

In this case, type is any integer type - namely bool, short, unsigned short, int, unsigned, long and unsigned long.

The extra parameters have the same meaning as for the to_string functions.

Similarly, floating-point types are handled:

void stlplus::print_type(std::ostream& str, type f,
                         stlplus::real_display_t display = stlplus::display_mixed,
                         unsigned width = 0,
                         unsigned precision = 6);