portability/inf.hpp
An Infinite-Precision Integer Type

Introduction

This component is an infinite-precision integer type which, through operator overloading, has been made to look as much as possible like a built-in integer type. Thus you can add infs with a + b and increment it using i++. It provides a full set of arithmetic, logical and shifting operations, type conversions to and from all the C++ integer types, bitwise manipulation and string representation and conversion in any base.

The inf class has been written for portability more than performance. If you need a high-performance numeric library it is recommended that the Gnu Multi-Precision (GMP) library is used; it is seriously fast.

Declaring and Initialising inf objects

The inf object looks a bit like an integer type and can be declared and initialised in a similar way. However, there is one thing you cannot do, and that is simply assign an integer value to it on creation. In other words, the following will not work:

inf b = 0;  // illegal

The reason for this is that, although the inf type has constructors that take integer arguents, I've made them explicit which prevents them being used by C++ to perform implicit type conversions (all 1-parameter constructors are normally considered candidates for type conversions). I was concerned that C++ compilers can be over-keen on such type conversions and create nasty surprises for users. Therefore, to initialise to an integer value, use the C++ constructor form:

inf c(0);

You may not realise this, but you can use this notation even with the standard built-in integer types.

The other significant difference is that an uninitialised inf will have the value zero - there is no such thing as an unknown value.

For example, here is a way of creating an inf with the value zero:

inf a;

An inf can be initialised from an expression using any C++ integer type - int, short, long and the unsigned variants of any of these. It can also be initialised from another inf.

Typically, when using very large numbers, you will need to use string representations to handle the initial value. The inf type will take a string containing an integer value and convert it into an inf. The format of the string can be any of the formats defined in string_utilities, so it can be a straight decimal integer ("12345"), an octal integer ("012345"), a hexadecimal integer ("0x12345") or a hash-style integer ("12#12345"). Here's how you'd initialise an inf with a googol (10100):

inf googol("10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000");

Assignments

A full set of assignments is supported, from the six built-in integer types, from a string representation and from another inf. These assignments have the same characteristics as the initialisations just discussed.

Note that assignments are handled differently from initialisation by C++, so you get the following strange effect:

inf d = 0; // illegal
d = 0;     // legal

This seems strange until you realise that the first '=' operator is an initialisation and in fact calls a constructor and not an assignment operator, whilst the second is a true assignment.

Constant Values

The last sections showed how an inf can be initialised or assigned from a built-in integer type. However, sometimes an intermediate value of type inf is required. I have disallowed implicit type conversion from in integer type to an inf because, from experience, I know that allowing implicit type conversions from integer types can cause the compiler to become overly fond of type conversions to inf. You lose a lot of compiler checking as a result. Therefore, intermediate values must be explicitly created. For example, you cannot do this:

a += 25;    // illegal

This is not legal because the += operator requires an inf as the argument and in this case we have an int. The solution is to use the temporary-constructor notation to form a temporary of type inf:

a += inf(25);

This notation can be seen as the creation of an anonymous temporary object of type inf which is initialised with an int argument. The resultant temporary can then be passed as the right-hand argument of the += operator.

Converting an inf to an integer type

The inf class has a set of member functions which convert back to the basic C++ integer types. To convert to an int, use the to_int member function:

inf a = "12345";
int i = a.to_int();

The full set of six conversions are available (for short, int and long and their unsigned variants):

Note that the names to_unsigned and to_uint are synonyms - my preference is to think of unsigned as a type rather than as a variant of int (I write "unsigned" not "unsigned int"). I provided the to_uint for those who think of unsigned as a modifier to int.

All of these methods have a bool argument which tells the method whether to range-check the result before the conversion. For example, if the value is too big for type short, then to_short has two possible responses. If checking is on, then it will throw the exception std::overflow_error. If checking is off then it simply truncates the value to fit the type, changing its value. The default is for checking to be off. The method actually looks like:

short inf::to_short(bool check = false) const throw(std::overflow_error)

Bit Manipulation

An inf has a 2's-complement representation with any number of bits. It follows the convention that bit 0 is the l.s.b. and the uppermost bit is the sign bit. It is possible to access the bits of an inf and to change them. First, you can get the number of bits in the representation and you can also set the number of bits:

unsigned bits = a.bits();
a.resize(bits + 1);

Furthermore, you can reduce the representation to the minimum size necessary to represent the value stored:

a.reduce();

The bits can be examined with either the bit function or the [] operator.

for (unsigned i = 0; i < a.bits(); i++)
  std::cout << "bit " << i << " = " << a[i] << std::endl;

The result of the [] operator is a bool. The one difference with the array index operator is that [] cannot be used as the target of an assignment. In other words you cannot do this:

a[0] = true;  // illegal

Instead, the bits of an inf can be set, cleared or preset to a bool value using the set, clear and preset functions. So here are two ways of setting bit 0 to true:

a.set(0);
a.preset(0, true);

The preset function is particularly useful when the bool value is the result of an expression rather than a constant as here.

Value Tests

There are a number of short-cut tests for the value of an inf so that you can determine whether it is zero, natural, positive or negative. These are member functions returning a bool:

if (a.negative())
  ...

There are also implicit tests for zero and non-zero for use in conditionals - either in if statements or in while loops (including the second field of a for loop):

if (a)  // tests for a having a value, i.e. non-zero
  ...
if (!b) // tests for b not having a value, i.e. zero
  ...

Comparisons

The inf type has a full set of comparison operators - ==, !=, <, <=, > and >=. Note however the comment earlier on the creation of intermediate values, so this is illegal:

if (a == 0)
  ...

You can fix this by either using the test for zero from the last section, or (especially for other non-zero values) using the temporary-constructor notation:

if (a == inf(0))
  ...

Bitwise Logical Operations

The inf type has a full set of bitwise logical operators which are divided into two sets. The first set are the composite operator-assignments or self-modifying operators such as &= which modify the inf:

a &= b;

This makes a the result of bitwise-anding the values of a and b.

The second set are the simple operators or value-returning opoerators which return the result of the logical operation but do not modify either argument:

c = a & b;

This makes c the result of the bitwise-anding of the values of a and b.

These operators all normalise the result to the maximum length of the two arguments, sign extends the shorter argument to the length of the longer one and then performs the bitwise operation on these normalised values. Thus, the bitwise-and of an 8-bit inf with a 16-bit inf is a 16-bit inf.

The self-modifying operations are: invert, &=, |= and ^= (xor). The value-returning operators are: ~, &, | and ^.

Shift Operators

The inf type can be shifted left or right. The shift distance is in fact represented by an unsigned, not an inf. As with the logical operators, there are both self-modifying and value-returning sets of these operators. The self-modifying operators are: <<= and >>=. The value-returning operators are: << and >>.

Thus an inf can be shifted left by two bits using:

a << 2;

Arithmetic Operations

The inf type has a full set of arithmetic operations. Once again (yawn) they are divided into self-modifying and value-returning functions. The self-modifying operators are: negate, abs, +=, -=, *=, /= and %=. The value-returning operators are: unary -, +, -, *, / and %.

It is worthwhile differentiating between the two versions of the abs operation. The self-modifying form is called as a member function:

a.abs();

This turns a into its absolute value. The value-returning form is called as a non-member function:

b = abs(a);

This makes b the absolute value of a without modifying a.

Note that the / and % operators can throw the inf::divide_by_zero exception.

Increment and Decrement Operators

The inf type can of course be used for mundane tasks such as loop counters and for this purpose has a full set of increment and decrement operators. There are two variants of these - pre-increment/decrement (e.g. ++i) and post-increment/decrement (e.g. i++). The pre-increment operator increments the value and returns the incremented value. the post-increment increments the value but returns the previous non-incremented value.

String Representation and I/O

An important part of the inf type is its string representation which allows the type to be printed to the screen and saved to and restored from a file. The string representation is consistent with the string representation of built-in integers as provided by the string_utilities package. This supports the standard C formats of decimal ("12345"), octal ("012345"), binary ("0b010001010") and hexadecimal ("0x12345"). It also supports the STLplus-specific hash format ("12#12345") which allows the use of any base from base 2 to base 36. There are conversions to/from string:

std::string to_string(const inf& i,
                      unsigned radix = 10,
                      radix_display_t display = radix_c_style_or_hash,
                      unsigned width = 0)
  throw(std::invalid_argument);

inf to_inf(const std::string&, unsigned radix = 0)
  throw(std::invalid_argument);

Note that these are non-member functions. See the string_utilities documentation for the meaning of the extra parameters. The default behaviour is a simple decimal number.

There are also IOStream input (>>) and output (<<) operators for the inf type so that integer values can be written to and read from a file or other I/O device:

inf a = "12345";
std::cout << "the value is " << a << std::endl;

and

inf a;
std::cin >> a;

As an example, here's the code to generate and display a googol. As you may know, a googol is a one followed by a hundred zeroes:

inf googol(1);
for (unsigned j = 0; j < 100; j++)
  googol *= inf(10);
std::cout << "a googol is:" << std::endl;
for (unsigned radix = 2; radix <= 36; radix++)
  std::cout << "base " << radix << " = " << to_string(googol, radix) << std::endl;

This code prints the googol in every radix from 2 to 36. The output looks like this:

a googol is:
base 2 = 0b0100100100100110101101001001011001010011000011011111001110101100001011001001111000010011000100110011100000101111110011100010101100111001000000100011100010000100011010011111001010101010110010010000110000100010101000001011101000111100010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
base 3 = 3#122012210112120112111212010011100001101211222101110010100012001010011011021010111212020100220020021122002200200010101000112122102122010002012010000000120120022011020201122101010221121011200012121021202022020101
base 4 = 4#10210212231021121103003133032230023021320103010303200233303202230321000203202010122133022222302100300202220023220330100000000000000000000000000000000000000000000000000
base 5 = 5#102414221203323202133113331031102220100330010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
base 6 = 6#225421520555002405205021514331412332420034405541334340545305400514013453222020504101024233010145312244402025311325050123301452144
base 7 = 7#16201341553122251063252024261246503522112115506446252526241360534151125226544036056624134325461423451523416401660341314
base 8 = 00444465511312303371654131170230463405763425471004342043237125262206042501350742000000000000000000000000000000000
base 9 = 9#565715515455104301354871403305033134233455210806248080603330478378102163000516264221571127534605537668211
base 10 = 10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
base 11 = 11#107611186a648903471813259295629384570114251063297959276735639698066525944868937609960a3a7a17391a1
base 12 = 12#52377002841b241b07a24b908b19a6959ab7a08120693b00991a1299b018ab70b11144a6506633388a9046ba11054
base 13 = 13#72c7a1051523285a7244587a82c6184b210239756b3033791a1381197a005ccbbc9a54461668a53a8589393273
base 14 = 14#1d15ba97ca20a0b65db2c537911c9cb06dc7c31084bd85a25bdb6540a6a9b3b1347c6055a2b633cbadb09b44
base 15 = 15#11251839c078059010e1ae8598a337dcecbec6ceb0034ed2d436ec35cac6406a2de1e31c5b3d007905ce6a
base 16 = 0x1249ad2594c37ceb0b2784c4ce0bf38ace408e211a7caab24308a82e8f10000000000000000000000000
base 17 = 17#22b12972d2793de45bcaaae50d2a3fgcaacgd5e6d5c4937d52g8fb8d33a2a8c6d5bgbec4e420780ff4
base 18 = 18#6ec1bcf4688309gh806h932adcc44eeg6de0fe9fahde66dgh108c9g3623e045a0h7a95ab594ce99a
base 19 = 19#1f6dcb713d74f8gac9c5agi2e0ecb55da61fi9ae08e7hchf2b236672e128h29gfgccc8874hhc8a9
base 20 = 20#d4dj27751e811e02jc8j525841500000000000000000000000000000000000000000000000000
base 21 = 21#6h307dg64gac07d579f617g0014878b0gja0e6j69ddh5h541k343520iii6f48e5cdg26dfi454
base 22 = 22#4cflf34gchk5jcc43fa5flhb96c2jf5f3572g5afh42dj31a12ga9hdjk94ah2ck1ccl4i9j0ac
base 23 = 23#3l6adc1kjjfb6j7eh7e9d20gdg43l07ccad1a8aea3gdebi38g5da3hhkcb9aj9e7a8jca817d
base 24 = 24#453j58dkc2beg2k1i3m4c67bba9732e6k4jeed4eg5dh5ndfenc1f0i3301c320c0dm7adf2g
base 25 = 25#5e9c73hh28g8i5g5ca53f100000000000000000000000000000000000000000000000000
base 26 = 26#8oikgc2dn19n3hd0gd7ch09157j49md2mcejjlj2ak237pn1p3he8p0khnn9bomijgk20gg
base 27 = 27#h5lefedn3491amqac395134473dn69o67h2ii3a0ehbh325300ff846jha3pg4i5g7k86a
base 28 = 28#1b5eegphn0cfn4ae1la8logkfjiq3gk2e6nik0hfnmn26i9gckdk7bh7n1q8g3i9nepk94
base 29 = 29#3hg0h7pl6s4qb96hp43oepi8o7helm5rf6i6ngekhbsmln09c6qle0k78ee7f8mlkjmsg
base 30 = 30#anhmc58j7lglcg33hodnk7sndpa3aj70j1rppi9f83ktbpktjjg6oft86bqa44opkb3a
base 31 = 31#16532c98ncbeel7rel86l7845f0bhkq3cg6dq10e9o3deb14sbclrn8efmm462nsapg5
base 32 = 32#4i9lkip9grstc5if164po5v72me827226jslap462585q7h00000000000000000000
base 33 = 33#jq9lr1s39q0wlawtn6vwll8ob7qaskkiupbe9fhge89wfh2tn7rllf8f5afpm2g7i1
base 34 = 34#2sn7biwwnggi76tqbhc2muect6iraeb1f6nbcr0ahi3wxxe2hm8hxdtjqeegp6ttg4
base 35 = 35#f4aydl5wlj4rsxp7lpc21v0vb700c5bdkncfb881jck8oo7d2x7t6m0bjhouw1cfp
base 36 = 36#2hqbczu2ow52bala8lgc3s5y9mm5tiy0vo9tke25466gfi6ax8gs22x7kuu8l1tds

Note how hash format has been used except for base 10, which uses no prefix, base 2 which uses 0b, base 8 which uses 0 and base 16 which uses 0x.

Exceptions

The inf class can throw exceptions, though of course only under exceptional conditions. The following exceptions can be thrown:

std::out_of_range
Thrown by bit-index operations if the indexed bit is out of range of the number of bits.
std::overflow_error
Thrown by conversion functions when converting to C++ integer types if the inf's value will not fit
std::invalid_argument
Thrown by the string conversion functions if the string does not have the right format for an integer value
inf::divide_by_zero
Thrown by the divide (/) and remainder (%) operators if the right-hand argument is zero.

Note that all but the last of these are defined in the std:: namespace. This is because they are standard exceptions and are defined in the standard library header <stdexcept>. The last exception is declared within the inf class and is therefore in the inf:: namespace. This was defined because there isn't an appropriate standard exception for this job.

All of the exceptions that can be thrown by inf are derivatives of the baseclass std::logic_error, so you can catch any of them with a single catch clause:

try
{
  ...
}
catch(std::logic_error& except)
{
  ...
}

All logic error exceptions store a string with an error message in it. This can be used to give more information on the cause of the error:

catch (std::logic_error& e)
{
  std::cout << "caught exception " << e.what() << std::endl;
}