error_handler - A General-Purpose Message Handler

Introduction

The error handler subsystem is a message handler which performs all message printing in an application. It controls the routing of messages to the current output device and does all the message formatting according to a set of message formats for each kind of message. The formatting is fully configurable.

Messages are stored in a file rather than hard-coded into the application. Each message is referred to by a unique ID, which is a string. When a request to print a message is received, the message is retrieved by its ID from the message file, formatted according to the formatting rules in place at the time, has arguments substituted and then is printed to the current output device.

The error handler supports both simple messages and positional messages. A positional message is one that relates to a particular position in a file, and would find application in a compiler for example to show where syntax errors occur in a source file. Simple messages are not related to a file, they are just plain message text.

There are four levels of message: information, warning, error and fatal. These have well-defined meanings so there should be no ambiguity as to which one to use.

For applications such as compilers, the error handler can keep count of the number of errors that have been reported and throw an exception when an error limit has been reached. This is user-friendly in that it prevents the user becoming swamped with error messages.

For applications where processing goes through a number of deeper and deeper stages such that an error message would be so far removed from the user input it would be meaningless, the error handler supports the concept of a context stack. For example, if you are writing an assembler (I wonder where I get my examples from) and inlining a function, you could push the original function call onto the context stack. Then, any error messages will be printed with supplementary messages showing the context of the error.

Initialisation

The error handler is implemented as a class. You create an object of that class and then use that object to manage all your messages. Typically, you would create the error handler in main() and then pass it to all functions that need to do message printing.

The constructor takes 1-4 arguments to initialise the error handler. Start with the simplest constructor:

error_handler::error_handler(
  otext& device,
  unsigned limit = 0,
  bool show = true)
    throw();

The device argument is a TextIO output device which is used as the destination for all messages. For command-line tools it will probably be fout (standard output) or ferr (standard error) as defined in FileIO.

The limit argument is the error limit. If set to zero (the default) there is no limit to the number of errors that can be printed. Otherwise, if the stated limit of error messages is reached, the error handler will throw an error_limit exception which can be caught later to terminate an operation.

The show argument controls whether positional messages will also display the original source text with an indicator showing where the error occurred. This will typically be true for command-line tools and false for GUI commands. For example, a VHDL to Icode assembler might generate the following error with show set to true:

"test10.vhdl" (18,12) : error : this builtin equality operator doesn't have a module mapping
      if p1 = null then
            ^

Setting show to false gives a reduced form of the message:

"test10.vhdl" (18,12) : error : this builtin equality operator doesn't have a module mapping

This reduced form only makes sense within a context-sensitive editor, as might be found in a GUI, which would highlight line 18, column 12 of the source file "test10.vhdl".

So far, no message file has been loaded. There is a modified form of the constructor that does this for a single file, or message files (plural) can be added to the error handler explicitly. The latter option will be handled in the next section. The modified constructor looks like this:

error_handler::error_handler(
  otext& device,
  const std::string& message_file,
  unsigned limit = 0,
  bool show = true)
    throw(error_handler_read_error);

The only difference is that this constructor takes the name of a message file as its second argument. Note also that the constructor can throw an error_handler_read_error exception in the case where reading the message file fails, either because the file doesn't exist or because there was a format error in the file. This will be dealt with in more detail in the section on Handling Message Files.

Handling Message Files

The message handler can read in messages from any number of message files. Each file is added to the message handler by the add_message_file function:

void error_handler::add_message_file(const std::string& message_file)
  throw(error_handler_read_error);

The message file will be read into the error handler and the messages stored ready for use. If any errors are found, the error_handler_read_error exception is thrown.

The error_handler_read_error exception contains an error_position field, which stores the file position of the error. This includes the filename, the line number and the column number. If the file was missing, the line number will be zero. Otherwise, the line number refers to the message file line number, numbered from 1. The column number refers to the character position in the line, numbered from 0. Incidentally, the numbering of lines from 1 and columns from 0 is a common, almost standard, convention.

The message file format is very simple. Each message declaration is in the form:

<ID> <spaces> <text>

An <ID> is a unique mnemonic for the error message. It starts with an alphabetic character and may contain alphanumerics and underscores only. It must be unique across all the message files loaded into the error handler. If a duplicate message ID is found, an error_handler_read_error exception will be thrown containing the position of the second declaration.

The <spaces> can be one or more space or tab characters.

The <text> is the remainder of the line up to the newline and is plain text (not a quoted string).

All lines starting with a non-alphabetic character are assumed to be comments and are ignored.

Here's an example, trivial message:

HELLO           Hello World!

This defines a message with an ID of "HELLO" and with message text "Hello World!".

Simple messages like this are not much use. Most messages need to have arguments inserted into them. Arguments are specified in the message file by the substitution strings "@0", "@1" etc. The index number (0, 1 etc.) refers to the order in which the arguments are passed to the message printing functions. Note that they are numbered from zero, not from one. There is no limit to the number of arguments that can be substituted and the index number can have more than one digit (e.g. "@10"). The use of numbered arguments was chosen in preference to C-style substitutions because it allows a message to be reworded in a different order without changing the source code. This becomes particularly useful if we ever get round to multi-lingual support since different human languages will be worded with arguments in a different order.

A trivial message with a single argument is:

IAM             My name is @0!

And in the French version:

IAM             Je m'appelle @0!

Format Control

The format of messages is controlled by built-in format strings which can be customised for a particular application. The format strings use the same argument substitution rules as the messages themselves. It is possible to individually customise all message types: information, supplement, context, warning, error and fatal messages (see later for an explanation of context messages). There is yet another format string for positional messages.

The default message formats are:

The information/context/supplement/warning/error/fatal formats take a single argument which is the formatted message. The @0 argument will be replaced by the message from the error file with the arguments substituted.

For positional messages, the simple message text created by the rules above is further substituted into a positional format string. The positional format string takes up to 4 arguments:

You can miss out a part of this (e.g. the column number) by simply not including the argument number in the format string. For example, use the position format string "file: @1, line: @2: @0" to eliminate the column information. I can't think of any legitimate reason for excluding any field apart from the column number.

The default formats can be overridden by the set_****_format functions:

void error_handler::set_information_format(const std::string& format) throw();
void error_handler::set_context_format(const std::string& format) throw();
void error_handler::set_supplement_format(const std::string& format) throw();
void error_handler::set_warning_format(const std::string& format) throw();
void error_handler::set_error_format(const std::string& format) throw();
void error_handler::set_fatal_format(const std::string& format) throw();
void error_handler::set_position_format(const std::string& format) throw();

Note: at present there is no sanity check on the format strings. It is up to you to get them right. Mistakes here could lead to error_handler_format_error being thrown when you try to write a message.

Format strings can be inserted into the message file itself. The defaults are equivalent to having the following messages in a message file:

INFORMATION  @0
CONTEXT      context: @0
SUPPLEMENT   supplement: @0
WARNING      warning: @0
ERROR        error: @0
FATAL        FATAL: @0
POSITION     "@1" (@2,@3) : @0

Source File Position Specifications

Many functions in the error handler require file positions to be specified. A file position means the combination of the source file name, a line number and a column number. This is encapsulated in a class in called error_position:

class error_position
{
public:
  error_position(const std::string& filename, unsigned line, unsigned column);

  const std::string& filename(void) const;
  unsigned line(void) const;
  unsigned column(void) const;

  friend otext& operator << (otext&, const position&);
};

Source Code Display Control

This feature controls whether the error handler shows positional messages with two lines of supplementary text showing the source code line in error and an arrow showing the column in which the error was found. This is a boolean switch controlled by two functions:

void error_handler::show_position(void);
void error_handler::hide_position(void);

Printing Messages

There are 4 classes of message: information, warning, error, fatal. They have the following interpretation and behaviour:

information
Used for progress messages, status messages etc.
warning
Indicates that a problem has been found but there is a sensible, well defined, way of proceeding
error
Indicates that a problem has been found and the operation will fail - processing may continue but only to find further errors. May throw an error_handler_limit_error if the error limit is reached.
fatal
An internal (programming) error has been found and the operation is stopping NOW. It does this by throwing error_handler_fatal_error.

In addition to this behaviour, be aware that all print functions could throw either an error_handler_id_error or a error_handler_format_error.

There are also 2 kinds of message: simple, positional:

simple
just a text message
positional
a message relating to a source file and a specific position in that file

This gives 8 variants. For each variant there is a general-purpose print function which takes a vector of strings as a parameter. This vector of strings is used as the arguments to substitute into the message. For example, here's the general form of the simple informational message handler:

bool error_handler::information(
  const std::string& id, 
  const std::vector<std::string>& args)
    throw(error_handler_id_error,error_handler_format_error);

The first field is the ID of the message to print and the second is the set of arguments to substitute into that message. Typically this will be built up by using the vector::push_back function.

This general-purpose function is a bit clumsy in practice but has the advantage that it has no limit on the number of arguments that can be passed. To make life easier in the majority of cases, short-cut forms are provided for 0 to 3 argument messages. For informational messages these are:

bool error_handler::information(const std::string& id)
  throw(error_handler_id_error,error_handler_format_error);
bool error_handler::information(const std::string& id,
                                const std::string& arg1)
  throw(error_handler_id_error,error_handler_format_error);
bool error_handler::information(const std::string& id,
                                const std::string& arg1,
                                const std::string& arg2)
  throw(error_handler_id_error,error_handler_format_error);
bool error_handler::information(const std::string& id,
                                const std::string& arg1,
                                const std::string& arg2,
                                const std::string& arg3)
  throw(error_handler_id_error,error_handler_format_error);

Each of these functions also has a positional equivalent. The position is placed before the id parameter.

For example:

bool error_handler::information(const position&,
                                const std::string& id,
                                const std::vector<std::string>& args)
  throw(error_handler_id_error,error_handler_format_error);

This set of permutations is repeated for the warning, error and fatal message handlers.

The error functions can throw an extra error_handler_limit_error exception if the error limit is reached. The fatal functions always throw a error_handler_fatal_error exception.

Printing Plain Text

Not all text output is in the form of messages. Sometimes you need to just print plain text. For example, to reprint the source text of a program to show where an error was found. However, it is also useful to be able to redirect this plain text to the same device as is being used for message reporting. This is done through the plaintext function:

bool plaintext (const std::string& text);

Each plaintext function call adds a newline to the text printed (in other words, do not terminate the text with a newline because the function will do it.

Handling Error Counts and Error Limit

The section on exceptions deals with what happens when the error limit is reached (an exception is thrown). This section deals with setting error counts and resetting the current count.

Initially, the error limit is set by the constructor. If this is set to 0, then this means there is no error limit and an error_limit exception can never be thrown. Any other value sets the error limit.

The error handler keeps count of how many error messages have been printed. This number can be used at the end of the program to determine whether the program succeeded (count = 0) or failed (count > 0). It can even be used as the exit status of the program since the convention there is to use 0 to mean success and any other value to mean failure.

If a program needs to perform a series of tasks in isolation, it is useful to be able to reset the error count before each stage.

In all, there are four functions relating to the error count and the error limit:

void error_handler::set_error_limit(unsigned limit);
unsigned error_handler::error_limit(void) const;
void error_handler::reset_error_count(void);
unsigned error_handler::error_count(void) const;

Context Stack

The idea of a context stack applies to situations where a problem requires recursion and where an error could become meaningless due to that recursion because any errors are so far removed from the original problem that they don't make sense. The example I used before was that of inlining a function, where an error in the inlining is far removed from the user's original source code. In that case, you can build up a context stack to print out with each message so that the path to the error can be traced. In this example, the context would be a positional message showing the original function call that is being inlined.

The context stack is maintained through push and pop functions. A push function is similar to the message print functions above, but they don't print anything. Instead, the formatted message is stored in the context stack for future use. Then, when any message is printed, the context stack is also printed. Note that the context messages have a different format to information/warning/error/fatal messages and that format is configurable

The context stack functions look like the message functions with the name push_context. As before, there is a general purpose function which takes a vector of string arguments and then convenience functions that take 0-3 arguments. There are two permutations: simple and positional. This gives 10 permutations.

For example:

void error_handler::push_context (
  const std::string& id,
  const std::vector<std::string>& args)
    throw(error_handler_id_error,error_handler_format_error);

This is the general form for simple messages.

The push_context method is called when entering the context. There is also a pop_context method which must be called on leaving the context:

bool error_handler::pop_context(void);

It is your responsibility to ensure that every push_context is paired with a pop_context.

This can be difficult. In fact it may not be possible if exceptions can be thrown. The solution is to use the auto_push_context methods. These take exactly the same arguments as the push_context methods, but they return an error_context object. This object will automatically pop the context stack when it goes out of scope. An example of the auto_push_context method is:

error_context error_handler::auto_push_context(
  const std::string& id,
  const std::vector& args)
throw(error_handler_id_error,error_handler_format_error);

In use the return object should be assigned to a local variable. When that local variable goes out of scope, the context stack is popped.

Supplementary Messages

The concept of supplement messages is similar to the context stack. This is a set of messages that are printed along with a main information/warning/error/fatal message to give extra information. The supplement messages are loaded before the main message is printed so the whole is printed in one block of text. The main difference is that supplement messages have no memory - they apply only to the next main message and then are discarded.

An example use of supplement messages would be in a compiler. Say you have more than one possible solution to an expression because the types are ambiguous. The main message would say that the types are ambiguous. Then the supplement messages would list the possible types. There would be one supplement message per type.

Here's some example code using supplement messages:

for (unsigned i = 0; i < candidates.size(); i++)
  errors.push_supplement(candidates[i].position(), "VDK_POSSIBLE_SOLUTION");
errors.error(current_position, "VDK_COMPONENT_AMBIGUOUS");

Exceptions Thrown by Error Handler

When errors are discovered within an error handler, it resorts to throwing an exception to indicate that error. The following exceptions can be thrown:

error_handler_read_error
Thrown when an error is found reading a message file.
error_handler_format_error
Thrown when trying to write a message and a formatting error is found in the message text.
error_handler_id_error
Thrown when a message ID is requested which is not present in the message files(s).
error_handler_limit_error
Thrown when the error limit is reached (only enabled if the error limit is set to > 0).
error_handler_fatal_error
Thrown after a message of severity fatal is thrown. Fatal errors always throw this exception.

These exceptions can be caught individually or as a group. To catch them individually, you end up with quite a large catch block:

try
{
}
catch(error_handler::error_handler_read_error& exception)
{
}
catch(error_handler::error_handler_format_error& exception)
{
}
catch(error_handler::error_handler_id_error& exception)
{
}
catch(error_handler::error_handler_limit_error& exception)
{
}
catch(error_handler::error_handler_fatal_error& exception)
{
}

The exceptions can be caught as a group because they are all derivatives of std::runtime_error which in turn is a derivative of std::exception. It is good practice anyway to catch std::exception.

This cruder form of exception handling only requires one catch block:

try
{
}
catch(std::exception& exception)
{
}

A good compromise is to catch specific exceptions which require special handling and then catching all the others in one go. For example, a error_handler_limit_error exception might require different handling from all the others. This is done by putting the catch block for the specific errors before the general catch block:

try
{
}
catch(error_handler_limit_error& exception)
{
}
catch(std::exception& exception)
{
}

Note that catching an exception by reference preserves the derivative information (see "The C++ Programming Language", Bjarne Stroustrup, p359) so that dynamic casts could be used on a std::exception to get the derivative information.

Each specific exception contains extra information on the nature of the exception.

An error_handler_read_error contains a position field:

class error_handler_read_error : public std::runtime_error
{
public:
  error_handler_read_error(const position& pos);
  const position& where(void) const;
};

The position information contains the name of the message file in which the read failed, the line and column numbers where the read failed.

A format error also contains position information showing the exact line and column in the file where the format is wrong (typically the substitution of an argument more than the number of arguments passed to the print function). It also returns the format string and the offset into that string of the error. This is most useful for built-in formats such as the error format or the position format where the file position will be empty. It tends to be redundant information for an error in the message file.

class error_handler_format_error : public std::runtime_error
{
public:
  error_handler_format_error(const std::string& format, unsigned offset);
  error_handler_format_error(const position& pos, const std::string& format, unsigned offset);
  const position& where(void) const;
  const std::string& format(void) const;
  unsigned offset(void) const;
};

An error_handler_id_error contains the name of the message ID which could not be found in the message file:

class error_handler_id_error : public std::runtime_error
{
public:
  error_handler_id_error(const std::string& id);
  const std::string& id(void) const;
};

A error_handler_limit_error simply contains the error limit that has been reached:

class error_handler_limit_error : public std::runtime_error
{
public:
  error_handler_limit_error(unsigned limit);
  unsigned limit(void) const;
};

Finally, error_handler_fatal_error does not contain any extra information over the baseclass:

class error_handler_fatal_error : public std::runtime_error
{
public:
  error_handler_fatal_error(void);
};

In all cases, the std:exception method what() can be used to get a string (well actually a char*) describing the error in words. This includes textual forms of the extra information above - for example the limit error will state that the error limit was reached and what the limit was. The what() field can be printed via the error_handler's plaintext method so that it is printed in the same place as all other error_handler messages:

  error_handler errors(...);
  try
  {
    ...
  }
  catch(std::exception& exception)
  {
    errors.plaintext(string("exception: failed with ") + string(exception.what()));
  }