subsystems/message_handler.hpp
A General-Purpose Message Handler

Introduction

The message handler subsystem is a class which can be used to perform all message printing in an application. It controls the routing of messages to the current output device and does all the message formatting according to a set of message formats for each kind of message. The formatting is fully configurable.

Messages are stored in a file rather than hard-coded into the application. Each message is referred to by a unique ID, which is a string. When a request to print a message is received, the message is retrieved by its ID from the message file, formatted according to the formatting rules in place at the time, has arguments substituted and then is printed to the current output device.

This separation of the messages into a file means that it is easy to implement multi-lingual applications. Just select an appropriate message file according to the locale.

The message handler supports both simple messages and positional messages. A positional message is one that relates to a particular position in a file, and would find application in a compiler for example to show where syntax errors occur in a source file. Simple messages are not related to a file, they are just plain message text.

There are four levels of message: information, warning, error and fatal. These have well-defined meanings so there should be no ambiguity as to which one to use.

For applications such as compilers, the message handler can keep count of the number of errors that have been reported and throw an exception when an error limit has been reached. This is user-friendly in that it prevents the user becoming swamped with error messages.

For applications where processing goes through a number of deeper and deeper stages such that an error message would be so far removed from the user input it would be meaningless, the message handler supports the concept of a context stack. For example, if you are writing an assembler (I wonder where I get my examples from) and inlining a function, you could push the original function call onto the context stack. Then, any error messages will be printed with supplementary messages showing the context of the error.

Initialisation

The message handler is implemented as a class. You create an object of that class and then use that object to manage all your messages. Typically, you would create the message handler in main() and then pass it to all functions that need to do message printing.

The constructor takes 1-4 arguments to initialise the message handler. Start with the simplest constructor:

message_handler::message_handler(std::ostream& device,
                                 unsigned limit = 0,
                                 bool show = true);

The device argument is an IOStream output device which is used as the destination for all messages. For command-line tools it will probably be std::cout (standard output) or std::cerr (standard error).

The limit argument is the error limit. If set to zero (the default) there is no limit to the number of errors that can be printed. Otherwise, if the stated limit of error messages is reached, the message handler will throw an error_limit exception which can be caught later to terminate an operation.

The show argument controls whether positional messages will also display the original source text with an indicator showing where the error occurred. This will typically be true for command-line tools and false for GUI commands. For example, a compiler might generate the following error with show set to true:

"test10.cpp" (18,12) : error : this builtin equality operator doesn't have a body
if p1 = null then
      ^

Setting show to false gives a reduced form of the message:

"test10.cpp" (18,12) : error : this builtin equality operator doesn't have a body

This reduced form only makes sense within a context-sensitive editor, as might be found in a GUI, which would highlight line 18, column 12 of the source file "test10.cpp".

So far, no message file has been loaded. There is a modified form of the constructor that does this for a single file, or message files (plural) can be added to the message handler explicitly. The latter option will be handled in the next section. The modified constructor looks like this:

message_handler::message_handler(std::ostream& device,
                                 const std::string& message_file,
                                 unsigned limit = 0,
                                 bool show = true);

The only difference is that this constructor takes the name of a message file as its second argument. Note also that the constructor can throw an stlplus::message_handler_read_error exception in the case where reading the message file fails, either because the file doesn't exist or because there was a format error in the file. This will be dealt with in more detail in the section on Handling Message Files.

Handling Message Files

The message handler can read in messages from any number of message files. Each file is added to the message handler by the add_message_file function:

void message_handler::add_message_file(const std::string& message_file);

The message file will be read into the message handler and the messages stored ready for use. If any errors are found, the stlplus::message_handler_read_error exception is thrown.

The stlplus::message_handler_read_error exception contains an message_position field, which stores the file position of the error. This includes the filename, the line number and the column number. If the file was missing, the line number will be zero. Otherwise, the line number refers to the message file line number, numbered from 1. The column number refers to the character position in the line, numbered from 0. Incidentally, the numbering of lines from 1 and columns from 0 is a common, almost standard, convention.

The message file format is very simple. Each message declaration is in the form:

<ID> <spaces> <text>

An <ID> is a unique mnemonic for the error message. It starts with an alphabetic character and may contain alphanumerics and underscores only. It must be unique across all the message files loaded into the message handler. If a duplicate message ID is found, an stlplus::message_handler_read_error exception will be thrown containing the position of the second declaration.

The <spaces> can be one or more space or tab characters.

The <text> is the remainder of the line up to the newline and is plain text (not a quoted string).

All lines starting with a non-alphabetic character are assumed to be comments and are ignored.

Here's an example, trivial message:

HELLO           Hello World!

This defines a message with an ID of "HELLO" and with message text "Hello World!".

Simple messages like this are not much use. Most messages need to have arguments inserted into them. Arguments are specified in the message file by the substitution strings "@0", "@1" etc. The index number (0, 1 etc.) refers to the order in which the arguments are passed to the message printing functions. Note that they are numbered from zero, not from one. There is no limit to the number of arguments that can be substituted and the index number can have more than one digit (e.g. "@10"). The use of numbered arguments was chosen in preference to C-style substitutions because it allows a message to be reworded in a different order without changing the source code. This becomes particularly useful for multi-lingual support since different human languages will be worded with arguments in a different order.

A trivial message with a single argument is:

IAM             My name is @0!

And in the French version:

IAM             Je m'appelle @0!

Format Control

The format of messages is controlled by built-in format strings which can be customised for a particular application. The format strings use the same argument substitution rules as the messages themselves. It is possible to individually customise all message types: information, supplement, context, warning, error and fatal messages (see later for an explanation of context messages). There is yet another format string for positional messages.

The default message formats are:

The information/context/supplement/warning/error/fatal formats take a single argument which is the formatted message. The @0 argument will be replaced by the message from the error file with the arguments substituted.

For positional messages, the simple message text created by the rules above is further substituted into a positional format string. The positional format string takes up to 4 arguments:

You can miss out a part of this (e.g. the column number) by simply not including the argument number in the format string. For example, use the position format string "file: @1, line: @2: @0" to eliminate the column information. I can't think of any legitimate reason for excluding any field apart from the column number.

The default formats can be overridden by the set_****_format functions:

void message_handler::set_information_format(const std::string& format);
void message_handler::set_context_format(const std::string& format);
void message_handler::set_supplement_format(const std::string& format);
void message_handler::set_warning_format(const std::string& format);
void message_handler::set_error_format(const std::string& format);
void message_handler::set_fatal_format(const std::string& format);
void message_handler::set_position_format(const std::string& format);

Note: at present there is no sanity check on the format strings. It is up to you to get them right. Mistakes here could lead to stlplus::message_handler_format_error being thrown when you try to write a message.

Format strings can be inserted into the message file itself. The defaults are equivalent to having the following messages in a message file:

INFORMATION  @0
CONTEXT      context: @0
SUPPLEMENT   supplement: @0
WARNING      warning: @0
ERROR        error: @0
FATAL        FATAL: @0
POSITION     "@1" (@2,@3) : @0

These can be overridden in your own message files. This will be necessary for multi-lingual support unless you are happy to have English keywords starting each message.

Source File Position Specifications

Many functions in the message handler require file positions to be specified. A file position means the combination of the source file name, a line number and a column number. This is encapsulated in a class in called stlplus::message_position:

class message_position
{
public:
  message_position(const std::string& filename, unsigned line, unsigned column);

  const std::string& filename(void) const;
  unsigned line(void) const;
  unsigned column(void) const;

  friend std::ostream& operator << (std::ostream&, const position&);
};

Source Code Display Control

This feature controls whether the message handler shows positional messages with two lines of supplementary text showing the source code line and an arrow showing the column that the message relates to. This is a boolean switch controlled by two functions:

void message_handler::show_position(void);
void message_handler::hide_position(void);

Printing Messages

There are 4 classes of message: information, warning, error, fatal. They have the following interpretation and behaviour:

information
Used for progress messages, status messages etc.
warning
Indicates that a problem has been found but there is a sensible, well defined, way of proceeding
error
Indicates that a problem has been found and the operation will fail - processing may continue but only to find further errors. May throw an stlplus::message_handler_limit_error if the error limit is reached.
fatal
An internal (programming) error has been found and the operation is stopping NOW. It does this by throwing stlplus::message_handler_fatal_error.

In addition to this behaviour, be aware that all print functions could throw either an stlplus::message_handler_id_error or a stlplus::message_handler_format_error.

There are also 2 kinds of message: simple, positional:

simple
just a text message
positional
a message relating to a source file and a specific position in that file

This gives 8 variants. For each variant there is a general-purpose print method which takes a vector of strings as a parameter. This vector of strings is used as the arguments to substitute into the message. For example, here's the general form of the simple informational message handler:

bool message_handler::information(const std::string& id,
                                  const std::vector<std::string>& args);

The first field is the ID of the message to print and the second is the set of arguments to substitute into that message. Typically this will be built up by using the vector::push_back function.

This general-purpose method is a bit clumsy in practice but has the advantage that it has no limit on the number of arguments that can be passed. To make life easier in the majority of cases, short-cut forms are provided for 0 to 3 argument messages. For informational messages these are:

bool message_handler::information(const std::string& id);

bool message_handler::information(const std::string& id,
                                  const std::string& arg1);

bool message_handler::information(const std::string& id,
                                  const std::string& arg1,
                                  const std::string& arg2);

bool message_handler::information(const std::string& id,
                                  const std::string& arg1,
                                  const std::string& arg2,
                                  const std::string& arg3);

Each of these functions also has a positional equivalent. The position is placed before the id parameter.

For example:

bool message_handler::information(const position&,
                                  const std::string& id,
                                  const std::vector<std::string>& args);

There are similar methods for 0, 1, 2 and 3 arguments as above.

This set of permutations is repeated for the warning, error and fatal message handlers.

The error functions can throw an extra stlplus::message_handler_limit_error exception if the error limit is reached. The fatal functions always throw a stlplus::message_handler_fatal_error exception.

Printing Plain Text

Not all text output is in the form of messages. Sometimes you need to just print plain text. For example, to reprint the source text of a program to show where an error was found. However, it is also useful to be able to redirect this plain text to the same device as is being used for message reporting. This is done through the plaintext method:

bool message_handler::plaintext (const std::string& text);

Each plaintext method call adds a newline to the text printed (in other words, do not terminate the text with a newline because the method will do it).

Handling Error Counts and Error Limit

The section on exceptions deals with what happens when the error limit is reached (an exception is thrown). This section deals with setting error counts and resetting the current count.

Initially, the error limit is set by the constructor. If this is set to 0, then this means there is no error limit and a stlplus::message_handler_error_limit exception can never be thrown. Any other value sets the error limit.

The message handler keeps count of how many error messages have been printed. This number can be used at the end of the program to determine whether the program succeeded (count = 0) or failed (count > 0). It can even be used as the exit status of the program since the convention there is to use 0 to mean success and any other value to mean failure.

If a program needs to perform a series of tasks in isolation, it is useful to be able to reset the error count before each stage.

In all, there are four functions relating to the error count and the error limit:

void message_handler::set_error_limit(unsigned limit);
unsigned message_handler::error_limit(void) const;
void message_handler::reset_error_count(void);
unsigned message_handler::error_count(void) const;

Context Stack

The idea of a context stack applies to situations where a problem requires recursion and where an error could become meaningless due to that recursion because any errors are so far removed from the original problem that they don't make sense. The example I used before was that of inlining a function, where an error in the inlining is far removed from the user's original source code. In that case, you can build up a context stack to print out with each message so that the path to the error can be traced. In this example, the context would be a positional message showing the original function call that is being inlined.

The context stack is maintained through push and pop functions. A push function is similar to the message print functions above, but they don't print anything. Instead, the formatted message is stored in the context stack for future use. Then, when any message is printed, the context stack is also printed. Note that the context messages have a different text format to information/warning/error/fatal messages and that format is configurable

The context stack functions look like the message functions with the name push_context. As before, there is a general purpose function which takes a vector of string arguments and then convenience functions that take 0-3 arguments. There are two permutations: simple and positional. This gives 10 permutations.

For example:

void message_handler::push_context (const std::string& id,
                                    const std::vector<std::string>& args);

This is the general form for simple messages.

The push_context method is called when entering the context. There is also a pop_context method which must be called on leaving the context:

bool message_handler::pop_context(void);

It is your responsibility to ensure that every push_context is paired with a pop_context.

This can be difficult. In fact it may not be possible if exceptions can be thrown. The solution is to use the auto_push_context methods. These take exactly the same arguments as the push_context methods, but they return an stlplus::message_context object. This object will automatically pop the context stack when it goes out of scope. An example of the auto_push_context method is:

stlplus::message_context message_handler::auto_push_context(const std::string& id,
                                                            const std::vector<std::string>& args);

In use the return object should be assigned to a local variable. When that local variable goes out of scope, the context stack is popped.

Supplementary Messages

The concept of supplement messages is similar to the context stack. This is a set of messages that are printed along with a main information/warning/error/fatal message to give extra information. The supplement messages are loaded before the main message is printed so the whole is printed in one block of text. The main difference is that supplement messages have no memory - they apply only to the next main message and then are discarded.

An example use of supplement messages would be in a compiler. Say you have more than one possible solution to an expression because the types are ambiguous. The main message would say that the types are ambiguous. Then the supplement messages would list the possible types. There would be one supplement message per type.

Here's some example code using supplement messages:

for (unsigned i = 0; i < candidates.size(); i++)
  errors.push_supplement(candidates[i].position(), "VDK_POSSIBLE_SOLUTION");
errors.error(current_position, "VDK_COMPONENT_AMBIGUOUS");

Although the supplemental messages are loaded before the main message, they are printed after it, since this is a more natural order:

"test.cpp" (13,6) error: this component is ambiguous
"comp1.cpp" (16,4) supplement: this is one possible solution
"comp2.cpp" (24,8) supplement: this is one possible solution
"comp3.cpp" (48,8) supplement: this is one possible solution

Exceptions Thrown by Message Handler

When errors are discovered within an message handler, it resorts to throwing an exception to indicate that error. The following exceptions can be thrown:

stlplus::message_handler_read_error
Thrown when an error is found reading a message file.
stlplus::message_handler_format_error
Thrown when trying to write a message and a formatting error is found in the message text.
stlplus::message_handler_id_error
Thrown when a message ID is requested which is not present in the message files(s).
stlplus::message_handler_limit_error
Thrown when the error limit is reached (only enabled if the error limit is set to > 0).
stlplus::message_handler_fatal_error
Thrown after a message of severity fatal is thrown. Fatal errors always throw this exception.

These exceptions can be caught individually or as a group. To catch them individually, you end up with quite a large catch block:

try
{
}
catch(stlplus::message_handler_read_error& exception)
{
}
catch(stlplus::message_handler_format_error& exception)
{
}
catch(stlplus::message_handler_id_error& exception)
{
}
catch(stlplus::message_handler_limit_error& exception)
{
}
catch(stlplus::message_handler_fatal_error& exception)
{
}

The exceptions can be caught as a group because they are all derivatives of std::runtime_error which in turn is a derivative of std::exception. It is good practice anyway to catch std::exception.

This cruder form of exception handling only requires one catch block:

try
{
}
catch(std::exception& exception)
{
}

A good compromise is to catch specific exceptions which require special handling and then catching all the others in one go. For example, a stlplus::message_handler_limit_error exception might require different handling from all the others. This is done by putting the catch block for the specific errors before the general catch block:

try
{
}
catch(stlplus::message_handler_limit_error& exception)
{
}
catch(std::exception& exception)
{
}

Note that catching an exception by reference preserves the subclass information (see "The C++ Programming Language", Bjarne Stroustrup, p359) so that dynamic casts could be used on a std::exception to get the subclass information.

Each specific exception contains extra information on the nature of the exception.

An stlplus::message_handler_read_error contains a position field:

class stlplus::message_handler_read_error : public std::runtime_error
{
public:
  message_handler_read_error(const stlplus::message_position& pos);
  const stlplus::message_position& where(void) const;
};

The position information contains the name of the message file in which the read failed, the line and column numbers where the read failed.

A format error also contains position information showing the exact line and column in the file where the format is wrong (typically the substitution of an argument more than the number of arguments passed to the print function). It also returns the format string and the offset into that string of the error. This is most useful for built-in formats such as the error format or the position format where the file position will be empty. It tends to be redundant information for an error in the message file.

class stlplus::message_handler_format_error : public std::runtime_error
{
public:
  stlplus::message_handler_format_error(const std::string& format, unsigned offset);
  stlplus::message_handler_format_error(const stlplus::message_position& pos, const std::string& format, unsigned offset);
  const stlplus::message_position& where(void) const;
  const std::string& format(void) const;
  unsigned offset(void) const;
};

An stlplus::message_handler_id_error contains the name of the message ID which could not be found in the message file:

class stlplus::message_handler_id_error : public std::runtime_error
{
public:
  stlplus::message_handler_id_error(const std::string& id);
  const std::string& id(void) const;
};

An stlplus::message_handler_limit_error simply contains the error limit that has been reached:

class stlplus::message_handler_limit_error : public std::runtime_error
{
public:
  message_handler_limit_error(unsigned limit);
  unsigned limit(void) const;
};

Finally, the stlplus::message_handler_fatal_error does not contain any extra information over the baseclass:

class stlplus::message_handler_fatal_error : public std::runtime_error
{
public:
  message_handler_fatal_error(void);
};

In all cases, the std:exception method what() can be used to get a string (well actually a char*) describing the error in words. This includes textual forms of the extra information above - for example the limit error will state that the error limit was reached and what the limit was. The what() field can be printed via the message_handler's plaintext method so that it is printed in the same place as all other message_handler messages:

stlplus::message_handler errors(...);
try
{
  ...
}
catch(std::exception& exception)
{
  errors.plaintext(std::string("exception: failed with ") + std::string(exception.what()));
}