A Directed Graph Container

This component is an STL-like template class which implements a directed-graph data structure. A directed graph is a graph in which all arcs have a direction associated with them - so can be interrogated as to which node they come from and which they go to.

This document assumes you know what a graph is and only deals with how to use the digraph class.

The digraph template takes two types:

template<typename NT, typename AT> class digraph

The first type is the type to be stored on the nodes of the graph, whilst the second is the type to be stored on the arcs (yes, you can store data on the arcs!). You must specify both types - so if you are not going to be storing data on, say, the arcs, then it is recommended that you put a bool or a char in that space, since these are the smallest types. Sadly, you cannot use void as a zero-byte type here (shame!).

For example, here is how to create a variable called graph which is a digraph with a string on the node and an int on the arc:

digraph<string,int> graph;

The digraph class is written in the style of all STL containers. It is a templatised class which is traversed and manipulated using iterators. The main difference with digraph is that it contains two different kinds of object - the node and the arc - and each one has its own iterator. The convention throughout the class is to place the prefix arc_ on any function or iterator which acts on an arc, but there is no prefix on node functions or iterators. Thus, the following four iterators are provided:

- iterator - a node iterator
- const_iterator - a node iterator for const graphs
- arc_iterator - an arc iterator
- const_arc_iterator - an arc iterator for const graphs

The iterators act very much like list iterators - they can be traversed forwards using the ++ operator from the begin() to the end() to visit every node or arc in the graph. The traversal order, however, is undefined. You may find it is the creation order but this is a coincidence and may change if I redesign the class, so do not rely on this coincidence in your code.

Thus, to visit every node, you use a standard iteration loop, STL-style:

for (digraph<string,int>::iterator node = graph.begin(); node != graph.end(); node++) std::cerr << *node << std::endl;

The other point about this loop is that the graph's node type (NT), in this case a string, is accessed by simply dereferencing the iterator as if it was a pointer, as with any other STL iterator.

The iterators remain valid when you add nodes to a graph - again, like the list class. Furthermore, removing nodes only invalidates iterators pointing to the removed node. However, be aware that deleting a node also deletes all arcs attached to it. You cannot have unconnected arcs in a digraph.

A similar loop can be used to visit all the arcs in the graph:

for (digraph<string,int>::arc_iterator arc = graph.arc_begin(); arc != graph.arc_end(); arc++) std::cerr << *arc << std::endl;

Note that the arc_iterator and the access functions arc_begin() and arc_end() have the arc_ prefix. In this case, dereferencing the arc_iterator returns the NA type, in this case an int.

When an iterator is first declared, but not assigned to, it is known as a *null*
iterator. Then, when it is assigned to so that it points to a node in the tree, it is known as a
*valid* iterator. If the iterator is made to point off the edge of the tree - for example the
parent of the root node, then it becomes an *end* iterator.

This concept is similar to STL iterators like the list iterator, where an iterator that walks off the end of the list becomes an end iterator. However, the STL has no concept of a null iterator and therefore no direct equivalent of a valid iterator.

The iterator has a set of functions for testing these conditions:

bool iterator::null(void) const; bool iterator::end(void) const; bool iterator::valid(void) const;

Note that these functions are members of the iterator class, not the digraph class.

Exactly one of these conditions will always be true, since the iterator must be one of null, end or valid.

Only a valid iterator can be dereferenced to access the node type T pointed to by the iterator. Any attempt to dereference an end or null iterator will result in an exception being thrown. See the section below on Exceptions to see which.

STLplus iterators are slightly different to STL iterators in that they know which graph object they belong to. The graph is referred to as the "owner" of the iterator. An iterator can only be used with it's owner. Using an iterator with the wrong graph throws an exception (specifically, the stlplus::wrong_object exception).

You can test whether a graph owns an iterator with the owns method:

bool digraph::owns(iterator);

Many of the functions in digraph return data structures or take data structures as arguments. These data structures are all STL vectors of iterators or other data structures. This section simply defines these types and explains how they are used.

typedef std::vector<arc_iterator> arc_vector; typedef std::vector<const_arc_iterator> const_arc_vector;

An arc_vector is a collection of arcs, used to represent either the set of arcs entering or leaving a node or a path between nodes. When used in the first form - as a set of arcs - the ordering is irrelevant, it should be treated just as a set. When used as a path, the ordering is significant because it represents the sequence of arcs that need to be traversed to get from one node to the other.

Note that a const_arc_vector is a vector of const_arc_iterator: it does not mean that the vector itself is constant.

To convert a const_arc_vector to an arc_vector or vice versa, two functions are provided:

const_arc_vector constify(const arc_vector&) const; arc_vector deconstify(const const_arc_vector&) const;

So, the constify function adds the const_ to the data type, whilst the deconstify function removes it.

A path_vector is a set of paths, each of which is a set of arcs represented as an arc_vector as defined above. So this means that a path_vector is a vector of vectors of arc_iterator. The type declarations are:

typedef std::vector<arc_vector> path_vector; typedef std::vector<const_arc_vector> const_path_vector;

These types are only used to store paths as the name suggests. In this case the vector is again acting as a set in that there is no particular ordering to the paths. For example, the all_paths function returns a path_vector containing all the paths between two nodes in no particular order.

As before, the const and non-const forms can be inter-converted.

const_path_vector constify(const path_vector&) const; path_vector deconstify(const const_path_vector&) const;

A node vector is a collection of nodes. Again, it is most usually treated as a set of nodes. For example, the adjacent_inputs function gives the set of all nodes connected to the inputs of a node as a node_vector. The declarations are:

typedef std::vector<iterator> node_vector; typedef std::vector<const_iterator> const_node_vector; const_node_vector constify(const node_vector&) const; node_vector deconstify(const const_node_vector&) const;

This final supplementary type declaration in digraph is a bit different because it represents the type of a callback function:

typedef bool (*arc_select_fn) (const digraph<NT,AT>&, const_arc_iterator);

The arc_select_fn type defines the type profile for an 'arc select' function which is used in the path algorithms to restrict the search for paths to a set of arcs. Basically, the select function is called for each arc which is considered a candidate for a path. If the select function returns true, then the arc is traversed in trying to form a path. If, however, the select function returns false, the arc will be ignored and therefore cannot form part of a path.

The arguments to the select function are the graph and an arc iterator. If the arc is tagged with selection information, that information can be accessed by simply dereferencing the arc iterator. For instance, say we have an arc type that has a select() member function that returns a bool to indicate whether that arc is selected. Then the select function would be written like this:

bool select_arc(const digraph<my_node,my_arc>& graph,digraph<my_node,my_arc>::const_arc_iterator arc) { return arc->select(); }

Sometimes the selection
information is on the nodes instead. Since the select function has access to the
graph as well as the iterator, it is possible to access either the *to* or
the *from* nodes of the arc and retrieve the selection information from
there. For example, lets say we have a node data structure with a member
function called select() which simply returns a bool. Lets also say that an arc
is selected only if it is connected *to* a selected node. This is how you'd
write the selection function:

bool select_arc_from_node(const digraph<my_node,my_arc>& graph,digraph<my_node,my_arc>::const_arc_iterator arc) { return graph.arc_to(arc)->select(); }

Once you have defined a select function, it can be used as the final argument in any of the path functions. If you omit the function or set the pointer (because the arc_select_fn type defines a function pointer) to null, then all arcs are selected. To pass a function pointer as an argument, simply use the name of the function with no parentheses. For example:

digraph<my_node,my_arc>::path_vector paths = graph.all_paths(from,to,select_arc_from_node);

The only operations on a whole graph are the copy constructor, assignment and the clear function:

digraph(const digraph<NT,AT>&); digraph<NT,AT>& operator=(const digraph<NT,AT>&); void clear(void);

The copy constructor and assignment both make copies of the graph, maintaining the connectivity of the arcs. Clever huh? The clear function, as its name suggests, trashes the entire contents of the graph.

Usually the first function you will use apart from the constructors. You create a node using the insert member function which takes as its argument the NT type of the graph - in this example a string. The return value of the insert function is an iterator pointing to the newly created node. This can then be used as an argument for the arc_insert function - see later.

iterator insert(const NT& node_data);

For example:

digraph<string,int>::iterator node1 = graph.insert("node1"); digraph<string,int>::iterator node2 = graph.insert("node2");

When a node is tired of life, or when you believe a node's misdemeanors justify euthanasia, you delete it using the erase function:

iterator erase(iterator);

The iterator passed to the erase function is invalidated by the erasure, as are any other iterators pointing to that node. The erase function returns the iterator to the next node in the iteration sequence or end() if it is the last one. This is so that you can continue an iteration even after erasing a node (again, this is just like the list). For example:

for (digraph<string,int>::iterator node = graph.begin(); node != graph.end(); ) { if (<some condition>) node = graph.erase(node); else node++; }

Note that there is no increment in the for loop control part, but the increment is deferred to the body of the loop. If the node is deleted, then the return iterator from the erase function is the next node, otherwise the ++ operator is used to increment to the next node.

Deleting a node also deletes all arcs connected to it. It is illegal to have unconnected arcs in a digraph.

When a digraph is full of junk, just delete all the nodes using the clear function. Since deleting a node deletes all arcs connected to it, this function has the side-effect of deleting all arcs too.

void clear(void);

Sometimes you need to know whether a graph has any nodes in it and if so, how many. Easy - just use one or both of these two functions:

bool empty(void) const; unsigned size(void) const;

These functions have no other use - the digraph is not like a vector, so you cannot get a node by its unsigned offset. Note that empty() is not equivalent to size()==0 in terms of performance (it is of course logically equivalent). The empty() function is more efficient.

The use of iterators has already been covered. The first and last nodes in the iteration sequence are obtained using the begin() and end() functions:

const_iterator begin(void) const; iterator begin(void); const_iterator end(void) const; iterator end(void);

Note that these functions have been overloaded so that begin() on a const digraph gives you a const_iterator whereas a begin() on a non-const digraph just gives you an iterator. You need to match the iterator to the context. For example, when passing a digraph to a function which cannot modify it, the convention is to pass it as a const reference. Therefore, within that function, only const_iterator can be used. This const_iterator can only be used to iterate through the graph or to access the contents of a node by dereferencing - and even then it returns a const reference to the contents so you cannot modify the node data. Here's an example function which prints the nodes of a graph and shows how the const-ness is preserved:

void print_nodes(const digraph<string,int>& graph) { std::cout << " empty: " << graph.empty() << std::endl; std::cout << " size: " << graph.size() << std::endl; for (digraph<string,int>::const_iterator node = graph.begin(); node != graph.end(); node++) std::cout << " - " << *node; }

Sometimes you need to store an iterator in another data structure - it is recommended that you always store an iterator and not a const_iterator. The problem then is that you might actually want it to be a const_iterator. It is possible to convert between iterator and const_iterator by using the constify and deconstify methods of the iterator class (Notice that they are methods of the iterator, not the digraph):

const_iterator iterator::constify(void) const; iterator const_iterator::deconstify(void) const;

In use, you use them like this:

digraph<string,int>::iterator i = ...; digraph<string,int>::const_iterator ci = i.constify();

or:

digraph<string,int>::const_iterator ci = ...; digraph<string,int>::iterator i = ci.deconstify();

The digraph class implements a directed graph, so the arcs connected to each node are split into two groups - the inputs and outputs. There can be any number of inputs and outputs to a node. The number of inputs to a node is the node's fanin and the number of outputs its fanout. Inputs and outputs are acccessed by their integer offset. The following functions are provided for inputs:

unsigned fanin(const_iterator) const; unsigned fanin(iterator); const_arc_iterator input(const_iterator, unsigned) const; arc_iterator input(iterator, unsigned);

The fanin function gives the number of inputs, which can then be used in a loop which counts from 0 to fanin-1. i.e.:

for (unsigned i = 0; i < graph.fanin(node); i++) { arc_iterator arc = graph.input(node, i); ... }

Note that an arc_iterator is retrieved from a non-const digraph, whilst a const_arc_iterator is retrieved from a const digraph.

Similarly, the following set of functions can be used to access the output arcs of a node:

unsigned fanout(const_iterator) const; unsigned fanout(iterator); const_arc_iterator output(const_iterator, unsigned) const; arc_iterator output(iterator, unsigned);

Another way of obtaining the inputs and outputs of a node is as an arc vector. The type declaration for arc_vector is a member of digraph and there are two functions inputs(node) and outputs(node) to get these lists. The function profiles are:

arc_vector inputs(const_iterator) const; arc_vector outputs(const_iterator) const;

Once you have an arc_vector, it can be iteratoed through using the STL list iterator. Remember that each arc in the arc_vector is represented by an arc iterator, so the arc itself is accessed by dereferencing that iterator.

The remaining arc access functions are utility functions that allow you to find the fanin or fanout index of an arc from a node.

unsigned output_offset(const_iterator from, const_arc_iterator arc) const; unsigned output_offset(iterator from, arc_iterator arc); unsigned input_offset(const_iterator to, const_arc_iterator arc) const; unsigned input_offset(iterator to, arc_iterator arc);

The value returned is an index in the range 0..fanin-1 for inputs, 0..fanout-1 for outputs. If the arc is not an input or output of the specified node, then the value npos() is returned (that is, digraph<NT,AT>::npos()). Don't blame me for the name, I adopted the name used in the STL string template which serves a similar purpose.

To avoid name conflicts, arc functions have the arc_ prefix. Arcs, like
nodes, are referred to by a list iterator which is returned by the arc_insert
function. They may also be visited from arc_begin() to arc_end(). Each arc has a
*from* field and a *to* field which contain the node iterators of the endpoints of
the arc. Of course, the arc data can be accessed by simply dereferencing the
iterator.

An arc is created using the arc_insert function. Since it is illegal to have
an unconnected arc in a digraph, the nodes it is connected to must be created
first. Then the arc is connected at the time of its creation by passing the node
iterators to the arc_insert function. Since arcs have a direction, the nodes it
is connected to are referred to as the *from* node and the *to* node.
The final argument of the arc_insert function is the data to be stored on the
arc. Since it is common to store data only on the nodes and not on the arcs, the
arc data field is optional:

arc_iterator arc_insert(iterator from, iterator to, const AT& arc_data = AT());

As with the node iterator, the returned arc_iterator points to the newly-created arc.

An arc is deleted using the arc_erase function. As with the node erase function, it returns an arc_iterator to the next arc in the iteration sequence:

arc_iterator arc_erase(arc_iterator);

Deleting the last arc connected to a node does **not** cause the node to
be deleted - it is perfectly legal to have floating nodes in a digraph.

Obvious really:

void arc_clear(void);

This leaves all the nodes in the digraph floating aimlessly in the void.

The number of arcs can be obtained with the arc_size function or alternatively, the special case of zero arcs can be tested for with the arc_empty function:

bool arc_empty (void) const; unsigned arc_size(void) const;

As with the node functions, remember that it is more efficient to test for empty() than to test for size()==0 and that !empty() is more efficient than size()>0.

The first and last arcs in the iteration sequence are obtained using the arc_begin() and arc_end() functions:

const_arc_iterator arc_begin(void) const; arc_iterator arc_begin(void); const_arc_iterator arc_end(void) const; arc_iterator arc_end(void);

You then step through the arcs using the ++ operator on the iterator.

Note that these functions have been overloaded so that arc_begin() on a const digraph gives you a const_arc_iterator whereas an arc_begin() on a non-const digraph just gives you an arc_iterator, just as with nodes. Here's an example function which prints the arcs of a graph and shows how the const-ness is preserved:

void print_arcs(const digraph<string,int>& graph) { std::cout << " empty: " << graph.arc_empty() << std::endl; std::cout << " size: " << graph.arc_size() << std::endl; for (digraph<string,int>::const_arc_iterator arc = graph.arc_begin(); arc != graph.arc_end(); arc++) std::cout << " - " << *arc << std::endl; }

You can go from an arc to its connected nodes using the arc_from and arc_to
functions. Remember that an arc is directed and so goes *from* a node
*to* another. The arc_from function, obviously (Doh) gives the *from*
node and so on...

const_iterator arc_from(const_arc_iterator) const; iterator arc_from(arc_iterator); const_iterator arc_to(const_arc_iterator) const; iterator arc_to(arc_iterator);

You can change the nodes that an arc is connected to using these functions.

void arc_move(arc_iterator arc, iterator from, iterator to); void arc_move_from(arc_iterator arc, iterator from); void arc_move_to(arc_iterator arc, iterator to); void arc_flip(arc_iterator arc);

The arc_move function disconnects the arc from its current from/to nodes and reconnects to a new from/to pair. The arc_move_from function only changes the from node of the arc, leaving the to node untouched. Similarly the arc_move_to function only changes the to node and leaves the from node untouched. Finally, the arc_flip function swaps the from and to nodes so as to flip the arc's direction.

The whole-graph operations allow large-scale manipulations of graphs. At present there is only one whole-graph operation:

void move(digraph& source);

This moves the source graph's nodes and arcs into the target graph. The source will be left empty by this move. All iterators to the source graph will become owned by the target graph, so the iterators will continue working and pointing to the same node or arc.

Nodes are adjacent if they are connected by an arc. Since digraph is a directed graph, the direction of the arc is significant. For example, if there is an arc from London to Manchester, London is considered adjacent to Manchester, but Manchester is not considered adjacent to London (a principle well understood by the owner of Virgin trains). The following functions allow operations on adjacent nodes. They are simply composite oprations which combine basic node and arc access functions to give a slightly more abstract approach. To put it another way, you can manipulate nodes without having to go via arcs all the time.

The first function is the test for whether two nodes are adjacent:

bool adjacent(const_iterator from, const_iterator to) const; bool adjacent(iterator from, iterator to);

The next set of functions find which arc makes two nodes adjacent. There is no need to test for adjacency first though, because if the lookup fails, these functions simply return arc_end().

const_arc_iterator adjacent_arc(const_iterator from, const_iterator to) const; arc_iterator adjacent_arc(iterator from, iterator to);

There may be any number of arcs connecting two nodes. The following functions return the set of all arcs connecting two nodes. If there are no arcs then the return vector will be zero length.

const_arc_vector adjacent_arcs(const_iterator from, const_iterator to) const; arc_vector adjacent_arcs(iterator from, iterator to);

The final set of functions access the set of nodes adjacent to a particular node as a node list. This list can then be traversed using the normal STL list iterators.

const_node_vector input_adjacencies(const_iterator to) const; node_vector input_adjacencies(iterator to); const_node_vector output_adjacencies(const_iterator from) const; node_vector output_adjacencies(iterator from);

A topographical sort is an ordering of a graph's nodes such that each node is visited after all of its inputs have been visited. It is purely a function of the connectivity of the graph and nothing to do with the data sored in the graph.

There are four methods provided for this:

std::pair<const_node_vector,const_arc_vector> sort(arc_select_fn = 0) const; std::pair<node_vector,arc_vector> sort(arc_select_fn = 0); const_node_vector dag_sort(arc_select_fn = 0) const; node_vector dag_sort(arc_select_fn = 0);

The `sort`

methods are the full form of the algorithm. It returns two vectors: the
first is the sorted set of node iterators in topographical order. The second is the set of iterators
to error arcs: these are the arcs that had to be broken to perform the sort. This is because the
sort is otherwise impossible in the presence of loops - you cannot visit every node after all of its
inputs. If the error arcs vector is empty, that means the graph had no loops and is a Directed
Acyclic Graph or DAG.

The `dag_sort`

methods are for cases where you already know that a graph is a DAG, for
example because of the way it was constructed. In that case the extra checking of the error arcs is
unnecessary. This returns a vector of node iterators in topographical order. If it turns out that
the graph is not a DAG (i.e. there are loops), it returns an empty vector.

The sort methods just presented and all the path algorithms in later sections take an arc_select callback which allows arcs to be selected or ignored for consideration in the sort or path. The selection callback function is applied to each arc in the traversal. If no function is provided (i.e. a null pointer is passed, which is the default) all of the arcs in the graph are selected.

If you want to use arc selection you should create a function with the type profile given by the arc_select_fn type. The select function is passed both the graph and the arc iterator so that it is possible to select an arc on the basis of the nodes it is connected to. Within the function, write code that returns false if the arc is to be ignored and true when selected.

For example, consider a graph with an integer value on each arc:

typedef digraph<string,int> sgraph;

You could make forward ards natural and backward arcs negative. The following selection callback will then select only the forward arcs:

bool select_natural (const sgraph& graph, sgraph::const_arc_iterator arc) { return *arc >= 0; }

This is applied by using the function name as a parameter to the sort methods:

sgraph::node_vector sort = graph.dag_sort(select_natural);

After this call, the node vector will be the topographical sort order of the forward arcs only.

Note: I used a callback because the STL-like predicate idea wasn't working for me... I kept hitting against the limitations of templates, or at least their limitations as implemented by gcc and by VC++.

A path is a series of arcs - you can use arc_from and arc_to to convert that into a series of nodes. The arc_from of the first arc in the path is the start node of the path, whilst the arc_to of the last arc is the end node of the path. A path must have at least one arc in it, but an empty arc_vector can be returned by some of these functions to indicate an error.

As with the topographical sorts in the last section, arcs can be selected or ignored as candidates for forming paths by using a select callback which is passed as an optional parameter to all of the path functions.

The simplest path function is the test of whether two nodes are connected by a path:

bool path_exists(const_iterator from, const_iterator to, arc_select_fn = 0) const; bool path_exists(iterator from, iterator to, arc_select_fn = 0);

Nuff said?

You can also get the set of all paths between two nodes.

const_path_vector all_paths(const_iterator from, const_iterator to, arc_select_fn = 0) const; path_vector all_paths(iterator from, iterator to, arc_select_fn = 0);

The return value is a vector of paths, where an empty return vector means that there are no paths between the nodes.

Note: path_exists() is faster than calling all_paths() and then checking for an empty result. However, calling both functions is only a good idea if there is a high false rate expected. In other words if you write this:

... if (graph.path_exists(from,to)) { digraph<my_node,my_arc>::path_vector paths = graph.all_paths(from,to); ... }

This is worthwhile if there are few paths so that the path_exists nearly always returns false, but inefficient if the path is likely to exist. In that case it is better to rewrite:

... digraph<my_node,mo_arc>::path_vector paths = graph.all_paths(from,to); if (!paths.empty()) { ... }

The next set of path algorithms are functions which calculate the set of all nodes in the graph which are reachable from a particular node (i.e. there exists an output path) and a complementary set which calculate the set of nodes which can reach that node (i.e. there exists a path to the node's inputs). I refer to the former set as the reachable nodes and the latter as the reaching nodes.

const_node_vector reachable_nodes(const_iterator from, arc_select_fn = 0) const; node_vector reachable_nodes(iterator from, arc_select_fn = 0); const_node_vector reaching_nodes(const_iterator to, arc_select_fn = 0) const; node_vector reaching_nodes(iterator to, arc_select_fn = 0);

The shortest path algorithms analyse the paths between nodes and calculate the shortest. This is an unweighted algorith which simply counts the number of arcs to get the path length. However, it is more efficient than calling all_nodes and then selecting the shortest.

const_arc_vector shortest_path(const_iterator from, const_iterator to, arc_select_fn = 0) const; arc_vector shortest_path(iterator from, iterator to, arc_select_fn = 0);

The functions return a single path. If that path is empty, then no paths were found between the nodes specified. Otherwise, the size of the vector is the path length.

A variant of the shortest path function is the shortest_paths function (notice the plural). This calculates the set of all reachable nodes form the specified node and then calculates the shortest path for each of those nodes. The algorithm is, however, more efficient than calling reachable_nodes followed by repeated calls to shortest_path.

const_path_vector shortest_paths(const_iterator from, arc_select_fn = 0) const; path_vector shortest_paths(iterator from, arc_select_fn = 0);

The result is a vector of paths in no particular order. To find which reachable node each path referes to, you can find the arc_to node of the last arc in the path. There is one path in the path_vector for each reachable nodes, so the set of arc_to nodes for the path_vector is also the set of all reachable nodes.

There are three exceptions that can be thrown by digraph, all indicating a misuse of the iterators:

**null_dereference**- logic_error- This is used to indicate that an iterator which is null has been dereferenced. You should always
be sure that iterators are non-null before using the
`*`

or`->`

operators. **end_dereference**- logic_error- This is used to indicate that an iterator that is pointing to an end element has been dereferenced. In line with STL conventions, the end iterator points to the element after the last element in a container, so dereferencing such an iterator is illegal (the memory pointed to will be unconstructed).
**wrong_object**- logic_error- This is used to indicate that an iterator which was created by one container has been used in a different container. For example, if you have two graph objects in your program, create an iterator from one graph and then use it in another, then this will throw the wrong_object exception.

These exceptions are common to all STLplus iterators and are explained in the STLplus exceptions policy.

Here's an example which just shows how all the bits fit together:

#include "digraph.hpp" #include "persistent_contexts.hpp" #include "persistent_digraph.hpp" #include "persistent_vector.hpp" #include "persistent_string.hpp" #include "persistent_int.hpp" #include "persistent_shortcuts.hpp" #include "build.hpp" #include <vector> #include <iostream> //////////////////////////////////////////////////////////////////////////////// typedef stlplus::digraph<std::string,int> string_int_graph; //////////////////////////////////////////////////////////////////////////////// std::ostream& operator<<(std::ostream& output, string_int_graph::iterator i) { return output << ((long)((void*)&*i)) << "->" << *i; } std::ostream& operator<<(std::ostream& output, string_int_graph::arc_iterator i) { return output << ((long)((void*)&*i)) << "->" << *i; } std::ostream& operator<<(std::ostream& output, string_int_graph::arc_vector arcs) { for (unsigned i = 0; i < arcs.size(); i++) { if (i > 0) output << " : "; output << arcs[i]; } return output; } std::ostream& operator<<(std::ostream& output, string_int_graph::path_vector paths) { for (unsigned i = 0; i < paths.size(); i++) { if (i > 0) output << " - "; output << paths[i]; } return output; } std::ostream& operator<<(std::ostream& output, string_int_graph::node_vector nodes) { for (unsigned i = 0; i < nodes.size(); i++) { if (i > 0) output << " - "; output << nodes[i]; } return output; } std::ostream& operator<<(std::ostream& output, string_int_graph& graph) { output << "nodes:" << std::endl; for (string_int_graph::iterator n = graph.begin(); n != graph.end(); ++n) output << n << " inputs: " << graph.inputs(n) << " outputs: " << graph.outputs(n) << std::endl; output << "arcs:" << std::endl; for (string_int_graph::arc_iterator a = graph.arc_begin(); a != graph.arc_end(); ++a) output << a << " from " << graph.arc_from(a) << " to " << graph.arc_to(a) << std::endl; return output; } //////////////////////////////////////////////////////////////////////////////// bool select_less_than_4 (const string_int_graph& graph, string_int_graph::const_arc_iterator arc) { return abs(*arc) < 4; } bool select_natural (const string_int_graph& graph, string_int_graph::const_arc_iterator arc) { return *arc >= 0; } static void test (string_int_graph& graph) { for (string_int_graph::iterator i = graph.begin(); i != graph.end(); i++) { for (string_int_graph::iterator j = graph.begin(); j != graph.end(); j++) { if (!graph.adjacent(i, j)) std::cout << " " << *i << " is NOT adjacent to " << *j << std::endl; else std::cout << " " << *i << " is adjacent to " << *j << std::endl; if (!graph.path_exists(i, j)) std::cout << " " << *i << " does NOT have a path to " << *j << std::endl; else { std::cout << " " << *i << " has a path to " << *j << std::endl; std::cout << " " << "paths from " << *i << " to " << *j << " are: " << graph.all_paths(i, j) << std::endl; std::cout << " " << "shortest path from " << *i << " to " << *j << " is: " << graph.shortest_path(i, j) << std::endl; } } std::cout << " " << "shortest paths for " << *i << " less than " << 4 << " are: " << graph.shortest_paths(i, select_less_than_4) << std::endl; std::cout << " " << "reachable nodes from " << *i << " are: " << graph.reachable_nodes(i) << std::endl; std::cout << " " << "nodes which can reach " << *i << " are: " << graph.reaching_nodes(i) << std::endl; } std::pair<string_int_graph::node_vector,string_int_graph::arc_vector> sort = graph.sort(select_natural); std::cout << " " << "topographical sort: " << sort.first; if (!sort.second.empty()) { std::cout << ", errors = " << sort.second; } std::cout << std::endl; std::cout << " " << "DAG sort: " << graph.dag_sort(select_natural) << std::endl; } void dump_string_int_graph(stlplus::dump_context& context, const string_int_graph& graph) { stlplus::dump_digraph(context, graph, stlplus::dump_string, stlplus::dump_int); } void restore_string_int_graph(stlplus::restore_context& context, string_int_graph& graph) { stlplus::restore_digraph(context, graph, stlplus::restore_string, stlplus::restore_int); } //////////////////////////////////////////////////////////////////////////////// typedef std::vector<string_int_graph::iterator> node_vector; void dump_string_int_graph_iterator(stlplus::dump_context& context, const string_int_graph::iterator& node) { stlplus::dump_digraph_iterator(context, node); } void dump_node_vector(stlplus::dump_context& context, const node_vector& nodes) { stlplus::dump_vector(context, nodes, dump_string_int_graph_iterator); } void restore_string_int_graph_iterator(stlplus::restore_context& context, string_int_graph::iterator& node) { stlplus::restore_digraph_iterator(context, node); } void restore_node_vector(stlplus::restore_context& context, node_vector& nodes) { stlplus::restore_vector(context, nodes, restore_string_int_graph_iterator); } //////////////////////////////////////////////////////////////////////////////// typedef std::vector<string_int_graph::arc_iterator> arc_vector; void dump_string_int_graph_arc_iterator(stlplus::dump_context& context, const string_int_graph::arc_iterator& arc) { stlplus::dump_digraph_arc_iterator(context, arc); } void dump_arc_vector(stlplus::dump_context& context, const arc_vector& arcs) { stlplus::dump_vector(context, arcs, dump_string_int_graph_arc_iterator); } void restore_string_int_graph_arc_iterator(stlplus::restore_context& context, string_int_graph::arc_iterator& arc) { stlplus::restore_digraph_arc_iterator(context, arc); } void restore_arc_vector(stlplus::restore_context& context, arc_vector& arcs) { stlplus::restore_vector(context, arcs, restore_string_int_graph_arc_iterator); } //////////////////////////////////////////////////////////////////////////////// class test_graph { public: string_int_graph m_graph; node_vector m_nodes; arc_vector m_arcs; string_int_graph::iterator insert(const std::string& node_data) { string_int_graph::iterator result = m_graph.insert(node_data); m_nodes.push_back(result); return result; } string_int_graph::arc_iterator arc_insert(string_int_graph::iterator from, string_int_graph::iterator to, int arc_data) { string_int_graph::arc_iterator result = m_graph.arc_insert(from,to,arc_data); m_arcs.push_back(result); return result; } void dump(stlplus::dump_context& context) const { dump_string_int_graph(context,m_graph); dump_node_vector(context,m_nodes); dump_arc_vector(context,m_arcs); } void restore(stlplus::restore_context& context) { restore_string_int_graph(context,m_graph); restore_node_vector(context,m_nodes); restore_arc_vector(context,m_arcs); } bool report (void) { bool result = true; std::cout << "== Graph ====================" << std::endl; std::cout << m_graph; std::cout << "== Nodes ====================" << std::endl; std::cout << m_nodes << std::endl; std::cout << "== Arcs =====================" << std::endl; std::cout << m_arcs << std::endl; std::cout << "== Tests ====================" << std::endl; for (unsigned n = 0; n < m_nodes.size(); n++) { if (m_nodes[n].owner() != &m_graph) { std::cout << "ERROR: node " << m_nodes[n] << " doesn't belong to graph" << std::endl; result = false; } } for (unsigned a = 0; a < m_arcs.size(); a++) { if (m_arcs[a].owner() != &m_graph) { std::cout << "ERROR: arc " << m_arcs[a] << " doesn't belong to graph" << std::endl; result = false; } } std::cout << "=============================" << std::endl; return result; } }; //////////////////////////////////////////////////////////////////////////////// void dump_test_graph(stlplus::dump_context& context, const test_graph& graph) { graph.dump(context); } void restore_test_graph(stlplus::restore_context& context, test_graph& graph) { graph.restore(context); } //////////////////////////////////////////////////////////////////////////////// int main(int argc, char* argv[]) { std::cerr << stlplus::build() << std::endl; bool result = true; try { test_graph graph; // Note: backward arcs are labelled with negative values string_int_graph::iterator node1 = graph.insert("node1"); string_int_graph::iterator node2 = graph.insert("node2"); string_int_graph::iterator node3 = graph.insert("node3"); string_int_graph::iterator node4 = graph.insert("node4"); string_int_graph::iterator node5 = graph.insert("node5"); graph.arc_insert(node1, node2, 1); graph.arc_insert(node2, node4, 2); graph.arc_insert(node4, node3, 3); string_int_graph::arc_iterator arc4 = graph.arc_insert(node1, node3, 4); graph.arc_insert(node3, node5, 5); graph.arc_insert(node5, node4, -6); graph.arc_insert(node4, node4, -7); std::cout << "### initial graph:" << std::endl; result &= graph.report(); test(graph.m_graph); std::cout << "dumping to file" << std::endl; stlplus::dump_to_file(graph, "test.tmp", dump_test_graph, 0); test_graph graph2; std::cout << "restoring from file" << std::endl; stlplus::restore_from_file("test.tmp", graph2, restore_test_graph, 0); std::cout << "after restore from file:" << std::endl; result &= graph2.report(); test(graph2.m_graph); } catch(const std::exception& exception) { std::cout << "caught exception: " << exception.what() << std::endl; result = false; } if (result) std::cout << "... test PASSED" << std::endl; else std::cout << "... test FAILED" << std::endl; return result ? 0 : 1; }

And here's the output:

STLplus version 3.2, Windows, gcc v3.4.5 (mingw-vista special r3), debug ### initial graph: == Graph ==================== nodes: 4079404->node1 inputs: outputs: 4073164->1 : 4073412->4 4072772->node2 inputs: 4073164->1 outputs: 4073212->2 4072860->node3 inputs: 4073316->3 : 4073412->4 outputs: 4073468->5 4072988->node4 inputs: 4073212->2 : 4073540->-6 : 4073604->-7 outputs: 4073316->3 : 4073604->-7 4073092->node5 inputs: 4073468->5 outputs: 4073540->-6 arcs: 4073164->1 from 4079404->node1 to 4072772->node2 4073212->2 from 4072772->node2 to 4072988->node4 4073316->3 from 4072988->node4 to 4072860->node3 4073412->4 from 4079404->node1 to 4072860->node3 4073468->5 from 4072860->node3 to 4073092->node5 4073540->-6 from 4073092->node5 to 4072988->node4 4073604->-7 from 4072988->node4 to 4072988->node4 == Nodes ==================== 4079404->node1 - 4072772->node2 - 4072860->node3 - 4072988->node4 - 4073092->node5 == Arcs ===================== 4073164->1 : 4073212->2 : 4073316->3 : 4073412->4 : 4073468->5 : 4073540->-6 : 4073604->-7 == Tests ==================== ============================= node1 is NOT adjacent to node1 node1 does NOT have a path to node1 node1 is adjacent to node2 node1 has a path to node2 paths from node1 to node2 are: 4073164->1 shortest path from node1 to node2 is: 4073164->1 node1 is adjacent to node3 node1 has a path to node3 paths from node1 to node3 are: 4073164->1 : 4073212->2 : 4073316->3 - 4073164->1 : 4073212->2 : 4073604->-7 : 4073316->3 - 4073412->4 shortest path from node1 to node3 is: 4073412->4 node1 is NOT adjacent to node4 node1 has a path to node4 paths from node1 to node4 are: 4073164->1 : 4073212->2 - 4073412->4 : 4073468->5 : 4073540->-6 shortest path from node1 to node4 is: 4073164->1 : 4073212->2 node1 is NOT adjacent to node5 node1 has a path to node5 paths from node1 to node5 are: 4073164->1 : 4073212->2 : 4073316->3 : 4073468->5 - 4073164->1 : 4073212->2 : 4073604->-7 : 4073316->3 : 4073468->5 - 4073412->4 : 4073468->5 shortest path from node1 to node5 is: 4073412->4 : 4073468->5 shortest paths for node1 less than 4 are: 4073164->1 - 4073164->1 : 4073212->2 : 4073316->3 - 4073164->1 : 4073212->2 reachable nodes from node1 are: 4072772->node2 - 4072860->node3 - 4072988->node4 - 4073092->node5 nodes which can reach node1 are: node2 is NOT adjacent to node1 node2 does NOT have a path to node1 node2 is NOT adjacent to node2 node2 does NOT have a path to node2 node2 is NOT adjacent to node3 node2 has a path to node3 paths from node2 to node3 are: 4073212->2 : 4073316->3 - 4073212->2 : 4073604->-7 : 4073316->3 shortest path from node2 to node3 is: 4073212->2 : 4073316->3 node2 is adjacent to node4 node2 has a path to node4 paths from node2 to node4 are: 4073212->2 shortest path from node2 to node4 is: 4073212->2 node2 is NOT adjacent to node5 node2 has a path to node5 paths from node2 to node5 are: 4073212->2 : 4073316->3 : 4073468->5 - 4073212->2 : 4073604->-7 : 4073316->3 : 4073468->5 shortest path from node2 to node5 is: 4073212->2 : 4073316->3 : 4073468->5 shortest paths for node2 less than 4 are: 4073212->2 : 4073316->3 - 4073212->2 reachable nodes from node2 are: 4072860->node3 - 4072988->node4 - 4073092->node5 nodes which can reach node2 are: 4079404->node1 node3 is NOT adjacent to node1 node3 does NOT have a path to node1 node3 is NOT adjacent to node2 node3 does NOT have a path to node2 node3 is NOT adjacent to node3 node3 has a path to node3 paths from node3 to node3 are: 4073468->5 : 4073540->-6 : 4073316->3 - 4073468->5 : 4073540->-6 : 4073604->-7 : 4073316->3 shortest path from node3 to node3 is: 4073468->5 : 4073540->-6 : 4073316->3 node3 is NOT adjacent to node4 node3 has a path to node4 paths from node3 to node4 are: 4073468->5 : 4073540->-6 shortest path from node3 to node4 is: 4073468->5 : 4073540->-6 node3 is adjacent to node5 node3 has a path to node5 paths from node3 to node5 are: 4073468->5 shortest path from node3 to node5 is: 4073468->5 shortest paths for node3 less than 4 are: reachable nodes from node3 are: 4072988->node4 - 4073092->node5 nodes which can reach node3 are: 4072772->node2 - 4072988->node4 - 4073092->node5 - 4079404->node1 node4 is NOT adjacent to node1 node4 does NOT have a path to node1 node4 is NOT adjacent to node2 node4 does NOT have a path to node2 node4 is adjacent to node3 node4 has a path to node3 paths from node4 to node3 are: 4073316->3 - 4073604->-7 : 4073316->3 shortest path from node4 to node3 is: 4073316->3 node4 is adjacent to node4 node4 has a path to node4 paths from node4 to node4 are: 4073316->3 : 4073468->5 : 4073540->-6 - 4073604->-7 shortest path from node4 to node4 is: 4073604->-7 node4 is NOT adjacent to node5 node4 has a path to node5 paths from node4 to node5 are: 4073316->3 : 4073468->5 - 4073604->-7 : 4073316->3 : 4073468->5 shortest path from node4 to node5 is: 4073316->3 : 4073468->5 shortest paths for node4 less than 4 are: 4073316->3 reachable nodes from node4 are: 4072860->node3 - 4073092->node5 nodes which can reach node4 are: 4072772->node2 - 4072860->node3 - 4073092->node5 - 4079404->node1 node5 is NOT adjacent to node1 node5 does NOT have a path to node1 node5 is NOT adjacent to node2 node5 does NOT have a path to node2 node5 is NOT adjacent to node3 node5 has a path to node3 paths from node5 to node3 are: 4073540->-6 : 4073316->3 - 4073540->-6 : 4073604->-7 : 4073316->3 shortest path from node5 to node3 is: 4073540->-6 : 4073316->3 node5 is adjacent to node4 node5 has a path to node4 paths from node5 to node4 are: 4073540->-6 shortest path from node5 to node4 is: 4073540->-6 node5 is NOT adjacent to node5 node5 has a path to node5 paths from node5 to node5 are: 4073540->-6 : 4073316->3 : 4073468->5 - 4073540->-6 : 4073604->-7 : 4073316->3 : 4073468->5 shortest path from node5 to node5 is: 4073540->-6 : 4073316->3 : 4073468->5 shortest paths for node5 less than 4 are: reachable nodes from node5 are: 4072860->node3 - 4072988->node4 nodes which can reach node5 are: 4072772->node2 - 4072860->node3 - 4072988->node4 - 4079404->node1 topographical sort: 4079404->node1 - 4072772->node2 - 4072988->node4 - 4072860->node3 - 4073092->node5 DAG sort: 4079404->node1 - 4072772->node2 - 4072988->node4 - 4072860->node3 - 4073092->node5 dumping to file restoring from file after restore from file: == Graph ==================== nodes: 4073772->node1 inputs: outputs: 4089004->1 : 4089268->4 4088500->node2 inputs: 4089004->1 outputs: 4089100->2 4088612->node3 inputs: 4089164->3 : 4089268->4 outputs: 4089372->5 4088820->node4 inputs: 4089100->2 : 4089476->-6 : 4089596->-7 outputs: 4089164->3 : 4089596->-7 4088932->node5 inputs: 4089372->5 outputs: 4089476->-6 arcs: 4089004->1 from 4073772->node1 to 4088500->node2 4089100->2 from 4088500->node2 to 4088820->node4 4089164->3 from 4088820->node4 to 4088612->node3 4089268->4 from 4073772->node1 to 4088612->node3 4089372->5 from 4088612->node3 to 4088932->node5 4089476->-6 from 4088932->node5 to 4088820->node4 4089596->-7 from 4088820->node4 to 4088820->node4 == Nodes ==================== 4073772->node1 - 4088500->node2 - 4088612->node3 - 4088820->node4 - 4088932->node5 == Arcs ===================== 4089004->1 : 4089100->2 : 4089164->3 : 4089268->4 : 4089372->5 : 4089476->-6 : 4089596->-7 == Tests ==================== ============================= node1 is NOT adjacent to node1 node1 does NOT have a path to node1 node1 is adjacent to node2 node1 has a path to node2 paths from node1 to node2 are: 4089004->1 shortest path from node1 to node2 is: 4089004->1 node1 is adjacent to node3 node1 has a path to node3 paths from node1 to node3 are: 4089004->1 : 4089100->2 : 4089164->3 - 4089004->1 : 4089100->2 : 4089596->-7 : 4089164->3 - 4089268->4 shortest path from node1 to node3 is: 4089268->4 node1 is NOT adjacent to node4 node1 has a path to node4 paths from node1 to node4 are: 4089004->1 : 4089100->2 - 4089268->4 : 4089372->5 : 4089476->-6 shortest path from node1 to node4 is: 4089004->1 : 4089100->2 node1 is NOT adjacent to node5 node1 has a path to node5 paths from node1 to node5 are: 4089004->1 : 4089100->2 : 4089164->3 : 4089372->5 - 4089004->1 : 4089100->2 : 4089596->-7 : 4089164->3 : 4089372->5 - 4089268->4 : 4089372->5 shortest path from node1 to node5 is: 4089268->4 : 4089372->5 shortest paths for node1 less than 4 are: 4089004->1 - 4089004->1 : 4089100->2 : 4089164->3 - 4089004->1 : 4089100->2 reachable nodes from node1 are: 4088500->node2 - 4088612->node3 - 4088820->node4 - 4088932->node5 nodes which can reach node1 are: node2 is NOT adjacent to node1 node2 does NOT have a path to node1 node2 is NOT adjacent to node2 node2 does NOT have a path to node2 node2 is NOT adjacent to node3 node2 has a path to node3 paths from node2 to node3 are: 4089100->2 : 4089164->3 - 4089100->2 : 4089596->-7 : 4089164->3 shortest path from node2 to node3 is: 4089100->2 : 4089164->3 node2 is adjacent to node4 node2 has a path to node4 paths from node2 to node4 are: 4089100->2 shortest path from node2 to node4 is: 4089100->2 node2 is NOT adjacent to node5 node2 has a path to node5 paths from node2 to node5 are: 4089100->2 : 4089164->3 : 4089372->5 - 4089100->2 : 4089596->-7 : 4089164->3 : 4089372->5 shortest path from node2 to node5 is: 4089100->2 : 4089164->3 : 4089372->5 shortest paths for node2 less than 4 are: 4089100->2 : 4089164->3 - 4089100->2 reachable nodes from node2 are: 4088612->node3 - 4088820->node4 - 4088932->node5 nodes which can reach node2 are: 4073772->node1 node3 is NOT adjacent to node1 node3 does NOT have a path to node1 node3 is NOT adjacent to node2 node3 does NOT have a path to node2 node3 is NOT adjacent to node3 node3 has a path to node3 paths from node3 to node3 are: 4089372->5 : 4089476->-6 : 4089164->3 - 4089372->5 : 4089476->-6 : 4089596->-7 : 4089164->3 shortest path from node3 to node3 is: 4089372->5 : 4089476->-6 : 4089164->3 node3 is NOT adjacent to node4 node3 has a path to node4 paths from node3 to node4 are: 4089372->5 : 4089476->-6 shortest path from node3 to node4 is: 4089372->5 : 4089476->-6 node3 is adjacent to node5 node3 has a path to node5 paths from node3 to node5 are: 4089372->5 shortest path from node3 to node5 is: 4089372->5 shortest paths for node3 less than 4 are: reachable nodes from node3 are: 4088820->node4 - 4088932->node5 nodes which can reach node3 are: 4073772->node1 - 4088500->node2 - 4088820->node4 - 4088932->node5 node4 is NOT adjacent to node1 node4 does NOT have a path to node1 node4 is NOT adjacent to node2 node4 does NOT have a path to node2 node4 is adjacent to node3 node4 has a path to node3 paths from node4 to node3 are: 4089164->3 - 4089596->-7 : 4089164->3 shortest path from node4 to node3 is: 4089164->3 node4 is adjacent to node4 node4 has a path to node4 paths from node4 to node4 are: 4089164->3 : 4089372->5 : 4089476->-6 - 4089596->-7 shortest path from node4 to node4 is: 4089596->-7 node4 is NOT adjacent to node5 node4 has a path to node5 paths from node4 to node5 are: 4089164->3 : 4089372->5 - 4089596->-7 : 4089164->3 : 4089372->5 shortest path from node4 to node5 is: 4089164->3 : 4089372->5 shortest paths for node4 less than 4 are: 4089164->3 reachable nodes from node4 are: 4088612->node3 - 4088932->node5 nodes which can reach node4 are: 4073772->node1 - 4088500->node2 - 4088612->node3 - 4088932->node5 node5 is NOT adjacent to node1 node5 does NOT have a path to node1 node5 is NOT adjacent to node2 node5 does NOT have a path to node2 node5 is NOT adjacent to node3 node5 has a path to node3 paths from node5 to node3 are: 4089476->-6 : 4089164->3 - 4089476->-6 : 4089596->-7 : 4089164->3 shortest path from node5 to node3 is: 4089476->-6 : 4089164->3 node5 is adjacent to node4 node5 has a path to node4 paths from node5 to node4 are: 4089476->-6 shortest path from node5 to node4 is: 4089476->-6 node5 is NOT adjacent to node5 node5 has a path to node5 paths from node5 to node5 are: 4089476->-6 : 4089164->3 : 4089372->5 - 4089476->-6 : 4089596->-7 : 4089164->3 : 4089372->5 shortest path from node5 to node5 is: 4089476->-6 : 4089164->3 : 4089372->5 shortest paths for node5 less than 4 are: reachable nodes from node5 are: 4088612->node3 - 4088820->node4 nodes which can reach node5 are: 4073772->node1 - 4088500->node2 - 4088612->node3 - 4088820->node4 topographical sort: 4073772->node1 - 4088500->node2 - 4088820->node4 - 4088612->node3 - 4088932->node5 DAG sort: 4073772->node1 - 4088500->node2 - 4088820->node4 - 4088612->node3 - 4088932->node5 ... test PASSED