A Developer's Notebook -

Trees - Part I

tre

render(‘/image.*’, caption: ‘Bright green tree - Waikat’, src: ‘//upload.wikimedia.org/wikipedia/commons/thumb/f/f6/Bright_green_tree_-Waikato.jpg/512px-Bright_green_tree-Waikato.jpg)](http://commons.wikimedia.org/wiki/File%3ABright_green_tree-_Waikato.jpg)

We used trees to build the heap data structure before, but we didn’t bother with the theory behind trees, which are abstract and concrete data structures themselves. There’s a huge range of material to cover so I’ll split this in several posts.

In this first post we’ll cover the basic theory and implement a binary search tree (BST), which provides O(h) time search, insert and delete operations (h is the tree height). First, the basics:

Trees are graphs with a few extra properties and interpretations/conventions. * Trees have height (longest branch length) and depth (distance to root). * The uppermost level consists of at most one node (the tree root). * All nodes may have children. * There are no edges other than parent-child edges.

Trees are classified according to some of those properties above and some others we’ll mention later. Most commonly, there is a constraint to the maximum number of children per node -e.g. the binary tree limits children to 2 per node.

Graph

Mathematically, a graph is a set of vertices and edges, thus a graph G is usually written as G(V,E). Besides linking vertices in the graph, edges can also carry a specific value which may be interpreted as cost, weight, distance etc.

graph viewed with BurgerGF

In computer science, we’re interested in the (abstract) data structure used to implement the graph mathematical concept. Let’s first discuss the basic elements in a graph - vertices and edges:

typedef struct vertex
{
 unsigned long id;
 int status;
 double x,y;
 void* data;
} vertex;

Vertices should be able to hold any kind of data, so we’ll just throw in a void pointer for that. Other than that we have an id, status (marked or unmarked - more on that later) and 2D coordinates so we can draw the vertices somewhere.

typedef struct edge
{
 vertex* from, *to;
 int cost;
} edge;

Edges consist of just pointers to the vertices they link and an optional value used as weight, distance, cost etc. Strictly speaking we could use a void pointer for that value as well, as long as we also defined a comparison function. But let’s save the hassle and just use an integer instead - most algorithms will be fine with that.

Shortest path, part I - Dijkstra's algorithm

Now that we have a way to represent graphs, we can discuss one of the most important problems in graph theory: the shortest path problem (SPP). More or less formally, we’ll define SPP as:

Given a weighted graph G(V,E), find the sequence P = {v0, v1, v2, …, v(n-1)}, vi ∈ V, from vertex V0 to vertex V(n-1), such that the list of edges EP = {(v0,v1), (v1,v2), … (v(n-2), v(n-1))} exists and the summation of costs of all elements e ∈ EP is the smallest possible.

In other words, find the less expensive (ergo “shortest”) path between two vertices.

The trivial solution is using BFS starting at vertex A and stopping when it reaches vertex B. However, BFS doesn’t look at the edge costs: it calculates the path with least edges, not the path with least total cost.

Although not necessarily the fastest, Dijkstra’s algorithm is probably the most popular way to solve the shortest path problem due to its simplicity and elegance. The algorithm relies heavily on priority queues, so make sure to take a look at that if you haven’t already.

Pseudocode

dist[from] = 0
for v : G
      if v != source
            dist[v] = infinity
      prev[v] = -1
      PQ.add(v, dist[v])
while PQ.hasNext()
      u = PQ.pop()
      for each neighbor v of u
            alt = dist[u] + length(u, v)
            if alt < dist[v]
                  dist[v] = alt
                  prev[v] = u
                  PQ.decrease_key(v,alt)
return prev

A Developer's Notebook

Posts tagged "graph"

Trees - Part I

Graph

Shortest path, part I - Dijkstra's algorithm