Graphs and minimum spanning trees
par Eric Fusy et Karthik Bhargavan

 Login :  Mot de passe :

In this TD, we will implement a particular type of graph, where nodes represent cities on a map and arcs are drawn between neighbouring cities (adjacencies are not determined by road connections, but instead by the geodesic distance, precisely two cities are considered as adjacent if their geodesic distance is smaller than a certain maximal value). We will then implement two classical (greedy) algorithms, Kruskal's algorithm and Prim's algorithm, to find the minimum spanning tree in each connected component of the graph of cities, whose arcs are weighted by the geodesic distance. (You will normally be expected to complete at least up to question 6.)

Example of a Graph and its minimum spanning tree

Left: a graph with two connected components. Right: the arcs of the graph are assigned weights; the arcs of the two minimum spanning trees (one for each connected component) are shown bolder.

Java standard library

We will use different interfaces and classes coming from the Java standard library:

You are encouraged to look at those libraries to see which methods are available on each data structure.

Getting started

To begin with, create a new project TD5. We give you partial code, available at td_5.zip. Download it and unzip it within the TD5 project (in the src/ folder if the project has one). This will create three packages called arbres, carte and graphe, and a Test class (remember to refresh within Eclipse using "F5").

The package arbres contains the squeletons of three classes: ArcComparator, UnionFind, and SpanningTree, which will have to be completed in exercises 3 to 8 (in order to compute minimum spanning trees of proximity graphs of cities).

The package carte contains two classes, Carte and Visualisation, which are used to load and visualise a map in a graphical window. These classes should not be modified.

The package graphe contains four classes:

You are encouraged to look at the Ville and Arc classes before starting.

We will test the programs on the maps of Mayotte and France. Download the files mf.txt and fr.txt, and install these two files in the main directory TD5 of the TD. If you would like to work on files for other countries, you can find them here.

Proximity graphs

We will construct graphs of cities, where each Ville will be connected by an Arc to a number of other Ville that we call its neighbours.

The class GraphVille

First, consider the incomplete class GrapheVille given in package graphe. It contains a field toNeighbours of type Map<Ville, List<Arc>> that maps each ville v to an ordered collection of Arcs, each of which has v as source and one of v's neighbours as target.

public class GrapheVille {
  private Map<Ville, List<Arc>> toNeighbours;

  // Initialize toNeighbours
  public GrapheVille() {...}

  // Is Ville v in the graph?
  boolean contient(Ville v){...}

  // All the Villes in the graph
  Set<Ville> villes(){...}

  // Add a Ville to the graph
  void ajouterVille(Ville v){...}

  // Add an Arc to the graph
  void ajouterArc(Arc arc){..}

  // All the arcs that go out from Ville v to its neighbours
  List<Arc> arcsSortant(Ville v){...}

}

Complete the class GrapheVille by implementing the methods shown here.

For now, the body of method connectNeighbours must be left empty.

Test with the method test1 in Test.java. The result should look as follows:

There are 87 nodes in this graph.
Accua	Latitude: -12.7225	Longitude: 45.0563889
Andrema	Latitude: -12.6786111	Longitude: 45.0977778
Apandzo	Latitude: -12.8305556	Longitude: 45.1302778
Bambo Est	Latitude: -12.9261111	Longitude: 45.1738889
Bambo Ouest	Latitude: -12.9219444	Longitude: 45.0872222
Bandraboua	Latitude: -12.7016667	Longitude: 45.1205556
Bandrajou	Latitude: -12.7452778	Longitude: 45.2186111
Bandrani	Latitude: -12.845	Longitude: 45.1019444
Bandrele	Latitude: -12.9066667	Longitude: 45.1913889
There are 86 arcs in this graph.

and the window:

resultat mf aretes

Upload GrapheVille.java.

Connecting Neighbours

Next, we construct graphs where neighbours are any two cities within a certain distance of each other. Now fill in the method connectNeighbours that takes an argument maxDistance and adds an Arcs between any two distinct Ville (using equals) that are less than or equal to maxDistance away from each other.

public void connectNeighbours(double maxDistance){...}

You can assume that the graph already contains all cities as nodes but no arc. You will use the method double distance(Ville dest) of the class Ville that computes the distance between the current Ville and any other Ville. You will also use the method ajouterArc of the class GrapheVille to add a new Arc between neighbouring Ville.

Test with test2. The result should be:

There are 166 arcs in this graph.

and you should obtain the following window, where the color of each arc is a function of its length field:

Proximity graph for Mayotte

To compute the proximity graph for France, replace mf.txt in initMayotte with fr.txt (and the two cities with Palaiseau and Montanceix). What do you obtain? Since the time taken by connectNeighbours is quadratic in the number of cities, you will find that constructing this graph takes a long time. Instead, from this exercise onwards, all our tests will use the function initFrance that uses a more efficient graph construction (implemented by the GrapheEuclidean in package graphe).

Upload GrapheVille.java.

Minimum spanning trees/forests

We now focus on the problem of computing minimum spanning trees of a graph G. Since G can be made of several connected components G1,...,Gk (as with the proximity graph when the maximal value for the distances becomes small), we consider more generally the problem of computing the minimum ''connecting spanning forest'' of G (a connecting spanning forest of G is a forest of trees T1,...,Tk such that each Ti is a spanning tree of Gi, for each i in [1..k]), as shown in the introductory figure of the TD.

Comparing arcs

Since Kruskal's and Prim's algorithm consider the arcs in increasing order, we need to have a comparator for arcs. Fill in the method compare of ArcComparator.java, which compares two Arc objects according to the length (which is an attribute of the class Arc).

Test with test3. The result should be:

-1
1
1
0

Upload ArcComparator.java.

Union-find

For Kruskal's algorithm, we will also need a union-find structure for Arc objects, which is to be implemented in the class UnionFind.java. We would like the UnionFind to rely on a HashMap mechanism. Precisely, there is an attribute parent of type HashMap<Ville,Ville> such that the instruction parent.get(v) returns the parent of v (possibly null), and the instruction parent.put(v,u) assigns u as the parent of v. Fill in the constructor and the methods union and find in the class UnionFind.java. Use the fact that parent.get(v) returns null as no parent is assigned to v. As HashMap uses equals for testing identity, you should do the same for consistency.

Remark: the union operations will be applied to arcs of a geographic map that are considered in the order given by the geodesic distance; it is reasonable to assume that the components along the way will look quite "random", so that the underlying trees of the union-find structures are expected to be quite well balanced. Therefore you do not have to use size comparison as in TD2, but you should use iterative path compression in the find method.

Test with test4. The result should be:

true
true
false

Upload UnionFind.java.

Kruskal's algorithm, all components together

We can now implement a first version of Kruskal's algorithm. At first we remind you how Kruskal's algorithm works:

  1. the arcs are sorted in increasing order of weight (you can use an instruction as Collections.sort(col,ac), where col is of type Collection<Arc>, and ac is of type ArcComparator),
  2. the arcs e1,e2,...,ek are considered one by one in increasing order, and at each step i one decides if ei is to be selected: ei is selected if and only if it satisfies the property that its two extremities are not connected using the previously selected arcs (testing this property is handled by the union-find structure).
The output (the collection of selected arcs) forms the minimum connecting spanning forest of the graph. Note that this forest will not contain some trivial components which are formed from a single vertex and which are therefore without an arc.
To initially collect all the arcs of the GraphVille g (which are then to be sorted at the beginning of the method kruskal), create an auxiliary method public List<Arc> tousLesArcs() in the class GrapheVille.

Upload GrapheVille.java.

Fill in the method public static Collection<Arc> kruskal(UnionFind u, GrapheVille g) in the class SpanningTree.java, which is to return the collection of the arcs that forms the minimum connecting spanning forest. The UnionFind argument u is initially empty and, at return time, it should contain a description of the non-trivial connected components of g. This side-effect will be useful in the next exercise.

Remark: beware that Collection<E> (with E the type of elements in the collection) is not a class but an interface. Therefore, it has to be instantiated using an implementing class, for instance with an instruction such as Collection<E> col=new LinkedList<E>().

Test with test5. The result should be (the computation time depending on device/implementation):

Computation time (Kruskal first version) : 613 ms
Total length:  : 110191528 m
49390 connected cities

and you should obtain the following window, where the arcs in the minimum connecting spanning forest F are colored orange, and arcs not in F are light gray:

Proximity graph for Mayotte

Upload SpanningTree.java.

Separating the components in Kruskal's algorithm

Using the first kruskal method, fill in the method public static Collection<Collection<Arc>> kruskal(GrapheVille g), which is to return the list of collections of arcs (one collection for each non-trivial tree-component) of the minimum connecting spanning forest of g.

Remark: it is recommended to use a HashMap<Ville,Collection<Arc>> listes that maps each root of the UnionFind (a root for each connected component) to the list of arcs of the minimum spanning tree of the corresponding connected component, and to return listes.values() at the end of the method.

Test with test6. The result should be:

Computation time (Kruskal second version) : 707 ms
Total length  : 110191528 m
49390 connected cities
3330 components in the forest

and you should obtain the following window, where the arcs in the minimum connecting spanning forest F are colored (a color is randomly assigned to each tree-component), and arcs not in F are light gray:

Proximity graph for Mayotte

Upload SpanningTree.java.

Prim's algorithm for the component of a starting Ville

At first we recall how Prim's algorithm works. It uses a priority queue q (here, a queue of arcs, with the priority given by having smaller distance) as an auxiliary structure. One starts at a Ville v, puts all arcs at v into q, and remove v from the set of non-visited elements. Then as long as q is not empty,

  1. pick up the next element from q (instruction q.poll()), which is an Arc a,
  2. if the target u of a is in the non-visited elements, add to q all the arcs with u as source, and remove u from the set of non-visited elements.
Note that Prim's algorithm only visits nodes (Ville objects) that are in the connected component C containing v; its output is the minimum spanning tree of C.

Fill in the method public static Collection<Arc> primTree(HashSet nonVisited, GrapheVille g, Ville start), which is to return (in the form of a collection of arcs) the minimum spanning tree of the connected component C (of g) containing the Ville object start. During the process, the HashSet object nonVisited is to be modified so that, at the end of the execution, all the Ville in C have been taken out of nonVisited (this side effect will be useful for the next exercise).

Test with test7. The result should be:

Computation time (Prim first version) : 109 ms
Total length:  : 44973288 m
20002 connected cities

and you should obtain the following window, where the arcs in the minimum spanning tree T of the component containing ''Palaiseau'' are colored orange, and arcs not in T are light gray:

Proximity graph for Mayotte

Upload SpanningTree.java.

Prim's algorithm at each component

Using the primTree method, fill in the method public static Collection<Collection<Arc>> primForest(GrapheVille g), which is to return the list of collections of arcs (one collection for each tree-component) of the minimum connecting spanning forest of g. Again, note that this forest should not contain the trivial components which are formed from a single vertex and empty sub-collections are not allowed in the ouput of primForest.

To peek (without removing) one element from a collection or a set, one can paste the following generic method inside class SpanningTree and use it properly:

  public static <E> E peekOne(Collection<E> c) {
    for (E e : c)
      return e;
    return null;
  }

Test with test8. The result should be:

Computation time (Prim second version) : 506 ms
Total length  : 110191528 m
49390 connected cities
3330 components in the forest

and you should obtain the following window, where the arcs in the minimum connecting spanning forest F are colored (a color is randomly assigned to each tree-component), and arcs not in F are light gray:

Proximity graph for Mayotte

Upload SpanningTree.java.

Testing the tree structure of a collection of arcs

Add to the class SpanningTree a method public static boolean testTree(Collection<Arc> arcs), which tests if the graph formed by the arcs in arcs and by the Ville objects incident to these arcs is a tree. As a convention, for null as the parameter, testTree will return false and, with an empty collection as the parameter, testTree will return true. You can add to SpanningTree any auxiliary method you want. Modify class Test to apply this method to the outputs of the Kruskal/Prim's methods. You should also test testTree on some typical collections of arcs that don't define a tree.

Upload SpanningTree.java.