# TD 9 - Mapper

The goal of this lab session is to implement the Mapper algorithm and to explore various datasets.

## 0. Goal

Our goal here is to implement the algorithm for building Mapper from a point cloud. We provide a skeleton code in C++. Your task is to fill in the blanks in this code, and to test it on various datasets. We will proceed in several steps.

## 1. Data Structure

Q1. Fill the "Point" class. You will describe a point with:

• an integer ID,
• a filter value,
• a vector of coordinates.
You should also implement a constructor that takes a vector of double (the coordinates) as input and a method to compute the Euclidean distance.

Q2. Fill the "Cover" class. You will describe a cover with a vector of pair of double. You should also implement a constructor that takes an int (the resolution) and a double (the gain) as input.

• read_cloud that builds a Cloud (i.e. a vector of Points) from the name of a file.
• read_function_from_file that takes the name of a file and a Cloud (i.e. a vector of Points) as inputs, and assigns filter values to the Points based on the values given in the file.
• read_coordinate that takes a number k and a Cloud as inputs, and uses the kth coordinate to give filter values to the Points.

## 2. Functions

We provide the build_neighborhood_graph function, that outputs a Graph (i.e. a map from Point (nodes) to vectors of Point (neighbors)) from the Cloud and from a parameter delta, which is a ratio of the diameter of the Cloud (i.e. the maximal pairwise Euclidean distance).

Q3. Fill the count_cc function, which takes as input a graph G and an index id, and which outputs the set of connected components (cc) of G, indexed starting from id. The output is stored in a ConnectedComp, i.e. a map from integers (IDs of cc) to vectors of Point.
Hint: use a depth first search.

Q4. Fill the MapperElts function, which takes as input a graph G and a Cover I, and which outputs a vector of ConnectedComp, each element of the vector being the set of cc of a particular subgraph of the Cloud given by the preimage of a specific interval (or intersection of a pair of intervals) of the Cover.

Q5. Fill the build_mapper_graph function, which connects the nodes given by MapperElts and outputs the Mapper (i.e. a map from integers (IDs of nodes) to vectors of integers (IDs of neighbors)). A particular neighbor of a node should appear as many times in the node vector as there are cc in the corresponding intersection.

We provide the main function, which calls the previous functions to do the calculations. Note that it has another parameter, the so-called mask, which removes all nodes that contain fewer than a specified number of Cloud points. The output is a .dot file (standard format for graphs), called mapper.dot.

You can use the Graphviz library to visualize the .dot file. You need to download the library Download page (executable packages are available for Linux, Mac and Windows). Once the library installed, you can type neato -Tpdf mapper.dot -o mapper.pdf in a terminal to produce a .pdf of the visualized graph.

Note: at this stage you will have no color in the output graph. If you want colors, you have to assign a color to each node, depending on (say) the average of the filter values over the set of data points represented by the node.

## 3. Exploration

Q6. Use the code and find a good cover to compute a skeleton of the double torus that has the correct vertical homology for the z-coordinate function (3rd column). Try also to compute a skeleton of one of the 3D shapes of TD6, using the z-coordinate function again (2nd column). Here are two possible skeletons:

Q7. Test various covers and neighborhood graphs to retrieve the topological structure of the crater of TD1. You can use the x-coordinate or the density function as filter. Here are two possible skeletons. Can you guess which function was used as a filter for each image, and explain the shape of the Mapper graph?

Q8. Test various parameters and filters to explore the topological structure of the natural images and the NBA datasets from TD1.

Here is our solution.