# TD 2b - Clustering via Mapper

The goal of this lab session is to study the Mapper of point clouds
and to gain some insight on parameter selection, both theoretically and experimentally.

## The Mapper computed on continuous spaces and point clouds

### Mapper on continuous spaces

Compute the Mappers of the following spaces.
Infer a rule of thumb to ensure that the Mapper is a graph.

Persistent homology is defined in general for tame functions defined on topological spaces.
As you have seen during the lecture, computing the degree-0 persistent homology of a function amounts
to counting the connected components of its sublevel sets.

Compute the degree-0 persistence diagrams of the previous height functions.
Can you predict which degree-0 topological features (i.e. branches) are missing in the Mapper using these diagrams?
The general solution for any topological feature is the topic of the last lecture.

### Mapper on discrete spaces

Compute the Mapper of the point cloud on the left with neighborhood size `delta`,
using the corresponding `delta`-neighborhood graph built on top of the cloud on the right.
Compute the Mapper of the neighborhood graph, seen as a metric space.
What do you notice?
Infer a rule of thumb for the choice of the resolution `r` and the gain `g`, given a
neighborhood size `delta`.

## Mapper algorithm and parameter selection

We provide code for the computation of Mapper for discrete metric spaces.
If you use Linux, or if you have a Linux environment (like MinGW for Windows users),
you can compile the code using `g++` (if you use Linux) by typing

`g++ mapper.cpp -o mapper -std=c++11`

in a terminal. Then, you can use the code in the following way:

`./mapper point_cloud_file_name function_file_name delta norm resolution gain `

where `norm` is either 0 or 1 and specifies if you want to normalize your data. It produces a `dot` file
that you can visualize with any graph visualization software, e.g. Graphviz.
If you use Python, you can also use the following code, that comes from
KeplerMapper.

There is a general procedure to automatically compute good parameters for point clouds that are samplings of manifolds.
- The rules of thumbs you derived in the previous section give insight on how to tune the resolution
`r`
and the gain `g`.
- To tune
`delta`, one can use subsampling. If `X` is your point cloud with size `n`,
let `Y_1,...,Y_N` be subsets of `X` of size `m=n/log(n)^1+epsilon`, where `N < C(n,m)`.
Then set

where `d_H` is the Hausdorff distance:

Write a code in the language of your choice that takes a point cloud and a function file as arguments and outputs the parameters
`r`, `g` and `delta`. Here is a solution.
## Test on various shapes

Test your code on the following point clouds. For each one,
- compute the Mappers for various choices of filters (coordinates, principal components, eccentricity) and parameters,
- infer some rules of thumbs on the shape of the Mapper for extreme values of the parameters,
- run your code to find good candidate parameters and compute the Mapper with them.

### Toy examples

### Real life examples

- 3D shapes. Mapper is often used to compute skeleta of shapes with applications in computer graphics.
Even though the parameters are most of the time chosen by hand, the selection procedure directly outputs the good ones.
- A line of the COIL dataset.