# Hierarchical Clustering¶

Hierarchical clustering is a method of cluster analysis which seeks to build a hierarchy of clusters. We apply a “bottom up” approach: each observation starts in its own clister, and pairs of clusters are subsequently merged.

The merges are determined in a greedy manner. We start by constructing a pairwise distance matrix. Then, the clusters of the pair with closest distance are merged iteratively.

## Example¶

Imagine we have files with the training data. We create CDenseFeatures (here 64 bit floats aka RealFeatures) as:

features_train = RealFeatures(f_feats_train)

features_train = RealFeatures(f_feats_train);

RealFeatures features_train = new RealFeatures(f_feats_train);

features_train = Shogun::RealFeatures.new f_feats_train

features_train <- RealFeatures(f_feats_train)

features_train = shogun.RealFeatures(f_feats_train)

RealFeatures features_train = new RealFeatures(f_feats_train);

auto features_train = some<CDenseFeatures<float64_t>>(f_feats_train);


In order to run CHierarchical, we need to choose a distance, for example CEuclideanDistance, or other sub-classes of CDistance. The distance is initialized with the data we want to classify.

distance = EuclideanDistance(features_train, features_train)

distance = EuclideanDistance(features_train, features_train);

EuclideanDistance distance = new EuclideanDistance(features_train, features_train);

distance = Shogun::EuclideanDistance.new features_train, features_train

distance <- EuclideanDistance(features_train, features_train)

distance = shogun.EuclideanDistance(features_train, features_train)

EuclideanDistance distance = new EuclideanDistance(features_train, features_train);

auto distance = some<CEuclideanDistance>(features_train, features_train);


We then create an instance of the CHierarchical classifier by assigning the steps of merging we expect to have in the training.

merges = 3
hierarchical = Hierarchical(merges, distance)

merges = 3;
hierarchical = Hierarchical(merges, distance);

int merges = 3;
Hierarchical hierarchical = new Hierarchical(merges, distance);

merges = 3
hierarchical = Shogun::Hierarchical.new merges, distance

merges <- 3
hierarchical <- Hierarchical(merges, distance)

merges = 3
hierarchical = shogun.Hierarchical(merges, distance)

int merges = 3;
Hierarchical hierarchical = new Hierarchical(merges, distance);

auto merges = 3;
auto hierarchical = some<CHierarchical>(merges, distance);


We can extract the information of the two merged elements, as well as the distance between them in each merging step:

d = hierarchical.get_merge_distances()
cp = hierarchical.get_cluster_pairs()

d = hierarchical.get_merge_distances();
cp = hierarchical.get_cluster_pairs();

DoubleMatrix d = hierarchical.get_merge_distances();
DoubleMatrix cp = hierarchical.get_cluster_pairs();

d = hierarchical.get_merge_distances
cp = hierarchical.get_cluster_pairs

d <- hierarchical$get_merge_distances() cp <- hierarchical$get_cluster_pairs()

d = hierarchical:get_merge_distances()
cp = hierarchical:get_cluster_pairs()

double[] d = hierarchical.get_merge_distances();
int[,] cp = hierarchical.get_cluster_pairs();

auto d = hierarchical->get_merge_distances();
auto cp = hierarchical->get_cluster_pairs();


## References¶

Wikipedia: Hierarchical_clustering