17.8. Evaluation#

There are two possible approaches for evaluation of a clustering algorithm.

Full reference
No reference

In full reference evaluation, a ground truth about an ideal clustering of the data is given. E.g., suppose our goal is to cluster a set of facial images of different persons. The ground truth in this case is the name/id of each person associated with each image.

17.8.1. Full Reference Evaluation#

Let $Y$ be a given set of data points. Assume that the ideal/reference clustering of $Y$ is given beforehand.

Assume that the dataset $Y$ can be divided into $K$ clusters.
Let these clusters be named $Y_{1}, \dots, Y_{K}$ .
Assume that it is known which point belongs to which cluster.

In general a clustering $C$ of a set $Y$ constructed by a clustering algorithm is a set ${C_{1}, \dots, C_{C}}$ of non-empty disjoint subsets of $Y$ such that their union equals $Y$ . Clearly: $| C_{c} | > 0$ .

The clustering process may make a number of mistakes.

It may identify incorrect number of clusters and $C$ may not be equal to $K$ .
More-over even if $K = C$ , the data points may be placed in wrong clusters.

Ideally, we want $K = C$ and $C_{c} = Y_{k}$ with a bijective mapping between $1 \leq c \leq C$ and $1 \leq k \leq K$ . In practice, a clustering algorithm estimates the number of clusters $C$ and assigns a label $l_{s}$ , $1 \leq s \leq S$ to each vector $y_{s}$ where $1 \leq l_{s} \leq C$ .
All the labels can be put in a label vector $L$ where $L \in {1, \dots, C}^{S}$ . The permutation matrix $Γ$ can be easily obtained from $L$ .

Following [85], we will quickly establish the commonly used measures for clustering performance.

We have a reference clustering of vectors in $Y$ given by $B = {Y_{1}, \dots, Y_{K}}$ which is known to us in advance (either by construction in synthetic experiments or as ground truth with real life data-sets).
The clustering obtained by the algorithm is given by $C = {C_{1}, \dots, C_{C}}$ .

For two arbitrary points $y_{i}, y_{j} \in Y$ , there are four possibilities:

they belong to same cluster in both $B$ and $C$ (true positive),
they are in same cluster in $B$ but different cluster in $C$ (false negative)
they are in different clusters in $B$ but in same cluster in $C$
they are in different clusters in both $B$ and $C$ (true negative).

Consider some cluster $Y_{i} \in B$ and $C_{j} \in C$ .

The elements common to $Y_{i}$ and $C_{j}$ are given by $Y_{i} \cap C_{j}$ .
We define

${precision}_{i j} ≜ \frac{| Y_{i} \cap C_{j} |}{| C_{j} |} .$
We define the overall precision for $C_{j}$ as

$precision (C_{j}) ≜ max_{i} ({precision}_{i j}) .$
We define ${recall}_{i j} ≜ \frac{| Y_{i} \cap C_{j} |}{| Y_{i} |}$ .
We define the overall recall for $Y_{i}$ as

$recall (Y_{i}) ≜ max_{j} ({recall}_{i j}) .$
We define the $F$ score as

$F_{i j} ≜ \frac{2 {precision}_{i j} {recall}_{i j}}{{precision}_{i j} + {recall}_{i j}} .$
We define the overall $F$ -score for $Y_{i}$ as

$F (Y_{i}) ≜ max_{j} (F_{i j}) .$
We note that cluster $C_{j}$ for which the maximum is achieved is best matching cluster for $Y_{i}$ .
Finally, we define the overall $F$ -score for the clustering

$F (B, C) ≜ \frac{1}{S} \sum_{i = 1}^{p} | Y_{i} | F (Y_{i})$

where $S$ is the total number of vectors in $Y$ .

We also define a clustering ratio given by the factor

$η ≜ \frac{C}{K} .$

There are different ways to define clustering error. For the special case where the number of clusters is known in advance, and we ensure that the data-set is divided into exactly those many clusters, it is possible to define subspace clustering error as follows:

clustering error = \frac{# of misclassified points}{total # of points} .

The definition is adopted from [32] for comparing the results in this paper with their results. This definition can be used after a proper one-one mapping between original labels and cluster labels assigned by the clustering algorithms has been identified. We can compute this mapping by comparing $F$ -scores.

Topics in Signal Processing

Evaluation

Contents

17.8. Evaluation#

17.8.1. Full Reference Evaluation#