Now we will talk about another
concept in networks called clustering.
The idea of clustering is
to measure the extent to which,
if you were a node,
to what extent are your friends
also friends with one another.
So let's define clustering with
respect to a particular node 'V'.
It's the fraction of pairs of neighbors
that are connected to one another.
For example, look at this node here:
It has 2 neighbors, and its neighbors
are connected to each other.
So its clustering is one; all of its
neighbors are connected to each other.
So the fraction of pairs of neighbors
that are connected to each other is one.
Whereas for this node, it has 3 neighbors,
but not all of its neighbors are connected to each other.
These 2 are connected, but this one is not connected
to its other neighbor. Mathematically, we know that if
a node V has K_v neighbors, there are
K_v * (K v -1)/2 pairs of neighbors.
If you are not mathematically inclined,
you don't have to worry about how this is derived,
just believe it.
In terms of this,
we define C_v as the fraction of all
the possible pairs that are linked.
So let's look at that.
This node, it only has 1 neighbor.
And so by definition, if you only have 1 neighbor,
your clustering is 0,
since you don't have any pair of neighbors.
This node has 3 neighbors,
This one,
this one
and this one.
Ok, so that means that it has
3*2/2 pairs of neighbors,
possible pairs; that equals 3, 6/2.
But only one of its pairs of neighbors
is linked. That is, it's missing links between
this pair of neighbors and this pair of neighbors.
So only 1 out of the 3 possible pairs
are linked. So its clustering is 1/3.
As we said before,
the clustering with respect to this node is one,
because its 2 neighbors have a link
between them, so all possible
pairs of neighbors are linked to this node.
This one, similar to this one,
has one third.
This one 0; it has 2 neighbors,
but they are not linked,
so none of its possible pairs
of neighbors are linked.
And similarly, this one
only has one neighbor,
so it doesn't have any pairs of neighbors,
so its clustering is 0.
And now we can define
the clustering coefficient
of the entire network,
which is simply the average clustering
with respect to each node.
So the clustering coefficient,
the average C_v over all nodes,
so we add up these,
and divide by the number of nodes,
and we get that the clustering coefficient is 0.278.
Now we can contrast a network like this one,
with a network like this that
is completely connected, that is
every node is connected to every other node.
So everybody's friends are also friends with each other
and with that you get a clustering coefficient of 1,
or a network like this one,
in which each node is connected
to 2 other nodes,
but none of the pairs of neighboring nodes
are connected.
And this one has a clustering coefficient of 0.
The clustering coefficient can be indicative
of things like; how long it takes information to travel
from one part of the network to another,
or also, how badly the system will fall apart
if one node is deleted.
Let's stop here for a quick quiz on clustering.