In this unit, we're going to cover a 
different aspect of network structure

often called "scale-free structure, " but 
more generally known as

"long-tailed network structure."

Here's the idea. Here's a picture of 
typical structure of parts of the world

wide web. Where nodes here are web pages 
and links are links between pages and you

can see that they are directional; that
one page links to another page, but not

necessarily in reverse. And you can also
see that there's hubs in this kind of

network. Pages that link to many other
pages and there's also hubs that are

pages that many other pages link to,
like right here.

Ok, so let's look at in contrast a random 
network. So this is what a random network

might look like if we just randomly threw
down a bunch of nodes and put random

links between them. So let's look at the
degree distribution of these different

types of networks.

Here's what the degree distribution of a
random network looks like. This is a plot

where I'm not putting numbers on these, 
but you can sort of see the shape, this

bell curve shape. This is also called a 
"normal distribution." Or a "Gaussian

distribution." Or, to be more exact, this
is really what's called a

"Poisson distribution."

But let's not worry about that. Just be
aware of the general shape of this.

What's interesting about the random
network's distribution is that most nodes

have the same number of links to them. 
If we look here in this middle. This

middle degree, sort of average degree
here, most nodes have that degree. There's

a small number of nodes that have very
low degree. A small number of nodes that

have very high degree. And after a certain
numbers out here they quickly go to zero.

Now let's contrast that with the degree
distribution of the world wide web.

Approximately. So here I've drawn a
picture of the degree distribution.

Similarly it has these bars except the
bars here are so narrow that it looks

like a solid black mass. But you can think
of it as this kind of plot, bar graph. But

the bars are really really narrow and
there's a huge number of them. So this

distribution looks really different from
this distribution. There's a very large

number of nodes here that have low
degree and a very small number that

have high degree. There's also some
other interesting things about this shape.

If I plot this from say, a thousand to ten
thousand, and suppose I take this little

square here. Starting at ten thousand
and now plot this little part all the way

to a hundred thousand, but I'm now
going to blow this up to be about the

same size as this. So let's do that.

So here's the ten thousand to a
hundred thousand and I blew the

whole thing up so what we were
seeing before is this little tiny

patch over here becomes larger
here. Now I have this, of course,

is much lower number of nodes
than in the previous plot, but I

just want you to concentrate
on the shape. If we blow up a

little part of this the shape
remains the same. Similarly if I

started at a hundred thousand
and plotted all the way to a million

the shape remains the same.

This might ring some bells in terms of
fractals. We have, a fractal, remember,

is something where if you take a small
part of it and blow it up it looks like

the whole original object. It's self
similar at all levels. And a power law

distribution is in fact a fractal object
in the same way we looked at

cauliflower and trees and so on. In
the sense that if you take a small

part of it and blow it up it has the
same shape as the original. So this

is also called a "scale- free 
distribution"

and the reason is that this idea of
scale. At all scales it looks the same

and so it's called "scale-free."

Here's the approximate degree 
distribution of the web. This is

empirically what's been reported. That 
the number of nodes with degree k is

proportional to 1/(k^2). This kind of 
mathematical expression is called a

"power law." Because it involves a 
variable raised to some power. Here we

have 1/(k^2), but that's equivalent to 
k^(-2). If you ever hear of the term

"scale-free distribution"

you just know that that's 
equivalent to a power law distribution.

And what's important to know is what that 
exponent is.

So the definition of a "scale-free 
network" is a network with a "scale-

free" or "power law" degree distribution.

And we can see that scale-free networks 
are fractal-like. Here's an abstract

scale-free network. It's fractal-like in 
that you get hubs that have lots of

outgoing links or incoming links. And if 
we scale down here we see at this level

little hubs. At each scale we can see 
smaller and smaller hubs that have the

same shape as the original network 
roughly. So we have self-similarity to

some extent in this network. I'm just 
calling it "fractal-like." And that's why

it has a power law-like distribution. And 
I say "power law-like" because in the

real world, networks aren't really 
scale-free. Just like in the real world

objects aren't really fractals. They're 
not mathematical fractals. They're

approximations to fractals. And the same 
thing with network degree distributions.

I should point out that scale-free 
networks indeed have the small world

property. That is, low average distances 
between nodes and high clustering.

In 1999, Albert Lazlo Barabasi and Reka 
Albert published a paper in the Journal

Science called "Emergence of Scaling and 
Random Networks", in which they introduced

this idea of scale-free networks to a 
large audience and this really, similar

to the Watts and Strogatz paper I 
mentioned before, this paper sparked a

huge amount of interest in the nascent 
field of network science. Barabasi and

Albert looked at many different networks,
looked at the empirical data and proposed

that scale-free degree distributions are 
very common in real world complex

networks.

Other papers, such as this one by 
Clauset, Shalizi, and Newman, have been a

little bit more skeptical of that claim 
that the data is power law distributed.

So I just want you to be aware that there 
is some controversy about whether certain

networks are actually scale-free or not.

Evelyn Fox Keller, a biologist who 
studies networks, said in 2005, "current

assessments of the commonality of power 
laws are probably over estimates" and

that's probably true.

But in general what most people do
accept is that many real world networks,

most real world networks probably, have
long-tailed degree distributions where

"long-tail" just means that it has this
kind of shape. With many nodes having

relatively low degree compared to the 
number of nodes that have high degree.

And this is called a tail of a 
distribution in probability theory.

Because it kind of looks like a tail. 
This is sort of the body and

this is the tail.

And this is called a "long-tailed 
distribution" because it's tail keeps

going on and on and on. And that is very 
different from what we call "Bell curve"

or normal distributions.

Here's a picture of a normal distribution 
of human height. This is a nice example

because people are lined up here by their 
heights with the shortest people over

here and the tallest people over here. 
And you can see that most people cluster

in the middle. There's a few people who 
are very short. There's a few people who

are very tall. Most people are average. 
This normal distribution does not have a

long tail. This goes to zero very quickly 
on either side. A power law distribution

or a long-tailed distribution does not go 
to zero. It has a long tail that drops

off and this has some really important 
implications for real world networks.