In this unit, we're going to cover a different aspect of network structure often called "scale-free structure, " but more generally known as "long-tailed network structure." Here's the idea. Here's a picture of typical structure of parts of the world wide web. Where nodes here are web pages and links are links between pages and you can see that they are directional; that one page links to another page, but not necessarily in reverse. And you can also see that there's hubs in this kind of network. Pages that link to many other pages and there's also hubs that are pages that many other pages link to, like right here. Ok, so let's look at in contrast a random network. So this is what a random network might look like if we just randomly threw down a bunch of nodes and put random links between them. So let's look at the degree distribution of these different types of networks. Here's what the degree distribution of a random network looks like. This is a plot where I'm not putting numbers on these, but you can sort of see the shape, this bell curve shape. This is also called a "normal distribution." Or a "Gaussian distribution." Or, to be more exact, this is really what's called a "Poisson distribution." But let's not worry about that. Just be aware of the general shape of this. What's interesting about the random network's distribution is that most nodes have the same number of links to them. If we look here in this middle. This middle degree, sort of average degree here, most nodes have that degree. There's a small number of nodes that have very low degree. A small number of nodes that have very high degree. And after a certain numbers out here they quickly go to zero. Now let's contrast that with the degree distribution of the world wide web. Approximately. So here I've drawn a picture of the degree distribution. Similarly it has these bars except the bars here are so narrow that it looks like a solid black mass. But you can think of it as this kind of plot, bar graph. But the bars are really really narrow and there's a huge number of them. So this distribution looks really different from this distribution. There's a very large number of nodes here that have low degree and a very small number that have high degree. There's also some other interesting things about this shape. If I plot this from say, a thousand to ten thousand, and suppose I take this little square here. Starting at ten thousand and now plot this little part all the way to a hundred thousand, but I'm now going to blow this up to be about the same size as this. So let's do that. So here's the ten thousand to a hundred thousand and I blew the whole thing up so what we were seeing before is this little tiny patch over here becomes larger here. Now I have this, of course, is much lower number of nodes than in the previous plot, but I just want you to concentrate on the shape. If we blow up a little part of this the shape remains the same. Similarly if I started at a hundred thousand and plotted all the way to a million the shape remains the same. This might ring some bells in terms of fractals. We have, a fractal, remember, is something where if you take a small part of it and blow it up it looks like the whole original object. It's self similar at all levels. And a power law distribution is in fact a fractal object in the same way we looked at cauliflower and trees and so on. In the sense that if you take a small part of it and blow it up it has the same shape as the original. So this is also called a "scale- free distribution" and the reason is that this idea of scale. At all scales it looks the same and so it's called "scale-free." Here's the approximate degree distribution of the web. This is empirically what's been reported. That the number of nodes with degree k is proportional to 1/(k^2). This kind of mathematical expression is called a "power law." Because it involves a variable raised to some power. Here we have 1/(k^2), but that's equivalent to k^(-2). If you ever hear of the term "scale-free distribution" you just know that that's equivalent to a power law distribution. And what's important to know is what that exponent is. So the definition of a "scale-free network" is a network with a "scale- free" or "power law" degree distribution. And we can see that scale-free networks are fractal-like. Here's an abstract scale-free network. It's fractal-like in that you get hubs that have lots of outgoing links or incoming links. And if we scale down here we see at this level little hubs. At each scale we can see smaller and smaller hubs that have the same shape as the original network roughly. So we have self-similarity to some extent in this network. I'm just calling it "fractal-like." And that's why it has a power law-like distribution. And I say "power law-like" because in the real world, networks aren't really scale-free. Just like in the real world objects aren't really fractals. They're not mathematical fractals. They're approximations to fractals. And the same thing with network degree distributions. I should point out that scale-free networks indeed have the small world property. That is, low average distances between nodes and high clustering. In 1999, Albert Lazlo Barabasi and Reka Albert published a paper in the Journal Science called "Emergence of Scaling and Random Networks", in which they introduced this idea of scale-free networks to a large audience and this really, similar to the Watts and Strogatz paper I mentioned before, this paper sparked a huge amount of interest in the nascent field of network science. Barabasi and Albert looked at many different networks, looked at the empirical data and proposed that scale-free degree distributions are very common in real world complex networks. Other papers, such as this one by Clauset, Shalizi, and Newman, have been a little bit more skeptical of that claim that the data is power law distributed. So I just want you to be aware that there is some controversy about whether certain networks are actually scale-free or not. Evelyn Fox Keller, a biologist who studies networks, said in 2005, "current assessments of the commonality of power laws are probably over estimates" and that's probably true. But in general what most people do accept is that many real world networks, most real world networks probably, have long-tailed degree distributions where "long-tail" just means that it has this kind of shape. With many nodes having relatively low degree compared to the number of nodes that have high degree. And this is called a tail of a distribution in probability theory. Because it kind of looks like a tail. This is sort of the body and this is the tail. And this is called a "long-tailed distribution" because it's tail keeps going on and on and on. And that is very different from what we call "Bell curve" or normal distributions. Here's a picture of a normal distribution of human height. This is a nice example because people are lined up here by their heights with the shortest people over here and the tallest people over here. And you can see that most people cluster in the middle. There's a few people who are very short. There's a few people who are very tall. Most people are average. This normal distribution does not have a long tail. This goes to zero very quickly on either side. A power law distribution or a long-tailed distribution does not go to zero. It has a long tail that drops off and this has some really important implications for real world networks.