Hello and welcome to unit 5. This unit is on empirical power law distributions. The central question that we will consider is as follows: suppose you are studying some phenomenon and you have a set of data. How can you figure out if that data is well described by a power-law? And if so, what's the best way to estimate the power law exponent. It turns out that these questions can be surprisingly subtle and it's easy to approach them in the wrong way. I'll begin in the next sub-unit by looking at different ways to represent data. We'll talk about histograms but also about cumulative distribution functions and rank frequency plots. I'll then give a brief survey of some of the phenomena that are thought to be distributed according to power laws and then we'll start looking at some of these questions of inference. Given that we suspect that something is a power law, what's the best way to estimate the parameters that describe the distribution. We'll also need to talk about alternatives, by that I mean there may be other distributions that describe your data better than a power law and there are some distributions that are often confused with power-laws in particular stretched exponentials and log-normals. So what I'll do is say a little bit about those other distributions, and compare and contrast them to power-laws, and then talk about some statistical techniques for choosing among those different options. Before I get started, another word or two about this unit. I will frequently be referring to a couple of really key papers in empirical power laws and inference with power laws and I've put a list of those papers along with links to pdfs in a subunit or a section called Additional Resources. So, you can find all the papers that I mention right there so you shouldn't have to look around for them. Additionally, I'm gonna be mentioning a bunch of statistical techniques and we're not gonna get so involved that I'm gonna show you step by step how to do them. I think that's too much statistics and beyond the scope of this course, but I suspect, I hope that some of you will want to try these things out on your own. So, there's some good software that's out there that's been developed in the last 5 years or so and links to those software are also in this additional resources section. There's some Python code, some R code, and a little bit of Matlab code as well. So, if you want to try some of that stuff out, those are pretty good places to start. Ok, so let's begin by thinking about power law distributions by looking at how to represent data. And I'll do so with a simple example that will lead us through the construction of a cumulative distribution function and a rank-frequency plot. You'll see that it's easier to calculate than it is to say. So, let's get started.