Welcome back to the course on Algorithmic Information Dynamics. Finally, we are about to learn what Algorithmic Information Dynamics is and how new methods may provide new, fresh and hopefully better means to deal with causation. You will see how we will start putting together all what you learnt and we had to go through, from information theory to graphs and networks, and from computability to dynamical systems. In this lecture we will contrast computable measures such as Shannon entropy and algorithmic complexity when it comes to characterising an object, particularly a special graph that we created for this purpose. So we saw in previous units how entropy tries to quantify randomness in a different way than algorithmic randomness and how entropy would miss some algorithmic properties in objects such as sequences if these lack statistical regularities. In a strict way, algorithmic randomness is a generalisation of the typical application of entropy. Examples where entropy as a tool would fail to shed light on the nature of objects include the mathematical constant pi, the Champernowne constant and the Erdos-Copeland constant, to mention a few. Another example that we also covered was that of the Thue-Morse sequence. Entropy can only find statistical regularities if no other method to infer the nature of the source is available to entropy, a method such as algorithmic complexity that while it is difficult to estimate it does not depend on beliefs such as underlying probability distributions to which we rarely have full access to or full knowledge. Of course one can design all sort of more fancy measures on top of Shannon entropy but as long as they rely on the same principles they will all have exactly the same shortcomings so we need new tools to complement statistical tools such as Shannon entropy. One such tool to complement entropy to infer the nature of the source and potentially the source itself in the form of a mechanistic model is to introduce computation into the challenge. The idea is that if an object has an algorithmic regularity it can be, in principle, quantified by a universal and non-computable measure such as algorithmic complexity and, as we saw before, this comes in the form of a question regarding whether the object or sequence can be characterised as random or not random according to the length of the computer programs that can generate such object. One major challenge in modern physics is the proper characterization of network systems for use in fields ranging from physics, to chemistry. A common problem is the description of order parameters with which to characterise the "complexity of a network." Graph complexity has traditionally been characterized using graph-theoretic measures such as the degree distribution, clustering coefficient, edge density, and community or modular structure. For graphs, as well, one can find how the limitations of Shannon entropy and any other computable measure will require you to, first, define a feature of interest in order to apply the measure. Here, for example, is an attempt to come up with a very general measure of graph entropy where G is a graph and i is a property of the graph G, a property can be how many 1 and 0s its adjacency matrix may have or how the degree distribution of the graph looks like and we will see how H will depend on the choice of i making entropy to be quite fragile to the choice of feature of interest making entropy not to be a graph invariant. We also saw before how not having access to any probability distributions but only access to G traditionally leads to taking the underlying probability distribution of G for a given feature i denoted by P(G) is assumed to be a uniform distribution which makes entropy to behave as a simple counting function, counting how many times feature i occurs as we inspect G precisely for property i. We introduced a method to build a family of recursive graphs with maximal entropy but low algorithmic complexity, hence graphs that appear statistically random but are, however, of low algorithmic randomness and thus causally (recursively) generated. Moreover, these graphs may have maximal entropy for some lossless descriptions but minimal entropy for some other lossless descriptions of exactly the same objects, with both descriptions characterizing the same object and only that object, thereby demonstrating how entropy fails at unequivocally and unambiguously characterizing a graph independent of a particular feature of interest. We denote by ZK a graph (unequivocally) constructed as follows: Let a node labelled as 1 be connected to another node labelled 2 to be a starting graph G. If a node with label n has degree n, we call it a core node; otherwise, we call it a supportive node. Iteratively add a node n + 1 to G such that the number of core nodes in G is maximized. That is, we aim at adding nodes whose number of edges is the same number as their labels So we keep adding nodes trying to connect each node in a way such that, again, every node eventually gets the same number of edges than its label determines, so node 2 would have 2 edges, node 3 would have 3 edges, node 4 would have 4 edges and so on. Here we can see core nodes coloured in red when they have already reached the number of edges that their label establishes, and those nodes that are still missing edges are coloured in blue and we call them supportive nodes. Then we keep adding nodes and edges to each node in an iterative fashion, one node at a time but as many edges as needed to complete the edges of all previously added nodes marked in red. As it turns out, this simple algorithm determines a unique graph and constitutes a method that can be used to build a family of recursive graphs with maximal entropy but low algorithmic complexity, that is, graphs that appear statistically random but are, however, of low algorithmic randomness and thus causally (recursively) generated just like this one. Indeed, the degree sequence of the graph is the Champernowne constant in base 10, a transcendental real whose decimal expansion we saw before was Borel normal and thus of maximal entropy. This is because core node 1 has 1 link and core node 2 has 2 links and so on, so the degree sequence of this graph is near maximal entropy except for a smaller number of supportive nodes. Notice that one can also reconstruct exactly the same graph from its degree sequence as we have also proven. Interestingly, the sequence of number of edges is a recurrence relation built upon previous iteration values between core and supportive nodes, defined by a beautiful sequence involving the golden ratio. The brackets represent the floor function. But more important, when you look at the adjacency matrix of this graph, it turns out to be very sparse, that is, there are way less links or edges than its greatest possible number of edges among all nodes, and the number of possible edges increases exponentially while the number of actual edges grows linearly so the edge density of this graph vanishes and goes to 0 with the adjacency matrix having mostly all 0s also meaning that the adjacency matrix is of minimal entropy because it has almost nothing but 0s. This is a striking feature per design because both the adjacency matrix of the graph and the degree sequence of the same graph can uniquely reconstruct the graph from scratch so they are lossless representations yet, without knowing the generating source, one description of the graph suggests that the graph is almost maximally random and the other description suggests that it is minimally random hence having divergent entropic values for different features. This helps see how one can choose an arbitrary feature of a graph when it comes to assign it an entropy value and how this is different to algorithmic complexity that is invariant to object description and actually considers the generating mechanisms based on computer programs before making a final call. Yet, as we know, the graph can be generated recursively by a short computer program that always produces exactly the same graph and is thus deterministic and of low algorithmic complexity. Here is a very compact computer program written in the Wolfram Language that produces the graph. In spite of its trivial construction, the ZK graph displays all sorts of interesting graph-theoretic, dynamical, and complexity properties. For example, the clustering coefficient of the undirected graph asymptotically converges to 0.65 and some properties increase or decrease linearly while others do so polynomially. And the different eigenvalues behave in non-trivial ways. And, as we just explained, the entropy of different graph descriptions (even for fully accurate descriptions, and not because of a lack of information from the observer point of view) diverge and become trivially dependent on other simple functions (e.g., edge density or degree sequence normality). While useful for quantifying specific features of the graph that may appear interesting, no graph-theoretic or entropic measure can account for the low (algorithmic) randomness and therefore (high) causal content of the network, and while algorithmic complexity is difficult to apply and may sometimes not produce the desired results for a lack of computational power, it can in principle deal with these cases, and we will see how we can actually, with new methods based on algorithmic probability, make good estimations.