So in the previous lesson we saw how we can trace in detail the algorithmic information trajectory of objects and events evolving over time. Here we will see that we can move from being observers to being drivers of how systems can change. For this, we will introduce a causal calculus based on algorithmic information dynamics. The is based upon perturbing objects or systems according to a sequence of interventions that lead to different possible trajectories. For example, let S be a non-random binary file containing a string in binary: 10101010101010101010101010101010101010101010101010101010101010 Clearly, S is algorithmically compressible, with pS = “Print(01) 31 times” a short computer program generating S. Let S’ be equal to S except for a bitwise operation (bitwise NOT, or complement) in, say, position 24: 10101010101010101010100010101010101010101010101010101010101010 A short computer program that generates S’ can be written in terms of pS that can already account for most of the string except for the perturbed bit, which pS’ has thus to account for. A candidate program can be pS’ =“Print(10) 11 times; Print(000); Print(10) 20 times”. Clearly, the length in binary of the computer program pS generating S is upper bounded by |pS’|, the length, in bits, of the computer program pS’ generating S’. We have that the relationship of their algorithmic complexity C is: C(S) < C(S’) for any single-bit mutation of S, where C(S) = |pS| and C(S’) = |pS’| are the lengths of the shortest computer programs generating S and S’. Notice also that we have here induced a point mutation in the form of perturbing a bit and flipping it from its original value, but if this sequence were a dynamical system, all the pointwise mutations could have been state transitions in a dynamic space, so these mutations are also covering the different one bit possible orbits or trajectories of this object and we can cover in principle many more by continue making perturbations. So not only this is a causal interventionist calculus but also considering the changes of complexity that different orbits of the same object. Perturbations can be of 2 opposite types when it comes to changing the complexity of the underlying model explaining the data. One kind are the perturbations that move an object towards randomness and the other kind are perturbations that move the object away from randomness. To illustrate it we have this example, the kind of perturbation to be applied will be, again, bit deletion and the object to start with will be a binary a string. At every step we produce all possible single-bit mutations of the previous step and keep only the mutated string that moved the string closer or farther away from randomness. You can see how pushing a string that is composed by a simple substring followed by a random-looking string towards and away from randomness using BDM actually extracts the correct substring. This process of interventions is at the core of the causal algorithmic calculus and algorithmic information dynamics as applied to, for example, to strings before moving to more sophisticated objects such as networks. Here is another type of perturbation that minimise the object shifts towards or away from randomness, we call them neutral perturbations because they maximise the preservation of the main features in data. You can see how the shorter string in the last step still seems to preserve some of the features of the original string such as the turns between 0s and 1s but it has been significantly reduced in length. Random data reacts different to guided perturbations. Let S be algorithmically incompressible, that is, algorithmic random. This means that S has no generating mechanism shorter than the string itself and can only be generated by a computer program of the form pS = Print(S), which is not much shorter than S itself. Let S’ be the result of negating any bit in S just as we did for S and S’. Then pS’ = Print(S’). There are 2 possible relations between pS’ and pS after a single bit perturbation (bit negation), either pS < pS’ or pS > pS’ depending on S’ moving towards or away with respect to C(S). However, perturbing only a single bit cannot result in a (much) less random S’ because pS = Bitwise(Print(S’),n) can be used to reverse S’ into S which is random, but both Bitwise and n (the bit index to be changed) are of fixed and small length and so do not contribute to the already high algorithmic complexity/randomness of the original random string (which can be of any length). So all perturbations are neutral when we face a random object! This means that perturbations to an algorithmically random object have a lower impact on their new candidate generating mechanisms (or lack thereof) than perturbations to non-random objects---with respect to the generating mechanisms of their unperturbed states---and thus this effect can be exploited to estimate and infer the causal content of causal and non-causal objects. In an algorithmically random object, any change goes unnoticed because no perturbation can lead to a dramatic change of its (already high) algorithmic complexity. Real-world cases will move in an intermediate region between determinism and randomness The principle on which this causal calculus and algorithmic information dynamics is based on, is the principle that we have mentioned before. The principle of the conservation of algorithmic complexity in deterministic isolated systems evolving over time. In other words, the rate of algorithmic complexity change is constant and changes by at most a very small logarithmic term as a function of time when time is needed to specify a particular time, otherwise the algorithmic complexity of such a system is always exactly the same because it is all generated by the same generating program. Here, for example, is the Elementary Cellular Automaton rule 30, which is deterministic and once started with an initial condition can be isolated. No matter where the evolution stops, the algorithmic complexity of the cellular automaton is always exactly the same as the rule has never changed and only the time is needed to reproduce exactly the same object. So after 1 step, 2 steps, 100 steps or any number of steps, the algorithmic complexity of the object remains almost the same except for the logarithmic term that for all purposes can be neglected. But if a system is perturbed from the outside, and such a perturbation cannot be explained by the system’s rules, then the algorithmic complexity of the system from that point on will be different or will suffer a noticeable change at that precise point, so we can know important information by detecting every time that a system deviates from its normal course, it is perturbed or any of its parts behave random precisely by looking at how removed it is from its original algorithmic complexity. Networks can be seen as generated by computer programs too. Here on the left we have some examples of computer pseudo code able to describe the networks on the right. And we can see how applying perturbations to the network on the right, such as deleting an edge or a node would have changes in its generative model on the left depending on the algorithmic complexity of the network. For example, in a complete graph, deleting a node does not change the generating program because the resulting graph is still a complete graph and we can regenerate the original graph with the same computer program with no need to make any modification. However, in a random graph, one in which the description of every edge is necessary to describe the network, every change to the network produces a significant change on the computer program yet such a change at the level of the computer program is small compared to the long length of the computer program describing the original random graph and so the model remains of almost the same length. Finally, when the network is neither trivial nor random we have that the computer program is of small size but modifications to the data, in this case the network, would require a major change on the explaining model. This means that by having a rough estimation of the complexity of the original object and performing perturbations on the object we can tell apart trivial and random cases from complex cases by seeing the effect of perturbations on the generative model. In this way, we can colour every element of a network by its contribution to its generating model, just as we said, in a complete graph shown in figure A, all nodes are neutral because they have no noticeable effect in the generating mechanism that can regenerate the graph, because the original and mutated graphs are both complete graphs. But randomly removing an edge from a complete graph produces a major effect in the model explaining the original graph so all edges are coloured in red in figure B because their removal moves the object towards randomness. In a random graph like the graph in figure E, all nodes and edges have very little effect in the original already long model explaining the random graph, and sometimes in less trivial and non-random graphs some changes move the graph towards simplicity too, for example by removing colliding edges from an otherwise unidirectional path graph as in figure D the graph moves towards simplicity or away from randomness and thus they are coloured in blue. But we do not have to colour elements in binary, or trinary with only 2 or 3 colours, we can do it with all the colour spectrum and have a more finer grained picture of the way in which perturbing elements in an object such as a graph would change the causal properties of the graph as dictated by its underlying generating mechanism. And we can classify networks by whether they are blue or red shifted, meaning that most elements of a graph can move the graph towards or away from randomness, giving us a vector of interesting features that effectively provide a ranking of the ways in which an object can naturally or artificially be moved. The spectra can also be plotted as a curve that we will denote by sigma G, we call it an information signature that consists in ordering the values of the elements of an object from positive to negative contribution to the original object. Positive elements move the object away from randomness while negative elements move it towards randomness. Different graphs have different characteristic signatures. For example, in a perfect algorithmic random graph all elements would be neutral, but in pseudo-randomly random graphs elements will accumulate around 0 with most of the standard deviation on the positive side and then a small negative tail. This is because it is, in general, easier to move a random graph away from randomness than further to randomness if we assumed that the graph was already random. In a simple graph, such as a complete graph or a fully disconnected graph, all elements such as nodes, would be neutral and thus the signature is a constant around 0. Finally, complex networks will have elements on both sides, and if they produce an edgy inverted S shape it can be considered critical, because basically any perturbation dramatically changes the nature of the object and moves it towards or away from randomness. Notice also that while these critical can exist for algorithmic probability, they cannot for measures such as logical depth, because dramatic changes of logical depth by changing only one element at a time are impossible, this is because of a law of slow change coming from the fact that one cannot shortcut introducing a lot of computation to either increase or decrease the logical depth of an object in a few steps. So now we can ask whether moving these networks in the algorithmic complexity space can move them in any other way in an educated or predictable way. Questions such as what kind of topology is changed or preserved when pushing or pulling objects towards or away from randomness are very interesting, and what would tell us to move networks on the algorithmic complexity space if such networks had dynamics on them such as Boolean networks would also be very interesting. Here are some examples of the way in which we can numerically move networks with algorithmic information dynamics and our causal interventionist calculus. In figure (a), for example, is a complete graph, denoted by K, that is pushed away from randomness by edge deletion and produces other complete graphs after 10 and 19 steps respectively starting from the complete graph with 10 nodes as it was expected. In figure (b) we show an example of pushing a network towards randomness, as it would have been expected the result of pushing a complete graph towards randomness produces an E-R graph approaching edge density equal to 0.5 also as theoretically expected. In figure (c) pushing a random graph towards simplicity reveals structured subgraphs after 32 and 41 steps contained in the original random one thereby revealing structure in randomness by single artificial perturbations. In the next unit we will start with examples to show how we can reprogram networks not only in this sense but how moving them in the algorithmic complexity plane actually make systems provided with dynamics to behave in different and rather controlled ways.