So in the previous lesson we saw how we can
trace in detail the algorithmic information

trajectory of objects and events evolving
over time.

Here we will see that we can move from being
observers to being drivers of how systems

can change.

For this, we will introduce a causal calculus
based on algorithmic information dynamics.

The is based upon perturbing objects or systems
according to a sequence of interventions that

lead to different possible trajectories.

For example, let S be a non-random binary
file containing a string in binary:

10101010101010101010101010101010101010101010101010101010101010

Clearly, S is algorithmically compressible,
with pS = “Print(01) 31 times” a

short computer program generating S. Let S’
be equal to S except for a bitwise operation

(bitwise NOT, or complement) in, say, position
24:

10101010101010101010100010101010101010101010101010101010101010

A short computer program that generates S’
can be written in terms of pS that can already

account for most of the string except for
the perturbed bit, which pS’ has thus to

account for.

A candidate program can be pS’ =“Print(10)
11 times; Print(000); Print(10) 20 times”.

Clearly, the length in binary of the computer
program pS generating S is upper bounded by

|pS’|,

the length, in bits, of the computer program
pS’ generating S’.

We have that the relationship of their algorithmic
complexity C is:

C(S) < C(S’)

for any single-bit mutation of S, where C(S)

= |pS|

and C(S’)

= |pS’|

are the lengths of the shortest computer programs
generating S and S’.

Notice also that we have here induced a point
mutation in the form of perturbing a bit and

flipping it from its original value, but if
this sequence were a dynamical system, all

the pointwise mutations could have been state
transitions in a dynamic space, so these mutations

are also covering the different one bit possible
orbits or trajectories of this object and

we can cover in principle many more by continue
making perturbations.

So not only this is a causal interventionist
calculus but also considering the changes

of complexity that different orbits of the
same object.

Perturbations can be of 2 opposite types when
it comes to changing the complexity of the

underlying model explaining the data.

One kind are the perturbations that move an
object towards randomness and the other kind

are perturbations that move the object away
from randomness.

To illustrate it we have this example, the
kind of perturbation to be applied will be,

again, bit deletion and the object to start
with will be a binary a string.

At every step we produce all possible single-bit
mutations of the previous step and keep only

the mutated string that moved the string closer
or farther away from randomness.

You can see how pushing a string that is composed
by a simple substring followed by a random-looking

string towards and away from randomness using
BDM actually extracts the correct substring.

This process of interventions is at the core
of the causal algorithmic calculus and algorithmic

information dynamics as applied to, for example,
to strings before moving to more sophisticated

objects such as networks.

Here is another type of perturbation that
minimise the object shifts towards or away

from randomness, we call them neutral perturbations
because they maximise the preservation of

the main features in data.

You can see how the shorter string in the
last step still seems to preserve some of

the features of the original string such as
the turns between 0s and 1s but it has been

significantly reduced in length.

Random data reacts different to guided perturbations.

Let S be algorithmically incompressible, that
is, algorithmic random.

This means that S has no generating mechanism
shorter than the string itself and can only

be generated by a computer program of the
form pS = Print(S), which is not much shorter

than S itself.

Let S’ be the result of negating any bit
in S just as we did for S and S’.

Then pS’ = Print(S’).

There are 2 possible relations between pS’
and pS after a single bit perturbation (bit

negation), either pS < pS’ or pS > pS’
depending on S’ moving towards or away with

respect to C(S).

However, perturbing only a single bit cannot
result in a (much) less random S’ because

pS = Bitwise(Print(S’),n) can be used to
reverse S’ into S which is random, but both

Bitwise and n (the bit index to be changed)
are of fixed and small length and so do not

contribute to the already high algorithmic
complexity/randomness of the original random

string (which can be of any length).

So all perturbations are neutral 
when we face a random object!

This means that perturbations to an algorithmically
random object have a lower impact on their

new candidate generating mechanisms (or lack
thereof) than perturbations to non-random

objects---with respect to the generating mechanisms
of their unperturbed states---and thus this

effect can be exploited to estimate and infer
the causal content of causal and non-causal

objects.

In an algorithmically random object, any change
goes unnoticed because no perturbation can

lead to a dramatic change of its (already
high) algorithmic complexity.

Real-world cases will move in an intermediate
region between determinism and randomness

The principle on which this causal calculus
and algorithmic information dynamics is based

on, is the principle that we have mentioned
before.

The principle of the conservation of algorithmic
complexity in deterministic isolated systems

evolving over time.

In other words, the rate of algorithmic complexity
change is constant and changes by at most

a very small logarithmic term as a function
of time when time is needed to specify a particular

time, otherwise the algorithmic complexity
of such a system is always exactly the same

because it is all generated by the same generating
program.

Here, for example, is the Elementary Cellular
Automaton rule 30, which is deterministic

and once started with an initial condition
can be isolated.

No matter where the evolution stops, the algorithmic
complexity of the cellular automaton is always

exactly the same as the rule has never changed
and only the time is needed to reproduce exactly

the same object.

So after 1 step, 2 steps, 100 steps or any
number of steps, the algorithmic complexity

of the object remains almost the same except
for the logarithmic term that for all purposes

can be neglected.

But if a system is perturbed from the outside,
and such a perturbation cannot be explained

by the system’s rules, then the algorithmic
complexity of the system from that point on

will be different or will suffer a noticeable
change at that precise point, so we can know

important information by detecting every time
that a system deviates from its normal course,

it is perturbed or any of its parts behave
random precisely by looking at how removed

it is from its original algorithmic complexity.

Networks can be seen as generated by computer
programs too.

Here on the left we have some examples of
computer pseudo code able to describe the

networks on the right.

And we can see how applying perturbations
to the network on the right, such as deleting

an edge or a node would have changes in its
generative model on the left depending on

the algorithmic complexity of the network.

For example, in a complete graph, deleting
a node does not change the generating program

because the resulting graph is still a complete
graph and we can regenerate the original graph

with the same computer program with no need
to make any modification.

However, in a random graph, one in which the
description of every edge is necessary to

describe the network, every change to the
network produces a significant change on the

computer program yet such a change at the
level of the computer program is small compared

to the long length of the computer program
describing the original random graph and so

the model remains of almost the same length.

Finally, when the network is neither trivial
nor random we have that the computer program

is of small size but modifications to the
data, in this case the network, would require

a major change on the explaining model.

This means that by having a rough estimation
of the complexity of the original object and

performing perturbations on the object we
can tell apart trivial and random cases from

complex cases by seeing the effect of perturbations
on the generative model.

In this way, we can colour every element of
a network by its contribution to its generating

model, just as we said, in a complete graph
shown in figure A, all nodes are neutral because

they have no noticeable effect in the generating
mechanism that can regenerate the graph, because

the original and mutated graphs are both complete
graphs.

But randomly removing an edge from a complete
graph produces a major effect in the model

explaining the original graph so all edges
are coloured in red in figure B because their

removal moves the object towards randomness.

In a random graph like the graph in figure
E, all nodes and edges have very little effect

in the original already long model explaining
the random graph, and sometimes in less trivial

and non-random graphs some changes move the
graph towards simplicity too, for example

by removing colliding edges from an otherwise
unidirectional path graph as in figure D the

graph moves towards simplicity or away from
randomness and thus they are coloured in blue.

But we do not have to colour elements in binary,
or trinary with only 2 or 3 colours, we can

do it with all the colour spectrum and have
a more finer grained picture of the way in

which perturbing elements in an object such
as a graph would change the causal properties

of the graph as dictated by its underlying
generating mechanism.

And we can classify networks by whether they
are blue or red shifted, meaning that most

elements of a graph can move the graph towards
or away from randomness, giving us a vector

of interesting features that effectively provide
a ranking of the ways in which an object can

naturally or artificially be moved.

The spectra can also be plotted as a curve
that we will denote by sigma G, we call it

an information signature that consists in
ordering the values of the elements of an

object from positive to negative contribution
to the original object.

Positive elements move the object away from
randomness while negative elements move it

towards randomness.

Different graphs have different characteristic
signatures.

For example, in a perfect algorithmic random
graph all elements would be neutral, but in

pseudo-randomly random graphs elements will
accumulate around 0 with most of the standard

deviation on the positive side and then a
small negative tail.

This is because it is, in general, easier
to move a random graph away from randomness

than further to randomness if we assumed that
the graph was already random.

In a simple graph, such as a complete graph
or a fully disconnected graph, all elements

such as nodes, would be neutral and thus the
signature is a constant around 0.

Finally, complex networks will have elements
on both sides, and if they produce an edgy

inverted S shape it can be considered critical,
because basically any perturbation dramatically

changes the nature of the object and moves
it towards or away from randomness.

Notice also that while these critical can
exist for algorithmic probability, they cannot

for measures such as logical depth, because
dramatic changes of logical depth by changing

only one element at a time are impossible,
this is because of a law of slow change coming

from the fact that one cannot shortcut introducing
a lot of computation to either increase or

decrease the logical depth of an object in
a few steps.

So now we can ask whether moving these networks
in the algorithmic complexity space can move

them in any other way in an educated or predictable
way.

Questions such as what kind of topology is
changed or preserved when pushing or pulling

objects towards or away from randomness are
very interesting, and what would tell us to

move networks on the algorithmic complexity
space if such networks had dynamics on them

such as Boolean networks would also be very
interesting.

Here are some examples of the way in which
we can numerically move networks with algorithmic

information dynamics and our causal interventionist
calculus.

In figure (a), for example, is a complete
graph, denoted by K, that is pushed away from

randomness by edge deletion and produces other
complete graphs after 10 and 19 steps respectively

starting from the complete graph with 10 nodes
as it was expected.

In figure (b) we show an example of pushing
a network towards randomness, as it would

have been expected the result of pushing a
complete graph towards randomness produces

an E-R graph approaching edge density equal
to 0.5 also as theoretically expected.

In figure (c) pushing a random graph towards
simplicity reveals structured subgraphs after

32 and 41 steps contained in the original
random one thereby revealing structure in

randomness by single artificial perturbations.

In the next unit we will start with examples
to show how we can reprogram networks not

only in this sense but how moving them in
the algorithmic complexity plane actually

make systems provided with dynamics to behave
in different and rather controlled ways.