Welcome back to the course on Algorithmic
Information Dynamics.

Finally, we are about to learn what Algorithmic
Information Dynamics is and how new methods

may provide new, fresh and hopefully better
means to deal with causation.

You will see how we will start putting together
all what you learnt and we had to go through,

from information theory to graphs and networks,
and from computability to dynamical systems.

In this lecture we will contrast computable
measures such as Shannon entropy and algorithmic

complexity when it comes to characterising
an object, particularly a special graph that

we created for this purpose.

So we saw in previous units how entropy tries
to quantify randomness in a different way

than algorithmic randomness and how entropy
would miss some algorithmic properties in

objects such as sequences if these lack statistical
regularities.

In a strict way, algorithmic randomness is
a generalisation of the typical application

of entropy.

Examples where entropy as a tool would fail
to shed light on the nature of objects include

the mathematical constant pi, the Champernowne
constant and the Erdos-Copeland constant,

to mention a few.

Another example that we also covered was that
of the Thue-Morse sequence.

Entropy can only find statistical regularities
if no other method to infer the nature of

the source is available to entropy, a method
such as algorithmic complexity that while

it is difficult to estimate it does not depend
on beliefs such as underlying probability

distributions to which we rarely have full
access to or full knowledge.

Of course one can design all sort of more
fancy measures on top of Shannon entropy but

as long as they rely on the same principles
they will all have exactly the same shortcomings

so we need new tools to complement statistical
tools such as Shannon entropy.

One such tool to complement entropy to infer
the nature of the source and potentially the

source itself in the form of a mechanistic
model is to introduce computation into the

challenge.

The idea is that if an object has an algorithmic
regularity it can be, in principle, quantified

by a universal and non-computable measure
such as algorithmic complexity and, as we

saw before, this comes in the form of a question
regarding whether the object or sequence can

be characterised as random or not random according
to the length of the computer programs that

can generate such object.

One major challenge in modern physics is the
proper characterization of network systems

for use in fields ranging from physics, to
chemistry.

A common problem is the description of order
parameters with which to characterise the

"complexity of a network."

Graph complexity has traditionally been characterized
using graph-theoretic measures such as the

degree distribution, clustering coefficient,
edge density, and community or modular structure.

For graphs, as well, one can find how the
limitations of Shannon entropy and any other

computable measure will require you to, first,
define a feature of interest in order to apply

the measure.

Here, for example, is an attempt to come up
with a very general measure of graph entropy

where G is a graph and i is a property of
the graph G, a property can be how many 1

and 0s its adjacency matrix may have or how
the degree distribution of the graph looks

like and we will see how H will depend on
the choice of i making entropy to be quite

fragile to the choice of feature of interest
making entropy not to be a graph invariant.

We also saw before how not having access to
any probability distributions but only access

to G traditionally leads to taking the underlying
probability distribution of G for a given

feature i denoted by P(G) is assumed to be
a uniform distribution which makes entropy

to behave as a simple counting function, counting
how many times feature i occurs as we inspect

G precisely for property i.

We introduced a method to build a family of
recursive graphs with maximal entropy but

low algorithmic complexity, hence graphs that
appear statistically random but are, however,

of low algorithmic randomness and thus causally
(recursively) generated.

Moreover, these graphs may have maximal entropy
for some lossless descriptions but minimal

entropy for some other lossless descriptions
of exactly the same objects, with both descriptions

characterizing the same object and only that
object, thereby demonstrating how entropy

fails at unequivocally and unambiguously characterizing
a graph independent of a particular feature

of interest.

We denote by ZK a graph (unequivocally) constructed
as follows:

Let a node labelled as 1 be connected to another
node labelled 2 to be a starting graph G.

If a node with label n has degree n, we call
it a core node; otherwise, we call it a supportive

node.

Iteratively add a node n + 1 to G such that
the number of core nodes in G is maximized.

That is, we aim at adding nodes whose number
of edges is the same number as their labels

So we keep adding nodes trying to connect
each node in a way such that, again, every

node eventually gets the same number of edges
than its label determines, so node 2 would

have 2 edges, node 3 would have 3 edges, node
4 would have 4 edges and so on.

Here we can see core nodes coloured in red
when they have already reached the number

of edges that their label establishes, and
those nodes that are still missing edges are

coloured in blue and we call them supportive
nodes.

Then we keep adding nodes and edges to each
node in an iterative fashion, one node at

a time but as many edges as needed to complete
the edges of all previously added nodes marked

in red.

As it turns out, this simple algorithm determines
a unique graph and constitutes a method that

can be used to build a family of recursive
graphs with maximal entropy but low algorithmic

complexity, that is, graphs that appear statistically
random but are, however, of low algorithmic

randomness and thus causally (recursively)
generated just like this one.

Indeed, the degree sequence of the graph is
the Champernowne constant in base 10, a transcendental

real whose decimal expansion we saw before
was Borel normal and thus of maximal entropy.

This is because core node 1 has 1 link and
core node 2 has 2 links and so on, so the

degree sequence of this graph is near maximal
entropy except for a smaller number of supportive

nodes.

Notice that one can also reconstruct exactly
the same graph from its degree sequence as

we have also proven.

Interestingly, the sequence of number of edges
is a recurrence relation built upon previous

iteration values between core and supportive
nodes, defined by a beautiful sequence involving

the golden ratio.

The brackets represent the floor function.

But more important, when you look at the adjacency
matrix of this graph, it turns out to be very

sparse, that is, there are way less links
or edges than its greatest possible number

of edges among all nodes, and the number of
possible edges increases exponentially while

the number of actual edges grows linearly
so the edge density of this graph vanishes

and goes to 0 with the adjacency matrix having
mostly all 0s also meaning that the adjacency

matrix is of minimal entropy because it has
almost nothing but 0s.

This is a striking feature per design because
both the adjacency matrix of the graph and

the degree sequence of the same graph can
uniquely reconstruct the graph from scratch

so they are lossless representations yet,
without knowing the generating source, one

description of the graph suggests that the
graph is almost maximally random and the other

description suggests that it is minimally
random hence having divergent entropic values

for different features.

This helps see how one can choose an arbitrary
feature of a graph when it comes to assign

it an entropy value and how this is different
to algorithmic complexity that is invariant

to object description and actually considers
the generating mechanisms based on computer

programs before making a final call.

Yet, as we know, the graph can be generated
recursively by a short computer program that

always produces exactly the same graph and
is thus deterministic and of low algorithmic

complexity.

Here is a very compact computer program written
in the Wolfram Language that produces the

graph.

In spite of its trivial construction, the
ZK graph displays all sorts of interesting

graph-theoretic, dynamical, and complexity
properties.

For example, the clustering coefficient of
the undirected graph asymptotically converges

to 0.65 and some properties increase or decrease
linearly while others do so polynomially.

And the different eigenvalues behave in non-trivial
ways.

And, as we just explained, the entropy of
different graph descriptions (even for fully

accurate descriptions, and not because of
a lack of information from the observer point

of view) diverge and become trivially dependent
on other simple functions (e.g., edge density

or degree sequence normality).

While useful for quantifying specific features
of the graph that may appear interesting,

no graph-theoretic or entropic measure can
account for the low (algorithmic) randomness

and therefore (high) causal content of the
network, and while algorithmic complexity

is difficult to apply and may sometimes not
produce the desired results for a lack of

computational power, it can in principle deal
with these cases, and we will see how we can

actually, with new methods based on algorithmic
probability, make good estimations.