In all fairness, dealing with causality is extremely difficult and we used to do well with tools such as traditional statistics and classical probability theory. A major development in causality has been information theory, which is in some way a clever extension of traditional statistics. One that has sometimes been dangerous because it may give the impression to deal with causality in different ways than traditional statistics and probability theory, but it does not. The concept of entropy was developed in the context of statistical mechanics in the early 20th century as an attempt to understand the average behavior of small objects such as particles, atoms, and molecules. In the late 40's, Claude Shannon formulated a version of entropy similar to the one adopted in statistical mechanics, a branch of quantum physics. Intended to help evaluate the capacity of communication channels while Shannon was working at Bell Labs, a company concerned with communication lines, a precursor of what today is the giant AT&T. But it was Andrey Kolmogorov on the shoulders of others, such as Von Mises and Alan Turing, that soon found out that probability theory was very limited to deal with fundamental aspects of randomness and thus causality. Yet, we have not made much use so far of what we know today about mathematical randomness as opposed to statistical randomness in areas such as data science and science in general to help tackle the challenge of causality discovery. To characterize randomness is key in the area of causation because it is exactly what one wants to discount from observation and modeling. In order to come up with some candidate mechanism to explain the data, one does not want to attribute chance where there is not or the other way around. The founders of the theory of algorithmic complexity, namely Andrey Kolmogorov, Gregory Chaitin, Ray Solomonoff, and Leonid Levin, have came up with the mathematical solutions to the challenge of randomness, causality discovery, and inference in a fundamental way, even if difficult to calculate. Solomonov and Levin showed that the concept of algorithmic probability is the ultimate solution to the problem of causal inference, suggesting that intelligence is likely simply computation. Both concepts (algorithmic complexity and algorithmic probability) are faces of the same coin deeply related to each other. However, while some applied scientists may have heard of algorithmic randomness, and likely few were about algorithmic probability, very few, if anyone, uses them in the challenge of causation. The advantages of algorithmic complexity and algorithmic probability in data science have been largely underused if not ignored because unlike other simpler measures, such as those from statistics including channel entropy, measures of algorithmic complexity are much more difficult to estimate. One thing to bear in mind is that there will be always some statistics involved in, for example, validating on our final testing. Because the only way to have full certainty about causation is not only to witness a process of interest first hand, but also to isolate the system and have access to all the involved processes at all scales, something difficult if not always impossible in the real world. Here we will introduce what we think is an exciting, hopefully for you too, new approach to tackle the problem of causality based on algorithmic complexity by way of algorithmic probability. We call this new area algorithmic information dynamics, or algorithmic dynamics for short. It is what Marvin Minsky, a founding father of the field of artificial intelligence, said about algorithmic probability: "It seems to me that the most important discovery since Godel was the discovery by Chaitin, Solomonov, and Kolmogorov of a concept called algorithmic probability which is the fundamental new theory of how to make predictions given a collection of experiences. And this is a beautiful theory everybody should learn it but it's got one problem, which is that you can't actually calculate what this theory predicts because it's too hard and requires an infinite amount of work. However, it should be possible to make practical approximations to the Chaitin, Kolmogorov, Solomonov theory that will make better predictions than anything we have today. And everybody should learn all about that and spend the rest of their lives working on it. In this course we offer an approach to causal discovery that is model driven based on this theory of algorithmic probability to help discover and produce candidate generative models for observed data. We have made some progress at applying these new tools to data to deal with causation and it was make use of the progress made throughout history. From logic to statistics, probability dynamical systems and computation, including perturbation analysis, but it does remove traditional statistics underneath for estimations of probability distributions away from the center. We think that algorithmic information dynamics may be part of a shift away from the dominance that statistics and even equations have long exerted over science and will move towards computation and machine learning, but a different machine learning rooted in algorithmic approaches to data rather than the current state of machine learning based mostly on traditional statistics. We will see in the last module how algorithmic information dynamics can help predict and even reconstruct the phase space and space-time behavior of complex nonlinear dynamical systems with limited information. First, in the next modules, we will cover many more topics needed to understand algorithmic information dynamics. Now that we have given a brief overview to understand the motivation, the significance, and some of the many aspects and challenges involved in causality.