Jekyll2024-02-25T12:58:44+00:00https://cybercat-institute.github.io///Cybercat InstituteWrite an awesome description for your new site here. You can edit this line in _config.yml. It will appear in your document head meta (for Google search results) and in your feed.xml site description.Cybercat InstituteIteration with Optics2024-02-22T00:00:00+00:002024-02-22T00:00:00+00:00https://cybercat-institute.github.io//2024/02/22/iteration-optics<p>In this post I’ll describe the theory of how to add iteration to categories of optics. Iteration is required for almost all applications of categorical cybernetics beyond game theory, and is something we’ve been handling only semi-formally for some time. The only tool we need is already one we have inside the categorical cybernetics framework: parametrisation weighted by a lax monoidal functor. I’ll end with a conjecture that this is an instance of a general procedure to force states in a symmetric monoidal category.</p>
<p>This post is strongly inspired by the account of Moore machines in <a href="http://davidjaz.com/">David Jaz Myers</a>’ book <a href="http://davidjaz.com/Papers/DynamicalBook.pdf">Categorical Systems Theory</a>, and <a href="https://matteocapucci.wordpress.com/">Matteo</a>’s enthusiasm for it. There’s probably a big connection to things like <a href="https://arxiv.org/abs/1903.01093">Delayed trace categories</a>, but I don’t understand it yet.</p>
<p>The diagrams in this post are made with <a href="https://q.uiver.app/">Quiver</a> and <a href="https://varkor.github.io/tangle/">Tangle</a>.</p>
<h1>The iteration functor</h1>
<p>For the purposes of this post, we’ll be working with a symmetric monoidal category $\mathcal C$, and the category $\mathbf{Optic} (\mathcal C)$ of monoidal optics over it. Objects of $\mathbf{Optic} (\mathcal C)$ are pairs of objects of $\mathcal C$, and morphisms are given by the coend formula</p>
\[\mathbf{Optic} (\mathcal C) \left( \binom{X}{X'}, \binom{Y}{Y'} \right) = \int_{M : \mathcal C} \mathcal C (X, M \otimes Y) \times \mathcal C (M \otimes Y', X')\]
<p>which amounts to saying that an optic $\binom{X}{X’} \to \binom{Y}{Y’}$ is an equivalence class of triples</p>
\[(M : \mathcal C, f : X \to M \otimes Y, f' : M \otimes Y' \to X')\]
<p>I’m pretty sure everything in this post works for other categories of bidirectional processes such as mixed optics and dependent lenses, this is just a setting to write it down which is both convenient and not at all obvious.</p>
<p>The <strong>iteration functor</strong> is a functor $\mathrm{Iter} : \mathbf{Optic} (\mathcal C) \to \mathbf{Set}$ defined on objects by</p>
\[\mathrm{Iter} \binom{X}{X'} = \int_{M : \mathcal C} \mathcal C (I, M \otimes X) \times \mathcal C (M \otimes X', M \otimes X)\]
<p>We refer to elements of $\mathrm{Iter} \binom{X}{X’}$ as <em>iteration data</em> for $\binom{X}{X’}$. We call the object $M$ the <em>state space</em>, the morphism $x_0 : I \to M \otimes X$ the <em>initial state</em> and the morphism $i : M \otimes X’ \to M \otimes X$ the <em>iterator</em>.</p>
<p>Note that in the common case that $\mathcal C$ is cartesian monoidal, we can eliminate the coend to obtain a simpler characterisation:</p>
\[\mathrm{Iter} \binom{X}{X'} = \mathcal C (1, X) \times \mathcal C (X', X)\]
<p>Given an optic $f : \binom{X}{X’} \to \binom{Y}{Y’}$ given by $f = (N, f : X \to N \otimes Y, f’ : N \otimes Y’ \to X’)$, we get a function</p>
\[\mathrm{Iter} (f) : \mathrm{Iter} \binom{X}{X'} \to \mathrm{Iter} \binom{Y}{Y'}\]
<p>Namely, the state space is $M \otimes N$, the initial state is</p>
\[I \overset{x_0}\longrightarrow M \otimes X \xrightarrow{M \otimes f} M \otimes N \otimes Y\]
<p>and the iterator is</p>
\[M \otimes N \otimes Y' \xrightarrow{M \otimes f'} M \otimes X' \overset{i}\longrightarrow M \otimes X \xrightarrow{M \otimes f} M \otimes N \otimes Y\]
<p>This is evidently functorial. Funnily enough, although the action of $\mathrm{Iter}$ on objects when $\mathcal C$ is cartesian is easier to understand, its action on morphisms is less obvious and is not <em>evidently</em> functorial, instead demanding a small proof.</p>
<h1>Pairing iterators and continuations</h1>
<p>We have an existing functor $K : \mathbf{Optic} (\mathcal C)^{\mathrm{op}} \to \mathbf{Set}$, given on objects by $K \binom{X}{X’} = \mathcal C (X, X’)$. This is the <em>continuation functor</em>, and it is the contravariant functor represented by the monoidal unit $\binom{I}{I}$. (This functor first appeared in <a href="https://arxiv.org/abs/1711.07059">Morphisms of Open Games</a>.)</p>
<p>For the remainder of this section I’ll specialise to the case $\mathcal C = \mathbf{Set}$, in which case an optic $\binom{X}{X’} \to \binom{Y}{Y’}$ is determined by a pair of functions $f : X \to Y$ and $f’ : X \times Y’ \to X’$, and iteration data $i : \mathrm{Iter} \binom{X}{X’}$ is determined by an initial value $x_0 : X$ and a function $i : X’ \to X$.</p>
<p>Given iteration data and a continuation that agree on their common boundary, we know enough to run the iteration and produce an infinite stream of values:</p>
\[\left< - | - \right> : \mathrm{Iter} \binom{X}{X'} \times K \binom{X}{X'} \to X^\omega\]
<p>Namely, this stream is defined corecursively by</p>
\[\left< x_0, i | k \right> = x_0 : \left< i (k (x_0)), i | k \right>\]
<p>This operation is natural (technically, <em>dinatural</em>): for any iteration data $i : \mathrm{Iter} \binom{X}{X’}$, optic $f : \binom{X}{X’} \to \binom{Y}{Y’}$ and continuation $k : K \binom{Y}{Y’}$, we have</p>
\[\left< i | K (f) (k) \right> = f^\omega \left( \left< \mathrm{Iter} (f) (i) | k \right> \right)\]
<p>where $f^\omega (-) : X^\omega \to Y^\omega$ means applying the forwards pass of $f$ to every element of the stream. As a commuting diagram,</p>
<p><img src="/assetsPosts/2024-02-20-iteration-optics/dinaturality.png" alt="Dinaturality" /></p>
<p>Here’s a tiny implementation of the iteration functor and the pairing operator in Haskell:</p>
<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kr">data</span> <span class="kt">Iterator</span> <span class="n">s</span> <span class="n">t</span> <span class="o">=</span> <span class="kt">Iterator</span> <span class="p">{</span>
<span class="n">initialState</span> <span class="o">::</span> <span class="n">s</span><span class="p">,</span>
<span class="n">updateState</span> <span class="o">::</span> <span class="n">t</span> <span class="o">-></span> <span class="n">s</span>
<span class="p">}</span>
<span class="n">mapIterator</span> <span class="o">::</span> <span class="kt">Lens</span> <span class="n">s</span> <span class="n">t</span> <span class="n">a</span> <span class="n">b</span> <span class="o">-></span> <span class="kt">Iterator</span> <span class="n">s</span> <span class="n">t</span> <span class="o">-></span> <span class="kt">Iterator</span> <span class="n">a</span> <span class="n">b</span>
<span class="n">mapIterator</span> <span class="n">l</span> <span class="p">(</span><span class="kt">Iterator</span> <span class="n">s</span> <span class="n">f</span><span class="p">)</span> <span class="o">=</span> <span class="kt">Iterator</span> <span class="p">(</span><span class="n">s</span> <span class="o">^#</span> <span class="n">l</span><span class="p">)</span> <span class="p">(</span><span class="nf">\</span><span class="n">b</span> <span class="o">-></span> <span class="p">(</span><span class="n">f</span> <span class="p">(</span><span class="n">s</span> <span class="o">&</span> <span class="n">l</span> <span class="o">.~</span> <span class="n">b</span><span class="p">))</span> <span class="o">^#</span> <span class="n">l</span><span class="p">)</span>
<span class="n">runIterator</span> <span class="o">::</span> <span class="kt">Iterator</span> <span class="n">s</span> <span class="n">t</span> <span class="o">-></span> <span class="kt">Lens</span> <span class="n">s</span> <span class="n">t</span> <span class="nb">()</span> <span class="nb">()</span> <span class="o">-></span> <span class="p">[</span><span class="n">s</span><span class="p">]</span>
<span class="n">runIterator</span> <span class="p">(</span><span class="kt">Iterator</span> <span class="n">s</span> <span class="n">f</span><span class="p">)</span> <span class="n">l</span> <span class="o">=</span> <span class="n">s</span> <span class="o">:</span> <span class="n">runIterator</span> <span class="p">(</span><span class="kt">Iterator</span> <span class="p">(</span><span class="n">f</span> <span class="p">(</span><span class="n">s</span> <span class="o">&</span> <span class="n">l</span> <span class="o">.~</span> <span class="nb">()</span><span class="p">))</span> <span class="n">f</span> <span class="p">)</span> <span class="n">l</span>
</code></pre></div></div>
<h1>The category of elements of Iterator</h1>
<p>The next step is to form the category of elements $\int \mathrm{Iter}$, also known as the discrete Grothendieck construction. This is a category whose objects are tuples $\left( \binom{X}{X’}, i \right)$ of an object $\binom{X}{X’}$ of $\mathbf{Optic} (\mathcal C)$ and a choice of iteration data $i : \mathrm{Iter} \binom{X}{X’}$. A morphism $\left( \binom{X}{X’}, i \right) \to \left( \binom{Y}{Y’}, j \right)$ is an optic $f : \binom{X}{X’} \to \binom{Y}{Y’}$ with the property that $\mathrm{Iter} (f) (i) = j$, that is to say, the iteration data on the left and right boundary have to agree.</p>
<p>The functor $\mathrm{Iter} : \mathbf{Optic} (\mathcal C) \to \mathbf{Set}$ is lax monoidal: there is an evident natural way to combine pairs of iteration data into iteration data for pairs:</p>
\[\nabla : \mathrm{Iter} \binom{X}{X'} \times \mathrm{Iter} \binom{Y}{Y'} \to \mathrm{Iter} \binom{X \otimes Y}{X' \otimes Y'}\]
<p>This means that the tensor product of $\mathbf{Optic} (\mathcal C)$ lifts to $\int \mathrm{Iter}$, by</p>
\[\left( \binom{X}{X'}, i \right) \otimes \left( \binom{Y}{Y'}, j \right) = \left( \binom{X \otimes Y}{X' \otimes Y'}, i \nabla j \right)\]
<p>The category $\int \mathrm{Iter}$ can essentially already describe iteration with optics, although in a slightly awkward way. Suppose we draw a string diagram that not coincidentally resembles a control loop:</p>
<p><img src="/assetsPosts/2024-02-20-iteration-optics/closed-control-loop.png" alt="Control loop" /></p>
<p>Here, $f$ and $f’$ denote some morphisms $f : X \to Y$ and $f’ : Y \to X$ in our underlying category, and $x_0$ represents an initial state $x_0 : I \to X$.</p>
<p>Normally string diagrams denote morphisms of a monoidal category, but we make a cut just to the right of the backwards-to-forwards turning point, and consider that everything left of that is describing a boundary object. Namely in this case, we have the object $\left( \binom{X}{X}, i \right)$ where the iteration data $i : \mathrm{Iter} \binom{X}{X}$ is given by the state space $I$, the initial state $x_0 : I \to I \otimes X$ and the iterator $\mathrm{id} : I \otimes X \to I \otimes X$.</p>
<p><img src="/assetsPosts/2024-02-20-iteration-optics/cut-control-loop.png" alt="Control loop" /></p>
<p>The remainder of the string diagram to the right of the cut denotes an ordinary optic $f : \binom{X}{X} \to \binom{I}{I}$, namely the one given by $f = (Y, f, f’)$, with forwards pass $f : X \to Y \otimes I$ and backwards pass $f’ : Y \otimes I \to X$. This boils down to describing the composite morphism $f; f’ : X \to X$.</p>
<p>Overall, we can read this diagram as denoting a morphism $f$ in $\int \mathrm{Iter}$ of type $f : \left( \binom{X}{X}, i \right) \to \left( \binom{I}{I}, \mathrm{Iter} (f) (i) \right)$. The iteration data on the right boundary is $\mathrm{Iter} (f) (i) : \mathrm{Iter} \binom{I}{I}$, which concretely has state space $Y$, the initial state $x_0; f : I \to Y$ and iterator $f’; f : Y \to Y$.</p>
<p>This works in principle, but splitting the diagram between denoting an object and denoting a morphism is very non-standard. So far, this amounts to doing for the iteration functor what we did for the selection functions functor in section 6 of <a href="https://arxiv.org/abs/2105.06332">Towards Foundations of Categorical Cybernetics</a>.</p>
<h1>The full theory of iteration</h1>
<p>Now we take the final step to fix the slight clunkiness of using $\int \mathrm{Iter}$ as a model of iteration. This continues the firmly established pattern that categorical cybernetics contains only two ideas that get combined in more and more intricate ways: optics and parametrisation.</p>
<p>There is a strong monoidal functor $\pi : \int \mathrm{Iter} \to \mathbf{Optic} (\mathcal C)$ that forgets the iteration data, namely the discrete fibration $\pi \left( \binom{X}{X’}, i \right) = \binom{X}{X’}$. This functor generates an action of the monoidal category $\int \mathrm{Iter}$ on $\mathbf{Optic} (\mathcal C)$, namely</p>
\[\left( \binom{X}{X'}, i \right) \bullet \binom{Y}{Y'} = \binom{X \otimes Y}{X' \otimes Y'}\]
<p>See section 5.5 of <a href="https://arxiv.org/abs/2203.16351">Actegories for the Working Amthematician</a> for far too much information about actegories of this form.</p>
<p>We now take the category $\mathbf{Para}_{\int \mathrm{Iter}} (\mathbf{Optic} (\mathcal C))$ of parametrised morphisms generated by this action. We also refer to this kind of thing (parametrisation for the action generated by a discrete fibration) as the Para construction <em>weighted</em> by $\mathrm{Iter}$, $\mathbf{Para}^\mathrm{Iter} (\mathbf{Optic} (\mathcal C))$ - the name comes from it being a kind of <a href="https://ncatlab.org/nlab/show/weighted+limit">weighted limit</a> and I think the reference for this is <a href="https://www.brunogavranovic.com/">Bruno</a>’s PhD thesis, which is sadly unreleased as I’m writing this.</p>
<p>Working things through: an object of $\mathbf{Para}^\mathrm{Iter} (\mathbf{Optic} (\mathcal C))$ is still a pair $\binom{X}{X’}$, but a morphism $\binom{X}{X’} \to \binom{Y}{Y’}$ consists of three things: another pair of objects $\binom{Z}{Z’}$, iteration data $i : \mathrm{Iter} \binom{Z}{Z’}$, and an optic $\binom{X \otimes Z}{X’ \otimes Z’} \to \binom{Y}{Y’}$.</p>
<p>Now suppose we have a diagram of an open control loop, that is to say, a control loop that is open-as-in-systems (not to be confused with an <a href="https://en.wikipedia.org/wiki/Open-loop_controller">open loop controller</a>, which is unrelated):</p>
<p><img src="/assetsPosts/2024-02-20-iteration-optics/open-control-loop.png" alt="Open control loop" /></p>
<p>Here the primitive morphisms in the diagram are $f : A \otimes X \to B \otimes Y$, $f’ : B’ \otimes Y \to A’ \otimes X$, and an initial state $x_0 : I \to X$. The idea is that $f$ is the forwards pass, $f’$ is the backwards pass, and after the backwards pass comes another forwards pass but one time step in the future.</p>
<p>To make formal sense of this diagram, we imagine that we deform the backwards-to-forwards bend upwards, treating the state as a parameter, and then cut the diagram as we did before:</p>
<p><img src="/assetsPosts/2024-02-20-iteration-optics/cut-open-control-loop.png" alt="Cut open control loop" /></p>
<p>Now we can read this off as a morphism $\binom{X}{X’} \to \binom{Y}{Y’}$ in $\mathbf{Para}^\mathrm{Iter} (\mathbf{Optic} (\mathcal C))$. The (weighted) Para construction makes everything go smoothly, so this is an entirely standard string diagram with no funny stuff.</p>
<p>Technically categories of parametrised morphisms are always bicategories (or better, double categories), and I think this is a rare case where we actually want to quotient out all morphisms in the vertical direction, i.e. identify $\left( f : \binom{X \otimes Z}{X’ \otimes Z’} \to \binom{Y}{Y’}, i : \mathrm{Iter} \binom{Z}{Z’} \right)$ with $\left( g : \binom{X \otimes W}{X’ \otimes W’} \to \binom{Y}{Y’}, j : \mathrm{Iter} \binom{W}{W’} \right)$ whenever there is <em>any</em> optic $h : \binom{Z}{Z’} \to \binom{W}{W’}$ making $\mathrm{Iter} (h) (i) = j$ and commuting with $f$ and $g$. Coming back to our earlier picture of cutting a string diagram, this exactly says that we identify all of the different ways we could make the cut. In order to do this we change the base of enrichment along the functor $\pi_0 : \mathbf{Cat} \to \mathbf{Set}$ taking each category to its set of connected components.</p>
<p>One final note: Almost everything in this post used nothing but the fact that $\mathrm{Iter}$ is a lax monoidal functor $\mathbf{Optic} (\mathcal C) \to \mathbf{Set}$. With minimal translation, I think the entire thing works as a story about “forcing states in a symmetric monoidal category”: given any symmetric monoidal category $\mathcal C$ and a lax monoidal functor $F : \mathcal C \to \mathbf{Set}$, the category $\mathbf{Para}^F (\mathcal C)$ is equivalently described as $\mathcal C$ freely extended with a morphism $x : I \to X$ for every $x : F (X)$. I’ll leave this as a conjecture for somebody else to prove.</p>Jules HedgesIn this post I'll describe the theory of how to add iteration to categories of optics. Iteration is required for almost all applications of categorical cybernetics beyond game theory, and is something we've been handling only semi-formally for some time. The only tool we need is already one we have inside the categorical cybernetics framework: parametrisation weighted by a lax monoidal functor. I'll end with a conjecture that this is an instance of a general procedure to force states in a symmetric monoidal category.Passive Inference is Compositional, Active Inference is Emergent2024-02-06T00:00:00+00:002024-02-06T00:00:00+00:00https://cybercat-institute.github.io//2024/02/06/passive-inference-compositional<p>This post is a writeup of a talk I gave at the <a href="https://amcs-community.org/events/causal-cognition-humans-machines/">Causal Cognition in Humans and Machines</a> workshop in Oxford, about some work in progress I have with <a href="https://tsmithe.net/">Toby Smithe</a>. To a large extent this is my take on the theoretical work in Toby’s PhD thesis, with the emphasis shifted from category theory and neuroscience to numerical computation and AI. In the last section I will outline my proposal for how to build AGI.</p>
<h2>Markov kernels</h2>
<p>The starting point is the concept of a <a href="https://en.wikipedia.org/wiki/Markov_kernel">Markov kernel</a>, which is a synonym for <a href="https://en.wikipedia.org/wiki/Conditional_probability_distribution">conditional probability distribution</a> that sounds unnecessarily fancy but, crucially, contains only 30% as many syllables. If $X$ and $Y$ are some sets then a Markov kernel $\varphi$ from $X$ to $Y$ is a conditional probability distribution $\mathbb P_\varphi [y \mid x]$. Most of this post will be agnostic to what exactly “probability distribution” can mean, but in practice it will <em>probably</em> eventually mean “Gaussian”, in order to <a href="https://knowyourmeme.com/memes/money-printer-go-brrr">go <em>brrr</em></a>, by which I mean <em>effective in practice at the expense of theoretical compromise</em>. (I blatantly stole this usage of that meme from <a href="https://www.brunogavranovic.com/">Bruno</a>.)</p>
<p>There are two different perspectives on how Markov kernels can be implemented. They could be <em>exact</em>, for example, they could be represented as a stochastic matrix (in the finite support case) or as a tensor containing a mean vector and covariance matrix for each input (in the Gaussian case). Alternatively they could be <a href="https://en.wikipedia.org/wiki/Monte_Carlo_method">Monte Carlo</a>, that is, implemented as a function from $X$ to $Y$ that may call a pseudorandom number generator. If we send the same input repeatedly then the outputs are samples from the distribution we want. Importantly these functions satisfy the <a href="https://en.wikipedia.org/wiki/Markov_property">Markov property</a>: the distribution on the output depends only on the current input and not on any internal state.</p>
<p>An important fact about Markov kernels is that they can be composed. Given a Markov kernel $\mathbb P_\varphi [y \mid x]$ and another $\mathbb P_\psi [z \mid y]$, there is a composite kernel $\mathbb P_{\varphi; \psi} [z \mid x]$ obtained by integrating out $y$:</p>
\[\mathbb P_{\varphi; \psi} [z \mid x] = \int \mathbb P_\varphi [y \mid x] \cdot \mathbb P_\psi [z \mid y] \, dy\]
<p>This formula is sometimes given the unnecessarily fancy name <a href="https://en.wikipedia.org/wiki/Chapman%E2%80%93Kolmogorov_equation">Chapman-Kolmogorov equation</a>. If we represent kernels by stochastic matrices, then this is exactly matrix multiplication; if they are Gaussian tensors, then it’s a similar but slightly more complicated operation. Doing exact probability for anything more complicated is extremely hard in practice because of the <a href="https://en.wikipedia.org/wiki/Curse_of_dimensionality">curse of dimensionality</a>.</p>
<p>If we represent kernels by Monte Carlo funtions, then composition is literally just function composition, which is extremely convenient. That is, we can just send particles through a chain of functions and they’ll come out with the right distribution - this fact is basically what the term “Monte Carlo” actually means.</p>
<p>A special case of this is an ordinary (non-conditional) probability distribution, which can be usefully thought of as a Markov kernel whose domain is a single point. Given a distribution $\mathbb P_\pi [x]$ and a kernel $\mathbb P_\varphi [y \mid x]$ we can obtain a distribution $\pi; \varphi$ on $y$, known as the <em>pushforward distribution</em>, by integrating out $x$:</p>
\[\mathbb P_{\pi; \varphi} [y] = \int \mathbb P_\pi [x] \cdot \mathbb P_\varphi [y \mid x] \, dx\]
<h2>Bayesian inversion</h2>
<p>Suppose we have a Markov kernel $\mathbb P_\varphi [y \mid x]$ and we are shown a sample of its output, but we can’t see what the input was. What can we say about the input? To do this, we must start from some initial belief about how the input was distributed: a <em>prior</em> $\mathbb P_\pi [x]$. After observing $y$, <a href="https://en.wikipedia.org/wiki/Bayes%27_theorem">Bayes’ law</a> tells us how we should modify our belief to a <em>posterior distsribution</em> that accounts for the new evidence. The formula is</p>
\[\mathbb P [x \mid y] = \frac{\mathbb P_\varphi [y \mid x] \cdot \mathbb P_\pi [x]}{\mathbb P_{\pi; \varphi} [y]}\]
<p>The problem of computing posterior distributions in practice is called <a href="https://en.wikipedia.org/wiki/Bayesian_inference">Bayesian inference</a>, and is very hard and very well studied.</p>
<p>If we fix $\pi$, it turns out that the previous formula for $\mathbb P [x \mid y]$ defines a Markov kernel from $Y$ to $X$, giving the posterior distribution for each possible observation. We call this the <em>Bayesian inverse</em> of $\varphi$ with respect to $\pi$, and write $\mathbb P_{\varphi^\dagger_\pi} [x \mid y]$.</p>
<p>The reason we can have $y$ as the input of the kernel but we had to pull out $\pi$ as a parameter is that the formula for Bayes’ law is <em>linear</em> in $y$ but <em>nonlinear</em> in $\pi$. This nonlinearity is really the thing that makes Bayesian inference hard.</p>
<p>Technically, Bayes’ law only considers <em>sharp</em> evidence, that is, we observe a particular point $y$. Considering inverse Markov kernels also gives us a way of handling <em>noisy</em> evidence, such as stochastic uncertainty in a measurement, by pushing forward a distribution $\mathbb P_\rho [y]$ to obtain $\mathbb P_{\rho; \varphi^\dagger_\pi} [x]$. This way of handling noisy evidence is sometimes called a <em>Jeffreys update</em>, and contrasted with a different formula called a <em>Pearl update</em> - see <a href="https://arxiv.org/abs/1807.05609">this paper</a> by <a href="https://www.cs.ru.nl/B.Jacobs/">Bart Jacobs</a>. Pearl updates have very different properties and I don’t know how they fit into this story, if at all. Provisionally, I consider the story of this post as evidence that Jeffreys updates are “right” in some sense.</p>
<h2>Deep inference</h2>
<p>So far we’ve introduced 2 operations on Markov kernels: composition and Bayesian inversion. Are they related to each other? The answer is a resounding <em>yes</em>: they are related by the formula</p>
\[(\varphi; \psi)^\dagger_\pi = \psi^\dagger_{\pi; \varphi}; \varphi^\dagger_\pi\]
<p>We call this the <em>chain rule</em> for Bayesian inversion, because of its extremely close resemblance to the chain rule for transpose Jacobians that underlies backpropagation in neural networks and differentiable programming:</p>
\[J^\top_x (f; g) = J^\top_{f (x)} (g) \cdot J^\top_x (f)\]
<p>The Bayesian chain rule is <em>extremely</em> folkloric. I conjectured it in 2019 while talking to Toby, and he proved it a few months later, writing it down in his unpublished preprint <a href="https://arxiv.org/abs/2006.01631">Bayesian Updates Compose Optically</a>. It’s definitely not new - <em>some</em> people already know this fact - but extremely few, and we failed to find it written down in a single place. (I feel like it should have been known by the 1950s at the latest, when things like dynamic programming were being worked out. Perhaps it’s one of the things that was well known in the Soviet Union but wasn’t discovered in the West until much later.) The first place Toby <em>published</em> this fact was in <a href="https://arxiv.org/abs/2305.06112">The Compositional Structure of Bayesian Inference</a> with <a href="https://dylanbraithwaite.github.io/about.html">Dylan Braithwaite</a> and me, which fixed a minor problem to do with zero-probability observations in a nice way.</p>
<p>What this formula tells us is that if we have a Markov kernel with a known factorisation, we can compute Bayesian posteriors efficiently if we already know the Bayesian inverse of each factor. Since this is exactly the same form as differentiable programming, we have good evidence that it can go <em>brrr</em>. At first I thought it was completely obvious that this must be how compilers for probabilistic programming languages work, but it turns out this is not the case at all, probabilistic programming languages are monolithic. I’ve given this general methodology for computing posteriors compositionally the catchy name <em>deep inference</em>, by its very close structural resemblance to deep learning.</p>
<h2>Variational inference</h2>
<p>I wrote “we can compute Bayesian posteriors efficiently if we already know the Bayesian inverse of each factor”, but this is still a big <em>if</em>: computing posteriors even of simple functions is still hard if the dimensionality is high. Numerical methods are used in practice to approximate the posterior, and we would like to make use of these while still exploiting compositional structure.</p>
<p>The usual way of approximating a Bayesian inverse $\varphi^\dagger_\pi$ is to cook up a functional form $\varphi^\prime_\pi (p)$ that depends on some parameters $p \in \mathbb R^N$. Then we find a loss function on the parameters with the property that minimising it causes the approximate inverse to converge to the exact inverse, ie. $\varphi^\prime_\pi (p) \longrightarrow \varphi^\dagger_\pi$. This is called <em>variational inference</em>.</p>
<p>There are many ways to do this. Probably the most common loss function in practice is <a href="https://en.wikipedia.org/wiki/Kullback%E2%80%93Leibler_divergence">KL divergence</a> (aka <em>relative entropy</em>),</p>
\[\mathbf{KL} (\varphi^\dagger_\pi, \varphi^\prime_\pi (p)) = \int \mathbb P_{\varphi^\dagger_\pi} [x \mid y] \log \frac{\mathbb P_{\varphi^\dagger_\pi} [x \mid y]}{\mathbb P_{\varphi^\prime_\pi (p)} [x \mid y]} \, dx\]
<p>This expression is a function of $y$, which can optionally also be integrated over (but the next paragraph reveals a better way to use it). A closely related alternative is <a href="https://en.wikipedia.org/wiki/Evidence_lower_bound">variational free energy</a>, which despite being more complicated to define is more computationally tractable.</p>
<p>Ideally, we would like to use a functional form for which we can derive an analytic formula that tells us exactly how we should update our parameters to decrease the loss, given (possibly batched) Monte Carlo samples that are assumed to be drawn from a distribution in a certain class, such as Gaussians.</p>
<p>Of course in 2024 if you are <em>serious</em> then the functional form you use is a deep neural network, and you replace your favourite loss function by its derivative. I refer to this version as <em>deep variational inference</em>. There is no fundamental difference in theory, but in practice deep variational inference is necessary in order to go <em>brrr</em>.</p>
<h2>Passive inference is compositional</h2>
<p>Now, suppose we have two Markov kernels $\mathbb P_\varphi [y \mid x]$ and $\mathbb P_\psi [z \mid y]$ which we compose. Suppose we have a prior $\mathbb P_\pi [x]$ for $\varphi$, which pushes forward to a prior $\mathbb P_{\pi; \varphi} [y]$ for $\psi$. We pick a functional form for approximating each Bayesian inverse, which we call $\mathbb P_{\varphi^\prime_\pi (p)} [x \mid y]$ and $\mathbb P_{\psi^\prime_{\pi; \varphi} (q)} [y \mid z]$.</p>
<p>Doing this requires a major generalisation of our loss function. This was found by Toby Smithe in <a href="https://arxiv.org/abs/2109.04461">Compositional active inference 1</a>. The method he developed comes straight from <a href="https://arxiv.org/abs/1603.04641">compositional game theory</a>, and this appearance of virtually identical structure in game theory and Bayesian inference is absolutely one of the most core ideas of <a href="https://cybercat-institute.github.io/2022/05/29/what-is-categorical-cybernetics/">categorical cybernetics</a> as I envision it.</p>
<p>The idea is to define the loss of an approximate inverse to a kernel $\varphi : X \to Y$ in a <em>context</em> that includes not only a prior distribution on $X$, but also a (generally nonlinear) function $k$ called the <em>continuation</em>, that transforms probability distributions on $Y$. The continuation is a black box that describes how predictions transform into observations. Then when $y$ appears free in the expressions for KL divergence and variational free energy, we integrate it over the distribution $k (\pi; \varphi)$.</p>
<p>So for our composite kernel $\varphi; \psi$, as well as the prior $\pi$ on $X$ we also have a continuation $k$ that transforms distributions on $Z$. In order to optimise the parameters $(p, q)$ in this context, we divide them into two sub-problems:</p>
<ul>
<li>Optimise the parameters $p$ for $\varphi$ in the context given by the prior $\pi$ on $X$ and the continuation $k’$ on $Y$ given by $k’ (\sigma) = k (\sigma; \psi); \psi’_\sigma (q)$</li>
<li>Optimise the parameters $q$ for $\psi$ in the context given by the prior $\pi; \varphi$ on $Y$ and the continuation $k$ on $Z$</li>
</ul>
<p>Notice that the optimisation step for $p$ involves the current value of $q$, but not vice versa. It is easy to prove that this method correctly converges to the total Bayesian inverse by a dynamic programming argument, if we first optimise $q$ to convergence and then optimise $p$. However, Toby and me conjecture that this procedure also converges if $p$ and $q$ are optimised asynchronously, which means the procedure can be parallelised.</p>
<h2>Active inference is emergent</h2>
<p>The convergence conjecture in the previous section crucially relies on the fact that the prediction kernels $\varphi$ and $\psi$ are fixed, and we are only trying to approximate their Bayesian inverses. That is why I referred to it as <em>passive inference</em>. The term <em>active inference</em> means several different things (more on this in the next section) but one thing it should mean is that we simultaneously learn to do both prediction and inference.</p>
<p>Toby and me think that if we do this, the compositionality result breaks. In particular, if we also have a parametrised family of prediction kernels $\varphi (p)$ which converge to our original kernel $\varphi$, it is <em>not</em> the case that</p>
\[\psi^\prime_{\pi; \varphi (p)} (q); \varphi^\prime_\pi (p) \longrightarrow (\varphi; \psi)^\dagger_\pi\]
<p>Specifically, we think that the nonlinear dependency of $\psi^\prime_{\pi; \varphi (p)} (q)$ on $\varphi^\prime (p)$ causes things to go wrong.</p>
<p>One way of saying this negative conjecture is: <em>compositional active inference can fail to converge to true beliefs, even in a stationary environment</em>. The main reason you’d want to do this anyway, even at the expense of getting the wrong answer, is that it might go <em>brrr</em> - but whether this is really true remains to be seen.</p>
<p>We can, however, put a positive spin on this negative result. I am known for the idea that <em>the opposite of compositionality is emergence</em>, from <a href="https://julesh.com/2017/04/22/on-compositionality/">this blog post</a>. A compositional active inference system does not behave like the sum of its parts. The interaction between components can prevent them from learning true beliefs, but can it do anything positive for us? So far we know nothing about how this emergent learning dynamics behaves, but our optimistic hope is that it could be responsible for what is normally called things like <em>intelligence</em> and <em>creativity</em> - on the basis that there aren’t many other places that they could be hiding.</p>
<h2>How to build a brain</h2>
<p>Boosted by the last paragraph, we now fully depart the realm of mathematical conjecture and enter the outer wilds of hot takes, increasing in temperature towards the end.</p>
<p>So far I’ve talked about active inference but not mentioned what is probably the most important thing in the cloud of ideas around the term: conflating <em>prediction</em> and <em>control</em>. Ordinarily, we would think of $\mathbb P_{\pi; \varphi} [y]$ as <em>prediction</em> and $\mathbb P_{\varphi^\dagger_\pi} [x \mid y]$ as <em>inference</em>. However it has been proposed (I believe the idea is due to <a href="https://www.fil.ion.ucl.ac.uk/~karl/">Karl Friston</a>) that in the end $\mathbb P_{\pi; \varphi} [y]$ is interpreted as a command: at the end of a chain of prediction-inference devices comes an actuator designed to act on the external environment in order to (try to) make the prediction true. That is, a prediction like “my arm will rise” is <em>the same thing</em> as the command “lift my arm” when connected to my arm muscles.</p>
<p>This lets us add one more piece to the puzzle, namely <em>reinforcement learning</em>. A deep active inference system can interact with an environment (either the real world or a simulated environment), by interpreting its ultimate predictions as commands, effecting those commands into the environment, and responding with fresh observations. Over time, the system should learn to predict the response of the environment, that is to say, it will learn an <em>internal model</em> of its environment. If several different active inference systems interact with the same environment, then we should consider the environment of each to contain the others, and expect each to learn a model of the others, recursively.</p>
<p>I am not a neuroscientist, but I understand it is at least plausible that the compositional structure of the mammalian cortex exactly reflects the compositional structure of deep active inference. The cortex is shaped (in the sense of connectivity) approximately like a pyramid, with both sensory and motor areas at the bottom. In particular, the brain is <em>not</em> a <a href="https://en.wikipedia.org/wiki/Series_of_tubes">series of tubes</a> with sensory signals going in at one end and motor signals coming out at the other end. Obviously the basic pyramid shape must be modified with endless ad-hoc modifications at every scale developed by evolution for various tasks. So following Hofstadter’s <a href="http://bert.stuy.edu/pbrooks/fall2014/materials/HumanReasoning/Hofstadter-PreludeAntFugue.pdf">Ant Fugue</a>, I claim <em>the cortex is shaped like an anthill</em>.</p>
<p>The idea is that the hierarchical structure is roughly an <em>abstraction</em> hierarchy. Predictions (aka commands) $\mathbb P_\varphi [y \mid x]$ travel down the hierarchy (towards sensorimotor areas), transforming predictions at a higher level of abstraction $\mathbb P_\pi [x]$ into predictions at a lower level of abstraction $\mathbb P_{\pi; \varphi} [y]$. Inferences $\mathbb P_{\varphi^\dagger_\pi} [x \mid y]$ travel up the hierarchy (away from sensorimotor areas), transforming observations at a lower level of abstraction $\mathbb P_\rho [y]$ into observations at a higher level of abstraction $\mathbb P_{\rho; \varphi^\dagger_\pi} [x]$.</p>
<p>Given this circularity, with observations depending on predictions recursively through many layers, I expect that the system will learn to predict <em>sequences</em> of inputs (as any recursive neural network does, and notably <em>transformers</em> do extremely successfully) - and also <em>sequences of sequences</em> and so on. I predict that stability will increase up the hierarchy - that is, updates will usually be smaller at higher levels - so that at least conceptually, higher levels run on a slower timescale than lower levels. This comes back to ideas I first read almost 15 years ago in the book <a href="https://us.macmillan.com/books/9780805078534/onintelligence">On Intelligence</a> by Jeff Hawkins and Sandra Blakeslee.</p>
<p>Conceptually, this is exactly the same idea I wrote about in <a href="https://link.springer.com/chapter/10.1007/978-3-031-08020-3_9">chapter 9</a> of <a href="https://link.springer.com/book/10.1007/978-3-031-08020-3">The Road to General Intelligence</a> - the main difference is that now I think I have a good idea how to actually compute commands and observations in practice, whereas back then I hand-crafted a toy proof of concept.</p>
<p>If both sensory and motor areas are at the bottom of the hierarchy, this raises the obvious question of what is at the <em>top</em>. It probably has something to do with long term memory formation, but it is almost impossible to not be thinking about <em>consciousness</em> at this point. I’m going to step back from this so that the hot takes in this post don’t reach their ignition temperature before the next paragraph.</p>
<p>The single hottest take that I genuinely believe is that <em>deep variational reinforcement learning is all you need</em>, and is the only conceptually plausible route to what is sometimes sloppily called “AGI” and what I refer to in private as “true intelligence”.</p>
<p>I should mention that none of my collaborators is as optimistic as me that <em>deep variational reinforcement sequence learning is all you need</em>. Uniquely among my collaborators, I am a hardcore connectionist and I believe good old fashioned symbolic methods have no essential role to play. Time will tell.</p>
<p>My long term goal is <em>obviously</em> to build this, if it works. My short term goal is to build some baby prototypes starting with passive inference, to verify and demonstrate that what works in theory also works in practice. So watch this space, because the future might be wild…</p>Jules HedgesThis post is a writeup of a talk I gave at the Causal Cognition in Humans and Machines workshop in Oxford, about some work in progress I have with Toby Smithe. To a large extent this is my take on the theoretical work in Toby's PhD thesis, with the emphasis shifted from category theory and neuroscience to numerical computation and AI. In the last section I will outline my proposal for how to build AGI.How to Stay Locally Safe in a Global World2024-01-16T00:00:00+00:002024-01-16T00:00:00+00:00https://cybercat-institute.github.io//2024/01/16/How%20to%20Stay%20Locally%20Safe%20in%20a%20Global%20World<p>Cross-posted from <a href="https://jadeedenstarmaster.wordpress.com/">Jade’s blog</a>: parts <a href="https://jadeedenstarmaster.wordpress.com/2023/12/06/how-to-stay-locally-safe-in-a-global-world/">1</a>, <a href="https://jadeedenstarmaster.wordpress.com/2023/12/17/how-to-stay-locally-safe-in-a-global-world-part-ii-defining-a-world-and-stating-the-problem/">2</a>, <a href="https://jadeedenstarmaster.wordpress.com/2023/12/17/how-to-stay-locally-safe-in-a-global-world-part-iii-the-global-safety-poset/">3</a></p>
<h2>Introduction</h2>
<p>Suppose your name is $x$ and you have a very important state machine $N_x : S_x \times \Sigma \to \mathcal{P}(S_x)$ that you cherish with all your heart. Because you love this state machine so much, you don’t want it to malfunction and you have a subset $P \subseteq S_x$ which you consider to be safe. If your state machine ever leaves this safe space you are in big trouble so you ask the following question. If you start in some subset $I \subseteq P$ will your state machine $N_x$ ever leave $P$? In math, you ask if</p>
\[\mu (\blacksquare(-) \cup I) \subseteq P\]
<p>where $\mu$ is the least fixed point and $\blacksquare(-)$ indicates the next-time operator of the cherished state machine. What is the next-time operator?</p>
<p>Definition: For a function $N : X \times \Sigma \to \mathcal{P}(Y)$ there is a monotone function $\blacksquare_N : \mathcal{P}(X) \to \mathcal{P}(Y)$ given by</p>
\[\blacksquare_N(A) = \bigcup_{a \in A} \bigcup_{s \in \Sigma} N(a,s)\]
<p>In layspeak the next-time operator sends a set of states the set of all possible successors of those states.</p>
<p>In a perfect world you could use these definitions to ensure safety using the formula</p>
\[\mu (\blacksquare(-) \cup I) = \bigcup_{n=0}^{\infty} (\blacksquare ( - ) \cup I)^n\]
<p>or at least check safety up to an arbitrary time-step $n$ by computing this infinite union one step at a time.</p>
<p>Unfortunately there is a big problem with this method! Your state machine does not exist in isolation. You have a friend whose name is $y$ with their own state machine $N_y : S_y \times \Sigma \to \mathcal{P} (S_y)$. $y$ has the personal freedom to run their state machine how they like but there are functions</p>
\[N_{xy} : S_x \times \Sigma \to \mathcal{P}(S_y)\]
<p>and</p>
\[N_{yx} : S_y \times \Sigma \to \mathcal{P}(S_x)\]
<p>which allow states of your friend’s machine to change the states of your own and vice-versa. Making matters worse, there is a whole graph $G$ whose vertices are your friends and whose edges indicate that the corresponding state machines may effect each other. How can you be expected to ensure safety under these conditions?</p>
<p>But don’t worry, category theory comes to the rescue. In the next sections I will:</p>
<ul>
<li>State my model of the world and the local-to-global safety problem for this model (Part II)</li>
<li>Propose a solution to the local-to-global safety problem based on an enriched version of the Grothendieck construction (Part III)</li>
</ul>
<h2>Defining a World and Stating the Problem</h2>
<p>Suppose we have a directed graph $G=(V(G),E(G))$ representing our world. The vertices of this graph are the different agents in our world and an edge represents a connection between these agents. The semantics of this graph will be the following:</p>
<p>Definition: Let $\mathsf{Mach}$ be the directed graph whose objects are sets and where there is an edge $e : X \to Y$ for every function</p>
\[e : X \times \Sigma \to \mathcal{P}(Y)\]
<p>A world is a morphism of directed graphs $W : G \to \mathsf{Mach}$.</p>
<p>A world has a set $S_x$ for each vertex $x$ called the local state over $\mathbf{x}$ and for each edge $e :x \to y$ a function $W(e) : S_x \times \Sigma_e \to \mathcal{P}(S_y)$ representing the state machine connecting the local state over $x$ to the local state over $y$. Note that self edges are ordinary state machines from a local state to itself. An example world may be drawn as follows:</p>
<p><img src="/assetsPosts/2023-12-18-How to Stay Locally Safe in a Global World/World.png" alt="Example World" /></p>
<p>Definition: Given a world $W: G \to \mathsf{Mach}$, the total machine of $W$ is the state machine
$\int W : \sum_{x \in V(G)} S_x \times \sum_{e \in E(G)} \Sigma_e \to \mathcal{P}( \sum_{x \in V(G)} S_x )$</p>
<p>given by</p>
\[( (s,x),(\tau,d)) \mapsto \bigcup_{e: x \to y} F(e) (s, \tau)\]
<p>The notation $\int$ is used based on the belief that this is some version of the Grothendieck construction. Exactly which flavor will be left to future work. The transition function of this state machine comes from unioning the transition functions of all the state machines associated to edges originating in a vertex.</p>
<p>Definition: Given a world $W : G \to \mathsf{Mach}$, a vertex $x \in V(G)$, and subsets $I,P \subset S_x$, we say that $I$ is locally safe in a global context if</p>
\[\mu (\blacksquare_{\int W} (-) \cup I) \subseteq P\]
<p>where $\blacksquare_{\int W}$ is the next-time operator of the state machine $\int W$.</p>
<p>The state machine $\int W$ may be large enough to make computing this least fixed point by brute force impractical. Therefore, we must leverage the compositional structure of $W$. We will see how to do this in the next post.</p>
<h2>The Global Safety Poset</h2>
<p>In this section we will give a compositional solution to the local safety problem in a global context in two steps:</p>
<ul>
<li>First by turning the world into a functor $\hat{W} : FG \to \mathsf{Poset}$</li>
<li>Then by gluing this functor into a single poset $\int \hat{W}$ whose inequalities solve the problem of interest.</li>
</ul>
<p>First we define the functor.</p>
<p>Given a world $W : G \to \mathsf{Mach}$, there is a functor</p>
\[\hat{W} : FG \to \mathsf{Poset}\]
<p>where</p>
<ul>
<li>$FG$ is the free category on the graph $G$,</li>
<li>$\mathsf{Poset}$ is the category whose objects are posets and whose morphisms are monotone functions.</li>
</ul>
<p>Functors from a free category are uniquely defined by their image on vertices and generating edges.</p>
<ul>
<li>For a vertex $x \in V(G)$, $\hat{W}(x) = \mathcal{P}(S_x)$,</li>
<li>for an edge $e : x \to y$, we define $\hat{W}(e): \mathcal{P}(S_x) \to \mathcal{P}(S_y)$ by $A \mapsto \blacksquare_{W(e)}(A)$</li>
</ul>
<p>Now for step two.</p>
<p>Given a functor $\hat{W} : FG \to \mathsf{Poset}$ defined from a world $W$, the <strong>global safety poset</strong> is a poset $\int \hat{W}$ where</p>
<ul>
<li>elements are pairs $(x \in V(G), A \subseteq S_x)$,</li>
<li>$(x, A) \leq (y, B) \iff \bigwedge_{f: x \to y \in FG} \hat{W} (f) (A) \subseteq B$</li>
</ul>
<p>Given a world $W : G \to \mathsf{Mach}$, a vertex $x \in V(G)$, and subsets $I,P \subseteq S_x$ then $I$ is locally safe in a global context if and only if there is an inequality
$(x,I) \subseteq (x,P)$ in the global safety poset $\int \hat{W}$</p>
<p>My half-completed proof of this theorem involves a square of functors</p>
<p><img src="/assetsPosts/2023-12-18-How to Stay Locally Safe in a Global World/commsquare.png" alt="Correctness Square" /></p>
<p>Going from right and then down, the first functor uses a Grothendieck construction to turn a world into a total state machine and then turns that state machine into it’s global safety poset. Going down and then right follows the construction detailed in the last two sections. The commutativity of this diagram should verify correctness. I will explain all of this in more detail later. Thanks for tuning in today!</p>Jade MasterSuppose your name is x and you have a very important state machine that you cherish with all your heart. Because you love this state machine so much, you don't want it to malfunction and you have a subset which you consider to be safe. If your state machine ever leaves this safe space you are in big trouble so you ask the following question.AI Safety Meets Value Chain Integrity2023-12-11T00:00:00+00:002023-12-11T00:00:00+00:00https://cybercat-institute.github.io//2023/12/11/ai-safety-meets-value-chain-integrity<p><strong>tl;dr - Advanced AI making economic decisions in supply chains and markets creates poorly-understood risks, especially by undermining the fundamental concept of individuality of agents. We propose to research these risks by building and simulating models.</strong></p>
<p>For many years, AI has been routinely used for economic decision making. Two major roles it has traditionally played are high frequency trading and algorithmic pricing. Traditionally these are quite simple, at the level of tabular Q-learning agents. Even these comparatively simple algorithms can behave in unexpected ways due to emergent interactions in an economic environment. Probably the most infamous of these events was the <a href="https://en.wikipedia.org/wiki/2010_flash_crash">flash crash</a>, for which algorithmic high speed trading was a major contributing cause. Much less well known is the subtle issue of <em>implicit collusion</em> in pricing algorithms, which are ubiquitous in several markets such as airline tickets and Amazon: <a href="https://www.aeaweb.org/articles?id=10.1257/aer.20190623">a widely 2020 cited paper</a> found that even very simple tabular Q-learning will converge to prices higher than the Nash equilibrium price - but <a href="https://arxiv.org/abs/2201.00345">our research</a> found that this depends sensitively on the exact method of training, and the effect vanishes when the algorithms are trained independenly in simulated markets.</p>
<p>Besides markets, AI is also already used for making decisions in supply chains (see for example [<a href="https://www.thomsonreuters.com/en-us/posts/technology/ai-supply-chains/">1</a> <a href="https://www.mckinsey.com/capabilities/operations/our-insights/autonomous-supply-chain-planning-for-consumer-goods-companies">2</a> <a href="https://www.forbes.com/sites/forbestechcouncil/2023/08/08/ais-role-in-supply-chain-management-and-how-organizations-can-get-started/">3</a> <a href="https://www.accenture.com/us-en/blogs/business-functions-blog/generative-ai-why-smarter-supply-chains-are-here">4</a>]), and surely will be moreso in the future. Contemporary supply chains are extraordinarily complex. A typical modern technology product can have hundred of thousands of components sourced from ten thousand suppliers across half a dozen tiers which need to be shipped across the globe to the final assembly. A single five-dollar part can stop an assembly line, which in the case of industries like automotive can cost millions per hour of downtime. The worst type of inventory a company can carry is a 99.9% finished product it cannot sell. Over time, supply chains have been hyper-optimised at the expense of integrity, so that a metaphorical perfect storm in the shape of an <a href="https://en.wikipedia.org/wiki/2010_eruptions_of_Eyjafjallaj%C3%B6kull">Icelandic volcano named Eyjafjallajökull erupting</a> or a <a href="https://en.wikipedia.org/wiki/2021_Suez_Canal_obstruction">container ship named <em>Ever Given</em> getting stuck in the Suez Canal</a> caused massive disruption that inevitably leads to delayed goods, spoiled perishables, lawsuits and contested insurance claims easily in the ten digits. The <a href="https://www.ey.com/en_gl/supply-chain/how-covid-19-impacted-supply-chains-and-what-comes-next">COVID-19 pandemic</a> was a business school case for all the types of havoc supply chain disruptions can wreak, oscillating wildly from not enough containers to too many containers in port, obstructing the handling of cargo, from COVID-related work shutdowns in China to sudden shifts in consumer behavior in Western countries, leading to layoffs in hospitality industries and labour shortages in production and transportation. Beyond these knock-on effects that can explode planning horizons for procurement and shift the delicate power balance from buyer to supplier, another major problem in supply chain is the knock-off effect: fashion brands and pharmaceutical companies alike fight the problem of counterfeit products being introduced into the supply chain when no one is looking, leading to multi-million dollar losses along with the reputational damage, and, especially in pharmaceuticals, posing a hazard to health and life for many. Supply chain integrity crucially on transparency across a multitude of participants who are typically less than eager to share confidential data.</p>
<p>Moving fowards from these events, the delicate tredeoff between efficiency and integrity is a perfect use-case for the integrated and inter-connected decision-making that is afforded by AI.</p>
<p>This brings us to the issue of economic decisions being deferred to large language models such as GPT4. The well known examples are not “natively economic”, but many people are adapting transformer architectures to operate on various types of data besides linguistic data, and it is only a matter of time before there are “economics LLMs”. In the meantime, GPT is entirely capable of making economic decisions with the right prompting - although virtually nothing is known about its performance on these type of tasks. We do not recommend using GPT to make investment decisions for you, but we expect it to become widespread anyway, if it isn’t already. Similarly, we expect large parts of complex supply chains to be almost entirely deferred to AI, extending the existing automation and its associated benefits and risks.</p>
<h2>AI undermines individuality in economics</h2>
<p>The traditional (tabular Q-learning) and contemporary (LLMs) situations are very different in many ways, but they have a subtle and crucial point in common. This is that decisions that look independent are secretly connected. There are two ways this could happen: one is that human decision-makers defer to off-the-shelf software that comes from the same upstream supplier - as is the case for algorithmic pricing in the airline industry for example. The other is that there really is a single instance of the AI system in the world and everybody is calling into it - as is the case with GPT.</p>
<p>For off-the-shelf implementations of tabular Q-learning for algorithmic pricing, there is some evidence that having a single upstream supplier has a significant impact on the behaviour of the market, and this is something that regulators are actively investigating. For LLMs virtually nothing is known, but we expect that the situation is worse. At the very least, the situation will certainly be more unpredictable, and we expect the compounding of implicit biases to be worse as these systems become ubiquitous and deeply embedded into decision-making. We plan to research this, by building economic simulations where decisions are made by advanced AIs and studying their behaviour.</p>
<p>A further possibility is more hypothetical, but we expect it to become a reality within the next few years. Right now the technology behind large language models - generative transformers - mainly operates on textual data, but it is actively being adapted for other types of data, and for other tasks besides text generation. Making economic decisions is very similar to playing games, and so there is an obvious analogy to the wildly successful application of deep reinforcement learning to strategically complex game playing tasks such as Go and StarCraft 2 by DeepMind. Combining this with generative transformer architectures could be immensely powerful, and it is not hard to believe such a system could surpass human performance on the task of economic decision-making.</p>
<h2>Modelling for harm prevention</h2>
<p>Compositional game theory - a technology that we <a href="https://arxiv.org/abs/1603.04641">developed</a> and <a href="https://github.com/CyberCat-Institute/open-game-engine">implemented</a> - is currently the state of the art for implementing complex meso-scale microeconomic models. The way things are traditionally done, models are written first in mathematics and are later converted into computational models in general purpose languages (traditionally Fortran, but increasingly in modern languages such as Python), a process that is very slow and very prone to introducing hard-to-detect errors. We use a <em>model is code</em> paradigm, where both the mathematical and computational languages are modified to bring them very close to each other - most commonly we build our models directly in code, with a clean separation of concerns between the economic and computational parts. Our models are not inherently more accurate, but they are 2 orders of magnitude faster and cheaper to build, and this unlocks our secret weapon: <em>rapid prototyping models</em>. By iterating quickly, and continuously obtaining feedback from data and stakeholders, we reach a better model than could be built monolithically.</p>
<p>Why do we want to build these models? The bigger picture is, we want to inform the discussion about regulation of AI. This discussion is already widespread at the highest level of governments around the world, but is currently heavily lacking in evidence one way or the other. There’s a good reason for this: the domain of LLMs is language, and it is extremely difficult to make convincing predictions about the possible harms that can happen mediated by linguistic communication. More restricted domains, such as the behaviours of API bots, are easier to reason about. We have identified the general realm of economic decision-making as a critically under-explored part of the general AI safety question, which our tools are well-placed to explore through modelling and simulations.</p>
<p>Our implementation of compositional game theory allows modularly switching the algorithm that each player uses for making decisions. Normally when doing applied game theory we use a monte carlo optimiser for every player. But we also have <a href="https://github.com/CyberCat-Institute/open-games-RLib">a version</a> that calls a Python implementation of Q-learning over a web socket. We could also easily switch it to calls to an open source LLM, or API calls to a GPT API bot or similar.</p>
<p>What’s more, this is emphatically <em>not</em> a mere hack that we bolt on top of game theory. At the core of our whole approach is our discovery, as seen in <a href="https://arxiv.org/abs/2105.06332">this paper</a>, that the foundations of compositional game theory and several branches of machine learning are extremely closely related - this foundation is what we call <a href="https://cybercat.institute/2022/05/29/what-is-categorical-cybernetics/">categorical cybernetics</a>. This foundation is what guides us and tells us that what are are doing is really meaningful. More than that, though, it opens a realistic possibility that we can know <em>qualitative</em> things about the behaviour of AIs making economic decisions, a much higher level of confidence than making inferences from simulation results. And when it comes to informing the discussion on regulation when the stakes are as high as they are, more certainty is always better.</p>
<h2>What if?</h2>
<p>So far we have focussed on the likely negative <em>accidental</em> impacts AI is likely to have on markets and supply chains, where they perform their intended purpose locally but interact in unforeseen ways. This is already concerning, but there is another side to the issue. What if decisions that should be independent are made by a single AI that has “gone rogue”, i.e. has a goal that is not the intended one? Depending on your personal assessment of the likelihood of this situation you could read this section as a fun thought experiment or a warning.</p>
<p>Being handed direct control of markets and supply chains gives perhaps the most powerful leverage over the physical world that an AI could have. Since it can <em>collude with itself</em>, it can easily create behaviours that would never be possible when decisions are made by agents that are independent and at least somewhat rational.</p>
<p>By far the most straightforward outcome of this situation is chaos. Markets and supply chains are so deeply interconnected that it would take very little intentional damage to create a recession deep enough to bring society to its knees. However, by virtually destroying the institutions that it controls this makes it a one-time event, which while extremely bad, would be easily recoverable for humanity as a whole.</p>
<p>Much worse would be the ability of a rogue AI to subtly direct real-world resources towards a secret goal of its own over a long period of time. It isn’t a hypothetical that complex supply chains can easily hide parts of themselves: consider how widespread is modern slavery in the supply chains of consumer electronics, or how the US government secretly procured the resources needed to build the first nuclear weapons at a time when supply chains were much simpler.</p>
<h2>Conclusion</h2>
<p>It is arguable exactly how extensive are the risks associated to allowing AIs to interact with economic systems, with the scenarios described in the previous section being hypothetical. However, it is undeniable that some serious risks do exist, including already-observed events such as flash crashes and implicit collusion. We have identified that the specific factor of decision-makers using the same upstream provider of decision-making software leads to poorly-understood emergent behaviours of supply chains and markets.</p>
<p>Our theoretical framework, compositional game theory, and our implementation of it, the open game engine, are the perfect tools for building and simulating models of economic situations with AI decision-makers. The goal of creating these models is to produce evidence leading to a better-informed debate on issues around the regulation of AI.</p>Jules HedgesAdvanced AI making economic decisions in supply chains and markets creates poorly-understood risks, especially by undermining the fundamental concept of individuality of agents. We propose to research these risks by building and simulating models.About the CyberCat Institute blog2023-11-26T00:00:00+00:002023-11-26T00:00:00+00:00https://cybercat-institute.github.io//2023/11/26/test-post<p>The Cybercat blog website is based on the <a href="https://jekyllthemes.io/theme/whiteglass">Whiteglass</a> theme.</p>
<h2>TOC <!-- omit in toc --></h2>
<ul>
<li><a href="#workflow">Workflow</a>
<ul>
<li><a href="#previewing">Previewing</a></li>
</ul>
</li>
<li><a href="#post-preamble">Post preamble</a></li>
<li><a href="#latex">Latex</a>
<ul>
<li><a href="#theorem-environments">Theorem environments</a>
<ul>
<li><a href="#referencing">Referencing</a></li>
</ul>
</li>
<li><a href="#typesetting-diagrams">Typesetting diagrams</a>
<ul>
<li><a href="#quiver">Quiver</a></li>
<li><a href="#tikz">Tikz</a></li>
<li><a href="#referencing-1">Referencing</a></li>
</ul>
</li>
</ul>
</li>
<li><a href="#images">Images</a>
<ul>
<li><a href="#referencing-2">Referencing</a></li>
</ul>
</li>
<li><a href="#code">Code</a></li>
</ul>
<h2>Workflow</h2>
<p>Standard github workflow:</p>
<ul>
<li>Clone this repo</li>
<li>Create a branch</li>
<li>Write your post</li>
<li>Make a PR</li>
<li>Wait for approval</li>
</ul>
<p>The blog will be automatically rebuilt once your PR is merged.</p>
<h3>Previewing</h3>
<p>Since the blog uses Jekyll, you will need to <a href="https://jekyllrb.com/docs/installation/">install it</a> or use the included nix flake devshell (just run <code class="language-plaintext highlighter-rouge">nix develop</code> with flakes-enabled nix installed) to be able to preview your contents. Once the installation is complete, just navigate to the repo folder and give <code class="language-plaintext highlighter-rouge">bundle exec jekyll serve</code>. Jekyll will spawn a local server (usually at <code class="language-plaintext highlighter-rouge">127.0.0.1:4000</code>) that will allow you to see the blog in locale.</p>
<h2>Post preamble</h2>
<p>Posts must be placed in the <code class="language-plaintext highlighter-rouge">_posts</code> folder. Post titles follow the convention <code class="language-plaintext highlighter-rouge">yyyy-mm-dd-title.md</code>. Post assets (such as images) go in the folder <code class="language-plaintext highlighter-rouge">assetsPost</code>, where you should create a folder with the same name of the post.</p>
<p>Each post should start with the following preamble:</p>
<div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nn">---</span>
<span class="na">layout</span><span class="pi">:</span> <span class="s">post</span>
<span class="na">title</span><span class="pi">:</span> <span class="s">the title of your post</span>
<span class="na">author</span><span class="pi">:</span> <span class="s">your name</span>
<span class="na">categories</span><span class="pi">:</span> <span class="s">keyword or a list of keywords [keyword1, keyword2, keyword3]</span>
<span class="na">excerpt</span><span class="pi">:</span> <span class="s">A short summary of your post</span>
<span class="na">image</span><span class="pi">:</span> <span class="s">assetsPosts/yourPostFolder/imageToBeUsedAsThumbnails.png This is optional, but useful if e.g. you share the post on Twitter.</span>
<span class="na">usemathjax</span><span class="pi">:</span> <span class="kc">true</span><span class="s"> (omit this line if you don't need to typeset math)</span>
<span class="na">thanks</span><span class="pi">:</span> <span class="s">A short acknowledged message. It will be shown immediately above the content of your post.</span>
<span class="nn">---</span>
</code></pre></div></div>
<p>As for the content of the post, it should be typeset in markdown.</p>
<h2>Latex</h2>
<ul>
<li>Inline math is shown by using <code class="language-plaintext highlighter-rouge">$ ... $</code>. Notice that some expressions such as <code class="language-plaintext highlighter-rouge">a_b</code> typeset correctly, while expressions like <code class="language-plaintext highlighter-rouge">a_{b}</code> or <code class="language-plaintext highlighter-rouge">a_\command</code> sometimes do not. I guess this is because mathjax expects <code class="language-plaintext highlighter-rouge">_</code> to be followed by a literal.</li>
<li>Display math is shown by using <code class="language-plaintext highlighter-rouge">$$ ... $$</code>. The problem above doesn’t show up in this case, but you gotta be careful:
<div class="language-markdown highlighter-rouge"><div class="highlight"><pre class="highlight"><code> text
$$ ... $$
text
</code></pre></div> </div>
<p>does not typeset correctly, whereas:</p>
<div class="language-markdown highlighter-rouge"><div class="highlight"><pre class="highlight"><code> text
$$
...
$$
text
</code></pre></div> </div>
<p>does. You can also use environments, as in:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> $$
\begin{align*}
...
\end{align*}
$$
</code></pre></div> </div>
</li>
</ul>
<h3>Theorem environments</h3>
<p>We provide the following theorem environments: Definition, Proposition, Lemma, Theorem and Corollary. Numbering is automatic. If you need others, just ask. The way these works is as follows:</p>
<div class="language-latex highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="p">{</span><span class="c">% def %}</span>
A *definition* is a blabla, such that: <span class="p">$</span><span class="nb">...</span><span class="p">$</span>. Furthermore, it is:
<span class="p">$$</span><span class="nb">
...
</span><span class="p">$$</span>
<span class="p">{</span><span class="c">% enddef %}</span>
</code></pre></div></div>
<p>This gets rendered as follows:</p>
<div class="definition">
<p>A <em>definition</em> is a blabla, such that: $…$. Furthermore, it is:</p>
\[...\]
</div>
<p>Numbering is automatic. Use the tags:</p>
<div class="language-latex highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="p">{</span><span class="c">% def %}</span>
For your definitions
<span class="p">{</span><span class="c">% enddef %}</span>
<span class="p">{</span><span class="c">% not %}</span>
For your notations
<span class="p">{</span><span class="c">% endnot %}</span>
<span class="p">{</span><span class="c">% ex %}</span>
For your examples
<span class="p">{</span><span class="c">% endex %}</span>
<span class="p">{</span><span class="c">% diag %}</span>
For your diagrams
<span class="p">{</span><span class="c">% enddiag %}</span>
<span class="p">{</span><span class="c">% prop %}</span>
For your propositions
<span class="p">{</span><span class="c">% endprop %}</span>
<span class="p">{</span><span class="c">% lem %}</span>
For your lemmas
<span class="p">{</span><span class="c">% endlem %}</span>
<span class="p">{</span><span class="c">% thm %}</span>
For your theorems
<span class="p">{</span><span class="c">% endthm %}</span>
<span class="p">{</span><span class="c">% cor %}</span>
For your corollaries
<span class="p">{</span><span class="c">% endcor %}</span>
</code></pre></div></div>
<h4>Referencing</h4>
<p>If you need to reference results just append a <code class="language-plaintext highlighter-rouge">{"id":"your_reference_tag"}</code> after the tag, where <code class="language-plaintext highlighter-rouge">your_reference_tag</code> is the same as a LaTex label. Fore example:</p>
<div class="language-latex highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="p">{</span><span class="c">% def {"id":"your_reference_tag"} %}</span>
A *definition* is a blabla, such that: <span class="p">$</span><span class="nb">...</span><span class="p">$</span>. Furthermore, it is:
<span class="p">$$</span><span class="nb">
...
</span><span class="p">$$</span>
<span class="p">{</span><span class="c">% enddef %}</span>
</code></pre></div></div>
<p>Then you can reference this by doing:</p>
<div class="language-markdown highlighter-rouge"><div class="highlight"><pre class="highlight"><code>As we remarked in <span class="p">[</span><span class="nv">Reference description</span><span class="p">](</span><span class="sx">#your_reference_tag</span><span class="p">)</span>, we are awesome...
</code></pre></div></div>
<h3>Typesetting diagrams</h3>
<p>We support two types of diagrams: quiver and TikZ.</p>
<h4>Quiver</h4>
<p>You can render <a href="https://q.uiver.app/">quiver</a> diagrams by enclosing quiver expoted iframes between <code class="language-plaintext highlighter-rouge">quiver</code> tags:</p>
<ul>
<li>On <a href="https://q.uiver.app/">quiver</a>, click on <code class="language-plaintext highlighter-rouge">Export: Embed code</code></li>
<li>Copy the code</li>
<li>In the blog, put it between delimiters as follows:</li>
</ul>
<div class="language-html highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
{% quiver %}
<span class="c"><!-- https://q.uiver.app/codecodecode--></span>
<span class="nt"><iframe</span> <span class="na">codecodecode</span><span class="nt">></iframe></span>
{% endquiver %}
</code></pre></div></div>
<p>They get rendered as follows:</p>
<div class="quiver">
<!-- https://q.uiver.app/#q=WzAsMyxbMCwwLCJYIl0sWzEsMiwiQiJdLFsyLDAsIkEiXSxbMCwxLCJnIiwxXSxbMiwxLCJmIiwxXSxbMCwyLCJoIiwxXV0= -->
<iframe class="quiver-embed" src="https://q.uiver.app/#q=WzAsMyxbMCwwLCJYIl0sWzEsMiwiQiJdLFsyLDAsIkEiXSxbMCwxLCJnIiwxXSxbMiwxLCJmIiwxXSxbMCwyLCJoIiwxXV0=&embed" width="432" height="432" style="border-radius: 8px; border: none;"></iframe>
</div>
<p><strong>Should the picture come out cropped, select <code class="language-plaintext highlighter-rouge">fixed size</code> when exporting the quiver diagram, and choose some suitable parameters.</strong></p>
<h4>Tikz</h4>
<p>You can render tikz diagrams by enclosing tikz code between <code class="language-plaintext highlighter-rouge">tikz</code> tags, as follows:</p>
<div class="language-latex highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="p">{</span><span class="c">% tikz %}</span>
<span class="nt">\begin{tikzpicture}</span>
<span class="k">\draw</span> (0,0) circle (1in);
<span class="nt">\end{tikzpicture}</span>
<span class="p">{</span><span class="c">% endtikz %}</span>
</code></pre></div></div>
<p>Tikz renders as follows:</p>
<div class="tikz"><script type="text/tikz">
\rotatebox{0}{
\scalebox{1}{
\begin{tikzpicture}
\node[circle, fill, minimum size=5pt, inner sep=0pt, label=left:{$1$}] (al1) at (-2,0) {};
\node[circle, fill, minimum size=5pt, inner sep=0pt, label=right:{$1$}] (ar1) at (0,0) {};
\node[circle, fill, minimum size=5pt, inner sep=0pt, label=right:{$2$}] (ar2) at (0,-1) {};
\node[circle, fill, minimum size=5pt, inner sep=0pt, label=right:{$3$}] (ar3) at (0,-2) {};
\draw[thick] (al1) to (ar1);
\draw[thick, out=180, in=180, looseness=2] (ar2) to (ar3);
\end{tikzpicture}
}
}
</script></div>
<p>Notice that at the moment tikz rendering:</p>
<ul>
<li>Supports any option you put after <code class="language-plaintext highlighter-rouge">\begin{document}</code> in a <code class="language-plaintext highlighter-rouge">.tex</code> file. So you can use this to include any stuff you’d typeset with LaTex (but we STRONGLY advise against it).</li>
<li>Does not support usage of anything that should go in the LaTex preamble, that is, before <code class="language-plaintext highlighter-rouge">\begin{document}</code>. This includes exernal tikz libraries such as <code class="language-plaintext highlighter-rouge">calc</code>, <code class="language-plaintext highlighter-rouge">arrows</code>, etc; and packages such as <code class="language-plaintext highlighter-rouge">tikz-cd</code>. Should you need <code class="language-plaintext highlighter-rouge">tikz-cd</code>, use quiver as explained above. If you need fancier stuff, you’ll have to render the tikz diagrams by yourself and import them as images (see below).</li>
</ul>
<h4>Referencing</h4>
<p>Referencing works also for the quiver and tikz tags, as in:</p>
<div class="language-latex highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="p">{</span><span class="c">% tikz {"id":"your_reference_tag"} %}</span>
...
<span class="p">{</span><span class="c">% endtikz %}</span>
</code></pre></div></div>
<p>This automatically creates a numbered ‘Figure’ caption under the figure, as in:</p>
<div class="quiverCaption" id="example"><div class="quiver">
<!-- https://q.uiver.app/#q=WzAsMyxbMCwwLCJYIl0sWzEsMiwiQiJdLFsyLDAsIkEiXSxbMCwxLCJnIiwxXSxbMiwxLCJmIiwxXSxbMCwyLCJoIiwxXV0= -->
<iframe class="quiver-embed" src="https://q.uiver.app/#q=WzAsMyxbMCwwLCJYIl0sWzEsMiwiQiJdLFsyLDAsIkEiXSxbMCwxLCJnIiwxXSxbMiwxLCJmIiwxXSxbMCwyLCJoIiwxXV0=&embed" width="432" height="432" style="border-radius: 8px; border: none;"></iframe>
</div></div>
<p>Whenever possible, we encourage you to enclose diagrams into definitions/propositions/etc should you need to reference them.</p>
<h2>Images</h2>
<p>Images are included via standard markdown syntax:</p>
<div class="language-markdown highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">![</span><span class="nv">image description</span><span class="p">](</span><span class="sx">image_path</span><span class="p">)</span>
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">image_path</code> can be a remote link. Should you need to upload images to this blog post, do as follows:</p>
<ul>
<li>Create a folder in <code class="language-plaintext highlighter-rouge">assetsPosts</code> with the same title of the blog post file. So if the blogpost file is <code class="language-plaintext highlighter-rouge">yyyy-mm-dd-title.md</code>, create the folder <code class="language-plaintext highlighter-rouge">assetsPosts/yyyy-mm-dd-title</code></li>
<li>Place your images there</li>
<li>Reference the images by doing:
<div class="language-markdown highlighter-rouge"><div class="highlight"><pre class="highlight"><code> !<span class="p">[</span><span class="nv">image description</span><span class="p">](</span><span class="sx">../assetsPosts/yyyy-mm-dd-title/image</span><span class="p">)</span>
</code></pre></div> </div>
</li>
</ul>
<p>Whenever possible, we recommend the images to be in the format <code class="language-plaintext highlighter-rouge">.png</code>, and to be <code class="language-plaintext highlighter-rouge">800</code> pixels in width, with <strong>transparent</strong> backround. Ideally, these should be easily readable on the light gray background of the blog website. You can strive from these guidelines if you have no alternative, but our definition and your definition of ‘I had no alternative’ may be different, and <em>we may complain</em>.</p>
<h4>Referencing</h4>
<p>Referencing works exactly as for diagrams:</p>
<div class="language-latex highlighter-rouge"><div class="highlight"><pre class="highlight"><code>
<span class="p">{</span><span class="c">% figure {"id":"your_reference_tag"} %}</span>
![image description](image<span class="p">_</span>path)
<span class="p">{</span><span class="c">% endfigure %}</span>
</code></pre></div></div>
<h2>Code</h2>
<p>CyberCat blog offers support for code snippets:</p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">print_hi</span><span class="p">(</span><span class="nb">name</span><span class="p">)</span>
<span class="nb">puts</span> <span class="s2">"Hi, </span><span class="si">#{</span><span class="nb">name</span><span class="si">}</span><span class="s2">"</span>
<span class="k">end</span>
<span class="n">print_hi</span><span class="p">(</span><span class="s1">'Tom'</span><span class="p">)</span>
<span class="c1">#=> prints 'Hi, Tom' to STDOUT.</span>
</code></pre></div></div>
<p>To include a code snippet, just give:</p>
<div class="language-markdown highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">```</span><span class="nl">language the snippet is written in
</span><span class="sb">your code</span>
<span class="p">```</span>
</code></pre></div></div>
<p>Check out the <a href="https://jekyllrb.com/docs/home">Jekyll docs</a> for more info on how to get the most out of Jekyll. File all bugs/feature requests at <a href="https://github.com/jekyll/jekyll">Jekyll’s GitHub repo</a>. If you have questions, you can ask them on <a href="https://talk.jekyllrb.com/">Jekyll Talk</a>.</p>Fabrizio GenoveseThis is a short summary of the post. It is meant to explain how to write for our blog.A Software Engine For Game Theoretic Modelling - Part 22022-06-24T00:00:00+00:002022-06-24T00:00:00+00:00https://cybercat-institute.github.io//2022/06/24/a-software-engine-for-game-theoretic-modelling-part-2<h2>Introduction</h2>
<p>Some time ago, in a <a href="https://statebox.org/blog/compositional-game-engine/">previous blog post</a>, we introduced our software engine for game theoretic modelling. In this post, we expand more on how to apply the engine to use cases relevant for the Ethereum ecosystem. We will consider an analysis of a simplified staking protocol. Our focus will be on compositionality – what this means from the perspective of representing protocols and from the perspective of analyzing protocols.</p>
<p>We end with an outlook on the further development of the engine, what its current limitations are and how we work on overcoming them.</p>
<p>The codebase of the example discussed can be found <a href="https://github.com/20squares/block-validation">here</a>. If you have never seen the engine before, we advise you to go back to our earlier post. Also note that there exists a basic <a href="https://github.com/philipp-zahn/open-games-engine/blob/master/Tutorial/TUTORIAL.md">tutorial</a> that explains how the engine works. Lastly, here is a recent <a href="https://www.youtube.com/watch?v=fucygCyCyo8">presentation</a> Philipp gave at the <a href="https://ef-events.notion.site/ETHconomics-Devconnect-676d73f791684e18bfae35bbc9e1fa90">Ethconomics workshop at DevConnect Amsterdam</a>.</p>
<h2>Preliminaries</h2>
<p>Consider a simplified model of a staking protocol. The staking protocol is motivated by <a href="https://ethereum.org/en/developers/docs/consensus-mechanisms/pos/">Ethereum proof of stake</a>. The model we introduce is relevant as, even though simple, it shines a light on how a previous version of the staking protocol was subject to reorg attacks as discussed in this <a href="https://arxiv.org/abs/2110.10086">paper</a>. We thank Barnabé Monnot for pointing us to the problem in the first place and helping us with the specification and modelling.</p>
<p>In what follows, we give a short verbal summary of the protocol.</p>
<p>To begin with, we model a chain as a (compositional) relation. The chain contains blocks with unique identifiers as well as voting weights. The weights correspond to votes by validators on the specific blocks contained in the chain. Here is an example of such a chain in the case of two validators:</p>
<p><img src="/assetsPosts/2022-06-24-a-software-engine-for-game-theoretic-modelling-part-2/chain.png" alt="Example chain for two validators" /></p>
<p>The staking protocol consists of episodes. Within each episode, which lasts for several time steps, a <em>proposer</em> decides to extend the chain by a further block. The proposer can decide to extend or not to extend it. If the proposer extends the chain, he chooses on which block to build. Consider the following example when the proposer extends the above chain:</p>
<p><img src="/assetsPosts/2022-06-24-a-software-engine-for-game-theoretic-modelling-part-2/extendedchain.png" alt="Example chain for two validators and a new block proposed" /></p>
<p>The new block he generates will have initially no votes attesting to this block being the legitimate successor. This assessment is conducted by two validators.</p>
<p>These two validators observe the last stage of the chain before their episode starts and they observe a possible change to the chain made by the proposer within their episode. The validators can then vote on the block which they view as the legitimate successor. Here is the continued example from above:</p>
<p><img src="/assetsPosts/2022-06-24-a-software-engine-for-game-theoretic-modelling-part-2/extendedandvoted.png" alt="Example chain for two validators, new block proposed, and voted on" /></p>
<p>Both the proposer’s as well as the validators’ choices will be evaluated in the next episode. If the decisions they made, i.e. the building on a specific block by the proposer as well as the voting by the validators, is on the path to the longest weighted chain, they will receive a reward.</p>
<p>From a modelling perspective, this is an important feature. The agents’ remuneration in episode $t$ will be determined in episode $(t+1)$. We will come back to this feature.</p>
<p>So far, the setup seems simple enough. However, the picture is complicated by possible network issues. Messages may be delayed. For instance, the two validators might not observe a message by the proposer in their episode simply due to the network being partitioned.</p>
<p>Hence, in this specific case, the validators cannot be sure when a message does <em>not</em> reach them, that the message was actually not sent, or that they just have not received it yet.</p>
<p>Real world network issues like delay complicate the incentives. They also open avenues for malicious agents. Modelling the arising incentive problems in game-theoretic terms is a formidable challenge as the timing of moves and information is itself affected by the moves of players. For instance, in the reorg attack mentioned in the beginning, a malicious proposer might want to wait with sending information until the next episode has started. In that way he might draw validators away from the honest proposer of that episode and instead have them vote on his block that he created late.</p>
<p>The practical modelling of such interactions is not obvious (and in fact motivated a new research project on our end). Here, we dramatically simplify the problem. We get rid of time completely. Instead, we leverage a key feature of our approach: Games are defined as open systems —- open to their environment and waiting for information.</p>
<p>Through the environment we can feed in specific information we want. Concretely, we can expose the proposer and validators in a given episode exactly to the kind of reorg scenario mentioned above: Proposer and validators are facing differing information regarding the state of the chain.</p>
<p>Besides simplifying the model, proceeding in this way has a further advantage. The analysis of optimal moves is static and only relative to the context. It thereby becomes much simpler.<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup></p>
<h2>Representing the protocol as a compositional game</h2>
<p>In order to construct a game-theoretic model of the protocol, we will build up the protocol from the bottom up using building blocks.</p>
<h3>Building blocks</h3>
<p>We begin with the boring but necessary parts that describe the mechanics of the protocol. These components are mostly functions lifted into games as computations. In order not to introduce too much clutter in this post, we focus on the open games representations and hide details of the auxiliary function implementations. These functions are straightforward and is should be hopefully clear from the context what they do.</p>
<h4>Auxiliary components</h4>
<p>Given a chain, <code class="language-plaintext highlighter-rouge">determineHeadOfChain</code> produces the head of the current chain:</p>
<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">determineHeadOfChain</span> <span class="o">=</span> <span class="o">[</span><span class="n">opengame</span><span class="o">|</span>
inputs : chain ;
feedback : ;
:-----:
inputs : chain ;
feedback : ;
operation : forwardFunction $ determineHead ;
outputs : head ;
returns : ;
:-----:
outputs : head ;
returns : ;
<span class="o">|]</span>
</code></pre></div></div>
<p>Given the old chain from $(t-1)$ and the head of the chain from $(t-2)$, <code class="language-plaintext highlighter-rouge">oldProposerAddedBlock</code> determines whether the proposer actually did send a new block in $(t-1)$. It also outputs the head of the chain for period $(t-1)$ - as this is needed in the next period.</p>
<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">oldProposerAddedBlock</span> <span class="o">=</span> <span class="o">[</span><span class="n">opengame</span><span class="o">|</span>
inputs : chainOld, headOfChainIdT2 ;
feedback : ;
:-----:
inputs : chainOld, headOfChainIdT2 ;
feedback : ;
operation : forwardFunction $ uncurry wasBlockSent ;
outputs : correctSent, headOfChainIdT1 ;
returns : ;
:-----:
outputs : correctSent, headOfChainIdT1 ;
returns : ;
<span class="o">|]</span>
</code></pre></div></div>
<p>Given the decision by the proposer to either wait or to send a head, <code class="language-plaintext highlighter-rouge">addBlock</code> creates a new chain. Which means either the old chain is copied as before or the chain is actually appended by a new block.</p>
<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">addBlock</span> <span class="o">=</span> <span class="o">[</span><span class="n">opengame</span><span class="o">|</span>
inputs : chainOld, chosenIdOrWait ;
feedback : ;
:-----:
inputs : chainOld, chosenIdOrWait ;
feedback : ;
operation : forwardFunction $
uncurry addToChainWait ;
outputs : chainNew ;
returns : ;
:-----:
outputs : chainNew ;
returns : ;
<span class="o">|]</span>
</code></pre></div></div>
<p>The following diagram summarizes the information flow in these building blocks.</p>
<p><img src="/assetsPosts/2022-06-24-a-software-engine-for-game-theoretic-modelling-part-2/auxiliary.png" alt="Information flow in the building blocks" /></p>
<h4>Decisions</h4>
<p>Given the old chain from $(t-1)$, proposer decides to append the block to a node or not to append. Conditional on that decision, a new chain is created.</p>
<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">proposer</span> <span class="n">name</span> <span class="o">=</span> <span class="o">[</span><span class="n">opengame</span><span class="o">|</span>
inputs : chainOld;
feedback : ;
:-----:
inputs : chainOld ;
feedback : ;
operation : dependentDecision name
alternativesProposer;
outputs : decisionProposer ;
returns : 0;
inputs : chainOld, decisionProposer ;
feedback : ;
operation : addBlock ;
outputs : chainNew;
returns : ;
//
:-----:
outputs : chainNew ;
returns : ;
<span class="o">|]</span>
</code></pre></div></div>
<p>Given a new chain proposed and the old chain from $(t-1)$, validator then decides which node to attest as the head.</p>
<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">validator</span> <span class="n">name</span> <span class="o">=</span> <span class="o">[</span><span class="n">opengame</span><span class="o">|</span>
inputs : chainNew,chainOld ;
feedback : ;
:-----:
inputs : chainNew,chainOld ;
feedback : ;
operation : dependentDecision name
(\(chainNew, chainOld) ->
[1, vertexCount chainNew]) ;
outputs : attestedIndex ;
returns : 0 ;
// ^ NOTE the payoff for the validator comes from the next period
:-----:
outputs : attestedIndex ;
returns : ;
<span class="o">|]</span>
</code></pre></div></div>
<p>This open game is parameterized by a specific player (<code class="language-plaintext highlighter-rouge">name</code>). The information flow of the decision open games are depicted in in the next diagram:</p>
<p><img src="/assetsPosts/2022-06-24-a-software-engine-for-game-theoretic-modelling-part-2/decisions.png" alt="Add block flow" /></p>
<h4>Payoffs</h4>
<p>The central aspect of the protocol is how the payoffs of the different players are determined. For both proposers and validators we split the payoff components into two parts. First, we create open games which are mere accounting devices, i.e. they just update a player’s payoff.</p>
<p><code class="language-plaintext highlighter-rouge">updatePayoffValidator</code>:</p>
<ol>
<li>determines the value that an validator should receive conditional on his action being assessed as correct and</li>
<li>updates the value for a specific validator. This open game is parameterized by a specific player (<code class="language-plaintext highlighter-rouge">name</code>).</li>
</ol>
<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">updatePayoffValidator</span> <span class="n">name</span> <span class="n">fee</span> <span class="o">=</span> <span class="o">[</span><span class="n">opengame</span><span class="o">|</span>
inputs : bool ;
feedback : ;
:-----:
inputs : bool ;
feedback : ;
operation : forwardFunction $ validatorPayoff fee ;
outputs : value ;
returns : ;
// ^ Determines the value
inputs : value ;
feedback : ;
operation : addPayoffs name ;
outputs : ;
returns : ;
:-----:
outputs : ;
returns : ;
<span class="o">|]</span>
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">updatePayoffProposer</code> works analogously to the validators’. First, determine the value the proposer should receive depending on his action. Second, do the book-keeping and add the payoff to <code class="language-plaintext highlighter-rouge">name</code>’s account.</p>
<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">updatePayoffProposer</span> <span class="n">name</span> <span class="n">reward</span> <span class="o">=</span> <span class="o">[</span><span class="n">opengame</span><span class="o">|</span>
inputs : bool ;
feedback : ;
:-----:
inputs : bool ;
feedback : ;
operation : forwardFunction $ proposerPayoff reward;
outputs : value ;
returns : ;
// ^ Determines the value
inputs : value ;
feedback : ;
operation : addPayoffs name ;
outputs : ;
returns : ;
:-----:
outputs : ;
returns : ;
<span class="o">|]</span>
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">proposerPayment</code> embeds <code class="language-plaintext highlighter-rouge">updatePayoffProposer</code> into a larger game where the first stage includes a function, <code class="language-plaintext highlighter-rouge">proposedCorrect</code>, lifted into the open game. That function does what its name suggests, given the latest chain and a Boolean value whether the proposer actually added a block, it determines whether the proposer proposed correctly - according to the protocol.</p>
<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">proposerPayment</span> <span class="n">name</span> <span class="n">reward</span> <span class="o">=</span> <span class="o">[</span><span class="n">opengame</span><span class="o">|</span>
inputs : blockAddedInT1, chainNew ;
feedback : ;
:-----:
inputs : blockAddedInT1, chainNew ;
feedback : ;
operation : forwardFunction $ uncurry
proposedCorrect ;
outputs : correctSent ;
returns : ;
// ^ This determines whether the proposer was
correct in period (t-1)
inputs : correctSent ;
feedback : ;
operation : updatePayoffProposer name reward;
outputs : ;
returns : ;
// ^ Updates the payoff of the proposer given
decision in period (t-1)
:-----:
outputs : ;
returns : ;
<span class="o">|]</span>
</code></pre></div></div>
<p>This last game already show-cases a pattern that we will see from now on repeatedly. Using the primitive components, we go on to build up larger games. All the needed “molding” parts have been put on the table. All what follows will be about composing those elements.</p>
<p>Let us consider another example for composition.</p>
<p><code class="language-plaintext highlighter-rouge">validatorsPayment</code> groups the payment for validators included, here two, into one game.</p>
<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">validatorsPayment</span> <span class="n">name1</span> <span class="n">name2</span> <span class="n">fee</span> <span class="o">=</span> <span class="o">[</span><span class="n">opengame</span><span class="o">|</span>
inputs : validatorHashMap, chainNew, headId;
feedback : ;
:-----:
inputs : validatorHashMap, chainNew, headId ;
feedback : ;
operation : forwardFunction $ uncurry3 $
attestedCorrect name1 ;
outputs : correctAttested1 ;
returns : ;
// ^ This determines whether validator 1 was
correct in period (t-1) using the latest
hash and the old information
inputs : validatorHashMap, chainNew, headId ;
feedback : ;
operation : forwardFunction $ uncurry3 $
attestedCorrect name2 ;
outputs : correctAttested2 ;
returns : ;
// ^ This determines whether validator 2 was
correct in period (t-1)
inputs : correctAttested1 ;
feedback : ;
operation : updatePayoffValidator name1 fee ;
outputs : ;
returns : ;
// ^ Updates the payoff of validator 1 given
decision in period (t-1)
inputs : correctAttested2 ;
feedback : ;
operation : updatePayoffValidator name2 fee ;
outputs : ;
returns : ;
// ^ Updates the payoff of validator 2 given
decision in period (t-1)
:-----:
outputs : ;
returns : ;
<span class="o">|]</span>
</code></pre></div></div>
<p>This concludes the blocks for generating payments. The information flow of these components is depicted in the following diagram:</p>
<p><img src="/assetsPosts/2022-06-24-a-software-engine-for-game-theoretic-modelling-part-2/payments.png" alt="Grouping of validators" /></p>
<p><code class="language-plaintext highlighter-rouge">validatorsGroupDecision</code>` groups all validators’ decisions considered into one game. The output of this game is a map (in the programming sense) connecting the name of the validator with her/his decision.</p>
<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">validatorsGroupDecision</span> <span class="n">name1</span> <span class="n">name2</span> <span class="o">=</span> <span class="o">[</span><span class="n">opengame</span><span class="o">|</span>
inputs : chainNew,chainOld, validatorsHashMapOld ;
feedback : ;
:-----:
inputs : chainNew, chainOld ;
feedback : ;
operation : validator name1 ;
outputs : attested1 ;
returns : ;
// ^ Validator1 makes a decision
inputs : chainNew, chainOld ;
feedback : ;
operation : validator name2 ;
outputs : attested2 ;
returns : ;
// ^ Validator2 makes a decision
inputs : [(name1,attested1),(name2,attested2)],
validatorsHashMapOld ;
feedback : ;
operation : forwardFunction $ uncurry
newValidatorMap ;
outputs : validatorHashMap ;
returns : ;
// ^ Creates a map of which validator voted for
which index
inputs : chainNew, [attested1,attested2] ;
feedback : ;
operation : forwardFunction $ uncurry updateVotes ;
outputs : chainNewUpdated;
returns : ;
// ^ Updates the chain with the relevant votes
:-----:
outputs : validatorHashMap, chainNewUpdated;
returns : ;
<span class="o">|]</span>
</code></pre></div></div>
<p>The group of validators together is not really anything exciting but it serves to illustrate a general point. The nesting of smaller games into larger games is mostly about establishing clear interfaces. As long as we do not change the interfaces, we can change the internal behavior. When we build our model in several steps and refine it over time, this is very helpful. Here, for instance, the payment for an individual validator might change. But such a change is only required in one place - assuming the interaction with the outside world does not change - and will not affect the wider construction of the game. In other words, it reduces efforts in rewriting games.</p>
<p>Similarly, we chose the output type of the grouped validators with the intention that it would be easy to add more validators while keeping the interface, the mapping of validators to their decisions, intact.</p>
<p>The next diagram illustrates the composition of components and the information flow.</p>
<p><img src="/assetsPosts/2022-06-24-a-software-engine-for-game-theoretic-modelling-part-2/groupdecision.png" alt="Information flow in the validators' group decision" /></p>
<h3>Integrating the components towards one episode</h3>
<p>Having assembled all the necessary components, we can now turn to a model of an episode of the complete protocol.</p>
<p>Given the previous chain $(t-1)$, the block which was the head of the chain in $(t-2)$, and the voting decisions of the previous validators, this game puts all the decisions together.</p>
<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">oneEpisode</span> <span class="n">p0</span> <span class="n">p1</span> <span class="n">a10</span> <span class="n">a20</span> <span class="n">a11</span> <span class="n">a21</span> <span class="n">reward</span> <span class="n">fee</span> <span class="o">=</span> <span class="o">[</span><span class="n">opengame</span><span class="o">|</span>
inputs : chainOld, headOfChainIdT2,
validatorsHashMapOld ;
// ^ chainOld is the old hash
feedback : ;
:-----:
inputs : chainOld ;
feedback : ;
operation : proposer p1 ;
outputs : chainNew ;
returns : ;
// ^ Proposer makes a decision, a new hash is
proposed
inputs : chainNew,chainOld, validatorsHashMapOld;
feedback : ;
operation : validatorsGroupDecision a11 a21 ;
outputs : validatorHashMapNew, chainNewUpdated ;
returns : ;
// ^ Validators make a decision
inputs : chainNewUpdated ;
feedback : ;
operation : determineHeadOfChain ;
outputs : headOfChainId ;
returns : ;
// ^ Determines the head of the chain
inputs : validatorsHashMapOld, chainNewUpdated,
headOfChainId ;
feedback : ;
operation : validatorsPayment a10 a20 fee ;
outputs : ;
returns : ;
// ^ Determines whether validators from period (t-1)
were correct and get rewarded
inputs : chainOld, headOfChainIdT2 ;
feedback : ;
operation : oldProposerAddedBlock ;
outputs : blockAddedInT1, headOfChainIdT1;
returns : ;
// ^ This determines whether the proposer from
period (t-1) did actually add a block or not
inputs : blockAddedInT1, chainNewUpdated ;
feedback : ;
operation : proposerPayment p0 reward ;
outputs : ;
returns : ;
// ^ This determines whether the proposer from
period (t-1) was correct and triggers payments
accordingly
:-----:
outputs : chainNewUpdated, headOfChainIdT1,
validatorHashMapNew ;
returns : ;
<span class="o">|]</span>
</code></pre></div></div>
<p>For clarity, the diagram below illustrates the interacting of the different components and their information flow.</p>
<p><img src="/assetsPosts/2022-06-24-a-software-engine-for-game-theoretic-modelling-part-2/oneepisode.png" alt="Information flow" /></p>
<p>One important thing to note is that this game representation has no inherent dynamics. This is due to a general principle behind the theory of open games as it does not have the notion of time.<sup id="fnref:2" role="doc-noteref"><a href="#fn:2" class="footnote" rel="footnote">2</a></sup></p>
<p>This is a limitation in the sense that we cannot see the dynamics unfold. It also has advantages though: The incentive analysis has no side-effects; in functional programming terms, it acts like a pure function and is fully referential relative to some state of the game.</p>
<h3>More models from here on</h3>
<p>Once we have represented the one-episode model, we have choices. We can directly work with that model. And we will do that in the next section. But we can also construct “larger models”: Either by manually combining several episodes into a new multi-episode model or by embedding the single episode into a Markov game structure.</p>
<p>We do not cover the construction proper or the analysis of the Markov game in this post. But the idea is simple: The stage game is a state in a Markov game where the state is fully captured by the inputs to the stage game. A Markov strategy then determines the move in the stage game. This, in turn, allows to derive the next state of the Markov game. To analyze such a game we can approximate the future payoff from the point of view of a single player under the assumption that the other players keep playing their strategy. In that way we can also assess unilateral deviations for the player in focus.</p>
<h2>Analysis</h2>
<p>After having established a model of the protocol, let us turn to its analysis. The whole point is of course not to represent the games but to learn something about the incentives of the agents involved.</p>
<p>It is important to note, here, that the model we arrived at above is just <em>one</em> possible way to represent the situation. Obviously, the engine cannot guarantee that you end up with a useful model. But what it should guarantee is that you can adapt the model quickly and iterate through a range of models. <em>The</em> “one true model” rarely exists. Instead being able to adapt and consider many different scenarios is the default.</p>
<p>We will illustrate two analyses showing that following the protocol’s intention, namely agents who follow the protocol truthfully, end up in an equilibrium. But we also see that in the current form, there are problems if a proposer chooses to delay his message strategically. Thus, this makes the protocol susceptible to attacks.</p>
<h3>Honest behavior</h3>
<p>We will first illustrate that the protocol works as intended if all agents involved are honest. They all observe the current head of the chain. Proposers then build a new block on top of that head; validators validate that head. The analysis can be found in <code class="language-plaintext highlighter-rouge">HonestBehavior.hs</code>.</p>
<p>We use the <code class="language-plaintext highlighter-rouge">oneEpisode</code> model. That is, we are slicing the protocol into one period and supply the initial information at which that round begins and a continuation for how the game continues in the next round. Recall that the rewards for proposer and validators in period $t$ is determined in period $(t+1)$. This information, the initialization and the continuation are fed in through <code class="language-plaintext highlighter-rouge">initialContextLinear</code> where <em>linear</em> signifies that we consider a non-forked chain.</p>
<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">initialContextLinear</span> <span class="n">p</span> <span class="n">a1</span> <span class="n">a2</span> <span class="n">reward</span> <span class="n">successFee</span> <span class="o">=</span>
<span class="kt">StochasticStatefulContext</span>
<span class="p">(</span><span class="n">pure</span> <span class="p">(</span><span class="nb">()</span><span class="p">,(</span><span class="n">initialChainLinear</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="n">initialMap</span><span class="p">)))</span>
<span class="p">(</span><span class="nf">\</span><span class="kr">_</span> <span class="n">x</span> <span class="o">-></span> <span class="n">feedPayoffs</span> <span class="n">p</span> <span class="n">a1</span> <span class="n">a2</span> <span class="n">reward</span> <span class="n">successFee</span> <span class="n">x</span><span class="p">)</span>
</code></pre></div></div>
<p>This expression looks more complicated than it actually is. The first part, <code class="language-plaintext highlighter-rouge">(pure ()),(initialChainLinear, 3, initialMap)))</code>, determines the starting conditions of the situation we consider. That is, we provide the input parameters which <code class="language-plaintext highlighter-rouge">oneEpisode</code> expects from us. Among other things, this contains the initial chain we start with. Here, replicated from above as a reminder:</p>
<p><img src="/assetsPosts/2022-06-24-a-software-engine-for-game-theoretic-modelling-part-2/chain.png" alt="Example chain for two validators" /></p>
<p>The second part, <code class="language-plaintext highlighter-rouge">(\_ x -> feedPayoffs p a1 a2 reward successFee x)</code> describes a function which computes the payoff from the current action in the next period. Details of how this payoff is determined can be found under the implementation of <code class="language-plaintext highlighter-rouge">feedPayoffs</code>.</p>
<p>Again, the way we approach this problem is by exploiting the key feature of open games: the one-episode model is like a pipe expecting some inflows and outflows. Once we have them defined, we can analyze what is going on inside of that “pipe”.</p>
<p>The last element needed for our analysis are the strategies. We define “honest” strategies for both proposer and validators: <code class="language-plaintext highlighter-rouge">strategyProposer</code> and <code class="language-plaintext highlighter-rouge">strategyValidator</code>.</p>
<p>Both type of agents observe previous information, for instance the past chain, then build on the head of the chain (proposer) or attest the head of the chain (validators).</p>
<p>Note that we include a condition in the strategies that deals with the scenario when there is not a unique head of the chain. In the analysis we focus on here, when everyone behaves honest, we will never reach this case. However, once not all agents are honest, there might be scenarios where the head is not unique. This will be important in the second case we analyze below.</p>
<p>Once we have defined the strategies, there is only one thing left to do: Initialize the game with some parameters, specifically rewards and fees for proposer, respectively validators.</p>
<p>In the file <code class="language-plaintext highlighter-rouge">HonestBehavior.hs</code> you can find one such parameterization, <code class="language-plaintext highlighter-rouge">analyzeScenario</code>:</p>
<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">analyzeScenario</span> <span class="o">=</span> <span class="n">eqOneEpisodeGame</span> <span class="s">"p0"</span> <span class="s">"p1"</span> <span class="s">"a10"</span> <span class="s">"a20"</span> <span class="s">"a11"</span> <span class="s">"a21"</span> <span class="mi">2</span> <span class="mi">2</span> <span class="n">strategyOneEpisode</span> <span class="p">(</span><span class="n">initialContextLinear</span> <span class="s">"p1"</span> <span class="s">"a11"</span> <span class="s">"a21"</span> <span class="mi">2</span> <span class="mi">2</span><span class="p">)</span>
</code></pre></div></div>
<p>This game employs the honest strategies. If we query it, we see that the proposer as well as the validators have no incentive to deviate. These strategies form an equilibrium - as intended in the design of the protocol.</p>
<h3>Identifying attacks - zooming in</h3>
<p>Let us turn to a second analysis. This analysis can be found in <code class="language-plaintext highlighter-rouge">Attacker.hs</code>.</p>
<p>In that episode we continue to consider the behavior of honest agents. However, these agents will start out on a chain that we assume has been intentionally delayed by the proposer in the episode before. This is achieved by adding an additional input to <code class="language-plaintext highlighter-rouge">oneEpisodeAttack</code>, <code class="language-plaintext highlighter-rouge">chainManipulated</code>, which is otherwise equivalent to <code class="language-plaintext highlighter-rouge">oneEpisode</code>: we as analysts can manipulate the chain that the proposer sees and that the validators see.</p>
<div class="language-haskell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">oneEpisodeAttack</span> <span class="n">p0</span> <span class="n">p1</span> <span class="n">a10</span> <span class="n">a20</span> <span class="n">a11</span> <span class="n">a21</span> <span class="n">reward</span> <span class="n">fee</span> <span class="o">=</span> <span class="o">[</span><span class="n">opengame</span><span class="o">|</span>
inputs : chainOld, headOfChainIdT2,
validatorsHashMapOld, chainManipulated ;
// ^ chainOld is the old hash
feedback : ;
:-----:
inputs : chainOld ;
feedback : ;
operation : proposer p1 ;
outputs : chainNew ;
returns : ;
// ^ Proposer makes a decision, a new hash is
proposed
inputs : chainNew, chainManipulated ;
feedback : ;
operation : mergeChain ;
outputs : mergedChain ;
returns : ;
// ^ Merges the two chains into a new chain for the
validators
inputs : mergedChain,chainOld,
validatorsHashMapOld;
feedback : ;
operation : validatorsGroupDecision a11 a21 ;
outputs : validatorHashMapNew, chainNewUpdated ;
returns : ;
// ^ Validators make a decision
inputs : chainNewUpdated ;
feedback : ;
operation : determineHeadOfChain ;
outputs : headOfChainId ;
returns : ;
// ^ Determines the head of the chain
inputs : validatorsHashMapOld, chainNewUpdated,
headOfChainId ;
feedback : ;
operation : validatorsPayment a10 a20 fee ;
outputs : ;
returns : ;
// ^ Determines whether validators from period (t-1)
were correct and get rewarded
inputs : chainOld, headOfChainIdT2 ;
feedback : ;
operation : oldProposerAddedBlock ;
outputs : blockAddedInT1, headOfChainIdT1;
returns : ;
// ^ This determines whether the proposer from
period (t-1) did actually add a block or not
inputs : blockAddedInT1, chainNewUpdated ;
feedback : ;
operation : proposerPayment p0 reward ;
outputs : ;
returns : ;
// ^ This determines whether the proposer from
period (t-1) was correct and triggers payments
accordingly
:-----:
outputs : chainNewUpdated, headOfChainIdT1,
validatorHashMapNew ;
returns : ;
<span class="o">|]</span>
</code></pre></div></div>
<p>This simulates the situation where the malicious proposer from the episode before sends a block after the honest proposer from this episode has added his own block. As a result there are now two nodes in the chain with 0 votes on it. In other words, there are two contenders for the head of the chain. The chain at point in time looks like this:</p>
<p><img src="/assetsPosts/2022-06-24-a-software-engine-for-game-theoretic-modelling-part-2/attackchain.png" alt="Forked chain for two validators" /></p>
<p>The next steps are analogous to the analysis before, we define inputs and how the game continues. Lastly, we need to define strategies.</p>
<p>We consider two strategies by the validators adapted to the specific scenario: Either they vote with the honest proposer, i.e. vote for node 4 (<code class="language-plaintext highlighter-rouge">strategyValidator4</code>), or they vote with the attacker’s block, i.e. vote for node 5 (<code class="language-plaintext highlighter-rouge">strategyValidator5</code>). We assume the proposer behaves honest as before.</p>
<p>If we ran the equilibrium check on these two scenarios, <code class="language-plaintext highlighter-rouge">analyzeScenario4</code> and <code class="language-plaintext highlighter-rouge">analyzeScenario5</code>, we see that <em>both</em> constitute an equilibrium. That is, in both cases none of the players has an incentive to deviate. Obviously, the scenario where the validators vote for the malicious proposer is not an equilibrium we want from the design perspective of the protocol.</p>
<p>We can shed further light in what is going on here: So far we assumed that the validators will coordinate on one node, they either both choose node 4 or both choose node 5. The key issue is that they observe two candidate nodes for the new head of the chain. We can also consider the case where the validators randomize when facing a tie (<code class="language-plaintext highlighter-rouge">analyzeScenarioRandom</code>). In that case, we see that the result is a non-equilibrium state. Both validators would profit from voting on another block. The reason is simple: They are not coordinated. In case of randomly drawing one of the heads, there is the possibility that the validators output mutually contradictory information. Which means, they will not be rewarded.</p>
<h2>Outlook</h2>
<p>The development of the engine is ongoing. Protocols which involve a timing choice, as for instance for a proposer waiting to send information and thereby potentially learning something about the validators’ behavior in the meantime, pose a challenge for the current implementation. One should add, they also pose a challenge for classical game representations such as the extensive form. As we have shown, it is still entirely possible to represent such games in the engine. However, such modelling puts the burden on the modeller to make reasonable choices. It would be nice to start with an actual protocol and extract a game-theoretic model out of it. Extending the underlying theory and the engine to better accommodate such scenarios is on the top of our todo list.</p>
<div class="footnotes" role="doc-endnotes">
<ol>
<li id="fn:1" role="doc-endnote">
<p>This is not the only way to model the protocol in the current implementation. It is also possible to consider a timer explicitly as a state variable. This <a href="https://github.com/20squares/block-validation/tree/timer">branch</a> contains such a model. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
<li id="fn:2" role="doc-endnote">
<p>We should be more precise: In the current theory of open games there is always a clear notion of causality - who moves when and what is observed when by whom. The relevant “events” can be organized in a relation. This follows the overall categorical structure in which open games are embedded. We are working on a version of the theory where time - or other underlying structures like networks - are what open games are based on. <a href="#fnref:2" class="reversefootnote" role="doc-backlink">↩</a></p>
</li>
</ol>
</div>Philipp ZahnSome time ago, in a previous blog post, we introduced our software engine for game theoretic modelling. In this post, we expand more on how to apply the engine to use cases relevant for the Ethereum ecosystem. We will consider an analysis of a simplified staking protocol. Our focus will be on compositionality – what this means from the perspective of representing protocols and from the perspective of analyzing protocols.What is Categorical Cybernetics?2022-05-29T00:00:00+00:002022-05-29T00:00:00+00:00https://cybercat-institute.github.io//2022/05/29/what-is-categorical-cybernetics<p><strong>Categorical cybernetics</strong>, or <strong>CyberCat</strong> to its friends, is – no surprise – the application of methods of (applied) category theory to cybernetics. The “<strong>category theory</strong>” part is clear enough, but the term “<strong>cybernetics</strong>” is notoriously fluid, and throughout history has meant more or less whatever the writer wanted it to mean. So, let’s lay down some boundaries.</p>
<p>I first proposed CyberCat, both as a field and as a term, in <a href="https://julesh.com/2019/11/27/categorical-cybernetics-a-manifesto/">this 2019 blog post</a> (for which this one is partly an update). There I fixed a definition that I still like: <strong>cybernetics is the control theory of complex systems</strong>. That is, cybernetics is the interaction of control theory and systems theory.</p>
<p>We add to this <a href="https://www.appliedcategorytheory.org/">applied category theory</a>, which has some generic benefits. Most importantly we have <a href="https://julesh.com/2017/04/22/on-compositionality/">compositionality</a> by default, and a more precise way of talking about it than in fields like machine learning where it is present but informal. Compositionality also gets us half way to computer implementation by default, by making our models similar to programs. Finally category theory gives us a disciplined way to talk about interaction between models in different fields.</p>
<p>It turns out - and this fact is at the heart of CyberCat - that the category-theoretic study of control has a huge amount of overlap with things like <strong>learning</strong> and <strong>strategic analysis</strong>. Those were also historically part of cybernetics, and can be seen as aspects of control theory with a certain amount of squinting, so we also include them.</p>
<p>On top of that definition, a cultural aspect of the historical cybernetics movement that we want to retain is that <strong>cybernetics is inherently interdisciplinary</strong>. Cybernetics is not just the theory but the practice: in engineering, artificial intelligence, economics, ecology, political science, and anywhere else where it might be useful. (Part of the reason we created the Institute – more on that in a future post – is to make this cross-cutting collaboration easier than in a unviersity.)</p>
<p>Cybernetics has been an academic dirty word since many decades now: in the 60s and 70s it went through a hype cycle, things were over-claimed and the field eventually fell apart. As founders of the CyberCat Institute we believe that <strong>the time is right to reclaim the word cybernetics</strong>. Apart from anything else, the word is just too cool to not use. More importantly, the objects of study – and the interdisciplinary approach to studying them – are even more important now than 50 years ago.</p>
<p>Having laid out what CyberCat could potentially be, I will now narrow the scope. At the Institute we are focussing on not just any applications of category theory to cybernetics, but to a small set of very closely interrelated tools. These are, roughly, things that have a family resemblance to <strong>open games</strong>.</p>
<p>This post isn’t the place to go into technical details, but what these things have in common is that they model <strong>bidirectional processes</strong>: they are processes (that is, they have an extent in time) in which some information appears to flow backwards (I described the idea in more detail in <a href="https://julesh.com/2017/09/29/a-first-look-at-open-games/">this post</a>). The best known of these is <strong>backpropagation</strong>, where the backwards pass goes backwards. A key technical idea behind CyberCat is the observation that many other important processes in cybernetics have a lot in common with backprop, once you take the right perspective. The category-theoretic tool used to model these processes is <strong>optics</strong>.</p>
<p>Besides backprop, the things we have put on a uniform mathematical foundation using optics are value iteration, Bayesian inference, filtering, and the unnamed process that is the secret sauce of compositional game theory.</p>
<p>This is the academic foundation that we start from. The question that comes next is, so what? How can this knowledge be exploited to solve actual problems? This is where the CyberCat Institute comes in, but I want to leave that for a future post. In the meantime, you can look at our <a href="/projects">projects page</a> to see the kinds of things we are working on right now.</p>Jules HedgesCategorical cybernetics, or CyberCat to its friends, is – no surprise – the application of methods of (applied) category theory to cybernetics. The "category theory" part is clear enough, but the term "cybernetics" is notoriously fluid, and throughout history has meant more or less whatever the writer wanted it to mean. So, let’s lay down some boundaries.