Skip to content

cadcad specification (DRAFT)

Tyler Mace edited this page Mar 31, 2021 · 2 revisions

cadCAD Specification (DRAFT)

Suppose $X$ is some object which has datastructure $\mathcal{X}$, literally anything we can define but in practice we're using json like objects. ($\mathcal{X}$ is really just a space but we're talking about representing that space in a computer)

$X^+ = h(X)$

The system model is a generalized dynamical system if $h:\mathcal{X} \rightarrow \mathcal{X}$

there is a map $h$ with domain and codomain $\mathcal{X}$

Our set of generalized dynamical systems is $\mathcal{D}$ then a specifical one can be expressed as $d=(h, \mathcal{X})\in \mathcal{D}$ is the primitive concept here and we can these things as closed under composition. That is to say for any $d_1,d_2\in \mathcal{D}$ then $d_1 \circ d_2 \in \mathcal{D}$

Now let's suppose that we wish to decompose the operator $f$ to be comprised of inputs and updates as a consequence of these inputs, as is standard in control engineering:

$u = g(X)$

$X^+ = f(X, u)$

such that

$h(X) = f(X, g(X))$

Define a state transition function (aka mechanism) as $(f,\mathcal{X},U)$ with the following $U:\mathcal{X}\rightarrow \mathcal \wp({\mathcal{U}})$ which gives us the set of admissible actions $\mathcal{U}$ under the map $f:\mathcal{X} \times \mathcal{U} \rightarrow \mathcal{X}$

for any selection $g:\mathcal{X} \rightarrow \mathcal{U}$ we have a system of the type $f \circ g \in \mathcal{D}$

this gives us a single "state update block".

Define a set of state update blocks $h_i$ for $i\in{0,1,.., m}$ where each $h_i$ has $f_i$ and $g_i$ satisfying the conditions above for $f$ and $g$ respectively.


Sidebar $h_i:\mathcal{X}_i \rightarrow \mathcal{Y}_i$ where $\mathcal{X}_i,\mathcal{Y}_i \subset \mathcal{X}$ suppose each $h_i$ operates on $x_i \subseteq X$ and mutates a subset $y_i \subseteq X$. then we can examine the set of chains of composition which product equivalent results by considering the intersections of the various subsets $x_i,y_i$ for all $i$.


Then we will call each $h_i\in \mathcal{D}$ a partial state update block and the entire state update is given by $H\in \mathcal{X}$

$H(X)= h_m(h_{m-1}(...h_1(h_0(X)) ) )$

where $H:\mathcal{X} \rightarrow \mathcal{X}$ and is also a generalized dynamical system but the value of $X$ passes through intermediate states so we will denote these substeps as $X_i$ where $X_m=X^+$.

$X = X_m^-$ $X_0 = h_0(X) = f_0(X, g_0(X))$ $X_1 = h_1(X_0) = f_1(X_0, g_1(X_0))$ $\vdots$ $X_{m-1} = h_{m-1}(X_{m-2})= f_{m-1}(X_{m-2}, g_{m-1}(X_{m-2}))$ $X_m = h_m(X_{m-1})= f_m(X_{m-1}, g_m(X_{m-1}))$ $X^+ = X_m$

A run is a sequence $X_{i,t}$ where there is some initial condition $X=\mathbf{X_0}=X_{_,0}$ and we apply the first update

$X_{0,1} = h_0(X)= h_0(X_{_,0})$

this gives us a loop:


for $t\in,range(T)$: $\qquad$ for $i\in,range(m)$ $\qquad\qquad u = g_i(X_{t,i-1})$ $\qquad\qquad X_{t,i}=f_i(X_{t,i-1}, u)$

$\qquad \qquad$if $i==m$: $\qquad \qquad\qquad \mathbf{X_t} = X_{t-1,m}$


What is actually retained?

in the most expressive version we can store $X_{t,i}$ for all $i$ and $t$

next less would be $\mathbf{X_t}$

then the even more parsimonious would be $\mathbf{Y_t} \subset \mathbf{X_t}$

and if desired one could choose a different middleground with specific outputs $Y_{i,t}\subset X_{i,t}$.


Potential future feature is to allow flagging of partial state updates to have their outputs stored and/or flag specific states to have their outputs stored even after the simulation itself will no longer reference them directly.


Relation to testing procedures

if we think of cadCAD as similar to ML algorithms in the sense that there is a valid sequence of transformers we can define a classes:

  • functions $h\in \mathcal{D}$
  • data structures $\wp(\mathcal{X})$
  • functions $g$ and $f$
  • assertions of the form $A: \mathcal{X} \rightarrow {0,1}$ which select a subspace $\mathcal{X}^C \subset \mathcal{X}$
  • assertions of the form $a: \mathcal{X} \times \mathcal{X} \rightarrow {0,1}$ which verifies the validity of a state transition $X^+ = h(X)$, eg $V(X^+)==V(X)$.

(suggestions)

  • State objects (json-like with nested key value pairs, and optionally assertions on the relationships between values, eg sum of agent tokens = supply of token)
  • State update functions (operators of type $h:\mathcal{X} \rightarrow \mathcal{X}$, which are created by combining input operators $u=g(X)$ and $X^+=f(X, u)$ where the domains and ranges match such that $X^+=f(X, g(X))$ maps $\mathcal{X}$ to $\mathcal{X}$.
  • State transition functions $f(X,u)$ take a prior state $X\in \mathcal{X}$ and an input in $u\in\mathcal{U}$ and return a posterior state $X^+\in\mathcal{X}$. There may be state, time or actor dependent restrictions captured by the set $\mathcal{U}$. the mechanism is passive if applying an empty action returns the prior state: $X^+=f(X, \emptyset ) = X$.
  • Input functions $u = g(X)$ which take the state object as their domain and whose codomain is a space $\mathcal{U}$ which may also gave restrictions. This operator has a variety of special cases:
    • i) Environmental processes are generally modeled as random variablies which will drive stochastic processes in the associated functions $f(X,u)$.
    • ii) Behavioral models use $u$ to represent actions taken by actors outside the control of systems engineer, depending on the application the admissible action set $\mathcal{U}$ may be restricted in a state dependent manner,
    • iii) Control policies are feedback processes which adapt the system according to runs defined by the systems engineer which the intent to enforce an invariant or steer the system with respect to an objective.
    • iv) Input functions may simply reads of external data sets or streams from other runtimes.
  • A sequence of operations which are state update functions may be bundled into a sequence which is collectively a valid state update function, this pipeline is what we call the "partial state update blocks" and applying the whole sequence is the state update function.

Note that for testing purposes nearly everything about a simulation can be checked using pre and post conditions on the state objects $\mathcal{X}$ and on the input signals $u\in \mathcal{U}$ (associated with a specific $f(X,u))$. Therefore the testing framework could work by defining classes of assertions:

  • General assertion about the state object itself which could be checked in debugging or automating testing mode at every point an object $X\in \mathcal{X}$ is either referenced or returned.
  • Specific assertion about the local invariants $V(X^+)=V(X)$ (or others such as $V(X^+)< V(X)$, or any boolean) under a particular operation $h(X)$.
  • Specific assertion about the admissibility of a particular action $u\in \mathcal{U}$, most commonly constructed as checking whether the image of some action $u$ results in $f(X,u)\in \mathcal{X}$. In theory other assertions are possible as well

Broadly speaking since we're doing this over arbitrary data structures, another family of assertions we could explore are checking units, if it is the case that fields the state can be said to have units, and if units themselves have well defined relations.

Example: Simple Scheme Declaration and Check

This example is drawn from the credCastle model being worked on by one cadCAD community "study group"

Define the state objects top level keys and assign them data types

state_schema = {
    'castle': int,
    'castle_token_price': float,
    'castle_token_supply': int,
    'castle_token_holdings': list
    }

The create an a genesis state:

init_holdings = [np.random.randint(0,2) for i in range(n)]
sum_of_init_holdings = sum(init_holdings)

genesis_state = {
    'castle': 1,
    'castle_token_price': 1.2,
    'castle_token_supply':sum_of_init_holdings,
    'castle_token_holdings': init_holdings
}

Our state is defined by a set of keys declared in the schema.

keys = state_schema.keys()
print(keys)
dict_keys(['castle', 'castle_token_price', 'castle_token_supply', 'castle_token_holdings'])

Our genesis state has the same keys!

print(genesis_state.keys()==keys)
True

We also see that our genesis state respects the schema!

for k in keys:
    print(type(genesis_state[k])==state_schema[k])
    
True
True
True

This simple example demonstrates the idea that we can declare boolean functions of over our state objects to verify that they are in fact in our statespace.

Also possible

  • more sophisticated schemas (including classes and nested structures)
  • logics that run contingent on the violation of these rules, which project back into the domain $\mathcal{X}$ or cause actions to fail and revert to the prior state.

cadCAD System in "Plain English"

In the cadCAD simulation methodology, we operate on four layers: Policies, Mechanisms, States, and Metrics. Information flows do not have explicit feedback loop unless noted. Policies determine the inputs into the system dynamics, and can come from user input, observations from the exogenous environment, or algorithms. Mechanisms are functions that take the policy decisions and update the States to reflect the policy level changes. States are variables that represent the system quantities at the given point in time, and Metrics are computed from state variables to assess the health of the system. Metrics can often be thought of as KPIs, or Key Performance Indicators.

At a more granular level, to setup a model, there are system conventions and configurations that must be followed.

The way to think of cadCAD modeling is analogous to machine learning pipelines which normally consist of multiple steps when training and running a deployed model. There is preprocessing, which includes segregating features between continuous and categorical, transforming or imputing data, and then instantiating, training, and running a machine learning model with specified hyperparameters. cadCAD modeling can be thought of in the same way as states, roughly translating into features, are fed into pipelines that have built-in logic to direct traffic between different mechanisms, such as scaling and imputation. Accuracy scores, ROC, etc are analogous to the metrics that can be configured on a cadCAD model, specifying how well a given model is doing in meeting its objectives. The parameter sweeping capability of cadCAD can be thought of as a grid search, or way to find the optimal hyperparameters for a system by running through alternative scenarios. A/B style testing that cadCAD enables is used in the same way machine learning models are A/B tested, except out of the box, in providing a side by side comparison of muliple different models to compare and contract performance. Utilizing the field of Systems Identification, dynamical systems models can be used to "online learn" by providing a feedback loop to generative system mechanisms.

The flexibility of cadCAD also enables the embedding of machine learning models into behavior policies or mechanisms for complex systems with an machine learning prediction component.

Examples

System Dynamics (SD)

System Dynamics is a modeling paradigm used to model the nonlinear behavior of complex systems using flows, stocks, and feedback loops. Systems Dynamics modeling is very useful in modeling population flows, financial statements, etc but has a limited ability to represent complex agent and system interactions.

Predator Prey model formulation

An example model for understanding dynamical system is the commonly used Lotka–Volterra Prey-Predator model, which is a pair of first order nonlienar differentional equations that is used to describe the dynamics of two species interating, one which is a predator the other which is the prey. We can model the population changes over time.

It is based on the following[2,3]

\begin{aligned}{\frac {dx}{dt}}&=\alpha x-\beta xy,\{\frac {dy}{dt}}&=\delta xy-\gamma y,\end{aligned}

Where:

  • $x$ is the number of prey
  • $y$ is the number of some predator
  • $\frac{dx}{dt}$ and $\frac{dy}{dt}$ represent the instantaneous growth rates of
  • $t$ represents time
  • α, β, γ, δ are positive real parameters describing the interaction of the two species.

The most prominent feature of it is the existence, depending on the choice of parameters, of a repeatable cycle around a fixed point which creates a dynamical equilibrium between the number of preys and predators on a system.

partial_state_update_block = [
    {
        'policies': {
            'reproduce_prey': p_reproduce_prey,
            'reproduce_predators': p_reproduce_predators,
            'eliminate_prey': p_eliminate_prey,
            'eliminate_predators': p_eliminate_predators
        },
        'variables': {
            'prey_population': s_prey_population,
            'predator_population': s_predator_population            
        }
    }

]

System Dynamics paradigm (macroscopic view) advantages
  • Fast-performing, allowing a very large number of timesteps and simulations
  • Easy to prototype and to add/modify mechanisms
  • Easy to insert a multitude of complex factors
  • The output is usually easy to visualize

Agent Based Modeling (ABM)

Agent based modeling is a modeling paradigm to simulate the interaction of autonoumous agents and their results on the underlying system. An example of Agent Based Modeling is modeling secondary market behavior of individual actors, such as traders, long-term investors, and liquidity providers.

Using the same Predator Prey model defined above in the Systems Dynamics Example, we'll adopt a model based on a grid world, on which preys and predators take the following actions at each timestep of their lifes:

  • Food is grown on every site.
  • All agents digest some of the food on their stomach and get older.
  • All agents move (if possible) to an available random neighboring location.
  • The agents reproduce themselves if there is an available partner nearby
  • The prey agents feed on the available food
  • The predator agents hunts the nearby preys
  • All old enough agents die

There is an inherent stochastic nature on this model, and every time that you run it, we'll have a completely different result for the same parameters. But we can see that there is sort of a random equilibrium that converges to the dynamical equilibrium which we presented on the dynamical simulation.

partial_state_update_block = [
    {
        'policies': {
            'grow_food': p_grow_food
        },
        'variables': {
            'sites': s_update_food
        }
    },
    {
        'policies': {
            'increase_agent_age': p_digest_and_olden
        },
        'variables': {
            'agents': s_agent_food_age

        }
    },
    {
        'policies': {
            'move_agent': p_move_agents
        },
        'variables': {
            'agents': s_agent_location

        }
    },
    {
        'policies': {
            'reproduce_agents': p_reproduce_agents

        },
        'variables': {
            'agents': s_agent_create,

        }
    },
    {
        'policies': {
            'feed_prey': p_feed_prey
        },
        'variables': {
            'agents': s_agent_food,
            'sites': s_site_food
        }
    },
    {
        'policies': {
            'hunt_prey': p_hunt_prey
        },
        'variables': {
            'agents': s_agent_food
        }
    },
    {
        'policies': {
            'natural_death': p_natural_death
        },
        'variables': {
            'agents': s_agent_remove
        }
    }
]

Agent-based modeling paradigm (microscopic view) advantages

  • Are conceptually closer to experience, making it easier to explain to someone with no previous background
  • Easier to generate complex behavior with simple rules
  • Generates more granular and detailed information

Networked Models

Grassroots Economics has created a Community Currency to help alleviate the liqudity crisis of rural Kenya. BlockScience created a graph based dynamical system model in order to provide a scaffold for Grassroot's economy planning, a subset of which is discussed below as an illustration of networked model types.

For networked, graph models evolving over time, assuming we have a directed graph $\mathcal{G}(\mathcal{V},\mathcal{E})$ with subpopulations as vertices or nodes, $\mathcal{V} = {1...\mathcal{V}}$ and edges as $\mathcal{E} \subseteq \mathcal{V} \times \mathcal{V}$. Demand, utility, and spend are edges connecting the subpopulations, with spend used to denote desired flow between agents, as $i,j \in \mathcal{E}$. Techically, the graph is a weighted, directed multigraph with more than on edge, $i \longrightarrow j$ for any pair of vertices $i,j \in \mathcal{V}$ with $w_{i,j}$. In this example, we have a state update block, as shown below, with two partial state update blocks, choose_agents and spend_allocation.

partial_state_update_block = [
    'Behaviors': {
        'policies': {
            'action': choose_agents
        },
        'variables': {
        'network': update_agent_activity,
        'outboundAgents': update_outboundAgents,
        'inboundAgents':update_inboundAgents
        }
    },
    'Spend allocation': {
        'policies': {
            'action': spend_allocation
        },
        'variables': {
        'network': update_node_spend
        }
    }
]

In this example, during the spend_allocation, we calculate, based off of the desired interacting agents's demand, utility, and liquidity constraints, we iterate through the desired demand and allocate based on a stack ranking of utility $v_{i,j}$ over demand $\frac{v_{i,j}}{d_{i,j}}$ until all demand for each agent is met or the agent $i$ runs out of funds. There are several assertions we may want to test, such as:

  • Agent $i$ does not go negative in their funds.
  • All edges the that agent $i$ is connected to have been stacked ranked by utility and demand.

Networked Models Advantages

  • Represent complex relationships containing interaction data between mutliple agents
  • Networked models are an object type, ad a result, they can be used in conjuction with ABM and multiscale modeling approaches for modeling detailed interactions effeciently.

Multiscale Modeling

Multiscale Modeling is a type of modeling over multiple scales of time or space to describe a system, or spatio-temporal scales. An example of a multiscale model is the Conviction Voting, a a novel decision making process where votes express their preference for which proposals they would like to see approved in a continuous rather than discrete way. The longer the community keeps a preference on an individual proposal, the “stronger” the proposal conviction becomes. In the conviction voting model a graph structure is used to record the the introduction and removal of participants, candidates, proposals, and their outcomes. The complexity and different scales represented that cadCAD is able to model.

Multiscale Modeling Advantages

  • Ability on multiple spatio-temporal scales.
  • Nonlinear dynamics and feedback effects with emergent properties
  • Realistic system complexity in engineering, control systems, and economics models.

Glossary

This section contains terms and definitions

Table Term/Concept, Math notation, Definition/description

carefully define some extra terms

  • object (as in arbitrary data structure)
  • a sequence of objects is a stream (general discrete)
  • special case is a point (as in a vectorspace)
  • with special case sequence called a signal (has both continuous and discrete variants)

need to work on a breakout on "time" representations

  • continuous time
  • event sequences (partial orders)
  • discrete time (sampled continuous time versus strict order of events)

Basic Definitions

Term Notation Definition Relations
State $X$ an object or point representing the current configuration of the system $X\in \mathcal{X}$
Statespace $\mathcal{X}$ a data structure or space containing all possible values of $X$ $\hbox{Image}(f) \subseteq \mathcal{X}$
State Update Map $f$ a map which takes the current state $X\in\mathcal{X}$ and an input object $u \in\mathcal{U}$ to return a new object in $\mathcal{X}$. $f:\mathcal{X}\times\mathcal{U}\rightarrow \mathcal{X}$
Input $u$ an object representing an input to the system triggering a state update $f(X,u)\in \mathcal{X}$
Admissible Input Space $\mathcal{U}$ a data structure or space containing all possible values of $u$ given the current state $X\in \mathcal{X}$ $u\in\mathcal{U}$
Input Space $\mathcal{V}^\ddagger$ a data structure or space containing all possible values of $u$ $\mathcal{U}\in\wp(\mathcal{V})^\dagger$
Admissible Input Map $U$ a map which takes the current state $X$ and returns the Admissible Input Space $\mathcal{U}$ $U:\mathcal{X}\rightarrow\wp(\mathcal{V})$
Input Map $g$ a map which takes the current state $X$ and returns an admissible input $u$ $g:\mathcal{X}\rightarrow \mathcal{U}$
State Transition Map $h$ a map which takes the current state $X$ to a point in the same statespace $\mathcal{X}$ $h = f\circ g$, $h:\mathcal{X}\rightarrow\mathcal{X}$
Posterior State $X^+$ the state the system arrives at after applying the state transition function $h$ to state $X$. $X^+=h(X)$
Trajectory $X_0, X_1\ldots\in \mathcal{X}$ A sequence of points in $\mathcal{X}$ achieved by recursively applying the map $h$ starting at initial object $X_0$ $X_{t+1}=h(X_t)$
Generalized Dynamical System ${h,\mathcal{X}}$ A map $h$ and an associated statespace $\mathcal{X}$ such that $h:\mathcal{X} \rightarrow \mathcal{X}$, generating trajectories for any initial object $X_0$ $X_1 = h(X_0)$, $X_2 = h(X_1)$ $\dots$ $X_t = h(X_{t-1})$$\ldots$

$^\dagger$ The notation $\wp(\mathcal{S})$ denotes the power set of the set $\mathcal{S}$.

$^\ddagger$ The notation $\mathcal{V}$ denotes the universal set (set of all sets or set of all possible elements in a given context/universe? need citation)$.

Basic Definitions, an Example: King on Chess Board

Suppose a system comprised of a King moving on an empty chess board. The chess board is 8x8 squares, and the king can move one square in any direction (horizontally, vertically, or diagonally).

  • $X$ is a tuple representing the king's position on the board. Let's define a convention where:
    • (0,0): bottom left square
    • (0,7): upper left square
    • (7,0): bottom right square
    • (7,7): upper right square
  • $\mathcal{X}$, being the set of all possible values of $X$, is the set of all the 64 squares of the board: {(0,0),(0,1),...(0,7),(1,0),(1,1)...(7,6),(7,7)}
  • $u$ is the king's move. Let's define a convention where the move is described by a tuple where each element is the number of squares the king moves along the corresponding axis, eg.
    • (0,1): move up
    • (-1,0): move left
    • (1,1): move diagonally to the upper right
  • $\mathcal{V}$ is the set of all 9 possible values of $u$: {(-1,-1),(0,-1),(1,-1)(-1,0),(0,0),(1,0)(-1,1),(0,1),(1,1)}
  • $\mathcal{U}$, the admissible input space, depends on the current position of the king ($X$). For example, if the king is at (0,0) it can't move to the left or down; it's input space is restricted to {(0,0),(0,1),(1,1),(1,0)}. The admissible input map $U$ is a mapping function that formalizes such restrictions for all possible $X$.
  • $g$, the input function, selects one of the possible $u$ from $\mathcal{U}$
  • $f$, the state update function, computes the new position of the king after a move. If we define $X=(x_i,x_j)$ and $u=(\Delta{x_i},\Delta{x_j})$, then $f(X,u) = (x_i+\Delta{x_i},x_j+\Delta{x_j})$. Notice that because $u$ is selected from $\mathcal{U}$, no further checks on the validity of $u$ need to be performed when $f$ is evaluated.

Chaining Partial State-Update Blocks

Term Notation Definition
$h_i$
$\mathcal{X}_i$
$\mathcal{Y}_i$
$x_i$
$y_i$
$g_i$
$f_i$
$H$

Defining and Enforcing Domain Restrictions

Term Notation Definition
$a$
$\mathcal{X}^C$
$V$
$A$

+invariants, other more advanced stuff eg $V$, $A$, $a$, $\mathcal{X}^C$

+fixed points and neighborhoods, design for convergence properties (eg estimation,sensemaking)

+games, composed games, and path planning problems

References