Free Energy Principle

The Free Energy Principle

$$ a, \mu, m = \arg\min F(\tilde{s}, \mu \mid m) $$

Where:

$F(\tilde{s}, \mu \mid m)$ is the variational free energy, a quantity that bounds surprise.
$\tilde{s}$: sensory inputs (possibly generalized coordinates of sensations).
$\mu$: internal states (like beliefs or expectations).
$m$: the model or structure used to generate predictions.
$a$: actions that can influence the sensory input.
$\arg\min$: denotes the values of $a, \mu, m$ that minimize the free energy.

This means that an agent (e.g. a brain) is constantly trying to:

$$ \mu = \arg\min_{\mu} \, D_{\mathrm{KL}} \left( q(\vartheta) \,\\|\, p(\vartheta \mid \tilde{s}) \right) $$

$$ \mu = \arg\max \left\\{ I(\tilde{s}, \mu) - H(\mu) \right\\} $$

$\tilde{s}$: sensory data (observed input)
$\mu$: internal representation (e.g. neural encoding or beliefs)
$I(\tilde{s}, \mu)$: mutual information between sensory input and internal representations — how much knowing $\mu$ reduces uncertainty about $\tilde{s}$
$H(\mu)$: entropy of the internal representation — how complex or redundant $\mu$ is
Minimizing free energy is equivalent to maximizing mutual information (like Infomax), while penalizing complex beliefs (entropy or KL divergence).