~Approximate minimax, three ways

Tuesday, February 25^th, 2025

This is a short technical note written in the process of working out some definitions for a paper about using an approximate minimax regret training objective to mitigate goal misgeneralisation in advanced deep reinforcement learning systems. I define three different approximate relaxations of the minimax objective, and show that the definitions are related under certain assumptions.

Thanks to my colleagues Karim Abdel Sadek and Michael Dennis for helping me understand the necessary game theory and decision theory to figure this out.

§Background

This note is a sequel to minimax, three ways. See that note for additional background. This note references the results and definitions from the previous note by number. In this note, I closely follow those definitions and propositions from the previous note in order to define three approximate relaxations of the minimax objective and study their relationships.

What do I mean by an approximate relaxation of an objective?¹ First let me formalise optimisation. Suppose we have a function $f : \mathcal{X} \to \mathbb{R}$ . To minimise this function is to find an argument $x \in \mathcal{X}$ that achieves the minimum possible value $f(x) = \operatornamewithlimits{\vphantom{arg\,}min}_{x' \in \mathcal{X}} f(x')$ where the RHS is assumed to exist. We define the set of minimisers $\operatornamewithlimits{arg\,min}_{x'\in\mathcal{X}} f(x') = \left\{ x \in \mathcal{X} \,\middle|\, f(x) = \operatornamewithlimits{\vphantom{arg\,}min}_{x' \in \mathcal{X}} f(x') \right\}.$

This criterion is very strict, making it difficult to find such minimisers in practice. It’s slightly more realistic to assume that we might be able to find some $x \in \mathcal{X}$ that is within some finite approximation threshold $\varepsilon\geq 0$ of the minimum possible value. We thus define the set of approximate minimisers $\operatornamewithlimits{\text{arg\,$\varepsilon$\,min}}_{x'\in\mathcal{X}} f(x') = \left\{ x \in \mathcal{X} \,\middle|\, f(x) \leq \operatornamewithlimits{\vphantom{arg\,}min}_{x' \in \mathcal{X}} f(x') + \varepsilon \right\}.$ We recover exact minimisation with $\varepsilon= 0$ .

Similarly, we define maximisers $\operatornamewithlimits{arg\,max}_{x'\in\mathcal{X}} f(x') = \left\{x \in \mathcal{X}\,\middle|\,f(x) = \operatornamewithlimits{\vphantom{arg\,}max}_{x' \in \mathcal{X}} f(x')\right\}$ , and approximate maximisers given approximation threshold $\delta \geq 0$ , $\operatornamewithlimits{\text{arg\,$\delta$\,max}}_{x'\in\mathcal{X}} f(x') = \left\{ x \in \mathcal{X} \,\middle|\, f(x) \geq \operatornamewithlimits{\vphantom{arg\,}max}_{x' \in \mathcal{X}} f(x') - \delta \right\}.$

§Three approximate relaxations…

In the previous note, we explored three different definitions of the minimax objective: minimax(1), minimax(2), and minimax(3 $\sigma$ ). The idea was that each definition would suggest a different approximate relaxation. Here I state the relaxed definitions.

As in the previous note, suppose we have a function $r : \mathcal{A}\times \mathcal{S}\to \mathbb{R}$ . It doesn’t matter what the function represents, or what the sets $\mathcal{A}$ and $\mathcal{S}$ are, but the notation is chosen to suggest that $a \in \mathcal{A}$ is an action chosen by a decision maker, $s \in \mathcal{S}$ is a state of the world, and $r(a, s)$ is regret arising from taking action $a$ in state $s$ . The only assumption I actually make on $r$ , $\mathcal{A}$ , and $\mathcal{S}$ is that all the maxima and minima I refer to exist.

Definition 1 defines minimax actions as those that achieve the minimum value of a function, and that function is given by the maximum value given a choice of state that depends on the action. There is no room to relax the internal maximisation, as we need a concrete value to minimise over. For now, we just ignore the inner optimisation and we only approximate the outer one.

Relaxation 1: Let $\varepsilon\geq 0$ be an approximation threshold. An action $a \in \mathcal{A}$ is approximinimax(1) at resolution $\varepsilon$ if $a \in \operatornamewithlimits{\text{arg\,$\varepsilon$\,min}}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a', s').$ We denote the set of approximinimax(1) actions at resolution $\varepsilon$ by $\mathcal{A}^{(1)}_{\varepsilon}$ .

Definition 2 defines minimax actions as those that are part of a Nash equilibrium of a two-player zero-sum game, where the agent minimises the function and the adversary maximises it. We relax this definition by replacing the Nash equilibrium condition with an approximate Nash equilibrium.

Relaxation 2: Let $\varepsilon, \delta \geq 0$ be approximation thresholds. An action $a \in \mathcal{A}$ is approximinimax(2) at resolution $(\varepsilon, \delta)$ if there exists $s \in \mathcal{S}$ such that both $s \in \operatornamewithlimits{\text{arg\,$\delta$\,max}}_ {s'\in\mathcal{S}}r(a, s') \quad\textit{and}\quad a \in \operatornamewithlimits{\text{arg\,$\varepsilon$\,min}}_ {a'\in\mathcal{A}}r(a', s).$ We denote the set of approximinimax(2) actions at resolution $(\varepsilon,\delta)$ by $\mathcal{A}^{(2)}_{\varepsilon,\delta}$ .

This definition clearly allows us to relax the maximum as well as the minimum, however, at a given resolution, not all games have approximate Nash equilibria. In such cases, minimax(2) is not well defined. In contrast, minimax(1) is well-defined for any $r$ , $\mathcal{A}$ , $\mathcal{S}$ as long as the maxima and minima exist. Intuitively, we shouldn’t need there to be an equilibrium for us to be able to talk about minimising the worst possible response.

Definition 3 is designed to overcome the shortcomings of each of definitions 1 and 2. We defined minimax actions as minima of a function involving a concrete max map, a deterministic function actions to worst-case states. We can relax this definition by relaxing both the maximisation criterion in the max map and the minimisation criterion in the definition of minimax actions.

Relaxation 3.1: Let $\delta \geq 0$ be an approximation threshold. An approximate max map at resolution $\delta$ is any function $\sigma: \mathcal{A}\to \mathcal{S}$ such that for all $a \in \mathcal{A}$ , $\sigma(a) \in \operatornamewithlimits{\text{arg\,$\delta$\,max}}_ {s'\in\mathcal{S}}r(a, s') .$

Relaxation 3.2: Let $\varepsilon, \delta \geq 0$ be approximation thresholds. Let $\sigma$ be any approximate max map at resolution $\delta$ . An action $a \in \mathcal{A}$ is approximinimax(3 $\sigma$ ) at resolution $\varepsilon$ if $a \in \operatornamewithlimits{\text{arg\,$\varepsilon$\,min}}_ {a'\in\mathcal{A}}r(a', \sigma(a')) .$ We denote the set of approximinimax(3 $\sigma$ ) actions at resolution $\varepsilon$ as $\mathcal{A}^{(3\sigma)}_{\varepsilon}$ .

§… and their relations

Given $\varepsilon= \delta = 0$ , we recover definitions 1, 2, and 3 from the previous note, and therefore the results of propositions 1 through 6. Suppose for simplicity that $\mathcal{A}^{(2)}_{0,0}$ is non-empty (equivalently, any equilibrium exists). Let $\sigma$ be any approximate max map at resolution 0. Then we have $\mathcal{A}^{(1)}_{0} = \mathcal{A}^{(2)}_{0,0} = \mathcal{A}^{(3\sigma)}_{0} .$

It remains to discover what relations hold between these sets in the case where $\varepsilon, \delta > 0$ . In this section, we derive the following relations.

Theorem 1: Let $\varepsilon, \delta, \eta \geq 0$ be approximation thresholds. Let $\sigma: \mathcal{A}\to \mathcal{S}$ be any approximate max map at resolution $\eta$ . Let $\Delta = \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a',s') - \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}\operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a',s') .$ Then we have the following relations.

$\mathcal{A}^{(2)}_{\varepsilon,\delta} \subseteq \mathcal{A}^{(1)}_{\varepsilon+\delta}$ .
$\mathcal{A}^{(1)}_{\varepsilon} \subseteq \mathcal{A}^{(2)}_{\varepsilon+\Delta,\varepsilon+\Delta}$ .
$\mathcal{A}^{(1)}_{\varepsilon} \subseteq \mathcal{A}^{(3\sigma)}_{\varepsilon+\eta}$ .
$\mathcal{A}^{(3\sigma)}_{\varepsilon} \subseteq \mathcal{A}^{(1)}_{\varepsilon+\eta}$ .
$\mathcal{A}^{(3\sigma)}_{\varepsilon} \subseteq \mathcal{A}^{(2)}_{\varepsilon+\eta+\Delta,\varepsilon+\eta+\Delta}$ .
$\mathcal{A}^{(2)}_{\varepsilon,\delta} \subseteq \mathcal{A}^{(3\sigma)}_{\varepsilon+\delta+\eta}$ .

We prove each part of this theorem after a brief detour to explain the role of $\Delta$ in the above statement.

§§Aside: Approximate equilibria of zero-sum games

To generalise propositions 2 and 5, we need an appropriate generalisation of lemma 2. The following lemma shows that if $\mathcal{A}^{(2)}_{\varepsilon,\delta}$ is non-empty, then the max–min inequality is approximately an equality (with approximation threshold $\varepsilon+ \delta$ ). Note, this assumption is weaker than the assumption of lemma 2, that $\mathcal{A}^{(2)}_{0,0}$ is non-empty.

Lemma 3. Suppose there exists $(a, s) \in \mathcal{A}\times \mathcal{S}$ such that $s \in \operatornamewithlimits{\text{arg\,$\delta$\,max}}_ {s'\in\mathcal{S}}r(a, s')$ and $a \in \operatornamewithlimits{\text{arg\,$\varepsilon$\,min}}_ {a'\in\mathcal{A}}r(a', s)$ . Then $\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}\operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s') \geq \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a', s') - \delta - \varepsilon .$

Proof. $\begin{align*} \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}\operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s') &\geq \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s) &\text{(by definition of max)} \\&\geq r(a, s) - \varepsilon &\text{(by definition of $a$)} \\&\geq \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a, s') - \delta - \varepsilon &\text{(by definition of $s$)} \\&\geq \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a', s') - \delta - \varepsilon. &\text{(by definition of min)} \tag*{$\square$} \end{align*}$

Compare this proof to that of lemma 2 in the previous note. Observe that we used the same argument, except that instead of invoking the optimality of $a$ and $s$ , we invoke the approximate optimality and introduce a compensatory $\varepsilon$ or $\delta$ . The rest of the proofs in this note are generalisations of the proofs in the previous note along very similar lines.

We could proceed the assumption that some set of approximate equilibria is non-empty. However, we’ll get a more precise bound if we dig a little deeper into what’s going on here. The proof of lemma 3 holds for every approximate equilibrium $(a, s)$ and every $\varepsilon$ and $\delta$ . The tightest bound comes from using the approximate equilibrium that is closest to being an exact equilibrium among all approximate equilibria that are available. The following lemma shows that this ‘tightest bound’ is as tight as possible, because it captures the difference between $\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}\operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s')$ and $\operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a', s')$ .

Lemma 4: Let $\Delta = \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a',s') - \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}\operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a',s') .$ There exist approximation thresholds $\varepsilon, \delta \geq 0$ , an action $a \in \mathcal{A}$ , and a state $s \in \mathcal{S}$ such that (1) $a \in \operatornamewithlimits{\text{arg\,$\varepsilon$\,min}}_ {a'\in\mathcal{A}}r(a', s)$ ; (2) $s \in \operatornamewithlimits{\text{arg\,$\delta$\,max}}_ {s'\in\mathcal{S}}r(a, s')$ ; and (3) $\varepsilon+ \delta = \Delta$ .

Proof. Let $a \in \operatornamewithlimits{arg\,min}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a', s')$ and $s \in \operatornamewithlimits{arg\,max}_ {s'\in\mathcal{S}}\operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s')$ . Let $\varepsilon= r(a, s) - \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s)$ and $\delta = \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a, s') - r(a, s)$ . We immediately have (1) and (2). For (3), observe $\begin{align*} \varepsilon+ \delta % &= \left(r(a, s) - \min r(a', s)\right) % + \left(\max r(a, s') - r(a, s)\right) &= \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a, s') - \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s) &\text{(by definition of $\varepsilon$, $\delta$)} \\&= \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a', s') - \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}\operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s') &\hspace{-4em}\text{(by definition of $a$, $s$)} \\&= \Delta. &\text{(by definition of $\Delta$)} \tag*{$\square$} \end{align*}$

Intuitively, by expressing parts 2 and 5 of theorem 1 in terms of $\Delta$ , we bypass the need to assume that a particular set of approximinimax(2) actions is nonempty. We instead connect our results to the pair of approximation thresholds with the smallest sum such that an approximate equilibrium exists.

In particular, when $\varepsilon,\eta \ll \Delta$ , we need to relax the approximation thresholds a lot to contain $\mathcal{A}^{(1)}_{\varepsilon}$ or $\mathcal{A}^{(3\sigma)}_{\varepsilon}$ , since otherwise approximinimax(2) will be empty!

§§Proof of theorem 1 (six parts)

The proofs for each part of this theorem follow. They are very similar to the proofs of propositions 1 through 6 from the previous note. Each proof follows essentially the same structure, except for introducing $\varepsilon$ and $\delta$ terms when we make use of the assumption that an argument is an approximate optimiser rather than an exact optimiser.

Proof (part 1). Suppose $a \in \mathcal{A}^{(2)}_{\varepsilon,\delta}$ , that is, there exists $s \in \mathcal{S}$ such that both $s \in \operatornamewithlimits{\text{arg\,$\delta$\,max}}_ {s'\in\mathcal{S}}r(a, s')$ and $a \in \operatornamewithlimits{\text{arg\,$\varepsilon$\,min}}_ {a'\in\mathcal{A}}r(a', s)$ . We want to show that $a \in \mathcal{A}^{(1)}_{\varepsilon+\delta}$ , that is, $a \in \operatornamewithlimits{\text{arg\,$(\varepsilon{+}\delta)$\,min}}_ {a'\in\mathcal{A}} \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a', s').$ Observe: $\begin{align*} \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a, s') &\leq r(a, s) + \delta &\text{($\displaystyle s \in \operatornamewithlimits{\text{arg\,$\delta$\,max}}_ {s'\in\mathcal{S}}r(a,s')$)} \\&\leq \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s) + \varepsilon+ \delta &\text{($\displaystyle a \in \operatornamewithlimits{\text{arg\,$\varepsilon$\,min}}_ {a'\in\mathcal{A}}r(a', s)$)} \\&\leq \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}\operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s') + \varepsilon+ \delta &\text{(by definition of max)} \\&\leq \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a', s') + \varepsilon+ \delta. &\text{(max--min inequality, lemma 1)} \tag*{$\square$} \end{align*}$

Proof (part 2). Suppose $a \in \mathcal{A}^{(1)}_{\varepsilon}$ , that is, $a \in \operatornamewithlimits{\text{arg\,$\varepsilon$\,min}}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a', s')$ . We want to show that $a \in \mathcal{A}^{(2)}_{\varepsilon+\Delta,\varepsilon+\Delta}$ , that is, there exists $s \in \mathcal{S}$ such that both $a \in \operatornamewithlimits{\text{arg\,$(\varepsilon{+}\Delta)$\,min}}_ {a'\in\mathcal{A}} r(a', s) \quad\text{and}\quad s \in \operatornamewithlimits{\text{arg\,$(\varepsilon{+}\Delta)$\,max}}_ {s'\in\mathcal{S}} r(a, s').$ Let’s start by putting $s \in \operatornamewithlimits{arg\,max}_ {s'\in\mathcal{S}}\operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s')$ . Note that here we are proving the existence of such an $s$ , so there is no need to furnish an approximate optimiser—that would just lead to a weaker result.

Unlike for proposition 2, let’s prove the two conditions one at a time, so that we can more easily keep track of the approximations involved. First, the condition on $a$ : $\begin{align*} r(a,s) &\leq \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a, s') &\text{(by definition of max)} \\&\leq \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a', s') + \varepsilon &\text{($\displaystyle a \in \operatornamewithlimits{\text{arg\,$\varepsilon$\,min}}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a', s')$)} \\&= \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}\operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s') + \Delta + \varepsilon &\text{(by definition of $\Delta$)} \\&= \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s) + \Delta + \varepsilon. &\text{($\displaystyle s \in \operatornamewithlimits{arg\,max}_ {s'\in\mathcal{S}}\operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a',s')$)} \end{align*}$ Now for the condition on $s$ , we reason backwards through the same chain of terms: $\begin{align*} r(a,s) &\geq \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s) &\text{(by definition of min)} \\&= \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}\operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s') &\text{($\displaystyle s \in \operatornamewithlimits{arg\,max}_ {s'\in\mathcal{S}}\operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s')$)} \\&= \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a', s') - \Delta &\text{(by definition of $\Delta$)} \\&\geq \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a, s') - \varepsilon- \Delta. &\text{($\displaystyle a \in \operatornamewithlimits{arg\,min}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a',s')$)} \tag*{$\square$} \end{align*}$

For the remaining parts, recall that $\sigma$ is assumed to be an arbitrary approximate max map at resolution $\eta$ , that is, for all $a \in \mathcal{A}$ , we have $\sigma(a) \in \operatornamewithlimits{\text{arg\,$\delta$\,max}}_ {s'\in\mathcal{S}}r(a, s')$ .

Proof (part 3). Suppose $a \in \mathcal{A}^{(1)}_{\varepsilon}$ , that is, $a \in \operatornamewithlimits{\text{arg\,$\varepsilon$\,min}}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a', s')$ . We want to show that $a \in \mathcal{A}^{(3\sigma)}_{\varepsilon+\eta}$ , that is, $a \in \operatornamewithlimits{\text{arg\,$(\varepsilon{+}\eta)$\,min}}_ {a'\in\mathcal{A}} r(a', \sigma(a')).$ Observe $\begin{align*} r(a, \sigma(a)) &\leq \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a, s') &\text{(by definition of max)} \\&\leq \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a', s') + \varepsilon &\text{($\displaystyle a \in \operatornamewithlimits{\text{arg\,$\varepsilon$\,min}}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a', s')$)} \\&\leq \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}\Bigl( r(a', \sigma(a')) + \eta \Bigr) + \varepsilon &\text{(by definition of $\sigma$, min)} \\&\leq \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', \sigma(a')) + \eta + \varepsilon. &\text{($\eta$ constant wrt. $a'$)} \tag*{$\square$} \end{align*}$

Proof (part 4). Suppose $a \in \mathcal{A}^{(3\sigma)}_{\varepsilon}$ , that is, $a \in \operatornamewithlimits{\text{arg\,$\varepsilon$\,min}}_ {a'\in\mathcal{A}}r(a', \sigma(a'))$ . We want to show that $a \in \mathcal{A}^{(1)}_{\varepsilon+\eta}$ , that is, $a \in \operatornamewithlimits{\text{arg\,$(\varepsilon{+}\eta)$\,min}}_ {a'\in\mathcal{A}} \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a', s').$ Observe $\begin{align*} \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a, s') &\leq r(a, \sigma(a)) + \eta &\text{(by definition of $\sigma$)} \\&\leq \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', \sigma(a')) + \varepsilon+ \eta &\text{($a \in \operatornamewithlimits{\text{arg\,$\varepsilon$\,min}}_ {a'\in\mathcal{A}}r(a', \sigma(a'))$)} \\&\leq \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a', s') + \varepsilon+ \eta. &\text{(by definition of max, min)} \tag*{$\square$} \end{align*}$

Parts 5 and 6 follow from the above bounds. I hypothesised in the previous note that by following a direct proof, we might get a tighter bound. This was not the case. The proofs of propositions 5 and 6 only eliminated trivial redundancies that didn’t affect the bound. For completeness, I include direct proofs of parts 5 and 6 below.

Proof (part 5). Suppose $a \in \mathcal{A}^{(3\sigma)}_{\varepsilon}$ , that is, $a \in \operatornamewithlimits{\text{arg\,$\varepsilon$\,min}}_ {a'\in\mathcal{A}}r(a', \sigma(a'))$ . We want to show that $a \in \mathcal{A}^{(2)}_{\varepsilon+\eta+\Delta,\varepsilon+\eta+\Delta}$ , that is, there exists $s \in \mathcal{S}$ such that both $a \in \operatornamewithlimits{\text{arg\,$(\varepsilon{+}\eta{+}\Delta)$\,min}}_ {a'\in\mathcal{A}} r(a', s). \quad\text{and}\quad s \in \operatornamewithlimits{\text{arg\,$(\varepsilon{+}\eta{+}\Delta)$\,max}}_ {s'\in\mathcal{S}} r(a, s')$ Like in part 2, let’s start by putting $s \in \operatornamewithlimits{arg\,max}_ {s'\in\mathcal{S}}\operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s')$ . Unlike for proposition 5, let’s prove the two conditions one at a time, so that we can more easily keep track of the approximations involved. First, the condition on $a$ : $\begin{align*} r(a, s) &\leq \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a, s') &\text{(by definition of max)} \\&\leq r(a, \sigma(a)) + \eta &\text{(by definition of $\sigma$)} \\&\leq \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', \sigma(a')) + \varepsilon+ \eta &\text{($\displaystyle a \in \operatornamewithlimits{\text{arg\,$\varepsilon$\,min}}_ {a'\in\mathcal{A}}r(a', \sigma(a'))$)} \\&\leq \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a', s') + \varepsilon+ \eta &\text{(by definition of max, min)} \\&= \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}\operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s') + \Delta + \varepsilon+ \eta &\text{(by definition of $\Delta$)} \\&= \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s) + \Delta + \varepsilon+ \eta. &\text{($\displaystyle s \in \operatornamewithlimits{arg\,max}_ {s'\in\mathcal{S}}\operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s')$)} \end{align*}$ Now for the condition on $s$ , we reason backwards through the same chain of terms: $\begin{align*} r(a,s) &\geq \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s) &\text{(by definition of min)} \\&= \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}\operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s') &\text{($\displaystyle s \in \operatornamewithlimits{arg\,max}_ {s'\in\mathcal{S}}\operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s')$)} \\&= \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a', s') - \Delta &\text{(by definition of $\Delta$)} \\&\geq \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', \sigma(a')) - \Delta &\text{(by definition of max, min)} \\&\geq r(a, \sigma(a)) - \varepsilon- \Delta &\text{($\displaystyle a \in \operatornamewithlimits{\text{arg\,$\varepsilon$\,min}}_ {a'\in\mathcal{A}}r(a', \sigma(a'))$)} \\&\geq \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a, s') - \eta - \varepsilon- \Delta. &\text{(by definition of $\sigma$)} \tag*{$\square$} \end{align*}$

Proof (part 6). Suppose $a \in \mathcal{A}^{(2)}_{\varepsilon,\delta}$ , that is, there exists $s \in \mathcal{S}$ such that both $s \in \operatornamewithlimits{\text{arg\,$\delta$\,max}}_ {s'\in\mathcal{S}}r(a, s')$ and $a \in \operatornamewithlimits{\text{arg\,$\varepsilon$\,min}}_ {a'\in\mathcal{A}}r(a', s)$ . We want to show $a \in \mathcal{A}^{(3\sigma)}_{\varepsilon+\delta+\eta}$ , that is, $a \in \operatornamewithlimits{\text{arg\,$(\varepsilon{+}\delta{+}\eta)$\,min}}_ {a'\in\mathcal{A}} r(a', \sigma(a')).$ Observe $\begin{align*} r(a, \sigma(a)) &\leq \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a, s') &\text{(by definition of max)} \\&\leq r(a, s) + \delta &\text{($s \in \operatornamewithlimits{\text{arg\,$\delta$\,max}}_ {s'\in\mathcal{S}}r(a, s')$)} \\&\leq \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s) + \varepsilon+ \delta &\text{($a \in \operatornamewithlimits{\text{arg\,$\varepsilon$\,min}}_ {a'\in\mathcal{A}}r(a', s)$)} \\&\leq \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}\operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s') + \varepsilon+ \delta &\text{(by definition of max)} \\&\leq \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a', s') + \varepsilon+ \delta &\text{(by max--min inequality, lemma 1)} \\&\leq \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}\Bigr( r(a', \sigma(a')) + \eta \Bigl) + \varepsilon+ \delta &\text{(by definition of $\sigma$, min)} \\&\leq \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', \sigma(a')) + \eta + \varepsilon+ \delta. &\text{($\eta$ constant wrt. $a'$)} \tag*{$\square$} \end{align*}$

§Conclusion

There you have it—three different approximate relaxations of the minimax objective, and their approximate equivalence.

These relations are not as strong as in the exact case—this theorem does not establish equality except in the case $\varepsilon= \delta = \eta = 0$ . However, we do recover an interesting network of relations that establish that we can convert between relaxed objectives modulo a linear deterioration in approximation quality. This tells us that the families of solution sets (for varying approximation thresholds) are closely related, satisfying something like a bi-Lipschitz equivalence property. Importantly, in the limit as $\varepsilon, \delta, \eta \to 0$ , the approximate solution sets get closer and closer to being actually equal.

I have not thought very much about whether these relations are optimally tight. However, I didn’t see any places where the proofs could be streamlined and fewer terms introduced, without imposing further assumptions on $r$ , $\mathcal{A}$ , and $\mathcal{S}$ .

That’s all for now—it’s time to get back to work using these definitions to analyse the robustness of minimax regret training methods!

Edmund Lau points out that there are other ways of ‘relaxing’ the notion of optimisation. For example, instead of taking the above approach of relaxation through discretisation, we could pursue a probabilistic relaxation whereby we cast exact optimisation as sampling from a probability distribution concentrated on the set of minimisers with density given by $p^\star(x) \propto \left[\!\!\left[ x \in \operatornamewithlimits{arg\,min}_{x'\in\mathcal{X}} f(x') \right]\!\!\right],$ and we relax this procedure by instead sampling from a Boltzmann distribution with inverse temperature $\beta \geq 0$ , with density given by $p^\beta(x) \propto \exp(-\beta f(x)).$ This is the so-called softmin operation (if we were maximising, we’d have the more well-known softmax). We recover exact optimisation in the limit $\beta \to \infty$ . I haven’t considered what a probabilistic relaxation of minimax might look like, but I suppose such a concept has probably been studied in game theory.↩︎