far.in.net


Approximate minimax, three ways

Tuesday, February 25th, 2025.

This is a short technical note written in the process of working out some definitions for a paper about using an approximate minimax regret training objective to mitigate goal misgeneralisation in advanced deep reinforcement learning systems. I define three different approximate relaxations of the minimax objective, and show that the definitions are related under certain assumptions.

Thanks to my colleagues Karim Abdel Sadek and Michael Dennis for helping me understand the necessary game theory and decision theory to figure this out.

Background

This note is a sequel to minimax, three ways. See that note for additional background. This note references the results and definitions from the previous note by number. In this note, I closely follow those definitions and propositions from the previous note in order to define three approximate relaxations of the minimax objective and study their relationships.

What do I mean by an approximate relaxation of an objective?1 First let me formalise optimisation. Suppose we have a function f:XRf : \mathcal{X} \to \mathbb{R}. To minimise this function is to find an argument xXx \in \mathcal{X} that achieves the minimum possible value f(x)=argminxXf(x)f(x) = \operatornamewithlimits{\vphantom{arg\,}min}_{x' \in \mathcal{X}} f(x') where the RHS is assumed to exist. We define the set of minimisers arg minxXf(x)={xX|f(x)=argminxXf(x)}. \operatornamewithlimits{arg\,min}_{x'\in\mathcal{X}} f(x') = \left\{ x \in \mathcal{X} \,\middle|\, f(x) = \operatornamewithlimits{\vphantom{arg\,}min}_{x' \in \mathcal{X}} f(x') \right\}.

This criterion is very strict, making it difficult to find such minimisers in practice. It’s slightly more realistic to assume that we might be able to find some xXx \in \mathcal{X} that is within some finite approximation threshold ε0\varepsilon\geq 0 of the minimum possible value. We thus define the set of approximate minimisers argεminxXf(x)={xX|f(x)argminxXf(x)+ε}. \operatornamewithlimits{\text{arg\,$\varepsilon$\,min}}_{x'\in\mathcal{X}} f(x') = \left\{ x \in \mathcal{X} \,\middle|\, f(x) \leq \operatornamewithlimits{\vphantom{arg\,}min}_{x' \in \mathcal{X}} f(x') + \varepsilon \right\}. We recover exact minimisation with ε=0\varepsilon= 0.

Similarly, we define maximisers arg maxxXf(x)={xX|f(x)=argmaxxXf(x)}\operatornamewithlimits{arg\,max}_{x'\in\mathcal{X}} f(x') = \left\{x \in \mathcal{X}\,\middle|\,f(x) = \operatornamewithlimits{\vphantom{arg\,}max}_{x' \in \mathcal{X}} f(x')\right\}, and approximate maximisers given approximation threshold δ0\delta \geq 0, argδmaxxXf(x)={xX|f(x)argmaxxXf(x)δ}. \operatornamewithlimits{\text{arg\,$\delta$\,max}}_{x'\in\mathcal{X}} f(x') = \left\{ x \in \mathcal{X} \,\middle|\, f(x) \geq \operatornamewithlimits{\vphantom{arg\,}max}_{x' \in \mathcal{X}} f(x') - \delta \right\}.

Three approximate relaxations…

In the previous note, we explored three different definitions of the minimax objective: minimax(1), minimax(2), and minimax(3σ\sigma). The idea was that each definition would suggest a different approximate relaxation. Here I state the relaxed definitions.

As in the previous note, suppose we have a function r:A×SRr : \mathcal{A}\times \mathcal{S}\to \mathbb{R}. It doesn’t matter what the function represents, or what the sets A\mathcal{A} and S\mathcal{S} are, but the notation is chosen to suggest that aAa \in \mathcal{A} is an action chosen by a decision maker, sSs \in \mathcal{S} is a state of the world, and r(a,s)r(a, s) is regret arising from taking action aa in state ss. The only assumption I actually make on rr, A\mathcal{A}, and S\mathcal{S} is that all the maxima and minima I refer to exist.

Definition 1 defines minimax actions as those that achieve the minimum value of a function, and that function is given by the maximum value given a choice of state that depends on the action. There is no room to relax the internal maximisation, as we need a concrete value to minimise over. For now, we just ignore the inner optimisation and we only approximate the outer one.

Relaxation 1: Let ε0\varepsilon\geq 0 be an approximation threshold. An action aAa \in \mathcal{A} is approximinimax(1) at resolution ε\varepsilon if aargεminaAargmaxsSr(a,s). a \in \operatornamewithlimits{\text{arg\,$\varepsilon$\,min}}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a', s'). We denote the set of approximinimax(1) actions at resolution ε\varepsilon by Aε(1)\mathcal{A}^{(1)}_{\varepsilon}.

Definition 2 defines minimax actions as those that are part of a Nash equilibrium of a two-player zero-sum game, where the agent minimises the function and the adversary maximises it. We relax this definition by replacing the Nash equilibrium condition with an approximate Nash equilibrium.

Relaxation 2: Let ε,δ0\varepsilon, \delta \geq 0 be approximation thresholds. An action aAa \in \mathcal{A} is approximinimax(2) at resolution (ε,δ)(\varepsilon, \delta) if there exists sSs \in \mathcal{S} such that both sargδmaxsSr(a,s)andaargεminaAr(a,s). s \in \operatornamewithlimits{\text{arg\,$\delta$\,max}}_ {s'\in\mathcal{S}}r(a, s') \quad\textit{and}\quad a \in \operatornamewithlimits{\text{arg\,$\varepsilon$\,min}}_ {a'\in\mathcal{A}}r(a', s). We denote the set of approximinimax(2) actions at resolution (ε,δ)(\varepsilon,\delta) by Aε,δ(2)\mathcal{A}^{(2)}_{\varepsilon,\delta}.

This definition clearly allows us to relax the maximum as well as the minimum, however, at a given resolution, not all games have approximate Nash equilibria. In such cases, minimax(2) is not well defined. In contrast, minimax(1) is well-defined for any rr, A\mathcal{A}, S\mathcal{S} as long as the maxima and minima exist. Intuitively, we shouldn’t need there to be an equilibrium for us to be able to talk about minimising the worst possible response.

Definition 3 is designed to overcome the shortcomings of each of definitions 1 and 2. We defined minimax actions as minima of a function involving a concrete max map, a deterministic function actions to worst-case states. We can relax this definition by relaxing both the maximisation criterion in the max map and the minimisation criterion in the definition of minimax actions.

Relaxation 3.1: Let δ0\delta \geq 0 be an approximation threshold. An approximate max map at resolution δ\delta is any function σ:AS\sigma: \mathcal{A}\to \mathcal{S} such that for all aAa \in \mathcal{A}, σ(a)argδmaxsSr(a,s). \sigma(a) \in \operatornamewithlimits{\text{arg\,$\delta$\,max}}_ {s'\in\mathcal{S}}r(a, s') .

Relaxation 3.2: Let ε,δ0\varepsilon, \delta \geq 0 be approximation thresholds. Let σ\sigma be any approximate max map at resolution δ\delta. An action aAa \in \mathcal{A} is approximinimax(3σ\sigma) at resolution ε\varepsilon if aargεminaAr(a,σ(a)). a \in \operatornamewithlimits{\text{arg\,$\varepsilon$\,min}}_ {a'\in\mathcal{A}}r(a', \sigma(a')) . We denote the set of approximinimax(3σ\sigma) actions at resolution ε\varepsilon as Aε(3σ)\mathcal{A}^{(3\sigma)}_{\varepsilon}.

… and their relations

Given ε=δ=0\varepsilon= \delta = 0, we recover definitions 1, 2, and 3 from the previous note, and therefore the results of propositions 1 through 6. Suppose for simplicity that A0,0(2)\mathcal{A}^{(2)}_{0,0} is non-empty (equivalently, any equilibrium exists). Let σ\sigma be any approximate max map at resolution 0. Then we have A0(1)=A0,0(2)=A0(3σ). \mathcal{A}^{(1)}_{0} = \mathcal{A}^{(2)}_{0,0} = \mathcal{A}^{(3\sigma)}_{0} .

It remains to discover what relations hold between these sets in the case where ε,δ>0\varepsilon, \delta > 0. In this section, we derive the following relations.

Theorem 1: Let ε,δ,η0\varepsilon, \delta, \eta \geq 0 be approximation thresholds. Let σ:AS\sigma: \mathcal{A}\to \mathcal{S} be any approximate max map at resolution η\eta. Let Δ=argminaAargmaxsSr(a,s)argmaxsSargminaAr(a,s). \Delta = \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a',s') - \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}\operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a',s') . Then we have the following relations.

  1. Aε,δ(2)Aε+δ(1)\mathcal{A}^{(2)}_{\varepsilon,\delta} \subseteq \mathcal{A}^{(1)}_{\varepsilon+\delta}.

  2. Aε(1)Aε+Δ,ε+Δ(2)\mathcal{A}^{(1)}_{\varepsilon} \subseteq \mathcal{A}^{(2)}_{\varepsilon+\Delta,\varepsilon+\Delta}.

  3. Aε(1)Aε+η(3σ)\mathcal{A}^{(1)}_{\varepsilon} \subseteq \mathcal{A}^{(3\sigma)}_{\varepsilon+\eta}.

  4. Aε(3σ)Aε+η(1)\mathcal{A}^{(3\sigma)}_{\varepsilon} \subseteq \mathcal{A}^{(1)}_{\varepsilon+\eta}.

  5. Aε(3σ)Aε+η+Δ,ε+η+Δ(2)\mathcal{A}^{(3\sigma)}_{\varepsilon} \subseteq \mathcal{A}^{(2)}_{\varepsilon+\eta+\Delta,\varepsilon+\eta+\Delta}.

  6. Aε,δ(2)Aε+δ+η(3σ)\mathcal{A}^{(2)}_{\varepsilon,\delta} \subseteq \mathcal{A}^{(3\sigma)}_{\varepsilon+\delta+\eta}.

We prove each part of this theorem after a brief detour to explain the role of Δ\Delta in the above statement.

Aside: Approximate equilibria of zero-sum games

To generalise propositions 2 and 5, we need an appropriate generalisation of lemma 2. The following lemma shows that if Aε,δ(2)\mathcal{A}^{(2)}_{\varepsilon,\delta} is non-empty, then the max–min inequality is approximately an equality (with approximation threshold ε+δ\varepsilon+ \delta). Note, this assumption is weaker than the assumption of lemma 2, that A0,0(2)\mathcal{A}^{(2)}_{0,0} is non-empty.

Lemma 3. Suppose there exists (a,s)A×S(a, s) \in \mathcal{A}\times \mathcal{S} such that sargδmaxsSr(a,s)s \in \operatornamewithlimits{\text{arg\,$\delta$\,max}}_ {s'\in\mathcal{S}}r(a, s') and aargεminaAr(a,s)a \in \operatornamewithlimits{\text{arg\,$\varepsilon$\,min}}_ {a'\in\mathcal{A}}r(a', s). Then argmaxsSargminaAr(a,s)argminaAargmaxsSr(a,s)δε. \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}\operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s') \geq \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a', s') - \delta - \varepsilon .

Proof. argmaxsSargminaAr(a,s)argminaAr(a,s)(by definition of max)r(a,s)ε(by definition of a)argmaxsSr(a,s)δε(by definition of s)argminaAargmaxsSr(a,s)δε.(by definition of min)\begin{align*} \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}\operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s') &\geq \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s) &\text{(by definition of max)} \\&\geq r(a, s) - \varepsilon &\text{(by definition of $a$)} \\&\geq \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a, s') - \delta - \varepsilon &\text{(by definition of $s$)} \\&\geq \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a', s') - \delta - \varepsilon. &\text{(by definition of min)} \tag*{$\square$} \end{align*}

Compare this proof to that of lemma 2 in the previous note. Observe that we used the same argument, except that instead of invoking the optimality of aa and ss, we invoke the approximate optimality and introduce a compensatory ε\varepsilon or δ\delta. The rest of the proofs in this note are generalisations of the proofs in the previous note along very similar lines.

We could proceed the assumption that some set of approximate equilibria is non-empty. However, we’ll get a more precise bound if we dig a little deeper into what’s going on here. The proof of lemma 3 holds for every approximate equilibrium (a,s)(a, s) and every ε\varepsilon and δ\delta. The tightest bound comes from using the approximate equilibrium that is closest to being an exact equilibrium among all approximate equilibria that are available. The following lemma shows that this ‘tightest bound’ is as tight as possible, because it captures the difference between argmaxsSargminaAr(a,s)\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}\operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s') and argminaAargmaxsSr(a,s)\operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a', s').

Lemma 4: Let Δ=argminaAargmaxsSr(a,s)argmaxsSargminaAr(a,s). \Delta = \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a',s') - \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}\operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a',s') . There exist approximation thresholds ε,δ0\varepsilon, \delta \geq 0, an action aAa \in \mathcal{A}, and a state sSs \in \mathcal{S} such that (1) aargεminaAr(a,s)a \in \operatornamewithlimits{\text{arg\,$\varepsilon$\,min}}_ {a'\in\mathcal{A}}r(a', s); (2) sargδmaxsSr(a,s)s \in \operatornamewithlimits{\text{arg\,$\delta$\,max}}_ {s'\in\mathcal{S}}r(a, s'); and (3) ε+δ=Δ\varepsilon+ \delta = \Delta.

Proof. Let aarg minaAargmaxsSr(a,s)a \in \operatornamewithlimits{arg\,min}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a', s') and sarg maxsSargminaAr(a,s)s \in \operatornamewithlimits{arg\,max}_ {s'\in\mathcal{S}}\operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s'). Let ε=r(a,s)argminaAr(a,s)\varepsilon= r(a, s) - \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s) and δ=argmaxsSr(a,s)r(a,s)\delta = \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a, s') - r(a, s). We immediately have (1) and (2). For (3), observe ε+δ=argmaxsSr(a,s)argminaAr(a,s)(by definition of εδ)=argminaAargmaxsSr(a,s)argmaxsSargminaAr(a,s)(by definition of as)=Δ.(by definition of Δ)\begin{align*} \varepsilon+ \delta % &= \left(r(a, s) - \min r(a', s)\right) % + \left(\max r(a, s') - r(a, s)\right) &= \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a, s') - \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s) &\text{(by definition of $\varepsilon$, $\delta$)} \\&= \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a', s') - \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}\operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s') &\hspace{-4em}\text{(by definition of $a$, $s$)} \\&= \Delta. &\text{(by definition of $\Delta$)} \tag*{$\square$} \end{align*}

Intuitively, by expressing parts 2 and 5 of theorem 1 in terms of Δ\Delta, we bypass the need to assume that a particular set of approximinimax(2) actions is nonempty. We instead connect our results to the pair of approximation thresholds with the smallest sum such that an approximate equilibrium exists.

In particular, when ε,ηΔ\varepsilon,\eta \ll \Delta, we need to relax the approximation thresholds a lot to contain Aε(1)\mathcal{A}^{(1)}_{\varepsilon} or Aε(3σ)\mathcal{A}^{(3\sigma)}_{\varepsilon}, since otherwise approximinimax(2) will be empty!

Proof of theorem 1 (six parts)

The proofs for each part of this theorem follow. They are very similar to the proofs of propositions 1 through 6 from the previous note. Each proof follows essentially the same structure, except for introducing ε\varepsilon and δ\delta terms when we make use of the assumption that an argument is an approximate optimiser rather than an exact optimiser.

Proof (part 1). Suppose aAε,δ(2)a \in \mathcal{A}^{(2)}_{\varepsilon,\delta}, that is, there exists sSs \in \mathcal{S} such that both sargδmaxsSr(a,s)s \in \operatornamewithlimits{\text{arg\,$\delta$\,max}}_ {s'\in\mathcal{S}}r(a, s') and aargεminaAr(a,s)a \in \operatornamewithlimits{\text{arg\,$\varepsilon$\,min}}_ {a'\in\mathcal{A}}r(a', s). We want to show that aAε+δ(1)a \in \mathcal{A}^{(1)}_{\varepsilon+\delta}, that is, aarg(ε+δ)minaAargmaxsSr(a,s).a \in \operatornamewithlimits{\text{arg\,$(\varepsilon{+}\delta)$\,min}}_ {a'\in\mathcal{A}} \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a', s'). Observe: argmaxsSr(a,s)r(a,s)+δ(sargδmaxsSr(a,s))argminaAr(a,s)+ε+δ(aargεminaAr(a,s))argmaxsSargminaAr(a,s)+ε+δ(by definition of max)argminaAargmaxsSr(a,s)+ε+δ.(max–min inequality, lemma 1)\begin{align*} \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a, s') &\leq r(a, s) + \delta &\text{($\displaystyle s \in \operatornamewithlimits{\text{arg\,$\delta$\,max}}_ {s'\in\mathcal{S}}r(a,s')$)} \\&\leq \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s) + \varepsilon+ \delta &\text{($\displaystyle a \in \operatornamewithlimits{\text{arg\,$\varepsilon$\,min}}_ {a'\in\mathcal{A}}r(a', s)$)} \\&\leq \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}\operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s') + \varepsilon+ \delta &\text{(by definition of max)} \\&\leq \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a', s') + \varepsilon+ \delta. &\text{(max--min inequality, lemma 1)} \tag*{$\square$} \end{align*}

Proof (part 2). Suppose aAε(1)a \in \mathcal{A}^{(1)}_{\varepsilon}, that is, aargεminaAargmaxsSr(a,s)a \in \operatornamewithlimits{\text{arg\,$\varepsilon$\,min}}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a', s'). We want to show that aAε+Δ,ε+Δ(2)a \in \mathcal{A}^{(2)}_{\varepsilon+\Delta,\varepsilon+\Delta}, that is, there exists sSs \in \mathcal{S} such that both aarg(ε+Δ)minaAr(a,s)andsarg(ε+Δ)maxsSr(a,s). a \in \operatornamewithlimits{\text{arg\,$(\varepsilon{+}\Delta)$\,min}}_ {a'\in\mathcal{A}} r(a', s) \quad\text{and}\quad s \in \operatornamewithlimits{\text{arg\,$(\varepsilon{+}\Delta)$\,max}}_ {s'\in\mathcal{S}} r(a, s'). Let’s start by putting sarg maxsSargminaAr(a,s)s \in \operatornamewithlimits{arg\,max}_ {s'\in\mathcal{S}}\operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s'). Note that here we are proving the existence of such an ss, so there is no need to furnish an approximate optimiser—that would just lead to a weaker result.

Unlike for proposition 2, let’s prove the two conditions one at a time, so that we can more easily keep track of the approximations involved. First, the condition on aa: r(a,s)argmaxsSr(a,s)(by definition of max)argminaAargmaxsSr(a,s)+ε(aargεminaAargmaxsSr(a,s))=argmaxsSargminaAr(a,s)+Δ+ε(by definition of Δ)=argminaAr(a,s)+Δ+ε.(sarg maxsSargminaAr(a,s))\begin{align*} r(a,s) &\leq \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a, s') &\text{(by definition of max)} \\&\leq \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a', s') + \varepsilon &\text{($\displaystyle a \in \operatornamewithlimits{\text{arg\,$\varepsilon$\,min}}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a', s')$)} \\&= \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}\operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s') + \Delta + \varepsilon &\text{(by definition of $\Delta$)} \\&= \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s') + \Delta + \varepsilon. &\text{($\displaystyle s \in \operatornamewithlimits{arg\,max}_ {s'\in\mathcal{S}}\operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a',s')$)} \end{align*} Now for the condition on ss, we reason backwards through the same chain of terms: r(a,s)argminaAr(a,s)(by definition of min)=argmaxsSargminaAr(a,s)(sarg maxsSargminaAr(a,s))=argminaAargmaxsSr(a,s)Δ(by definition of Δ)argmaxsSr(a,s)εΔ.(aarg minaAargmaxsSr(a,s))\begin{align*} r(a,s) &\geq \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s) &\text{(by definition of min)} \\&= \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}\operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s') &\text{($\displaystyle s \in \operatornamewithlimits{arg\,max}_ {s'\in\mathcal{S}}\operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s')$)} \\&= \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a', s') - \Delta &\text{(by definition of $\Delta$)} \\&\geq \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a, s') - \varepsilon- \Delta. &\text{($\displaystyle a \in \operatornamewithlimits{arg\,min}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a',s')$)} \tag*{$\square$} \end{align*}

For the remaining parts, recall that σ\sigma is assumed to be an arbitrary approximate max map at resolution η\eta, that is, for all aAa \in \mathcal{A}, we have σ(a)argδmaxsSr(a,s)\sigma(a) \in \operatornamewithlimits{\text{arg\,$\delta$\,max}}_ {s'\in\mathcal{S}}r(a, s').

Proof (part 3). Suppose aAε(1)a \in \mathcal{A}^{(1)}_{\varepsilon}, that is, aargεminaAargmaxsSr(a,s)a \in \operatornamewithlimits{\text{arg\,$\varepsilon$\,min}}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a', s'). We want to show that aAε+η(3σ)a \in \mathcal{A}^{(3\sigma)}_{\varepsilon+\eta}, that is, aarg(ε+η)minaAr(a,σ(a)).a \in \operatornamewithlimits{\text{arg\,$(\varepsilon{+}\eta)$\,min}}_ {a'\in\mathcal{A}} r(a', \sigma(a')). Observe r(a,σ(a))argmaxsSr(a,s)(by definition of max)argminaAargmaxsSr(a,s)+ε(aargεminaAargmaxsSr(a,s))argminaA(r(a,σ(a))+η)+ε(by definition of σ, min)argminaAr(a,σ(a))+η+ε.(η constant wrt. a)\begin{align*} r(a, \sigma(a)) &\leq \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a, s') &\text{(by definition of max)} \\&\leq \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a', s') + \varepsilon &\text{($\displaystyle a \in \operatornamewithlimits{\text{arg\,$\varepsilon$\,min}}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a', s')$)} \\&\leq \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}\Bigl( r(a', \sigma(a')) + \eta \Bigr) + \varepsilon &\text{(by definition of $\sigma$, min)} \\&\leq \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', \sigma(a')) + \eta + \varepsilon. &\text{($\eta$ constant wrt. $a'$)} \tag*{$\square$} \end{align*}

Proof (part 4). Suppose aAε(3σ)a \in \mathcal{A}^{(3\sigma)}_{\varepsilon}, that is, aargεminaAr(a,σ(a))a \in \operatornamewithlimits{\text{arg\,$\varepsilon$\,min}}_ {a'\in\mathcal{A}}r(a', \sigma(a')). We want to show that aAε+η(1)a \in \mathcal{A}^{(1)}_{\varepsilon+\eta}, that is, aarg(ε+η)minaAargmaxsSr(a,s).a \in \operatornamewithlimits{\text{arg\,$(\varepsilon{+}\eta)$\,min}}_ {a'\in\mathcal{A}} \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a', s'). Observe argmaxsSr(a,s)r(a,σ(a))+η(by definition of σ)argminaAr(a,σ(a))+ε+η(aargεminaAr(a,σ(a)))argminaAargmaxsSr(a,s)+ε+η.(by definition of max, min)\begin{align*} \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a, s') &\leq r(a, \sigma(a)) + \eta &\text{(by definition of $\sigma$)} \\&\leq \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', \sigma(a')) + \varepsilon+ \eta &\text{($a \in \operatornamewithlimits{\text{arg\,$\varepsilon$\,min}}_ {a'\in\mathcal{A}}r(a', \sigma(a'))$)} \\&\leq \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a', s') + \varepsilon+ \eta. &\text{(by definition of max, min)} \tag*{$\square$} \end{align*}

Parts 5 and 6 follow from the above bounds. I hypothesised in the previous note that by following a direct proof, we might get a tighter bound. This was not the case. The proofs of propositions 5 and 6 only eliminated trivial redundancies that didn’t affect the bound. For completeness, I include direct proofs of parts 5 and 6 below.

Proof (part 5). Suppose aAε(3σ)a \in \mathcal{A}^{(3\sigma)}_{\varepsilon}, that is, aargεminaAr(a,σ(a))a \in \operatornamewithlimits{\text{arg\,$\varepsilon$\,min}}_ {a'\in\mathcal{A}}r(a', \sigma(a')). We want to show that aAε+η+Δ,ε+η+Δ(2)a \in \mathcal{A}^{(2)}_{\varepsilon+\eta+\Delta,\varepsilon+\eta+\Delta}, that is, there exists sSs \in \mathcal{S} such that both aarg(ε+η+Δ)minaAr(a,s).andsarg(ε+η+Δ)maxsSr(a,s) a \in \operatornamewithlimits{\text{arg\,$(\varepsilon{+}\eta{+}\Delta)$\,min}}_ {a'\in\mathcal{A}} r(a', s). \quad\text{and}\quad s \in \operatornamewithlimits{\text{arg\,$(\varepsilon{+}\eta{+}\Delta)$\,max}}_ {s'\in\mathcal{S}} r(a, s') Like in part 2, let’s start by putting sarg maxsSargminaAr(a,s)s \in \operatornamewithlimits{arg\,max}_ {s'\in\mathcal{S}}\operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s'). Unlike for proposition 5, let’s prove the two conditions one at a time, so that we can more easily keep track of the approximations involved. First, the condition on aa: r(a,s)argmaxsSr(a,s)(by definition of max)r(a,σ(a))+η(by definition of σ)argminaAr(a,σ(a))+ε+η(aargεminaAr(a,σ(a)))argminaAargmaxsSr(a,s)+ε+η(by definition of max, min)=argmaxsSargminaAr(a,s)+Δ+ε+η(by definition of Δ)=argminaAr(a,s)+Δ+ε+η.(sarg maxsSargminaAr(a,s))\begin{align*} r(a, s) &\leq \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a, s') &\text{(by definition of max)} \\&\leq r(a, \sigma(a)) + \eta &\text{(by definition of $\sigma$)} \\&\leq \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', \sigma(a')) + \varepsilon+ \eta &\text{($\displaystyle a \in \operatornamewithlimits{\text{arg\,$\varepsilon$\,min}}_ {a'\in\mathcal{A}}r(a', \sigma(a'))$)} \\&\leq \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a', s') + \varepsilon+ \eta &\text{(by definition of max, min)} \\&= \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}\operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s') + \Delta + \varepsilon+ \eta &\text{(by definition of $\Delta$)} \\&= \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s) + \Delta + \varepsilon+ \eta. &\text{($\displaystyle s \in \operatornamewithlimits{arg\,max}_ {s'\in\mathcal{S}}\operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s')$)} \end{align*} Now for the condition on ss, we reason backwards through the same chain of terms: r(a,s)argminaAr(a,s)(by definition of min)=argmaxsSargminaAr(a,s)(sarg maxsSargminaAr(a,s))=argminaAargmaxsSr(a,s)Δ(by definition of Δ)argminaAr(a,σ(a))Δ(by definition of max, min)r(a,σ(a))εΔ(aargεminaAr(a,σ(a)))argmaxsSr(a,s)ηεΔ.(by definition of σ)\begin{align*} r(a,s) &\geq \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s) &\text{(by definition of min)} \\&= \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}\operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s') &\text{($\displaystyle s \in \operatornamewithlimits{arg\,max}_ {s'\in\mathcal{S}}\operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s')$)} \\&= \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a', s') - \Delta &\text{(by definition of $\Delta$)} \\&\geq \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', \sigma(a')) - \Delta &\text{(by definition of max, min)} \\&\geq r(a, \sigma(a)) - \varepsilon- \Delta &\text{($\displaystyle a \in \operatornamewithlimits{\text{arg\,$\varepsilon$\,min}}_ {a'\in\mathcal{A}}r(a', \sigma(a'))$)} \\&\geq \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a, s') - \eta - \varepsilon- \Delta. &\text{(by definition of $\sigma$)} \tag*{$\square$} \end{align*}

Proof (part 6). Suppose aAε,δ(2)a \in \mathcal{A}^{(2)}_{\varepsilon,\delta}, that is, there exists sSs \in \mathcal{S} such that both sargδmaxsSr(a,s)s \in \operatornamewithlimits{\text{arg\,$\delta$\,max}}_ {s'\in\mathcal{S}}r(a, s') and aargεminaAr(a,s)a \in \operatornamewithlimits{\text{arg\,$\varepsilon$\,min}}_ {a'\in\mathcal{A}}r(a', s). We want to show aAε+δ+η(3σ)a \in \mathcal{A}^{(3\sigma)}_{\varepsilon+\delta+\eta}, that is, aarg(ε+δ+η)minaAr(a,σ(a)).a \in \operatornamewithlimits{\text{arg\,$(\varepsilon{+}\delta{+}\eta)$\,min}}_ {a'\in\mathcal{A}} r(a', \sigma(a')). Observe r(a,σ(a))argmaxsSr(a,s)(by definition of max)r(a,s)+δ(sargδmaxsSr(a,s))argminaAr(a,s)+ε+δ(aargεminaAr(a,s))argmaxsSargminaAr(a,s)+ε+δ(by definition of max)argminaAargmaxsSr(a,s)+ε+δ(by min–max inequality, lemma 1)argminaA(r(a,σ(a))+η)+ε+δ(by definition of σ, min)argminaAr(a,σ(a))+η+ε+δ.(η constant wrt. a)\begin{align*} r(a, \sigma(a)) &\leq \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a, s') &\text{(by definition of max)} \\&\leq r(a, s) + \delta &\text{($s \in \operatornamewithlimits{\text{arg\,$\delta$\,max}}_ {s'\in\mathcal{S}}r(a, s')$)} \\&\leq \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s) + \varepsilon+ \delta &\text{($a \in \operatornamewithlimits{\text{arg\,$\varepsilon$\,min}}_ {a'\in\mathcal{A}}r(a', s)$)} \\&\leq \operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}\operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', s') + \varepsilon+ \delta &\text{(by definition of max)} \\&\leq \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}\operatornamewithlimits{\vphantom{arg\,}max}_ {s'\in\mathcal{S}}r(a', s') + \varepsilon+ \delta &\text{(by min--max inequality, lemma 1)} \\&\leq \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}\Bigr( r(a', \sigma(a')) + \eta \Bigl) + \varepsilon+ \delta &\text{(by definition of $\sigma$, min)} \\&\leq \operatornamewithlimits{\vphantom{arg\,}min}_ {a'\in\mathcal{A}}r(a', \sigma(a')) + \eta + \varepsilon+ \delta. &\text{($\eta$ constant wrt. $a'$)} \tag*{$\square$} \end{align*}

Conclusion

There you have it—three different approximate relaxations of the minimax objective, and their approximate equivalence.

These relations are not as strong as in the exact case—this theorem does not establish equality except in the case ε=δ=η=0\varepsilon= \delta = \eta = 0. However, we do recover an interesting network of relations that establish that we can convert between relaxed objectives modulo a linear deterioration in approximation quality. This tells us that the families of solution sets (for varying approximation thresholds) are closely related, satisfying something like a bi-Lipschitz equivalence property. Importantly, in the limit as ε,δ,η0\varepsilon, \delta, \eta \to 0, the approximate solution sets get closer and closer to being actually equal.

I have not thought very much about whether these relations are optimally tight. However, I didn’t see any places where the proofs could be streamlined and fewer terms introduced, without imposing further assumptions on rr, A\mathcal{A}, and S\mathcal{S}.

That’s all for now—it’s time to get back to work using these definitions to analyse the robustness of minimax regret training methods!


  1. Edmund Lau points out that there are other ways of ‘relaxing’ the notion of optimisation. For example, instead of taking the above approach of relaxation through discretisation, we could pursue a probabilistic relaxation whereby we cast exact optimisation as sampling from a probability distribution concentrated on the set of minimisers with density given by p(x)[ ⁣ ⁣[xarg minxXf(x)] ⁣ ⁣], p^\star(x) \propto \left[\!\!\left[ x \in \operatornamewithlimits{arg\,min}_{x'\in\mathcal{X}} f(x') \right]\!\!\right], and we relax this procedure by instead sampling from a Boltzmann distribution with inverse temperature β0\beta \geq 0, with density given by pβ(x)exp(βf(x)). p^\beta(x) \propto \exp(-\beta f(x)). This is the so-called softmin operation (if we were maximising, we’d have the more well-known softmax). We recover exact optimisation in the limit β\beta \to \infty. I haven’t considered what a probabilistic relaxation of minimax might look like, but I suppose such a concept has probably been studied in game theory.↩︎