far.in.net


Expectiles are simple to compute

Wednesday, June 11th, 2025

Expectiles are a class of summary statistics generalising the well-known expected value. They have been relatively neglected since their introduction. Perhaps this is because the expectiles of a sample lack a well-known, simple and efficient calculation procedure?

This note derives a simple and efficient algorithm for computing the expectiles of a discrete distribution, or “sample expectiles” of a finite sample from an arbitrary distribution.

Defining expectiles

Expectiles generalise the expected value by introducing an asymmetry parameter τ(0,1)\tau \in (0, 1), representing a degree of optimism, or, how much to weight better-than-expected data compared to worse-than-expected data.

Formally, given a scalar random variable XX with finite second moment, the τ\tau-expectile of XX, ϵX(τ)\epsilon_{X}{(\tau)}, is the minimiser of an asymmetric version of the expected squared distance, weighting squared positive distances by τ\tau and squared negative distances by 1τ1-\tau: ϵX(τ)=argminϵE[[[X>ϵ]]1ττ(ϵX)2]. \epsilon_{X}{(\tau)} = \mathop{\mathrm{arg\,min}}_ \epsilon \mathbb{E}\left[ [\hspace{-1.5pt}[X > \epsilon]\hspace{-1.5pt}]^{\tau}_ {1-\tau} \cdot (\epsilon-X)^2 \right]. Here, [[P]]ba[\hspace{-1.5pt}[P]\hspace{-1.5pt}]^a_b is a generalised Iverson bracket, evaluating to aa if PP is a true proposition, or to bb otherwise.

For a discrete random variable XX taking on a finite set of values x1,,xNx_1, \ldots, x_N with respective probabilities p1,,pNp_1, \ldots, p_N, this definition simplifies to ϵX(τ)=argminϵi=1Npi[[xi>ϵ]]1ττ(ϵxi)2. \epsilon_{X}{(\tau)} = \mathop{\mathrm{arg\,min}}_ \epsilon \sum_{i=1}^N p_i \cdot [\hspace{-1.5pt}[x_i > \epsilon]\hspace{-1.5pt}]^{\tau}_ {1-\tau} \cdot (\epsilon-x_i)^2.

Similarly, if we take a finite sample of size NN, x=(x1,xN)\vec{x} = (x_1, \ldots x_N), where xiXx_i \sim X, then we can define the sample expectile ϵx(τ)\epsilon_{\vec x}{(\tau)} as ϵx(τ)=argminϵ1Ni=1N[[xi>ϵ]]1ττ(ϵxi)2. \epsilon_{\vec x}{(\tau)} = \mathop{\mathrm{arg\,min}}_ \epsilon \frac1N \sum_{i=1}^N [\hspace{-1.5pt}[x_i > \epsilon]\hspace{-1.5pt}]^{\tau}_ {1-\tau} \cdot (\epsilon-x_i)^2.

In the rest of this note, I derive a simple algorithm for computing sample expectiles. This method can easily be extended to computing the expectiles of an arbitrary discrete distribution with finite support.

The sample expectile objective

Computing the sample τ\tau-expectile given a sample x\vec x involves finding the value of ϵ\epsilon that minimises the (rescaled) sample expectile objective Sx(ϵ)=12i=1N[[xi>ϵ]]1ττ(ϵxi)2. S_{\vec x}(\epsilon) = \frac12 \sum_{i=1}^N [\hspace{-1.5pt}[x_i > \epsilon]\hspace{-1.5pt}]^{\tau}_ {1-\tau} \cdot (\epsilon-x_i)^2.

The following method is based on three observations:

  1. As a sum of piecewise quadratic functions, the objective is piecewise quadratic. Therefore, its gradient is piecewise linear.

  2. As a sum of functions that are continuously differentiable with respect to ϵ\epsilon, the objective is continuously differentiable with respect to ϵ\epsilon.1 That is, its gradient is a continuous function.

  3. As a sum of strictly convex functions, the objective is strictly convex. Therefore, its gradient is strictly increasing.

In summary, Sx(ϵ)S'_ {\vec x}(\epsilon) is piecewise linear, continuous, and monotonically increasing in ϵ\epsilon.

Solving for the minimum

The above implies we can find the minimum of Sx(ϵ)S_{\vec x}(\epsilon) by finding the root of Sx(ϵ)S'_ {\vec x}(\epsilon), and we can find this root in turn by checking each linear “piece” of Sx(ϵ)S'_ {\vec x}(\epsilon) to find the one that crosses zero.

Let’s start by finding those linear pieces. Differentiating the objective with respect to ϵ\epsilon yields Sx(ϵ)=i=1N[[xi>ϵ]]1ττ(ϵxi)=(1τ)i:xiϵ(ϵxi)+τi:xi>ϵ(ϵxi)=((1τ)i:xiϵ1+τi:xi>ϵ1)ϵ((1τ)i:xiϵxi+τi:xi>ϵxi)=Ax(ϵ)ϵBx(ϵ)\begin{align*} S'_ {\vec x}(\epsilon) &= \sum_{i=1}^{N} [\hspace{-1.5pt}[x_i > \epsilon]\hspace{-1.5pt}]^{\tau}_ {1-\tau} \cdot (\epsilon - x_i) \\ & = (1-\tau) \sum_{i : x_i \leq \epsilon} (\epsilon - x_i) +\tau \sum_{i : x_i > \epsilon} (\epsilon - x_i) \\ & = \left( (1-\tau) \sum_{i : x_i \leq \epsilon} 1 + \tau \sum_{i : x_i > \epsilon} 1 \right) \cdot \epsilon - \left( (1-\tau) \sum_{i : x_i \leq \epsilon} x_i + \tau \sum_{i : x_i > \epsilon} x_i \right) \\ & = A_{\vec x}(\epsilon) \cdot \epsilon - B_{\vec x}(\epsilon) \end{align*} where Ax(ϵ)=(1τ)i:xiϵ1+τi:xi>ϵ1,Bx(ϵ)=(1τ)i:xiϵxi+τi:xi>ϵxi.\begin{align*} A_{\vec x}(\epsilon) &= (1-\tau) \sum_{i : x_i \leq \epsilon} 1 + \tau \sum_{i : x_i > \epsilon} 1, \\ B_{\vec x}(\epsilon) &= (1-\tau) \sum_{i : x_i \leq \epsilon} x_i + \tau \sum_{i : x_i > \epsilon} x_i. \end{align*}

Observe that, while Ax(ϵ)A_{\vec x}(\epsilon) and Bx(ϵ)B_{\vec x}(\epsilon) depend on ϵ\epsilon, the dependence arises only through comparison with the xix_i in the summation bounds. Therefore, these functions are piecewise constant with boundaries at each distinct xix_i.

This leads to the following efficient algorithm for finding the root of Sx(ϵ)S'_ {\vec x}(\epsilon):

  1. Sort x\vec x. Hence, assume that x1xNx_1 \leq \ldots \leq x_N.

  2. For i=1,,Ni = 1, \ldots, N, compute Ai=Ax(xi)A_i = A_{\vec x}(x_i) and Bi=Bx(xi)B_i = B_{\vec x}(x_i). All values can be computed in linear time by sharing partial sums Fi=j:xjxi1F_i = \sum_{j : x_j \leq x_i} 1 and Mi=j:xjxixjM_i = \sum_{j : x_j \leq x_i} x_j.

  3. For i=1,,Ni = 1, \ldots, N, compute Si=Sx(xi)=AixiBiS'_ i = S'_ {\vec x}(x_i) = A_i x_i - B_i. Note that the iith linear segment of Sx(ϵ)S'_ {\vec x}(\epsilon) runs from (xi,Si)(x_i, S'_ i) up to (xi+1,Si+1)(x_{i+1}, S'_ {i+1}).

  4. Check each ii to identify one such that Si0<Si+1S'_ i \leq 0 < S'_ {i+1}. This linear segment crosses zero at ϵ=Bi/Ai\epsilon_\star = B_i / A_i.

The resulting ϵ\epsilon_\star satisfies Sx(ϵ)=0S'_ {\vec x}(\epsilon_\star) = 0. Since Sx(ϵ)S'_ {\vec x}(\epsilon) is strictly monotonically increasing this implies that this stationary point is a global minimum of the objective and therefore ϵx(τ)=ϵ. \epsilon_{\vec x}{(\tau)} = \epsilon_\star.

Python/NumPy implementation

Here is a simplified NumPy implementation of the above algorithm.

import numpy as np


def expectile(sample: np.ndarray, tau: float) -> float:
    # step 1: pre-sort the sample
    x = np.sort(sample)
    
    # step 2: compute linear segment coefficients
    # i. precompute partial sums at key points
    N = x.size
    F = np.arange(N) + 1
    M = np.cumsum(x)
    
    # ii. compute the coefficients from these partial sums
    A = ((1 - tau) * F + tau * (F[-1] - F)) # slopes
    B = ((1 - tau) * M + tau * (M[-1] - M)) # offsets

    # step 3: compute starting point of each segment
    G = A * x - B
    
    # step 4: find the segment crossing zero and its root
    # i. find the segment with G_i <= 0 and G_i+1 > 0
    start_below_0 = (G <= 0)[:-1]
    stop_above_0  = (G > 0)[1:]
    i = np.nonzero(start_below_0 & stop_above_0)[0][0]
    
    # ii. interpolate to get the root
    eps = B[i]/A[i]

    # done!
    return eps

This implementation is meant for pedagogical purposes. A version that vectorises the tau input and is more careful about corner cases and potential numerical issues is available in this repository.

Algorithmic efficiency

An advantage of the above method is that it doesn’t involve resorting to numerical methods for minimising the expectile objective. It’s not quite a closed-form solution as it still involves sorting the data and searching iteratively for the segment of the gradient that crosses zero.

The above implementation is pretty efficient. However, it could be made even more efficient.2

I derived this algorithm in 2020 during a coursework research project. Back then, I remember being disappointed that the implementation I saw for computing sample expectiles worked by applying generic root finding algorithms to the gradient, without taking advantage of its special structure. I also looked around and found that standard statistics libraries (SciPy, R) didn’t appear to have tools for computing sample expectiles.

Since then, there have been some developments in tools for computing sample expectiles. SciPy added a method for computing sample expectiles, though it too uses a generic root finder. More excitingly, Daouia, Stupfler, and Usseglio-Carleve published “An expectile computation cookbook” (submitted 2023), which discusses methods similar to mine for computing exact expectiles for discrete distributions as well as for other kinds of distributions. I’m glad to see expectiles getting some love!


  1. It’s not obvious that the function is continuous or differentiable, let alone continuously differentiable, given the use of (discontinuous) Iverson brackets. However, these brackets always occur in multiplication with a term of the form (xϵ)2(x-\epsilon)^2 which is zero and has gradient zero at the discontinuity of the Iverson bracket.↩︎

  2. I have to thank Gemini 2.5 Pro (preview), which pointed out these directions for efficiency improvements when I showed it an earlier draft of this post that conjectured prematurely that the above method was optimal.↩︎