This article is translated from a Chinese article on my Zhihu account. The original article was posted at 2019-10-12 18:56 +0800.


Some stipulations:

  • Without special statements, all vectors appearing in this article are nn-dimensional vectors, nNn\in\mathbb N;
  • Iteration variable kk always iterates over [0,n)Z\left[0,n\right)\cup\mathbb Z;
  • sumξkξk\operatorname{sum}\vec\xi\coloneqq\sum_k\xi_k;
  • prodξkξk\operatorname{prod}\vec\xi\coloneqq\prod_k\xi_k;
  • If the independent and dependent variables of function ff are both scalars, then define f ⁣(ξ)(f ⁣(ξ0),f ⁣(ξ1),,f ⁣(ξn))f\!\left(\vec\xi\right)\coloneqq\left(f\!\left(\xi_0\right),f\!\left(\xi_1\right),\ldots,f\!\left(\xi_n\right)\right);
  • ξηkξkηk\vec\xi^{\vec\eta}\coloneqq\prod_k\xi_k^{\eta_k};
  • minξminkξk\min\vec\xi\coloneqq\min_k\xi_k;
  • maxξmaxkξk\max\vec\xi\coloneqq\max_k\xi_k;
  • δξ,η{1,ξ=η,0,ξη;\delta_{\xi,\eta}\coloneqq\begin{cases}1,&&\xi=\eta,\\0,&&\xi\ne\eta;\end{cases}
  • By saying ξ\vec\xi is congruent, all components of ξ\vec\xi are equal to each other.

Definition 1. Suppose we have samples x(R+)n\vec x\in\left(\mathbb R^+\right)^n, weights w{ξ(R+)n|sumξ=1}\vec w\in\left\{\vec\xi\in\left(\mathbb R^+\right)^n\,\middle|\,\operatorname{sum}\vec\xi=1\right\}, and parameter p[,+]p\in\left[-\infty,+\infty\right]. Define the Hölder mean by Mp,w ⁣(x)(wxp)1p.M_{p,\vec w}\!\left(\vec x\right)\coloneqq\left(\vec w\cdot\vec x^p\right)^{\frac 1p}.

Note. The function is indefinite when p{,0,+}p\in\left\{-\infty,0,+\infty\right\}, but actually there exist limits limp0Mp,w ⁣(x)=xw,\lim_{p\to0}M_{p,\vec w}\!\left(\vec x\right)=\vec x^{\vec w}, limpMp,w ⁣(x)=minx,\lim_{p\to-\infty}M_{p,\vec w}\!\left(\vec x\right)=\min\vec x, limp+Mp,w ⁣(x)=maxx.\lim_{p\to+\infty}M_{p,\vec w}\!\left(\vec x\right)=\max\vec x. The limits are to be proved as theorems later. We can use them to define the Hölder mean for p{,0,+}p\in\left\{-\infty,0,+\infty\right\}.

Theorem 1. limp0Mp,w ⁣(x)=xw.\lim_{p\to0}M_{p,\vec w}\!\left(\vec x\right)=\vec x^{\vec w}.

Proof. limp0Mp,w ⁣(x)=limp0(wxp)1p(Definition 1)=limp0expln ⁣(wxp)p=explimp0ln ⁣(wxp)p=explimp0w(xplnx)wxp(L’Hoˆpital’s rule)=exp ⁣(wlnx)=xw.\begin{aligned} \lim_{p\to0}M_{p,\vec w}\!\left(\vec x\right) &=\lim_{p\to0}\left(\vec w\cdot\vec x^p\right)^{\frac 1p} &\text{(Definition 1)}\\ &=\lim_{p\to0}\exp\frac{\ln\!\left(\vec w\cdot\vec x^p\right)}p\\ &=\exp\lim_{p\to0}\frac{\ln\!\left(\vec w\cdot\vec x^p\right)}p\\ &=\exp\lim_{p\to0}\frac{\vec w\cdot\left(\vec x^p\ln\vec x\right)}{\vec w\cdot\vec x^p} &\text{(L'H\^opital's rule)}\\ &=\exp\!\left(\vec w\cdot\ln\vec x\right)\\ &=\vec x^{\vec w}. \end{aligned} \square

Theorem 2. Mp,w ⁣(x)=Mp,w ⁣(x1)1.M_{p,\vec w}\!\left(\vec x\right)=M_{-p,\vec w}\!\left(\vec x^{-1}\right)^{-1}.

Proof. Mp,w ⁣(x)=(wxp)1p(Definition 1)=((w(x1)p)1p)1=Mp,w ⁣(x1)1(Definition 1)\begin{aligned} M_{p,\vec w}\!\left(\vec x\right) &=\left(\vec w\cdot\vec x^p\right)^{\frac 1p} &\text{(Definition 1)}\\ &=\left(\left(\vec w \cdot\left(\vec x^{-1}\right)^{-p}\right)^{-\frac1p}\right)^{-1}\\ &=M_{-p,\vec w}\!\left(\vec x^{-1}\right)^{-1} &\text{(Definition 1)} \end{aligned} \square

Theorem 3. limp+Mp,w ⁣(x)=maxx.\lim_{p\to+\infty}M_{p,\vec w}\!\left(\vec x\right)=\max\vec x.

Proof. Because k:xkmaxx1\forall k:\frac{x_k}{\max\vec x}\le1, then limp+(xmaxx)p=δmaxx,x\lim_{p\to+\infty}\left(\frac{\vec x}{\max\vec x}\right)^p=\delta_{\max\vec x},\vec x. limp+Mp,w ⁣(x)=limp+(wxp)1p(Definition 1)=(maxx)limp+(w(xmaxx)p)1p=maxx(wlimp+(xmaxx)p)limp+1p=(maxx)(wδ(maxx),x)0=maxx.\begin{aligned} \lim_{p\to+\infty}M_{p,\vec w}\!\left(\vec x\right) &=\lim_{p\to+\infty}\left(\vec w\cdot\vec x^p\right)^{\frac 1p} &\text{(Definition 1)}\\ &=\left(\max\vec x\right)\lim_{p\to+\infty}\left(\vec w\cdot\left(\frac{\vec x}{\max\vec x}\right)^p\right)^{\frac 1p}\\ &=\max\vec x\left(\vec w\cdot\lim_{p\to+\infty}\left(\frac x{\max\vec x}\right)^p\right)^{\lim_{p\to+\infty}\frac 1p}\\ &=\left(\max\vec x\right)\left(\vec w\cdot\delta_{\left(\max\vec x\right),\vec x}\right)^0\\ &=\max\vec x. \end{aligned} \square

Theorem 4. limpMp,w ⁣(x)=minx.\lim_{p\to-\infty}M_{p,\vec w}\!\left(\vec x\right)=\min\vec x.

Proof. limpMp,w ⁣(x)=limpMp,w ⁣(x1)1(Theorem 2)=limp+Mp,w ⁣(x1)1=max(x1)1(Theorem 3)=minx.\begin{aligned} \lim_{p\to-\infty}M_{p,\vec w}\!\left(\vec x\right) &=\lim_{p\to-\infty}M_{-p,\vec w}\!\left(\vec x^{-1}\right)^{-1} &\text{(Theorem 2)}\\ &=\lim_{p\to+\infty}M_{p,\vec w}\!\left(\vec x^{-1}\right)^{-1}\\ &=\max\left(\vec x^{-1}\right)^{-1} &\text{(Theorem 3)}\\ &=\min\vec x. \end{aligned} \square

Theorem 5. If p>qp>q, then Mp,w ⁣(x)Mq,w ⁣(x),M_{p,\vec w}\!\left(\vec x\right)\ge M_{q,\vec w}\!\left(\vec x\right), where the equality holds iff x\vec x is congruent.

Proof. Case 1: p>q>0p>q>0.

Let f:R+R+:ξξpqf:\mathbb R^+\to\mathbb R^+:\xi\mapsto\xi^{\frac pq}, then it has second derivative d2f ⁣(ξ)dξ2=pq(pq1)ξpq2.\frac{\mathrm d^2f\!\left(\xi\right)}{\mathrm d\xi^2}=\frac pq\left(\frac pq-1\right)\xi^{\frac pq-2}. Because p>q>0p>q>0, then pq(pq1)>0\frac pq\left(\frac pq-1\right)>0, and then d2fdξ2>0\frac{\mathrm d^2f}{\mathrm d\xi^2}>0, i.e. ff is convex. Therefore, according to Jensen’s inequality, wf ⁣(xq)f ⁣(wxq),\vec w\cdot f\!\left(\vec x^q\right)\ge f\!\left(\vec w\cdot\vec x^q\right), i.e. wxp(wxq)pq.\vec w\cdot\vec x^p\ge\left(\vec w\cdot\vec x^q\right)^{\frac pq}. Take 1p\frac1pth power to both sides of the equation. Without changing the direction of the inequality sign, we have wxpwxq,\vec w\cdot\vec x^p\ge\vec w\cdot\vec x^q, i.e. (according to Definition 1) Mp,w ⁣(x)Mq,w ⁣(x).M_{p,\vec w}\!\left(\vec x\right)\ge M_{q,\vec w}\!\left(\vec x\right). According to the condition for the equality to hold in Jensen’s inequality, the equality holds iff x\vec x is congruent.

Case 2: p>q=0p>q=0.

Because the logarithm function is concave, according to Jensen’s inequality, ln ⁣(wxp)wlnxp.\ln\!\left(\vec w\cdot\vec x^p\right)\ge\vec w\cdot\ln\vec x^p. Take exponential on both sides of the equation, and we have wxpxpw.\vec w\cdot\vec x^p\ge\vec x^{p\vec w}. Take 1p\frac1pth power to both sides of the equation. Without changing the direction of the inequality sign, we have (wxp)1pxw,\left(\vec w\cdot\vec x^p\right)^{\frac1p}\ge\vec x^{\vec w}, i.e. (according to Definition 1) Mp,w ⁣(x)Mq,w ⁣(x).M_{p,\vec w}\!\left(\vec x\right)\ge M_{q,\vec w}\!\left(\vec x\right). According to the condition for the equality to hold in Jensen’s inequality, the equality holds iff x\vec x is congruent.

Case 3: p=0>qp=0>q.

Mq,w ⁣(x)=Mq,w ⁣(x1)1(Theorem 2)M0,w ⁣(x1)1(Case 2)=M0,w ⁣(x).(Theorem 2)\begin{align*} M_{q,\vec w}\!\left(\vec x\right) &=M_{-q,\vec w}\!\left(\vec x^{-1}\right)^{-1} &\text{(Theorem 2)}\\ &\le M_{0,\vec w}\!\left(\vec x^{-1}\right)^{-1} &\text{(Case 2)}\\ &=M_{0,\vec w}\!\left(\vec x\right). &\text{(Theorem 2)} \end{align*} The equality holds iff x\vec x is congruent (Case 2).

Case 4: 0>p>q0>p>q.

Because q>p>0-q>-p>0, we have Mq,w ⁣(x)=Mq,w ⁣(x1)1(Theorem 2)Mp,w ⁣(x1)1(Case 1)=Mp,w ⁣(x).(Theorem 2)\begin{align*} M_{q,\vec w}\!\left(\vec x\right) &=M_{-q,\vec w}\!\left(\vec x^{-1}\right)^{-1} &\text{(Theorem 2)}\\ &\le M_{-p,\vec w}\!\left(\vec x^{-1}\right)^{-1} &\text{(Case 1)}\\ &=M_{p,\vec w}\!\left(\vec x\right). &\text{(Theorem 2)} \end{align*} The equality holds iff x\vec x is congruent (Case 1).

By all 4 cases, the original proposition is proved. \square

Corollary (HM-GM-AM-QM inequalities). minxn(x1)1(x)1nxnx2nmaxx,\min\vec x\le n\left(\sum\vec x^{-1}\right)^{-1} \le\left(\prod\vec x\right)^\frac1n \le\frac{\sum\vec x}n \le\sqrt{\frac{\sum\vec x^2}{n}} \le\max\vec x, where the equality holds iff x\vec x is congruent.

Proof. Let w=(1n,,1n)\vec w=\left(\frac1n,\dots,\frac1n\right). Then according to Theorem 5, M,w ⁣(x)M1,w ⁣(x)M0,w ⁣(x)M1,w ⁣(x)M2,w ⁣(x)M+,w ⁣(x),M_{-\infty,\vec w}\!\left(\vec x\right) \le M_{-1,\vec w}\!\left(\vec x\right) \le M_{0,\vec w}\!\left(\vec x\right) \le M_{1,\vec w}\!\left(\vec x\right) \le M_{2,\vec w}\!\left(\vec x\right) \le M_{+\infty,\vec w}\!\left(\vec x\right), i.e. (according to Definition 1) minxn(x1)1(x)1nxnx2nmaxx,\min\vec x\le n\left(\sum\vec x^{-1}\right)^{-1} \le\left(\prod\vec x\right)^\frac1n \le\frac{\sum\vec x}n \le\sqrt{\frac{\sum\vec x^2}{n}} \le\max\vec x, where the equality holds iff x\vec x is congruent (Theorem 5). \square