This article is translated from a Chinese article on my Zhihu account. The original article was posted at 2019-10-12 18:56 +0800.
Some stipulations:
Without special statements, all vectors appearing in this article are n n n -dimensional vectors, n ∈ N n\in\mathbb N n ∈ N ;
Iteration variable k k k always iterates over [ 0 , n ) ∪ Z \left[0,n\right)\cup\mathbb Z [ 0 , n ) ∪ Z ;
sum ξ ⃗ ≔ ∑ k ξ k \operatorname{sum}\vec\xi\coloneqq\sum_k\xi_k sum ξ : = ∑ k ξ k ;
prod ξ ⃗ ≔ ∏ k ξ k \operatorname{prod}\vec\xi\coloneqq\prod_k\xi_k prod ξ : = ∏ k ξ k ;
If the independent and dependent variables of function f f f are both scalars, then define f ( ξ ⃗ ) ≔ ( f ( ξ 0 ) , f ( ξ 1 ) , … , f ( ξ n ) ) f\!\left(\vec\xi\right)\coloneqq\left(f\!\left(\xi_0\right),f\!\left(\xi_1\right),\ldots,f\!\left(\xi_n\right)\right) f ( ξ ) : = ( f ( ξ 0 ) , f ( ξ 1 ) , … , f ( ξ n ) ) ;
ξ ⃗ η ⃗ ≔ ∏ k ξ k η k \vec\xi^{\vec\eta}\coloneqq\prod_k\xi_k^{\eta_k} ξ η : = ∏ k ξ k η k ;
min ξ ⃗ ≔ min k ξ k \min\vec\xi\coloneqq\min_k\xi_k min ξ : = min k ξ k ;
max ξ ⃗ ≔ max k ξ k \max\vec\xi\coloneqq\max_k\xi_k max ξ : = max k ξ k ;
δ ξ , η ≔ { 1 , ξ = η , 0 , ξ ≠ η ; \delta_{\xi,\eta}\coloneqq\begin{cases}1,&&\xi=\eta,\\\\0,&&\xi\ne\eta;\end{cases} δ ξ , η : = ⎩ ⎨ ⎧ 1 , 0 , ξ = η , ξ = η ;
By saying ξ ⃗ \vec\xi ξ is congruent, all components of ξ ⃗ \vec\xi ξ are equal to each other.
Definition 1 . Suppose we have samples x ⃗ ∈ ( R + ) n \vec x\in\left(\mathbb R^+\right)^n x ∈ ( R + ) n , weights w ⃗ ∈ { ξ ⃗ ∈ ( R + ) n ∥ sum ξ ⃗ = 1 } \vec w\in\left\{\vec\xi\in\left(\mathbb R^+\right)^n\,\middle\|\,\operatorname{sum}\vec\xi=1\right\} w ∈ { ξ ∈ ( R + ) n sum ξ = 1 } , and parameter p ∈ [ − ∞ , + ∞ ] p\in\left[-\infty,+\infty\right] p ∈ [ − ∞ , + ∞ ] . Define the Hölder mean by
M p , w ⃗ ( x ⃗ ) ≔ ( w ⃗ ⋅ x ⃗ p ) 1 p . M_{p,\vec w}\!\left(\vec x\right)\coloneqq\left(\vec w\cdot\vec x^p\right)^{\frac 1p}. M p , w ( x ) : = ( w ⋅ x p ) p 1 .
Note. The function is indefinite when p ∈ { − ∞ , 0 , + ∞ } p\in\left\{-\infty,0,+\infty\right\} p ∈ { − ∞ , 0 , + ∞ } , but actually there exist limits
lim p → 0 M p , w ⃗ ( x ⃗ ) = x ⃗ w ⃗ , \lim_{p\to0}M_{p,\vec w}\!\left(\vec x\right)=\vec x^{\vec w}, p → 0 lim M p , w ( x ) = x w ,
lim p → − ∞ M p , w ⃗ ( x ⃗ ) = min x ⃗ , \lim_{p\to-\infty}M_{p,\vec w}\!\left(\vec x\right)=\min\vec x, p → − ∞ lim M p , w ( x ) = min x ,
lim p → + ∞ M p , w ⃗ ( x ⃗ ) = max x ⃗ . \lim_{p\to+\infty}M_{p,\vec w}\!\left(\vec x\right)=\max\vec x. p → + ∞ lim M p , w ( x ) = max x .
The limits are to be proved as theorems later. We can use them to define the Hölder mean for p ∈ { − ∞ , 0 , + ∞ } p\in\left\{-\infty,0,+\infty\right\} p ∈ { − ∞ , 0 , + ∞ } .
Theorem 1 .
lim p → 0 M p , w ⃗ ( x ⃗ ) = x ⃗ w ⃗ . \lim_{p\to0}M_{p,\vec w}\!\left(\vec x\right)=\vec x^{\vec w}. p → 0 lim M p , w ( x ) = x w .
Proof.
lim p → 0 M p , w ⃗ ( x ⃗ ) = lim p → 0 ( w ⃗ ⋅ x ⃗ p ) 1 p (Definition 1) = lim p → 0 exp ln ( w ⃗ ⋅ x ⃗ p ) p = exp lim p → 0 ln ( w ⃗ ⋅ x ⃗ p ) p = exp lim p → 0 w ⃗ ⋅ ( x ⃗ p ln x ⃗ ) w ⃗ ⋅ x ⃗ p (L’H o ˆ pital’s rule) = exp ( w ⃗ ⋅ ln x ⃗ ) = x ⃗ w ⃗ . \begin{aligned}
\lim_{p\to0}M_{p,\vec w}\!\left(\vec x\right)
&=\lim_{p\to0}\left(\vec w\cdot\vec x^p\right)^{\frac 1p}
&\text{(Definition 1)}\\
&=\lim_{p\to0}\exp\frac{\ln\!\left(\vec w\cdot\vec x^p\right)}p\\
&=\exp\lim_{p\to0}\frac{\ln\!\left(\vec w\cdot\vec x^p\right)}p\\
&=\exp\lim_{p\to0}\frac{\vec w\cdot\left(\vec x^p\ln\vec x\right)}{\vec w\cdot\vec x^p}
&\text{(L'H\^opital's rule)}\\
&=\exp\!\left(\vec w\cdot\ln\vec x\right)\\
&=\vec x^{\vec w}.
\end{aligned} p → 0 lim M p , w ( x ) = p → 0 lim ( w ⋅ x p ) p 1 = p → 0 lim exp p ln ( w ⋅ x p ) = exp p → 0 lim p ln ( w ⋅ x p ) = exp p → 0 lim w ⋅ x p w ⋅ ( x p ln x ) = exp ( w ⋅ ln x ) = x w . (Definition 1) (L’H o ˆ pital’s rule) □ \square □
Theorem 2 .
M p , w ⃗ ( x ⃗ ) = M − p , w ⃗ ( x ⃗ − 1 ) − 1 . M_{p,\vec w}\!\left(\vec x\right)=M_{-p,\vec w}\!\left(\vec x^{-1}\right)^{-1}. M p , w ( x ) = M − p , w ( x − 1 ) − 1 .
Proof.
M p , w ⃗ ( x ⃗ ) = ( w ⃗ ⋅ x ⃗ p ) 1 p (Definition 1) = ( ( w ⃗ ⋅ ( x ⃗ − 1 ) − p ) − 1 p ) − 1 = M − p , w ⃗ ( x ⃗ − 1 ) − 1 (Definition 1) \begin{aligned}
M_{p,\vec w}\!\left(\vec x\right)
&=\left(\vec w\cdot\vec x^p\right)^{\frac 1p}
&\text{(Definition 1)}\\
&=\left(\left(\vec w \cdot\left(\vec x^{-1}\right)^{-p}\right)^{-\frac1p}\right)^{-1}\\
&=M_{-p,\vec w}\!\left(\vec x^{-1}\right)^{-1}
&\text{(Definition 1)}
\end{aligned} M p , w ( x ) = ( w ⋅ x p ) p 1 = ( ( w ⋅ ( x − 1 ) − p ) − p 1 ) − 1 = M − p , w ( x − 1 ) − 1 (Definition 1) (Definition 1) □ \square □
Theorem 3 .
lim p → + ∞ M p , w ⃗ ( x ⃗ ) = max x ⃗ . \lim_{p\to+\infty}M_{p,\vec w}\!\left(\vec x\right)=\max\vec x. p → + ∞ lim M p , w ( x ) = max x .
Proof. Because ∀ k : x k max x ⃗ ≤ 1 \forall k:\frac{x_k}{\max\vec x}\le1 ∀ k : m a x x x k ≤ 1 , then lim p → + ∞ ( x ⃗ max x ⃗ ) p = δ max x ⃗ , x ⃗ \lim_{p\to+\infty}\left(\frac{\vec x}{\max\vec x}\right)^p=\delta_{\max\vec x},\vec x lim p → + ∞ ( m a x x x ) p = δ m a x x , x .
lim p → + ∞ M p , w ⃗ ( x ⃗ ) = lim p → + ∞ ( w ⃗ ⋅ x ⃗ p ) 1 p (Definition 1) = ( max x ⃗ ) lim p → + ∞ ( w ⃗ ⋅ ( x ⃗ max x ⃗ ) p ) 1 p = max x ⃗ ( w ⃗ ⋅ lim p → + ∞ ( x max x ⃗ ) p ) lim p → + ∞ 1 p = ( max x ⃗ ) ( w ⃗ ⋅ δ ( max x ⃗ ) , x ⃗ ) 0 = max x ⃗ . \begin{aligned}
\lim_{p\to+\infty}M_{p,\vec w}\!\left(\vec x\right)
&=\lim_{p\to+\infty}\left(\vec w\cdot\vec x^p\right)^{\frac 1p}
&\text{(Definition 1)}\\
&=\left(\max\vec x\right)\lim_{p\to+\infty}\left(\vec w\cdot\left(\frac{\vec x}{\max\vec x}\right)^p\right)^{\frac 1p}\\
&=\max\vec x\left(\vec w\cdot\lim_{p\to+\infty}\left(\frac x{\max\vec x}\right)^p\right)^{\lim_{p\to+\infty}\frac 1p}\\
&=\left(\max\vec x\right)\left(\vec w\cdot\delta_{\left(\max\vec x\right),\vec x}\right)^0\\
&=\max\vec x.
\end{aligned} p → + ∞ lim M p , w ( x ) = p → + ∞ lim ( w ⋅ x p ) p 1 = ( max x ) p → + ∞ lim ( w ⋅ ( max x x ) p ) p 1 = max x ( w ⋅ p → + ∞ lim ( max x x ) p ) l i m p → + ∞ p 1 = ( max x ) ( w ⋅ δ ( m a x x ) , x ) 0 = max x . (Definition 1) □ \square □
Theorem 4 .
lim p → − ∞ M p , w ⃗ ( x ⃗ ) = min x ⃗ . \lim_{p\to-\infty}M_{p,\vec w}\!\left(\vec x\right)=\min\vec x. p → − ∞ lim M p , w ( x ) = min x .
Proof.
lim p → − ∞ M p , w ⃗ ( x ⃗ ) = lim p → − ∞ M − p , w ⃗ ( x ⃗ − 1 ) − 1 (Theorem 2) = lim p → + ∞ M p , w ⃗ ( x ⃗ − 1 ) − 1 = max ( x ⃗ − 1 ) − 1 (Theorem 3) = min x ⃗ . \begin{aligned}
\lim_{p\to-\infty}M_{p,\vec w}\!\left(\vec x\right)
&=\lim_{p\to-\infty}M_{-p,\vec w}\!\left(\vec x^{-1}\right)^{-1}
&\text{(Theorem 2)}\\
&=\lim_{p\to+\infty}M_{p,\vec w}\!\left(\vec x^{-1}\right)^{-1}\\
&=\max\left(\vec x^{-1}\right)^{-1}
&\text{(Theorem 3)}\\
&=\min\vec x.
\end{aligned} p → − ∞ lim M p , w ( x ) = p → − ∞ lim M − p , w ( x − 1 ) − 1 = p → + ∞ lim M p , w ( x − 1 ) − 1 = max ( x − 1 ) − 1 = min x . (Theorem 2) (Theorem 3) □ \square □
Theorem 5 . If p > q p>q p > q , then
M p , w ⃗ ( x ⃗ ) ≥ M q , w ⃗ ( x ⃗ ) , M_{p,\vec w}\!\left(\vec x\right)\ge M_{q,\vec w}\!\left(\vec x\right), M p , w ( x ) ≥ M q , w ( x ) ,
he equality holds iff x ⃗ \vec x x is congruent.
Proof. Case 1: p > q > 0 p>q>0 p > q > 0 .
Let f : R + → R + : ξ ↦ ξ p q f:\mathbb R^+\to\mathbb R^+:\xi\mapsto\xi^{\frac pq} f : R + → R + : ξ ↦ ξ q p , then it has second derivative
d 2 f ( ξ ) d ξ 2 = p q ( p q − 1 ) ξ p q − 2 . \frac{\mathrm d^2f\!\left(\xi\right)}{\mathrm d\xi^2}=\frac pq\left(\frac pq-1\right)\xi^{\frac pq-2}. d ξ 2 d 2 f ( ξ ) = q p ( q p − 1 ) ξ q p − 2 .
Because p > q > 0 p>q>0 p > q > 0 , then p q ( p q − 1 ) > 0 \frac pq\left(\frac pq-1\right)>0 q p ( q p − 1 ) > 0 , and then d 2 f d ξ 2 > 0 \frac{\mathrm d^2f}{\mathrm d\xi^2}>0 d ξ 2 d 2 f > 0 , i.e. f f f is convex. Therefore, according to Jensen’s inequality,
w ⃗ ⋅ f ( x ⃗ q ) ≥ f ( w ⃗ ⋅ x ⃗ q ) , \vec w\cdot f\!\left(\vec x^q\right)\ge f\!\left(\vec w\cdot\vec x^q\right), w ⋅ f ( x q ) ≥ f ( w ⋅ x q ) ,
i.e.
w ⃗ ⋅ x ⃗ p ≥ ( w ⃗ ⋅ x ⃗ q ) p q . \vec w\cdot\vec x^p\ge\left(\vec w\cdot\vec x^q\right)^{\frac pq}. w ⋅ x p ≥ ( w ⋅ x q ) q p .
Take 1 p \frac1p p 1 th power to both sides of the equation. Without changing the direction of the inequality sign, we have
w ⃗ ⋅ x ⃗ p ≥ w ⃗ ⋅ x ⃗ q , \vec w\cdot\vec x^p\ge\vec w\cdot\vec x^q, w ⋅ x p ≥ w ⋅ x q ,
i.e. (according to Definition 1)
M p , w ⃗ ( x ⃗ ) ≥ M q , w ⃗ ( x ⃗ ) . M_{p,\vec w}\!\left(\vec x\right)\ge M_{q,\vec w}\!\left(\vec x\right). M p , w ( x ) ≥ M q , w ( x ) .
According to the condition for the equality to hold in Jensen’s inequality, the equality holds iff x ⃗ \vec x x is congruent.
Case 2: p > q = 0 p>q=0 p > q = 0 .
Because the logarithm function is concave, according to Jensen’s inequality,
ln ( w ⃗ ⋅ x ⃗ p ) ≥ w ⃗ ⋅ ln x ⃗ p . \ln\!\left(\vec w\cdot\vec x^p\right)\ge\vec w\cdot\ln\vec x^p. ln ( w ⋅ x p ) ≥ w ⋅ ln x p .
Take exponential on both sides of the equation, and we have
w ⃗ ⋅ x ⃗ p ≥ x ⃗ p w ⃗ . \vec w\cdot\vec x^p\ge\vec x^{p\vec w}. w ⋅ x p ≥ x p w .
Take 1 p \frac1p p 1 th power to both sides of the equation. Without changing the direction of the inequality sign, we have
( w ⃗ ⋅ x ⃗ p ) 1 p ≥ x ⃗ w ⃗ , \left(\vec w\cdot\vec x^p\right)^{\frac1p}\ge\vec x^{\vec w}, ( w ⋅ x p ) p 1 ≥ x w ,
i.e. (according to Definition 1)
M p , w ⃗ ( x ⃗ ) ≥ M q , w ⃗ ( x ⃗ ) . M_{p,\vec w}\!\left(\vec x\right)\ge M_{q,\vec w}\!\left(\vec x\right). M p , w ( x ) ≥ M q , w ( x ) .
According to the condition for the equality to hold in Jensen’s inequality, the equality holds iff x ⃗ \vec x x is congruent.
Case 3: p = 0 > q p=0>q p = 0 > q .
M q , w ⃗ ( x ⃗ ) = M − q , w ⃗ ( x ⃗ − 1 ) − 1 (Theorem 2) ≤ M 0 , w ⃗ ( x ⃗ − 1 ) − 1 (Case 2) = M 0 , w ⃗ ( x ⃗ ) . (Theorem 2) \begin{align*}
M_{q,\vec w}\!\left(\vec x\right)
&=M_{-q,\vec w}\!\left(\vec x^{-1}\right)^{-1}
&\text{(Theorem 2)}\\
&\le M_{0,\vec w}\!\left(\vec x^{-1}\right)^{-1}
&\text{(Case 2)}\\
&=M_{0,\vec w}\!\left(\vec x\right).
&\text{(Theorem 2)}
\end{align*} M q , w ( x ) = M − q , w ( x − 1 ) − 1 ≤ M 0 , w ( x − 1 ) − 1 = M 0 , w ( x ) . (Theorem 2) (Case 2) (Theorem 2)
The equality holds iff x ⃗ \vec x x is congruent (Case 2).
Case 4: 0 > p > q 0>p>q 0 > p > q .
Because − q > − p > 0 -q>-p>0 − q > − p > 0 , we have
M q , w ⃗ ( x ⃗ ) = M − q , w ⃗ ( x ⃗ − 1 ) − 1 (Theorem 2) ≤ M − p , w ⃗ ( x ⃗ − 1 ) − 1 (Case 1) = M p , w ⃗ ( x ⃗ ) . (Theorem 2) \begin{align*}
M_{q,\vec w}\!\left(\vec x\right)
&=M_{-q,\vec w}\!\left(\vec x^{-1}\right)^{-1}
&\text{(Theorem 2)}\\
&\le M_{-p,\vec w}\!\left(\vec x^{-1}\right)^{-1}
&\text{(Case 1)}\\
&=M_{p,\vec w}\!\left(\vec x\right).
&\text{(Theorem 2)}
\end{align*} M q , w ( x ) = M − q , w ( x − 1 ) − 1 ≤ M − p , w ( x − 1 ) − 1 = M p , w ( x ) . (Theorem 2) (Case 1) (Theorem 2)
The equality holds iff x ⃗ \vec x x is congruent (Case 1).
By all 4 cases, the original proposition is proved. □ \square □
Corollary (HM-GM-AM-QM inequalities).
min x ⃗ ≤ n ( ∑ x ⃗ − 1 ) − 1 ≤ ( ∏ x ⃗ ) 1 n ≤ ∑ x ⃗ n ≤ ∑ x ⃗ 2 n ≤ max x ⃗ , \min\vec x\le n\left(\sum\vec x^{-1}\right)^{-1}
\le\left(\prod\vec x\right)^\frac1n
\le\frac{\sum\vec x}n
\le\sqrt{\frac{\sum\vec x^2}{n}}
\le\max\vec x, min x ≤ n ( ∑ x − 1 ) − 1 ≤ ( ∏ x ) n 1 ≤ n ∑ x ≤ n ∑ x 2 ≤ max x ,
where the equality holds iff x ⃗ \vec x x is congruent.
Proof. Let w ⃗ = ( 1 n , … , 1 n ) \vec w=\left(\frac1n,\dots,\frac1n\right) w = ( n 1 , … , n 1 ) . Then according to Theorem 5,
M − ∞ , w ⃗ ( x ⃗ ) ≤ M − 1 , w ⃗ ( x ⃗ ) ≤ M 0 , w ⃗ ( x ⃗ ) ≤ M 1 , w ⃗ ( x ⃗ ) ≤ M 2 , w ⃗ ( x ⃗ ) ≤ M + ∞ , w ⃗ ( x ⃗ ) , M_{-\infty,\vec w}\!\left(\vec x\right)
\le M_{-1,\vec w}\!\left(\vec x\right)
\le M_{0,\vec w}\!\left(\vec x\right)
\le M_{1,\vec w}\!\left(\vec x\right)
\le M_{2,\vec w}\!\left(\vec x\right)
\le M_{+\infty,\vec w}\!\left(\vec x\right), M − ∞ , w ( x ) ≤ M − 1 , w ( x ) ≤ M 0 , w ( x ) ≤ M 1 , w ( x ) ≤ M 2 , w ( x ) ≤ M + ∞ , w ( x ) ,
i.e. (according to Definition 1)
min x ⃗ ≤ n ( ∑ x ⃗ − 1 ) − 1 ≤ ( ∏ x ⃗ ) 1 n ≤ ∑ x ⃗ n ≤ ∑ x ⃗ 2 n ≤ max x ⃗ , \min\vec x\le n\left(\sum\vec x^{-1}\right)^{-1}
\le\left(\prod\vec x\right)^\frac1n
\le\frac{\sum\vec x}n
\le\sqrt{\frac{\sum\vec x^2}{n}}
\le\max\vec x, min x ≤ n ( ∑ x − 1 ) − 1 ≤ ( ∏ x ) n 1 ≤ n ∑ x ≤ n ∑ x 2 ≤ max x ,
where the equality holds iff x ⃗ \vec x x is congruent (Theorem 5). □ \square □