Ulysses’ trip

Some understanding of Grassmann numbers out of intuition

2024-10-10T22:40:02-07:00

This article (except the introduction and the afterwords) is my answer to one of the homework problems that I did when I took a quantum field theory course. The original problem asked to verify the formula for linear change of variables in integration. It was originally written on 2024-02-06.

Introduction

Although Grassmann numbers are purely mathematical concept, but like most people, I was introduced to them in physics class. I then had the natural question: how to formally define Grassmann numbers? In a homework given by my professor of QFT course, I found that I had to answer the question to do a problem in the homework in a way that I am satisfied with.

Numbers

Let $\p{\mbb G_0,+}$ and $\p{\mbb G_1,+}$ be two abelian groups such that $\mbb G_0\cap\mbb G_1=\B{0}$ . For convenience, for any $k\in\bN$ , define $\mbb G_k\ceq\mbb G_{k\bmod2}$ . Define a multiplication on $\mbb G_0\cup\mbb G_1$ such that

multiplication is associative, non-degenerate, and distributive over addition;
$\mbb G_0$ are commuting numbers and $\mbb G_1$ are anticommuting numbers: $\forall\psi_1\in\mbb G_{k_1},\psi_2\in\mbb G_{k_2}: \psi_1\psi_2=\p{-}^{k_1k_2}\psi_2\psi_1;$
and there is a unity $1\in\mbb G_0$ such that $1+\cdots+1\ne0$ for any finite number of summands.

We then have to have $\forall\psi_1\in\mbb G_{k_1},\psi_2\in\mbb G_{k_2}: \psi_1\psi_2\in\mbb G_{k_1+k_2}.$ Therefore, $\mbb G_0$ is a commutative ring with characteristic zero, and $\mbb G_1$ is a $\mbb G_0$ -module. We can then define linear functions with this structure. In this sense, the multiplication on $\mbb G_1$ defines a symplectic bilinear form.

These are not enough to define every property we need for $\mbb G_0$ and $\mbb G_1$ . I will introduce more properties as axioms later.

Tensors

It seems that we need this property as an axiom: for any linear function $\func\lmd{\mbb G_k}{\mbb G_0}$ , $\exists!\vphi\in\mbb G_k:\lmd=\p{\psi\mapsto\vphi\psi}.$ I call this property the first representation property, analog to the Riez representation theorem. I will call linear functions that maps objects to $\mbb G_0$ linear functionals, and the dual space of a $\mbb G_0$ -module as the set of all linear functionals on it.

With the fist representation property, we can identify $\mbb G_k$ with its dual space so that any multilinear map (tensor) have well-defined components. For any $k$ -linear map $\func T{\p{\mbb G_1^n}^k}{\mbb G_0}$ (or alternatively called a rank- $k$ tensor on $\mbb G_1^n$ ), we can write it uniquely in the form $\fc T{\psi_1,\dots,\psi_k}=\psi_{1i_1}\cdots\psi_{ki_k}a_{i_1\cdots i_k},$ where the components $T_{i_1\cdots i_k}\in\mbb G_k$ , and the dummy indices are summed from $1$ to $n$ . Denote the set of all rank- $k$ tensors on $\mbb G_1^n$ as $\mcal T_1^{nk}$ .

Similarly, we can define $k$ -linear maps $\func T{\p{\mbb G_0^n}^k}{\mbb G_0}$ (or rank- $k$ tensors on $\mbb G_0^n$ ), whose components are in $\mbb G_0$ , and denote the set of all of them as $\mcal T_0^{nk}$ . Tensors from $\mcal T_0^{nk}$ and those from $\mcal T_1^{nk}$ can be multiplied and contracted together without any problems. However, the result of these operations may not be in $\mcal T_0^{nk}$ or $\mcal T_1^{nk}$ , but some tensor that takes arguments from both $\mbb G_0^n$ and $\mbb G_1^n$ .

Linear endomorphisms

Here we will need another property as an axiom: for any linear function $\func\lmd{\mbb G_k}{\mbb G_k}$ , $\exists!\vphi\in\mbb G_0:\lmd=\p{\psi\mapsto\vphi\psi}.$ I call this property the second representation property. This is very similar to the first representation property, but it covers linear endomorphisms on $\mbb G_k$ instead of linear functionals on $\mbb G_k$ .

With the second representation property, we can prove that any possible linear endomorphism $J$ on $\mbb G_k^n$ can be written as a unique matrix in $\mbb G_0^{n\times n}$ acting on the components of the argument: $\fc J\psi_i=J_{ij}\psi_j,$ where $J_{ij}\in\mbb G_0$ are called the components of the linear endomorphism $J$ . From now on, we do not need to distinguish between matrices in $\mbb G_0^{n\times n}$ and linear endomorphisms on $\mbb G_k^n$ .

For a matrix $J\in\mbb G_0^{n\times n}$ , we can define its determinant as $\det J\ceq J_{1i_1}\cdots J_{ni_n}\veps^{\b n}_{i_1\cdots i_n}\in\mbb G_0,$ where $\veps^{\b n}\in\mcal T_0^{nn}$ is the Levi-Civita symbol, which is a completely antisymmetric tensor on $\mbb G_0^n$ whose components take values in $\B{-1,0,1}\subset\mbb G_0$ .

Analytic functions

For any $T\in\mcal T_1^{nk}$ , define a degree- $k$ monomial on $\mbb G_1^n$ as $\vfunc{M_T}{\mbb G_1^n}{\mbb G_0}{\psi}{\fc T{\psi,\dots,\psi}},$ which is a degree- $k$ homogeneous function on $\mbb G_1^n$ . Note that different tensors may correspond to the same monomial. Especially, for any $k>n$ , a degree- $k$ monomial must be trivial (send any input to zero). Also, if there is any pair of indices such that $T$ is symmetric in exchanging them, then the monomial $M_T$ must be trivial. Therefore, we only need to consider the those completely antisymmetric tensors when studying monomials. Denote the set of all completely antisymmetric rank- $k$ tensors on $\mbb G_1^n$ as $\mcal T_1^{n\b k}$ , and then the fact that we only need antisymmetric tensors to define monomials can be written as $M_{\mcal T_1^{n\b k}}=M_{\mcal T_1^{nk}}$ .

An analytic function $f$ on $\mbb G_1^n$ is defined as a sum of monomials: $\vfunc f{\mbb G_1^n}{\mbb G_0}\psi{\sum_k \fc{M_{T^{\b k}}}\psi},$ where $T^{\b k}\in\mcal T_1^{n\b k}$ , whose components may be referred to as expansion coefficients. We do not need to worry about the convergence because this is a finite sum ( $k\le n$ ). Denote the set of all analytic functions on $\mbb G_1^n$ as $\mcal A_n$ .

Two properties of analytic functions:

If $f\in\mcal A_n$ , then for any $\dlt\in\mbb G_1^n$ , the translation $\p{\psi\mapsto\fc f{\psi+\dlt}}\in\mcal A_n$ .
If $f\in\mcal A_n$ , then for any $J\in\mbb G_0^{n\times n}$ , the linear transformation in the argument $f\circ J\in\mcal A_n$ .

Integrals

Now we define that a linear function $\int:\mcal A_n\to\mbb G_n$ is called an integral if it satisfies the following property: $\forall f\in\mcal A_n,\dlt\in\mbb G_1^n:\int f=\int\psi\mapsto\fc f{\psi+\dlt},$ which intuitively means that an integral is invariant under translation.

With this definition of an integral, we are now interested in the most general form of an integral.

Because $\int$ is linear, we can find its form on monomials, and then sum them up to get the form on all analytic functions. As a linear function on monomials, it must be of the form (by the second representation property) $\int M_{T^{\b k}}=c^{\b k}_{i_1\cdots i_k}T^{\b k}_{i_1\cdots i_k},$ where $c^{\b k}\in\mcal T_0^{n\b k}$ does not depend on $T^{\b k}$ . Plug this form into the translational invariance of $\int$ , and we have $\begin{align*} c_{i_1\cdots i_k}^{\b k}T^{\b k}_{i_1\cdots i_k} &=\int\psi\mapsto\p{\psi_{i_1}+\dlt_{i_1}}\cdots\p{\psi_{i_k}+\dlt_{i_k}}T^{\b k}_{i_1\cdots i_k}\\ &=\int\psi\mapsto\sum_l\binom kl\psi_{i_1}\cdots\psi_{i_l} \dlt_{i_{l+1}}\cdots\dlt_{i_k}T^{\b k}_{i_1\cdots i_k}\\ &=\sum_l\binom kl c^{\b l}_{i_1\cdots i_l}\dlt_{i_{l+1}}\cdots\dlt_{i_k}T^{\b k}_{i_1\cdots i_k} \end{align*}$ (here the binomial coefficient should be regarded as its image under the natural ring homomorphism from $\bZ$ to $\mbb G_0$ , which must be non-zero because $\mbb G_0$ has characteristic zero). Regarding $T^{\b k}$ as the independent variable, this equation is a homogeneous linear equation $\fc{L^{\b k}}{T^{\b k}}=0$ associated with the linear operator $L$ on $\mcal T_1^{n\b k}$ defined as $L^{\b k}_{i_1\cdots i_k}\ceq c^{\b k}_{i_1\cdots i_k}-\sum_l\binom kl c^{\b l}_{i_1\cdots i_l}\dlt_{i_{l+1}}\cdots\dlt_{i_k}.$ For the solution set of the linear equation to be the whole space $\mcal T_1^{n\b k}$ , we need $L^{\b k}=0$ . Again by the second representation property, we need all the components to vanish (strictly speaking, we need the completely antisymmetric part to vanish, but they are already completely antisymmetric): $\forall k\le n,\dlt\in\mbb G_1^n,i_1,\dots,i_k: c^{\b k}_{i_1\cdots i_k}-\sum_l\binom kl c^{\b l}_{i_1\cdots i_l}\dlt_{i_{l+1}}\cdots\dlt_{i_k}=0.$ The first term cancels with the $l=k$ term in the sum, so this equation does not impose any requirement for $c^{\b k}$ but only impose requirements for $c^{\b l}$ with $l$ . Then, we can induce on $k$ : the equation for $k=0$ does nothing; the equation for $k=1$ requires $c^{\b 0}$ to vanish; the equation for $k=2$ , given that $c^{\b 0}$ vanishes, now requires $c^{\b 1}$ to vanish; and so on. For each $k$ , the equation additionally requires $c^{\b{k-1}}$ to vanish. Finally, when we reach $k=n$ , which is the end of the induction, we require $c^{\b l}$ to vanish for all $l$ , and there is no requirement for $c^{\b n}$ . Therefore, the integral of any monomial is zero except for the degree- $n$ monomial, and thus we only need to consider the $n$ th degree term when finding the integral of an analytic function.

Note that $\mcal T_k^{n\b n}=\mbb G_k\veps^{\b n}$ (in other words, the most general form of a completely antisymmetric rank- $n$ tensor on $\mbb G_k^n$ is a constant in $\mbb G_k$ times the Levi-Civita symbol). Therefore, $c^{\b n}_{i_1\cdots i_n}=c_n\veps^{\b n}_{i_1\cdots i_n}, \quad T^{\b n}_{i_1\cdots i_n}=d\veps^{\b n}_{i_1\cdots i_n},$ where $c_n=\in\mbb G_0$ and $d\in\mbb G_n$ . The definition of an integral does not impose any requirement for $c_n$ , so it can be any element in $\mbb G_0$ . For convenience, define $c_n\ceq1$ for all $n$ , and then we have $\int M_{d\veps^{\b n}}=\veps^{\b n}_{i_1\cdots i_n}d\veps^{\b n}_{i_1\cdots i_n} =n!\,d,$ where $n!$ is the image of $n!$ under the natural ring homomorphism from $\bZ$ to $\mbb G_0$ . The integral of any monomial with its degree different from $n$ is zero, so the integral of any analytic function is just that of its degree- $n$ term: $\int\psi\mapsto\sum_k \fc{M_{T^{\b k}}}\psi=n!\,T^{\b n}_{1\cdots n}.$

Linear change of integrated variable

Now, for a linear endomorphism $J\in\mbb G_0^{n\times n}$ and an analytic function $f\in\mcal A_n$ , consider the integral $\int f\circ J$ . We only needs to consider the degree- $n$ monomial term, which is $\fc{M_{T^{\b n}}}{\fc J\psi}=J_{i_1j_1}\psi_{j_1}\cdots J_{i_nj_n}\psi_{j_n}d\veps^{\b n}_{i_1\cdots i_n},$ where $T^{\b n}=d\veps^{\b n}$ is used. Notice that $\veps_{i_1\cdots i_n}J_{i_1j_1}\cdots J_{i_nj_n}$ itself is a rank- $n$ completely antisymmetric tensor on $\mbb G_0^n$ , so it can also be written as a constant times $\veps^{\b n}$ . By letting $j_1,\dots,j_n$ be $1,\dots,n$ respectively, we see that the constant is just $\det J$ . Therefore, $\fc{M_{T^{\b n}}}{\fc J\psi}=\fc{M_{T^{\b n}}}\psi\det J.$ By the linearty of the integral, we have $\forall f\in\mcal A_n:\int f\circ J=\det J\int f.$

Afterwords

Actually, before I wrote my answer, I already know the exterior algebra. In this article, my definition to Grassmann numbers is more abstract and puts the commuting numbers and anticomuuting numbers in more equal footings. This definition is closer to what I intuitively think Grassmann numbers could be.

There are several potential problems in this article:

Some axioms are given, but I did not prove that they are consistent.
Some claims are made without proof. They may turn out to be wrong.
I did not prove that the usual definition of Grassmann numbers (with exterior algebra) can be formulated as a special case of my definition.
I am not educated in supersymmetry, which is where Grassmann numbers are applied most. I only made my definition comply with the properties of Grassmann numbers that I have learned for doing the path integral of fermionic fields.

The notational convenience of imaginary time in the derivation of the metric in Poincaré coordinates

2024-09-10T14:19:33-07:00

Introduction

There are two major conventions for the metric signature: $\p{+,-,-,-}$ (west coast) and $\p{-,+,+,+}$ (east coast). However, the first convention that I have met in my journey of learning physics is neither of them: the imaginary time. Shortly after, I started using the west coast convention, so I never really used the imaginary time convention seriously. I personally dislike the imaginary time convention, and so do most people in the physics community and history, which is why most modern textbooks use either the west coast or the east coast convention. One of my past physics teachers deemed the imaginary time convention to be a heresy (异端邪说).

However, in some cases, the imaginary time convention can be convenient due to the use of multi-index notation (which is more concise and feature-rich than the Einstein notation). Here is one of such cases: the derivation of the metric in Poincaré coordinates for the anti-de Sitter space.

The $d$ -dimensional anti-de Sitter space $\mrm{AdS}_d$ of scale $l$ is defined as the hyperboloid $-l^2=-T_1^2-T_2^2+\sum_{i=1}^{d-1}\p{X^i}^2$ in $M^{d-1,2}$ (the analogue of the Minkowski space, but with signature $d-1,2$ ). The Poincaré coordinates are defined as $\begin{align*} z&\ceq\fr{l^2}{T_1+X^{d-1}},\\ t&\ceq\fr{lT_2}{T_1+X^{d-1}},\\ x^i&\ceq\fr{lX^i}{T_1+X^{d-1}},&i=1,\ldots,d-2. \end{align*}$

The derivation

Define $T\ceq T_1$ and $X\ceq X^{d-1}$ just for fun. Then, define two $\p{d-1}$ -dimensional multi-indices $Y\ceq\p{\i T_2,X^1,\ldots,X^{d-2}},\quad y\ceq\p{\i t,x^1,\ldots,x^{d-2}}.$

The hyperboloid constraint and the metric (east coast convention) are then $X^2-T^2+Y^2=-l^2,\quad \d s^2=\d X^2-\d T^2+\d Y^2,$ which are equivalently $\p{X+T}\p{X-T}=-l^2-Y^2,\quad\d s^2=\p{\d X+\d T}\p{\d X-\d T}+\d Y^2.$ The definition of the Poincaré coordinates can be written as $z=\fr{l^2}{X+T},\quad y=\fr zlY,$ or equivalently $X+T=\fr{l^2}z,\quad Y=\fr{ly}z.$

Substitute Equation 2 into the first equation in Equation 1. Then, we have $X-T=-z-\fr{y^2}z.$ Differentiate Equation 2 and 3, and we have $\d X+\d T=-\fr{l^2}{z^2}\,\d z,\quad \d X-\d T=-\d z+\fr{y^2}{z^2}\,\d z-\fr{2y}z\,\d y,\quad \d Y=l\p{\fr{\d y}z-\fr y{z^2}\,\d z}.$ Substitute this into the second equation in Equation 1, and we have $\d s^2=-\fr{l^2}{z^2}\d z\p{-\d z+\fr{y^2}{z^2}\,\d z-\fr{2y}z\,\d y}+l^2\p{\fr{\d y}z-\fr y{z^2}\,\d z}^2 =\fr{l^2}{z^2}\p{\d y^2+\d z^2}.$

Finally, substitute back the definition of $y$ , and we have the result $\d s^2=\fr{l^2}{z^2}\p{-\d t^2+\sum_{i=1}^{d-2}\p{\d x^i}^2+\d z^2}.$

The smallest wave packet in the lowest Landau level

2024-07-01T01:01:46-07:00

Introduction

Exercise 12.5 from Modern Condensed Matter Physics (Girvin and Yang, 2019) asks to construct a Gaussian wave packet in the lowest Landau level in the Landau gauge, such that it is localized as closely as possible around some point $\mbf R\ceq\p{R_x,R_y}$ .

Actually, we can prove that the smallest wave packet is a Gaussian wave packet. Here is the derivation.

The problem

First, for readers who are not familiar with the Landau levels, here is a brief introduction. For an electron confined in the $xy$ plane under a magnetic field $\mbf B=B\bhat z$ , its Hamiltonian is $H=\fr1{2m_e}\p{p_x^2+\p{p_y-\fr{eB}cx}^2}$ under the Landau gauge $\mbf A=Bx\bhat y$ . Its eigenstates in the position representation are $\fc{\psi_{nk}}{x,y}=\e^{\i ky}\fc{H_n}{\fr xl-kl} \e^{-\p{x-kl^2}^2/2l^2}$ labeled by $n\in\bN$ and $k\in\bR$ , where $H_n$ is the Hermite polynomial of degree $n$ and $l\ceq\sqrt{\hbar c/eB}$ . States with the same $n$ are degenerate in energy ( $E_n=\p{n+1/2}\hbar eB/m_ec$ ) and make up the $n$ th Landau level. The Landau level with $n=0$ is called the lowest Landau level.

The problem, now, is this optimization problem: $\begin{align*} \min_{a_k}\quad&\mel{\Psi}{x^2+y^2}{\Psi}\\ \st\quad&\braket{\Psi}{\Psi}=1,\\ &\mel{\Psi}{x}{\Psi}=R_x,\\ &\mel{\Psi}{y}{\Psi}=R_y \end{align*}$ (optimizing $\a{x^2+y^2}$ is equivalent to optimizing $\sgm_x^2+\sgm_y^2$ because $\a x$ and $\a y$ are both fixed), where $\ket\Psi$ is defined as the state whose position representation is $\fc\Psi{x,y}=\int\d k\,a_k\e^{\i ky}\e^{-\p{x-kl^2}^2/2l^2}.$

The solution

Consider the moment-generating function $\begin{align*} \fc M{u,v}&\ceq\mel{\Psi}{\e^{ux+vy}}{\Psi}\\ &=\iint\d x\d y\,\e^{ux+vy} \int\d k\,a_k^*\e^{-\i ky}\e^{-\fr1{2l^2}\p{x-kl^2}^2} \int\d k'\,a_{k'}\e^{\i k'y}\e^{-\fr1{2l^2}\p{x-k'l^2}^2}\\ &=\iint\d k\d k'\,a_k^*a_{k'}\int\d x\,\e^{ ux-\fr1{2l^2}\p{x-kl^2}^2-\fr1{2l^2}\p{x-k'l^2}^2 }\underbrace{\int\d y\,\fc\exp{vy+\i\p{k'-k}y}}_{2\pi\fc\dlt{k'-k-\i v}}\\ &=2\pi\int\d k\,a_k^*a_{k+\i v}\underbrace{\int\d x\,\fc\exp{ ux-\fr1{2l^2}\p{x-kl^2}^2-\fr1{2l^2}\p{x-\p{k+\i v}l^2}^2 }}_{l\sqrt\pi\fc\exp{\fr14l^2\p{4ku+u^2+2\i uv+v^2}}}\\ &=2\pi^{3/2}l\fc\exp{\fr14l^2\p{u^2+2\i uv+v^2}} \int\d k\,a_k^*a_{k+\i v}\e^{kl^2u}\\ &=2\pi^{3/2}l\int\d k\,a_k^*\left( a_k+kl^2a_ku+\i a_k'v+\fr14l^2\p{1+2k^2l^2}a_ku^2 \right.\\&\qquad\qquad\qquad\qquad\left. {}+\fr14\p{l^2a_k-2a_k''}v^2 +\fr\i2l^2\p{a_k+2ka_k'}uv+\cdots \right), \end{align*}$ where $a_k'\ceq\d a_k/\d k$ and $a_k''\ceq\d^2a_k/\d k^2$ . On the other hand, we have $\fc M{u,v}=\mel{\Psi}{1+ux+uy+\fr12u^2x^2+\fr12v^2y^2+uvxy+\cdots}{\Psi}.$ Compare the expansion coefficients, and we have $\begin{align*} \braket{\Psi}{\Psi}&=2\pi^{3/2}l\int\d k\,a_k^*a_k,\\ \mel{\Psi}{x}{\Psi}&=2\pi^{3/2}l^3\int\d k\,a_k^*ka_k,\\ \mel{\Psi}{y}{\Psi}&=2\i\pi^{3/2}l\int\d k\,a_k^*a_k',\\ \mel{\Psi}{x^2}{\Psi}&=\fr12\pi^{3/2}l^3\int\d k\,a_k^*\p{1+2k^2l^2}a_k,\\ \mel{\Psi}{y^2}{\Psi}&=\fr12\pi^{3/2}l\int\d k\,a_k^*\p{l^2-2a_k''}a_k. \end{align*}$

Define $\fc\vphi k\ceq a_k\sqrt{2\pi^{3/2}l}$ . Define fictitious position and momentum operators acting on $\vphi$ as $\Xi\vphi:k\mapsto k\fc\vphi k,\quad \Pi\vphi:k\mapsto-\i\fc{\vphi'}k.$ Using the constraints of the original optimization problem and abusing the bra–ket notation on $\vphi$ , we have $\braket{\vphi}{\vphi}=1,\quad\mel\vphi\Xi\vphi=\fr{R_x}{l^2},\quad \mel\vphi\Pi\vphi=-R_y.$ The objective function then becomes $\mel{\Psi}{x^2+y^2}{\Psi}=\fr12l^2+\mel{\vphi}{\mcal H}{\vphi},$ where $\mcal H\ceq \Pi^2/2+l^4\Xi^2/2$ is a fictitious Hamiltonian, which is the Hamiltonian of a harmonic oscillator with mass $1$ and angular frequency $\omg\ceq l^2$ .

The optimization problem can now be re-stated in terms of $\ket\vphi$ as $\begin{align*} \min_{\ket\vphi}\quad&\mel\vphi{\mcal H}{\vphi}\\ \st\quad&\braket\vphi\vphi=1,\quad\mel\vphi\Xi\vphi=R_x/\omg,\quad\mel\vphi\Pi\vphi=-R_y. \end{align*}$ Physically, this means that we want to find the state of a harmonic oscillator with the given expectation values of position and momentum and the lowest energy. To find it, we can use Hisenberg’s uncertainty principle: $\begin{align*} \a{\mcal H}&=\fr12\a{\Pi^2}+\fr12\omg^2\a{\Xi^2}\\ &=\fr12\p{\a\Pi^2+\sgm_\Pi^2}+\fr12\omg^2\p{\a{\Xi^2}+\sgm_\Xi^2}\\ &=\fr12\sgm_\Pi^2+\fr12\omg^2\sgm_\Xi^2+\fr12R_y^2+\fr12 R_x^2\\ &\ge\omg\sgm_\Pi\sgm_\Xi+\fr12R^2 \ge\fr12\omg+\fr12R^2. \end{align*}$ The equality in the first “ $\ge$ ” is achieved when $\sgm_\Pi=\omg\sgm_\Xi$ , and that in the second “ $\ge$ ” is achieved when the uncertainty principle is saturated. As we know from quantum mechanics, the coherent state of a harmonic oscillator satisfies both conditions. The wavefunction of this state is $\fc\vphi k=\p{\fr\omg\pi}^{1/4} \fc\exp{-\fr12\omg\p{k-\fr{R_x}{\omg}}^2-\i R_yk}.$ Express the final result in terms of $a_k$ : $a_k=\fr1{\sqrt2\pi}\e^{-\i kR_y}\e^{-\fr1{2l^2}\p{R_x-kl^2}^2}.$ We may work out the integral to get the wave function of the wave packet: $\fc{\Psi}{x,y}=\fr1{\sqrt{2\pi}l}\fc\exp{-\fr1{4l^2}\p{ \p{x-R_x}^2+\p{y-R_y}^2-2\i\p{x+R_x}\p{y-R_y} }}.$ This is a Gaussian wave packet centered at $\mbf R$ with covariance matrix $\opc{Diag}{l^2,l^2}$ .

Further problems

The optimal wave packet is indeed Gaussian. This makes me curious about whether this is a coincidence or not.

Another thing worth noting is that this result is actually the Dirac delta wave function peaking at $\mbf R$ projected into the lowest Landau level. This was actually my first idea to solve the problem. I was like: well, isn’t the Dirac delta the smallest possible wave packet by all means? If the basis is complete, I can surely combine them into a Dirac delta, and it would be very easy to work out $a_k$ in this case. Then, I was like: nah, merely a single Landau level is not complete, so I cannot do that anyway. I then did not even bother to proceed with this approach and went on to trying other methods. It turns out that this approach is actually correct—at least it gives the same result as the correct approach.

Regularizing the partition function of a hydrogen atom

2024-06-30T21:18:12-07:00

The partition function of a hydrogen atom diverges (only considering bound states). However, we can regularize it to get finite answers. Different regularizations give the same result. They largely agree with the physical arguments for the case of the hydrogen atom at room or cold temperature, but this should be considered a mere coincidence. The results from regularized partition functions cannot generally be trusted.

Letting people know when you are asleep

2024-02-21T02:26:45-08:00

While I have been trying to respond whenever people reach out to me, it is impossible to be available 24/7 due to the simple fact that a human has to sleep to be alive. While this is annoying, it is also a fact that I cannot change.

Therefore, a natural idea that comes to my mind is to simply let people know when I am asleep so that they do not expect me to respond immediately.

Discord and GitHub have been among my most used platforms for some time, and they both have a feature to let users set a custom status (with a custom text and an emoji). This opens up a possibility of using a program to automatically set the status to indicate that I am asleep.

For Discord, this is as simple as invoking a REST API (Notice that this is against Discord’s ToS):

# Set sleeping status
curl -X PATCH \
	-H "Content-Type: application/json" \
	-H "Authorization: YoUr.DiScOrD.ToKeN" \
	-d '{"custom_status":{"text":"Sleeping...","emoji_id":null,"emoji_name":"😴","expires_at":null},"status":"dnd"}' \
	https://discordapp.com/api/v8/users/@me/settings


# Clear sleeping status
curl -X PATCH \
	-H "Content-Type: application/json" \
	-H "Authorization: YoUr.DiScOrD.ToKeN" \
	-d '{"custom_status":null,"status":"online"}' \
	https://discordapp.com/api/v8/users/@me/settings

For GitHub, there is not a REST API for that, but you can install the user-status plugin for GitHub CLI:

1	gh extension install vilmibm/gh-user-status

Then, you can set the status with:

# Set sleeping status
gh user-status set 'Sleeping...' --emoji='sleeping' --limited

# Clear sleeping status
gh user-status set 'null' --expiry=1s

Now, the next step is to run these commands automatically when I fall asleep and wake up. This can be done with MacroDroid, which can trigger actions based on various triggers. To run arbitrary commands, you can use the Tasker plugin for Termux. To have it working, one also needs to uncomment allow-external-apps = true in ~/.termux/termux.properties, and grant MacroDroid the permission to run Termux commands by

1	adb shell pm grant com.arlosoft.macrodroid com.termux.permission.RUN_COMMAND

MacroDroid supports using the return value of the sleep API to trigger an action, but this tends to be quite unreliable on my device. Therefore, I use it in conjunction with a quick setting tile that I can toggle manually. The macro has two triggers:

Fell Asleep / Woke Up (Android sleep API),
Quick Tile On/Off,

and it has these actions:

If Trigger Fired: Woke Up, or Quick Tile Off
	If Sleeping = True
		Clear sleeping status on Discord and GitHub
		# Include other waking up logic here, such as turning off DND mode
	End If
	Sleeping = False
Else If Trigger Fired: Fell Asleep, or Quick Tile On
	If Sleeping = False
		Set GitHub and Discord user status to sleeping
		# Include other falling asleep logic here, such as turning on DND mode
	End If
	Sleeping = True
End If

By the way, I have a bunch of topics that I want to write blog articles about, but I have been quite busy recently, so I may have to pause updating this blog for a while. I hope I can get back to writing soon!

Using nmcli to connect to eduroam in UCSB

2024-01-08T01:03:36-08:00

Save this certificate to some file, say /YOUR/PATH/TO/ca.pem:

Then, run

nmcli con mod eduroam 802-1x.eap peap
nmcli con mod eduroam 802-11-wireless-security.key-mgmt wpa-eap
nmcli con mod eduroam 802-11-wireless-security.proto rsn
nmcli con mod eduroam 802-11-wireless-security.pairwise ccmp
nmcli con mod eduroam 802-11-wireless-security.group ccmp,tkip
nmcli con mod eduroam 802-1x.ca-cert /YOUR/PATH/TO/ca.pem
nmcli con mod eduroam 802-1x.phase2-autheap mschapv2
nmcli con mod eduroam 802-1x.anonymous-identity anonymous@ucsb.edu
nmcli con mod eduroam 802-1x.identity YOUR_EDU_EMAIL_ADDRESS
nmcli con mod eduroam 802-1x.password YOUR_PASSWORD

Finally, you can connect to eduroam now!

This may also apply to eduroam in other campuses, but I haven’t tested it yet.

The duality between two plane trajectories related by a conformal map

2023-12-22T11:19:04-08:00

I always feel amazed about how 2D physics can often be fascinating due to theorems in complex analysis. This article is about one among such cases.

Theorem. The conformal map $\fc wz$ transforms the trajectory with energy $-B$ in potential $\fc Uz\ceq A\v{\d w/\d z}^2$ into the trajectory with energy $-A$ in potential $\fc Vw\ceq B\v{\d z/\d w}^2$ .

This result is pretty amazing in that it reveals a quite implicit duality between the two potentials, and it looks very symmetric as written.

This theorem, as I know of, was first introduced in the appendix of V. I. Arnold’s book Huygens and Barrow, Newton and Hooke. Part of this article is already covered in the relevant part of the book.

Power-law central-force potentials

Before I show the proof of it, let me first introduce it by a much more well-known example.

As we all know, Bertrand’s theorem states that the only two types of central-force potentials where all bound orbits are closed are $U\propto r^{-1}$ (the Kepler problem) and $U\propto r^2$ (the harmonic oscillator). How the two potentials are special among all sorts of different central-force potentials makes people wonder if there is any connection between them. Fortunately, there is one, and it is obvious once we notice that the complex squaring transforms any center-at-origin ellipses into focus-at-origin ellipses. Inspired by this, it is easy to see that trajectories in the Kepler problem can be transformed into trajectories of harmonic oscillators under complex squaring.

You may ask, how can we notice complex squaring does the said transformation on ellipses? The observation is noticing the simple algebra $\p{z+\fr1z}^2=z^2+\fr1{z^2}+2,$ which means that the Joukowski transform $z\mapsto z+1/z$ of a unit circle simply translates under complex squaring. We can then try to generalize this to circles of other radii, whose Joukowski transformations are just ellipses! (If you remember, this is the second time Joukowski transformation appears in my blog. The first time was here.)

Then, are the Kepler problem and the harmonic oscillator the only two central-force potentials whose trajectories can be transformed into each other by a complex function? The answer is no. In fact, for any trajectory in almost any power-law central-force potential, we can take some power of it to get a trajectory in another power-law central-force potential.

This result can be summarized as follows. Taking the $\p{\alp/2+1}$ th power of a trajectory with energy $E$ in the potential $U=ar^\alp$ ( $\alp\ne-2$ ) gives a trajectory with energy $F$ in the potential $V=br^\beta$ , where $\p{\alp+2}\p{\beta+2}=4,\quad b=-\fr14\p{\alp+2}^2E,\quad F=-\fr14\p{\alp+2}^2a.$ To prove this, we just need to reparameterize the transformed trajectory in a new time coordinate $\tau$ defined as $\d\tau=\v z^\alp\,\d t$ , where $z$ is the complex position of the original trajectory. Then, by some calculation and utilizing the energy conservation, we can show that the parameter equation in terms of the new time coordinate satisfy the equation of motion we expect. I will not show the details here because they would be redundant once I prove the more general case using the same methods.

Corollaries and applications

There is an interesting special case, which is $\alp=-2$ . There is no potential that is dual to $U\propto r^{-2}$ . Another interesting case is $\alp=-4$ , which is dual to itself ( $\beta=-4$ ). It kind of means that the coefficient in the potential is “interchangeable” with the energy, and the trajectories can be derived from each other by taking the complex reciprocal.

We can get some interesting results with $a=0$ , which is just the case of a free particle, whose trajectories are all straight lines. Since in this case we necessary have $F=0$ , we can say that the zero-energy trajectory in any power-law potential is related to a straight line by a power. From this result, we can derive some interesting corollaries. For example, the zero-energy trajectory in the Kepler problem is a parabola (square of a straight line), which is well-known. The zero-energy trajectory in $U\propto-r^{-4}$ is a circle passing through the origin (reciprocal of a straight line), which is a pretty interesting not-so-well-known result.

Another interesting result is that, the deflection angle of an incident zero-energy particle scattered by the potential $U\propto-r^\alp$ is $\tht$ under paraxial limit, if $\alp=\fr{2\vphi}{\pi-\vphi},\quad\vphi=\pm\tht-2k\pi,\quad k\in\bN.$ This result can be easily derived by using the conformal transform of the real line (actually, a straight line that approaches the real line). The crucial part here is that $k$ cannot take negative integers because we need $\alp>-2$ . The reason is that, when $\alp\le-2$ , paraxial zero-energy particles are bound to sink into the origin, and thus no scattering actually happens. This small pitfall indicates that the trajectory in the dual potential is not a two-side infinite straight line, either, in that limit, in contrast to being seemingly a free particle.

Some straightforward proofs

Let’s go back to the theorem I stated at the beginning of this article.

Proof. Consider a new time coordinate $\tau$ defined as $\d\tau=\v{\d w/\d z}^2\,\d t$ . Then, the motion of $w$ satisfies $\begin{align*} m\fr{\d^2w}{\d\tau^2} &=m\fr{\d t}{\d\tau}\fr{\d}{\d t}\p{\fr{\d t}{\d\tau}\fr{\d w}{\d t}}\\ &=m\v{\fr{\d z}{\d w}}^2\fr{\d}{\d t}\p{\v{\fr{\d z}{\d w}}^2\fr{\d w}{\d z}\fr{\d z}{\d t}}\\ &=m\fr{\d z}{\d w}\p{\fr{\d z}{\d w}}^*\p{\p{\fr{\d^2z}{\d w^2}\fr{\d w}{\d z}\fr{\d z}{\d t}}^*\fr{\d z}{\d t} +\p{\fr{\d z}{\d w}}^*\fr{\d^2 z}{\d t^2}}. \end{align*}$ Here we need to substitute $\d^2 z/\d t^2$ by the equation of motion for $z$ . By computing the real and imaginary parts separately, we can derive that for any holomorphic function $f$ , the gradient of $\v f^2$ expressed as a complex number is $\nabla\v f^2=2\p{\d f/\d z}^*f$ . Therefore, the equation of motion for $z$ is $m\fr{\d^2z}{\d t^2}=-2A\fr{\d w}{\d z}\p{\fr{\d^2w}{\d z^2}}^*.$ According to series reversion, we have $\d^2 w/\d z^2=-\p{\d w/\d z}^3\d^2 z/\d w^2$ . Therefore, the equation of motion for $z$ can also be written as $m\fr{\d^2z}{\d t^2}=2A\v{\fr{\d w}{\d z}}^2\p{\fr{\d w}{\d z}}^{*2}\p{\fr{\d^2 z}{\d w^2}}^*.$ Substitute this, and we have $m\fr{\d^2w}{\d\tau^2}=\fr{\d z}{\d w}\p{\fr{\d^2z}{\d w^2}}^* \p{m\v{\fr{\d z}{\d t}}^2+2A\v{\fr{\d w}{\d z}}^2}.$ Substitute the energy conservation of the motion of $z$ : $\fr12m\v{\fr{\d z}{\d t}}^2+A\v{\fr{\d w}{\d z}}^2=-B,$ and we have $m\fr{\d^2w}{\d\tau^2}=-2B\fr{\d z}{\d w}\p{\fr{\d^2z}{\d w^2}}^*,$ which is the equation of motion for $w$ that we expect.

To get the energy of the motion of $w$ , we calculate $\begin{align*} \fr12m\v{\fr{\d w}{\d\tau}}^2+B\v{\fr{\d z}{\d w}}^2 &=\fr12m\v{\fr{\d w}{\d z}\fr{\d z}{\d t}\fr{\d t}{\d\tau}}^2+B\v{\fr{\d z}{\d w}}^2\\ &=\v{\fr{\d w}{\d z}}^2\p{-B-A\v{\fr{\d w}{\d z}}^2}\v{\fr{\d z}{\d w}}^4+B\v{\fr{\d z}{\d w}}^2\\ &=-A, \end{align*}$ which is the energy conservation of the motion of $w$ in the potential $V$ that we expect. $\square$

Noticing that we are only interested in the trajectory, we can just use Maupertuis’ principle to get a simpler proof.

Proof. $\mcal S_0=\int\v{\d z}\sqrt{2m\p{-B-A\v{\fr{\d w}{\d z}}^2}}=\int\v{\d w}\sqrt{2m\p{-A-B\v{\fr{\d z}{\d w}}^2}}.$ The abbreviated action is then exactly the same for the motion of $z$ and the motion of $w$ . Therefore, by Maupertuis’ principle, for any physical trajectory of $z$ , the trajectory of $w$ is also physical. $\square$

Details worth noting

Invertibility of the conformal map

There are two different definitions of a conformal transformation in two dimensions. One is that a function defined on an open subset of $\bC$ is conformal iff it is holomorphic and its derivative is nowhere zero. The other is that a function is conformal iff it is biholomorphic (is bijective and has a holomorphic inverse).

You may think here I have adopted the second definition because when I say $\fc Vw\ceq B\v{\d z/\d w}^2$ , I am implicitly assuming that I can take the inverse of $\fc wz$ to get the function $\fc zw$ and then take the derivative of it. However, if that is the case, an immediate problem is that then the duality between the Kepler problem and the harmonic oscillator, from which I introduced the more general result in the first place, would not be actually covered by the “more general” result. This is because $z\mapsto z^2$ is not biholomorphic (because it is not injective).

Then, why did this never become a problem when we were studying the duality between the Kepler problem and the harmonic oscillator? All we have talked about is how we can derive a trajectory in the Kepler problem by squaring the trajectory of a harmonic oscillator, but we have not discussed about how we can reverse this process, as an essential part of the duality. You may think the reverse of the process would be totally natural given how symmetric our theorem is regarding the two potentials. However, the reverse is not actually well-defined since the inverse of squaring, i.e., taking the square root, is not a single-valued function. Nevertheless, it is still well-defined in some sense: starting with whichever branch we like, tracing one point on the trajectory of the Kepler problem, and moving it along this trajectory for two cycles, we will end up with a trajectory of the harmonic oscillator if we take the square root of the position and ensure we always choose the branch so that the mapping is continuously done.

What about other power-law central potentials? In those cases, we have non-closed trajectories, so we cannot just move along the trajectory for two cycles. For example, if we take $w=z^3$ , then the potential would be $U=9A\v z^4$ . For any non-closed trajectory, we can uniquely map it to a trajectory of the potential $V=B\v w^{-4/3}/9$ . However, we cannot uniquely do the reverse mapping. There would be three different trajectories in the potential $U$ that can be mapped to the same trajectory in $V$ , and we can in turn map the trajectory in $V$ to any of the three trajectories in $U$ depending on which branch we choose.

Therefore, to generalize this for more general potentials, we can use similar arguments. Because $z\mapsto w$ has non-zero derivative everywhere in our considered region, it is everywhere locally invertible by the Lagrange inversion theorem. We can then bijectively map the trajectories in the two dual potentials locally for every small (and finite) segment and then patch them together to get the global correspondence between the two trajectories. This mapping may not be well-defined globally, but the trajectories can still be considered dual to each other. If the potential also becomes multi-valued due to the mapping $w\mapsto z$ being multi-valued, then we should imagine this situation like this: at some point, the potential may be different when the particle visit here for the second time. This case does not happen if we only look at power-law potentials, but it does happen for more general cases.

What makes this sense of duality weaker is that one trajectory can be dual to multiple different trajectories. A case worth noting is that sometimes one trajectory can be mapped to infinitely many different trajectories. This happens when the trajectory runs around a logarithmic branch point. However, we can gain the sense of duality back if we can also consider the case where $z\mapsto w$ is multi-valued. The notion of conformal transformation is now too limited to cover this case, a better notion is a global analytic function, which generalizes the notion of analytic function to allow for multiple branches.

Requirements for the potential

Not any potential can be expressed as $A\v{\d w/\d z}^2$ . How can we determine whether a potential can be expressed in this form?

Theorem. A continuous potential $U$ can be expressed in the form of $A\v{\d w/\d z}^2$ (where $\fc wz$ is a conformal transformation) iff one of the following conditions is met:

$U$ is zero everywhere, or
$\ln\v U$ is a harmonic function on the domain of $U$ .

Proof. First, prove the necessity.

An obvious requirement is that the potential must be positive everywhere or negative everywhere (or zero everywhere, but that is trivial). The sign is determined by the sign of $A$ . Therefore, without loss of generality, we can assume $A=1$ because we can always absorb a factor of $\sqrt{\v A}$ into $w$ and adjust the overall sign of $U$ accordingly.

We can decompose $\p{\d w/\d z}^2$ in the polar form $\p{\d w/\d z}^2=\v{\d w/\d z}^2\e^{\i\vphi}=U\e^{\i\vphi},$ where $\vphi$ is a real function of $z$ . Applying the Cauchy–Riemann equations to $\p{\d w/\d z}^2$ gives $\i\partial_x\p{\fr{\d w}{\d z}}^2=\partial_y\p{\fr{\d w}{\d z}}^2 \implies\i\p{\e^{\i\vphi}\partial_xU+\i U\e^{\i\vphi}\partial_x\vphi} =\e^{\i\vphi}\partial_yU+\i U\e^{\i\vphi}\partial_y\vphi.$ Equate the real and imaginary parts, and we have $\begin{cases}U\partial_x\vphi=-\partial_yU,\\U\partial_y\vphi=\partial_xU.\end{cases}$ Use the symmetry of second derivatives on $\vphi$ , and we have $\partial_x\partial_y\vphi-\partial_y\partial_x\vphi=0 \implies\partial_x\fr{\partial_xU}U+\partial_y\fr{\partial_yU}U=0.$ In the language of vector analysis, this is just $\nabla^2\ln U=0$ .

Considering the case where $U$ is negative everywhere, we have that $\ln\v U$ is a harmonic function.

Then, prove the sufficiency.

The case where $U$ is zero everywhere is trivial. Otherwise, because $\ln\v U$ is defined everywhere on the domain of $U$ , we must have $U$ is non-zero everywhere. Because $U$ is continuous, we have $U$ is either positive everywhere or negative everywhere.

Without loss of generality, assume $U$ is positive everywhere. Let $\vphi$ be the harmonic conjugate of $\ln U$ . Then, $\ln U+\i\vphi$ is a holomorphic function. We can then define $\fr{\d w}{\d z}=\sqrt U\e^{\i\vphi/2},$ which is also a holomorphic function. $\square$

From now on, we will call this requirement on $U$ as being log-harmonic for obvious resons.

We should notice that whether $U$ is log-harmonic does not respect that any potential can have an additive constant and still be essentially the same potential. An immediate example is that a function that is positive everywhere may be negative somewhere if we add a constant to it. We may then want to ask whether $U$ can be log-harmonic if we allow it to be added an additive constant. It is easy to do this: we can just apply the same test to $U+C$ , and see if there is some $C$ that makes it work. To illustrate, solve the equation $\nabla^2\ln\v{U+C}=0$ for $C$ , and then see whether it is a constant over the whole complex plane.

A property of log-harmonic functions is that the product of two log-harmonic functions is also log-harmonic.

Trajectories that run out of the domain

Trajectories often run out of the domain of the potential. For example, in the discussions about power-law potentials before, though not emphasized, the origin is outside the domain of the potential because it is either a pole or a zero of $\d w/\d z$ (except the trivial case where $w$ is simply proportional to $z$ ). Another example that is rather overlooked is that unbound trajectories go to infinity while infinity is often not in the domain of the potential, either.

What need to take care of is that, when the trajectories run out of the domain, the trajectory is cut off there, and the rest of the trajectory is never considered (even if it may come back to the domain again later). Take the Kepler problem ane the harmonic oscillator as an example. If a trajectory of the harmonic oscillator passes through the origin, which is outside the domain, the trajectory degrades from a closed ellipse to a segment. If you take the square of a segment passing through the origin, you will get a broken line folded into itself, which looks like a particle in the Coulomb field may sink into the origin and then goes back along the exact path it came along. This would confusing if it were physical.

Arbitrariness in the construction of the conformal map

The construction of $z\mapsto w$ is not unique for a given $U$ .

Rotation and translation

First, we can observe that the substitution $w\to w'\ceq w\e^{\i\tht}+w_0$ does not change $\v{\d w/\d z}$ (nor thus $U$ ). The real number $\tht$ is a function of $z$ in principle, but if we want $w$ to be holomorphic on a connected region, then $\tht$ must be a constant (except the trivial case where $w=0$ ).

The dual trajectory does change, though, but the dual potential $V$ is also changed, too. Because $\v{\d z/\d w'}=\v{\d z/\d w}$ , we have $\fc{V'}{w'}=\fc Vw=\fc V{\p{w'-w_0}\e^{-\i\tht}}.$ Therefore, the dual trajectory and the dual potential are also rotated and translated by the same amount.

Scaling

Before introducing scaling, I need to add some words about the unit systems. In the above discussions, I have never mentioned what units or dimensions do $z,w,A,B$ have. The natural way of thinking is to let $z,w$ have the dimension of length and let $A,B$ have the dimension of energy. However, this is not the only way of thinking. We will later see that the $z$ -space and the $w$ -space can have totally different dimensions.

The dimensions or units of variables in a physical formula can be totally different from what they were originally intended to be. For example, when a particle is rotating, its motion needs to satisfy $\dot{\mbf r}=\bs\omg\times\mbf r$ , where $\bs\omg$ is the angular velocity. However, although $\mbf r$ has the dimension of length when it is first introduced, this formula is satisfied by any rotating vectors. A typical example is that the angular momentum changes according to this formula when a rigid body is doing precession. For another example, in classical mechanics and general relativity, the coordinates used to describe the motion of a particle are often not in the dimension of length, but have all sorts of dimensions. For another example that is less well-known, just because the Berry connection has the same gauge transformation as the electromagnetic potential, a bunch of formulas that are useful in electromagnetic theory can be applied to the Berry connection to define all sorts of interesting quantities with rich physical implications. The units of Berry connection are, however, very unimportant because they are literally arbitrary.

Therefore, what does a unit system actually bring us in a physical theory? The only thing it brings us is the ability to conveniently see in what aspects our theories are invariant under the scaling of some quantities. For example, in classical mechanics, we can scale the mass and the potential of any system with the same factor, and then the system will still behave the same in terms of the time-dependent length-based motion. This is because the part of the dimension of energy that is independent of length and time is to the first power of the dimension of mass. For similar reasons, we can derive another two scaling invariances, one about length-scaling and the other about time-scaling. In quantum mechanics, we suffer one less such scaling invariances because of the existence of $\hbar$ ; in special relativity, we suffer one less such scaling invariances because of the existence of $c$ ; and in general relativity, we suffer two less such scaling invariances because of the existence of $G$ and $c$ . This is the incentive of introducing natural units in physics: they give us a more clear image of how our theory can be scaled leaving the physics invariant.

As for dimensional analysis, the essence of it is to find the required form of theory so that it satisfies some sort of scaling invariance. For example, we can use dimensional analysis to derive that the frequency of a harmonic oscillator is proportional to the square root of the ratio of the stiffness to the mass. We know this must be correct because this is the only theory that is consistent with the three scaling invariances that must be satisfied by any theories under the framework of classical mechanics.

Now, consider the scaling in $w$ , i.e., $w\to w'\ceq w/C$ for some non-zero real number $C$ . The potential $U$ can be kept invariant by scaling $A\to A'\ceq C^2A$ . However, we cannot change $B$ if we want to leave the trajectory of $z$ unchanged because it is determined by the energy of the trajectory of $z$ . Therefore, the dual potential $V$ would be scaled to $\fc{V'}{w'}=C^2\fc Vw=C^2\fc V{Cw'}.$ This means that physics is unchanged if length is scaled by $C$ and energy and potential are both scaled by $C^2$ . This corresponds to one of the three scaling invariances in classical mechanics that we talked about before.

What is interesting here is that the length-scaling in the $w$ -space is done independently of that in the $z$ -space. This means that the length dimension in the two systems are independent of each other, so the two systems can have totally different unit systems.

Canonical transformation of time

The transformation from $z$ to $w$ seems like a coordinate transformation, which is covered by canonical transformations. However, here we have an additional requirement about the form of the Hamiltonian: $H=\fr{p_z^2}{2m}+\fc Uz,\quad K=\fr{p_w^2}{2m}+\fc Vw,$ where $K$ is the transformed Hamiltonian (or called the Kamiltonian in the jargon of canonical transformations). This is not generally true because the transformation in the generalized momentum is restrictively determined when the transformation in the generalized coordinate is already given. From the proof of the original theorem, we can see that a transformation in time is a must, which is given by $\d\tau=\v{\d w/\d z}^2\,\d t$ .

The problem is that the canonical transformations covered in most textbooks usually do not allow for a transformation in time, but only for a transformation in the canonical variables. Therefore, I need to first address the problem of integrating the transformation of time into the theory of canonical transformations. I will not do this for the most general case, but only for the case general enough for the purpose of explaining the case interesting this article.

Change in the time variable in the stationary-action principle

Before diving into the general canonical transformation, let’s first consider the case where the transformation is only in the time variable.

Consider a system with the Lagrangian $\fc L{q,\dot q}$ (not explicitly dependent on time). Then, the action can be expressed as $S=\int_{t_1}^{t_2}\fc L{q,\dot q}\d t.$ The same integral can be expressed in terms of a new time variable $\tau$ as $S=\int_{\tau_1}^{\tau_2}\fc L{q,\mathring q\dot\tau}\fr{\d\tau}{\dot\tau},$ where $\mathring q\ceq\d q/\d\tau$ is the generalized velocity in the new time variable. The transformed Lagrangian, or what I want to call the Magrangian¹, is then $\fc M{q,\mathring q}\ceq\fc L{q,\mathring q\dot\tau}\fr1{\dot\tau}.$ For the case that we are concerning, $\dot\tau$ is a positive real function of $q$ but does not (explicitly) depend on $t$ . The limits $\tau_1,\tau_2$ satisfy the condition $\tau_2-\tau_1=\int_{t_1}^{t_2}\fc{\dot\tau}q\,\d t.$ This relation is crucial. When finding the variation $\dlt S$ , we are fixing $t_1,t_2$ . However, we cannot fix both $\tau_1,\tau_2$ because their difference is dependent on the path $\fc qt$ . What we can do is to fix $\tau_1$ and to let $\tau_2$ have a variation given by $\dlt\tau_2=\int_{t_1}^{t_2}\fc{\dot\tau'}q\dlt q\,\d t =\int_{\tau_1}^{\tau_2}\fr{\fc{\dot\tau'}q}{\fc{\dot\tau}q}\dlt q\,\d\tau,$ where $\dot\tau'$ is the derivative (or gradient, in higher dimensions) of $\dot\tau$ as a function of $q$ . As can be seen, only if $\dot\tau$ is a constant (i.e., $\tau$ is simply an affine transform of $t$ ) does $\dlt\tau_2$ vanish for any $\dlt q$ .

Using the well-known variation of the action when there is variation in the time coordinate, we have $\dlt S=\int_{\tau_1}^{\tau_2} \p{\fr{\partial M}{\partial q}-\fr{\d}{\d\tau}\fr{\partial M}{\partial\mathring q}}\dlt q\,\d t -\fc{K}{\fc q{\tau_2},\fc{\mathring q}{\tau_2}}\dlt\tau_2,$ where $\fc K{q,\mathring q}\ceq\mathring q\fr{\partial M}{\partial\mathring q}-M$ is the energy (or the Kamiltonian, but as a function of generalized coordinates and velocities) of the system.

A quick check of this variation

Embed the latest Mastodon post in my website

2023-11-19T13:51:36-08:00

As we all know you can embed a Twitter timeline in your website like this:

A Twitter timeline

Rotational symmetry of plane lattices as a simple example of algebraic number theory

2023-11-09T23:53:41-08:00

Here is an exercise problem from Modern Condensed Matter Physics (Girvin and Yang, 2019):^©

Exercise 3.9. Show that five-fold rotation symmetry is inconsistent with lattice translation symmetry in 2D. Since 3D lattices can be formed by stacking 2D lattices, this conclusion holds in 3D as well.

Before I saw this problem, I had never thought about whether a plane lattice can have $m$ -fold symmetry for any positive integer $m$ . I was surprised at first that I cannot have a translationally symmetric lattice with 5-fold symmetry. After some thinking, I did realize that I cannot imagine a 5-fold symmetric plane lattice, so such a lattice cannot exist intuitively.

Actually, the only allowed rotational symmetries are 2-fold, 3-fold, 4-fold, and 6-fold. This result is known as the crystallographic restriction theorem. Then, how to prove it?

After jiggling around the possible structure of the symmetry group of a plane lattice, I finally proved it. I found that this proof is actually a simple and good example of how algebraic number theory can be used in physics.

Before dive into the proof, we need to first prove a simple lemma about real analysis:

Lemma 1. If $G$ is a subgroup of $(\mathbb R^2,+)$ that is discrete and spans $\mathbb R^2$ , then there exist two linearly independent elements in $\mathbb R^2$ that generate $G$ .

Proof. Because $G$ spans $\mathbb R^2$ , there exist two linearly independent elements $g_1,g_2\in G$ .

Consider the vector subspace $V_1\coloneqq g_1\mathbb R$ and the subgroup $G_1\coloneqq G\cap V_1$ . Obviously, $G_1$ should be generated by some element $h_1\in G_1$ (this is because $V_1\simeq\mathbb R$ , and $G_1$ as a discrete set must have a smallest positive element under that isomorphism, which must be the generator of $G_0$ because it would otherwise not be the smallest positive element). Therefore, $G_1=h_1\mathbb Z$ . Also, because $h_1\ne0$ , $\left\{h_1,g_2\right\}$ must span $\mathbb R^2$ .

Let $T\coloneqq\left\{ah_1+bg_2\in G\,\middle|\,a\in\left[0,1\right),b\in\left[0,1\right]\right\}.$ Then, $T$ must be discrete (because $G$ is) and bounded, and contains at least the element $g_2$ . Express every element in $T$ as $ah_1+bg_2$ and pick out the one element with the smallest non-zero $b$ , and denote it as $h_2=a^\star h_1+b^\star g_2$ . Certainly, $\left\{h_1,h_2\right\}$ span $\mathbb R^2$ .

Now, for any $g\in G$ , we can express it uniquely as $g=ah_1+bg_2$ . Define $c_2\coloneqq\left\lfloor\frac{b}{b^\star}\right\rfloor,\quad c_1\coloneqq\left\lfloor a-a^\star c_2\right\rfloor,\quad g'\coloneqq g-c_1h_1-c_2h_2.$ Then, $g'\in T$ , and if we express it as $g'=a'h_1+b'g_2$ , then $b'$ is smaller than $b^\star$ . By definition of $b^\star$ , $b'=0$ , so $g'\in G_1$ . Hence, $\left\{h_1,h_2\right\}$ generates $G$ . $\square$

Now, we are ready to prove our main result:

Theorem. There is a discrete subset of $\mathbb R^2$ that has both translational symmetry and $m$ -fold symmetry iff $\varphi(m)\le2$ , where $\varphi$ is Euler’s totient function.

Proof. For the neccessity, prove by contradiction. I instead prove that a set that has the said symmetries must not be discrete.

Denote the plane as $\mathbb C$ . Assume that there is an $m$ -fold symmetry around point $0$ . Then, for any lattice site $z$ , the point $Rz\coloneqq\alpha z$ (where $\alpha\coloneqq\mathrm e^{2\pi\mathrm i/m}$ ) is also a lattice site. Assume that there is a translational symmetry with translation $a$ , then the point $Tz\coloneqq z+a$ is also a lattice site. Without loss of generality, we can adjust the orientation of our coordinate system and the length unit so that $a=1$ .

The group $G$ generated by $\{R,T\}$ is a subgroup of the symmetry group of the lattice. Its action $S\coloneqq\left\{g0\,\middle|\,g\in G\right\}$ on the point $0$ is a subset of all the lattice sites (this is only true when $0$ is a lattice site; I will discuss later the other case). Notice that for any $z\in S,n\in\mathbb Z$ , we have $T^nRz=n+\alpha z\in S$ . Therefore, by expanding any polynomial with integer coefficients using Horner’s rule, we can see that $\mathbb Z[\alpha]\subseteq S$ .

Because $\alpha$ is an algebraic integer of degree $\varphi(m)$ (the minimal polynomial of $\alpha$ is known as the $m$ th cyclotomic polynomial), the generating set of $\mathbb Z[\alpha]$ must have at least $\varphi(m)$ elements. Therefore, according to Lemma 1, $\mathbb Z[\alpha]$ is discrete iff $\varphi(m)\le2$ .

For the case where $0$ is not a lattice site, we can generate $S$ by acting $G$ on any lattice site $z_0$ . We can then easily prove that $z_0+\mathbb Z[\alpha]\subseteq S$ . To prove this, we just need to see that we can act $R^{-k}$ on $z_0$ before further acting $T^nR$ on it for $k$ times. All the other steps are the same and still valid.

For the sufficiency, because there are only finitely many $m$ ’s that satisfy $\varphi\!\left(m\right)\le2$ . Therefore, we can enumerate these $m$ ’s and see that we can easily construct a plane lattice with both translational symmetry and $m$ -fold symmetry for each $m$ . $\square$

I know the original problem in the book was probably not intended to be solved in this way, but it is really amazing how some seemingly purely mathematical areas can have their applications in physics, especially in an exercise problem of a physics textbook where pure mathematics is pretty unexpected.

Unfortunately, this proof, which is based on algebraic properties of certain complex numbers, does not generalize to higher dimensions because we cannot use the complex plane to represent a high-dimensional space.

I restructured my blog

2023-11-06T00:06:52-08:00

Rendering equations server-side

I have been using MathJax as a client-side equation renderer to render equations on my blog for a long time.

The main problem about the client-side rendering is that it makes people that turn off JavaScript on their browsers (e.g. for privacy reasons) unable to see the equations in my articles. Another problem is that it is annoying to wait for the browser to render all the equations, especially if the site owner could have rendered them for you.

I actually have had some experience in server-side equation rendering in Jekyll. In a past post, I talked about how I used Jekyll and KaTeX to render equations in emails server-side. For the website of Sunniesnow (see here for a related post), I use jekyll-katex to render the equations server-side.

Then, I thought, what is stopping me to render equations server-side on my blog? I then started the migration.

The painful building

The easiest way to switch to server-side equation rendering is just to use kramdown-math-katex. Install the gem, add an option math_engine: katex into the Kramdown configurations of _config.yml, add the needed CSS to the theme, and… What is my computer doing? It is just stuck at building the site!

By adding the --verbose option to the jekyll serve command, I can see what it was doing. I can see that it is never stuck on any step, but rendering each article that has equations (especially those with a ton of ones) takes seconds. Because I have dozens of articles with equations, it takes minutes to build the site. It seems that although KaTeX has always been advertising itself as the fastest math typesetting library for the web, it is not fast enough for me to use it to render equations server-side.

A way to mitigate this issue is to use the --incremental option of jekyll serve. This makes the building much faster except the first time. I can also expect Jekyll to support lazy building in the future, which will entirely skip the building phase and build the files as needed on the fly.

I found another way to partially mitigate this issue. On my blog, I have been extensively utilizing the markdownify filter to render Markdown inside the templates, including the title of the posts, the excerpt of the posts, and something else. Those are rendered in multiple places, including the homepage, the archive page, the RSS feed, and the search page. Since now rendering Markdown is being very slow, I decided to cache the rendered Markdowns. A very simple strategy is as follows:

1
2
3

def markdownify input
	UlyssesZhan.markdown_snippet_cache[input] ||= Filters.instance_method(:markdownify).bind_call self, input
end

Also, for most of the time I actually do not need to see the Markdown styling in the titles and excerpts, so I can also disable the markdownify filter depending on the site configuration, like this:

def markdownify input
	return input if @context.registers[:site].config['avoid_markdown']
	UlyssesZhan.markdown_snippet_cache[input] ||= Filters.instance_method(:markdownify).bind_call self, input
end

If I do not want to modify the site configuration file, I can also utilize an environment variable. I can use a after-init hook to set the configuration item based on the environment variable.

Rendering archives has also been very slow even with this Markdown disabling trick (for some reason I do not know). I decide to use another environment variable to disable the rendering of archives. Change the line gem 'jekyll-archives' in Gemfile to this:

1	gem 'jekyll-archives', install_if: !ENV['JEKYLL_NO_ARCHIVE']

By using --incremental and these two tricks together, I can finally build the site in seconds if I only modify one post during jekyll serve.

Cross-referencing

It seems that I cannot cross-reference equations using server-side means. First, KaTeX does not support cross-referencing, and the current workarounds are not acceptable for my use cases.

I then looked at kramdown-math-mathjaxnode, which uses the MathJax Node library to render equations server-side. The MathJax Node library itself does support rendering equation numbers, but kramdown-math-mathjaxnode does not support cross-referencing either. What is worse is that it has not been maintained for years, which means I probably had to rewrite the plugin myself, but I did not have spare time.

Even worse, Kramdown is just not suitable for implementing cross-referencing. I briefly looked at Kramdown’s source codes, and I realized that if I was about to write a math engine for Kramdown to support cross-reference, I would have to refactor Kramdown a bit. Actually, cross-referencing is quite a non-trivial feature for markup languages because of references that cannot be resolved during the first compilation. In $\LaTeX$ , those references are resolved in the second compilation. I would need to refactor Kramdown to support a similar workflow to make it possible to implement cross-referencing.

Then, I looked at other Markdown engines. For Ruby, the only successful Markdown engine besides Kramdown that I know was Redcarpet (it used to be the default Markdown engine of Jekyll), and it was not designed with cross-referencing in mind either. Its developer even rejected to support math-related features a long time ago.

This is why I looked at non-Ruby Markdown engines. The first option that I came up with and also the option that I finally chose is Pandoc.

Pandoc is power in that its form of customization is filters, which transforms the whole parsed AST of the document. Because the whole AST is visible at once for a filter, it is then possible to implement cross-referencing by using a filter. Fortunately, someone has already written such a filter, and it is called pandoc-crossref. What is good about this approach is that it is independent of the math engine that I use: I can use MathJax or KaTeX, client-side or server-side, and it does not matter. The only drawback about it is that it does not support cross-reference a particular line in align or eqnarray environment, which is a feature that I have used in some of my posts. I have to reword those posts to avoid using that feature.

Now that we have a filter, we then need a way to let Pandoc render the math expressions server-side. Fortunately (again), someone has already written a filter for this purpose, and it is called pandoc-katex. Append this filter after the pandoc-crossref filter, and we are done.

The drawback about Pandoc is that it has no Ruby implementations, which means the only way to utilize Pandoc in Jekyll is to write a wrapper of it in Ruby and develop a Jekyll plugin for using that wrapper of Pandoc as the Markdown engine. Fortunately, someone has already done this: the wrapper is called pandoc-ruby, and the Jekyll plugin is called jekyll-pandoc.

Although the math rendering problem is solved, a somewhat unrelated problem arises: Pandoc does not use Rouge to highlight code blocks, but I like Rouge. Unfortunately, no one has written a Pandoc filter to use Rouge to highlight code blocks for me; but fortunately, I can write one myself quickly because it is easy enough, especially if I utilize Paru, which contains an API library to help me with writing Pandoc filters in Ruby.

Paru is actually an alternative to pandoc-ruby. Now that I also use Paru, I started to wonder if I should use pandoc-ruby at all. Considering that jekyll-pandoc has not been maintained for years, I decided to write my own Jekyll plugin to use Paru as the Markdown engine, and the simple plugin is called jekyll-paru.

Tedious work of reformatting the old posts

Using kramdown-math-katex is the only option that I do not need to adjust most of my posts. Another option, jekyll-katex, is not compatible with the markup that I use to write equations. I could not either just wrap the whole {{ content }} inside the {% katexmm %} block (due to some errors that I do not know), and the error messages then were impossible to utilize to help me locate the incompatibilities.

For the option that I finally use, Pandoc, I also have to adjust most of my posts. The major incompatibility is that I need to change all \label and \ref to the format recognizable by pandoc-crossref. Another incompatibility is that I need to use {target=_blank} instead of {:target="_blank"} to indicate a link to be opened in a new tab (as well as other HTML attributes that I use this syntax to embed in Markdown). Also, Pandoc does not allow blank lines inside math display blocks, which I have used in some of my posts (by the way, $\LaTeX$ does not allow those blank lines either, which is pretty annoying).

I then wrote a simple script that use regular expressions to help me with this refactoring task. However, because of the diversity of the syntaxes that I used, I still need to check the posts manually after I ran the script. This makes the refactoring task still very tedious.

The much more complicated GitHub Actions workflow

Now, to build my site, the machine needs pandoc, pandoc-crossref, and pandoc-katex, none of which are Ruby Gems. I need to set up Haskell environment and Rust environment to install them. In GitHub Actions, I can use haskell-actions/setup to set up Haskell environment and cargo-install to install Cargo packages.

I do not know how I managed to make the GitHub Actions workflow file work expectedly at one shot, but I did.

Table of contents and searching

I have been using jekyll-toc to generate the table of contents for each post. The problem with using it now is that it strips the HTML in headings and only keeps the text, so headings with math expressions will not be rendered with nice math typesetting. It was not a problem previously because the client-side math rendering script will render the math expressions in the table of contents. Now that I switched to server-side math rendering, I had to patch jekyll-toc to make it work.

The search functionality was implemented by myself. It is a simple client-side searching powered by Lunr. I also had to refactor the search functionality a bit to make the search results be rendered with math expressions (which were previously also handled by the client-side math rendering script).

Updating the theme

The reason that I updated the theme is actually quite dramatic. This originated from me trying to use kramdown-math-katex. To ensure that the KaTeX CSS has the correct version with the KaTeX renderer used by katex-ruby, I decided to @import the SCSS file found in the repo of katex-ruby into my theme. I found that the SCSS file utilizes a function asset-path to load the fonts, but my CSS pre-processor does not support it, so I tried to extend my CSS pre-processor.

Jekyll uses jekyll-sass-converter to render CSS files, which once (v2) used sassc, but now (v3) uses sass-embedded. The former does not support extension of custom SCSS functions, but the latter does. Therefore, I need to upgrade my jekyll-sass-converter to v3. I actually could have upgraded it earlier because I have been using Jekyll v4 for a long time, but I deliberately kept using jekyll-sass-converter v2 because jekyll-action, which I used, had an issue about using sass-embedded. However, I have long ago migrated from jekyll-action to GitHub’s official upload-pages-artifact, so I can now upgrade jekyll-sass-converter to v3.

Then why does this have anything to do with the theme I used (which is Minima)? After I upgraded jekyll-sass-converter to v3, I found that there are some deprecation warnings in the SCSS files (they are actually already fixed, but I do not know why the issue is still open). This was also when I noticed that Minima has not released a new version for 4 years, and the last stable release is v2.5.1.

Then, how did I upgrade to Minima v3? I actually just tried to use the master branch of the Git repo of Minima, and I found that it was great.

Placeholder files for customization

I am glad to see Minima v3 introduced the include custom-head.html which allows for custom additional HTML metadata and the SCSS file minima/custom-variables.scss and minima/custom-styles.scss which allows for custom SCSS rules to override the default ones.

Although it took me some time to migrate my already present SCSS files and HTML metadata to the new structure, I am glad that Minima adopted this new structure that is more useful and more modern.

Skins

Another feature that I really like about Minima v3 is the support of skins. Minima now comes with several pre-defined skins which I can choose from. The default skin called classic is the one that originated from Minima v2, based on which I wrote my own skin.

I still remember a long time ago I tried to make my site support dark theme. It was such a pain because there are so many colors hardcoded in the theme so that I have to rewrite a large part of the SCSS files provided by Minima to support dark theme. Now, Minima v3 has a pre-defined skin called auto, which adaptively looks the same as classic or dark based on the browser’s prefers-color-scheme. I can now implement my skin based on auto (select my skin in the site’s configuration file and @import the auto skin in my skin’s SCSS file), and the codes are now much cleaner.