Classical vs. quantum statistical mechanics

Previously, I have written two blog articles (part 1 about thermal ensembles and part 2 about non-thermal ensembles) about a formalism of statistical ensembles. I will be using it as the formalism of classical statistical mechanics in this article.

In that formalism, the space of microstates of a system is a measure space M\mcal M, and the physical meaning of the measure is the number of microstates. A macrostate is described by the extensive quantities, which are a function of the microstate, so designating a macrostate restricts the microstates that can realize it to a subset of M\mcal M.

A state of the system is a probability density function pp on M\mcal M, whose physical meaning is an ensemble of microstates. The macroscopically measured extensive quantities of the system are defined to be the ensemble average, i.e., the measured value of A:MRA:\mcal M\to\bR is MpA\int_{\mcal M}pA. Generally, any probability density function is a perfectly valid state, but the most important ones are those that are thermal equilibrium states, including the microcanonical ensembles, the thermal ensembles, and the non-thermal ensembles. The term about an ensemble being thermal or non-thermal is made up by me, but for most practical reasons, we only need to focus on thermal ensembles (because both canonical ensembles and grand canonical ensembles are thermal ensembles).

To avoid subtleties about measure theory and topology, in this article, we will only use counting measure and discrete spaces for the space of microstates and the space of extensive quantities.

Possible confusion of macrostate vs. state and an example

In this article, a macrostate is a tuple of extensive quantities (usually the energy, the volume, and the number of particles) that constrain the microstates. In classical statistical mechanics, every microstate has a definite macrostate. Technically, any function on the microstates may be defined as the macrostates of the system (as long as it meets some measure-theoretic requirements).

On the other hand, a state is an ensemble of microstates. In classical statistical mechanics, it is a probability distribution on the microstates. Any probability density function on the microstates is a state state of the system.

These two concepts are clearly distinct in the context of this article, but they are often confused in the literature.

For example, consider the system M={0,1,2,3}\mcal M=\B{0,1,2,3}, and it has three different macrostates E={0,1,2}E=\B{0,1,2}. Then, we can define the set of microstates that realize the macrostate 00 to be M0={0}M_0=\B{0}, and similarly we can define M1={1,2}M_1=\B{1,2} and M2={3}M_2=\B{3}. We then finished defining the macrostates of the system.

Now, let’s see what states we can define. Despite that the system has only 44 different microstates, it has infinitely many states because any probability distribution on the microstates is a state, which may be specified by the probabilities p0,p1,p2,p3p_0,p_1,p_2,p_3 that sum to 11, each representing the probability of the corresponding microstate. For example, p0=12,p1=12,p2=p3=0p_0=\fr12,\quad p_1=\fr12,\quad p_2=p_3=0 is a perfectly valid state of the system. However, to find the thermal equilibrium state for a certain macrostate, we can use the equal a priori probability principle to find the microcanonical ensemble. For example, the microcanonical ensemble for the macrostate 00 is p0=1,p1=p2=p3=0,p_0=1,\quad p_1=p_2=p_3=0, and the microcanonical ensemble for the macrostate 11 is p1=p2=12,p0=p3=0.p_1=p_2=\fr12,\quad p_0=p_3=0.

Now let’s consider EE as a subset of R\bR so that we can do arithmetics on EE (of course it is called extensive quantities for a reason). We can then define a thermal ensemble given the intensive variables, say, 11: p0=1Z,p1=p2=e1Z,p3=e2Z,p_0=\fr{1}{Z},\quad p_1=p_2=\fr{\e^{-1}}{Z},\quad p_3=\fr{\e^{-2}}{Z}, where Z=1+2e1+e2Z=1+2\e^{-1}+\e^{-2} is the partition function.

I would like to give an example of a non-thermal ensemble, but it is only non-trivially defined if the space of extensive quantities is at least two-dimensional (i.e. if EE lives on R2\bR^2 instead of R\bR), so I will omit it here.

However, in quantum mechanics, things get different because of the introduction of superpositions of states. For the superpositions to make sense, the space of microstates must be endowed with a vector space structure. By principles in quantum mechanics, it is the projective space of a separable Hilbert space H\mcal H. A state of the system is then a density operator ρ\rho on H\mcal H, which can be any positive semi-definite self-adjoint operator with trace 11. This is quite different from a state in the classical case because we cannot simply interpret a density operator as an ensemble of microstates. Generally, we can have different ensembles that realize the same density operator. All those different ensembles are just equally physically valid (without further contexts) due to the Schrödinger–HJW theorem.

Extensive quantities are self-adjoint operators on H\mcal H. This leads to a key difference between classical and quantum statistical mechanics: in quantum statistical mechanics, a microstate generally does not have a definite macrostate, except for the case when it is an eigenstate of all the extensive quantities. However, we can still define macroscopically measured extensive quantities for any state of the system, being TrρA\Tr\rho A for any self-adjoint operator AA.

The fact that only the states in the eigenspace of all the extensive quantities have a definite macrostate imposes a challenge on defining the microcanonical ensemble (to clarify, I am referring to the density operator, which does not define a particular ensemble, but I am still using “microcanonical ensemble” to refer to that state). It may not be possible to define a microcanonical ensemble for every possible combinations of values of the extensive quantities (in their spectra). In practice, one would restrict to only consider mutually commuting operators as the extensive quantities. Then, the microcanonical ensemble density operator is the projection operator onto the common eigenspace (properly normalized to have trace 11). The statistical mechanics is then actually equivalent to the classical statistical mechanics (namely taking the eigenbasis of the extensive quantities as the classical space of microstates)! Unfortunately, this is not the way typically used in practice because it is not always practical to find the eigenbasis.

To avoid mathematical subtleties, we will mostly only consider finite-dimensional Hilbert spaces.

Here is a summary table:

Classical Quantum
Space of microstates Measure space M\mcal M Projective space of a separable Hilbert space H\mcal H
State Probability density function pp on M\mcal M Density operator ρ\rho on H\mcal H
Extensive quantities Functions on M\mcal M Self-adjoint operators on H\mcal H
Measured value of AA MpA\int_{\mcal M}pA TrρA\Tr\rho A

Many-body systems

We then want to ask: if the space of microstates for one particle is M\mcal M (or H\mcal H), what is the space of microstates for many particles? The answer depends on whether the particles are distinguishable particles, indistinguishable fermions, or indistinguishable bosons.

There are two aspects in which fermions and bosons contrasts with each other. One is their symmetry properties: fermions are antisymmetric under exchange of particles, and bosons are symmetric. The other is their statistical properties: fermions obey the Pauli exclusion principle, and the bosons do not. The second property naturally leads us to work with Fock states, which can be derived from the first property after second quantization. In this article, a third kind of particles, distinguishable particles, will also be considered. They are neither symmetric nor antisymmetric under exchange of particles, but exchanging particles actually gives a new state.

The whole idea of these different kinds of particles is very easy to describe in quantum mechanics. If the microstates of each particle live on H\mcal H, then the microstates of many distinguishable particles live on the tensor algebra T ⁣(H)\fc T{\mcal H}; those of many bosons live on the symmetric algebra S ⁣(H)\fc S{\mcal H}; and those of many fermions live on the exterior algebra  ⁣(H)\fc\bigwedge{\mcal H}. Those spaces are called Fock spaces. They are naturally graded, so the particle number operator can be defined by defining the NN-grade subspace of the Fock space to be the eigenspace associated with the eigenvalue NN.

Taking ideas from the Fock basis in quantum mechanics, we can similarly discuss those different kinds of particles in classical statistical mechanics. If the microstates of each particle are M\mcal M, then the microstates of many distinguishable particles are tuples NNMN\bigcup_{N\in\bN}\mcal M^N; those of many bosons are finite multisets in the universe M\mcal M, i.e., {m:MN  |  m<}\set{m:\mcal M\to\bN}{\sum m<\infty}; and those of many fermions are finite subsets of M\mcal M, i.e., P<0 ⁣(M)\fc{\mcal P_{<\aleph_0}}{\mcal M}. Those concepts, namely tuple, multiset, and set, are actually common mathematical constructs used in combinatorics. They all have a natural notion of size, which we define the number of particles to be.

I previously stated that there is an equivalence between quantum and classical statistical mechanics. Here, necessarily for the equivalence to hold, the dimension of the NN-particle subspace in the Fock space (when it is finite) must be the same as the cardinality of the NN-particle microstates in the classical case, and this is indeed the case. Assume that dimH=cardM=M\dim\mcal H=\card\mcal M=M, then both the dimension of the subspace of NN distinguishable particles and the number of classical microstates of NN distinguishable particles are MNM^N. This number for bosons is MN/N!M^{\overline N}/N!, and that for fermions is MN/N!M^{\underline N}/N!, where MNMk<M+Nk,MNMN<kMkM^{\overline N}\ceq\prod_{M\le k<M+N}k,\quad M^{\underline N}\ceq\prod_{M-N<k\le M}k are called the rising factorial power and the falling factorial power respectively. These are the number of ways to put NN balls into MM boxes under three different rules.

Here is a summary table:

Distinguishable Bosons Fermions
Quantum Tensor algebra Symmetric algebra Exterior algebra
Classical Tuple Multiset Set
Number MNM^N MN/N!M^{\overline N}/N! MN/N!M^{\underline N}/N!
Labels on distinguishable particles

To explain this, I may actually need to explain the mathematical definition of a tuple. My personal favorite definition of a tuple is nested ordered pairs, with Kuratowski’s definition of an ordered pair. However, for the purpose of this illustration, I will use another definition, which defines a tuple as a function from a finite von Neumann ordinal to the set of elements (M\mcal M in this case), and a function is defined using its graph. There is a notational advantage of this definition in that, if we also define natural numbers as von Neumann ordinals (which is a common practice in set theory), it unifies the notation of the Cartesian power and the set of functions (in other words, we can identify MN\mcal M^N with NMN\to\mcal M).

With this definition, we can see that a microstate of NN distinguishable particles is a function from their labels to the single-particle microstates, and the labels are always the first NN natural numbers. The point is that, if a particle is removed or added, the labels will be rearranged so that the labels are always the first NN natural numbers.

This should concern you in that the operation of rearranging labels makes each label no longer unique to each particle. For example, say, initially, the system has two particles with labels 00 and 11, and they are in single-particle microstates m0m_0 and m1m_1 respectively. It is then allowed to exchange particles with a bath. If particle 11 moves from the system to the bath while another particle from the bath moves to the system, then the two particles in the system after the exchange will still have labels 00 and 11, but they are not the same particles as before. Namely, particle 11 is not the same particle 11 as before. If the two new partcicles are in single-particle microstates m0m_0 and m1m_1 respectively just as before, then this new state will be regarded as the same state as the initial state, which should not be true because the particles are different from before.

Therefore, to avoid the subtlety of the labels, maybe it is better to consider microstates of many distinguishable particles directly as functions from the set of particles to the single-particle microstates, without attaching labels to the particles. However, this means that as long as the system is allowed to exchange particles with a bath, which, by definition, has a large number of particles compared to the system, the number of microstates in the system will be drastically increased. It would then be impossible to use the grand canonical ensemble to describe the system because you will find that the average number of particles in the system would depend on the number of particles in the bath, which is very absurd.

From this, we can see that the idea that every particle is distinguishable is inherently flawed, i.e., it can only be self-consistent with the unphysical operation of rearranging labels. This hints that, either the a priori probability principle is not applicable in this case, or there are only a few distinguishable types of particles in any practical cases.

Gibbs factor and entropy

Gibbs put the famous factor of 1/N!1/N! in front of the phase space integral of the ideal gas to make the entropy asymptotically linear in NN. People often interpret this as accounting for the indistinguishability of particles so that the result of classical treatment can match with the quantum treatment.

Actually, the effect of the Gibbs factor may not be as important as you imagined. In the microcanonical and the canonical ensemble, the Gibbs factor is just an overall factor for the partition function. The only effect is that the chemical potential would not be intensive and that the entropy would not be extensive without it, but there is no actual physical consequence of this because we cannot measure the entropy and the chemical potential in experiments anyway. In the grand canonical ensemble, the distribution of the number of particles is expected to be different with or without the Gibbs factor. However, at least for the ideal gas example (or more generally, for models with a quadratic Hamiltonian), the equipartition theorem and the ideal gas law would still hold without the Gibbs factor. Consider the grand canonical partition function of the ideal gas, whether we include the Gibbs factor or not: Ξ1N1N!(eβμVλd)N,Ξ2N(eβμVλd)N,\Xi_1\ceq\sum_N\fr1{N!}\p{\fr{\e^{\beta\mu}V}{\lmd^d}}^N,\quad \Xi_2\ceq\sum_N\p{\fr{\e^{\beta\mu}V}{\lmd^d}}^N, where λβh2/2πm\lmd\ceq\sqrt{\beta h^2/2\pi m}. If you spend the time to actually do the calculation, you can get the desired pV=N/βpV=\a N/\beta and E=dN/2β\a E=d\a N/2\beta, whether you include the Gibbs factor or not. The entropy and the chemical potential would indeed change drastically with the introduction of the Gibbs factor, but they are not actually measurable quantities in experiments.

In case you feel this too magical

Let’s do this calculation. The calculation with Ξ1\Xi_1 is standard on textbooks, so I will skip it. For Ξ2\Xi_2, we have Ξ2=11eαV/λd,\Xi_2=\fr1{1-\e^{-\alp}V/\lmd^d}, where αβμ\alp\ceq-\beta\mu. Notice that there is a condition for this convergence, but it does not matter because we only need to consider those α\alp values that make it converge. Then, N=αlnΞ2=eαV/λd1eαV/λd,\a N=-\fr{\partial}{\partial\alp}\ln\Xi_2 =\fr{\e^{-\alp}V/\lmd^d}{1-\e^{-\alp}V/\lmd^d}, E=βlnΞ2=d2βeαV/λd1eαV/λd=d2βN.\a E=-\fr{\partial}{\partial\beta}\ln\Xi_2 =\fr d{2\beta}\fr{\e^{-\alp}V/\lmd^d}{1-\e^{-\alp}V/\lmd^d} =\fr d{2\beta}\a N. Therefore, it works out. You may have noticed that the N\a N and E\a E do not seem to be proportional to VV, but it is fine because α\alp is not intensive. Now, for the ideal gas law, we have p=1βVlnΞ2=1βeα/λd1eαV/λd=NβV.p=\fr1\beta\fr{\partial}{\partial V}\ln\Xi_2 =\fr1\beta\fr{\e^{-\alp}/\lmd^d}{1-\e^{-\alp}V/\lmd^d} =\fr{\a N}{\beta V}. Therefore, it works out.

In fact, you can multiply the summand by any (sensible) function of NN without spoiling these state equations, but it is specific to the ideal gas. The reason behind this is because of the strict extensivity of EE and NN.

Let’s just consider the general case for now. Assume that the canonical partition function is f ⁣(N)Z ⁣(β,N,V)\fc fN\fc Z{\beta,N,V}, where f ⁣(N)\fc fN is the Gibbs factor, which can actually be any non-trivial function you like. Then, the average energy in the canonical ensemble is EZ=βln ⁣(f ⁣(N)Z ⁣(β,N,V))=1Z ⁣(β,N,V)βZ ⁣(β,N,V)=u ⁣(β,N/V)N,\a E_Z=-\fr{\partial}{\partial\beta}\fc\ln{\fc fN\fc Z{\beta,N,V}} =-\fr1{\fc Z{\beta,N,V}}\fr{\partial}{\partial\beta}\fc Z{\beta,N,V} =\fc u{\beta,N/V}N, where u ⁣(β,N/V)\fc u{\beta,N/V} cannot depend on any extensive quantities (here, the only things that it can depend on are the temperature β\beta and the particle number density N/VN/V). The last step is because both EE and NN are extensive quantities (so they must be proportional to each other). Notice that this requires the thermodynamic limit unless we are considering the ideal gas, where the extensivity is exact. Therefore, βZ ⁣(β,N,V)=u ⁣(β,N/V)NZ ⁣(β,N,V).\fr{\partial}{\partial\beta}\fc Z{\beta,N,V}=-\fc u{\beta,N/V}N\fc Z{\beta,N,V}. (1)(1)

Particularly, for ideal gases, u ⁣(β,N/V)\fc u{\beta,N/V} only depends on β\beta, with no N/VN/V dependence. For more general cases, it is reasonable to assume that u ⁣(β,n)\fc u{\beta,n} can be expanded in a power series of nn: u ⁣(β,n)=k=0uk ⁣(β)nk.\fc u{\beta,n}=\sum_{k=0}^\infty\fc{u_k}\beta n^k.

Then, let’s define the grand canonical partition function to be Ξ ⁣(β,α,V)Nf ⁣(N)Z ⁣(β,N,V)eαN.\fc\Xi{\beta,\alp,V}\ceq\sum_N\fc fN\fc Z{\beta,N,V}\e^{-\alp N}. Then, the average energy in the grand canonical ensemble is EΞ=βlnΞ ⁣(β,α,V)=1Ξ ⁣(β,α,V)Nf ⁣(N)Z ⁣(β,N,V)βeαN.\a E_\Xi=-\fr{\partial}{\partial\beta}\ln\fc\Xi{\beta,\alp,V} =-\fr1{\fc\Xi{\beta,\alp,V}}\sum_N\fc fN\fr{\partial\fc Z{\beta,N,V}}{\partial\beta}\e^{-\alp N}. Substitute Equation 1, and then we get EΞ=kuk ⁣(β)VkNk+1Ξ.\a E_\Xi=\sum_k\fr{\fc{u_k}\beta}{V^k}\a{N^{k+1}}_\Xi. For ideal gas, only the k=0k=0 term is nonzero, so we recover EΞ=u ⁣(β,NΞ/V)NΞ.\a E_\Xi=\fc u{\beta,\a N_\Xi/V}\a N_\Xi. (2)(2) For more general case, for this to be true, we need to require that NkΞNΞk1\fr{\a{N^k}_\Xi}{\a{N}^k_\Xi}\to1 in the thermodynamic limit. However, this is not true for a general f ⁣(N)\fc fN. In fact, it is not true already for the Ξ2\Xi_2 example above, which can be easily shown for k=2k=2. Notice that N2Ξ2NΞ22NΞ22=2α2lnΞ2 ⁣(β,α,V)(αlnΞ2 ⁣(β,α,V))2=1eαV/λd=1+1NΞ21.\fr{\a{N^2}_{\Xi_2}-\a N_{\Xi_2}^2}{\a N_{\Xi_2}^2} =\fr{\fr{\partial^2}{\partial\alp^2}\ln\fc{\Xi_2}{\beta,\alp,V}}{\p{\fr{\partial}{\partial\alp}\ln\fc{\Xi_2}{\beta,\alp,V}}^2} =\fr1{\e^{-\alp}V/\lmd^d}=1+\fr1{\a N_{\Xi_2}}\to1. Therefore, N2Ξ2NΞ222.\fr{\a{N^2}_{\Xi_2}}{\a N_{\Xi_2}^2}\to2. This makes 2 not true if u1 ⁣(β)\fc{u_1}\beta is non-trivial. The deeper reason behind this disagreement is that the extensivity of the characteristic functions (in this case, the Helmholtz energy and the grand potential) is required for the thermodynamic equivalence between different ensembles (in this case, the canonical ensemble and the grand canonical ensemble). I will cover this in more detail later in this article.

This then raises questions. Does the entropy have to be linear in NN? In other words, does the entropy need to meet the traditional sense of extensivity? Does physics actually care about our definition of the entropy? The answer to these questions is actually no. The entropy is not something that we can directly measure in experiments, and there are some freedom in the definition of the entropy that does not affect any physical outcomes.

Now, recall that the Gibbs factor accounts for the indistinguishability of particles. This would mean that whether the particles are actually distinguishable or not does not matter the actual physics. Gas particles in real life may well be distinguishable. For example, chlorine has two stable isotopes that naturally occur with considerable abundance, and that does not make it substantially different from, say, fluorine, which has only one stable isotope. Maybe people will also find observable features in fluorine molecules that would make them distinguishable, who knows? That would not deny any of the experimentally tested thermodynamic theories that can be applied to fluorine today.

Therefore, the Gibbs factor should not be introduced in the sole purpose of accounting for the indistinguishability of particles. It is introduced to make the entropy traditionally extensive. However, as I already stated, it is not necessary for the actual physics, so why is it important to make the entropy extensive? The answer is that, otherwise, the free energy (be it the Helmholtz energy or the Gibbs energy) would not be extensive. The free energy measures the work that can be extracted from the system, and by this nature it must be extensive because energy is additive. Therefore, only when we define the entropy in a way such that it is extensive, can it possibly make the derived free energy be able to measure the extractable work.

Having the idea that the free energy measures the amount of work that can be extracted from the system, we would then think we are able to extract some work out of the process of mixing two distinguishable gases. This is because distinguishability gives rise to a mixing entropy, which is the whole reason why it makes the entropy fail to be traditionally extensive. On the other hand, as I stated, whether we regard the two gases distinguishable or not in theory, it does not matter the actual physics. However, the amount of work that can be extracted from the process of mixing two gases is very physical by any means. To resolve this, the take is that, if it is possible to extract work from mixing them in one’s theory, then it should also be possible to distinguish the gases in their theory. On the other hand, if the two gases are indistinguishable in one’s theory, then it is impossible to extract work from mixing them in their theory. Therefore, it actually does not matter whether the gases are “in reality” distinguishable or not, the theory would be able to make itself consistent. The texts about the mixing paradox on Wikipedia explain this idea, which is a gist of the paper (which unfortunately did not talk about the grand canonical ensemble in detail).

Another importance for the entropy to be extensive is that only then can different ensembles be thermodynamically equivalent. The thermodynamical equivalence is the property that the thermodynamic properties determined from the characteristic functions (e.g., entropy, Helmholtz energy, and grand potential) of different statistical ensembles are the same in the thermodynamic limit. This is not a sufficient condition, though, because we also need to require that the entropy is a concave function of the extensive quantities. There is a good paper that explains the equivalence and nonequivalence of ensembles in detail, assuming the characteristic functions are always extensive. The main idea is that, for any statistical ensemble, the probability measure on the space of macrostates, parametrized by the particle number NN, satisfies the large deviation principle with the rate function being the characteristic function. With the concavity condition, using a generalization of Laplace’s method, it can then be proven that the characteristic functions of different ensembles are related as being the Legendre transform of each other.

Simplified sketch

I am writing this because before I read the paper, I independently came up with the same idea of using Laplace’s method to prove the equivalence of ensembles. I wrote it on Zhihu, and here is a translation of it.

Assume that the extensive quantity of the system is EE and that the corresponding intensive quantity is II. Suppose that the partition function of the EE-ensemble is Ω ⁣(E)\fc\Omg E, and then the characteristic function of the EE-ensemble would be S ⁣(E)lnΩ ⁣(E)\fc SE\ceq\ln\fc\Omg E, and we would have I=S ⁣(E)I=\fc{S'}E (the prime denotes the derivative) in the thermal equilibrium state with fixed EE.

On the other hand, the partition function Z ⁣(I)\fc ZI of the II-ensemble is the Laplace transform of Ω ⁣(E)\fc\Omg E: Z ⁣(I)=Ω ⁣(E)eIEdE=eS ⁣(E)IEdE.\fc ZI=\int\fc\Omg E\e^{-IE}\,\d E =\int\e^{\fc SE-IE}\,\d E. We have the characteristic function F ⁣(I)lnZ ⁣(I)\fc FI\ceq-\ln\fc ZI of the II-ensemble, and we would have E=F ⁣(I)E=\fc{F'}I in the thermal equilibrium state with fixed II.

The question now is whether I=S ⁣(E)I=\fc{S'}E and E=F ⁣(I)E=\fc{F'}I are actually the same equation. In other words, are SS' and FF' inverse functions of each other? If they are, then we get the same results from the EE-ensemble and the II-ensemble. Nevertheless, generally they are not. We just need one counterexample to show that: for system with a quadratic Hamiltonian, let EE be the energy, and then its corresponding intensive quantity II is the inverse temperature (in this case, the EE-ensemble is the microcanonical ensemble, and the II-ensemble is the canonical ensemble), and we have Ω ⁣(E)En/2,S ⁣(E)=n/2E,F ⁣(I)=1+n/2I,\fc\Omg E\propto E^{n/2},\quad \fc{S'}E=\fr{n/2}E,\quad \fc{F'}I=\fr{1+n/2}I, where nn is the number of quadratic terms in the Hamiltonian (e.g., n=3Nn=3N for classical monatomic ideal gas).

However, we can see that, for the thermodynamic limit nn\to\infty, we indeed have SS' and FF' being the inverse functions of each other. We can then conjecture that, under the thermodynamic limit, different ensembles will get the same result. Now, what is the thermodynamic limit? We may think that multiplying extensive quantities by a zooming factor λ\lmd and letting λ\lmd\to\infty is the thermodynamic limit. A good characteristic function should also be extensive in the thermodynamic limit, so S ⁣(λE)λS ⁣(E)\fc S{\lmd E}\approx\lmd\fc SE. Therefore, we define Sλ ⁣(E)S ⁣(λE),Zλ ⁣(I)eSλ ⁣(E)IλEdEeλ(S ⁣(E)IE)dE.\fc{S_\lmd}E\ceq\fc S{\lmd E},\quad \fc{Z_\lmd}I\ceq\int\e^{\fc{S_\lmd}E-I\lmd E}\,\d E \approx\int\e^{\lmd\p{\fc SE-IE}}\,\d E. When λ\lmd\to\infty, use Laplace’s method to get (assuming that SS is a concave function) Zλ ⁣(I)2πλS ⁣(S1 ⁣(I))eλ(S ⁣(S1 ⁣(I))IS1 ⁣(I)),\fc{Z_\lmd}I\approx\sqrt{\fr{2\pi}{\lmd\v{\fc{S''}{\fc{S^{\prime-1}}{I}}}}} \e^{\lmd\p{\fc S{\fc{S^{\prime-1}}{I}}-I\fc{S^{\prime-1}}{I}}}, and thus (only keeping the highest order term in λ\lmd) Fλ ⁣(I)IlnZλ ⁣(I)λS1 ⁣(I)Sλ1 ⁣(I),\fc{F'_\lmd}I\ceq-\fr\partial{\partial I}\ln\fc{Z_\lmd}I \approx\lmd\fc{S^{\prime-1}}{I}\approx\fc{S^{\prime-1}_\lmd}I, where Sλ ⁣(E)dS ⁣(λE)/d ⁣(λE)=S ⁣(λE)\fc{S'_\lmd}E\ceq\d\fc S{\lmd E}/\d\!\p{\lmd E} =\fc{S'}{\lmd E} (instead of simply the derivative of SλS_\lmd). This is indeed our expected result: SλS'_\lmd and FλF'_\lmd are inverse functions of each other.

Gibbs factor and indistinguishability

Why can the introductiong of the Gibbs factor account for the indistinguishability?

Define Ω0 ⁣(M,N)MN\fc{\Omg^0}{M,N}\ceq M^N to be the number of microstates of NN distinguishable particles with MM single-particle microstates. Then, define Ω0 ⁣(M,N)Ω0 ⁣(M,N)/N!\fc{\Omg_0}{M,N}\ceq\fc{\Omg^0}{M,N}/N! to be the version with the Gibbs factor. Also, define Ω± ⁣(M,N)MN/N!\fc{\Omg_\pm}{M,N}\ceq M^{\overline{\underline N}}/N! for bosons and fermions, where MNM^{\overline{\underline N}} is MNM^{\overline N} or MNM^{\underline N} corresponding to ++ or - in the notation “±\pm” respectively.

If we make the distinguishable particles indistinguishable, we have to characterize them as either bosons or fermions. However, Ω0\Omg_0 and Ω±\Omg_\pm are not exactly the same, This discrepancy can be resolved in the large MM limit. We have Ω± ⁣(M,N)=1N!k=0N1(M±k)=MNN!k=0N1(1±kM)=Ω0 ⁣(M,N)(1±N(N1)2M+O ⁣(M2)),\begin{align*} \fc{\Omg_\pm}{M,N}&=\fr1{N!}\prod_{k=0}^{N-1}\p{M\pm k} =\fr{M^N}{N!}\prod_{k=0}^{N-1}\p{1\pm\fr kM}\\ &=\fc{\Omg_0}{M,N}\p{1\pm\fr{N\p{N-1}}{2M}+\order{M^{-2}}}, \end{align*} where the big-O notation is understood as NN fixed and MM\to\infty. Therefore, to have Ω0Ω±\Omg_0\approx\Omg_\pm, loosely speaking, we need MN2M\gg N^2. In this limit, there is no difference between boson statistics and fermion statistics, and both of them are the same as distinguishable particles with the Gibbs factor.

Intuitively, if MM is very large, then in most of the microstates, each single-particle microstate is occupied by at most one particle, which renders boson statistics and fermion statistics the same. Particularly, if there are infinitely many single-particle microstates, then MM is effectively infinite, so Ω0=Ω±\Omg_0=\Omg_\pm is strictly true for any MM in this case. This is why the result for classical ideal gas is exact: there are so many single-particle microstates that the probability for two particles to occupy the same microstate is exactly zero, i.e., such microstates have zero measure.

Classical Fermi gas

I previously said that, to make things simple, the measure on M\mcal M would be the counting measure. One big reason behind that is the difficulty of a purely classical description of the Fermi gas.

In the classical description of a gas, the microstates of each particle are points in the 2d2d-dimensional phase space, which is a region in R2d\bR^{2d}, and the measure is just the usual Lebesgue measure (or any other practically equivalent measure, for math nerds). Therefore, naturally, the microstates of many particles with particle number NN would be a region in R2dN\bR^{2dN}, also equipped with the usual Lebesgue measure.

If the gas consists of fermions, then in any microstate, two particles cannot be in the same single-particle microstate. However, the set of all microstates that have two such particles has zero measure in the 2dN2dN-dimensional phase space. Therefore, it just would not matter at all whether the particles are fermions or not in the classical description.

However, we know that is not the actual case. In practice, we divide the single-particle phase space into cells of size hdh^{d}, where hh is the Planck constant, which we put here by hand. No two particles can reside in the same cell. Therefore, any “bulky” region in the single-particle phase space with volume Ω\Omg cannot contain more than Ω/hd\Omg/h^{d} particles.

This all sounds fine, except that we did not define what a “bulky” region is. Of course, the Fermi sea is a bulky region, but what about a tube that is long enough to connect any specified discrete points in the single-particle phase space but is thin enough to have volume even smaller than hdh^{d}? In fact, by constructing regions with not-so-exotic shapes, we can make any distributino of particles in the single-particle phase space seem like it is violating the Pauli exclusion principle or not arbitrarily. Just shown in the figure below, particles that reasonably distribute in different cells may be regarded as being in one cell, while particles that reasonably occupy the same cell may be regarded as being in different cells.

Regular phase space cells and exotic ones

There are some possible ways to resolve this issue. One naive way is to stipulate that the cell arranges in some lattice structure such as the simple cube lattice. However, this will break the rotational symmetry in the phase space so that the Fermi sea will not be strictly isotropic anymore. Also, the introduction of the lattice structure changes the physics of the system if it is far from the thermodynamic limit. Only in the thermodynamic limit will the particular choice of lattice structure be irrelevant to the physics.

Another way is to consider a phase space density functional theory, where a function ρ\rho is defined on the single-particle phase space, representing the number of particles in unit volume in the phase space. The measure of the number of microstates for the many-body system is then the functional integral of this density function. The Pauli exclusion principle can then be translated into the constraint that the value of ρ\rho must not exceed 1/hd1/h^{d} anywhere, which prevents the number of particles in any region of size hdh^{d} from exceeding 11. It can also describe bosons by removing this constraint. I have not explored this approach myself, but I doubt it would be a good idea because it seems like an overkill to the problem and will introduce even more mathematical subtleties with the functional integral. Also, more careful analysis must be done to devise the proper measure on the functional space to match the usual sense of number of microstates. Another issue is that it defies the classical notion of particles as clear points but instead treats them as cloudy distributions just like quantum mechanics, and by this very reason it is not capable of being generalized to describe distinguishable particles.

When MM is not very large, using the Gibbs factor is then not a correct way to account for indistinguishability. However, it can be corrected, as long as we use Ω± ⁣(M,N)MN\fc{\Omg^\pm}{M,N}\ceq M^{\overline{\underline N}} instead of Ω0 ⁣(M,N)\fc{\Omg^0}{M,N}. Then, we would have Ω± ⁣(M,N)/N!=Ω± ⁣(M,N)\fc{\Omg^\pm}{M,N}/N!=\fc{\Omg_\pm}{M,N} exactly, corresponding to boson statistics and fermion statistics. There are indeed combanitorics problems of putting distinguishable balls into boxes that results in Ω± ⁣(M,N)\fc{\Omg^\pm}{M,N}. Actually, Ω+ ⁣(M,N)\fc{\Omg^+}{M,N} is the number of ways to put NN distinguishable balls into MM boxes with the balls in each box being ordered; Ω ⁣(M,N)\fc{\Omg^-}{M,N} is the number of ways to put NN distinguishable balls into MM exclusive boxes (“exclusive” means that each box cannot contain more than one ball).

Now, Ω±\Omg^\pm represents two more different rules under which we put balls into boxes. Together with Ω0\Omg_0 and Ω±\Omg_\pm, there are five different rules in total. We can summarize them into a table:

NN balls MM boxes Number of ways Particles
Distinguishable Unordered Ω0=MN\Omg^0=M^N Distinguishable particles
Distinguishable Ordered Ω+=MN\Omg^+=M^{\overline N} Bosons (without Gibbs factor)
Distinguishable Exclusive Ω=MN\Omg^-=M^{\underline N} Fermions (without Gibbs factor)
Indistinguishable Unordered Ω+=MN/N!\Omg_+=M^{\overline N}/N! Bosons
Indistinguishable Exclusive Ω=MN/N!\Omg_-=M^{\underline N}/N! Fermions

These are all common enumerative problems of putting balls into boxes in combinatorics. One can extend this table by including more different enumerative problems. There is such a table called the twentyfold way that lists 20 different enumerative problems.