Wavefunctions as gravitational waves

This is the paper I always wanted to write. It is there now, and I think it is good – and that‘s an understatement. 🙂 It is probably best to download it as a pdf-file from the viXra.org site because this was a rather fast ‘copy and paste’ job from the Word version of the paper, so there may be issues with boldface notation (vector notation), italics and, most importantly, with formulas – which I, sadly, have to ‘snip’ into this WordPress blog, as they don’t have an easy copy function for mathematical formulas.

It’s great stuff. If you have been following my blog – and many of you have – you will want to digest this. 🙂

Abstract : This paper explores the implications of associating the components of the wavefunction with a physical dimension: force per unit mass – which is, of course, the dimension of acceleration (m/s2) and gravitational fields. The classical electromagnetic field equations for energy densities, the Poynting vector and spin angular momentum are then re-derived by substituting the electromagnetic N/C unit of field strength (mass per unit charge) by the new N/kg = m/s2 dimension.

The results are elegant and insightful. For example, the energy densities are proportional to the square of the absolute value of the wavefunction and, hence, to the probabilities, which establishes a physical normalization condition. Also, Schrödinger’s wave equation may then, effectively, be interpreted as a diffusion equation for energy, and the wavefunction itself can be interpreted as a propagating gravitational wave. Finally, as an added bonus, concepts such as the Compton scattering radius for a particle, spin angular momentum, and the boson-fermion dichotomy, can also be explained more intuitively.

While the approach offers a physical interpretation of the wavefunction, the author argues that the core of the Copenhagen interpretations revolves around the complementarity principle, which remains unchallenged because the interpretation of amplitude waves as traveling fields does not explain the particle nature of matter.

Introduction

This is not another introduction to quantum mechanics. We assume the reader is already familiar with the key principles and, importantly, with the basic math. We offer an interpretation of wave mechanics. As such, we do not challenge the complementarity principle: the physical interpretation of the wavefunction that is offered here explains the wave nature of matter only. It explains diffraction and interference of amplitudes but it does not explain why a particle will hit the detector not as a wave but as a particle. Hence, the Copenhagen interpretation of the wavefunction remains relevant: we just push its boundaries.

The basic ideas in this paper stem from a simple observation: the geometric similarity between the quantum-mechanical wavefunctions and electromagnetic waves is remarkably similar. The components of both waves are orthogonal to the direction of propagation and to each other. Only the relative phase differs : the electric and magnetic field vectors (E and B) have the same phase. In contrast, the phase of the real and imaginary part of the (elementary) wavefunction (ψ = a·ei∙θ = a∙cosθ – a∙sinθ) differ by 90 degrees (π/2).[1] Pursuing the analogy, we explore the following question: if the oscillating electric and magnetic field vectors of an electromagnetic wave carry the energy that one associates with the wave, can we analyze the real and imaginary part of the wavefunction in a similar way?

We show the answer is positive and remarkably straightforward.  If the physical dimension of the electromagnetic field is expressed in newton per coulomb (force per unit charge), then the physical dimension of the components of the wavefunction may be associated with force per unit mass (newton per kg).[2] Of course, force over some distance is energy. The question then becomes: what is the energy concept here? Kinetic? Potential? Both?

The similarity between the energy of a (one-dimensional) linear oscillator (E = m·a2·ω2/2) and Einstein’s relativistic energy equation E = m∙c2 inspires us to interpret the energy as a two-dimensional oscillation of mass. To assist the reader, we construct a two-piston engine metaphor.[3] We then adapt the formula for the electromagnetic energy density to calculate the energy densities for the wave function. The results are elegant and intuitive: the energy densities are proportional to the square of the absolute value of the wavefunction and, hence, to the probabilities. Schrödinger’s wave equation may then, effectively, be interpreted as a diffusion equation for energy itself.

As an added bonus, concepts such as the Compton scattering radius for a particle and spin angular, as well as the boson-fermion dichotomy can be explained in a fully intuitive way.[4]

Of course, such interpretation is also an interpretation of the wavefunction itself, and the immediate reaction of the reader is predictable: the electric and magnetic field vectors are, somehow, to be looked at as real vectors. In contrast, the real and imaginary components of the wavefunction are not. However, this objection needs to be phrased more carefully. First, it may be noted that, in a classical analysis, the magnetic force is a pseudovector itself.[5] Second, a suitable choice of coordinates may make quantum-mechanical rotation matrices irrelevant.[6]

Therefore, the author is of the opinion that this little paper may provide some fresh perspective on the question, thereby further exploring Einstein’s basic sentiment in regard to quantum mechanics, which may be summarized as follows: there must be some physical explanation for the calculated probabilities.[7]

We will, therefore, start with Einstein’s relativistic energy equation (E = mc2) and wonder what it could possibly tell us.

I. Energy as a two-dimensional oscillation of mass

The structural similarity between the relativistic energy formula, the formula for the total energy of an oscillator, and the kinetic energy of a moving body, is striking:

1. E = mc2
2. E = mω2/2
3. E = mv2/2

In these formulas, ω, v and c all describe some velocity.[8] Of course, there is the 1/2 factor in the E = mω2/2 formula[9], but that is exactly the point we are going to explore here: can we think of an oscillation in two dimensions, so it stores an amount of energy that is equal to E = 2·m·ω2/2 = m·ω2?

That is easy enough. Think, for example, of a V-2 engine with the pistons at a 90-degree angle, as illustrated below. The 90° angle makes it possible to perfectly balance the counterweight and the pistons, thereby ensuring smooth travel at all times. With permanently closed valves, the air inside the cylinder compresses and decompresses as the pistons move up and down and provides, therefore, a restoring force. As such, it will store potential energy, just like a spring, and the motion of the pistons will also reflect that of a mass on a spring. Hence, we can describe it by a sinusoidal function, with the zero point at the center of each cylinder. We can, therefore, think of the moving pistons as harmonic oscillators, just like mechanical springs.

Figure 1: Oscillations in two dimensions

If we assume there is no friction, we have a perpetuum mobile here. The compressed air and the rotating counterweight (which, combined with the crankshaft, acts as a flywheel[10]) store the potential energy. The moving masses of the pistons store the kinetic energy of the system.[11]

At this point, it is probably good to quickly review the relevant math. If the magnitude of the oscillation is equal to a, then the motion of the piston (or the mass on a spring) will be described by x = a·cos(ω·t + Δ).[12] Needless to say, Δ is just a phase factor which defines our t = 0 point, and ω is the natural angular frequency of our oscillator. Because of the 90° angle between the two cylinders, Δ would be 0 for one oscillator, and –π/2 for the other. Hence, the motion of one piston is given by x = a·cos(ω·t), while the motion of the other is given by x = a·cos(ω·t–π/2) = a·sin(ω·t).

The kinetic and potential energy of one oscillator (think of one piston or one spring only) can then be calculated as:

1. K.E. = T = m·v2/2 = (1/2)·m·ω2·a2·sin2(ω·t + Δ)
2. P.E. = U = k·x2/2 = (1/2)·k·a2·cos2(ω·t + Δ)

The coefficient k in the potential energy formula characterizes the restoring force: F = −k·x. From the dynamics involved, it is obvious that k must be equal to m·ω2. Hence, the total energy is equal to:

E = T + U = (1/2)· m·ω2·a2·[sin2(ω·t + Δ) + cos2(ω·t + Δ)] = m·a2·ω2/2

To facilitate the calculations, we will briefly assume k = m·ω2 and a are equal to 1. The motion of our first oscillator is given by the cos(ω·t) = cosθ function (θ = ω·t), and its kinetic energy will be equal to sin2θ. Hence, the (instantaneous) change in kinetic energy at any point in time will be equal to:

d(sin2θ)/dθ = 2∙sinθ∙d(sinθ)/dθ = 2∙sinθ∙cosθ

Let us look at the second oscillator now. Just think of the second piston going up and down in the V-2 engine. Its motion is given by the sinθ function, which is equal to cos(θ−π /2). Hence, its kinetic energy is equal to sin2(θ−π /2), and how it changes – as a function of θ – will be equal to:

2∙sin(θ−π /2)∙cos(θ−π /2) = = −2∙cosθ∙sinθ = −2∙sinθ∙cosθ

We have our perpetuum mobile! While transferring kinetic energy from one piston to the other, the crankshaft will rotate with a constant angular velocity: linear motion becomes circular motion, and vice versa, and the total energy that is stored in the system is T + U = ma2ω2.

We have a great metaphor here. Somehow, in this beautiful interplay between linear and circular motion, energy is borrowed from one place and then returns to the other, cycle after cycle. We know the wavefunction consist of a sine and a cosine: the cosine is the real component, and the sine is the imaginary component. Could they be equally real? Could each represent half of the total energy of our particle? Should we think of the c in our E = mc2 formula as an angular velocity?

These are sensible questions. Let us explore them.

II. The wavefunction as a two-dimensional oscillation

The elementary wavefunction is written as:

ψ = a·ei[E·t − px]/ħa·ei[E·t − px]/ħ = a·cos(px E∙t/ħ) + i·a·sin(px E∙t/ħ)

When considering a particle at rest (p = 0) this reduces to:

ψ = a·ei∙E·t/ħ = a·cos(E∙t/ħ) + i·a·sin(E∙t/ħ) = a·cos(E∙t/ħ) i·a·sin(E∙t/ħ)

Let us remind ourselves of the geometry involved, which is illustrated below. Note that the argument of the wavefunction rotates clockwise with time, while the mathematical convention for measuring the phase angle (ϕ) is counter-clockwise.

Figure 2: Euler’s formula

If we assume the momentum p is all in the x-direction, then the p and x vectors will have the same direction, and px/ħ reduces to p∙x/ħ. Most illustrations – such as the one below – will either freeze x or, else, t. Alternatively, one can google web animations varying both. The point is: we also have a two-dimensional oscillation here. These two dimensions are perpendicular to the direction of propagation of the wavefunction. For example, if the wavefunction propagates in the x-direction, then the oscillations are along the y– and z-axis, which we may refer to as the real and imaginary axis. Note how the phase difference between the cosine and the sine  – the real and imaginary part of our wavefunction – appear to give some spin to the whole. I will come back to this.

Figure 3: Geometric representation of the wavefunction

Hence, if we would say these oscillations carry half of the total energy of the particle, then we may refer to the real and imaginary energy of the particle respectively, and the interplay between the real and the imaginary part of the wavefunction may then describe how energy propagates through space over time.

Let us consider, once again, a particle at rest. Hence, p = 0 and the (elementary) wavefunction reduces to ψ = a·ei∙E·t/ħ. Hence, the angular velocity of both oscillations, at some point x, is given by ω = -E/ħ. Now, the energy of our particle includes all of the energy – kinetic, potential and rest energy – and is, therefore, equal to E = mc2.

Can we, somehow, relate this to the m·a2·ω2 energy formula for our V-2 perpetuum mobile? Our wavefunction has an amplitude too. Now, if the oscillations of the real and imaginary wavefunction store the energy of our particle, then their amplitude will surely matter. In fact, the energy of an oscillation is, in general, proportional to the square of the amplitude: E µ a2. We may, therefore, think that the a2 factor in the E = m·a2·ω2 energy will surely be relevant as well.

However, here is a complication: an actual particle is localized in space and can, therefore, not be represented by the elementary wavefunction. We must build a wave packet for that: a sum of wavefunctions, each with their own amplitude ak, and their own ωi = -Ei/ħ. Each of these wavefunctions will contribute some energy to the total energy of the wave packet. To calculate the contribution of each wave to the total, both ai as well as Ei will matter.

What is Ei? Ei varies around some average E, which we can associate with some average mass m: m = E/c2. The Uncertainty Principle kicks in here. The analysis becomes more complicated, but a formula such as the one below might make sense:We can re-write this as:What is the meaning of this equation? We may look at it as some sort of physical normalization condition when building up the Fourier sum. Of course, we should relate this to the mathematical normalization condition for the wavefunction. Our intuition tells us that the probabilities must be related to the energy densities, but how exactly? We will come back to this question in a moment. Let us first think some more about the enigma: what is mass?

Before we do so, let us quickly calculate the value of c2ħ2: it is about 1´1051 N2∙m4. Let us also do a dimensional analysis: the physical dimensions of the E = m·a2·ω2 equation make sense if we express m in kg, a in m, and ω in rad/s. We then get: [E] = kg∙m2/s2 = (N∙s2/m)∙m2/s2 = N∙m = J. The dimensions of the left- and right-hand side of the physical normalization condition is N3∙m5.

III. What is mass?

We came up, playfully, with a meaningful interpretation for energy: it is a two-dimensional oscillation of mass. But what is mass? A new aether theory is, of course, not an option, but then what is it that is oscillating? To understand the physics behind equations, it is always good to do an analysis of the physical dimensions in the equation. Let us start with Einstein’s energy equation once again. If we want to look at mass, we should re-write it as m = E/c2:

[m] = [E/c2] = J/(m/s)2 = N·m∙s2/m2 = N·s2/m = kg

This is not very helpful. It only reminds us of Newton’s definition of a mass: mass is that what gets accelerated by a force. At this point, we may want to think of the physical significance of the absolute nature of the speed of light. Einstein’s E = mc2 equation implies we can write the ratio between the energy and the mass of any particle is always the same, so we can write, for example:This reminds us of the ω2= C1/L or ω2 = k/m of harmonic oscillators once again.[13] The key difference is that the ω2= C1/L and ω2 = k/m formulas introduce two or more degrees of freedom.[14] In contrast, c2= E/m for any particle, always. However, that is exactly the point: we can modulate the resistance, inductance and capacitance of electric circuits, and the stiffness of springs and the masses we put on them, but we live in one physical space only: our spacetime. Hence, the speed of light c emerges here as the defining property of spacetime – the resonant frequency, so to speak. We have no further degrees of freedom here.

The Planck-Einstein relation (for photons) and the de Broglie equation (for matter-particles) have an interesting feature: both imply that the energy of the oscillation is proportional to the frequency, with Planck’s constant as the constant of proportionality. Now, for one-dimensional oscillations – think of a guitar string, for example – we know the energy will be proportional to the square of the frequency. It is a remarkable observation: the two-dimensional matter-wave, or the electromagnetic wave, gives us two waves for the price of one, so to speak, each carrying half of the total energy of the oscillation but, as a result, we get a proportionality between E and f instead of between E and f2.

However, such reflections do not answer the fundamental question we started out with: what is mass? At this point, it is hard to go beyond the circular definition that is implied by Einstein’s formula: energy is a two-dimensional oscillation of mass, and mass packs energy, and c emerges us as the property of spacetime that defines how exactly.

When everything is said and done, this does not go beyond stating that mass is some scalar field. Now, a scalar field is, quite simply, some real number that we associate with a position in spacetime. The Higgs field is a scalar field but, of course, the theory behind it goes much beyond stating that we should think of mass as some scalar field. The fundamental question is: why and how does energy, or matter, condense into elementary particles? That is what the Higgs mechanism is about but, as this paper is exploratory only, we cannot even start explaining the basics of it.

What we can do, however, is look at the wave equation again (Schrödinger’s equation), as we can now analyze it as an energy diffusion equation.

IV. Schrödinger’s equation as an energy diffusion equation

The interpretation of Schrödinger’s equation as a diffusion equation is straightforward. Feynman (Lectures, III-16-1) briefly summarizes it as follows:

“We can think of Schrödinger’s equation as describing the diffusion of the probability amplitude from one point to the next. […] But the imaginary coefficient in front of the derivative makes the behavior completely different from the ordinary diffusion such as you would have for a gas spreading out along a thin tube. Ordinary diffusion gives rise to real exponential solutions, whereas the solutions of Schrödinger’s equation are complex waves.”[17]

Let us review the basic math. For a particle moving in free space – with no external force fields acting on it – there is no potential (U = 0) and, therefore, the Uψ term disappears. Therefore, Schrödinger’s equation reduces to:

∂ψ(x, t)/∂t = i·(1/2)·(ħ/meff)·∇2ψ(x, t)

The ubiquitous diffusion equation in physics is:

∂φ(x, t)/∂t = D·∇2φ(x, t)

The structural similarity is obvious. The key difference between both equations is that the wave equation gives us two equations for the price of one. Indeed, because ψ is a complex-valued function, with a real and an imaginary part, we get the following equations[18]:

1. Re(∂ψ/∂t) = −(1/2)·(ħ/meffIm(∇2ψ)
2. Im(∂ψ/∂t) = (1/2)·(ħ/meffRe(∇2ψ)

These equations make us think of the equations for an electromagnetic wave in free space (no stationary charges or currents):

1. B/∂t = –∇×E
2. E/∂t = c2∇×B

The above equations effectively describe a propagation mechanism in spacetime, as illustrated below.

Figure 4: Propagation mechanisms

The Laplacian operator (∇2), when operating on a scalar quantity, gives us a flux density, i.e. something expressed per square meter (1/m2). In this case, it is operating on ψ(x, t), so what is the dimension of our wavefunction ψ(x, t)? To answer that question, we should analyze the diffusion constant in Schrödinger’s equation, i.e. the (1/2)·(ħ/meff) factor:

1. As a mathematical constant of proportionality, it will quantify the relationship between both derivatives (i.e. the time derivative and the Laplacian);
2. As a physical constant, it will ensure the physical dimensions on both sides of the equation are compatible.

Now, the ħ/meff factor is expressed in (N·m·s)/(N· s2/m) = m2/s. Hence, it does ensure the dimensions on both sides of the equation are, effectively, the same: ∂ψ/∂t is a time derivative and, therefore, its dimension is s1 while, as mentioned above, the dimension of ∇2ψ is m2. However, this does not solve our basic question: what is the dimension of the real and imaginary part of our wavefunction?

At this point, mainstream physicists will say: it does not have a physical dimension, and there is no geometric interpretation of Schrödinger’s equation. One may argue, effectively, that its argument, (px – E∙t)/ħ, is just a number and, therefore, that the real and imaginary part of ψ is also just some number.

To this, we may object that ħ may be looked as a mathematical scaling constant only. If we do that, then the argument of ψ will, effectively, be expressed in action units, i.e. in N·m·s. It then does make sense to also associate a physical dimension with the real and imaginary part of ψ. What could it be?

We may have a closer look at Maxwell’s equations for inspiration here. The electric field vector is expressed in newton (the unit of force) per unit of charge (coulomb). Now, there is something interesting here. The physical dimension of the magnetic field is N/C divided by m/s.[19] We may write B as the following vector cross-product: B = (1/c)∙ex×E, with ex the unit vector pointing in the x-direction (i.e. the direction of propagation of the wave). Hence, we may associate the (1/c)∙ex× operator, which amounts to a rotation by 90 degrees, with the s/m dimension. Now, multiplication by i also amounts to a rotation by 90° degrees. Hence, we may boldly write: B = (1/c)∙ex×E = (1/c)∙iE. This allows us to also geometrically interpret Schrödinger’s equation in the way we interpreted it above (see Figure 3).[20]

Still, we have not answered the question as to what the physical dimension of the real and imaginary part of our wavefunction should be. At this point, we may be inspired by the structural similarity between Newton’s and Coulomb’s force laws:Hence, if the electric field vector E is expressed in force per unit charge (N/C), then we may want to think of associating the real part of our wavefunction with a force per unit mass (N/kg). We can, of course, do a substitution here, because the mass unit (1 kg) is equivalent to 1 N·s2/m. Hence, our N/kg dimension becomes:

N/kg = N/(N·s2/m)= m/s2

What is this: m/s2? Is that the dimension of the a·cosθ term in the a·eiθ a·cosθ − i·a·sinθ wavefunction?

My answer is: why not? Think of it: m/s2 is the physical dimension of acceleration: the increase or decrease in velocity (m/s) per second. It ensures the wavefunction for any particle – matter-particles or particles with zero rest mass (photons) – and the associated wave equation (which has to be the same for all, as the spacetime we live in is one) are mutually consistent.

In this regard, we should think of how we would model a gravitational wave. The physical dimension would surely be the same: force per mass unit. It all makes sense: wavefunctions may, perhaps, be interpreted as traveling distortions of spacetime, i.e. as tiny gravitational waves.

V. Energy densities and flows

Pursuing the geometric equivalence between the equations for an electromagnetic wave and Schrödinger’s equation, we can now, perhaps, see if there is an equivalent for the energy density. For an electromagnetic wave, we know that the energy density is given by the following formula:E and B are the electric and magnetic field vector respectively. The Poynting vector will give us the directional energy flux, i.e. the energy flow per unit area per unit time. We write:Needless to say, the ∙ operator is the divergence and, therefore, gives us the magnitude of a (vector) field’s source or sink at a given point. To be precise, the divergence gives us the volume density of the outward flux of a vector field from an infinitesimal volume around a given point. In this case, it gives us the volume density of the flux of S.

We can analyze the dimensions of the equation for the energy density as follows:

1. E is measured in newton per coulomb, so [EE] = [E2] = N2/C2.
2. B is measured in (N/C)/(m/s), so we get [BB] = [B2] = (N2/C2)·(s2/m2). However, the dimension of our c2 factor is (m2/s2) and so we’re also left with N2/C2.
3. The ϵ0 is the electric constant, aka as the vacuum permittivity. As a physical constant, it should ensure the dimensions on both sides of the equation work out, and they do: [ε0] = C2/(N·m2) and, therefore, if we multiply that with N2/C2, we find that is expressed in J/m3.[21]

Replacing the newton per coulomb unit (N/C) by the newton per kg unit (N/kg) in the formulas above should give us the equivalent of the energy density for the wavefunction. We just need to substitute ϵ0 for an equivalent constant. We may to give it a try. If the energy densities can be calculated – which are also mass densities, obviously – then the probabilities should be proportional to them.

Let us first see what we get for a photon, assuming the electromagnetic wave represents its wavefunction. Substituting B for (1/c)∙iE or for −(1/c)∙iE gives us the following result:Zero!? An unexpected result! Or not? We have no stationary charges and no currents: only an electromagnetic wave in free space. Hence, the local energy conservation principle needs to be respected at all points in space and in time. The geometry makes sense of the result: for an electromagnetic wave, the magnitudes of E and B reach their maximum, minimum and zero point simultaneously, as shown below.[22] This is because their phase is the same.

Figure 5: Electromagnetic wave: E and B

Should we expect a similar result for the energy densities that we would associate with the real and imaginary part of the matter-wave? For the matter-wave, we have a phase difference between a·cosθ and a·sinθ, which gives a different picture of the propagation of the wave (see Figure 3).[23] In fact, the geometry of the suggestion suggests some inherent spin, which is interesting. I will come back to this. Let us first guess those densities. Making abstraction of any scaling constants, we may write:We get what we hoped to get: the absolute square of our amplitude is, effectively, an energy density !

|ψ|2  = |a·ei∙E·t/ħ|2 = a2 = u

This is very deep. A photon has no rest mass, so it borrows and returns energy from empty space as it travels through it. In contrast, a matter-wave carries energy and, therefore, has some (rest) mass. It is therefore associated with an energy density, and this energy density gives us the probabilities. Of course, we need to fine-tune the analysis to account for the fact that we have a wave packet rather than a single wave, but that should be feasible.

As mentioned, the phase difference between the real and imaginary part of our wavefunction (a cosine and a sine function) appear to give some spin to our particle. We do not have this particularity for a photon. Of course, photons are bosons, i.e. spin-zero particles, while elementary matter-particles are fermions with spin-1/2. Hence, our geometric interpretation of the wavefunction suggests that, after all, there may be some more intuitive explanation of the fundamental dichotomy between bosons and fermions, which puzzled even Feynman:

“Why is it that particles with half-integral spin are Fermi particles, whereas particles with integral spin are Bose particles? We apologize for the fact that we cannot give you an elementary explanation. An explanation has been worked out by Pauli from complicated arguments of quantum field theory and relativity. He has shown that the two must necessarily go together, but we have not been able to find a way of reproducing his arguments on an elementary level. It appears to be one of the few places in physics where there is a rule which can be stated very simply, but for which no one has found a simple and easy explanation. The explanation is deep down in relativistic quantum mechanics. This probably means that we do not have a complete understanding of the fundamental principle involved.” (Feynman, Lectures, III-4-1)

The physical interpretation of the wavefunction, as presented here, may provide some better understanding of ‘the fundamental principle involved’: the physical dimension of the oscillation is just very different. That is all: it is force per unit charge for photons, and force per unit mass for matter-particles. We will examine the question of spin somewhat more carefully in section VII. Let us first examine the matter-wave some more.

VI. Group and phase velocity of the matter-wave

The geometric representation of the matter-wave (see Figure 3) suggests a traveling wave and, yes, of course: the matter-wave effectively travels through space and time. But what is traveling, exactly? It is the pulse – or the signal – only: the phase velocity of the wave is just a mathematical concept and, even in our physical interpretation of the wavefunction, the same is true for the group velocity of our wave packet. The oscillation is two-dimensional, but perpendicular to the direction of travel of the wave. Hence, nothing actually moves with our particle.

Here, we should also reiterate that we did not answer the question as to what is oscillating up and down and/or sideways: we only associated a physical dimension with the components of the wavefunction – newton per kg (force per unit mass), to be precise. We were inspired to do so because of the physical dimension of the electric and magnetic field vectors (newton per coulomb, i.e. force per unit charge) we associate with electromagnetic waves which, for all practical purposes, we currently treat as the wavefunction for a photon. This made it possible to calculate the associated energy densities and a Poynting vector for energy dissipation. In addition, we showed that Schrödinger’s equation itself then becomes a diffusion equation for energy. However, let us now focus some more on the asymmetry which is introduced by the phase difference between the real and the imaginary part of the wavefunction. Look at the mathematical shape of the elementary wavefunction once again:

ψ = a·ei[E·t − px]/ħa·ei[E·t − px]/ħ = a·cos(px/ħ − E∙t/ħ) + i·a·sin(px/ħ − E∙t/ħ)

The minus sign in the argument of our sine and cosine function defines the direction of travel: an F(x−v∙t) wavefunction will always describe some wave that is traveling in the positive x-direction (with the wave velocity), while an F(x+v∙t) wavefunction will travel in the negative x-direction. For a geometric interpretation of the wavefunction in three dimensions, we need to agree on how to define i or, what amounts to the same, a convention on how to define clockwise and counterclockwise directions: if we look at a clock from the back, then its hand will be moving counterclockwise. So we need to establish the equivalent of the right-hand rule. However, let us not worry about that now. Let us focus on the interpretation. To ease the analysis, we’ll assume we’re looking at a particle at rest. Hence, p = 0, and the wavefunction reduces to:

ψ = a·ei∙E·t/ħ = a·cos(−E∙t/ħ) + i·a·sin(−E0∙t/ħ) = a·cos(E0∙t/ħ) − i·a·sin(E0∙t/ħ)

E0 is, of course, the rest mass of our particle and, now that we are here, we should probably wonder whose time we are talking about: is it our time, or is the proper time of our particle? Well… In this situation, we are both at rest so it does not matter: t is, effectively, the proper time so perhaps we should write it as t0. It does not matter. You can see what we expect to see: E0/ħ pops up as the natural frequency of our matter-particle: (E0/ħ)∙t = ω∙t. Remembering the ω = 2π·f = 2π/T and T = 1/formulas, we can associate a period and a frequency with this wave, using the ω = 2π·f = 2π/T. Noting that ħ = h/2π, we find the following:

T = 2π·(ħ/E0) = h/E0 ⇔ = E0/h = m0c2/h

This is interesting, because we can look at the period as a natural unit of time for our particle. What about the wavelength? That is tricky because we need to distinguish between group and phase velocity here. The group velocity (vg) should be zero here, because we assume our particle does not move. In contrast, the phase velocity is given by vp = λ·= (2π/k)·(ω/2π) = ω/k. In fact, we’ve got something funny here: the wavenumber k = p/ħ is zero, because we assume the particle is at rest, so p = 0. So we have a division by zero here, which is rather strange. What do we get assuming the particle is not at rest? We write:

vp = ω/k = (E/ħ)/(p/ħ) = E/p = E/(m·vg) = (m·c2)/(m·vg) = c2/vg

This is interesting: it establishes a reciprocal relation between the phase and the group velocity, with as a simple scaling constant. Indeed, the graph below shows the shape of the function does not change with the value of c, and we may also re-write the relation above as:

vp/= βp = c/vp = 1/βg = 1/(c/vp)

Figure 6: Reciprocal relation between phase and group velocity

We can also write the mentioned relationship as vp·vg = c2, which reminds us of the relationship between the electric and magnetic constant (1/ε0)·(1/μ0) = c2. This is interesting in light of the fact we can re-write this as (c·ε0)·(c·μ0) = 1, which shows electricity and magnetism are just two sides of the same coin, so to speak.[24]

Interesting, but how do we interpret the math? What about the implications of the zero value for wavenumber k = p/ħ. We would probably like to think it implies the elementary wavefunction should always be associated with some momentum, because the concept of zero momentum clearly leads to weird math: something times zero cannot be equal to c2! Such interpretation is also consistent with the Uncertainty Principle: if Δx·Δp ≥ ħ, then neither Δx nor Δp can be zero. In other words, the Uncertainty Principle tells us that the idea of a pointlike particle actually being at some specific point in time and in space does not make sense: it has to move. It tells us that our concept of dimensionless points in time and space are mathematical notions only. Actual particles – including photons – are always a bit spread out, so to speak, and – importantly – they have to move.

For a photon, this is self-evident. It has no rest mass, no rest energy, and, therefore, it is going to move at the speed of light itself. We write: p = m·c = m·c2/= E/c. Using the relationship above, we get:

vp = ω/k = (E/ħ)/(p/ħ) = E/p = c ⇒ vg = c2/vp = c2/c = c

This is good: we started out with some reflections on the matter-wave, but here we get an interpretation of the electromagnetic wave as a wavefunction for the photon. But let us get back to our matter-wave. In regard to our interpretation of a particle having to move, we should remind ourselves, once again, of the fact that an actual particle is always localized in space and that it can, therefore, not be represented by the elementary wavefunction ψ = a·ei[E·t − px]/ħ or, for a particle at rest, the ψ = a·ei∙E·t/ħ function. We must build a wave packet for that: a sum of wavefunctions, each with their own amplitude ai, and their own ωi = −Ei/ħ. Indeed, in section II, we showed that each of these wavefunctions will contribute some energy to the total energy of the wave packet and that, to calculate the contribution of each wave to the total, both ai as well as Ei matter. This may or may not resolve the apparent paradox. Let us look at the group velocity.

To calculate a meaningful group velocity, we must assume the vg = ∂ωi/∂ki = ∂(Ei/ħ)/∂(pi/ħ) = ∂(Ei)/∂(pi) exists. So we must have some dispersion relation. How do we calculate it? We need to calculate ωi as a function of ki here, or Ei as a function of pi. How do we do that? Well… There are a few ways to go about it but one interesting way of doing it is to re-write Schrödinger’s equation as we did, i.e. by distinguishing the real and imaginary parts of the ∂ψ/∂t =i·[ħ/(2m)]·∇2ψ wave equation and, hence, re-write it as the following pair of two equations:

1. Re(∂ψ/∂t) = −[ħ/(2meff)]·Im(∇2ψ) ⇔ ω·cos(kx − ωt) = k2·[ħ/(2meff)]·cos(kx − ωt)
2. Im(∂ψ/∂t) = [ħ/(2meff)]·Re(∇2ψ) ⇔ ω·sin(kx − ωt) = k2·[ħ/(2meff)]·sin(kx − ωt)

Both equations imply the following dispersion relation:

ω = ħ·k2/(2meff)

Of course, we need to think about the subscripts now: we have ωi, ki, but… What about meff or, dropping the subscript, m? Do we write it as mi? If so, what is it? Well… It is the equivalent mass of Ei obviously, and so we get it from the mass-energy equivalence relation: mi = Ei/c2. It is a fine point, but one most people forget about: they usually just write m. However, if there is uncertainty in the energy, then Einstein’s mass-energy relation tells us we must have some uncertainty in the (equivalent) mass too. Here, I should refer back to Section II: Ei varies around some average energy E and, therefore, the Uncertainty Principle kicks in.

VII. Explaining spin

The elementary wavefunction vector – i.e. the vector sum of the real and imaginary component – rotates around the x-axis, which gives us the direction of propagation of the wave (see Figure 3). Its magnitude remains constant. In contrast, the magnitude of the electromagnetic vector – defined as the vector sum of the electric and magnetic field vectors – oscillates between zero and some maximum (see Figure 5).

We already mentioned that the rotation of the wavefunction vector appears to give some spin to the particle. Of course, a circularly polarized wave would also appear to have spin (think of the E and B vectors rotating around the direction of propagation – as opposed to oscillating up and down or sideways only). In fact, a circularly polarized light does carry angular momentum, as the equivalent mass of its energy may be thought of as rotating as well. But so here we are looking at a matter-wave.

The basic idea is the following: if we look at ψ = a·ei∙E·t/ħ as some real vector – as a two-dimensional oscillation of mass, to be precise – then we may associate its rotation around the direction of propagation with some torque. The illustration below reminds of the math here.

Figure 7: Torque and angular momentum vectors

A torque on some mass about a fixed axis gives it angular momentum, which we can write as the vector cross-product L = r×p or, perhaps easier for our purposes here as the product of an angular velocity (ω) and rotational inertia (I), aka as the moment of inertia or the angular mass. We write:

L = I·ω

Note we can write L and ω in boldface here because they are (axial) vectors. If we consider their magnitudes only, we write L = I·ω (no boldface). We can now do some calculations. Let us start with the angular velocity. In our previous posts, we showed that the period of the matter-wave is equal to T = 2π·(ħ/E0). Hence, the angular velocity must be equal to:

ω = 2π/[2π·(ħ/E0)] = E0

We also know the distance r, so that is the magnitude of r in the Lr×p vector cross-product: it is just a, so that is the magnitude of ψ = a·ei∙E·t/ħ. Now, the momentum (p) is the product of a linear velocity (v) – in this case, the tangential velocity – and some mass (m): p = m·v. If we switch to scalar instead of vector quantities, then the (tangential) velocity is given by v = r·ω. So now we only need to think about what we should use for m or, if we want to work with the angular velocity (ω), the angular mass (I). Here we need to make some assumption about the mass (or energy) distribution. Now, it may or may not sense to assume the energy in the oscillation – and, therefore, the mass – is distributed uniformly. In that case, we may use the formula for the angular mass of a solid cylinder: I = m·r2/2. If we keep the analysis non-relativistic, then m = m0. Of course, the energy-mass equivalence tells us that m0 = E0/c2. Hence, this is what we get:

L = I·ω = (m0·r2/2)·(E0/ħ) = (1/2)·a2·(E0/c2)·(E0/ħ) = a2·E02/(2·ħ·c2)

Does it make sense? Maybe. Maybe not. Let us do a dimensional analysis: that won’t check our logic, but it makes sure we made no mistakes when mapping mathematical and physical spaces. We have m2·J2 = m2·N2·m2 in the numerator and N·m·s·m2/s2 in the denominator. Hence, the dimensions work out: we get N·m·s as the dimension for L, which is, effectively, the physical dimension of angular momentum. It is also the action dimension, of course, and that cannot be a coincidence. Also note that the E = mc2 equation allows us to re-write it as:

L = a2·E02/(2·ħ·c2)

Of course, in quantum mechanics, we associate spin with the magnetic moment of a charged particle, not with its mass as such. Is there way to link the formula above to the one we have for the quantum-mechanical angular momentum, which is also measured in N·m·s units, and which can only take on one of two possible values: J = +ħ/2 and −ħ/2? It looks like a long shot, right? How do we go from (1/2)·a2·m02/ħ to ± (1/2)∙ħ? Let us do a numerical example. The energy of an electron is typically 0.510 MeV » 8.1871×10−14 N∙m, and a… What value should we take for a?

We have an obvious trio of candidates here: the Bohr radius, the classical electron radius (aka the Thompon scattering length), and the Compton scattering radius.

Let us start with the Bohr radius, so that is about 0.×10−10 N∙m. We get L = a2·E02/(2·ħ·c2) = 9.9×10−31 N∙m∙s. Now that is about 1.88×104 times ħ/2. That is a huge factor. The Bohr radius cannot be right: we are not looking at an electron in an orbital here. To show it does not make sense, we may want to double-check the analysis by doing the calculation in another way. We said each oscillation will always pack 6.626070040(81)×10−34 joule in energy. So our electron should pack about 1.24×10−20 oscillations. The angular momentum (L) we get when using the Bohr radius for a and the value of 6.626×10−34 joule for E0 and the Bohr radius is equal to 6.49×10−59 N∙m∙s. So that is the angular momentum per oscillation. When we multiply this with the number of oscillations (1.24×10−20), we get about 8.01×10−51 N∙m∙s, so that is a totally different number.

The classical electron radius is about 2.818×10−15 m. We get an L that is equal to about 2.81×10−39 N∙m∙s, so now it is a tiny fraction of ħ/2! Hence, this leads us nowhere. Let us go for our last chance to get a meaningful result! Let us use the Compton scattering length, so that is about 2.42631×10−12 m.

This gives us an L of 2.08×10−33 N∙m∙s, which is only 20 times ħ. This is not so bad, but it is good enough? Let us calculate it the other way around: what value should we take for a so as to ensure L = a2·E02/(2·ħ·c2) = ħ/2? Let us write it out:

In fact, this is the formula for the so-called reduced Compton wavelength. This is perfect. We found what we wanted to find. Substituting this value for a (you can calculate it: it is about 3.8616×10−33 m), we get what we should find:

This is a rather spectacular result, and one that would – a priori – support the interpretation of the wavefunction that is being suggested in this paper.

VIII. The boson-fermion dichotomy

Let us do some more thinking on the boson-fermion dichotomy. Again, we should remind ourselves that an actual particle is localized in space and that it can, therefore, not be represented by the elementary wavefunction ψ = a·ei[E·t − px]/ħ or, for a particle at rest, the ψ = a·ei∙E·t/ħ function. We must build a wave packet for that: a sum of wavefunctions, each with their own amplitude ai, and their own ωi = −Ei/ħ. Each of these wavefunctions will contribute some energy to the total energy of the wave packet. Now, we can have another wild but logical theory about this.

Think of the apparent right-handedness of the elementary wavefunction: surely, Nature can’t be bothered about our convention of measuring phase angles clockwise or counterclockwise. Also, the angular momentum can be positive or negative: J = +ħ/2 or −ħ/2. Hence, we would probably like to think that an actual particle – think of an electron, or whatever other particle you’d think of – may consist of right-handed as well as left-handed elementary waves. To be precise, we may think they either consist of (elementary) right-handed waves or, else, of (elementary) left-handed waves. An elementary right-handed wave would be written as:

ψ(θi= ai·(cosθi + i·sinθi)

In contrast, an elementary left-handed wave would be written as:

ψ(θi= ai·(cosθii·sinθi)

How does that work out with the E0·t argument of our wavefunction? Position is position, and direction is direction, but time? Time has only one direction, but Nature surely does not care how we count time: counting like 1, 2, 3, etcetera or like −1, −2, −3, etcetera is just the same. If we count like 1, 2, 3, etcetera, then we write our wavefunction like:

ψ = a·cos(E0∙t/ħ) − i·a·sin(E0∙t/ħ)

If we count time like −1, −2, −3, etcetera then we write it as:

ψ = a·cos(E0∙t/ħ) − i·a·sin(E0∙t/ħ)= a·cos(E0∙t/ħ) + i·a·sin(E0∙t/ħ)

Hence, it is just like the left- or right-handed circular polarization of an electromagnetic wave: we can have both for the matter-wave too! This, then, should explain why we can have either positive or negative quantum-mechanical spin (+ħ/2 or −ħ/2). It is the usual thing: we have two mathematical possibilities here, and so we must have two physical situations that correspond to it.

It is only natural. If we have left- and right-handed photons – or, generalizing, left- and right-handed bosons – then we should also have left- and right-handed fermions (electrons, protons, etcetera). Back to the dichotomy. The textbook analysis of the dichotomy between bosons and fermions may be epitomized by Richard Feynman’s Lecture on it (Feynman, III-4), which is confusing and – I would dare to say – even inconsistent: how are photons or electrons supposed to know that they need to interfere with a positive or a negative sign? They are not supposed to know anything: knowledge is part of our interpretation of whatever it is that is going on there.

Hence, it is probably best to keep it simple, and think of the dichotomy in terms of the different physical dimensions of the oscillation: newton per kg versus newton per coulomb. And then, of course, we should also note that matter-particles have a rest mass and, therefore, actually carry charge. Photons do not. But both are two-dimensional oscillations, and the point is: the so-called vacuum – and the rest mass of our particle (which is zero for the photon and non-zero for everything else) – give us the natural frequency for both oscillations, which is beautifully summed up in that remarkable equation for the group and phase velocity of the wavefunction, which applies to photons as well as matter-particles:

(vphase·c)·(vgroup·c) = 1 ⇔ vp·vg = c2

The final question then is: why are photons spin-zero particles? Well… We should first remind ourselves of the fact that they do have spin when circularly polarized.[25] Here we may think of the rotation of the equivalent mass of their energy. However, if they are linearly polarized, then there is no spin. Even for circularly polarized waves, the spin angular momentum of photons is a weird concept. If photons have no (rest) mass, then they cannot carry any charge. They should, therefore, not have any magnetic moment. Indeed, what I wrote above shows an explanation of quantum-mechanical spin requires both mass as well as charge.[26]

IX. Concluding remarks

There are, of course, other ways to look at the matter – literally. For example, we can imagine two-dimensional oscillations as circular rather than linear oscillations. Think of a tiny ball, whose center of mass stays where it is, as depicted below. Any rotation – around any axis – will be some combination of a rotation around the two other axes. Hence, we may want to think of a two-dimensional oscillation as an oscillation of a polar and azimuthal angle.

Figure 8: Two-dimensional circular movement

The point of this paper is not to make any definite statements. That would be foolish. Its objective is just to challenge the simplistic mainstream viewpoint on the reality of the wavefunction. Stating that it is a mathematical construct only without physical significance amounts to saying it has no meaning at all. That is, clearly, a non-sustainable proposition.

The interpretation that is offered here looks at amplitude waves as traveling fields. Their physical dimension may be expressed in force per mass unit, as opposed to electromagnetic waves, whose amplitudes are expressed in force per (electric) charge unit. Also, the amplitudes of matter-waves incorporate a phase factor, but this may actually explain the rather enigmatic dichotomy between fermions and bosons and is, therefore, an added bonus.

The interpretation that is offered here has some advantages over other explanations, as it explains the how of diffraction and interference. However, while it offers a great explanation of the wave nature of matter, it does not explain its particle nature: while we think of the energy as being spread out, we will still observe electrons and photons as pointlike particles once they hit the detector. Why is it that a detector can sort of ‘hook’ the whole blob of energy, so to speak?

The interpretation of the wavefunction that is offered here does not explain this. Hence, the complementarity principle of the Copenhagen interpretation of the wavefunction surely remains relevant.

Appendix 1: The de Broglie relations and energy

The 1/2 factor in Schrödinger’s equation is related to the concept of the effective mass (meff). It is easy to make the wrong calculations. For example, when playing with the famous de Broglie relations – aka as the matter-wave equations – one may be tempted to derive the following energy concept:

1. E = h·f and p = h/λ. Therefore, f = E/h and λ = p/h.
2. v = λ = (E/h)∙(p/h) = E/p
3. p = m·v. Therefore, E = v·p = m·v2

E = m·v2? This resembles the E = mc2 equation and, therefore, one may be enthused by the discovery, especially because the m·v2 also pops up when working with the Least Action Principle in classical mechanics, which states that the path that is followed by a particle will minimize the following integral:Now, we can choose any reference point for the potential energy but, to reflect the energy conservation law, we can select a reference point that ensures the sum of the kinetic and the potential energy is zero throughout the time interval. If the force field is uniform, then the integrand will, effectively, be equal to KE − PE = m·v2.[27]

However, that is classical mechanics and, therefore, not so relevant in the context of the de Broglie equations, and the apparent paradox should be solved by distinguishing between the group and the phase velocity of the matter wave.

Appendix 2: The concept of the effective mass

The effective mass – as used in Schrödinger’s equation – is a rather enigmatic concept. To make sure we are making the right analysis here, I should start by noting you will usually see Schrödinger’s equation written as:This formulation includes a term with the potential energy (U). In free space (no potential), this term disappears, and the equation can be re-written as:

∂ψ(x, t)/∂t = i·(1/2)·(ħ/meff)·∇2ψ(x, t)

We just moved the i·ħ coefficient to the other side, noting that 1/i = –i. Now, in one-dimensional space, and assuming ψ is just the elementary wavefunction (so we substitute a·ei∙[E·t − p∙x]/ħ for ψ), this implies the following:

a·i·(E/ħ)·ei∙[E·t − p∙x]/ħ = −i·(ħ/2meffa·(p22 ei∙[E·t − p∙x]/ħ

⇔ E = p2/(2meff) ⇔ meff = m∙(v/c)2/2 = m∙β2/2

It is an ugly formula: it resembles the kinetic energy formula (K.E. = m∙v2/2) but it is, in fact, something completely different. The β2/2 factor ensures the effective mass is always a fraction of the mass itself. To get rid of the ugly 1/2 factor, we may re-define meff as two times the old meff (hence, meffNEW = 2∙meffOLD), as a result of which the formula will look somewhat better:

meff = m∙(v/c)2 = m∙β2

We know β varies between 0 and 1 and, therefore, meff will vary between 0 and m. Feynman drops the subscript, and just writes meff as m in his textbook (see Feynman, III-19). On the other hand, the electron mass as used is also the electron mass that is used to calculate the size of an atom (see Feynman, III-2-4). As such, the two mass concepts are, effectively, mutually compatible. It is confusing because the same mass is often defined as the mass of a stationary electron (see, for example, the article on it in the online Wikipedia encyclopedia[28]).

In the context of the derivation of the electron orbitals, we do have the potential energy term – which is the equivalent of a source term in a diffusion equation – and that may explain why the above-mentioned meff = m∙(v/c)2 = m∙β2 formula does not apply.

References

This paper discusses general principles in physics only. Hence, references can be limited to references to physics textbooks only. For ease of reading, any reference to additional material has been limited to a more popular undergrad textbook that can be consulted online: Feynman’s Lectures on Physics (http://www.feynmanlectures.caltech.edu). References are per volume, per chapter and per section. For example, Feynman III-19-3 refers to Volume III, Chapter 19, Section 3.

Notes

[1] Of course, an actual particle is localized in space and can, therefore, not be represented by the elementary wavefunction ψ = a·ei∙θa·ei[E·t − px]/ħ = a·(cosθ i·a·sinθ). We must build a wave packet for that: a sum of wavefunctions, each with its own amplitude ak and its own argument θk = (Ek∙t – pkx)/ħ. This is dealt with in this paper as part of the discussion on the mathematical and physical interpretation of the normalization condition.

[2] The N/kg dimension immediately, and naturally, reduces to the dimension of acceleration (m/s2), thereby facilitating a direct interpretation in terms of Newton’s force law.

[3] In physics, a two-spring metaphor is more common. Hence, the pistons in the author’s perpetuum mobile may be replaced by springs.

[4] The author re-derives the equation for the Compton scattering radius in section VII of the paper.

[5] The magnetic force can be analyzed as a relativistic effect (see Feynman II-13-6). The dichotomy between the electric force as a polar vector and the magnetic force as an axial vector disappears in the relativistic four-vector representation of electromagnetism.

[6] For example, when using Schrödinger’s equation in a central field (think of the electron around a proton), the use of polar coordinates is recommended, as it ensures the symmetry of the Hamiltonian under all rotations (see Feynman III-19-3)

[7] This sentiment is usually summed up in the apocryphal quote: “God does not play dice.”The actual quote comes out of one of Einstein’s private letters to Cornelius Lanczos, another scientist who had also emigrated to the US. The full quote is as follows: “You are the only person I know who has the same attitude towards physics as I have: belief in the comprehension of reality through something basically simple and unified… It seems hard to sneak a look at God’s cards. But that He plays dice and uses ‘telepathic’ methods… is something that I cannot believe for a single moment.” (Helen Dukas and Banesh Hoffman, Albert Einstein, the Human Side: New Glimpses from His Archives, 1979)

[8] Of course, both are different velocities: ω is an angular velocity, while v is a linear velocity: ω is measured in radians per second, while v is measured in meter per second. However, the definition of a radian implies radians are measured in distance units. Hence, the physical dimensions are, effectively, the same. As for the formula for the total energy of an oscillator, we should actually write: E = m·a2∙ω2/2. The additional factor (a) is the (maximum) amplitude of the oscillator.

[9] We also have a 1/2 factor in the E = mv2/2 formula. Two remarks may be made here. First, it may be noted this is a non-relativistic formula and, more importantly, incorporates kinetic energy only. Using the Lorentz factor (γ), we can write the relativistically correct formula for the kinetic energy as K.E. = E − E0 = mvc2 − m0c2 = m0γc2 − m0c2 = m0c2(γ − 1). As for the exclusion of the potential energy, we may note that we may choose our reference point for the potential energy such that the kinetic and potential energy mirror each other. The energy concept that then emerges is the one that is used in the context of the Principle of Least Action: it equals E = mv2. Appendix 1 provides some notes on that.

[10] Instead of two cylinders with pistons, one may also think of connecting two springs with a crankshaft.

[11] It is interesting to note that we may look at the energy in the rotating flywheel as potential energy because it is energy that is associated with motion, albeit circular motion. In physics, one may associate a rotating object with kinetic energy using the rotational equivalent of mass and linear velocity, i.e. rotational inertia (I) and angular velocity ω. The kinetic energy of a rotating object is then given by K.E. = (1/2)·I·ω2.

[12] Because of the sideways motion of the connecting rods, the sinusoidal function will describe the linear motion only approximately, but you can easily imagine the idealized limit situation.

[13] The ω2= 1/LC formula gives us the natural or resonant frequency for a electric circuit consisting of a resistor (R), an inductor (L), and a capacitor (C). Writing the formula as ω2= C1/L introduces the concept of elastance, which is the equivalent of the mechanical stiffness (k) of a spring.

[14] The resistance in an electric circuit introduces a damping factor. When analyzing a mechanical spring, one may also want to introduce a drag coefficient. Both are usually defined as a fraction of the inertia, which is the mass for a spring and the inductance for an electric circuit. Hence, we would write the resistance for a spring as γm and as R = γL respectively.

[15] Photons are emitted by atomic oscillators: atoms going from one state (energy level) to another. Feynman (Lectures, I-33-3) shows us how to calculate the Q of these atomic oscillators: it is of the order of 108, which means the wave train will last about 10–8 seconds (to be precise, that is the time it takes for the radiation to die out by a factor 1/e). For example, for sodium light, the radiation will last about 3.2×10–8 seconds (this is the so-called decay time τ). Now, because the frequency of sodium light is some 500 THz (500×1012 oscillations per second), this makes for some 16 million oscillations. There is an interesting paradox here: the speed of light tells us that such wave train will have a length of about 9.6 m! How is that to be reconciled with the pointlike nature of a photon? The paradox can only be explained by relativistic length contraction: in an analysis like this, one need to distinguish the reference frame of the photon – riding along the wave as it is being emitted, so to speak – and our stationary reference frame, which is that of the emitting atom.

[16] This is a general result and is reflected in the K.E. = T = (1/2)·m·ω2·a2·sin2(ω·t + Δ) and the P.E. = U = k·x2/2 = (1/2)· m·ω2·a2·cos2(ω·t + Δ) formulas for the linear oscillator.

[17] Feynman further formalizes this in his Lecture on Superconductivity (Feynman, III-21-2), in which he refers to Schrödinger’s equation as the “equation for continuity of probabilities”. The analysis is centered on the local conservation of energy, which confirms the interpretation of Schrödinger’s equation as an energy diffusion equation.

[18] The meff is the effective mass of the particle, which depends on the medium. For example, an electron traveling in a solid (a transistor, for example) will have a different effective mass than in an atom. In free space, we can drop the subscript and just write meff = m. Appendix 2 provides some additional notes on the concept. As for the equations, they are easily derived from noting that two complex numbers a + i∙b and c + i∙d are equal if, and only if, their real and imaginary parts are the same. Now, the ∂ψ/∂t = i∙(ħ/meff)∙∇2ψ equation amounts to writing something like this: a + i∙b = i∙(c + i∙d). Now, remembering that i2 = −1, you can easily figure out that i∙(c + i∙d) = i∙c + i2∙d = − d + i∙c.

[19] The dimension of B is usually written as N/(m∙A), using the SI unit for current, i.e. the ampere (A). However, 1 C = 1 A∙s and, hence, 1 N/(m∙A) = 1 (N/C)/(m/s).

[20] Of course, multiplication with i amounts to a counterclockwise rotation. Hence, multiplication by –i also amounts to a rotation by 90 degrees, but clockwise. Now, to uniquely identify the clockwise and counterclockwise directions, we need to establish the equivalent of the right-hand rule for a proper geometric interpretation of Schrödinger’s equation in three-dimensional space: if we look at a clock from the back, then its hand will be moving counterclockwise. When writing B = (1/c)∙iE, we assume we are looking in the negative x-direction. If we are looking in the positive x-direction, we should write: B = -(1/c)∙iE. Of course, Nature does not care about our conventions. Hence, both should give the same results in calculations. We will show in a moment they do.

[21] In fact, when multiplying C2/(N·m2) with N2/C2, we get N/m2, but we can multiply this with 1 = m/m to get the desired result. It is significant that an energy density (joule per unit volume) can also be measured in newton (force per unit area.

[22] The illustration shows a linearly polarized wave, but the obtained result is general.

[23] The sine and cosine are essentially the same functions, except for the difference in the phase: sinθ = cos(θ−π /2).

[24] I must thank a physics blogger for re-writing the 1/(ε0·μ0) = c2 equation like this. See: http://reciprocal.systems/phpBB3/viewtopic.php?t=236 (retrieved on 29 September 2017).

[25] A circularly polarized electromagnetic wave may be analyzed as consisting of two perpendicular electromagnetic plane waves of equal amplitude and 90° difference in phase.

[26] Of course, the reader will now wonder: what about neutrons? How to explain neutron spin? Neutrons are neutral. That is correct, but neutrons are not elementary: they consist of (charged) quarks. Hence, neutron spin can (or should) be explained by the spin of the underlying quarks.

[27] We detailed the mathematical framework and detailed calculations in the following online article: https://readingfeynman.org/2017/09/15/the-principle-of-least-action-re-visited.

[28] https://en.wikipedia.org/wiki/Electron_rest_mass (retrieved on 29 September 2017).

Quantum Mechanics: The Other Introduction

About three weeks ago, I brought my most substantial posts together in one document: it’s the Deep Blue page of this site. I also published it on Amazon/Kindle. It’s nice. It crowns many years of self-study, and many nights of short and bad sleep – as I was mulling over yet another paradox haunting me in my dreams. It’s been an extraordinary climb but, frankly, the view from the top is magnificent. 🙂

The offer is there: anyone who is willing to go through it and offer constructive and/or substantial comments will be included in the book’s acknowledgements section when I go for a second edition (which it needs, I think). First person to be acknowledged here is my wife though, Maria Elena Barron, as she has given me the spacetime and, more importantly, the freedom to take this bull by its horns.

Below I just copy the foreword, just to give you a taste of it. 🙂

Foreword

Another introduction to quantum mechanics? Yep. I am not hoping to sell many copies, but I do hope my unusual background—I graduated as an economist, not as a physicist—will encourage you to take on the challenge and grind through this.

I’ve always wanted to thoroughly understand, rather than just vaguely know, those quintessential equations: the Lorentz transformations, the wavefunction and, above all, Schrödinger’s wave equation. In my bookcase, I’ve always had what is probably the most famous physics course in the history of physics: Richard Feynman’s Lectures on Physics, which have been used for decades, not only at Caltech but at many of the best universities in the world. Plus a few dozen other books. Popular books—which I now regret I ever read, because they were an utter waste of time: the language of physics is math and, hence, one should read physics in math—not in any other language.

But Feynman’s Lectures on Physics—three volumes of about fifty chapters each—are not easy to read. However, the experimental verification of the existence of the Higgs particle in CERN’s LHC accelerator a couple of years ago, and the award of the Nobel prize to the scientists who had predicted its existence (including Peter Higgs and François Englert), convinced me it was about time I take the bull by its horns. While, I consider myself to be of average intelligence only, I do feel there’s value in the ideal of the ‘Renaissance man’ and, hence, I think stuff like this is something we all should try to understand—somehow. So I started to read, and I also started a blog (www.readingfeynman.org) to externalize my frustration as I tried to cope with the difficulties involved. The site attracted hundreds of visitors every week and, hence, it encouraged me to publish this booklet.

So what is it about? What makes it special? In essence, it is a common-sense introduction to the key concepts in quantum physics. However, while common-sense, it does not shy away from the math, which is complicated, but not impossible. So this little book is surely not a Guide to the Universe for Dummies. I do hope it will guide some Not-So-Dummies. It basically recycles what I consider to be my more interesting posts, but combines them in a comprehensive structure.

It is a bit of a philosophical analysis of quantum mechanics as well, as I will – hopefully – do a better job than others in distinguishing the mathematical concepts from what they are supposed to describe, i.e. physical reality.

Last but not least, it does offer some new didactic perspectives. For those who know the subject already, let me briefly point these out:

I. Few, if any, of the popular writers seems to have noted that the argument of the wavefunction (θ = E·t – p·t) – using natural units (hence, the numerical value of ħ and c is one), and for an object moving at constant velocity (hence, x = v·t) – can be written as the product of the proper time of the object and its rest mass:

θ = E·t – p·x = E·t − p·x = mv·t − mv·v·x = mv·(t − v·x)

⇔ θ = m0·(t − v·x)/√(1 – v2) = m0·t’

Hence, the argument of the wavefunction is just the proper time of the object with the rest mass acting as a scaling factor for the time: the internal clock of the object ticks much faster if it’s heavier. This symmetry between the argument of the wavefunction of the object as measured in its own (inertial) reference frame, and its argument as measured by us, in our own reference frame, is remarkable, and allows to understand the nature of the wavefunction in a more intuitive way.

While this approach reflects Feynman’s idea of the photon stopwatch, the presentation in this booklet generalizes the concept for all wavefunctions, first and foremost the wavefunction of the matter-particles that we’re used to (e.g. electrons).

II. Few, if any, have thought of looking at Schrödinger’s wave equation as an energy propagation mechanism. In fact, when helping my daughter out as she was trying to understand non-linear regression (logit and Poisson regressions), it suddenly realized we can analyze the wavefunction as a link function that connects two physical spaces: the physical space of our moving object, and a physical energy space.

Re-inserting Planck’s quantum of action in the argument of the wavefunction – so we write θ as θ = (E/ħ)·t – (p/ħ)·x = [E·t – p·x]/ħ – we may assign a physical dimension to it: when interpreting ħ as a scaling factor only (and, hence, when we only consider its numerical value, not its physical dimension), θ becomes a quantity expressed in newton·meter·second, i.e. the (physical) dimension of action. It is only natural, then, that we would associate the real and imaginary part of the wavefunction with some physical dimension too, and a dimensional analysis of Schrödinger’s equation tells us this dimension must be energy.

This perspective allows us to look at the wavefunction as an energy propagation mechanism, with the real and imaginary part of the probability amplitude interacting in very much the same way as the electric and magnetic field vectors E and B. This leads me to the next point, which I make rather emphatically in this booklet:  the propagation mechanism for electromagnetic energy – as described by Maxwell’s equations – is mathematically equivalent to the propagation mechanism that’s implicit in the Schrödinger equation.

I am, therefore, able to present the Schrödinger equation in a much more coherent way, describing not only how this famous equation works for electrons, or matter-particles in general (i.e. fermions or spin-1/2 particles), which is probably the only use of the Schrödinger equation you are familiar with, but also how it works for bosons, including the photon, of course, but also the theoretical zero-spin boson!

In fact, I am personally rather proud of this. Not because I am doing something that hasn’t been done before (I am sure many have come to the same conclusions before me), but because one always has to trust one’s intuition. So let me say something about that third innovation: the photon wavefunction.

III. Let me tell you the little story behind my photon wavefunction. One of my acquaintances is a retired nuclear scientist. While he knew I was delving into it all, I knew he had little time to answer any of my queries. However, when I asked him about the wavefunction for photons, he bluntly told me photons didn’t have a wavefunction. I should just study Maxwell’s equations and that’s it: there’s no wavefunction for photons: just this traveling electric and a magnetic field vector. Look at Feynman’s Lectures, or any textbook, he said. None of them talk about photon wavefunctions. That’s true, but I knew he had to be wrong. I mulled over it for several months, and then just sat down and started doing to fiddle with Maxwell’s equations, assuming the oscillations of the E and B vector could be described by regular sinusoids. And – Lo and behold! – I derived a wavefunction for the photon. It’s fully equivalent to the classical description, but the new expression solves the Schrödinger equation, if we modify it in a rather logical way: we have to double the diffusion constant, which makes sense, because E and B give you two waves for the price of one!

[…]

In any case, I am getting ahead of myself here, and so I should wrap up this rather long introduction. Let me just say that, through my rather long journey in search of understanding – rather than knowledge alone – I have learned there are so many wrong answers out there: wrong answers that hamper rather than promote a better understanding. Moreover, I was most shocked to find out that such wrong answers are not the preserve of amateurs alone! This emboldened me to write what I write here, and to publish it. Quantum mechanics is a logical and coherent framework, and it is not all that difficult to understand. One just needs good pointers, and that’s what I want to provide here.

As of now, it focuses on the mechanics in particular, i.e. the concept of the wavefunction and wave equation (better known as Schrödinger’s equation). The other aspect of quantum mechanics – i.e. the idea of uncertainty as implied by the quantum idea – will receive more attention in a later version of this document. I should also say I will limit myself to quantum electrodynamics (QED) only, so I won’t discuss quarks (i.e. quantum chromodynamics, which is an entirely different realm), nor will I delve into any of the other more recent advances of physics.

In the end, you’ll still be left with lots of unanswered questions. However, that’s quite OK, as Richard Feynman himself was of the opinion that he himself did not understand the topic the way he would like to understand it. But then that’s exactly what draws all of us to quantum physics: a common search for a deep and full understanding of reality, rather than just some superficial description of it, i.e. knowledge alone.

So let’s get on with it. I am not saying this is going to be easy reading. In fact, I blogged about much easier stuff than this in my blog—treating only aspects of the whole theory. This is the whole thing, and it’s not easy to swallow. In fact, it may well too big to swallow as a whole. But please do give it a try. I wanted this to be an intuitive but formally correct introduction to quantum math. However, when everything is said and done, you are the only who can judge if I reached that goal.

Of course, I should not forget the acknowledgements but… Well… It was a rather lonely venture, so I am only going to acknowledge my wife here, Maria, who gave me all of the spacetime and all of the freedom I needed, as I would get up early, or work late after coming home from my regular job. I sacrificed weekends, which we could have spent together, and – when mulling over yet another paradox – the nights were often short and bad. Frankly, it’s been an extraordinary climb, but the view from the top is magnificent.

I just need to insert one caution, my site (www.readingfeynman.org) includes animations, which make it much easier to grasp some of the mathematical concepts that I will be explaining. Hence, I warmly recommend you also have a look at that site, and its Deep Blue page in particular – as that page has the same contents, more or less, but the animations make it a much easier read.

Have fun with it!

Jean Louis Van Belle, BA, MA, BPhil, Drs.

All what you ever wanted to know about the photon wavefunction…

Post scriptum note added on 11 July 2016: This is one of the more speculative posts which led to my e-publication analyzing the wavefunction as an energy propagation. With the benefit of hindsight, I would recommend you to immediately read the more recent exposé on the matter that is being presented here, which you can find by clicking on the provided link.

Original post:

This post is, essentially, a continuation of my previous post, in which I juxtaposed the following images:

Both are the same, and then they’re not. The illustration on the right-hand side is a regular quantum-mechanical wavefunction, i.e. an amplitude wavefunction. You’ve seen that one before. In this case, the x-axis represents time, so we’re looking at the wavefunction at some particular point in space. ]You know we can just switch the dimensions and it would all look the same.] The illustration on the left-hand side looks similar, but it’s not an amplitude wavefunction. The animation shows how the electric field vector (E) of an electromagnetic wave travels through space. Its shape is the same. So it’s the same function. Is it also the same reality?

Yes and no. And I would say: more no than yes—in this case, at least. Note that the animation does not show the accompanying magnetic field vector (B). That vector is equally essential in the electromagnetic propagation mechanism according to Maxwell’s equations, which—let me remind you—are equal to:

1. B/∂t = –∇×E
2. E/∂t = ∇×B

In fact, I should write the second equation as ∂E/∂t = c2∇×B, but then I assume we measure time and distance in equivalent units, so c = 1.

You know that E and B are two aspects of one and the same thing: if we have one, then we have the other. To be precise, B is always orthogonal to in the direction that’s given by the right-hand rule for the following vector cross-product: B = ex×E, with ex the unit vector pointing in the x-direction (i.e. the direction of propagation). The reality behind is illustrated below for a linearly polarized electromagnetic wave.

The B = ex×E equation is equivalent to writing B= i·E, which is equivalent to:

B = i·E = ei(π/2)·ei(kx − ωt) = cos(kx − ωt + π/2) + i·sin(kx − ωt + π/2)

= −sin((kx − ωt) + i·cos(kx − ωt)

Now, E and B have only two components: Eand Ez, and Band Bz. That’s only because we’re looking at some ideal or elementary electromagnetic wave here but… Well… Let’s just go along with it. 🙂 It is then easy to prove that the equation above amounts to writing:

1. B= cos(kx − ωt + π/2) = −sin(kx − ωt) = −Ez
2. B= sin(kx − ωt + π/2) = cos(kx − ωt) = Ey

We should now think of Ey and Eas the real and imaginary part of some wavefunction, which we’ll denote as ψE = ei(kx − ωt). So we write:

E = (Ey, Ez) = Ey + i·E= cos(kx − ωt) + i∙sin(kx − ωt) = ReE) + i·ImE) = ψE = ei(kx − ωt)

What about B? We just do the same, so we write:

B = (By, Bz) = By + i·B= ψB = i·E = i·ψE = −sin(kx − ωt) + i∙sin(kx − ωt) = − ImE) + i·ReE)

Now we need to prove that ψE and ψB are regular wavefunctions, which amounts to proving Schrödinger’s equation, i.e. ∂ψ/∂t = i·(ħ/m)·∇2ψ, for both ψE and ψB. [Note I use the Schrödinger’s equation for a zero-mass spin-zero particle here, which uses the ħ/m factor rather than the ħ/(2m) factor.] To prove that ψE and ψB are regular wavefunctions, we should prove that:

1. Re(∂ψE/∂t) =  −(ħ/m)·Im(∇2ψE) and Im(∂ψE/∂t) = (ħ/m)·Re(∇2ψE), and
2. Re(∂ψB/∂t) =  −(ħ/m)·Im(∇2ψB) and Im(∂ψB/∂t) = (ħ/m)·Re(∇2ψB).

Let’s do the calculations for the second pair of equations. The time derivative on the left-hand side is equal to:

∂ψB/∂t = −iω·iei(kx − ωt) = ω·[cos(kx − ωt) + i·sin(kx − ωt)] = ω·cos(kx − ωt) + iω·sin(kx − ωt)

The second-order derivative on the right-hand side is equal to:

2ψ= ∂2ψB/∂x= i·k2·ei(kx − ωt) = k2·cos(kx − ωt) + i·k2·sin(kx − ωt)

So the two equations for ψare equivalent to writing:

1. Re(∂ψB/∂t) =   −(ħ/m)·Im(∇2ψB) ⇔ ω·cos(kx − ωt) = k2·(ħ/m)·cos(kx − ωt)
2. Im(∂ψB/∂t) = (ħ/m)·Re(∇2ψB) ⇔ ω·sin(kx − ωt) = k2·(ħ/m)·sin(kx − ωt)

So we see that both conditions are fulfilled if, and only if, ω = k2·(ħ/m).

Now, we also demonstrated in that post of mine that Maxwell’s equations imply the following:

1. ∂By/∂t = –(∇×E)y = ∂Ez/∂x = ∂[sin(kx − ωt)]/∂x = k·cos(kx − ωt) = k·Ey
2. ∂Bz/∂t = –(∇×E)z = – ∂Ey/∂x = – ∂[cos(kx − ωt)]/∂x = k·sin(kx − ωt) = k·Ez

Hence, using those B= −Eand B= Eequations above, we can also calculate these derivatives as:

1. ∂By/∂t = −∂Ez/∂t = −∂sin(kx − ωt)/∂t = ω·cos(kx − ωt) = ω·Ey
2. ∂Bz/∂t = ∂Ey/∂t = ∂cos(kx − ωt)/∂t = −ω·[−sin(kx − ωt)] = ω·Ez

In other words, Maxwell’s equations imply that ω = k, which is consistent with us measuring time and distance in equivalent units, so the phase velocity is  = 1 = ω/k.

So far, so good. We basically established that the propagation mechanism for an electromagnetic wave, as described by Maxwell’s equations, is fully coherent with the propagation mechanism—if we can call it like that—as described by Schrödinger’s equation. We also established the following equalities:

1. ω = k
2. ω = k2·(ħ/m)

The second of the two de Broglie equations tells us that k = p/ħ, so we can combine these two equations and re-write these two conditions as:

ω/k = 1 = k·(ħ/m) = (p/ħ)·(ħ/m) = p/m ⇔ p = m

What does this imply? The p here is the momentum: p = m·v, so this condition implies must be equal to 1 too, so the wave velocity is equal to the speed of light. Makes sense, because we actually are talking light here. 🙂 In addition, because it’s light, we also know E/p = = 1, so we have – once again – the general E = p = m equation, which we’ll need!

OK. Next. Let’s write the Schrödinger wave equation for both wavefunctions:

1. ∂ψE/∂t = i·(ħ/mE)·∇2ψE, and
2. ∂ψB/∂t = i·(ħ/mB)·∇2ψB.

Huh? What’s mE and mE? We should only associate one mass concept with our electromagnetic wave, shouldn’t we? Perhaps. I just want to be on the safe side now. Of course, if we distinguish mE and mB, we should probably also distinguish pE and pB, and EE and EB as well, right? Well… Yes. If we accept this line of reasoning, then the mass factor in Schrödinger’s equations is pretty much like the 1/c2 = μ0ε0 factor in Maxwell’s (1/c2)·∂E/∂t = ∇×B equation: the mass factor appears as a property of the medium, i.e. the vacuum here! [Just check my post on physical constants in case you wonder what I am trying to say here, in which I explain why and how defines the (properties of the) vacuum.]

To be consistent, we should also distinguish pE and pB, and EE and EB, and so we should write ψand ψB as:

1. ψE = ei(kEx − ωEt), and
2. ψB = ei(kBx − ωBt).

Huh? Yes. I know what you think: we’re talking one photon—or one electromagnetic wave—so there can be only one energy, one momentum and, hence, only one k, and one ω. Well… Yes and no. Of course, the following identities should hold: kE = kB and, likewise, ω= ωB. So… Yes. They’re the same: one k and one ω. But then… Well… Conceptually, the two k’s and ω’s are different. So we write:

1. pE = EE = mE, and
2. pB = EB = mB.

The obvious question is: can we just add them up to find the total energy and momentum of our photon? The answer is obviously positive: E = EE + EB, p = pE + pB and m = mE + mB.

Let’s check a few things now. How does it work for the phase and group velocity of ψand ψB? Simple:

1. vg = ∂ωE/∂kE = ∂[EE/ħ]/∂[pE/ħ] = ∂EE/∂pE = ∂pE/∂pE = 1
2. vp = ωE/kE = (EE/ħ)/(pE/ħ) = EE/pE = pE/pE = 1

So we’re fine, and you can check the result for ψby substituting the subscript E for B. To sum it all up, what we’ve got here is the following:

1. We can think of a photon having some energy that’s equal to E = p = m (assuming c = 1), but that energy would be split up in an electric and a magnetic wavefunction respectively: ψand ψB.
2. Schrödinger’s equation applies to both wavefunctions, but the E, p and m in those two wavefunctions are the same and not the same: their numerical value is the same (pE =EE = mE = pB =EB = mB), but they’re conceptually different. They must be: if not, we’d get a phase and group velocity for the wave that doesn’t make sense.

Of course, the phase and group velocity for the sum of the ψand ψwaves must also be equal to c. This is obviously the case, because we’re adding waves with the same phase and group velocity c, so there’s no issue with the dispersion relation.

So let’s insert those pE =EE = mE = pB =EB = mB values in the two wavefunctions. For ψE, we get:

ψ= ei[kEx − ωEt) ei[(pE/ħ)·x − (EE/ħ)·t]

You can do the calculation for ψyourself. Let’s simplify our life a little bit and assume we’re using Planck units, so ħ = 1, and so the wavefunction simplifies to ψei·(pE·x − EE·t). We can now add the components of E and B using the summation formulas for sines and cosines:

1. B+ Ey = cos(pB·x − EB·t + π/2) + cos(pE·x − EE·t) = 2·cos[(p·x − E·t + π/2)/2]·cos(π/4) = √2·cos(p·x/2 − E·t/2 + π/4)

2. B+ Ez = sin(pB·x − EB·t+π/2) + sin(pE·x − EE·t) = 2·sin[(p·x − E·t + π/2)/2]·cos(π/4) = √2·sin(p·x/2 − E·t/2 + π/4)

Interesting! We find a composite wavefunction for our photon which we can write as:

E + B = ψ+ ψ= E + i·E = √2·ei(p·x/2 − E·t/2 + π/4) = √2·ei(π/4)·ei(p·x/2 − E·t/2) = √2·ei(π/4)·E

What a great result! It’s easy to double-check, because we can see the E + i·E = √2·ei(π/4)·formula implies that 1 + should equal √2·ei(π/4). Now that’s easy to prove, both geometrically (just do a drawing) or formally: √2·ei(π/4) = √2·cos(π/4) + i·sin(π/4ei(π/4) = (√2/√2) + i·(√2/√2) = 1 + i. We’re bang on! 🙂

We can double-check once more, because we should get the same from adding E and B = i·E, right? Let’s try:

E + B = E + i·E = cos(pE·x − EE·t) + i·sin(pE·x − EE·t) + i·cos(pE·x − EE·t) − sin(pE·x − EE·t)

= [cos(pE·x − EE·t) – sin(pE·x − EE·t)] + i·[sin(pE·x − EE·t) – cos(pE·x − EE·t)]

Indeed, we can see we’re going to obtain the same result, because the −sinθ in the real part of our composite wavefunction is equal to cos(θ+π/2), and the −cosθ in its imaginary part is equal to sin(θ+π/2). So the sum above is the same sum of cosines and sines that we did already.

So our electromagnetic wavefunction, i.e. the wavefunction for the photon, is equal to:

ψ = ψ+ ψ= √2·ei(p·x/2 − E·t/2 + π/4) = √2·ei(π/4)·ei(p·x/2 − E·t/2)

What about the √2 factor in front, and the π/4 term in the argument itself? No sure. It must have something to do with the way the magnetic force works, which is not like the electric force. Indeed, remember the Lorentz formula: the force on some unit charge (q = 1) will be equal to F = E + v×B. So… Well… We’ve got another cross-product here and so the geometry of the situation is quite complicated: it’s not like adding two forces Fand Fto get some combined force F = Fand F2.

In any case, we need the energy, and we know that its proportional to the square of the amplitude, so… Well… We’re spot on: the square of the √2 factor in the √2·cos product and √2·sin product is 2, so that’s twice… Well… What? Hold on a minute! We’re actually taking the absolute square of the E + B = ψ+ ψ= E + i·E = √2·ei(p·x/2 − E·t/2 + π/4) wavefunction here. Is that legal? I must assume it is—although… Well… Yes. You’re right. We should do some more explaining here.

We know that we usually measure the energy as some definite integral, from t = 0 to some other point in time, or over the cycle of the oscillation. So what’s the cycle here? Our combined wavefunction can be written as √2·ei(p·x/2 − E·t/2 + π/4) = √2·ei(θ/2 + π/4), so a full cycle would correspond to θ going from 0 to 4π here, rather than from 0 to 2π. So that explains the √2 factor in front of our wave equation.

Bingo! If you were looking for an interpretation of the Planck energy and momentum, here it is. And, while everything that’s written above is not easy to understand, it’s close to the ‘intuitive’ understanding to quantum mechanics that we were looking for, isn’t it? The quantum-mechanical propagation model explains everything now. 🙂 I only need to show one more thing, and that’s the different behavior of bosons and fermions:

1. The amplitudes of identitical bosonic particles interfere with a positive sign, so we have Bose-Einstein statistics here. As Feynman writes it: (amplitude direct) + (amplitude exchanged).
2. The amplitudes of identical fermionic particles interfere with a negative sign, so we have Fermi-Dirac statistics here: (amplitude direct) − (amplitude exchanged).

I’ll think about it. I am sure it’s got something to do with that B= i·E formula or, to put it simply, with the fact that, when bosons are involved, we get two wavefunctions (ψand ψB) for the price of one. The reasoning should be something like this:

I. For a massless particle (i.e. a zero-mass fermion), our wavefunction is just ψ = ei(p·x − E·t). So we have no √2 or √2·ei(π/4) factor in front here. So we can just add any number of them – ψ1 + ψ2 + ψ3 + … – and then take the absolute square of the amplitude to find a probability density, and we’re done.

II. For a photon (i.e. a zero-mass boson), our wavefunction is √2·ei(π/4)·ei(p·x − E·t)/2, which – let’s introduce a new symbol – we’ll denote by φ, so φ = √2·ei(π/4)·ei(p·x − E·t)/2. Now, if we add any number of these, we get a similar sum but with that √2·ei(π/4) factor in front, so we write: φ1 + φ2 + φ3 + … = √2·ei(π/4)·(ψ1 + ψ2 + ψ3 + …). If we take the absolute square now, we’ll see the probability density will be equal to twice the density for the ψ1 + ψ2 + ψ3 + … sum, because

|√2·ei(π/4)·(ψ1 + ψ2 + ψ3 + …)|2 = |√2·ei(π/4)|2·|ψ1 + ψ2 + ψ3 + …)|2 2·|ψ1 + ψ2 + ψ3 + …)|2

So… Well… I still need to connect this to Feynman’s (amplitude direct) ± (amplitude exchanged) formula, but I am sure it can be done.

Now, we haven’t tested the complete √2·ei(π/4)·ei(p·x − E·t)/2 wavefunction. Does it respect Schrödinger’s ∂ψ/∂t = i·(1/m)·∇2ψ or, including the 1/2 factor, the ∂ψ/∂t = i·[1/2m)]·∇2ψ equation? [Note we assume, once again, that ħ = 1, so we use Planck units once more.] Let’s see. We can calculate the derivatives as:

• ∂ψ/∂t = −√2·ei(π/4)·ei∙[p·x − E·t]/2·(i·E/2)
• 2ψ = ∂2[√2·ei(π/4)·ei∙[p·x − E·t]/2]/∂x= √2·ei(π/4)·∂[√2·ei(π/4)·ei∙[p·x − E·t]/2·(i·p/2)]/∂x = −√2·ei(π/4)·ei∙[p·x − E·t]/2·(p2/4)

So Schrödinger’s equation becomes:

i·√2·ei(π/4)·ei∙[p·x − E·t]/2·(i·E/2) = −i·(1/m)·√2·ei(π/4)·ei∙[p·x − E·t]/2·(p2/4) ⇔ 1/2 = 1/4!?

That’s funny ! It doesn’t work ! The E and m and p2 are OK because we’ve got that E = m = p equation, but we’ve got problems with yet another factor 2. It only works when we use the 2/m coefficient in Schrödinger’s equation.

So… Well… There’s no choice. That’s what we’re going to do. The Schrödinger equation for the photon is ∂ψ/∂t = i·(2/m)·∇2ψ !

It’s a very subtle point. This is all great, and very fundamental stuff! Let’s now move on to Schrödinger’s actual equation, i.e. the ∂ψ/∂t = i·(ħ/2m)·∇2ψ equation.

Post scriptum on the Planck units:

If we measure time and distance in equivalent units, say seconds, we can re-write the quantum of action as:

1.0545718×10−34 N·m·s = (1.21×1044 N)·(1.6162×10−35 m)·(5.391×10−44 s)

⇔ (1.0545718×10−34/2.998×108) N·s2 = (1.21×1044 N)·(1.6162×10−35/2.998×108 s)(5.391×10−44 s)

⇔ (1.21×1044 N) = [(1.0545718×10−34/2.998×108)]/[(1.6162×10−35/2.998×108 s)(5.391×10−44 s)] N·s2/s2

You’ll say: what’s this? Well… Look at it. We’ve got a much easier formula for the Planck force—much easier than the standard formulas you’ll find on Wikipedia, for example. If we re-interpret the symbols ħ and so they denote the numerical value of the quantum of action and the speed of light in standard SI units (i.e. newton, meter and second)—so ħ and c become dimensionless, or mathematical constants only, rather than physical constants—then the formula above can be written as:

FP newton = (ħ/c)/[(lP/c)·tP] newton ⇔ FP = ħ/(lP·tP)

Just double-check it: 1.0545718×10−34/(1.6162×10−35·5.391×10−44) = 1.21×1044. Bingo!

You’ll say: what’s the point? The point is: our model is complete. We don’t need the other physical constants – i.e. the Coulomb, Boltzmann and gravitational constant – to calculate the Planck units we need, i.e. the Planck force, distance and time units. It all comes out of our elementary wavefunction! All we need to explain the Universe – or, let’s be more modest, quantum mechanics – is two numerical constants (c and ħ) and Euler’s formula (which uses π and e, of course). That’s it.

If you don’t think that’s a great result, then… Well… Then you’re not reading this. 🙂

Schrödinger’s equation and the two de Broglie relations

Post scriptum note added on 11 July 2016: This is one of the more speculative posts which led to my e-publication analyzing the wavefunction as an energy propagation. With the benefit of hindsight, I would recommend you to immediately the more recent exposé on the matter that is being presented here, which you can find by clicking on the provided link. In fact, I actually made some (small) mistakes when writing the post below.

Original post:

I’ve re-visited the de Broglie equations a couple of times already. In this post, however, I want to relate them to Schrödinger’s equation. Let’s start with the de Broglie equations first. Equations. Plural. Indeed, most popularizing books on quantum physics will give you only one of the two de Broglie equations—the one that associates a wavelength (λ) with the momentum (p) of a matter-particle:

λ = h/p

In fact, even the Wikipedia article on the ‘matter wave’ starts off like that and is, therefore, very confusing, because, for a good understanding of quantum physics, one needs to realize that the λ = h/p equality is just one of a pair of two ‘matter wave’ equations:

1. λ = h/p
2. f = E/h

These two equations give you the spatial and temporal frequency of the wavefunction respectively. Now, those two frequencies are related – and I’ll show you how in a minute – but they are not the same. It’s like space and time: they are related, but they are definitely not the same. Now, because any wavefunction is periodic, the argument of the wavefunction – which we’ll introduce shortly – will be some angle and, hence, we’ll want to express it in radians (or – if you’re really old-fashioned – degrees). So we’ll want to express the frequency as an angular frequency (i.e. in radians per second, rather than in cycles per second), and the wavelength as a wave number (i.e. in radians per meter). Hence, you’ll usually see the two de Broglie equations written as:

1. k = p/ħ
2. ω = E/ħ

It’s the same: ω = 2π∙f and f = 1/T (T is the period of the oscillation), and k = 2π/λ and then ħ = h/2π, of course! [Just to remove all ambiguities: stop thinking about degrees. They’re a Babylonian legacy, who thought the numbers 6, 12, and 60 had particular religious significance. So that’s why we have twelve-hour nights and twelve-hour days, with each hour divided into sixty minutes and each minute divided into sixty seconds, and – particularly relevant in this context – why ‘once around’ is divided into 6×60 = 360 degrees. Radians are the unit in which we should measure angles because… Well… Google it. They measure an angle in distance units. That makes things easier—a lot easier! Indeed, when studying physics, the last thing you want is artificial units, like degrees.]

So… Where were we? Oh… Yes. The de Broglie relation. Popular textbooks usually commit two sins. One is that they forget to say we have two de Broglie relations, and the other one is that the E = h∙f relationship is presented as the twin of the Planck-Einstein relation for photons, which relates the energy (E) of a photon to its frequency (ν): E = h∙ν = ħ∙ω. The former is criminal neglect, I feel. As for the latter… Well… It’s true and not true: it’s incomplete, I’d say, and, therefore, also very confusing.

Why? Because both things lead one to try to relate the two equations, as momentum and energy are obviously related. In fact, I’ve wasted days, if not weeks, on this. How are they related? What formula should we use? To answer that question, we need to answer another one: what energy concept should we use? Potential energy? Kinetic energy? Should we include the equivalent energy of the rest mass?

One quickly gets into trouble here. For example, one can try the kinetic energy, K.E. = m∙v2/2, and use the definition of momentum (p = m∙v), to write E = p2/(2m), and then we could relate the frequency f to the wavelength λ using the general rule that the traveling speed of a wave is equal to the product of its wavelength and its frequency (v = λ∙f). But if E = p2/(2m) and f = v/λ, we get:

p2/(2m) = h∙v/λ ⇔  λ = 2∙h/p

So that is almost right, but not quite: that factor 2 should not be there. In fact, it’s easy to see that we’d get de Broglie’s λ = h/p equation from his E = h∙f equation if we’d use E = m∙v2 rather than E = m∙v2/2. In fact, the E = m∙v2 relation comes out of them if we just multiply the two and, yes, use that v = λ relation once again:

1. f·λ = (E/h)·(h/p) = E/p
2. v = λ ⇒ f·λ = v = E/p ⇔ E = v·p = v·(m·v) ⇒ E = m·v2

But… Well… E = m∙v2? How could we possibly justify the use of that formula?

The answer is simple: our v = f·λ equation is wrong. It’s just something one shouldn’t apply to the complex-valued wavefunction. The ‘correct’ velocity formula for the complex-valued wavefunction should have that 1/2 factor, so we’d write 2·f·λ = v to make things come out alright. But where would this formula come from?

Well… Now it’s time to introduce the wavefunction.

The wavefunction

You know the elementary wavefunction:

ψ = ψ(x, t) = ei(ωt − kx) = ei(kx − ωt) = cos(kx−ωt) + i∙sin(kx−ωt)

As for terminology, note that the term ‘wavefunction’ refers to what I write above, while the term ‘wave equation’ usually refers to Schrödinger’s equation, which I’ll introduce in a minute. Also note the use of boldface indicates we’re talking vectors, so we’re multiplying the wavenumber vector k with the position vector x = (x, y, z) here, although we’ll often simplify and assume one-dimensional space. In any case…

So the question is: why can’t we use the v = f·λ formula for this wave? The period of cosθ + isinθ is the same as that of the sine and cosine function considered separately: cos(θ+2π) + isin(θ+2π) = cosθ + isinθ, so T = 2π and f = 1/T = 1/2π do not change. So the f, T and λ should be the same, no?

No. We’ve got two oscillations for the price of one here: one ‘real’ and one ‘imaginary’—but both are equally essential and, hence, equally ‘real’. So we’re actually combining two waves. So it’s just like adding other waves: when adding waves, one gets a composite wave that has (a) a phase velocity and (b) a group velocity.

Huh? Yes. It’s quite interesting. When adding waves, we usually have a different ω and k for each of the component waves, and the phase and group velocity will depend on the relation between those ω’s and k’s. That relation is referred to as the dispersion relation. To be precise, if you’re adding waves, then the phase velocity of the composite wave will be equal to vp = ω/k, and its group velocity will be equal to vg = dω/dk. We’ll usually be interested in the group velocity, and so to calculate that derivative, we need to express ω as a function of k, of course, so we write ω as some function of k, i.e. ω = ω(k). There are number of possibilities then:

1. ω and k may be directly proportional, so we can write ω as ω = a∙k: in that case, we find that vp = vg = a.
2. ω and k are not directly proportional but have a linear relationship, so we can write write ω as ω = a∙k + b. In that case, we find that vg = a and… Well… We’ve got a problem calculating vp, because we don’t know what k to use!
3. ω and k may be non-linearly related, in which case… Well… One does has to do the calculation and see what comes out. 🙂

Let’s now look back at our ei(kx − ωt) = cos(kx−ωt) + i∙sin(kx−ωt) function. You’ll say that we’ve got only one ω and one k here, so we’re not adding waves with different ω’s and k’s. So… Well… What?

That’s where the de Broglie equations come in. Look: k = p/ħ, and ω = E/ħ. If we now use the correct energy formula, i.e. the kinetic energy formula E = m·v2/2 (rather than that nonsensical E = m·v2 equation) – which we can also write as E = m·v·v/2 = p·v/2 = p·p/2m = p2/2m, with v = p/m the classical velocity of the elementary particle that Louis de Broglie was thinking of – then we can calculate the group velocity of our ei(kx − ωt) = cos(kx−ωt) + i∙sin(kx−ωt) as:

vg = dω/dk = d[E/ħ]/d[p/ħ] = dE/dp = d[p2/2m]/dp = 2p/2m = p/m = v

However, the phase velocity of our ei(kx − ωt) is:

vp = ω/k = (E/ħ)/(p/ħ) = E/p = (p2/2m)/p = p/2m = v/2

So that factor 1/2 only appears for the phase velocity. Weird, isn’t it? We find that the group velocity (vg) of the ei(kx − ωt) function is equal to the classical velocity of our particle (i.e. v), but that its phase velocity (vp) is equal to v divided by 2.

Hmm… What to say? Well… Nothing much—except that it makes sense, and very much so, because it’s the group velocity of the wavefunction that’s associated with the classical velocity of a particle, not the phase velocity. In fact, if we include the rest mass in our energy formula, so if we’d use the relativistic E = γm0c2 and p = γm0v formulas (with γ the Lorentz factor), then we find that vp = ω/k = E/p = (γm0c2)/(γm0v) = c2/v, and so that’s a superluminal velocity, because v is always smaller than c!

What? That’s even weirder! If we take the kinetic energy only, we find a phase velocity equal to v/2, but if we include the rest energy, then we get a superluminal phase velocity. It must be one or the other, no? Yep! You’re right! So that makes us wonder: is E = m·v2/2 really the right energy concept to use? The answer is unambiguous: no! It isn’t! And, just for the record, our young nobleman didn’t use the kinetic energy formula when he postulated his equations in his now famous PhD thesis.

So what did he use then? Where did he get his equations?

I am not sure. 🙂 A stroke of genius, it seems. According to Feynman, that’s how Schrödinger got his equation too: intuition, brilliance. In short, a stroke of genius. 🙂 Let’s relate these these two gems.

Schrödinger’s equation and the two de Broglie relations

Erwin Schrödinger and Louis de Broglie published their equations in 1924 and 1926 respectively. Can they be related? The answer is: yes—of course! Let’s first look at de Broglie‘s energy concept, however. Louis de Broglie was very familiar with Einsteins’ work and, hence, he knew that the energy of a particle consisted of three parts:

1. The particle’s rest energy m0c2, which de Broglie referred to as internal energy (Eint): this ‘internal energy’ includes the rest mass of the ‘internal pieces’, as he put it (now we call those ‘internal pieces’ quarks), as well as their binding energy (i.e. the quarks’interaction energy);
2. Any potential energy it may have because of some field (so de Broglie was not assuming the particle was traveling in free space), which we’ll denote by V: the field(s) can be anything—gravitational, electromagnetic—you name it: whatever changes the energy because of the position of the particle;
3. The particle’s kinetic energy, which we wrote in terms of its momentum p: K.E. = m·v2/2 = m2·v2/(2m) = (m·v)2/(2m) = p2/(2m).

Indeed, in my previous posts, I would write the wavefunction as de Broglie wrote it, which is as follows:

ψ(θ) = ψ(x, t) = a·eiθ = a·e−i[(Eint + p2/(2m) + V)·t − p∙x]/ħ

In those post – such as my post on virtual particles – I’d also note how a change in potential energy plays out: a change in potential energy, when moving from one place to another, would change the wavefunction, but through the momentum only—so it would impact the spatial frequency only. So the change in potential would not change the temporal frequencies ω= Eint + p12/(2m) + V1 and ω= Eint + p22/(2m) + V2. Why? Or why not, I should say? Because of the energy conservation principle—or its equivalent in quantum mechanics. The temporal frequency f or ω, i.e. the time-rate of change of the phase of the wavefunction, does not change: all of the change in potential, and the corresponding change in kinetic energy, goes into changing the spatial frequency, i.e. the wave number k or the wavelength λ, as potential energy becomes kinetic or vice versa.

So is that consistent with what we wrote above, that E = m·v2? Maybe. Let’s think about it. Let’s first look at Schrödinger’s equation in free space (i.e. a space with zero potential) once again:

If we insert our ψ = ei(kx − ωt) formula in Schrödinger’s free-space equation, we get the following nice result. [To keep things simple, we’re just assuming one-dimensional space for the calculations, so ∇2ψ = ∂2ψ/∂x2. But the result can easily be generalized.] The time derivative on the left-hand side is ∂ψ/∂t = −iω·ei(kx − ωt). The second-order derivative on the right-hand side is ∂2ψ/∂x2 = (ik)·(ik)·ei(kx − ωt) = −k2·ei(kx − ωt) . The ei(kx − ωt) factor on both sides cancels out and, hence, equating both sides gives us the following condition:

iω = −(iħ/2m)·k2 ⇔ ω = (ħ/2m)·k2

Substituting ω = E/ħ and k = p/ħ yields:

E/ħ = (ħ/2m)·p22 = m2·v2/(2m·ħ) = m·v2/(2ħ) ⇔ E = m·v2/2

Bingo! We get that kinetic energy formula! But now… What if we’d not be considering free space? In other words: what if there is some potential? Well… We’d use the complete Schrödinger equation, which is:

Huh? Why is there a minus sign now? Look carefully: I moved the iħ factor on the left-hand side to the other when writing the free space version. If we’d do that for the complete equation, we’d get:

I like that representation a lot more—if only because it makes it a lot easier to interpret the equation—but, for some reason I don’t quite understand, you won’t find it like that in textbooks. Now how does it work when using the complete equation, so we add the −(i/ħ)·V·ψ term? It’s simple: the ei(kx − ωt) factor also cancels out, and so we get:

iω = −(iħ/2m)·k2−(i/ħ)·V ⇔ ω = (ħ/2m)·k+ V/ħ

Substituting ω = E/ħ and k = p/ħ once more now yields:

E/ħ = (ħ/2m)·p22 + V/ħ = m2·v2/(2m·ħ) + V/ħ = m·v2/(2ħ) + V/ħ ⇔ E = m·v2/2 + V

Bingo once more!

The only thing that’s missing now is the particle’s rest energy m0c2, which de Broglie referred to as internal energy (Eint). That includes everything, i.e. not only the rest mass of the ‘internal pieces’ (as said, now we call those ‘internal pieces’ quarks) but also their binding energy (i.e. the quarks’interaction energy). So how do we get that energy concept out of Schrödinger’s equation? There’s only one answer to that: that energy is just like V. We can, quite simply, just add it.

That brings us to the last and final question: what about our vg = result if we do not use the kinetic energy concept, but the E = m·v2/2 + V + Eint concept? The answer is simple: nothing. We still get the same, because we’re taking a derivative and the V and Eint just appear as constants, and so their derivative with respect to p is zero. Check it:

vg = dω/dk = d[E/ħ]/d[p/ħ] = dE/dp = d[p2/2m + V + Eint ]/dp = 2p/2m = p/m = v

It’s now pretty clear how this thing works. To localize our particle, we just superimpose a zillion of these ei(ωt − kx) equations. The only condition is that we’ve got that fixed vg = dω/dk = v relationhip, but so we do have such fixed relationship—as you can see above. In fact, the Wikipedia article on the dispersion relation mentions that the de Broglie equations imply the following relation between ω and k: ω = ħk2/2m. As you can see, that’s not entirely correct: the author conveniently forgets the potential (V) and the rest energy (Eint) in the energy formula here!

What about the phase velocity? That’s a different story altogether. You can think about that for yourself. 🙂

I should make one final point here. As said, in order to localize a particle (or, to be precise, its wavefunction), we’re going to add a zillion elementary wavefunctions, each of which will make its own contribution to the composite wave. That contribution is captured by some coefficient ai in front of every eiθi function, so we’ll have a zillion aieiθi functions, really. [Yep. Bit confusing: I use here as subscript, as well as imaginary unit.] In case you wonder how that works out with Schrödinger’s equation, the answer is – once again – very simple: both the time derivative (which is just a first-order derivative) and the Laplacian are linear operators, so Schrödinger’s equation, for a composite wave, can just be re-written as the sum of a zillion ‘elementary’ wave equations.

So… Well… We’re all set now to effectively use Schrödinger’s equation to calculate the orbitals for a hydrogen atom, which is what we’ll do in our next post.

In the meanwhile, you can amuse yourself with reading a nice Wikibook article on the Laplacian, which gives you a nice feel for what Schrödinger’s equation actually represents—even if I gave you a good feel for that too on my Essentials page. Whatever. You choose. Just let me know what you liked best. 🙂

Oh… One more point: the vg = dω/dk = d[p2/2m]/dp = p/m = calculation obviously assumes we can treat m as a constant. In fact, what we’re actually doing is a rather complicated substitution of variables: you should write it all out—but that’s not the point here. The point is that we’re actually doing a non-relativistic calculation. Now, that does not mean that the wavefunction isn’t consistent with special relativity. It is. In fact, in one of my posts, I show how we can explain relativistic length contraction using the wavefunction. But it does mean that our calculation of the group velocity is not relativistically correct. But that’s a minor point: I’ll leave it for you as an exercise to calculate the relativistically correct formula for the group velocity. Have fun with it! 🙂

Note: Notations are often quite confusing. One should, generally speaking, denote a frequency by ν (nu), rather than by f, so as to not cause confusion with any function f, but then… Well… You create a new problem when you do that, because that Greek letter nu (ν) looks damn similar to the v of velocity, so that’s why I’ll often use f when I should be using nu (ν). As for the units, a frequency is expressed in cycles per second, while the angular frequency ω is expressed in radians per second. One cycle covers 2π radians and, therefore, we can write: ν = ω/2π. Hence, h∙ν = h∙ω/2π = ħ∙ω. Both ν as well as ω measure the time-rate of change of the phase of the wave function, as opposed to k, i.e. the spatial frequency of the wave function, which depends on the speed of the wave. Physicists also often use the symbol v for the speed of a wave, which is also hugely confusing, because it’s also used to denote the classical velocity of the particle. And then there’s two wave velocities, of course: the group versus the phase velocity. In any case… I find the use of that other symbol (c) for the wave velocity even more confusing, because this symbol is also used for the speed of light, and the speed of a wave is not necessarily (read: usually not) equal to the speed of light. In fact, both the group as well as the phase velocity of a particle wave are very different from the speed of light. The speed of a wave and the speed of light only coincide for electromagnetic waves and, even then, it should be noted that photons also have amplitudes to travel faster or slower than the speed of light.

The Uncertainty Principle

In my previous post, I showed how Feynman derives Schrödinger’s equation using a historical and, therefore, quite intuitive approach. The approach was intuitive because the argument used a discrete model, so that’s stuff we are well acquainted with—like a crystal lattice, for example. However, now we’re now going to think continuity from the start. Let’s first see what changes in terms of notation.

New notations

Our C(xn, t) = 〈xn|ψ〉 now becomes C(x) = 〈x|ψ〉. This notation does not explicitly show the time dependence but then you know amplitudes like this do vary in space as well as in time. Having said that, the analysis below focuses mainly on their behavior in space, so it does make sense to not explicitly mention the time variable. It’s the usual trick: we look at how stuff behaves in space or, alternatively, in time. So we temporarily ‘forget’ about the other variable. That’s just how we work: it’s hard for our mind to think about these wavefunctions in both dimensions simultaneously although, ideally, we should do that.

Now, you also know that quantum physicists prefer to denote the wavefunction C(x) with some Greek letter: ψ (psi) or φ (phi). Feynman think it’s somewhat confusing because we use the same to denote a state itself, but I don’t agree. I think it’s pretty straightforward. In any case, we write:

ψ(x) = Cψ(x) = C(x) = 〈x|ψ〉

The next thing is the associated probabilities. From your high school math course, you’ll surely remember that we have two types of probability distributions: they are either discrete or, else, continuous. If they’re continuous, then our probability distribution becomes a probability density function (PDF) and, strictly speaking, we should no longer say that the probability of finding our particle at any particular point x at some time t is this or that. That probability is, strictly speaking, zero: if our variable is continuous, then our probability is defined for an interval only, and the P[x] value itself is referred to as a probability density. So we’ll look at little intervals Δx, and we can write the associated probability as:

prob (x, Δx) = |〈x|ψ〉|2Δx = |ψ(x)|2Δx

The idea is illustrated below. We just re-divide our continuous scale in little intervals and calculate the surface of some tiny elongated rectangle now. 🙂

It is also easy to see that, when moving to an infinite set of states, our 〈φ|ψ〉 = ∑〈φ|x〉〈x|ψ〉 (over all x) formula for calculating the amplitude for a particle to go from state ψ to state φ should now be written as an infinite sum, i.e. as the following integral:

Now, we know that 〈φ|x〉 = 〈x|φ〉* and, therefore, this integral can also be written as:

For example, if φ(x) =  〈x|φ〉 is equal to a simple exponential, so we can write φ(x) = a·eiθ, then φ*(x) =  〈φ|x〉 = a·e+iθ.

With that, we’re ready for the plat de résistance, except for one thing, perhaps: we don’t look at spin here. If we’d do that, we’d have to take two sets of base sets: one for up and one for down spin—but we don’t worry about this, for the time being, that is. 🙂

The momentum wavefunction

Our wavefunction 〈x|ψ〉 varies in time as well as in space. That’s obvious. How exactly depends on the energy and the momentum: both are related and, hence, if there’s uncertainty in the momentum, there will be uncertainty in the momentum, and vice versa. Uncertainty in the momentum changes the behavior of the wavefunction in space—through the p = ħk factor in the argument of the wavefunction (θ = ω·t − k·x)—while uncertainty in the energy changes the behavior of the wavefunction in time—through the E = ħω relation. As mentioned above, we focus on the variation in space here. We’ll do so y defining a new state, which is referred to as a state of definite momentum. We’ll write it as mom p, and so now we can use the Dirac notation to write the amplitude for an electron to have a definite momentum equal to p as:

φ(p) = 〈 mom p | ψ 〉

Now, you may think that the 〈x|ψ〉 and 〈mom p|ψ〉 amplitudes should be the same because, surely, we do associate the state with a definite momentum p, don’t we? Well… No! If we want to localize our wave ‘packet’, i.e. localize our particle, then we’re actually not going to associate it with a definite momentum. See my previous posts: we’re going to introduce some uncertainty so our wavefunction is actually a superposition of more elementary waves with slightly different (spatial) frequencies. So we should just go through the motions here and apply our integral formula to ‘unpack’ this amplitude. That goes as follows:

So, as usual, when seeing a formula like this, we should remind ourselves of what we need to solve. Here, we assume we somehow know the ψ(x) = 〈x|ψ〉 wavefunction, so the question is: what do we use for 〈 mom p | x 〉? At this point, Feynman wanders off to start a digression on normalization, which really confuses the picture. When everything is said and done, the easiest thing to do is to just jot down the formula for that 〈mom p | x〉 in the integrand and think about it for a while:

〈mom p | x〉 = ei(p/ħ)∙x

I mean… What else could it be? This formula is very fundamental, and I am not going to try to explain it. As mentioned above, Feynman tries to ‘explain’ it by some story about probabilities and normalization, but I think his ‘explanation’ just confuses things even more. Really, what else would it be? The formula above really encapsulates what it means if we say that p and x are conjugate variables. [I can already note, of course, that symmetry implies that we can write something similar for energy and time. Indeed, we can define a state of definite energy as 〈E | ψ〉, and then ‘unpack’ it in the same way, and see that one of the two factors in the integrand would be equal to 〈E | t〉 and, of course, we’d associate a similar formula with it:

E | t〉 = ei(E/ħ)∙t

But let me get back to the lesson here. We’re analyzing stuff in space now, not in time. Feynman gives a simple example here. He suggests a wavefunction which has the following form:

ψ(x) = K·ex2/4σ2

The example is somewhat disingenuous because this is not a complex– but real-valued function. In fact, squaring it, and then calculating applying the normalization condition (all probabilities have to add up to one), yields the normal probability distribution:

prob (x, Δx) = P(x)dx = (2πσ2)−1/2ex2/2σ2dx

So that’s just the normal distribution for μ = 0, as illustrated below.

In any case, the integral we have to solve now is:

Now, I hate integrals as much as you do (probably more) and so I assume you’re also only interested in the result (if you want the detail: check it in Feynman), which we can write as:

φ(p) = (2πη2)−1/4·ep2/4η2, with η = ħ/2σ

This formula is totally identical to the ψ(x) = (2πσ2)−1/4·ex2/4σdistribution we started with, except that it’s got another sigma value, which we denoted by η (and that’s not nu but eta), with

η = ħ/2σ

Just for the record, Feynman refers to η and σ as the ‘half-width’ of the respective distributions. Mathematicians would say they’re the standard deviation. The concept are nearly the same, but not quite. In any case, that’s another thing I’ll let you find our for yourself. 🙂 The point is: η and σ are inversely proportional to each other, and the constant of proportionality is equal to ħ/2.

Now, if we take η and σ as measures of the uncertainty in and respectively – which is what they are, obviously ! – then we can re-write that η = ħ/2σ as ησ = ħ/2 or, better still, as the Uncertainty Principle itself:

ΔpΔx = ħ/2

You’ll say: that’s great, but we usually see the Uncertainty Principle written as:

ΔpΔx ≥ ħ/2

So where does that come from? Well… We choose a normal distribution (or the Gaussian distribution, as physicists call it), and so that yields the ΔpΔx = ħ/2 identity. If we’d chosen another one, we’d find a slightly different relation and so… Well… Let me quote Feynman here: “Interestingly enough, it is possible to prove that for any other form of a distribution in x or p, the product ΔpΔcannot be smaller than the one we have found here, so the Gaussian distribution gives the smallest possible value for the ΔpΔproduct.”

This is great. So what about the even more approximate ΔpΔx ≥ ħ formula? Where does that come from? Well… That’s more like a qualitative version of it: it basically says the minimum value of the same product is of the same order as ħ which, as you know, is pretty tiny: it’s about 0.0000000000000000000000000000000006626 J·s. 🙂 The last thing to note is its dimension: momentum is expressed in newton-second and position in meter, obviously. So the uncertainties in them are expressed in the same unit, and so the dimension of the product is N·m·s = J·s. So this dimension combines force, distance and time. That’s quite appropriate, I’d say. The ΔEΔproduct obviously does the same. But… Well… That’s it, folks! I enjoyed writing this – and I cannot always say the same of other posts! So I hope you enjoyed reading it. 🙂

Freewheeling once more…

You remember the elementary wavefunction Ψ(x, t) = Ψ(θ), with θ = ω·t−k∙x = (E/ħ)·t − (p/ħ)∙x = (E·t−p∙x)/ħ. Now, we can re-scale θ and define a new argument, which we’ll write as:

φ = θ/ħ = E·t−p∙x

The Ψ(θ) function can now be written as:

Ψ(x, t) = Ψ(θ) = [ei·(θ/ħ)]ħ = Φ(φ) = [ei·φ]ħ with φ = E·t−p∙x

This doesn’t change the fundamentals: we’re just re-scaling E and p here, by measuring them in units of ħ.

You’ll wonder: can we do that? We’re talking physics here, so our variables represent something real. Not all we can do in math, should be done in physics, right? So what does it mean? We need to look at the dimensions of our variables. Does it affect our time and distance units, i.e. the second and the meter? Well… I’d say it’s OK.

Energy is expressed in joule: 1 J = 1 N·m. [In SI base units, we write: J = N·m = (kg·m/s2)·m = kg·(m/s)2.] So if we divide it by ħ, whose dimension is joule-second (J·s), we get some value expressed per second, i.e. a (temporal) frequency. That’s what we want, as we’re multiplying it with t in the argument of our wavefunction!

Momentum is expressed in newton-second (N·s). Now, 1 J = 1 N·m, so 1 N = 1 J/m. Hence, if we divide the momentum value by ħ, we get some value expressed per meter: N·s/J·s = N/J = N/N·m = 1/m. So we get a spatial frequency here. That’s what we want, as we’re multiplying it with x!

So the answer is yes: we can re-scale energy and momentum and we get a temporal and spatial frequency respectively, which we can multiply with t and x respectively: we do not need to change our time and distance units when re-scaling E and p by dividing by ħ!

The next question is: if we express energy and momentum as temporal and spatial frequencies, do our E = m·cand p = m·formulas still apply? They should: both and v are expressed in meter per second (m/s) and, as mentioned above, the re-scaling does not affect our time and distance units. Hence, the energy-mass equivalence relation, and the definition of p (p = m·v), imply that we can re-write the argument (φ) of our ‘new’ wavefunction – i.e. Φ(φ) – as:

φ = E·t−p∙x = m·c2∙t − m∙v·x = m·c2[t – (v/c)∙(x/c)] = m·c2[t – (v/c)∙(x/c)]

In effect, when re-scaling our energy and momentum values, we’ve also re-scaled our unit of inertia, i.e. the unit in which we measure the mass m, which is directly related to both energy as well as momentum. To be precise, from a math point of view, m is nothing but a proportionality constant in both the E = m·cand p = m·formulas.

The next step is to fiddle with the time and distance units. If we

1. measure x and t in equivalent units (so c = 1);
2. denote v/c by β; and
3. re-use the x symbol to denote x/c (that’s just to simplify by saving symbols);

we get:

φ = m·(t–β∙x)

This argument is the product of two factors: (1) m and (2) t–β∙x.

1. The first factor – i.e. the mass m – is an inherent property of the particle that we’re looking at: it measures its inertia, i.e. the key variable in any dynamical model (i.e. any model – classical or quantum-mechanical – representing the motion of the particle).
2. The second factor – i.e. t–v∙x – reminds one of the argument of the wavefunction that’s used in classical mechanics, i.e. x–vt, with v the velocity of the wave. Of course, we should note two major differences between the t–β∙x and x–vt expressions:
1. β is a relative velocity (i.e. a ratio between 0 and 1), while v is an absolute velocity (i.e. a number between 0 and ≈ 299,792,458 m/s).
2. The t–β∙x expression switches the time and distance variables as compared to the x–vt expression, and vice versa.

Both differences are important, but let’s focus on the second one. From a math point of view, the t–β∙x and x–vt expressions are equivalent. However, time is time, and distance is distance—in physics, that is. So what can we conclude here? To answer that question, let’s re-analyze the x–vt expression. Remember its origin: if we have some wave function F(x–vt), and we add some time Δt to its argument – so we’re looking at F[x−v(t+Δt)] now, instead of F(x−vt) – then we can restore it to its former value by also adding some distance Δx = v∙Δt to the argument: indeed, if we do so, we get F[x+Δx−v(t+Δt)] = F(x+vΔt–vt−vΔt) = F(x–vt). Of course, we can do the same analysis the other way around, so we add some Δx and then… Well… You get the idea.

Can we do that for for the F(t–β∙x) expression too? Sure. If we add some Δt to its argument, then we can restore it to its former value by also adding some distance Δx = Δt/β. Just check it: F[(t+Δt)–β(x+Δx)] = F(t+Δt–βx−βΔx) = F(t+Δt–βx−βΔt/β) = F(t–β∙x).

So the mathematical equivalence between the t–β∙x and x–vt expressions is surely meaningful. The F(x–vt) function uniquely determines the waveform and, as part of that determination (or definition, if you want), it also defines its velocity v. Likewise, we can say that the Φ(φ) = Φ[m·(t–β∙x)] function defines the (relative) velocity (β) of the particle that we’re looking at—quantum-mechanically, that is.

You’ll say: we’ve got two variables here: m and β. Well… Yes and no. We can look at m as an independent variable here. In fact, if you want, we could define yet another variable –χ = φ/m = t–β∙x – and, hence, yet another wavefunction here:

Ψ(θ) = [ei·(θ/ħ)]ħ = [ei·φ]ħ = Φ(φ) = Χ(χ) = [ei·φ/m]ħ·m = [ei·χ]ħ·m = [ei·θ/(ħ·m)]ħ·m

Does that make sense? Maybe. Think of it: the spatial dimension of the wave pulse F(x–vt) – if you don’t know what I am talking about: just think of its ‘position’ – is defined by its velocity v = x/t, which – from a math point of view – is equivalent to stating: x – v∙t = 0. Likewise, if we look at our wavefunction as some pulse in space, then its spatial dimension would also be defined by its (relative) velocity, which corresponds to the classical (relative) velocity of the particle we’re looking at. So… Well… As I said, I’ll let you think of all this.

Post Scriptum:

1. You may wonder what that ħ·m factor in that Χ(χ) = [ei·χ]ħ·m = [ei·(t–β∙x)/(ħ·m)]ħ·m function actually stands for. Well… If we measure time and distance in equivalent units (so = 1 and, therefore, E = m), and if we measure energy in units of ħ, then ħ·m corresponds to our old energy unit, i.e. E measured in joule, rather than in terms of ħ. So… Well… I don’t think we can say much more about it.
2. Another thing you may want to think about is the relativistic transformation of the wavefunction. You know that we should correct Newton’s Law of Motion for velocities approaching c. We do so by integrating the Lorentz factor. In light of the fact that we’re using the relative velocity (β) in our wave function, do you think we still need to apply such corrections for the wavefunction? What’s your guess? 🙂

Working with amplitudes

Don’t worry: I am not going to introduce the Hamiltonian matrix—not yet, that is. But this post is going a step further than my previous ones, in the sense that it will be more abstract. At the same time, I do want to stick to real physical examples so as to illustrate what we’re doing when working with those amplitudes. The example that I am going to use involves spin. So let’s talk about that first.

Spin, angular momentum and the magnetic moment

You know spin: it allows experienced pool players to do the most amazing tricks with billiard balls, making a joke of what a so-called elastic collision is actually supposed to look like. So it should not come as a surprise that spin complicates the analysis in quantum mechanics too. We dedicated several posts to that (see, for example, my post on spin and angular momentum in quantum physics) and I won’t repeat these here. Let me just repeat the basics:

1. Classical and quantum-mechanical spin do share similarities: the basic idea driving the quantum-mechanical spin model is that of a electric charge – positive or negative – spinning about its own axis (this is often referred to as intrinsic spin) as well as having some orbital motion (presumably around some other charge, like an electron in orbit with a nucleus at the center). This intrinsic spin, and the orbital motion, give our charge some angular momentum (J) and, because it’s an electric charge in motion, there is a magnetic moment (μ). To put things simply: the classical and quantum-mechanical view of things converge in their analysis of atoms or elementary particles as tiny little magnets. Hence, when placed in an external magnetic field, there is some interaction – a force – and their potential and/or kinetic energy changes. The whole system, in fact, acquires extra energy when placed in an external magnetic field.

Note: The formula for that magnetic energy is quite straightforward, both in classical as well as in quantum physics, so I’ll quickly jot it down: U = −μB = −|μ|·|B|·cosθ = −μ·B·cosθ. So it’s just the scalar product of the magnetic moment and the magnetic field vector, with a minus sign in front so as to get the direction right. [θ is the angle between the μ and B vectors and determines whether U as a whole is positive or negative.

2. The classical and quantum-mechanical view also diverge, however. They diverge, first, because of the quantum nature of spin in quantum mechanics. Indeed, while the angular momentum can take on any value in classical mechanics, that’s not the case in quantum mechanics: in whatever direction we measure, we get a discrete set of values only. For example, the angular momentum of a proton or an electron is either −ħ/2 or +ħ/2, in whatever direction we measure it. Therefore, they are referred to as spin-1/2 particles. All elementary fermions, i.e. the particles that constitute matter (as opposed to force-carrying particles, like photons), have spin 1/2.

Note: Spin-1/2 particles include, remarkably enough, neutrons too, which has the same kind of magnetic moment that a rotating negative charge would have. The neutron, in other words, is not exactly ‘neutral’ in the magnetic sense. One can explain this by noting that a neutron is not ‘elementary’, really: it consists of three quarks, just like a proton, and, therefore, it may help you to imagine that the electric charges inside are, somehow, distributed unevenly—although physicists hate such simplifications. I am noting this because the famous Stern-Gerlach experiment, which established the quantum nature of particle spin, used silver atoms, rather than protons or electrons. More in general, we’ll tend to forget about the electric charge of the particles we’re describing, assuming, most of the time, or tacitly, that they’re neutral—which helps us to sort of forget about classical theory when doing quantum-mechanical calculations!

3. The quantum nature of spin is related to another crucial difference between the classical and quantum-mechanical view of the angular momentum and the magnetic moment of a particle. Classically, the angular momentum and the magnetic moment can have any direction.

Note: I should probably briefly remind you that J is a so-called axial vector, i.e. a vector product (as opposed to a scalar product) of the radius vector r and the (linear) momentum vector p = m·v, with v the velocity vector, which points in the direction of motion. So we write: J = r×p = r×m·v = |r|·|p|·sinθ·n. The n vector is the unit vector perpendicular to the plane containing r and (and, hence, v, of course) given by the right-hand rule. I am saying this to remind you that the direction of the magnetic moment and the direction of motion are not the same: the simple illustration below may help to see what I am talking about.]

Back to quantum mechanics: the image above doesn’t work in quantum mechanics. We do not have an unambiguous direction of the angular momentum and, hence, of the magnetic moment. That’s where all of the weirdness of the quantum-mechanical concept of spin comes out, really. I’ll talk about that when discussing Feynman’s ‘filters’ – which I’ll do in a moment – but here I just want to remind you of the mathematical argument that I presented in the above-mentioned post. Just like in classical mechanics, we’ll have a maximum (and, hence, also a minimum) value for J, like +ħ, 0 and +ħ for a Lithium-6 nucleus. [I am just giving this rather special example of a spin-1 article so you’re reminded we can have particles with an integer spin number too!] So, when we measure its angular momentum in any direction really, it will take on one of these three values: +ħ, 0 or +ħ. So it’s either/or—nothing in-between. Now that leads to a funny mathematical situation: one would usually equate the maximum value of a quantity like this to the magnitude of the vector, which is equal to the (positive) square root of J2 = J= Jx2 + Jy2 + Jz2, with Jx, Jy and Jz the components of J in the x-, y- and z-direction respectively. But we don’t have continuity in quantum mechanics, and so the concept of a component of a vector needs to be carefully interpreted. There’s nothing definite there, like in classical mechanics: all we have is amplitudes, and all we can do is calculate probabilities, or expected values based on those amplitudes.

Huh? Yes. In fact, the concept of the magnitude of a vector itself becomes rather fuzzy: all we can do really is calculate its expected value. Think of it: in the classical world, we have a J2 = Jproduct that’s independent of the direction of J. For example, if J is all in the x-direction, then Jand Jwill be zero, and J2 = Jx2. If it’s all in the y-direction, then Jand Jwill be zero and all of the magnitude of J will be in the y-direction only, so we write: J2 = Jy2. Likewise, if J does not have any z-component, then our JJ product will only include the x- and y-components: JJ = Jx2 + Jy2. You get the idea: the J2 = Jproduct is independent of the direction of J exactly because, in classical mechanics, J actually has a precise and unambiguous magnitude and direction and, therefore, actually has a precise and unambiguous component in each direction. So we’d measure Jx, Jy, and Jand, regardless of the actual direction of J, we’d find its magnitude |J| = J = +√J2 = +(Jx2 + Jy2 + Jz2)1/2.

In quantum mechanics, we just don’t have quantities like that. We say that Jx, Jand Jhave an amplitude to take on a value that’s equal to +ħ, 0 or +ħ (or whatever other value is allowed by the spin number of the system). Now that we’re talking spin numbers, please note that this characteristic number is usually denoted by j, which is a bit confusing, but so be it. So can be 0, 1/2, 1, 3/2, etcetera, and the number of ‘permitted values’ is 2j + 1 values, with each value being separated by an amount equal to ħ. So we have 1, 2, 3, 4, 5 etcetera possible values for Jx, Jand Jrespectively. But let me get back to the lesson. We just can’t do the same thing in quantum mechanics. For starters, we can’t measure Jx, Jy, and Jsimultaneously: our Stern-Gerlach apparatus has a certain orientation and, hence, measures one component of J only. So what can we do?

Frankly, we can only do some math here. The wave-mechanical approach does allow to think of the expected value of J2 = J= Jx2 + Jy2 + Jz2 value, so we write:

E[J2] = E[JJ] = E[Jx2 + Jy2 + Jz2] = ?

[Feynman’s use of the 〈 and 〉 brackets to denote an expected value is hugely confusing, because these brackets are also used to denote an amplitude. So I’d rather use the more commonly used E[X] notation.] Now, it is a rather remarkable property, but the expected value of the sum of two or more random variables is equal to the sum of the expected values of the variables, even if those variables may not be independent. So we can confidently use the linearity property of the expected value operator and write:

E[Jx+ Jy2 + Jz2] = E[Jx2] + E[Jx2] + E[Jx2]

Now we need something else. It’s also just part of the quantum-mechanical approach to things and so you’ll just have to accept it. It sounds rather obvious but it’s actually quite deep: if we measure the x-, y- or z-component of the angular momentum of a random particle, then each of the possible values is equally likely to occur. So that means, in our case, that the +ħ, 0 or +ħ values are equally likely, so their likelihood is one into three, i.e. 1/3. Again, that sounds obvious but it’s not. Indeed, please note, once again, that we can’t measure Jx, Jy, and Jsimultaneously, so the ‘or’ in x-, y- or z-component is an exclusive ‘or’. Of course, I must add this equipartition of likelihoods is valid only because we do not have a preferred direction for J: the particles in our beam have random ‘orientations’. Let me give you the lingo for this: we’re looking at an unpolarized beam. You’ll say: so what? Well… Again, think about what we’re doing here: we may of may not assume that the Jx, Jy, and Jvariables are related. In fact, in classical mechanics, they surely are: they’re determined by the magnitude and direction of J. Hence, they are not random at all ! But let me continue, so you see what comes out.

Because the +ħ, 0 and +ħ values are equally, we can write: E[Jx2] = ħ2/3 + 0/3 + (−ħ)2/3 = [ħ2 + 0 + (−ħ)2]/3 = 2ħ2/3. In case you wonder, that’s just the definition of the expected value operator: E[X] = p1x+ p2x+ … = ∑pixi, with pi the likelihood of the possible value x. So we take a weighted average with the respective probabilities as the weights. However, in this case, with an unpolarized beam, the weighted average becomes a simple average.

Now, E[Jy2] and E[Jz2] are – rather unsurprisingly – also equal to 2ħ2/3, so we find that E[J2] = E[Jx2] + E[Jx2] + E[Jx2] = 3·(2ħ2/3) = 2ħand, therefore, we’d say that the magnitude of the angular momentum is equal to |J| = J = +√2·ħ ≈ 1.414·ħ. Now that value is not equal to the maximum value of our x-, y-, z-component of J, or the component of J in whatever direction we’d want to measure it. That maximum value is ħ, without the √2 factor, so that’s some 40% less than the magnitude we’ve just calculated!

Now, you’ve probably fallen asleep by now but, what this actually says, is that the angular momentum, in quantum mechanics, is never completely in any direction. We can state this in another way: it implies that, in quantum mechanics, there’s no such thing really as a ‘definite’ direction of the angular momentum.

[…]

OK. Enough on this. Let’s move on to a more ‘real’ example. Before I continue though, let me generalize the results above:

[I] A particle, or a system, will have a characteristic spin number: j. That number is always an integer or a half-integer, and it determines a discrete set of possible values for the component of the angular momentum J in any direction.

[II] The number of values is equal to 2j + 1, and these values are separated by ħ, which is why they are usually measured in units of ħ, i.e. Planck’s reduced constant: ħ ≈ 1×10−34 J·s, so that’s tiny but real. 🙂 [It’s always good to remind oneself that we’re actually trying to describe reality.] For example, the permitted values for a spin-3/2 particle are +3ħ/2, +ħ/2, −ħ/2 and −3ħ/2 or, measured in units of ħ, +3/2, +1/2, −1/2 and −3/2. When discussing spin-1/2 particles, we’ll often refer to the two possible states as the ‘up’ and the ‘down’ state respectively. For example, we may write the amplitude for an electron or a proton to have a angular momentum in the x-direction equal to +ħ/2 or −ħ/2 as 〈+x〉 and 〈−x〉 respectively. [Don’t worry too much about it right now: you’ll get used to the notation quickly.]

[III] The classical concepts of angular momentum, and the related magnetic moment, have their limits in quantum mechanics. The magnitude of a vector quantity like angular momentum is generally not equal to the maximum value of the component of that quantity in any direction. The general rule is:

J= j·(j+1)ħ2 > j2·ħ2

So the maximum value of any component of J in whatever direction (i.e. j·ħ) is smaller than the magnitude of J (i.e. √[ j·(j+1)]·ħ). This implies we cannot associate any precise and unambiguous direction with quantities like the angular momentum J or the magnetic moment μ. As Feynman puts it:

“That the energy of an atom [or a particle] in a magnetic field can have only certain discrete energies is really not more surprising than the fact that atoms in general have only certain discrete energy levels—something we mentioned often in Volume I. Why should the same thing not hold for atoms in a magnetic field? It does. But it is the attempt to correlate this with the idea of an oriented magnetic moment that brings out some of the strange implications of quantum mechanics.”

A real example: the disintegration of a muon in a magnetic field

I talked about muon integration before, when writing a much more philosophical piece on symmetries in Nature and time reversal in particular. I used the illustration below. We’ve got an incoming muon that’s being brought to rest in a block of material, and then, as muons do, it disintegrates, emitting an electron and two neutrinos. As you can see, the decay direction is (mostly) in the direction of the axial vector that’s associated with the spin direction, i.e. the direction of the grey dashed line. However, there’s some angular distribution of the decay direction, as illustrated by the blue arrows, that are supposed to visualize the decay products, i.e. the electron and the neutrinos.

This disintegration process is very interesting from a more philosophical side. The axial vector isn’t ‘real’: it’s a mathematical concept—a pseudovector. A pseudo- or axial vector is the product of two so-called true vectors, aka as polar vectors. Just look back at what I wrote about the angular momentum: the J in the J = r×p = r×m·v formula is such vector, and its direction depends on the spin direction, which is clockwise or counter-clockwise, depending from what side you’re looking at it. Having said that, who’s to judge if the product of two ‘true’ vectors is any less ‘true’ than the vectors themselves? 🙂

The point is: the disintegration process does not respect what is referred to as P-symmetry. That’s because our mathematical conventions (like all of these right-hand rules that we’ve introduced) are unambiguous, and they tell us that the pseudovector in the mirror image of what’s going on, has the opposite direction. It has to, as per our definition of a vector product. Hence, our fictitious muon in the mirror should send its decay products in the opposite direction too! So… Well… The mirror image of our muon decay process is actually something that’s not going to happen: it’s physically impossible. So we’ve got a process in Nature here that doesn’t respect ‘mirror’ symmetry. Physicists prefer to call it ‘P-symmetry’, for parity symmetry, because it involves a flip of sign of all space coordinates, so there’s a parity inversion indeed. So there’s processes in Nature that don’t respect it but, while that’s all very interesting, it’s not what I want to write about. [Just check that post of mine if you’d want to read more.] Let me, therefore, use another illustration—one that’s more to the point in terms of what we do want to talk about here:

So we’ve got the same muon here – well… A different one, of course! 🙂 – entering that block (A) and coming to a grinding halt somewhere in the middle, and then it disintegrates in a few micro-seconds, which is an eternity at the atomic or sub-atomic scale. It disintegrates into an electron and two neutrinos, as mentioned above, with some spread in the decay direction. [In case you wonder where we can find muons… Well… I’ll let you look it up yourself.] So we have:

Now it turns out that the presence of a magnetic field (represented by the B arrows in the illustration above) can drastically change the angular distribution of decay directions. That shouldn’t surprise us, of course, but how does it work, exactly? Well… To simplify the analysis, we’ve got a polarized beam here: the spin direction of all muons before they enter the block and/or the magnetic field, i.e. at time t = 0, is in the +x-direction. So we filtered them just, before they entered the block. [I will come back to this ‘filtering’ process.] Now, if the muon’s spin would stay that way, then the decay products – and the electron in particular – would just go straight, because all of the angular momentum is in that direction. However, we’re in the quantum-mechanical world here, and so things don’t stay the same. In fact, as we explained, there’s no such things as a definite angular momentum: there’s just an amplitude to be in the +x state, and that amplitude changes in time and in space.

How exactly? Well… We don’t know, but we can apply some clever tricks here. The first thing to note is that our magnetic field will add to the energy of our muon. So, as I explained in my previous post, the magnetic field adds to the E in the exponent of our complex-valued wavefunction a·e(i/ħ)(E·t − px). In our example, we’ve got a magnetic field in the z-direction only, so that U = −μB reduces to U = −μz·B, and we can re-write our wavefunction as:

a·e(i/ħ)[(E+U)·t − px] = a·e(i/ħ)(E·t − px)·e(i/ħ)(μz·B·t)

Of course, the magnetic field only acts from t = 0 to when the muon disintegrates, which we’ll denote by the point t = τ. So what we get is that the probability amplitude of a particle that’s been in a uniform magnetic field changes by a factor e(i/ħ)(μz·B·τ). Note that it’s a factor indeed: we use it to multiply. You should also note that this is a complex exponential, so it’s a periodic function, with its real and imaginary part oscillating between zero and one. Finally, we know that μz can take on only certain values: for a spin-1/2 particle, they are plus or minus some number, which we’ll simply denote as μ, so that’s without the subscript, so our factor becomes:

e(i/ħ)(±μ·B·t)

[The plus or minus sign needs to be explained here, so let’s do that quickly: we have two possible states for a spin-1/2 particle, one ‘up’, and the other ‘down’. But then we also know that the phase of our complex-valued wave function turns clockwise, which is why we have a minus sign in the exponent of our eiθ expression. In short, for the ‘up’ state, we should take the positive value, i.e. +μ, but the minus sign in the exponent of our eiθ function makes it negative again, so our factor is e−(i/ħ)(μ·B·t) for the ‘up’ state, and e+(i/ħ)(μ·B·t) for the ‘down’ state.]

OK. We get that, but that doesn’t get us anywhere—yet. We need another trick first. One of the most fundamental rules in quantum-mechanics is that we can always calculate the amplitude to go from one state, say φ (read: ‘phi’), to another, say χ (read: ‘khi’), if we have a complete set of so-called base states, which we’ll denote by the index i or j (which you shouldn’t confuse with the imaginary unit, of course), using the following formula:

〈 χ | φ 〉 = ∑ 〈 χ | i 〉〈 i | φ 〉

I know this is a lot to swallow, so let me start with the notation. You should read 〈 χ | φ 〉 from right to left: it’s the amplitude to go from state φ to state χ. This notation is referred to as the bra-ket notation, or the Dirac notation. [Dirac notation sounds more scientific, doesn’t it?] The right part, i.e. | φ 〉, is the bra, and the left part, i.e. 〈 χ | is the ket. In our example, we wonder what the amplitude is for our muon staying in the +x state. Because that amplitude is time-dependent, we can write it as A+(τ) = 〈 +at time t = τ | +at time t = 0 〉 = 〈 +at t = τ | +at t = 0 〉or, using a very misleading shorthand, 〈 +x | +x 〉. [The shorthand is misleading because the +in the ket obviously means something else than the +in the bra.]

But let’s apply the rule. We’ve got two states with respect to each coordinate axis only here. For example, in respect to the z-axis, the spin values are +z and −z respectively. [As mentioned above, we actually mean that the angular momentum in this direction is either +ħ/2 or −ħ/2, aka as ‘up’ or ‘down’ respectively, but then quantum theorists seem to like all kinds of symbols better, so we’ll use the +z and −z notations for these two base states here. So now we can use our rule and write:

A+(t) = 〈 +x | +x 〉 = 〈 +x | +z 〉〈 +z | +x 〉 + 〈 +x | −z 〉〈 −z | +x 〉

You’ll say this doesn’t help us any further, but it does, because there is another set of rules, which are referred to as transformation rules, which gives us those 〈 +z | +x 〉 and 〈 −z | +x 〉 amplitudes. They’re real numbers, and it’s the same number for both amplitudes.

〈 +z | +x 〉 = 〈 −z | +x 〉 = 1/√2

This shouldn’t surprise you too much: the square root disappears when squaring, so we get two equal probabilities – 1/2, to be precise – that add up to one which – you guess it – they have to add up to because of the normalization rule: the sum of all probabilities has to add up to one, always. [I can feel your impatience, but just hang in here for a while, as I guide you through what is likely to be your very first quantum-mechanical calculation.] Now, the 〈 +z | +x 〉 = 〈 −z | +x 〉 = 1/√2 amplitudes are the amplitudes at time t = 0, so let’s be somewhat less sloppy with our notation and write 〈 +z | +x 〉 as C+(0) and 〈 −z | +x 〉 as C(0), so we write:

〈 +z | +x 〉 = C+(0) = 1/√2

〈 −z | +x 〉 = C(0) = 1/√2

Now we know what happens with those amplitudes over time: that e(i/ħ)(±μ·B·t) factor kicks in, and so we have:

C+(t) = C+(0)·e−(i/ħ)(μ·B·t) = e−(i/ħ)(μ·B·t)/√2

C(t) = C(0)·e+(i/ħ)(μ·B·t) = e+(i/ħ)(μ·B·t)/√2

As for the plus and minus signs, see my remark on the tricky ± business in regard to μ. To make a long story somewhat shorter :-), our expression for A+(t) = 〈 +x at t | +x 〉 now becomes:

A+(t) = 〈 +x | +z 〉·C+(t) + 〈 +x | −z 〉·C(t)

Now, you wouldn’t be too surprised if I’d just tell you that the 〈 +x | +z 〉 and 〈 +x | −z 〉 amplitudes are also real-valued and equal to 1/√2, but you can actually use yet another rule we’ll generalize shortly: the amplitude to go from state φ to state χ is the complex conjugate of the amplitude to to go from state χ to state φ, so we write 〈 χ | φ 〉 = 〈 φ | χ 〉*, and therefore:

〈 +x | +z 〉 = 〈 +z | +x 〉* = (1/√2)* = (1/√2)

〈 +x | −z 〉 = 〈 −z | +x 〉* = (1/√2)* = (1/√2)

So our expression for A+(t) = 〈 +x at t | +x 〉 now becomes:

A+(t) = e−(i/ħ)(μ·B·t)/2 + e(i/ħ)(μ·B·t)/2

That’s the sum of a complex-valued function and its complex conjugate, and we’ve shown more than once (see my page on the essentials, for example) that such sum reduces to the sum of the real parts of the complex exponentials. [You should not expect any explanation of Euler’s eiθ = cosθ + i·sinθ rule at this level of understanding.] In short, we get the following grand result:

The big question, of course: what does this actually mean? 🙂 Well… Just square this thing and you get the probabilities shown below. [Note that the period of a squared cosine function is π, instead of 2π, which you can easily verify using an online graphing tool.]

Because you’re tired of this business, you probably don’t realize what we’ve just done. It’s spectacular and mundane at the same time. Let me quote Feynman to summarize the results:

“We find that the chance of catching the decay electron in the electron counter varies periodically with the length of time the muon has been sitting in the magnetic field. The frequency depends on the magnetic moment μ. The magnetic moment of the muon has, in fact, been measured in just this way.”

As far as I am concerned, the key result is that we’ve learned how to work with those mysterious amplitudes, and the wavefunction, in a practical way, thereby using all of the theoretical rules of the quantum-mechanical approach to real-life physical situations. I think that’s a great leap forward, and we’ll re-visit those rules in a more theoretical and philosophical démarche in the next post. As for the example itself, Feynman takes it much further, but I’ll just copy the Grand Master here:

Huh? Well… I am afraid I have to leave it at this, as I discussed the precession of ‘atomic’ magnets elsewhere (see my post on precession and diamagnetism), which gives you the same formula: ω= μ·B/J (just substitute J for ±ħ/2). However, the derivation above approaches it from an entirely different angle, which is interesting. Of course, all fits. 🙂 However, I’ll let you do your own homework now. I hope to see you tomorrow for the mentioned theoretical discussion. Have a nice evening, or weekend – or whatever ! 🙂

The Uncertainty Principle revisited

I’ve written a few posts on the Uncertainty Principle already. See, for example, my post on the energy-time expression for it (ΔE·Δt ≥ h). So why am I coming back to it once more? Not sure. I felt I left some stuff out. So I am writing this post to just complement what I wrote before. I’ll do so by explaining, and commenting on, the ‘semi-formal’ derivation of the so-called Kennard formulation of the Principle in the Wikipedia article on it.

The Kennard inequalities, σxσp ≥ ħ/2 and σEσt ≥ ħ/2, are more accurate than the more general Δx·Δp ≥ h and ΔE·Δt ≥ h expressions one often sees, which are an early formulation of the Principle by Niels Bohr, and which Heisenberg himself used when explaining the Principle in a thought experiment picturing a gamma-ray microscope. I presented Heisenberg’s thought experiment in another post, and so I won’t repeat myself here. I just want to mention that it ‘proves’ the Uncertainty Principle using the Planck-Einstein relations for the energy and momentum of a photon:

E = hf and p = h/λ

Heisenberg’s thought experiment is not a real proof, of course. But then what’s a real proof? The mentioned ‘semi-formal’ derivation looks more impressive, because more mathematical, but it’s not a ‘proof’ either (I hope you’ll understand why I am saying that after reading my post). The main difference between Heisenberg’s thought experiment and the mathematical derivation in the mentioned Wikipedia article is that the ‘mathematical’ approach is based on the de Broglie relation. That de Broglie relation looks the same as the Planck-Einstein relation (p = h/λ) but it’s fundamentally different.

Indeed, the momentum of a photon (i.e. the p we use in the Planck-Einstein relation) is not the momentum one associates with a proper particle, such as an electron or a proton, for example (so that’s the p we use in the de Broglie relation). The momentum of a particle is defined as the product of its mass (m) and velocity (v). Photons don’t have a (rest) mass, and their velocity is absolute (c), so how do we define momentum for a photon? There are a couple of ways to go about it, but the two most obvious ones are probably the following:

1. We can use the classical theory of electromagnetic radiation and show that the momentum of a photon is related to the magnetic field (we usually only analyze the electric field), and the so-called radiation pressure that results from it. It yields the p = E/c formula which we need to go from E = hf to p = h/λ, using the ubiquitous relation between the frequency, the wavelength and the wave velocity (c = λf). In case you’re interested in the detail, just click on the radiation pressure link).
2. We can also use the mass-energy equivalence E = mc2. Hence, the equivalent mass of the photon is E/c2, which is relativistic mass only. However, we can multiply that mass with the photon’s velocity, which is c, thereby getting the very same value for its momentum p = E/c= E/c.

So Heisenberg’s ‘proof’ uses the Planck-Einstein relations, as it analyzes the Uncertainty Principle more as an observer effect: probing matter with light, so to say. In contrast, the mentioned derivation takes the de Broglie relation itself as the point of departure. As mentioned, the de Broglie relations look exactly the same as the Planck-Einstein relationship (E = hf and p = h/λ) but the model behind is very different. In fact, that’s what the Uncertainty Principle is all about: it says that the de Broglie frequency and/or wavelength cannot be determined exactly: if we want to localize a particle, somewhat at least, we’ll be dealing with a frequency range Δf. As such, the de Broglie relation is actually somewhat misleading at first. Let’s talk about the model behind.

A particle, like an electron or a proton, traveling through space, is described by a complex-valued wavefunction, usually denoted by the Greek letter psi (Ψ) or phi (Φ). This wavefunction has a phase, usually denoted as θ (theta) which – because we assume the wavefunction is a nice periodic function – varies as a function of time and space. To be precise, we write θ as θ = ωt – kx or, if the wave is traveling in the other direction, as θ = kx – ωt.

I’ve explained this in a couple of posts already, including my previous post, so I won’t repeat myself here. Let me just note that ω is the angular frequency, which we express in radians per second, rather than cycles per second, so ω = 2π(one cycle covers 2π rad). As for k, that’s the wavenumber, which is often described as the spatial frequency, because it’s expressed in cycles per meter or, more often (and surely in this case), in radians per meter. Hence, if we freeze time, this number is the rate of change of the phase in space. Because one cycle is, again, 2π rad, and one cycle corresponds to the wave traveling one wavelength (i.e. λ meter), it’s easy to see that k = 2π/λ. We can use these definitions to re-write the de Broglie relations E = hf and p = h/λ as:

E = ħω and p = ħk with h = h/2π

What about the wave velocity? For a photon, we have c = λf and, hence, c = (2π/k)(ω/2π) = ω/k. For ‘particle waves’ (or matter waves, if you prefer that term), it’s much more complicated, because we need to distinguish between the so-called phase velocity (vp) and the group velocity (vg). The phase velocity is what we’re used to: it’s the product of the frequency (the number of cycles per second) and the wavelength (the distance traveled by the wave over one cycle), or the ratio of the angular frequency and the wavenumber, so we have, once again, λf = ω/k = vp. However, this phase velocity is not the classical velocity of the particle that we are looking at. That’s the so-called group velocity, which corresponds to the velocity of the wave packet representing the particle (or ‘wavicle’, if your prefer that term), as illustrated below.

The animation below illustrates the difference between the phase and the group velocity even more clearly: the green dot travels with the ‘wavicles’, while the red dot travels with the phase. As mentioned above, the group velocity corresponds to the classical velocity of the particle (v). However, the phase velocity is a mathematical point that actually travels faster than light. It is a mathematical point only, which does not carry a signal (unlike the modulation of the wave itself, i.e. the traveling ‘groups’) and, hence, it does not contradict the fundamental principle of relativity theory: the speed of light is absolute, and nothing travels faster than light (except mathematical points, as you can, hopefully, appreciate now).

The two animations above do not represent the quantum-mechanical wavefunction, because the functions that are shown are real-valued, not complex-valued. To imagine a complex-valued wave, you should think of something like the ‘wavicle’ below or, if you prefer animations, the standing waves underneath (i.e. C to H: A and B just present the mathematical model behind, which is that of a mechanical oscillator, like a mass on a spring indeed). These representations clearly show the real as well as the imaginary part of complex-valued wave-functions.

With this general introduction, we are now ready for the more formal treatment that follows. So our wavefunction Ψ is a complex-valued function in space and time. A very general shape for it is one we used in a couple of posts already:

Ψ(x, t) ∝ ei(kx – ωt) = cos(kx – ωt) + isin(kx – ωt)

If you don’t know anything about complex numbers, I’d suggest you read my short crash course on it in the essentials page of this blog, because I don’t have the space nor the time to repeat all of that. Now, we can use the de Broglie relationship relating the momentum of a particle with a wavenumber (p = ħk) to re-write our psi function as:

Ψ(x, t) ∝ ei(kx – ωt) = ei(px/ħ – ωt)

Note that I am using the ‘proportional to’ symbol (∝) because I don’t worry about normalization right now. Indeed, from all of my other posts on this topic, you know that we have to take the absolute square of all these probability amplitudes to arrive at a probability density function, describing the probability of the particle effectively being at point x in space at point t in time, and that all those probabilities, over the function’s domain, have to add up to 1. So we should insert some normalization factor.

Having said that, the problem with the wavefunction above is not normalization really, but the fact that it yields a uniform probability density function. In other words, the particle position is extremely uncertain in the sense that it could be anywhere. Let’s calculate it using a little trick: the absolute square of a complex number equals the product of itself with its (complex) conjugate. Hence, if z = reiθ, then │z│2 = zz* = reiθ·reiθ = r2eiθiθ = r2e0 = r2. Now, in this case, assuming unique values for k, ω, p, which we’ll note as k0, ω0, p0 (and, because we’re freezing time, we can also write t = t0), we should write:

│Ψ(x)│2 = │a0ei(p0x/ħ – ω0t02 = │a0eip0x/ħ eiω0t0 2 = │a0eip0x/ħ 2 │eiω0t0 2 = a02

Note that, this time around, I did insert some normalization constant a0 as well, so that’s OK. But so the problem is that this very general shape of the wavefunction gives us a constant as the probability for the particle being somewhere between some point a and another point b in space. More formally, we get the surface for a rectangle when we calculate the probability P[a ≤ X ≤ b] as we should calculate it, which is as follows:

More specifically, because we’re talking one-dimensional space here, we get P[a ≤ X ≤ b] = (b–a)·a02. Now, you may think that such uniform probability makes sense. For example, an electron may be in some orbital around a nucleus, and so you may think that all ‘points’ on the orbital (or within the ‘sphere’, or whatever volume it is) may be equally likely. Or, in another example, we may know an electron is going through some slit and, hence, we may think that all points in that slit should be equally likely positions. However, we know that it is not the case. Measurements show that not all points are equally likely. For an orbital, we get complicated patterns, such as the one shown below, and please note that the different colors represent different complex numbers and, hence, different probabilities.

Also, we know that electrons going through a slit will produce an interference pattern—even if they go through it one by one! Hence, we cannot associate some flat line with them: it has to be a proper wavefunction which implies, once again, that we can’t accept a uniform distribution.

In short, uniform probability density functions are not what we see in Nature. They’re non-uniform, like the (very simple) non-uniform distributions shown below. [The left-hand side shows the wavefunction, while the right-hand side shows the associated probability density function: the first two are static (i.e. they do not vary in time), while the third one shows a probability distribution that does vary with time.]

I should also note that, even if you would dare to think that a uniform distribution might be acceptable in some cases (which, let me emphasize this, it is not), an electron can surely not be ‘anywhere’. Indeed, the normalization condition implies that, if we’d have a uniform distribution and if we’d consider all of space, i.e. if we let a go to –∞ and b to +∞, then a0would tend to zero, which means we’d have a particle that is, literally, everywhere and nowhere at the same time.

In short, a uniform probability distribution does not make sense: we’ll generally have some idea of where the particle is most likely to be, within some range at least. I hope I made myself clear here.

Now, before I continue, I should make some other point as well. You know that the Planck constant (h or ħ) is unimaginably small: about 1×10−34 J·s (joule-second). In fact, I’ve repeatedly made that point in various posts. However, having said that, I should add that, while it’s unimaginably small, the uncertainties involved are quite significant. Let us indeed look at the value of ħ by relating it to that σxσp ≥ ħ/2 relation.

Let’s first look at the units. The uncertainty in the position should obviously be expressed in distance units, while momentum is expressed in kg·m/s units. So that works out, because 1 joule is the energy transferred (or work done) when applying a force of 1 newton (N) over a distance of 1 meter (m). In turn, one newton is the force needed to accelerate a mass of one kg at the rate of 1 meter per second per second (this is not a typing mistake: it’s an acceleration of 1 m/s per second, so the unit is m/s2: meter per second squared). Hence, 1 J·s = 1 N·m·s = 1 kg·m/s2·m·s = kg·m2/s. Now, that’s the same dimension as the ‘dimensional product’ for momentum and distance: m·kg·m/s = kg·m2/s.

Now, these units (kg, m and s) are all rather astronomical at the atomic scale and, hence, h and ħ are usually expressed in other dimensions, notably eV·s (electronvolt-second). However, using the standard SI units gives us a better idea of what we’re talking about. If we split the ħ = 1×10−34 J·s value (let’s forget about the 1/2 factor for now) ‘evenly’ over σx and σp – whatever that means: all depends on the units, of course!  – then both factors will have magnitudes of the order of 1×10−17: 1×10−17 m times 1×10−17 kg·m/s gives us 1×10−34 J·s.

You may wonder how this 1×10−17 m compares to, let’s say, the classical electron radius, for example. The classical electron radius is, roughly speaking, the ‘space’ an electron seems to occupy as it scatters incoming light. The idea is illustrated below (credit for the image goes to Wikipedia, as usual). The classical electron radius – or Thompson scattering length – is about 2.818×10−15 m, so that’s almost 300 times our ‘uncertainty’ (1×10−17 m). Not bad: it means that we can effectively relate our ‘uncertainty’ in regard to the position to some actual dimension in space. In this case, we’re talking the femtometer scale (1 fm = 10−15 m), and so you’ve surely heard of this before.

What about the other ‘uncertainty’, the one for the momentum (1×10−17 kg·m/s)? What’s the typical (linear) momentum of an electron? Its mass, expressed in kg, is about 9.1×10−31 kg. We also know its relative velocity in an electron: it’s that magical number α = v/c, about which I wrote in some other posts already, so v = αc ≈ 0.0073·3×10m/s ≈ 2.2×10m/s. Now, 9.1×10−31 kg times 2.2×10m/s is about 2×10–26 kg·m/s, so our proposed ‘uncertainty’ in regard to the momentum (1×10−17 kg·m/s) is half a billion times larger than the typical value for it. Now that is, obviously, not so good. [Note that calculations like this are extremely rough. In fact, when one talks electron momentum, it’s usual angular momentum, which is ‘analogous’ to linear momentum, but angular momentum involves very different formulas. If you want to know more about this, check my post on it.]

Of course, now you may feel that we didn’t ‘split’ the uncertainty in a way that makes sense: those –17 exponents don’t work, obviously. So let’s take 1×10–26 kg·m/s for σp, which is half of that ‘typical’ value we calculated. Then we’d have 1×10−8 m for σx (1×10−8 m times 1×10–26 kg·m/s is, once again, 1×10–34 J·s). But then that uncertainty suddenly becomes a huge number: 1×10−8 m is 100 angstrom. That’s not the atomic scale but the molecular scale! So it’s huge as compared to the pico- or femto-meter scale (1 pm = 1×10−12 m, 1 fm = 1×10−15 m) which we’d sort of expect to see when we’re talking electrons.

OK. Let me get back to the lesson. Why this digression? Not sure. I think I just wanted to show that the Uncertainty Principle involves ‘uncertainties’ that are extremely relevant: despite the unimaginable smallness of the Planck constant, these uncertainties are quite significant at the atomic scale. But back to the ‘proof’ of Kennard’s formulation. Here we need to discuss the ‘model’ we’re using. The rather simple animation below (again, credit for it has to go to Wikipedia) illustrates it wonderfully.

Look at it carefully: we start with a ‘wave packet’ that looks a bit like a normal distribution, but it isn’t, of course. We have negative and positive values, and normal distributions don’t have that. So it’s a wave alright. Of course, you should, once more, remember that we’re only seeing one part of the complex-valued wave here (the real or imaginary part—it could be either). But so then we’re superimposing waves on it. Note the increasing frequency of these waves, and also note how the wave packet becomes increasingly localized with the addition of these waves. In fact, the so-called Fourier analysis, of which you’ve surely heard before, is a mathematical operation that does the reverse: it separates a wave packet into its individual component waves.

So now we know the ‘trick’ for reducing the uncertainty in regard to the position: we just add waves with different frequencies. Of course, different frequencies imply different wavenumbers and, through the de Broglie relationship, we’ll also have different values for the ‘momentum’ associated with these component waves. Let’s write these various values as kn, ωn, and pn respectively, with n going from 0 to N. Of course, our point in time remains frozen at t0. So we get a wavefunction that’s, quite simply, the sum of N component waves and so we write:

Ψ(x) = ∑ anei(pnx/ħ – ωnt0= ∑ an  eipnx/ħeiωnt= ∑ Aneipnx/ħ

Note that, because of the eiωnt0, we now have complex-valued coefficients An = aneiωnt0 in front. More formally, we say that An represents the relative contribution of the mode pn to the overall Ψ(x) wave. Hence, we can write these coefficients A as a function of p. Because Greek letters always make more of an impression, we’ll use the Greek letter Φ (phi) for it. 🙂 Now, we can go to the continuum limit and, hence, transform that sum above into an infinite sum, i.e. an integral. So our wave function then becomes an integral over all possible modes, which we write as:

Don’t worry about that new 1/√2πħ factor in front. That’s, once again, something that has to do with normalization and scales. It’s the integral itself you need to understand. We’ve got that Φ(p) function there, which is nothing but our An coefficient, but for the continuum case. In fact, these relative contributions Φ(p) are now referred to as the amplitude of all modes p, and so Φ(p) is actually another wave function: it’s the wave function in the so-called momentum space.

You’ll probably be very confused now, and wonder where I want to go with an integral like this. The point to note is simple: if we have that Φ(p) function, we can calculate (or derive, if you prefer that word) the Ψ(x) from it using that integral above. Indeed, the integral above is referred to as the Fourier transform, and it’s obviously closely related to that Fourier analysis we introduced above.

Of course, there is also an inverse transform, which looks exactly the same: it just switches the wave functions (Ψ and Φ) and variables (x and p), and then (it’s an important detail!), it has a minus sign in the exponent. Together, the two functions –  as defined by each other through these two integrals – form a so-called Fourier integral pair, also known as a Fourier transform pair, and the variables involved are referred to as conjugate variables. So momentum (p) and position (x) are conjugate variables and, likewise, energy and time are also conjugate variables (but so I won’t expand on the time-energy relation here: please have a look at one of my others posts on that).

Now, I thought of copying and explaining the proof of Kennard’s inequality from Wikipedia’s article on the Uncertainty Principle (you need to click on the show button in the relevant section to see it), but then that’s pretty boring math, and simply copying stuff is not my objective with this blog. More importantly, the proof has nothing to do with physics. Nothing at all. Indeed, it just proves a general mathematical property of Fourier pairs. More specifically, it proves that, the more concentrated one function is, the more spread out its Fourier transform must be. In other words, it is not possible to arbitrarily concentrate both a function and its Fourier transform.

So, in this case, if we’d ‘squeeze’ Ψ(x), then its Fourier transform Φ(p) will ‘stretch out’, and so that’s what the proof in that Wikipedia article basically shows. In other words, there is some ‘trade-off’ between the ‘compaction’ of Ψ(x), on the one hand, and Φ(p), on the other, and so that is what the Uncertainty Principle is all about. Nothing more, nothing less.

But… Yes? What’s all this talk about ‘squeezing’ and ‘compaction’? We can’t change reality, can we? Well… Here we’re entering the philosophical field, of course. How do we interpret the Uncertainty Principle? It surely does look like us trying to measure something has some impact on the wavefunction. In fact, usually, our measurement – of either position or momentum – usually makes the wavefunctions collapse: we suddenly know where the particle is and, hence, ψ(x) seems to collapse into one point. Alternatively, we measure its momentum and, hence, Φ(p) collapses.

That’s intriguing. In fact, even more intriguing is the possibility we may only partially affect those wavefunctions with measurements that are somewhat less ‘drastic’. It seems a lot of research is focused on that (just Google for partial collapse of the wavefunction, and you’ll finds tons of references, including presentations like this one).

Hmm… I need to further study the topic. The decomposition of a wave into its component waves is obviously something that works well in physics—and not only in quantum mechanics but also in much more mundane examples. Its most general application is signal processing, in which we decompose a signal (which is a function of time) into the frequencies that make it up. Hence, our wavefunction model makes a lot of sense, as it mirrors the physics involved in oscillators and harmonics obviously.

Still… I feel it doesn’t answer the fundamental question: what is our electron really? What do those wave packets represent? Physicists will say questions like this don’t matter: as long as our mathematical models ‘work’, it’s fine. In fact, if even Feynman said that nobody – including himself – truly understands quantum mechanics, then I should just be happy and move on. However, for some reason, I can’t quite accept that. I should probably focus some more on that de Broglie relationship, p = h/λ, as it’s obviously as fundamental to my understanding of the ‘model’ of reality in physics as that Fourier analysis of the wave packet. So I need to do some more thinking on that.

The de Broglie relationship is not intuitive. In fact, I am not ashamed to admit that it actually took me quite some time to understand why we can’t just re-write the de Broglie relationship (λ = h/p) as an uncertainty relation itself: Δλ = h/Δp. Hence, let me be very clear on this:

Δx = h/Δp (that’s the Uncertainty Principle) but Δλ ≠ h/Δp !

Let me quickly explain why.

If the Δ symbol expresses a standard deviation (or some other measurement of uncertainty), we can write the following:

p = h/λ ⇒ Δp = Δ(h/λ) = hΔ(1/λ) ≠ h/Δp

So I can take h out of the brackets after the Δ symbol, because that’s one of the things that’s allowed when working with standard deviations. More in particular, one can prove the following:

1. The standard deviation of some constant function is 0: Δ(k) = 0
2. The standard deviation is invariant under changes of location: Δ(x + k) = Δ(x + k)
3. Finally, the standard deviation scales directly with the scale of the variable: Δ(kx) = |k |Δ(x).

However, it is not the case that Δ(1/x) = 1/Δx. However, let’s not focus on what we cannot do with Δx: let’s see what we can do with it. Δx equals h/Δp according to the Uncertainty Principle—if we take it as an equality, rather than as an inequality, that is. And then we have the de Broglie relationship: p = h/λ. Hence, Δx must equal:

Δx = h/Δp = h/[Δ(h/λ)] =h/[hΔ(1/λ)] = 1/Δ(1/λ)

That’s obvious, but so what? As mentioned, we cannot write Δx = Δλ, because there’s no rule that says that Δ(1/λ) = 1/Δλ and, therefore, h/Δp ≠ Δλ. However, what we can do is define Δλ as an interval, or a length, defined by the difference between its lower and upper bound (let’s denote those two values by λa and λb respectively. Hence, we write Δλ = λb – λa. Note that this does not assume we have a continuous range of values for λ: we can have any number of frequencies λbetween λa and λb, but so you see the point: we’ve got a range of values λ, discrete or continuous, defined by some lower and upper bound.

Now, the de Broglie relation associates two values pa and pb with λa and λb respectively:  pa = h/λa and pb = h/λb. Hence, we can similarly define the corresponding Δp interval as pa – pb, with pa = h/λa and p= h/λb. Note that, because we’re taking the reciprocal, we have to reverse the order of the values here: if λb > λa, then pa = h/λa > p= h/λb. Hence, we can write Δp = Δ(h/λ) = pa – pb = h/λ1 – h/λ= h(1/λ1 – 1/λ2) = h[λ2 – λ1]/λ1λ2. In case you have a bit of difficulty, just draw some reciprocal functions (like the ones below), and have fun connecting intervals on the horizontal axis with intervals on the vertical axis using these functions.

Now, h[λ2 – λ1]/λ1λ2) is obviously something very different than h/Δλ = h/(λ2 – λ1). So we can surely not equate the two and, hence, we cannot write that Δp = h/Δλ.

Having said that, the Δx = 1/Δ(1/λ) = λ1λ2/(λ2 – λ1) that emerges here is quite interesting. We’ve got a ratio here, λ1λ2/(λ2 – λ1, which shows that Δx depends only on the upper and lower bounds of the Δλ range. It does not depend on whether or not the interval is discrete or continuous.

The second thing that is interesting to note is Δx depends not only on the difference between those two values (i.e. the length of the interval) but also on their value: if the length of the interval, i.e. the difference between the two frequencies is the same, but their values as such are higher, then we get a higher value for Δx, i.e. a greater uncertainty in the position. Again, this shows that the relation between Δλ and Δx is not straightforward. But so we knew that already, and so I’ll end this post right here and right now. 🙂