# This year’s Nobel Prize for Physics…

One of my beloved brothers just sent me the news on this year’s Nobel Prize for Physics. Of course, it went to the MIT/Caltech LIGO scientists – who confirmed the reality of gravitational waves. That’s exactly the topic that I am exploring when trying to digest all this quantum math and stuff. Brilliant !

I actually sent the physicists a congratulatory message – and my paper ! I can’t believe I actually did that.

In the best case, I just made a fool of myself. In the worst case… Well… I just made a fool of myself. š

# Electron and photon strings

Note: I have published a paper that is very coherent and fully explains what the idea of a photon might be. There is nothing stringy. Check it out: The Meaning of the Fine-Structure Constant. No ambiguity. No hocus-pocus.

Jean Louis Van Belle, 23 December 2018

Original post:

In my previous posts, I’ve been playing with… Well… At the very least, a new didactic approach to understanding the quantum-mechanical wavefunction. I just boldly assumed the matter-wave is a gravitational wave. I did so by associating its components with the dimension of gravitational field strength: newton per kg, which is the dimension of acceleration (N/kg = m/s2). Why? When you remember the physical dimension of the electromagnetic field is N/C (force per unitĀ charge), then that’s kinda logical, right? šĀ The math is beautiful. Key consequences include the following:

1. Schrodinger’s equation becomes an energy diffusion equation.
2. Energy densities give us probabilities.
3. The elementary wavefunction for the electron gives us the electron radius.
4. Spin angular momentum can be interpreted as reflecting the right- or left-handedness of the wavefunction.
5. Finally, the mysterious boson-fermion dichotomy is no longer “deep down in relativistic quantum mechanics”, as Feynman famously put it.

It’s all great. Every day brings something new. š Today I want to focus on our weird electron model and how we get God’s number (aka the fine-structure constant) out of it. Let’s recall the basics of it.Ā We had the elementary wavefunction:

Ļ =Ā aĀ·eāi[EĀ·t ā pāx]/Ä§ =Ā aĀ·eāi[EĀ·t ā pāx]/Ä§ = aĀ·cos(pāx/Ä§ ā Eāt/Ä§) + iĀ·aĀ·sin(pāx/Ä§ ā Eāt/Ä§)

In one-dimensional space (think of a particle traveling along some line), the vectors (p and x) become scalars, and so we simply write:

Ļ =Ā aĀ·eāi[EĀ·t ā pāx]/Ä§ =Ā aĀ·eāi[EĀ·t ā pāx]/Ä§ = aĀ·cos(pāx/Ä§ ā Eāt/Ä§) + iĀ·aĀ·sin(pāx/Ä§ ā Eāt/Ä§)

This wavefunction comes with constantĀ probabilities |Ļ|2Ā  = a2, so we need to define a space outside of whichĀ Ļ = 0. Think of the particle-in-a-box model. This is obvious oscillations pack energy, and the energy of our particle is finite. Hence, each particle – be it a photon or an electron – will pack aĀ finiteĀ number of oscillations. It will, therefore, occupy a finite amount of space. Mathematically, this corresponds to the normalization condition:Ā all probabilities have to add up to one, as illustrated below.Now, allĀ oscillations of the elementary wavefunction have the same amplitude:Ā a. [Terminology is a bit confusing here because we use the term amplitude to refer to two very different things here: we may sayĀ a is the amplitude of the (probability) amplitudeĀ Ļ. So how many oscillations do we have? What is theĀ sizeĀ of our box? Let us assume our particle is an electron, and we will reduce its motion to aĀ one-dimensionalĀ motion only: we’re thinking of it as traveling along the x-axis. We can then use the y- andĀ z-axes asĀ mathematical axes only: they will show us how the magnitude and direction of the real and imaginary component ofĀ Ļ. The animation below (for which I have to credit Wikipedia) shows how it looks like.Of course, we can have right- as well as left-handed particle waves because, while timeĀ physicallyĀ goes by in one direction only (we can’t reverse time), we can countĀ it in two directions: 1, 2, 3, etcetera orĀ ā1,Ā ā2,Ā ā3,Ā etcetera. In the latter case, think of timeĀ tickingĀ away. š Of course, in ourĀ physicalĀ interpretation of the wavefunction, this should explain the (spin) angular momentum of the electron, which is – for some mysterious reason that we now understand š – always equal toĀ JĀ =Ā Ā± Ä§/2.

Now, becauseĀ a is some constant here, we may think of our box as a cylinder along the x-axis. Now, the rest mass of an electron is about 0.510 MeV, so that’s around 8.19Ć10ā14 Nām, so it will pack some 1.24Ć1020Ā oscillations per second. So how long is our cylinder here? To answer that question, we need to calculate theĀ phaseĀ velocity of our wave. We’ll come back to that in a moment. Just note how this compares to a photon: the energy of a photon will typically be a few electronvoltĀ only (1 eVĀ ā 1.6 Ć10ā19Ā NĀ·m) and, therefore, it will pack like 1015Ā oscillations per second, so that’s a density (in time) that is about 100,000 timesĀ less.

Back to the angular momentum. The classical formula for it isĀ L = IĀ·Ļ, so that’s angular frequency times angular mass. What’s the angular velocity here? That’s easy:Ā Ļ =Ā E/Ä§. What’s the angular mass? If we think of our particle as a tiny cylinder,Ā we may use the formula for its angular mass: I = mĀ·r2/2. We have m: that’s the electron mass, right? Right? So what is r? That should be the magnitude of the rotating vector, right? So that’sĀ a. Of course, the mass-energyĀ equivalence relation tells us that E = mc2, so we can write:

L = IĀ·Ļ = (mĀ·r2/2)Ā·(E/Ä§) = (1/2)Ā·a2Ā·mĀ·(mc2/Ä§) = (1/2)Ā·a2Ā·m2Ā·c2/Ä§

Does it make sense? Maybe. Maybe not. You can check the physical dimensions on both sides of the equation, and that works out: we do get something that is expressed in NĀ·mĀ·s, so that’s actionĀ orĀ angular momentumĀ units. Now, weĀ knowĀ L must be equal toĀ JĀ =Ā Ā± Ä§/2. [As mentioned above, the plus or minus sign depends on the left- or right-handedness of our wavefunction, so don’t worry about that.] How do we know that? Because of the Stern-Gerlach experiment, which has been repeated a zillion times, if not more. Now, if L =Ā J, then we get the following equation for a:Ā Ā This is the formula for the radius of an electron. To be precise, it is theĀ Compton scattering radius, so that’s theĀ effectiveĀ radius of an electron as determined by scattering experiments. You can calculate it:Ā it is about 3.8616Ć10ā13 m, so that’s theĀ picometerĀ scale, as we would expect.

This isĀ a rather spectacular result. As far as I am concerned, it is spectacular enough for me to actuallyĀ believeĀ myĀ interpretation of the wavefunction makes sense.

Let us now try to think about theĀ lengthĀ of our cylinder once again. The period of our wave is equal to T = 1/f = 1/(Ļ/2Ļ) = 1/[(E/Ä§)Ā·2Ļ] =Ā 1/(E/h) = h/E. Now, theĀ phaseĀ velocity (vp) will be given by:

vpĀ =Ā Ī»Ā·fĀ = (2Ļ/k)Ā·(Ļ/2Ļ) =Ā Ļ/k =Ā (E/Ä§)/(p/Ä§) = E/p = E/(mĀ·vg) = (mĀ·c2)/(mĀ·vg) = c2/vg

This isĀ veryĀ interesting, because it establishes anĀ inverseĀ proportionality betweenĀ the group and the phase velocity of our wave, withĀ c2Ā as the coefficient ofĀ inverseĀ proportionality.Ā In fact, this equation looks better if we write asĀ vpĀ·vgĀ =Ā c2. Of course, theĀ groupĀ velocityĀ (vg) is theĀ classicalĀ velocity of our electron. This equation shows us the idea of an electron at rest doesn’t make sense: ifĀ vgĀ = 0, thenĀ vpĀ times zero must equalĀ c2, which cannot be the case: electronsĀ mustĀ move in space. More generally, speaking, matter-particles must move in space, with the photon as our limiting case: it moves at the speed of light. Hence, for a photon, we find that vpĀ =Ā vgĀ = E/p =Ā c.

How can we calculate theĀ lengthĀ of a photon or an electron? It is an interesting question. The mentioned orders or magnitude of the frequency (1015Ā or 1020) gives us the number of oscillations per second. But how many do we have inĀ oneĀ photon, or inĀ one electron?

Let’s first think about photons, because we have more clues here. Photons are emitted by atomic oscillators: atoms going from one state (energy level) to another. We know how to calculate to calculate the Q of these atomic oscillators (see, for example, Feynman I-32-3):Ā it is of the order of 108, which means the wave train will last about 10ā8Ā seconds (to be precise, that is the time it takes for the radiation to die out by a factor 1/e). Now, the frequency of sodium light, for example, is 0.5Ć1015Ā oscillations per second, and the decay time is about 3.2Ć10ā8Ā seconds, so that makes for (0.5Ć1015)Ā·(3.2Ć10ā8) = 16 million oscillations. Now, the wavelength is 600 nanometer (600Ć10ā9) m), so that gives us a wavetrain with a length of (600Ć10ā9)Ā·(16Ć106) = 9.6 m.

These oscillations may or may not have the same amplitude and, hence, each of these oscillations may pack a different amount of energies. However,Ā if the total energy of our sodium light photon (i.e. about 2 eVĀ āĀ 3.3Ć10ā19Ā J) are to be packed in those oscillations, then each oscillation would pack about 2Ć10ā26Ā J, on average, that is. We speculated in other posts on how we might imagine the actual wave pulse that atoms emit when going from one energy state to another, so we don’t do that again here. However, the following illustration of the decay of a transient signal dies out may be useful.

This calculation is interesting. It also gives us an interesting paradox: if a photon is a pointlike particle, how can we say its length is like 10 meterĀ or more? Relativity theory saves us here. We need to distinguish the reference frame of the photon ā riding along the wave as it is being emitted, so to speak ā and our stationary reference frame, which is that of the emitting atom. Now, because the photon travels at the speed of light, relativistic length contraction will make it lookĀ like a pointlike particle.

What about the electron? Can we use similarĀ assumptions? For the photon, we can use the decay time to calculate the effective numberĀ of oscillations. What can we use for an electron? We will need to make some assumption about the phase velocity or, what amounts to the same, the group velocity of the particle. What formulas can we use? TheĀ p = mĀ·v is the relativistically correct formula for the momentum of an object if m = mv, so that’s the same m we use in the E =Ā mc2Ā formula. Of course,Ā vĀ here is, obviously, the group velocity (vg), so that’s the classical velocity of our particle. Hence, we can write:

p = mĀ·vgĀ = (E/c2)Ā·vgĀ āĀ vgĀ = p/m =Ā Ā pĀ·c2/E

This is just another way of writing thatĀ vgĀ =Ā c2/vpĀ or vpĀ =Ā c2/vgĀ so it doesn’t help, does it? Maybe. Maybe not. Let us substitute in our formula for the wavelength:

Ī» =Ā vp/fĀ =Ā vpĀ·TĀ =Ā vpā(h/E) = (c2/vg)Ā·(h/E) = h/(mĀ·vg) = h/pĀ

This gives us the otherĀ de BroglieĀ relation:Ā Ī» =Ā h/p. This doesn’t help us much, although it is interesting to think about it. TheĀ fĀ = E/h relation is somewhat intuitive: higher energy, higher frequency. In contrast, what the Ī» =Ā h/p relation tells us that we get an infinite wavelength if the momentum becomes really small. What does this tell us? I am not sure. Frankly, I’ve look at the secondĀ de BroglieĀ relation like a zillion times now, and I think it’s rubbish. It’s meant to be used for the groupĀ velocity, I feel. I am saying that because we get a non-sensical energy formula out of it. Look at this:

1. E = hĀ·f and p = h/Ī». Therefore, f = E/h and Ī» = p/h.
2. vĀ =Ā fĀ·Ī» = (E/h)ā(p/h) = E/p
3. p = mĀ·v. Therefore, E = vĀ·p = mĀ·v2

E = mĀ·v2? This formula is only correct ifĀ vĀ =Ā c, in which case it becomes theĀ E = mc2 equation. So it then describes a photon, or a massless matter-particle which… Well… That’s a contradictio in terminis. š In all other cases, we get nonsense.

Let’s try something differently.Ā  If our particle is at rest, then p = 0 and theĀ pĀ·x/Ä§ term in our wavefunction vanishes, so it’s just:

Ļ =Ā aĀ·eāiĀ·EĀ·t/Ä§ =Ā aĀ·cos(Eāt/Ä§) ā iĀ·aĀ·sin(Eāt/Ä§)

Hence, our wave doesn’t travel. It has the same amplitude at every point in space at any point in time. Both the phase and group velocity become meaningless concepts. TheĀ amplitude variesĀ – because of the sine and cosine – but the probability remains the same:Ā |Ļ|2Ā  = a2. Hmm… So we need to find another way to define the size of our box. One of the formulas I jotted down in my paper in which I analyze the wavefunction as a gravitational waveĀ was this one:

It was a physicalĀ normalization condition: the energy contributions of the waves that make up a wave packet need to add up to the total energy of our wave. Of course, for our elementary wavefunction here, the subscripts vanish and so the formula reduces to E = (E/c2)Ā·a2Ā·(E2/Ä§2), out of which we get our formula for the scattering radius: aĀ =Ā Ä§/mc. Now how do we pack that energy in our cylinder?Ā Assuming that energy is distributed uniformly, we’re tempted to write something like E =Ā a2Ā·l or, looking at the geometry of the situation:

E = ĻĀ·a2Ā·l āĀ lĀ = E/(ĻĀ·a2)

It’s just the formula for the volume of a cylinder.Ā Using the value we got for the Compton scattering radius (aĀ =Ā 3.8616Ć10ā13 m), we find anĀ l that’sĀ equal to (8.19Ć10ā14)/(ĻĀ·14.9Ć10ā26) =ā 0.175Ć1012Meter?Ā Yes. We get the following formula:

0.175Ć1012Ā m is 175 millionĀ kilometer. That’s – literally – astronomic. It corresponds to 583 light-seconds, or 9.7 light-minutes.Ā So that’s about 1.17 times the (average) distance between the Sun and the Earth. You can see that we do need to build a wave packet: that space is a bit too large to look for an electron, right? š

Could we possibly get some less astronomic proportions? What if weĀ imposeĀ thatĀ lĀ should equalĀ a? We get the following condition:We find that m would have to be equal to m ā 1.11Ć10ā36Ā kg. That’s tiny. In fact, it’s equivalent to an energy of aboutĀ  equivalent to 0.623 eV (which you’ll see written as 623 milli-eV. This corresponds to light with a wavelength of about 2 micro-meter (Ī¼m), so that’s in the infrared spectrum. It’s a funny formula: we find, basically, that theĀ l/aĀ ratio is proportional to m4. Hmm… What should we think of this? If you have any ideas, let me know !

Post scriptum (3 October 2017):Ā The paper is going well. Getting lots of downloads, and the views on my blog are picking up too. But I have been vicious. Substituting BĀ for (1/c)āiāEĀ or for ā(1/c)āiāEĀ implies a very specific choice of reference frame. The imaginary unit is a two-dimensional concept: it only makes sense when giving it a planeĀ view. Literally. Indeed, myĀ formulas assume the iĀ (or āi) plane is perpendicular to the direction of propagation of the elementary quantum-mechanical wavefunction. So… Yes. The need for rotation matrices is obvious. But my physicalĀ interpretation of the wavefunction stands. š

# Wavefunctions as gravitational waves

This is the paper I always wanted to write. It is there now, and I think it is good – and that‘s an understatement. š It is probably best to download it as a pdf-file from the viXra.org site because this was a rather fast ‘copy and paste’ job from the Word version of the paper, so there may be issues with boldface notation (vector notation), italics and, most importantly, with formulas – which I, sadly, have to ‘snip’ into this WordPress blog, as they don’t have an easy copy function for mathematical formulas.

It’s great stuff. If you have been following my blog – and many of you have – you will want to digest this. š

Abstract : This paper explores the implications of associating the components of the wavefunction with a physical dimension: force per unit mass ā which is, of course, the dimension of acceleration (m/s2) and gravitational fields. The classical electromagnetic field equations for energy densities, the Poynting vector and spin angular momentum are then re-derived by substituting the electromagnetic N/C unit of field strength (mass per unit charge) by the new N/kg = m/s2 dimension.

The results are elegant and insightful. For example, the energy densities are proportional to the square of the absolute value of the wavefunction and, hence, to the probabilities, which establishes a physical normalization condition. Also, SchrĆ¶dingerās wave equation may then, effectively, be interpreted as a diffusion equation for energy, and the wavefunction itself can be interpreted as a propagating gravitational wave. Finally, as an added bonus, concepts such as the Compton scattering radius for a particle, spin angular momentum, and the boson-fermion dichotomy, can also be explained more intuitively.

While the approach offers a physical interpretation of the wavefunction, the author argues that the core of the Copenhagen interpretations revolves around the complementarity principle, which remains unchallenged because the interpretation of amplitude waves as traveling fields does not explain the particle nature of matter.

# Introduction

This is not another introduction to quantum mechanics. We assume the reader is already familiar with the key principles and, importantly, with the basic math. We offer an interpretation of wave mechanics. As such, we do not challenge the complementarity principle: the physical interpretation of the wavefunction that is offered here explains the wave nature of matter only. It explains diffraction and interference of amplitudes but it does not explain why a particle will hit the detector not as a wave but as a particle. Hence, the Copenhagen interpretation of the wavefunction remains relevant: we just push its boundaries.

The basic ideas in this paper stem from a simple observation: the geometric similarity between the quantum-mechanical wavefunctions and electromagnetic waves is remarkably similar. The components of both waves are orthogonal to the direction of propagation and to each other. Only the relative phase differs : the electric and magnetic field vectors (E and B) have the same phase. In contrast, the phase of the real and imaginary part of the (elementary) wavefunction (Ļ = aĀ·eāiāĪø = aācosĪø – aāsinĪø) differ by 90 degrees (Ļ/2).[1] Pursuing the analogy, we explore the following question: if the oscillating electric and magnetic field vectors of an electromagnetic wave carry the energy that one associates with the wave, can we analyze the real and imaginary part of the wavefunction in a similar way?

We show the answer is positive and remarkably straightforward.  If the physical dimension of the electromagnetic field is expressed in newton per coulomb (force per unit charge), then the physical dimension of the components of the wavefunction may be associated with force per unit mass (newton per kg).[2] Of course, force over some distance is energy. The question then becomes: what is the energy concept here? Kinetic? Potential? Both?

The similarity between the energy of a (one-dimensional) linear oscillator (E = mĀ·a2Ā·Ļ2/2) and Einsteinās relativistic energy equation E = māc2 inspires us to interpret the energy as a two-dimensional oscillation of mass. To assist the reader, we construct a two-piston engine metaphor.[3] We then adapt the formula for the electromagnetic energy density to calculate the energy densities for the wave function. The results are elegant and intuitive: the energy densities are proportional to the square of the absolute value of the wavefunction and, hence, to the probabilities. SchrĆ¶dingerās wave equation may then, effectively, be interpreted as a diffusion equation for energy itself.

As an added bonus, concepts such as the Compton scattering radius for a particle and spin angular, as well as the boson-fermion dichotomy can be explained in a fully intuitive way.[4]

Of course, such interpretation is also an interpretation of the wavefunction itself, and the immediate reaction of the reader is predictable: the electric and magnetic field vectors are, somehow, to be looked at as real vectors. In contrast, the real and imaginary components of the wavefunction are not. However, this objection needs to be phrased more carefully. First, it may be noted that, in a classical analysis, the magnetic force is a pseudovector itself.[5] Second, a suitable choice of coordinates may make quantum-mechanical rotation matrices irrelevant.[6]

Therefore, the author is of the opinion that this little paper may provide some fresh perspective on the question, thereby further exploring Einsteinās basic sentiment in regard to quantum mechanics, which may be summarized as follows: there must be some physical explanation for the calculated probabilities.[7]

We will, therefore, start with Einsteinās relativistic energy equation (E = mc2) and wonder what it could possibly tell us.

# I. Energy as a two-dimensional oscillation of mass

The structural similarity between the relativistic energy formula, the formula for the total energy of an oscillator, and the kinetic energy of a moving body, is striking:

1. E = mc2
2. E = mĻ2/2
3. E = mv2/2

In these formulas, Ļ, v and c all describe some velocity.[8] Of course, there is the 1/2 factor in the E = mĻ2/2 formula[9], but that is exactly the point we are going to explore here: can we think of an oscillation in two dimensions, so it stores an amount of energy that is equal to E = 2Ā·mĀ·Ļ2/2 = mĀ·Ļ2?

That is easy enough. Think, for example, of a V-2 engine with the pistons at a 90-degree angle, as illustrated below. The 90Ā° angle makes it possible to perfectly balance the counterweight and the pistons, thereby ensuring smooth travel at all times. With permanently closed valves, the air inside the cylinder compresses and decompresses as the pistons move up and down and provides, therefore, a restoring force. As such, it will store potential energy, just like a spring, and the motion of the pistons will also reflect that of a mass on a spring. Hence, we can describe it by a sinusoidal function, with the zero point at the center of each cylinder. We can, therefore, think of the moving pistons as harmonic oscillators, just like mechanical springs.

Figure 1: Oscillations in two dimensions

If we assume there is no friction, we have a perpetuum mobile here. The compressed air and the rotating counterweight (which, combined with the crankshaft, acts as a flywheel[10]) store the potential energy. The moving masses of the pistons store the kinetic energy of the system.[11]

At this point, it is probably good to quickly review the relevant math. If the magnitude of the oscillation is equal to a, then the motion of the piston (or the mass on a spring) will be described by x = aĀ·cos(ĻĀ·t + Ī).[12] Needless to say, Ī is just a phase factor which defines our t = 0 point, and Ļ is the natural angular frequency of our oscillator. Because of the 90Ā° angle between the two cylinders, Ī would be 0 for one oscillator, and āĻ/2 for the other. Hence, the motion of one piston is given by x = aĀ·cos(ĻĀ·t), while the motion of the other is given by x = aĀ·cos(ĻĀ·tāĻ/2) = aĀ·sin(ĻĀ·t).

The kinetic and potential energy of one oscillator (think of one piston or one spring only) can then be calculated as:

1. K.E. = T = mĀ·v2/2 = (1/2)Ā·mĀ·Ļ2Ā·a2Ā·sin2(ĻĀ·t + Ī)
2. P.E. = U = kĀ·x2/2 = (1/2)Ā·kĀ·a2Ā·cos2(ĻĀ·t + Ī)

The coefficient k in the potential energy formula characterizes the restoring force: F = ākĀ·x. From the dynamics involved, it is obvious that k must be equal to mĀ·Ļ2. Hence, the total energy is equal to:

E = T + U = (1/2)Ā· mĀ·Ļ2Ā·a2Ā·[sin2(ĻĀ·t + Ī) + cos2(ĻĀ·t + Ī)] = mĀ·a2Ā·Ļ2/2

To facilitate the calculations, we will briefly assume k = mĀ·Ļ2 and a are equal to 1. The motion of our first oscillator is given by the cos(ĻĀ·t) = cosĪø function (Īø = ĻĀ·t), and its kinetic energy will be equal to sin2Īø. Hence, the (instantaneous) change in kinetic energy at any point in time will be equal to:

d(sin2Īø)/dĪø = 2āsinĪøād(sinĪø)/dĪø = 2āsinĪøācosĪø

Let us look at the second oscillator now. Just think of the second piston going up and down in the V-2 engine. Its motion is given by the sinĪø function, which is equal to cos(ĪøāĻ /2). Hence, its kinetic energy is equal to sin2(ĪøāĻ /2), and how it changes ā as a function of Īø ā will be equal to:

2āsin(ĪøāĻ /2)ācos(ĪøāĻ /2) = = ā2ācosĪøāsinĪø = ā2āsinĪøācosĪø

We have our perpetuum mobile! While transferring kinetic energy from one piston to the other, the crankshaft will rotate with a constant angular velocity: linear motion becomes circular motion, and vice versa, and the total energy that is stored in the system is T + U = ma2Ļ2.

We have a great metaphor here. Somehow, in this beautiful interplay between linear and circular motion, energy is borrowed from one place and then returns to the other, cycle after cycle. We know the wavefunction consist of a sine and a cosine: the cosine is the real component, and the sine is the imaginary component. Could they be equally real? Could each represent half of the total energy of our particle? Should we think of the c in our E = mc2 formula as an angular velocity?

These are sensible questions. Let us explore them.

# II. The wavefunction as a two-dimensional oscillation

The elementary wavefunction is written as:

Ļ = aĀ·eāi[EĀ·t ā pāx]/Ä§aĀ·eāi[EĀ·t ā pāx]/Ä§ = aĀ·cos(pāx/Ä§ Eāt/Ä§) + iĀ·aĀ·sin(pāx/Ä§ Eāt/Ä§)

When considering a particle at rest (p = 0) this reduces to:

Ļ = aĀ·eāiāEĀ·t/Ä§ = aĀ·cos(Eāt/Ä§) + iĀ·aĀ·sin(Eāt/Ä§) = aĀ·cos(Eāt/Ä§) iĀ·aĀ·sin(Eāt/Ä§)

Let us remind ourselves of the geometry involved, which is illustrated below. Note that the argument of the wavefunction rotates clockwise with time, while the mathematical convention for measuring the phase angle (Ļ) is counter-clockwise.

Figure 2: Eulerās formula

If we assume the momentum p is all in the x-direction, then the p and x vectors will have the same direction, and pāx/Ä§ reduces to pāx/Ä§. Most illustrations ā such as the one below ā will either freeze x or, else, t. Alternatively, one can google web animations varying both. The point is: we also have a two-dimensional oscillation here. These two dimensions are perpendicular to the direction of propagation of the wavefunction. For example, if the wavefunction propagates in the x-direction, then the oscillations are along the y– and z-axis, which we may refer to as the real and imaginary axis. Note how the phase difference between the cosine and the sine  ā the real and imaginary part of our wavefunction ā appear to give some spin to the whole. I will come back to this.

Figure 3: Geometric representation of the wavefunction

Hence, if we would say these oscillations carry half of the total energy of the particle, then we may refer to the real and imaginary energy of the particle respectively, and the interplay between the real and the imaginary part of the wavefunction may then describe how energy propagates through space over time.

Let us consider, once again, a particle at rest. Hence, p = 0 and the (elementary) wavefunction reduces to Ļ = aĀ·eāiāEĀ·t/Ä§. Hence, the angular velocity of both oscillations, at some point x, is given by Ļ = -E/Ä§. Now, the energy of our particle includes all of the energy ā kinetic, potential and rest energy ā and is, therefore, equal to E = mc2.

Can we, somehow, relate this to the mĀ·a2Ā·Ļ2 energy formula for our V-2 perpetuum mobile? Our wavefunction has an amplitude too. Now, if the oscillations of the real and imaginary wavefunction store the energy of our particle, then their amplitude will surely matter. In fact, the energy of an oscillation is, in general, proportional to the square of the amplitude: E Āµ a2. We may, therefore, think that the a2 factor in the E = mĀ·a2Ā·Ļ2 energy will surely be relevant as well.

However, here is a complication: an actual particle is localized in space and can, therefore, not be represented by the elementary wavefunction. We must build a wave packet for that: a sum of wavefunctions, each with their own amplitude ak, and their own Ļi = -Ei/Ä§. Each of these wavefunctions will contribute some energy to the total energy of the wave packet. To calculate the contribution of each wave to the total, both ai as well as Ei will matter.

What is Ei? Ei varies around some average E, which we can associate with some average mass m: m = E/c2. The Uncertainty Principle kicks in here. The analysis becomes more complicated, but a formula such as the one below might make sense:We can re-write this as:What is the meaning of this equation? We may look at it as some sort of physical normalization condition when building up the Fourier sum. Of course, we should relate this to the mathematical normalization condition for the wavefunction. Our intuition tells us that the probabilities must be related to the energy densities, but how exactly? We will come back to this question in a moment. Let us first think some more about the enigma: what is mass?

Before we do so, let us quickly calculate the value of c2Ä§2: it is about 1Ā“1051 N2ām4. Let us also do a dimensional analysis: the physical dimensions of the E = mĀ·a2Ā·Ļ2 equation make sense if we express m in kg, a in m, and Ļ in rad/s. We then get: [E] = kgām2/s2 = (Nās2/m)ām2/s2 = Nām = J. The dimensions of the left- and right-hand side of the physical normalization condition is N3ām5.

# III. What is mass?

We came up, playfully, with a meaningful interpretation for energy: it is a two-dimensional oscillation of mass. But what is mass? A new aether theory is, of course, not an option, but then what is it that is oscillating? To understand the physics behind equations, it is always good to do an analysis of the physical dimensions in the equation. Let us start with Einsteinās energy equation once again. If we want to look at mass, we should re-write it as m = E/c2:

[m] = [E/c2] = J/(m/s)2 = NĀ·mās2/m2 = NĀ·s2/m = kg

This is not very helpful. It only reminds us of Newtonās definition of a mass: mass is that what gets accelerated by a force. At this point, we may want to think of the physical significance of the absolute nature of the speed of light. Einsteinās E = mc2 equation implies we can write the ratio between the energy and the mass of any particle is always the same, so we can write, for example:This reminds us of the Ļ2= C1/L or Ļ2 = k/m of harmonic oscillators once again.[13] The key difference is that the Ļ2= C1/L and Ļ2 = k/m formulas introduce two or more degrees of freedom.[14] In contrast, c2= E/m for any particle, always. However, that is exactly the point: we can modulate the resistance, inductance and capacitance of electric circuits, and the stiffness of springs and the masses we put on them, but we live in one physical space only: our spacetime. Hence, the speed of light c emerges here as the defining property of spacetime ā the resonant frequency, so to speak. We have no further degrees of freedom here.

The Planck-Einstein relation (for photons) and the de Broglie equation (for matter-particles) have an interesting feature: both imply that the energy of the oscillation is proportional to the frequency, with Planckās constant as the constant of proportionality. Now, for one-dimensional oscillations ā think of a guitar string, for example ā we know the energy will be proportional to the square of the frequency. It is a remarkable observation: the two-dimensional matter-wave, or the electromagnetic wave, gives us two waves for the price of one, so to speak, each carrying half of the total energy of the oscillation but, as a result, we get a proportionality between E and f instead of between E and f2.

However, such reflections do not answer the fundamental question we started out with: what is mass? At this point, it is hard to go beyond the circular definition that is implied by Einsteinās formula: energy is a two-dimensional oscillation of mass, and mass packs energy, and c emerges us as the property of spacetime that defines how exactly.

When everything is said and done, this does not go beyond stating that mass is some scalar field. Now, a scalar field is, quite simply, some real number that we associate with a position in spacetime. The Higgs field is a scalar field but, of course, the theory behind it goes much beyond stating that we should think of mass as some scalar field. The fundamental question is: why and how does energy, or matter, condense into elementary particles? That is what the Higgs mechanism is about but, as this paper is exploratory only, we cannot even start explaining the basics of it.

What we can do, however, is look at the wave equation again (SchrĆ¶dingerās equation), as we can now analyze it as an energy diffusion equation.

# IV. SchrĆ¶dingerās equation as an energy diffusion equation

The interpretation of SchrĆ¶dingerās equation as a diffusion equation is straightforward. Feynman (Lectures, III-16-1) briefly summarizes it as follows:

āWe can think of SchrĆ¶dingerās equation as describing the diffusion of the probability amplitude from one point to the next. [ā¦] But the imaginary coefficient in front of the derivative makes the behavior completely different from the ordinary diffusion such as you would have for a gas spreading out along a thin tube. Ordinary diffusion gives rise to real exponential solutions, whereas the solutions of SchrĆ¶dingerās equation are complex waves.ā[17]

Let us review the basic math. For a particle moving in free space ā with no external force fields acting on it ā there is no potential (U = 0) and, therefore, the UĻ term disappears. Therefore, SchrĆ¶dingerās equation reduces to:

āĻ(x, t)/āt = iĀ·(1/2)Ā·(Ä§/meff)Ā·ā2Ļ(x, t)

The ubiquitous diffusion equation in physics is:

āĻ(x, t)/āt = DĀ·ā2Ļ(x, t)

The structural similarity is obvious. The key difference between both equations is that the wave equation gives us two equations for the price of one. Indeed, because Ļ is a complex-valued function, with a real and an imaginary part, we get the following equations[18]:

1. Re(āĻ/āt) = ā(1/2)Ā·(Ä§/meff)Ā·Im(ā2Ļ)
2. Im(āĻ/āt) = (1/2)Ā·(Ä§/meff)Ā·Re(ā2Ļ)

These equations make us think of the equations for an electromagnetic wave in free space (no stationary charges or currents):

1. āB/āt = āāĆE
2. āE/āt = c2āĆB

The above equations effectively describe a propagation mechanism in spacetime, as illustrated below.

Figure 4: Propagation mechanisms

The Laplacian operator (ā2), when operating on a scalar quantity, gives us a flux density, i.e. something expressed per square meter (1/m2). In this case, it is operating on Ļ(x, t), so what is the dimension of our wavefunction Ļ(x, t)? To answer that question, we should analyze the diffusion constant in SchrĆ¶dingerās equation, i.e. the (1/2)Ā·(Ä§/meff) factor:

1. As a mathematical constant of proportionality, it will quantify the relationship between both derivatives (i.e. the time derivative and the Laplacian);
2. As a physical constant, it will ensure the physical dimensions on both sides of the equation are compatible.

Now, the Ä§/meff factor is expressed in (NĀ·mĀ·s)/(NĀ· s2/m) = m2/s. Hence, it does ensure the dimensions on both sides of the equation are, effectively, the same: āĻ/āt is a time derivative and, therefore, its dimension is s1 while, as mentioned above, the dimension of ā2Ļ is m2. However, this does not solve our basic question: what is the dimension of the real and imaginary part of our wavefunction?

At this point, mainstream physicists will say: it does not have a physical dimension, and there is no geometric interpretation of SchrĆ¶dingerās equation. One may argue, effectively, that its argument, (pāx – Eāt)/Ä§, is just a number and, therefore, that the real and imaginary part of Ļ is also just some number.

To this, we may object that Ä§ may be looked as a mathematical scaling constant only. If we do that, then the argument of Ļ will, effectively, be expressed in action units, i.e. in NĀ·mĀ·s. It then does make sense to also associate a physical dimension with the real and imaginary part of Ļ. What could it be?

We may have a closer look at Maxwellās equations for inspiration here. The electric field vector is expressed in newton (the unit of force) per unit of charge (coulomb). Now, there is something interesting here. The physical dimension of the magnetic field is N/C divided by m/s.[19] We may write B as the following vector cross-product: B = (1/c)āexĆE, with ex the unit vector pointing in the x-direction (i.e. the direction of propagation of the wave). Hence, we may associate the (1/c)āexĆ operator, which amounts to a rotation by 90 degrees, with the s/m dimension. Now, multiplication by i also amounts to a rotation by 90Ā° degrees. Hence, we may boldly write: B = (1/c)āexĆE = (1/c)āiāE. This allows us to also geometrically interpret SchrĆ¶dingerās equation in the way we interpreted it above (see Figure 3).[20]

Still, we have not answered the question as to what the physical dimension of the real and imaginary part of our wavefunction should be. At this point, we may be inspired by the structural similarity between Newtonās and Coulombās force laws:Hence, if the electric field vector E is expressed in force per unit charge (N/C), then we may want to think of associating the real part of our wavefunction with a force per unit mass (N/kg). We can, of course, do a substitution here, because the mass unit (1 kg) is equivalent to 1 NĀ·s2/m. Hence, our N/kg dimension becomes:

N/kg = N/(NĀ·s2/m)= m/s2

What is this: m/s2? Is that the dimension of the aĀ·cosĪø term in the aĀ·eāiĪø aĀ·cosĪø ā iĀ·aĀ·sinĪø wavefunction?

My answer is: why not? Think of it: m/s2 is the physical dimension of acceleration: the increase or decrease in velocity (m/s) per second. It ensures the wavefunction for any particle ā matter-particles or particles with zero rest mass (photons) ā and the associated wave equation (which has to be the same for all, as the spacetime we live in is one) are mutually consistent.

In this regard, we should think of how we would model a gravitational wave. The physical dimension would surely be the same: force per mass unit. It all makes sense: wavefunctions may, perhaps, be interpreted as traveling distortions of spacetime, i.e. as tiny gravitational waves.

# V. Energy densities and flows

Pursuing the geometric equivalence between the equations for an electromagnetic wave and SchrĆ¶dingerās equation, we can now, perhaps, see if there is an equivalent for the energy density. For an electromagnetic wave, we know that the energy density is given by the following formula:E and B are the electric and magnetic field vector respectively. The Poynting vector will give us the directional energy flux, i.e. the energy flow per unit area per unit time. We write:Needless to say, the āā operator is the divergence and, therefore, gives us the magnitude of a (vector) fieldās source or sink at a given point. To be precise, the divergence gives us the volume density of the outward flux of a vector field from an infinitesimal volume around a given point. In this case, it gives us the volume density of the flux of S.

We can analyze the dimensions of the equation for the energy density as follows:

1. E is measured in newton per coulomb, so [EāE] = [E2] = N2/C2.
2. B is measured in (N/C)/(m/s), so we get [BāB] = [B2] = (N2/C2)Ā·(s2/m2). However, the dimension of our c2 factor is (m2/s2) and so weāre also left with N2/C2.
3. The Ļµ0 is the electric constant, aka as the vacuum permittivity. As a physical constant, it should ensure the dimensions on both sides of the equation work out, and they do: [Īµ0] = C2/(NĀ·m2) and, therefore, if we multiply that with N2/C2, we find that is expressed in J/m3.[21]

Replacing the newton per coulomb unit (N/C) by the newton per kg unit (N/kg) in the formulas above should give us the equivalent of the energy density for the wavefunction. We just need to substitute Ļµ0 for an equivalent constant. We may to give it a try. If the energy densities can be calculated ā which are also mass densities, obviously ā then the probabilities should be proportional to them.

Let us first see what we get for a photon, assuming the electromagnetic wave represents its wavefunction. Substituting B for (1/c)āiāE or for ā(1/c)āiāE gives us the following result:Zero!? An unexpected result! Or not? We have no stationary charges and no currents: only an electromagnetic wave in free space. Hence, the local energy conservation principle needs to be respected at all points in space and in time. The geometry makes sense of the result: for an electromagnetic wave, the magnitudes of E and B reach their maximum, minimum and zero point simultaneously, as shown below.[22] This is because their phase is the same.

Figure 5: Electromagnetic wave: E and B

Should we expect a similar result for the energy densities that we would associate with the real and imaginary part of the matter-wave? For the matter-wave, we have a phase difference between aĀ·cosĪø and aĀ·sinĪø, which gives a different picture of the propagation of the wave (see Figure 3).[23] In fact, the geometry of the suggestion suggests some inherent spin, which is interesting. I will come back to this. Let us first guess those densities. Making abstraction of any scaling constants, we may write:We get what we hoped to get: the absolute square of our amplitude is, effectively, an energy density !

|Ļ|2  = |aĀ·eāiāEĀ·t/Ä§|2 = a2 = u

This is very deep. A photon has no rest mass, so it borrows and returns energy from empty space as it travels through it. In contrast, a matter-wave carries energy and, therefore, has some (rest) mass. It is therefore associated with an energy density, and this energy density gives us the probabilities. Of course, we need to fine-tune the analysis to account for the fact that we have a wave packet rather than a single wave, but that should be feasible.

As mentioned, the phase difference between the real and imaginary part of our wavefunction (a cosine and a sine function) appear to give some spin to our particle. We do not have this particularity for a photon. Of course, photons are bosons, i.e. spin-zero particles, while elementary matter-particles are fermions with spin-1/2. Hence, our geometric interpretation of the wavefunction suggests that, after all, there may be some more intuitive explanation of the fundamental dichotomy between bosons and fermions, which puzzled even Feynman:

āWhy is it that particles with half-integral spin are Fermi particles, whereas particles with integral spin are Bose particles? We apologize for the fact that we cannot give you an elementary explanation. An explanation has been worked out by Pauli from complicated arguments of quantum field theory and relativity. He has shown that the two must necessarily go together, but we have not been able to find a way of reproducing his arguments on an elementary level. It appears to be one of the few places in physics where there is a rule which can be stated very simply, but for which no one has found a simple and easy explanation. The explanation is deep down in relativistic quantum mechanics. This probably means that we do not have a complete understanding of the fundamental principle involved.ā (Feynman, Lectures, III-4-1)

The physical interpretation of the wavefunction, as presented here, may provide some better understanding of āthe fundamental principle involvedā: the physical dimension of the oscillation is just very different. That is all: it is force per unit charge for photons, and force per unit mass for matter-particles. We will examine the question of spin somewhat more carefully in section VII. Let us first examine the matter-wave some more.

# VI. Group and phase velocity of the matter-wave

The geometric representation of the matter-wave (see Figure 3) suggests a traveling wave and, yes, of course: the matter-wave effectively travels through space and time. But what is traveling, exactly? It is the pulse ā or the signal ā only: the phase velocity of the wave is just a mathematical concept and, even in our physical interpretation of the wavefunction, the same is true for the group velocity of our wave packet. The oscillation is two-dimensional, but perpendicular to the direction of travel of the wave. Hence, nothing actually moves with our particle.

Here, we should also reiterate that we did not answer the question as to what is oscillating up and down and/or sideways: we only associated a physical dimension with the components of the wavefunction ā newton per kg (force per unit mass), to be precise. We were inspired to do so because of the physical dimension of the electric and magnetic field vectors (newton per coulomb, i.e. force per unit charge) we associate with electromagnetic waves which, for all practical purposes, we currently treat as the wavefunction for a photon. This made it possible to calculate the associated energy densities and a Poynting vector for energy dissipation. In addition, we showed that SchrĆ¶dinger’s equation itself then becomes a diffusion equation for energy. However, let us now focus some more on the asymmetry which is introduced by the phase difference between the real and the imaginary part of the wavefunction. Look at the mathematical shape of the elementary wavefunction once again:

Ļ = aĀ·eāi[EĀ·t ā pāx]/Ä§aĀ·eāi[EĀ·t ā pāx]/Ä§ = aĀ·cos(pāx/Ä§ ā Eāt/Ä§) + iĀ·aĀ·sin(pāx/Ä§ ā Eāt/Ä§)

The minus sign in the argument of our sine and cosine function defines the direction of travel: an F(xāvāt) wavefunction will always describe some wave that is traveling in the positive x-direction (with the wave velocity), while an F(x+vāt) wavefunction will travel in the negative x-direction. For a geometric interpretation of the wavefunction in three dimensions, we need to agree on how to define i or, what amounts to the same, a convention on how to define clockwise and counterclockwise directions: if we look at a clock from the back, then its hand will be moving counterclockwise. So we need to establish the equivalent of the right-hand rule. However, let us not worry about that now. Let us focus on the interpretation. To ease the analysis, we’ll assume we’re looking at a particle at rest. Hence, p = 0, and the wavefunction reduces to:

Ļ = aĀ·eāiāEĀ·t/Ä§ = aĀ·cos(āEāt/Ä§) + iĀ·aĀ·sin(āE0āt/Ä§) = aĀ·cos(E0āt/Ä§) ā iĀ·aĀ·sin(E0āt/Ä§)

E0 is, of course, the rest mass of our particle and, now that we are here, we should probably wonder whose time we are talking about: is it our time, or is the proper time of our particle? Well… In this situation, we are both at rest so it does not matter: t is, effectively, the proper time so perhaps we should write it as t0. It does not matter. You can see what we expect to see: E0/Ä§ pops up as the natural frequency of our matter-particle: (E0/Ä§)āt = Ļāt. Remembering the Ļ = 2ĻĀ·f = 2Ļ/T and T = 1/formulas, we can associate a period and a frequency with this wave, using the Ļ = 2ĻĀ·f = 2Ļ/T. Noting that Ä§ = h/2Ļ, we find the following:

T = 2ĻĀ·(Ä§/E0) = h/E0 ā = E0/h = m0c2/h

This is interesting, because we can look at the period as a natural unit of time for our particle. What about the wavelength? That is tricky because we need to distinguish between group and phase velocity here. The group velocity (vg) should be zero here, because we assume our particle does not move. In contrast, the phase velocity is given by vp = Ī»Ā·= (2Ļ/k)Ā·(Ļ/2Ļ) = Ļ/k. In fact, we’ve got something funny here: the wavenumber k = p/Ä§ is zero, because we assume the particle is at rest, so p = 0. So we have a division by zero here, which is rather strange. What do we get assuming the particle is not at rest? We write:

vp = Ļ/k = (E/Ä§)/(p/Ä§) = E/p = E/(mĀ·vg) = (mĀ·c2)/(mĀ·vg) = c2/vg

This is interesting: it establishes a reciprocal relation between the phase and the group velocity, with as a simple scaling constant. Indeed, the graph below shows the shape of the function does not change with the value of c, and we may also re-write the relation above as:

vp/= Ī²p = c/vp = 1/Ī²g = 1/(c/vp)

Figure 6: Reciprocal relation between phase and group velocity

We can also write the mentioned relationship as vpĀ·vg = c2, which reminds us of the relationship between the electric and magnetic constant (1/Īµ0)Ā·(1/Ī¼0) = c2. This is interesting in light of the fact we can re-write this as (cĀ·Īµ0)Ā·(cĀ·Ī¼0) = 1, which shows electricity and magnetism are just two sides of the same coin, so to speak.[24]

Interesting, but how do we interpret the math? What about the implications of the zero value for wavenumber k = p/Ä§. We would probably like to think it implies the elementary wavefunction should always be associated with some momentum, because the concept of zero momentum clearly leads to weird math: something times zero cannot be equal to c2! Such interpretation is also consistent with the Uncertainty Principle: if ĪxĀ·Īp ā„ Ä§, then neither Īx nor Īp can be zero. In other words, the Uncertainty Principle tells us that the idea of a pointlike particle actually being at some specific point in time and in space does not make sense: it has to move. It tells us that our concept of dimensionless points in time and space are mathematical notions only. Actual particles – including photons – are always a bit spread out, so to speak, and – importantly – they have to move.

For a photon, this is self-evident. It has no rest mass, no rest energy, and, therefore, it is going to move at the speed of light itself. We write: p = mĀ·c = mĀ·c2/= E/c. Using the relationship above, we get:

vp = Ļ/k = (E/Ä§)/(p/Ä§) = E/p = c ā vg = c2/vp = c2/c = c

This is good: we started out with some reflections on the matter-wave, but here we get an interpretation of the electromagnetic wave as a wavefunction for the photon. But let us get back to our matter-wave. In regard to our interpretation of a particle having to move, we should remind ourselves, once again, of the fact that an actual particle is always localized in space and that it can, therefore, not be represented by the elementary wavefunction Ļ = aĀ·eāi[EĀ·t ā pāx]/Ä§ or, for a particle at rest, the Ļ = aĀ·eāiāEĀ·t/Ä§ function. We must build a wave packet for that: a sum of wavefunctions, each with their own amplitude ai, and their own Ļi = āEi/Ä§. Indeed, in section II, we showed that each of these wavefunctions will contribute some energy to the total energy of the wave packet and that, to calculate the contribution of each wave to the total, both ai as well as Ei matter. This may or may not resolve the apparent paradox. Let us look at the group velocity.

To calculate a meaningful group velocity, we must assume the vg = āĻi/āki = ā(Ei/Ä§)/ā(pi/Ä§) = ā(Ei)/ā(pi) exists. So we must have some dispersion relation. How do we calculate it? We need to calculate Ļi as a function of ki here, or Ei as a function of pi. How do we do that? Well… There are a few ways to go about it but one interesting way of doing it is to re-write SchrĆ¶dinger’s equation as we did, i.e. by distinguishing the real and imaginary parts of the āĻ/āt =iĀ·[Ä§/(2m)]Ā·ā2Ļ wave equation and, hence, re-write it as the following pair of two equations:

1. Re(āĻ/āt) = ā[Ä§/(2meff)]Ā·Im(ā2Ļ) ā ĻĀ·cos(kx ā Ļt) = k2Ā·[Ä§/(2meff)]Ā·cos(kx ā Ļt)
2. Im(āĻ/āt) = [Ä§/(2meff)]Ā·Re(ā2Ļ) ā ĻĀ·sin(kx ā Ļt) = k2Ā·[Ä§/(2meff)]Ā·sin(kx ā Ļt)

Both equations imply the following dispersion relation:

Ļ = Ä§Ā·k2/(2meff)

Of course, we need to think about the subscripts now: we have Ļi, ki, but… What about meff or, dropping the subscript, m? Do we write it as mi? If so, what is it? Well… It is the equivalent mass of Ei obviously, and so we get it from the mass-energy equivalence relation: mi = Ei/c2. It is a fine point, but one most people forget about: they usually just write m. However, if there is uncertainty in the energy, then Einstein’s mass-energy relation tells us we must have some uncertainty in the (equivalent) mass too. Here, I should refer back to Section II: Ei varies around some average energy E and, therefore, the Uncertainty Principle kicks in.

# VII. Explaining spin

The elementary wavefunction vector ā i.e. the vector sum of the real and imaginary component ā rotates around the x-axis, which gives us the direction of propagation of the wave (see Figure 3). Its magnitude remains constant. In contrast, the magnitude of the electromagnetic vector ā defined as the vector sum of the electric and magnetic field vectors ā oscillates between zero and some maximum (see Figure 5).

We already mentioned that the rotation of the wavefunction vector appears to give some spin to the particle. Of course, a circularly polarized wave would also appear to have spin (think of the E and B vectors rotating around the direction of propagation – as opposed to oscillating up and down or sideways only). In fact, a circularly polarized light does carry angular momentum, as the equivalent mass of its energy may be thought of as rotating as well. But so here we are looking at a matter-wave.

The basic idea is the following: if we look at Ļ = aĀ·eāiāEĀ·t/Ä§ as some real vector ā as a two-dimensional oscillation of mass, to be precise ā then we may associate its rotation around the direction of propagation with some torque. The illustration below reminds of the math here.

Figure 7: Torque and angular momentum vectors

A torque on some mass about a fixed axis gives it angular momentum, which we can write as the vector cross-product L = rĆp or, perhaps easier for our purposes here as the product of an angular velocity (Ļ) and rotational inertia (I), aka as the moment of inertia or the angular mass. We write:

L = IĀ·Ļ

Note we can write L and Ļ in boldface here because they are (axial) vectors. If we consider their magnitudes only, we write L = IĀ·Ļ (no boldface). We can now do some calculations. Let us start with the angular velocity. In our previous posts, we showed that the period of the matter-wave is equal to T = 2ĻĀ·(Ä§/E0). Hence, the angular velocity must be equal to:

Ļ = 2Ļ/[2ĻĀ·(Ä§/E0)] = E0/Ä§

We also know the distance r, so that is the magnitude of r in the LrĆp vector cross-product: it is just a, so that is the magnitude of Ļ = aĀ·eāiāEĀ·t/Ä§. Now, the momentum (p) is the product of a linear velocity (v) – in this case, the tangential velocity – and some mass (m): p = mĀ·v. If we switch to scalar instead of vector quantities, then the (tangential) velocity is given by v = rĀ·Ļ. So now we only need to think about what we should use for m or, if we want to work with the angular velocity (Ļ), the angular mass (I). Here we need to make some assumption about the mass (or energy) distribution. Now, it may or may not sense to assume the energy in the oscillation ā and, therefore, the mass ā is distributed uniformly. In that case, we may use the formula for the angular mass of a solid cylinder: I = mĀ·r2/2. If we keep the analysis non-relativistic, then m = m0. Of course, the energy-mass equivalence tells us that m0 = E0/c2. Hence, this is what we get:

L = IĀ·Ļ = (m0Ā·r2/2)Ā·(E0/Ä§) = (1/2)Ā·a2Ā·(E0/c2)Ā·(E0/Ä§) = a2Ā·E02/(2Ā·Ä§Ā·c2)

Does it make sense? Maybe. Maybe not. Let us do a dimensional analysis: that wonāt check our logic, but it makes sure we made no mistakes when mapping mathematical and physical spaces. We have m2Ā·J2 = m2Ā·N2Ā·m2 in the numerator and NĀ·mĀ·sĀ·m2/s2 in the denominator. Hence, the dimensions work out: we get NĀ·mĀ·s as the dimension for L, which is, effectively, the physical dimension of angular momentum. It is also the action dimension, of course, and that cannot be a coincidence. Also note that the E = mc2 equation allows us to re-write it as:

L = a2Ā·E02/(2Ā·Ä§Ā·c2)

Of course, in quantum mechanics, we associate spin with the magnetic moment of a charged particle, not with its mass as such. Is there way to link the formula above to the one we have for the quantum-mechanical angular momentum, which is also measured in NĀ·mĀ·s units, and which can only take on one of two possible values: J = +Ä§/2 and āÄ§/2? It looks like a long shot, right? How do we go from (1/2)Ā·a2Ā·m02/Ä§ to Ā± (1/2)āÄ§? Let us do a numerical example. The energy of an electron is typically 0.510 MeV Ā» 8.1871Ć10ā14 Nām, and aā¦ What value should we take for a?

We have an obvious trio of candidates here: the Bohr radius, the classical electron radius (aka the Thompon scattering length), and the Compton scattering radius.

Let us start with the Bohr radius, so that is about 0.Ć10ā10 Nām. We get L = a2Ā·E02/(2Ā·Ä§Ā·c2) = 9.9Ć10ā31 Nāmās. Now that is about 1.88Ć104 times Ä§/2. That is a huge factor. The Bohr radius cannot be right: we are not looking at an electron in an orbital here. To show it does not make sense, we may want to double-check the analysis by doing the calculation in another way. We said each oscillation will always pack 6.626070040(81)Ć10ā34 joule in energy. So our electron should pack about 1.24Ć10ā20 oscillations. The angular momentum (L) we get when using the Bohr radius for a and the value of 6.626Ć10ā34 joule for E0 and the Bohr radius is equal to 6.49Ć10ā59 Nāmās. So that is the angular momentum per oscillation. When we multiply this with the number of oscillations (1.24Ć10ā20), we get about 8.01Ć10ā51 Nāmās, so that is a totally different number.

The classical electron radius is about 2.818Ć10ā15 m. We get an L that is equal to about 2.81Ć10ā39 Nāmās, so now it is a tiny fraction of Ä§/2! Hence, this leads us nowhere. Let us go for our last chance to get a meaningful result! Let us use the Compton scattering length, so that is about 2.42631Ć10ā12 m.

This gives us an L of 2.08Ć10ā33 Nāmās, which is only 20 times Ä§. This is not so bad, but it is good enough? Let us calculate it the other way around: what value should we take for a so as to ensure L = a2Ā·E02/(2Ā·Ä§Ā·c2) = Ä§/2? Let us write it out:

In fact, this is the formula for the so-called reduced Compton wavelength. This is perfect. We found what we wanted to find. Substituting this value for a (you can calculate it: it is about 3.8616Ć10ā33 m), we get what we should find:

This is a rather spectacular result, and one that would ā a priori ā support the interpretation of the wavefunction that is being suggested in this paper.

# VIII. The boson-fermion dichotomy

Let us do some more thinking on the boson-fermion dichotomy. Again, we should remind ourselves that an actual particle is localized in space and that it can, therefore, not be represented by the elementary wavefunction Ļ = aĀ·eāi[EĀ·t ā pāx]/Ä§ or, for a particle at rest, the Ļ = aĀ·eāiāEĀ·t/Ä§ function. We must build a wave packet for that: a sum of wavefunctions, each with their own amplitude ai, and their own Ļi = āEi/Ä§. Each of these wavefunctions will contribute some energy to the total energy of the wave packet. Now, we can have another wild but logical theory about this.

Think of the apparent right-handedness of the elementary wavefunction: surely, Nature can’t be bothered about our convention of measuring phase angles clockwise or counterclockwise. Also, the angular momentum can be positive or negative: J = +Ä§/2 or āÄ§/2. Hence, we would probably like to think that an actual particle – think of an electron, or whatever other particle you’d think of – may consist of right-handed as well as left-handed elementary waves. To be precise, we may think they either consist of (elementary) right-handed waves or, else, of (elementary) left-handed waves. An elementary right-handed wave would be written as:

Ļ(Īøi= aiĀ·(cosĪøi + iĀ·sinĪøi)

In contrast, an elementary left-handed wave would be written as:

Ļ(Īøi= aiĀ·(cosĪøi ā iĀ·sinĪøi)

How does that work out with the E0Ā·t argument of our wavefunction? Position is position, and direction is direction, but time? Time has only one direction, but Nature surely does not care how we count time: counting like 1, 2, 3, etcetera or like ā1, ā2, ā3, etcetera is just the same. If we count like 1, 2, 3, etcetera, then we write our wavefunction like:

Ļ = aĀ·cos(E0āt/Ä§) ā iĀ·aĀ·sin(E0āt/Ä§)

If we count time like ā1, ā2, ā3, etcetera then we write it as:

Ļ = aĀ·cos(āE0āt/Ä§) ā iĀ·aĀ·sin(āE0āt/Ä§)= aĀ·cos(E0āt/Ä§) + iĀ·aĀ·sin(E0āt/Ä§)

Hence, it is just like the left- or right-handed circular polarization of an electromagnetic wave: we can have both for the matter-wave too! This, then, should explain why we can have either positive or negative quantum-mechanical spin (+Ä§/2 or āÄ§/2). It is the usual thing: we have two mathematical possibilities here, and so we must have two physical situations that correspond to it.

It is only natural. If we have left- and right-handed photons – or, generalizing, left- and right-handed bosons – then we should also have left- and right-handed fermions (electrons, protons, etcetera). Back to the dichotomy. The textbook analysis of the dichotomy between bosons and fermions may be epitomized by Richard Feynmanās Lecture on it (Feynman, III-4), which is confusing and ā I would dare to say ā even inconsistent: how are photons or electrons supposed to know that they need to interfere with a positive or a negative sign? They are not supposed to know anything: knowledge is part of our interpretation of whatever it is that is going on there.

Hence, it is probably best to keep it simple, and think of the dichotomy in terms of the different physical dimensions of the oscillation: newton per kg versus newton per coulomb. And then, of course, we should also note that matter-particles have a rest mass and, therefore, actually carry charge. Photons do not. But both are two-dimensional oscillations, and the point is: the so-called vacuum – and the rest mass of our particle (which is zero for the photon and non-zero for everything else) – give us the natural frequency for both oscillations, which is beautifully summed up in that remarkable equation for the group and phase velocity of the wavefunction, which applies to photons as well as matter-particles:

(vphaseĀ·c)Ā·(vgroupĀ·c) = 1 ā vpĀ·vg = c2

The final question then is: why are photons spin-zero particles? Well… We should first remind ourselves of the fact that they do have spin when circularly polarized.[25] Here we may think of the rotation of the equivalent mass of their energy. However, if they are linearly polarized, then there is no spin. Even for circularly polarized waves, the spin angular momentum of photons is a weird concept. If photons have no (rest) mass, then they cannot carry any charge. They should, therefore, not have any magnetic moment. Indeed, what I wrote above shows an explanation of quantum-mechanical spin requires both mass as well as charge.[26]

# IX. Concluding remarks

There are, of course, other ways to look at the matter ā literally. For example, we can imagine two-dimensional oscillations as circular rather than linear oscillations. Think of a tiny ball, whose center of mass stays where it is, as depicted below. Any rotation ā around any axis ā will be some combination of a rotation around the two other axes. Hence, we may want to think of a two-dimensional oscillation as an oscillation of a polar and azimuthal angle.

Figure 8: Two-dimensional circular movement

The point of this paper is not to make any definite statements. That would be foolish. Its objective is just to challenge the simplistic mainstream viewpoint on the reality of the wavefunction. Stating that it is a mathematical construct only without physical significance amounts to saying it has no meaning at all. That is, clearly, a non-sustainable proposition.

The interpretation that is offered here looks at amplitude waves as traveling fields. Their physical dimension may be expressed in force per mass unit, as opposed to electromagnetic waves, whose amplitudes are expressed in force per (electric) charge unit. Also, the amplitudes of matter-waves incorporate a phase factor, but this may actually explain the rather enigmatic dichotomy between fermions and bosons and is, therefore, an added bonus.

The interpretation that is offered here has some advantages over other explanations, as it explains the how of diffraction and interference. However, while it offers a great explanation of the wave nature of matter, it does not explain its particle nature: while we think of the energy as being spread out, we will still observe electrons and photons as pointlike particles once they hit the detector. Why is it that a detector can sort of āhookā the whole blob of energy, so to speak?

The interpretation of the wavefunction that is offered here does not explain this. Hence, the complementarity principle of the Copenhagen interpretation of the wavefunction surely remains relevant.

# Appendix 1: The de Broglie relations and energy

The 1/2 factor in SchrĆ¶dingerās equation is related to the concept of the effective mass (meff). It is easy to make the wrong calculations. For example, when playing with the famous de Broglie relations ā aka as the matter-wave equations ā one may be tempted to derive the following energy concept:

1. E = hĀ·f and p = h/Ī». Therefore, f = E/h and Ī» = p/h.
2. v = fĀ·Ī» = (E/h)ā(p/h) = E/p
3. p = mĀ·v. Therefore, E = vĀ·p = mĀ·v2

E = mĀ·v2? This resembles the E = mc2 equation and, therefore, one may be enthused by the discovery, especially because the mĀ·v2 also pops up when working with the Least Action Principle in classical mechanics, which states that the path that is followed by a particle will minimize the following integral:Now, we can choose any reference point for the potential energy but, to reflect the energy conservation law, we can select a reference point that ensures the sum of the kinetic and the potential energy is zero throughout the time interval. If the force field is uniform, then the integrand will, effectively, be equal to KE ā PE = mĀ·v2.[27]

However, that is classical mechanics and, therefore, not so relevant in the context of the de Broglie equations, and the apparent paradox should be solved by distinguishing between the group and the phase velocity of the matter wave.

# Appendix 2: The concept of the effective mass

The effective mass ā as used in SchrĆ¶dingerās equation ā is a rather enigmatic concept. To make sure we are making the right analysis here, I should start by noting you will usually see SchrĆ¶dingerās equation written as:This formulation includes a term with the potential energy (U). In free space (no potential), this term disappears, and the equation can be re-written as:

āĻ(x, t)/āt = iĀ·(1/2)Ā·(Ä§/meff)Ā·ā2Ļ(x, t)

We just moved the iĀ·Ä§ coefficient to the other side, noting that 1/i = –i. Now, in one-dimensional space, and assuming Ļ is just the elementary wavefunction (so we substitute aĀ·eāiā[EĀ·t ā pāx]/Ä§ for Ļ), this implies the following:

āaĀ·iĀ·(E/Ä§)Ā·eāiā[EĀ·t ā pāx]/Ä§ = āiĀ·(Ä§/2meff)Ā·aĀ·(p2/Ä§2)Ā· eāiā[EĀ·t ā pāx]/Ä§

ā E = p2/(2meff) ā meff = mā(v/c)2/2 = māĪ²2/2

It is an ugly formula: it resembles the kinetic energy formula (K.E. = māv2/2) but it is, in fact, something completely different. The Ī²2/2 factor ensures the effective mass is always a fraction of the mass itself. To get rid of the ugly 1/2 factor, we may re-define meff as two times the old meff (hence, meffNEW = 2āmeffOLD), as a result of which the formula will look somewhat better:

meff = mā(v/c)2 = māĪ²2

We know Ī² varies between 0 and 1 and, therefore, meff will vary between 0 and m. Feynman drops the subscript, and just writes meff as m in his textbook (see Feynman, III-19). On the other hand, the electron mass as used is also the electron mass that is used to calculate the size of an atom (see Feynman, III-2-4). As such, the two mass concepts are, effectively, mutually compatible. It is confusing because the same mass is often defined as the mass of a stationary electron (see, for example, the article on it in the online Wikipedia encyclopedia[28]).

In the context of the derivation of the electron orbitals, we do have the potential energy term ā which is the equivalent of a source term in a diffusion equation ā and that may explain why the above-mentioned meff = mā(v/c)2 = māĪ²2 formula does not apply.

# References

This paper discusses general principles in physics only. Hence, references can be limited to references to physics textbooks only. For ease of reading, any reference to additional material has been limited to a more popular undergrad textbook that can be consulted online: Feynmanās Lectures on Physics (http://www.feynmanlectures.caltech.edu). References are per volume, per chapter and per section. For example, Feynman III-19-3 refers to Volume III, Chapter 19, Section 3.

# Notes

[1] Of course, an actual particle is localized in space and can, therefore, not be represented by the elementary wavefunction Ļ = aĀ·eāiāĪøaĀ·eāi[EĀ·t ā pāx]/Ä§ = aĀ·(cosĪø iĀ·aĀ·sinĪø). We must build a wave packet for that: a sum of wavefunctions, each with its own amplitude ak and its own argument Īøk = (Ekāt – pkāx)/Ä§. This is dealt with in this paper as part of the discussion on the mathematical and physical interpretation of the normalization condition.

[2] The N/kg dimension immediately, and naturally, reduces to the dimension of acceleration (m/s2), thereby facilitating a direct interpretation in terms of Newtonās force law.

[3] In physics, a two-spring metaphor is more common. Hence, the pistons in the authorās perpetuum mobile may be replaced by springs.

[4] The author re-derives the equation for the Compton scattering radius in section VII of the paper.

[5] The magnetic force can be analyzed as a relativistic effect (see Feynman II-13-6). The dichotomy between the electric force as a polar vector and the magnetic force as an axial vector disappears in the relativistic four-vector representation of electromagnetism.

[6] For example, when using SchrĆ¶dingerās equation in a central field (think of the electron around a proton), the use of polar coordinates is recommended, as it ensures the symmetry of the Hamiltonian under all rotations (see Feynman III-19-3)

[7] This sentiment is usually summed up in the apocryphal quote: āGod does not play dice.āThe actual quote comes out of one of Einsteinās private letters to Cornelius Lanczos, another scientist who had also emigrated to the US. The full quote is as follows: “You are the only person I know who has the same attitude towards physics as I have: belief in the comprehension of reality through something basically simple and unified… It seems hard to sneak a look at God’s cards. But that He plays dice and uses ‘telepathic’ methods… is something that I cannot believe for a single moment.” (Helen Dukas and Banesh Hoffman, Albert Einstein, the Human Side: New Glimpses from His Archives, 1979)

[8] Of course, both are different velocities: Ļ is an angular velocity, while v is a linear velocity: Ļ is measured in radians per second, while v is measured in meter per second. However, the definition of a radian implies radians are measured in distance units. Hence, the physical dimensions are, effectively, the same. As for the formula for the total energy of an oscillator, we should actually write: E = mĀ·a2āĻ2/2. The additional factor (a) is the (maximum) amplitude of the oscillator.

[9] We also have a 1/2 factor in the E = mv2/2 formula. Two remarks may be made here. First, it may be noted this is a non-relativistic formula and, more importantly, incorporates kinetic energy only. Using the Lorentz factor (Ī³), we can write the relativistically correct formula for the kinetic energy as K.E. = E ā E0 = mvc2 ā m0c2 = m0Ī³c2 ā m0c2 = m0c2(Ī³ ā 1). As for the exclusion of the potential energy, we may note that we may choose our reference point for the potential energy such that the kinetic and potential energy mirror each other. The energy concept that then emerges is the one that is used in the context of the Principle of Least Action: it equals E = mv2. Appendix 1 provides some notes on that.

[10] Instead of two cylinders with pistons, one may also think of connecting two springs with a crankshaft.

[11] It is interesting to note that we may look at the energy in the rotating flywheel as potential energy because it is energy that is associated with motion, albeit circular motion. In physics, one may associate a rotating object with kinetic energy using the rotational equivalent of mass and linear velocity, i.e. rotational inertia (I) and angular velocity Ļ. The kinetic energy of a rotating object is then given by K.E. = (1/2)Ā·IĀ·Ļ2.

[12] Because of the sideways motion of the connecting rods, the sinusoidal function will describe the linear motion only approximately, but you can easily imagine the idealized limit situation.

[13] The Ļ2= 1/LC formula gives us the natural or resonant frequency for a electric circuit consisting of a resistor (R), an inductor (L), and a capacitor (C). Writing the formula as Ļ2= C1/L introduces the concept of elastance, which is the equivalent of the mechanical stiffness (k) of a spring.

[14] The resistance in an electric circuit introduces a damping factor. When analyzing a mechanical spring, one may also want to introduce a drag coefficient. Both are usually defined as a fraction of the inertia, which is the mass for a spring and the inductance for an electric circuit. Hence, we would write the resistance for a spring as Ī³m and as R = Ī³L respectively.

[15] Photons are emitted by atomic oscillators: atoms going from one state (energy level) to another. Feynman (Lectures, I-33-3) shows us how to calculate the Q of these atomic oscillators: it is of the order of 108, which means the wave train will last about 10ā8 seconds (to be precise, that is the time it takes for the radiation to die out by a factor 1/e). For example, for sodium light, the radiation will last about 3.2Ć10ā8 seconds (this is the so-called decay time Ļ). Now, because the frequency of sodium light is some 500 THz (500Ć1012 oscillations per second), this makes for some 16 million oscillations. There is an interesting paradox here: the speed of light tells us that such wave train will have a length of about 9.6 m! How is that to be reconciled with the pointlike nature of a photon? The paradox can only be explained by relativistic length contraction: in an analysis like this, one need to distinguish the reference frame of the photon ā riding along the wave as it is being emitted, so to speak ā and our stationary reference frame, which is that of the emitting atom.

[16] This is a general result and is reflected in the K.E. = T = (1/2)Ā·mĀ·Ļ2Ā·a2Ā·sin2(ĻĀ·t + Ī) and the P.E. = U = kĀ·x2/2 = (1/2)Ā· mĀ·Ļ2Ā·a2Ā·cos2(ĻĀ·t + Ī) formulas for the linear oscillator.

[17] Feynman further formalizes this in his Lecture on Superconductivity (Feynman, III-21-2), in which he refers to SchrĆ¶dingerās equation as the āequation for continuity of probabilitiesā. The analysis is centered on the local conservation of energy, which confirms the interpretation of SchrĆ¶dingerās equation as an energy diffusion equation.

[18] The meff is the effective mass of the particle, which depends on the medium. For example, an electron traveling in a solid (a transistor, for example) will have a different effective mass than in an atom. In free space, we can drop the subscript and just write meff = m. Appendix 2 provides some additional notes on the concept. As for the equations, they are easily derived from noting that two complex numbers a + iāb and c + iād are equal if, and only if, their real and imaginary parts are the same. Now, the āĻ/āt = iā(Ä§/meff)āā2Ļ equation amounts to writing something like this: a + iāb = iā(c + iād). Now, remembering that i2 = ā1, you can easily figure out that iā(c + iād) = iāc + i2ād = ā d + iāc.

[19] The dimension of B is usually written as N/(māA), using the SI unit for current, i.e. the ampere (A). However, 1 C = 1 Aās and, hence, 1 N/(māA) = 1 (N/C)/(m/s).

[20] Of course, multiplication with i amounts to a counterclockwise rotation. Hence, multiplication by –i also amounts to a rotation by 90 degrees, but clockwise. Now, to uniquely identify the clockwise and counterclockwise directions, we need to establish the equivalent of the right-hand rule for a proper geometric interpretation of SchrĆ¶dingerās equation in three-dimensional space: if we look at a clock from the back, then its hand will be moving counterclockwise. When writing B = (1/c)āiāE, we assume we are looking in the negative x-direction. If we are looking in the positive x-direction, we should write: B = -(1/c)āiāE. Of course, Nature does not care about our conventions. Hence, both should give the same results in calculations. We will show in a moment they do.

[21] In fact, when multiplying C2/(NĀ·m2) with N2/C2, we get N/m2, but we can multiply this with 1 = m/m to get the desired result. It is significant that an energy density (joule per unit volume) can also be measured in newton (force per unit area.

[22] The illustration shows a linearly polarized wave, but the obtained result is general.

[23] The sine and cosine are essentially the same functions, except for the difference in the phase: sinĪø = cos(ĪøāĻ /2).

[24] I must thank a physics blogger for re-writing the 1/(Īµ0Ā·Ī¼0) = c2 equation like this. See: http://reciprocal.systems/phpBB3/viewtopic.php?t=236 (retrieved on 29 September 2017).

[25] A circularly polarized electromagnetic wave may be analyzed as consisting of two perpendicular electromagnetic plane waves of equal amplitude and 90Ā° difference in phase.

[26] Of course, the reader will now wonder: what about neutrons? How to explain neutron spin? Neutrons are neutral. That is correct, but neutrons are not elementary: they consist of (charged) quarks. Hence, neutron spin can (or should) be explained by the spin of the underlying quarks.

[27] We detailed the mathematical framework and detailed calculations in the following online article: https://readingfeynman.org/2017/09/15/the-principle-of-least-action-re-visited.

[28] https://en.wikipedia.org/wiki/Electron_rest_mass (retrieved on 29 September 2017).

# Math, physics and reality

This blog has been nice. It doesn’t get an awful lot of traffic (about a thousand visitors a week) but, from time to time, I do get a response or a question that fires me up, if only because it tells me someone is actuallyĀ reading what I write.

Looking at the site now, I feel like I need to reorganize it completely. It’s justĀ chaos, right? But then that’s what gets me the positive feedback: my readers are in the same boat. We’re trying to make sense of what physicists tell us is reality. TheĀ interference modelĀ I presented in my previous post is really nice. It has all the ingredients of quantum mechanics, which I would group under two broad categories: uncertainty and duality. Both are related, obviously. I will not talk about theĀ realityĀ of the wavefunction here, because I am biased: I firmly believe the wavefunction represents something real. Why? Because Einstein’s E = mĀ·c2Ā formula tells us so: energy is a two-dimensional oscillation of mass. Two-dimensional, because it’s gotĀ twiceĀ the energy of the classroom oscillator (think of a mass on a spring). More importantly, the real and imaginary dimension of the oscillation are both real: they’re perpendicular to the direction of motion of the wave-particle. Photon or electron. It doesn’t matter. Of course, we have all of the transformation formulas, but… Well… These areĀ notĀ real: they are only there to accommodateĀ ourĀ perspective: the state of the observer.

The distinction between theĀ groupĀ andĀ phaseĀ velocity of a wave packet is probably the best example of the failure of ordinary words to describe reality: particles are not waves, and waves are not particles. They are both… Well… Both at the same time. To calculate theĀ actionĀ along someĀ path, we assume there is some path, and we assume there is some particle following some path. The path and the particle are just figments of our mind. Useful figments of the mind, but… Well… There is no such thing as an infinitesimally small particle, and the concept of some one-dimensional line in spacetime does not make sense either. Or… Well… They do. Because they helpĀ usĀ to make sense of the world. Of whatĀ is, whatever it is. š

The mainstream views on the physical significance of the wavefunction are probably best summed up in the EncyclopĆ¦dia Britannica, which says the wavefunction has no physical significance. Let me quote the relevant extract here:

“TheĀ wave function,Ā in quantum mechanics, is a variable quantity that mathematically describes the wave characteristics of a particle. The value of the wave function of a particle at a given point of space and time is related to the likelihood of the particleās being there at the time. By analogy with waves such as those of sound, a wave function, designated by the Greek letter psi, ĪØ, may be thought of as an expression for the amplitude of the particle wave (or de Broglie wave), although for such waves amplitude has no physical significance. The square of the wave function, ĪØ2, however, does have physical significance: the probability of finding the particle described by a specific wave function ĪØ at a given point and time is proportional to the value of ĪØ2.”

Really? First, this is factuallyĀ wrong: the probability is given by the square of theĀ absoluteĀ value of the wave function. These are twoĀ veryĀ different things:

1. The square of a complex number is just another complex number:Ā (aĀ + ib)2Ā = a2Ā + (ib)2Ā + 2iab = a2Ā +Ā i2b2Ā + 2iab = a2Ā ā b2Ā + 2iab.
2. In contrast, the square of the absolute value always gives us a realĀ number, to which we assign the mentioned physical interpretation:|aĀ + ib|2Ā = [ā(a2Ā + b2)]2Ā =Ā a2Ā + b2.

But it’s not only position: using the right operators, we can also get probabilities on momentum, energy and other physical variables. Hence, the wavefunction is so much more than what theĀ EncyclopĆ¦dia BritannicaĀ suggests.

More fundamentally, what is written there is philosophicallyĀ inconsistent.Ā SquaringĀ something – the number itself or its norm –Ā is a mathematical operation. How can a mathematical operation suddenly yield something that has physical significance, if none of the elements it operates on, has any. One cannot just go from the mathematical to the physical space. The mathematical space describesĀ the physical space. Always. In physics, at least. š

So… Well… There is too much nonsense around. Disgusting. And theĀ EncyclopĆ¦dia BritannicaĀ should not just present the mainstream view. The truth is: the jury is still out, and there are many guys like me. We think the majority view is plain wrong. In this case, at least. š

# Playing with amplitudes

Let’s play a bit with the stuff we found in our previous post. This is going to be unconventional, or experimental, if you want. The idea is to give you… Well… Some ideas. So you can play yourself. š Let’s go.

Let’s first look at Feynman’s (simplified) formula for the amplitude of a photon to go from point aĀ to point b. If we identify point aĀ by the position vector r1Ā and point bĀ by the position vectorĀ r2, and using Dirac’s fancyĀ bra-ketĀ notation, then it’s written as:

So we have a vector dot product here: pār12Ā = |p|ā|r12|Ā·Ā cosĪø = pār12Ā·cosĪ±. The angle here (Ī±) is the angle between theĀ pĀ andĀ r12Ā vector. All good. Well… No. We’ve got a problem. When it comes to calculating probabilities, the Ī± angle doesn’t matter: |eiĀ·Īø/r|2Ā = 1/r2. Hence, for the probability, we get: P = |Ā ā©r2|r1āŖ |2Ā =Ā 1/r122. Always ! Now that’s strange. The Īø =Ā pār12/Ä§Ā argument gives us a different phase depending on the angle (Ī±) between p and r12. But… Well… Think of it:Ā cosĪ± goes from 1 to 0 when Ī± goes from 0 to Ā±90Ā° and, of course, is negative when p and r12Ā have opposite directions but… Well… According to this formula, the probabilitiesĀ doĀ not depend on the direction of the momentum. That’s just weird, I think. Did Feynman, in his iconicĀ Lectures, give us a meaningless formula?

Maybe. We may also note this function looks like the elementary wavefunction for any particle, which we wrote as:

Ļ(x, t) = aĀ·eāiāĪøĀ =Ā aĀ·eāiā(Eāt āĀ pāx)/Ä§= aĀ·eāiā(Eāt)/Ä§Ā·eiā(pāx)/Ä§

The only difference is that the ā©r2|r1āŖ sort of abstracts away from time, so… Well… Let’s get a feel for the quantities. Let’s think of a photon carryingĀ some typical amount of energy. Hence, let’s talk visible light and, therefore, photons of a few eV only – say 5.625 eV = 5.625Ć1.6Ć10ā19Ā J = 9Ć10ā19Ā J. Hence, their momentum is equal to p = E/c = (9Ć10ā19Ā NĀ·m)/(3Ć105Ā m/s) = 3Ć10ā24Ā NĀ·s. That’s tiny but that’s only becauseĀ newtonsĀ andĀ secondsĀ are enormous units at the (sub-)atomic scale. As for the distance, we may want to use the thickness of a playing card as a starter, as that’s what Young used when establishing the experimentalĀ fact of light interfering with itself. Now, playing cards in Young’s time were obviously rougher than those today, but let’s take the smaller distance: modern cards are as thin as 0.3 mm. Still, that distance is associated with a value ofĀ Īø that is equal to 13.6 million. Hence, theĀ densityĀ of our wavefunction is enormous at this scale, and it’s a bit of a miracle that Young could see any interference at all ! As shown in the table below, we only get meaningful values (remember:Ā Īø is a phase angle) when we go down to the nanometerĀ scale (10ā9Ā m) or, even better, theĀ angstroms scale ((10ā9Ā m).Ā

So… Well… Again: what can we do with Feynman’s formula? Perhaps he didn’t give us a propagatorĀ function but something that is more general (read: more meaningful) at our (limited) level of knowledge. As I’ve been reading Feynman for quite a while now – like three or four years š – I think… Well… Yes. That’s it. Feynman wants us to think about it. š Are you joking again, Mr. Feynman?Ā šĀ So let’s assume the reasonable thing: let’s assume it gives us the amplitude to go from point a toĀ point bĀ by the position vectorĀ along some path r.Ā So, then, in line with what we wrote in our previous post, let’s say pĀ·rĀ (momentum over a distance) is the action (S) we’d associate with this particular path (r) and then see where we get. So let’s writeĀ the formula like this:

ĻĀ =Ā aĀ·eiĀ·ĪøĀ = (1/r)Ā·eiĀ·S/Ä§Ā =Ā eiĀ·pār/Ä§/r

We’ll use an index to denote the various paths: r0Ā is the straight-line path and riĀ is any (other) path.Ā Now, quantum mechanics tells us we should calculate this amplitudeĀ for every possible path. The illustration below shows the straight-line path and two nearby paths. So each of these paths is associated with some amount of action, which we measure in PlanckĀ units:Ā Īø =Ā S/Ä§.Ā

The time interval is given by tĀ = t0Ā =Ā r0/c, for all paths. Why is the time interval the same for all paths? Because we think of a photon going from some specificĀ point in space and in timeĀ to some otherĀ specificĀ point in space and in time. Indeed, when everything is said and done, we do think of light as traveling from pointĀ a to pointĀ bĀ at the speed of light (c). In fact, all of the weird stuff here is all about trying to explain howĀ it does that. š

Now, if we would think of the photon actually traveling along this or that path, then this implies its velocityĀ along any of the nonlinear paths will be largerĀ thanĀ c, which is OK. That’s just the weirdness of quantum mechanics, and you should actuallyĀ notĀ think of the photon actually traveling along one of these paths anyway although we’ll often put it that way. Think of something fuzzier, whatever that may be. š

So the action is energy times time, or momentum times distance. Hence, the difference in action between two paths iĀ andĀ jĀ is given by:

Ī“SĀ = pĀ·rjĀ āĀ pĀ·riĀ = pĀ·(rjĀ ā ri) = pĀ·Īr

I’ll explain theĀ Ī“S <Ā 2ĻÄ§/3 thing in a moment. Let’s first pause and think about theĀ uncertainty and how we’re modeling it. We can effectively think of the variation in SĀ as some uncertaintyĀ in the action: Ī“SĀ = ĪS = pĀ·Īr. However, if SĀ is also equal to energy times time (SĀ = EĀ·t), and we insist tĀ is the same for all paths, then we must have some uncertainty in the energy, right? Hence, we can write Ī“SĀ as ĪS = ĪEĀ·t. But, of course, E =Ā E =Ā mĀ·c2Ā = pĀ·c, so we will have an uncertainty in the momentum as well. Hence, the variation inĀ SĀ should be written as:

Ī“SĀ = ĪSĀ = ĪpĀ·Īr

That’s just logical thinking: if we, somehow, entertain the idea of a photon going from someĀ specificĀ point in spacetime to some otherĀ specificĀ point in spacetime along various paths, then the variation, or uncertainty,Ā in the action will effectively combine some uncertainty in the momentum and the distance. We can calculate Īp as ĪE/c, so we get the following:

Ī“SĀ = ĪSĀ = ĪpĀ·Īr =Ā ĪEĀ·Īr/c = ĪEĀ·Īt with Īt =Ā Īr/c

So we have the two expressions for the Uncertainty Principle here: ĪSĀ = ĪpĀ·Īr =Ā ĪEĀ·Īt. Just be careful with the interpretation of Īt: it’s just the equivalent of Īr. We just express the uncertainty in distance in secondsĀ using the (absolute) speed of light. We are notĀ changing our spacetime interval: we’re still looking at a photon going fromĀ aĀ toĀ bĀ inĀ tĀ seconds,Ā exactly. Let’s now look at theĀ Ī“S <Ā 2ĻÄ§/3 thing. If we’re addingĀ twoĀ amplitudes (twoĀ arrowsĀ or vectors, so to speak) and we want the magnitude of the result to be larger than the magnitude of the two contributions, then the angle between them should be smaller than 120 degrees, so that’s 2Ļ/3 rad. The illustration below shows how you can figure that out geometrically.Hence, if S0Ā is the action for r0, then S1Ā = S0Ā + Ä§Ā and S2Ā = S0Ā + 2Ā·Ä§ are still good, but S3Ā = S0Ā + 3Ā·Ä§Ā isĀ notĀ good. Why? Because the difference in the phase angles is ĪĪøĀ =Ā S1/Ä§Ā āĀ S0/Ä§Ā = (S0Ā + Ä§)/Ä§Ā āĀ S0/Ä§ = 1 andĀ ĪĪø =Ā S2/Ä§Ā āĀ S0/Ä§Ā = (S0Ā + 2Ā·Ä§)/Ä§Ā āĀ S0/Ä§ = 2 respectively, so that’s 57.3Ā°Ā and 114.6Ā°Ā respectively and that’s, effectively,Ā lessĀ than 120Ā°. In contrast,Ā for the next path, we find that ĪĪøĀ =Ā S3/Ä§Ā āĀ S0/Ä§Ā = (S0Ā + 3Ā·Ä§)/Ä§Ā āĀ S0/Ä§ = 3, so that’s 171.9Ā°. So that amplitude gives us a negative contribution.

Let’s do some calculations using a spreadsheet. To simplify things, we will assume we measure everything (time, distance, force, mass, energy, action,…) in Planck units. Hence, we can simply write:Ā SnĀ = S0Ā + n. Of course, nĀ = 1, 2,… etcetera, right? Well… Maybe not. We areĀ measuringĀ action in units ofĀ Ä§, butĀ do we actually think actionĀ comesĀ in units ofĀ Ä§?Ā I am not sure. It would make sense, intuitively, butā¦ Wellā¦ Thereās uncertainty on the energy (E) and the momentum (p) of our photon, right? And how accurately can we measure the distance? So thereās some randomness everywhere. š¦ So let’s leave that question open as for now.

We will also assume that the phase angle forĀ S0Ā is equal to 0 (or some multiple of 2Ļ, if you want). That’s just a matter of choosing the origin of time. This makes it really easy: ĪSnĀ =Ā SnĀ ā S0Ā = n, and the associated phase angle ĪønĀ = ĪĪønĀ is the same. In short, the amplitude for each path reduces to ĻnĀ = eiĀ·n/r0. So we need to add these firstĀ andĀ thenĀ calculate the magnitude, which we can then square to get a probability. Of course, there is also the issue of normalization (probabilities have to add up to one) but let’s tackle that later. For the calculations, we use Euler’s rĀ·eiĀ·ĪøĀ = rĀ·(cosĪø + iĀ·sinĪø) = rĀ·cosĪø + iĀ·rĀ·sinĪø formula. Needless to say, |rĀ·eiĀ·Īø|2Ā = |r|2Ā·|eiĀ·Īø|2Ā = |r|2Ā·(cos2Īø + sin2Īø) = r. Finally, when adding complex numbers, we add the real and imaginary parts respectively, and we’ll denote the Ļ0Ā + Ļ1Ā +Ļ2Ā + … sum as ĪØ.

Now, we also need to see how our ĪSĀ = ĪpĀ·ĪrĀ works out. We may want to assume that the uncertainty in p and in r will both be proportional to the overall uncertainty in the action. For example, we could try writing the following:Ā ĪSnĀ = ĪpnĀ·ĪrnĀ =Ā nĀ·Īp1Ā·Īr1. It also makes sense that you may want ĪpnĀ and ĪrnĀ to be proportional to Īp1Ā and Īr1Ā respectively. Combining both, the assumption would be this:

ĪpnĀ =Ā ānĀ·Īp1Ā andĀ ĪrnĀ =Ā ānĀ·Īr1

So now we just need to decide how we will distribute ĪS1Ā =Ā Ä§Ā = 1 over Īp1Ā and Īr1Ā respectively. For example, if we’d assume Īp1Ā = 1, then Īr1Ā = Ä§/Īp1Ā = 1/1 = 1. These are the calculations. I will let you analyze them. šWell… We get a weird result. It reminds me ofĀ Feynman’s explanation of the partial reflection of light, shown below, but… Well… That doesn’t make much sense, does it?

Hmm… Maybe it does. š Look at the graph more carefully. The peaks sort of oscillate out so… Well… That might make sense… š

Does it? Are we doingĀ something wrongĀ here? These amplitudes should reflect the ones that are reflected in those nice animations (like this one, for example, which is part of thatās part of the Wikipedia article on Feynmanās path integral formulation of quantum mechanics). So what’s wrong, if anything? Well… Our paths differ by some fixed amount of action, which doesn’t quite reflect the geometric approach that’s used in those animations. The graph below shows how the distanceĀ rĀ varies as a function ofĀ n.Ā

If we’d use a model in which the distance wouldĀ increaseĀ linearly or, preferably, exponentially, then we’d get the result we want to get, right?

Well… Maybe. Let’s try it.Ā Hmm… We need to think about the geometry here. Look at the triangle below.Ā IfĀ bĀ is the straight-line path (r0), thenĀ acĀ could be one of the crooked paths (rn). To simplify, we’ll assume isosceles triangles, soĀ aĀ equalsĀ cĀ and, hence, rnĀ = 2Ā·a = 2Ā·c. We will also assume theĀ successive paths are separated by the same vertical distance (h =Ā h1) right in the middle, so hbĀ =Ā hnĀ = nĀ·h1.Ā It is then easy to show the following:This gives the following graph for rnĀ = 10 and h1Ā = 0.01.

Is this the right step increase? Not sure. We can vary the values in our spreadsheet. Let’s first build it. TheĀ photon will have to travel faster in order to cover the extra distance in the same time, so its momentum will be higher. Let’s think about the velocity. Let’s start with the first path (nĀ = 1). In order to cover the extraĀ distance Īr1, the velocity c1Ā must be equal to (r0Ā + Īr1)/tĀ = r0/tĀ + Īr1/t =Ā cĀ + Īr1/tĀ = c0Ā + Īr1/t. We can write c1Ā as c1Ā =Ā c0Ā + Īc1, so Īc1Ā = Īr1/t.Ā Now, theĀ ratioĀ of p1Ā  and p0Ā will be equal to theĀ ratioĀ of c1Ā andĀ c0Ā because p1/p0Ā = (mc1)/mc0) = c1/c0. Hence, we have the following formula for p1:

p1Ā = p0Ā·c1/c0Ā = p0Ā·(c0Ā + Īc1)/c0Ā = p0Ā·[1 + Īr1/(c0Ā·t) = p0Ā·(1 + Īr1/r0)

ForĀ pn, the logic is the same, so we write:

pnĀ = p0Ā·cn/c0Ā = p0Ā·(c0Ā + Īcn)/c0Ā = p0Ā·[1 + Īrn/(c0Ā·t) = p0Ā·(1 + Īrn/r0)

Let’s do the calculations, and let’s use meaningful values, so the nanometer scale and actual values for Planck’s constant and the photon momentum. The results are shown below.Ā

Pretty interesting. In fact, this looksĀ reallyĀ good. TheĀ probabilityĀ first swings around wildly, because of these zones of constructive and destructive interference, but then stabilizes. [Of course, I would need to normalize the probabilities, but you get the idea, right?] So… Well… I think we get a veryĀ meaningful result with this model. Sweet ! š I’m lovin’ it ! š And, here you go, this is (part of) the calculation table, so you can see what I am doing. š

The graphs below look even better: I just changed the h1/r0Ā ratio from 1/100 to 1/10. The probability stabilizes almost immediately. š So… Well… It’s not as fancy as the referenced animation, but I think the educational value of this thing here is at least as good ! š

š This is good stuff… š

Post scriptum (19 September 2017): There is an obvious inconsistency in the model above, and in the calculations. We assume there is a path r1Ā = ,Ā r2, r2,etcetera, and then we calculate the action for it, and the amplitude, and then we add the amplitude to the sum. But, surely, we should count these paths twice, in two-dimensional space, that is. Think of the graph: we have positive and negative interference zones that are sort of layered around the straight-line path, as shown below.

In three-dimensional space, these lines become surfaces. Hence, rather than adding oneĀ arrow for everyĀ Ī“Ā Ā having oneĀ contribution only, we may want to add… Well… In three-dimensional space, the formula for the surface around the straight-line path would probably look like ĻĀ·hnĀ·r1, right? Hmm… Interesting idea. I changed my spreadsheet to incorporate that idea, and I got the graph below. It’s a nonsensical result, because the probability does swing around, but it gradually spins out of control: it never stabilizes.That’s because we increase theĀ weightĀ of the paths that are further removed from the center. So… Well… We shouldn’t be doing that, I guess. š I’ll you look for the right formula, OK? Let me know when you found it. š

# The Principle of Least Action re-visited

As I was posting some remarks on the Exercises that come with Feynman’sĀ Lectures,Ā I was thinking I should do another post on the Principle of Least Action, and how it is used in quantum mechanics. It is an interesting matter, because the Principle of Least Action sort of connects classical and quantum mechanics.

Let us first re-visit the Principle in classical mechanics. The illustrations which Feynman uses in his iconicĀ exposĆ©Ā on it are copied below. You know what they depict: some object that goes up in the air, and then comes back down because of… Well… Gravity. Hence, we have a force field and, therefore, some potential which gives our object some potential energy. The illustration is nice because we can apply it any (uniform) force field, so let’s analyze it a bit more in depth.

We know the actualĀ trajectory – which Feynman writes as x(t) =Ā x(t)Ā +Ā Ī·(t) so as to distinguish it from some other nearby path x(t) – willĀ minimizeĀ the value of the following integral:

In the mentioned post, I try to explain what the formula actually means by breaking it up in two separate integrals: one with the kinetic energy in the integrand and – you guessed it š – one with the potential energy. We can choose any reference point for our potential energy, of course, but to better reflect the energy conservation principle, we assume PE = 0 at the highest point. This ensures that theĀ sumĀ of the kinetic and the potential energy is zero. For a mass of 5 kg (think of the ubiquitous cannon ball), and a (maximum) height of 50 m, we got the following graph.

Just to make sure, here is how we calculate KE and PE as a function of time:

We can, of course, also calculate the action as a function of time:

Note the integrand: KEĀ ā PEĀ = mĀ·v2. Strange, isn’t it? It’sĀ likeĀ EĀ =Ā mĀ·c2, right? We get aĀ weird cubic function, which I plotted below (blue). I added the function for theĀ heightĀ (but inĀ millimeter) because of the different scales.

So what’s going on? The action concept is interesting. As theĀ productĀ of force, distance and time, it makes intuitive sense: it’s force over distance over time. To cover some distance in some force field, energy will be used or spent but, clearly, the timeĀ that is needed should matter as well, right? Yes. But the question is:Ā how, exactly? Let’s analyze what happens fromĀ tĀ = 0 toĀ tĀ = 3.2 seconds, so that’s the trajectory fromĀ hĀ = 0 to the highest point (hĀ = 50 m). The actionĀ that is required toĀ bring our 5 kg object there would be equal to FĀ·hĀ·t = mĀ·gĀ·hĀ·tĀ =Ā 5Ć9.8Ć50Ć3.2 = 7828.9Ā JĀ·s. [I use non-rounded values in my calculations.]Ā However, our action integral tells us it’s only 5219.6Ā JĀ·s. The difference (2609.3 JĀ·s) is explained by the initial velocity and, hence, the initial kinetic energy, which we got for free, so to speak, and which, over the time interval, is spent asĀ action. So our action integral gives us a netĀ value, so to speak.

To be precise, we can calculate the time rate of change of the kinetic energy as d(KE)/dtĀ = ā1533.7 + 480.2Ā·t, so that’s a linear function of time. The graph below shows how it works. The time rate of change is initially negative, asĀ kinetic energy gets spent and increases the potential energy of our object. At the maximum height, the time of rate of change is zero. The object then starts falling, and the time rate of change becomes positive, as the velocity of our object goes from zero to… Well… The velocity is a linear function of time as well:Ā vĀ =Ā v0Ā ā gĀ·t, remember? Hence, atĀ tĀ = v0/g = 31.3/9.8 = 3.2 s, the velocity becomesĀ negativeĀ so our cannon ball is, effectively, falling down. Of course, as it falls down and gains speed, it covers more and more distance per secondĀ and, therefore, the associated actionĀ also goes up exponentially. Just re-define our starting point atĀ tĀ = 3.2 s. The mĀ·v0tĀ·(v0Ā āĀ gt) term is zero at that point, and so then it’s only the mĀ·g2Ā·t3/3 term that counts.

So… Yes. That’s clear enough. But it still doesn’t answer the fundamental question: how does that minimization of SĀ (or the maximization ofĀ āS) work,Ā exactly? Well… It’s not likeĀ NatureĀ knows it wants to go from pointĀ aĀ to pointĀ b, and then sort of works out some least actionĀ algorithm. No. The true path is given by the force law which,Ā at every point in spacetime, will accelerate, or decelerate, our object at a rateĀ aĀ that is equal to the ratio of the force and the mass of our object. In this case, we write:Ā aĀ = F/mĀ = mĀ·g/m = g, so that’s the acceleration of gravity. That’s the onlyĀ realĀ thing: all of the above is just math, someĀ mental construct, so to speak.

Of course, this acceleration, or deceleration, then gives the velocity and the kinetic energy. Hence, once again, it’s not like we’re choosingĀ some average for our kinetic energy: the force (gravity, in this particular case) just give us that average. Likewise, the potential energy depends on the position of our object, which we get from… Well… Where it starts and where it goes, so it also depends on the velocity and, hence, the acceleration or deceleration from the force field. So there isĀ noĀ optimization. No teleology.Ā Newton’s force law gives us the true path. If we drop something down, it will go down in a straight line, because any deviation from it would add to the distance. A more complicated illustration is Fermat’s Principle of Least Time, which combines distance and time. But we won’t go into any further detail here. Just note that, in classical mechanics, the true path can, effectively, be associated with a minimumĀ value for that action integral: any other path will be associated with a higher S. So we’re done with classical mechanics here. What about the Principle of Least Action in quantum mechanics?

## The Principle of Least Action in quantum mechanics

We have the uncertainty in quantum mechanics: there is no unique path. However, we can, effectively, associate each possible path with a definite amount of action, which we will also write as S. However, instead of talkingĀ velocities, we’ll usually want to talkĀ momentum. Photons have no rest mass (m0Ā = 0), but they do haveĀ momentumĀ because of their energy: for a photon, the E = mĀ·c2Ā equation can be rewritten as E = pĀ·c, and the Einstein-Planck relation for photons tells us the photon energy (E) is related to the frequency (f): E = hĀ·f. Now, for a photon, the wavelength is given by fĀ = c/Ī».Ā Hence, p = E/c = hĀ·f/c= h/Ī» = Ä§Ā·k.

OK. What’s the action integral? What’s the kinetic and potential energy? Let’s just try the energy: E = mĀ·c2. It reflects theĀ KEĀ ā PEĀ = mĀ·v2Ā formula we used above. Of course, the energy of a photon doesĀ notĀ vary, so the value of our integral is just the energy times the travel time, right? What is the travel time? Let’s do things properly by using vector notations here, so we will have two position vectorsĀ r1Ā andĀ r2Ā for point aĀ andĀ b respectively. We can then define a vector pointing fromĀ r1Ā toĀ r2, which we will write as r12. The distance between the two points is then, obviously, equal to|r12| = ār122Ā =Ā r12. Our photon travels at the speed of light, so theĀ timeĀ interval will be equal toĀ tĀ = r12/c. So we get a very simple formula for the action:Ā SĀ = EĀ·t = pĀ·cĀ·tĀ = pĀ·cĀ·r12/cĀ = pĀ·r12. Now, it may or may not make sense to assume that the directionĀ of the momentum of our photon and the direction of r12Ā are somewhat different, so we’ll want to re-write this as a vector dot product: S =Ā pĀ·r12. [Of course, you know theĀ pār12Ā dot product equals |p|ā|r12|Ā· cosĪø = pār12Ā·cosĪø, with Īø the angle betweenĀ pĀ andĀ r12. If the angle is the same, then cosĪø is equal to 1. If the angle is Ā± Ļ/2, then itās 0.]

So now we minimize the action so as to determine the actualĀ path? No. We have this weird stopwatchĀ stuff in quantum mechanics.Ā We’ll use this S =Ā pĀ·r12Ā value to calculate a probability amplitude. So we’ll associate trajectories with amplitudes, and we just use the action values to do so. This is how it works (don’t ask me why – not now, at least):

1. We measure action in units of Ä§, because… Well… Planck’s constant is a pretty fundamental unit of action, right? š So we write Īø = S/Ä§Ā =Ā pĀ·r12/Ä§.
2. Īø usually denotes an angle, right? Right. Īø = pĀ·r12/Ä§Ā is the so-called phase of… Well… A proper wavefunction:

Ļ(p,Ā r12) = aĀ·eiĀ·ĪøĀ = (1/r12)Ā·eiĀ·pār12/Ä§Ā Ā Ā

Wow !Ā I realize you may never have seen this… Well… It’s myĀ derivation of what physicists refer to as theĀ propagator functionĀ for a photon. If you google it, you may see it written like this (most probably not, however, as it’s usually couched in more abstract math):This formulation looks slightly better because it uses Diracs bra-ketĀ notation:Ā the initialĀ state of our photon is written as ā©Ā r1|Ā and its final state is, accordingly, |r2āŖ. But it’s the same: it’s the amplitude for our photon to go from point aĀ to pointĀ b. In case you wonder, the 1/r12Ā coefficient is there to take care of the inverse square law. I’ll let you think about that for yourself. It’s just like any other physical quantity (orĀ intensity, if you want): they get diluted as the distance increases. [Note that we get the inverse square (1/r122)Ā when calculating a probability, which we do byĀ taking the absolute square of our amplitude:Ā |(1/r12)Ā·eiĀ·pār12/Ä§|2Ā = |1/r122)|2Ā·|eiĀ·pār12/Ä§|2Ā = 1/r122.]

So… Well… Now we are ready to understand Feynman’s own summary of his path integral formulation of quantum mechanics:Ā Ā explanation words:

āHere is how it works: Suppose that for all paths, SĀ is very large compared to Ä§.Ā One path contributes a certain amplitude. For a nearby path, the phase is quite different, because with an enormous SĀ even a small change in SĀ means a completely different phaseābecause Ä§Ā is so tiny. So nearby paths will normally cancel their effects out in taking the sumāexcept for one region, and that is when a path and a nearby path all give the same phase in the first approximation (more precisely, the same action within Ä§). Only those paths will be the important ones.”

You are now, finally, ready to understand that wonderful animation that’s part of the Wikipedia article on Feynman’s path integral formulation of quantum mechanics. Check it out, and let the author (not me, butĀ a guy who identifies himself asĀ Juan David) I think it’s great ! š

## Explaining diffraction

All of the above is nice, but how does it work? What’s the geometry? Let me be somewhat more adventurous here. So we have our formula forĀ theĀ amplitudeĀ of a photon to go from one pointĀ to another:The formula is far too simple, if only because it assumes photons always travel at the speed of light. As explained in an older post of mine, a photon also has an amplitude to travel slower or faster than cĀ (I know that sounds crazy, but it is what it is) and a more sophisticated propagator function will acknowledge that and, unsurprisingly, ensure the spacetime intervals that are more light-like make greater contributions to the ‘final arrow’, as Feynman (or his student, Ralph Leighton, I should say) put it in his Strange Theory of Light and Matter. However, then we’d need to use four-vector notation and we don’t want to do that here. The simplified formula above serves the purpose. We can re-write it as:

Ļ(p,Ā r12) =Ā aĀ·eiĀ·ĪøĀ = (1/r12)Ā·eiĀ·S/Ä§Ā = eiĀ·pār12/Ä§/r12

Again, S =Ā pĀ·r12Ā is just the amount ofĀ actionĀ we calculate for the path. Action is energy over some time (1 NĀ·mĀ·s = 1 JĀ·s), or momentum over some distance (1 kgĀ·(m/s)Ā·m = 1 NĀ·(s2/m)Ā·(m/s)Ā·m) = 1 NĀ·mĀ·s). For a photon traveling at the speed of light, we have E = pĀ·c, and tĀ =Ā r12/c, so we get a very simple formula for the action:Ā SĀ = EĀ·tĀ = pĀ·r12. Now, we know that, in quantum mechanics, we have to add the amplitudes for the various paths between r1Ā and r2Ā so we get a ‘final arrow’ whose absolute square gives us the probability of… Well… Our photon going from r1Ā and r2. You also know that we don’t really know what actually happens in-between: we know amplitudes interfere, but that’s what we’re modeling when adding the arrows. Let me copy one of Feynman’s famous drawings so we’re sure we know what we’re talking about.Our simplified approach (the assumption of light traveling at the speed of light) reduces our least action principle to a least time principle: the arrows associated with the path of least time and the paths immediately left and right of it that make the biggestĀ contributionĀ to the final arrow. Why? Think of the stopwatch metaphor: these stopwatches arrive around the same time and, hence, their hands point more or less in the same direction. It doesnāt matter what direction ā as long as itās more or lessĀ the same.

Now let me copy the illustrations he uses to explain diffraction. Look at them carefully, and read the explanation below.

When the slit is large, our photon is likely to travel in a straight line. There are many otherĀ possibleĀ paths – crooked paths – but the amplitudes that are associated with those other paths cancel each other out. In contrast, the straight-line path and, importantly, the nearbyĀ paths, are associated with amplitudes that have the same phase, more or less.

However, when the slit is very narrow, there is a problem. AsĀ Feynman puts it, “there are not enough arrows to cancel each other out” and, therefore, the crooked paths are also associated with sizable probabilities. Now how does that work, exactly? Not enough arrows? Why? Let’s have a look at it.

The phase (Īø) of our amplitudes aĀ·eiĀ·ĪøĀ = (1/r12)Ā·eiĀ·S/Ä§Ā is measured in units of Ä§:Ā Īø = S/Ä§. Hence, we should measure the variation in SĀ in units of Ä§. Consider two paths, for example: one for which the action is equal to S, and one for which the action is equal toĀ SĀ +Ā Ī“SĀ =Ā SĀ +Ā ĻĀ·Ä§, so Ī“SĀ = ĻĀ·Ä§.Ā They will cancel each other out:

eiĀ·S/Ä§/r12Ā + eiĀ·(SĀ +Ā Ī“S)/Ä§/r12Ā = (1/r12)Ā·(eiĀ·S/Ä§/r12Ā + eiĀ·(S+ĻĀ·Ä§)/Ä§/r12Ā )

= (1/r12)Ā·(eiĀ·S/Ä§Ā + eiĀ·S/Ä§Ā·eiĀ·Ļ) = (1/r12)Ā·(eiĀ·S/Ä§Ā ā eiĀ·S/Ä§) = 0

So nearby paths will interfere constructively, so to speak, by making the final arrow larger. In order for that to happen,Ā Ī“S should be smaller thanĀ 2ĻÄ§/3 ā 2Ä§, as shown below.

Why? That’s just the way the addition of angles work. Look at the illustration below: if the red arrow is the amplitude to which we are adding another, any amplitude whose phase angle is smaller thanĀ 2ĻÄ§/3 ā 2Ä§Ā will add something to its length. That’s what the geometry of the situation tells us. [If you have time, you can perhaps find some algebraic proof: let me know the result!]
We need to note a few things here. First, unlike what you might think, the amplitudes of theĀ higher and lower path in the drawing do notĀ cancel. On the contrary, the action SĀ is the same, so their magnitudes just add up. Second, if this logic is correct, we will have alternating zones with paths that interfere positively and negatively, as shown below.

Interesting geometry. How relevant are these zones as we move out from the center, steadily increasing Ī“S? I am not quite sure. I’d have to get into the math of it all, which I don’t want to do in a blog like this. What I do want to do is re-examine is Feynman’s intuitive explanation of diffraction: when the slit is very narrow, “there are not enough arrows to cancel each other out.”

Huh?Ā What’s that?Ā Can’t we add more paths? It’s a tricky question. We are measuringĀ action in units ofĀ Ä§, butĀ do we actually think action comesĀ in units ofĀ Ä§?Ā I am not sure. It would make sense, intuitively, but… Well… There’s uncertainty on the energy (E) and the momentum (p) of our photon, right? And how accurately can we measure the distance? So there’s some randomness everywhere. Having said that, the whole argument does requires us to assume action effectivelyĀ comesĀ in units of Ä§:Ā Ä§Ā is, effectively, theĀ scaling factorĀ here.

So how can we have more paths? More arrows? I don’t think so. We measure SĀ as energy over some time, or as momentum over some distance, and we express all these quantities in old-fashioned SI units: newtonĀ for the force,Ā meterĀ for the distance, andĀ secondĀ for the time. If we want smaller arrows, we’ll have to use other units, but then the numericalĀ value forĀ Ä§Ā will change too! So… Well… No. I don’t think so. And it’s not because of the normalization rule (all probabilities have to add up to one, so we do some have some re-scaling for that). That doesn’t matter, really. What matters is the physics behind the formula, and the formula tells us the physical reality isĀ Ä§. So the geometry of the situation is what it is.

Hmm… I guess that, at this point, we should wrap up our rather intuitive discussion here, and resort to the mathematical formalism of Feynman’s path integral formulation, but you can find that elsewhere.

Post scriptum: I said I would show how the Principle of Least Action is relevant to both classical as well as quantum mechanics. Well… Let me quote the Master once more:

“So in the limiting case in which Planckās constant Ä§Ā goes to zero, the correct quantum-mechanical laws can be summarized by simply saying: āForget about all these probability amplitudes. The particle does go on a special path, namely, that one for which SĀ does not vary in the first approximation.ā”

So thatās how the Principle of Least Action sort of unifies quantum mechanics as well as classical mechanics. š

Post scriptumĀ 2: In my next post, I’ll be doing some calculations. They will answer the question as to how relevant those zones of positive and negative interference further away from the straight-line path. I’ll give a numerical exampleĀ which shows the 1/r12Ā factor does its job. š Just have a look at it. š

# Some thoughts on the nature of reality

Some other comment on an article on my other blog, inspired me to structure some thoughts that are spread over various blog posts. What follows below, is probably the first draft of an article or a paper I plan to write. Or, who knows, I might re-write my two introductory books on quantum physics and publish a new edition soon. š

## Physical dimensions and Uncertainty

The physical dimension of the quantum of action (h orĀ Ä§ = h/2Ļ) is force (expressed in newton)Ā times distance (expressed in meter)Ā times time (expressed in seconds): NĀ·mĀ·s. Now, you may think this NĀ·mĀ·s dimension is kinda hard to imagine. We can imagine its individual components, right? Force, distance and time. We know what they are. But the product of all three? What is it, really?

It shouldn’t be all that hard to imagine what it might be, right? The NĀ·mĀ·s unit is also the unit in which angular momentum is expressed – and you can sort of imagine what that is, right? Think of a spinning top, or a gyroscope. We may also think of the following:

1. [h] = NĀ·mĀ·s = (NĀ·m)Ā·s = [E]Ā·[t]
2. [h] = NĀ·mĀ·s = (NĀ·s)Ā·m = [p]Ā·[x]

Hence, the physical dimension of action is that of energy (E) multiplied by time (t) or, alternatively, that of momentum (p) times distance (x). To be precise, the second dimensional equation should be written as [h] = [p]Ā·[x], because both the momentum and the distance traveled will be associated with some direction. It’s a moot point for the discussion at the moment, though. Let’s think about the first equation first:Ā [h] = [E]Ā·[t]. What does it mean?

Energy… Hmm… InĀ real life, we are usually not interested in the energy of a system as such, but by the energy it can deliver, or absorb, per second. This is referred to as theĀ powerĀ of a system, and it’s expressed in J/s, or watt. Power is also defined as the (time) rate at which work is done. Hmm… But so here we’re multiplying energy and time. So what’s that? After Hiroshima and Nagasaki, we can sort of imagine the energy of an atomic bomb. We can also sort of imagine the power that’s being released by the Sun in light and other forms of radiation, which is about 385Ć1024 joule per second. But energy times time? What’s that?

I am not sure. If we think of the Sun as a huge reservoir of energy, then the physical dimension of action is just like having that reservoir of energy guaranteed for some time, regardless of how fast or how slow we use it. So, in short, it’s just like the Sun – or the Earth, or the Moon, or whatever object – just being there, for someĀ definiteĀ amount of time. So, yes: someĀ definite amount of mass or energy (E) for someĀ definiteĀ amount of time (t).

Let’s bring the mass-energy equivalence formula in here: E = mc2. Hence, the physical dimension of action can also be written as [h] = [E]Ā·[t] = [mc]2Ā·[t] = (kgĀ·m2/s2)Ā·s =Ā kgĀ·m2/s.Ā What does that say? Not all that much – for the time being, at least. We can get thisĀ [h] = kgĀ·m2/s through some other substitution as well. A force of one newton will give a mass of 1 kg an acceleration of 1 m/s per second. Therefore, 1 N = 1 kgĀ·m/s2Ā and, hence, the physical dimension of h, or the unit of angular momentum, may also be written as 1 NĀ·mĀ·s = 1 (kgĀ·m/s2)Ā·mĀ·s = 1 kgĀ·m2/s, i.e. the product of mass, velocity and distance.

Hmm… What can we do with that? Nothing much for the moment: our first reading of it is just that it reminds us of the definition of angular momentum – some mass with some velocity rotating around an axis. What about the distance? Oh… The distance here is just the distance from the axis, right? Right. But… Well… It’s like having some amount of linear momentum available over some distance – or in some space, right? That’s sufficiently significant as an interpretation for the moment, I’d think…

## Fundamental units

This makes one think about what units would be fundamental – and what units we’d consider as being derived. Formally, theĀ newton is aĀ derivedĀ unit in the metric system, as opposed to the units of mass, length and time (kg, m, s). Nevertheless, I personally like to think of force as being fundamental:Ā  a force is what causes an object to deviate from its straight trajectory in spacetime. Hence, we may want to think of theĀ quantum of action as representing three fundamental physical dimensions: (1)Ā force, (2)Ā time and (3) distance – or space. We may then look at energy and (linear) momentum as physical quantities combining (1) force and distance and (2) force and time respectively.

Let me write this out:

1. Force times length (think of a force that isĀ acting on some object over some distance) is energy: 1 jouleĀ (J) =Ā 1 newtonĀ·meter (N). Hence, we may think of the concept of energy as a projectionĀ of action in space only: we make abstraction of time. The physical dimension of the quantum of action should then be written as [h] = [E]Ā·[t]. [Note the square brackets tell us we are looking at aĀ dimensionalĀ equation only, so [t] is just the physical dimension of the time variable. It’s a bit confusing because I also use square brackets as parentheses.]
2. Conversely, the magnitude of linear momentum (p = mĀ·v) is expressed in newtonĀ·seconds: 1 kgĀ·m/s = 1 (kgĀ·m/s2)Ā·s = 1 NĀ·s. Hence, we may think of (linear) momentum as a projection of action in time only: we make abstraction of its spatial dimension. Think of a force that is acting on some objectĀ during some time.Ā The physical dimension of the quantum of action should then be written as [h] = [p]Ā·[x]

Of course, a force that is acting on some object during some time, will usually also act on the same object over some distance but… Well… Just try, for once, to make abstraction of one of the two dimensions here: timeĀ orĀ distance.

It is a difficult thing to do because, when everything is said and done, we don’t live in space or in time alone, but in spacetime and, hence, such abstractions are not easy. [Of course, now you’ll say that it’s easy to think of something that moves in time only: an object that is standing still does just that – but then we know movement is relative, so there is no such thing as an object that is standing still in spaceĀ in an absolute sense: Hence, objects never stand still in spacetime.] In any case, we should try such abstractions, if only because of the principle of least actionĀ is so essential and deep in physics:

1. In classical physics, the path of some object in a force field will minimizeĀ the total action (which is usually written as S) along that path.
2. In quantum mechanics, the same action integral will give us various values S – each corresponding to a particular path – and each path (and, therefore, each value of S, really) will be associated with a probability amplitude that will be proportional to some constant times eāiĀ·ĪøĀ =Ā eiĀ·(S/Ä§). Because Ä§ is so tiny, even a small change in S will give a completely different phase angle Īø. Therefore, most amplitudes will cancel each other out as we take the sum of the amplitudes over all possible paths: only the paths that nearlyĀ give the same phase matter. In practice, these are the paths that are associated with a variation in S of an order of magnitude that is equal to Ä§.

The paragraph above summarizes, in essence, Feynman’s path integral formulation of quantum mechanics. We may, therefore, think of the quantum of actionĀ expressingĀ itself (1) in time only, (2) in space only, or – much more likely – (3) expressing itself in both dimensions at the same time. Hence, if the quantum of action gives us the order of magnitudeĀ of the uncertainty – think of writing something like S Ā± Ä§, we may re-write our dimensional [Ä§] = [E]Ā·[t] and [Ä§] = [p]Ā·[x] equations as the uncertainty equations:

• ĪEĀ·Īt = Ä§Ā
• ĪpĀ·Īx = Ä§

You should note here that it is best to think of the uncertainty relations as aĀ pairĀ of equations, if only because you should also think of the concept of energy and momentum as representing different aspectsĀ of the same reality, as evidenced by the (relativistic) energy-momentum relation (E2Ā = p2c2Ā ā m02c4). Also, as illustrated below, the actual path – or, to be more precise, what we might associate with the concept of the actual path – is likely to be some mix of Īx and Īt. If Īt is very small, then Īx will be very large. In order to move over such distance, our particle will require a larger energy, so ĪE will be large. Likewise, if Īt is very large, then Īx will be very small and, therefore, ĪE will be very small. You can also reason in terms of Īx, and talk about momentum rather than energy. You will arrive at the same conclusions: the ĪEĀ·Īt = h and ĪpĀ·Īx = hĀ relations represent two aspects of the same reality – or, at the very least, what we mightĀ thinkĀ of as reality.

Also think of the following: ifĀ ĪEĀ·Īt =Ā hĀ and ĪpĀ·Īx =Ā h, thenĀ ĪEĀ·Īt =Ā ĪpĀ·Īx and, therefore,Ā ĪE/Īp must be equal to Īx/Īt. Hence, theĀ ratioĀ of the uncertainty about x (the distance) and the uncertainty about t (the time) equals theĀ ratioĀ of the uncertainty about E (the energy) and the uncertainty about p (the momentum).

Of course, you will note that the actual uncertainty relations have a factor 1/2 in them. This may be explained by thinking of both negative as well as positive variations in space and in time.

We will obviously want to do some more thinking about those physical dimensions. The idea of a force implies the idea of some object – of some mass on which the force is acting. Hence, let’s think about the concept of mass now. But… Well… Mass and energy are supposed to be equivalent, right? So let’s look at the concept of energyĀ too.

## Action, energy and mass

What isĀ energy, really? InĀ real life, we are usually not interested in the energy of a system as such, but by the energy it can deliver, or absorb, per second. This is referred to as theĀ powerĀ of a system, and it’s expressed in J/s. However, in physics, we always talk energy – not power – so… Well… What is the energy of a system?

According to the de BroglieĀ and Einstein – and so many other eminent physicists, of course – we should not only think of the kinetic energy of its parts, but also of their potential energy, and their restĀ energy, and – for an atomic system – we may add some internal energy, which may be binding energy, or excitation energy (think of a hydrogen atom in an excited state, for example). A lot of stuff. š But, obviously, Einstein’s mass-equivalence formula comes to mind here, and summarizes it all:

E = mĀ·c2

The m in this formula refers to mass – not to meter, obviously. Stupid remark, of course… But… Well… What is energy, really? What is mass,Ā really? What’s thatĀ equivalenceĀ between mass and energy,Ā really?

I don’t have the definite answer to that question (otherwise I’d be famous), but… Well… I do think physicists and mathematicians should invest more in exploring some basic intuitions here. As I explained in several posts, it is very tempting to think of energy as some kind of two-dimensional oscillation of mass. A force over some distance will cause a mass to accelerate. This is reflected in theĀ dimensional analysis:

[E] = [m]Ā·[c2] = 1 kgĀ·m2/s2Ā = 1 kgĀ·m/s2Ā·m = 1 NĀ·m

The kg and m/s2Ā factors make this abundantly clear: m/s2Ā is the physical dimension of acceleration: (the change in) velocity per time unit.

Other formulas now come to mind, such as the Planck-Einstein relation: E = hĀ·f = ĻĀ·Ä§. We could also write: E = h/T. Needless to say, T = 1/fĀ is theĀ periodĀ of the oscillation. So we could say, for example, that the energy of some particle times the period of the oscillation gives us Planck’s constant again. What does that mean? Perhaps it’s easier to think of it the other way around: E/f = h = 6.626070040(81)Ć10ā34Ā JĀ·s. Now, fĀ is the number of oscillationsĀ per second. Let’s write it asĀ fĀ = n/s, so we get:

E/fĀ = E/(n/s) = EĀ·s/nĀ = 6.626070040(81)Ć10ā34Ā JĀ·s ā E/nĀ = 6.626070040(81)Ć10ā34Ā J

What an amazing result! Our wavicle – be it a photon or a matter-particle – will alwaysĀ packĀ 6.626070040(81)Ć10ā34Ā jouleĀ inĀ oneĀ oscillation, so that’s the numericalĀ value of Planck’s constant which, of course, depends on our fundamentalĀ units (i.e. kg, meter, second, etcetera in the SI system).

Of course, the obvious question is: what’s oneĀ oscillation? If it’s a wave packet, the oscillations may not have the same amplitude, and we may also not be able to define an exact period. In fact, we should expect the amplitude and duration of each oscillation to be slightly different, shouldn’t we? And then…

Well… What’s an oscillation? We’re used toĀ countingĀ them:Ā nĀ oscillations per second, so that’sĀ per time unit. How many do we have in total? We wrote about that in our posts on the shape and size of a photon. We know photons are emitted by atomic oscillators – or, to put it simply, just atoms going from one energy level to another. Feynman calculated the Q of these atomic oscillators: itās of the order of 108Ā (see hisĀ Lectures,Ā I-33-3: itās a wonderfully simple exercise, and one that really shows his greatness as a physics teacher), so… Well… This wave train will last about 10ā8Ā seconds (thatās the time it takes for the radiation to die out by a factor 1/e). To give a somewhat more precise example,Ā for sodium light, which has a frequency of 500 THz (500Ć1012Ā oscillations per second) and a wavelength of 600 nm (600Ć10ā9Ā meter), the radiation will lasts about 3.2Ć10ā8Ā seconds. [In fact, thatās the time it takes for the radiationās energy to die out by a factor 1/e, so(i.e. the so-called decay time Ļ), so the wavetrain will actually lastĀ longer, but so the amplitude becomes quite small after that time.]Ā So… Well… Thatās a very short time but… Still, taking into account the rather spectacular frequency (500 THz) of sodium light, that makes for some 16 million oscillations and, taking into the account the rather spectacular speed of light (3Ć108Ā m/s), that makes for a wave train with a length of, roughly,Ā 9.6 meter. Huh? 9.6 meter!? But a photon is supposed to be pointlike, isn’it it? It has no length, does it?

That’s where relativity helps us out: as I wrote in one of my posts, relativistic length contraction may explain the apparent paradox. Using the reference frame of the photonĀ – so if we’d be traveling at speed c,ā ridingā with the photon, so to say, as itās being emitted – then we’d āseeā the electromagnetic transient as itās being radiated into space.

However, while we can associate some massĀ with the energy of the photon, none of what I wrote above explains what the (rest) mass of a matter-particle could possibly be.Ā There is no real answer to that, I guess. You’ll think of the Higgs field now but… Then… Well. The Higgs field is a scalar field. Very simple: some number that’s associated with some position in spacetime. That doesn’t explain very much, does it? š¦ When everything is said and done, the scientists who, in 2013 only, got the Nobel Price for their theory on the Higgs mechanism, simply tell us mass is some number. That’s something we knew already, right? š

## The reality of the wavefunction

The wavefunction is, obviously, a mathematical construct: aĀ descriptionĀ of reality using a very specific language. What language? Mathematics, of course! Math may not be universal (aliens might not be able to decipher our mathematical models) but it’s pretty good as a globalĀ tool of communication, at least.

The realĀ question is: is the descriptionĀ accurate? Does it match reality and, if it does, howĀ goodĀ is the match? For example, the wavefunction for an electron in a hydrogen atom looks as follows:

Ļ(r, t) = eāiĀ·(E/Ä§)Ā·tĀ·f(r)

As I explained in previous posts (see, for example, my recent postĀ on reality and perception), theĀ f(r) function basically provides some envelope for the two-dimensional eāiĀ·ĪøĀ =Ā eāiĀ·(E/Ä§)Ā·tĀ = cosĪø + iĀ·sinĪøĀ oscillation, with rĀ = (x, y, z),Ā Īø = (E/Ä§)Ā·tĀ = ĻĀ·tĀ and Ļ = E/Ä§. So it presumes theĀ duration of each oscillation is some constant. Why? Well… Look at the formula: this thing has a constant frequency in time. It’s only the amplitude that is varying as a function of the rĀ = (x, y, z) coordinates. š So… Well… If each oscillation is to alwaysĀ packĀ 6.626070040(81)Ć10ā34Ā joule, but the amplitude of the oscillation varies from point to point, then… Well… We’ve got a problem. The wavefunction above is likely to be an approximation of reality only. š The associated energy is the same, but… Well… Reality is probablyĀ notĀ the nice geometrical shape we associate with those wavefunctions.

In addition, we should think of theĀ Uncertainty Principle: thereĀ mustĀ be some uncertainty in the energy of the photons when our hydrogen atom makes a transition from one energy level to another. But then… Well… If our photon packs something like 16 million oscillations, and the order of magnitude of the uncertainty is only of the order ofĀ hĀ (or Ä§ = h/2Ļ) which, as mentioned above, is the (average) energy of oneĀ oscillation only, then we don’t have much of a problem here, do we? š

Post scriptum: In previous posts, we offered some analogies – or metaphors – to a two-dimensional oscillation (remember the V-2 engine?). Perhaps it’s all relatively simple. If we have some tiny little ball of mass – and its center of mass has to stay where it is – then any rotation – around any axis – will be some combination of a rotation around ourĀ x- and z-axis – as shown below. Two axes only. So we may want to think of a two-dimensionalĀ oscillation as an oscillation of the polar and azimuthal angle. š

# Thinking again…

One of the comments on my other blog made me think I should, perhaps, write something on waves again. The animation below shows theĀ elementaryĀ wavefunctionĀ Ļ =Ā aĀ·eāiĪøĀ = Ļ =Ā aĀ·eāiĀ·ĪøĀ Ā = aĀ·eāi(ĻĀ·tākĀ·x)Ā = aĀ·eā(i/Ä§)Ā·(EĀ·tāpĀ·x)Ā .We know this elementary wavefunction cannotĀ represent a real-lifeĀ particle. Indeed, the aĀ·eāiĀ·ĪøĀ function implies the probability of finding the particle – an electron, a photon, or whatever – would be equal to P(x, t) = |Ļ(x, t)|2Ā = |aĀ·eā(i/Ä§)Ā·(EĀ·tāpĀ·x)|2Ā = |a|2Ā·|eā(i/Ä§)Ā·(EĀ·tāpĀ·x)|2Ā = |a|2Ā·12= a2Ā everywhere. Hence, the particle would be everywhere – and, therefore, nowhere really. We need to localize the wave – or build a wave packet. We can do so by introducing uncertainty: we then addĀ a potentially infinite number of these elementary wavefunctions with slightly different values for E and p, and various amplitudes a. Each of these amplitudes will then reflect theĀ contributionĀ to the composite wave, which – in three-dimensional space – we can write as:

Ļ(r, t) = eāiĀ·(E/Ä§)Ā·tĀ·f(r)

As I explained in previous posts (see, for example, my recent postĀ on reality and perception), theĀ f(r) function basically provides some envelope for the two-dimensional eāiĀ·ĪøĀ =Ā eāiĀ·(E/Ä§)Ā·tĀ = cosĪø + iĀ·sinĪøĀ oscillation, with rĀ = (x, y, z),Ā Īø = (E/Ä§)Ā·tĀ = ĻĀ·tĀ and Ļ = E/Ä§.

Note that it looks like the wave propagatesĀ from left to right – in theĀ positive direction of an axis which we may refer to as the x-axis. Also note this perception results from the fact that, naturally, we’d associate time with theĀ rotationĀ of that arrow at the center – i.e. with the motion in the illustration,Ā while the spatial dimensions are just what they are: linear spatial dimensions. [This point is, perhaps, somewhat less self-evident than you may think at first.]

Now, the axis which points upwards is usually referred to as the z-axis, and the third and final axis – which points towardsĀ us –Ā would then be the y-axis, obviously.Ā Unfortunately, this definition would violate the so-called right-hand rule for defining a proper reference frame: the figures below shows the two possibilities – a left-handed and a right-handed reference frame – and it’s the right-handed reference (i.e. the illustration on the right) which we have to use in order to correctly define all directions, including the direction ofĀ rotationĀ of the argument of the wavefunction.Hence, if we don’t change the direction of the y– and z-axes – so we keep defining the z-axis as the axis pointing upwards, and the y-axis as the axis pointing towardsĀ us – then the positive direction of the x-axis would actually be the direction from right to left, and we should say that the elementary wavefunction in the animation above seems to propagate in the negativeĀ x-direction. [Note that this left- or right-hand rule is quite astonishing: simply swapping the direction ofĀ oneĀ axis of a left-handed frame makes it right-handed, and vice versa.]

Note my language when I talk about the direction of propagation of our wave. I wrote: it looks like, or it seems toĀ go in this or that direction. And I mean that: there is no real travelingĀ here. At this point, you may want to review a post I wrote for my son, which explains the basic math behind waves, and in which I also explained the animation below.

Note how the peaks and troughs of this pulse seem to move leftwards, but the wave packet (or theĀ groupĀ or theĀ envelopeĀ of the waveāwhatever you want to call it) moves to the right. The point is: the pulse itself doesn’tĀ travel left or right. Think of the horizontal axis in the illustration above as an oscillating guitar string: each point on the string just moves up and down. Likewise, if our repeated pulse would represent a physical wave in water, for example, then the water just stays where it is: it just moves up and down. Likewise, if we shake up some rope, the rope is not going anywhere: we just started some motionĀ that is traveling down the rope.Ā In other words, the phase velocity is just a mathematical concept. The peaks and troughs that seem to be traveling are just mathematical points that are ātravelingā left or right. Thatās why thereās no limit on the phase velocity: it canĀ – and, according to quantum mechanics, actually willĀ –Ā exceed the speed of light. In contrast, the groupĀ velocity – which is the actual speed of the particle that is being represented by the wavefunction – may approachĀ – or, in the case of a massless photon, will actually equalĀ –Ā the speed of light, but will never exceedĀ it, and itsĀ directionĀ will, obviously, have aĀ physicalĀ significance as it is, effectively, the direction of travel of our particle – be it an electron, a photon (electromagnetic radiation), or whatever.

Hence, you should not think theĀ spinĀ of a particle – integer or half-integer – is somehow related to the direction of rotation of the argument of the elementary wavefunction. It isn’t: Nature doesn’t give a damn about our mathematical conventions, and that’s what the direction of rotation of the argument of that wavefunction is: just some mathematical convention. That’s why we write aĀ·eāi(ĻĀ·tākĀ·x)Ā rather thanĀ aĀ·ei(ĻĀ·t+kĀ·x)Ā orĀ aĀ·ei(ĻĀ·tākĀ·x): it’s just because of the right-hand rule for coordinate frames, and also because Euler defined the counter-clockwise direction as theĀ positive direction of an angle. There’s nothing more to it.

OK. That’s obvious. Let me now return to my interpretation of Einstein’s E = mĀ·c2Ā formula (see my previous posts on this). I noted that, in the reference frame of the particle itself (see my basics page), the elementary wavefunction aĀ·eā(i/Ä§)Ā·(EĀ·tāpĀ·x)Ā reduces to aĀ·eā(i/Ä§)Ā·(E’Ā·t’): the origin of the reference frame then coincides with (the center of) our particle itself, and the wavefunction only varies with the time in the inertial reference frame (i.e. the properĀ time t’), with the rest energy of the object (E’) as the time scale factor. How should we interpret this?

Well… Energy is force times distance, and force is defined as that what causes some massĀ toĀ accelerate. To be precise, theĀ newtonĀ – as the unit of force – is defined as theĀ magnitude of a force which would cause a mass of one kg to accelerate with one meter per secondĀ per second. Per second per second. This is not a typo: 1 N corresponds to 1 kg times 1 m/sĀ per second, i.e. 1 kgĀ·m/s2. So… Because energy is force times distance, the unit of energyĀ may be expressed in units of kgĀ·m/s2Ā·m, or kgĀ·m2/s2, i.e. the unit of mass times the unit ofĀ velocity squared. To sum it all up:

1 J = 1 NĀ·m = 1 kgĀ·(m/s)2

This reflects the physical dimensionsĀ on both sides of theĀ E = mĀ·c2Ā formula again but… Well… How should weĀ interpretĀ this? Look at the animation below once more, and imagine the green dot is some tinyĀ massĀ moving around the origin, in an equally tiny circle. We’ve gotĀ twoĀ oscillations here: each packingĀ halfĀ of the total energy of… Well… Whatever it is that our elementary wavefunction might represent in realityĀ – which we don’t know, of course.

Now, the blue and the red dot – i.e. the horizontal and vertical projectionĀ of the green dot –Ā accelerate up and down. If we look carefully, we see these dots accelerateĀ towardsĀ the zero point and, once they’ve crossed it, theyĀ decelerate, so as to allow for a reversal of direction: the blue dot goes up, and then down. Likewise, the red dot does the same. The interplay between the two oscillations, because of the 90Ā° phase difference, is interesting: if the blue dot is at maximum speed (near or at the origin), the red dot reverses speed (its speed is, therefore, (almost) nil), and vice versa. The metaphor of our frictionless V-2 engine, our perpetuum mobile,Ā comes to mind once more.

The question is: what’s going on, really?

My answer is: I don’t know. I do think that, somehow, energy should be thought of as some two-dimensional oscillation of something – something which we refer to asĀ mass, but we didn’t define mass very clearly either. It also, somehow, combines linear and rotational motion. Each of the two dimensions packs half of the energy of the particle that is being represented by our wavefunction. It is, therefore, only logical that the physical unitĀ of both is to be expressed as a force over some distance – which is, effectively, the physical dimension of energy – or the rotational equivalent of them: torqueĀ over some angle.Ā Indeed, the analogy between linear and angular movement is obvious: theĀ kineticĀ energy of a rotating object is equal to K.E. = (1/2)Ā·IĀ·Ļ2. In this formula, I is the rotational inertiaĀ – i.e. the rotational equivalent of mass – and Ļ is the angular velocity – i.e. the rotational equivalent of linearĀ velocity. Noting that the (average) kinetic energy in any system must be equal to the (average) potential energy in the system, we can add both, so we get a formula which is structurallyĀ similar to theĀ E = mĀ·c2Ā formula. But isĀ it the same? Is the effective mass of some object the sum of an almost infinite number of quantaĀ that incorporate some kind ofĀ rotationalĀ motion? And – if we use the right units – is the angular velocity of these infinitesimally small rotations effectively equal to the speed of light?

I am not sure. Not at all, really. But, so far, I can’t think of any explanation of the wavefunction that would make more sense than this one. I just need to keep trying to find better ways toĀ articulateĀ orĀ imagineĀ what might be going on. š In this regard, I’d like to add a point – which may or may not be relevant. When I talked about that guitar string, or the water wave, and wrote that each point on the string – or each water drop – just moves up and down, we should think of the physicality of the situation: when the string oscillates, itsĀ lengthĀ increases. So it’s only because our string is flexible that it can vibrate between the fixed points at its ends. For a rope that’sĀ notĀ flexible, the end points would need to move in and out with the oscillation. Look at the illustration below, for example: the two kids who are holding rope must come closer to each other, so as to provide the necessary space inside of the oscillation for the other kid. šThe next illustration – of how water waves actually propagate – is, perhaps, more relevant. Just think of a two-dimensional equivalent – and of the two oscillations as being transverseĀ waves, as opposed to longitudinal.Ā See how string theory starts making sense? š

The most fundamental question remains the same: what is it,Ā exactly, that is oscillating here? What is theĀ field? It’s always some force on some charge – but what charge, exactly? Mass? What is it? Well… I don’t have the answer to that. It’s the same as asking: what isĀ electricĀ charge,Ā really? So the question is: what’s theĀ realityĀ of mass, of electric charge, or whatever other charge that causes a force toĀ actĀ on it?

If youĀ know, please letĀ meĀ know. š

Post scriptum: The fact that we’re talking someĀ two-dimensional oscillation here – think of a surface now – explains the probability formula: we need toĀ squareĀ the absolute value of the amplitude to get it. And normalize, of course. Also note that, when normalizing, we’d expect to get some factor involvingĀ Ļ somewhere, because we’re talking someĀ circularĀ surface – as opposed to a rectangular one. But I’ll letĀ youĀ figure that out. š

# An introduction to virtual particles (2)

When reading quantum mechanics, it often feels like the more you know, the less you understand. My reading of the Yukawa theory of force, as an exchange of virtual particles (see my previous post), must have left you with many questions. Questions I can’t answer because… Well… I feel as much as a fool as you do when thinking about it all. Yukawa first talks about some potential – which we usually think of as being some scalarĀ function – and thenĀ suddenly this potential becomes a wavefunction. Does that make sense? And think of the mass of that ‘virtual’ particle: the rest mass of a neutral pion is about 135 MeV. That’s an awful lot – at the (sub-)atomic scale that is: it’s equivalent to the rest mass of some 265 electrons!

But… Well… Think of it: the use of a static potential when solving SchrĆ¶dinger’s equation for the electron orbitals around a hydrogen nucleus (a proton, basically) also raises lots of questions: if we think of our electron as a point-like particle being first here and then there, then that’s also not very consistent with a static (scalar) potential either!

One of the weirdest aspects of the Yukawa theory is that these emissions and absorptions of virtual particles violate the energy conservation principle. Look at the animation once again (below): it sort of assumes a rather heavy particle – consisting of a d- or u-quark and its antiparticle – is emittedĀ – out of nothing, it seems – to then vanish as the antiparticle is destroyed when absorbed. What about the energy balance here: are we talking six quarks (the proton and the neutron), or six plus two?Now that we’re talking mass, note a neutral pion (Ļ0) may either be a uÅ« or a dÄĀ combination, and that the mass of a u-quark and a d-quark is only 2.4 and 4.8 MeV – so theĀ bindingĀ energy of the constituent parts of this Ļ0Ā particle is enormous: it accounts for most of its mass.

The thing is… While we’ve presented the Ļ0Ā particle as a virtualĀ particle here, you should also note we find Ļ0Ā particles in cosmic rays. Cosmic rays are particle rays, really: beams of highly energetic particles. Quite a bunch of them are just protons that are being ejected by our Sun. [The Sun also ejects electrons – as you might imagine – but let’s think about the protons here first.] When these protons hit an atom or a molecule in our atmosphere, they usually break up in various particles, including ourĀ Ļ0Ā particle, as shown below.Ā

So… Well… How can we relate these things? What isĀ going on, really, inside of that nucleus?

Well… I am not sure. Aitchison and HeyĀ do their utmost to try to explain the pion – as aĀ virtualĀ particle, that is – inĀ terms ofĀ energy fluctuationsĀ that obey the Uncertainty Principle for energy and time:Ā ĪEĀ·ĪtĀ ā„Ā Ä§/2. Now, I find such explanations difficult to follow. Such explanations usually assume any measurement instrument – measuring energy, time, momentum of distance – measures those variables on some discrete scale, which implies some uncertainty indeed. But that uncertainty is more like an imprecision, in my view. Not something fundamental. Let me quote Aitchison and Hey:

“Suppose a device is set up capable of checking to see whether energy is, in fact, conserved while the pion crosses over.. The crossing time Īt must be at least r/c, where r is the distance apart of the nucleons. Hence, the device must be capable of operating on a time scale smaller than Īt to be able to detect the pion, but it need not be very much less than this. Thus the energy uncertainty in the reading by the device will be of the order ĪEĀ ā¼Ā Ä§/Īt) = Ä§Ā·(c/r).”

As said, I find such explanations really difficult, although I can sort of sense some of the implicit assumptions. As I mentioned a couple of times already, the E = mĀ·c2Ā equation tells us energy is mass in motion, somehow: some weird two-dimensional oscillation in spacetime. So, yes, we can appreciate we need someĀ time unitĀ toĀ countĀ the oscillations – or, equally important, to measure theirĀ amplitude.

[…] But… Well… This falls short of a moreĀ fundamentalĀ explanation of what’s going on. I like to think of Uncertainty in terms of Planck’s constant itself:Ā Ä§ orĀ hĀ or – as you’ll usually see it – as halfĀ of that value: Ä§/2. [The Stern-Gerlach experiment implies it’s Ä§/2, rather than h/2 or Ä§ orĀ hĀ itself.] The physical dimension of Planck’s constant is action: newton times distance times time. I also like to think action can express itself in two ways: as (1) some amount of energy (ĪE: some force of some distance) over some time (Īt) or, else, as (2) some momentum (Īp: some force during some time) over some distance (Īs). Now, if we equate ĪE with the energy of the pion (135 MeV), then we may calculate the order of magnitudeĀ ofĀ Īt from ĪEĀ·Īt ā„ Ä§/2 as follows:

Ā Īt = (Ä§/2)/(135 MeV) ā (3.291Ć10ā16Ā eVĀ·s)/(134.977Ć106Ā eV)Ā ā 0.02438Ć10ā22Ā s

Now, that’s anĀ unimaginablyĀ small time unit – but much and muchĀ larger than the Planck time (the Planck time unit is about 5.39 Ć 10ā44 s). The corresponding distanceĀ rĀ is equal to rĀ = ĪtĀ·cĀ = (0.02438Ć10ā22Ā s)Ā·(2.998Ć108Ā m/s) ā 0.0731Ć10ā14Ā m = 0.731 fm. So… Well… Yes. We got the answer we wanted… So… Well… We should be happy about that but…

Well… I am not. I don’t like this indeterminacy. This randomness in the approach. For starters, I am very puzzled by the fact that the lifetime of the actual Ļ0Ā particle we see in the debrisĀ of proton collisions with other particles as cosmic rays enter the atmosphere is likeĀ 8.4Ć10ā17 seconds, so that’s like 35Ā millionĀ times longer than the Īt =Ā 0.02438Ć10ā22Ā s we calculated above.

Something doesn’t feel right. I just can’t see the logic here.Ā Sorry. I’ll be back.

# An introduction to virtual particles

We are going toĀ venture beyond quantum mechanics as it is usually understood – covering electromagnetic interactions only. Indeed, all of my posts so far – a bit less than 200, I think š – were all centered around electromagnetic interactions – with the model of the hydrogen atom as our most precious gem, so to speak.

In this post, we’ll be talking the strong force – perhaps not for the first time but surely for the first time at this level of detail. It’s an entirely different world – as I mentioned in one of my very first posts in this blog. Let me quote what I wrote there:

“The math describing the ‘reality’ of electrons and photons (i.e. quantum mechanics and quantum electrodynamics), as complicated as it is, becomes even more complicated ā and, important to note, also much less accurateĀ ā when it is used to try to describe the behavior of Ā quarks. Quantum chromodynamics (QCD) is a different world. […]Ā Of course, that should not surprise us, because we’re talking very different order of magnitudes here: femtometers (10ā15 m), in the case of electrons, as opposed to attometers (10ā18 m)Ā or even zeptometers (10ā21Ā m) when we’re talking quarks.”

In fact, the femtometer scale is used to measure the radiusĀ of both protons as well as electrons and, hence, is much smaller than the atomic scale, which is measured in nanometer (1 nm = 10ā9Ā m). The so-called Bohr radius for example, which is a measure for the size of an atom, is measured in nanometer indeed, so that’s a scale that is aĀ millionĀ times larger than the femtometer scale. ThisĀ gapĀ in the scale effectively separates entirely different worlds. In fact, the gap is probably as large a gap as the gap between our macroscopic world and the strange reality of quantum mechanics. What happens at the femtometer scale,Ā really?

The honest answer is: we don’t know, but we do have modelsĀ to describe what happens. Moreover, for want of better models, physicists sort of believe these models are credible. To be precise, we assume there’s a force down there which we refer to as theĀ strongĀ force. In addition, there’s also a weak force. Now, you probably know these forces are modeled asĀ interactionsĀ involving anĀ exchangeĀ ofĀ virtualĀ particles. This may be related to what Aitchison and Hey refer to as the physicist’s “distaste for action-at-a-distance.” To put it simply: if one particle – through some force – influences some other particle, then something must be going on between the two of them.

Of course, now you’ll say that something isĀ effectively going on: there’s the electromagnetic field, right? Yes. But what’s the field? You’ll say: waves. But then you know electromagnetic waves also have a particle aspect. So we’re stuck with this weird theoretical framework: the conceptual distinction between particles and forces, or between particle and field, are not so clear. So that’s what the more advanced theories we’ll be looking at – like quantum field theory – try to bring together.

Note that we’ve been using a lot of confusing and/or ambiguous terms here: according to at least one leading physicist, for example, virtual particles should not be thought of as particles! But we’re putting the cart before the horse here. Let’s go step by step. To better understand the ‘mechanics’ of how the strong and weak interactions are being modeled in physics, most textbooks – including Aitchison and Hey, which we’ll follow here – start by explaining the original ideas as developed by the Japanese physicist Hideki Yukawa, who received a Nobel Prize for his work in 1949.

So what is it all about? As said, the ideasĀ – or theĀ modelĀ as such, so to speak – are more important than Yukawa’s original application, which was to model the force between a proton and a neutron. Indeed, we now explain such force as a force between quarks, and the force carrier is the gluon, which carries the so-calledĀ colorĀ charge. To be precise, the force between protons and neutrons – i.e. the so-called nuclearĀ force – isĀ nowĀ considered to be a rather minorĀ residual force: it’s just what’s left of the actualĀ strong force that binds quarks together. The Wikipedia article on thisĀ has someĀ good text andĀ a really nice animation on this. But… Well… Again, note that we are only interested in theĀ model right now. So how does that look like?

First, we’ve got the equivalent of the electric charge: the nucleon is supposed to have some ‘strong’ charge, which we’ll write as gs. Now you know the formulas for theĀ potentialĀ energy – because of the gravitational force – between two masses, or theĀ potentialĀ energy between two charges – because of the electrostatic force. Let me jot them down once again:

1. U(r) =Ā āGĀ·MĀ·m/r
2. U(r) = (1/4ĻĪµ0)Ā·q1Ā·q2/r

The two formulas are exactly the same. They both assume U = 0 forĀ rĀ ā ā. Therefore, U(r) is always negative. [Just think of q1Ā and q2Ā as opposite charges, so the minus sign is not explicit – but it is also there!] We know thatĀ U(r)Ā curve will look like the one below: some work (force times distance) is needed to move the two charges some distanceĀ away from each other – from point 1 to point 2, for example. [The distance r is x here – but you got that, right?]

Now, physics textbooks – or other articles you might find, like on Wikipedia – will sometimes mention that the strong force is non-linear, but that’s very confusing because… Well… The electromagnetic force – or the gravitational force – aren’t linear either: their strength is inversely proportional to the squareĀ of the distance and – as you can see from the formulas for the potential energy – that 1/r factor isn’t linearĀ either. So that isn’t very helpful. In order to further the discussion, I should now write down Yukawa’sĀ hypotheticalĀ formula for the potential energy between a neutron and a proton, which we’ll refer to, logically, as the n-p potential:The āgs2Ā factor is, obviously, the equivalent of the q1Ā·q2Ā product: think of the proton and the neutron having equal but opposite ‘strong’ charges. The 1/4Ļ factor reminds us of the Coulomb constant:Ā keĀ = 1/4ĻĪµ0. Note this constant ensures the physical dimensions of both sides of the equation make sense: the dimension of Īµ0Ā is NĀ·m2/C2, so U(r) is – as we’d expect – expressed in newtonĀ·meter, orĀ joule. We’ll leave the question of the units for gsĀ open – for the time being, that is. [As for the 1/4Ļ factor, I am not sure why Yukawa put it there. My best guess is that he wanted to remind us some constant should be there to ensure the units come out alright.]

So, when everything is said and done, the big new thing is the eār/a/rĀ factor, which replaces the usual 1/r dependency on distance. Needless to say, e is Euler’s number here –Ā notĀ the electric charge. The two green curves below show what the eār/aĀ factor does to the classical 1/r function for aĀ = 1 andĀ aĀ = 0.1 respectively: smaller values forĀ aĀ ensure the curve approaches zero more rapidly. In fact, forĀ aĀ = 1,Ā eār/a/rĀ is equal to 0.368 forĀ rĀ = 1, and remains significant for values rĀ that are greater than 1 too.Ā In contrast, forĀ aĀ = 0.1, eār/a/rĀ is equal to 0.004579 (more or less, that is) for rĀ = 4 and rapidly goes to zero for all values greater than that.

Aitchison and Hey callĀ a, therefore, aĀ range parameter: it effectively defines theĀ rangeĀ in which the n-p potential has a significant value: outside of the range, its value is, for all practical purposes, (close to) zero. Experimentally, this range was established as being more or less equal to rĀ ā¤ 2 fm.Ā Needless to say, while this range factor may do its job, it’s obvious Yukawa’s formula for the n-p potential comes across as being somewhat random: what’s the theory behind? There’s none, really. It makes one think of the logistic function: the logistic function fits many statistical patterns, but it is (usually) not obvious why.

Next in Yukawa’s argument is the establishment of an equivalent, for the nuclear force, of the Poisson equation in electrostatics: using theĀ E = āāĪ¦ formula, we can re-write Maxwell’s āā¢EĀ = Ļ/Īµ0Ā equation (aka Gauss’ Law) asĀ āā¢E =Ā āāā¢āĪ¦Ā = āā2Ī¦Ā āĀ ā2Ī¦=Ā āĻ/Īµ0Ā indeed. The divergenceĀ operatorĀ theĀ āā¢ operator gives us theĀ volumeĀ density of the flux of E out of an infinitesimal volume around a given point. [You may want to check one of my post on this. The formula becomes somewhat more obvious if we re-write it as āā¢EĀ·dV = ā(ĻĀ·dV)/Īµ0: āā¢EĀ·dV is then, quite simply, the flux of E out of the infinitesimally small volume dV, and the right-hand side of the equation says this is given by the product of the charge inside (ĻĀ·dV) and 1/Īµ0, which accounts for the permittivity of the medium (which is the vacuum in this case).] Of course, you will also remember the āĪ¦ notation: ā is just the gradient (or vector derivative) of the (scalar) potential Ī¦, i.e. the electric (or electrostatic) potential in a space around that infinitesimally small volume with charge density Ļ. So… Well… The Poisson equation is probably notĀ soĀ obvious as it seems at first (again, checkĀ my post on itĀ on it for more detail) and, yes, that āā¢ operator – the divergenceĀ operator – is a pretty impressive mathematical beast. However, I must assume you master this topic and move on. So… Well… I must now give you the equivalent of Poisson’s equation for the nuclear force. It’s written like this:What the heck? Relax. To derive this equation, we’d need to take a pretty complicated dĆ©tour, which we won’t do. [See Appendix G of Aitchison and Grey if you’d want the details.] Let me just point out the basics:

1. The Laplace operator (ā2) is replaced by one that’s nearly the same: ā2Ā ā 1/a2. And it operates on the same concept: a potential, which is a (scalar) function of the position r. Hence, U(r) is just the equivalent ofĀ Ī¦.

2. The right-hand side of the equation involves Dirac’s delta function. Now that’s a weird mathematical beast. Its definition seems to defy what I refer to as the ‘continuum assumption’ in math. Ā I wrote a few things about it in one of my posts on SchrĆ¶dinger’s equationĀ – and I could give you its formula – but that won’t help you very much. It’s just a weird thing. As Aitchison and GreyĀ write, you should just think of the whole expression as a finite range analogueĀ of Poisson’s equation in electrostatics. So it’s only for extremely smallĀ rĀ that the whole equation makes sense. Outside of the range defined by our range parameterĀ a, the whole equation just reduces to 0 = 0 – for all practical purposes, at least.

Now, of course, you know that the neutron and the proton are not supposed to just sit there. They’re also in these sort of intricate dance which – for the electron case – is described by some wavefunction, which we derive as a solution from SchrĆ¶dinger’s equation. So U(r) is going to vary not only in space but also in time and we should, therefore, write it as U(r, t). Now, we will, of course, assume it’s going to vary in space and time as someĀ waveĀ and we may, therefore, suggest some waveĀ equationĀ for it. To appreciate this point, you should review some of the posts I did on waves. More in particular, you may want to review the post I did on traveling fields, in which I showed you the following:Ā if we see an equation like:then the functionĀ Ļ(x, t) must have the following general functional form:AnyĀ function ĻĀ like that will work – so it will be a solution to the differential equation – and we’ll refer to it as a wavefunction. Now, the equation (and the function) is for a wave traveling inĀ one dimension only (x) but the same post shows we can easily generalize to waves traveling in three dimensions. In addition, we may generalize the analyse to includeĀ complex-valuedĀ functions as well. Now, you will still be shocked by Yukawa’s field equation for U(r, t) but, hopefully, somewhat less so after the above reminder on how wave equations generally look like:As said, you can look up the nitty-gritty in Aitchison and GreyĀ (or in its appendices) but, up to this point, you should be able to sort of appreciate what’s going on without getting lost in it all. Yukawa’s next step – and all that follows – is much more baffling. We’d think U, the nuclear potential, is just some scalar-valued wave, right? It varies in space and in time, but… Well… That’s what classical waves, like water or sound waves, for example do too. So far, so good. However, Yukawa’s next step is to associate aĀ de Broglie-type wavefunction with it. Hence, Yukawa imposesĀ solutions of the type:What?Ā Yes. It’s a big thing to swallow, and it doesn’t help most physicists refer to U as aĀ force field. A force and the potential that results from it are two different things. To put it simply: theĀ forceĀ on an object isĀ notĀ the same as theĀ workĀ you need to move it from here to there. Force and potential areĀ relatedĀ butĀ differentĀ concepts. Having said that, it sort of make sense now, doesn’t it? If potential is energy, and if it behaves like some wave, then we must be able to associate it with aĀ de Broglie-type particle. This U-quantum, as it is referred to, comes in two varieties, which are associated with the ongoingĀ absorption-emission process that is supposed to take place inside of the nucleus (depicted below):

p + UāĀ ā n andĀ n + U+Ā ā p

It’s easy to see that theĀ UāĀ andĀ U+Ā particles are just each other’s anti-particle. When thinking about this, I can’t help remembering Feynman, when he enigmatically wrote – somewhere in his Strange Theory of Light and MatterĀ – thatĀ an anti-particle might just be the same particle traveling back in time.Ā In fact, theĀ exchangeĀ here is supposed to happen within aĀ time windowĀ that is so short it allows for the briefĀ violationĀ of the energy conservation principle.

Let’s be more precise and try to find the properties of that mysterious U-quantum. You’ll need to refresh what you know about operators to understand how substituting Yukawa’sĀ de BroglieĀ wavefunction in the complicated-looking differential equation (the waveĀ equation) gives us the following relation between the energy and the momentum of our new particle:Now, it doesn’t take too many gimmicks to compare this against the relativistically correct energy-momentum relation:Combining both gives us the associated (rest) mass of the U-quantum:ForĀ aĀ ā 2 fm,Ā mUĀ is about 100 MeV. Of course, it’s always to check the dimensions and calculate stuff yourself. Note the physical dimension ofĀ Ä§/(aĀ·c) is NĀ·s2/m = kg (just think of the F = mĀ·a formula). Also note that NĀ·s2/m = kg = (NĀ·m)Ā·s2/m2Ā = J/(m2/s2), so that’s the [E]/[c2] dimension.Ā The calculation – and interpretation – is somewhat tricky though: if you do it, you’ll find that:

Ä§/(aĀ·c) ā (1.0545718Ć10ā34Ā NĀ·mĀ·s)/[(2Ć10ā15Ā m)Ā·(2.997924583Ć108Ā m/s)] ā 0.176Ć10ā27Ā kg

Now, most physics handbooks continue that terrible habit of writing particle weights in eV, rather than using the correct eV/c2Ā unit. So when they write: mUĀ is about 100 MeV, they actually mean to say that it’s 100 MeV/c2. In addition, the eV is notĀ an SI unit. Hence, to get that number, we should first write 0.176Ć10ā27Ā kg as some value expressed in J/c2, and then convert the jouleĀ (J) into electronvolt (eV). Let’s do that. First, note that c2Ā ā 9Ć1016Ā m2/s2, so 0.176Ć10ā27Ā kgĀ āĀ 1.584Ć10ā11Ā J/c2. Now we do the conversion from jouleĀ to electronvolt. WeĀ get: (1.584Ć10ā11Ā J/c2)Ā·(6.24215Ć1018Ā eV/J)Ā ā 9.9Ć107Ā eV/c2Ā = 99 MeV/c2.Ā Bingo!Ā So that was Yukawa’s prediction for theĀ nuclear force quantum.

Of course, Yukawa was wrong but, as mentioned above, his ideas are now generally accepted. First note the mass of the U-quantum is quite considerable:Ā 100 MeV/c2Ā is a bit more than 10% of the individual proton or neutron mass (about 938-939 MeV/c2). While theĀ binding energyĀ causes the mass of an atom to be less than the mass of their constituent parts (protons, neutrons and electrons), it’s quite remarkably that the deuterium atom – a hydrogen atom with an extra neutron – has an excess mass of about 13.1 MeV/c2, and a binding energy with an equivalent mass of only 2.2 MeV/c2. So… Well… There’s something there.

As said, this post only wanted to introduce some basic ideas. The current model of nuclear physics is represented by the animation below, which I took from the Wikipedia article on it. The U-quantum appears as the pion here – and it doesĀ notĀ really turn the proton into a neutron and vice versa. Those particles are assumed to be stable. In contrast, it is theĀ quarksĀ that changeĀ colorĀ by exchanging gluons between each other. And we know look at the exchange particle – which we refer to as the pionĀ –Ā between the proton and the neutron as consisting of two quarks in its own right: a quark and a anti-quark. So… Yes… All weird. QCD is just a different world. We’ll explore it more in the coming days and/or weeks. šAn alternative – and simpler – way of representing this exchange of a virtual particle (a neutralĀ pionĀ in this case) is obtained by drawing a so-called Feynman diagram:OK. That’s it for today. More tomorrow. š

# Reality and perception

It’s quite easy to get lost in all of the math when talking quantum mechanics. In this post, I’d like to freewheel a bit. I’ll basically try to relate the wavefunction we’ve derived for the electron orbitals to the more speculative posts I wrote on how toĀ interpretĀ the wavefunction. So… Well… Let’s go. š

If there is one thing you should remember from all of the stuff I wrote in my previous posts, then it’s that the wavefunction for an electron orbital – Ļ(x, t), so that’s a complex-valued function in twoĀ variables (position and time) – canĀ be written as the product of two functions in oneĀ variable:

Ļ(x, t) = eāiĀ·(E/Ä§)Ā·tĀ·f(x)

In fact, we wrote f(x) as Ļ(x), but I told you how confusing that is: the Ļ(x) and Ļ(x, t) functions are, obviously,Ā veryĀ different. To be precise,Ā theĀ f(x) = Ļ(x) function basically provides some envelope for the two-dimensional eiĪøĀ =Ā eāiĀ·(E/Ä§)Ā·tĀ = cosĪø + iĀ·sinĪøĀ oscillation – as depicted below (Īø = ā(E/Ä§)Ā·tĀ = ĻĀ·tĀ with Ļ = āE/Ä§).When analyzing this animation – look at the movement of the green, red and blue dots respectively – one cannot miss the equivalence between this oscillation and the movement of a mass on a spring – as depicted below.The eāiĀ·(E/Ä§)Ā·tĀ function just gives us twoĀ springs for the price of one. š Now, you may want to imagine some kind of elastic medium – Feynman’s famous drum-head, perhaps š – and you may also want to think of all of this in terms of superimposed waves but… Well… I’d need to review if that’s really relevant to what we’re discussing here, so I’d rather notĀ make things too complicated and stick to basics.

First note that the amplitude of the two linear oscillations above is normalized: the maximum displacement of the object from equilibrium, in the positive or negative direction, which we may denote by x = Ā±A, is equal to one. Hence, the energy formula is just the sum of the potential and kinetic energy: T + U = (1/2)Ā·A2Ā·mĀ·Ļ2Ā = (1/2)Ā·mĀ·Ļ2. But so we haveĀ twoĀ springs and, therefore, the energy in this two-dimensional oscillation is equal to E = 2Ā·(1/2)Ā·mĀ·Ļ2Ā =Ā mĀ·Ļ2.

This formula is structurally similar to Einstein’sĀ E = mĀ·c2Ā formula. Hence, one may want to assume that the energy of some particle (an electron, in our case, because we’re discussing electron orbitals here)Ā is just the two-dimensional motion of itsĀ mass. To put it differently, we might also want to think that the oscillating real and imaginary component of our wavefunction each store one halfĀ of the total energy of our particle.

However, the interpretation of this rather bold statement is not so straightforward. First, you should note that the Ļ in the E =Ā mĀ·Ļ2Ā formula is an angularĀ velocity, as opposed to the cĀ in theĀ E = mĀ·c2Ā formula, which is a linear velocity. Angular velocities are expressed inĀ radiansĀ per second, while linear velocities are expressed inĀ meterĀ per second. However, while theĀ radianĀ measures an angle, we know it does so by measuring a length. Hence, if our distance unit is 1 m, an angle of 2ĻĀ rad will correspond to a length of 2ĻĀ meter, i.e. the circumference of the unit circle. So… Well… The two velocities mayĀ notĀ be so different after all.

There are other questions here. In fact, the other questions are probably more relevant. First, we should note that the Ļ in the E =Ā mĀ·Ļ2Ā can take on any value. For a mechanical spring, Ļ will be a function of (1) the stiffnessĀ of the spring (which we usually denote by k, and which is typically measured in newton (N) per meter) and (2) the mass (m) on the spring. To be precise, we write:Ā Ļ2Ā = k/m – or, what amounts to the same, ĻĀ = ā(k/m). Both k and m are variablesĀ and, therefore, Ļ can really be anything. In contrast, we know that c is a constant: cĀ equalsĀ 299,792,458 meter per second, to be precise. So we have this rather remarkable expression: cĀ = ā(E/m), and it is valid for anyĀ particle – our electron, or the proton at the center, or our hydrogen atom as a whole. It is also valid for more complicated atoms, of course. In fact, it is valid forĀ anyĀ system.

Hence, we need to take another look at the energy conceptĀ that is used in our Ļ(x, t) = eāiĀ·(E/Ä§)Ā·tĀ·f(x) wavefunction. You’ll remember (if not, youĀ should) that the E here is equal to EnĀ = ā13.6 eV, ā3.4 eV, ā1.5 eV and so on, for nĀ = 1, 2, 3, etc. Hence, this energy concept is rather particular. As Feynman puts it: “The energies are negative because we picked our zero point as the energy of an electron located far from the proton. When it is close to the proton, its energy is less, so somewhat below zero. The energy is lowest (most negative) for n = 1, and increases toward zero with increasing n.”

Now, this is theĀ one and onlyĀ issue I have with the standard physics story. I mentioned it in one of my previous posts and, just for clarity, let me copy what I wrote at the time:

Feynman gives us a rather casual explanation [on choosing a zero point for measuring energy] in one of his very firstĀ LecturesĀ on quantum mechanics, where he writes the following:Ā āIf we have a āconditionā which is a mixture of two different states with different energies, then the amplitude for each of the two states will vary with time according to an equation likeĀ aĀ·eāiĻt, with Ä§Ā·Ļ =Ā EĀ = mĀ·c2. Hence, we can write the amplitude for the two states, for example as:

eāi(E1/Ä§)Ā·tĀ and eāi(E2/Ä§)Ā·t

And if we have some combination of the two, we will have an interference. But notice that if we added a constant to both energies, it wouldnāt make any difference. If somebody else were to use a different scale of energy in which all the energies were increased (or decreased) by a constant amountāsay, by the amount Aāthen the amplitudes in the two states would, from his point of view, be:

eāi(E1+A)Ā·t/Ä§Ā and eāi(E2+A)Ā·t/Ä§

All of his amplitudes would be multiplied by the same factor eāi(A/Ä§)Ā·t, and all linear combinations, or interferences, would have the same factor. When we take the absolute squares to find the probabilities, all the answers would be the same. The choice of an origin for our energy scale makes no difference; we can measure energy from any zero we want. For relativistic purposes it is nice to measure the energy so that the rest mass is included, but for many purposes that arenāt relativistic it is often nice to subtract some standard amount from all energies that appear. For instance, in the case of an atom, it is usually convenient to subtract the energy MsĀ·c2, where MsĀ is the mass of all the separate piecesāthe nucleus and the electronsāwhich is, of course, different from the mass of the atom. For other problems, it may be useful to subtract from all energies the amount MgĀ·c2, where MgĀ is the mass of the whole atom in the ground state; then the energy that appears is just the excitation energy of the atom. So, sometimes we may shift our zero of energy by some very large constant, but it doesnāt make any difference, provided we shift all the energies in a particular calculation by the same constant.ā

Itās a rather long quotation, but itās important. The key phrase here is, obviously, the following: āFor other problems, it may be useful to subtract from all energies the amount MgĀ·c2, where MgĀ is the mass of the whole atom in the ground state; then the energy that appears is just the excitation energy of the atom.ā So thatās what heās doing when solving SchrĆ¶dingerās equation. However, I should make the following point here: if we shift the origin of our energy scale, it does not make any difference in regard to theĀ probabilitiesĀ we calculate, but it obviously does make a difference in terms of our wavefunction itself. To be precise, itsĀ densityĀ in time will beĀ veryĀ different. Hence, if weād want to give the wavefunction someĀ physicalĀ meaning ā which is what Iāve been trying to do all along ā itĀ doesĀ make a huge difference. When we leave the rest mass of all of the pieces in our system out, we can no longer pretend we capture their energy.

So… Well… There you go. If we’d want to try to interpret our Ļ(x, t) = eāiĀ·(En/Ä§)Ā·tĀ·f(x) function as a two-dimensional oscillation of theĀ massĀ of our electron, the energy concept in it – so that’s the EnĀ in it – should include all pieces. Most notably, it should also include the electron’sĀ rest energy, i.e. its energy when it is notĀ in a bound state. This rest energy is equal to 0.511 MeV. […]Ā Read this again: 0.511 mega-electronvolt (106Ā eV), so that’s huge as compared to the tiny energy values we mentioned so far (ā13.6 eV, ā3.4 eV, ā1.5 eV,…).

Of course, this gives us a rather phenomenal order of magnitude for the oscillation that we’re looking at. Let’s quickly calculate it. We need to convert to SI units,Ā of course: 0.511 MeV is about 8.2Ć10ā14Ā jouleĀ (J), and so the associated frequencyĀ is equal toĀ Ī½ = E/h = (8.2Ć10ā14Ā J)/(6.626Ć10ā34 JĀ·s) ā 1.23559Ć1020Ā cycles per second. Now, I know such number doesn’t say all that much: just note it’s the same order of magnitude as the frequency of gamma raysĀ and… Well… No. I won’t say more. You should try to think about this for yourself. [If you do,Ā think – for starters – aboutĀ the difference between bosons and fermions: matter-particles are fermions, and photons are bosons. Their nature is very different.]

The correspondingĀ angularĀ frequency is just the same number but multiplied by 2Ļ (one cycle corresponds to 2ĻĀ radiansĀ and, hence, Ļ = 2ĻĀ·Ī½ = 7.76344Ć1020Ā radĀ per second. Now, if our green dot would be moving around the origin, along the circumference of our unit circle, then its horizontal and/or vertical velocity would approach the same value. Think of it. We have thisĀ eiĪøĀ =Ā eāiĀ·(E/Ä§)Ā·tĀ =Ā eiĀ·ĻĀ·tĀ = cos(ĻĀ·t) +Ā iĀ·sin(ĻĀ·t) function, with Ļ = E/Ä§. So theĀ cos(ĻĀ·t) captures the motion along the horizontal axis, while the sin(ĻĀ·t) function captures the motion along the vertical axis.Ā Now, the velocity along the horizontalĀ axis as a function of time is given by the following formula:

v(t) = d[x(t)]/dt = d[cos(ĻĀ·t)]/dt =Ā āĻĀ·sin(ĻĀ·t)

Likewise, the velocity along theĀ verticalĀ axis is given byĀ v(t) = d[sin(ĻĀ·t)]/dt = ĻĀ·cos(ĻĀ·t). These are interesting formulas: they show the velocity (v) along one of the two axes is always lessĀ than theĀ angular velocity (Ļ). To be precise, the velocity vĀ approaches – or, in the limit, is equal to –Ā the angular velocity Ļ when ĻĀ·t is equal to ĻĀ·tĀ = 0,Ā Ļ/2, Ļ or 3Ļ/2. So… Well… 7.76344Ć1020Ā meterĀ per second!? That’s like 2.6Ā trillionĀ times the speed of light. So that’s not possible, of course!

That’s where theĀ amplitudeĀ of our wavefunction comes in – our envelope functionĀ f(x): the green dot doesĀ notĀ move along the unit circle. The circle is much tinier and, hence, the oscillation shouldĀ notĀ exceed the speed of light. In fact, I should probably try to prove it oscillatesĀ atĀ the speed of light, thereby respecting Einstein’s universal formula:

cĀ = ā(E/m)

Written like this – rather than as you know it: E = mĀ·c2Ā – this formula shows the speed of light is just a property of spacetime, just like the ĻĀ = ā(k/m) formula (or the ĻĀ = ā(1/LC) formula for a resonant AC circuit) shows that Ļ, the naturalĀ frequency of our oscillator, is a characteristic of the system.

Am I absolutely certain of what I am writing here? No. My level of understanding of physics is still that of an undergrad. But… Well… It all makes a lot of sense, doesn’t it? š

Now, I said there were a fewĀ obvious questions, and so far I answered only one. The other obvious question is why energy would appear to us as mass in motionĀ in two dimensions only. Why is it an oscillation in a plane? We might imagine a third spring, so to speak, moving in and out from us, right? Also, energyĀ densitiesĀ are measured per unitĀ volume, right?

NowĀ that‘s a clever question, and I must admit I can’t answer it right now. However, I do suspect it’s got to do with the fact that the wavefunction depends on the orientation of our reference frame. If we rotate it, it changes. So it’s like we’ve lost one degree of freedom already, so only two are left. Or think of the third direction as the direction of propagationĀ of the wave. šĀ Also, we should re-read what we wrote about the Poynting vector for the matter wave, or what Feynman wrote about probabilityĀ currents. Let me give you some appetite for that by noting that we can re-writeĀ jouleĀ per cubic meter (J/m3) asĀ newtonĀ perĀ squareĀ meter: J/m3Ā = NĀ·m/m3Ā = N/m2. [Remember: the unit of energy is force times distance. In fact, looking at Einstein’s formula, I’d say it’s kgĀ·m2/s2Ā (mass times a squared velocity), but that simplifies to the same: kgĀ·m2/s2Ā = [N/(m/s2)]Ā·m2/s2.]

I should probably also remindĀ you that there is no three-dimensional equivalent of Euler’s formula, and the way the kinetic and potential energy of those two oscillations works together is rather unique. Remember I illustrated it with the image of a V-2 engine in previous posts. There is no such thing as a V-3 engine. [Well… There actually is – but not with the third cylinder being positioned sideways.]

But… Then… Well… Perhaps we should think of some weird combination ofĀ twoĀ V-2 engines. The illustration below shows the superposition of twoĀ one-dimensional waves – I think – one traveling east-west and back, and the other one traveling north-south and back. So, yes, we may to think of Feynman’s drum-head again – but combiningĀ two-dimensional waves –Ā twoĀ waves thatĀ bothĀ have an imaginary as well as a real dimension

Hmm… Not sure. If we go down this path, we’d need to add a third dimension – so w’d have a super-weird V-6 engine! As mentioned above, the wavefunction does depend on our reference frame: we’re looking at stuff from a certain directionĀ and, therefore, we can only see what goes up and down, and what goes left or right. We can’t see what comes near and what goes away from us. Also think of the particularities involved in measuring angular momentum – or the magnetic moment of some particle. We’re measuring that along one direction only! Hence, it’s probably no use to imagine we’re looking atĀ threeĀ waves simultaneously!

In any case…Ā I’ll let you think about all of this. I do feel I am on to something. I am convinced that my interpretation of the wavefunction as anĀ energy propagationĀ mechanism, or asĀ energy itselfĀ – as a two-dimensional oscillation of mass – makes sense. š

Of course, I haven’t answered oneĀ keyĀ question here: whatĀ isĀ mass? What is that green dot – in reality, that is? At this point, we can only waffle – probably best to just give its standard definition: mass is a measure ofĀ inertia. A resistance to acceleration or deceleration, or to changing direction. But that doesn’t say much. I hate to say that – in many ways – all that I’ve learned so far hasĀ deepenedĀ the mystery, rather than solve it. The more we understand, the less we understand? But… Well… That’s all for today, folks ! Have fun working through it for yourself. š

Post scriptum: I’ve simplified the wavefunction a bit. As I noted in my post on it, the complex exponential is actually equal toĀ eāiĀ·[(E/Ä§)Ā·tĀ āĀ mĀ·Ļ], so we’ve got a phase shift because of m, the quantum number which denotes the z-component of the angular momentum. But that’s a minor detail that shouldn’t trouble or worry you here.

# The periodic table

This post is, in essence, a continuation of my series on electron orbitals. I’ll just further tie up some loose ends and then – hopefully – have some time to show how we get the electron orbitals for other atoms than hydrogen. So we’ll sort of build up the periodic table. Sort of. š

We should first review a bit. The illustration below copies the energy level diagram from Feynman’sĀ Lecture on the hydrogen wave function. Note he uses āE for the energy scale because… Well… I’ve copied the EnĀ values for n = 1, 2, 3,… 7 next to it: the value forĀ E1Ā (-13.6 eV) is four times the value of E2Ā (-3.4 eV).

How do we know those values? We discussed that before – long time back: we have the so-calledĀ gross structure of the hydrogenĀ spectrumĀ here. The table below gives the energy values for the first seven levels, and you can calculate an example for yourself: the differenceĀ between E2Ā (-3.4 eV) andĀ E4Ā (-0.85 eV) is 2.55 eV, so that’s 4.08555Ć10ā19Ā J, which corresponds to aĀ frequencyĀ equal toĀ fĀ = E/hĀ = (4.08555Ć10ā19Ā J)/(6.626Ć10ā34Ā JĀ·s) ā 0.6165872Ć1015Ā Hz. Now that frequency corresponds to a wavelength that’s equal toĀ Ī» = c/fĀ = (299,792,458 m/s)/0.6165872Ć1015/s)Ā ā 486Ć10ā9Ā m. So that’s the 486 nano-meter line the so-called Balmer series, as shown in the illustration next to the table with the energy values.

So far, so good. An interesting point to note is that we only haveĀ oneĀ solution forĀ nĀ = 1. To be precise, we have oneĀ sphericalĀ solution only: the 1s solution. Now, for n = 2, we have one 2s solutionĀ but also threeĀ 2pĀ solutions (remember theĀ pĀ stands forĀ principalĀ lines). In the simplified model we’re using (we’re notĀ discussing the fine or hyperfine structure here), these threeĀ solutions are referred to as ‘degenerate states’: they are different statesĀ with the same energy. Now, we know that any linear combinationĀ of the solutions for a differential equation must also be a solution. Therefore, any linear combination of the 2pĀ solutions will also be a stationary state of the same energy. In fact, a superposition of the 2s and one or more of the 2p states should also be a solution. There is an interestingĀ appĀ which visualizes how such superimposed states look like. I copy three illustrations below, but I recommend you googleĀ for stuff like this yourself: it’s really fascinating! You should, once again, pay attention to the symmetries planes and/or symmetry axes.

But we’ve written enough about the orbital of oneĀ electron now. What if there are two electrons, or three, or more. In other word, how does it work for helium,Ā lithium, and so on? Feynman gives us a bit of an intuitive explanation here – nothing analytical, really. First, he notes SchrĆ¶dinger’s equation forĀ twoĀ electrons would look as follows:

Second, the Ļ(x) function in the Ļ(x, t) = eāiĀ·(E/Ä§)Ā·tĀ·Ļ(x) function now becomes a function in sixĀ variables, which he – curiously enough – now no longer writes as Ļ but as f:The rest of the text speaks for itself, although you might be disappointed by what he writes (the bold-face and/or italics are mine):

“The geometrical dependence is contained inĀ f, which is a function of six variablesāthe simultaneous positions of the two electrons. No one has found an analytic solution, although solutions for the lowest energy states have been obtained by numerical methods. With 3,Ā 4, orĀ 5Ā electrons it is hopeless to try to obtain exact solutions, and it is going too far to say that quantum mechanics has given a precise understanding of the periodic table.Ā It is possible, however, even with a sloppy approximationāand some fixingāto understand, at least qualitatively, many chemical properties which show up in the periodic table.

The chemical properties of atoms are determined primarily by their lowest energy states. We can use the following approximate theory to find these states and their energies. First, we neglect the electron spin, except that we adopt the exclusion principle and say that any particular electronic state can be occupied by only one electron. This means that any particular orbital configuration can have up to two electronsāone with spin up, the other with spin down.

Next we disregard the details of the interactions between the electrons in our first approximation, and say that each electron moves in a central field which is the combined field of the nucleus and all the other electrons. For neon, which has 10Ā electrons, we say that one electron sees an average potential due to the nucleus plus the other nine electrons. We imagine then that in the SchrĆ¶dinger equation for each electron we put aĀ V(r)Ā which is a 1/rĀ field modified by a spherically symmetric charge density coming from the other electrons.

In this model each electron acts like an independent particle. The angular dependence of its wave function will be just the same as the ones we had for the hydrogen atom. There will be s-states, p-states, and so on; and they will have the various possible m-values. Since V(r)Ā no longer goes asĀ 1/r, the radial part of the wave functions will be somewhat different, but it will be qualitatively the same, so we will have the same radial quantum numbers,Ā n. The energies of the states will also be somewhat different.”

So that’s rather disappointing, isn’t it? We can only get someĀ approximate – orĀ qualitative – understanding of the periodic table from quantum mechanics – because theĀ mathĀ is too complex: onlyĀ numericalĀ methods can give us those orbitals! Wow!Ā Let me list some of the salient points in Feynman’s treatment of the matter:

• ForĀ heliumĀ (He), we have two electrons in the lowest state (i.e. the 1s state): one has its spin ‘up’ and the other is ‘down’. Because the shell is filled, the ionization energy (to removeĀ oneĀ electron) has an even larger value than the ionization energy for hydrogen: 24.6 eV! That’s why there is “practically no tendency” for the electron to be attracted by some other atom: helium is chemically inert – which explains it being part of the group ofĀ nobleĀ orĀ inertĀ gases.
• ForĀ lithiumĀ (Li), two electrons will occupy the 1sĀ orbital, and the third should go to an nĀ = 2 state. But which one? WithĀ lĀ = 0, orĀ lĀ = 1? A 2s state or a 2p state?Ā In hydrogen, these two nĀ = 2Ā states have the same energy, but in other atoms they donāt. Why not? That’s a complicated story, but the gist of the argument is as follows: aĀ 2s state has some amplitude to be near the nucleus, while the 2p state does not. That means that a 2sĀ electron will feel some of the triple electric charge of the Li nucleus, and this extra attraction lowers the energy of the 2sĀ state relative to the 2pĀ state.

To make a long story short, the energy levels will be roughly as shown in the table below. For example, the energy that’s needed to remove the 2s electron of the lithium – i.e. theĀ ionizationĀ energy of lithium – is only 5.4 eV because… Well… As you can see, it has a higher energy (lessĀ negative, that is) than the 1sĀ state (ā13.6 eV for hydrogen and, as mentioned above, ā24.6 eV for helium). So lithium is chemically active – as opposed to helium.Ā

You should compare the table below with the table above. If you do, you’ll understand how electrons ‘fill up’ those electron shells. Note, for example, that the energy of the 4s state is slightly lowerĀ than the energy of the 3d state, so it fills up beforeĀ the 3dĀ shell does. [I know the table is hard to read – just check out the original text if you want to see it better.]

This, then, is what you learnt in high school and, of course, there are 94 naturally occurring elements – and another 24 heavier elements that have been produced in labs, so we’d need to go all the way to no. 118. Now, Feynman doesn’t do that, and so I won’t do that either. š

Well… That’s it, folks. We’re done with Feynman. It’s time to move to a physicsĀ grad course now! Talk stuff like quantum field theory, for example. Or string theory. š Stay tuned!

# Re-visiting electron orbitals (III)

In my previous post, I mentioned that it wasĀ not so obvious (both from a physicalĀ as well as from aĀ mathematicalĀ point of view) to write the wavefunction for electron orbitals – which we denoted as Ļ(x, t), i.e. a function of two variables (or four: one time coordinate and three space coordinates) –Ā as the product of two other functions in one variable only.

[…] OK. The above sentence is difficult to read. Let me write in math. š It isĀ notĀ so obvious to write Ļ(x, t) as:

Ļ(x, t) = eāiĀ·(E/Ä§)Ā·tĀ·Ļ(x)

As I mentioned before, the physicists’ use of the same symbol (Ļ, psi) for both the Ļ(x, t) and Ļ(x) function is quite confusing – because the two functions areĀ veryĀ different:

• Ļ(x, t) is a complex-valued function of twoĀ (real)Ā variables: x and t. OrĀ four, I should say, because xĀ = (x, y, z) – but it’s probably easier to think of x as oneĀ vectorĀ variable – aĀ vector-valued argument, so to speak. And then t is, of course, just aĀ scalarĀ variable. So… Well… A function of twoĀ variables: the position in space (x), and time (t).
• In contrast, Ļ(x) is a real-valuedĀ function ofĀ oneĀ (vector) variable only: x, so that’s the position in space only.

Now you should cry foul, of course: Ļ(x) is notĀ necessarilyĀ real-valued. It mayĀ be complex-valued. You’re right.Ā You know the formula:Note the derivation of this formula involved a switch from Cartesian to polar coordinates here, so from xĀ = (x, y, z) to rĀ = (r, Īø, Ļ), and that the function is also a function of the twoĀ quantum numbersĀ l and m now, i.e. the orbital angular momentum (l) and its z-component (m) respectively. In my previous post(s), I gave you the formulas for Yl,m(Īø, Ļ) and Fl,m(r) respectively. Fl,m(r) was a real-valued function alright, but the Yl,m(Īø, Ļ) had that eiĀ·mĀ·ĻĀ factor in it. So… Yes. You’re right: the Yl,m(Īø, Ļ) function is real-valued if – and onlyĀ if – m = 0, in which case eiĀ·mĀ·ĻĀ = 1.Ā Let me copy the table from Feynman’s treatment of the topic once again:The Plm(cosĪø) functions are the so-called (associated) Legendre polynomials, and the formula for these functions is rather horrible:Don’t worry about it too much: just note the Plm(cosĪø)Ā is aĀ real-valuedĀ function. The point is the following:theĀ Ļ(x, t) is a complex-valuedĀ function because – andĀ onlyĀ because – we multiply a real-valued envelope function – which depends on positionĀ only – with eāiĀ·(E/Ä§)Ā·tĀ·eiĀ·mĀ·ĻĀ = eāiĀ·[(E/Ä§)Ā·tĀ āĀ mĀ·Ļ].

[…]

Please read the above once again and – more importantly – think about it for a while. š You’ll have to agree with the following:

• As mentioned in my previous post,Ā the eiĀ·mĀ·ĻĀ factor just gives us phase shift: just aĀ re-set of our zero point for measuring time, so to speak, and the whole eāiĀ·[(E/Ä§)Ā·tĀ āĀ mĀ·Ļ]Ā factor just disappears when weāre calculating probabilities.
• The envelope function gives us the basic amplitude – in theĀ classicalĀ sense of the word:Ā the maximum displacement fromĀ theĀ zeroĀ value. And so it’s that eāiĀ·[(E/Ä§)Ā·tĀ āĀ mĀ·Ļ]Ā that ensures the whole expression somehow captures the energyĀ of the oscillation.

Let’s first look at the envelope function again. Let me copy the illustration forĀ n = 5 and lĀ = 2 from aĀ Wikimedia CommonsĀ article.Ā Note the symmetry planes:

• Any plane containing theĀ z-axis is a symmetry plane – like a mirror in which we can reflect one half of theĀ shape to get the other half. [Note that I am talking theĀ shapeĀ only here. Forget about the colors for a while – as these reflect the complex phase of the wavefunction.]
• Likewise, the plane containingĀ bothĀ the x– and the y-axis is a symmetry plane as well.

The first symmetry plane – or symmetryĀ line, really (i.e. theĀ z-axis) – should not surprise us, because the azimuthal angle Ļ is conspicuously absent in the formula for our envelope function if, as we are doing in this article here, we merge theĀ eiĀ·mĀ·ĻĀ factor with the eāiĀ·(E/Ä§)Ā·t, so it’s just part and parcel of what the author of the illustrations above refers to as the ‘complex phase’ of our wavefunction.Ā OK. Clear enough – I hope. š But why is theĀ the xy-plane a symmetry plane too? We need to look at that monstrous formula for the Plm(cosĪø) function here: just note the cosĪø argument in it is being squaredĀ before it’s used in all of the other manipulation. Now, we know that cosĪø = sin(Ļ/2Ā āĀ Īø). So we can define someĀ newĀ angle – let’s just call it Ī± – which is measured in the way we’re used to measuring angle, which is notĀ from the z-axis but from the xy-plane. So we write: cosĪø = sin(Ļ/2Ā āĀ Īø) = sinĪ±. The illustration below may or may not help you to see what we’re doing here.So… To make a long story short, we can substitute the cosĪø argument in the Plm(cosĪø) function for sinĪ± = sin(Ļ/2Ā āĀ Īø). Now, if the xy-plane is a symmetry plane, then we must find the same value for Plm(sinĪ±) and Plm[sin(āĪ±)]. Now, that’s not obvious, because sin(āĪ±) = āsinĪ± ā Ā sinĪ±. However, because the argument in that Plm(x) function is being squared before any other operation (like subtracting 1 and exponentiating the result), it is OK: [āsinĪ±]2Ā = [sinĪ±]2Ā =Ā sin2Ī±. […] OK, I am sure the geeks amongst my readers will be able to explain this more rigorously. In fact, I hope they’ll have a look at it, because there’s also that dl+m/dxl+mĀ operator, and so you should check what happens with the minus sign there. š

[…] Well… By now, you’re probably totally lost, but the fact of the matter is that we’ve got a beautiful result here. Let me highlight the most significant results:

• AĀ definiteĀ energy state of a hydrogen atom (or of an electron orbiting around some nucleus, I should say) appears to us as some beautifully shaped orbital – an envelopeĀ function in three dimensions, really – whichĀ has the z-axis – i.e. the vertical axis – as a symmetry line and the xy-plane as a symmetry plane.
• The eāiĀ·[(E/Ä§)Ā·tĀ āĀ mĀ·Ļ]Ā factor gives us the oscillation within the envelope function. As such, it’s this factor that, somehow,Ā captures the energyĀ of the oscillation.

It’s worth thinking about this. Look at the geometry of the situation again – as depicted below. We’re looking at the situation along the x-axis, in the direction of the origin, which is the nucleus of our atom.

The eiĀ·mĀ·ĻĀ factor just gives us phase shift: just aĀ re-set of our zero point for measuring time, so to speak. Interesting, weird – but probably less relevant than the eāiĀ·[(E/Ä§)Ā·tĀ factor, which gives us the two-dimensional oscillation that captures the energy of the state.

Now, the obvious question is: the oscillation of what, exactly? I am not quite sure but – as I explained in my Deep BlueĀ page – the real and imaginary part of our wavefunction are really like the electric and magnetic field vector of an oscillating electromagnetic field (think of electromagnetic radiation – if that makes it easier). Hence, just like the electric and magnetic field vector represent some rapidly changing forceĀ on a unit charge, the real and imaginary part of our wavefunction must also represent some rapidly changingĀ forceĀ on… Well… I am not quite sure on what though. The unit charge is usually defined as the charge of a proton – rather than an electron – but then forces act on some mass, right? And the massĀ of a proton is hugely different from the mass of an electron. The same electric (or magnetic) force will, therefore, give a hugely different acceleration to both.

So… Well… My guts instinct tells me the real and imaginary part of our wavefunction just represent, somehow, a rapidly changing force on some unit ofĀ mass, but then I am not sure how to define that unit right now (it’s probably notĀ the kilogram!).

Now, there is another thing we should note here: we’re actually sort of de-constructing a rotationĀ (look at the illustration above once again) in two linearly oscillating vectors – one along the z-axis and the other along the y-axis.Ā Hence, in essence, we’re actually talking about something that’s spinning.Ā In other words, we’re actually talking someĀ torqueĀ around the x-axis. In what direction? I think that shouldn’t matter – that we can write E or āE, in other words, but… Well… I need to explore this further – as should you! š

Let me just add one more note on the eiĀ·mĀ·ĻĀ factor. It sort of defines the geometryĀ of the complex phase itself. Look at the illustration below. Click on it to enlarge it if necessary – or, better still, visit the magnificent Wikimedia Commons article from which I get these illustrations. These are the orbitals nĀ = 4 and lĀ = 3. Look at the red hues in particular – or the blue – whatever: focus on one color only, and see how how – for mĀ = Ā±1, we’ve got one appearance of that color only. For mĀ = Ā±1, the same color appears at two ends of the ‘tubes’ – or toriĀ (plural of torus), I should say – just to sound more professional. š For mĀ = Ā±2, the torus consists of three parts – or, in mathematical terms, we’d say the order of its rotational symmetryĀ is equal to 3.Ā Check that Wikimedia Commons article for higher values ofĀ nĀ andĀ l: the shapes become very convoluted, but the observation holds. š

Have fun thinking all of this through for yourself – and please do look at those symmetries in particular. š

Post scriptum: You should do some thinking on whether or not theseĀ mĀ =Ā Ā±1, Ā±2,…, Ā±lĀ orbitals are really different. As I mentioned above, a phase difference is just what it is: a re-set of the t = 0 point. Nothing more, nothing less. So… Well… As far as I am concerned, that’s notĀ aĀ realĀ difference, is it? š As with other stuff, I’ll let you think about this for yourself.

# Re-visiting electron orbitals (II)

I’ve talked about electron orbitals in a couple of posts already – including a fairly recent one, which is why I put the (II) after the title. However, I just wanted to tie up some loose ends here – and do some more thinking about the concept of a definite energy state. What is it really? We know the wavefunction for aĀ definiteĀ energy state can always be written as:

Ļ(x, t) = eāiĀ·(E/Ä§)Ā·tĀ·Ļ(x)

Well… In fact, we should probably formally proveĀ that but… Well… Let us justĀ explore this formulaĀ in a more intuitive way – for the time being, that is – using those electron orbitals we’ve derived.

First, let me note that Ļ(x, t) and Ļ(x) are veryĀ different functions and, therefore, the choice of the sameĀ symbolĀ for both (the GreekĀ psi) is – in my humble opinion – not very fortunate, but then… Well… It is theĀ choice of physicists – as copied in textbooks all over – and so we’ll just have to live with it. Of course, we can appreciate why they choose to use the same symbol – Ļ(x) is like a time-independent wavefunction now, so that’s nice – but… Well… You should note that it isĀ notĀ so obvious to write some function as the product of two other functions. To be complete, I’ll be a bit more explicit here: if some function in two variables – say F(x, y) – can be written as the product of two functions in one variable – say f(x) and g(y), so we can write F as F(x, y) = f(x)Ā·g(y) – then we say F is a separableĀ function. For a full overview of what that means, click on this link. And note mathematicians do choose a different symbol for the functions F and g. It would probably be interesting to explore what the conditions for separability actually imply in terms of propertiesĀ of… Well… The wavefunction and its argument, i.e. the space and time variables. But… Well… That’s stuff for another post. š

Secondly, note that theĀ momentumĀ variable (p) – i.e. the pĀ in our elementary wavefunction aĀ·eiĀ·(pĀ·xāEĀ·t)/Ä§Ā has sort of vanished: Ļ(x) is a function of the position only. Now, you may think it should beĀ somewhereĀ there – that, perhaps, we can write something like Ļ(x) = Ļ[x), p(x)]. But… No. The momentum variable has effectively vanished. Look at Feynman’s solutions for the electron orbitalsĀ of a hydrogen atom:The Yl,m(Īø, Ļ) and Fn,l(Ļ) functions here are functions of the (polar) coordinates Ļ, Īø, Ļ only. So that’s the positionĀ only (these coordinates are polar orĀ sphericalĀ coordinates, soĀ Ļ is the radial distance, Īø is the polar angle, and Ļ is the azimuthal angle). There’s no idea whatsoever of any momentum in one or the other spatialĀ direction here. I find that rather remarkable. Let’s see how it all works with a simple example.

The functions below are the Yl,m(Īø, Ļ) for lĀ = 1. Note the symmetry: if we swap Īø and Ļ for -Īø and -Ļ respectively, we get the other function: 2-1/2Ā·sin(-Īø)Ā·ei(-Ļ)Ā = -2-1/2Ā·sinĪøĀ·eiĻ.

To get the probabilities, we need to take the absolute square of the whole thing, including eāiĀ·(E/Ä§), but we know |eiĀ·Ī“|2Ā = 1 for any value of Ī“. Why? Because the absolute square of anyĀ complex number is the product of the number with its complex conjugate, so |eiĀ·Ī“|2Ā = eiĀ·Ī“Ā·eiĀ·Ī“Ā = eiĀ·0Ā = 1. So we only have to look at the absolute square of the Yl,m(Īø, Ļ) and Fn,l(Ļ) functions here. The Fn,l(Ļ) function is a real-valued function, so its absolute square is just what it is: some real number (I gave you the formula for theĀ akĀ coefficients in my post on it, and you shouldn’t worry about them: they’re real too). In contrast, theĀ Yl,m(Īø, Ļ) functions are complex-valued – most of them are, at least. Unsurprisingly, we find the probabilities are also symmetric:

P = |-2-1/2Ā·sinĪøĀ·eiĻ|2Ā = (-2-1/2Ā·sinĪøĀ·eiĻ)Ā·(-2-1/2Ā·sinĪøĀ·eiĻ)

= (2-1/2Ā·sinĪøĀ·eiĻ)Ā·(2-1/2Ā·sinĪøĀ·eiĻ) =Ā Ā |2-1/2Ā·sinĪøĀ·eiĻ|2Ā = (1/2)Ā·sin2Īø

Of course, for mĀ = 0, the probability is just cos2Īø. The graphs below are the polar graphs for theĀ cos2Īø and (1/2)Ā·sin2Īø functions respectively.

These polar graphs are not so easy to interpret, so let me say a few words about them. The points that are plotted combine (a) some radialĀ distanceĀ from the center – which I wrote as P because this distance is, effectively,Ā aĀ probability – with (b) the polar angle Īø (so that’s one of the Ā three coordinates). To be precise,Ā the plot gives us, for a given Ļ, all of the (Īø, P) combinations. It works as follows. To calculate the probability for some Ļ and Īø (note that Ļ can be any angle), we must take the absolute square of that Ļn,l,m,Ā = Yl,m(Īø, Ļ)Ā·Fn,l(Ļ) product. Hence, we must calculateĀ |Yl,m(Īø, Ļ)Ā·Fn,l(Ļ)|2Ā = |Fn,l(Ļ)|2Ā·cos2Īø for mĀ = 0, and (1/2)Ā·|Fn,l(Ļ)|2Ā·sin2Īø for mĀ =Ā Ā±1. Hence, the value of Ļ determines the value of Fn,l(Ļ), and that Fn,l(Ļ) value then determines the shape of the polar graph. The three graphs below – P = cos2Īø, P = (1/2)Ā·cos2Īø and P = (1/4)Ā·cos2Īø – illustrate the idea. Note that we’re measuring Īø from the z-axisĀ here, as we should. So that gives us the right orientation of this volume, as opposed to the other polar graphs above, which measured Īø from the x-axis. So… Well… We’re getting there, aren’t we? š

Now you’ll have two or three – or even more – obvious questions. The first one is: where is the third lobe? That’s a good question. Most illustrations will represent the p-orbitals as follows:Three lobes. Well… Frankly, I am not quite sure here, but the equations speak for themselves: the probabilities only depend on Ļ and Īø. Hence, the azimuthal angle Ļ can be anything. So you just need to rotate those P = (1/2)Ā·sin2Īø and P = cos2Īø curves about the theĀ z-axis. In case you wonder how to do that, the illustration below may inspire you.The second obvious question is about the size of those lobes. That 1/2 factor must surely matter, right? Well… We still have thatĀ Fn,l(Ļ) factor, of course, but you’re right: that factor does notĀ depend on the value forĀ m: it’s the same for mĀ = 0 orĀ Ā± 1. So… Well… Those representations above – with the three lobes, all of the same volume – may not be accurate. I found an interesting site – Atom in a Box – with anĀ app that visualizes the atomic orbitals in a fun and exciting way. Unfortunately, it’s for Mac and iPhone only – but this YouTube video shows how it works. I encourage you to explore it. In fact, I need to explore it – but what I’ve seen on that YouTube video (I don’t have a Mac nor an iPhone) suggests the three-lobe illustrations may effectively be wrong: there’s some asymmetry here – which we’d expect, because those p-orbitals are actually supposed to be asymmetric! In fact, the most accurate pictures may well be the ones below. I took them fromĀ Wikimedia Commons. The author explains the use of the color codes as follows: “The depicted rigid body is where the probability density exceeds a certain value. The color shows the complex phase of the wavefunction, where blue means real positive, red means imaginary positive, yellow means real negative and green means imaginary negative.” I must assume he refers to the sign of aĀ andĀ bĀ when writing a complex number asĀ aĀ + iĀ·b

The third obvious question is related to the one above: we should get someĀ cloud, right? Not some rigid body or some surface. Well… I think you can answer that question yourself now, based on what the author of the illustration above wrote: if we changeĀ the cut-off value for the probability, then we’ll give a different shape. So you can play with that and, yes, it’s some cloud, and that’s what the mentionedĀ appĀ visualizes. š

The fourth question is the most obvious of all. It’s the question I started this post with: what areĀ those definite energy states? We have uncertainty, right? So how doesĀ thatĀ play out? NowĀ thatĀ is a question I’ll try to tackle in my next post. Stay tuned ! š

Post scriptum: Let me add a few remarks here so as to – hopefully – contribute to an even better interpretation of what’s going on here. As mentioned, theĀ key to understanding is, obviously, the following basic functional form:

Ļ(r, t) = eāiĀ·(E/Ä§)Ā·tĀ·Ļ(r)

Wikipedia refers to the eāiĀ·(E/Ä§)Ā·tĀ factor as a time-dependent phase factor which, as you can see, we can separate out because we are looking at a definiteĀ energy state here. Note the minus sign in the exponent – which reminds us of the minus sign in the exponent of the elementaryĀ Ā wavefunction, which we wrote as:

Ā aĀ·eāiĀ·ĪøĀ =Ā aĀ·eāiĀ·[(E/Ä§)Ā·t ā (p/Ä§)āx]Ā =Ā aĀ·eiĀ·[(p/Ä§)āxĀ āĀ (E/Ä§)Ā·t]Ā =Ā aĀ·eāiĀ·(E/Ä§)Ā·tĀ·eiĀ·(p/Ä§)āx

We know thisĀ elementaryĀ wavefunction is problematic in terms of interpretation because its absolute square gives us someĀ constantĀ probabilityĀ P(x, t) = |aĀ·eāiĀ·[(E/Ä§)Ā·t ā (p/Ä§)āx]|2Ā = a2. In other words, at any point in time, our electron is equally likely to be anywhereĀ in space. That is notĀ consistent with the idea of our electron being somewhere at some point in time.

The other question is: what reference frame do we use to measure E and p? Indeed, the value of E and p = (px,Ā py,Ā pz) depends on our reference frame: from the electron’s own point of view, it hasĀ no momentum whatsoever: p = 0. Fortunately, we do have a point of reference here: the nucleus of our hydrogen atom. And our own position, of course, because youĀ should note, indeed, that both the subject and the object of the observation are necessary to define the Cartesian x =Ā x, y, zĀ – or, more relevant in this context – the polar rĀ =Ā Ļ, Īø, Ļ coordinates.

This, then, defines some finite or infiniteĀ box in spaceĀ in which the (linear) momentum (p) of our electron vanishes, and then we just need to solve SchrĆ¶dinger’s diffusion equationĀ to find the solutions for Ļ(r). These solutions are more conveniently written in terms of the radial distance Ļ, the polar angle Īø, and the azimuthal angle Ļ:

The functions below are the Yl,m(Īø, Ļ) functions for lĀ = 1.

The interesting thing about these Yl,m(Īø, Ļ) functions is the eiĀ·ĻĀ and/or eāiĀ·ĻĀ factor. Indeed, note the following:

1. Because the sinĪø and cosĪø factors areĀ real-valued, theyĀ only define some envelope for the Ļ(r) function.
2. In contrast, the eiĀ·ĻĀ and/or eāiĀ·ĻĀ factor define some phase shift.

Let’s have a look at theĀ physicalityĀ of the situation, which is depicted below.

The nucleus of our hydrogen atom is at the center. The polar angle is measured from the z-axis, and we know we only have an amplitude there for mĀ = 0, so let’s look at what thatĀ cosĪø factor does. If Īø = 0Ā°, the amplitude is just what it is, but when Īø >Ā 0Ā°, thenĀ |cosĪø|Ā < 1 and, therefore, the probabilityĀ P = |Fn,l(Ļ)|2Ā·cos2Īø will diminish. Hence, for the same radial distance (Ļ), we areĀ lessĀ likely to find the electron at some angle Īø >Ā 0Ā° than on the z-axis itself. Now thatĀ makes sense, obviously. You can work out the argument forĀ mĀ =Ā Ā± 1 yourself, I hope. [The axis of symmetry will be different, obviously!]Ā In contrast, the eiĀ·ĻĀ and/or eāiĀ·ĻĀ factor work very differently. These just give us a phase shift,Ā as illustrated below. A re-set of our zero point for measuring time, so to speak, and the eiĀ·ĻĀ and/or eāiĀ·ĻĀ factor effectively disappears when we’re calculating probabilities, which is consistent with the fact that this angle clearly doesn’t influence theĀ magnitudeĀ of the amplitude fluctuations.So… Well… That’s it, really. I hope you enjoyed this ! š

# Some more on symmetries…

In our previous post, we talked a lot about symmetries in space – in a rather playful way. Let’s try to take it further here by doing some more thinking on symmetries inĀ spacetime. This post will pick up some older stuff – from my posts on statesĀ and the related quantum math in November 2015, for example – but that shouldn’t trouble you too much. On the contrary, I actually hope to tie up some loose ends here.

Let’s first review some obvious ideas. Think about the direction of time. On a time axis, time goes from left to right. It will usually be measured from someĀ zeroĀ point – like when we started our experiment or something š – to some +tĀ point but we may also think of some point in timeĀ beforeĀ ourĀ zeroĀ point, so the minusĀ (āt)Ā points – the left side of the axis – make sense as well. So the directionĀ of time is clear and intuitive. Now, what does it mean to reverseĀ the direction of time?Ā We need to distinguish two things here: the convention, and… Well… Reality. If we would suddenly decide to reverse the direction in which we measureĀ time, then that’s just another convention. We don’t change reality: trees and kids would still grow the way they always did. š We would just have to change the numbers on our clocks or, alternatively, the direction ofĀ rotationĀ of the hand(s) of our clock, as shown below. [I only showed the hour hand because… Well… I don’t want to complicate things by introducing twoĀ time units. But adding the minute hand doesn’t make any difference.]

Now, imagine you’re the dictator who decided to change our time measuring convention. How would youĀ go about it? Would you change the numbers on the clock or the direction of rotation? Personally, I’d be in favor of changing the direction of rotation. Why? Well… First, we wouldn’t have to change expressions such as: “If you are looking north right now, then west is in the 9 o’clock direction, so go there.” š More importantly, it would align our clocks with the way we’re measuring angles. On the other hand, it would notĀ align our clocks with the way theĀ argument (Īø) of our elementaryĀ wavefunction Ļ =Ā aĀ·eāiĪøĀ =Ā eāiĀ·(EĀ·t ā pĀ·x)/Ä§Ā is measured, because that’s… Well… Clockwise.

So… What are the implications here? We would need to change t forĀ āt in our wavefunction as well, right? Yep.Ā Good point. So that’s another convention that would change: we should write our elementary wavefunction now asĀ Ļ =Ā aĀ·eiĀ·(EĀ·t ā pĀ·x)/Ä§. So we would have to re-define Īø as Īø = āEĀ·t + pĀ·x = pĀ·xĀ āEĀ·t. So… Well…Ā Done!

So… Well… What’s next? Nothing. Note that we’re notĀ changing reality here. We’re just adapting our formulas to a new dictatorial convention according to which we should count time from positiveĀ toĀ negativeĀ –Ā like 2, 1, 0, -1, -2 etcetera, as shown below. Fortunately, we can fix allĀ of our laws and formulas in physics byĀ swapping tĀ forĀ -t. So that’s great. No sweat.Ā

Is that all? Yes. We don’t need to do anything else. We’ll still measure the argument of our wavefunction as an angle, so that’s… Well… After changing our convention, it’s now clockwise. š Whatever you want to call it: it’s still the sameĀ direction. Our dictator can’t change physical realityĀ š

Hmm… But so we are obviously interested in changing physical reality. I mean… Anyone can become a dictator, right? In contrast, weĀ – enlightened scientists – want to reallyĀ change the world, don’t we? š So what’s a time reversalĀ in reality? Well… I don’t know… YouĀ tell me. š We may imagine some movie being played backwards, or trees and kids shrinkingĀ instead of growing,Ā or some bird flying backwards – and I amĀ notĀ talking the hummingbird here. š

Hey!Ā The latter illustration – that bird flying backwards – is probably the better one: if we reverse the direction ofĀ time – in reality, that is – then we should also reverse all directions in space. But… Well… What doesĀ thatĀ mean, really? We need to think in terms of force fields here. A stone that’d be falling must now go back up. Two opposite charges that were goingĀ towardsĀ each other, should now move away from each other. But… My God!Ā Such world cannot exist, can it?

No. It cannot. And we don’t need to invoke the second law of thermodynamics for that. š None of what happens in a movie that’s played backwards makes sense: a heavy stone doesĀ notĀ suddenly fly up and decelerate upwards. So it is notĀ like the anti-matterĀ world we described in our previous post. No. We can effectively imagine some world in which all charges have been replaced by their opposite: we’d have positiveĀ electrons (positrons) aroundĀ negativelyĀ charged nuclei consisting of antiprotons andĀ antineutrons and, somehow, negativeĀ masses. But Coulomb’s lawĀ would still tell us two opposite charges – q1Ā and –q2Ā , for example – don’t repel butĀ attractĀ each other, with a force that’s proportional to the product of their charges, i.e. q1Ā·(-q2) = –q1Ā·q2. Likewise, Newton’s law of gravitation would still tell us that two masses m1Ā and m2Ā – negative or positive –Ā will attract each other with a force that’s proportional to the product of their masses, i.e. m1Ā·m2Ā = (-m1)Ā·(-m2). If you’d make a movie in the antimatter world, it would look just like any other movie. It would definitelyĀ notĀ look like a movie being played backwards.

In fact, the latter formula – m1Ā·m2Ā = (-m1)Ā·(-m2)Ā – tells us why: we’re not changing anything by putting a minus sign in front of all of our variables, which are time (t), position (x), mass (m)Ā and charge (q). [Did I forget one? I don’t think so.] Hence, the famous CPT TheoremĀ – which tells us that a world in which (1) time is reversed, (2) all charges have been conjugated (i.e. all particles have been replaced by their antiparticles), and (3) all spatial coordinates now have the opposite sign, is entirely possible (because it would obey the same Laws of Nature that we, in ourĀ world, have discovered over the past few hundred years)Ā – is actually nothing but a tautology. Now, I mean that literally: a tautology is aĀ statement that is true by necessity or by virtue of its logical form. Well… That’s the case here: if we flip the signs of allĀ of our variables, we basically just agreed to count or measure everything from positiveĀ toĀ negative. That’s it. Full stop. Such exoticĀ convention is… Well… Exotic, but itĀ cannotĀ change the real world. Full stop.

Of course, this leaves the more intriguing questions entirely open. PartialĀ symmetries. Like time reversal only. š Or charge conjugation only. š So let’s think about that.

We know that the world that we see in a mirror mustĀ be made of anti-matter but, apart from that particularity, that world makes sense: if we drop a stone in front of the mirror, the stone in the mirror will drop down too. Two like charges will be seen as repelling each other in the mirror too, and concepts such as kinetic or potential energy look just the same. So time just seems to tick away in both worlds – no time reversal here! – and… Well… We’ve got two CP-symmetrical worlds here, don’t we? We only flipped the sign of the coordinate frame and of the charges. Both are possible, right? And what’s possible must exist, right? Well… Maybe. That’s the next step. Let’s first see if both are possible. š

Now, when you’ve read my previous post, you’ll noteĀ that I did notĀ flip theĀ z-coordinate when reflectingĀ my world in the mirror. That’s true. But… Well… That’s entirely beside the point. We could flip the z-axis too and so then we’d have a full parity inversion. [Or parityĀ transformationĀ – sounds more serious, doesn’t it? But it’s only a simple inversion, really.]Ā It really doesn’t matter. The point is: axial vectors have the opposite sign in the mirror world, and so it’s not only about whether or not an antimatter world is possible (it should be, right?): it’s about whether or not the sign reversal of allĀ of those axial vectors makes sense in each and every situation. The illustration below, for example, shows how aĀ left-handedĀ neutrino should be aĀ right-handedĀ antineutrino in the mirror world.I hope you understand the left- versus right-handed thing. Think, for example, of how the left-circularly polarized wavefunction below would look like in the mirror. Just apply the customary right-hand rule to determine the direction of the angular momentum vector. You’ll agree it will be right-circularly polarized in the mirror, right? That’s why we need the charge conjugation: think of the magnetic moment of a circulating charge! So… Well… I can’t dwell on this too much but – if Maxwell’s equations are to hold – then that world in the mirrorĀ mustĀ be made of antimatter.

Now, we know that some processes – in ourĀ world – areĀ notĀ entirely CP-symmetrical. I wrote about this at length in previous posts, so I won’t dwell on these experiments here. The point is: these experiments – which are not easy to understand – lead physicists, philosophers, bloggers and what have you to solemnly state that the world in the mirror cannot reallyĀ exist. And… Well… They’re right. However, I think their observations are beside the point.Ā Literally.

So… Well… I would just like to make a very fundamentalĀ philosophical remark about all those discussions. My point is quite simple:

We should realize that the mirror world andĀ ourĀ world are effectively separated by the mirror. So we should notĀ be looking at stuff inĀ the mirror fromĀ our perspective, because that perspective is well… OutsideĀ of the mirror. A different world. š In my humble opinion,Ā the valid point of reference would be the observerĀ inĀ the mirror, like the photographer in the image below. Now note the following: if theĀ realĀ photographer, on this side of the mirror, would have a left-circularly polarized beam in front of him, then theĀ imaginaryĀ photographer, on theĀ otherĀ side of the mirror, would see theĀ mirrorĀ image of this left-circularly polarized beam as a left-circularly polarized beam too. š I know that sounds complicated but re-read it a couple of times and – I hope – you’ll see the point. If you don’t… Well… Let me try to rephrase it: the point is that the observer inĀ the mirrorĀ would be seeingĀ ourĀ world – just the same laws and what have you, all makes sense!Ā – but he would see ourĀ worldĀ inĀ hisĀ world, so he’d see it in the mirror world. š

Capito? If you would actually be living inĀ the mirror world, then all the things you would seeĀ inĀ the mirror world would make perfectly sense. But you would be living inĀ the mirror world. You would notĀ look at itĀ from outside, i.e. from the other side of the mirror.Ā In short, I actually think the mirror world does exist – but in the mirror only. š […] I am, obviously, joking here. Let me be explicit: ourĀ world is our world, and I think those CP violations in Nature are telling us that it’s the onlyĀ realĀ world. The other worlds exist in our mind only – or in some mirror. š

Post scriptum: I know theĀ Die HardĀ philosophers among you will now have an immediate rapid-backfire question. [Hey – I just invented a new word, didn’t I? AĀ rapid-backfireĀ question. Neat.] How would the photographerĀ inĀ the mirror look atĀ ourĀ world? The answer to that question is simple: symmetry! He (or she) would think it’s a mirror world only.Ā HisĀ world andĀ ourĀ world would be separated by the same mirror. So… What are the implications here?

Well… That mirror is only a piece of glass with a coating. We made it. Or… Well… Some man-made company made it. šĀ So… Well… If you think that observer in the mirror – I am talking about that imageĀ of the photographer in that picture above now – would actually exist, then… Well… Then you need to be aware of the consequences: the corollary of hisĀ existence is thatĀ youĀ doĀ notĀ exist. š And… Well… No. I won’t say more. If you’re reading stuff like this, then you’re smart enough to figure it out for yourself. We live inĀ oneĀ world. Quantum mechanics tells us theĀ perspective on that worldĀ mattersĀ veryĀ much – amplitudes are different in different reference frames – but… Well… Quantum mechanics – or physics in general – doesĀ notĀ give us many degrees of freedoms. None, really. It basically tells us the world we live in is the only world that’sĀ possible, really. But… Then… Well… That’s just because physics… Well… When everything is said and done, it’s just mankind’s drive to ensure our perceptionĀ of the Universe lines up with… Well… What weĀ perceiveĀ it to be. š¦ or š Whatever your appreciation of it. Those Great Minds did an incredible job. š

# Symmetries and transformations

In my previous post, I promised to do something on symmetries. Something simple but then… Well… YouĀ know how it goes: one question always triggers another one. š

Look at the situation in the illustration on the left below. We suppose we have somethingĀ realĀ going on there: something is moving from left to right (so that’s in the 3 o’clock direction), and then something else is going aroundĀ clockwise (so that’s notĀ the direction in which we measure angles (which also include the argumentĀ Īø of our wavefunction), because that’s alwaysĀ counter-clockwise, as I note at the bottom of the illustration). To be precise, we should note that the angular momentum here is all about the y-axis, so the angular momentum vector L points in the (positive) y-direction. We get that direction from the familiar right-hand rule, which is illustrated in the top right corner.

Now, suppose someone else is looking at this from the other side – or just think of yourself going around a full 180Ā° to look at the same thing from the back side. You’ll agree you’ll see the same thing going fromĀ rightĀ toĀ left (so that’s in theĀ 9 o’clock direction now – or, if our clock is transparent, the 3 o’clock direction of our reversed clock). Likewise, the thing that’s turning around will now go counter-clockwise.

Note that both observers – so that’s me and that other person (or myself after my walk around this whole thing) – use a regular coordinate system, which implies the following:

1. We’ve got regular 90Ā° degree angles between our coordinates axes.
2. Our x-axis goes from negative to positive from left to right, and our y-axis does the same going away from us.
3. We also both define our z-axis using, once again, the ubiquitous right-hand rule, so our z-axis points upwards.

So we have two observers looking at the same realityĀ – some linearĀ as well as someĀ angularĀ momentum – but from opposite sides. And so we’ve got a reversal of both the linear as well as the angular momentum. NotĀ in reality, of course, because we’re looking at the same thing. But weĀ measureĀ it differently. Indeed, if we use the subscripts 1 and 2 to denote the measurements in the two coordinate systems, we find that p2Ā =Ā āp1.Ā Likewise, we also find that L2Ā =Ā āL1.

Now, when you see these two equations, youĀ will probably not worry aboutĀ thatĀ p2Ā =Ā āp1Ā equation – although you should, because it’s actually only valid for this rather particular orientation of the linear momentum (I’ll come back to that in a moment). It’s the L2Ā =Ā āL1Ā equation which should surprise you most. Why? Because you’ve always been told there is a bigĀ difference between (1)Ā realĀ vectors (aka polar vectors), like the momentumĀ p, or the velocityĀ v, or the force F,Ā and (2)Ā pseudo-vectors (aka axial vectors), like the angularĀ momentumĀ L. You may also remember how to distinguish between the two:Ā if you change theĀ directionĀ of the axes of your reference frame, polar vectors will change sign too, as opposed to axial vectors: axial vectors do notĀ swap sign if we swap the coordinate signs.

So… Well… How does that work here? In fact, what we should ask ourselves is: why does that notĀ work here? Well… It’s simple, really. We’re not changing the direction of the axes here. Or… Well… Let me be more precise: we’re only swapping the sign of the x– and y-axis. We didĀ notĀ flip the z-axis. So we turned things around, but we didn’t turn them upside down. It makes a huge difference. Note, for example, that if all of the linear momentum would have been in the z-direction only (so ourĀ p vector would have been pointing in the z-direction, and in the z-direction only), it wouldĀ notĀ swap sign. The illustration below shows what really happens with the coordinates of some vector when we’re doing aĀ rotation. It’s, effectively, only theĀ x– andĀ y-coordinates that flip sign.

It’s easy to see that thisĀ rotation about the z-axis here preserves our deep sense of ‘up’ versus ‘down’, but that it swaps ‘left’ for ‘right’, and vice versa. Note that this is notĀ a reflection. We areĀ notĀ looking at some mirror world here. The difference between a reflection (a mirror world) and a rotation (the real world seen from another angle) is illustrated below. It’s quite confusing but, unlike what you might think, a reflection does not swap left for right. It does turn things inside out, but that’s what a rotation does as well: near becomes far, and far becomes near.

Before we move on, let me say a few things about theĀ mirror worldĀ and, more in particular, about the obvious question: could it possiblyĀ exist? Well… What do you think? Your first reaction might well be: “Of course! What nonsense question! We just walk around whatever it is that we’re seeing – or, what amounts to the same, we just turn it around – and there it is: that’s the mirror world, right? So of course it exists!” Well… No. That’sĀ notĀ the mirror world. That’s just theĀ realĀ world seen from the opposite direction, and that world… Well… That’s just the real world. šĀ The mirror world is, literally, the worldĀ in the mirrorĀ – like the photographer in the illustration below. We don’t swap left for right here: some object going from left to right in the real world is still going from left to right in the mirror world!Of course, you may now involve the photographer in the picture above and observe – note that you’re now an observer of the observer of the mirror š – that, if he would move his left arm in the real world, the photographer in the mirror world would be moving his right arm. But… Well… No. You’re saying that because you’re nowĀ imaging that you’re the photographer in the mirror world yourself now, who’s looking at the real world from inside, so to speak. So you’ve rotated the perspective in your mindĀ and you’re saying it’s his right arm because you imagineĀ yourself to be the photographer in the mirror.Ā We usually do that because… Well… Because we look in a mirror every day, right? So we’re used to seeing ourselves that way and we always think it’s us we’re seeing. š However, the illustration above is correct: the mirrorĀ world only swaps near for far, and far for near, so it only swaps the sign of the y-axis.

So the questionĀ isĀ relevant: could the mirror world actually exist? What we’re reallyĀ asking here is the following: can we swap the sign of oneĀ coordinate axisĀ only in all of our physical laws and equations and… Well… Do we then still get the same laws and equations? Do we get the same Universe – because that’s what those laws and equations describe? If so, our mirror world can exist. If not, then not.

Now,Ā I’ve done a post on that, in which I explain that mirror world can only exist if it would consist of anti-matter. So if our real world and the mirror world would actually meet, they would annihilate each other. š But that post is quite technical. Here I want to keep it veryĀ simple: I basically only want to show what the rotationĀ operation implies for the wavefunction. There is no doubt whatsoever that the rotatedĀ world exists. In fact, the rotated world is just ourĀ world. WeĀ walk around some object, or we turn it around, but so we’re still watching the same object. So we’re not thinking about the mirror world here. We just want to know how things look like when adopting some other perspective.

So, back to the starting point: we just have two observers here, who look at the same thing but from opposite directions. Mathematically, this corresponds to a rotation of our reference frameĀ aboutĀ the z-axis of 180Ā°. Let me spell out – somewhat more precisely – what happens to the linear and angular momentum here:

1. The direction of the linear momentum in the xy-plane swaps direction.
2. The angular momentum about the y-axis, as well as about the x-axis, swaps direction too.

Note that the illustration only shows angular momentum about the y-axis, but you can easily verify the statement about the angular momentum about theĀ x-axis. In fact, the angular momentum aboutĀ anyĀ line in theĀ xy-plane will swap direction.

Of course, theĀ x-, y-, z-axes in the other reference frame are different than mine, and so I should give them a subscript, right? Or, at the very least, write something like x’, y’, z’, so we have a primedĀ reference frame here,Ā right? Well… Maybe. Maybe not. Think about it. š A coordinate system is just a mathematical thing… Only the momentum is real… Linear or angular… Equally real… And then Nature doesn’t care about our position, does it? So… Well… No subscript needed, right? Or… Well… What do youĀ think?Ā š

It’s just funny, isn’t it? It looks like we can’t really separate reality and perception here. Indeed, note how ourĀ p2Ā =Ā āp1Ā and L2Ā =Ā āL1Ā equations already mix reality with how we perceive it. It’s the same thingĀ in realityĀ but the coordinates of p1Ā and L1 are positive, while the coordinates of p2Ā and L2Ā are negative. To be precise, these coordinates will look like this:

1. p1Ā = (p, 0, 0) and L1 =Ā (0, L, 0)
2. p2Ā = (āp, 0, 0) and L1 =Ā (0, āL, 0)

So are they two different things or are they not? š Think about it. I’ll move on in the meanwhile. š

Now, you probably know a thing or two about parityĀ symmetry, orĀ P-symmetry: if if we flip the sign of all coordinates, then we’ll still find the same physical laws, like F = mĀ·a and what have you. [It works for all physical laws, including quantum-mechanical laws – except those involving theĀ weakĀ force (read: radioactive decay processes).] But so here we are talking rotational symmetry. That’s notĀ the same as P-symmetry. If we flip the signs ofĀ allĀ coordinates, we’re also swapping ‘up’ for ‘down’, so we’re not only turning around, but we’re also getting upside down. The difference betweenĀ rotational symmetry and P-symmetry is shown below.

As mentioned, we’ve talked about P-symmetry at length in other posts, and you can easilyĀ googleĀ a lot more on that. The question we want to examine here – just as a fun exercise – is the following:

How does that rotationalĀ symmetry work for a wavefunction?

TheĀ very first illustration in this post gave you the functional form of theĀ elementaryĀ wavefunction Ā eiĪø = eiĀ·(EĀ·t pĀ·x)/Ä§. We should actually use a bold typeĀ xĀ = (x, y, z) in this formula but we’ll assume we’re talking something similar to that p vector: something moving in the x-direction only – or in the xy-planeĀ only. TheĀ z-component doesn’t change.Ā Now, you know that we can reduce allĀ actualĀ wavefunctions to some linear combination of such elementary wavefunctions by doing a FourierĀ decomposition, so it’s fine to look at the elementaryĀ wavefunction only – so we don’t make it too complicated here. Now think of the following.

The energy E in theĀ eiĪø = eiĀ·(EĀ·t – pĀ·x)/Ä§Ā function is a scalar, so it doesn’t have any direction and we’ll measure it the same from both sides – as kinetic or potential energy or, more likely, by adding both. But… Well… Writing eiĀ·(EĀ·t – pĀ·x)/Ä§Ā or eiĀ·(EĀ·t +Ā pĀ·x)/Ä§Ā is not the same, right? No, it’s not. However, think of it as follows: we won’t be changing the direction of time, right? So it’s OK to notĀ change the sign of E. In fact, we can re-write the two expressions as follows:

1. eiĀ·(EĀ·t – pĀ·x)/Ä§Ā = eiĀ·(E/Ä§)Ā·tĀ·eiĀ·(p/Ä§)Ā·x
2. eiĀ·(EĀ·t +Ā pĀ·x)/Ä§Ā = eiĀ·(E/Ä§)Ā·tĀ·eiĀ·(p/Ä§)Ā·x

The first wavefunction describes some particle going in the positiveĀ x-direction, while the second wavefunction describes some particle going in the negative x-direction, so… Well… That’s exactlyĀ what we see in those two reference frames, so there is no issue whatsoever. š It’s just… Well… I just wanted to show the wavefunctionĀ doesĀ look different too when looking at something from another angle.

So why am I writing about this? Why am I being fussy? Well.. It’s just to show you that those transformationsĀ are actually quite natural – just as natural as it is to see some particle go in one direction in one reference frame and see it go in the other in the other. š It also illustrates another point that I’ve been trying to make: the wavefunction is somethingĀ real. It’s not just a figment of our imagination. The real and imaginary part of our wavefunction have a precise geometrical meaning – and I explained what that might be in my more speculative posts, which I’ve brought together in the Deep BlueĀ page of this blog. But… Well… I can’t dwell on that here because… Well… You should read that page. š

The point to note is the following: weĀ doĀ have different wavefunctions in different reference frames, but these wavefunctions describe the same physical reality, and they alsoĀ do respect the symmetries we’d expect them to respect, except… Well… TheĀ laws describing theĀ weakĀ force don’t, butĀ I wrote about that a veryĀ long time ago, and it wasĀ notĀ in the context of trying to explain the relatively simple basic laws of quantum mechanics. š If you’re interested, you should check out my post(s) on that or, else, just googleĀ a bit. It’s really exciting stuff, but not something that will help you much to understand the basics, which is what we’re trying to do here. š

The second point to note is that thoseĀ transformationsĀ of the wavefunction – or of quantum-mechanical statesĀ –Ā which we go through when rotating our reference frame, for example – are really quite natural. There’s nothing special about them. We had such transformations in classical mechanics too! But… Well… Yes, I admit they doĀ lookĀ complicated. But then that’s why you’re so fascinated and why you’re reading this blog, isn’t it? š

Post scriptum: It’s probably useful to be somewhat more precise on all of this. You’ll remember we visualized the wavefunction in some of our posts using the animation below. It uses a left-handed coordinate system, which is rather unusual but then it may have been made with a software which uses a left-handed coordinate system (like RenderMan, for example). Now the rotating arrow at the center moves with time and gives us the polarization of our wave. Applying our customary right-handĀ rule,you can see this beam is left-circularly polarized. [I know… It’s quite confusing, but just go through the motions here and be consistent.]Now, you know that eiĀ·(p/Ä§)Ā·x and eiĀ·(p/Ä§)Ā·xĀ are each other’s complex conjugate:

1. eiĀ·kĀ·xĀ =Ā cos(kĀ·x) +Ā iĀ·sin(kĀ·x)
2. eiĀ·kĀ·xĀ =Ā cos(-kĀ·x) +Ā iĀ·sin(-kĀ·x) = cos(kĀ·x) āĀ iĀ·sin(kĀ·x)

Their real part – the cosine function – is the same, but the imaginary part – the sine function – has the opposite sign.Ā So, assuming the direction of propagation is, effectively, the x-direction, then what’s the polarization of the mirror image? Well… The wave will now go from right to left, and its polarization… Hmm…Ā Well… What?Ā

Well… If you can’t figure it out, then just forget about those signs and just imagine you’re effectively looking at the same thingĀ from the backside. In fact, if you have a laptop, you can push the screen down and go around your computer. š There’s no shame in that. In fact, I did that just to make sure I am notĀ talking nonsense here. š If you look at this beam from the backside, you’ll effectively see it go from right to left – instead of from what you see on this side, which is a left-to-right direction. And as for its polarization… Well… The angular momentum vector swaps direction too but the beam is stillĀ left-circularly polarized. So… Well… That’s consistent with what we wrote above. š The real world is real, and axial vectors are as real as polar vectors. This realĀ beam will only appear to beĀ right-circularly polarizedĀ in a mirror. Now, as mentioned above, that mirror world is notĀ ourĀ world. If it would exist – in some other Universe – then it would be made up of anti-matter. š

So… Well… Might it actually exist? Is there some other world made of anti-matter out there? I don’t know. We need to think about that reversal of ‘near’ and ‘far’ too: as mentioned, a mirror turns things inside out, so to speak. So what’s the implication of that? When we walk aroundĀ something – or do aĀ rotationĀ – then the reversal between ‘near’ and ‘far’ is something physical: we go near to what was far, and we go away from what was near. But so how would we get into our mirror world, so to speak? We may say that thisĀ anti-matter world in the mirror is entirely possible, but then how would we get there? We’d need to turn ourselves, literally, inside out – like short of shrink to the zero point and then come back out of it to do that parity inversion along our line of sight. So… Well… I don’t see that happen, which is why I am a fan of the One World hypothesis. š SoĀ IĀ thinkĀ the mirror world is just what it is: the mirror world. Nothing real. But… Then… Well… What doĀ youĀ think? š

# Quantum-mechanical magnitudes

As I was writing about those rotations in my previous postĀ (on electron orbitals), I suddenly felt I should do some more thinking on (1) symmetries and (2) the concept of quantum-mechanicalĀ magnitudesĀ of vectors. I’ll write about the first topic (symmetries) in some other post. Let’s first tackle the latter concept. Oh… And for those I frightened with my last post… Well… ThisĀ should really be an easy read. More of a short philosophical reflection about quantum mechanics. Not a technical thing. Something intuitive. At least I hope it will come out that way. š

First, you should note that the fundamental idea that quantities like energy, or momentum, may be quantized is a very natural one. In fact, it’s what the early Greek philosophers thought about Nature. Of course, while the idea of quantization comes naturally to us (I think it’s easier to understand than, say, the idea of infinity), it is, perhaps,Ā notĀ so easy to deal with itĀ mathematically. Indeed, most mathematical ideas – like functions and derivatives – are based on what I’ll loosely refer to asĀ continuum theory. So… Yes, quantization does yield some surprising results, like that formula for the magnitude of some vector J:The JĀ·J in the classical formula above is, of course, the equally classical vector dot product, and the formula itself is nothing but Pythagoras’ Theorem in three dimensions. Easy. I just put a + sign in front of the square roots so as to remind you we actually always have twoĀ square roots and that we should take the positive one. š

I will now show you how we get that quantum-mechanical formula. The logic behind it is fairly straightforward but, at the same time… Well… You’ll see. š We know that a quantum-mechanical variable – like the spin of an electron, or the angular momentum of an atom – is not continuous butĀ discrete: it will have some valueĀ mĀ = j,Ā j-1,Ā j-2, …, -(j-2), -(j-1), –j.Ā OurĀ jĀ here is the maximumĀ value of the magnitude of the component of our vector (J) in the direction of measurement, which – as you know – is usually written as Jz. Why? Because we will usually choose our coordinate system such that ourĀ z-axis is aligned accordingly. š Those values j,Ā j-1,Ā j-2, …, -(j-2), -(j-1), –j are separated by one unit. That unit would be Planck’s quantum of action Ä§ ā 1.0545718Ć10ā34Ā NĀ·mĀ·s – by the way, isn’t it amazing we can actually measure such tiny stuff in some experiment? š – ifĀ JĀ would happen to be the angular momentum, but the approach here is more general – actionĀ can express itself in various ways š – soĀ the unit doesn’t matter: it’s just the unit, so that’s just one. š It’s easy to see that this separation implies jĀ must be some integer or half-integer. [Of course, now you might think the values of a series like 2.4, 1.4, 0.4, -0.6, -1.6 are also separated by one unit, but… Well… That would violate the most basic symmetry requirement so… Well… No. Our jĀ has to be an integer or a half-integer. Please also note that the number of possible values for mĀ is equal toĀ 2j+1, as we’ll use that in a moment.]

OK. You’re familiar with this by now and so I should not repeat the obvious. To make things somewhat more real, let’s assumeĀ jĀ = 3/2, so mĀ =Ā  3/2, 1/2, -1/2 or +3/2. Now, we don’t know anything about the system and, therefore, these four values are all equally likely. Now, you may notĀ agree with this assumption but… Well… You’ll have to agree that, at this point, you can’t come up with anything else that would make sense, right? It’s just like a classical situation: JĀ might point in any direction, so we have to give allĀ anglesĀ an equal probability.Ā [In fact, I’ll show you – in a minute or so – that you actually have a point here: we should think some more about this assumption – but so that’s for later. I am asking you to just go along with this story as for now.]

So theĀ expectedĀ value ofĀ JzĀ is E[Jz] is equal to E[Jz] = (1/4)Ā·(3/2)+(1/4)Ā·(1/2)+(1/4)Ā·(-1/2)+(1/4)Ā·(-3/2) = 0. Nothing new here. We just multiply probabilities with all of the possible values to get an expected value. So we get zero here because our values are distributed symmetrically around the zero point. No surprise. Now, to calculate a magnitude, we don’t need JzĀ but Jz2. In case you wonder, that’s what this squaring business is all about: we’re abstracting away from the directionĀ and so we’re going to squareĀ both positive as well as negative values to then add it all up and take a square root.Ā Now, the expected value of Jz2Ā is equal to E[Jz] = (1/4)Ā·(3/2)2+(1/4)Ā·(1/2)2+(1/4)Ā·(-1/2)2+(1/4)Ā·(-3/2)2Ā = 5/4 = 1.25. Some positiveĀ value.

You may note that it’s a bit larger than the average of the absoluteĀ value of our variable, which is equal to (|3/2|+|1/2|+|-1/2|+|-3/2|)/4 = 1, but that’s just because the squaring favors larger values š Also note that, of course, we’d also get some positive value if JzĀ would be a continuous variable over the [-3/2, +3/2] interval, but I’ll let youĀ thinkĀ about whatĀ positive value we’d get for Jz2Ā assuming JzĀ is uniform distributed over the [-3/2, +3/2] interval, because that calculation is actually notĀ so straightforward as it may seem at first. In any case, these considerations are not very relevant to our story here, so let’s move on.

Of course, ourĀ z-direction was random, and so we get the same thing for whatever direction. More in particular, we’ll also get it for theĀ x– andĀ y-directions: E[Jx] = E[Jy] = E[Jz] = 5/4. Now, at this point it’s probably good to give you a more generalized formula for these quantities. I think you’ll easily agree to the following one:So now we can apply our classical JĀ·J =Ā Jx2Ā +Ā Jy2Ā +Ā Jz2Ā formula to these quantities by calculating the expected value of J =Ā JĀ·J, which is equal to:

E[JĀ·J] = E[Jx2] + E[Jy2] + E[Jz2] = 3Ā·E[Jx2] = 3Ā·E[Jy2] = 3Ā·E[Jz2]

You should note we’re making use of the E[XĀ +Ā Y] = E[X]+ E[Y] property here: the expected value of the sum of two variables is equal to the sum of the expected values of the variables, and you should also note this is true even if the individual variables would happen to be correlated – which might or might not be the case. [What do you think is the case here?]

For jĀ = 3/2, it’s easy to see we get E[JĀ·J] = 3Ā·E[Jx] = 3Ā·5/4 = (3/2)Ā·(3/2+1) = jĀ·(j+1). We should now generalize this formula for other values of j,Ā  which is notĀ so easy… Hmm… It obviously involves some formula for a series, and I am not good at that… So… Well… I just checked if it was true forĀ jĀ = 1/2 andĀ jĀ = 1 (please check thatĀ at least for yourself too!) and then I just believe the authorities on thisĀ for all other values of j. š

Now, in a classicalĀ situation, we knowĀ thatĀ JĀ·J product will be the same for whatever direction JĀ wouldĀ happen to have, and so its expected value will be equal to its constantĀ value JĀ·J. So we can write:Ā E[JĀ·J] = JĀ·J.Ā So… Well… That’s why we write what we wrote above:

Makes sense, no?Ā E[JĀ·J] = E[Jx2+Jy2+Jz2] = E[Jx2]+E[Jy2]+E[Jz2] = jĀ·(j+1) = JĀ·JĀ = J2, so JĀ = +ā[j(j+1)], right?

Hold your horses, man!Ā Think! What are we doing here, really? We didn’t calculate all that much above. We only found that E[Jx2]+E[Jy2]+E[Jz2] = E[Jx2+Jy2+Jz2] = Ā jĀ·(j+1). So what?Ā Well… That’s notĀ a proof thatĀ theĀ JĀ vector actually exists.

Huh?Ā

Yes. That JĀ vector might just be some theoretical concept. When everything is said and done, all we’ve been doing – or at least, weĀ imaginedĀ we did – is thoseĀ repeated measurements of Jx,Ā JyĀ andĀ JzĀ here – or whatever subscript you’d want to use, like JĪø,Ļ, for example (the example is notĀ random, of course) – and so, of course, it’s only natural that we assume these things are the magnitude of the component (in the direction of measurement)Ā of someĀ realĀ vector that is out there, but then… Well… Who knows? Think of what we wrote about the angular momentum in our previous post on electron orbitals. WeĀ imagine – or do like to think – thatĀ there’s some angular momentum vector JĀ outĀ there, which we think of as being “cocked” at some angle, so its projection onto the z-axis gives us those discrete values forĀ mĀ which, for jĀ = 2, for example, are equal to 0, 1 or 2 (and -1 and -2, of course) – like in the illustration below. šBut… Well… NoteĀ those weird angles: we get something close to 24.1Ā° and then another value close to 54.7Ā°. No symmetry here. š¦Ā The table below gives some more values for largerĀ j. They’re easy to calculate – it’s, once again, just Pythagoras’ Theorem – but… Well… No symmetries here. Just weird values. [I amĀ notĀ saying the formula for these angles isĀ notĀ straightforward. That formula is easy enough:Ā Īø = sin-1(m/ā[j(j+1)]). It’s just… Well… No symmetry. You’ll see why that matters in a moment.]I skipped the half-integer values forĀ jĀ in the table above so you might think they might make it easier to come up with some kind of sensible explanation for the angles. Well… No. They don’t. For example, forĀ jĀ = 1/2 and m =Ā Ā± 1/2, the angles are Ā±35.2644Ā° – more or less, that is. š As you can see, these angles doĀ notĀ nicely cut up our circle in equal pieces, which triggers the obvious question: are these angles really equallyĀ likely? Equal angles doĀ notĀ correspond to equal distances on theĀ z-axis (in case you don’t appreciate the point, look at the illustration below). Ā

So… Well… Let me summarize the issue on hand as follows: the idea of the angle of the JĀ vector being randomly distributed is not compatible with the idea of those JzĀ values being equally spaced and equally likely. The latter idea – equally spaced and equally likelyĀ JzĀ values – relates to different possible statesĀ of the system being equally likely, so… Well… It’s just a different idea. š¦

Now there is another thing which we should mention here. The maximum value of theĀ z-component of ourĀ JĀ vector is always smaller thanĀ that quantum-mechanical magnitude, and quite significantly so for smallĀ j, as shown in the table below. It is only for larger values ofĀ jĀ that the ratio of the two starts to converge to 1. For example, forĀ jĀ = 25, it is about 1.02, so that’s only 2% off.Ā That’s why physicists tell us that, in quantum mechanics, the angular momentum is never “completely along the z-direction.” It is obvious that this actually challenges the ideaĀ of a veryĀ precise direction in quantum mechanics, but then that shouldn’t surprise us, does it? After, isn’t this what the Uncertainty Principle is all about?

DifferentĀ states, rather than differentĀ directions… And then Uncertainty because… Well… Because of discrete variables that won’t split in the middle. Hmm… š¦

Perhaps. Perhaps I should just accept all of this and go along with it… But… Well… I am really not satisfied here, despite Feynman’s assurance that that’s OK:Ā āUnderstanding of these matters comes very slowly, if at all. Of course, one does get better able to know what is going to happen in a quantum-mechanical situationāif that is what understanding meansābut one never gets a comfortable feeling that these quantum-mechanical rules are ‘natural’.ā

I do want to get that comfortable feeling – onĀ some sunny day, at least. šĀ And so I’ll keep playing with this, until… Well… Until I give up. š In the meanwhile, if you’dĀ feel you’ve got some better or some more intuitiveĀ explanation for all of this, please do let me know. I’d be very grateful to you. š

Post scriptum: Of course, we would all want to believe thatĀ JĀ somehow exists because… Well… We want to explainĀ those states somehow, right? I, for one, am not happy with being told to just accept things and shut up. So let me add some remarks here. First, you may think that the narrative above should distinguish between polar and axial vectors. You’ll remember polar vectors are theĀ realĀ vectors, like a radius vectorĀ r, or a force F, or velocity or (linear) momentum.Ā Axial vectors (also known as pseudo-vectors) are vectors like the angular momentum vector: we sort ofĀ constructĀ them from… Well… From realĀ vectors. The angular momentum L, for example, is the vector crossĀ product of the radius vector rĀ and the linear momentum vector p: we write L = rĆp.Ā In that sense, they’re a figment of our imagination. But then… What’s real and unreal? The magnitude of L, for example, does correspond to something real, doesn’t it? And its direction does give us the direction of circulation, right? You’re right.Ā Hence, I think polar and axial vectors are both real – in whatever sense you’d want to define real. Their reality is just different, and that’s reflected in their mathematical behavior: if you change theĀ directionĀ of the axes of your reference frame, polar vectors will change sign too, as opposed to axial vectors: they don’t swap sign. They do something else, which I’ll explain in my next post, where I’ll be talking symmetries.

But let us, for the sake of argument, assume whatever I wrote about those angles applies to axialĀ vectors only. Let’s be even more specific, and say it applies to the angular momentum vector only. If that’s the case, we may want to think of aĀ classicalĀ equivalent for the mentioned lack of a precise direction: free nutation. It’s a complicated thing – even more complicated than the phenomenon ofĀ precession, which we should be familiar with by now. Look at the illustration below (which I took from an article of a physics professor from Saint Petersburg), which shows both precession as well as nutation. Think of the movement of a spinning top when you release it: its axis will, at first, nutateĀ around the axis of precession, before it settles in a more steady precession.The nutation is caused by the gravitational force field, and the nutation movement usually dies out quickly because ofĀ dampeningĀ forces (read: friction). Now, we don’t think of gravitational fields when analyzing angular momentum in quantum mechanics, and we shouldn’t. But there is something else we may want to think of. There is also a phenomenon which is referred to asĀ free nutation, i.e. a nutation that isĀ notĀ caused by an external force field. The Earth, for example, nutates slowly because of a gravitational pull from the Sun and the other planets – so that’sĀ notĀ a free nutation – but, in addition to this, there’s an even smaller wobble – whichĀ isĀ an example of free nutation – because the Earth is not exactly spherical. In fact, the Great Mathematician, Leonhard Euler, had already predicted this, back in 1765, but it took another 125 years or so before an astronomist, Seth Chandler, could finally experimentally confirm and measure it. So they named this wobble the Chandler wobble (Euler already has too many things named after him). š

Now I don’t have much backup here –Ā none, actually š – but why wouldn’t we imagine our electron would also sort of nutate freely because of… Well… Some symmetric asymmetry – something like the slightly elliptical shape of our Earth. š We may then effectively imagine the angular momentum vector as continually changing direction between a minimum and a maximum angle – something like what’s shown below, perhaps, between 0 and 40 degrees. Think of it as a rotation within a rotation, or an oscillation within an oscillation – or a standing wave within a standing wave. šI am not sure if this approach would solve the problem of our angles and distances – the issue of whether we should think in equally likelyĀ angles or equally likelyĀ distancesĀ along the z-axis, really – but… Well… I’ll let you play with this. Please do send me some feedback if you think you’ve found something. š

Whatever your solution is, it is likely to involve the equipartition theorem and harmonics, right? Perhaps we can, indeed, imagine standing waves within standing waves, and then standing waves within standing waves. How far can we go? š

Post scriptum 2: When re-reading this post, I was thinking I should probably do something with the following idea. If we’ve got a sphere, and we’re thinking of some vector pointing to some point on theĀ surfaceĀ of that sphere, then we’re doing something which is referred to as point picking on the surface of a sphere, and the probability distributions – as a function of the polar and azimuthal anglesĀ Īø and Ļ – are quite particular. See the article on the Wolfram site on this, for example. I am not sure if it’s going to lead to some easy explanation of the ‘angle problem’ we’ve laid out here but… Well… It’s surely an element in the explanation. The key idea here is shown in the illustration below: if the direction of our momentum in three-dimensional space is really random, there may still be more of a chance of an orientation towards the equator, rather than towards the pole. So… Well… We need to study the math of this. š But that’s for later.

# Re-visiting electron orbitals

One of the pieces I barely gave a glance when reading Feynmanās Lectures over the past few years, was the derivation of the non-spherical electron orbitals for the hydrogen atom. It just looked like a boring piece of math – and I thought the derivation of the s-orbitals – the sphericallyĀ symmetrical ones – was interesting enough already. To some extent, it is ā but there is so much more to it. When I read it now, the derivation of those p-, d-, f– etc.Ā orbitalsĀ brings all of the weirdness of quantum mechanics together and, while doing so, also provides for a deeper understanding of all of the ideas and concepts we’re trying to get used to. In addition, Feynmanās treatment of the matter is actually much shorter than what youāll find in other textbooks, becauseā¦ Wellā¦ As he puts it, he takes a shortcut. So letās try to follow the bright mind of our Master as he walks us through it.

Youāll remember ā if not, check it out again ā that we found the spherically symmetric solutions for SchrĆ¶dingerās equation for our hydrogen atom. Just to be make sure, SchrĆ¶dingerās equation is a differential equation ā a condition we impose on the wavefunction for our electron ā and so we need to find the functional form for the wavefunctions that describe the electron orbitals. [Quantum math is so confusing that it’s often good to regularly think of what it is that we’re actually trying to do. :-)] In fact, that functional form gives us a whole bunch of solutions ā or wavefunctions ā which are defined by three quantum numbers: n, l, and m. The parameter n corresponds to an energy level (En), l is the orbital (quantum) number, and m is the z-component of the angular momentum. But that doesnāt say much. Letās go step by step.

First, we derived those spherically symmetric solutions ā which areĀ referred to as s-states ā assuming this was a state with zero (orbital) angular momentum, which we write as lĀ = 0. [As you know, Feynman does not incorporate the spin of the electron in his analysis, which is, therefore, approximative only.] Now what exactly is a state with zero angular momentum? When everything is said and done, we areĀ effectively trying to describe some electron orbital here, right? So that’s an amplitude for the electron to beĀ somewhere, but then we also know it always moves. So, when everything is said and done, the electron is some circulating negative charge, right? So there is always some angular momentum and, therefore, some magnetic moment, right?

Well… If youĀ googleĀ this question on Physics Stack Exchange, you’ll get a lot ofĀ mumbo jumboĀ telling you that you shouldn’t think of the electron actually orbiting around. But… Then… Well… A lot of thatĀ mumbo jumboĀ is contradictory. For example, one of the academics writing there doesĀ note that, while we shouldn’t think of an electron as some particle, the orbital is still a distribution which gives you the probability of actually finding the electron at some point (x,y,z). So… Well… It is someĀ kind of circulating charge – as a point, as a cloud or as whatever.Ā The only reasonable answer – in my humble opinion – is that lĀ = 0 probably means there is no netĀ circulating charge, so the movement in this or that direction must balance the movement in the other. One may note, in this regard, thatĀ the phenomenon of electron capture in nuclear reactions suggests electrons do travel through the nucleus for at least part of the time, which is entirely coherent with the wavefunctions for s-states – shown below – which tell us that the most probable (x, y, z) position for the electron is right at the center – so that’s where the nucleus is. There is also a non-zero probability for the electron to be at the center for the other orbitals (p,Ā d, etcetera).In fact, now that I’ve shown this graph, I should quickly explain it. The three graphs are the spherically symmetric wavefunctions for the first three energy levels. For the first energy level – which is conventionally written asĀ n = 1,Ā notĀ as n = 0 – the amplitude approaches zero rather quickly. For n = 2 and n = 3, there are zero-crossings: the curve passes the r-axis. Feynman calls these zero-crossing radial nodes. To be precise, the number of zero-crossings for these s-states is nĀ ā 1, so there’s none for nĀ = 1, one for nĀ = 2, two forĀ nĀ = 3, etcetera.

Now, why is the amplitude – apparently – some real-valuedĀ function here? That’s because we’re actually notĀ looking atĀ Ļ(r, t) here but at the Ļ(r) function which appears in the following break-upĀ of the actual wavefunction Ļ(r, t):

Ļ(r, t) = eāiĀ·(E/Ä§)Ā·tĀ·Ļ(r)

SoĀ Ļ(r) is more of an envelopeĀ function for the actual wavefunction, which varies both in spaceĀ as well as in time. It’s good to remember that: I would have used another symbol, becauseĀ Ļ(r, t) and Ļ(r) are two different beasts, really – but then physicists want you to think, right? And Mr. Feynman would surelyĀ want you to do that, so why not inject some confusing notation from time to time? š So forĀ nĀ = 3, for example, Ļ(r) goes from positive to negative and then to positive, and these areas are separated by radial nodes. Feynman put it on the blackboard like this:I am just inserting it to compare this concept of radial nodes with the concept of a nodal plane, which we’ll encounter when discussing p-states in a moment, but I can already tell you what they are now: thoseĀ p-states are symmetrical in one direction only, as shown below, and so we have a nodal plane instead of a radial node. But so I am getting ahead of myself here… šBefore going back to where I was, I just need to add one more thing. š Of course, you know that we’ll take the square of the absolute value of our amplitude to calculate a probability (or theĀ absolute square – as we abbreviate it), so you may wonder why the sign is relevant at all. Well… I am not quite sure either but there’s this concept of orbital parityĀ which you may have heard of. Ā The orbital parity tells us what will happen to the sign if we calculate the value for Ļ for ār rather than for r. If Ļ(ār) = Ļ(r), then we have anĀ evenĀ function – or even orbital parity. Likewise, if Ļ(ār) = āĻ(r), then we’ll the function oddĀ – and so we’ll have an oddĀ orbital parity. The orbital parity is always equal to (-1)lĀ = Ā±1. The exponentĀ lĀ is that angular quantum number, andĀ +1, or + tout court,Ā means even, and -1 or just ā means odd. The angular quantum number for those p-states is lĀ = 1, so that works with the illustration of the nodal plane. š As said, it’s notĀ hugely important but I might as well mention in passing – especially because we’ll re-visit the topic ofĀ symmetriesĀ a few posts from now. š

OK. I said I would talk about states with some angular momentumĀ (soĀ lĀ ā  0)Ā and so it’s about time I start doing that. As you know, our orbital angular momentum lĀ is measured in units of Ä§ (just like the total angular momentum J, which we’ve discussed ad nauseamĀ already). We also know that if we’d measure its componentĀ along any direction – anyĀ directionĀ really, but physicists will usually make sure that the z-axis of their reference frame coincides with, so we call it the z-axis š – then we will find that it can only have one of a discrete set of values mĀ·Ä§Ā =Ā lĀ·Ä§, (l-1)Ā·Ä§, …, -(l-1)Ā·Ä§, –lĀ·Ä§. Hence, lĀ just takes the role of our good old quantum numberĀ jĀ here, and m is just Jz. Likewise, I’d like to introduce lĀ as the equivalent ofĀ J, so we can easily talk about the angular momentumĀ vector. And now that we’re here, why not writeĀ mĀ in bold type too, and say thatĀ mĀ is the z-component itself – i.e. the wholeĀ vectorĀ quantity, so that’s the direction and the magnitude.

Now, we do need to note one crucial difference between jĀ andĀ l, or betweenĀ J andĀ l: ourĀ j could be an integerĀ orĀ a half-integer. In contrast,Ā lĀ must be some integer. Why? Well… IfĀ lĀ can be zero, and the values ofĀ lĀ must be separated by a full unit, thenĀ l must be 1, 2, 3 etcetera. š If this simple answer doesn’t satisfy you, I’ll refer you to Feynman’s, which is also short but more elegant than mine. šĀ Now, you may or may not remember that the quantum-mechanical equivalent of the magnitude of a vector quantity such as lĀ is to be calculated as ā[lĀ·(l+1)]Ā·Ä§, so if lĀ = 1, that magnitude will be ā2Ā·Ä§ ā 1.4142Ā·Ä§, so that’s – as expected – larger than the maximum value for m, which is +1. As you know, that leads us to think of that z-componentĀ mĀ as a projection ofĀ l. Paraphrasing Feynman, the limited set of values for m imply that the angular momentum is always “cocked” at some angle. For lĀ = 1, that angle is either +45Ā° or, else, ā45Ā°, as shown below.What ifĀ l = 2? The magnitude ofĀ lĀ is then equal to ā[2Ā·(2+1)]Ā·Ä§ = ā6Ā·Ä§ ā 2.4495Ā·Ä§. How do we relate that to those “cocked” angles? The values ofĀ mĀ now range from -2 to +2, with aĀ unitĀ distance in-between. The illustration below shows the angles. [I didn’t mention Ä§ any more in that illustration because, by now, we should know it’s our unit of measurement – always.]

Note we’ve got a bigger circle here (the radius is about 2.45 here, as opposed to a bit more than 1.4 for m = 0). Also note that it’s notĀ a nice cake with perfectly equal pieces. From the graph, it’s obvious that the formula for the angle is the following:It’s simple but intriguing. Needless to say, the sinĀ ā1Ā function is the inverse sine, also known as the arcsine. I’ve calculated the values for allĀ mĀ forĀ l = 1, 2, 3, 4 andĀ 5 below. The most interesting values are the angles forĀ mĀ = 1 andĀ mĀ =Ā l.Ā As the graphs underneath show, forĀ mĀ = 1, the values start approaching the zero angle for very largeĀ l, so there’s not much difference any more between mĀ = Ā±1 and mĀ = 1 for large values of l. What about theĀ mĀ =Ā l case? Well… Believe it or not,Ā ifĀ lĀ becomes really large, then these angles do approach 90Ā°. If you don’t remember how to calculate limits, then just calculateĀ Īø for some huge value forĀ lĀ andĀ m. For lĀ =Ā mĀ = 1,000,000, for example, you should find that Īø = 89.9427…Ā°. š

Isn’t this fascinating? I’ve actually never seen this in a textbook – so it might be an original contribution. š OK. I need to get back to the grind: Feynman’s derivation ofĀ non-symmetrical electron orbitals. Look carefully at the illustration below. IfĀ mĀ is really the projection of some angular momentum that’s “cocked”, either at a zero-degree or, alternatively, at Ā±45Āŗ (for theĀ lĀ = 1 situation we show here) – a projection on the z-axis, that is –Ā then the value ofĀ mĀ (+1, 0 or -1) does actually correspond to some idea of the orientationĀ of theĀ space in which our electron is circulating. ForĀ mĀ = 0, that space – think of some torus or whatever other space in which our electron might circulate – would have some alignment with the z-axis. For mĀ = Ā±1, there is no such alignment.Ā

The interpretation is tricky, however, and the illustration on the right-hand side above isĀ surelyĀ too much of a simplification: an orbital is definitelyĀ notĀ like a planetary orbit. It doesn’t even look like a torus. In fact, the illustration in the bottom right corner, which shows the probability density, i.e. the space in which we are actually likely to find the electron, is a picture that is much more accurate – and it surely does notĀ resemble a planetary orbit or some torus. However, despite that, theĀ idea that, for mĀ = 0, we’d have some alignment ofĀ theĀ space in which our electron moves with theĀ z-axisĀ is not wrong. Feynman expresses it as follows:

“Suppose m is zero, then there can be some non-zero amplitude to find the electron on theĀ z-axis at some distance r. We’ll call this amplitude Fl(r).”

You’ll say: so what? And you’ll also say that illustration in the bottom right corner suggests the electron is actually circulatingĀ aroundĀ the z-axis, rather thanĀ throughĀ it. Well… No. That illustration does not show any circulation. It only shows a probability density. No suggestion of any actual movement or circulation. So the idea is valid: ifĀ mĀ = 0, then the implication is that, somehow, the space ofĀ circulation of currentĀ aroundĀ the direction of the angular momentum vector (J), as per the well-known right-hand rule, will include the z-axis. So the idea of that electron orbitingĀ throughĀ the z-axis for mĀ = 0Ā is essentially correct, and the corollary is… Well… I’ll talk about that in a moment.

But… Well… So what?Ā What’s so special about that Fl(r) amplitude? What can we do with that? Well…Ā IfĀ we would find a way to calculate Fl(r), then we know everything. Huh? Everything?Ā Yes. The reasoning here is quite complicated, so please bear with me as we go through it.

The first thing you need to accept, is rather weird. The thing we said about the non-zero amplitudes to find the electron somewhere on the z-axis for the m = 0 state – which, using Dirac’s bra-ket notation, we’ll write as |l,Ā mĀ = 0āŖ – has a very categorical corollary:

The amplitude to find an electron whose state m isĀ notĀ equal to zero on the z-axis (at some non-zero distance r) is zero. We can only find an electron onĀ the z-axis unless theĀ z-component of its angular momentum (m) is zero.Ā

Now, I know this is hard to swallow, especially when looking at those 45Ā° angles for JĀ in our illustrations, because these suggest the actual circulation of current mayĀ also include at least part ofĀ the z-axis. But… Well… No. Why not? Well… I have no good answer here except for the usual one which, I admit, is quite unsatisfactory: it’s quantum mechanics,Ā notĀ classical mechanics. So we have to look at theĀ mĀ andĀ āmĀ vectors, which are pointed along the z-axis itself for m = Ā±1 and, hence, the circulation we’d associate with those momentum vectors (even if they’re the zcomponentĀ only) is aroundĀ the z-axis. Not through orĀ onĀ it. I know it’s a really poor argument, but it’s consistent with our picture of the actualĀ electron orbitals – that picture in terms of probability densities, which I copy below. For m = ā1, we have theĀ yz-plane as the nodal plane between the twoĀ lobesĀ of our distribution, so no amplitude to find the electron on theĀ z-axis (nor would we find it on theĀ y-axis, as you can see). Likewise, forĀ m = +1, we have the xz-plane as the nodal plane. Both nodal planes include theĀ z-axis and, therefore, there’s zeroĀ probability on that axis.Ā

In addition, you may also want to note theĀ 45Ā° angle we associate with mĀ = Ā±1 does sort ofĀ demarcate the lobesĀ of the distribution by defining a three-dimensionalĀ coneĀ and… Well… I know these arguments are rather intuitive, and so you mayĀ refuse to accept them. In fact, to some extent, IĀ refuse to accept them. š Indeed, let me say this loud and clear:Ā I really want to understand this in a better way!Ā

But… Then… Well… Such better understanding may never come. Feynman’s warning, just before he starts explaining the Stern-Gerlach experiment and the quantization of angular momentum, rings very true here: “Understanding of these matters comes very slowly, if at all. Of course, one does get better able to know what is going to happen in a quantum-mechanical situationāif that is what understanding meansābut one never gets a comfortable feeling that these quantum-mechanical rules are ānatural.ā Of course they are, but they are not natural to our own experience at an ordinary level.” So… Well… What can I say?

It is now time to pull the rabbit out of the hat.Ā To understand what we’re going to do next, you need to remember that our amplitudes – or wavefunctions – are always expressedĀ with regard to a specific frame of reference, i.e. some specific choice of an x-, y– and z-axis. If we change the reference frame – say, to some new set of x’-, y’– and z’-axes – then we need to re-write our amplitudes (or wavefunctions) in terms of the new reference frame. In order to do so, one should use a set of transformation rules. I’ve written several posts on that – including a very basic one, which you may want to re-read (just click the link here).

Look at the illustration below. We want to calculate the amplitude to find the electron at some point in space. Our reference frame is theĀ x, y, z frame and the polar coordinates (orĀ sphericalĀ coordinates, I should say) of our point are the radial distanceĀ r, the polar angle Īø (theta),Ā and the azimuthal angle Ļ (phi). [The illustration below – which I copied from Feynman’s exposĆ©Ā – uses aĀ capital letterĀ forĀ phi, but I stick to the more usual or more modern convention here.]

In case you wonder why we’d use polar coordinates rather than Cartesian coordinates… Well… I need to refer you to my other post on the topic of electron orbitals, i.e. the one in which I explain how we get theĀ spherically symmetricĀ solutions: if you have radial (central) fields, then it’s easier to solve stuff using polar coordinates – although you wouldn’t think so if you think of that monsterĀ equation that we’re actually trying to solve here:

It’s really SchrĆ¶dinger’s equation for the situation on hand (i.e. a hydrogen atom, with a radial or central Coulomb field because of its positively charged nucleus), but re-written in terms of polar coordinates. For the detail, see the mentioned post. Here, you should just remember we got the spherically symmetric solutions assuming the derivatives of the wavefunction with respect to Īø and Ļ – so that’s the āĻ/āĪø and āĻ/āĻ in the equation abovewere zero. So now we don’t assume these partial derivatives to be zero: we’re looking forĀ states with an angular dependence, as Feynman puts it somewhat enigmatically. […] Yes. I know. This post is becoming very long, and so you are getting impatient.Ā Look at the illustration with the (r, Īø, Ļ) point, and let me quote Feynman on the line of reasoning now:

“Suppose we have the atom in some |l,Ā māŖ state, what is the amplitude to find the electron at the angles Īø and ĻĀ and the distance rĀ from the origin? Put a new z-axis, say z’, at that angle (see the illustration above), and ask: what is the amplitude that the electron will be at the distance rĀ along the new z’-axis? We know that it cannot be found along z’Ā unless itsĀ z’-component of angular momentum, say m’, is zero. When m’Ā is zero, however, the amplitude to find the electron along z’Ā is Fl(r).Ā Therefore, the result is the product of two factors. The first is the amplitude that an atom in the state |l,Ā māŖĀ along the z-axis will be in the state |l,Ā m’Ā = 0āŖĀ with respect to the z’-axis. Multiply that amplitude by Fl(r) and you have the amplitude Ļl,m(r) to find the electron at (r, Īø, Ļ) with respect to the original axes.”

So what is he telling us here? Well… He’s going a bit fast here. š Worse, I think he may actually not have chosen the right words here, so let me try to rephrase it. We’ve introduced the Fl(r) function above: it was the amplitude, for m = 0, to find the electron on theĀ z-axis at some distance r. But so here we’re obviously in the x’, y’, z’Ā frame and so Fl(r) is the amplitude for m’Ā = 0,Ā Ā it’s the amplitude to find the electron on theĀ z-axis at some distance r along theĀ z’-axis. Of course, for this amplitude to be non-zero, we must be in the |l,Ā m’Ā = 0āŖ state, but are we? Well… |l,Ā m’Ā = 0āŖ actually gives us the amplitudeĀ for that. So we’re going toĀ multiplyĀ two amplitudes here:

Fl(r)Ā·|l,Ā m’Ā = 0āŖ

So this amplitude is the product of two amplitudes as measured in the the x’, y’, z’Ā frame. Note it’s symmetric: we may also write it as |l,Ā m’Ā = 0āŖĀ·Fl(r). We now need to sort of translate that into an amplitude as measured in the x, y, zĀ frame. To go from x, y, z to x’, y’, z’, we first rotated around the z-axis by the angle Ļ, and thenĀ rotated around theĀ newĀ y’-axis by the angle Īø. Now, the order of rotation matters: you can easily check that by taking a non-symmetrical object in your hand and doing those rotations in the two different sequences: check what happens to the orientation of your object. Hence, to go back we shouldĀ firstĀ rotate about theĀ y’-axis by the angleĀ āĪø, so our z’-axis folds into the oldĀ z-axis,Ā and then rotate about the z-axis by the angle āĻ.

Now, we will denote the transformationĀ matrices that correspond to these rotations as Ry’(āĪø) and Rz(āĻ) respectively. These transformation matrices are complicated beasts. They are surelyĀ notĀ the easy rotation matrices that you can use for the coordinates themselves. You can click this link to see how they look like for lĀ = 1. For larger l, there are other formulas, which Feynman derives in another chapterĀ of his LecturesĀ on quantum mechanics. But let’s move on. Here’s the grand result:

The amplitude for our wavefunctionĀ Ļl,m(r) – which denotes the amplitude for (1) the atom to be in the state that’s characterized by the quantum numbers lĀ andĀ mĀ andĀ – let’s not forget – (2) find the electron at r –Ā note the bold type:Ā rĀ = (x, y, z) – would be equal to:

Ļl,m(r) =Ā ā©l, m|Rz(āĻ) Ry’(āĪø)|l,Ā m’Ā = 0āŖĀ·Fl(r)

Well… Hmm… Maybe. […] That’sĀ notĀ how Feynman writes it. He writes it as follows:

Ļl,m(r) =Ā ā©l, 0|Ry(Īø) Rz(Ļ)|l,Ā māŖĀ·Fl(r)

I am not quite sure what I did wrong. Perhaps the two expressions are equivalent. Or perhaps – is it possible at all? – Feynman made a mistake? I’ll find out. [P.S: I re-visited this point in the meanwhile: see the P.S. to this post. :-)] The point to note is that we have some combinedĀ rotation matrixĀ Ry(Īø) Rz(Ļ). The elementsĀ of this matrix are algebraic functions of Īø and Ļ, which we will write as Yl,m(Īø, Ļ), so we write:

aĀ·Yl,m(Īø, Ļ) = ā©l, 0|Ry(Īø) Rz(Ļ)|l,Ā māŖ

Or aĀ·Yl,m(Īø, Ļ) =Ā ā©l, m|Rz(āĻ) Ry’(āĪø)|l,Ā m’Ā = 0āŖ, if Feynman would have it wrong and myĀ line of reasoning above would be correctĀ – which is obviouslyĀ notĀ so likely. Hence, the Ļl,m(r) function is now written as:

Ļl,m(r) =Ā aĀ·Yl,m(Īø, Ļ)Ā·Fl(r)

The coefficient aĀ is, as usual, a normalization coefficient so as to make sure the surface under the probability density function is 1.Ā As mentioned above, we get theseĀ Yl,m(Īø, Ļ) functions from combining those rotation matrices. For lĀ = 1, andĀ mĀ = -1, 0, +1, they are:Ā A more complete table is given below:So, yes, we’re done. Those equations above give us those wonderful shapes for the electron orbitals, as illustrated below (credit for the illustration goes to an interesting site of the UC Davis school).But… Hey!Ā Wait a moment!Ā We only have these Yl,m(Īø, Ļ) functions here. What about Fl(r)?

You’re right. We’re not quite there yet, because we don’t have a functional form for Fl(r). Not yet, that is. Unfortunately, thatĀ derivation is another lengthy development – and that derivation actuallyĀ isĀ just tedious math only. Hence, I will refer you to Feynman for that. šĀ Let me just insert one more thing before giving you The Grand Equation, and that’s a explanation of how we get those nice graphs. They are so-called polar graphs. There is a nice and easy article on them on the website of the University of Illinois, but I’ll summarize it for you. Polar graphs use a polar coordinate grid, as opposed to the Cartesian (or rectangular) coordinate grid that we’re used to. It’s shown below.Ā

The origin is now referred to as theĀ poleĀ – like in North or South Pole indeed. š The straight lines from the pole (like the diagonals, for example, or the axes themselves, or any line in-between) measure the distance from the pole which, in this case, goes from 0 to 10, and we can connect the equidistant points by a series of circles – as shown in the illustration also. These lines from the pole are defined by some angle – which we’ll write as Īø to make things easy š – which just goes from 0 to 2Ļ = 0 and then round and round and round again. The rest is simple: you’re just going to graph a function, or an equation – just like you’d graph y = ax + bĀ in the Cartesian plane – but it’s going to be a polar equation. Referring back to our p-orbitals, we’ll want to graph the cos2Īø = Ļ equation, for example, because that’s going to show us the shape of that probability density function forĀ lĀ = 1 andĀ mĀ = 0. So our graph is going to connect the (Īø, Ļ) points for which the angle (Īø) and the distance from the pole (Ļ) satisfies the cos2Īø = Ļ equation. There is a really nice widgetĀ on the WolframAlpha site that produces those graphs for you. I used it to produce the graph below, which shows the 1.1547Ā·cos2Īø = Ļ graph (theĀ 1.1547 coefficient is the normalization coefficientĀ a).Ā Now, you’ll wonder why this is a curve, or a curved line. That widget even calculates its length: it’s about 6.374743 units long. So why don’t we have a surfaceĀ or aĀ volumeĀ here?Ā We didn’t specify any value for Ļ, did we? No, we didn’t. The widget calculates those values from the equation. So… Yes. It’s a valid question: where’s the distribution? We were talking about some electron cloud or something, right?

Right. To get that cloud – those probability densities really – weĀ need that Fl(r) function. Our cos2Īø = Ļ is, once again, just some kind of envelope function: it marks a space but doesn’t fill it, so to speak. š In fact, I should now give you the completeĀ description, which has all of the possible states of the hydrogen atom – everything! No separate pieces anymore. Here it is. It also includesĀ n. It’s The Grand Equation:The akĀ coefficients in the formula forĀ ĻFn,l(Ļ) are the solutions to the equation below, which I copied from Feynman’s text on it all. I’ll also refer you to the same text to see how you actually get solutions out of it, and what they then actually represent. šWe’re done. Finally!

I hope you enjoyed this. Look at what we’ve achieved. We had this differential equation (a simple diffusion equation, really, albeit in the complex space), and then we have a central Coulomb field and the rather simple concept of quantized (i.e. non-continuous or discrete) angular momentum. Now see what magic comes out of it! WeĀ literallyĀ constructed the atomic structure out of it, and it’s all wonderfully elegant and beautiful.

NowĀ IĀ think that’s amazing, and if you’re reading this, then I am sure you’ll find it as amazing as I do.

Note: I did a better job in explaining the intricacies of actually representing those orbitals in a later post. I recommend you have a look at itĀ by clicking the link here.

Post scriptum on the transformation matrices:

You must find the explanation for that ā©l, 0|Ry(Īø) Rz(Ļ)|l,Ā māŖĀ·Fl(r) product highly unsatisfactory, and it is. š I just wanted to make you think – rather than just superficially read through it. First note that Fl(r)Ā·|l,Ā m’Ā = 0āŖ isĀ notĀ a product of two amplitudes: it is the product of an amplitude with aĀ state. A state is a vector in a rather special vector space – aĀ HilbertĀ space (just a nice word to throw around, isn’t it?). The point is: a state vector is written as some linear combination of baseĀ states. Something inside of me tells me we may look at the three p-states as base states, but I need to look into that.

Let’s first calculate theĀ Ry(Īø) RzĀ matrix to see if we get those formulas for the angular dependence of the amplitudes. It’s the product of theĀ Ry(Īø) and RzĀ matrices, which I reproduce below.

Note that this product is non-commutative because… Well… Matrix products generally are non-commutative. š So… Well… There they are: the second row gives us those functions, soĀ IĀ am wrong, obviously, and Dr. Feynman is right. Of course, he is. He isĀ alwaysĀ right – especially because hisĀ LecturesĀ have gone through so many revised editions that all errors must be out by now. š

However, let me – just for fun – also calculate myĀ Rz(āĻ) Ry’(āĪø) product. I can do so in two steps: first I calculate Rz(Ļ) Ry’(Īø), and then I substitute the angles Ļ and Īø for āĻ and āĪø, remembering that cos(āĪ±) = cos(Ī±) and sin(āĪ±) = āsin(Ī±). I might have made a mistake, but I got this:The functions look the same but… Well… No. TheĀ eiĻĀ and eāiĻĀ are in the wrong place (it’s just oneĀ minus sign – but it’s crucially different). And then these functions should not be in a column. That doesn’t make sense when you write it all out. So Feynman’s expression is, of course, fully correct. But so how doĀ we interpret that ā©l, 0|Ry(Īø) Rz(Ļ)|l,Ā māŖ expression then? This amplitude probably answers the following question:

Given that our atom is in the |l,Ā māŖ state, what is the amplitude for it to be in the ā©l, 0| state in the x’, y’, z’ frame?

That makes sense – because we did start out with the assumption that our atom was in the the |l,Ā māŖ state, so… Yes. Think about it some more and you’ll see it all makes sense: we can – and should – multiply this amplitude with the Fl(r) amplitude.

OK. Now we’re reallyĀ done with this. š

Note: As for the ā© | and Ā | āŖ symbols to denote a state, note that there’s not much difference: both are state vectors, but a state vector that’s written as an end state – so that’s like ā© Ī¦ | – is a 1Ć3 vector (so that’s a columnĀ vector), while a vector written as | Ī¦ āŖ is a 3Ć1 vector (so that’s a rowĀ vector). So that’s why ā©l, 0|Ry(Īø) Rz(Ļ)|l,Ā māŖ does give us some number. We’ve got a (1Ć3)Ā·(3Ć3)Ā·(3Ć1) matrix product here – but so it gives us what we want: a 1Ć1 amplitude. š

# The state(s) of a photon

While hurrying to try to understand the things I wanted to understand most – like SchrĆ¶dinger’s equation and, equally important, its solutions explaining the weird shapes of electron orbitals – I skipped some interesting bits and pieces. Worse, I skipped two or three of Feynman’sĀ LecturesĀ on quantum mechanics entirely. These include Chapter 17 – on symmetry and conservation laws – and Chapter 18 – on angular momentum. With the benefit of hindsight, that was not the right thing to do. If anything, doing allĀ of theĀ Lectures would, at the veryĀ least, ensure I would have more than an ephemeral grasp of it all. So… In this and the next post, I want to tidy up and go over everything I skipped so far. š

We’ve written a lot on how quantum mechanics applies to both bosons as well as fermions. For example, we pointed out – in very much detail – that the mathematicalĀ structure of theĀ electromagnetic wave – light! š – is quite similar to that of the ubiquitous wavefunction. Equally fundamental – if not more – is the fact that light also arrives in lumps – little light-particles which we call photons. It’s the photoelectric effect, which Einstein explained in 1905 by… Well… By telling us that light consists of quanta – photons – whose energy must be high enough so as to be able to dislodge an electron. It’s what got him his Nobel Prize. [Einstein never got a Nobel Prize for his relativity theory, which is – arguably – at least as important. There’s a lot of controversy around that but, in any case, that’s history.]

So it shouldn’t surprise you that there’s an equivalent to the spin of an electron. With spin, we refer to the angular momentum of a quantum-mechanical system – an atom, a nucleus, an electron, whatever – which, as you know, can only be one of a set of discrete values when measured along some direction, which we usually refer to as the z-direction. More formally, we write that the z-component of the angular moment J is equal to

JzĀ = jĀ·Ä§, (j-1)Ā·Ä§, (j-2)Ā·Ä§, …, -(j-2)Ā·Ä§, -(j-1)Ā·Ä§, –jĀ·Ä§

TheĀ jĀ in this expression is the so-calledĀ spinĀ of the system. For an electron, it’s equal toĀ Ā±1/2, which we referred to as “up” and “down” states respectively because of obvious reasons: one state points upwards – more or less, that is (we know the angular momentum will actually precess around the direction of the magnetic field) – while the other points downwards.

We also know that theĀ magnetic energyĀ of an electron in a (weak) magnetic field – which, as you know, we conveniently assume to be pointing in the same z-direction, so BzĀ = B – will be equal to:

UmagĀ = gĀ·Ī¼zĀ·BĀ·jĀ =Ā Ā± 2Ā·Ī¼zĀ·BĀ·(1/2)Ā =Ā Ā± Ī¼zĀ·B = Ā± BĀ·(qeĀ·Ä§)/(2m)

In short, the magnetic energy is proportional to the magnetic field, and the constant of proportionality is the so-called Bohr magneton qeĀ·Ä§/2m. So far, so good. What’s the analog for a photon?

Well… Let’s first discuss the equivalent of a Stern-Gerlach apparatus for photons. That would be a polarizing material, like a piece of calcite, for example. Now, it is, unfortunately,Ā much moreĀ difficult to explain how a polarizing material works than to explain how a Stern-Gerlach apparatus works. [If you thought the workings of that (hypothetical) Stern-Gerlach filter were difficult to understand, think again.] We actually have different types of polarizers – some complicated, some easy. We’ll take the easy ones: linear ones. In addition, the phenomenon of polarization itself is a bit intricate. The phenomenon is well described in Chapter 33 of Feynman’s firstĀ VolumeĀ ofĀ Lectures, out of which I copied the two illustrations below the next paragraph.

Of course, to make sure you think about whatever is that you’re reading, Feynman now chooses the z-direction such that it coincides with theĀ direction of propagationĀ of the electromagnetic radiation. So it’s now theĀ x– andĀ y-direction that we’re looking at. NotĀ the z-direction any more.Ā As usual, we forget about theĀ magneticĀ field vector BĀ and so we think of the oscillatingĀ electricĀ field vector E only. Why can we forget about B? Well… If we have E, we know B. Full stop. As you know, I think B is pretty essential in the analysis too but… Well… You’ll see all textbooks on physics quickly forget about B when describing light. I don’t want to do that, but… Well… I need to move on. [I’ll come back to the matter – sideways – at the end of this post. :-)]

So we know the electric field vector E may oscillate in a plane (so that’s up and down and back again) but – interestingly enough – its direction may also rotate around the z-axis (again, remember the z-axis is the direction of propagation). Why? Well… Because E has an x– and a y-component (no z-component!), and these two components may oscillate in phase or out of phase, and so all of the combinations below are possible.To make a long story short, light comes in two varieties: linearly polarized and elliptically polarized. Of course, elliptically may be circularly – if you’re lucky! š

Now, a (linear) polarizer has anĀ optical axis, and only light whose E vector is oscillating along that axis will go through. […] OK. That’s not true: the component along the optical axisĀ of some EĀ pointing in some other direction will go through too! I’ll show how that works in a moment. But so all the rest is absorbed, and the absorbed energy just heats up the polarizer (which, of course, then radiates heat back out).

In any case, if the optical axis happens to be our x-axis, then we know that the light that comes through will be x-polarized, so that corresponds to the rather peculiar ExĀ = 1 and EyĀ = 0 notation. [This notation refers to coefficients we’ll use later to resolve states into base states – but don’t worry about it now.] Needless to say, you shouldn’t confuse the electric field vector E with the energy of our photon, which we denote asĀ E. No bold letter here. No subscript. š

Pfff… This introduction is becoming way too long. What about our photon? We want to talk aboutĀ oneĀ photon only and we’ve already written over a page and haven’t started yet. š

Well… First, we must note that we’ll assume the light is perfectlyĀ monochromatic, so all photons will have an energy that’s equal to E = hĀ·f, so the energy is proportional to the frequencyĀ of our light, and the constant of proportionality is Planck’s constant. That’s Einstein’s relation, not a de BroglieĀ relation. Just remember: we’re talkingĀ definiteĀ energy states here.

Second – and much more importantly – we may define two base states for our photon, |xāŖ andĀ |yāŖ respectively, which correspond to the classical linearĀ x– and y-polarization. So a photonĀ canĀ be in stateĀ |xāŖ or |yāŖ but, as usual, it is much more likely to be in some state that is some linearĀ combination of these two base states.

OK. Now we can start playing with these ideas. Imagine a polarizer – or polaroid, as Feynman calls it – whose optical axis is tilted – say, it’s at an angleĀ Īø from the x-axis, as shown below.Ā Classically, the light that comes through will be polarized in theĀ x’-direction, which we associate with that angleĀ Īø. So we say the photons will be in the |x‘āŖ state.Ā So far, so good. But what happens if we haveĀ twoĀ polarizers, set up as shown below, with the optical axis of the first one at an angleĀ Īø, which is, say, equal to 30Ā°? Will any light get through?

Well? No answer? […] Think about it. What happens classically? […] No answer? Let me tell you. In a classical analysis, we’d say that only the x-component of the light that comes through the first polarizer would get through the second one. Huh?Ā Yes. It isĀ notĀ all or nothing in aĀ classical analysis. This is where the magnitudeĀ of E comes in, which we’ll write as E0, so as to notĀ confuse it with the energyĀ E. [I know you’ll confuse it anyway but… Well… I need to move on or I won’t get anywhere with this story.] So if E0Ā is the (maximum) magnitude (or amplitude – in the classical sense of the word, that is) of E as the light leaves the first polarizer, then its x-component will be equal to E0Ā·cosĪø. [I don’t need to make a drawing here, do I?] Of course, you know that the intensity of the light will be proportional to the square of the (maximum) field, which is equal to E02Ā·cos2Īø = 0.75Ā·E02Ā for Īø = 30Ā°.

So our classical theory says that only 3/4 of the energy that we were sending in will get through. The rest (1/4) will be absorbed. So how do we model that quantum-mechanically?Ā It’s amazingly simple. We’ve already associated the |x‘āŖ state with the photons coming out of the first polaroid, and so now we’ll just say that this |x‘āŖ state is equal to the following linear combination of the |xāŖ and |yāŖ base states:

|x‘āŖ = cosĪøĀ·|xāŖ + sinĪøĀ·|yāŖ

Huh?Ā Yes. As Feynman puts it, we should think our |x‘āŖ beamĀ of photons can, somehow, be resolvedĀ into |xāŖ and |yāŖ beams. Of course, we’re talking amplitudes here, so we’re talking ā©x|x‘āŖ and ā©y|x‘āŖ amplitudes here, and the absolute square of those amplitudes will give us the probability that a photon in the |x‘āŖ state gets into the |xāŖ and |yāŖ state respectively. So how do we calculate that? Well… If |x‘āŖ = cosĪøĀ·|xāŖ + sinĪøĀ·|yāŖ, then we can obviously write the following:

Now, we know thatĀ ā©x|yāŖ = 0, because |xāŖ and |yāŖ are base states. Because of the same reason, ā©x|xāŖ = 1. That’s just an implication of the definition of baseĀ states: ā©i|jāŖ = Ī“ij. So we get:

Lo and behold! The absolute square of that is equal to cos2Īø, so each of these photons have an (average) probability of 3/4 to get through. So if we were to have like 10 billion photons, then some 7.5 billion of them would get through. As these photons are all associated with a definiteĀ energy – and they go through asĀ one whole, of course (no such thing as a 3/4 photon!) – we find that 3/4 of all of the energy goes through. The quantum-mechanical theory gives the same result as the classical theory – as it should, in this case at least!

Now that’s all good for linear polarization. What about elliptical or circular polarization? Hmm… That’s a bit more complicated, but equally feasible. If we denote the state of a photon with a right-hand circular polarization (RHC) as |RāŖ and, likewise, the state of a photon with a left-hand circular polarization (LHC) as |LāŖ, then we can write these as the following linear combinations of our base states |xāŖ and |yāŖ:That’s where those coefficients under illustrations (c) and (g) come in, although I think they’ve got the sign of i (the imaginary unit) wrong. š So how does it work? Well… That 1/ā2 factor is – obviously – just there to make sure everything’s normalized, so all probabilities over all states add up to 1. So that is taken care of and now we just need to explain how and why we’re adding |xāŖ and |yāŖ. For |RāŖ, the amplitudes must be the same but with a phase difference of 90Ā°. That corresponds to the sine and cosine function, which are the same except for a phase difference of Ļ/2 (90Ā°), indeed: sin(Ļ + Ļ/2) = cosĻ. Now, a phase shift of 90Ā° corresponds to a multiplication with the imaginary unit i. Indeed, iĀ =Ā eiĀ·Ļ/2Ā and, therefore, it is obvious that eiĀ·Ļ/2Ā·eiĀ·ĻĀ = eiĀ·(Ļ +Ā Ļ/2).

Of course, if we can write RHC and LHC states as a linear combination of the base states |xāŖ and |yāŖ, then you’ll believe me if I say that we can write anyĀ polarization state – including non-circular elliptical ones – as a linear combination of these base states. Now, there are two or three other things I’d like to point out here:

1. The RHC and LHC states can be used as base states themselves – so they satisfy all of the conditions for a set of base states. Indeed, it’s easy to add and then subtract the two equations above to get the following:As an exercise, you should verify the right and left polarization states effectively satisfy the conditions for a set of base states.

2. We can also rotate the xy-plane around the z-axis (as mentioned, that’s the direction of propagation of our beam) and use the resulting |x‘āŖ and |y‘āŖ states as base states. In short, we can effectively, as Feynman puts it, “You can resolve light into x– and y– polarizations, or into x’– and y’-polarizations, or into right and left polarizations as a basis.” These pairs are always orthogonal and also satisfy the other conditions we’d impose on a set of base states.

3. The last point I want to make here is much more enigmatic but, as far as I am concerned – by far – the most interesting of all of Feynman’sĀ LectureĀ on this topic. It’s actually just a footnote, but I am very excited about it. So… Well… What is it?

Well… Feynman does the calculations to show how a circularly polarized photon looks like when we rotate the coordinates around the z-axis, and shows the phaseĀ of the right and left polarized states effectively keeps track of the x– andĀ y-axes, so all of our “right-hand” rules don’t get lost somehow. He compares this analysis to an analysis he actually did – in a much earlier LectureĀ (in Chapter 5) – for spin-one particles. But, of course, here we’ve been analyzing the photon as a two-state system, right?

So… Well… Don’t we have a contradiction here? If photons are spin-one particles, then they’re supposed to be analyzed in terms ofĀ threeĀ base states, right? Well… I guess so… But then Feynman adds a footnote – with aĀ veryĀ important remark:

#### “The photon is a spin-one particle which has, however, no ‘zero’-state.”

Why I am noting that? Because it confirms my theory about photons – force-particles – being different from matter-particles not only because of the different rules for adding amplitudes, but also because we getĀ twoĀ wavefunctions for the price of one and, therefore, twiceĀ the energy for every oscillation!Ā And so we’ll also have a distance ofĀ twoĀ Planck units between the equivalent of the “up” and “down” states of the photon, rather than one Planck unit, like what we have for the angular momentum for an electron.Ā

I described the gist of my argument in my e-book, which you’ll find under another tab of this blog, and so I’ll refer you there. However, in case you’re interested, the summary of the summary is as follows:

1. We can think of a photon having some energy thatās equal to E = p = mĀ (assuming we choose our time and distance units such thatĀ c = 1), but that energy would be split up in an electric and a magnetic wavefunction respectively:Ā ĻEĀ and ĻB.
2. Now, SchrĆ¶dingerās equation would then apply to bothĀ wavefunctions, but the E, p and m in those two wavefunctions are the same and not the same: their numericalĀ value is the same (pEĀ =EEĀ = mEĀ = pBĀ =EBĀ = mB), but theyāre conceptuallyĀ different. [They must be: I showed that, if they aren’t, then we get a phase and group velocity for the wave that doesnāt make sense.]

It is then easy to show that – using the BĀ = iĀ·E relation between the magnetic and the electric field vectors – we find aĀ compositeĀ wavefunction for our photon which we can write as:

E + B =Ā ĻEĀ + ĻBĀ = EĀ +Ā iĀ·EĀ =Ā ā2Ā·ei(pĀ·x/2 ā EĀ·t/2 + Ļ/4)Ā =Ā ā2Ā·ei(Ļ/4)Ā·ei(pĀ·x/2 ā EĀ·t/2)Ā =Ā ā2Ā·ei(Ļ/4)Ā·E

The whole thing then becomes:

ĻĀ =Ā ĻEĀ + ĻBĀ = ā2Ā·ei(pĀ·x/2 ā EĀ·t/2 + Ļ/4)Ā = ā2Ā·ei(Ļ/4)Ā·ei(pĀ·x/2 ā EĀ·t/2)Ā

So we’ve got a ā2 factor here in front of our combinedĀ wavefunction for our photon which, knowing that the energy is proportional to the square of the amplitude gives us twiceĀ the energy we’d associate with aĀ regularĀ amplitude… [With “regular”, I mean the wavefunction for matter-particles – fermions, that is.] Soā¦ Wellā¦ That little footnote of Feynman seems to confirm I really am on to something. Nice!Ā VeryĀ nice, actually! š

# Davidson’s function

This post has gotĀ nothingĀ to do with quantum mechanics. It’s just… Well… My son – who’s preparing for his entrance examinations for engineering studies – sent me a message yesterday asking me to quickly explain Davidson’s function – as he has to do some presentation on it as part of a class assignment. As I am an economist – and Davidson’s function is used in transport economics – he thought I would be able to help him out quickly, and I was. So I just thought it might be interesting to quickly jot down my explanation as a post in this blog. It won’t help you with quantum mechanics but, if anything, it may help you think about functional forms and some related topics.

In his message, he sent me the function – copied below – and some definitions of the variables which he got from some software package he had seen or used – at least that’s what he told me. šSo… This function tells us that the dependent variable is the travel time t, and that it is seen as a function of some independent variable x and some parameters t0, c and Īµ.Ā My son defined the variableĀ x as the flowĀ (of vehicles) on the road, and c as the capacity of the road. To be precise, he wrote the formula that was to be used for cĀ asĀ follows:What about a formula forĀ x? Well… He said that was theĀ actualĀ flow of vehicles, but he had no formula for it. As forĀ t0, that was the travel time “at free speed.” Finally, he said Īµ was a āparamĆØtre de sensibilitĆ© de congestion.ā Sorry for the French, but that’s the language of his school, which is located in some town in southern Belgium. In English, we might translate it as a congestion sensitivity coefficient. And so that’s what he struggled most with – or so he said.

So that got us started. I immediately told him that, if you write something likeĀ cĀ ā x, then you’d better make sureĀ cĀ andĀ xĀ have the same physicalĀ dimension. The formula above tells us thatĀ cĀ is the number of vehicles that you can park on that road. Bumper to bumper. So I told him that’s a rather weird definition of capacity. It’s definitelyĀ notĀ the dimension of flow: the flow should be some numberĀ per second or – much more likely in transport economics – per minute or per hour. So I told him that he should double-check those definitions of x and c, and that I’d get back to him to explain the formula itself after I hadĀ googledĀ and read some articles on it. So I did that, and so here’s the full explanation I gave him.

While there’s some pretty awesome theory behind (queuing theory and all that), which transportation gurus take very seriously – see, for example, the papers written by Rahmi AkĆ§elikĀ – a quick look at it all reveals that Davidson’s function is, essentially, just a specificĀ functional formĀ that weĀ imposeĀ on some real-life problem. So I’d call it an empiricalĀ function: there’s some theory behind, but it’s more based on experience than pure theory. Of course, sound logic is – or should be – applied to both empirical as well as to purely theoretical functions, but… Well… It’s a different approach than, say, modeling the dynamics of quantum-mechanical state changes. š Just note, for example, that we might just as well have tried something else – some exponential function. Something like this, for example:Davidson’s function is, quite simply, just nicer and easier than the one above, because the function above is not linear. It could be quadratic (Ī² = 2), or whatever, but surely notĀ linear.Ā In contrast, Davidson’s function is linear and, therefore, easy to fit onto actual traffic data using the simplest of simple linear regression models – and, speaking from experience, most engineers and economists in a real-life job can barely handle even that! š

So just look atĀ that x/(cāx) factor as measuring the congestion or saturation, somehow. We’ll denote it by s.Ā If you can sort of accept that, then you’ll agree that Davidson’s function tells us that the extra time that’s needed to drive from some placeĀ a to some placeĀ bĀ along our road will be directly proportional to:

1. That congestion factor x/(cāx), about which I’ll write more in a moment;
2. The free-speedĀ orĀ free-flowĀ travel time t0Ā – which I’ll call theĀ free-flowĀ travel time from now on, rather than the free-speed travel time, because there’s no such thing as free speed in reality: we have speed limits – or safety limits, or scared moms in the car, whatever – and, a more authoritative argument, the literature on Davidson’s function also talks about free flow rather than free speed;
3. That epsilon factor (Īµ), which – of all the stuff I presented so far – mystified my son most.

So the formula for the extra travel time that’s needed is, obviously, equal to:So we have a very simple linearĀ functional form for the extra travel time, and we can easily estimate the actualĀ value of our Īµ parameter using actual traffic data in a simple linear regression. The data analysis toolkit of MS Excel will do stuff like this – if you have the data, of course – so you don’t need a sophisticated statistical software package here.

So that’s it, really: Davidson’s function is, effectively, just nice and easy to work with. […] Well… […] Of course, we still need toĀ defineĀ what x and c actuallyĀ are. And what’s that so-calledĀ free flow (or free speed?)Ā travel time? Well… The free-flow travel time is, obviously, the time you need to go fromĀ aĀ toĀ bĀ at the free-flow speed. But what’s the free-flow speed? My friend’s Maserati is faster than my little Santro. š And are we allowed to go faster than the maximum authorized speed? Interesting questions.

So that’s where the analysis becomes interesting, and why we need better definitions of x and c. IfĀ cĀ is some density – what my son’s rather non-sensical formula seems to imply – we may want to express itĀ per unit distance. Per kilometer, for example. So we should probably re-defineĀ c more simply:Ā as the number of lanes divided by the average length of the vehicles that are using it. We get that by dividing the c above by the length of the road – so we divide the length of the road by the length of the road, which gives 1. š You may think thatās weird, because we get something like 3/5 = 0.6… So… What? Wellā¦ Yes. 0.6 vehicles per meter, so thatās 600 vehicles per kilometer! Does that sound OK? I think it does. So let’s express thatĀ capacityĀ (c)Ā as a maximumĀ densityĀ – for the time being, at least.

Now, none of those cars can move, of course: they are all standing still. Bumper to bumper. Itās only when we decrease the density that they’re able to move. In fact, you can – and should – visualize the process: the first car moves and opens a space of, say, one or two meter, and then the second one, and so and so on – till all cars are moving with a few meter in-between them. So the density will obviously decrease and, as a result, we’re getting someĀ flowĀ of vehicles here. If there’s three meter between them, for example, then the density goes down to 3/8 vehicles per meter, so that’s 375 vehicles per kilometer. Still a lot, and you’ll have to agree that – with only 3 meters between them – they’ll probably only move veryĀ slowly!

You get the idea. We can now defineĀ xĀ as a density too – some densityĀ xĀ that isĀ smallerĀ than the maximum densityĀ c. Then that x/(cāx) factor – measuring the saturation – obviously makes a lot of sense. The graph below shows how it looks like for c = 5. [The value of 5 is just random, and its order of magnitude doesn’t matter either: we can always re-scale from m to km, or from seconds to minutes and what have you. So don’t worry about it.] Look at this example: whenĀ xĀ is small – like 1 or 2 only – thenĀ x/(5āx) doesn’t increase all that much. So that means we add little to the travel time. Conversely, when x approaches cĀ = 5 – so that’s the limit (as you can see, the xĀ = 5 line is a (vertical)Ā asymptote of the function) – then the travel time becomes huge and starts approaching infinity. So… Well… Yes. That’s when all cars are standing still – bumper to bumper.Ā But so what’s the free-flow speed? Is it the maximum speed of my friend’s Maserati -which is like 275 km/h? Well… I don’t think my friend ever drove that fast, so probably not. What else? Think about it. What should we choose here? The obvious choice is the speed limit: 120 km/h, or 90 km/h, or 60 km/h – or whatever. Why? Because you don’t want a ticket, I guess… In any case, let’s analyze that question later. Let’s first look at something else.

Of course, you’ll want to keep some distance between you and the car in front of you when driving at relatively high speeds, and that’s the crux of the analysis really. You may or, more likely, you may not remember that your driving instructor told you to always measure the safety distance between you and the car(s) in front inĀ secondsĀ rather than in meter. In Belgium, we’re told to stay two seconds away from the car in front of us. So when it passes a light pole, we’ll count “twenty-one, twenty-two” and… Well… If we pass that same light pole whileĀ we’re still counting those two seconds, then we’d better keep some more distance. It’s got to do with reaction time: when the car in front of you slams the brakes, you need some time to react, and then that car might also have better brakes than yours, so you want to build in some extra safety margin in case you don’t slow down as fast as the car in front of you. So that two-seconds rule is not about the breaking distance really – or not about the breaking distance alone. No. It’s more about the reaction time. In any case, the point is that you’ll want to measure the safety distance inĀ time rather than in meter.Ā Capito? OK… Onwards…

Now, 120 km/h amounts to 120,000/3,600 = 33.333 meter per second. So the safetyĀ distance here is almost 67 meter! If the maximum authorized velocity is only 90 km/h, then the safety distance shrinks to 2 Ć (90,000/3,600) = 50 meter. For a maximum authorized velocity of 60 km/h, the safety distance would be equal to 33.333 meter. Both are much larger distances than the average length of the vehicles and, hence, itās basically the safety distance ā not the length of the vehicle ā that we need to consider! Let’s quickly calculate the related densities:

• For a three-lane highway, with all vehicles traveling at 120 km/h and keeping their safety distance, the density will be equal to 3Ā·1,000/66.666… = 45 vehicles per kilometer of highway, so that’s 15 vehicles per lane.
• If the travel speed is 90 km/h, then the density will be equal to 60 vehicles per km (20 vehicles per lane).
• Finally, at 60 km/h, the density will be 90 vehicles per km (30 vehicles per lane).

Note that our two-seconds rule implies a linear relation between the safety distance and the maximum authorized speed. You can also see that the relation between the density and the maximum authorized speed is inverselyĀ proportional: if we halveĀ the speed, the density doubles.

Now, you can easily come up with some more formulas, and play around a bit. For example, if we denote the security distance by d, and the mentioned two seconds as tdĀ – so that’s the time (t) that defines the security distance d – then d is, logically, equal to: d = tdāvmax. But rather than trying to find more formulas and play with them, let’s think about that concept of flow now. If we would want to defineĀ the capacity ā or the actual flow ā in terms of the number of vehicles that are passing along any point along this highway, how should we calculate that?

Well… The flow is the number of vehicles that will pass us in one hour, right? So if vmax is 120 km/h, then ā assuming full capacity ā all the vehicles on the next 120 km of highway will all pass us, right? So that makes 45 vehicles per km times 120 km = 5,400 vehicles – per hour, of course. Hence, the flow is just the product of the densityĀ times the speed.

Now, look at this: if vmax is equal to 90 km/h, then weāll have 60 vehicles per km times 90 km = ā¦ Well… Itās ā interestingly enough ā the same number: 5,400 vehicles per hour. Letās calculate for vmax = 60 km/hā¦ Security distance is 33.333 meter, so we can have 90 vehicles on each km of highway which means that, over one hour, 90 times 60 = 5,400 vehicles will pass us! It’s, once again, the same number: 5,400!Ā Now thatās a very interesting conclusion. Let me highlight it:

If we assume the vehicles will keep some fixed time distance between them (e.g. two seconds), then the capacity of our highway – expressed as some number of vehicles passing along it per time unit – does not depend on the velocity.

So the capacity – expressed as a flow rather than as a density – is just a fixed number: xĀ vehicles per hour. The density affects only the (average) speed of all those vehicles. Hence, increasing densities are associated with lower speeds, and higher travel times, but they don’t change the capacity.

It’s really a rather remarkable conclusion, even if the relation between the densityĀ and the flowĀ is easily understood – both mathematically and, more importantly, intuitively. For example, if the density goes down to 60 vehicles per km of highway, then they will only be able to move at a speed of 90 km/h, but weāll still have that flow of 5,400 vehicles per hour ā which we can look at as the capacity butĀ expressed as some flow rather than as a density.Ā Lower densities allow for even higher speeds: we calculated above that a density of 45 vehicles per km would allow them to drive at a maximum speed of 120 km/h, so travel time would be reduced even more, but we’d still have 5,400 vehicles per hour! So… Well… Yes. It all makes sense.

Now what happens if the density is even lower, so we could – theoretically – drive safely, or not so safely, at some speed that’s way above the speed limit? If we have enough cars – say 30 vehicles per km, but all driving moreĀ than 120 km/h, while respecting the two-seconds rule – we’d still have the same flow: 5,400 vehicles per hour. And travel time would go down. But so we can thinkĀ of lower densities and higher speeds but, again, there’s got to be some limit here – a speed limit, safety considerations, a limit to what our engine or our car can do, and, finally, there’s the speed of light too. š I am just joking, of course, but I hope you see the point. At some point, it doesn’t matter whether or not the density goes down even further: the travel time should hit some minimum. And it’s that minimum – the lowest possible travel time – that you’d probably like to define as t0.

As mentioned, the minimum travel time is associated with some maximum speed, and – after some consideration of the possible candidates for the maximum speed – you’ll agree the speed limit is a better candidate than the 275 km/h limit of my friends’ Maserati Quattroporte.Ā Likewise, you would probably also like to define x0Ā as the (maximum) density at the speed limit.

What we’re saying here is that – in theory at least – our t = t(x) function should start with a linear section, between x = 0 and x = x0. That linear section defines a density 0 < x < x0Ā which is compatible with us driving at the speed limit – say, 120 km/h – and, hence, with us only needing the time t = t0Ā to arrive at our destination. Only when xĀ becomes larger than x0, we’ve got to reduce speed – below the speed limit (say, 120 km/h) – to keep the flow going while keeping an appropriate safety distance. A reduction of speed implies a increaseĀ in travel time, of course. So that’s what’s illustrated in the graph below.

To be specific, if the speed limit is 120 km/h, then – assuming you don’t want to be caught speeding – the minimum travel time will always be equal to 30 seconds per km, even if you’re alone on the highway. Now, as long as the density is less than 45 vehicles per km, you can keep that travel time the same, because you can do your 120 km/h while keeping the safety distance. But if the density increases, above 45 vehicles per km, then stuff starts slowing down because everyone is uncomfortable with the shorter distance between them and the car in front. As the density goes up even more – say, to 60 vehicles per km – we can only do 90 km/h, and so the travel time will then be equal to 40 seconds per km. And then it goes to 90 vehicles per km, so speed slows down to 60 km/h, and so that’s a travel time of 60 seconds per km. Of course, you’re smart – veryĀ smart – and soĀ you’ll immediately say this implies thatĀ the secondĀ section of our graph should be linear too, like this:

You’re right. But then… Well… That doesn’t work with our limit for x, which is c. As I pointed out, cĀ is anĀ absoluteĀ maximum density: you just can’tĀ parkĀ any more cars on that highway – unless you fold them up or so. š So what’s the conclusion? Well… We may think of the Davidson function as a primitive combination of both shapes, as shown below.

I call it aĀ primitiveĀ approximation, because that Davidson function (so that’s the greenĀ smooth curve above) is notĀ a precise (linear or non-linear) combination of the two functions we presented (I am talking about the blue broken line and the smooth red curve here). It’s just… Well… Some primitiveĀ approximation. š Now you can write some very complicated papers – as other authors do – to sort of try to explain this shape, but you’ll find yourself fiddling withĀ variableĀ time distance rules and other hypotheses that may or may not make sense. In short, you’re likely to introduceĀ otherĀ logical inconsistencies when trying to refine the model. So my advice is to just accept the Davidson’s function as some easyĀ empiricalĀ fit to some real-life situation, and think of what the parameters actuallyĀ doĀ – mathematically speaking, that is. How do they change theĀ shapeĀ of our graph?

So we’re now ready to explain thatĀ epsilonĀ factor (Īµ) by looking at what it does, indeed. Please try an online graphing tool with a slider (e.g. https://www.desmos.com/calculator) – just type something like a + bx in the function box, and you’ll see the sliders appear – so you can see how the function changes for different parameter values. The two graphs below, for example, which I made using that graphing tool, show you the function t = 2 + 2āĪµāx/(10āx) for Īµ = 0.5 and Īµ = 10 respectively. As you can see, both functions start at t = 2 and have the same asymptote at x = c = 10. However, you’ll agree that they look very differentĀ – and that’s because of the value of the Īµ parameter. For Īµ = 0.5, the travel time does not increase all that much ā initially at least. Indeed, as you can see, t is equal to 3 if the density is half of the capacity (t = 3 for x = 5 = c/2). In contrast, for Īµ = 10, we have immediate saturation, so to speak: travel time goes through the roof almost immediately! For example, forĀ xĀ = 3, t ā 10.6, so while the density is less than a third of the capacity, the associated travel time is already more than fiveĀ times the free-flow travel time!

Now I have a tricky question for you: does it make sense to allow Īµ to take on values larger than one? Think about it. š In any case, now you’ve seenĀ what the Īµ factor does from a math point of view. So… Well…Ā I’ll conclude here by just noting that it does, indeed, makeĀ sense to refer to ĪµĀ as a āparamĆØtre de sensibilitĆ© de congestionā, because thatās what it is: a congestion sensitivity coefficient. Indeed, itās not the congestion or saturationĀ parameter itself (thatās a term we should reserve term for the x/(cx) factor), but a congestion sensitivity coefficient alright!

Of course, you will still want some theoretical interpretation. Wellā¦ To be honest, I can’t give you one. I don’t want to get lost in all of those theoretical excursions on Davidson’s function, because… Well… It’s no use. That Īµ is just what it is: itās a proportionality coefficient that we are imposing upon our functional form for our travel-time function. You can sum it up as follows:

If x/(cāx) is the congestion parameter (or variable, I should say), then it goes from 0 to ā (infinity) when the traffic flow (x) goes x = 0 to x = cĀ (full capacity). So, yes, we can call the x/(cāx) factor the congestion or saturation variable and write it asĀ s = x/(cāx). And then we can refer to Īµ as the āparamĆØtre de sensibilitĆ© de congestionā, because it is a measure not of the congestion itself, but of the sensitivity of the travel time to the congestion.

If you’d absolutely want some mathematical formula for it, then you could use this one, which you get from re-writing Īt = t0Ā·ĪµĀ·s asĀ Īt/t0Ā = ĪµĀ·s:

ā(Īt/t0)/ās = Īµ

But… Frankly. You can stare at this formula for a long while – it’s a derivative alright, and you know what derivatives stand for – but you’ll probably learn nothing much from it. [Of course, please write me if you don’t agree, Vincent!] I just looked atĀ those two graphs, and note how their form changes as a function of Īµ. Perhaps you have some brighter idea about it!

So… Well… I am done. You should now fullyĀ understand Davidson’s function. Let me write it down once more:

Again, as mentioned, its main advantage is its linearity. Because of its linearity, it is easy to actually estimate the parameters: it’s just a simple linear regression ā using actual travel times and actual congestion measurements ā and so then we can estimate the value of Īµ and see if it works.Ā Huh? How do we seeĀ if it works?Ā Well… I told you already: when everything is said and done, Davidsonās function is just one of the many models for the actual reality, so it tries to explain how travel time increases because of congestion. There are other models, which come with other functions ā but they are more complicated, and so are the functions that come with them (check out that paper fromĀ Rahmi AkĆ§elik, for example). Only reality can tell us what model is the best fit to whatever it is that weāre trying to model. So that’s why I call Davidson’s function an empiricalĀ function, and so you should check it against reality. That’s when a statistical software package might be handy: it allows you to test theĀ fitĀ of various functional – linear and non-linear – forms against a real-life data set.

So that’s it.Ā I tasked my son to go through this post and correct any errors – only typos, I hope! – I may have made. I hope he’ll enjoy this little exercise. š