The wavefunction in a medium: amplitudes as signals

We finally did what we wanted to do for a while already: we produced a paper on the meaning of the wavefunction and wave equations in the context of an atomic lattice (think of a conductor or a semiconductor here). Unsurprisingly, we came to the following conclusions:

1. The concept of the matter-wave traveling through the vacuum, an atomic lattice or any medium can be equated to the concept of an electric or electromagnetic signal traveling through the same medium.

2. There is no need to model the matter-wave as a wave packet: a single wave – with a precise frequency and a precise wavelength – will do.

3. If we do want to model the matter-wave as a wave packet rather than a single wave with a precisely defined frequency and wavelength, then the uncertainty in such wave packet reflects our own limited knowledge about the momentum and/or the velocity of the particle that we think we are representing. The uncertainty is, therefore, not inherent to Nature, but to our limited knowledge about the initial conditions or, what amounts to the same, what happened to the particle(s) in the past.

4. The fact that such wave packets usually dissipate very rapidly, reflects that even our limited knowledge about initial conditions tends to become equally rapidly irrelevant. Indeed, as Feynman puts it, “the tiniest irregularities tend to get magnified very quickly” at the micro-scale.

In short, as Hendrik Antoon Lorentz noted a few months before his demise, there is, effectively, no reason whatsoever “to elevate indeterminism to a philosophical principle.” Quantum mechanics is just what it should be: common-sense physics.

The paper confirms intuitions we had highlighted in previous papers already, but uses the formalism of quantum mechanics itself to demonstrate this.

PS: We put the paper on academia.edu and ResearchGate as well, but Phil Gibbs’ site has easy access (no log-in or membership required). Long live Phil Gibbs!

Rutherford’s idea of an electron

Pre-scriptum (dated 27 June 2020): Two illustrations in this post were deleted by the dark force. We will not substitute them. The reference is given and it will help you to look them up yourself. In fact, we think it will greatly advance your understanding if you do so. Mr. Gottlieb may actually have done us a favor by trying to pester us.

Electrons, atoms, elementary particles and wave equations

The New Zealander Ernest Rutherford came to be known as the father of nuclear physics. He was the first to provide a reliable estimate of the order of magnitude of the size of the nucleus. To be precise, in the 1921 paper which we will discuss here, he came up with an estimate of about 15 fm for massive nuclei, which is the current estimate for the size of an uranium nucleus. His experiments also helped to significantly enhance the Bohr model of an atom, culminating – just before WW I started – in the Bohr-Rutherford model of an atom (E. Rutherford, Phil. Mag. 27, 488).

The Bohr-Rutherford model of an atom explained the (gross structure of the) hydrogen spectrum perfectly well, but it could not explain its finer structure—read: the orbital sub-shells which, as we all know now (but not very well then), result from the different states of angular momentum of an electron and the associated magnetic moment.

The issue is probably best illustrated by the two diagrams below, which I copied from Feynman’s Lectures. As you can see, the idea of subshells is not very relevant when looking at the gross structure of the hydrogen spectrum because the energy levels of all subshells are (very nearly) the same. However, the Bohr model of an atom—which is nothing but an exceedingly simple application of the E = h·f equation (see p. 4-6 of my paper on classical quantum physics)—cannot explain the splitting of lines for a lithium atom, which is shown in the diagram on the right. Nor can it explain the splitting of spectral lines when we apply a stronger or weaker magnetic field while exciting the atoms so as to induce emission of electromagnetic radiation.

Schrödinger’s wave equation solves that problem—which is why Feynman and other modern physicists claim this equation is “the most dramatic success in the history of the quantum mechanics” or, more modestly, a “key result in quantum mechanics” at least!

Such dramatic statements are exaggerated. First, an even finer analysis of the emission spectrum (of hydrogen or whatever other atom) reveals that Schrödinger’s wave equation is also incomplete: the hyperfine splitting, the Zeeman splitting (anomalous or not) or the (in)famous Lamb shift are to be explained not only in terms of the magnetic moment of the electron but also in terms of the magnetic moment of the nucleus and its constituents (protons and neutrons)—or of the coupling between those magnetic moments (we may refer to our paper on the Lamb shift here). This cannot be captured in a wave equation: second-order differential equations are – quite simply – not sophisticated enough to capture the complexity of the atomic system here.

Also, as we pointed out previously, the current convention in regard to the use of the imaginary unit (i) in the wavefunction does not capture the spin direction and, therefore, makes abstraction of the direction of the magnetic moment too! The wavefunction therefore models theoretical spin-zero particles, which do not exist. In short, we cannot hope to represent anything real with wave equations and wavefunctions.

More importantly, I would dare to ask this: what use is an ‘explanation’ in terms of a wave equation if we cannot explain what the wave equation actually represents? As Feynman famously writes: “Where did we get it from? Nowhere. It’s not possible to derive it from anything you know. It came out of the mind of Schrödinger, invented in his struggle to find an understanding of the experimental observations of the real world.” Our best guess is that it, somehow, models (the local diffusion of) energy or mass densities as well as non-spherical orbital geometries. We explored such interpretations in our very first paper(s) on quantum mechanics, but the truth is this: we do not think wave equations are suitable mathematical tools to describe simple or complex systems that have some internal structure—atoms (think of Schrödinger’s wave equation here), electrons (think of Dirac’s wave equation), or protons (which is what some others tried to do, but I will let you do some googling here yourself).

We need to get back to the matter at hand here, which is Rutherford’s idea of an electron back in 1921. What can we say about it?

Rutherford’s contributions to the 1921 Solvay Conference

From what you know, and from what I write above, you will understand that Rutherford’s research focus was not on electrons: his prime interest was in explaining the atomic structure and in solving the mysteries of nuclear radiation—most notably the emission of alpha– and beta-particles as well as highly energetic gamma-rays by unstable or radioactive nuclei. In short, the nature of the electron was not his prime interest. However, this intellectual giant was, of course, very much interested in whatever experiment or whatever theory that might contribute to his thinking, and that explains why, in his contribution to the 1921 Solvay Conference—which materialized as an update of his seminal 1914 paper on The Structure of the Atom—he devotes considerable attention to Arthur Compton’s work on the scattering of light from electrons which, at the time (1921), had not even been published yet (Compton’s seminal article on (Compton) scattering was published in 1923 only).

It is also very interesting that, in the very same 1921 paper—whose 30 pages are more than a multiple of his 1914 article and later revisions of it (see, for example, the 1920 version of it, which actually has wider circulation on the Internet)—Rutherford also offers some short reflections on the magnetic properties of electrons while referring to Parson’s ring current model which, in French, he refers to as “l’électron annulaire de Parson.” Again, it is very strange that we should translate Rutherford’s 1921 remarks back in English—as we are sure the original paper must have been translated from English to French rather than the other way around.

However, it is what it is, and so here we do what we have to do: we give you a free translation of Rutherford’s remarks during the 1921 Solvay Conference on the state of research regarding the electron at that time. The reader should note these remarks are buried in a larger piece on the emission of β particles by radioactive nuclei which, as it turns out, are nothing but high-energy electrons (or their anti-matter counterpart—positrons). In fact, we should—before we proceed—draw attention to the fact that the physicists at the time had no clear notion of the concepts of protons and neutrons.

This is, indeed, another remarkable historical contribution of the 1921 Solvay Conference because, as far as I know, this is the first time Rutherford talks about the neutron hypothesis. It is quite remarkable he does not advance the neutron hypothesis to explain the atomic mass of atoms combining what we know think of as protons and neutrons (Rutherford regularly talks of a mix of ‘positive and negative electrons’ in the nucleus—neither the term proton or neutron was in use at the time) but as part of a possible explanation of nuclear fusion reactions in stars or stellar nebulae. This is, indeed, his response to a question during the discussions on Rutherford’s paper on the possibility of nuclear synthesis in stars or nebulae from the French physicist Jean Baptise Perrin who, independently from the American chemist William Draper Harkins, had proposed the possibility of hydrogen fusion just the year before (1919):

“We can, in fact, think of enormous energies being released from hydrogen nuclei merging to form helium—much larger energies than what can come from the Kelvin-Helmholtz mechanism. I have been thinking that the hydrogen in the nebulae might come from particles which we may refer to as ‘neutrons’: these would consist of a positive nucleus with an electron at an exceedingly small distance (“un noyau positif avec un électron à toute petite distance”). These would mediate the assembly of the nuclei of more massive elements. It is, otherwise, difficult to understand how the positively charged particles could come together against the repulsive force that pushes them apart—unless we would envisage they are driven by enormous velocities.”

We may add that, just to make sure he get this right, Rutherford is immediately requested to elaborate his point by the Danish physicist Martin Knudsen: “What’s the difference between a hydrogen atom and this neutron?”—which Rutherford simply answers as follows: “In a neutron, the electron would be very much closer to the nucleus.” In light of the fact that it was only in 1932 that James Chadwick would experimentally prove the existence of neutrons (and positively charged protons), we are, once again, deeply impressed by the the foresight of Rutherford and the other pioneers here: the predictive power of their theories and ideas is, effectively, truly amazing by any standard—including today’s. I should, perhaps, also add that I fully subscribe to Rutherford’s intuition that a neutron should be a composite particle consisting of a proton and an electron—but that’s a different discussion altogether.

We must come back to the topic of this post, which we will do now. Before we proceed, however, we should highlight one other contextual piece of information here: at the time, very little was known about the nature of α and β particles. We now know that beta-particles are electrons, and that alpha-particles combine two protons and two neutrons. That was not known in the 1920s, however: Rutherford and his associates could basically only see positive or negative particles coming out of these radioactive processes. This further underscores how much knowledge they were able to gain from rather limited sets of data.

Rutherford’s idea of an electron in 1921

So here is the translation of some crucial text. Needless to say, the italics, boldface and additions between [brackets] are not Rutherford’s but mine, of course.

“We may think the same laws should apply in regard to the scattering [“diffusion”] of α and β particles. [Note: Rutherford noted, earlier in his paper, that, based on the scattering patterns and other evidence, the force around the nucleus must respect the inverse square law near the nucleus—moreover, it must also do so very near to it.] However, we see marked differences. Anyone who has carefully studied the trajectories [photographs from the Wilson cloud chamber] of beta-particles will note the trajectories show a regular curvature. Such curved trajectories are even more obvious when they are illuminated by X-rays. Indeed, A.H. Compton noted that these trajectories seem to end in a converging helical path turning right or left. To explain this, Compton assumes the electron acts like a magnetic dipole whose axis is more or less fixed, and that the curvature of its path is caused by the magnetic field [from the (paramagnetic) materials that are used].

Further examination would be needed to make sure this curvature is not some coincidence, but the general impression is that the hypothesis may be quite right. We also see similar curvature and helicity with α particles in the last millimeters of their trajectories. [Note: α-particles are, obviously, also charged particles but we think Rutherford’s remark in regard to α particles also following a curved or helical path must be exaggerated: the order of magnitude of the magnetic moment of protons and neutrons is much smaller and, in any case, they tend to cancel each other out. Also, because of the rather enormous mass of α particles (read: helium nuclei) as compared to electrons, the effect would probably not be visible in a Wilson cloud chamber.]

The idea that an electron has magnetic properties is still sketchy and we would need new and more conclusive experiments before accepting it as a scientific fact. However, it would surely be natural to assume its magnetic properties would result from a rotation of the electron. Parson’s ring electron model [“électron annulaire“] was specifically imagined to incorporate such magnetic polarity [“polarité magnétique“].

A very interesting question here would be to wonder whether such rotation would be some intrinsic property of the electron or if it would just result from the rotation of the electron in its atomic orbital around the nucleus. Indeed, James Jeans usefully reminded me any asymmetry in an electron should result in it rotating around its own axis at the same frequency of its orbital rotation. [Note: The reader can easily imagine this: think of an asymmetric object going around in a circle and returning to its original position. In order to return to the same orientation, it must rotate around its own axis one time too!]

We should also wonder if an electron might acquire some rotational motion from being accelerated in an electric field and if such rotation, once acquired, would persist when decelerating in an(other) electric field or when passing through matter. If so, some of the properties of electrons would, to some extent, depend on their past.”

Each and every sentence in these very brief remarks is wonderfully consistent with modern-day modelling of electron behavior. We should add, of course, non-mainstream modeling of electrons but the addition is superfluous because mainstream physicists stubbornly continue to pretend electrons have no internal structure, and nor would they have any physical dimension. In light of the numerous experimental measurements of the effective charge radius as well as of the dimensions of the physical space in which photons effectively interfere with electrons, such mainstream assumptions seem completely ridiculous. However, such is the sad state of physics today.

Thinking backward and forward

We think that it is pretty obvious that Rutherford and others would have been able to adapt their model of an atom to better incorporate the magnetic properties not only of electrons but also of the nucleus and its constituents (protons and neutrons). Unfortunately, scientists at the time seem to have been swept away by the charisma of Bohr, Heisenberg and others, as well as by the mathematical brilliance of the likes of Sommerfeld, Dirac, and Pauli.

The road then was taken then has not led us very far. We concur with Oliver Consa’s scathing but essentially correct appraisal of the current sorry state of physics:

“QED should be the quantized version of Maxwell’s laws, but it is not that at all. QED is a simple addition to quantum mechanics that attempts to justify two experimental discrepancies in the Dirac equation: the Lamb shift and the anomalous magnetic moment of the electron. The reality is that QED is a bunch of fudge factors, numerology, ignored infinities, hocus-pocus, manipulated calculations, illegitimate mathematics, incomprehensible theories, hidden data, biased experiments, miscalculations, suspicious coincidences, lies, arbitrary substitutions of infinite values and budgets of 600 million dollars to continue the game. Maybe it is time to consider alternative proposals. Winter is coming.”

I would suggest we just go back where we went wrong: it may be warmer there, and thinking both backward as well as forward must, in any case, be a much more powerful problem solving technique than relying only on expert guessing on what linear differential equation(s) might give us some S-matrix linking all likely or possible initial and final states of some system or process. 🙂

Post scriptum: The sad state of physics is, of course, not limited to quantum electrodynamics only. We were briefly in touch with the PRad experimenters who put an end to the rather ridiculous ‘proton radius puzzle’ by re-confirming the previously established 0.83-0.84 range for the effective charge radius of a proton: we sent them our own classical back-of-the-envelope calculation of the Compton scattering radius of a proton based on the ring current model (see p. 15-16 of our paper on classical physics), which is in agreement with these measurements and courteously asked what alternative theories they were suggesting. Their spokesman replied equally courteously:

“There is no any theoretical prediction in QCD. Lattice [theorists] are trying to come up [with something] but that will take another decade before any reasonable number [may come] from them.”

This e-mail exchange goes back to early February 2020. There has been no news since. One wonders if there is actually any real interest in solving puzzles. The physicist who wrote the above may have been nominated for a Nobel Prize in Physics—I surely hope so because, in contrast to some others, he and his team surely deserve one— but I think it is rather incongruous to finally firmly establish the size of a proton while, at the same time, admit that protons should not have any size according to mainstream theory—and we are talking the respected QCD sector of the equally respected Standard Model here!

We understand, of course! As Freddy Mercury famously sang: The Show Must Go On.

The Mystery Wallahs

I’ve been working across Asia – mainly South Asia – for over 25 years now. You will google the exact meaning but my definition of a wallah is a someone who deals in something: it may be a street vendor, or a handyman, or anyone who brings something new. I remember I was one of the first to bring modern mountain bikes to India, and they called me a gear wallah—because they were absolute fascinated with the number of gears I had. [Mountain bikes are now back to a 2 by 10 or even a 1 by 11 set-up, but I still like those three plateaux in front on my older bikes—and, yes, my collection is becoming way too large but I just can’t do away with it.]

Any case, let me explain the title of this post. I stumbled on the work of the research group around Herman Batelaan in Nebraska. Absolutely fascinating ! Not only did they actually do the electron double-slit experiment, but their ideas on an actual Stern-Gerlach experiment with electrons are quite interesting: https://digitalcommons.unl.edu/cgi/viewcontent.cgi?article=1031&context=physicsgay

I also want to look at their calculations on momentum exchange between electrons in a beam: https://iopscience.iop.org/article/10.1088/1742-6596/701/1/012007.

Outright fascinating. Brilliant ! […]

It just makes me wonder: why is the outcome of this 100-year old battle between mainstream hocus-pocus and real physics so undecided?

I’ve come to think of mainstream physicists as peddlers in mysteries—whence the title of my post. It’s a tough conclusion. Physics is supposed to be the King of Science, right? Hence, we shouldn’t doubt it. At the same time, it is kinda comforting to know the battle between truth and lies rages everywhere—including inside of the King of Science.

Wavefunctions as gravitational waves

This is the paper I always wanted to write. It is there now, and I think it is good – and that‘s an understatement. 🙂 It is probably best to download it as a pdf-file from the viXra.org site because this was a rather fast ‘copy and paste’ job from the Word version of the paper, so there may be issues with boldface notation (vector notation), italics and, most importantly, with formulas – which I, sadly, have to ‘snip’ into this WordPress blog, as they don’t have an easy copy function for mathematical formulas.

It’s great stuff. If you have been following my blog – and many of you have – you will want to digest this. 🙂

Abstract : This paper explores the implications of associating the components of the wavefunction with a physical dimension: force per unit mass – which is, of course, the dimension of acceleration (m/s²) and gravitational fields. The classical electromagnetic field equations for energy densities, the Poynting vector and spin angular momentum are then re-derived by substituting the electromagnetic N/C unit of field strength (mass per unit charge) by the new N/kg = m/s² dimension.

The results are elegant and insightful. For example, the energy densities are proportional to the square of the absolute value of the wavefunction and, hence, to the probabilities, which establishes a physical normalization condition. Also, Schrödinger’s wave equation may then, effectively, be interpreted as a diffusion equation for energy, and the wavefunction itself can be interpreted as a propagating gravitational wave. Finally, as an added bonus, concepts such as the Compton scattering radius for a particle, spin angular momentum, and the boson-fermion dichotomy, can also be explained more intuitively.

While the approach offers a physical interpretation of the wavefunction, the author argues that the core of the Copenhagen interpretations revolves around the complementarity principle, which remains unchallenged because the interpretation of amplitude waves as traveling fields does not explain the particle nature of matter.

Introduction

This is not another introduction to quantum mechanics. We assume the reader is already familiar with the key principles and, importantly, with the basic math. We offer an interpretation of wave mechanics. As such, we do not challenge the complementarity principle: the physical interpretation of the wavefunction that is offered here explains the wave nature of matter only. It explains diffraction and interference of amplitudes but it does not explain why a particle will hit the detector not as a wave but as a particle. Hence, the Copenhagen interpretation of the wavefunction remains relevant: we just push its boundaries.

The basic ideas in this paper stem from a simple observation: the geometric similarity between the quantum-mechanical wavefunctions and electromagnetic waves is remarkably similar. The components of both waves are orthogonal to the direction of propagation and to each other. Only the relative phase differs : the electric and magnetic field vectors (E and B) have the same phase. In contrast, the phase of the real and imaginary part of the (elementary) wavefunction (ψ = a·e^−i∙θ = a∙cosθ – a∙sinθ) differ by 90 degrees (π/2).[1] Pursuing the analogy, we explore the following question: if the oscillating electric and magnetic field vectors of an electromagnetic wave carry the energy that one associates with the wave, can we analyze the real and imaginary part of the wavefunction in a similar way?

We show the answer is positive and remarkably straightforward. If the physical dimension of the electromagnetic field is expressed in newton per coulomb (force per unit charge), then the physical dimension of the components of the wavefunction may be associated with force per unit mass (newton per kg).[2] Of course, force over some distance is energy. The question then becomes: what is the energy concept here? Kinetic? Potential? Both?

The similarity between the energy of a (one-dimensional) linear oscillator (E = m·a²·ω²/2) and Einstein’s relativistic energy equation E = m∙c² inspires us to interpret the energy as a two-dimensional oscillation of mass. To assist the reader, we construct a two-piston engine metaphor.[3] We then adapt the formula for the electromagnetic energy density to calculate the energy densities for the wave function. The results are elegant and intuitive: the energy densities are proportional to the square of the absolute value of the wavefunction and, hence, to the probabilities. Schrödinger’s wave equation may then, effectively, be interpreted as a diffusion equation for energy itself.

As an added bonus, concepts such as the Compton scattering radius for a particle and spin angular, as well as the boson-fermion dichotomy can be explained in a fully intuitive way.[4]

Of course, such interpretation is also an interpretation of the wavefunction itself, and the immediate reaction of the reader is predictable: the electric and magnetic field vectors are, somehow, to be looked at as real vectors. In contrast, the real and imaginary components of the wavefunction are not. However, this objection needs to be phrased more carefully. First, it may be noted that, in a classical analysis, the magnetic force is a pseudovector itself.[5] Second, a suitable choice of coordinates may make quantum-mechanical rotation matrices irrelevant.[6]

Therefore, the author is of the opinion that this little paper may provide some fresh perspective on the question, thereby further exploring Einstein’s basic sentiment in regard to quantum mechanics, which may be summarized as follows: there must be some physical explanation for the calculated probabilities.[7]

We will, therefore, start with Einstein’s relativistic energy equation (E = mc²) and wonder what it could possibly tell us.

I. Energy as a two-dimensional oscillation of mass

The structural similarity between the relativistic energy formula, the formula for the total energy of an oscillator, and the kinetic energy of a moving body, is striking:

E = mc²
E = mω²/2
E = mv²/2

In these formulas, ω, v and c all describe some velocity.[8] Of course, there is the 1/2 factor in the E = mω²/2 formula[9], but that is exactly the point we are going to explore here: can we think of an oscillation in two dimensions, so it stores an amount of energy that is equal to E = 2·m·ω²/2 = m·ω²?

That is easy enough. Think, for example, of a V-2 engine with the pistons at a 90-degree angle, as illustrated below. The 90° angle makes it possible to perfectly balance the counterweight and the pistons, thereby ensuring smooth travel at all times. With permanently closed valves, the air inside the cylinder compresses and decompresses as the pistons move up and down and provides, therefore, a restoring force. As such, it will store potential energy, just like a spring, and the motion of the pistons will also reflect that of a mass on a spring. Hence, we can describe it by a sinusoidal function, with the zero point at the center of each cylinder. We can, therefore, think of the moving pistons as harmonic oscillators, just like mechanical springs.

Figure 1: Oscillations in two dimensions V-2 engine

If we assume there is no friction, we have a perpetuum mobile here. The compressed air and the rotating counterweight (which, combined with the crankshaft, acts as a flywheel[10]) store the potential energy. The moving masses of the pistons store the kinetic energy of the system.[11]

At this point, it is probably good to quickly review the relevant math. If the magnitude of the oscillation is equal to a, then the motion of the piston (or the mass on a spring) will be described by x = a·cos(ω·t + Δ).[12] Needless to say, Δ is just a phase factor which defines our t = 0 point, and ω is the natural angular frequency of our oscillator. Because of the 90° angle between the two cylinders, Δ would be 0 for one oscillator, and –π/2 for the other. Hence, the motion of one piston is given by x = a·cos(ω·t), while the motion of the other is given by x = a·cos(ω·t–π/2) = a·sin(ω·t).

The kinetic and potential energy of one oscillator (think of one piston or one spring only) can then be calculated as:

K.E. = T = m·v²/2 = (1/2)·m·ω²·a²·sin²(ω·t + Δ)
P.E. = U = k·x²/2 = (1/2)·k·a²·cos²(ω·t + Δ)

The coefficient k in the potential energy formula characterizes the restoring force: F = −k·x. From the dynamics involved, it is obvious that k must be equal to m·ω². Hence, the total energy is equal to:

E = T + U = (1/2)· m·ω²·a²·[sin²(ω·t + Δ) + cos²(ω·t + Δ)] = m·a²·ω²/2

To facilitate the calculations, we will briefly assume k = m·ω² and a are equal to 1. The motion of our first oscillator is given by the cos(ω·t) = cosθ function (θ = ω·t), and its kinetic energy will be equal to sin²θ. Hence, the (instantaneous) change in kinetic energy at any point in time will be equal to:

d(sin²θ)/dθ = 2∙sinθ∙d(sinθ)/dθ = 2∙sinθ∙cosθ

Let us look at the second oscillator now. Just think of the second piston going up and down in the V-2 engine. Its motion is given by the sinθ function, which is equal to cos(θ−π /2). Hence, its kinetic energy is equal to sin²(θ−π /2), and how it changes – as a function of θ – will be equal to:

2∙sin(θ−π /2)∙cos(θ−π /2) = = −2∙cosθ∙sinθ = −2∙sinθ∙cosθ

We have our perpetuum mobile! While transferring kinetic energy from one piston to the other, the crankshaft will rotate with a constant angular velocity: linear motion becomes circular motion, and vice versa, and the total energy that is stored in the system is T + U = ma²ω².

We have a great metaphor here. Somehow, in this beautiful interplay between linear and circular motion, energy is borrowed from one place and then returns to the other, cycle after cycle. We know the wavefunction consist of a sine and a cosine: the cosine is the real component, and the sine is the imaginary component. Could they be equally real? Could each represent half of the total energy of our particle? Should we think of the c in our E = mc² formula as an angular velocity?

These are sensible questions. Let us explore them.

II. The wavefunction as a two-dimensional oscillation

The elementary wavefunction is written as:

ψ = a·e^{−i[E·t − p∙x]/ħ} = a·e^{−i[E·t − p∙x]/ħ} = a·cos(p∙x/ħ – E∙t/ħ) + i·a·sin(p∙x/ħ – E∙t/ħ)

When considering a particle at rest (p = 0) this reduces to:

ψ = a·e^{−i∙E·t/ħ} = a·cos(–E∙t/ħ) + i·a·sin(–E∙t/ħ) = a·cos(E∙t/ħ) – i·a·sin(E∙t/ħ)

Let us remind ourselves of the geometry involved, which is illustrated below. Note that the argument of the wavefunction rotates clockwise with time, while the mathematical convention for measuring the phase angle (ϕ) is counter-clockwise.

Figure 2: Euler’s formula 760px-eulers_formula

If we assume the momentum p is all in the x-direction, then the p and x vectors will have the same direction, and p∙x/ħ reduces to p∙x/ħ. Most illustrations – such as the one below – will either freeze x or, else, t. Alternatively, one can google web animations varying both. The point is: we also have a two-dimensional oscillation here. These two dimensions are perpendicular to the direction of propagation of the wavefunction. For example, if the wavefunction propagates in the x-direction, then the oscillations are along the y– and z-axis, which we may refer to as the real and imaginary axis. Note how the phase difference between the cosine and the sine – the real and imaginary part of our wavefunction – appear to give some spin to the whole. I will come back to this.

Figure 3: Geometric representation of the wavefunction 5d_euler_f

Hence, if we would say these oscillations carry half of the total energy of the particle, then we may refer to the real and imaginary energy of the particle respectively, and the interplay between the real and the imaginary part of the wavefunction may then describe how energy propagates through space over time.

Let us consider, once again, a particle at rest. Hence, p = 0 and the (elementary) wavefunction reduces to ψ = a·e^{−i∙E·t/ħ}. Hence, the angular velocity of both oscillations, at some point x, is given by ω = -E/ħ. Now, the energy of our particle includes all of the energy – kinetic, potential and rest energy – and is, therefore, equal to E = mc².

Can we, somehow, relate this to the m·a²·ω² energy formula for our V-2 perpetuum mobile? Our wavefunction has an amplitude too. Now, if the oscillations of the real and imaginary wavefunction store the energy of our particle, then their amplitude will surely matter. In fact, the energy of an oscillation is, in general, proportional to the square of the amplitude: E µ a². We may, therefore, think that the a² factor in the E = m·a²·ω² energy will surely be relevant as well.

However, here is a complication: an actual particle is localized in space and can, therefore, not be represented by the elementary wavefunction. We must build a wave packet for that: a sum of wavefunctions, each with their own amplitude a_k, and their own ω_i = -E_i/ħ. Each of these wavefunctions will contribute some energy to the total energy of the wave packet. To calculate the contribution of each wave to the total, both a_i as well as E_i will matter.

What is E_i? E_i varies around some average E, which we can associate with some average mass m: m = E/c². The Uncertainty Principle kicks in here. The analysis becomes more complicated, but a formula such as the one below might make sense:We can re-write this as:What is the meaning of this equation? We may look at it as some sort of physical normalization condition when building up the Fourier sum. Of course, we should relate this to the mathematical normalization condition for the wavefunction. Our intuition tells us that the probabilities must be related to the energy densities, but how exactly? We will come back to this question in a moment. Let us first think some more about the enigma: what is mass?

Before we do so, let us quickly calculate the value of c²ħ²: it is about 1´10^–⁵¹ N²∙m⁴. Let us also do a dimensional analysis: the physical dimensions of the E = m·a²·ω² equation make sense if we express m in kg, a in m, and ω in rad/s. We then get: [E] = kg∙m²/s² = (N∙s²/m)∙m²/s² = N∙m = J. The dimensions of the left- and right-hand side of the physical normalization condition is N³∙m⁵.

III. What is mass?

We came up, playfully, with a meaningful interpretation for energy: it is a two-dimensional oscillation of mass. But what is mass? A new aether theory is, of course, not an option, but then what is it that is oscillating? To understand the physics behind equations, it is always good to do an analysis of the physical dimensions in the equation. Let us start with Einstein’s energy equation once again. If we want to look at mass, we should re-write it as m = E/c²:

[m] = [E/c²] = J/(m/s)² = N·m∙s²/m² = N·s²/m = kg

This is not very helpful. It only reminds us of Newton’s definition of a mass: mass is that what gets accelerated by a force. At this point, we may want to think of the physical significance of the absolute nature of the speed of light. Einstein’s E = mc² equation implies we can write the ratio between the energy and the mass of any particle is always the same, so we can write, for example:This reminds us of the ω²= C^–¹/L or ω² = k/m of harmonic oscillators once again.[13] The key difference is that the ω²= C^–¹/L and ω² = k/m formulas introduce two or more degrees of freedom.[14] In contrast, c²= E/m for any particle, always. However, that is exactly the point: we can modulate the resistance, inductance and capacitance of electric circuits, and the stiffness of springs and the masses we put on them, but we live in one physical space only: our spacetime. Hence, the speed of light c emerges here as the defining property of spacetime – the resonant frequency, so to speak. We have no further degrees of freedom here.

The Planck-Einstein relation (for photons) and the de Broglie equation (for matter-particles) have an interesting feature: both imply that the energy of the oscillation is proportional to the frequency, with Planck’s constant as the constant of proportionality. Now, for one-dimensional oscillations – think of a guitar string, for example – we know the energy will be proportional to the square of the frequency. It is a remarkable observation: the two-dimensional matter-wave, or the electromagnetic wave, gives us two waves for the price of one, so to speak, each carrying half of the total energy of the oscillation but, as a result, we get a proportionality between E and f instead of between E and f².

However, such reflections do not answer the fundamental question we started out with: what is mass? At this point, it is hard to go beyond the circular definition that is implied by Einstein’s formula: energy is a two-dimensional oscillation of mass, and mass packs energy, and c emerges us as the property of spacetime that defines how exactly.

When everything is said and done, this does not go beyond stating that mass is some scalar field. Now, a scalar field is, quite simply, some real number that we associate with a position in spacetime. The Higgs field is a scalar field but, of course, the theory behind it goes much beyond stating that we should think of mass as some scalar field. The fundamental question is: why and how does energy, or matter, condense into elementary particles? That is what the Higgs mechanism is about but, as this paper is exploratory only, we cannot even start explaining the basics of it.

What we can do, however, is look at the wave equation again (Schrödinger’s equation), as we can now analyze it as an energy diffusion equation.

IV. Schrödinger’s equation as an energy diffusion equation

The interpretation of Schrödinger’s equation as a diffusion equation is straightforward. Feynman (Lectures, III-16-1) briefly summarizes it as follows:

“We can think of Schrödinger’s equation as describing the diffusion of the probability amplitude from one point to the next. […] But the imaginary coefficient in front of the derivative makes the behavior completely different from the ordinary diffusion such as you would have for a gas spreading out along a thin tube. Ordinary diffusion gives rise to real exponential solutions, whereas the solutions of Schrödinger’s equation are complex waves.”[17]

Let us review the basic math. For a particle moving in free space – with no external force fields acting on it – there is no potential (U = 0) and, therefore, the Uψ term disappears. Therefore, Schrödinger’s equation reduces to:

∂ψ(x, t)/∂t = i·(1/2)·(ħ/m_eff)·∇²ψ(x, t)

The ubiquitous diffusion equation in physics is:

∂φ(x, t)/∂t = D·∇²φ(x, t)

The structural similarity is obvious. The key difference between both equations is that the wave equation gives us two equations for the price of one. Indeed, because ψ is a complex-valued function, with a real and an imaginary part, we get the following equations[18]:

Re(∂ψ/∂t) = −(1/2)·(ħ/m_eff)·Im(∇²ψ)
Im(∂ψ/∂t) = (1/2)·(ħ/m_eff)·Re(∇²ψ)

These equations make us think of the equations for an electromagnetic wave in free space (no stationary charges or currents):

∂B/∂t = –∇×E
∂E/∂t = c²∇×B

The above equations effectively describe a propagation mechanism in spacetime, as illustrated below.

Figure 4: Propagation mechanisms

The Laplacian operator (∇²), when operating on a scalar quantity, gives us a flux density, i.e. something expressed per square meter (1/m²). In this case, it is operating on ψ(x, t), so what is the dimension of our wavefunction ψ(x, t)? To answer that question, we should analyze the diffusion constant in Schrödinger’s equation, i.e. the (1/2)·(ħ/m_eff) factor:

As a mathematical constant of proportionality, it will quantify the relationship between both derivatives (i.e. the time derivative and the Laplacian);
As a physical constant, it will ensure the physical dimensions on both sides of the equation are compatible.

Now, the ħ/m_eff factor is expressed in (N·m·s)/(N· s²/m) = m²/s. Hence, it does ensure the dimensions on both sides of the equation are, effectively, the same: ∂ψ/∂t is a time derivative and, therefore, its dimension is s^–¹ while, as mentioned above, the dimension of ∇²ψ is m^–². However, this does not solve our basic question: what is the dimension of the real and imaginary part of our wavefunction?

At this point, mainstream physicists will say: it does not have a physical dimension, and there is no geometric interpretation of Schrödinger’s equation. One may argue, effectively, that its argument, (p∙x – E∙t)/ħ, is just a number and, therefore, that the real and imaginary part of ψ is also just some number.

To this, we may object that ħ may be looked as a mathematical scaling constant only. If we do that, then the argument of ψ will, effectively, be expressed in action units, i.e. in N·m·s. It then does make sense to also associate a physical dimension with the real and imaginary part of ψ. What could it be?

We may have a closer look at Maxwell’s equations for inspiration here. The electric field vector is expressed in newton (the unit of force) per unit of charge (coulomb). Now, there is something interesting here. The physical dimension of the magnetic field is N/C divided by m/s.[19] We may write B as the following vector cross-product: B = (1/c)∙e_x×E, with e_x the unit vector pointing in the x-direction (i.e. the direction of propagation of the wave). Hence, we may associate the (1/c)∙e_x× operator, which amounts to a rotation by 90 degrees, with the s/m dimension. Now, multiplication by i also amounts to a rotation by 90° degrees. Hence, we may boldly write: B = (1/c)∙e_x×E = (1/c)∙i∙E. This allows us to also geometrically interpret Schrödinger’s equation in the way we interpreted it above (see Figure 3).[20]

Still, we have not answered the question as to what the physical dimension of the real and imaginary part of our wavefunction should be. At this point, we may be inspired by the structural similarity between Newton’s and Coulomb’s force laws:Hence, if the electric field vector E is expressed in force per unit charge (N/C), then we may want to think of associating the real part of our wavefunction with a force per unit mass (N/kg). We can, of course, do a substitution here, because the mass unit (1 kg) is equivalent to 1 N·s²/m. Hence, our N/kg dimension becomes:

N/kg = N/(N·s²/m)= m/s²

What is this: m/s²? Is that the dimension of the a·cosθ term in the a·e^−iθ= a·cosθ − i·a·sinθ wavefunction?

My answer is: why not? Think of it: m/s² is the physical dimension of acceleration: the increase or decrease in velocity (m/s) per second. It ensures the wavefunction for any particle – matter-particles or particles with zero rest mass (photons) – and the associated wave equation (which has to be the same for all, as the spacetime we live in is one) are mutually consistent.

In this regard, we should think of how we would model a gravitational wave. The physical dimension would surely be the same: force per mass unit. It all makes sense: wavefunctions may, perhaps, be interpreted as traveling distortions of spacetime, i.e. as tiny gravitational waves.

V. Energy densities and flows

Pursuing the geometric equivalence between the equations for an electromagnetic wave and Schrödinger’s equation, we can now, perhaps, see if there is an equivalent for the energy density. For an electromagnetic wave, we know that the energy density is given by the following formula:E and B are the electric and magnetic field vector respectively. The Poynting vector will give us the directional energy flux, i.e. the energy flow per unit area per unit time. We write:Needless to say, the ∇∙ operator is the divergence and, therefore, gives us the magnitude of a (vector) field’s source or sink at a given point. To be precise, the divergence gives us the volume density of the outward flux of a vector field from an infinitesimal volume around a given point. In this case, it gives us the volume density of the flux of S.

We can analyze the dimensions of the equation for the energy density as follows:

E is measured in newton per coulomb, so [E∙E] = [E²] = N²/C².
B is measured in (N/C)/(m/s), so we get [B∙B] = [B²] = (N²/C²)·(s²/m²). However, the dimension of our c² factor is (m²/s²) and so we’re also left with N²/C².
The ϵ₀ is the electric constant, aka as the vacuum permittivity. As a physical constant, it should ensure the dimensions on both sides of the equation work out, and they do: [ε₀] = C²/(N·m²) and, therefore, if we multiply that with N²/C², we find that u is expressed in J/m³.[21]

Replacing the newton per coulomb unit (N/C) by the newton per kg unit (N/kg) in the formulas above should give us the equivalent of the energy density for the wavefunction. We just need to substitute ϵ₀ for an equivalent constant. We may to give it a try. If the energy densities can be calculated – which are also mass densities, obviously – then the probabilities should be proportional to them.

Let us first see what we get for a photon, assuming the electromagnetic wave represents its wavefunction. Substituting B for (1/c)∙i∙E or for −(1/c)∙i∙E gives us the following result:Zero!? An unexpected result! Or not? We have no stationary charges and no currents: only an electromagnetic wave in free space. Hence, the local energy conservation principle needs to be respected at all points in space and in time. The geometry makes sense of the result: for an electromagnetic wave, the magnitudes of E and B reach their maximum, minimum and zero point simultaneously, as shown below.[22] This is because their phase is the same.

Figure 5: Electromagnetic wave: E and B EM field

Should we expect a similar result for the energy densities that we would associate with the real and imaginary part of the matter-wave? For the matter-wave, we have a phase difference between a·cosθ and a·sinθ, which gives a different picture of the propagation of the wave (see Figure 3).[23] In fact, the geometry of the suggestion suggests some inherent spin, which is interesting. I will come back to this. Let us first guess those densities. Making abstraction of any scaling constants, we may write:We get what we hoped to get: the absolute square of our amplitude is, effectively, an energy density !

|ψ|² = |a·e^{−i∙E·t/ħ}|²= a²= u

This is very deep. A photon has no rest mass, so it borrows and returns energy from empty space as it travels through it. In contrast, a matter-wave carries energy and, therefore, has some (rest) mass. It is therefore associated with an energy density, and this energy density gives us the probabilities. Of course, we need to fine-tune the analysis to account for the fact that we have a wave packet rather than a single wave, but that should be feasible.

As mentioned, the phase difference between the real and imaginary part of our wavefunction (a cosine and a sine function) appear to give some spin to our particle. We do not have this particularity for a photon. Of course, photons are bosons, i.e. spin-zero particles, while elementary matter-particles are fermions with spin-1/2. Hence, our geometric interpretation of the wavefunction suggests that, after all, there may be some more intuitive explanation of the fundamental dichotomy between bosons and fermions, which puzzled even Feynman:

“Why is it that particles with half-integral spin are Fermi particles, whereas particles with integral spin are Bose particles? We apologize for the fact that we cannot give you an elementary explanation. An explanation has been worked out by Pauli from complicated arguments of quantum field theory and relativity. He has shown that the two must necessarily go together, but we have not been able to find a way of reproducing his arguments on an elementary level. It appears to be one of the few places in physics where there is a rule which can be stated very simply, but for which no one has found a simple and easy explanation. The explanation is deep down in relativistic quantum mechanics. This probably means that we do not have a complete understanding of the fundamental principle involved.” (Feynman, Lectures, III-4-1)

The physical interpretation of the wavefunction, as presented here, may provide some better understanding of ‘the fundamental principle involved’: the physical dimension of the oscillation is just very different. That is all: it is force per unit charge for photons, and force per unit mass for matter-particles. We will examine the question of spin somewhat more carefully in section VII. Let us first examine the matter-wave some more.

VI. Group and phase velocity of the matter-wave

The geometric representation of the matter-wave (see Figure 3) suggests a traveling wave and, yes, of course: the matter-wave effectively travels through space and time. But what is traveling, exactly? It is the pulse – or the signal – only: the phase velocity of the wave is just a mathematical concept and, even in our physical interpretation of the wavefunction, the same is true for the group velocity of our wave packet. The oscillation is two-dimensional, but perpendicular to the direction of travel of the wave. Hence, nothing actually moves with our particle.

Here, we should also reiterate that we did not answer the question as to what is oscillating up and down and/or sideways: we only associated a physical dimension with the components of the wavefunction – newton per kg (force per unit mass), to be precise. We were inspired to do so because of the physical dimension of the electric and magnetic field vectors (newton per coulomb, i.e. force per unit charge) we associate with electromagnetic waves which, for all practical purposes, we currently treat as the wavefunction for a photon. This made it possible to calculate the associated energy densities and a Poynting vector for energy dissipation. In addition, we showed that Schrödinger’s equation itself then becomes a diffusion equation for energy. However, let us now focus some more on the asymmetry which is introduced by the phase difference between the real and the imaginary part of the wavefunction. Look at the mathematical shape of the elementary wavefunction once again:

ψ = a·e^{−i[E·t − p∙x]/ħ} = a·e^{−i[E·t − p∙x]/ħ} = a·cos(p∙x/ħ − E∙t/ħ) + i·a·sin(p∙x/ħ − E∙t/ħ)

The minus sign in the argument of our sine and cosine function defines the direction of travel: an F(x−v∙t) wavefunction will always describe some wave that is traveling in the positive x-direction (with c the wave velocity), while an F(x+v∙t) wavefunction will travel in the negative x-direction. For a geometric interpretation of the wavefunction in three dimensions, we need to agree on how to define i or, what amounts to the same, a convention on how to define clockwise and counterclockwise directions: if we look at a clock from the back, then its hand will be moving counterclockwise. So we need to establish the equivalent of the right-hand rule. However, let us not worry about that now. Let us focus on the interpretation. To ease the analysis, we’ll assume we’re looking at a particle at rest. Hence, p = 0, and the wavefunction reduces to:

ψ = a·e^{−i∙E·t/ħ} = a·cos(−E∙t/ħ) + i·a·sin(−E₀∙t/ħ) = a·cos(E₀∙t/ħ) − i·a·sin(E₀∙t/ħ)

E₀ is, of course, the rest mass of our particle and, now that we are here, we should probably wonder whose time t we are talking about: is it our time, or is the proper time of our particle? Well… In this situation, we are both at rest so it does not matter: t is, effectively, the proper time so perhaps we should write it as t₀. It does not matter. You can see what we expect to see: E₀/ħ pops up as the natural frequency of our matter-particle: (E₀/ħ)∙t = ω∙t. Remembering the ω = 2π·f = 2π/T and T = 1/f formulas, we can associate a period and a frequency with this wave, using the ω = 2π·f = 2π/T. Noting that ħ = h/2π, we find the following:

T = 2π·(ħ/E₀) = h/E₀ ⇔ f = E₀/h = m₀c²/h

This is interesting, because we can look at the period as a natural unit of time for our particle. What about the wavelength? That is tricky because we need to distinguish between group and phase velocity here. The group velocity (v_g) should be zero here, because we assume our particle does not move. In contrast, the phase velocity is given by v_p = λ·f = (2π/k)·(ω/2π) = ω/k. In fact, we’ve got something funny here: the wavenumber k = p/ħ is zero, because we assume the particle is at rest, so p = 0. So we have a division by zero here, which is rather strange. What do we get assuming the particle is not at rest? We write:

v_p = ω/k = (E/ħ)/(p/ħ) = E/p = E/(m·v_g) = (m·c²)/(m·v_g) = c²/v_g

This is interesting: it establishes a reciprocal relation between the phase and the group velocity, with c as a simple scaling constant. Indeed, the graph below shows the shape of the function does not change with the value of c, and we may also re-write the relation above as:

v_p/c = β_p = c/v_p = 1/β_g = 1/(c/v_p)

Figure 6: Reciprocal relation between phase and group velocity graph

We can also write the mentioned relationship as v_p·v_g = c², which reminds us of the relationship between the electric and magnetic constant (1/ε₀)·(1/μ₀) = c². This is interesting in light of the fact we can re-write this as (c·ε₀)·(c·μ₀) = 1, which shows electricity and magnetism are just two sides of the same coin, so to speak.[24]

Interesting, but how do we interpret the math? What about the implications of the zero value for wavenumber k = p/ħ. We would probably like to think it implies the elementary wavefunction should always be associated with some momentum, because the concept of zero momentum clearly leads to weird math: something times zero cannot be equal to c²! Such interpretation is also consistent with the Uncertainty Principle: if Δx·Δp ≥ ħ, then neither Δx nor Δp can be zero. In other words, the Uncertainty Principle tells us that the idea of a pointlike particle actually being at some specific point in time and in space does not make sense: it has to move. It tells us that our concept of dimensionless points in time and space are mathematical notions only. Actual particles – including photons – are always a bit spread out, so to speak, and – importantly – they have to move.

For a photon, this is self-evident. It has no rest mass, no rest energy, and, therefore, it is going to move at the speed of light itself. We write: p = m·c = m·c²/c = E/c. Using the relationship above, we get:

v_p = ω/k = (E/ħ)/(p/ħ) = E/p = c ⇒ v_g = c²/v_p = c²/c = c

This is good: we started out with some reflections on the matter-wave, but here we get an interpretation of the electromagnetic wave as a wavefunction for the photon. But let us get back to our matter-wave. In regard to our interpretation of a particle having to move, we should remind ourselves, once again, of the fact that an actual particle is always localized in space and that it can, therefore, not be represented by the elementary wavefunction ψ = a·e^{−i[E·t − p∙x]/ħ} or, for a particle at rest, the ψ = a·e^{−i∙E·t/ħ} function. We must build a wave packet for that: a sum of wavefunctions, each with their own amplitude a_i, and their own ω_i = −E_i/ħ. Indeed, in section II, we showed that each of these wavefunctions will contribute some energy to the total energy of the wave packet and that, to calculate the contribution of each wave to the total, both a_i as well as E_i matter. This may or may not resolve the apparent paradox. Let us look at the group velocity.

To calculate a meaningful group velocity, we must assume the v_g = ∂ω_i/∂k_i = ∂(E_i/ħ)/∂(p_i/ħ) = ∂(E_i)/∂(p_i) exists. So we must have some dispersion relation. How do we calculate it? We need to calculate ω_i as a function of k_ihere, or E_i as a function of p_i. How do we do that? Well… There are a few ways to go about it but one interesting way of doing it is to re-write Schrödinger’s equation as we did, i.e. by distinguishing the real and imaginary parts of the ∂ψ/∂t =i·[ħ/(2m)]·∇²ψ wave equation and, hence, re-write it as the following pair of two equations:

Re(∂ψ/∂t) = −[ħ/(2m_eff)]·Im(∇²ψ) ⇔ ω·cos(kx − ωt) = k²·[ħ/(2m_eff)]·cos(kx − ωt)
Im(∂ψ/∂t) = [ħ/(2m_eff)]·Re(∇²ψ) ⇔ ω·sin(kx − ωt) = k²·[ħ/(2m_eff)]·sin(kx − ωt)

Both equations imply the following dispersion relation:

ω = ħ·k²/(2m_eff)

Of course, we need to think about the subscripts now: we have ω_i, k_i, but… What about m_eff or, dropping the subscript, m? Do we write it as m_i? If so, what is it? Well… It is the equivalent mass of E_i obviously, and so we get it from the mass-energy equivalence relation: m_i = E_i/c². It is a fine point, but one most people forget about: they usually just write m. However, if there is uncertainty in the energy, then Einstein’s mass-energy relation tells us we must have some uncertainty in the (equivalent) mass too. Here, I should refer back to Section II: E_i varies around some average energy E and, therefore, the Uncertainty Principle kicks in.

VII. Explaining spin

The elementary wavefunction vector – i.e. the vector sum of the real and imaginary component – rotates around the x-axis, which gives us the direction of propagation of the wave (see Figure 3). Its magnitude remains constant. In contrast, the magnitude of the electromagnetic vector – defined as the vector sum of the electric and magnetic field vectors – oscillates between zero and some maximum (see Figure 5).

We already mentioned that the rotation of the wavefunction vector appears to give some spin to the particle. Of course, a circularly polarized wave would also appear to have spin (think of the E and B vectors rotating around the direction of propagation – as opposed to oscillating up and down or sideways only). In fact, a circularly polarized light does carry angular momentum, as the equivalent mass of its energy may be thought of as rotating as well. But so here we are looking at a matter-wave.

The basic idea is the following: if we look at ψ = a·e^{−i∙E·t/ħ} as some real vector – as a two-dimensional oscillation of mass, to be precise – then we may associate its rotation around the direction of propagation with some torque. The illustration below reminds of the math here.

Figure 7: Torque and angular momentum vectors Torque_animation

A torque on some mass about a fixed axis gives it angular momentum, which we can write as the vector cross-product L = r×p or, perhaps easier for our purposes here as the product of an angular velocity (ω) and rotational inertia (I), aka as the moment of inertia or the angular mass. We write:

L = I·ω

Note we can write L and ω in boldface here because they are (axial) vectors. If we consider their magnitudes only, we write L = I·ω (no boldface). We can now do some calculations. Let us start with the angular velocity. In our previous posts, we showed that the period of the matter-wave is equal to T = 2π·(ħ/E₀). Hence, the angular velocity must be equal to:

ω = 2π/[2π·(ħ/E₀)] = E₀/ħ

We also know the distance r, so that is the magnitude of r in the L = r×p vector cross-product: it is just a, so that is the magnitude of ψ = a·e^{−i∙E·t/ħ}. Now, the momentum (p) is the product of a linear velocity (v) – in this case, the tangential velocity – and some mass (m): p = m·v. If we switch to scalar instead of vector quantities, then the (tangential) velocity is given by v = r·ω. So now we only need to think about what we should use for m or, if we want to work with the angular velocity (ω), the angular mass (I). Here we need to make some assumption about the mass (or energy) distribution. Now, it may or may not sense to assume the energy in the oscillation – and, therefore, the mass – is distributed uniformly. In that case, we may use the formula for the angular mass of a solid cylinder: I = m·r²/2. If we keep the analysis non-relativistic, then m = m₀. Of course, the energy-mass equivalence tells us that m₀ = E₀/c². Hence, this is what we get:

L = I·ω = (m₀·r²/2)·(E₀/ħ) = (1/2)·a²·(E₀/c²)·(E₀/ħ) = a²·E₀²/(2·ħ·c²)

Does it make sense? Maybe. Maybe not. Let us do a dimensional analysis: that won’t check our logic, but it makes sure we made no mistakes when mapping mathematical and physical spaces. We have m²·J² = m²·N²·m² in the numerator and N·m·s·m²/s² in the denominator. Hence, the dimensions work out: we get N·m·s as the dimension for L, which is, effectively, the physical dimension of angular momentum. It is also the action dimension, of course, and that cannot be a coincidence. Also note that the E = mc² equation allows us to re-write it as:

L = a²·E₀²/(2·ħ·c²)

Of course, in quantum mechanics, we associate spin with the magnetic moment of a charged particle, not with its mass as such. Is there way to link the formula above to the one we have for the quantum-mechanical angular momentum, which is also measured in N·m·s units, and which can only take on one of two possible values: J = +ħ/2 and −ħ/2? It looks like a long shot, right? How do we go from (1/2)·a²·m₀²/ħ to ± (1/2)∙ħ? Let us do a numerical example. The energy of an electron is typically 0.510 MeV » 8.1871×10⁻¹⁴ N∙m, and a… What value should we take for a?

We have an obvious trio of candidates here: the Bohr radius, the classical electron radius (aka the Thompon scattering length), and the Compton scattering radius.

Let us start with the Bohr radius, so that is about 0.×10⁻¹⁰ N∙m. We get L = a²·E₀²/(2·ħ·c²) = 9.9×10⁻³¹ N∙m∙s. Now that is about 1.88×10⁴ times ħ/2. That is a huge factor. The Bohr radius cannot be right: we are not looking at an electron in an orbital here. To show it does not make sense, we may want to double-check the analysis by doing the calculation in another way. We said each oscillation will always pack 6.626070040(81)×10⁻³⁴ joule in energy. So our electron should pack about 1.24×10⁻²⁰ oscillations. The angular momentum (L) we get when using the Bohr radius for a and the value of 6.626×10⁻³⁴ joule for E₀ and the Bohr radius is equal to 6.49×10⁻⁵⁹ N∙m∙s. So that is the angular momentum per oscillation. When we multiply this with the number of oscillations (1.24×10⁻²⁰), we get about 8.01×10⁻⁵¹ N∙m∙s, so that is a totally different number.

The classical electron radius is about 2.818×10⁻¹⁵ m. We get an L that is equal to about 2.81×10⁻³⁹ N∙m∙s, so now it is a tiny fraction of ħ/2! Hence, this leads us nowhere. Let us go for our last chance to get a meaningful result! Let us use the Compton scattering length, so that is about 2.42631×10⁻¹² m.

This gives us an L of 2.08×10⁻³³ N∙m∙s, which is only 20 times ħ. This is not so bad, but it is good enough? Let us calculate it the other way around: what value should we take for a so as to ensure L = a²·E₀²/(2·ħ·c²) = ħ/2? Let us write it out:

In fact, this is the formula for the so-called reduced Compton wavelength. This is perfect. We found what we wanted to find. Substituting this value for a (you can calculate it: it is about 3.8616×10⁻³³ m), we get what we should find: F10

This is a rather spectacular result, and one that would – a priori – support the interpretation of the wavefunction that is being suggested in this paper.

VIII. The boson-fermion dichotomy

Let us do some more thinking on the boson-fermion dichotomy. Again, we should remind ourselves that an actual particle is localized in space and that it can, therefore, not be represented by the elementary wavefunction ψ = a·e^{−i[E·t − p∙x]/ħ} or, for a particle at rest, the ψ = a·e^{−i∙E·t/ħ} function. We must build a wave packet for that: a sum of wavefunctions, each with their own amplitude a_i, and their own ω_i = −E_i/ħ. Each of these wavefunctions will contribute some energy to the total energy of the wave packet. Now, we can have another wild but logical theory about this.

Think of the apparent right-handedness of the elementary wavefunction: surely, Nature can’t be bothered about our convention of measuring phase angles clockwise or counterclockwise. Also, the angular momentum can be positive or negative: J = +ħ/2 or −ħ/2. Hence, we would probably like to think that an actual particle – think of an electron, or whatever other particle you’d think of – may consist of right-handed as well as left-handed elementary waves. To be precise, we may think they either consist of (elementary) right-handed waves or, else, of (elementary) left-handed waves. An elementary right-handed wave would be written as:

ψ(θ_i) = a_i·(cosθ_i + i·sinθ_i)

In contrast, an elementary left-handed wave would be written as:

ψ(θ_i) = a_i·(cosθ_i − i·sinθ_i)

How does that work out with the E₀·t argument of our wavefunction? Position is position, and direction is direction, but time? Time has only one direction, but Nature surely does not care how we count time: counting like 1, 2, 3, etcetera or like −1, −2, −3, etcetera is just the same. If we count like 1, 2, 3, etcetera, then we write our wavefunction like:

ψ = a·cos(E₀∙t/ħ) − i·a·sin(E₀∙t/ħ)

If we count time like −1, −2, −3, etcetera then we write it as:

ψ = a·cos(−E₀∙t/ħ) − i·a·sin(−E₀∙t/ħ)= a·cos(E₀∙t/ħ) + i·a·sin(E₀∙t/ħ)

Hence, it is just like the left- or right-handed circular polarization of an electromagnetic wave: we can have both for the matter-wave too! This, then, should explain why we can have either positive or negative quantum-mechanical spin (+ħ/2 or −ħ/2). It is the usual thing: we have two mathematical possibilities here, and so we must have two physical situations that correspond to it.

It is only natural. If we have left- and right-handed photons – or, generalizing, left- and right-handed bosons – then we should also have left- and right-handed fermions (electrons, protons, etcetera). Back to the dichotomy. The textbook analysis of the dichotomy between bosons and fermions may be epitomized by Richard Feynman’s Lecture on it (Feynman, III-4), which is confusing and – I would dare to say – even inconsistent: how are photons or electrons supposed to know that they need to interfere with a positive or a negative sign? They are not supposed to know anything: knowledge is part of our interpretation of whatever it is that is going on there.

Hence, it is probably best to keep it simple, and think of the dichotomy in terms of the different physical dimensions of the oscillation: newton per kg versus newton per coulomb. And then, of course, we should also note that matter-particles have a rest mass and, therefore, actually carry charge. Photons do not. But both are two-dimensional oscillations, and the point is: the so-called vacuum – and the rest mass of our particle (which is zero for the photon and non-zero for everything else) – give us the natural frequency for both oscillations, which is beautifully summed up in that remarkable equation for the group and phase velocity of the wavefunction, which applies to photons as well as matter-particles:

(v_phase·c)·(v_group·c) = 1 ⇔ v_p·v_g = c²

The final question then is: why are photons spin-zero particles? Well… We should first remind ourselves of the fact that they do have spin when circularly polarized.[25] Here we may think of the rotation of the equivalent mass of their energy. However, if they are linearly polarized, then there is no spin. Even for circularly polarized waves, the spin angular momentum of photons is a weird concept. If photons have no (rest) mass, then they cannot carry any charge. They should, therefore, not have any magnetic moment. Indeed, what I wrote above shows an explanation of quantum-mechanical spin requires both mass as well as charge.[26]

IX. Concluding remarks

There are, of course, other ways to look at the matter – literally. For example, we can imagine two-dimensional oscillations as circular rather than linear oscillations. Think of a tiny ball, whose center of mass stays where it is, as depicted below. Any rotation – around any axis – will be some combination of a rotation around the two other axes. Hence, we may want to think of a two-dimensional oscillation as an oscillation of a polar and azimuthal angle.

Figure 8: Two-dimensional circular movement oscillation-of-a-ball

The point of this paper is not to make any definite statements. That would be foolish. Its objective is just to challenge the simplistic mainstream viewpoint on the reality of the wavefunction. Stating that it is a mathematical construct only without physical significance amounts to saying it has no meaning at all. That is, clearly, a non-sustainable proposition.

The interpretation that is offered here looks at amplitude waves as traveling fields. Their physical dimension may be expressed in force per mass unit, as opposed to electromagnetic waves, whose amplitudes are expressed in force per (electric) charge unit. Also, the amplitudes of matter-waves incorporate a phase factor, but this may actually explain the rather enigmatic dichotomy between fermions and bosons and is, therefore, an added bonus.

The interpretation that is offered here has some advantages over other explanations, as it explains the how of diffraction and interference. However, while it offers a great explanation of the wave nature of matter, it does not explain its particle nature: while we think of the energy as being spread out, we will still observe electrons and photons as pointlike particles once they hit the detector. Why is it that a detector can sort of ‘hook’ the whole blob of energy, so to speak?

The interpretation of the wavefunction that is offered here does not explain this. Hence, the complementarity principle of the Copenhagen interpretation of the wavefunction surely remains relevant.

Appendix 1: The de Broglie relations and energy

The 1/2 factor in Schrödinger’s equation is related to the concept of the effective mass (m_eff). It is easy to make the wrong calculations. For example, when playing with the famous de Broglie relations – aka as the matter-wave equations – one may be tempted to derive the following energy concept:

E = h·f and p = h/λ. Therefore, f = E/h and λ = p/h.
v = f·λ = (E/h)∙(p/h) = E/p
p = m·v. Therefore, E = v·p = m·v²

E = m·v²? This resembles the E = mc² equation and, therefore, one may be enthused by the discovery, especially because the m·v² also pops up when working with the Least Action Principle in classical mechanics, which states that the path that is followed by a particle will minimize the following integral: F11 Now, we can choose any reference point for the potential energy but, to reflect the energy conservation law, we can select a reference point that ensures the sum of the kinetic and the potential energy is zero throughout the time interval. If the force field is uniform, then the integrand will, effectively, be equal to KE − PE = m·v².[27]

However, that is classical mechanics and, therefore, not so relevant in the context of the de Broglie equations, and the apparent paradox should be solved by distinguishing between the group and the phase velocity of the matter wave.

Appendix 2: The concept of the effective mass

The effective mass – as used in Schrödinger’s equation – is a rather enigmatic concept. To make sure we are making the right analysis here, I should start by noting you will usually see Schrödinger’s equation written as: F12 This formulation includes a term with the potential energy (U). In free space (no potential), this term disappears, and the equation can be re-written as:

∂ψ(x, t)/∂t = i·(1/2)·(ħ/m_eff)·∇²ψ(x, t)

We just moved the i·ħ coefficient to the other side, noting that 1/i = –i. Now, in one-dimensional space, and assuming ψ is just the elementary wavefunction (so we substitute a·e^{−i∙[E·t − p∙x]/ħ} for ψ), this implies the following:

−a·i·(E/ħ)·e⁻i∙^{[E·t − p∙x]/ħ} = −i·(ħ/2m_eff)·a·(p²/ħ²)· e^{−i∙[E·t − p∙x]/ħ}

⇔ E = p²/(2m_eff) ⇔ m_eff = m∙(v/c)²/2 = m∙β²/2

It is an ugly formula: it resembles the kinetic energy formula (K.E. = m∙v²/2) but it is, in fact, something completely different. The β²/2 factor ensures the effective mass is always a fraction of the mass itself. To get rid of the ugly 1/2 factor, we may re-define m_eff as two times the old m_eff (hence, m_eff^NEW = 2∙m_eff^OLD), as a result of which the formula will look somewhat better:

m_eff = m∙(v/c)² = m∙β²

We know β varies between 0 and 1 and, therefore, m_eff will vary between 0 and m. Feynman drops the subscript, and just writes m_eff as m in his textbook (see Feynman, III-19). On the other hand, the electron mass as used is also the electron mass that is used to calculate the size of an atom (see Feynman, III-2-4). As such, the two mass concepts are, effectively, mutually compatible. It is confusing because the same mass is often defined as the mass of a stationary electron (see, for example, the article on it in the online Wikipedia encyclopedia[28]).

In the context of the derivation of the electron orbitals, we do have the potential energy term – which is the equivalent of a source term in a diffusion equation – and that may explain why the above-mentioned m_eff = m∙(v/c)² = m∙β² formula does not apply.

References

This paper discusses general principles in physics only. Hence, references can be limited to references to physics textbooks only. For ease of reading, any reference to additional material has been limited to a more popular undergrad textbook that can be consulted online: Feynman’s Lectures on Physics (http://www.feynmanlectures.caltech.edu). References are per volume, per chapter and per section. For example, Feynman III-19-3 refers to Volume III, Chapter 19, Section 3.

Notes

[1] Of course, an actual particle is localized in space and can, therefore, not be represented by the elementary wavefunction ψ = a·e^−i∙θ = a·e^{−i[E·t − p∙x]/ħ} = a·(cosθ – i·a·sinθ). We must build a wave packet for that: a sum of wavefunctions, each with its own amplitude a_k and its own argument θ_k = (E_k∙t – p_k∙x)/ħ. This is dealt with in this paper as part of the discussion on the mathematical and physical interpretation of the normalization condition.

[2] The N/kg dimension immediately, and naturally, reduces to the dimension of acceleration (m/s²), thereby facilitating a direct interpretation in terms of Newton’s force law.

[3] In physics, a two-spring metaphor is more common. Hence, the pistons in the author’s perpetuum mobile may be replaced by springs.

[4] The author re-derives the equation for the Compton scattering radius in section VII of the paper.

[5] The magnetic force can be analyzed as a relativistic effect (see Feynman II-13-6). The dichotomy between the electric force as a polar vector and the magnetic force as an axial vector disappears in the relativistic four-vector representation of electromagnetism.

[6] For example, when using Schrödinger’s equation in a central field (think of the electron around a proton), the use of polar coordinates is recommended, as it ensures the symmetry of the Hamiltonian under all rotations (see Feynman III-19-3)

[7] This sentiment is usually summed up in the apocryphal quote: “God does not play dice.”The actual quote comes out of one of Einstein’s private letters to Cornelius Lanczos, another scientist who had also emigrated to the US. The full quote is as follows: “You are the only person I know who has the same attitude towards physics as I have: belief in the comprehension of reality through something basically simple and unified… It seems hard to sneak a look at God’s cards. But that He plays dice and uses ‘telepathic’ methods… is something that I cannot believe for a single moment.” (Helen Dukas and Banesh Hoffman, Albert Einstein, the Human Side: New Glimpses from His Archives, 1979)

[8] Of course, both are different velocities: ω is an angular velocity, while v is a linear velocity: ω is measured in radians per second, while v is measured in meter per second. However, the definition of a radian implies radians are measured in distance units. Hence, the physical dimensions are, effectively, the same. As for the formula for the total energy of an oscillator, we should actually write: E = m·a²∙ω²/2. The additional factor (a) is the (maximum) amplitude of the oscillator.

[9] We also have a 1/2 factor in the E = mv²/2 formula. Two remarks may be made here. First, it may be noted this is a non-relativistic formula and, more importantly, incorporates kinetic energy only. Using the Lorentz factor (γ), we can write the relativistically correct formula for the kinetic energy as K.E. = E − E₀ = m_vc² − m₀c² = m₀γc² − m₀c² = m₀c²(γ − 1). As for the exclusion of the potential energy, we may note that we may choose our reference point for the potential energy such that the kinetic and potential energy mirror each other. The energy concept that then emerges is the one that is used in the context of the Principle of Least Action: it equals E = mv². Appendix 1 provides some notes on that.

[10] Instead of two cylinders with pistons, one may also think of connecting two springs with a crankshaft.

[11] It is interesting to note that we may look at the energy in the rotating flywheel as potential energy because it is energy that is associated with motion, albeit circular motion. In physics, one may associate a rotating object with kinetic energy using the rotational equivalent of mass and linear velocity, i.e. rotational inertia (I) and angular velocity ω. The kinetic energy of a rotating object is then given by K.E. = (1/2)·I·ω².

[12] Because of the sideways motion of the connecting rods, the sinusoidal function will describe the linear motion only approximately, but you can easily imagine the idealized limit situation.

[13] The ω²= 1/LC formula gives us the natural or resonant frequency for a electric circuit consisting of a resistor (R), an inductor (L), and a capacitor (C). Writing the formula as ω²= C^–¹/L introduces the concept of elastance, which is the equivalent of the mechanical stiffness (k) of a spring.

[14] The resistance in an electric circuit introduces a damping factor. When analyzing a mechanical spring, one may also want to introduce a drag coefficient. Both are usually defined as a fraction of the inertia, which is the mass for a spring and the inductance for an electric circuit. Hence, we would write the resistance for a spring as γm and as R = γL respectively.

[15] Photons are emitted by atomic oscillators: atoms going from one state (energy level) to another. Feynman (Lectures, I-33-3) shows us how to calculate the Q of these atomic oscillators: it is of the order of 10⁸, which means the wave train will last about 10^–8seconds (to be precise, that is the time it takes for the radiation to die out by a factor 1/e). For example, for sodium light, the radiation will last about 3.2×10^–8seconds (this is the so-called decay time τ). Now, because the frequency of sodium light is some 500 THz (500×10¹²oscillations per second), this makes for some 16 million oscillations. There is an interesting paradox here: the speed of light tells us that such wave train will have a length of about 9.6 m! How is that to be reconciled with the pointlike nature of a photon? The paradox can only be explained by relativistic length contraction: in an analysis like this, one need to distinguish the reference frame of the photon – riding along the wave as it is being emitted, so to speak – and our stationary reference frame, which is that of the emitting atom.

[16] This is a general result and is reflected in the K.E. = T = (1/2)·m·ω²·a²·sin²(ω·t + Δ) and the P.E. = U = k·x²/2 = (1/2)· m·ω²·a²·cos²(ω·t + Δ) formulas for the linear oscillator.

[17] Feynman further formalizes this in his Lecture on Superconductivity (Feynman, III-21-2), in which he refers to Schrödinger’s equation as the “equation for continuity of probabilities”. The analysis is centered on the local conservation of energy, which confirms the interpretation of Schrödinger’s equation as an energy diffusion equation.

[18] The m_eff is the effective mass of the particle, which depends on the medium. For example, an electron traveling in a solid (a transistor, for example) will have a different effective mass than in an atom. In free space, we can drop the subscript and just write m_eff = m. Appendix 2 provides some additional notes on the concept. As for the equations, they are easily derived from noting that two complex numbers a + i∙b and c + i∙d are equal if, and only if, their real and imaginary parts are the same. Now, the ∂ψ/∂t = i∙(ħ/m_eff)∙∇²ψ equation amounts to writing something like this: a + i∙b = i∙(c + i∙d). Now, remembering that i² = −1, you can easily figure out that i∙(c + i∙d) = i∙c + i²∙d = − d + i∙c.

[19] The dimension of B is usually written as N/(m∙A), using the SI unit for current, i.e. the ampere (A). However, 1 C = 1 A∙s and, hence, 1 N/(m∙A) = 1 (N/C)/(m/s).

[20] Of course, multiplication with i amounts to a counterclockwise rotation. Hence, multiplication by –i also amounts to a rotation by 90 degrees, but clockwise. Now, to uniquely identify the clockwise and counterclockwise directions, we need to establish the equivalent of the right-hand rule for a proper geometric interpretation of Schrödinger’s equation in three-dimensional space: if we look at a clock from the back, then its hand will be moving counterclockwise. When writing B = (1/c)∙i∙E, we assume we are looking in the negative x-direction. If we are looking in the positive x-direction, we should write: B = -(1/c)∙i∙E. Of course, Nature does not care about our conventions. Hence, both should give the same results in calculations. We will show in a moment they do.

[21] In fact, when multiplying C²/(N·m²) with N²/C², we get N/m², but we can multiply this with 1 = m/m to get the desired result. It is significant that an energy density (joule per unit volume) can also be measured in newton (force per unit area.

[22] The illustration shows a linearly polarized wave, but the obtained result is general.

[23] The sine and cosine are essentially the same functions, except for the difference in the phase: sinθ = cos(θ−π /2).

[24] I must thank a physics blogger for re-writing the 1/(ε₀·μ₀) = c² equation like this. See: http://reciprocal.systems/phpBB3/viewtopic.php?t=236 (retrieved on 29 September 2017).

[25] A circularly polarized electromagnetic wave may be analyzed as consisting of two perpendicular electromagnetic plane waves of equal amplitude and 90° difference in phase.

[26] Of course, the reader will now wonder: what about neutrons? How to explain neutron spin? Neutrons are neutral. That is correct, but neutrons are not elementary: they consist of (charged) quarks. Hence, neutron spin can (or should) be explained by the spin of the underlying quarks.

[27] We detailed the mathematical framework and detailed calculations in the following online article: https://readingfeynman.org/2017/09/15/the-principle-of-least-action-re-visited.

[28] https://en.wikipedia.org/wiki/Electron_rest_mass (retrieved on 29 September 2017).

The Liénard–Wiechert potentials and the solution for Maxwell’s equations

In my post on gauges and gauge transformations in electromagnetics, I mentioned the full and complete solution for Maxwell’s equations, using the electric and magnetic (vector) potential Φ and A. Feynman frames it nicely, so I should print it and put it on the kitchen door, so I can look at it everyday. 🙂

I should print the wave equation we derived in our previous post too. Hmm… Stupid question, perhaps, but why is there no wave equation above? I mean: in the previous post, we said the wave equation was the solution for Maxwell’s equation, didn’t we? The answer is simple, of course: the wave equation is a solution for waves originating from some source and traveling through free space, so that’s a special case. Here we have everything. Those integrals ‘sweep’ all over space, and so that’s real space, which is full of moving charges and so there’s waves everywhere. So the solution above is far more general and captures it all: it’s the potential at every point in space, and at every point in time, taking into account whatever else is there, moving or not moving. In fact, it is the general solution of Maxwell’s equations.

How do we find it? Well… I could copy Feynman’s 21st Lecture but I won’t do that. The solution is based on the formula for Φ and A for a small blob of charge, and then the formulas above just integrate over all of space. That solution for a small blob of charge, i.e. a point charge really, was first deduced in 1898, by a French engineer: Alfred-Marie Liénard. However, his equations did not get much attention, apparently, because a German physicist, Emil Johann Wiechert, worked on the same thing and found the very same equations just two years later. That’s why they are referred to as the Liénard-Wiechert potentials, so they both get credit for it, even if both of them worked it out independently. These are the equations:

Now, you may wonder why I am mentioning them, and you may also wonder how we get those integrals above, i.e. our general solution for Maxwell’s equations, from them. You can find the answer to your second question in Feynman’s 21st Lecture. 🙂 As for the first question, I mention them because one can derive two other formulas for E and B from them. It’s the formulas that Feynman uses in his first Volume, when studying light:

Now you’ll probably wonder how we can get these two equations from the Liénard-Wiechert potentials. They don’t look very similar, do they? No, they don’t. Frankly, I would like to give you the same answer as above, i.e. check it in Feynman’s 21st Lecture, but the truth is that the derivation is so long and tedious that even Feynman says one needs “a lot of paper and a lot of time” for that. So… Well… I’d suggest we just use all of those formulas and not worry too much about where they come from. If we can agree on that, we’re actually sort of finished with electromagnetism. All the chapters that follow Feynman’s 21st Lecture are applications indeed, so they do not add all that much to the core of the classical theory of electromagnetism.

So why did I write this post? Well… I am not sure. I guess I just wanted to sum things up for myself, so I can print it all out and put it on the kitchen door indeed. 🙂 Oh, and now that I think of it, I should add one more formula, and that’s the formula for spherical waves (as opposed to the plane waves we discussed in my previous post). It’s a very simple formula, and entirely what you’d expect to see:

The S function is the source function, and you can see that the formula is a Coulomb-like potential, but with the retarded argument. You’ll wonder: what is ψ? Is it E or B or what? Well… You can just substitute: ψ can be anything. Indeed, Feynman gives a very general solution for any type of spherical wave here. 🙂

So… That’s it, folks. That’s all there is to it. I hope you enjoyed it. 🙂

Addendum: Feynman’s equation for electromagnetic radiation

I talked about Feynman’s formula for electromagnetic radiation before, but it’s probably good to quickly re-explain it here. Note that it talks about the electric field only, as the magnetic field is so tiny and, in any case, if we have E then we can find B. So the formula is:

The geometry of the situation is depicted below. We have some charge q that, we assume, is moving through space, and so it creates some field E at point P. The e_r‘vector is the unit vector from P to Q, so it points at the charge. Well… It points to where the charge was at the time just a little while ago, i.e. at the time t – r‘/c. Why? Well… We don’t know where q is right now, because the field needs some time travel, we don’t know q right now, i.e. q at time t. It might be anywhere. Perhaps it followed some weird trajectory during the time r‘/c, like the trajectory below.

So our e_r‘vector moves as the charge moves, and so it will also have velocity and, likely, some acceleration, but what we measure for its velocity and acceleration, i.e. the d(e_r‘)/dt and d²(e_r‘)/dt² in that Feynman equation, is also the retarded velocity and the retarded acceleration. But look at the terms in the equation. The first two terms have a 1/r’² in them, so these two effects diminish with the square of the distance. The first term is just Coulomb’s Law (note that the minus sign in front takes care of the fact that like charges repel and so the E vector will point in the other way). Well… It is and it isn’t, because of the retarded time argument, of course. And so we have the second term, which sort of compensates for that. Indeed, the d(e_r‘)/dt is the time rate of change of e_r‘ and, hence, if r‘/c = Δt, then (r‘/c)·d(e_r‘)/dt is a first-order approximation of Δe_r‘.

As Feynman puts it: “The second term is as though nature were trying to allow for the fact that the Coulomb effect is retarded, if we might put it very crudely. It suggests that we should calculate the delayed Coulomb field but add a correction to it, which is its rate of change times the time delay that we use. Nature seems to be attempting to guess what the field at the present time is going to be, by taking the rate of change and multiplying by the time that is delayed.” In short, the first two terms can be written as E = −(q/4πε₀)/r‘²·[e_r‘ + Δe_r‘] and, hence, it’s a sort of modified Coulomb Law that sort of tries to guess what the electrostatic field at P should be based on (a) what it is right now, and (b) how q’s direction and velocity, as measured now, would change it.

Now, the third term has a 1/c² factor in front but, unlike the other two terms, this effect does not fall off with distance. So the formula below fully describes electromagnetic radiation, indeed, because it’s the only important term when we get ‘far enough away’, with ‘far enough’ meaning that the parts that go as the square of the distance have fallen off so much that they’re no longer significant.

Of course, you’re smart, and so you’ll immediately note that, as r increases, that unit vector keeps wiggling but that effect will also diminish. You’re right. It does, but in a fairly complicated way. The acceleration of e_r‘ has two components indeed. One is the transverse or tangential piece, because the end of e_r‘ goes up and down, and the other is a radial piece because it stays on a sphere and so it changes direction. The radial piece is the smallest bit, and actually also varies as the inverse square of $r$ when $r$ is fairly large. The tangential piece, however, varies only inversely as the distance, so as 1/r. So, yes, the wigglings of e_r‘ look smaller and smaller, inversely as the distance, but the tangential piece is and remains significant, because it does not vary as 1/r² but as 1/r only. That’s why you’ll usually see the law of radiation written in an even simpler way:

This law reduces the whole effect to the component of the acceleration that is perpendicular to the line of sight only. It assumes the distance is huge as compared to the distance over which the charge is moving and, therefore, that r‘ and r can be equated for all practical purposes. It also notes that the tangential piece is all that matters, and so it equates d²(e_r‘)/dt²with a_x/r. The whole thing is probably best illustrated as below: we have a generator driving charges up and down in G – so it’s an antenna really – and so we’ll measure a strong signal when putting the radiation detector D in position 1, but we’ll measure nothing in position 3. [The detector is, of course, another antenna, but with an amplifier for the signal.] But so here I am starting to talk about electromagnetic radiation once more, which was not what I wanted to do here, if only because Feynman does a much better job at that than I could ever do. 🙂

Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 17, 2020 as a result of a DMCA takedown notice from Michael A. Gottlieb, Rudolf Pfeiffer, and The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 20, 2020 as a result of a DMCA takedown notice from Michael A. Gottlieb, Rudolf Pfeiffer, and The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/

Traveling fields: the wave equation and its solutions

Pre-script (dated 26 June 2020): Our ideas have evolved into a full-blown realistic (or classical) interpretation of all things quantum-mechanical. In addition, I note the dark force has amused himself by removing some material. So no use to read this. Read my recent papers instead. 🙂

Original post:

We’ve climbed a big mountain over the past few weeks, post by post, 🙂 slowly gaining height, and carefully checking out the various routes to the top. But we are there now: we finally fully understand how Maxwell’s equations actually work. Let me jot them down once more:

As for how real or unreal the E and B fields are, I gave you Feynman’s answer to it, so… Well… I can’t add to that. I should just note, or remind you, that we have a fully equivalent description of it all in terms of the electric and magnetic (vector) potential Φ and A, and so we can ask the same question about Φ and A. They explain real stuff, so they’re real in that sense. That’s what Feynman’s answer amounts to, and I am happy with it. 🙂

What I want to do here is show how we can get from those equations to some kind of wave equation: an equation that describes how a field actually travels through space. So… Well… Let’s first look at that very particular wave function we used in the previous post to prove that electromagnetic waves propagate with speed c, i.e. the speed of light. The fields were very simple: the electric field had a y-component only, and the magnetic field a z-component only. Their magnitudes, i.e. their magnitude where the field had reached, as it fills the space traveling outwards, were given in terms of J, i.e. the surface current density going in the positive y-direction, and the geometry of the situation is illustrated below.

The fields were, obviously, zero where the fields had not reached as they were traveling outwards. And, yes, I know that sounds stupid. But… Well… It’s just to make clear what we’re looking at here. 🙂

We also showed how the wave would look like if we would turn off its First Cause after some time T, so if the moving sheet of charge would no longer move after time T. We’d have the following pulse traveling through space, a rectangular shape really:

We can imagine more complicated shapes for the pulse, like the shape shown below. J goes from one unit to two units at time t = t₁ and then to zero at t = t₂. Now, the illustration on the right shows the electric field as a function of x at the time t shown by the arrow. We’ve seen this before when discussing waves: if the speed of travel of the wave is equal to c, then x is equal to x = c·t, and the pattern is as shown below indeed: it mirrors what happened at the source x/c seconds ago. So we write:

This idea of using the retarded time t’ = t − x/c in the argument of a wave function f – or, what amounts to the same, using x − c/t – is key to understanding wave functions. I’ve explained this in very simple language in a post for my kids and, if you don’t get this, I recommend you check it out. What we’re doing, basically, is converting something expressed in time units into something expressed in distance units, or vice versa, using the velocity of the wave as the scale factor, so time and distance are both expressed in the same unit, which may be seconds, or meter.

To see how it works, suppose we add some time Δt to the argument of our wave function f, so we’re looking at f[x−c(t+Δt)] now, instead of f(x−ct). Now, f[x−c(t+Δt)] = f(x−ct−cΔt), so we’ll get a different value for our function—obviously! But it’s easy to see that we can restore our wave function F to its former value by also adding some distance Δx = cΔt to the argument. Indeed, if we do so, we get f[x+Δx−c(t+Δt)] = f(x+cΔt–ct−cΔt) = f(x–ct). You’ll say: t − x/c is not the same as x–ct. It is and it isn’t: any function of x–ct is also a function of t − x/c, because we can write:

Here, I need to add something about the direction of travel. The pulse above travel in the positive x-direction, so that’s why we have x minus ct in the argument. For a wave traveling in the negative x-direction, we’ll have a wave function y = F(x+ct). In any case, I can’t dwell on this, so let me move on.

Now, Maxwell’s equations in free or empty space, where are there no charges nor currents to interact with, reduce to:

Now, how can we relate this set of complicated equations to a simple wave function? Let’s do the exercise for our simple E_y and B_z wave. Let’s start by writing out the first equation, i.e. ∇·E = 0, so we get:

Now, our wave does not vary in the y and z direction, so none of the components, including E_y and E_zdepend on y or z. It only varies in the x-direction, so ∂E_y/∂y and ∂E_z/∂z are zero. Note that the cross-derivatives ∂E_y/∂z and ∂E_z/∂y are also zero: we’re talking a plane wave here, the field varies only with x. However, because ∇·E = 0, ∂E_x/∂x must be zero and, hence, E_x must be zero.

Huh? What? How is that possible? You just said that our field does vary in the x-direction! And now you’re saying it doesn’t it? Read carefully. I know it’s complicated business, but it all makes sense. Look at the function: we’re talking E_y, not E_x. E_y does vary as a function of x, but our field does not have an x-component, so E_x = 0. We have no cross-derivative ∂E_y/∂x in the divergence of E (i.e. in ∇·E = 0).

Huh? What? Let me put it differently. E has three components: E_x, E_y and E_z, and we have three space coordinates: x, y and z, so we have nine cross-derivatives. What I am saying is that all derivatives with respect to y and z are zero. That still leaves us with three derivatives: ∂E_x/∂x, ∂E_y/∂x, and ∂E_y/∂x. So… Because all derivatives in respect to y and z are zero, and because of the ∇·E = 0 equation, we know that ∂E_x/∂x must be zero. So, to make a long story short, I did not say anything about ∂E_y/∂x or ∂E_z/∂x. These may still be whatever they want to be, and they may vary in more or in less complicated ways. I’ll give an example of that in a moment.

Having said that, I do agree that I was a bit quick in writing that, because ∂E_x/∂x = 0, E_x must be zero too. Looking at the math only, E_x is not necessarily zero: it might be some non-zero constant. So… Yes. That’s a mathematical possibility. The static field from some charged condenser plate would be an example of a constant E_x field. However, the point is that we’re not looking at such static fields here: we’re talking dynamics here, and we’re looking at a particular type of wave: we’re talking a so-called plane wave. Now, the wave front of a plane wave is… Well… A plane. 🙂 So E_x is zero indeed. It’s a general result for plane waves: the electric field of a plane wave will always be at right angles to the direction of propagation.

Hmm… I can feel your skepticism here. You’ll say I am arbitrarily restricting the field of analysis… Well… Yes. For the moment. It’s not a reasonable restriction though. As I mentioned above, the field of a plane wave may still vary in both the y- and z-directions, as shown in the illustration below (for which the credit goes to Wikipedia), which visualizes the electric field of circularly polarized light. In any case, don’t worry too much about. Let’s get back to the analysis. Just note we’re talking plane waves here. We’ll talk about non-plane waves i.e. incoherent light waves later. 🙂

So we have plane waves and, therefore, a so-called transverse E field which we can resolve in two components: E_yand E_z. However, we wanted to study a very simply E_yfield only. Why? Remember the objective of this lesson: it’s just to show how we go from Maxwell’s equations to the wave function, and so let’s keep the analysis simple as we can for now: we can make it more general later. In fact, if we do the analysis now for non-zero E_yand zero E_z, we can do a similar analysis for non-zero E_zand zero E_y, and the general solution is going to be some superposition of two such fields, so we’ll have a non-zero E_yand E_z. Capito? 🙂 So let me write out Maxwell’s second equation, and use the results we got above, so I’ll incorporate the zero values for the derivatives with respect to y and z, and also the assumption that E_z is zero. So we get:

[By the way: note that, out of the nine derivatives, the curl involves only the (six) cross-derivatives. That’s linked to the neat separation between the curl and the divergence operator. Math is great! :-)]

Now, because of the flux rule (∇×E = –∂B/∂t), we can (and should) equate the three components of ∇×E above with the three components of –∂B/∂t, so we get:

[In case you wonder what it is that I am trying to do, patience, please! We’ll get where we want to get. Just hang in there and read on.] Now, ∂B_x/∂t = 0 and ∂B_y/∂t = 0 do not necessarily imply that B_x and B_yare zero: there might be some magnets and, hence, we may have some constant static field. However, that’s a matter of choosing a reference point or, more simply, assuming that empty space is effectively empty, and so we don’t have magnets lying around and so we assume that B_x and B_yare effectively zero. [Again, we can always throw more stuff in when our analysis is finished, but let’s keep it simple and stupid right now, especially because the B_x = B_y= 0 is entirely in line with the E_x = E_z= 0 assumption.]

The equations above tell us what we know already: the E and B fields are at right angles to each other. However, note, once again, that this is a more general result for all plane electromagnetic waves, so it’s not only that very special caterpillar or butterfly field that we’re looking at it. [If you didn’t read my previous post, you won’t get the pun, but don’t worry about it. You need to understand the equations, not the silly jokes.]

OK. We’re almost there. Now we need Maxwell’s last equation. When we write it out, we get the following monstrously looking set of equations:

However, because of all of the equations involving zeroes above 🙂 only ∂B_z/∂x is not equal to zero, so the whole set reduced to only simple equation only:

Simplifying assumptions are great, aren’t they? 🙂 Having said that, it’s easy to be confused. You should watch out for the denominators: a ∂x and a ∂t are two very different things. So we have two equations now involving first-order derivatives:

∂B_z/∂t = −∂E_y/∂x
−c²∂B_z/∂x = −∂E_y/∂t

So what? Patience, please! 🙂 Let’s differentiate the first equation with respect to x and the second with respect to t. Why? Because… Well… You’ll see. Don’t complain. It’s simple. Just do it. We get:

∂[∂B_z/∂t]/∂x = −∂²E_y/∂x²
∂[−c²∂B_z/∂x]/∂t = −∂²E_y/∂x²

So we can equate the left-hand sides of our two equations now, and what we get is a differential equation of the second order that we’ve encountered already, when we were studying wave equations. In fact, it is the wave equation for one-dimensional waves:

In case you want to double-check, I did a few posts on this, but, if you don’t get this, well… I am sorry. You’ll need to do some homework. More in particular, you’ll need to do some homework on differential equations. The equation above is basically some constraint on the functional form of E_y. More in general, if we see an equation like:

then the function ψ(x, t) must be some function

So any function ψ like that will work. You can check it out by doing the necessary derivatives and plug them into the wave equation. [In case you wonder how you should go about this, Feynman actually does it for you in his Lecture on this topic, so you may want to check it there.]

In fact, the functions f(x − c/t) and g(x + c/t) themselves will also work as possible solutions. So we can drop one or the other, which amounts to saying that our ‘shape’ has to travel in some direction, rather than in both at the same time. 🙂 Indeed, from all of my explanations above, you know what f(x − c/t) represents: it’s a wave that travels in the positive x-direction. Now, it may be periodic, but it doesn’t have to be periodic. The f(x − c/t) function could represent any constant ‘shape’ that’s traveling in the positive x-direction at speed c. Likewise, the g(x + c/t) function could represent any constant ‘shape’ that’s traveling in the negative x-direction at speed c. As for super-imposing both…

Well… I suggest you check that post I wrote for my son, Vincent. It’s on the math of waves, but it doesn’t have derivatives and/or differential equations. It just explains how superimposition and all that works. It’s not very abstract, as it revolves around a vibrating guitar string. So, if you have trouble with all of the above, you may want to read that first. 🙂 The bottom line is that we can get any wavefunction we want by superimposing simple sinusoidals that are traveling in one or the other direction, and so that’s what’s the more general solution really says. Full stop. So that’s what’s we’re doing really: we add very simple waves to get very more complicated waveforms. 🙂

Now, I could leave it at this, but then it’s very easy to just go one step further, and that is to assume that E_zand, therefore, B_yare not zero. It’s just a matter of super-imposing solutions. Let me just give you the general solution. Just look at it for a while. If you understood all that I’ve said above, 20 seconds or so should be sufficient to say: “Yes, that makes sense. That’s the solution in two dimensions.” At least, I hope so! 🙂

OK. I should really stop now. But… Well… Now that we’ve got a general solution for all plane waves, why not be even bolder and think about what we could possibly say about three-dimensional waves? So then E_xand, therefore, B_xwould not necessarily be zero either. After all, light can behave that way. In fact, light is likely to be non-polarized and, hence, E_xand, therefore, B_xare most probably not equal to zero!

Now, you may think the analysis is going to be terribly complicated. And you’re right. It would be if we’d stick to our analysis in terms of x, y and z coordinates. However, it turns out that the analysis in terms of vector equations is actually quite straightforward. I’ll just copy the Master here, so you can see His Greatness. 🙂

But what solution does an equation like (20.27) have? We can appreciate it’s actually three equations, i.e. one for each component, and so… Well… Hmm… What can we say about that? I’ll quote the Master on this too:

“How shall we find the general wave solution? The answer is that all the solutions of the three-dimensional wave equation can be represented as a superposition of the one-dimensional solutions we have already found. We obtained the equation for waves which move in the $x$ -direction by supposing that the field did not depend on $y$ and $z$ . Obviously, there are other solutions in which the fields do not depend on $x$ and $z$ , representing waves going in the $y$ -direction. Then there are solutions which do not depend on $x$ and $y$ , representing waves travelling in the $z$ -direction. Or in general, since we have written our equations in vector form, the three-dimensional wave equation can have solutions which are plane waves moving in any direction at all. Again, since the equations are linear, we may have simultaneously as many plane waves as we wish, travelling in as many different directions. Thus the most general solution of the three-dimensional wave equation is a superposition of all sorts of plane waves moving in all sorts of directions.”

It’s the same thing once more: we add very simple waves to get very more complicated waveforms. 🙂

You must have fallen asleep by now or, else, be watching something else. Feynman must have felt the same. After explaining all of the nitty-gritty above, Feynman wakes up his students. He does so by appealing to their imagination:

“Try to imagine what the electric and magnetic fields look like at present in the space in this lecture room. First of all, there is a steady magnetic field; it comes from the currents in the interior of the earth—that is, the earth’s steady magnetic field. Then there are some irregular, nearly static electric fields produced perhaps by electric charges generated by friction as various people move about in their chairs and rub their coat sleeves against the chair arms. Then there are other magnetic fields produced by oscillating currents in the electrical wiring—fields which vary at a frequency of $6060$ cycles per second, in synchronism with the generator at Boulder Dam. But more interesting are the electric and magnetic fields varying at much higher frequencies. For instance, as light travels from window to floor and wall to wall, there are little wiggles of the electric and magnetic fields moving along at $186,000$ miles per second. Then there are also infrared waves travelling from the warm foreheads to the cold blackboard. And we have forgotten the ultraviolet light, the x-rays, and the radiowaves travelling through the room.

Flying across the room are electromagnetic waves which carry music of a jazz band. There are waves modulated by a series of impulses representing pictures of events going on in other parts of the world, or of imaginary aspirins dissolving in imaginary stomachs. To demonstrate the reality of these waves it is only necessary to turn on electronic equipment that converts these waves into pictures and sounds.

If we go into further detail to analyze even the smallest wiggles, there are tiny electromagnetic waves that have come into the room from enormous distances. There are now tiny oscillations of the electric field, whose crests are separated by a distance of one foot, that have come from millions of miles away, transmitted to the earth from the Mariner II space craft which has just passed Venus. Its signals carry summaries of information it has picked up about the planets (information obtained from electromagnetic waves that travelled from the planet to the space craft).

There are very tiny wiggles of the electric and magnetic fields that are waves which originated billions of light years away—from galaxies in the remotest corners of the universe. That this is true has been found by “filling the room with wires”—by building antennas as large as this room. Such radiowaves have been detected from places in space beyond the range of the greatest optical telescopes. Even they, the optical telescopes, are simply gatherers of electromagnetic waves. What we call the stars are only inferences, inferences drawn from the only physical reality we have yet gotten from them—from a careful study of the unendingly complex undulations of the electric and magnetic fields reaching us on earth.

There is, of course, more: the fields produced by lightning miles away, the fields of the charged cosmic ray particles as they zip through the room, and more, and more. What a complicated thing is the electric field in the space around you! Yet it always satisfies the three-dimensional wave equation.”

So… Well… That’s it for today, folks. 🙂 We have some more gymnastics to do, still… But we’re really there. Or here, I should say: on top of the peak. What a view we have here! Isn’t it beautiful? It took us quite some effort to get on top of this thing, and we’re still trying to catch our breath as we struggle with what we’ve learned so far, but it’s really worthwhile, isn’t it? 🙂

Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 20, 2020 as a result of a DMCA takedown notice from Michael A. Gottlieb, Rudolf Pfeiffer, and The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 20, 2020 as a result of a DMCA takedown notice from Michael A. Gottlieb, Rudolf Pfeiffer, and The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/

A not so easy piece: introducing the wave equation (and the Schrödinger equation)

Pre-scriptum (dated 26 June 2020): This post did not suffer from the DMCA take-down of some material. It is, therefore, still quite readable—even if my views on these matters have evolved quite a bit as part of my realist interpretation of QM.

Original post:

The title above refers to a previous post: An Easy Piece: Introducing the wave function.

Indeed, I may have been sloppy here and there – I hope not – and so that’s why it’s probably good to clarify that the wave function (usually represented as Ψ – the psi function) and the wave equation (Schrödinger’s equation, for example – but there are other types of wave equations as well) are two related but different concepts: wave equations are differential equations, and wave functions are their solutions.

Indeed, from a mathematical point of view, a differential equation (such as a wave equation) relates a function (such as a wave function) with its derivatives, and its solution is that function or – more generally – the set (or family) of functions that satisfies this equation.

The function can be real-valued or complex-valued, and it can be a function involving only one variable (such as y = y(x), for example) or more (such as u = u(x, t) for example). In the first case, it’s a so-called ordinary differential equation. In the second case, the equation is referred to as a partial differential equation, even if there’s nothing ‘partial’ about it: it’s as ‘complete’ as an ordinary differential equation (the name just refers to the presence of partial derivatives in the equation). Hence, in an ordinary differential equation, we will have terms involving dy/dx and/or d²y/dx², i.e. the first and second derivative of y respectively (and/or higher-order derivatives, depending on the degree of the differential equation), while in partial differential equations, we will see terms involving ∂u/∂t and/or ∂u²/∂x²(and/or higher-order partial derivatives), with ∂ replacing d as a symbol for the derivative.

The independent variables could also be complex-valued but, in physics, they will usually be real variables (or scalars as real numbers are also being referred to – as opposed to vectors, which are nothing but two-, three- or more-dimensional numbers really). In physics, the independent variables will usually be x – or let’s use r = (x, y, z) for a change, i.e. the three-dimensional space vector – and the time variable t. An example is that wave function which we introduced in our ‘easy piece’.

Ψ(r, t) = Aeⁱ^{(p·r – Et)ħ}

[If you read the Easy Piece, then you might object that this is not quite what I wrote there, and you are right: I wrote Ψ(r, t) = Aeⁱ^{(p/ħ)·r – ωt)}. However, here I am just introducing the other de Broglie relation (i.e. the one relating energy and frequency): E = hf =ħω and, hence, ω = E/ħ. Just re-arrange a bit and you’ll see it’s the same.]

From a physics point of view, a differential equation represents a system subject to constraints, such as the energy conservation law (the sum of the potential and kinetic energy remains constant), and Newton’s law of course: F = d(mv)/dt. A differential equation will usually also be given with one or more initial conditions, such as the value of the function at point t = 0, i.e. the initial value of the function. To use Wikipedia’s definition: “Differential equations arise whenever a relation involving some continuously varying quantities (modeled by functions) and their rates of change in space and/or time (expressed as derivatives) is known or postulated.”

That sounds a bit more complicated, perhaps, but it means the same: once you have a good mathematical model of a physical problem, you will often end up with a differential equation representing the system you’re looking at, and then you can do all kinds of things, such as analyzing whether or not the actual system is in an equilibrium and, if not, whether it will tend to equilibrium or, if not, what the equilibrium conditions would be. But here I’ll refer to my previous posts on the topic of differential equations, because I don’t want to get into these details – as I don’t need them here.

The one thing I do need to introduce is an operator referred to as the gradient (it’s also known as the del operator, but I don’t like that word because it does not convey what it is). The gradient – denoted by ∇ – is a shorthand for the partial derivatives of our function u or Ψ with respect to space, so we write:

∇ = (∂/∂x, ∂/∂y, ∂/∂z)

You should note that, in physics, we apply the gradient only to the spatial variables, not to time. For the derivative in regard to time, we just write ∂u/∂t or ∂Ψ/∂t.

Of course, an operator means nothing until you apply it to a (real- or complex-valued) function, such as our u(x, t) or our Ψ(r, t):

∇u = ∂u/∂x and ∇Ψ = (∂Ψ/∂x, ∂Ψ/∂y, ∂Ψ/∂z)

As you can see, the gradient operator returns a vector with three components if we apply it to a real- or complex-valued function of r, and so we can do all kinds of funny things with it combining it with the scalar or vector product, or with both. Here I need to remind you that, in a vector space, we can multiply vectors using either (i) the scalar product, aka the dot product (because of the dot in its notation: a•b) or (ii) the vector product, aka as the cross product (yes, because of the cross in its notation: a×b).

So we can define a whole range of new operators using the gradient and these two products, such as the divergence and the curl of a vector field. For example, if E is the electric field vector (I am using an italic bold-type E so you should not confuse E with the energy E, which is a scalar quantity), then div E = ∇•E, and curl E =∇×E. Taking the divergence of a vector will yield some number (so that’s a scalar), while taking the curl will yield another vector.

I am mentioning these operators because you will often see them. A famous example is the set of equations known as Maxwell’s equations, which integrate all of the laws of electromagnetism and from which we can derive the electromagnetic wave equation:

(1) ∇•E = ρ/ε₀(Gauss’ law)

(2) ∇×E = –∂B/∂t (Faraday’s law)

(3) ∇•B = 0

(4) c²∇×B = j/ε₀+ ∂E/∂t

I should not explain these but let me just remind you of the essentials:

The first equation (Gauss’ law) can be derived from the equations for Coulomb’s law and the forces acting upon a charge q in an electromagnetic field: F = q(E + v×B) – with B the magnetic field vector (F is also referred to as the Lorentz force: it’s the combined force on a charged particle caused by the electric and magnetic fields; v the velocity of the (moving) charge; ρ the charge density (so charge is thought of as being distributed in space, rather than being packed into points, and that’s OK because our scale is not the quantum-mechanical one here); and, finally, ε₀ the electric constant (some 8.854×10⁻¹² farads per meter).
The second equation (Faraday’s law) gives the electric field associated with a changing magnetic field.
The third equation basically states that there is no such thing as a magnetic charge: there are only electric charges.
Finally, in the last equation, we have a vector j representing the current density: indeed, remember than magnetism only appears when (electric) charges are moving, so if there’s an electric current. As for the equation itself, well… That’s a more complicated story so I will leave that for the post scriptum.

We can do many more things: we can also take the curl of the gradient of some scalar, or the divergence of the curl of some vector (both have the interesting property that they are zero), and there are many more possible combinations – some of them useful, others not so useful. However, this is not the place to introduce differential calculus of vector fields (because that’s what it is).

The only other thing I need to mention here is what happens when we apply this gradient operator twice. Then we have an new operator ∇•∇ = ∇²which is referred to as the Laplacian. In fact, when we say ‘apply ∇ twice’, we are actually doing a dot product. Indeed, ∇ returns a vector, and so we are going to multiply this vector once again with a vector using the dot product rule: a•b = ∑a_ib_i(so we multiply the individual vector components and then add them). In the case of our functions u and Ψ, we get:

∇•(∇u) =∇•∇u = (∇•∇)u = ∇²u =∂²u/∂x²

∇•(∇Ψ) = ∇²Ψ = ∂²Ψ/∂x²+ ∂²Ψ/∂y²+ ∂²Ψ/∂z²

Now, you may wonder what it means to take the derivative (or partial derivative) of a complex-valued function (which is what we are doing in the case of Ψ) but don’t worry about that: a complex-valued function of one or more real variables, such as our Ψ(x, t), can be decomposed as Ψ(x, t) =Ψ_Re(x, t) + iΨ_Im(x, t), with Ψ_Re and Ψ_Re two real-valued functions representing the real and imaginary part of Ψ(x, t) respectively. In addition, the rules for integrating complex-valued functions are, to a large extent, the same as for real-valued functions. For example, if z is a complex number, then de^z/dz = e^z and, hence, using this and other very straightforward rules, we can indeed find the partial derivatives of a function such as Ψ(r, t) = Aeⁱ^{(p·r – Et)ħ} with respect to all the (real-valued) variables in the argument.

The electromagnetic wave equation

OK. That’s enough math now. We are ready now to look at – and to understand – a real wave equation – I mean one that actually represents something in physics. Let’s take Maxwell’s equations as a start. To make it easy – and also to ensure that you have easy access to the full derivation – we’ll take the so-called Heaviside form of these equations:

This Heaviside form assumes a charge-free vacuum space, so there are no external forces acting upon our electromagnetic wave. There are also no other complications such as electric currents. Also, the c²(i.e. the square of the speed of light) is written here c² = 1/με, with μ and ε the so-called permeability (μ) and permittivity (ε) respectively (c₀, μ₀and ε₀are the values in a vacuum space: indeed, light travels slower elsewhere (e.g. in glass) – if at all).

Now, these four equations can be replaced by just two, and it’s these two equations that are referred to as the electromagnetic wave equation(s):

The derivation is not difficult. In fact, it’s much easier than the derivation for the Schrödinger equation which I will present in a moment. But, even if it is very short, I will just refer to Wikipedia in case you would be interested in the details (see the article on the electromagnetic wave equation). The point here is just to illustrate what is being done with these wave equations and why – not so much how. Indeed, you may wonder what we have gained with this ‘reduction’.

The answer to this very legitimate question is easy: the two equations above are second-order partial differential equations which are relatively easy to solve. In other words, we can find a general solution, i.e. a set or family of functions that satisfy the equation and, hence, can represent the wave itself. Why a set of functions? If it’s a specific wave, then there should only be one wave function, right? Right. But to narrow our general solution down to a specific solution, we will need extra information, which are referred to as initial conditions, boundary conditions or, in general, constraints. [And if these constraints are not sufficiently specific, then we may still end up with a whole bunch of possibilities, even if they narrowed down the choice.]

Let’s give an example by re-writing the above wave equation and using our function u(x, t) or, to simplify the analysis, u(x, t) – so we’re looking at a plane wave traveling in one dimension only:

There are many functional forms for u that satisfy this equation. One of them is the following:

This resembles the one I introduced when presenting the de Broglie equations, except that – this time around – we are talking a real electromagnetic wave, not some probability amplitude. Another difference is that we allow a composite wave with two components: one traveling in the positive x-direction, and one traveling in the negative x-direction. Now, if you read the post in which I introduced the de Broglie wave, you will remember that these Aeⁱ^(kx–ωt)or Be^–i^(kx+ωt) waves give strange probabilities. However, because we are not looking at some probability amplitude here – so it’s not a de Broglie wave but a real wave (so we use complex number notation only because it’s convenient but, in practice, we’re only considering the real part), this functional form is quite OK.

That being said, the following functional form, representing a wave packet (aka a wave train) is also a solution (or a set of solutions better):

Huh? Well… Yes. If you really can’t follow here, I can only refer you to my post on Fourier analysis and Fourier transforms: I cannot reproduce that one here because that would make this post totally unreadable. We have a wave packet here, and so that’s the sum of an infinite number of component waves that interfere constructively in the region of the envelope (so that’s the location of the packet) and destructively outside. The integral is just the continuum limit of a summation of n such waves. So this integral will yield a function u with x and t as independent variables… If we know A(k) that is. Now that’s the beauty of these Fourier integrals (because that’s what this integral is).

Indeed, in my post on Fourier transforms I also explained how these amplitudes A(k) in the equation above can be expressed as a function of u(x, t) through the inverse Fourier transform. In fact, I actually presented the Fourier transform pair Ψ(x) and Φ(p) in that post, but the logic is same – except that we’re inserting the time variable t once again (but with its value fixed at t=0):

OK, you’ll say, but where is all of this going? Be patient. We’re almost done. Let’s now introduce a specific initial condition. Let’s assume that we have the following functional form for u at time t = 0:

You’ll wonder where this comes from. Well… I don’t know. It’s just an example from Wikipedia. It’s random but it fits the bill: it’s a localized wave (so that’s a a wave packet) because of the very particular form of the phase (θ = –x²+ ik₀x). The point to note is that we can calculate A(k) when inserting this initial condition in the equation above, and then – finally, you’ll say – we also get a specific solution for our u(x, t) function by inserting the value for A(k) in our general solution. In short, we get:

and

As mentioned above, we are actually only interested in the real part of this equation (so that’s the e with the exponent factor (note there is no i in it, so it’s just some real number) multiplied with the cosine term).

However, the example above shows how easy it is to extend the analysis to a complex-valued wave function, i.e. a wave function describing a probability amplitude. We will actually do that now for Schrödinger’s equation. [Note that the example comes from Wikipedia’s article on wave packets, and so there is a nice animation which shows how this wave packet (be it the real or imaginary part of it) travels through space. Do watch it!]

Schrödinger’s equation

Let me just write it down:

That’s it. This is the Schrödinger equation – in a somewhat simplified form but it’s OK.

[…] You’ll find that equation above either very simple or, else, very difficult depending on whether or not you understood most or nothing at all of what I wrote above it. If you understood something, then it should be fairly simple, because it hardly differs from the other wave equation.

Indeed, we have that imaginary unit (i) in front of the left term, but then you should not panic over that: when everything is said and done, we are working here with the derivative (or partial derivative) of a complex-valued function, and so it should not surprise us that we have an i here and there. It’s nothing special. In fact, we had them in the equation above too, but they just weren’t explicit. The second difference with the electromagnetic wave equation is that we have a first-order derivative of time only (in the electromagnetic wave equation we had ∂²u/∂t², so that’s a second-order derivative). Finally, we have a -1/2 factor in front of the right-hand term, instead of c². OK, so what? It’s a different thing – but that should not surprise us: when everything is said and done, it is a different wave equation because it describes something else (not an electromagnetic wave but a quantum-mechanical system).

To understand why it’s different, I’d need to give you the equivalent of Maxwell’s set of equations for quantum mechanics, and then show how this wave equation is derived from them. I could do that. The derivation is somewhat lengthier than for our electromagnetic wave equation but not all that much. The problem is that it involves some new concepts which we haven’t introduced as yet – mainly some new operators. But then we have introduced a lot of new operators already (such as the gradient and the curl and the divergence) so you might be ready for this. Well… Maybe. The treatment is a bit lengthy, and so I’d rather do in a separate post. Why? […] OK. Let me say a few things about it then. Here we go:

These new operators involve matrix algebra. Fine, you’ll say. Let’s get on with it. Well… It’s matrix algebra with matrices with complex elements, so if we write a n×m matrix A as A = (a_ia_j), then the elements a_ia_j (i = 1, 2,… n and j = 1, 2,… m) will be complex numbers.
That allows us to define Hermitian matrices: a Hermitian matrix is a square matrix A which is the same as the complex conjugate of its transpose.
We can use such matrices as operators indeed: transformations acting on a column vector X to produce another column vector AX.
Now, you’ll remember – from your course on matrix algebra with real (as opposed to complex) matrices, I hope – that we have this very particular matrix equation AX = λX which has non-trivial solutions (i.e. solutions X ≠ 0) if and only if the determinant of A-λI is equal to zero. This condition (det(A-λI) = 0) is referred to as the characteristic equation.
This characteristic equation is a polynomial of degree n in λ and its roots are called eigenvalues or characteristic values of the matrix A. The non-trivial solutions X ≠ 0 corresponding to each eigenvalue are called eigenvectors or characteristic vectors.

Now – just in case you’re still with me – it’s quite simple: in quantum mechanics, we have the so-called Hamiltonian operator. The Hamiltonian in classical mechanics represents the total energy of the system: H = T + V (total energy H = kinetic energy T + potential energy V). Here we have got something similar but different. 🙂 The Hamiltonian operator is written as H-hat, i.e. an H with an accent circonflexe (as they say in French). Now, we need to let this Hamiltonian operator act on the wave function Ψ and if the result is proportional to the same wave function Ψ, then Ψ is a so-called stationary state, and the proportionality constant will be equal to the energy E of the state Ψ. These stationary states correspond to standing waves, or ‘orbitals’, such as in atomic orbitals or molecular orbitals. So we have:

$E\Psi=\hat H \Psi$

I am sure you are no longer there but, in fact, that’s it. We’re done with the derivation. The equation above is the so-called time-independent Schrödinger equation. It’s called like that not because the wave function is time-independent (it is), but because the Hamiltonian operator is time-independent: that obviously makes sense because stationary states are associated with specific energy levels indeed. However, if we do allow the energy level to vary in time (which we should do – if only because of the uncertainty principle: there is no such thing as a fixed energy level in quantum mechanics), then we cannot use some constant for E, but we need a so-called energy operator. Fortunately, this energy operator has a remarkably simple functional form:

$\hat{E} \Psi = i\hbar\dfrac{\partial}{\partial t}\Psi = E\Psi$ Now if we plug that in the equation above, we get our time-dependent Schrödinger equation:

$i \hbar \frac{\partial}{\partial t}\Psi = \hat H \Psi$

OK. You probably did not understand one iota of this but, even then, you will object that this does not resemble the equation I wrote at the very beginning: i(∂u/∂t) = (-1/2)∇²u.

You’re right, but we only need one more step for that. If we leave out potential energy (so we assume a particle moving in free space), then the Hamiltonian can be written as:

$\hat{H} = -\frac{\hbar^2}{2m}\nabla^2$

You’ll ask me how this is done but I will be short on that: the relationship between energy and momentum is being used here (and so that’s where the 2m factor in the denominator comes from). However, I won’t say more about it because this post would become way too lengthy if I would include each and every derivation and, remember, I just want to get to the result because the derivations here are not the point: I want you to understand the functional form of the wave equation only. So, using the above identity and, OK, let’s be somewhat more complete and include potential energy once again, we can write the time-dependent wave equation as:

$i\hbar\frac{\partial}{\partial t}\Psi(\mathbf{r},t) = -\frac{\hbar^2}{2m}\nabla^2\Psi(\mathbf{r},t) + V(\mathbf{r},t)\Psi(\mathbf{r},t)$

Now, how is the equation above related to i(∂u/∂t) = (-1/2)∇²u? It’s a very simplified version of it: potential energy is, once again, assumed to be not relevant (so we’re talking a free particle again, with no external forces acting on it) but the real simplification is that we give m and ħ the value 1, so m = ħ = 1. Why?

Well… My initial idea was to do something similar as I did above and, hence, actually use a specific example with an actual functional form, just like we did for that the real-valued u(x, t) function. However, when I look at how long this post has become already, I realize I should not do that. In fact, I would just copy an example from somewhere else – probably Wikipedia once again, if only because their examples are usually nicely illustrated with graphs (and often animated graphs). So let me just refer you here to the other example given in the Wikipedia article on wave packets: that example uses that simplified i(∂u/∂t) = (-1/2)∇²u equation indeed. It actually uses the same initial condition:

However, because the wave equation is different, the wave packet behaves differently. It’s a so-called dispersive wave packet: it delocalizes. Its width increases over time and so, after a while, it just vanishes because it diffuses all over space. So there’s a solution to the wave equation, given this initial condition, but it’s just not stable – as a description of some particle that is (from a mathematical point of view – or even a physical point of view – there is no issue).

In any case, this probably all sounds like Chinese – or Greek if you understand Chinese :-). I actually haven’t worked with these Hermitian operators yet, and so it’s pretty shaky territory for me myself. However, I felt like I had picked up enough math and physics on this long and winding Road to Reality (I don’t think I am even halfway) to give it a try. I hope I succeeded in passing the message, which I’ll summarize as follows:

Schrödinger’s equation is just like any other differential equation used in physics, in the sense that it represents a system subject to constraints, such as the relationship between energy and momentum.
It will have many general solutions. In other words, the wave function – which describes a probability amplitude as a function in space and time – will have many general solutions, and a specific solution will depend on the initial conditions.
The solution(s) can represent stationary states, but not necessary so: a wave (or a wave packet) can be non-dispersive or dispersive. However, when we plug the wave function into the wave equation, it will satisfy that equation.

That’s neither spectacular nor difficult, is it? But, perhaps, it helps you to ‘understand’ wave equations, including the Schrödinger equation. But what is understanding? Dirac once famously said: “I consider that I understand an equation when I can predict the properties of its solutions, without actually solving it.”

Hmm… I am not quite there yet, but I am sure some more practice with it will help. 🙂

Post scriptum: On Maxwell’s equations

First, we should say something more about these two other operators which I introduced above: the divergence and the curl. First on the divergence.

The divergence of a field vector E (or B) at some point r represents the so-called flux of E, i.e. the ‘flow’ of E per unit volume. So flux and divergence both deal with the ‘flow’ of electric field lines away from (positive) charges. [The ‘away from’ is from positive charges indeed – as per the convention: Maxwell himself used the term ‘convergence’ to describe flow towards negative charges, but so his ‘convention’ did not survive. Too bad, because I think convergence would be much easier to remember.]

So if we write that ∇•E = ρ/ε₀, then it means that we have some constant flux of E because of some (fixed) distribution of charges.

Now, we already mentioned that equation (2) in Maxwell’s set meant that there is no such thing as a ‘magnetic’ charge: indeed, ∇•B = 0 means there is no magnetic flux. But, of course, magnetic fields do exist, don’t they? They do. A current in a wire, for example, i.e. a bunch of steadily moving electric charges, will induce a magnetic field according to Ampère’s law, which is part of equation (4) in Maxwell’s set: c²∇×B = j/ε₀, with j representing the current density and ε₀ the electric constant.

Now, at this point, we have this curl: ∇×B. Just like divergence (or convergence as Maxwell called it – but then with the sign reversed), curl also means something in physics: it’s the amount of ‘rotation’, or ‘circulation’ as Feynman calls it, around some loop.

So, to summarize the above, we have (1) flux (divergence) and (2) circulation (curl) and, of course, the two must be related. And, while we do not have any magnetic charges and, hence, no flux for B, the current in that wire will cause some circulation of B, and so we do have a magnetic field. However, that magnetic field will be static, i.e. it will not change. Hence, the time derivative ∂B/∂t will be zero and, hence, from equation (2) we get that ∇×E = 0, so our electric field will be static too. The time derivative ∂E/∂t which appears in equation (4) also disappears and we just have c²∇×B = j/ε₀. This situation – of a constant magnetic and electric field – is described as electrostatics and magnetostatics respectively. It implies a neat separation of the four equations, and it makes magnetism and electricity appear as distinct phenomena. Indeed, as long as charges and currents are static, we have:

[I] Electrostatics: (1) ∇•E = ρ/ε₀and (2) ∇×E = 0

[II] Magnetostatics: (3) c²∇×B = j/ε₀and (4) ∇•B = 0

The first two equations describe a vector field with zero curl and a given divergence (i.e. the electric field) while the third and fourth equations second describe a seemingly separate vector field with a given curl but zero divergence. Now, I am not writing this post scriptum to reproduce Feynman’s Lectures on Electromagnetism, and so I won’t say much more about this. I just want to note two points:

1. The first point to note is that factor c²in the c²∇×B = j/ε₀equation. That’s something which you don’t have in the ∇•E = ρ/ε₀equation. Of course, you’ll say: So what? Well… It’s weird. And if you bring it to the other side of the equation, it becomes clear that you need an awful lot of current for a tiny little bit of magnetic circulation (because you’re dividing by c² , so that’s a factor 9 with 16 zeroes after it (9×10¹⁶): an awfully big number in other words). Truth be said, it reveals something very deep. Hmm? Take a wild guess. […] Relativity perhaps? Well… Yes!

It’s obvious that we buried v somewhere in this equation, the velocity of the moving charges. But then it’s part of j of course: the rate at which charge flows through a unit area per second. But – Hey! – velocity as compared to what? What’s the frame of reference? The frame of reference is us obviously or – somewhat less subjective – the stationary charges determining the electric field according to equation (1) in the set above: ∇•E = ρ/ε₀. But so here we can ask the same question: stationary in what reference frame? As compared to the moving charges? Hmm… But so how does it work with relativity? I won’t copy Feynman’s 13th Lecture here, but so, in that lecture, he analyzes what happens to the electric and magnetic force when we look at the scene from another coordinate system – let’s say one that moves parallel to the wire at the same speed as the moving electrons, so – because of our new reference frame – the ‘moving electrons’ now appear to have no speed at all but, of course, our stationary charges will now seem to move.

What Feynman finds – and his calculations are very easy and straightforward – is that, while we will obviously insert different input values into Maxwell’s set of equations and, hence, get different values for the E and B fields, the actual physical effect – i.e. the final Lorentz force on a (charged) particle – will be the same. To be very specific, in a coordinate system at rest with respect to the wire (so we see charges move in the wire), we find a ‘magnetic’ force indeed, but in a coordinate system moving at the same speed of those charges, we will find an ‘electric’ force only. And from yet another reference frame, we will find a mixture of E and B fields. However, the physical result is the same: there is only one combined force in the end – the Lorentz force F = q(E + v×B) – and it’s always the same, regardless of the reference frame (inertial or moving at whatever speed – relativistic (i.e. close to c) or not).

In other words, Maxwell’s description of electromagnetism is invariant or, to say exactly the same in yet other words, electricity and magnetism taken together are consistent with relativity: they are part of one physical phenomenon: the electromagnetic interaction between (charged) particles. So electric and magnetic fields appear in different ‘mixtures’ if we change our frame of reference, and so that’s why magnetism is often described as a ‘relativistic’ effect – although that’s not very accurate. However, it does explain that c²factor in the equation for the curl of B. [How exactly? Well… If you’re smart enough to ask that kind of question, you will be smart enough to find the derivation on the Web. :-)]

Note: Don’t think we’re talking astronomical speeds here when comparing the two reference frames. It would also work for astronomical speeds but, in this case, we are talking the speed of the electrons moving through a wire. Now, the so-called drift velocity of electrons – which is the one we have to use here – in a copper wire of radius 1 mm carrying a steady current of 3 Amps is only about 1 m per hour! So the relativistic effect is tiny – but still measurable !

2. The second thing I want to note is that Maxwell’s set of equations with non-zero time derivatives for E and B clearly show that it’s changing electric and magnetic fields that sort of create each other, and it’s this that’s behind electromagnetic waves moving through space without losing energy. They just travel on and on. The math behind this is beautiful (and the animations in the related Wikipedia articles are equally beautiful – and probably easier to understand than the equations), but that’s stuff for another post. As the electric field changes, it induces a magnetic field, which then induces a new electric field, etc., allowing the wave to propagate itself through space. I should also note here that the energy is in the field and so, when electromagnetic waves, such as light, or radiowaves, travel through space, they carry their energy with them.

Let me be fully complete here, and note that there’s energy in electrostatic fields as well, and the formula for it is remarkably beautiful. The total (electrostatic) energy U in an electrostatic field generated by charges located within some finite distance is equal to:

This equation introduces the electrostatic potential. This is a scalar field Φ from which we can derive the electric field vector just by applying the gradient operator. In fact, all curl-free fields (such as the electric field in this case) can be written as the gradient of some scalar field. That’s a universal truth. See how beautiful math is? 🙂

An easy piece: introducing quantum mechanics and the wave function

Pre-scriptum (dated 26 June 2020): A quick glance at this piece – so many years after I have written it – tells me it is basically OK. However, it is quite obvious that, in terms of interpreting the math, I have come a very long way. However, I would recommend you go through the piece so as to get the basic math, indeed, and then you may or may not be ready for the full development of my realist or classical interpretation of QM. My manuscript may also be a fun read for you.

Original post:

After all those boring pieces on math, it is about time I got back to physics. Indeed, what’s all that stuff on differential equations and complex numbers good for? This blog was supposed to be a journey into physics, wasn’t it? Yes. But wave functions – functions describing physical waves (in classical mechanics) or probability amplitudes (in quantum mechanics) – are the solution to some differential equation, and they will usually involve complex-number notation. However, I agree we have had enough of that now. Let’s see how it works. By the way, the title of this post – An Easy Piece – is an obvious reference to (some of) Feynman’s 1965 Lectures on Physics, some of which were re-packaged in 1994 (six years after his death that is) in ‘Six Easy Pieces’ indeed – but, IMHO, it makes more sense to read all of them as part of the whole series.

Let’s first look at one of the most used mathematical shapes: the sinusoidal wave. The illustration below shows the basic concepts: we have a wave here – some kind of cyclic thing – with a wavelength λ, an amplitude (or height) of (maximum) A₀, and a so-called phase shift equal to φ. The Wikipedia definition of a wave is the following: “a wave is a disturbance or oscillation that travels through space and matter, accompanied by a transfer of energy.” Indeed, a wave transports energy as it travels (oh – I forgot to mention the speed or velocity of a wave (v) as an important characteristic of a wave), and the energy it carries is directly proportional to the square of the amplitude of the wave: E ∝ A² (this is true not only for waves like water waves, but also for electromagnetic waves, like light).

Let’s now look at how these variables get into the argument – literally: into the argument of the wave function. Let’s start with that phase shift. The phase shift is usually defined referring to some other wave or reference point (in this case the origin of the x and y axis). Indeed, the amplitude – or ‘height’ if you want (think of a water wave, or the strength of the electric field) – of the wave above depends on (1) the time t (not shown above) and (2) the location (x), but so we will need to have this phase shift φ in the argument of the wave function because at x = 0 we do not have a zero height for the wave. So, as we can see, we can shift the x-axis left or right with this φ. OK. That’s simple enough. Let’s look at the other independent variables now: time and position.

The height (or amplitude) of the wave will obviously vary both in time as well as in space. On this graph, we fixed time (t = 0) – and so it does not appear as a variable on the graph – and show how the amplitude y = A varies in space (i.e. along the x-axis). We could also have looked at one location only (x = 0 or x₁or whatever other location) and shown how the amplitude varies over time at that location only. The graph would be very similar, except that we would have a ‘time distance’ between two crests (or between two troughs or between any other two points separated by a full cycle of the wave) instead of the wavelength λ (i.e. a distance in space). This ‘time distance’ is the time needed to complete one cycle and is referred to as the period of the wave (usually denoted by the symbol T or T₀– in line with the notation for the maximum amplitude A₀). In other words, we will also see time (t) as well as location (x) in the argument of this cosine or sine wave function. By the way, it is worth noting that it does not matter if we use a sine or cosine function because we can go from one to the other using the basic trigonometric identities cos θ = sin(π/2 – θ) and sin θ = cos(π/2 – θ). So all waves of the shape above are referred to as sinusoidal waves even if, in most cases, the convention is to actually use the cosine function to represent them.

So we will have x, t and φ in the argument of the wave function. Hence, we can write A = A(x, t, φ) = cos(x + t + φ) and there we are, right? Well… No. We’re adding very different units here: time is measured in seconds, distance in meter, and the phase shift is measured in radians (i.e. the unit of choice for angles). So we can’t just add them up. The argument of a trigonometric function (like this cosine function) is an angle and, hence, we need to get everything in radians – because that’s the unit we use to measure angles. So how do we do that? Let’s do it step by step.

First, it is worth noting that waves are usually caused by something. For example, electromagnetic waves are caused by an oscillating point charge somewhere, and radiate out from there. Physical waves – like water waves, or an oscillating string – usually also have some origin. In fact, we can look at a wave as a way of transmitting energy originating elsewhere. In the case at hand here – i.e. the nice regular sinusoidal wave illustrated above – it is obvious that the amplitude at some time t = t₁at some point x = x₁ will be the same as the amplitude of that wave at point x = 0 some time ago. How much time ago? Well… The time (t₀) that was needed for that wave to travel from point x = 0 to point x = x₁is easy to calculate: indeed, if the wave originated at t = 0 and x = 0, then x₁(i.e. the distance traveled by the wave) will be equal to its velocity (v) multiplied by t₁, so we have x₁= v.t₁(note that we assume the wave velocity is constant – which is a very reasonable assumption). In other words, inserting x₁and t₁ in the argument of our cosine function should yield the same value as inserting zero for x and t. Distance and time can be substituted so to say, and that’s we will have something like x – vt or vt – x in the argument in that cosine function: we measure both time and distance in units of distance so to say. [Note that x – vt and –(x-vt) = vt – x are equivalent because cos θ = cos (-θ)]

Does this sound fishy? It shouldn’t. Think about it. In the (electric) field equation for electromagnetic radiation (that’s one of the examples of a wave which I mentioned above), you’ll find the so-called retarded acceleration a(t – x/c) in the argument: that’s the acceleration (a)of the charge causing the electric field at point x to change not at time t but at time t – x/c. So that’s the retarded acceleration indeed: x/c is the time it took for the wave to travel from its origin (the oscillating point charge) to x and so we subtract that from t. [When talking electromagnetic radiation (e.g. light), the wave velocity v is obviously equal to c, i.e. the speed of light, or of electromagnetic radiation in general.] Of course, you will now object that t – x/c is not the same as vt – x, and you are right: we need time units in the argument of that acceleration function, not distance. We can get to distance units if we would multiply the time with the wave velocity v but that’s complicated business because the velocity of that moving point charge is not a constant.

[…] I am not sure if I made myself clear here. If not, so be it. The thing to remember is that we need an input expressed in radians for our cosine function, not time, nor distance. Indeed, the argument in a sine or cosine function is an angle, not some distance. We will call that angle the phase of the wave, and it is usually denoted by the symbol θ – which we also used above. But so far we have been talking about amplitude as a function of distance, and we expressed time in distance units too – by multiplying it with v. How can we go from some distance to some angle? It is simple: we’ll multiply x – vt with 2π/λ.

Huh? Yes. Think about it. The wavelength will be expressed in units of distance – typically 1 m in the SI International System of Units but it could also be angstrom (10^–10 m = 0.1 nm) or nano-meter (10^–9 m = 10 Å). A wavelength of two meter (2 m) means that the wave only completes half a cycle per meter of travel. So we need to translate that into radians, which – once again – is the measure used to… well… measure angles, or the phase of the wave as we call it here. So what’s the ‘unit’ here? Well… Remember that we can add or subtract 2π (and any multiple of 2π, i.e. ± 2nπ with n = ±1, ±2, ±3,…) to the argument of all trigonometric functions and we’ll get the same value as for the original argument. In other words, a cycle characterized by a wavelength λ corresponds to the angle θ going around the origin and describing one full circle, i.e. 2π radians. Hence, it is easy: we can go from distance to radians by multiplying our ‘distance argument’ x – vt with 2π/λ. If you’re not convinced, just work it out for the example I gave: if the wavelength is 2 m, then 2π/λ equals 2π/2 = π. So traveling 6 meters along the wave – i.e. we’re letting x go from 0 to 6 m while fixing our time variable – corresponds to our phase θ going from 0 to 6π: both the ‘distance argument’ as well as the change in phase cover three cycles (three times two meter for the distance, and three times 2π for the change in phase) and so we’re fine. [Another way to think about it is to remember that the circumference of the unit circle is also equal to 2π (2π·r = 2π·1 in this case), so the ratio of 2π to λ measures how many times the circumference contains the wavelength.]

In short, if we put time and distance in the (2π/λ)(x-vt) formula, we’ll get everything in radians and that’s what we need for the argument for our cosine function. So our sinusoidal wave above can be represented by the following cosine function:

A = A(x, t) = A₀cos[(2π/λ)(x-vt)]

We could also write A = A₀cosθ with θ = (2π/λ)(x-vt). […] Both representations look rather ugly, don’t they? They do. And it’s not only ugly: it’s not the standard representation of a sinusoidal wave either. In order to make it look ‘nice’, we have to introduce some more concepts here, notably the angular frequency and the wave number. So let’s do that.

The angular frequency is just like the… well… the frequency you’re used to, i.e. the ‘non-angular’ frequency f, as measured in cycles per second (i.e. in Hertz). However, instead of measuring change in cycles per second, the angular frequency (usually denoted by the symbol ω) will measure the rate of change of the phase with time, so we can write or define ω as ω = ∂θ/∂t. In this case, we can easily see that ω = –2πv/λ. [Note that we’ll take the absolute value of that derivative because we want to work with positive numbers for such properties of functions.] Does that look complicated? In doubt, just remember that ω is measured in radians per second and then you can probably better imagine what it is really. Another way to understand ω somewhat better is to remember that the product of ω and the period T is equal to 2π, so that’s a full cycle. Indeed, the time needed to complete one cycle multiplied with the phase change per second (i.e. per unit time) is equivalent to going round the full circle: 2π = ω.T. Because f = 1/T, we can also relate ω to f and write ω = 2π.f = 2π/T.

Likewise, we can measure the rate of change of the phase with distance, and that gives us the wave number k = ∂θ/∂x, which is like the spatial frequency of the wave. So it is just like the wavelength but then measured in radians per unit distance. From the function above, it is easy to see that k = 2π/λ. The interpretation of this equality is similar to the ω.T = 2π equality. Indeed, we have a similar equation for k: 2π = k.λ, so the wavelength (λ) is for k what the period (T) is for ω. If you’re still uncomfortable with it, just play a bit with some numerical examples and you’ll be fine.

To make a long story short, this, then, allows us to re-write the sinusoidal wave equation above in its final form (and let me include the phase shift φ again in order to be as complete as possible at this stage):

A(x, t) = A₀cos(kx – ωt + φ)

You will agree that this looks much ‘nicer’ – and also more in line with what you’ll find in textbooks or on Wikipedia. 🙂 I should note, however, that we’re not adding any new parameters here. The wave number k and the angular frequency ω are not independent: this is still the same wave (A = A₀cos[(2π/λ)(x-vt)]), and so we are not introducing anything more than the frequency and – equally important – the speed with which the wave travels, which is usually referred to as the phase velocity. In fact, it is quite obvious from the ω.T = 2π and the k = 2π/λ identities that kλ = ω.T and, hence, taking into account that λ is obviously equal to λ = v.T (the wavelength is – by definition – the distance traveled by the wave in one period), we find that the phase (or wave) velocity v is equal to the ratio of ω and k, so we have that v = ω/k. So x, t, ω and k could be re-scaled or so but their ratio cannot change: the velocity of the wave is what it is. In short, I am introducing two new concepts and symbols (ω and k) but there are no new degrees of freedom in the system so to speak.

[At this point, I should probably say something about the difference between the phase velocity and the so-called group velocity of a wave. Let me do that in as brief a way as I can manage. Most real-life waves travel as a wave packet, aka a wave train. So that’s like a burst, or an “envelope” (I am shamelessly quoting Wikipedia here…), of “localized wave action that travels as a unit.” Such wave packet has no single wave number or wavelength: it actually consists of a (large) set of waves with phases and amplitudes such that they interfere constructively only over a small region of space, and destructively elsewhere. The famous Fourier analysis (or infamous if you have problems understanding what it is really) decomposes this wave train in simpler pieces. While these ‘simpler’ pieces – which, together, add up to form the wave train – are all ‘nice’ sinusoidal waves (that’s why I call them ‘simple’), the wave packet as such is not. In any case (I can’t be too long on this), the speed with which this wave train itself is traveling through space is referred to as the group velocity. The phase velocity and the group velocity are usually very different: for example, a wave packet may be traveling forward (i.e. its group velocity is positive) but the phase velocity may be negative, i.e. traveling backward. However, I will stop here and refer to the Wikipedia article on group and phase velocity: it has wonderful illustrations which are much and much better than anything I could write here. Just one last point that I’ll use later: regardless of the shape of the wave (sinusoidal, sawtooth or whatever), we have a very obvious relationship relating wavelength and frequency to the (phase) velocity: v = λ.f, or f = v/λ. For example, the frequency of a wave traveling 3 meter per second and wavelength of 1 meter will obviously have a frequency of three cycles per second (i.e. 3 Hz). Let’s go back to the main story line now.]

With the rather lengthy ‘introduction’ to waves above, we are now ready for the thing I really wanted to present here. I will go much faster now that we have covered the basics. Let’s go.

From my previous posts on complex numbers (or from what you know on complex numbers already), you will understand that working with cosine functions is much easier when writing them as the real part of a complex number A₀eⁱ^θ= A₀eⁱ^{(kx –}^{ωt + φ)}. Indeed, A₀eⁱ^θ = A₀(cosθ + isinθ) and so the cosine function above is nothing else but the real part of the complex number A₀eⁱ^θ. Working with complex numbers makes adding waves and calculating interference effects and whatever we want to do with these wave functions much easier: we just replace the cosine functions by complex numbers in all of the formulae, solve them (algebra with complex numbers is very straightforward), and then we look at the real part of the solution to see what is happening really. We don’t care about the imaginary part, because that has no relationship to the actual physical quantities – for physical and electromagnetic waves that is, or for any other problem in classical wave mechanics. Done. So, in classical mechanics, the use of complex numbers is just a mathematical tool.

Now, that is not the case for the wave functions in quantum mechanics: the imaginary part of a wave equation – yes, let me write one down here – such as Ψ = Ψ(x, t) = (1/x)eⁱ^{(kx – ωt)}is very much part and parcel of the so-called probability amplitude that describes the state of the system here. In fact, this Ψ function is an example taken from one of Feynman’s first Lectures on Quantum Mechanics (i.e. Volume III of his Lectures) and, in this case, Ψ(x, t) = (1/x)eⁱ^{(kx – ωt)}represents the probability amplitude of a tiny particle (e.g. an electron) moving freely through space – i.e. without any external forces acting upon it – to go from 0 to x and actually be at point x at time t. [Note how it varies inversely with the distance because of the 1/x factor, so that makes sense.] In fact, when I started writing this post, my objective was to present this example – because it illustrates the concept of the wave function in quantum mechanics in a fairly easy and relatively understandable way. So let’s have a go at it.

First, it is necessary to understand the difference between probabilities and probability amplitudes. We all know what a probability is: it is a real number between o and 1 expressing the chance of something happening. It is usually denoted by the symbol P. An example is the probability that monochromatic light (i.e. one or more photons with the same frequency) is reflected from a sheet of glass. [To be precise, this probability is anything between 0 and 16% (i.e. P = 0 to 0.16). In fact, this example comes from another fine publication of Richard Feynman – QED (1985) – in which he explains how we can calculate the exact probability, which depends on the thickness of the sheet.]

A probability amplitude is something different. A probability amplitude is a complex number (3 + 2i, or 2.6eⁱ^1.34, for example) and – unlike its equivalent in classical mechanics – both the real and imaginary part matter. That being said, probabilities and probability amplitudes are obviously related: to be precise, one calculates the probability of an event actually happening by taking the square of the modulus (or the absolute value) of the probability amplitude associated with that event. Huh? Yes. Just let it sink in. So, if we denote the probably amplitude by Φ, then we have the following relationship:

P =|Φ|²

P = probability

Φ = probability amplitude

In addition, where we would add and multiply probabilities in the classical world (for example, to calculate the probability of an event which can happen in two different ways – alternative 1 and alternative 2 let’s say – we would just add the individual probabilities to arrive at the probably of the event happening in one or the other way, so P = P₁+ P₂), in the quantum-mechanical world we should add and multiply probability amplitudes, and then take the square of the modulus of that combined amplitude to calculate the combined probability. So, formally, the probability of a particle to reach a given state by two possible routes (route 1 or route 2 let’s say) is to be calculated as follows:

Φ = Φ₁+ Φ₂

and P =|Φ|²=|Φ₁+ Φ₂|²

Also, when we have only one route, but that one route consists of two successive stages (for example: to go from A to C, the particle would have first have to go from A to B, and then from B to C, with different probabilities of stage AB and stage BC actually happening), we will not multiply the probabilities (as we would do in the classical world) but the probability amplitudes. So we have:

Φ = Φ_ABΦ_BC

and P =|Φ|²=|Φ_ABΦ_BC|²

In short, it’s the probability amplitudes (and, as mentioned, these are complex numbers, not real numbers) that are to be added and multiplied etcetera and, hence, the probability amplitudes act as the equivalent, so to say, in quantum mechanics, of the conventional probabilities in classical mechanics. The difference is not subtle. Not at all. I won’t dwell too much on this. Just re-read any account of the double-slit experiment with electrons which you may have read and you’ll remember how fundamental this is. [By the way, I was surprised to learn that the double-slit experiment with electrons has apparently only been done in 2012 in exactly the way as Feynman described it. So when Feynman described it in his 1965 Lectures, it was still very much a ‘thought experiment’ only – even a 1961 experiment (not mentioned by Feynman) had clearly established the reality of electron interference.]

OK. Let’s move on. So we have this complex wave function in quantum mechanics and, as Feynman writes, “It is not like a real wave in space; one cannot picture any kind of reality to this wave as one does for a sound wave.” That being said, one can, however, get pretty close to ‘imagining’ what it actually is IMHO. Let’s go by the example which Feynman gives himself – on the very same page where he writes the above actually. The amplitude for a free particle (i.e. with no forces acting on it) with momentum p = mv to go from location r₁to location r₂is equal to

Φ₁₂= (1/r₁₂)eⁱ^p.r₁₂/ħwith r₁₂ = r₂– r₁

I agree this looks somewhat ugly again, but so what does it say? First, be aware of the difference between bold and normal type: I am writing p and v in bold type above because they are vectors: they have a magnitude (which I will denote by p and v respectively) as well as a direction in space. Likewise, r₁₂ is a vector going from r₁to r₂ (and r₁and r₂ themselves are space vectors themselves obviously) and so r₁₂(non-bold) is the magnitude of that vector. Keeping that in mind, we know that the dot product p.r₁₂ is equal to the product of the magnitudes of those vectors multiplied by cosα, with α the angle between those two vectors. Hence, p.r₁₂ .= p.r₁₂.cosα. Now, if p and r₁₂ have the same direction, the angle α will be zero and so cosα will be equal to one and so we just have p.r₁₂ = p.r₁₂or, if we’re considering a particle going from 0 to some position x, p.r₁₂ = p.r₁₂= px.

Now we also have Planck’s constant there, in its reduced form ħ = h/2π. As you can imagine, this 2π has something to do with the fact that we need radians in the argument. It’s the same as what we did with x in the argument of that cosine function above: if we have to express stuff in radians, then we have to absorb a factor of 2π in that constant. However, here I need to make an additional digression. Planck’s constant is obviously not just any constant: it is the so-called quantum of action. Indeed, it appears in what may well the most fundamental relations in physics.

The first of these fundamental relations is the so-called Planck relation: E = hf. The Planck relation expresses the wave-particle duality of light (or electromagnetic waves in general): light comes in discrete quanta of energy (photons), and the energy of these ‘wave particles’ is directly proportional to the frequency of the wave, and the factor of proportionality is Planck’s constant.

The second fundamental relation, or relations – in plural – I should say, are the de Broglie relations. Indeed, Louis-Victor-Pierre-Raymond, 7th duc de Broglie, turned the above on its head: if the fundamental nature of light is (also) particle-like, then the fundamental nature of particles must (also) be wave-like. So he boldly associated a frequency f and a wavelength λ with all particles, such as electrons for example – but larger-scale objects, such as billiard balls, or planets, also have a de Broglie wavelength and frequency! The de Broglie relation determining the de Broglie frequency is – quite simply – the re-arranged Planck relation: f = E/h. So this relation relates the de Broglie frequency with energy. However, in the above wave function, we’ve got momentum, not energy. Well… Energy and momentum are obviously related, and so we have a second de Broglie relation relating momentum with wavelength: λ = h/p.

We’re almost there: just hang in there. 🙂 When we presented the sinusoidal wave equation, we introduced the angular frequency (ω) and the wave number (k), instead of working with f and λ. That’s because we want an argument expressed in radians. Here it’s the same. The two de Broglie equations have a equivalent using angular frequency and wave number: ω = E/ħ and k = p/ħ. So we’ll just use the second one (i.e. the relation with the momentum in it) to associate a wave number with the particle (k = p/ħ).

Phew! So, finally, we get that formula which we introduced a while ago already: Ψ(x) = (1/x)eⁱ^kx, or, including time as a variable as well (we made abstraction of time so far):

Ψ(x, t) = (1/x)eⁱ^{(kx – ωt)}

The formula above obviously makes sense. For example, the 1/x factor makes the probability amplitude decrease as we get farther away from where the particle started: in fact, this 1/x or 1/r variation is what we see with electromagnetic waves as well: the amplitude of the electric field vector E varies as 1/r and, because we’re talking some real wave here and, hence, its energy is proportional to the square of the field, the energy that the source can deliver varies inversely as the square of the distance. [Another way of saying the same is that the energy we can take out of a wave within a given conical angle is the same, no matter how far away we are: the energy flux is never lost – it just spreads over a greater and greater effective area. But let’s go back to the main story.]

We’ve got the math – I hope. But what does this equation mean really? What’s that de Broglie wavelength or frequency in reality? What wave are we talking about? Well… What’s reality? As mentioned above, the famous de Broglie relations associate a wavelength λ and a frequency f to a particle with momentum p and energy E, but it’s important to mention that the associated de Broglie wave function yields probability amplitudes. So it is, indeed, not a ‘real wave in space’ as Feynman would put it. It is a quantum-mechanical wave equation.

Huh? […] It’s obviously about time I add some illustrations here, and so that’s what I’ll do. Look at the two cases below. The case on top is pretty close to the situation I described above: it’s a de Broglie wave – so that’s a complex wave – traveling through space (in one dimension only here). The real part of the complex amplitude is in blue, and the green is the imaginary part. So the probability of finding that particle at some position x is the modulus squared of this complex amplitude. Now, this particular wave function ignores the 1/x variation and, hence, the squared modulus of Aeⁱ^{(kx – ωt)}is equal to a constant. To be precise, it’s equal to A² (check it: the squared modulus of a complex number z equals the product of z and its complex conjugate, and so we get A²as a result indeed). So what does this mean? It means that the probability of finding that particle (an electron, for example) is the same at all points! In other words, we don’t know where it is! In the illustration below (top part), that’s shown as the (yellow) color opacity: the probability is spread out, just like the wave itself, so there is no definite position of the particle indeed.

[Note that the formula in the illustration above (which I took from Wikipedia once again) uses p instead of k as the factor in front of x. While it does not make a big difference from a mathematical point of view (ħ is just a factor of proportionality: k = p/ħ), it does make a big difference from a conceptual point of view and, hence, I am puzzled as to why the author of this article did this. Also, there is some variation in the opacity of the yellow (i.e. the color of our tennis (or ping pong) ball representing our ‘wavicle’) which shouldn’t be there because the probability associated with this particular wave function is a constant indeed: so there is no variation in the probability (when squaring the absolute value of a complex number, the phase factor does not come into play). Also note that, because all probabilities have to add up to 100% (or to 1), a wave function like this is quite problematic. However, don’t worry about it just now: just try to go with the flow.]

By now, I must assume you shook your head in disbelief a couple of time already. Surely, this particle (let’s stick to the example of an electron) must be somewhere, yes? Of course.

The problem is that we gave an exact value to its momentum and its energy and, as a result, through the de Broglie relations, we also associated an exact frequency and wavelength to the de Broglie wave associated with this electron. Hence, Heisenberg’s Uncertainty Principle comes into play: if we have exact knowledge on momentum, then we cannot know anything about its location, and so that’s why we get this wave function covering the whole space, instead of just some region only. Sort of. Here we are, of course, talking about that deep mystery about which I cannot say much – if only because so many eminent physicists have already exhausted the topic. I’ll just state Feynman once more: “Things on a very small scale behave like nothing that you have any direct experience with. […] It is very difficult to get used to, and it appears peculiar and mysterious to everyone – both to the novice and to the experienced scientist. Even the experts do not understand it the way they would like to, and it is perfectly reasonable that they should not because all of direct, human experience and of human intuition applies to large objects. We know how large objects will act, but things on a small scale just do not act that way. So we have to learn about them in a sort of abstract or imaginative fashion and not by connection with our direct experience.” And, after describing the double-slit experiment, he highlights the key conclusion: “In quantum mechanics, it is impossible to predict exactly what will happen. We can only predict the odds [i.e. probabilities]. Physics has given up on the problem of trying to predict exactly what will happen. Yes! Physics has given up. We do not know how to predict what will happen in a given circumstance. It is impossible: the only thing that can be predicted is the probability of different events. It must be recognized that this is a retrenchment in our ideal of understanding nature. It may be a backward step, but no one has seen a way to avoid it.”

[…] That’s enough on this I guess, but let me – as a way to conclude this little digression – just quickly state the Uncertainty Principle in a more or less accurate version here, rather than all of the ‘descriptions’ which you may have seen of it: the Uncertainty Principle refers to any of a variety of mathematical inequalities asserting a fundamental limit (fundamental means it’s got nothing to do with observer or measurement effects, or with the limitations of our experimental technologies) to the precision with which certain pairs of physical properties of a particle (these pairs are known as complementary variables) such as, for example, position (x) and momentum (p), can be known simultaneously. More in particular, for position and momentum, we have that σ_xσ_p ≥ ħ/2 (and, in this formulation, σ is, obviously the standard symbol for the standard deviation of our point estimate for x and p respectively).

OK. Back to the illustration above. A particle that is to be found in some specific region – rather than just ‘somewhere’ in space – will have a probability amplitude resembling the wave equation in the bottom half: it’s a wave train, or a wave packet, and we can decompose it, using the Fourier analysis, in a number of sinusoidal waves, but so we do not have a unique wavelength for the wave train as a whole, and that means – as per the de Broglie equations – that there’s some uncertainty about its momentum (or its energy).

I will let this sink in for now. In my next post, I will write some more about these wave equations. They are usually a solution to some differential equation – and that’s where my next post will connect with my previous ones (on differential equations). Just to say goodbye – as for now that is – I will just copy another beautiful illustration from Wikipedia. See below: it represents the (likely) space in which a single electron on the 5d atomic orbital of a hydrogen atom would be found. The solid body shows the places where the electron’s probability density (so that’s the squared modulus of the probability amplitude) is above a certain value – so it’s basically the area where the likelihood of finding the electron is higher than elsewhere. The hue on the colored surface shows the complex phase of the wave function.

It is a wonderful image, isn’t it? At the very least, it increased my understanding of the mystery surround quantum mechanics somewhat. I hope it helps you too. 🙂

Post scriptum 1: On the need to normalize a wave function

In this post, I wrote something about the need for probabilities to add up to 1. In mathematical terms, this condition will resemble something like

In this integral, we’ve got – once again – the squared modulus of the wave function, and so that’s the probability of find the particle somewhere. The integral just states that all of the probabilities added all over space (Rⁿ) should add up to some finite number (a²). Hey! But that’s not equal to 1 you’ll say. Well… That’s a minor problem only: we can create a normalized wave function ψ out of ψ₀ by simply dividing $ψ by a so we have ψ = ψ 0 / a, and then all is ‘normal’ indeed. 🙂$

Post scriptum 2: On using colors to represent complex numbers

When inserting that beautiful 3D graph of that 5d atomic orbital (again acknowledging its source: Wikipedia), I wrote that “the hue on the colored surface shows the complex phase of the wave function.” Because this kind of visual representation of complex numbers will pop up in other posts as well (and you’ve surely encountered it a couple of times already), it’s probably useful to be explicit on what it represents exactly. Well… I’ll just copy the Wikipedia explanation, which is clear enough: “Given a complex number z = re^iθ, the phase (also known as argument) θ can be represented by a hue, and the modulus r =|z| is represented by either intensity or variations in intensity. The arrangement of hues is arbitrary, but often it follows the color wheel. Sometimes the phase is represented by a specific gradient rather than hue.” So here you go…

Post scriptum 3: On the de Broglie relations

The de Broglie relations are a wonderful pair. They’re obviously equivalent: energy and momentum are related, and wavelength and frequency are obviously related too through the general formula relating frequency, wavelength and wave velocity: fλ = v (the product of the frequency and the wavelength must yield the wave velocity indeed). However, when it comes to the relation between energy and momentum, there is a little catch. What kind of energy are we talking about? We were describing a free particle (e.g. an electron) traveling through space, but with no (other) charges acting on it – in other words: no potential acting upon it), and so we might be tempted to conclude that we’re talking about the kinetic energy (K.E.) here. So, at relatively low speeds (v), we could be tempted to use the equations p = mv and K.E. = p²/2m = mv²/2 (the one electron in a hydrogen atom travels at less than 1% of the speed of light, and so that’s a non-relativistic speed indeed) and try to go from one equation to the other with these simple formulas. Well… Let’s try it.

f = E/h according to de Broglie and, hence, substituting E with p²/2m and f with v/λ, we get v/λ = m²v²/2mh. Some simplification and re-arrangement should then yield the second de Broglie relation: λ = 2h/mv = 2h/p. So there we are. Well… No. The second de Broglie relation is just λ = h/p: there is no factor 2 in it. So what’s wrong? The problem is the energy equation: de Broglie does not use the K.E. formula. [By the way, you should note that the K.E. = mv²/2 equation is only an approximation for low speeds – low compared to c that is.] He takes Einstein’s famous E = mc²equation (which I am tempted to explain now but I won’t) and just substitutes c, the speed of light, with v, the velocity of the slow-moving particle. This is a very fine but also very deep point which, frankly, I do not yet fully understand. Indeed, Einstein’s E = mc²is obviously something much ‘deeper’ than the formula for kinetic energy. The latter has to do with forces acting on masses and, hence, obeys Newton’s laws – so it’s rather familiar stuff. As for Einstein’s formula, well… That’s a result from relativity theory and, as such, something that is much more difficult to explain. While the difference between the two energy formulas is just a factor of 1/2 (which is usually not a big problem when you’re just fiddling with formulas like this), it makes a big conceptual difference.

Hmm… Perhaps we should do some examples. So these de Broglie equations associate a wave with frequency f and wavelength λ with particles with energy E, momentum p and mass m traveling through space with velocity v: E = hf and p = h/λ. [And, if we would want to use some sine or cosine function as an example of such wave function – which is likely – then we need an argument expressed in radians rather than in units of time or distance. In other words, we will need to convert frequency and wavelength to angular frequency and wave number respectively by using the 2π = ωT = ω/f and 2π = kλ relations, with the wavelength (λ), the period (T) and the velocity (v) of the wave being related through the simple equations f = 1/T and λ = vT. So then we can write the de Broglie relations as: E = ħω and p = ħk, with ħ = h/2π.]

In these equations, the Planck constant (be it h or ħ) appears as a simple factor of proportionality (we will worry about what h actually is in physics in later posts) – but a very tiny one: approximately 6.626×10^–34 J·s (Joule is the standard SI unit to measure energy, or work: 1 J = 1 kg·m²/s²), or 4.136×10^–15 eV·s when using a more appropriate (i.e. larger) measure of energy for atomic physics: still, 10^–15 is only 0.000 000 000 000 001. So how does it work? First note, once again, that we are supposed to use the equivalent for slow-moving particles of Einstein’s famous E = mc²equation as a measure of the energy of a particle: E = mv². We know velocity adds mass to a particle – with mass being a measure for inertia. In fact, the mass of so-called massless particles, like photons, is nothing but their energy (divided by c²). In other words, they do not have a rest mass, but they do have a relativistic mass m = E/c², with E = hf (and with f the frequency of the light wave here). Particles, such as electrons, or protons, do have a rest mass, but then they don’t travel at the speed of light. So how does that work out in that E = mv²formula which – let me emphasize this point once again – is not the standard formula (for kinetic energy) that we’re used to (i.e. E = mv²/2)? Let’s do the exercise.

For photons, we can re-write E = hf as E = hc/λ. The numerator hc in this expression is 4.136×10^–15 eV·s (i.e. the value of the Planck constant h expressed in eV·s) multiplied with 2.998×10⁸ m/s (i.e. the speed of light c) so that’s (more or less) hc ≈ 1.24×10^–6 eV·m. For visible light, the denominator will range from 0.38 to 0.75 micrometer (1 μm = 10^–6 m), i.e. 380 to 750 nanometer (1 nm = 10^–6 m), and, hence, the energy of the photon will be in the range of 3.263 eV to 1.653 eV. So that’s only a few electronvolt (an electronvolt (eV) is, by definition, the amount of energy gained (or lost) by a single electron as it moves across an electric potential difference of one volt). So that’s 2.6 to 5.2 Joule (1 eV = 1.6×10^–19Joule) and, hence, the equivalent relativistic mass of these photons is E/c²or 2.9 to 5.8×10^–34kg. That’s tiny – but not insignificant. Indeed, let’s look at an electron now.

The rest mass of an electron is about 9.1×10⁻³¹kg (so that’s a scale factor of a thousand as compared to the values we found for the relativistic mass of photons). Also, in a hydrogen atom, it is expected to speed around the nucleus with a velocity of about 2.2×10⁶m/s. That’s less than 1% of the speed of light but still quite fast obviously: at this speed (2,200 km per second), it could travel around the earth in less than 20 seconds (a photon does better: it travels not less than 7.5 times around the earth in one second). In any case, the electron’s energy – according to the formula to be used as input for calculating the de Broglie frequency – is 9.1×10⁻³¹kg multiplied with the square of 2.2×10⁶ m/s, and so that’s about 44×10^–19Joule or about 70 eV (1 eV = 1.6×10^–19Joule). So that’s – roughly – 35 times more than the energy associated with a photon.

The frequency we should associate with 70 eV can be calculated from E = hv/λ (we should, once again, use v instead of c), but we can also simplify and calculate directly from the mass: λ = hv/E = hv/mv² = h/mv (however, make sure you express h in J·s in this case): we get a value for λ equal to 0.33 nanometer, so that’s more than one thousand times shorter than the above-mentioned wavelengths for visible light. So, once again, we have a scale factor of about a thousand here. That’s reasonable, no? [There is a similar scale factor when moving to the next level: the mass of protons and neutrons is about 2000 times the mass of an electron.] Indeed, note that we would get a value of 0.510 MeV if we would apply the E = mc², equation to the above-mentioned (rest) mass of the electron (in kg): MeV stands for mega-electronvolt, so 0.510 MeV is 510,000 eV. So that’s a few hundred thousand times the energy of a photon and, hence, it is obvious that we are not using the energy equivalent of an electron’s rest mass when using de Broglie’s equations. No. It’s just that simple but rather mysterious E = mv²formula. So it’s not mc²nor mv²/2 (kinetic energy). Food for thought, isn’t it? Let’s look at the formulas once again.

They can easily be linked: we can re-write the frequency formula as λ = hv/E = hv/mv² = h/mv and then, using the general definition of momentum (p = mv), we get the second de Broglie equation: p = h/λ. In fact, de Broglie‘s rather particular definition of the energy of a particle (E = mv²) makes v a simple factor of proportionality between the energy and the momentum of a particle: v = E/p or E = pv. [We can also get this result in another way: we have h = E/f = pλ and, hence, E/p = fλ = v.]

Again, this is serious food for thought: I have not seen any ‘easy’ explanation of this relation so far. To appreciate its peculiarity, just compare it to the usual relations relating energy and momentum: E =p²/2m or, in its relativistic form, p²c² = E² – m₀²c⁴. So these two equations are both not to be used when going from one de Broglie relation to another. [Of course, it works for massless photons: using the relativistic form, we get p²c² = E² – 0 or E = pc, and the de Broglie relation becomes the Planck relation: E = hf (with f the frequency of the photon, i.e. the light beam it is part of). We also have p = h/λ = hf/c, and, hence, the E/p = c comes naturally. But that’s not the case for (slower-moving) particles with some rest mass: why should we use mv² as a energy measure for them, rather than the kinetic energy formula?

But let’s just accept this weirdness and move on. After all, perhaps there is some mistake here and so, perhaps, we should just accept that factor 2 and replace λ = h/p by λ = 2h/p. Why not? 🙂 In any case, both the λ = h/mv and λ = 2h/p = 2h/mv expressions give the impression that both the mass of a particle as well as its velocity are on a par so to say when it comes to determining the numerical value of the de Broglie wavelength: if we double the speed, or the mass, the wavelength gets shortened by half. So, one would think that larger masses can only be associated with extremely short de Broglie wavelengths if they move at a fairly considerable speed. But that’s where the extremely small value of h changes the arithmetic we would expect to see. Indeed, things work different at the quantum scale, and it’s the tiny value of h that is at the core of this. Indeed, it’s often referred to as the ‘smallest constant’ in physics, and so here’s the place where we should probably say a bit more about what h really stands for.

Planck’s constant h describes the tiny discrete packets in which Nature packs energy: one cannot find any smaller ‘boxes’. As such, it’s referred to as the ‘quantum of action’. But, surely, you’ll immediately say that it’s cousin, ħ = h/2π, is actually smaller. Well… Yes. You’re actually right: ħ = h/2π is actually smaller. It’s the so-called quantum of angular momentum, also (and probably better) known as spin. Angular momentum is a measure of… Well… Let’s call it the ‘amount of rotation’ an object has, taking into account its mass, shape and speed. Just like p, it’s a vector. To be precise, it’s the product of a body’s so-called rotational inertia (so that’s similar to the mass m in p = mv) and its rotational velocity (so that’s like v, but it’s ‘angular’ velocity), so we can write L = Iω but we’ll not go in any more detail here. The point to note is that angular momentum, or spin as it’s known in quantum mechanics, also comes in discrete packets, and these packets are multiples of ħ. [OK. I am simplifying here but the idea or principle that I am explaining here is entirely correct.]

But let’s get back to the de Broglie wavelength now. As mentioned above, one would think that larger masses can only be associated with extremely short de Broglie wavelengths if they move at a fairly considerable speed. Well… It turns out that the extremely small value of h upsets our everyday arithmetic. Indeed, because of the extremely small value of h as compared to the objects we are used to ( in one grain of salt alone, we will find about 1.2×10¹⁸ atoms – just write a 1 with 18 zeroes behind and you’ll appreciate this immense numbers somewhat more), it turns out that speed does not matter all that much – at least not in the range we are used to. For example, the de Broglie wavelength associated with a baseball weighing 145 grams and traveling at 90 mph (i.e. approximately 40 m/s) would be 1.1×10^–34 m. That’s immeasurably small indeed – literally immeasurably small: not only technically but also theoretically because, at this scale (i.e. the so-called Planck scale), the concepts of size and distance break down as a result of the Uncertainty Principle. But, surely, you’ll think we can improve on this if we’d just be looking at a baseball traveling much slower. Well… It does not much get better for a baseball traveling at a snail’s pace – let’s say 1 cm per hour, i.e. 2.7×10^–6 m/s. Indeed, we get a wavelength of 17×10^–28 m, which is still nowhere near the nanometer range we found for electrons. Just to give an idea: the resolving power of the best electron microscope is about 50 picometer (1 pm = ×10^–12 m) and so that’s the size of a small atom (the size of an atom ranges between 30 and 300 pm). In short, for all practical purposes, the de Broglie wavelength of the objects we are used to does not matter – and then I mean it does not matter at all. And so that’s why quantum-mechanical phenomena are only relevant at the atomic scale.