The wavefunction as an oscillation of spacetime

Post scriptum note added on 11 July 2016: This is one of the more speculative posts which led to my e-publication analyzing the wavefunction as an energy propagation. With the benefit of hindsight, I would recommend you to immediately the more recent exposé on the matter that is being presented here, which you can find by clicking on the provided link.

Original post:

You probably heard about the experimental confirmation of the existence of gravitational waves by Caltech’s LIGO Lab. Besides further confirming our understanding of the Universe, I also like to think it confirms that the elementary wavefunction represents a propagation mechanism that is common to all forces. However, the fundamental question remains: what is the wavefunction? What are those real and imaginary parts of those complex-valued wavefunctions describing particles and/or photons? [In case you think photons have no wavefunction, see my post on it: it’s fairly straightforward to re-formulate the usual description of an electromagnetic wave (i.e. the description in terms of the electric and magnetic field vectors) in terms of a complex-valued wavefunction. To be precise, in the mentioned post, I showed an electromagnetic wave can be represented as the  sum of two wavefunctions whose components reflect each other through a rotation by 90 degrees.]

So what? Well… I’ve started to think that the wavefunction may not only describe some oscillation in spacetime. I’ve started to think the wavefunction—any wavefunction, really (so I am not talking gravitational waves only)—is nothing but an oscillation of spacetime. What makes them different is the geometry of those wavefunctions, and the coefficient(s) representing their amplitude, which must be related to their relative strength—somehow, although I still have to figure out how exactly.

Huh? Yes. Maxwell, after jotting down his equations for the electric and magnetic field vectors, wrote the following back in 1862: “The velocity of transverse undulations in our hypothetical medium, calculated from the electromagnetic experiments of MM. Kohlrausch and Weber, agrees so exactly with the velocity of light calculated from the optical experiments of M. Fizeau, that we can scarcely avoid the conclusion that light consists in the transverse undulations of the same medium which is the cause of electric and magnetic phenomena.”

We now know there is no medium – no aether – but physicists still haven’t answered the most fundamental question: what is it that is oscillating? No one has gone beyond the abstract math. I dare to say now that it must be spacetime itself. In order to prove this, I’ll have to study Einstein’s general theory of relativity. But this post will already cover some basics.

The quantum of action and natural units

We can re-write the quantum of action in natural units, which we’ll call Planck units for the time being. They may or may not be the Planck units you’ve heard about, so just think of them as being fundamental, or natural—for the time being, that is. You’ll wonder: what’s natural? What’s fundamental? Well… That’s the question we’re trying to explore in this post, so… Well… Just be patient… 🙂 We’ll denote those natural units as FP, lP, and tP, i.e. the Planck force, Planck distance and Planck time unit respectively. Hence, we write:

ħ = FPlP∙tP

Note that FP, lP, and tP are expressed in our old-fashioned SI units here, i.e. in newton (N), meter (m) and seconds (s) respectively. So FP, lP, and tP have a numerical value as well as a dimension, just like ħ. They’re not just numbers. If we’d want to be very explicit, we could write: FP = FP [force], or FP = FP N, and you could do the same for land tP. However, it’s rather tedious to mention those dimensions all the time, so I’ll just assume you understand the symbols we’re using do not represent some dimensionless number. In fact, that’s what distinguishes physical constants from mathematical constants.

Dimensions are also distinguishes physics equations from purely mathematical ones: an equation in physics will always relate some physical quantities and, hence, when you’re dealing with physics equations, you always need to wonder about the dimensions. [Note that the term ‘dimension’ has many definitions… But… Well… I suppose you know what I am talking about here, and I need to move on. So let’s do that.] Let’s re-write that ħ = FPlP∙tP formula as follows: ħ/tP = FPlP.

FPlP is, obviously, a force times a distance, so that’s energy. Please do check the dimensions on the left-hand side as well: [ħ/tP] = [[ħ]/[tP] = (N·m·s)/s = N·m. In short, we can think of EP = FPlP = ħ/tP as being some ‘natural’ unit as well. But what would it correspond to—physically? What is its meaning? We may be tempted to call it the quantum of energy that’s associated with our quantum of action, but… Well… No. While it’s referred to as the Planck energy, it’s actually a rather large unit, and so… Well… No. We should not think of it as the quantum of energy. We have a quantum of action but no quantum of energy. Sorry. Let’s move on.

In the same vein, we can re-write the ħ = FPlP∙tP as ħ/lP = FP∙tP. Same thing with the dimensions—or ‘same-same but different’, as they say in Asia: [ħ/lP] = [FP∙tP] = N·m·s)/m = N·s. Force times time is momentum and, hence, we may now be tempted to think of pP = FP∙tP = ħ/lP as the quantum of momentum that’s associated with ħ, but… Well… No. There’s no such thing as a quantum of momentum. Not now in any case. Maybe later. 🙂 But, for now, we only have a quantum of action. So we’ll just call ħ/lP = FP∙tP the Planck momentum for the time being.

So now we have two ways of looking at the dimension of Planck’s constant:

  1. [Planck’s constant] = N∙m∙s = (N∙m)∙s = [energy]∙[time]
  2. [Planck’s constant] = N∙m∙s = (N∙s)∙m = [momentum]∙[distance]

In case you didn’t get this from what I wrote above: the brackets here, i.e. the [ and ] symbols, mean: ‘the dimension of what’s between the brackets’. OK. So far so good. It may all look like kids stuff – it actually is kids stuff so far – but the idea is quite fundamental: we’re thinking here of some amount of action (h or ħ, to be precise, i.e. the quantum of action) expressing itself in time or, alternatively, expressing itself in spaceIn the former case, some amount of energy is expended during some time. In the latter case, some momentum is expended over some distance.

Of course, ideally, we should try to think of action expressing itself in space and time simultaneously, so we should think of it as expressing itself in spacetime. In fact, that’s what the so-called Principle of Least Action in physics is all about—but I won’t dwell on that here, because… Well… It’s not an easy topic, and the diversion would lead us astray. 🙂 What we will do, however, is apply the idea above to the two de Broglie relations: E = ħω and p = ħk. I assume you know these relations by now. If not, just check one of my many posts on them. Let’s see what we can do with them.

The de Broglie relations

We can re-write the two de Broglie relations as ħ = E/ω and ħ = p/k. We can immediately derive an interesting property here:

ħ/ħ = 1 = (E/ω)/(p/k) ⇔ E/p = ω/k

So the ratio of the energy and the momentum is equal to the wave velocity. What wave velocity? The group of the phase velocity? We’re talking an elementary wave here, so both are the same: we have only one E and p, and, hence, only one ω and k. The E/p = ω/k identity underscores the following point: the de Broglie equations are a pair of equations here, and one of the key things to learn when trying to understand quantum mechanics is to think of them as an inseparable pair—like an inseparable twin really—as the quantum of action incorporates both a spatial as well as a temporal dimension. Just think of what Minkowski wrote back in 1907, shortly after he had re-formulated Einstein’s special relativity theory in terms of four-dimensional spacetime, and just two years before he died—unexpectely—from an appendicitis: “Henceforth space by itself, and time by itself, are doomed to fade away into mere shadows, and only a kind of union of the two will preserve an independent reality.”

So we should try to think of what that union might represent—and that surely includes looking at the de Broglie equations as a pair of matter-wave equations. Likewise, we should also think of the Uncertainty Principle as a pair of equations: ΔpΔx ≥ ħ/2 and ΔEΔt ≥ ħ/2—but I’ll come back to those later.

The ω in the E = ħω equation and the argument (θ = kx – ωt) of the wavefunction is a frequency in time (or temporal frequency). It’s a frequency expressed in radians per second. You get one radian by dividing one cycle by 2π. In other words, we have 2π radians in one cycle. So ω is related the frequency you’re used to, i.e. f—the frequency expressed in cycles per second (i.e. hertz): we multiply f by 2π to get ω. So we can write: E = ħω = ħ∙2π∙f = h∙f, with h = ħ∙2π (or ħ = h/2π).

Likewise, the k in the p = ħk equation and the argument (θ = kx – ωt) of the wavefunction is a frequency in space (or spatial frequency). Unsurprisingly, it’s expressed in radians per meter.

At this point, it’s good to properly define the radian as a unit in quantum mechanics. We often think of a radian as some distance measured along the circumference, because of the way the unit is constructed (see the illustration below) but that’s right and wrong at the same time. In fact, it’s more wrong than right: the radian is an angle that’s defined using the length of the radius of the unit circle but, when everything is said and done, it’s a unit used to measure some anglenot a distance. That should be obvious from the 2π rad = 360 degrees identity. The angle here is the argument of our wavefunction in quantum mechanics, and so that argument combines both time (t) as well as distance (x): θ = kx – ωt = k(x – c∙t). So our angle (the argument of the wavefunction) integrates both dimensions: space as well as time. If you’re not convinced, just do the dimensional analysis of the kx – ωt expression: both the kx and ωt yield a dimensionless number—or… Well… To be precise, I should say: the kx and ωt products both yield an angle expressed in radians. That angle connects the real and imaginary part of the argument of the wavefunction. Hence, it’s a dimensionless number—but that does not imply it is just some meaningless number. It’s not meaningless at all—obviously!

Circle_radians

Let me try to present what I wrote above in yet another way. The θ = kx – ωt = (p/ħ)·x − (E/ħ)·t equation suggests a fungibility: the wavefunction itself also expresses itself in time and/or in space, so to speak—just like the quantum of action. Let me be precise: the p·x factor in the (p/ħ)·x term represents momentum (whose dimension is N·s) being expended over a distance, while the E·t factor in the (E/ħ)·t term represents energy (expressed in N·m) being expended over some time. [As for the minus sign in front of the (E/ħ)·t term, that’s got to do with the fact that the arrow of time points in one direction only while, in space, we can go in either direction: forward or backwards.] Hence, the expression for the argument tells us that both are essentially fungible—which suggests they’re aspects of one and the same thing. So that‘s what Minkowski intuition is all about: spacetime is one, and the wavefunction just connects the physical properties of whatever it is that we are observing – an electron, or a photon, or whatever other elementary particle – to it.

Of course, the corollary to thinking of unified spacetime is thinking of the real and imaginary part of the wavefunction as one—which we’re supposed to do as a complex number is… Well… One complex number. But that’s easier said than actually done, of course. One way of thinking about the connection between the two spacetime ‘dimensions’ – i.e. t and x, with x actually incorporating three spatial dimensions in space in its own right (see how confusing the term ‘dimension’ is?) – and the two ‘dimensions’ of a complex number is going from Cartesian to polar coordinates, and vice versa. You now think of Euler’s formula, of course – if not, you should – but let me insert something more interesting here. 🙂 I took it from Wikipedia. It illustrates how a simple sinusoidal function transforms as we go from Cartesian to polar coordinates.

Cartesian_to_polar

Interesting, isn’t it? Think of the horizontal and vertical axis in the Cartesian space as representing time and… Well… Space indeed. 🙂 The function connects the space and time dimension and might, for example, represent the trajectory of some object in spacetime. Admittedly, it’s a rather weird trajectory, as the object goes back and forth in some box in space, and accelerates and decelerates all of the time, reversing its direction in the process… But… Well… Isn’t that how we think of a an electron moving in some orbital? 🙂 With that in mind, look at how the same movement in spacetime looks like in polar coordinates. It’s also some movement in a box—but both the ‘horizontal’ and ‘vertical’ axis (think of these axes as the real and imaginary part of a complex number) are now delineating our box. So, whereas our box is a one-dimensional box in spacetime only (our object is constrained in space, but time keeps ticking), it’s a two-dimensional box in our ‘complex’ space. Isn’t it just nice to think about stuff this way?

As far as I am concerned, it triggers the same profound thoughts as that E/p = ω/k relation. The  left-hand side is a ratio between energy and momentum. Now, one way to look at energy is that it’s action per time unit. Likewise, momentum is action per distance unit. Of course, ω is expressed as some quantity (expressed in radians, to be precise) per time unit, and k is some quantity (again, expressed in radians) per distance unit. Because this is a physical equation, the dimension of both sides of the equation has to be the same—and, of course, it is the same: the action dimension in the numerator and denominator of the ratio on the left-hand side of the E/p = ω/k equation cancel each other. But… What? Well… Wouldn’t it be nice to think of the dimension of the argument of our wavefunction as being the dimension of action, rather than thinking of it as just some mathematical thing, i.e. an angle. I like to think the argument of our wavefunction is more than just an angle. When everything is said and done, it has to be something physical—if onlyh because the wavefunction describes something physical. But… Well… I need to do some more thinking on this, so I’ll just move on here. Otherwise this post risks becoming a book in its own right. 🙂

Let’s get back to the topic we were discussing here. We were talking about natural units. More in particular, we were wondering: what’s natural? What does it mean?

Back to Planck units

Let’s start with time and distance. We may want to think of lP and tP as the smallest distance and time units possible—so small, in fact, that both distance and time become countable variables at that scale.

Huh? Yes. I am sure you’ve never thought of time and distance as countable variables but I’ll come back to this rather particular interpretation of the Planck length and time unit later. So don’t worry about it now: just make a mental note of it. The thing is: if tP and lP are the smallest time and distance units possible, then the smallest cycle we can possibly imagine will be associated with those two units: we write: ωP = 1/tP and kP = 1/lP. What’s the ‘smallest possible’ cycle? Well… Not sure. You should not think of some oscillation in spacetime as for now. Just think of a cycle. Whatever cycle. So, as for now, the smallest cycle is just the cycle you’d associate with the smallest time and distance units possible—so we cannot write ωP = 2/tP, for example, because that would imply we can imagine a time unit that’s smaller than tP, as we can associate two cycles with tP now.

OK. Next step. We can now define the Planck energy and the Planck momentum using the de Broglie relations:

EP = ħ∙ωP = ħ/tP  and pP = ħ∙kP = ħ/lP

You’ll say that I am just repeating myself here, as I’ve given you those two equations already. Well… Yes and no. At this point, you should raise the following question: why are we using the angular frequency (ωP = 2π·fP) and the reduced Planck constant (ħ = h/2π), rather than fP or h?

That’s a great question. In fact, it begs the question: what’s the physical constant really? We have two mathematical constants – ħ and h – but they must represent the same physical reality. So is one of the two constants more real than the other? The answer is unambiguously: yes! The Planck energy is defined as EP = ħ/tP =(h/2π)/tP, so we cannot write this as EP = h/tP. The difference is that 1/2π factor, and it’s quite fundamental, as it implies we’re actually not associating a full cycle with tP and lP but a radian of that cycle only.

Huh? Yes. It’s a rather strange observation, and I must admit I haven’t quite sorted out what this actually means. The fundamental idea remains the same, however: we have a quantum of action, ħ (not h!), that can express itself as energy over the smallest distance unit possible or, alternatively, that expresses itself as momentum over the smallest time unit possible. In the former case, we write it as EP = FPlP = ħ/tP. In the latter, we write it as pP = FP∙tP = ħ/lP. Both are aspects of the same reality, though, as our particle moves in space as well as in time, i.e. it moves in spacetime. Hence, one step in space, or in time, corresponds to one radian. Well… Sort of… Not sure how to further explain this. I probably shouldn’t try anyway. 🙂

The more fundamental question is: with what speed is is moving? That question brings us to the next point. The objective is to get some specific value for lP and tP, so how do we do that? How can we determine these two values? Well… That’s another great question. 🙂

The first step is to relate the natural time and distance units to the wave velocity. Now, we do not want to complicate the analysis and so we’re not going to throw in some rest mass or potential energy here. No. We’ll be talking a theoretical zero-mass particle. So we’re not going to consider some electron moving in spacetime, or some other elementary particle. No. We’re going to think about some zero-mass particle here, or a photon. [Note that a photon is not just a zero-mass particle. It’s similar but different: in one of my previous posts, I showed a photon packs more energy, as you get two wavefunctions for the price of one, so to speak. However, don’t worry about the difference here.]

Now, you know that the wave velocity for a zero-mass particle and/or a photon is equal to the speed of light. To be precise, the wave velocity of a photon is the speed of light and, hence, the speed of any zero-mass particle must be the same—as per the definition of mass in Einstein’s special relativity theory. So we write: lP/tP = c ⇔ lP = c∙tP and tP = lP/c. In fact, we also get this from dividing EP by pP, because we know that E/p = c, for any photon (and for any zero-mass particle, really). So we know that EP/pP must also equal c. We can easily double-check that by writing: EP/pP = (ħ/tP)/(ħ/lP) = lP/tP = c. Substitution in ħ = FPlP∙tP yields ħ = c∙FP∙tP2 or, alternatively, ħ = FPlP2/c. So we can now write FP as a function of lP and/or tP:

FP = ħ∙c/lP2 = ħ/(c∙tP2)

We can quickly double-check this by dividing FP = ħ∙c/lP2 by FP = ħ/(c∙tP2). We get: 1 = c2∙tP2/lP2 ⇔ lP2/tP2 = c2 ⇔ lP/tP = c.

Nice. However, this does not uniquely define FP, lP, and tP. The problem is that we’ve got only two equations (ħ = FPlP∙tP and lP/tP = c) for three unknowns (FP, lP, and tP). Can we throw in one or both of the de Broglie equations to get some final answer?

I wish that would help, but it doesn’t—because we get the same ħ = FPlP∙tP equation. Indeed, we’re just re-defining the Planck energy (and the Planck momentum) by that EP = ħ/tP (and pP = ħ/lP) equation here, and so that does not give us a value for EP (and pP). So we’re stuck. We need some other formula so we can calculate the third unknown, which is the Planck force unit (FP). What formula could we possibly choose?

Well… We got a relationship by imposing the condition that lP/tP = c, which implies that if we’d measure the velocity of a photon in Planck time and distance units, we’d find that its velocity is one, so c = 1. Can we think of some similar condition involving ħ? The answer is: we can and we can’t. It’s not so simple. Remember we were thinking of the smallest cycle possible? We said it was small because tP and lP were the smallest units we could imagine. But how do we define that? The idea is as follows: the smallest possible cycle will pack the smallest amount of action, i.e. h (or, expressed per radian rather than per cycle, ħ).

Now, we usually think of packing energy, or momentum, instead of packing action, but that’s because… Well… Because we’re not good at thinking the way Minkowski wanted us to think: we’re not good at thinking of some kind of union of space and time. We tend to think of something moving in space, or, alternatively, of something moving in time—rather than something moving in spacetime. In short, we tend to separate dimensions. So that’s why we’d say the smallest possible cycle would pack an amount of energy that’s equal to EP = ħ∙ωP = ħ/tP, or an amount of momentum that’s equal to pP = ħ∙kP = ħ/lP. But both are part and parcel of the same reality, as evidenced by the E = m∙c2 = m∙cc = p∙c equality. [This equation only holds for a zero-mass particle (and a photon), of course. It’s a bit more complicated when we’d throw in some rest mass, but we can do that later. Also note I keep repeating my idea of the smallest cycle, but we’re talking radians of a cycle, really.]

So we have that mass-energy equivalence, which is also a mass-momentum equivalence according to that E = m∙c2 = m∙cc = p∙c formula. And so now the gravitational force comes into play: there’s a limit to the amount of energy we can pack into a tiny space. Or… Well… Perhaps there’s no limit—but if we pack an awful lot of energy into a really tiny speck of space, then we get a black hole.

However, we’re getting a bit ahead of ourselves here, so let’s first try something else. Let’s throw in the Uncertainty Principle.

The Uncertainty Principle

As mentioned above, we can think of some amount of action expressing itself over some time or, alternatively, over some distance. In the former case, some amount of energy is expended over some time. In the latter case, some momentum is expended over some distance. That’s why the energy and time variables, and the momentum and distance variables, are referred to as complementary. It’s hard to think of both things happening simultaneously (whatever that means in spacetime), but we should try! Let’s now look at the Uncertainty relations once again (I am writing uncertainty with a capital U out of respect—as it’s very fundamental, indeed!):

ΔpΔx ≥ ħ/2 and ΔEΔt ≥ ħ/2.

Note that the ħ/2 factor on the right-hand side quantifies the uncertainty, while the right-hand side of the two equations (ΔpΔx and ΔEΔt) are just an expression of that fundamental uncertainty. In other words, we have two equations (a pair), but there’s only one fundamental uncertainty, and it’s an uncertainty about a movement in spacetime. Hence, that uncertainty expresses itself in both time as well as in space.

Note the use of ħ rather than h, and the fact that the  1/2 factor makes it look like we’re splitting ħ over ΔpΔx and ΔEΔt respectively—which is actually a quite sensible explanation of what this pair of equations actually represent. Indeed, we can add both relations to get the following sum:

ΔpΔx + ΔEΔt ≥ ħ/2 + ħ/2 = ħ

Interesting, isn’t it? It explains that 1/2 factor which troubled us when playing with the de Broglie relations.

Let’s now think about our natural units again—about lP, and tP in particular. As mentioned above, we’ll want to think of them as the smallest distance and time units possible: so small, in fact, that both distance and time become countable variables, so we count x and t as 0, 1, 2, 3 etcetera. We may then imagine that the uncertainty in x and t is of the order of one unit only, so we write Δx = lP and Δt = tP. So we can now re-write the uncertainty relations as:

  • Δp·lP = ħ/2
  • ΔE·tP = ħ/2

Hey! Wait a minute! Do we have a solution for the value of lP and tP here? What if we equate the natural energy and momentum units to ΔE and Δp here? Well… Let’s try it. First note that we may think of the uncertainty in t, or in x, as being equal to plus or minus one unit, i.e. ±1. So the uncertainty is two units really. [Frankly, I just want to get rid of that 1/2 factor here.] Hence, we can re-write the ΔpΔx = ΔEΔt = ħ/2 equations as:

  • ΔpΔx = pPlP = FP∙tPlP = ħ
  • ΔEΔt = EP∙tP = FPlP∙tP = ħ

Hmm… What can we do with this? Nothing much, unfortunately. We’ve got the same problem: we need a value for FP (or for pP, or for EP) to get some specific value for lP and tP, so we’re stuck once again. We have three variables and two equations only, so we have no specific value for either of them. 😦

What to do? Well… I will give you the answer now—the answer you’ve been waiting for, really—but not the technicalities of it. There’s a thing called the Schwarzschild radius, aka as the gravitational radius. Let’s analyze it.

The Schwarzschild radius and the Planck length

The Schwarzschild radius is just the radius of a black hole. Its formal definition is the following: it is the radius of a sphere such that, if all the mass of an object were to be compressed within that sphere, the escape velocity from the surface of the sphere would equal the speed of light (c). The formula for the Schwartzschild radius is the following:

RS = 2m·G/c2

G is the gravitational constant here: G ≈ 6.674×10−11 N⋅m2/kg2. [Note that Newton’s F = m·Law tells us that 1 kg = 1 N·s2/m, as we’ll need to substitute units later.]

But what is the mass (m) in that RS = 2m·G/c2 equation? Using equivalent time and distance units (so = 1), we wrote the following for a zero-mass particle and for a photon respectively:

  • E = m = p = ħ/2 (zero-mass particle)
  • E = m = p = ħ (photon)

How can a zero-mass particle, or a photon, have some mass? Well… Because it moves at the speed of light. I’ve talked about that before, so I’ll just refer you to my post on that. Of course, the dimension of the right-hand side of these equations (i.e. ħ/2 or ħ) symbol has to be the same as the dimension on the left-hand side, so the ‘ħ’ in the E = ħ equation (or E = ħ/2 equation) is a different ‘ħ’ in the p = ħ equation (or p = ħ/2 equation). So we must be careful here. Let’s write it all out, so as to remind ourselves of the dimensions involved:

  • E [N·m] = ħ [N·m·s/s] = EP = FPlP∙tP/tP
  • p [N·s] = ħ [N·m·s/m] = pP = FPlP∙tP/lP

Now, let’s check this by cheating. I’ll just give you the numerical values—even if we’re not supposed to know them at this point—so you can see I am not writing nonsense here:

  • EP = 1.0545718×10−34 N·m·s/(5.391×10−44 s) = (1.21×1044 N)·(1.6162×10−35 m) = 1.9561×10N·m
  • pP =1.0545718×10−34 N·m·s/(1.6162×10−35 m) = (1.21×1044 N)·(5.391×10−44 s) = 6.52485 N·s

You can google the Planck units, and you’ll see I am not taking you for a ride here. 🙂

The associated Planck mass is mP = EP/c2 = 1.9561×10N·m/(2.998×10m/s)2 = 2.17651×108 N·s2/m = 2.17651×108 kg. So let’s plug that value into RS = 2m·G/cequation. We get:

RS = 2m·G/c= [(2.17651×108 kg)·(6.674×10−11 N⋅m2/kg)/(8.988×1016 m2·s−2)

= 1.6162×1035 kg·N⋅m2·kg−2·m2·s−2 = 1.6162×1035 kg·N⋅m2·kg−2·m2·s−2 = 1.6162×1035 m = lP

Bingo! You can look it up: 1.6162×1035 m is the Planck length indeed, so the Schwarzschild radius is the Planck length. We can now easily calculate the other Planck units:

  • t= lP/c = 1.6162×1035 m/(2.998×10m/s) = 5.391×10−44 s
  • F= ħ/(tPlP)= (1.0545718×10−34 N·m·s)/[(1.6162×1035 m)·(5.391×10−44 s) = 1.21×10−44 N

Bingo again! 🙂

[…] But… Well… Look at this: we’ve been cheating all the way. First, we just gave you that formula for the Schwarzschild radius. It looks like an easy formula but its derivation involves a profound knowledge of general relativity theory. So we’d need to learn about tensors and what have you. The formula is, in effect, a solution to what is known as Einstein’s field equations, and that’s pretty complicated stuff.

However, my crime is much worse than that: I also gave you those numerical values for the Planck energy and momentum, rather than calculating them. I just couldn’t calculate them with the knowledge we have so far. When everything is said and done, we have more than three unknowns. We’ve got five in total, including the Planck charge (qP) and, hence, we need five equations. Again, I’ll just copy them from Wikipedia, because… Well… What we’re discussing here is way beyond the undergraduate physics stuff that we’ve been presenting so far. The equations are the following. Just have a look at them and move on. 🙂

Planck units

Finally, I should note one more thing: I did not use 2m but m in Schwarzschild’s formula. Why? Well… I have no good answer to that. I did it to ensure I got the result we wanted to get. It’s that 1/2 factor again. In fact, the E = m = p = ħ/2 is the correct formula to use, and all would come out alright if we did that and defined the magnitude of the uncertainty as one unit only, but so we used the E = m = p = ħ formula instead, i.e. the equation that’s associated with a photon. You can re-do the calculations as an exercise: you’ll see it comes out alright.

Just to make things somewhat more real, let me note that the Planck energy is very substantial: 1.9561×10N·m ≈ 2×10J is equivalent to the energy that you’d get out of burning 60 liters of gasoline—or the mileage you’d get out of 16 gallons of fuel! In short, it’s huge,  and so we’re packing that into a unimaginably small space. To understand how that works, you can think of the E = h∙f ⇔ h = E/f relation once more. The h = E/f ratio implies that energy and frequency are directly proportional to each other, with h the coefficient of proportionality. Shortening the wavelength, amounts to increasing the frequency and, hence, the energy. So, as you think of our cycle becoming smaller and smaller, until it becomes the smallest cycle possible, you should think of the frequency becoming unimaginably large. Indeed, as I explained in one of my other posts on physical constants, we’re talking the the 1043 Hz scale here. However, we can always choose our time unit such that we measure the frequency as one cycle per time unit. Because the energy per cycle remains the same, it means the quantum of action (ħ = FPlP∙tP) expresses itself over extremely short time spans, which means the EP = FPlP product becomes huge, as we’ve shown above. The rest of the story is the same: gravity comes into play, and so our little blob in spacetime becomes a tiny black hole. Again, we should think of both space and time: they are joined in ‘some kind of union’ here, indeed, as they’re intimately connected through the wavefunction, which travels at the speed of light.

The wavefunction as an oscillation in and of spacetime

OK. Now I am going to present the big idea I started with. Let me first ask you a question: when thinking about the Planck-Einstein relation (I am talking about the E = ħ∙ω relation for a photon here, rather than the equivalent de Broglie equation for a matter-particle), aren’t you struck by the fact that the energy of a photon depends on the frequency of the electromagnetic wave only? I mean… It does not depend on its amplitude. The amplitude is mentioned nowhere. The amplitude is fixed, somehow—or considered to be fixed.

Isn’t that strange? I mean… For any other physical wave, the energy would not only depend on the frequency but also on the amplitude of the wave. For a photon, however, it’s just the frequency that counts. Light of the same frequency but higher intensity (read: more energy) is not a photon with higher amplitude, but just more photons. So it’s the photons that add up somehow, and so that explains the amplitude of the electric and magnetic field vectors (i.e. E and B) and, hence, the intensity of the light. However, every photon considered separately has the same amplitude apparently. We can only increase its energy by increasing the frequency. In short, ω is the only variable here.

Let’s look at that angular frequency once more. As you know, it’s expressed in radians per second but, if you multiply ω by 2π, you get the frequency you’re probably more acquainted with: f = 2πω = f cycles per second. The Planck-Einstein relation is then written as E = h∙f. That’s easy enough. But what if we’d change the time unit here? For example, what if our time unit becomes the time that’s needed for a photon to travel one meter? Let’s examine it.

Let’s denote that time unit by tm, so we write: 1 tm = 1/c s ⇔ tm1 = c s1, with c ≈ 3×108. The frequency, as measured using our new time unit, changes, obviously: we have to divide its former value by c now. So, using our little subscript once more, we could write: fm = f/c. [Why? Just add the dimension to make things more explicit: f s1 = f/c tm1 = f/c tm1.] But the energy of the photon should not depend on our time unit, should it?

Don’t worry. It doesn’t: the numerical value of Planck’s constant (h) would also change, as we’d replace the second in its dimension (N∙m∙s) by c times our new time unit tm. However, Planck’s constant remains what it is: some physical constant. It does not depend on our measurement units: we can use the SI units, or the Planck units (FP, lP, and tP), or whatever unit you can think of. It doesn’t matter: h (or ħ = h/2π) is what is—it’s the quantum of action, and so that’s a physical constant (as opposed to a mathematical constant) that’s associated with one cycle.

Now, I said we do not associate the wavefunction of a photon with an amplitude, but we do associate it with a wavelength. We do so using the standard formula for the velocity of a wave: c = f∙λ ⇔ λ = c/f. We can also write this using the angular frequency and the wavenumber: c = ω/k, with k = 2π/λ. We can double-check this, because we know that, for a photon, the following relation holds: E/p = c. Hence, using the E = ħ∙ω and p = ħ∙k relations, we get: (ħ∙ω)/(ħ∙k) = ω/k = c. So we have options here: h can express itself over a really long wavelength, or it can do so over an extremely short wavelength. We re-write p = ħ∙k as p = E/c = ħ∙2π/λ = h/λ ⇔ E = h∙c/λ ⇔ h∙c = E∙λ. We know this relationship: the energy and the wavelength of a photon (or an electromagnetic wave) are inversely proportional to each other.

Once again, we may want to think of the shortest wavelength possible. As λ gets a zillion times smaller, E gets a zillion times bigger. Is there a limit? There is. As I mentioned above, the gravitational force comes into play here: there’s a limit to the amount of energy we can pack into a tiny space. If we pack an awful lot of energy into a really tiny speck of space, then we get a black hole. In practical terms, that implies our photon can’t travel, as it can’t escape from the black hole it creates. That’s what that calculation of the Schwarzschild radius was all about.

We can—in fact, we should—now apply the same reasoning to the matter-wave. Instead of a photon, we should try to think of a zero-mass matter-particle. You’ll say: that’s a contradiction. Matter-particles – as opposed to force-carrying particles, like photons (or bosons in general) – must have some rest mass, so they can’t be massless. Well… Yes. You’re right. But we can throw the rest mass in later. I first want to focus on the abstract principles, i.e. the propagation mechanism of the matter-wave.

Using natural units, we know our particle will move in spacetime with velocity Δx/Δt = 1/1 = 1. Of course, it has to have some energy to move, or some momentum. We also showed that, if it’s massless, and the elementary wavefunction is ei[(p/ħ)x – (E/ħ)t), then we know the energy, and the momentum, has to be equal to ħ/2. Where does it get that energy, or momentum? Not sure. I like to think it borrows it from spacetime, as it breaks some potential barrier between those two points, and then it gives it back. Or, if it’s at point x = t = 0, then perhaps it gets it from some other massless particle moving from x = t = −1. In both cases, we’d like to think our particle keeps moving. So if the first description (borrowing) is correct, it needs to keep borrowing and returning energy in some kind of interaction with spacetime itself. If it’s the second description, it’s more like spacetime bumping itself forward.

In both cases, however, we’re actually trying to visualize (or should I say: imagine?) some oscillation of spacetime itself, as opposed to an oscillation in spacetime.

Huh? Yes. The idea is the following here: we like to think of the wavefunction as the dependent variable: both its real as well as its imaginary part are a function of x and t, indeed. But what if we’d think of x and t as dependent variables? In that case, the real and imaginary part of the wavefunction would be the independent variables. It’s just a matter of perspective. We sort of mirror our function: we switch its domain for its range, and its range for its domain, as shown below. It all makes sense, doesn’t it? Space and time appear as separate dimensions to us, but they’re intimately connected through c, ħ and the other fundamental physical constants. Likewise, the real and imaginary part of the wavefunction appear as separate dimensions, but they’re intimately connected through π and Euler’s number, i.e. through mathematical constants. That cannot be a coincidence: the mathematical and physical ‘space’ reflect each other through the wavefunction, just like the domain and range of a function reflect each other through that function. So physics and math must meet in some kind of union—at least in our mind, they do!

dependent independent

So, yes, we can—and probably should—be looking at the wavefunction as an oscillation of spacetime, rather than as an oscillation in spacetime only. As mentioned in my introduction, I’ll need to study general relativity theory—and very much in depth—to convincingly prove that point, but I am sure it can be done.

You’ll probably think I am arrogant when saying that—and I probably am—but then I am very much emboldened by the fact some nuclear scientist told me a photon doesn’t have any wavefunction: it’s just those E and B vectors, he told me—and then I found out he was dead wrong, as I showed in my previous post! So I’d rather think more independently now. I’ll keep you guys posted on progress—but it will probably take a while to figure it all out. In the meanwhile, please do let me know your ideas on this. 🙂

Let me wrap up this little excursion with two small notes:

  • We have this E/c = p relation. The mass-energy equivalence relation implies momentum must also have an equivalent mass. If E = m∙c2, then p = m∙c ⇔ m = p/c. It’s obvious, but I just thought it would be useful to highlight this.
  • When we studied the ammonia molecule as a two-state system, our space was not a continuum: we allowed just two positions—two points in space, which we defined with respect to the system. So x was a discrete variable. We assumed time to be continuous, however, and so we got those nice sinusoids as a solution to our set of Hamiltonian equations. However, if we look at space as being discrete, or countable, we should probably think of time as being countable as well. So we should, perhaps, think of a particle being at point x = t = 0 first, and, then, being at point x = t = 1. Instead of the nice sinusoids, we get some boxcar function, as illustrated below, but probably varying between 0 and 1—or whatever other normalized values. You get the idea, I hope. 🙂

boxcar1

Post Scriptum on the Principle of Least Action: As noted above, the Principle of Least Action is not very intuitive, even if Feynman’s exposé of it is not as impregnable as it may look at first. To put it simply, the Principle of Least Action says that the average kinetic energy less the average potential energy is as little as possible for the path of an object going from one point to another. So we have a path or line integral here. In a gravitation field, that integral is the following:

Least action

The integral is not all that important. Just note its dimension is the dimension of action indeed, as we multiply energy (the integrand) with time (dt). We can use the Principle of Least Action to re-state Newton’s Law, or whatever other classical law. Among other things, we’ll find that, in the absence of any potential, the trajectory of a particle will just be some straight line.

In quantum mechanics, however, we have uncertainty, as expressed in the ΔpΔx ≥ ħ/2 and ΔEΔt ≥ ħ/2 relations. Now, that uncertainty may express itself in time, or in distance, or in both. That’s where things become tricky. 🙂 I’ve written on this before, but let me copy Feynman himself here, with a more exact explanation of what’s happening (just click on the text to enlarge):

Feynman

The ‘student’ he speaks of above, is himself, of course. 🙂

Too complicated? Well… Never mind. I’ll come back to it later. 🙂

Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 17, 2020 as a result of a DMCA takedown notice from Michael A. Gottlieb, Rudolf Pfeiffer, and The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/

All what you ever wanted to know about the photon wavefunction…

Post scriptum note added on 11 July 2016: This is one of the more speculative posts which led to my e-publication analyzing the wavefunction as an energy propagation. With the benefit of hindsight, I would recommend you to immediately read the more recent exposé on the matter that is being presented here, which you can find by clicking on the provided link.

Original post:

This post is, essentially, a continuation of my previous post, in which I juxtaposed the following images:

Animation 5d_euler_f

Both are the same, and then they’re not. The illustration on the right-hand side is a regular quantum-mechanical wavefunction, i.e. an amplitude wavefunction. You’ve seen that one before. In this case, the x-axis represents time, so we’re looking at the wavefunction at some particular point in space. ]You know we can just switch the dimensions and it would all look the same.] The illustration on the left-hand side looks similar, but it’s not an amplitude wavefunction. The animation shows how the electric field vector (E) of an electromagnetic wave travels through space. Its shape is the same. So it’s the same function. Is it also the same reality?

Yes and no. And I would say: more no than yes—in this case, at least. Note that the animation does not show the accompanying magnetic field vector (B). That vector is equally essential in the electromagnetic propagation mechanism according to Maxwell’s equations, which—let me remind you—are equal to:

  1. B/∂t = –∇×E
  2. E/∂t = ∇×B

In fact, I should write the second equation as ∂E/∂t = c2∇×B, but then I assume we measure time and distance in equivalent units, so c = 1.

You know that E and B are two aspects of one and the same thing: if we have one, then we have the other. To be precise, B is always orthogonal to in the direction that’s given by the right-hand rule for the following vector cross-product: B = ex×E, with ex the unit vector pointing in the x-direction (i.e. the direction of propagation). The reality behind is illustrated below for a linearly polarized electromagnetic wave.

E and b

The B = ex×E equation is equivalent to writing B= i·E, which is equivalent to:

B = i·E = ei(π/2)·ei(kx − ωt) = cos(kx − ωt + π/2) + i·sin(kx − ωt + π/2)

= −sin((kx − ωt) + i·cos(kx − ωt)

Now, E and B have only two components: Eand Ez, and Band Bz. That’s only because we’re looking at some ideal or elementary electromagnetic wave here but… Well… Let’s just go along with it. 🙂 It is then easy to prove that the equation above amounts to writing:

  1. B= cos(kx − ωt + π/2) = −sin(kx − ωt) = −Ez
  2. B= sin(kx − ωt + π/2) = cos(kx − ωt) = Ey

We should now think of Ey and Eas the real and imaginary part of some wavefunction, which we’ll denote as ψE = ei(kx − ωt). So we write:

E = (Ey, Ez) = Ey + i·E= cos(kx − ωt) + i∙sin(kx − ωt) = ReE) + i·ImE) = ψE = ei(kx − ωt)

What about B? We just do the same, so we write:

B = (By, Bz) = By + i·B= ψB = i·E = i·ψE = −sin(kx − ωt) + i∙sin(kx − ωt) = − ImE) + i·ReE)

Now we need to prove that ψE and ψB are regular wavefunctions, which amounts to proving Schrödinger’s equation, i.e. ∂ψ/∂t = i·(ħ/m)·∇2ψ, for both ψE and ψB. [Note I use the Schrödinger’s equation for a zero-mass spin-zero particle here, which uses the ħ/m factor rather than the ħ/(2m) factor.] To prove that ψE and ψB are regular wavefunctions, we should prove that:

  1. Re(∂ψE/∂t) =  −(ħ/m)·Im(∇2ψE) and Im(∂ψE/∂t) = (ħ/m)·Re(∇2ψE), and
  2. Re(∂ψB/∂t) =  −(ħ/m)·Im(∇2ψB) and Im(∂ψB/∂t) = (ħ/m)·Re(∇2ψB).

Let’s do the calculations for the second pair of equations. The time derivative on the left-hand side is equal to:

∂ψB/∂t = −iω·iei(kx − ωt) = ω·[cos(kx − ωt) + i·sin(kx − ωt)] = ω·cos(kx − ωt) + iω·sin(kx − ωt)

The second-order derivative on the right-hand side is equal to:

2ψ= ∂2ψB/∂x= i·k2·ei(kx − ωt) = k2·cos(kx − ωt) + i·k2·sin(kx − ωt)

So the two equations for ψare equivalent to writing:

  1. Re(∂ψB/∂t) =   −(ħ/m)·Im(∇2ψB) ⇔ ω·cos(kx − ωt) = k2·(ħ/m)·cos(kx − ωt)
  2. Im(∂ψB/∂t) = (ħ/m)·Re(∇2ψB) ⇔ ω·sin(kx − ωt) = k2·(ħ/m)·sin(kx − ωt)

So we see that both conditions are fulfilled if, and only if, ω = k2·(ħ/m).

Now, we also demonstrated in that post of mine that Maxwell’s equations imply the following:

  1. ∂By/∂t = –(∇×E)y = ∂Ez/∂x = ∂[sin(kx − ωt)]/∂x = k·cos(kx − ωt) = k·Ey
  2. ∂Bz/∂t = –(∇×E)z = – ∂Ey/∂x = – ∂[cos(kx − ωt)]/∂x = k·sin(kx − ωt) = k·Ez

Hence, using those B= −Eand B= Eequations above, we can also calculate these derivatives as:

  1. ∂By/∂t = −∂Ez/∂t = −∂sin(kx − ωt)/∂t = ω·cos(kx − ωt) = ω·Ey
  2. ∂Bz/∂t = ∂Ey/∂t = ∂cos(kx − ωt)/∂t = −ω·[−sin(kx − ωt)] = ω·Ez

In other words, Maxwell’s equations imply that ω = k, which is consistent with us measuring time and distance in equivalent units, so the phase velocity is  = 1 = ω/k.

So far, so good. We basically established that the propagation mechanism for an electromagnetic wave, as described by Maxwell’s equations, is fully coherent with the propagation mechanism—if we can call it like that—as described by Schrödinger’s equation. We also established the following equalities:

  1. ω = k
  2. ω = k2·(ħ/m)

The second of the two de Broglie equations tells us that k = p/ħ, so we can combine these two equations and re-write these two conditions as:

ω/k = 1 = k·(ħ/m) = (p/ħ)·(ħ/m) = p/m ⇔ p = m

What does this imply? The p here is the momentum: p = m·v, so this condition implies must be equal to 1 too, so the wave velocity is equal to the speed of light. Makes sense, because we actually are talking light here. 🙂 In addition, because it’s light, we also know E/p = = 1, so we have – once again – the general E = p = m equation, which we’ll need!

OK. Next. Let’s write the Schrödinger wave equation for both wavefunctions:

  1. ∂ψE/∂t = i·(ħ/mE)·∇2ψE, and
  2. ∂ψB/∂t = i·(ħ/mB)·∇2ψB.

Huh? What’s mE and mE? We should only associate one mass concept with our electromagnetic wave, shouldn’t we? Perhaps. I just want to be on the safe side now. Of course, if we distinguish mE and mB, we should probably also distinguish pE and pB, and EE and EB as well, right? Well… Yes. If we accept this line of reasoning, then the mass factor in Schrödinger’s equations is pretty much like the 1/c2 = μ0ε0 factor in Maxwell’s (1/c2)·∂E/∂t = ∇×B equation: the mass factor appears as a property of the medium, i.e. the vacuum here! [Just check my post on physical constants in case you wonder what I am trying to say here, in which I explain why and how defines the (properties of the) vacuum.]

To be consistent, we should also distinguish pE and pB, and EE and EB, and so we should write ψand ψB as:

  1. ψE = ei(kEx − ωEt), and
  2. ψB = ei(kBx − ωBt).

Huh? Yes. I know what you think: we’re talking one photon—or one electromagnetic wave—so there can be only one energy, one momentum and, hence, only one k, and one ω. Well… Yes and no. Of course, the following identities should hold: kE = kB and, likewise, ω= ωB. So… Yes. They’re the same: one k and one ω. But then… Well… Conceptually, the two k’s and ω’s are different. So we write:

  1. pE = EE = mE, and
  2. pB = EB = mB.

The obvious question is: can we just add them up to find the total energy and momentum of our photon? The answer is obviously positive: E = EE + EB, p = pE + pB and m = mE + mB.

Let’s check a few things now. How does it work for the phase and group velocity of ψand ψB? Simple:

  1. vg = ∂ωE/∂kE = ∂[EE/ħ]/∂[pE/ħ] = ∂EE/∂pE = ∂pE/∂pE = 1
  2. vp = ωE/kE = (EE/ħ)/(pE/ħ) = EE/pE = pE/pE = 1

So we’re fine, and you can check the result for ψby substituting the subscript E for B. To sum it all up, what we’ve got here is the following:

  1. We can think of a photon having some energy that’s equal to E = p = m (assuming c = 1), but that energy would be split up in an electric and a magnetic wavefunction respectively: ψand ψB.
  2. Schrödinger’s equation applies to both wavefunctions, but the E, p and m in those two wavefunctions are the same and not the same: their numerical value is the same (pE =EE = mE = pB =EB = mB), but they’re conceptually different. They must be: if not, we’d get a phase and group velocity for the wave that doesn’t make sense.

Of course, the phase and group velocity for the sum of the ψand ψwaves must also be equal to c. This is obviously the case, because we’re adding waves with the same phase and group velocity c, so there’s no issue with the dispersion relation.

So let’s insert those pE =EE = mE = pB =EB = mB values in the two wavefunctions. For ψE, we get:

ψ= ei[kEx − ωEt) ei[(pE/ħ)·x − (EE/ħ)·t] 

You can do the calculation for ψyourself. Let’s simplify our life a little bit and assume we’re using Planck units, so ħ = 1, and so the wavefunction simplifies to ψei·(pE·x − EE·t). We can now add the components of E and B using the summation formulas for sines and cosines:

1. B+ Ey = cos(pB·x − EB·t + π/2) + cos(pE·x − EE·t) = 2·cos[(p·x − E·t + π/2)/2]·cos(π/4) = √2·cos(p·x/2 − E·t/2 + π/4)

2. B+ Ez = sin(pB·x − EB·t+π/2) + sin(pE·x − EE·t) = 2·sin[(p·x − E·t + π/2)/2]·cos(π/4) = √2·sin(p·x/2 − E·t/2 + π/4)

Interesting! We find a composite wavefunction for our photon which we can write as:

E + B = ψ+ ψ= E + i·E = √2·ei(p·x/2 − E·t/2 + π/4) = √2·ei(π/4)·ei(p·x/2 − E·t/2) = √2·ei(π/4)·E

What a great result! It’s easy to double-check, because we can see the E + i·E = √2·ei(π/4)·formula implies that 1 + should equal √2·ei(π/4). Now that’s easy to prove, both geometrically (just do a drawing) or formally: √2·ei(π/4) = √2·cos(π/4) + i·sin(π/4ei(π/4) = (√2/√2) + i·(√2/√2) = 1 + i. We’re bang on! 🙂

We can double-check once more, because we should get the same from adding E and B = i·E, right? Let’s try:

E + B = E + i·E = cos(pE·x − EE·t) + i·sin(pE·x − EE·t) + i·cos(pE·x − EE·t) − sin(pE·x − EE·t)

= [cos(pE·x − EE·t) – sin(pE·x − EE·t)] + i·[sin(pE·x − EE·t) – cos(pE·x − EE·t)]

Indeed, we can see we’re going to obtain the same result, because the −sinθ in the real part of our composite wavefunction is equal to cos(θ+π/2), and the −cosθ in its imaginary part is equal to sin(θ+π/2). So the sum above is the same sum of cosines and sines that we did already.

So our electromagnetic wavefunction, i.e. the wavefunction for the photon, is equal to:

ψ = ψ+ ψ= √2·ei(p·x/2 − E·t/2 + π/4) = √2·ei(π/4)·ei(p·x/2 − E·t/2) 

What about the √2 factor in front, and the π/4 term in the argument itself? No sure. It must have something to do with the way the magnetic force works, which is not like the electric force. Indeed, remember the Lorentz formula: the force on some unit charge (q = 1) will be equal to F = E + v×B. So… Well… We’ve got another cross-product here and so the geometry of the situation is quite complicated: it’s not like adding two forces Fand Fto get some combined force F = Fand F2.

In any case, we need the energy, and we know that its proportional to the square of the amplitude, so… Well… We’re spot on: the square of the √2 factor in the √2·cos product and √2·sin product is 2, so that’s twice… Well… What? Hold on a minute! We’re actually taking the absolute square of the E + B = ψ+ ψ= E + i·E = √2·ei(p·x/2 − E·t/2 + π/4) wavefunction here. Is that legal? I must assume it is—although… Well… Yes. You’re right. We should do some more explaining here.

We know that we usually measure the energy as some definite integral, from t = 0 to some other point in time, or over the cycle of the oscillation. So what’s the cycle here? Our combined wavefunction can be written as √2·ei(p·x/2 − E·t/2 + π/4) = √2·ei(θ/2 + π/4), so a full cycle would correspond to θ going from 0 to 4π here, rather than from 0 to 2π. So that explains the √2 factor in front of our wave equation.

Bingo! If you were looking for an interpretation of the Planck energy and momentum, here it is.:-) And, while everything that’s written above is not easy to understand, it’s close to the ‘intuitive’ understanding to quantum mechanics that we were looking for, isn’t it? The quantum-mechanical propagation model explains everything now. 🙂 I only need to show one more thing, and that’s the different behavior of bosons and fermions:

  1. The amplitudes of identitical bosonic particles interfere with a positive sign, so we have Bose-Einstein statistics here. As Feynman writes it: (amplitude direct) + (amplitude exchanged).
  2. The amplitudes of identical fermionic particles interfere with a negative sign, so we have Fermi-Dirac statistics here: (amplitude direct) − (amplitude exchanged).

I’ll think about it. I am sure it’s got something to do with that B= i·E formula or, to put it simply, with the fact that, when bosons are involved, we get two wavefunctions (ψand ψB) for the price of one. The reasoning should be something like this:

I. For a massless particle (i.e. a zero-mass fermion), our wavefunction is just ψ = ei(p·x − E·t). So we have no √2 or √2·ei(π/4) factor in front here. So we can just add any number of them – ψ1 + ψ2 + ψ3 + … – and then take the absolute square of the amplitude to find a probability density, and we’re done.

II. For a photon (i.e. a zero-mass boson), our wavefunction is √2·ei(π/4)·ei(p·x − E·t)/2, which – let’s introduce a new symbol – we’ll denote by φ, so φ = √2·ei(π/4)·ei(p·x − E·t)/2. Now, if we add any number of these, we get a similar sum but with that √2·ei(π/4) factor in front, so we write: φ1 + φ2 + φ3 + … = √2·ei(π/4)·(ψ1 + ψ2 + ψ3 + …). If we take the absolute square now, we’ll see the probability density will be equal to twice the density for the ψ1 + ψ2 + ψ3 + … sum, because

|√2·ei(π/4)·(ψ1 + ψ2 + ψ3 + …)|2 = |√2·ei(π/4)|2·|ψ1 + ψ2 + ψ3 + …)|2 2·|ψ1 + ψ2 + ψ3 + …)|2

So… Well… I still need to connect this to Feynman’s (amplitude direct) ± (amplitude exchanged) formula, but I am sure it can be done.

Now, we haven’t tested the complete √2·ei(π/4)·ei(p·x − E·t)/2 wavefunction. Does it respect Schrödinger’s ∂ψ/∂t = i·(1/m)·∇2ψ or, including the 1/2 factor, the ∂ψ/∂t = i·[1/2m)]·∇2ψ equation? [Note we assume, once again, that ħ = 1, so we use Planck units once more.] Let’s see. We can calculate the derivatives as:

  • ∂ψ/∂t = −√2·ei(π/4)·ei∙[p·x − E·t]/2·(i·E/2)
  • 2ψ = ∂2[√2·ei(π/4)·ei∙[p·x − E·t]/2]/∂x= √2·ei(π/4)·∂[√2·ei(π/4)·ei∙[p·x − E·t]/2·(i·p/2)]/∂x = −√2·ei(π/4)·ei∙[p·x − E·t]/2·(p2/4)

So Schrödinger’s equation becomes:

i·√2·ei(π/4)·ei∙[p·x − E·t]/2·(i·E/2) = −i·(1/m)·√2·ei(π/4)·ei∙[p·x − E·t]/2·(p2/4) ⇔ 1/2 = 1/4!?

That’s funny ! It doesn’t work ! The E and m and p2 are OK because we’ve got that E = m = p equation, but we’ve got problems with yet another factor 2. It only works when we use the 2/m coefficient in Schrödinger’s equation.

So… Well… There’s no choice. That’s what we’re going to do. The Schrödinger equation for the photon is ∂ψ/∂t = i·(2/m)·∇2ψ !

It’s a very subtle point. This is all great, and very fundamental stuff! Let’s now move on to Schrödinger’s actual equation, i.e. the ∂ψ/∂t = i·(ħ/2m)·∇2ψ equation.

Post scriptum on the Planck units:

If we measure time and distance in equivalent units, say seconds, we can re-write the quantum of action as:

1.0545718×10−34 N·m·s = (1.21×1044 N)·(1.6162×10−35 m)·(5.391×10−44 s)

⇔ (1.0545718×10−34/2.998×108) N·s2 = (1.21×1044 N)·(1.6162×10−35/2.998×108 s)(5.391×10−44 s)

⇔ (1.21×1044 N) = [(1.0545718×10−34/2.998×108)]/[(1.6162×10−35/2.998×108 s)(5.391×10−44 s)] N·s2/s2

You’ll say: what’s this? Well… Look at it. We’ve got a much easier formula for the Planck force—much easier than the standard formulas you’ll find on Wikipedia, for example. If we re-interpret the symbols ħ and so they denote the numerical value of the quantum of action and the speed of light in standard SI units (i.e. newton, meter and second)—so ħ and c become dimensionless, or mathematical constants only, rather than physical constants—then the formula above can be written as:

FP newton = (ħ/c)/[(lP/c)·tP] newton ⇔ FP = ħ/(lP·tP)

Just double-check it: 1.0545718×10−34/(1.6162×10−35·5.391×10−44) = 1.21×1044. Bingo!

You’ll say: what’s the point? The point is: our model is complete. We don’t need the other physical constants – i.e. the Coulomb, Boltzmann and gravitational constant – to calculate the Planck units we need, i.e. the Planck force, distance and time units. It all comes out of our elementary wavefunction! All we need to explain the Universe – or, let’s be more modest, quantum mechanics – is two numerical constants (c and ħ) and Euler’s formula (which uses π and e, of course). That’s it.

If you don’t think that’s a great result, then… Well… Then you’re not reading this. 🙂

The photon wavefunction

Post scriptum note added on 11 July 2016: This is one of the more speculative posts which led to my e-publication analyzing the wavefunction as an energy propagation. With the benefit of hindsight, I would recommend you to immediately the more recent exposé on the matter that is being presented here, which you can find by clicking on the provided link.

Original post:

In my previous posts, I juxtaposed the following images:

Animation 5d_euler_f

Both are the same, and then they’re not. The illustration on the left-hand side shows how the electric field vector (E) of an electromagnetic wave travels through space, but it does not show the accompanying magnetic field vector (B), which is as essential in the electromagnetic propagation mechanism according to Maxwell’s equations:

  1. B/∂t = –∇×E
  2. E/∂t = c2∇×B = ∇×B for c = 1

The second illustration shows a wavefunction ei(kx − ωt) = cos(kx − ωt) + i∙sin(kx − ωt). Its propagation mechanism—if we can call it like that—is Schrödinger’s equation:

∂ψ/∂t = i·(ħ/2m)·∇2ψ

We already drew attention to the fact that an equation like this models some flow. To be precise, the Laplacian on the right-hand side is the second derivative with respect to x here, and, therefore, expresses a flux density: a flow per unit surface area, i.e. per square meter. To be precise: the Laplacian represents the flux density of the gradient flow of ψ.

On the left-hand side of Schrödinger’s equation, we have a time derivative, so that’s a flow per second. The ħ/2m factor is like a diffusion constant. In fact, strictly speaking, that ħ/2m factor is a diffusion constant, because it does exactly the same thing as the diffusion constant D in the diffusion equation ∂φ/∂t = D·∇2φ, i.e:

  1. As a constant of proportionality, it quantifies the relationship between both derivatives.
  2. As a physical constant, it ensures the dimensions on both sides of the equation are compatible.

So our diffusion constant here is ħ/2m. Because of the Uncertainty Principle, m is always going to be some integer multiple of ħ/2, so ħ/2m = 1, 1/2, 1/3, 1/4 etcetera. In other words, the ħ/2m term is the inverse of the mass measured in units of ħ/2. We get the terms of the harmonic series here. How convenient! 🙂

In our previous posts, we studied the wavefunction for a zero-mass particle. Such particle has zero rest mass but – because of its movement – does have some energy, and, therefore, some mass and momentum. In fact, measuring time and distance in equivalent units (so = 1), we found that E = m = p = ħ/2 for the zero-mass particle. It had to be. If not, our equations gave us nonsense. So Schrödinger’s equation was reduced to:

∂ψ/∂t = i·∇2ψ

How elegant! We only need to explain that imaginary unit (i) in the equation. It does a lot of things. First, it gives us two equations for the price of one—thereby providing a propagation mechanism indeed. It’s just like the E and B vectors. Indeed, we can write that ∂ψ/∂t = i·∇2ψ equation as:

  1. Re(∂ψ/∂t) = −Im(∇2ψ)
  2. Im(∂ψ/∂t) = Re(∇2ψ)

You should be able to show that the two equations above are effectively equivalent to Schrödinger’s equation. If not… Well… Then you should not be reading this stuff.] The two equations above show that the real part of the wavefunction feeds into its imaginary part, and vice versa. Both are as essential. Let me say this one more time: the so-called real and imaginary part of a wavefunction are equally real—or essential, I should say!

Second, gives us the circle. Huh? Yes. Writing the wavefunction as ψ = a + i·b is not just like writing a vector in terms of its Cartesian coordinates, even if it looks very much that way. Why not? Well… Never forget: i2= −1, and so—let me use mathematical lingo here—the introduction of i makes our metric space complete. To put it simply: we can now compute everything. In short, the introduction of the imaginary unit gives us that wonderful mathematical construct, ei(kx − ωt), which allows us to model everything. In case you wonder, I mean: everything! Literally. 🙂

However, we’re not going to impose any pre-conditions here, and so we’re not going to make that E = m = p = ħ/2 assumption now. We’ll just re-write Schrödinger’s equation as we did last time—so we’re going to keep our ‘diffusion constant’ ħ/2m as for now:

  1. Re(∂ψ/∂t) = −(ħ/2m)·Im(∇2ψ)
  2. Im(∂ψ/∂t) = (ħ/2m)·Re(∇2ψ)

So we have two pairs of equations now. Can they be related? Well… They look the same, so they had better be related! 🙂 Let’s explore it. First note that, if we’d equate the direction of propagation with the x-axis, we can write the E vector as the sum of two y- and z-components: E = (Ey, Ez). Using complex number notation, we can write E as:

E = (Ey, Ez) = Ey + i·Ez

In case you’d doubt, just think of this simple drawing:

2000px-Complex_number_illustration

The next step is to imagine—funny word when talking complex numbers—that Ey and Eare the real and imaginary part of some wavefunction, which we’ll denote as ψE = ei(kx − ωt). So now we can write:

E = (Ey, Ez) = Ey + i·E= cos(kx − ωt) + i∙sin(kx − ωt) = ReE) + i·ImE)

What’s k and ω? Don’t worry about it—for the moment, that is. We’ve done nothing special here. In fact, we’re used to representing waves as some sine or cosine function, so that’s what we are doing here. Nothing more. Nothing less. We just need two sinusoids because of the circular polarization of our electromagnetic wave.

What’s next? Well… If ψE is a regular wavefunction, then we should be able to check if it’s a solution to Schrödinger’s equation. So we should be able to write:

  1. Re(∂ψE/∂t) =  −(ħ/2m)·Im(∇2ψE)
  2. Im(∂ψE/∂t) = (ħ/2m)·Re(∇2ψE)

Are we? How does that work? The time derivative on the left-hand side is equal to:

∂ψE/∂t = −iω·ei(kx − ωt) = −iω·[cos(kx − ωt) + i·sin(kx − ωt)] = ω·sin(kx − ωt) − iω·cos(kx − ωt)

The second-order derivative on the right-hand side is equal to:

2ψ= ∂2ψE/∂x= −k2·ei(kx − ωt) = −k2·cos(kx − ωt) − ik2·sin(kx − ωt)

So the two equations above are equivalent to writing:

  1. Re(∂ψE/∂t) =   −(ħ/2m)·Im(∇2ψE) ⇔ ω·sin(kx − ωt) = k2·(ħ/2m)·sin(kx − ωt)
  2. Im(∂ψE/∂t) = (ħ/2m)·Re(∇2ψE) ⇔ −ω·cos(kx − ωt) = −k2·(ħ/2m)·cos(kx − ωt)

Both conditions are fulfilled if, and only if, ω = k2·(ħ/2m). Now, assuming we measure time and distance in equivalent units (= 1), we can calculate the phase velocity of the electromagnetic wave as being equal to = ω/k = 1. We also have the de Broglie equation for the matter-wave, even if we’re not quite sure whether or not we should apply that to an electromagnetic waveIn any case, the de Broglie equation tells us that k = p/ħ. So we can re-write this condition as:

ω/k = 1 = k·(ħ/2m) = (p/ħ)·(ħ/2m) = p/2m ⇔ p = 2m ⇔ m = p/2

So that’s different from the E = m = p equality we imposed when discussing the wavefunction of the zero-mass particle: we’ve got that 1/2 factor which bothered us so much once again! And it’s causing us the same trouble: how do we interpret that m = p/2 equation? It leads to nonsense once more! E = m·c= m, but E is also supposed to be equal to p·c = p. Here, however, we find that E = p/2! We also get strange results when calculating the group and phase velocity. So… Well… What’s going on here?

I am not quite sure. It’s that damn 1/2 factor. Perhaps it’s got something to do with our definition of mass. The m in the Schrödinger equation was referred to as the effective or reduced mass of the electron wavefunction that it was supposed to model. Now that concept is something funny: it sure allows for some gymnastics, as you’ll see when going through the Wikipedia article on it! I promise I’ll dig into it—but not now and here, as I’ve got no time for that. 😦

However, the good news is that we also get a magnetic field vector with an electromagnetic wave: B. We know B is always orthogonal to E, and in the direction that’s given by the right-hand rule for the vector cross-product. Indeed, we can write B as B = ex×E/c, with ex the unit vector pointing in the x-direction (i.e. the direction of propagation), as shown below.

E and b

So we can do the same analysis: we just substitute E for B everywhere, and we’ll find the same condition: m = p/2. To distinguish the two wavefunctions, we used the E and B  subscripts for our wavefunctions, so we wrote ψand ψB. We can do the same for that m = p/2 condition:

  1. mE = pE/2
  2. m= pB/2

Should we just add mE and mE to get a total momentum and, hence, a total energy, that’s equal to E = m = p for the whole wave? I believe we should, but I haven’t quite figured out how we should interpret that summation!

So… Well… Sorry to disappoint you. I haven’t got the answer here. But I do believe my instinct tells me the truth: the wavefunction for an electromagnetic wave—so that’s the wavefunction for a photon, basically—is essentially the same as our wavefunction for a zero-mass particle. It’s just that we get two wavefunctions for the price of one. That’s what distinguishes bosons from fermions! And so I need to figure out how they differ exactly! And… Well… Yes. That might take me a while!

In the meanwhile, we should play some more with those E and B vectors, as that’s going to help us to solve the riddle—no doubt!

Fiddling with E and B

The B = ex×E/c equation is equivalent to saying that we’ll get B when rotating E by 90 degrees which, in turn, is equivalent to multiplication by the imaginary unit iHuh? Yes. Sorry. Just google the meaning of the vector cross product and multiplication by i. So we can write B = i·E, which amounts to writing:

B = i·E = ei(π/2)·ei(kx − ωt) = ei(kx − ωt + π/2) = cos(kx − ωt + π/2) + i·sin(kx − ωt + π/2)

So we can now associate a wavefunction ψB with the field magnetic field vector B, which is the same wavefunction as ψE except for a phase shift equal to π/2. You’ll say: so what? Well… Nothing much. I guess this observation just concludes this long digression on the wavefunction of a photon: it’s the same wavefunction as that of a zero-mass particle—except that we get two for the price of one!

It’s an interesting way of looking at things. Let’s look at the equations we started this post with, i.e. Maxwell’s equations in free space—i.e. no stationary charges, and no currents (i.e. moving charges) either! So we’re talking those ∂B/∂t = –∇×E and ∂E/∂t = ∇×B equations now.

Note that they actually give you four equations, because they’re vector equations:

  1. B/∂t = –∇×⇔ ∂By/∂t = –(∇×E)y and ∂Bz/∂t = –(∇×E)z
  2. E/∂t = ∇×⇔ ∂Ey/∂t = (∇×B)y and ∂Ez/∂t = (∇×B)z

To figure out what that means, we need to remind ourselves of the definition of the curl operator, i.e. the ∇× operator. For E, the components of ∇×E are the following:

  1. (∇×E)z = ∇xE– ∇yE= ∂Ey/∂x – ∂Ex/∂y
  2. (∇×E)x = ∇yE– ∇zE= ∂Ez/∂y – ∂Ey/∂z
  3. (∇×E)y = ∇zE– ∇xE= ∂Ex/∂z – ∂Ez/∂x

So the four equations above can now be written as:

  1. ∂By/∂t = –(∇×E)y = –∂Ex/∂z + ∂Ez/∂x
  2. ∂Bz/∂t = –(∇×E)z = –∂Ey/∂x + ∂Ex/∂y
  3. ∂Ey/∂t = (∇×B)y = ∂Bx/∂z – ∂Bz/∂x
  4. ∂Ez/∂t = (∇×B)= ∂By/∂x – ∂Bx/∂y

What can we do with this? Well… The x-component of E and B is zero, so one of the two terms in the equations simply disappears. We get:

  1. ∂By/∂t = –(∇×E)y = ∂Ez/∂x
  2. ∂Bz/∂t = –(∇×E)z = – ∂Ey/∂x
  3. ∂Ey/∂t = (∇×B)y = – ∂Bz/∂x
  4. ∂Ez/∂t = (∇×B)= ∂By/∂x

Interesting: only the derivatives with respect to x remain! Let’s calculate them:

  1. ∂By/∂t = –(∇×E)y = ∂Ez/∂x = ∂[sin(kx − ωt)]/∂x = k·cos(kx − ωt) = k·Ey
  2. ∂Bz/∂t = –(∇×E)z = – ∂Ey/∂x = – ∂[cos(kx − ωt)]/∂x = k·sin(kx − ωt) = k·Ez
  3. ∂Ey/∂t = (∇×B)y = – ∂Bz/∂x = – ∂[sin(kx − ωt + π/2)]/∂x = – k·cos(kx − ωt + π/2) = – k·By
  4. ∂Ez/∂t = (∇×B)= ∂By/∂x = ∂[cos(kx − ωt + π/2)]/∂x = − k·sin(kx − ωt + π/2) = – k·Bz

What wonderful results! The time derivatives of the components of B and E are equal to ±k times the components of E and B respectively! So everything is related to everything, indeed! 🙂

Let’s play some more. Using the cos(θ + π/2) = −sin(θ) and sin(θ + π/2) = cos(θ) identities, we know that By  and B= sin(kx − ωt + π/2) are equal to:

  1. B= cos(kx − ωt + π/2) = −sin(kx − ωt) = −Ez
  2. B= sin(kx − ωt + π/2) = cos(kx − ωt) = Ey

Let’s calculate those derivatives once more now:

  1. ∂By/∂t = −∂Ez/∂t = −∂sin(kx − ωt)/∂t = ω·cos(kx − ωt) = ω·Ey
  2. ∂Bz/∂t = ∂Ey/∂t = ∂cos(kx − ωt)/∂t = −ω·sin(kx − ωt) = −ω·Ez

This result can, obviously, be true only if ω = k, which we assume to be the case, as we’re measuring time and distance in equivalent units, so the phase velocity is  = 1 = ω/k.

Hmm… I am sure it won’t be long before I’ll be able to prove what I want to prove. I just need to figure out the math. It’s pretty obvious now that the wavefunction—any wavefunction, really—models the flow of energy. I just need to show how it works for the zero-mass particle—and then I mean: how it works exactly. We must be able to apply the concept of the Poynting vector to wavefunctions. We must be. I’ll find how. One day. 🙂

As for now, however, I feel we’ve played enough with those wavefunctions now. It’s time to do what we promised to do a long time ago, and that is to use Schrödinger’s equation to calculate electron orbitals—and other stuff, of course! Like… Well… We hardly ever talked about spin, did we? That comes with huge complexities. But we’ll get through it. Trust me. 🙂

The quantum of time and distance

Post scriptum note added on 11 July 2016: This is one of the more speculative posts which led to my e-publication analyzing the wavefunction as an energy propagation. With the benefit of hindsight, I would recommend you to immediately the more recent exposé on the matter that is being presented here, which you can find by clicking on the provided link. In fact, I actually made some (small) mistakes when writing the post below.

Original post:

In my previous post, I introduced the elementary wavefunction of a particle with zero rest mass in free space (i.e. the particle also has zero potential). I wrote that wavefunction as ei(kx − ωt) ei(x/2 − t/2) = cos[(x−t)/2] + i∙sin[(x−t)/2], and we can represent that function as follows:

5d_euler_f

If the real and imaginary axis in the image above are the y- and z-axis respectively, then the x-axis here is time, so here we’d be looking at the shape of the wavefunction at some fixed point in space.

Now, we could also look at its shape at some fixed in point in time, so the x-axis would then represent the spatial dimension. Better still, we could animate the illustration to incorporate both the temporal as well as the spatial dimension. The following animation does the trick quite well:

Animation

Please do note that space is one-dimensional here: the y- and z-axis represent the real and imaginary part of the wavefunction, not the y- or z-dimension in space.

You’ve seen this animation before, of course: I took it from Wikipedia, and it actually represents the electric field vector (E) for a circularly polarized electromagnetic wave. To get a complete picture of the electromagnetic wave, we should add the magnetic field vector (B), which is not shown here. We’ll come back to that later. Let’s first look at our zero-mass particle denuded of all properties, so that’s not an electromagnetic wave—read: a photon. No. We don’t want to talk charges here.

OK. So far so good. A zero-mass particle in free space. So we got that ei(x/2 − t/2) = cos[(x−t)/2] + i∙sin[(x−t)/2] wavefunction. We got that function assuming the following:

  1. Time and distance are measured in equivalent units, so = 1. Hence, the classical velocity (v) of our zero-mass particle is equal to 1, and we also find that the energy (E), mass (m) and momentum (p) of our particle are numerically the same. We wrote: E = m = p, using the p = m·v (for = c) and the E = m∙c2 formulas.
  2. We also assumed that the quantum of energy (and, hence, the quantum of mass, and the quantum of momentum) was equal to ħ/2, rather than ħ. The de Broglie relations (k = p/ħ and ω = E/ħ) then gave us the rather particular argument of our wavefunction: kx − ωt = x/2 − t/2.

The latter hypothesis (E = m = p = ħ/2) is somewhat strange at first but, as I showed in that post of mine, it avoids an apparent contradiction: if we’d use ħ, then we would find two different values for the phase and group velocity of our wavefunction. To be precise, we’d find for the group velocity, but v/2 for the phase velocity. Using ħ/2 solves that problem. In addition, using ħ/2 is consistent with the Uncertainty Principle, which tells us that ΔxΔp = ΔEΔt = ħ/2.

OK. Take a deep breath. Here I need to say something about dimensions. If we’re saying that we’re measuring time and distance in equivalent units – say, in meter, or in seconds – then we are not saying that they’re the same. The dimension of time and space is fundamentally different, as evidenced by the fact that, for example, time flows in one direction only, as opposed to x. To be precise, we assumed that x and t become countable variables themselves at some point in time. However, if we’re at t = 0, then we’d count time as t = 1, 2, etcetera only. In contrast, at the point x = 0, we can go to x = +1, +2, etcetera but we may also go to x = −1, −2, etc.

I have to stress this point, because what follows will require some mental flexibility. In fact, we often talk about natural units, such as Planck units, which we get from equating fundamental constants, such as c, or ħ, to 1, but then we often struggle to interpret those units, because we fail to grasp what it means to write = 1, or ħ = 1. For example, writing = 1 implies we can measure distance in seconds, or time in meter, but it does not imply that distance becomes time, or vice versa. We still need to keep track of whether or not we’re talking a second in time, or a second in space, i.e. c meter, or, conversely, whether we’re talking a meter in space, or a meter in time, i.e. 1/c seconds. We can make the distinction in various ways. For example, we could mention the dimension of each equation between brackets, so we’d write: t = 1×10−15 s [t] ≈ 299.8×10−9 m [t]. Alternatively, we could put a little subscript (like t, or d), so as to make sure it’s clear our meter is a a ‘light-meter’, so we’d write: t = 1×10−15 s ≈ 299.8×10−9 mt. Likewise, we could add a little subscript when measuring distance in light-seconds, so we’d write x = 3×10m ≈ 1 sd, rather than x = 3×10m [x] ≈ 1 s [x].

If you wish, we could refer to the ‘light-meter’ as a ‘time-meter’ (or a meter of time), and to the light-second as a ‘distance-second’ (or a second of distance). It doesn’t matter what you call it, or how you denote it. In fact, you will never hear of a meter of time, nor will you ever see those subscripts or brackets. But that’s because physicists always keep track of the dimensions of an equation, and so they know. They know, for example, that the dimension of energy combines the dimensions of both force as well as distance, so we write: [energy] = [force]·[distance]. Read: energy amounts to applying a force over a distance. Likewise, momentum amounts to applying some force over some time, so we write: [momentum] = [force]·[time]. Using the usual symbols for energy, momentum, force, distance and time respectively, we can write this as [E] = [F]·[x] and [p] = [F]·[t]. Using the units you know, i.e. joulenewton, meter and seconds, we can also write this as: 1 J = 1 N·m and 1…

Hey! Wait a minute! What’s that N·s unit for momentum? Momentum is mass times velocity, isn’t it? It is. But it amounts to the same. Remember that mass is a measure for the inertia of an object, and so mass is measured with reference to some force (F) and some acceleration (a): F = m·⇔ m = F/a. Hence, [m] = kg = [F/a] = N/(m/s2) = N·s2/m. [Note that the m in the brackets is symbol for mass but the other m is a meter!] So the unit of momentum is (N·s2/m)·(m/s) = N·s = newton·second.

Now, the dimension of Planck’s constant is the dimension of action, which combines all dimensions: force, time and distance. We write: ħ ≈ 1.0545718×10−34 N·m·s (newton·meter·second). That’s great, and I’ll show why in a moment. But, at this point, you should just note that when we write that E = m = p = ħ/2, we’re just saying they are numerically the same. The dimensions of E, m and p are not the same. So what we’re really saying is the following:

  1. The quantum of energy is ħ/2 newton·meter ≈ 0.527286×10−34 N·m.
  2. The quantum of momentum is ħ/2 newton·second ≈ 0.527286×10−34 N·s.

What’s the quantum of mass? That’s where the equivalent units come in. We wrote: 1 kg = 1 N·s2/m. So we could substitute the distance unit in this equation (m) by sd/= sd/(3×108). So we get: 1 kg = 3×108 N·s2/sd. Can we scrap both ‘seconds’ and say that the quantum of mass (ħ/2) is equal to the quantum of momentum? Think about it.

[…]

The answer is… Yes and no—but much more no than yes! The two sides of the equation are only numerically equal, but we’re talking a different dimension here. If we’d write that 1 kg = 0.527286×10−34 N·s2/sd = 0.527286×10−34 N·s, you’d be equating two dimensions that are fundamentally different: space versus time. To reinforce the point, think of it the other way: think of substituting the second (s) for 3×10m. Again, you’d make a mistake. You’d have to write 0.527286×10−34 N·(mt)2/m, and you should not assume that a time-meter is equal to a distance-meter. They’re equivalent units, and so you can use them to get some number right, but they’re not equal: what they measure, is fundamentally different. A time-meter measures time, while a distance-meter measure distance. It’s as simple as that. So what is it then? Well… What we can do is remember Einstein’s energy-mass equivalence relation once more: E = m·c2 (and m is the mass here). Just check the dimensions once more: [m]·[c2] = (N·s2/m)·(m2/s2) = N·m. So we should think of the quantum of mass as the quantum of energy, as energy and mass are equivalent, really.

Back to the wavefunction

The beauty of the construct of the wavefunction resides in several mathematical properties of this construct. The first is its argument:

θ = kx − ωt, with k = p/ħ and ω = E/ħ

Its dimension is the dimension of an angle: we express in it in radians. What’s a radian? You might think that a radian is a distance unit because… Well… Look at how we measure an angle in radians below:

Circle_radians

But you’re wrong. An angle’s measurement in radians is numerically equal to the length of the corresponding arc of the unit circle but… Well… Numerically only. 🙂 Just do a dimensional analysis of θ = kx − ωt = (p/ħ)·x − (E/ħ)·t. The dimension of p/ħ is (N·s)/(N·m·s) = 1/m = m−1, so we get some quantity expressed per meter, which we then multiply by x, so we get a pure number. No dimension whatsoever! Likewise, the dimension of E/ħ is (N·m)/(N·m·s) = 1/s = s−1, which we then multiply by t, so we get another pure number, which we then add to get our argument θ. Hence, Planck’s quantum of action (ħ) does two things for us:

  1. It expresses p and E in units of ħ.
  2. It sorts out the dimensions, ensuring our argument is a dimensionless number indeed.

In fact, I’d say the ħ in the (p/ħ)·x term in the argument is a different ħ than the ħ in the (E/ħ)·t term. Huh? What? Yes. Think of the distinction I made between s and sd, or between m and mt. Both were numerically the same: they captured a magnitude, but they measured different things. We’ve got the same thing here:

  1. The meter (m) in ħ ≈ 1.0545718×10−34 N·m·s in (p/ħ)·x is the dimension of x, and so it gets rid of the distance dimension. So the m in ħ ≈ 1.0545718×10−34 m·s goes, and what’s left measures p in terms of units equal to 1.0545718×10−34 N·s, so we get a pure number indeed.
  2. Likewise, the second (s) in ħ ≈ 1.0545718×10−34 N·m·s in (E/ħ)·t is the dimension of t, and so it gets rid of the time dimension. So the s in ħ ≈ 1.0545718×10−34 N·m·s goes, and what’s left measures E in terms of units equal to 1.0545718×10−34 N·m, so we get another pure number.
  3. Adding both gives us the argument θ: a pure number that measures some angle.

That’s why you need to watch out when writing θ = (p/ħ)·x − (E/ħ)·t as θ = (p·x − E·t)/ħ or – in the case of our elementary wavefunction for the zero-mass particle – as θ = (x/2 − t/2) = (x − t)/2. You can do it – in fact, you should do when trying to calculate something – but you need to be aware that you’re making abstraction of the dimensions. That’s quite OK, as you’re just calculating something—but don’t forget the physics behind!

You’ll immediately ask: what are the physics behind here? Well… I don’t know. Perhaps nobody knows. As Feynman once famously said: “I think I can safely say that nobody understands quantum mechanics.” But then he never wrote that, and I am sure he didn’t really mean that. And then he said that back in 1964, which is 50 years ago now. 🙂 So let’s try to understand it at least. 🙂

Planck’s quantum of action – 1.0545718×10−34 N·m·s – comes to us as a mysterious quantity. A quantity is more than a a number. A number is something like π or e, for example. It might be a complex number, like eiθ, but that’s still a number. In contrast, a quantity has some dimension, or some combination of dimensions. A quantity may be a scalar quantity (like distance), or a vector quantity (like a field vector). In this particular case (Planck’s ħ or h), we’ve got a physical constant combining three dimensions: force, time and distance—or space, if you want.  It’s a quantum, so it comes as a blob—or a lump, if you prefer that word. However, as I see it, we can sort of project it in space as well as in time. In fact, if this blob is going to move in spacetime, then it will move in space as well as in time: t will go from 0 to 1, and x goes from 0 to ± 1, depending on what direction we’re going. So when I write that E = p = ħ/2—which, let me remind you, are two numerical equations, really—I sort of split Planck’s quantum over E = m and p respectively.

You’ll say: what kind of projection or split is that? When projecting some vector, we’ll usually have some sine and cosine, or a 1/√2 factor—or whatever, but not a clean 1/2 factor. Well… I have no answer to that, except that this split fits our mathematical construct. Or… Well… I should say: my mathematical construct. Because what I want to find is this clean Schrödinger equation:

∂ψ/∂t = i·(ħ/2m)·∇2ψ = i·∇2ψ for m = ħ/2

Now I can only get this equation if (1) E = m = p and (2) if m = ħ/2 (which amounts to writing that E = p = m = ħ/2). There’s also the Uncertainty Principle. If we are going to consider the quantum vacuum, i.e. if we’re going to look at space (or distance) and time as count variables, then Δx and Δt in the ΔxΔp = ΔEΔt = ħ/2 equations are ± 1 and, therefore, Δp and ΔE must be ± ħ/2. In any case, I am not going to try to justify my particular projection here. Let’s see what comes out of it.

The quantum vacuum

Schrödinger’s equation for my zero-mass particle (with energy E = m = p = ħ/2) amounts to writing:

  1. Re(∂ψ/∂t) = −Im(∇2ψ)
  2. Im(∂ψ/∂t) = Re(∇2ψ)

Now that reminds of the propagation mechanism for the electromagnetic wave, which we wrote as ∂B/∂t = –∇×and E/∂t = ∇×B, also assuming we measure time and distance in equivalent units. However, we’ll come back to that later. Let’s first study the equation we have, i.e.

ei(kx − ωt) = ei(ħ·x/2 − ħ·t/2)/ħ = ei(x/2 − t/2) = cos[(x−t)/2] + i∙sin[(x−t)/2]

Let’s think some more. What is that ei(x/2 − t/2) function? It’s subject to conceiving time and distance as countable variables, right? I am tempted to say: as discrete variables, but I won’t go that far—not now—because the countability may be related to a particular interpretation of quantum physics. So I need to think about that. In any case… The point is that x can only take on values like 0, 1, 2, etcetera. And the same goes for t. To make things easy, we’ll not consider negative values for x right now (and, obviously, not for t either). But you can easily check it doesn’t make a difference: if you think of the propagation mechanism – which is what we’re trying to model here – then x is always positive, because we’re moving away from some source that caused the wave. In any case, we’ve got a infinite set of points like:

  • ei(0/2 − 0/2) ei(0) = cos(0) + i∙sin(0)
  • ei(1/2 − 0/2) = ei(1/2) = cos(1/2) + i∙sin(1/2)
  • ei(0/2 − 1/2) = ei(−1/2) = cos(−1/2) + i∙sin(−1/2)
  • ei(1/2 − 1/2) = ei(0) = cos(0) + i∙sin(0)

In my previous post, I calculated the real and imaginary part of this wavefunction for x going from 0 to 14 (as mentioned, in steps of 1) and for t doing the same (also in steps of 1), and what we got looked pretty good:

graph real graph imaginary

I also said that, if you wonder how the quantum vacuum could possibly look like, you should probably think of these discrete spacetime points, and some complex-valued wave that travels as illustrated above. In case you wonder what’s being illustrated here: the right-hand graph is the cosine value for all possible x = 0, 1, 2,… and t = 0, 1, 2,… combinations, and the left-hand graph depicts the sine values, so that’s the imaginary part of our wavefunction. Taking the absolute square of both gives 1 for all combinations. So it’s obvious we’d need to normalize and, more importantly, we’d have to localize the particle by adding several of these waves with the appropriate contributions. But so that’s not our worry right now. I want to check whether those discrete time and distance units actually make sense. What’s their size? Is it anything like the Planck length (for distance) and/or the Planck time?

Let’s see. What are the implications of our model? The question here is: if ħ/2 is the quantum of energy, and the quantum of momentum, what’s the quantum of force, and the quantum of time and/or distance?

Huh? Yep. We treated distance and time as countable variables above, but now we’d like to express the difference between x = 0 and x = 1 and between t = 0 and t = 1 in the units we know, this is in meter and in seconds. So how do we go about that? Do we have enough equations here? Not sure. Let’s see…

We obviously need to keep track of the various dimensions here, so let’s refer to that discrete distance and time unit as tand lP respectively. The subscript (P) refers to Planck, and the refers to a length, but we’re likely to find something else than Planck units. I just need placeholder symbols here. To be clear: tand lP are expressed in meter and seconds respectively, just like the actual Planck time and distance, which are equal to 5.391×10−44 s (more or less) and  1.6162×10−35 m (more or less) respectively. As I mentioned above, we get these Planck units by equating fundamental physical constants to 1. Just check it: (1.6162×10−35 m)/(5.391×10−44 s) = ≈ 3×10m/s. So the following relation must be true: lP = c·tP, or lP/t= c.

Now, as mentioned above, there must be some quantum of force as well, which we’ll write as FP, and which is – obviously – expressed in newton (N). So we have:

  1. E = ħ/2 ⇒ 0.527286×10−34 N·m = FP·lN·m
  2. p = ħ/2 ⇒ 0.527286×10−34 N·s = FP·tN·s

Let’s try to divide both formulas: E/p = (FP·lN·m)/(FP·tN·s) = lP/tP m/s = lP/tP m/s = c m/s. That’s consistent with the E/p = equation. Hmm… We found what we knew already. My model is not fully determined, it seems. 😦

What about the following simplistic approach? E is numerically equal to 0.527286×10−34, and its dimension is [E] = [F]·[x], so we write: E = 0.527286×10−34·[E] = 0.527286×10−34·[F]·[x]. Hence, [x] = [E]/[F] = (N·m)/N = m. That just confirms what we already know: the quantum of distance (i.e. our fundamental unit of distance) can be expressed in meter. But our model does not give that fundamental unit. It only gives us its dimension (meter), which is stuff we knew from the start. 😦

Let’s try something else. Let’s just accept that Planck length and time, so we write:

  • lP = 1.6162×10−35 m
  • t= 5.391×10−44 s

Now, if the quantum of action is equal to ħ N·m·s = FP·lP·tP N·m·s = 1.0545718×10−34 N·m·s, and if the two definitions of land tP above hold, then 1.0545718×10−34 N·m·s = (FN)×(1.6162×10−35 m)×(5.391×10−44 s) ≈ FP  8.713×10−79 N·m·s ⇔ FP ≈ 1.21×1044 N.

Does that make sense? It does according to Wikipedia, but how do we relate this to our E = p = m = ħ/2 equations? Let’s try this:

  1. EP = (1.0545718×10−34 N·m·s)/(5.391×10−44 s) = 1.956×109 J. That corresponds to the regular Planck energy.
  2. pP = (1.0545718×10−34 N·m·s)/(1.6162×10−35 m) = 0.6525 N·s. That corresponds to the regular Planck momentum.

Is EP = pP? Let’s substitute: 1.956×109 N·m = 1.956×109 N·(s/c) = 1.956×109/2.998×10N·s = 0.6525 N·s. So, yes, it comes out alright. In fact, I omitted the 1/2 factor in the calculations, but it doesn’t matter: it does come out alright. So I did not prove that the difference between my x = 0 and x = 1 points (or my t = 0 and t  = 1 points) is equal to the Planck length (or the Planck time unit), but I did show my theory is, at the very least, compatible with those units. That’s more than enough for now. And I’ll come surely come back to it in my next post. 🙂

Post Scriptum: One must solve the following equations to get the fundamental Planck units:

Planck units

We have five fundamental equations for five fundamental quantities respectively: tP, lP, FP, mP, and EP respectively, so that’s OK: it’s a fully determined system alright! But where do the expressions with G, kB (the Boltzmann constant) and ε0 come from? What does it mean to equate those constants to 1? Well… I need to think about that, and I’ll get back to you on it. 🙂

The wavefunction of a zero-mass particle

Post scriptum note added on 11 July 2016: This is one of the more speculative posts which led to my e-publication analyzing the wavefunction as an energy propagation. With the benefit of hindsight, I would recommend you to immediately the more recent exposé on the matter that is being presented here, which you can find by clicking on the provided link. In fact, I actually made some (small) mistakes when writing the post below.

Original post:

I hope you find the title intriguing. A zero-mass particle? So I am talking a photon, right? Well… Yes and no. Just read this post and, more importantly, think about this story for yourself. 🙂

One of my acquaintances is a retired nuclear physicist. We mail every now and then—but he has little or no time for my questions: he usually just tells me to keep studying. I once asked him why there is never any mention of the wavefunction of a photon in physics textbooks. He bluntly told me photons don’t have a wavefunction—not in the sense I was talking at least. Photons are associated with a traveling electric and a magnetic field vector. That’s it. Full stop. Photons do not have a ψ or φ function. [I am using ψ and φ to refer to position or momentum wavefunction. You know both are related: if we have one, we have the other.] But then I never give up, of course. I just can’t let go out of the idea of a photon wavefunction. The structural similarity in the propagation mechanism of the electric and magnetic field vectors E and B just looks too much like the quantum-mechanical wavefunction. So I kept trying and, while I don’t think I fully solved the riddle, I feel I understand it much better now. Let me show you the why and how.

I. An electromagnetic wave in free space is fully described by the following two equations:

  1. B/∂t = –∇×E
  2. E/∂t = c2∇×B

We’re making abstraction here of stationary charges, and we also do not consider any currents here, so no moving charges either. So I am omitting the ∇·E = ρ/ε0 equation (i.e. the first of the set of four equations), and I am also omitting the j0 in the second equation. So, for all practical purposes (i.e. for the purpose of this discussion), you should think of a space with no charges: ρ = 0 and = 0. It’s just a traveling electromagnetic wave. To make things even simpler, we’ll assume our time and distance units are chosen such that = 1, so the equations above reduce to:

  1. B/∂t = –∇×E
  2.  E/∂t = ∇×B

Perfectly symmetrical! But note the minus sign in the first equation. As for the interpretation, I should refer you to previous posts but, briefly, the ∇× operator is the curl operator. It’s a vector operator: it describes the (infinitesimal) rotation of a (three-dimensional) vector field. We discussed heat flow a couple of times, or the flow of a moving liquid. So… Well… If the vector field represents the flow velocity of a moving fluid, then the curl is the circulation density of the fluid. The direction of the curl vector is the axis of rotation as determined by the ubiquitous right-hand rule, and its magnitude of the curl is the magnitude of rotation. OK. Next  step.

II. For the wavefunction, we have Schrödinger’s equation, ∂ψ/∂t = i·(ħ/2m)·∇2ψ, which relates two complex-valued functions (∂ψ/∂t and ∇2ψ). Complex-valued functions consist of a real and an imaginary part, and you should be able to verify this equation is equivalent to the following set of two equations:

  1. Re(∂ψ/∂t) = −(ħ/2m)·Im(∇2ψ)
  2. Im(∂ψ/∂t) = (ħ/2m)·Re(∇2ψ)

[Two complex numbers a + ib and c + id are equal if, and only if, their real and imaginary parts are the same. However, note the −i factor in the right-hand side of the equation, so we get: a + ib = −i·(c + id) = d −ic.] The Schrödinger equation above also assumes free space (i.e. zero potential energy: V = 0) but, in addition – see my previous post – they also assume a zero rest mass of the elementary particle (E0 = 0). So just assume E= V = 0 in de Broglie’s elementary ψ(θ) = ψ(x, t) = eiθ = a·e−i[(E+ p2/(2m) + V)·t − p∙x]/ħ wavefunction. So, in essence, we’re looking at the wavefunction of a massless particle here. Sounds like nonsense, doesn’t it? But… Well… That should be the wavefunction of a photon in free space then, right? 🙂

Maybe. Maybe not. Let’s go as far as we can.

The energy of a zero-mass particle

What m would we use for a photon? It’s rest mass is zero, but it’s got energy and, hence, an equivalent mass. That mass is given by the m = E/cmass-energy equivalence. We also know a photon has momentum, and it’s equal to its energy divided by c: p = m·c = E/c. [I know the notation is somewhat confusing: E is, obviously, not the magnitude of E here: it’s energy!] Both yield the same result. We get: m·c = E/c ⇔ m = E/c⇔ E = m·c2.

OK. Next step. Well… I’ve always been intrigued by the fact that the kinetic energy of a photon, using the E = m·v2/2 = E = m·c2/2 formula, is only half of its total energy E = m·c2. Half: 1/2. That 1/2 factor is intriguing. Where’s the rest of the energy? It’s really a contradiction: our photon has no rest mass, and there’s no potential here, but its total energy is still twice its kinetic energy. Quid?

There’s only one conclusion: just because of its sheer existence, it must have some hidden energy, and that hidden energy is also equal to E = m·c2/2, and so the kinetic and hidden energy add up to E = m·c2.

Huh? Hidden energy? I must be joking, right?

Well… No. No joke. I am tempted to call it the imaginary energy, because it’s linked to the imaginary part of the wavefunction—but then it’s everything but imaginary: it’s as real as the imaginary part of the wavefunction. [I know that sounds a bit nonsensical, but… Well… Think about it: it does make sense.]

Back to that factor 1/2. You may or may not remember it popped up when we were calculating the group and the phase velocity of the wavefunction respectively, again assuming zero rest mass, and zero potential. [Note that the rest mass term is mathematically equivalent to the potential term in both the wavefunction as well as in Schrödinger’s equation: (E0·t +V·t = (E+ V)·t, and V·ψ + E0·ψ = (V+E0)·ψ—obviously!]

In fact, let me quickly show you that calculation again: the de Broglie relations tell us that the k and the ω in the ei(kx − ωt) = cos(kx−ωt) + i∙sin(kx−ωt) wavefunction (i.e. the spatial and temporal frequency respectively) are equal to k = p/ħ, and ω = E/ħ. If we would now use the kinetic energy formula E = m·v2/2 – which we can also write as E = m·v·v/2 = p·v/2 = p·p/2m = p2/2m, with v = p/m the classical velocity of the elementary particle that Louis de Broglie was thinking of – then we can calculate the group velocity of our ei(kx − ωt) = cos(kx−ωt) + i∙sin(kx−ωt) as:

vg = ∂ω/∂k = ∂[E/ħ]/∂[p/ħ] = ∂E/∂p = ∂[p2/2m]/∂p = 2p/2m = p/m = v

[Don’t tell me I can’t treat m as a constant when calculating ∂ω/∂k: I can. Think about it.] Now the phase velocity. The phase velocity of our ei(kx − ωt) is only half of that. Again, we get that 1/2 factor:

vp = ω/k = (E/ħ)/(p/ħ) = E/p = (p2/2m)/p = p/2m = v/2

Strange, isn’t it? Why would we get a different value for the phase velocity here? It’s not like we have two different frequencies here, do we? You may also note that the phase velocity turns out to be smaller than the group velocity, which is quite exceptional as well! So what’s the matter?

Well… The answer is: we do seem to have two frequencies here while, at the same time, it’s just one wave. There is only one k and ω here but, as I mentioned a couple of times already, that ei(kx − ωt) wavefunction seems to give you two functions for the price of one—one real and one imaginary: ei(kx − ωt) = cos(kx−ωt) + i∙sin(kx−ωt). So are we adding waves, or are we not? It’s a deep question. In my previous post, I said we were adding separate waves, but now I am thinking: no. We’re not. That sine and cosine are part of one and the same whole. Indeed, the apparent contradiction (i.e. the different group and phase velocity) gets solved if we’d use the E = m∙v2 formula rather than the kinetic energy E = m∙v2/2. Indeed, assuming that E = m∙v2 formula also applies to our zero-mass particle (I mean zero rest mass, of course), and measuring time and distance in natural units (so c = 1), we have:

E = m∙c2 = m and p = m∙c2 = m, so we get: E = m = p

Waw! What a weird combination, isn’t it? But… Well… It’s OK. [You tell me why it wouldn’t be OK. It’s true we’re glossing over the dimensions here, but natural units are natural units, and so c = c2 = 1. So… Well… No worries!] The point is: that E = m = p equality yields extremely simple but also very sensible results. For the group velocity of our ei(kx − ωt) wavefunction, we get:

vg = ∂ω/∂k = ∂[E/ħ]/∂[p/ħ] = ∂E/∂p = ∂p/∂p = 1

So that’s the velocity of our zero-mass particle (c, i.e. the speed of light) expressed in natural units once more—just like what we found before. For the phase velocity, we get:

vp = ω/k = (E/ħ)/(p/ħ) = E/p = p/p = 1

Same result! No factor 1/2 here! Isn’t that great? My ‘hidden energy theory’ makes a lot of sense. 🙂 In fact, I had mentioned a couple of times already that the E = m∙v2 relation comes out of the de Broglie relations if we just multiply the two and use the v = λ relation:

  1. f·λ = (E/h)·(h/p) = E/p
  2. v = λ ⇒ f·λ = v = E/p ⇔ E = v·p = v·(m·v) ⇒ E = m·v2

But so I had no good explanation for this. I have one now: the E = m·vis the correct energy formula for our zero-mass particle. 🙂

The quantization of energy and the zero-mass particle

Let’s now think about the quantization of energy. What’s the smallest value for E that we could possible think of? That’s h, isn’t it? That’s the energy of one cycle of an oscillation according to the Planck-Einstein relation (E = h·f). Well… Perhaps it’s ħ? Because… Well… We saw energy levels were separated by ħ, rather than h, when studying the blackbody radiation problem. So is it ħ = h/2π? Is the natural unit a radian (i.e. a unit distance), rather than a cycle?

Neither is natural, I’d say. We also have the Uncertainty Principle, which suggests the smallest possible energy value is ħ/2, because ΔxΔp = ΔtΔE = ħ/2.

Huh? What’s the logic here?

Well… I am not quite sure but my intuition tells me the quantum of energy must be related to the quantum of time, and the quantum of distance.

Huh? The quantum of time? The quantum of distance? What’s that? The Planck scale?

No. Or… Well… Let me correct that: not necessarily. I am just thinking in terms of logical concepts here. Logically, as we think of the smallest of smallest, then our time and distance variables must become count variables, so they can only take on some integer value n = 0, 1, 2 etcetera. So then we’re literally counting in time and/or distance units. So Δx and Δt are then equal to 1. Hence, Δp and ΔE are then equal to Δp = ΔE = ħ/2. Just think of the radian (i.e. the unit in which we measure θ) as measuring both time as well as distance. Makes sense, no?

No? Well… Sorry. I need to move on. So the smallest possible value for m = E = p would be ħ/2. Let’s substitute that in Schrödinger’s equation, or in that set of equations Re(∂ψ/∂t) = −(ħ/2m)·Im(∇2ψ) and Im(∂ψ/∂t) = (ħ/2m)·Re(∇2ψ). We get:

  1. Re(∂ψ/∂t) = −(ħ/2m)·Im(∇2ψ) = −(2ħ/2ħ)·Im(∇2ψ) = −Im(∇2ψ)
  2. Im(∂ψ/∂t) = (ħ/2m)·Re(∇2ψ) = (2ħ/2ħ)·Re(∇2ψ) = Re(∇2ψ)

Bingo! The Re(∂ψ/∂t) = −Im(∇2ψ) and Im(∂ψ/∂t) = Re(∇2ψ) equations were what I was looking for. Indeed, I wanted to find something that was structurally similar to the ∂B/∂t = –∇×and E/∂t = ∇×B equations—and something that was exactly similar: no coefficients in front or anything. 🙂

What about our wavefunction? Using the de Broglie relations once more (k = p/ħ, and ω = E/ħ), our ei(kx − ωt) = cos(kx−ωt) + i∙sin(kx−ωt) now becomes:

ei(kx − ωt) = ei(ħ·x/2 − ħ·t/2)/ħ = ei(x/2 − t/2) = cos[(x−t)/2] + i∙sin[(x−t)/2]

Hmm… Interesting! So we’ve got that 1/2 factor now in the argument of our wavefunction! I really feel I am close to squaring the circle here. 🙂 Indeed, it must be possible to relate the ∂B/∂t = –∇×E and ∂E/∂t = c2∇×B to the Re(∂ψ/∂t) = −Im(∇2ψ) and Im(∂ψ/∂t) = Re(∇2ψ) equations. I am sure it’s a complicated exercise. It’s likely to involve the formula for the Lorentz force, which says that the force on a unit charge is equal to E+v×B, with v the velocity of the charge. Why? Note the vector cross-product. Also note that ∂B/∂t and  ∂E/∂t are vector-valued functions, not scalar-valued functions. Hence, in that sense, ∂B/∂t and  ∂E/∂t and not like the Re(∂ψ/∂t) and/or Im(∂ψ/∂t) function. But… Well… For the rest, think of it: E and B are orthogonal vectors, and that’s  how we usually interpret the real and imaginary part of a complex number as well: the real and imaginary axis are orthogonal too!

So I am almost there. Who can help me prove what I want to prove here? The two propagation mechanisms are the “same-same but different”, as they say in Asia. The difference between the two propagation mechanisms must also be related to that fundamental dichotomy in Nature: the distinction between bosons and fermions. Indeed, when combining two directional quantities (i.e. two vectors), we like to think there are four different ways of doing that, as shown below. However, when we’re only interested in the magnitude of the result (and not in its direction), then the first and third result below are really the same, as are the second and fourth combination. Now, we’ve got pretty much the same in quantum math: we can, in theory, combine complex-valued amplitudes in four different ways but, in practice, we only have two (rather than four) types of behavior only: photons versus bosons.

vector addition

Is our zero-mass particle just the electric field vector?

Let’s analyze that ei(x/2 − t/2) = cos[(x−t)/2] + i∙sin[(x−t)/2] wavefunction some more. It’s easy to represent it graphically. The following animation does the trick:

Animation

I am sure you’ve seen this animation before: it represents a circularly polarized electromagnetic wave… Well… Let me be precise: it presents the electric field vector (E) of such wave only. The B vector is not shown here, but you know where and what it is: orthogonal to the E vector, as shown below—for a linearly polarized wave.

emwave2

Let’s think some more. What is that ei(x/2 − t/2) function? It’s subject to conceiving time and distance as countable variables, right? I am tempted to say: as discrete variables, but I won’t go that far—not now—because the countability may be related to a particular interpretation of quantum physics. So I need to think about that. In any case… The point is that x can only take on values like 0, 1, 2, etcetera. And the same goes for t. To make things easy, we’ll not consider negative values for x right now (and, obviously, not for t either). So we’ve got a infinite set of points like:

  • ei(0/2 − 0/2) = cos(0) + i∙sin(0)
  • ei(1/2 − 0/2) = cos(1/2) + i∙sin(1/2)
  • ei(0/2 − 1/2) = cos(−1/2) + i∙sin(−1/2)
  • ei(1/2 − 1/2) = cos(0) + i∙sin(0)

Now, I quickly opened Excel and calculated those cosine and sine values for x and t going from 0 to 14 below. It’s really easy. Just five minutes of work. You should do yourself as an exercise. The result is shown below. Both graphs connect 14×14 = 196 data points, but you can see what’s going on: this does effectively, represent the elementary wavefunction of a particle traveling in spacetime. In fact, you can see its speed is equal to 1, i.e. it effectively travels at the speed of light, as it should: the wave velocity is v = f·λ = (ω/2π)·(2π/k) = ω/k = (1/2)·(1/2) = 1. The amplitude of our wave doesn’t change along the x = t diagonal. As the Last Samurai puts it, just before he moves to the Other World: “Perfect! They are all perfect!” 🙂

graph imaginarygraph real

In fact, in case you wonder how the quantum vacuum could possibly look like, you should probably think of these discrete spacetime points, and some complex-valued wave that travels as it does in the illustration above.

Of course, that elementary wavefunction above does not localize our particle. For that, we’d have to add a potentially infinite number of such elementary wavefunctions, so we’d write the wavefunction as ∑ ajeiθj functions. [I use the symbol here for the subscript, rather than the more conventional i symbol for a subscript, so as to avoid confusion with the symbol used for the imaginary unit.] The acoefficients are the contribution that each of these elementary wavefunctions would make to the composite wave. What could they possibly be? Not sure. Let’s first look at the argument of our elementary component wavefunctions. We’d inject uncertainty in it. So we’d say that m = E = p is equal to

m = E = p = ħ/2 + j·ħ with j = 0, 1, 2,…

That amounts to writing: m = E = p = ħ/2, ħ, 3ħ/2, 2ħ, 5/2ħ, etcetera. Waw! That’s nice, isn’t it? My intuition tells me that our acoefficients will be smaller for higher j, so the aj(j) function would be some decreasing function. What shape? Not sure. Let’s first sum up our thoughts so far:

  1. The elementary wavefunction of a zero-mass particle (again, I mean zero rest mass) in free space is associated with an energy that’s equal to ħ/2.
  2. The zero-mass particle travels at the speed of light, obviously (because it has zero rest mass), and its kinetic energy is equal to E = m·v2/2 = m·c2/2.
  3. However, its total energy is equal to E = m·v= m·c2: it has some hidden energy. Why? Just because it exists.
  4. We may associate its kinetic energy with the real part of its wavefunction, and the hidden energy with its imaginary part. However, you should remember that the imaginary part of the wavefunction is as essential as its real part, so the hidden energy is equally real. 🙂

So… Well… Isn’t this just nice?

I think it is. Another obvious advantage of this way of looking at the elementary wavefunction is that – at first glance at least – it provides an intuitive understanding of why we need to take the (absolute) square of the wavefunction to find the probability of our particle being at some point in space and time. The energy of a wave is proportional to the square of its amplitude. Now, it is reasonable to assume the probability of finding our (point) particle would be proportional to the energy and, hence, to the square of the amplitude of the wavefunction, which is given by those aj(j) coefficients.

Huh?

OK. You’re right. I am a bit too fast here. It’s a bit more complicated than that, of course. The argument of probability being proportional to energy being proportional to the square of the amplitude of the wavefunction only works for a single wave a·eiθ. The argument does not hold water for a sum of functions ∑ ajeiθj. Let’s write it all out. Taking our m = E = p = ħ/2 + j·ħ = ħ/2, ħ, 3ħ/2, 2ħ, 5/2ħ,… formula into account, this sum would look like:

a1ei(x − t)(1/2) + a2ei(x − t)(2/2) + a3ei(x − t)(3/2) + a4ei(x − t)(4/2) + …

But—Hey! We can write this as some power series, can’t we? We just need to add a0ei(x − t)(0/2) = a0, and then… Well… It’s not so easy, actually. Who can help me? I am trying to find something like this:

power series

Or… Well… Perhaps something like this:

power series 2

Whatever power series it is, we should be able to relate it to this one—I’d hope:

power series 3

Hmm… […] It looks like I’ll need to re-visit this, but I am sure it’s going to work out. Unfortunately, I’ve got no more time today, I’ll let you have some fun now with all of this. 🙂 By the way, note that the result of the first power series is only valid for |x| < 1. 🙂

Note 1: What we should also do now is to re-insert mass in the equations. That should not be too difficult. It’s consistent with classical theory: the total energy of some moving mass is E = m·c2, out of which m·v2/2 is the classical kinetic energy. All the rest – i.e. m·c2 − m·v2/2 – is potential energy, and so that includes the energy that’s ‘hidden’ in the imaginary part of the wavefunction. 🙂

Note 2: I really didn’t pay much attentions to dimensions when doing all of these manipulations above but… Well… I don’t think I did anything wrong. Just to give you some more feel for that wavefunction ei(kx − ωt), please do a dimensional analysis of its argument. I mean, k = p/ħ, and ω = E/ħ, so check the dimensions:

  • Momentum is expressed in newton·second, and we divide it by the quantum of action, which is expressed in newton·meter·second. So we get something per meter. But then we multiply it with x, so we get a dimensionless number.
  • The same is true for the ωt term. Energy is expressed in joule, i.e. newton·meter, and so we divide it by ħ once more, so we get something per second. But then we multiply it with t, so… Well… We do get a dimensionless number: a number that’s expressed in radians, to be precise. And so the radian does, indeed, integrate both the time as well as the distance dimension. 🙂

Schrödinger’s equation and the two de Broglie relations

Post scriptum note added on 11 July 2016: This is one of the more speculative posts which led to my e-publication analyzing the wavefunction as an energy propagation. With the benefit of hindsight, I would recommend you to immediately the more recent exposé on the matter that is being presented here, which you can find by clicking on the provided link. In fact, I actually made some (small) mistakes when writing the post below.

Original post:

I’ve re-visited the de Broglie equations a couple of times already. In this post, however, I want to relate them to Schrödinger’s equation. Let’s start with the de Broglie equations first. Equations. Plural. Indeed, most popularizing books on quantum physics will give you only one of the two de Broglie equations—the one that associates a wavelength (λ) with the momentum (p) of a matter-particle:

λ = h/p

In fact, even the Wikipedia article on the ‘matter wave’ starts off like that and is, therefore, very confusing, because, for a good understanding of quantum physics, one needs to realize that the λ = h/p equality is just one of a pair of two ‘matter wave’ equations:

  1. λ = h/p
  2. f = E/h

These two equations give you the spatial and temporal frequency of the wavefunction respectively. Now, those two frequencies are related – and I’ll show you how in a minute – but they are not the same. It’s like space and time: they are related, but they are definitely not the same. Now, because any wavefunction is periodic, the argument of the wavefunction – which we’ll introduce shortly – will be some angle and, hence, we’ll want to express it in radians (or – if you’re really old-fashioned – degrees). So we’ll want to express the frequency as an angular frequency (i.e. in radians per second, rather than in cycles per second), and the wavelength as a wave number (i.e. in radians per meter). Hence, you’ll usually see the two de Broglie equations written as:

  1. k = p/ħ
  2. ω = E/ħ

It’s the same: ω = 2π∙f and f = 1/T (T is the period of the oscillation), and k = 2π/λ and then ħ = h/2π, of course! [Just to remove all ambiguities: stop thinking about degrees. They’re a Babylonian legacy, who thought the numbers 6, 12, and 60 had particular religious significance. So that’s why we have twelve-hour nights and twelve-hour days, with each hour divided into sixty minutes and each minute divided into sixty seconds, and – particularly relevant in this context – why ‘once around’ is divided into 6×60 = 360 degrees. Radians are the unit in which we should measure angles because… Well… Google it. They measure an angle in distance units. That makes things easier—a lot easier! Indeed, when studying physics, the last thing you want is artificial units, like degrees.]

So… Where were we? Oh… Yes. The de Broglie relation. Popular textbooks usually commit two sins. One is that they forget to say we have two de Broglie relations, and the other one is that the E = h∙f relationship is presented as the twin of the Planck-Einstein relation for photons, which relates the energy (E) of a photon to its frequency (ν): E = h∙ν = ħ∙ω. The former is criminal neglect, I feel. As for the latter… Well… It’s true and not true: it’s incomplete, I’d say, and, therefore, also very confusing.

Why? Because both things lead one to try to relate the two equations, as momentum and energy are obviously related. In fact, I’ve wasted days, if not weeks, on this. How are they related? What formula should we use? To answer that question, we need to answer another one: what energy concept should we use? Potential energy? Kinetic energy? Should we include the equivalent energy of the rest mass?

One quickly gets into trouble here. For example, one can try the kinetic energy, K.E. = m∙v2/2, and use the definition of momentum (p = m∙v), to write E = p2/(2m), and then we could relate the frequency f to the wavelength λ using the general rule that the traveling speed of a wave is equal to the product of its wavelength and its frequency (v = λ∙f). But if E = p2/(2m) and f = v/λ, we get:

p2/(2m) = h∙v/λ ⇔  λ = 2∙h/p

So that is almost right, but not quite: that factor 2 should not be there. In fact, it’s easy to see that we’d get de Broglie’s λ = h/p equation from his E = h∙f equation if we’d use E = m∙v2 rather than E = m∙v2/2. In fact, the E = m∙v2 relation comes out of them if we just multiply the two and, yes, use that v = λ relation once again:

  1. f·λ = (E/h)·(h/p) = E/p
  2. v = λ ⇒ f·λ = v = E/p ⇔ E = v·p = v·(m·v) ⇒ E = m·v2

But… Well… E = m∙v2? How could we possibly justify the use of that formula?

The answer is simple: our v = f·λ equation is wrong. It’s just something one shouldn’t apply to the complex-valued wavefunction. The ‘correct’ velocity formula for the complex-valued wavefunction should have that 1/2 factor, so we’d write 2·f·λ = v to make things come out alright. But where would this formula come from?

Well… Now it’s time to introduce the wavefunction.

The wavefunction

You know the elementary wavefunction:

ψ = ψ(x, t) = ei(ωt − kx) = ei(kx − ωt) = cos(kx−ωt) + i∙sin(kx−ωt)

As for terminology, note that the term ‘wavefunction’ refers to what I write above, while the term ‘wave equation’ usually refers to Schrödinger’s equation, which I’ll introduce in a minute. Also note the use of boldface indicates we’re talking vectors, so we’re multiplying the wavenumber vector k with the position vector x = (x, y, z) here, although we’ll often simplify and assume one-dimensional space. In any case…

So the question is: why can’t we use the v = f·λ formula for this wave? The period of cosθ + isinθ is the same as that of the sine and cosine function considered separately: cos(θ+2π) + isin(θ+2π) = cosθ + isinθ, so T = 2π and f = 1/T = 1/2π do not change. So the f, T and λ should be the same, no?

No. We’ve got two oscillations for the price of one here: one ‘real’ and one ‘imaginary’—but both are equally essential and, hence, equally ‘real’. So we’re actually combining two waves. So it’s just like adding other waves: when adding waves, one gets a composite wave that has (a) a phase velocity and (b) a group velocity.

Huh? Yes. It’s quite interesting. When adding waves, we usually have a different ω and k for each of the component waves, and the phase and group velocity will depend on the relation between those ω’s and k’s. That relation is referred to as the dispersion relation. To be precise, if you’re adding waves, then the phase velocity of the composite wave will be equal to vp = ω/k, and its group velocity will be equal to vg = dω/dk. We’ll usually be interested in the group velocity, and so to calculate that derivative, we need to express ω as a function of k, of course, so we write ω as some function of k, i.e. ω = ω(k). There are number of possibilities then:

  1. ω and k may be directly proportional, so we can write ω as ω = a∙k: in that case, we find that vp = vg = a.
  2. ω and k are not directly proportional but have a linear relationship, so we can write write ω as ω = a∙k + b. In that case, we find that vg = a and… Well… We’ve got a problem calculating vp, because we don’t know what k to use!
  3. ω and k may be non-linearly related, in which case… Well… One does has to do the calculation and see what comes out. 🙂

Let’s now look back at our ei(kx − ωt) = cos(kx−ωt) + i∙sin(kx−ωt) function. You’ll say that we’ve got only one ω and one k here, so we’re not adding waves with different ω’s and k’s. So… Well… What?

That’s where the de Broglie equations come in. Look: k = p/ħ, and ω = E/ħ. If we now use the correct energy formula, i.e. the kinetic energy formula E = m·v2/2 (rather than that nonsensical E = m·v2 equation) – which we can also write as E = m·v·v/2 = p·v/2 = p·p/2m = p2/2m, with v = p/m the classical velocity of the elementary particle that Louis de Broglie was thinking of – then we can calculate the group velocity of our ei(kx − ωt) = cos(kx−ωt) + i∙sin(kx−ωt) as:

vg = dω/dk = d[E/ħ]/d[p/ħ] = dE/dp = d[p2/2m]/dp = 2p/2m = p/m = v

However, the phase velocity of our ei(kx − ωt) is:

vp = ω/k = (E/ħ)/(p/ħ) = E/p = (p2/2m)/p = p/2m = v/2

So that factor 1/2 only appears for the phase velocity. Weird, isn’t it? We find that the group velocity (vg) of the ei(kx − ωt) function is equal to the classical velocity of our particle (i.e. v), but that its phase velocity (vp) is equal to v divided by 2.

Hmm… What to say? Well… Nothing much—except that it makes sense, and very much so, because it’s the group velocity of the wavefunction that’s associated with the classical velocity of a particle, not the phase velocity. In fact, if we include the rest mass in our energy formula, so if we’d use the relativistic E = γm0c2 and p = γm0v formulas (with γ the Lorentz factor), then we find that vp = ω/k = E/p = (γm0c2)/(γm0v) = c2/v, and so that’s a superluminal velocity, because v is always smaller than c!

What? That’s even weirder! If we take the kinetic energy only, we find a phase velocity equal to v/2, but if we include the rest energy, then we get a superluminal phase velocity. It must be one or the other, no? Yep! You’re right! So that makes us wonder: is E = m·v2/2 really the right energy concept to use? The answer is unambiguous: no! It isn’t! And, just for the record, our young nobleman didn’t use the kinetic energy formula when he postulated his equations in his now famous PhD thesis.

So what did he use then? Where did he get his equations?

I am not sure. 🙂 A stroke of genius, it seems. According to Feynman, that’s how Schrödinger got his equation too: intuition, brilliance. In short, a stroke of genius. 🙂 Let’s relate these these two gems.

Schrödinger’s equation and the two de Broglie relations

Erwin Schrödinger and Louis de Broglie published their equations in 1924 and 1926 respectively. Can they be related? The answer is: yes—of course! Let’s first look at de Broglie‘s energy concept, however. Louis de Broglie was very familiar with Einsteins’ work and, hence, he knew that the energy of a particle consisted of three parts:

  1. The particle’s rest energy m0c2, which de Broglie referred to as internal energy (Eint): this ‘internal energy’ includes the rest mass of the ‘internal pieces’, as he put it (now we call those ‘internal pieces’ quarks), as well as their binding energy (i.e. the quarks’interaction energy);
  2. Any potential energy it may have because of some field (so de Broglie was not assuming the particle was traveling in free space), which we’ll denote by V: the field(s) can be anything—gravitational, electromagnetic—you name it: whatever changes the energy because of the position of the particle;
  3. The particle’s kinetic energy, which we wrote in terms of its momentum p: K.E. = m·v2/2 = m2·v2/(2m) = (m·v)2/(2m) = p2/(2m).

Indeed, in my previous posts, I would write the wavefunction as de Broglie wrote it, which is as follows:

ψ(θ) = ψ(x, t) = a·eiθ = a·e−i[(Eint + p2/(2m) + V)·t − p∙x]/ħ 

In those post – such as my post on virtual particles – I’d also note how a change in potential energy plays out: a change in potential energy, when moving from one place to another, would change the wavefunction, but through the momentum only—so it would impact the spatial frequency only. So the change in potential would not change the temporal frequencies ω= Eint + p12/(2m) + V1 and ω= Eint + p22/(2m) + V2. Why? Or why not, I should say? Because of the energy conservation principle—or its equivalent in quantum mechanics. The temporal frequency f or ω, i.e. the time-rate of change of the phase of the wavefunction, does not change: all of the change in potential, and the corresponding change in kinetic energy, goes into changing the spatial frequency, i.e. the wave number k or the wavelength λ, as potential energy becomes kinetic or vice versa.

So is that consistent with what we wrote above, that E = m·v2? Maybe. Let’s think about it. Let’s first look at Schrödinger’s equation in free space (i.e. a space with zero potential) once again:

Schrodinger's equation 2

If we insert our ψ = ei(kx − ωt) formula in Schrödinger’s free-space equation, we get the following nice result. [To keep things simple, we’re just assuming one-dimensional space for the calculations, so ∇2ψ = ∂2ψ/∂x2. But the result can easily be generalized.] The time derivative on the left-hand side is ∂ψ/∂t = −iω·ei(kx − ωt). The second-order derivative on the right-hand side is ∂2ψ/∂x2 = (ik)·(ik)·ei(kx − ωt) = −k2·ei(kx − ωt) . The ei(kx − ωt) factor on both sides cancels out and, hence, equating both sides gives us the following condition:

iω = −(iħ/2m)·k2 ⇔ ω = (ħ/2m)·k2

Substituting ω = E/ħ and k = p/ħ yields:

E/ħ = (ħ/2m)·p22 = m2·v2/(2m·ħ) = m·v2/(2ħ) ⇔ E = m·v2/2

Bingo! We get that kinetic energy formula! But now… What if we’d not be considering free space? In other words: what if there is some potential? Well… We’d use the complete Schrödinger equation, which is:

schrodinger 5

Huh? Why is there a minus sign now? Look carefully: I moved the iħ factor on the left-hand side to the other when writing the free space version. If we’d do that for the complete equation, we’d get:

Schrodinger's equation 3I like that representation a lot more—if only because it makes it a lot easier to interpret the equation—but, for some reason I don’t quite understand, you won’t find it like that in textbooks. Now how does it work when using the complete equation, so we add the −(i/ħ)·V·ψ term? It’s simple: the ei(kx − ωt) factor also cancels out, and so we get:

iω = −(iħ/2m)·k2−(i/ħ)·V ⇔ ω = (ħ/2m)·k+ V/ħ

Substituting ω = E/ħ and k = p/ħ once more now yields:

E/ħ = (ħ/2m)·p22 + V/ħ = m2·v2/(2m·ħ) + V/ħ = m·v2/(2ħ) + V/ħ ⇔ E = m·v2/2 + V

Bingo once more!

The only thing that’s missing now is the particle’s rest energy m0c2, which de Broglie referred to as internal energy (Eint). That includes everything, i.e. not only the rest mass of the ‘internal pieces’ (as said, now we call those ‘internal pieces’ quarks) but also their binding energy (i.e. the quarks’interaction energy). So how do we get that energy concept out of Schrödinger’s equation? There’s only one answer to that: that energy is just like V. We can, quite simply, just add it.

That brings us to the last and final question: what about our vg = result if we do not use the kinetic energy concept, but the E = m·v2/2 + V + Eint concept? The answer is simple: nothing. We still get the same, because we’re taking a derivative and the V and Eint just appear as constants, and so their derivative with respect to p is zero. Check it:

vg = dω/dk = d[E/ħ]/d[p/ħ] = dE/dp = d[p2/2m + V + Eint ]/dp = 2p/2m = p/m = v

It’s now pretty clear how this thing works. To localize our particle, we just superimpose a zillion of these ei(ωt − kx) equations. The only condition is that we’ve got that fixed vg = dω/dk = v relationhip, but so we do have such fixed relationship—as you can see above. In fact, the Wikipedia article on the dispersion relation mentions that the de Broglie equations imply the following relation between ω and k: ω = ħk2/2m. As you can see, that’s not entirely correct: the author conveniently forgets the potential (V) and the rest energy (Eint) in the energy formula here!

What about the phase velocity? That’s a different story altogether. You can think about that for yourself. 🙂

I should make one final point here. As said, in order to localize a particle (or, to be precise, its wavefunction), we’re going to add a zillion elementary wavefunctions, each of which will make its own contribution to the composite wave. That contribution is captured by some coefficient ai in front of every eiθi function, so we’ll have a zillion aieiθi functions, really. [Yep. Bit confusing: I use here as subscript, as well as imaginary unit.] In case you wonder how that works out with Schrödinger’s equation, the answer is – once again – very simple: both the time derivative (which is just a first-order derivative) and the Laplacian are linear operators, so Schrödinger’s equation, for a composite wave, can just be re-written as the sum of a zillion ‘elementary’ wave equations.

So… Well… We’re all set now to effectively use Schrödinger’s equation to calculate the orbitals for a hydrogen atom, which is what we’ll do in our next post.

In the meanwhile, you can amuse yourself with reading a nice Wikibook article on the Laplacian, which gives you a nice feel for what Schrödinger’s equation actually represents—even if I gave you a good feel for that too on my Essentials page. Whatever. You choose. Just let me know what you liked best. 🙂

Oh… One more point: the vg = dω/dk = d[p2/2m]/dp = p/m = calculation obviously assumes we can treat m as a constant. In fact, what we’re actually doing is a rather complicated substitution of variables: you should write it all out—but that’s not the point here. The point is that we’re actually doing a non-relativistic calculation. Now, that does not mean that the wavefunction isn’t consistent with special relativity. It is. In fact, in one of my posts, I show how we can explain relativistic length contraction using the wavefunction. But it does mean that our calculation of the group velocity is not relativistically correct. But that’s a minor point: I’ll leave it for you as an exercise to calculate the relativistically correct formula for the group velocity. Have fun with it! 🙂

Note: Notations are often quite confusing. One should, generally speaking, denote a frequency by ν (nu), rather than by f, so as to not cause confusion with any function f, but then… Well… You create a new problem when you do that, because that Greek letter nu (ν) looks damn similar to the v of velocity, so that’s why I’ll often use f when I should be using nu (ν). As for the units, a frequency is expressed in cycles per second, while the angular frequency ω is expressed in radians per second. One cycle covers 2π radians and, therefore, we can write: ν = ω/2π. Hence, h∙ν = h∙ω/2π = ħ∙ω. Both ν as well as ω measure the time-rate of change of the phase of the wave function, as opposed to k, i.e. the spatial frequency of the wave function, which depends on the speed of the wave. Physicists also often use the symbol v for the speed of a wave, which is also hugely confusing, because it’s also used to denote the classical velocity of the particle. And then there’s two wave velocities, of course: the group versus the phase velocity. In any case… I find the use of that other symbol (c) for the wave velocity even more confusing, because this symbol is also used for the speed of light, and the speed of a wave is not necessarily (read: usually not) equal to the speed of light. In fact, both the group as well as the phase velocity of a particle wave are very different from the speed of light. The speed of a wave and the speed of light only coincide for electromagnetic waves and, even then, it should be noted that photons also have amplitudes to travel faster or slower than the speed of light.

Schrödinger’s equation as an energy conservation law

Post scriptum note added on 11 July 2016: This is one of the more speculative posts which led to my e-publication analyzing the wavefunction as an energy propagation. With the benefit of hindsight, I would recommend you to immediately the more recent exposé on the matter that is being presented here, which you can find by clicking on the provided link.

Original post:

In the movie about Stephen Hawking’s life, The Theory of Everything, there is talk about a single unifying equation that would explain everything in the universe. I must assume the real Stephen Hawking is familiar with Feynman’s unworldliness equation: U = 0, which – as Feynman convincingly demonstrates – effectively integrates all known equations in physics. It’s one of Feynman’s many jokes, of course, but an exceptionally clever one, as the argument convincingly shows there’s no such thing as one equation that explains all. Or, to be precise, one can, effectively, ‘hide‘ all the equations you want in a single equation, but it’s just a trick. As Feynman puts it: “When you unwrap the whole thing, you get back where you were before.”

Having said that, some equations in physics are obviously more fundamental than others. You can readily think of obvious candidates: Einstein’s mass-energy equivalence (m = E/c2); the wavefunction (ψ = ei(ω·t − k·x)) and the two de Broglie relations that come with it (ω = E/ħ and k = p/ħ); and, of course, Schrödinger’s equation, which we wrote as:

Schrodinger's equation

In my post on quantum-mechanical operators, I drew your attention to the fact that this equation is structurally similar to the heat diffusion equation. Indeed, assuming the heat per unit volume (q) is proportional to the temperature (T) – which is the case when expressing T in degrees Kelvin (K), so we can write q as q = k·T  – we can write the heat diffusion equation as:

heat diffusion 2

Moreover, I noted the similarity is not only structural. There is more to it: both equations model energy flows and/or densities. Look at it: the dimension of the left- and right-hand side of Schrödinger’s equation is the energy dimension: both quantities are expressed in joule. [Remember: a time derivative is a quantity expressed per second, and the dimension of Planck’s constant is the joule·second. You can figure out the dimension of the right-hand side yourself.] Now, the time derivative on the left-hand side is expressed in K/s. The constant in front (k) is just the (volume) heat capacity of the substance, which is expressed in J/(m3·K). So the dimension of k·(∂T/∂t) is J/(m3·s). On the right-hand side we have that Laplacian, whose dimension is K/m2, multiplied by the thermal conductivity, whose dimension is W/(m·K) = J/(m·s·K). Hence, the dimension of the product is  the same as the left-hand side: J/(m3·s).

We can present the thing in various ways: if we bring k to the other side, then we’ve got something expressed per second on the left-hand side, and something expressed per square meter on the right-hand side—but the k/κ factor makes it alright. The point is: both Schrödinger’s equation as well as the diffusion equation are actually an expression of the energy conservation law. They’re both expressions of Gauss’ flux theorem (but in differential form, rather than in integral form) which, as you know, pops up everywhere when talking energy conservation.

Huh? 

Yep. I’ll give another example. Let me first remind you that the k·(∂T/∂t) = ∂q/∂t = κ·∇2T equation can also be written as:

heat diffusion 3

The h in this equation is, obviously, not Planck’s constant, but the heat flow vector, i.e. the heat that flows through a unit area in a unit time, and h is, obviously, equal to h = −κ∇T. And, of course, you should remember your vector calculus here: ∇· is the divergence operator. In fact, we used the equation above, with ∇·h rather than ∇2T to illustrate the energy conservation principle. Now, you may or may not remember that we gave you a similar equation when talking about the energy of fields and the Poynting vector:

Poynting vector

This immediately triggers the following reflection: if there’s a ‘Poynting vector’ for heat flow (h), and for the energy of fields (S), then there must be some kind of ‘Poynting vector’ for amplitudes too! I don’t know which one, but it must exist! And it’s going to be some complex vector, no doubt! But it should be out there.

It also makes me think of a point I’ve made a couple of times already—about the similarity between the E and B vectors that characterize the traveling electromagnetic field, and the real and imaginary part of the traveling amplitude. Indeed, the similarity between the two illustrations below cannot be a coincidence. In both cases, we’ve got two oscillating magnitudes that are orthogonal to each other, always. As such, they’re not independent: one follows the other, or vice versa.

5d_euler_fFG02_06

 

 

 

 

 

The only difference is the phase shift. Euler’s formula incorporates a phase shift—remember: sinθ = cos(θ − π/2)—and so you don’t have that with the E and B vectors. But – Hey! – isn’t that why bosons and fermions are different? 🙂

[…]

This is great fun, and I’ll surely come back to it. As for now, however, I’ll let you ponder the matter for yourself. 🙂

Post scriptum: I am sure that all that the questions I raise here will be answered at the Masters’ level, most probably in some course dealing with quantum field theory, of course. 🙂 In any case, I am happy I can already anticipate such questions. 🙂

Oh – and, as for those two illustrations above, the animation below is one that should help you to think things through. It’s the electric field vector of a traveling circularly polarized electromagnetic wave, as opposed to the linearly polarized light that was illustrated above.

Animation

Quantum-mechanical operators

We climbed a mountain—step by step, post by post. 🙂 We have reached the top now, and the view is gorgeous. We understand Schrödinger’s equation, which describes how amplitudes propagate through space-time. It’s the quintessential quantum-mechanical expression. Let’s enjoy now, and deepen our understanding by introducing the concept of (quantum-mechanical) operators.

The operator concept

We’ll introduce the operator concept using Schrödinger’s equation itself and, in the process, deepen our understanding of Schrödinger’s equation a bit. You’ll remember we wrote it as:

schrodinger 5

However, you’ve probably seen it like it’s written on his bust, or on his grave, or wherever, which is as follows:

simple

Grave

It’s the same thing, of course. The ‘over-dot’ is Newton’s notation for the time derivative. In fact, if you click on the picture above (and zoom in a bit), then you’ll see that the craftsman who made the stone grave marker, mistakenly, also carved a dot above the psi (ψ) on the right-hand side of the equation—but then someone pointed out his mistake and so the dot on the right-hand side isn’t painted. 🙂 The thing I want to talk about here, however, is the H in that expression above, which is, obviously, the following operator:

H

That’s a pretty monstrous operator, isn’t it? It is what it is, however: an algebraic operator (it operates on a number—albeit a complex number—unlike a matrix operator, which operates on a vector or another matrix). As you can see, it actually consists of two other (algebraic) operators:

  1. The ∇operator, which you know: it’s a differential operator. To be specific, it’s the Laplace operator, which is the divergence (·) of the gradient () of a function: ∇= · = (∂/∂x, ∂/∂y , ∂/∂z)·(∂/∂x, ∂/∂y , ∂/∂z) = ∂2/∂x2  + ∂2/∂y+ ∂2/∂z2. This too operates on our complex-valued function wavefunction ψ, and yields some other complex-valued function, which we then multiply by −ħ2/2m to get the first term.
  2. The V(x, y, z) ‘operator’, which—in this particular context—just means: “multiply with V”. Needless to say, V is the potential here, and so it captures the presence of external force fields. Also note that V is a real number, just like −ħ2/2m.

Let me say something about the dimensions here. On the left-hand side of Schrödinger’s equation, we have the product of ħ and a time derivative (is just the imaginary unit, so that’s just a (complex) number). Hence, the dimension there is [J·s]/[s] (the dimension of a time derivative is something expressed per second). So the dimension of the left-hand side is joule. On the right-hand side, we’ve got two terms. The dimension of that second-order derivative (∇2ψ) is something expressed per square meter, but then we multiply it with −ħ2/2m, whose dimension is [J2·s2]/[J/(m2/s2)]. [Remember: m = E/c2.] So that reduces to [J·m2]. Hence, the dimension of (−ħ2/2m)∇2ψ is joule. And the dimension of V is joule too, of course. So it all works out. In fact, now that we’re here, it may or may not be useful to remind you of that heat diffusion equation we discussed when introducing the basic concepts involved in vector analysis:

diffusion equation

That equation illustrated the physical significance of the Laplacian. We were talking about the flow of heat in, say, a block of metal, as illustrated below. The in the equation above is the heat per unit volume, and the h in the illustration below was the heat flow vector (so it’s got nothing to do with Planck’s constant), which depended on the material, and which we wrote as = –κT, with T the temperature, and κ (kappa) the thermal conductivity. In any case, the point is the following: the equation below illustrates the physical significance of the Laplacian. We let it operate on the temperature (i.e. a scalar function) and its product with some constant (just think of replacing κ by −ħ2/2m gives us the time derivative of q, i.e. the heat per unit volume.

heat flow

In fact, we know that is proportional to T, so if we’d choose an appropriate temperature scale – i.e. choose the zero point such that T (your physics teacher in high school would refer to as the (volume) specific heat capacity) – then we could simple write:

∂T/∂t = (κ/k)∇2T

From a mathematical point of view, that equation is just the same as ∂ψ/∂t = –(i·ħ/2m)·∇2ψ, which is Schrödinger’s equation for V = 0. In other words, you can – and actually should – also think of Schrödinger’s equation as describing the flow of… Well… What?

Well… Not sure. I am tempted to think of something like a probability density in space, but ψ represents a (complex-valued) amplitude. Having said that, you get the idea—I hope! 🙂 If not, let me paraphrase Feynman on this:

“We can think of Schrödinger’s equation as describing the diffusion of a probability amplitude from one point to another. In fact, the equation looks something like the diffusion equation we introduced when discussing heat flow, or the spreading of a gas. But there is one main difference: the imaginary coefficient in front of the time derivative makes the behavior completely different from the ordinary diffusion such as you would have for a gas spreading out. Ordinary diffusion gives rise to real exponential solutions, whereas the solutions of Schrödinger’s equation are complex waves.”

That says it all, right? 🙂 In fact, Schrödinger’s equation – as discussed here – was actually being derived when describing the motion of an electron along a line of atoms, i.e. for motion in one direction only, but you can visualize what it represents in three-dimensional space. The real exponential functions Feynman refer to exponential decay function: as the energy is spread over an ever-increasing volume, the amplitude of the wave becomes smaller and smaller. That may be the case for complex-valued exponentials as well. The key difference between a real- and complex-valued exponential decay function is that a complex exponential is a cyclical function. Now, I quickly googled to see how we could visualize that, and I like the following illustration:

decay

The dimensional analysis of Schrödinger’s equation is also quite interesting because… Well… Think of it: that heat diffusion equation incorporates the same dimensions: temperature is a measure of the average energy of the molecules. That’s really something to think about. These differential equations are not only structurally similar but, in addition, they all seem to describe some flow of energy. That’s pretty deep stuff: it relates amplitudes to energies, so we should think in terms of Poynting vectors and all that. But… Well… I need to move on, and so I will move on—so you can re-visit this later. 🙂

Now that we’ve introduced the concept of an operator, let me say something about notations, because that’s quite confusing.

Some remarks on notation

Because it’s an operator, we should actually use the hat symbol—in line with what we did when we were discussing matrix operators: we’d distinguish the matrix (e.g. A) from its use as an operator (Â). You may or may not remember we do the same in statistics: the hat symbol is supposed to distinguish the estimator (â) – i.e. some function we use to estimate a parameter (which we usually denoted by some Greek symbol, like α) – from a specific estimate of the parameter, i.e. the value (a) we get when applying â to a specific sample or observation. However, if you remember the difference, you’ll also remember that hat symbol was quickly forgotten, because the context made it clear what was what, and so we’d just write a(x) instead of â(x). So… Well… I’ll be sloppy as well here, if only because the WordPress editor only offers very few symbols with a hat! 🙂

In any case, this discussion on the use (or not) of that hat is irrelevant. In contrast, what is relevant is to realize this algebraic operator H here is very different from that other quantum-mechanical Hamiltonian operator we discussed when dealing with a finite set of base states: that H was the Hamiltonian matrix, but used in an ‘operation’ on some state. So we have the matrix operator H, and the algebraic operator H.

Confusing?

Yes and no. First, we’ve got the context again, and so you always know whether you’re looking at continuous or discrete stuff:

  1. If your ‘space’ is continuous (i.e. if states are to defined with reference to an infinite set of base states), then it’s the algebraic operator.
  2. If, on the other hand, your states are defined by some finite set of discrete base states, then it’s the Hamiltonian matrix.

There’s another, more fundamental, reason why there should be no confusion. In fact, it’s the reason why physicists use the same symbol H in the first place: despite the fact that they look so different, these two operators (i.e. H the algebraic operator and H the matrix operator) are actually equivalent. Their interpretation is similar, as evidenced from the fact that both are being referred to as the energy operator in quantum physics. The only difference is that one operates on a (state) vector, while the other operates on a continuous function. It’s just the difference between matrix mechanics as opposed to wave mechanics really.

But… Well… I am sure I’ve confused you by now—and probably very much so—and so let’s start from the start. 🙂

Matrix mechanics

Let’s start with the easy thing indeed: matrix mechanics. The matrix-mechanical approach is summarized in that set of Hamiltonian equations which, by now, you know so well:

new

If we have base states, then we have equations like this: one for each = 1, 2,… n. As for the introduction of the Hamiltonian, and the other subscript (j), just think of the description of a state:

essential

So… Well… Because we had used already, we had to introduce j. 🙂

Let’s think about |ψ〉. It is the state of a system, like the ground state of a hydrogen atom, or one of its many excited states. But… Well… It’s a bit of a weird term, really. It all depends on what you want to measure: when we’re thinking of the ground state, or an excited state, we’re thinking energy. That’s something else than thinking its position in space, for example. Always remember: a state is defined by a set of base states, and so those base states come with a certain perspective: when talking states, we’re only looking at some aspect of reality, really. Let’s continue with our example of energy states, however.

You know that the lifetime of a system in an excited state is usually short: some spontaneous or induced emission of a quantum of energy (i.e. a photon) will ensure that the system quickly returns to a less excited state, or to the ground state itself. However, you shouldn’t think of that here: we’re looking at stable systems here. To be clear: we’re looking at systems that have some definite energy—or so we think: it’s just because of the quantum-mechanical uncertainty that we’ll always measure some other different value. Does that make sense?

If it doesn’t… Well… Stop reading, because it’s only going to get even more confusing. Not my fault, however!

Psi-chology

The ubiquity of that ψ symbol (i.e. the Greek letter psi) is really something psi-chological 🙂 and, hence, very confusing, really. In matrix mechanics, our ψ would just denote a state of a system, like the energy of an electron (or, when there’s only one electron, our hydrogen atom). If it’s an electron, then we’d describe it by its orbital. In this regard, I found the following illustration from Wikipedia particularly helpful: the green orbitals show excitations of copper (Cu) orbitals on a CuOplane. [The two big arrows just illustrate the principle of X-ray spectroscopy, so it’s an X-ray probing the structure of the material.]

800px-CuO2-plane_in_high_Tc_superconductor

So… Well… We’d write ψ as |ψ〉 just to remind ourselves we’re talking of some state of the system indeed. However, quantum physicists always want to confuse you, and so they will also use the psi symbol to denote something else: they’ll use it to denote a very particular Ci amplitude (or coefficient) in that |ψ〉 = ∑|iCi formula above. To be specific, they’d replace the base states |i〉 by the continuous position variable x, and they would write the following:

Ci = ψ(i = x) = ψ(x) = Cψ(x) = C(x) = 〈x|ψ〉

In fact, that’s just like writing:

φ(p) = 〈 mom p | ψ 〉 = 〈p|ψ〉 = Cφ(p) = C(p)

What they’re doing here, is (1) reduce the ‘system‘ to a ‘particle‘ once more (which is OK, as long as you know what you’re doing) and (2) they basically state the following:

If a particle is in some state |ψ〉, then we can associate some wavefunction ψ(x) or φ(p)—with it, and that wavefunction will represent the amplitude for the system (i.e. our particle) to be at x, or to have a momentum that’s equal to p.

So what’s wrong with that? Well… Nothing. It’s just that… Well… Why don’t they use χ(x) instead of ψ(x)? That would avoid a lot of confusion, I feel: one should not use the same symbol (psi) for the |ψ〉 state and the ψ(x) wavefunction.

Huh? Yes. Think about it. The point is: the position or the momentum, or even the energy, are properties of the system, so to speak and, therefore, it’s really confusing to use the same symbol psi (ψ) to describe (1) the state of the system, in general, versus (2) the position wavefunction, which describes… Well… Some very particular aspect (or ‘state’, if you want) of the same system (in this case: its position). There’s no such problem with φ(p), so… Well… Why don’t they use χ(x) instead of ψ(x) indeed? I have only one answer: psi-chology. 🙂

In any case, there’s nothing we can do about it and… Well… In fact, that’s what this post is about: it’s about how to describe certain properties of the system. Of course, we’re talking quantum mechanics here and, hence, uncertainty, and, therefore, we’re going to talk about the average position, energy, momentum, etcetera that’s associated with a particular state of a system, or—as we’ll keep things very simple—the properties of a ‘particle’, really. Think of an electron in some orbital, indeed! 🙂

So let’s now look at that set of Hamiltonian equations once again:

new

Looking at it carefully – so just look at it once again! 🙂 – and thinking about what we did when going from the discrete to the continuous setting, we can now understand we should write the following for the continuous case:

equivalence

Of course, combining Schrödinger’s equation with the expression above implies the following:

equality

Now how can we relate that integral to the expression on the right-hand side? I’ll have to disappoint you here, as it requires a lot of math to transform that integral. It requires writing H(x, x’) in terms of rather complicated functions, including – you guessed it, didn’t you? – Dirac’s delta function. Hence, I assume you’ll believe me if I say that the matrix- and wave-mechanical approaches are actually equivalent. In any case, if you’d want to check it, you can always read Feynman yourself. 🙂

Now, I wrote this post to talk about quantum-mechanical operators, so let me do that now.

Quantum-mechanical operators

You know the concept of an operator. As mentioned above, we should put a little hat (^) on top of our Hamiltonian operator, so as to distinguish it from the matrix itself. However, as mentioned above, the difference is usually quite clear from the context. Our operators were all matrices so far, and we’d write the matrix elements of, say, some operator A, as:

Aij ≡ 〈 i | A | j 〉

The whole matrix itself, however, would usually not act on a base state but… Well… Just on some more general state ψ, to produce some new state φ, and so we’d write:

| φ 〉 = A | ψ 〉

Of course, we’d have to describe | φ 〉 in terms of the (same) set of base states and, therefore, we’d expand this expression into something like this:

operator 2

You get the idea. I should just add one more thing. You know this important property of amplitudes: the 〈 ψ | φ 〉 amplitude is the complex conjugate of the 〈 φ | ψ 〉 amplitude. It’s got to do with time reversibility, because the complex conjugate of eiθ = ei(ω·t−k·x) is equal to eiθ = ei(ω·t−k·x), so we’re just reversing the x- and tdirection. We write:

 〈 ψ | φ 〉 = 〈 φ | ψ 〉*

Now what happens if we want to take the complex conjugate when we insert a matrix, so when writing 〈 φ | A | ψ 〉 instead of 〈 φ | ψ 〉, this rules becomes:

〈 φ | A | ψ 〉* = 〈 ψ | A† | φ 〉

The dagger symbol denotes the conjugate transpose, so A† is an operator whose matrix elements are equal to Aij† = Aji*. Now, it may or may not happen that the A† matrix is actually equal to the original A matrix. In that case – and only in that case – we can write:

〈 ψ | A | φ 〉 = 〈 φ | A | ψ 〉*

We then say that A is a ‘self-adjoint’ or ‘Hermitian’ operator. That’s just a definition of a property, which the operator may or may not have—but many quantum-mechanical operators are actually Hermitian. In any case, we’re well armed now to discuss some actual operators, and we’ll start with that energy operator.

The energy operator (H)

We know the state of a system is described in terms of a set of base states. Now, our analysis of N-state systems showed we can always describe it in terms of a special set of base states, which are referred to as the states of definite energy because… Well… Because they’re associated with some definite energy. In that post, we referred to these energy levels as En (n = I, II,… N). We used boldface for the subscript n (so we wrote n instead of n) because of these Roman numerals. With each energy level, we could associate a base state, of definite energy indeed, that we wrote as |n〉. To make a long story short, we summarized our results as follows:

  1. The energies EI, EII,…, En,…, EN are the eigenvalues of the Hamiltonian matrix H.
  2. The state vectors |n〉 that are associated with each energy En, i.e. the set of vectors |n〉, are the corresponding eigenstates.

We’ll be working with some more subscripts in what follows, and these Roman numerals and the boldface notation are somewhat confusing (if only because I don’t want you to think of these subscripts as vectors), we’ll just denote EI, EII,…, En,…, EN as E1, E2,…, Ei,…, EN, and we’ll number the states of definite energy accordingly, also using some Greek letter so as to clearly distinguish them from all our Latin letter symbols: we’ll write these states as: |η1〉, |η1〉,… |ηN〉. [If I say, ‘we’, I mean Feynman of course. You may wonder why he doesn’t write |Ei〉, or |εi〉. The answer is: writing |En〉 would cause confusion, because this state will appear in expressions like: |Ei〉Ei, so that’s the ‘product’ of a state (|Ei〉) and the associated scalar (Ei). Too confusing. As for using η (eta) instead of ε (epsilon) to denote something that’s got to do with energy… Well… I guess he wanted to keep the resemblance with the n, and then the Ancient Greek apparently did use this η letter  for a sound like ‘e‘ so… Well… Why not? Let’s get back to the lesson.]

Using these base states of definite energy, we can write the state of the system as:

|ψ〉 = ∑ |ηi〉 C = ∑ |ηi〉〈ηi|ψ〉    over all (i = 1, 2,… , N)

Now, we didn’t talk all that much about what these base states actually mean in terms of measuring something but you’ll believe if I say that, when measuring the energy of the system, we’ll always measure one or the other E1, E2,…, Ei,…, EN value. We’ll never measure something in-between: it’s eitheror. Now, as you know, measuring something in quantum physics is supposed to be destructive but… Well… Let us imagine we could make a thousand measurements to try to determine the average energy of the system. We’d do so by counting the number of times we measure E1 (and of course we’d denote that number as N1), E2E3, etcetera. You’ll agree that we’d measure the average energy as:

E average

However, measurement is destructive, and we actually know what the expected value of this ‘average’ energy will be, because we know the probabilities of finding the system in a particular base state. That probability is equal to the absolute square of that Ccoefficient above, so we can use the P= |Ci|2 formula to write:

Eav〉 = ∑ Pi Ei over all (i = 1, 2,… , N)

Note that this is a rather general formula. It’s got nothing to do with quantum mechanics: if Ai represents the possible values of some quantity A, and Pi is the probability of getting that value, then (the expected value of) the average A will also be equal to 〈Aav〉 = ∑ Pi Ai. No rocket science here! 🙂 But let’s now apply our quantum-mechanical formulas to that 〈Eav〉 = ∑ Pi Ei formula. [Oh—and I apologize for using the same angle brackets 〈 and 〉 to denote an expected value here—sorry for that! But it’s what Feynman does—and other physicists! You see: they don’t really want you to understand stuff, and so they often use very confusing symbols.] Remembering that the absolute square of a complex number equals the product of that number and its complex conjugate, we can re-write the 〈Eav〉 = ∑ Pi Ei formula as:

Eav〉 = ∑ Pi Ei = ∑ |Ci|Ei = ∑ Ci*CEi = ∑ C*CEi = ∑ 〈ψ|ηi〉〈ηi|ψ〉E= ∑ 〈ψ|ηiEi〈ηi|ψ〉 over all i

Now, you know that Dirac’s bra-ket notation allows numerous manipulations. For example, what we could do is take out that ‘common factor’ 〈ψ|, and so we may re-write that monster above as:

Eav〉 = 〈ψ| ∑ ηiEi〈ηi|ψ〉 = 〈ψ|φ〉, with |φ〉 = ∑ |ηiEi〈ηi|ψ〉 over all i

Huh? Yes. Note the difference between |ψ〉 = ∑ |ηi〉 C = ∑ |ηi〉〈ηi|ψ〉 and |φ〉 = ∑ |ηiEi〈ηi|ψ〉. As Feynman puts it: φ is just some ‘cooked-up‘ state which you get by taking each of the base states |ηi〉 in the amount Ei〈ηi|ψ〉 (as opposed to the 〈ηi|ψ〉 amounts we took for ψ).

I know: you’re getting tired and you wonder why we need all this stuff. Just hang in there. We’re almost done. I just need to do a few more unpleasant things, one of which is to remind you that this business of the energy states being eigenstates (and the energy levels being eigenvalues) of our Hamiltonian matrix (see my post on N-state systems) comes with a number of interesting properties, including this one:

H |ηi〉 = Eii〉 = |ηiEi

Just think about what’s written here: on the left-hand side, we’re multiplying a matrix with a (base) state vector, and on the left-hand side we’re multiplying it with a scalar. So our |φ〉 = ∑ |ηiEi〈ηi|ψ〉 sum now becomes:

|φ〉 = ∑ H |ηi〉〈ηi|ψ〉 over all (i = 1, 2,… , N)

Now we can manipulate that expression some more so as to get the following:

|φ〉 = H ∑|ηi〉〈ηi|ψ〉 = H|ψ〉

Finally, we can re-combine this now with the 〈Eav〉 = 〈ψ|φ〉 equation above, and so we get the fantastic result we wanted:

Eav〉 = 〈 ψ | φ 〉 = 〈 ψ | H ψ 〉

Huh? Yes! To get the average energy, you operate on |ψ with H, and then you multiply the result with ψ|. It’s a beautiful formula. On top of that, the new formula for the average energy is not only pretty but also useful, because now we don’t need to say anything about any particular set of base states. We don’t even have to know all of the possible energy levels. When we have to calculate the average energy of some system, we only need to be able to describe the state of that system in terms of some set of base states, and we also need to know the Hamiltonian matrix for that set, of course. But if we know that, we can calculate its average energy.

You’ll say that’s not a big deal because… Well… If you know the Hamiltonian, you know everything, so… Well… Yes. You’re right: it’s less of a big deal than it seems. Having said that, the whole development above is very interesting because of something else: we can easily generalize it for other physical measurements. I call it the ‘average value’ operator idea, but you won’t find that term in any textbook. 🙂 Let me explain the idea.

The average value operator (A)

The development above illustrates how we can relate a physical observable, like the (average) energy (E), to a quantum-mechanical operator (H). Now, the development above can easily be generalized to any observable that would be proportional to the energy. It’s perfectly reasonable, for example, to assume the angular momentum – as measured in some direction, of course, which we usually refer to as the z-direction – would be proportional to the energy, and so then it would be easy to define a new operator Lz, which we’d define as the operator of the z-component of the angular momentum L. [I know… That’s a bit of a long name but… Well… You get the idea.] So we can write:

Lzav = 〈 ψ | Lψ 〉

In fact, further generalization yields the following grand result:

If a physical observable A is related to a suitable quantum-mechanical operator Â, then the average value of A for the state | ψ 〉 is given by:

Aav = 〈 ψ |  ψ 〉 = 〈 ψ | φ 〉 with | φ 〉 =  ψ 〉

At this point, you may have second thoughts, and wonder: what state | ψ 〉? The answer is: it doesn’t matter. It can be any state, as long as we’re able to describe in terms of a chosen set of base states. 🙂

OK. So far, so good. The next step is to look at how this works for the continuity case.

The energy operator for wavefunctions (H)

We can start thinking about the continuous equivalent of the 〈Eav〉 = 〈ψ|H|ψ〉 expression by first expanding it. We write:

e average continuous function

You know the continuous equivalent of a sum like this is an integral, i.e. an infinite sum. Now, because we’ve got two subscripts here (i and j), we get the following double integral:

double integral

Now, I did take my time to walk you through Feynman’s derivation of the energy operator for the discrete case, i.e. the operator when we’re dealing with matrix mechanics, but I think I can simplify my life here by just copying Feynman’s succinct development:

Feynman

Done! Given a wavefunction ψ(x), we get the average energy by doing that integral above. Now, the quantity in the braces of that integral can be written as that operator we introduced when we started this post:

H

So now we can write that integral much more elegantly. It becomes:

Eav = ∫ ψ*(xH ψ(x) dx

You’ll say that doesn’t look like 〈Eav〉 = 〈 ψ | H ψ 〉! It does. Remember that 〈 ψ | = ψ 〉*. 🙂 Done!

I should add one qualifier though: the formula above assumes our wavefunction has been normalized, so all probabilities add up to one. But that’s a minor thing. The only thing left to do now is to generalize to three dimensions. That’s easy enough. Our expression becomes a volume integral:

Eav = ∫ ψ*(rH ψ(r) dV

Of course, dV stands for dVolume here, not for any potential energy, and, of course, once again we assume all probabilities over the volume add up to 1, so all is normalized. Done! 🙂

We’re almost done with this post. What’s left is the position and momentum operator. You may think this is going to another lengthy development but… Well… It turns out the analysis is remarkably simple. Just stay with me a few more minutes and you’ll have earned your degree. 🙂

The position operator (x)

The thing we need to solve here is really easy. Look at the illustration below as representing the probability density of some particle being at x. Think about it: what’s the average position?

average position

Well? What? The (expected value of the) average position is just this simple integral: 〈xav = ∫ P(x) dx, over all the whole range of possible values for x. 🙂 That’s all. Of course, because P(x) = |ψ(x)|2 =ψ*(x)·ψ(x), this integral now becomes:

xav = ∫ ψ*(x) x ψ(x) dx

That looks exactly the same as 〈Eav = ∫ ψ*(xH ψ(x) dx, and so we can look at as an operator too!

Huh? Yes. It’s an extremely simple operator: it just means “multiply by x“. 🙂

I know you’re shaking your head now: is it that easy? It is. Moreover, the ‘matrix-mechanical equivalent’ is equally simple but, as it’s getting late here, I’ll refer you to Feynman for that. 🙂

The momentum operator (px)

Now we want to calculate the average momentum of, say, some electron. What integral would you use for that? […] Well… What? […] It’s easy: it’s the same thing as for x. We can just substitute replace for in that 〈xav = ∫ P(x) dformula, so we get:

pav = ∫ P(p) dp, over all the whole range of possible values for p

Now, you might think the rest is equally simple, and… Well… It actually is simple but there’s one additional thing in regard to the need to normalize stuff here. You’ll remember we defined a momentum wavefunction (see my post on the Uncertainty Principle), which we wrote as:

φ(p) = 〈 mom p | ψ 〉

Now, in the mentioned post, we related this momentum wavefunction to the particle’s ψ(x) = 〈x|ψ〉 wavefunction—which we should actually refer to as the position wavefunction, but everyone just calls it the particle’s wavefunction, which is a bit of a misnomer, as you can see now: a wavefunction describes some property of the system, and so we can associate several wavefunctions with the same system, really! In any case, we noted the following there:

  • The two probability density functions, φ(p) and ψ(x), look pretty much the same, but the half-width (or standard deviation) of one was inversely proportional to the half-width of the other. To be precise, we found that the constant of proportionality was equal to ħ/2, and wrote that relation as follows: σp = (ħ/2)/σx.
  • We also found that, when using a regular normal distribution function for ψ(x), we’d have to normalize the probability density function by inserting a (2πσx2)−1/2 in front of the exponential.

Now, it’s a bit of a complicated argument, but the upshot is that we cannot just write what we usually write, i.e. Pi = |Ci|2 or P(x) = |ψ(x)|2. No. We need to put a normalization factor in front, which combines the two factors I mentioned above. To be precise, we have to write:

P(p) = |〈p|ψ〉|2/(2πħ)

So… Well… Our 〈pav = ∫ P(p) dp integral can now be written as:

pav = ∫ 〈ψ|ppp|ψ〉 dp/(2πħ)

So that integral is totally like what we found for 〈xav and so… We could just leave it at that, and say we’ve solved the problem. In that sense, it is easy. However, having said that, it’s obvious we’d want some solution that’s written in terms of ψ(x), rather than in terms of φ(p), and that requires some more manipulation. I’ll refer you, once more, to Feynman for that, and I’ll just give you the result:

momentum operator

So… Well… I turns out that the momentum operator – which I tentatively denoted as px above – is not so simple as our position operator (x). Still… It’s not hugely complicated either, as we can write it as:

px ≡ (ħ/i)·(∂/∂x)

Of course, the purists amongst you will, once again, say that I should be more careful and put a hat wherever I’d need to put one so… Well… You’re right. I’ll wrap this all up by copying Feynman’s overview of the operators we just explained, and so he does use the fancy symbols. 🙂

overview

Well, folks—that’s it! Off we go! You know all about quantum physics now! We just need to work ourselves through the exercises that come with Feynman’s Lectures, and then you’re ready to go and bag a degree in physics somewhere. So… Yes… That’s what I want to do now, so I’ll be silent for quite a while now. Have fun! 🙂

Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/

Schrödinger’s equation: the original approach

Of course, your first question when seeing the title of this post is: what’s original, really? Well… The answer is simple: it’s the historical approach, and it’s original because it’s actually quite intuitive. Indeed, Lecture no. 16 in Feynman’s third Volume of Lectures on Physics is like a trip down memory lane as Feynman himself acknowledges, after presenting Schrödinger’s equation using that very rudimentary model we developed in our previous post:

“We do not intend to have you think we have derived the Schrödinger equation but only wish to show you one way of thinking about it. When Schrödinger first wrote it down, he gave a kind of derivation based on some heuristic arguments and some brilliant intuitive guesses. Some of the arguments he used were even false, but that does not matter; the only important thing is that the ultimate equation gives a correct description of nature.”

So… Well… Let’s have a look at it. 🙂 We were looking at some electron we described in terms of its location at one or the other atom in a linear array (think of it as a line). We did so by defining base states |n〉 = |xn〉, noting that the state of the electron at any point in time could then be written as:

|φ〉 = ∑ |xnCn(t) = ∑ |xn〉〈xn|φ〉 over all n

The Cn(t) = 〈xn|φ〉 coefficient is the amplitude for the electron to be at xat t. Hence, the Cn(t) amplitudes vary with t as well as with x. We’ll re-write them as Cn(t) = C(xn, t) = C(xn). Note that the latter notation does not explicitly show the time dependence. The Hamiltonian equation we derived in our previous post is now written as:

iħ·(∂C(xn)/∂t) = E0C(xn) − AC(xn+b) − AC(xn−b)

Note that, as part of our move from the Cn(t) to the C(xn) notation, we write the time derivative dCn(t)/dt now as ∂C(xn)/∂t, so we use the partial derivative symbol now (∂). Of course, the other partial derivative will be ∂C(x)/∂x) as we move from the count variable xto the continuous variable x, but let’s not get ahead of ourselves here. The solution we found for our C(xn) functions was the following wavefunction:

C(xn) = a·ei(k∙xn−ω·t) ei∙ω·t·ei∙k∙xn ei·(E/ħ)·t·ei·k∙xn

We also found the following relationship between E and k:

E = E0 − 2A·cos(kb)

Now, even Feynman struggles a bit with the definition of E0 and k here, and their relationship with E, which is graphed below.

energy

Indeed, he first writes, as he starts developing the model, that E0 is, physically, the energy the electron would have if it couldn’t leak away from one of the atoms, but then he also adds: “It represents really nothing but our choice of the zero of energy.”

This is all quite enigmatic because we cannot just do whatever we want when discussing the energy of a particle. As I pointed out in one of my previous posts, when discussing the energy of a particle in the context of the wavefunction, we generally consider it to be the sum of three different energy concepts:

  1. The particle’s rest energy m0c2, which de Broglie referred to as internal energy (Eint), and which includes the rest mass of the ‘internal pieces’, as Feynman puts it (now we call those ‘internal pieces’ quarks), as well as their binding energy (i.e. the quarks’ interaction energy).
  2. Any potential energy it may have because of some field (i.e. if it is not traveling in free space), which we usually denote by U. This field can be anything—gravitational, electromagnetic: it’s whatever changes the energy of the particle because of its position in space.
  3. The particle’s kinetic energy, which we write in terms of its momentum p: m·v2/2 = m2·v2/(2m) = (m·v)2/(2m) = p2/(2m).

It’s obvious that we cannot just “choose” the zero point here: the particle’s rest energy is its rest energy, and its velocity is its velocity. So it’s not quite clear what the E0 in our model really is. As far as I am concerned, it represents the average energy of the system really, so it’s just like the E0 for our ammonia molecule, or the E0 for whatever two-state system we’ve seen so far. In fact, when Feynman writes that we can “choose our zero of energy so that E0 − 2A = 0″ (so the minimum of that curve above is at the zero of energy), he actually makes some assumption in regard to the relative magnitude of the various amplitudes involved.

We should probably think about it in this way: −(i/ħ)·E0 is the amplitude for the electron to just stay where it is, while i·A/ħ is the amplitude to go somewhere else—and note we’ve got two possibilities here: the electron can go to |xn+1〉,  or, alternatively, it can go to |xn−1〉. Now, amplitudes can be associated with probabilities by taking the absolute square, so I’d re-write the E0 − 2A = 0 assumption as:

E0 = 2A ⇔ |−(i/ħ)·E0|= |(i/ħ)·2A|2

Hence, in my humble opinion, Feynman’s assumption that E0 − 2A = 0 has nothing to do with ‘choosing the zero of energy’. It’s more like a symmetry assumption: we’re basically saying it’s as likely for the electron to stay where it is as it is to move to the next position. It’s an idea I need to develop somewhat further, as Feynman seems to just gloss over these little things. For example, I am sure it is not a coincidence that the EI, EIIEIII and EIV energy levels we found when discussing the hyperfine splitting of the hydrogen ground state also add up to 0. In fact, you’ll remember we could actually measure those energy levels (E= EII = EIII = A ≈ 9.23×10−6 eV, and EIV = −3A ≈ −27.7×10−6 eV), so saying that we can “choose” some zero energy point is plain nonsense. The question just doesn’t arise. In any case, as I have to continue the development here, I’ll leave this point for further analysis in the future. So… Well… Just note this E0 − 2A = 0 assumption, as we’ll need it in a moment.

The second assumption we’ll need concerns the variation in k. As you know, we can only get a wave packet if we allow for uncertainty in k which, in turn, translates into uncertainty for E. We write:

ΔE = Δ[E0 − 2A·cos(kb)]

Of course, we’d need to interpret the Δ as a variance (σ2) or a standard deviation (σ) so we can apply the usual rules – i.e. var(a) = 0, var(aX) = a2·var(X), and var(aX ± bY) = a2·var(X) + b2·var(Y) ± 2ab·cov(X, Y) – to be a bit more precise about what we’re writing here, but you get the idea. In fact, let me quickly write it out:

var[E0 − 2A·cos(kb)] = var(E0) + 4A2·var[cos(kb)] ⇔ var(E) = 4A2·var[cos(kb)]

Now, you should check my post scriptum to my page on the Essentials, to see how the probability density function of the cosine of a randomly distributed variable looks like, and then you should go online to find a formula for its variance, and then you can work it all out yourself, because… Well… I am not going to do it for you. What I want to do here is just show how Feynman gets Schrödinger’s equation out of all of these simplifications.

So what’s the second assumption? Well… As the graph shows, our k can take any value between −π/b and +π/b, and therefore, the kb argument in our cosine function can take on any value between −π and +π. In other words, kb could be any angle. However, as Feynman puts it—we’ll be assuming that kb is ‘small enough’, so we can use the small-angle approximations whenever we see the cos(kb) and/or sin(kb) functions. So we write: sin(kb) ≈ kb and cos(kb) ≈ 1 − (kb)2/2 = 1 − k2b2/2. Now, that assumption led to another grand result, which we also derived in our previous post. It had to do with the group velocity of our wave packet, which we calculated as:

= dω/dk = (2Ab2/ħ)·k

Of course, we should interpret our k here as “the typical k“. Huh? Yes… That’s how Feynman refers to it, and I have no better term for it. It’s some kind of ‘average’ of the Δk interval, obviously, but… Well… Feynman does not give us any exact definition here. Of course, if you look at the graph once more, you’ll say that, if the typical kb has to be “small enough”, then its expected value should be zero. Well… Yes and no. If the typical kb is zero, or if is zero, then is zero, and then we’ve got a stationary electron, i.e. an electron with zero momentum. However, because we’re doing what we’re doing (that is, we’re studying “stuff that moves”—as I put it unrespectfully in a few of my posts, so as to distinguish from our analyses of “stuff that doesn’t move”, like our two-state systems, for example), our “typical k” should not be zero here. OK… We can now calculate what’s referred to as the effective mass of the electron, i.e. the mass that appears in the classical kinetic energy formula: K.E. = m·v2/2. Now, there are two ways to do that, and both are somewhat tricky in their interpretation:

1. Using both the E0 − 2A = 0 as well as the “small kb” assumption, we find that E = E0 − 2A·(1 − k2b2/2) = A·k2b2. Using that for the K.E. in our formula yields:

meff = 2A·k2b2/v= 2A·k2b2/[(2Ab2/ħ)·k]= ħ2/(2Ab2)

2. We can use the classical momentum formula (p = m·v), and then the 2nd de Broglie equation, which tells us that each wavenumber (k) is to be associated with a value for the momentum (p) using the p = ħk (so p is proportional to k, with ħ as the factor of proportionality). So we can now calculate meff as meff = ħk/v. Substituting again for what we’ve found above, gives us the same:

meff = 2A·k2b2/v = ħ·k/[(2Ab2/ħ)·k] = ħ2/(2Ab2)

Of course, we’re not supposed to know the de Broglie relations at this point in time. 🙂 But, now that you’ve seen them anyway, note how we have two formulas for the momentum:

  • The classical formula (p = m·v) tells us that the momentum is proportional to the classical velocity of our particle, and m is then the factor of proportionality.
  • The quantum-mechanical formula (p = ħk) tells us that the (typical) momentum is proportional to the (typical) wavenumber, with Planck’s constant (ħ) as the factor of proportionality. Combining both combines the classical and quantum-mechanical perspective of a moving particle:

v = ħk

I know… It’s an obvious equation but… Well… Think of it. It’s time to get back to the main story now. Remember we were trying to find Schrödinger’s equation? So let’s get on with it. 🙂

To do so, we need one more assumption. It’s the third major simplification and, just like the others, the assumption is obvious on first, but not on second thought. 😦 So… What is it? Well… It’s easy to see that, in our meff = ħ2/(2Ab2) formula, all depends on the value of 2Ab2. So, just like we should wonder what happens with that kb factor in the argument of our sine or cosine function if b goes to zero—i.e. if we’re letting the lattice spacing go to zero, so we’re moving from a discrete to a continuous analysis now—we should also wonder what happens with that 2Ab2 factor! Well… Think about it. Wouldn’t it be reasonable to assume that the effective mass of our electron is determined by some property of the material, or the medium (so that’s the silicon in our previous post) and, hence, that it’s constant really. Think of it: we’re not changing the fundamentals really—we just have some electron roaming around in some medium and all that we’re doing now is bringing those xcloser together. Much closer. It’s only logical, then, that our amplitude to jump from xn±1 to xwould also increase, no? So what we’re saying is that 2Ab2 is some constant which we write as ħ2/meff or, what amounts to the same, that Ab= ħ2/2·meff.

Of course, you may raise two objections here:

  1. The Ab= ħ2/2·meff assumption establishes a very particular relation between A and b, as we can write A as A = [ħ2/(2meff)]·b−2 now. So we’ve got like an y = 1/x2 relation here. Where the hell does that come from?
  2. We were talking some real stuff here: a crystal lattice with atoms that, in reality, do have some spacing, so that corresponds to some real value for b. So that spacing gives some actual physical significance to those xvalues.

Well… What can I say? I think you should re-read that quote of Feynman when I started this post. We’re going to get Schrödinger’s equation – i.e. the ultimate prize for all of the hard work that we’ve been doing so far – but… Yes. It’s really very heuristic, indeed! 🙂 But let’s get on with it now! We can re-write our Hamiltonian equation as:

iħ·(∂C(xn)/∂t) = E0C(xn) − AC(xn+b) − AC(xn−b)]

= (E0−2A)C(xn) + A[2C(xn) − C(xn+b) − C(xn−b) = A[2C(xn) − C(xn+b) − C(xn−b)]

Now, I know your brain is about to melt down but, fiddling with this equation as we’re doing right now, Schrödinger recognized a formula for the second-order derivative of a function. I’ll just jot it down, and you can google it so as to double-check where it comes from:

second derivative

Just substitute f(x) for C(xn) in the second part of our equation above, and you’ll see we can effectively write that 2C(xn) − C(xn+b) − C(xn−b) factor as:

formula 1

We’re done. We just iħ·(∂C(xn)/∂t) on the left-hand side now and multiply the expression above with A, to get what we wanted to get, and that’s – YES! – Schrödinger’s equation:

Schrodinger 2

Whatever your objections to this ‘derivation’, it is the correct equation. For a particle in free space, we just write m instead of meff, but it’s exactly the same. I’ll now give you Feynman’s full quote, which is quite enlightening:

“We do not intend to have you think we have derived the Schrödinger equation but only wish to show you one way of thinking about it. When Schrödinger first wrote it down, he gave a kind of derivation based on some heuristic arguments and some brilliant intuitive guesses. Some of the arguments he used were even false, but that does not matter; the only important thing is that the ultimate equation gives a correct description of nature. The purpose of our discussion is then simply to show you that the correct fundamental quantum mechanical equation [i.e. Schrödinger’s equation] has the same form you get for the limiting case of an electron moving along a line of atoms. We can think of it as describing the diffusion of a probability amplitude from one point to the next along the line. That is, if an electron has a certain amplitude to be at one point, it will, a little time later, have some amplitude to be at neighboring points. In fact, the equation looks something like the diffusion equations which we have used in Volume I. But there is one main difference: the imaginary coefficient in front of the time derivative makes the behavior completely different from the ordinary diffusion such as you would have for a gas spreading out along a thin tube. Ordinary diffusion gives rise to real exponential solutions, whereas the solutions of Schrödinger’s equation are complex waves.”

So… That says it all, I guess. Isn’t it great to be where we are? We’ve really climbed a mountain here. And I think the view is gorgeous. 🙂

Oh—just in case you’d think I did not give you Schrödinger’s equation, let me write it in the form you’ll usually see it:

schrodinger 3

Done! 🙂

Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/

The de Broglie relations, the wave equation, and relativistic length contraction

Pre-script (dated 26 June 2020): Our ideas have evolved into a full-blown realistic (or classical) interpretation of all things quantum-mechanical. So no use to read this. Read my recent papers instead. 🙂

Original post:

You know the two de Broglie relations, also known as matter-wave equations:

f = E/h and λ = h/p

You’ll find them in almost any popular account of quantum mechanics, and the writers of those popular books will tell you that is the frequency of the ‘matter-wave’, and λ is its wavelength. In fact, to add some more weight to their narrative, they’ll usually write them in a somewhat more sophisticated form: they’ll write them using ω and k. The omega symbol (using a Greek letter always makes a big impression, doesn’t it?) denotes the angular frequency, while k is the so-called wavenumber.  Now, k = 2π/λ and ω = 2π·f and, therefore, using the definition of the reduced Planck constant, i.e. ħ = h/2π, they’ll write the same relations as:

  1. λ = h/p = 2π/k ⇔ k = 2π·p/h
  2. f = E/h = (ω/2π)

⇒ k = p/ħ and ω = E/ħ

They’re the same thing: it’s just that working with angular frequencies and wavenumbers is more convenient, from a mathematical point of view that is: it’s why we prefer expressing angles in radians rather than in degrees (k is expressed in radians per meter, while ω is expressed in radians per second). In any case, the ‘matter wave’ – even Wikipedia uses that term now – is, of course, the amplitude, i.e. the wave-function ψ(x, t), which has a frequency and a wavelength, indeed. In fact, as I’ll show in a moment, it’s got two frequencies: one temporal, and one spatial. I am modest and, hence, I’ll admit it took me quite a while to fully distinguish the two frequencies, and so that’s why I always had trouble connecting these two ‘matter wave’ equations.

Indeed, if they represent the same thing, they must be related, right? But how exactly? It should be easy enough. The wavelength and the frequency must be related through the wave velocity, so we can write: f·λ = v, with the velocity of the wave, which must be equal to the classical particle velocity, right? And then momentum and energy are also related. To be precise, we have the relativistic energy-momentum relationship: p·c = mv·v·c = mv·c2·v/c = E·v/c. So it’s just a matter of substitution. We should be able to go from one equation to the other, and vice versa. Right?

Well… No. It’s not that simple. We can start with either of the two equations but it doesn’t work. Try it. Whatever substitution you try, there’s no way you can derive one of the two equations above from the other. The fact that it’s impossible is evidenced by what we get when we’d multiply both equations. We get:

  1. f·λ = (E/h)·(h/p) = E/p
  2. v = f·λ  ⇒ f·λ = v = E/p ⇔ E = v·p = v·(m·v)

⇒ E = m·v2

Huh? What kind of formula is that? E = m·v2? That’s a formula you’ve never ever seen, have you? It reminds you of the kinetic energy formula of course—K.E. = m·v2/2—but… That factor 1/2 should not be there. Let’s think about it for a while. First note that this E = m·vrelation makes perfectly sense if v = c. In that case, we get Einstein’s mass-energy equivalence (E = m·c2), but that’s besides the point here. The point is: if v = c, then our ‘particle’ is a photon, really, and then the E = h·f is referred to as the Planck-Einstein relation. The wave velocity is then equal to c and, therefore, f·λ = c, and so we can effectively substitute to find what we’re looking for:

E/p = (h·f)/(h/λ) = f·λ = c ⇒ E = p·

So that’s fine: we just showed that the de Broglie relations are correct for photons. [You remember that E = p·c relation, no? If not, check out my post on it.] However, while that’s all nice, it is not what the de Broglie equations are about: we’re talking the matter-wave here, and so we want to do something more than just re-confirm that Planck-Einstein relation, which you can interpret as the limit of the de Broglie relations for v = c. In short, we’re doing something wrong here! Of course, we are. I’ll tell you what exactly in a moment: it’s got to do with the fact we’ve got two frequencies really.

Let’s first try something else. We’ve been using the relativistic E = mv·c2 equation above. Let’s try some other energy concept: let’s substitute the E in the f = E/h by the kinetic energy and then see where we get—if anywhere at all. So we’ll use the Ekinetic = m∙v2/2 equation. We can then use the definition of momentum (p = m∙v) to write E = p2/(2m), and then we can relate the frequency f to the wavelength λ using the v = λ∙f formula once again. That should work, no? Let’s do it. We write:

  1. E = p2/(2m)
  2. E = h∙f = h·v

⇒ λ = h·v/E = h·v/(p2/(2m)) = h·v/[m2·v2/(2m)] = h/[m·v/2] = 2∙h/p

So we find λ = 2∙h/p. That is almost right, but not quite: that factor 2 should not be there. Well… Of course you’re smart enough to see it’s just that factor 1/2 popping up once more—but as a reciprocal, this time around. 🙂 So what’s going on? The honest answer is: you can try anything but it will never work, because the f = E/h and λ = h/p equations cannot be related—or at least not so easily. The substitutions above only work if we use that E = m·v2 energy concept which, you’ll agree, doesn’t make much sense—at first, at least. Again: what’s going on? Well… Same honest answer: the f = E/h and λ = h/p equations cannot be related—or at least not so easily—because the wave equation itself is not so easy.

Let’s review the basics once again.

The wavefunction

The amplitude of a particle is represented by a wavefunction. If we have no information whatsoever on its position, then we usually write that wavefunction as the following complex-valued exponential:

ψ(x, t) = a·ei·[(E/ħ)·t − (p/ħ)∙x] = a·ei·(ω·t − kx= a·ei(kx−ω·t) = a·eiθ = (cosθ + i·sinθ)

θ is the so-called phase of our wavefunction and, as you can see, it’s the argument of a wavefunction indeed, with temporal frequency ω and spatial frequency k (if we choose our x-axis so its direction is the same as the direction of k, then we can substitute the k and x vectors for the k and x scalars, so that’s what we’re doing here). Now, we know we shouldn’t worry too much about a, because that’s just some normalization constant (remember: all probabilities have to add up to one). However, let’s quickly develop some logic here. Taking the absolute square of this wavefunction gives us the probability of our particle being somewhere in space at some point in time. So we get the probability as a function of x and t. We write:

P(x ,t) = |a·ei·[(E/ħ)·t − (p/ħ)∙x]|= a2

As all probabilities have to add up to one, we must assume we’re looking at some box in spacetime here. So, if the length of our box is Δx = x2 − x1, then (Δx)·a2 = (x2−x1a= 1 ⇔ Δx = 1/a2. [We obviously simplify the analysis by assuming a one-dimensional space only here, but the gist of the argument is essentially correct.] So, freezing time (i.e. equating t to some point t = t0), we get the following probability density function:

Capture

That’s simple enough. The point is: the two de Broglie equations f = E/h and λ = h/p give us the temporal and spatial frequencies in that ψ(x, t) = a·ei·[(E/ħ)·t − (p/ħ)∙x] relation. As you can see, that’s an equation that implies a much more complicated relationship between E/ħ = ω and p/ħ = k. Or… Well… Much more complicated than what one would think of at first.

To appreciate what’s being represented here, it’s good to play a bit. We’ll continue with our simple exponential above, which also illustrates how we usually analyze those wavefunctions: we either assume we’re looking at the wavefunction in space at some fixed point in time (t = t0) or, else, at how the wavefunction changes in time at some fixed point in space (x = x0). Of course, we know that Einstein told us we shouldn’t do that: space and time are related and, hence, we should try to think of spacetime, i.e. some ‘kind of union’ of space and time—as Minkowski famously put it. However, when everything is said and done, mere mortals like us are not so good at that, and so we’re sort of condemned to try to imagine things using the classical cut-up of things. 🙂 So we’ll just an online graphing tool to play with that a·ei(k∙x−ω·t) = a·eiθ = (cosθ + i·sinθ) formula.

Compare the following two graps, for example. Just imagine we either look at how the wavefunction behaves at some point in space, with the time fixed at some point t = t0, or, alternatively, that we look at how the wavefunction behaves in time at some point in space x = x0. As you can see, increasing k = p/ħ or increasing ω = E/ħ gives the wavefunction a higher ‘density’ in space or, alternatively, in time.

density 1

density 2That makes sense, intuitively. In fact, when thinking about how the energy, or the momentum, affects the shape of the wavefunction, I am reminded of an airplane propeller: as it spins, faster and faster, it gives the propeller some ‘density’, in space as well as in time, as its blades cover more space in less time. It’s an interesting analogy: it helps—me, at least—to think through what that wavefunction might actually represent.

propeller

So as to stimulate your imagination even more, you should also think of representing the real and complex part of that ψ = a·ei(k∙x−ω·t) = a·eiθ = (cosθ + i·sinθ) formula in a different way. In the graphs above, we just showed the sine and cosine in the same plane but, as you know, the real and the imaginary axis are orthogonal, so Euler’s formula a·eiθ (cosθ + i·sinθ) = cosθ + i·sinθ = Re(ψ) + i·Im(ψ) may also be graphed as follows:

5d_euler_f

The illustration above should make you think of yet another illustration you’ve probably seen like a hundred times before: the electromagnetic wave, propagating through space as the magnetic and electric field induce each other, as illustrated below. However, there’s a big difference: Euler’s formula incorporates a phase shift—remember: sinθ = cos(θ − π/2)—and you don’t have that in the graph below. The difference is much more fundamental, however: it’s really hard to see how one could possibly relate the magnetic and electric field to the real and imaginary part of the wavefunction respectively. Having said that, the mathematical similarity makes one think!

FG02_06

Of course, you should remind yourself of what E and B stand for: they represent the strength of the electric (E) and magnetic (B) field at some point x at some time t. So you shouldn’t think of those wavefunctions above as occupying some three-dimensional space. They don’t. Likewise, our wavefunction ψ(x, t) does not occupy some physical space: it’s some complex number—an amplitude that’s associated with each and every point in spacetime. Nevertheless, as mentioned above, the visuals make one think and, as such, do help us as we try to understand all of this in a more intuitive way.

Let’s now look at that energy-momentum relationship once again, but using the wavefunction, rather than those two de Broglie relations.

Energy and momentum in the wavefunction

I am not going to talk about uncertainty here. You know that Spiel. If there’s uncertainty, it’s in the energy or the momentum, or in both. The uncertainty determines the size of that ‘box’ (in spacetime) in which we hope to find our particle, and it’s modeled by a splitting of the energy levels. We’ll say the energy of the particle may be E0, but it might also be some other value, which we’ll write as En = E0 ± n·ħ. The thing to note is that energy levels will always be separated by some integer multiple of ħ, so ħ is, effectively , the quantum of energy for all practical—and theoretical—purposes. We then super-impose the various wave equations to get a wave function that might—or might not—resemble something like this:

Photon waveWho knows? 🙂 In any case, that’s not what I want to talk about here. Let’s repeat the basics once more: if we write our wavefunction a·ei·[(E/ħ)·t − (p/ħ)∙x] as a·ei·[ω·t − k∙x], we refer to ω = E/ħ as the temporal frequency, i.e. the frequency of our wavefunction in time (i.e. the frequency it has if we keep the position fixed), and to k = p/ħ as the spatial frequency (i.e. the frequency of our wavefunction in space (so now we stop the clock and just look at the wave in space). Now, let’s think about the energy concept first. The energy of a particle is generally thought of to consist of three parts:

  1. The particle’s rest energy m0c2, which de Broglie referred to as internal energy (Eint): it includes the rest mass of the ‘internal pieces’, as Feynman puts it (now we call those ‘internal pieces’ quarks), as well as their binding energy (i.e. the quarks’ interaction energy);
  2. Any potential energy it may have because of some field (so de Broglie was not assuming the particle was traveling in free space), which we’ll denote by U, and note that the field can be anything—gravitational, electromagnetic: it’s whatever changes the energy because of the position of the particle;
  3. The particle’s kinetic energy, which we write in terms of its momentum p: m·v2/2 = m2·v2/(2m) = (m·v)2/(2m) = p2/(2m).

So we have one energy concept here (the rest energy) that does not depend on the particle’s position in spacetime, and two energy concepts that do depend on position (potential energy) and/or how that position changes because of its velocity and/or momentum (kinetic energy). The two last bits are related through the energy conservation principle. The total energy is E = mvc2, of course—with the little subscript (v) ensuring the mass incorporates the equivalent mass of the particle’s kinetic energy.

So what? Well… In my post on quantum tunneling, I drew attention to the fact that different potentials , so different potential energies (indeed, as our particle travels one region to another, the field is likely to vary) have no impact on the temporal frequency. Let me re-visit the argument, because it’s an important one. Imagine two different regions in space that differ in potential—because the field has a larger or smaller magnitude there, or points in a different direction, or whatever: just different fields, which corresponds to different values for U1 and U2, i.e. the potential in region 1 versus region 2. Now, the different potential will change the momentum: the particle will accelerate or decelerate as it moves from one region to the other, so we also have a different p1 and p2. Having said that, the internal energy doesn’t change, so we can write the corresponding amplitudes, or wavefunctions, as:

  1. ψ11) = Ψ1(x, t) = a·eiθ1 = a·e−i[(Eint + p12/(2m) + U1)·t − p1∙x]/ħ 
  2. ψ22) = Ψ2(x, t) = a·e−iθ2 = a·e−i[(Eint + p22/(2m) + U2)·t − p2∙x]/ħ 

Now how should we think about these two equations? We are definitely talking different wavefunctions. However, their temporal frequencies ω= Eint + p12/(2m) + U1 and ω= Eint + p22/(2m) + Umust be the same. Why? Because of the energy conservation principle—or its equivalent in quantum mechanics, I should say: the temporal frequency f or ω, i.e. the time-rate of change of the phase of the wavefunction, does not change: all of the change in potential, and the corresponding change in kinetic energy, goes into changing the spatial frequency, i.e. the wave number k or the wavelength λ, as potential energy becomes kinetic or vice versa. The sum of the potential and kinetic energy doesn’t change, indeed. So the energy remains the same and, therefore, the temporal frequency does not change. In fact, we need this quantum-mechanical equivalent of the energy conservation principle to calculate how the momentum and, hence, the spatial frequency of our wavefunction, changes. We do so by boldly equating ω= Eint + p12/(2m) + Uand ω2 = Eint + p22/(2m) + U2, and so we write:

ω= ω2 ⇔ Eint + p12/(2m) + U=  Eint + p22/(2m) + U

⇔ p12/(2m) − p22/(2m) = U– U⇔ p2=  (2m)·[p12/(2m) – (U– U1)]

⇔ p2 = (p12 – 2m·ΔU )1/2

We played with this in a previous post, assuming that p12 is larger than 2m·ΔU, so as to get a positive number on the right-hand side of the equation for p22, so then we can confidently take the positive square root of that (p12 – 2m·ΔU ) expression to calculate p2. For example, when the potential difference ΔU = U– U1 was negative, so ΔU < 0, then we’re safe and sure to get some real positive value for p2.

Having said that, we also contemplated the possibility that p2= p12 – 2m·ΔU was negative, in which case p2 has to be some pure imaginary number, which we wrote as p= i·p’ (so p’ (read: p prime) is a real positive number here). We could work with that: it resulted in an exponentially decreasing factor ep’·x/ħ that ended up ‘killing’ the wavefunction in space. However, its limited existence still allowed particles to ‘tunnel’ through potential energy barriers, thereby explaining the quantum-mechanical tunneling phenomenon.

This is rather weird—at first, at least. Indeed, one would think that, because of the E/ħ = ω equation, any change in energy would lead to some change in ω. But no! The total energy doesn’t change, and the potential and kinetic energy are like communicating vessels: any change in potential energy is associated with a change in p, and vice versa. It’s a really funny thing. It helps to think it’s because the potential depends on position only, and so it should not have an impact on the temporal frequency of our wavefunction. Of course, it’s equally obvious that the story would change drastically if the potential would change with time, but… Well… We’re not looking at that right now. In short, we’re assuming energy is being conserved in our quantum-mechanical system too, and so that implies what’s described above: no change in ω, but we obviously do have changes in p whenever our particle goes from one region in space to another, and the potentials differ. So… Well… Just remember: the energy conservation principle implies that the temporal frequency of our wave function doesn’t change. Any change in potential, as our particle travels from one place to another, plays out through the momentum.

Now that we know that, let’s look at those de Broglie relations once again.

Re-visiting the de Broglie relations

As mentioned above, we usually think in one dimension only: we either freeze time or, else, we freeze space. If we do that, we can derive some funny new relationships. Let’s first simplify the analysis by re-writing the argument of the wavefunction as:

θ = E·t − p·x

Of course, you’ll say: the argument of the wavefunction is not equal to E·t − p·x: it’s (E/ħ)·t − (p/ħ)∙x. Moreover, θ should have a minus sign in front. Well… Yes, you’re right. We should put that 1/ħ factor in front, but we can change units, and so let’s just measure both E as well as p in units of ħ here. We can do that. No worries. And, yes, the minus sign should be there—Nature choose a clockwise direction for θ—but that doesn’t matter for the analysis hereunder.

The E·t − p·x expression reminds one of those invariant quantities in relativity theory. But let’s be precise here. We’re thinking about those so-called four-vectors here, which we wrote as pμ = (E, px, py, pz) = (E, p) and xμ = (t, x, y, z) = (t, x) respectively. [Well… OK… You’re right. We wrote those four-vectors as pμ = (E, px·c , py·c, pz·c) = (E, p·c) and xμ = (c·t, x, y, z) = (t, x). So what we write is true only if we measure time and distance in equivalent units so we have = 1. So… Well… Let’s do that and move on.] In any case, what was invariant was not E·t − p·x·c or c·t − x (that’s a nonsensical expression anyway: you cannot subtract a vector from a scalar), but pμ2 = pμpμ = E2 − (p·c)2 = E2 − p2·c= E2 − (px2 + py2 + pz2c2 and xμ2 = xμxμ = (c·t)2 − x2 = c2·t2 − (x2 + y2 + z2) respectively. [Remember pμpμ and xμxμ are four-vector dot products, so they have that +— signature, unlike the p2 and x2 or a·b dot products, which are just a simple sum of the squared components.] So… Well… E·t − p·x is not an invariant quantity. Let’s try something else.

Let’s re-simplify by equating ħ as well as c to one again, so we write: ħ = c = 1. [You may wonder if it is possible to ‘normalize’ both physical constants simultaneously, but the answer is yes. The Planck unit system is an example.]  then our relativistic energy-momentum relationship can be re-written as E/p = 1/v. [If c would not be one, we’d write: E·β = p·c, with β = v/c. So we got E/p = c/β. We referred to β as the relative velocity of our particle: it was the velocity, but measured as a ratio of the speed of light. So here it’s the same, except that we use the velocity symbol v now for that ratio.]

Now think of a particle moving in free space, i.e. without any fields acting on it, so we don’t have any potential changing the spatial frequency of the wavefunction of our particle, and let’s also assume we choose our x-axis such that it’s the direction of travel, so the position vector (x) can be replaced by a simple scalar (x). Finally, we will also choose the origin of our x-axis such that x = 0 zero when t = 0, so we write: x(t = 0) = 0. It’s obvious then that, if our particle is traveling in spacetime with some velocity v, then the ratio of its position x and the time t that it’s been traveling will always be equal to = x/t. Hence, for that very special position in spacetime (t, x = v·t) – so we’re talking the actual position of the particle in spacetime here – we get: θ = E·t − p·x = E·t − p·v·t = E·t − m·v·v·t= (E −  m∙v2)·t. So… Well… There we have the m∙v2 factor.

The question is: what does it mean? How do we interpret this? I am not sure. When I first jotted this thing down, I thought of choosing a different reference potential: some negative value such that it ensures that the sum of kinetic, rest and potential energy is zero, so I could write E = 0 and then the wavefunction would reduce to ψ(t) = ei·m∙v2·t. Feynman refers to that as ‘choosing the zero of our energy scale such that E = 0’, and you’ll find this in many other works too. However, it’s not that simple. Free space is free space: if there’s no change in potential from one region to another, then the concept of some reference point for the potential becomes meaningless. There is only rest energy and kinetic energy, then. The total energy reduces to E = m (because we chose our units such that c = 1 and, therefore, E = mc2 = m·12 = m) and so our wavefunction reduces to:

ψ(t) = a·ei·m·(1 − v2)·t

We can’t reduce this any further. The mass is the mass: it’s a measure for inertia, as measured in our inertial frame of reference. And the velocity is the velocity, of course—also as measured in our frame of reference. We can re-write it, of course, by substituting t for t = x/v, so we get:

ψ(x) = a·ei·m·(1/vv)·x

For both functions, we get constant probabilities, but a wavefunction that’s ‘denser’ for higher values of m. The (1 − v2) and (1/vv) factors are different, however: these factors becomes smaller for higher v, so our wavefunction becomes less dense for higher v. In fact, for = 1 (so for travel at the speed of light, i.e. for photons), we get that ψ(t) = ψ(x) = e0 = 1. [You should use the graphing tool once more, and you’ll see the imaginary part, i.e. the sine of the (cosθ + i·sinθ) expression, just vanishes, as sinθ = 0 for θ = 0.]

graph

The wavefunction and relativistic length contraction

Are exercises like this useful? As mentioned above, these constant probability wavefunctions are a bit nonsensical, so you may wonder why I wrote what I wrote. There may be no real conclusion, indeed: I was just fiddling around a bit, and playing with equations and functions. I feel stuff like this helps me to understand what that wavefunction actually is somewhat better. If anything, it does illustrate that idea of the ‘density’ of a wavefunction, in space or in time. What we’ve been doing by substituting x for x = v·t or t for t = x/v is showing how, when everything is said and done, the mass and the velocity of a particle are the actual variables determining that ‘density’ and, frankly, I really like that ‘airplane propeller’ idea as a pedagogic device. In fact, I feel it may be more than just a pedagogic device, and so I’ll surely re-visit it—once I’ve gone through the rest of Feynman’s Lectures, that is. 🙂

That brings me to what I added in the title of this post: relativistic length contraction. You’ll wonder why I am bringing that into a discussion like this. Well… Just play a bit with those (1 − v2) and (1/vv) factors. As mentioned above, they decrease the density of the wavefunction. In other words, it’s like space is being ‘stretched out’. Also, it can’t be a coincidence we find the same (1 − v2) factor in the relativistic length contraction formula: L = L0·√(1 − v2), in which L0 is the so-called proper length (i.e. the length in the stationary frame of reference) and is the (relative) velocity of the moving frame of reference. Of course, we also find it in the relativistic mass formula: m = mv = m0/√(1−v2). In fact, things become much more obvious when substituting m for m0/√(1−v2) in that ψ(t) = ei·m·(1 − v2)·t function. We get:

ψ(t) = a·ei·m·(1 − v2)·t = a·ei·m0·√(1−v2)·t 

Well… We’re surely getting somewhere here. What if we go back to our original ψ(x, t) = a·ei·[(E/ħ)·t − (p/ħ)∙x] function? Using natural units once again, that’s equivalent to:

ψ(x, t) = a·ei·(m·t − p∙x) = a·ei·[(m0/√(1−v2))·t − (m0·v/√(1−v2)∙x)

= a·ei·[m0/√(1−v2)]·(t − v∙x)

Interesting! We’ve got a wavefunction that’s a function of x and t, but with the rest mass (or rest energy) and velocity as parameters! Now that really starts to make sense. Look at the (blue) graph for that 1/√(1−v2) factor: it goes from one (1) to infinity (∞) as v goes from 0 to 1 (remember we ‘normalized’ v: it’s a ratio between 0 and 1 now). So that’s the factor that comes into play for t. For x, it’s the red graph, which has the same shape but goes from zero (0) to infinity (∞) as v goes from 0 to 1.

graph 2Now that makes sense: the ‘density’ of the wavefunction, in time and in space, increases as the velocity v increases. In space, that should correspond to the relativistic length contraction effect: it’s like space is contracting, as the velocity increases and, therefore, the length of the object we’re watching contracts too. For time, the reasoning is a bit more complicated: it’s our time that becomes more dense and, therefore, our clock that seems to tick faster.

[…]

I know I need to explore this further—if only so as to assure you I have not gone crazy. Unfortunately, I have no time to do that right now. Indeed, from time to time, I need to work on other stuff besides this physics ‘hobby’ of mine. :-/

Post scriptum 1: As for the E = m·vformula, I also have a funny feeling that it might be related to the fact that, in quantum mechanics, both the real and imaginary part of the oscillation actually matter. You’ll remember that we’d represent any oscillator in physics by a complex exponential, because it eased our calculations. So instead of writing A = A0·cos(ωt + Δ), we’d write: A = A0·ei(ωt + Δ) = A0·cos(ωt + Δ) + i·A0·sin(ωt + Δ). When calculating the energy or intensity of a wave, however, we couldn’t just take the square of the complex amplitude of the wave – remembering that E ∼ A2. No! We had to get back to the real part only, i.e. the cosine or the sine only. Now the mean (or average) value of the squared cosine function (or a squared sine function), over one or more cycles, is 1/2, so the mean of A2 is equal to 1/2 = A02. cos(ωt + Δ). I am not sure, and it’s probably a long shot, but one must be able to show that, if the imaginary part of the oscillation would actually matter – which is obviously the case for our matter-wave – then 1/2 + 1/2 is obviously equal to 1. I mean: try to think of an image with a mass attached to two springs, rather than one only. Does that make sense? 🙂 […] I know: I am just freewheeling here. 🙂

Post scriptum 2: The other thing that this E = m·vequation makes me think of is – curiously enough – an eternally expanding spring. Indeed, the kinetic energy of a mass on a spring and the potential energy that’s stored in the spring always add up to some constant, and the average potential and kinetic energy are equal to each other. To be precise: 〈K.E.〉 + 〈P.E.〉 = (1/4)·k·A2 + (1/4)·k·A= k·A2/2. It means that, on average, the total energy of the system is twice the average kinetic energy (or potential energy). You’ll say: so what? Well… I don’t know. Can we think of a spring that expands eternally, with the mass on its end not gaining or losing any speed? In that case, is constant, and the total energy of the system would, effectively, be equal to Etotal = 2·〈K.E.〉 = (1/2)·m·v2/2 = m·v2.

Post scriptum 3: That substitution I made above – substituting x for x = v·t – is kinda weird. Indeed, if that E = m∙v2 equation makes any sense, then E − m∙v2 = 0, of course, and, therefore, θ = E·t − p·x = E·t − p·v·t = E·t − m·v·v·t= (E −  m∙v2)·t = 0·t = 0. So the argument of our wavefunction is 0 and, therefore, we get a·e= for our wavefunction. It basically means our particle is where it is. 🙂

Post scriptum 4: This post scriptum – no. 4 – was added later—much later. On 29 February 2016, to be precise. The solution to the ‘riddle’ above is actually quite simple. We just need to make a distinction between the group and the phase velocity of our complex-valued wave. The solution came to me when I was writing a little piece on Schrödinger’s equation. I noticed that we do not find that weird E = m∙v2 formula when substituting ψ for ψ = ei(kx − ωt) in Schrödinger’s equation, i.e. in:

Schrodinger's equation 2

Let me quickly go over the logic. To keep things simple, we’ll just assume one-dimensional space, so ∇2ψ = ∂2ψ/∂x2. The time derivative on the left-hand side is ∂ψ/∂t = −iω·ei(kx − ωt). The second-order derivative on the right-hand side is ∂2ψ/∂x2 = (ik)·(ik)·ei(kx − ωt) = −k2·ei(kx − ωt) . The ei(kx − ωt) factor on both sides cancels out and, hence, equating both sides gives us the following condition:

iω = −(iħ/2m)·k2 ⇔ ω = (ħ/2m)·k2

Substituting ω = E/ħ and k = p/ħ yields:

E/ħ = (ħ/2m)·p22 = m2·v2/(2m·ħ) = m·v2/(2ħ) ⇔ E = m·v2/2

In short: the E = m·v2/2 is the correct formula. It must be, because… Well… Because Schrödinger’s equation is a formula we surely shouldn’t doubt, right? So the only logical conclusion is that we must be doing something wrong when multiplying the two de Broglie equations. To be precise: our v = f·λ equation must be wrong. Why? Well… It’s just something one shouldn’t apply to our complex-valued wavefunction. The ‘correct’ velocity formula for the complex-valued wavefunction should have that 1/2 factor, so we’d write 2·f·λ = v to make things come out alright. But where would this formula come from? The period of cosθ + isinθ is the period of the sine and cosine function: cos(θ+2π) + isin(θ+2π) = cosθ + isinθ, so T = 2π and f = 1/T = 1/2π do not change.

But so that’s a mathematical point of view. From a physical point of view, it’s clear we got two oscillations for the price of one: one ‘real’ and one ‘imaginary’—but both are equally essential and, hence, equally ‘real’. So the answer must lie in the distinction between the group and the phase velocity when we’re combining waves. Indeed, the group velocity of a sum of waves is equal to vg = dω/dk. In this case, we have:

vg = d[E/ħ]/d[p/ħ] = dE/dp

We can now use the kinetic energy formula to write E as E = m·v2/2 = p·v/2. Now, v and p are related through m (p = m·v, so = p/m). So we should write this as E = m·v2/2 = p2/(2m). Substituting E and p = m·v in the equation above then gives us the following:

dω/dk = d[p2/(2m)]/dp = 2p/(2m) = v= v

However, for the phase velocity, we can just use the v= ω/k formula, which gives us that 1/2 factor:

v= ω/k = (E/ħ)/(p/ħ) = E/p = (m·v2/2)/(m·v) = v/2

Bingo! Riddle solved! 🙂 Isn’t it nice that our formula for the group velocity also applies to our complex-valued wavefunction? I think that’s amazing, really! But I’ll let you think about it. 🙂

An introduction to virtual particles

In one of my posts on the rules of quantum math, I introduced the propagator function, which gives us the amplitude for a particle to go from one place to another. It looks like this:

propagator

The rand r2 vectors are, obviously, position vectors describing (1) where the particle is right now, so the initial state is written as |r1〉, and (2) where it might go, so the final state is |r2〉. Now we can combine this with the analysis in my previous post to think about what might happen when an electron sort of ‘jumps’ from one state to another. It’s a rather funny analysis, but it will give you some feel of what these so-called ‘virtual’ particles might represent.

Let’s first look at the shape of that function. The e(i/ħ)·(pr12function in the numerator is now familiar to you. Note the r12 in the argument, i.e. the vector pointing from r1 to r2. The pr12 dot product equals |p|∙|r12|·cosθ = p∙r12·cosθ, with θ the angle between p and r12. If the angle is the same, then cosθ is equal to 1. If the angle is π/2, then it’s 0, and the function reduces to 1/r12. So the angle θ, through the cosθ factor, sort of scales the spatial frequency. Let me try to give you some idea of how this looks like by assuming the angle between p and r12 is the same, so we’re looking at the space in the direction of the momentum only and |p|∙|r12|·cosθ = p∙r12. Now, we can look at the p/ħ factor as a scaling factor, and measure the distance x in units defined by that scale, so we write: x = p∙r12/ħ. The whole function, including the denominator, then reduces to (ħ/p)·eix/x = (ħ/p)·cos(x)/x + i·(ħ/p)·sin(x)/x, and we just need to square this to get the probability. All of the graphs are drawn hereunder: I’ll let you analyze them. [Note that the graphs do not include the ħ/p factor, which you may look at as yet another scaling factor.] You’ll see – I hope! – that it all makes perfect sense: the probability quickly drops off with distance, both in the positive as well as in the negative x-direction, while going to infinity when very near, i.e. for very small x. [Note that the absolute square, using cos(x)/x and sin(x)/x yields the same graph as squaring 1/x—obviously!]

graph

Now, this propagator function is not dependent on time: it’s only the momentum that enters the argument. Of course, we assume p to be some positive real number. Of course?

This is where Feynman starts an interesting conversation. In the previous post, we studied a model in which we had two protons, and one electron jumping from one to another, as shown below.

hydrogen

This model told us the equilibrium state is a stable ionized hydrogen molecule (so that’s an H2+ molecule), with an interproton distance that’s equal to 1 Ångstrom – so that’s like twice the size of a hydrogen atom (which we simply write as H) – and an energy that’s 2.72 eV less than the energy of a hydrogen atom and a proton (so that’s not an H2+ molecule but a system consisting of a separate hydrogen atom and a proton). The why and how of that equilibrium state is illustrated below. [For more details, see my previous post.]

raph2

Now, the model implies there is a sort of attractive force pulling the two protons together even when the protons are at larger distances than 1 Å. One can see that from the graph indeed. Now, we would not associate any molecular orbital with those distances, as the system is, quite simply, not a molecule but a separate hydrogen atom and a proton. Nevertheless, the amplitude A is non-zero, and so we have an electron jumping back and forth.

We know how that works from our post on tunneling: particles can cross an energy barrier and tunnel through. One of the weird things we had to consider when a particle crosses such potential barrier, is that the momentum factor p in its wavefunction was some pure imaginary number, which we wrote as p = i·p’. We then re-wrote that wavefunction as a·e−iθ = a·e−i[(E/ħ)∙t − (i·p’/ħ)x] = a·e−i(E/ħ)∙t·ei2·p’·x/ħ = a·e−i(E/ħ)∙t·e−p’·x/ħ. The e−p’·x/ħ factor in this formula is a real-valued exponential function, that sort of ‘kills’ our wavefunction as we move across the potential barrier, which is what is illustrated below: if the distance is too large, then the amplitude for tunneling goes to zero.

potential barrier

From a mathematical point of view, the analysis of our electron jumping back and forth is very similar. However, there are differences too. We can’t really analyze this in terms of a potential barrier in space. The barrier is the potential energy of the electron itself: it’s happy when it’s bound, because its energy then contributes to a reduction of the total energy of the hydrogen atomic system that is equal to the ionization energy, or the Rydberg energy as it’s called, which is equal to not less than 13.6 eV (which, as mentioned, is pretty big at the atomic level). Well… We can take that propagator function (1/re(i/ħ)·p∙r (note the argument has no minus sign: it can be quite tricky!), and just fill in the value for the momentum of the electron.

Huh? What momentum? It’s got no momentum to spare. On the contrary, it wants to stay with the proton, so it has no energy whatsoever to escape. Well… Not in quantum mechanics. In quantum mechanics it can use all its potential energy and convert it into kinetic energy, so it can get away from its proton and convert the energy that’s being released into kinetic energy.

But there is no release of energy! The energy is negative!

Exactly! You’re right. So we boldly write: K.E. = m·v2/2 = p2/(2m) = −13.6 eV, and, because we’re working with complex numbers, we can take a square root of negative number, using the definition of the imaginary unit: i = √(−1), so we get a purely imaginary value for the momentum p, which we write as:

p = ±i·√(2m·EH)

The sign of p is chosen so it makes sense: our electron should go in one direction only. It’s going to be the plus sign. [If you’d take the negative root, you’d get a nonsensical propagator function.] To make a long story short, our propagator function becomes:

(1/re(i/ħ)·i·√(2m·EH)∙r = (1/re(i/ħ)·i·√(2m·EH)∙r = (1/rei2/ħ·√(2m·EH)∙r = (1/r)·e−√(2m·EH)/ħ∙r

Of course, from a mathematical point of view, that’s the same function as e−p’·x/ħ: it’s a real-valued exponential function that quickly dies. But it’s an amplitude alright, and it’s just like an amplitude for tunneling indeed: if the distance is too large, then the amplitude goes to zero. The final cherry on the cake, of course, is to write:

A ∼ (1/r)·e−√(2m·EH)/ħ∙r

Well… No. It gets better. This amplitude is an amplitude for an electron bond between the two protons which, as we know, lowers the energy of the system. By how much? Well… By A itself. Now we know that work or energy is an integral or antiderivative of force over distance, so force is the derivative of energy with respect to the distance. So we can just take the derivative of the expression above to get the force. I’ll leave that you as an exercise: don’t forget to use the product rule! 🙂

So are we done? No. First, we didn’t talk about virtual particles yet! Let me do that now. However, first note that we should add one more effect in our two-proton-one-electron system: the coulomb field (ε) caused by the bare proton will cause the hydrogen molecule to take on an induced electric dipole moment (μ), so we should integrate that in our energy equation. Feynman shows how, but I won’t bother you with that here. Let’s talk about those virtual particles. What are they?

Well… There’s various definitions, but Feynman’s definition is this one:

“There is an exchange of a virtual electron when–as here–the electron has to jump across a space where it would have a negative energy. More specifically, a ‘virtual exchange’ means that the phenomenon involves a quantum-mechanical interference between an exchanged state and a non-exchanged state.”

You’ll say: what’s virtual about it? The electron does go from one place to another, doesn’t it? Well… Yes and no. We can’t observe it while it’s supposed to be doing that. Our analysis just tells us it seems to be useful to distinguish two different states and analyze all in terms of those differential equations. Who knows what’s really going on? What’s actual and what’s virtual? We just have some ‘model’ here: a model for the interaction between a hydrogen atom and a proton. It explains the attraction between them in terms of a sort of continuous exchange of an electron, but is it real?

The point is: in physics, it’s assumed that the coulomb interaction, i.e. all of electrostatics really, comes from the exchange of virtual photons: one electron, or proton, emits a photon, and then another absorbs it in the reverse of the same reaction. Furthermore, it is assumed that the amplitude for doing so is like that formula we found for the amplitude to exchange a virtual electron, except that the rest mass of a photon is zero, and so the formula reduces to 1/r. Such simple relationship makes sense, of course, because that’s how the electrostatic potential varies in space!

That, in essence, is all what there is to the quantum-mechanical theory of electromagnetism, which Feynman refers to as the ‘particle point of view’.

So… Yes. It’s that simple. Yes! For a change! 🙂

Post scriptum: Feynman’s Lecture on virtual particles is actually focused on a model for the nuclear forces. Most of it is devoted to a discussion of the virtual ‘pion’, or π-meson, which was then, when Feynman wrote his Lectures, supposed to mediate the force between two nucleons. However, this theory is clearly outdated: nuclear forces are described by quantum chromodynamics. So I’ll just skip the Yukawa theory here. It’s actually kinda strange his theory, which he proposed in 1935, was the theory for nuclear forces for such a long time. Hence, it’s surely all very interesting from a historical point of view.

The math behind the maser

Pre-script (dated 26 June 2020): I have come to the conclusion one does not need all this hocus-pocus to explain masers or lasers (and two-state systems in general): classical physics will do. So no use to read this. Read my papers instead. 🙂

Original post:

As I skipped the mathematical arguments in my previous post so as to focus on the essential results only, I thought it would be good to complement that post by looking at the math once again, so as to ensure we understand what it is that we’re doing. So let’s do that now. We start with the easy situation: free space.

The two-state system in free space

We started with an ammonia molecule in free space, i.e. we assumed there were no external force fields, like a gravitational or an electromagnetic force field. Hence, the picture was as simple as the one below: the nitrogen atom could be ‘up’ or ‘down’ with regard to its spin around its axis of symmetry.

Capture

It’s important to note that this ‘up’ or ‘down’ direction is defined in regard to the molecule itself, i.e. not in regard to some external reference frame. In other words, the reference frame is that of the molecule itself. For example, if I flip the illustration above – like below – then we’re still talking the same states, i.e. the molecule is still in state 1 in the image on the left-hand side and it’s still in state 2 in the image on the right-hand side. 

Capture

We then modeled the uncertainty about its state by associating two different energy levels with the molecule: E0 + A and E− A. The idea is that the nitrogen atom needs to tunnel through a potential barrier to get to the other side of the plane of the hydrogens, and that requires energy. At the same time, we’ll show the two energy levels are effectively associated with an ‘up’ or ‘down’ direction of the electric dipole moment of the molecule. So that resembles the two spin states of an electron, which we associated with the +ħ/2 and −ħ/2 energies respectively. So if E0 would be zero (we can always take another reference point, remember?), then we’ve got the same thing: two energy levels that are separated by some definite amount: that amount is 2A for the ammonia molecule, and ħ when we’re talking quantum-mechanical spin. I should make a last note here, before I move on: note that these energies only make sense in the presence of some external field, because the + and − signs in the E0 + A and E− A and +ħ/2 and −ħ/2 expressions make sense only with regard to some external direction defining what’s ‘up’ and what’s ‘down’ really. But I am getting ahead of myself here. Let’s go back to free space: no external fields, so what’s ‘up’ or ‘down’ is completely random here. 🙂

Now, we also know an energy level can be associated with a complex-valued wavefunction, or an amplitude as we call it. To be precise, we can associate it with the generic a·e−(i/ħ)·(E·t − px) expression which you know so well by now. Of course,  as the reference frame is that of the molecule itself, its momentum is zero, so the px term in the a·e−(i/ħ)·(E·t − px) expression vanishes and the wavefunction reduces to a·ei·ω·t a·e−(i/ħ)·E·t, with ω = E/ħ. In other words, the energy level determines the temporal frequency, or the temporal variation (as opposed to the spatial frequency or variation), of the amplitude.

We then had to find the amplitudes C1(t) = 〈 1 | ψ 〉 and C2(t) =〈 2 | ψ 〉, so that’s the amplitude to be in state 1 or state 2 respectively. In my post on the Hamiltonian, I explained why the dynamics of a situation like this can be represented by the following set of differential equations:

Hamiltonian

As mentioned, the Cand C2 functions evolve in time, and so we should write them as C= C1(t) and C= C2(t) respectively. In fact, our Hamiltonian coefficients may also evolve in time, which is why it may be very difficult to solve those differential equations! However, as I’ll show below, one usually assumes they are constant, and then one makes informed guesses about them so as to find a solution that makes sense.

Now, I should remind you here of something you surely know: if Cand Care solutions to this set of differential equations, then the superposition principle tells us that any linear combination a·C1 + b·Cwill also be a solution. So we need one or more extra conditions, usually some starting condition, which we can combine with a normalization condition, so we can get some unique solution that makes sense.

The Hij coefficients are referred to as Hamiltonian coefficients and, as shown in the mentioned post, the H11 and H22 coefficients are related to the amplitude of the molecule staying in state 1 and state 2 respectively, while the H12 and H21 coefficients are related to the amplitude of the molecule going from state 1 to state 2 and vice versa. Because of the perfect symmetry of the situation here, it’s easy to see that H11 should equal H22 , and that H12 and H21 should also be equal to each other. Indeed, Nature doesn’t care what we call state 1 or 2 here: as mentioned above, we did not define the ‘up’ and ‘down’ direction with respect to some external direction in space, so the molecule can have any orientation and, hence, switching the i an j indices should not make any difference. So that’s one clue, at least, that we can use to solve those equations: the perfect symmetry of the situation and, hence, the perfect symmetry of the Hamiltonian coefficients—in this case, at least!

The other clue is to think about the solution if we’d not have two states but one state only. In that case, we’d need to solve iħ·[dC1(t)/dt] = H11·C1(t). That’s simple enough, because you’ll remember that the exponential function is its own derivative. To be precise, we write: d(a·eiωt)/dt = a·d(eiωt)/dt = a·iω·eiωt, and please note that can be any complex number: we’re not necessarily talking a real number here! In fact, we’re likely to talk complex coefficients, and we multiply with some other complex number (iω) anyway here! So if we write iħ·[dC1/dt] = H11·C1 as dC1/dt = −(i/ħ)·H11·C1 (remember: i−1 = 1/i = −i), then it’s easy to see that the Ca·e–(i/ħ)·H11·t function is the general solution for this differential equation. Let me write it out for you, just to make sure:

dC1/dt = d[a·e–(i/ħ)H11t]/dt = a·d[e–(i/ħ)H11t]/dt = –a·(i/ħ)·H11·e–(i/ħ)H11t

= –(i/ħ)·H11·a·e–(i/ħ)H11= −(i/ħ)·H11·C1

Of course, that reminds us of our generic wavefunction a·e−(i/ħ)·E0·t wavefunction: we only need to equate H11 with E0 and we’re done! Hence, in a one-state system, the Hamiltonian coefficient is, quite simply, equal to the energy of the system. In fact, that’s a result can be generalized, as we’ll see below, and so that’s why Feynman says the Hamiltonian ought to be called the energy matrix.

In fact, we actually may have two states that are entirely uncoupled, i.e. a system in which there is no dependence of C1 on Cand vice versa. In that case, the two equations reduce to:

iħ·[dC1/dt] = H11·C1 and iħ·[dC2/dt] = H22·C2

These do not form a coupled system and, hence, their solutions are independent:

C1(t) = a·e–(i/ħ)·H11·t and C2(t) = b·e–(i/ħ)·H22·t 

The symmetry of the situation suggests we should equate a and b, and then the normalization condition says that the probabilities have to add up to one, so |C1(t)|+ |C2(t)|= 1, so we’ll find that = 1/√2.

OK. That’s simple enough, and this story has become quite long, so we should wrap it up. The two ‘clues’ – about symmetry and about the Hamiltonian coefficients being energy levels – lead Feynman to suggest that the Hamiltonian matrix for this particular case should be equal to:

H-matrix

Why? Well… It’s just one of Feynman’s clever guesses, and it yields probability functions that makes sense, i.e. they actually describe something real. That’s all. 🙂 I am only half-joking, because it’s a trial-and-error process indeed and, as I’ll explain in a separate section in this post, one needs to be aware of the various approximations involved when doing this stuff. So let’s be explicit about the reasoning here:

  1. We know that H11 = H22 = Eif the two states would be identical. In other words, if we’d have only one state, rather than two – i.e. if H12 and H21 would be zero – then we’d just plug that in. So that’s what Feynman does. So that’s what we do here too! 🙂
  2. However, H12 and H21 are not zero, of course, and so assume there’s some amplitude to go from one position to the other by tunneling through the energy barrier and flipping to the other side. Now, we need to assign some value to that amplitude and so we’ll just assume that the energy that’s needed for the nitrogen atom to tunnel through the energy barrier and flip to the other side is equal to A. So we equate H12 and H21 with −A.

Of course, you’ll wonder: why minus A? Why wouldn’t we try H12 = H21 = A? Well… I could say that a particle usually loses potential energy as it moves from one place to another, but… Well… Think about it. Once it’s through, it’s through, isn’t it? And so then the energy is just Eagain. Indeed, if there’s no external field, the + or − sign is quite arbitrary. So what do we choose? The answer is: when considering our molecule in free space, it doesn’t matter. Using +A or −A yields the same probabilities. Indeed, let me give you the amplitudes we get for H11 = H22 = Eand H12 and H21 = −A:

  1. C1(t) = 〈 1 | ψ 〉 = (1/2)·e(i/ħ)·(E− A)·t + (1/2)·e(i/ħ)·(E+ A)·t = e(i/ħ)·E0·t·cos[(A/ħ)·t]
  2. C2(t) = 〈 2 | ψ 〉 = (1/2)·e(i/ħ)·(E− A)·t – (1/2)·e(i/ħ)·(E+ A)·t = i·e(i/ħ)·E0·t·sin[(A/ħ)·t]

[In case you wonder how we go from those exponentials to a simple sine and cosine factor, remember that the sum of complex conjugates, i.e eiθ eiθ reduces to 2·cosθ, while eiθ − eiθ reduces to 2·i·sinθ.]

Now, it’s easy to see that, if we’d have used +A rather than −A, we would have gotten something very similar:

  • C1(t) = 〈 1 | ψ 〉 = (1/2)·e(i/ħ)·(E+ A)·t + (1/2)·e(i/ħ)·(E− A)·t = e(i/ħ)·E0·t·cos[(A/ħ)·t]
  • C2(t) = 〈 2 | ψ 〉 = (1/2)·e(i/ħ)·(E+ A)·t – (1/2)·e(i/ħ)·(E− A)·t = −i·e(i/ħ)·E0·t·sin[(A/ħ)·t]

So we get a minus sign in front of our C2(t) function, because cos(α) = cos(–α) but sin(α) = −sin(α). However, the associated probabilities are exactly the same. For both, we get the same P1(t) and P2(t) functions:

  • P1(t) = |C1(t)|2 = cos2[(A/ħ)·t]
  • P2(t) = |C2(t)|= sin2[(A/ħ)·t]

[Remember: the absolute square of and −is |i|= +√12 = +1 and |i|2 = (−1)2|i|= +1 respectively, so the i and −i in the two C2(t) formulas disappear.]

You’ll remember the graph:

graph

Of course, you’ll say: that plus or minus sign in front of C2(t) should matter somehow, doesn’t it? Well… Think about it. Taking the absolute square of some complex number – or some complex function , in this case! – amounts to multiplying it with its complex conjugate. Because the complex conjugate of a product is the product of the complex conjugates, it’s easy to see what happens: the e(i/ħ)·E0·t factor in C1(t) = e(i/ħ)·E0·t·cos[(A/ħ)·t] and C2(t) = ±i·e(i/ħ)·E0·t·sin[(A/ħ)·t] gets multiplied by e+(i/ħ)·E0·t and, hence, doesn’t matter: e(i/ħ)·E0·t·e+(i/ħ)·E0·t = e0 = 1. The cosine factor in C1(t) = e(i/ħ)·E0·t·cos[(A/ħ)·t] is real, and so its complex conjugate is the same. Now, the ±i·sin[(A/ħ)·t] factor in C2(t) = ±i·e(i/ħ)·E0·t·sin[(A/ħ)·t] is a pure imaginary number, and so its complex conjugate is its opposite. For some reason, we’ll find similar solutions for all of the situations we’ll describe below: the factor determining the probability will either be real or, else, a pure imaginary number. Hence, from a math point of view, it really doesn’t matter if we take +A or −A for  or  real factor for those H12 and H21 coefficients. We just need to be consistent in our choice, and I must assume that, in order to be consistent, Feynman likes to think of our nitrogen atom borrowing some energy from the system and, hence, temporarily reducing its energy by an amount that’s equal to −A. If you have a better interpretation, please do let me know! 🙂

OK. We’re done with this section… Except… Well… I have to show you how we got those C1(t) and C1(t) functions, no? Let me copy Feynman here:

solutionNote that the ‘trick’ involving the addition and subtraction of the differential equations is a trick we’ll use quite often, so please do have a look at it. As for the value of the a and b coefficients – which, as you can see, we’ve equated to 1 in our solutions for C1(t) and C1(t) – we get those because of the following starting condition: we assume that at t = 0, the molecule will be in state 1. Hence, we assume C1(0) = 1 and C2(0) = 0. In other words: we assume that we start out on that P1(t) curve in that graph with the probability functions above, so the C1(0) = 1 and C2(0) = 0 starting condition is equivalent to P1(0) = 1 and P1(0) = 0. Plugging that in gives us a/2 + b/2 = 1 and a/2 − b/2 = 0, which is possible only if a = b = 1.

Of course, you’ll say: what if we’d choose to start out with state 2, so our starting condition is P1(0) = 0 and P1(0) = 1? Then a = 1 and b = −1, and we get the solution we got when equating H12 and H21 with +A, rather than with −A. So you can think about that symmetry once again: when we’re in free space, then it’s quite arbitrary what we call ‘up’ or ‘down’.

So… Well… That’s all great. I should, perhaps, just add one more note, and that’s on that A/ħ value. We calculated it in the previous post, because we wanted to actually calculate the period of those P1(t) and P2(t) functions. Because we’re talking the square of a cosine and a sine respectively, the period is equal to π, rather than 2π, so we wrote: (A/ħ)·T = π ⇔ T = π·ħ/A. Now, the separation between the two energy levels E+ A and E− A, so that’s 2A, has been measured as being equal, more or less, to 2A ≈ 10−4 eV.

How does one measure that? As mentioned above, I’ll show you, in a moment, that, when applying some external field, the plus and minus sign do matter, and the separation between those two energy levels E+ A and E− A will effectively represent something physical. More in particular, we’ll have transitions from one energy level to another and that corresponds to electromagnetic radiation being emitted or absorbed, and so there’s a relation between the energy and the frequency of that radiation. To be precise, we can write 2A = h·f0. The frequency of the radiation that’s being absorbed or emitted is 23.79 GHz, which corresponds to microwave radiation with a wavelength of λ = c/f0 = 1.26 cm. Hence, 2·A ≈ 25×109 Hz times 4×10−15 eV·s = 10−4 eV, indeed, and, therefore, we can write: T = π·ħ/A ≈ 3.14 × 6.6×10−16 eV·s divided by 0.5×10−4 eV, so that’s 40×10−12 seconds = 40 picoseconds. That’s 40 trillionths of a seconds. So that’s very short, and surely much shorter than the time that’s associated with, say, a freely emitting sodium atom, which is of the order of 3.2×10−8 seconds. You may think that makes sense, because the photon energy is so much lower: a sodium light photon is associated with an energy equal to E = h·f = 500×1012 Hz times 4×10−15  eV·s = 2 eV, so that’s 20,000 times 10−4 eV.

There’s a funny thing, however. An oscillation of a frequency of 500 tera-hertz that lasts 3.2×10−8 seconds is equivalent to 500×1012 Hz times 3.2×10−8 s ≈ 16 million cycles. However, an oscillation of a frequency of 23.97 giga-hertz that only lasts 40×10−12 seconds is equivalent to 23.97×109 Hz times 40×10−12 s ≈ 1000×10−3 = 1 ! One cycle only? We’re surely not talking resonance here!

So… Well… I am just flagging it here. We’ll have to do some more thinking about that later. [I’ve added an addendum that may or may not help us in this regard. :-)]

The two-state system in a field

As mentioned above, when there is no external force field, we define the ‘up’ or ‘down’ direction of the nitrogen atom was defined with regard to its its spin around its axis of symmetry, so with regard to the molecule itself. However, when we apply an external electromagnetic field, as shown below, we do have some external reference frame.

Now, the external reference frame – i.e. the physics of the situation, really – may make it more convenient to define the whole system using another set of base states, which we’ll refer to as I and II, rather than 1 and 2. Indeed, you’ve seen the picture below: it shows a state selector, or a filter as we called it. In this case, there’s a filtering according to whether our ammonia molecule is in state I or, alternatively, state II. It’s like a Stern-Gerlach apparatus splitting an electron beam according to the spin state of the electrons, which is ‘up’ or ‘down’ too, but in a totally different way than our ammonia molecule. Indeed, the ‘up’ and ‘down’ spin of an electron has to do with its magnetic moment and its angular momentum. However, there are a lot of similarities here, and so you may want to compare the two situations indeed, i.e. the electron beam in an inhomogeneous magnetic field versus the ammonia beam in an inhomogeneous electric field.

electric field

Now, when reading Feynman, as he walks us through the relevant Lecture on all of this, you get the impression that it’s the I and II states only that have some kind of physical or geometric interpretation. That’s not the case. Of course, the diagram of the state selector above makes it very obvious that these new I and II base states make very much sense in regard to the orientation of the field, i.e. with regard to external space, rather than with respect to the position of our nitrogen atom vis-á-vis the hydrogens. But… Well… Look at the image below: the direction of the field (which we denote by ε because we’ve been using the E for energy) obviously matters when defining the old ‘up’ and ‘down’ states of our nitrogen atom too!

In other words, our previous | 1 〉 and | 2 〉 base states acquire a new meaning too: it obviously matters whether or not the electric dipole moment of the molecule is in the same or, conversely, in the opposite direction of the field. To be precise, the presence of the electromagnetic field suddenly gives the energy levels that we’d associate with these two states a very different physical interpretation.

ammonia

Indeed, from the illustration above, it’s easy to see that the electric dipole moment of this particular molecule in state 1 is in the opposite direction and, therefore, temporarily ignoring the amplitude to flip over (so we do not think of A for just a brief little moment), the energy that we’d associate with state 1 would be equal to E+ με. Likewise, the energy we’d associate with state 2 is equal to E− με.  Indeed, you’ll remember that the (potential) energy of an electric dipole is equal to the vector dot product of the electric dipole moment μ and the field vector ε, but with a minus sign in front so as to get the sign for the energy righ. So the energy is equal to −μ·ε = −|μ|·|ε|·cosθ, with θ the angle between both vectors. Now, the illustration above makes it clear that state 1 and 2 are defined for θ = π and θ = 0 respectively. [And, yes! Please do note that state 1 is the highest energy level, because it’s associated with the highest potential energy: the electric dipole moment μ of our ammonia molecule will – obviously! – want to align itself with the electric field ε ! Just think of what it would imply to turn the molecule in the field!]

Therefore, using the same hunches as the ones we used in the free space example, Feynman suggests that, when some external electric field is involved, we should use the following Hamiltonian matrix:

H-matrix 2

So we’ll need to solve a similar set of differential equations with this Hamiltonian now. We’ll do that later and, as mentioned above, it will be more convenient to switch to another set of base states, or another ‘representation’ as it’s referred to. But… Well… Let’s not get too much ahead of ourselves: I’ll say something about that before we’ll start solving the thing, but let’s first look at that Hamiltonian once more.

When I say that Feynman uses the same clues here, then… Well.. That’s true and not true. You should note that the diagonal elements in the Hamiltonian above are not the same: E+ με ≠ E+ με. So we’ve lost that symmetry of free space which, from a math point of view, was reflected in those identical H11 = H22 = Ecoefficients.

That should be obvious from what I write above: state 1 and state 2 are no longer those 1 and 2 states we described when looking at the molecule in free space. Indeed, the | 1 〉 and | 2 〉 states are still ‘up’ or ‘down’, but the illustration above also makes it clear we’re defining state 1 and state 2 not only with respect to the molecule’s spin around its own axis of symmetry but also vis-á-vis some direction in space. To be precise, we’re defining state 1 and state 2 here with respect to the direction of the electric field ε. Now that makes a really big difference in terms of interpreting what’s going on.

In fact, the ‘splitting’ of the energy levels because of that amplitude A is now something physical too, i.e. something that goes beyond just modeling the uncertainty involved. In fact, we’ll find it convenient to distinguish two new energy levels, which we’ll write as E= E+ A and EII = E− A respectively. They are, of course, related to those new base states | I 〉 and | II 〉 that we’ll want to use. So the E+ A and E− A energy levels themselves will acquire some physical meaning, and especially the separation between them, i.e. the value of 2A. Indeed, E= E+ A and EII = E− A will effectively represent an ‘upper’ and a ‘lower’ energy level respectively.

But, again, I am getting ahead of myself. Let’s first, as part of working towards a solution for our equations, look at what happens if and when we’d switch to another representation indeed.

Switching to another representation

Let me remind you of what I wrote in my post on quantum math in this regard. The actual state of our ammonia molecule – or any quantum-mechanical system really – is always to be described in terms of a set of base states. For example, if we have two possible base states only, we’ll write:

| φ 〉 = | 1 〉 C1 + | 2 〉 C2

You’ll say: why? Our molecule is obviously always in either state 1 or state 2, isn’t it? Well… Yes and no. That’s the mystery of quantum mechanics: it is and it isn’t. As long as we don’t measure it, there is an amplitude for it to be in state 1 and an amplitude for it to be in state 2. So we can only make sense of its state by actually calculating 〈 1 | φ 〉 and 〈 2 | φ 〉 which, unsurprisingly are equal to 〈 1 | φ 〉 = 〈 1 | 1 〉 C1 + 〈 1 | 2 〉 C2  = C1(t) and 〈 2 | φ 〉 = 〈 2 | 1 〉 C1 + 〈 2 | 2 〉 C2  = C2(t) respectively, and so these two functions give us the probabilities P1(t) and  P2(t) respectively. So that’s Schrödinger’s cat really: the cat is dead or alive, but we don’t know until we open the box, and we only have a probability function – so we can say that it’s probably dead or probably alive, depending on the odds – as long as we do not open the box. It’s as simple as that.

Now, the ‘dead’ and ‘alive’ condition are, obviously, the ‘base states’ in Schrödinger’s rather famous example, and we can write them as | DEAD 〉 and | ALIVE 〉 you’d agree it would be difficult to find another representation. For example, it doesn’t make much sense to say that we’ve rotated the two base states over 90 degrees and we now have two new states equal to (1/√2)·| DEAD 〉 – (1/√2)·| ALIVE 〉 and (1/√2)·| DEAD 〉 + (1/√2)·| ALIVE 〉 respectively. There’s no direction in space in regard to which we’re defining those two base states: dead is dead, and alive is alive.

The situation really resembles our ammonia molecule in free space: there’s no external reference against which to define the base states. However, as soon as some external field is involved, we do have a direction in space and, as mentioned above, our base states are now defined with respect to a particular orientation in space. That implies two things. The first is that we should no longer say that our molecule will always be in either state 1 or state 2. There’s no reason for it to be perfectly aligned with or against the field. Its orientation can be anything really, and so its state is likely to be some combination of those two pure base states | 1 〉 and | 2 〉.

The second thing is that we may choose another set of base states, and specify the very same state in terms of the new base states. So, assuming we choose some other set of base states | I 〉 and | II 〉, we can write the very same state | φ 〉 = | 1 〉 C1 + | 2 〉 Cas:

| φ 〉 = | I 〉 CI + | II 〉 CII

It’s really like what you learned about vectors in high school: one can go from one set of base vectors to another by a transformation, such as, for example, a rotation, or a translation. It’s just that, just like in high school, we need some direction in regard to which we define our rotation or our translation.

For state vectors, I showed how a rotation of base states worked in one of my posts on two-state systems. To be specific, we had the following relation between the two representations:

matrix

The (1/√2) factor is there because of the normalization condition, and the two-by-two matrix equals the transformation matrix for a rotation of a state filtering apparatus about the y-axis, over an angle equal to (minus) 90 degrees, which we wrote as:

transformation

The y-axis? What y-axis? What state filtering apparatus? Just relax. Think about what you’ve learned already. The orientations are shown below: the S apparatus separates ‘up’ and ‘down’ states along the z-axis, while the T-apparatus does so along an axis that is tilted, about the y-axis, over an angle equal to α, or φ, as it’s written in the table above.

tilted

Of course, we don’t really introduce an apparatus at this or that angle. We just introduced an electromagnetic field, which re-defined our | 1 〉 and | 2 〉 base states and, therefore, through the rotational transformation matrix, also defines our | I 〉 and | II 〉 base states.

[…] You may have lost me by now, and so then you’ll want to skip to the next section. That’s fine. Just remember that the representations in terms of | I 〉 and | II 〉 base states or in terms of | 1 〉 and | 2 〉 base states are mathematically equivalent. Having said that, if you’re reading this post, and you want to understand it, truly (because you want to truly understand quantum mechanics), then you should try to stick with me here. 🙂 Indeed, there’s a zillion things you could think about right now, but you should stick to the math now. Using that transformation matrix, we can relate the Cand CII coefficients in the | φ 〉 = | I 〉 CI + | II 〉 CII expression to the Cand CII coefficients in the | φ 〉 = | 1 〉 C1 + | 2 〉 C2 expression. Indeed, we wrote:

  • C= 〈 I | ψ 〉 = (1/√2)·(C1 − C2)
  • CII = 〈 II | ψ 〉 = (1/√2)·(C1 + C2)

That’s exactly the same as writing:

transformation

OK. […] Waw! You just took a huge leap, because we can now compare the two sets of differential equations:

set of equations

They’re mathematically equivalent, but the mathematical behavior of the functions involved is very different. Indeed, unlike the C1(t) and C2(t) amplitudes, we find that the CI(t) and CII(t) amplitudes are stationary, i.e. the associated probabilities – which we find by taking the absolute square of the amplitudes, as usual – do not vary in time. To be precise, if you write it all out and simplify, you’ll find that the CI(t) and CII(t) amplitudes are equal to:

  • CI(t) = 〈 I | ψ 〉 = (1/√2)·(C1 − C2) = (1/√2)·e(i/ħ)·(E0+ A)·t = (1/√2)·e(i/ħ)·EI·t
  • CII(t) = 〈 II | ψ 〉 = (1/√2)·(C1 + C2) = (1/√2)·e(i/ħ)·(E0− A)·t = (1/√2)·e(i/ħ)·EII·t

As the absolute square of the exponential is equal to one, the associated probabilities, i.e. |CI(t)|2 and |CII(t)|2, are, quite simply, equal to |1/√2|2 = 1/2. Now, it is very tempting to say that this means that our ammonia molecule has an equal chance to be in state I or state II. In fact, while I may have said something like that in my previous posts, that’s not how one should interpret this. The chance of our molecule being exactly in state I or state II, or in state 1 or state 2 is varying with time, with the probability being ‘dumped’ from one state to the other all of the time.

I mean… The electric dipole moment can point in any direction, really. So saying that our molecule has a 50/50 chance of being in state 1 or state 2 makes no sense. Likewise, saying that our molecule has a 50/50 chance of being in state I or state II makes no sense either. Indeed, the state of our molecule is specified by the | φ 〉 = | I 〉 CI + | II 〉 CII = | 1 〉 C1 + | 2 〉 Cequations, and neither of these two expressions is a stationary state. They mix two frequencies, because they mix two energy levels.

Having said that, we’re talking quantum mechanics here and, therefore, an external inhomogeneous electric field will effectively split the ammonia molecules according to their state. The situation is really like what a Stern-Gerlach apparatus does to a beam of electrons: it will split the beam according to the electron’s spin, which is either ‘up’ or, else, ‘down’, as shown in the graph below:

diagram 2

The graph for our ammonia molecule, shown below, is very similar. The vertical axis measures the same: energy. And the horizontal axis measures με, which increases with the strength of the electric field ε. So we see a similar ‘splitting’ of the energy of the molecule in an external electric field.

graph new

How should we explain this? It is very tempting to think that the presence of an external force field causes the electrons, or the ammonia molecule, to ‘snap into’ one of the two possible states, which are referred to as state I and state II respectively in the illustration of the ammonia state selector below. But… Well… Here we’re entering the murky waters of actually interpreting quantum mechanics, for which (a) we have no time, and (b) we are not qualified. So you should just believe, or take for granted, what’s being shown here: an inhomogeneous electric field will split our ammonia beam according to their state, which we define as I and II respectively, and which are associated with the energy E0+ A and E0− A  respectively.

electric field

As mentioned above, you should note that these two states are stationary. The Hamiltonian equations which, as they always do, describe the dynamics of this system, imply that the amplitude to go from state I to state II, or vice versa, is zero. To make sure you ‘get’ that, I reproduce the associated Hamiltonian matrix once again:

H-matrix I and II

Of course, that will change when we start our analysis of what’s happening in the maser. Indeed, we will have some non-zero HI,II and HII,I amplitudes in the resonant cavity of our ammonia maser, in which we’ll have an oscillating electric field and, as a result, induced transitions from state I to II and vice versa. However, that’s for later. While I’ll quickly insert the full picture diagram below, you should, for the moment, just think about those two stationary states and those two zeroes. 🙂

maser diagram

Capito? If not… Well… Start reading this post again, I’d say. 🙂

Intermezzo: on approximations

At this point, I need to say a few things about all of the approximations involved, because it can be quite confusing indeed. So let’s take a closer look at those energy levels and the related Hamiltonian coefficients. In fact, in his LecturesFeynman shows us that we can always have a general solution for the Hamiltonian equations describing a two-state system whenever we have constant Hamiltonian coefficients. That general solution – which, mind you, is derived assuming Hamiltonian coefficients that do not depend on time – can always be written in terms of two stationary base states, i.e. states with a definite energy and, hence, a constant probability. The equations, and the two definite energy levels are:

Hamiltonian

solution3

That yields the following values for the energy levels for the stationary states:

solution x

Now, that’s very different from the E= E0+ A and EII = E0− A energy levels for those stationary states we had defined in the previous section: those stationary states had no square root, and no μ2ε2, in their energy. In fact, that sort of answers the question: if there’s no external field, then that μ2ε2 factor is zero, and the square root in the expression becomes ±√A= ±A. So then we’re back to our E= E0+ A and EII = E0− A formulas. The whole point, however, is that we will actually have an electric field in that cavity. Moreover, it’s going to be a field that varies in time, which we’ll write:

field

Now, part of the confusion in Feynman’s approach is that he constantly switches between representing the system in terms of the I and II base states and the 1 and 2 base states respectively. For a good understanding, we should compare with our original representation of the dynamics in free space, for which the Hamiltonian was the following one:

H-matrix

That matrix can easily be related to the new one we’re going to have to solve, which is equal to:

H-matrix 2

The interpretation is easy if we look at that illustration again:

ammonia

If the direction of the electric dipole moment is opposite to the direction ε, then the associated energy is equal to −μ·ε = −μ·ε = −|μ|·|ε|·cosθ = −μ·ε·cos(π) = +με. Conversely, for state 2, we find −μ·ε·cos(0) = −με for the energy that’s associated with the dipole moment. You can and should think about the physics involved here, because they make sense! Thinking of amplitudes, you should note that the +με and −με terms effectively change the H11 and H22 coefficients, so they change the amplitude to stay in state 1 or state 2 respectively. That, of course, will have an impact on the associated probabilities, and so that’s why we’re talking of induced transitions now.

Having said that, the Hamiltonian matrix above keeps the −A for H12 and H21, so the matrix captures spontaneous transitions too!

Still… You may wonder why Feynman doesn’t use those Eand EII formulas with the square root because that would give us some exact solution, wouldn’t it? The answer to that question is: maybe it would, but would you know how to solve those equations? We’ll have a varying field, remember? So our Hamiltonian H11 and H22 coefficients will no longer be constant, but time-dependent. As you’re going to see, it takes Feynman three pages to solve the whole thing using the +με and −με approximation. So just imagine how complicated it would be using that square root expression! [By the way, do have a look at those asymptotic curves in that illustration showing the splitting of energy levels above, so you see how that approximation looks like.]

So that’s the real answer: we need to simplify somehow, so as to get any solutions at all!

Of course, it’s all quite confusing because, after Feynman first notes that, for strong fields, the A2 in that square root is small as compared to μ2ε2, thereby justifying the use of the simplified E= E0+ με = H11 and EII = E0− με = H22 coefficients, he continues and bluntly uses the very same square root expression to explain how that state selector works, saying that the electric field in the state selector will be rather weak and, hence, that με will be much smaller than A, so one can use the following approximation for the square root in the expressions above:

square root sum of squaresThe energy expressions then reduce to:energy 2

And then we can calculate the force on the molecules as:

force

So the electric field in the state selector is weak, but the electric field in the cavity is supposed to be strong, and so… Well… That’s it, really. The bottom line is that we’ve a beam of ammonia molecules that are all in state I, and it’s what happens with that beam then, that is being described by our new set of differential equations:

new

Solving the equations

As all molecules in our ammonia beam are described in terms of the | I 〉 and | II 〉 base states – as evidenced by the fact that we say all molecules that enter the cavity are state I – we need to switch to that representation. We do that by using that transformation above, so we write:

  • C= 〈 I | ψ 〉 = (1/√2)·(C1 − C2)
  • CII = 〈 II | ψ 〉 = (1/√2)·(C1 + C2)

Keeping these ‘definitions’ of Cand CII in mind, you should then add the two differential equations, divide the result by the square root of 2, and you should get the following new equation:

Eq1

Please! Do it and verify the result! You want to learn something here, no? 🙂

Likewise, subtracting the two differential equations, we get:

Eq2

We can re-write this as:set new

Now, the problem is that the Hamiltonian constants here are not constant. To be precise, the electric field ε varies in time. We wrote:

field

So HI,II  and HII,I, which are equal to με, are not constant: we’ve got Hamiltonian coefficients that are a function of time themselves. […] So… Well… We just need to get on with it and try to finally solve this thing. Let me just copy Feynman as he grinds through this:

F1

This is only the first step in the process. Feynman just takes two trial functions, which are really similar to the very general Ca·e–(i/ħ)·H11·t function we presented when only one equation was involved, or – if you prefer a set of two equations – those CI(t) = a·e(i/ħ)·EI·t and CI(t) = b·e(i/ħ)·EII·equations above. The difference is that the coefficients in front, i.e. γI and γII are not some (complex) constant, but functions of time themselves. The next step in the derivation is as follows:

F2

One needs to do a bit of gymnastics here as well to follow what’s going on, but please do check and you’ll see it works. Feynman derives another set of differential equations here, and they specify these γI = γI(t) and γII = γII(t) functions. These equations are written in terms of the frequency of the field, i.e. ω, and the resonant frequency ω0, which we mentioned above when calculating that 23.79 GHz frequency from the 2A = h·f0 equation. So ω0 is the same molecular resonance frequency but expressed as an angular frequency, so ω0 = f0/2π = ħ/2A. He then proceeds to simplify, using assumptions one should check. He then continues:

F3

That gives us what we presented in the previous post:

F4

So… Well… What to say? I explained those probability functions in my previous post, indeed. We’ve got two probabilities here:

  • P= cos2[(με0/ħ)·t]
  • PII = sin2[(με0/ħ)·t]

So that’s just like the P=  cos2[(A/ħ)·t] and P= sin2[(A/ħ)·t] probabilities we found for spontaneous transitions. But so here we are talking induced transitions.

As you can see, the frequency and, hence, the period, depend on the strength, or magnitude, of the electric field, i.e. the εconstant in the ε = 2ε0cos(ω·t) expression. The natural unit for measuring time would be the period once again, which we can easily calculate as (με0/ħ)·T = π ⇔ T = π·ħ/με0.

Now, we had that T = (π·ħ)/(2A) expression above, which allowed us to calculate the period of the spontaneous transition frequency, which we found was like 40 picoseconds, i.e. 40×10−12 seconds. Now, the T = (π·ħ)/(2με0) is very similar, it allows us to calculate the expected, average, or mean time for an induced transition. In fact, if we write Tinduced = (π·ħ)/(2με0) and Tspontaneous = (π·ħ)/(2A), then we can take ratio to find:

Tinduced/Tspontaneous = [(π·ħ)/(2με0)]/[(π·ħ)/(2A)] = A/με0

This A/με0 ratio is greater than one, so Tinduced/Tspontaneous is greater than one, which, in turn, means that the presence of our electric field – which, let me remind you, dances to the beat of the resonant frequency – causes a slower transition than we would have had if the oscillating electric field were not present.

But – Hey! – that’s the wrong comparison! Remember all molecules enter in a stationary state, as they’ve been selected so as to ensure they’re in state I. So there is no such thing as a spontaneous transition frequency here! They’re all polarized, so to speak, and they would remain that way if there was no field in the cavity. So if there was no oscillating electric field, they would never transition. Nothing would happen! Well… In terms of our particular set of base states, of course! Why? Well… Look at the Hamiltonian coefficients HI,II = HII,I = με: these coefficients are zero if ε is zero. So… Well… That says it all.

So that‘s what it’s all about: induced emission and, as I explained in my previous post, because all molecules enter in state I, i.e. the upper energy state, literally, they all ‘dump’ a net amount of energy equal to 2A into the cavity at the occasion of their first transition. The molecules then keep dancing, of course, and so they absorb and emit the same amount as they go through the cavity, but… Well… We’ve got a net contribution here, which is not only enough to maintain the cavity oscillations, but actually also provides a small excess of power that can be drawn from the cavity as microwave radiation of the same frequency.

As Feynman notes, an exact description of what actually happens requires an understanding of the quantum mechanics of the field in the cavity, i.e. quantum field theory, which I haven’t studied yet. But… Well… That’s for later, I guess. 🙂

Post scriptum: The sheer length of this post shows we’re not doing something that’s easy here. Frankly, I feel the whole analysis is still quite obscure, in the sense that – despite looking at this thing again and again – it’s hard to sort of interpret what’s going on, in a physical sense that is. But perhaps one shouldn’t try that. I’ve quoted Feynman’s view on how easy or how difficult it is to ‘understand’ quantum mechanics a couple of times already, so let me do it once more:

“Because atomic behavior is so unlike ordinary experience, it is very difficult to get used to, and it appears peculiar and mysterious to everyone—both to the novice and to the experienced physicist. Even the experts do not understand it the way they would like to, and it is perfectly reasonable that they should not, because all of direct, human experience and human intuition applies to large objects.”

So… Well… I’ll grind through the remaining Lectures now – I am halfway through Volume III now – and then re-visit all of this. Despite Feynman’s warning, I want to understand it the way I like to, even if I don’t quite know what way that is right now. 🙂

Addendum: As for those cycles and periods, I noted a couple of times already that the Planck-Einstein equation E = h·f  can usefully be re-written as E/= h, as it gives a physical interpretation to the value of the Planck constant. In fact, I said h is the energy that’s associated with one cycle, regardless of the frequency of the radiation involved. Indeed, the energy of a photon divided by the number of cycles per second, should give us the energy per cycle, no?

Well… Yes and no. Planck’s constant h and the frequency are both expressed referencing the time unit. However, if we say that a sodium atom emits one photon only as its electron transitions from a higher energy level to a lower one, and if we say that involves a decay time of the order of 3.2×10−8 seconds, then what we’re saying really is that a sodium light photon will ‘pack’ like 16 million cycles, which is what we get when we multiply the number of cycles per second (i.e. the mentioned frequency of 500×1012 Hz) by the decay time (i.e. 3.2×10−8 seconds): (500×1012 Hz)·(3.2×10−8 s) = 16 ×10cycles, indeed. So the energy per cycle is 2.068 eV (i.e. the photon energy) divided by 16×106, so that’s 0.129×10−6 eV. Unsurprisingly, that’s what we get when we we divide h by 3.2×10−8 s: (4.13567×10−15)/(3.2×10−8 s) = 1.29×10−7 eV. We’re just putting some values in to the E/(T) = h/T equation here.

The logic for that 2A = h·f0 is the same. The frequency of the radiation that’s being absorbed or emitted is 23.79 GHz, so the photon energy is (23.97×109 Hz)·(4.13567×10−15 eV·s) ≈ 1×10−4 eV. Now, we calculated the transition period T as T = π·ħ/A ≈ (π·6.626×10−16 eV·s)/(0.5×10−4 eV) ≈ 41.6×10−12 seconds. Now, an oscillation of a frequency of 23.97 giga-hertz that only lasts 41.6×10−12 seconds is an oscillation of one cycle only. The consequence is that, when we continue this style of reasoning, we’d have a photon that packs all of its energy into one cycle!

Let’s think about what this implies in terms of the density in space. The wavelength of our microwave radiation is 1.25×10−2 m, so we’ve got a ‘density’ of 1×10−4 eV/1.25×10−2 m = 0.8×10−2 eV/m = 0.008 eV/m. The wavelength of our sodium light is 0.6×10−6 m, so we get a ‘density’ of 1.29×10−7 eV/0.6×10−6 m = 2.15×10−1 eV/m = 0.215 eV/m. So the energy ‘density’ of our sodium light is 26.875 times that of our microwave radiation. 🙂

Frankly, I am not quite sure if calculations like this make much sense. In fact, when talking about energy densities, I should review my posts on the Poynting vector. However, they may help you think things through. 🙂

Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 17, 2020 as a result of a DMCA takedown notice from Michael A. Gottlieb, Rudolf Pfeiffer, and The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/

Re-visiting uncertainty…

I re-visited the Uncertainty Principle a couple of times already, but here I really want to get at the bottom of the thing? What’s uncertain? The energy? The time? The wavefunction itself? These questions are not easily answered, and I need to warn you: you won’t get too much wiser when you’re finished reading this. I just felt like freewheeling a bit. [Note that the first part of this post repeats what you’ll find on the Occam page, or my post on Occam’s Razor. But these post do not analyze uncertainty, which is what I will be trying to do here.]

Let’s first think about the wavefunction itself. It’s tempting to think it actually is the particle, somehow. But it isn’t. So what is it then? Well… Nobody knows. In my previous post, I said I like to think it travels with the particle, but then doesn’t make much sense either. It’s like a fundamental property of the particle. Like the color of an apple. But where is that color? In the apple, in the light it reflects, in the retina of our eye, or is it in our brain? If you know a thing or two about how perception actually works, you’ll tend to agree the quality of color is not in the apple. When everything is said and done, the wavefunction is a mental construct: when learning physics, we start to think of a particle as a wavefunction, but they are two separate things: the particle is reality, the wavefunction is imaginary.

But that’s not what I want to talk about here. It’s about that uncertainty. Where is the uncertainty? You’ll say: you just said it was in our brain. No. I didn’t say that. It’s not that simple. Let’s look at the basic assumptions of quantum physics:

  1. Quantum physics assumes there’s always some randomness in Nature and, hence, we can measure probabilities only. We’ve got randomness in classical mechanics too, but this is different. This is an assumption about how Nature works: we don’t really know what’s happening. We don’t know the internal wheels and gears, so to speak, or the ‘hidden variables’, as one interpretation of quantum mechanics would say. In fact, the most commonly accepted interpretation of quantum mechanics says there are no ‘hidden variables’.
  2. However, as Shakespeare has one of his characters say: there is a method in the madness, and the pioneers– I mean Werner Heisenberg, Louis de Broglie, Niels Bohr, Paul Dirac, etcetera – discovered that method: all probabilities can be found by taking the square of the absolute value of a complex-valued wavefunction (often denoted by Ψ), whose argument, or phase (θ), is given by the de Broglie relations ω = E/ħ and k = p/ħ. The generic functional form of that wavefunction is:

Ψ = Ψ(x, t) = a·eiθ = a·ei(ω·t − k ∙x) = a·ei·[(E/ħ)·t − (p/ħ)∙x]

That should be obvious by now, as I’ve written more than a dozens of posts on this. 🙂 I still have trouble interpreting this, however—and I am not ashamed, because the Great Ones I just mentioned have trouble with that too. It’s not that complex exponential. That eiφ is a very simple periodic function, consisting of two sine waves rather than just one, as illustrated below. [It’s a sine and a cosine, but they’re the same function: there’s just a phase difference of 90 degrees.] sine

No. To understand the wavefunction, we need to understand those de Broglie relations, ω = E/ħ and k = p/ħ, and then, as mentioned, we need to understand the Uncertainty Principle. We need to understand where it comes from. Let’s try to go as far as we can by making a few remarks:

  • Adding or subtracting two terms in math, (E/ħ)·t − (p/ħ)∙x, implies the two terms should have the same dimension: we can only add apples to apples, and oranges to oranges. We shouldn’t mix them. Now, the (E/ħ)·t and (p/ħ)·x terms are actually dimensionless: they are pure numbers. So that’s even better. Just check it: energy is expressed in newton·meter (energy, or work, is force over distance, remember?) or electronvolts (1 eV = 1.6×10−19 J = 1.6×10−19 N·m); Planck’s constant, as the quantum of action, is expressed in J·s or eV·s; and the unit of (linear) momentum is 1 N·s = 1 kg·m/s = 1 N·s. E/ħ gives a number expressed per second, and p/ħ a number expressed per meter. Therefore, multiplying E/ħ and p/ħ by t and x respectively gives us a dimensionless number indeed.
  • It’s also an invariant number, which means we’ll always get the same value for it, regardless of our frame of reference. As mentioned above, that’s because the four-vector product pμxμ = E·t − px is invariant: it doesn’t change when analyzing a phenomenon in one reference frame (e.g. our inertial reference frame) or another (i.e. in a moving frame).
  • Now, Planck’s quantum of action h, or ħ – h and ħ only differ in their dimension: h is measured in cycles per second, while ħ is measured in radians per second: both assume we can at least measure one cycle – is the quantum of energy really. Indeed, if “energy is the currency of the Universe”, and it’s real and/or virtual photons who are exchanging it, then it’s good to know the currency unit is h, i.e. the energy that’s associated with one cycle of a photon. [In case you want to see the logic of this, see my post on the physical constants c, h and α.]
  • It’s not only time and space that are related, as evidenced by the fact that t − x itself is an invariant four-vector, E and p are related too, of course! They are related through the classical velocity of the particle that we’re looking at: E/p = c2/v and, therefore, we can write: E·β = p·c, with β = v/c, i.e. the relative velocity of our particle, as measured as a ratio of the speed of light. Now, I should add that the t − x four-vector is invariant only if we measure time and space in equivalent units. Otherwise, we have to write c·t − x. If we do that, so our unit of distance becomes meter, rather than one meter, or our unit of time becomes the time that is needed for light to travel one meter, then = 1, and the E·β = p·c becomes E·β = p, which we also write as β = p/E: the ratio of the energy and the momentum of our particle is its (relative) velocity.

Combining all of the above, we may want to assume that we are measuring energy and momentum in terms of the Planck constant, i.e. the ‘natural’ unit for both. In addition, we may also want to assume that we’re measuring time and distance in equivalent units. Then the equation for the phase of our wavefunctions reduces to:

θ = (ω·t − k ∙x) = E·t − p·x

Now, θ is the argument of a wavefunction, and we can always re-scale such argument by multiplying or dividing it by some constant. It’s just like writing the argument of a wavefunction as v·t–x or (v·t–x)/v = t –x/v  with the velocity of the waveform that we happen to be looking at. [In case you have trouble following this argument, please check the post I did for my kids on waves and wavefunctions.] Now, the energy conservation principle tells us the energy of a free particle won’t change. [Just to remind you, a ‘free particle’ means it’s in a ‘field-free’ space, so our particle is in a region of uniform potential.] So we can, in this case, treat E as a constant, and divide E·t − p·x by E, so we get a re-scaled phase for our wavefunction, which I’ll write as:

φ = (E·t − p·x)/E = t − (p/E)·x = t − β·x

Alternatively, we could also look at p as some constant, as there is no variation in potential energy that will cause a change in momentum, and the related kinetic energy. We’d then divide by p and we’d get (E·t − p·x)/p = (E/p)·t − x) = t/β − x, which amounts to the same, as we can always re-scale by multiplying it with β, which would again yield the same t − β·x argument.

The point is, if we measure energy and momentum in terms of the Planck unit (I mean: in terms of the Planck constant, i.e. the quantum of energy), and if we measure time and distance in ‘natural’ units too, i.e. we take the speed of light to be unity, then our Platonic wavefunction becomes as simple as:

Φ(φ) = a·eiφ = a·ei(t − β·x)

This is a wonderful formula, but let me first answer your most likely question: why would we use a relative velocity?Well… Just think of it: when everything is said and done, the whole theory of relativity and, hence, the whole of physics, is based on one fundamental and experimentally verified fact: the speed of light is absolute. In whatever reference frame, we will always measure it as 299,792,458 m/s. That’s obvious, you’ll say, but it’s actually the weirdest thing ever if you start thinking about it, and it explains why those Lorentz transformations look so damn complicated. In any case, this fact legitimately establishes as some kind of absolute measure against which all speeds can be measured. Therefore, it is only natural indeed to express a velocity as some number between 0 and 1. Now that amounts to expressing it as the β = v/c ratio.

Let’s now go back to that Φ(φ) = a·eiφ = a·ei(t − β·x) wavefunction. Its temporal frequency ω is equal to one, and its spatial frequency k is equal to β = v/c. It couldn’t be simpler but, of course, we’ve got this remarkably simple result because we re-scaled the argument of our wavefunction using the energy and momentum itself as the scale factor. So, yes, we can re-write the wavefunction of our particle in a particular elegant and simple form using the only information that we have when looking at quantum-mechanical stuff: energy and momentum, because that’s what everything reduces to at that level.

So… Well… We’ve pretty much explained what quantum physics is all about here. You just need to get used to that complex exponential: eiφ = cos(−φ) + i·sin(−φ) = cos(φ) −i·sin(φ). It would have been nice if Nature would have given us a simple sine or cosine function. [Remember the sine and cosine function are actually the same, except for a phase difference of 90 degrees: sin(φ) = cos(π/2−φ) = cos(φ+π/2). So we can go always from one to the other by shifting the origin of our axis.] But… Well… As we’ve shown so many times already, a real-valued wavefunction doesn’t explain the interference we observe, be it interference of electrons or whatever other particles or, for that matter, the interference of electromagnetic waves itself, which, as you know, we also need to look at as a stream of photons , i.e. light quanta, rather than as some kind of infinitely flexible aether that’s undulating, like water or air.

However, the analysis above does not include uncertainty. That’s as fundamental to quantum physics as de Broglie‘s equations, so let’s think about that now.

Introducing uncertainty

Our information on the energy and the momentum of our particle will be incomplete: we’ll write E = E± σE, and p = p± σp. Huh? No ΔE or ΔE? Well… It’s the same, really, but I am a bit tired of using the Δ symbol, so I am using the σ symbol here, which denotes a standard deviation of some density function. It underlines the probabilistic, or statistical, nature of our approach.

The simplest model is that of a two-state system, because it involves two energy levels only: E = E± A, with A some constant. Large or small, it doesn’t matter. All is relative anyway. 🙂 We explained the basics of the two-state system using the example of an ammonia molecule, i.e. an NHmolecule, so it consists on one nitrogen and three hydrogen atoms. We had two base states in this system: ‘up’ or ‘down’, which we denoted as base state | 1 〉 and base state | 2 〉 respectively. This ‘up’ and ‘down’ had nothing to do with the classical or quantum-mechanical notion of spin, which is related to the magnetic moment. No. It’s much simpler than that: the nitrogen atom could be either beneath or, else, above the plane of the hydrogens, as shown below, with ‘beneath’ and ‘above’ being defined in regard to the molecule’s direction of rotation around its axis of symmetry.

Capture

In any case, for the details, I’ll refer you to the post(s) on it. Here I just want to mention the result. We wrote the amplitude to find the molecule in either one of these two states as:

  • C= 〈 1 | ψ 〉 = (1/2)·e(i/ħ)·(E− A)·t + (1/2)·e(i/ħ)·(E+ A)·t
  • C= 〈 2 | ψ 〉 = (1/2)·e(i/ħ)·(E− A)·t – (1/2)·e(i/ħ)·(E+ A)·t

That gave us the following probabilities:

graph

If our molecule can be in two states only, and it starts off in one, then the probability that it will remain in that state will gradually decline, while the probability that it flips into the other state will gradually increase.

Now, the point you should note is that we get these time-dependent probabilities only because we’re introducing two different energy levels: E+ A and E− A. [Note they separated by an amount equal to 2·A, as I’ll use that information later.] If we’d have one energy level only – which amounts to saying that we know it, and that it’s something definite then we’d just have one wavefunction, which we’d write as:

a·eiθ = a·e−(i/ħ)·(E0·t − p·x) = a·e−(i/ħ)·(E0·t)·e(i/ħ)·(p·x)

Note that we can always split our wavefunction in a ‘time’ and a ‘space’ part, which is quite convenient. In fact, because our ammonia molecule stays where it is, it has no momentum: p = 0. Therefore, its wavefunction reduces to:

a·eiθ = a·e−(i/ħ)·(E0·t)

As simple as it can be. 🙂 The point is that a wavefunction like this, i.e. a wavefunction that’s defined by a definite energy, will always yield a constant and equal probability, both in time as well as in space. That’s just the math of it: |a·eiθ|= a2. Always! If you want to know why, you should think of Euler’s formula and Pythagoras’ Theorem: cos2θ +sin2θ = 1. Always! 🙂

That constant probability is annoying, because our nitrogen atom never ‘flips’, and we know it actually does, thereby overcoming a energy barrier: it’s a phenomenon that’s referred to as ‘tunneling’, and it’s real! The probabilities in that graph above are real! Also, if our wavefunction would represent some moving particle, it would imply that the probability to find it somewhere in space is the same all over space, which implies our particle is everywhere and nowhere at the same time, really.

So, in quantum physics, this problem is solved by introducing uncertainty. Introducing some uncertainty about the energy, or about the momentum, is mathematically equivalent to saying that we’re actually looking at a composite wave, i.e. the sum of a finite or potentially infinite set of component waves. So we have the same ω = E/ħ and k = p/ħ relations, but we apply them to energy levels, or to some continuous range of energy levels ΔE. It amounts to saying that our wave function doesn’t have a specific frequency: it now has n frequencies, or a range of frequencies Δω = ΔE/ħ. In our two-state system, n = 2, obviously! So we’ve two energy levels only and so our composite wave consists of two component waves only.

We know what that does: it ensures our wavefunction is being ‘contained’ in some ‘envelope’. It becomes a wavetrain, or a kind of beat note, as illustrated below:

File-Wave_group

[The animation comes from Wikipedia, and shows the difference between the group and phase velocity: the green dot shows the group velocity, while the red dot travels at the phase velocity.]

So… OK. That should be clear enough. Let’s now apply these thoughts to our ‘reduced’ wavefunction

Φ(φ) = a·eiφ = a·ei(t − β·x)

Thinking about uncertainty

Frankly, I tried to fool you above. If the functional form of the wavefunction is a·e−(i/ħ)·(E·t − p·x), then we can measure E and p in whatever unit we want, including h or ħ, but we cannot re-scale the argument of the function, i.e. the phase θ, without changing the functional form itself. I explained that in that post for my kids on wavefunctions:, in which I explained we may represent the same electromagnetic wave by two different functional forms:

 F(ct−x) = G(t−x/c)

So F and G represent the same wave, but they are different wavefunctions. In this regard, you should note that the argument of F is expressed in distance units, as we multiply t with the speed of light (so it’s like our time unit is 299,792,458 m now), while the argument of G is expressed in time units, as we divide x by the distance traveled in one second). But F and G are different functional forms. Just do an example and take a simple sine function: you’ll agree that sin(θ) ≠ sin(θ/c) for all values of θ, except 0. Re-scaling changes the frequency, or the wavelength, and it does so quite drastically in this case. 🙂 Likewise, you can see that a·ei(φ/E) = [a·eiφ]1/E, so that’s a very different function. In short, we were a bit too adventurous above. Now, while we can drop the 1/ħ in the a·e−(i/ħ)·(E·t − p·x) function when measuring energy and momentum in units that are numerically equal to ħ, we’ll just revert to our original wavefunction for the time being, which equals

Ψ(θ) = a·eiθ = a·ei·[(E/ħ)·t − (p/ħ)·x]

Let’s now introduce uncertainty once again. The simplest situation is that we have two closely spaced energy levels. In theory, the difference between the two can be as small as ħ, so we’d write: E = E± ħ/2. [Remember what I said about the ± A: it means the difference is 2A.] However, we can generalize this and write: E = E± n·ħ/2, with n = 1, 2, 3,… This does not imply any greater uncertainty – we still have two states only – but just a larger difference between the two energy levels.

Let’s also simplify by looking at the ‘time part’ of our equation only, i.e. a·ei·(E/ħ)·t. It doesn’t mean we don’t care about the ‘space part’: it just means that we’re only looking at how our function varies in time and so we just ‘fix’ or ‘freeze’ x. Now, the uncertainty is in the energy really but, from a mathematical point of view, we’ve got an uncertainty in the argument of our wavefunction, really. This uncertainty in the argument is, obviously, equal to:

(E/ħ)·t = [(E± n·ħ/2)/ħ]·t = (E0/ħ ± n/2)·t = (E0/ħ)·t ± (n/2)·t

So we can write:

a·ei·(E/ħ)·t = a·ei·[(E0/ħ)·t ± (1/2)·t] = a·ei·[(E0/ħ)·t]·ei·[±(n/2)·t]

This is valid for any value of t. What the expression says is that, from a mathematical point of view, introducing uncertainty about the energy is equivalent to introducing uncertainty about the wavefunction itself. It may be equal to a·ei·[(E0/ħ)·t]·ei·(n/2)·t, but it may also be equal to a·ei·[(E0/ħ)·t]·ei·(n/2)·t. The phases of the ei·t/2 and ei·t/2 factors are separated by a distance equal to t.

So… Well…

[…]

Hmm… I am stuck. How is this going to lead me to the ΔE·Δt = ħ/2 principle? To anyone out there: can you help? 🙂

[…]

The thing is: you won’t get the Uncertainty Principle by staring at that formula above. It’s a bit more complicated. The idea is that we have some distribution of the observables, like energy and momentum, and that implies some distribution of the associated frequencies, i.e. ω for E, and k for p. The Wikipedia article on the Uncertainty Principle gives you a formal derivation of the Uncertainty Principle, using the so-called Kennard formulation of it. You can have a look, but it involves a lot of formalism—which is what I wanted to avoid here!

I hope you get the idea though. It’s like statistics. First, we assume we know the population, and then we describe that population using all kinds of summary statistics. But then we reverse the situation: we don’t know the population but we do have sample information, which we also describe using all kinds of summary statistics. Then, based on what we find for the sample, we calculate the estimated statistics for the population itself, like the mean value and the standard deviation, to name the most important ones. So it’s a bit the same here, except that, in quantum mechanics, there may not be any real value underneath: the mean and the standard deviation represent something fuzzy, rather than something precise.

Hmm… I’ll leave you with these thoughts. We’ll develop them further as we will be digging into all much deeper over the coming weeks. 🙂

Post scriptum: I know you expect something more from me, so… Well… Think about the following. If we have some uncertainty about the energy E, we’ll have some uncertainty about the momentum p according to that β = p/E. [By the way, please think about this relationship: it says, all other things being equal (such as the inertia, i.e. the mass, of our particle), that more energy will all go into more momentum. More specifically, note that ∂p/∂p = β according to this equation. In fact, if we include the mass of our particle, i.e. its inertia, as potential energy, then we might say that (1−β)·E is the potential energy of our particle, as opposed to its kinetic energy.] So let’s try to think about that.

Let’s denote the uncertainty about the energy as ΔE. As should be obvious from the discussion above, it can be anything: it can mean two separate energy levels E = E± A, or a potentially infinite set of values. However, even if the set is infinite, we know the various energy levels need to be separated by ħ, at least. So if the set is infinite, it’s going to be a countable infinite set, like the set of natural numbers, or the set of integers. But let’s stick to our example of two values E = E± A only, with A = ħ so E + ΔE = E± ħ and, therefore, ΔE = ± ħ. That implies Δp = Δ(β·E) = β·ΔE = ± β·ħ.

Hmm… This is a bit fishy, isn’t it? We said we’d measure the momentum in units of ħ, but so here we say the uncertainty in the momentum can actually be a fraction of ħ. […] Well… Yes. Now, the momentum is the product of the mass, as measured by the inertia of our particle to accelerations or decelerations, and its velocity. If we assume the inertia of our particle, or its mass, to be constant – so we say it’s a property of the object that is not subject to uncertainty, which, I admit, is a rather dicey assumption (if all other measurable properties of the particle are subject to uncertainty, then why not its mass?) – then we can also write: Δp = Δ(m·v) = Δ(m·β) = m·Δβ. [Note that we’re not only assuming that the mass is not subject to uncertainty, but also that the velocity is non-relativistic. If not, we couldn’t treat the particle’s mass as a constant.] But let’s be specific here: what we’re saying is that, if ΔE = ± ħ, then Δv = Δβ will be equal to Δβ = Δp/m = ± (β/m)·ħ. The point to note is that we’re no longer sure about the velocity of our particle. Its (relative) velocity is now:

β ± Δβ = β ± (β/m)·ħ

But, because velocity is the ratio of distance over time, this introduces an uncertainty about time and distance. Indeed, if its velocity is β ± (β/m)·ħ, then, over some time T, it will travel some distance X = [β ± (β/m)·ħ]·T. Likewise, it we have some distance X, then our particle will need a time equal to T = X/[β ± (β/m)·ħ].

You’ll wonder what I am trying to say because… Well… If we’d just measure X and T precisely, then all the uncertainty is gone and we know if the energy is E+ ħ or E− ħ. Well… Yes and no. The uncertainty is fundamental – at least that’s what’s quantum physicists believe – so our uncertainty about the time and the distance we’re measuring is equally fundamental: we can have either of the two values X = [β ± (β/m)·ħ] T = X/[β ± (β/m)·ħ], whenever or wherever we measure. So we have a ΔX and ΔT that are equal to ± [(β/m)·ħ]·T and X/[± (β/m)·ħ] respectively. We can relate this to ΔE and Δp:

  • ΔX = (1/m)·T·Δp
  • ΔT = X/[(β/m)·ΔE]

You’ll grumble: this still doesn’t give us the Uncertainty Principle in its canonical form. Not at all, really. I know… I need to do some more thinking here. But I feel I am getting somewhere. 🙂 Let me know if you see where, and if you think you can get any further. 🙂

The thing is: you’ll have to read a bit more about Fourier transforms and why and how variables like time and energy, or position and momentum, are so-called conjugate variables. As you can see, energy and time, and position and momentum, are obviously linked through the E·t and p·products in the E0·t − p·x sum. That says a lot, and it helps us to understand, in a more intuitive way, why the ΔE·Δt and Δp·Δx products should obey the relation they are obeying, i.e. the Uncertainty Principle, which we write as ΔE·Δt ≥ ħ/2 and Δp·Δx ≥ ħ/2. But so proving involves more than just staring at that Ψ(θ) = a·eiθ = a·ei·[(E/ħ)·t − (p/ħ)·x] relation.

Having said, it helps to think about how that E·t − p·x sum works. For example, think about two particles, a and b, with different velocity and mass, but with the same momentum, so p= pb ⇔ ma·v= ma·v⇔ ma/v= mb/va. The spatial frequency of the wavefunction  would be the same for both but the temporal frequency would be different, because their energy incorporates the rest mass and, hence, because m≠ mb, we also know that E≠ Eb. So… It all works out but, yes, I admit it’s all very strange, and it takes a long time and a lot of reflection to advance our understanding.

Occam’s Razor

The analysis of a two-state system (i.e. the rather famous example of an ammonia molecule ‘flipping’ its spin direction from ‘up’ to ‘down’, or vice versa) in my previous post is a good opportunity to think about Occam’s Razor once more. What are we doing? What does the math tell us?

In the example we chose, we didn’t need to worry about space. It was all about time: an evolving state over time. We also knew the answers we wanted to get: if there is some probability for the system to ‘flip’ from one state to another, we know it will, at some point in time. We also want probabilities to add up to one, so we knew the graph below had to be the result we would find: if our molecule can be in two states only, and it starts of in one, then the probability that it will remain in that state will gradually decline, while the probability that it flips into the other state will gradually increase, which is what is depicted below.

graph

However, the graph above is only a Platonic idea: we don’t bother to actually verify what state the molecule is in. If we did, we’d have to ‘re-set’ our t = 0 point, and start all over again. The wavefunction would collapse, as they say, because we’ve made a measurement. However, having said that, yes, in the physicist’s Platonic world of ideas, the probability functions above make perfect sense. They are beautiful. You should note, for example, that P1 (i.e. the probability to be in state 1) and P2 (i.e. the probability to be in state 2) add up to 1 all of the time, so we don’t need to integrate over a cycle or something: so it’s all perfect!

These probability functions are based on ideas that are even more Platonic: interfering amplitudes. Let me explain.

Quantum physics is based on the idea that these probabilities are determined by some wavefunction, a complex-valued amplitude that varies in time and space. It’s a two-dimensional thing, and then it’s not. It’s two-dimensional because it combines a sine and cosine, i.e. a real and an imaginary part, but the argument of the sine and the cosine is the same, and the sine and cosine are the same function, except for a phase shift equal to π. We write:

a·eiθ = cos(θ) – sin(−θ) = cosθ – sinθ

The minus sign is there because it turns out that Nature measures angles, i.e. our phase, clockwise, rather than counterclockwise, so that’s not as per our mathematical convention. But that’s a minor detail, really. [It should give you some food for thought, though.] For the rest, the related graph is as simple as the formula:

graph sin and cos

Now, the phase of this wavefunction is written as θ = (ω·t − k ∙x). Hence, ω determines how this wavefunction varies in time, and the wavevector k tells us how this wave varies in space. The young Frenchman Comte Louis de Broglie noted the mathematical similarity between the ω·t − k ∙x expression and Einstein’s four-vector product pμxμ = E·t − px, which remains invariant under a Lorentz transformation. He also understood that the Planck-Einstein relation E = ħ·ω actually defines the energy unit and, therefore, that any frequency, any oscillation really, in space or in time, is to be expressed in terms of ħ.

[To be precise, the fundamental quantum of energy is h = ħ·2π, because that’s the energy of one cycle. To illustrate the point, think of the Planck-Einstein relation. It gives us the energy of a photon with frequency f: Eγ = h·f. If we re-write this equation as Eγ/f = h, and we do a dimensional analysis, we get: h = Eγ/f ⇔ 6.626×10−34 joule·second [x joule]/[cycles per second] ⇔ h = 6.626×10−34 joule per cycle. It’s only because we are expressing ω and k as angular frequencies (i.e. in radians per second or per meter, rather than in cycles per second or per meter) that we have to think of ħ = h/2π rather than h.]

Louis de Broglie connected the dots between some other equations too. He was fully familiar with the equations determining the phase and group velocity of composite waves, or a wavetrain that actually might represent a wavicle traveling through spacetime. In short, he boldly equated ω with ω = E/ħ and k with k = p/ħ, and all came out alright. It made perfect sense!

I’ve written enough about this. What I want to write about here is how this also makes for the situation on hand: a simple two-state system that depends on time only. So its phase is θ = ω·t = E0/ħ. What’s E0? It is the total energy of the system, including the equivalent energy of the particle’s rest mass and any potential energy that may be there because of the presence of one or the other force field. What about kinetic energy? Well… We said it: in this case, there is no translational or linear momentum, so p = 0. So our Platonic wavefunction reduces to:

a·eiθ = ae(i/ħ)·(E0·t)

Great! […] But… Well… No! The problem with this wavefunction is that it yields a constant probability. To be precise, when we take the absolute square of this wavefunction – which is what we do when calculating a probability from a wavefunction − we get P = a2, always. The ‘normalization’ condition (so that’s the condition that probabilities have to add up to one) implies that P1 = P2 = a2 = 1/2. Makes sense, you’ll say, but the problem is that this doesn’t reflect reality: these probabilities do not evolve over time and, hence, our ammonia molecule never ‘flips’ its spin direction from ‘up’ to ‘down’, or vice versa. In short, our wavefunction does not explain reality.

The problem is not unlike the problem we’d had with a similar function relating the momentum and the position of a particle. You’ll remember it: we wrote it as a·eiθ = ae(i/ħ)·(p·x). [Note that we can write a·eiθ = a·e−(i/ħ)·(E0·t − p·x) = a·e−(i/ħ)·(E0·t)·e(i/ħ)·(p·x), so we can always split our wavefunction in a ‘time’ and a ‘space’ part.] But then we found that this wavefunction also yielded a constant and equal probability all over space, which implies our particle is everywhere (and, therefore, nowhere, really).

In quantum physics, this problem is solved by introducing uncertainty. Introducing some uncertainty about the energy, or about the momentum, is mathematically equivalent to saying that we’re actually looking at a composite wave, i.e. the sum of a finite or infinite set of component waves. So we have the same ω = E/ħ and k = p/ħ relations, but we apply them to n energy levels, or to some continuous range of energy levels ΔE. It amounts to saying that our wave function doesn’t have a specific frequency: it now has n frequencies, or a range of frequencies Δω = ΔE/ħ.

We know what that does: it ensures our wavefunction is being ‘contained’ in some ‘envelope’. It becomes a wavetrain, or a kind of beat note, as illustrated below:

File-Wave_group

[The animation also shows the difference between the group and phase velocity: the green dot shows the group velocity, while the red dot travels at the phase velocity.]

This begs the following question: what’s the uncertainty really? Is it an uncertainty in the energy, or is it an uncertainty in the wavefunction? I mean: we have a function relating the energy to a frequency. Introducing some uncertainty about the energy is mathematically equivalent to introducing uncertainty about the frequency. Of course, the answer is: the uncertainty is in both, so it’s in the frequency and in the energy and both are related through the wavefunction. So… Well… Yes. In some way, we’re chasing our own tail. 🙂

However, the trick does the job, and perfectly so. Let me summarize what we did in the previous post: we had the ammonia molecule, i.e. an NH3 molecule, with the nitrogen ‘flipping’ across the hydrogens from time to time, as illustrated below:

dipole

This ‘flip’ requires energy, which is why we associate two energy levels with the molecule, rather than just one. We wrote these two energy levels as E+ A and E− A. That assumption solved all of our problems. [Note that we don’t specify what the energy barrier really consists of: moving the center of mass obviously requires some energy, but it is likely that a ‘flip’ also involves overcoming some electrostatic forces, as shown by the reversal of the electric dipole moment in the illustration above.] To be specific, it gave us the following wavefunctions for the amplitude to be in the ‘up’ or ‘1’ state versus the ‘down’ or ‘2’ state respectivelly:

  • C= (1/2)·e(i/ħ)·(E− A)·t + (1/2)·e(i/ħ)·(E+ A)·t
  • C= (1/2)·e(i/ħ)·(E− A)·t – (1/2)·e(i/ħ)·(E+ A)·t

Both are composite waves. To be precise, they are the sum of two component waves with a temporal frequency equal to ω= (E− A)/ħ and ω= (E+ A)/ħ respectively. [As for the minus sign in front of the second term in the wave equation for C2, −1 = e±iπ, so + (1/2)·e(i/ħ)·(E+ A)·t and – (1/2)·e(i/ħ)·(E+ A)·t are the same wavefunction: they only differ because their relative phase is shifted by ±π.] So the so-called base states of the molecule themselves are associated with two different energy levels: it’s not like one state has more energy than the other.

You’ll say: so what?

Well… Nothing. That’s it really. That’s all I wanted to say here. The absolute square of those two wavefunctions gives us those time-dependent probabilities above, i.e. the graph we started this post with. So… Well… Done!

You’ll say: where’s the ‘envelope’? Oh! Yes! Let me tell you. The C1(t) and C2(t) equations can be re-written as:

C2

Now, remembering our rules for adding and subtracting complex conjugates (eiθ + e–iθ = 2cosθ and eiθ − e–iθ = 2sinθ), we can re-write this as:

C3

So there we are! We’ve got wave equations whose temporal variation is basically defined by Ebut, on top of that, we have an envelope here: the cos(A·t/ħ) and sin(A·t/ħ) factor respectively. So their magnitude is no longer time-independent: both the phase as well as the amplitude now vary with time. The associated probabilities are the ones we plotted:

  • |C1(t)|= cos2[(A/ħ)·t], and
  • |C2(t)|= sin2[(A/ħ)·t].

So, to summarize it all once more, allowing the nitrogen atom to push its way through the three hydrogens, so as to flip to the other side, thereby breaking the energy barrier, is equivalent to associating two energy levels to the ammonia molecule as a whole, thereby introducing some uncertainty, or indefiniteness as to its energy, and that, in turn, gives us the amplitudes and probabilities that we’ve just calculated. [And you may want to note here that the probabilities “sloshing back and forth”, or “dumping into each other” – as Feynman puts it – is the result of the varying magnitudes of our amplitudes, so that’s the ‘envelope’ effect. It’s only because the magnitudes vary in time that their absolute square, i.e. the associated probability, varies too.

So… Well… That’s it. I think this and all of the previous posts served as a nice introduction to quantum physics. More in particular, I hope this post made you appreciate the mathematical framework is not as horrendous as it often seems to be.

When thinking about it, it’s actually all quite straightforward, and it surely respects Occam’s principle of parsimony in philosophical and scientific thought, also know as Occam’s Razor: “When trying to explain something, it is vain to do with more what can be done with less.” So the math we need is the math we need, really: nothing more, nothing less. As I’ve said a couple of times already, Occam would have loved the math behind QM: the physics call for the math, and the math becomes the physics.

That’s what makes it beautiful. 🙂

Post scriptum:

One might think that the addition of a term in the argument in itself would lead to a beat note and, hence, a varying probability but, no! We may look at e(i/ħ)·(E+ A)·t as a product of two amplitudes:

e(i/ħ)·(E+ A)·t e(i/ħ)·E0·t·e(i/ħ)·A·t

But, when writing this all out, one just gets a cos(α·t+β·t)–sin(α·t+β·t), whose absolute square |cos(α·t+β·t)–sin(α·t+β·t)|= 1. However, writing e(i/ħ)·(E+ A)·t as a product of two amplitudes in itself is interesting. We multiply amplitudes when an event consists of two sub-events. For example, the amplitude for some particle to go from s to x via some point a is written as:

x | s 〉via a = 〈 x | a 〉〈 a | s 〉

Having said that, the graph of the product is uninteresting: the real and imaginary part of the wavefunction are a simple sine and cosine function, and their absolute square is constant, as shown below. graph

Adding two waves with very different frequencies – A is a fraction of E– gives a much more interesting pattern, like the one below, which shows an eiαt+eiβt = cos(αt)−i·sin(αt)+cos(βt)−i·sin(βt) = cos(αt)+cos(βt)−i·[sin(αt)+sin(βt)] pattern for α = 1 and β = 0.1.

graph 2

That doesn’t look a beat note, does it? The graphs below, which use 0.5 and 0.01 for β respectively, are not typical beat notes either.

 graph 3graph 4

We get our typical ‘beat note’ only when we’re looking at a wave traveling in space, so then we involve the space variable again, and the relations that come with in, i.e. a phase velocity v= ω/k  = (E/ħ)/(p/ħ) = E/p = c2/v (read: all component waves travel at the same speed), and a group velocity v= dω/dk = v (read: the composite wave or wavetrain travels at the classical speed of our particle, so it travels with the particle, so to speak). That’s what’s I’ve shown numerous times already, but I’ll insert one more animation here, just to make sure you see what we’re talking about. [Credit for the animation goes to another site, one on acoustics, actually!]

beats

So what’s left? Nothing much. The only thing you may want to do is to continue thinking about that wavefunction. It’s tempting to think it actually is the particle, somehow. But it isn’t. So what is it then? Well… Nobody knows, really, but I like to think it does travel with the particle. So it’s like a fundamental property of the particle. We need it every time when we try to measure something: its position, its momentum, its spin (i.e. angular momentum) or, in the example of our ammonia molecule, its orientation in space. So the funny thing is that, in quantum mechanics,

  1. We can measure probabilities only, so there’s always some randomness. That’s how Nature works: we don’t really know what’s happening. We don’t know the internal wheels and gears, so to speak, or the ‘hidden variables’, as one interpretation of quantum mechanics would say. In fact, the most commonly accepted interpretation of quantum mechanics says there are no ‘hidden variables’.
  2. But then, as Polonius famously put, there is a method in this madness, and the pioneers – I mean Werner Heisenberg, Louis de Broglie, Niels Bohr, Paul Dirac, etcetera – discovered. All probabilities can be found by taking the square of the absolute value of a complex-valued wavefunction (often denoted by Ψ), whose argument, or phase (θ), is given by the de Broglie relations ω = E/ħ and k = p/ħ:

θ = (ω·t − k ∙x) = (E/ħ)·t − (p/ħ)·x

That should be obvious by now, as I’ve written dozens of posts on this by now. 🙂 I still have trouble interpreting this, however—and I am not ashamed, because the Great Ones I just mentioned have trouble with that too. But let’s try to go as far as we can by making a few remarks:

  •  Adding two terms in math implies the two terms should have the same dimension: we can only add apples to apples, and oranges to oranges. We shouldn’t mix them. Now, the (E/ħ)·t and (p/ħ)·x terms are actually dimensionless: they are pure numbers. So that’s even better. Just check it: energy is expressed in newton·meter (force over distance, remember?) or electronvolts (1 eV = 1.6×10−19 J = 1.6×10−19 N·m); Planck’s constant, as the quantum of action, is expressed in J·s or eV·s; and the unit of (linear) momentum is 1 N·s = 1 kg·m/s = 1 N·s. E/ħ gives a number expressed per second, and p/ħ a number expressed per meter. Therefore, multiplying it by t and x respectively gives us a dimensionless number indeed.
  • It’s also an invariant number, which means we’ll always get the same value for it. As mentioned above, that’s because the four-vector product pμxμ = E·t − px is invariant: it doesn’t change when analyzing a phenomenon in one reference frame (e.g. our inertial reference frame) or another (i.e. in a moving frame).
  • Now, Planck’s quantum of action h or ħ (they only differ in their dimension: h is measured in cycles per second and ħ is measured in radians per second) is the quantum of energy really. Indeed, if “energy is the currency of the Universe”, and it’s real and/or virtual photons who are exchanging it, then it’s good to know the currency unit is h, i.e. the energy that’s associated with one cycle of a photon.
  • It’s not only time and space that are related, as evidenced by the fact that t − x itself is an invariant four-vector, E and p are related too, of course! They are related through the classical velocity of the particle that we’re looking at: E/p = c2/v and, therefore, we can write: E·β = p·c, with β = v/c, i.e. the relative velocity of our particle, as measured as a ratio of the speed of light. Now, I should add that the t − x four-vector is invariant only if we measure time and space in equivalent units. Otherwise, we have to write c·t − x. If we do that, so our unit of distance becomes meter, rather than one meter, or our unit of time becomes the time that is needed for light to travel one meter, then = 1, and the E·β = p·c becomes E·β = p, which we also write as β = p/E: the ratio of the energy and the momentum of our particle is its (relative) velocity.

Combining all of the above, we may want to assume that we are measuring energy and momentum in terms of the Planck constant, i.e. the ‘natural’ unit for both. In addition, we may also want to assume that we’re measuring time and distance in equivalent units. Then the equation for the phase of our wavefunctions reduces to:

θ = (ω·t − k ∙x) = E·t − p·x

Now, θ is the argument of a wavefunction, and we can always re-scale such argument by multiplying or dividing it by some constant. It’s just like writing the argument of a wavefunction as v·t–x or (v·t–x)/v = t –x/v  with the velocity of the waveform that we happen to be looking at. [In case you have trouble following this argument, please check the post I did for my kids on waves and wavefunctions.] Now, the energy conservation principle tells us the energy of a free particle won’t change. [Just to remind you, a ‘free particle’ means it is present in a ‘field-free’ space, so our particle is in a region of uniform potential.] You see what I am going to do now: we can, in this case, treat E as a constant, and divide E·t − p·x by E, so we get a re-scaled phase for our wavefunction, which I’ll write as:

φ = (E·t − p·x)/E = t − (p/E)·x = t − β·x

Now that’s the argument of a wavefunction with the argument expressed in distance units. Alternatively, we could also look at p as some constant, as there is no variation in potential energy that will cause a change in momentum, i.e. in kinetic energy. We’d then divide by p and we’d get (E·t − p·x)/p = (E/p)·t − x) = t/β − x, which amounts to the same, as we can always re-scale by multiplying it with β, which would then yield the same t − β·x argument.

The point is, if we measure energy and momentum in terms of the Planck unit (I mean: in terms of the Planck constant, i.e. the quantum of energy), and if we measure time and distance in ‘natural’ units too, i.e. we take the speed of light to be unity, then our Platonic wavefunction becomes as simple as:

Φ(φ) = a·eiφ = a·ei(t − β·x)

This is a wonderful formula, but let me first answer your most likely question: why would we use a relative velocity?Well… Just think of it: when everything is said and done, the whole theory of relativity and, hence, the whole of physics, is based on one fundamental and experimentally verified fact: the speed of light is absolute. In whatever reference frame, we will always measure it as 299,792,458 m/s. That’s obvious, you’ll say, but it’s actually the weirdest thing ever if you start thinking about it, and it explains why those Lorentz transformations look so damn complicated. In any case, this fact legitimately establishes as some kind of absolute measure against which all speeds can be measured. Therefore, it is only natural indeed to express a velocity as some number between 0 and 1. Now that amounts to expressing it as the β = v/c ratio.

Let’s now go back to that Φ(φ) = a·eiφ = a·ei(t − β·x) wavefunction. Its temporal frequency ω is equal to one, and its spatial frequency k is equal to β = v/c. It couldn’t be simpler but, of course, we’ve got this remarkably simple result because we re-scaled the argument of our wavefunction using the energy and momentum itself as the scale factor. So, yes, we can re-write the wavefunction of our particle in a particular elegant and simple form using the only information that we have when looking at quantum-mechanical stuff: energy and momentum, because that’s what everything reduces to at that level.

Of course, the analysis above does not include uncertainty. Our information on the energy and the momentum of our particle will be incomplete: we’ll write E = E± σE, and p = p± σp. [I am a bit tired of using the Δ symbol, so I am using the σ symbol here, which denotes a standard deviation of some density function. It underlines the probabilistic, or statistical, nature of our approach.] But, including that, we’ve pretty much explained what quantum physics is about here.

You just need to get used to that complex exponential: eiφ = cos(−φ) + i·sin(−φ) = cos(φ) − i·sin(φ). Of course, it would have been nice if Nature would have given us a simple sine or cosine function. [Remember the sine and cosine function are actually the same, except for a phase difference of 90 degrees: sin(φ) = cos(π/2−φ) = cos(φ+π/2). So we can go always from one to the other by shifting the origin of our axis.] But… Well… As we’ve shown so many times already, a real-valued wavefunction doesn’t explain the interference we observe, be it interference of electrons or whatever other particles or, for that matter, the interference of electromagnetic waves itself, which, as you know, we also need to look at as a stream of photons , i.e. light quanta, rather than as some kind of infinitely flexible aether that’s undulating, like water or air.

So… Well… Just accept that eiφ is a very simple periodic function, consisting of two sine waves rather than just one, as illustrated below.

 sine

And then you need to think of stuff like this (the animation is taken from Wikipedia), but then with a projection of the sine of those phasors too. It’s all great fun, so I’ll let you play with it now. 🙂

Sumafasores

Some content on this page was disabled on June 20, 2020 as a result of a DMCA takedown notice from Michael A. Gottlieb, Rudolf Pfeiffer, and The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 20, 2020 as a result of a DMCA takedown notice from Michael A. Gottlieb, Rudolf Pfeiffer, and The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/

Quantum math: states as vectors, and apparatuses as operators

Pre-script (dated 26 June 2020): Our ideas have evolved into a full-blown realistic (or classical) interpretation of all things quantum-mechanical. In addition, I note the dark force has amused himself by removing some material. So no use to read this. Read my recent papers instead. 🙂

Original post:

I actually wanted to write about the Hamiltonian matrix. However, I realize that, before I can serve the plat de résistance, we need to review or introduce some more concepts and ideas. It all revolves around the same theme: working with states is like working with vectors, but so you need to know how exactly. Let’s go for it. 🙂

In my previous posts, I repeatedly said that a set of base states is like a coordinate system. A coordinate system allows us to describe (i.e. uniquely identify) vectors in an n-dimensional space: we associate a vector with a set of real numbers, like x, y and z, for example. Likewise, we can describe any state in terms of a set of complex numbers – amplitudes, really – once we’ve chosen a set of base states. We referred to this set of base states as a ‘representation’. For example, if our set of base states is +S, 0S and −S, then any state φ can be defined by the amplitudes C+ = 〈 +S | φ 〉, C0 = 〈 0S | φ 〉, and C = 〈 −S | φ 〉.

We have to choose some representation (but we are free to choose which one) because, as I demonstrated when doing a practical example (see my description of muon decay in my post on how to work with amplitudes), we’ll usually want to calculate something like the amplitude to go from one state to another – which we denoted as 〈 χ | φ 〉 – and we’ll do that by breaking it up. To be precise, we’ll write that amplitude 〈 χ | φ 〉  – i.e. the amplitude to go from state φ to state χ (you have to read this thing from right to left, like Hebrew or Arab) – as the following sum:

sum

So that’s a sum over a complete set of base states (that’s why I write all i under the summation symbol ∑). We discussed this rule in our presentation of the ‘Laws’ of quantum math.

Now we can play with this. As χ can be defined in terms of the chosen set of base states too, it’s handy to know that 〈 χ | i 〉 and 〈 i | χ 〉 are each other’s complex conjugates – we write this as: 〈 χ | i 〉 = 〈 i | χ 〉* – so if we have one, we have the other (we can also write: 〈 i | χ 〉* = 〈 χ | i 〉). In other words, if we have all Ci = 〈 i | φ 〉 and all Di = 〈 i | χ 〉, i.e. the ‘components’ of both states in terms of our base states, then we can calculate 〈 χ | φ 〉 as:

〈 χ | φ 〉 = ∑ Di*Ci = ∑〈 χ | i 〉〈 i | φ 〉,

provided we make sure we do the summation over a complete set of base states. For example, if we’re looking at the angular momentum of a spin-1/2 particle, like an electron or a proton, then we’ll have two base states, +ħ/2 and +ħ/2, so then we’ll have only two terms in our sum, but the spin number (j) of a cobalt nucleus is 7/2, so if we’d be looking at the angular momentum of a cobalt nucleus, we’ll have eight (2·j + 1) base states and, hence, eight terms when doing the sum. So it’s very much like working with vectors, indeed, and that’s why states are often referred to as state vectors. So now you know that term too. 🙂

However, the similarities run even deeper, and we’ll explore all of them in this post. You may or may not remember that your math teacher actually also defined ordinary vectors in three-dimensional space in terms of base vectors ei, defined as: e= [1, 0, 0], e= [0, 1, 0] and e= [0, 0, 1]. You may also remember that the units along the x, y and z-axis didn’t have to be the same – we could, for example, measure in cm along the x-axis, but in inches along the z-axis, even if that’s not very convenient to calculate stuff – but that it was very important to ensure that the base vectors were a set of orthogonal vectors. In any case, we’d chose our set of orthogonal base vectors and write all of our vectors as:

A = Ax·e1 + Ay·e+ Az·e3

That’s simple enough. In fact, one might say that the equation above actually defines coordinates. However, there’s another way of defining them. We can write Ax, Ay, and Az as vector dot products, aka scalar vector products (as opposed to cross products, or vector products tout court). Check it:

A= A·e1, A= A·e2, and A= A·e3.

This actually allows us to re-write the vector dot product A·B in a way you’ve probably haven’t seen before. Indeed, you’d usually calculate A·B as |A|∙|B|·cosθ = A∙B·cosθ (A and B is the magnitude of the vectors A and B respectively) or, quite simply, as AxB+ AyB+ AzBz. However, using the dot products above, we can now also write it as:

equation 2

We deliberately wrote B·A instead of Abecause, while the mathematical similarity with the

〈 χ | φ 〉 = ∑〈 χ | i 〉〈 i | φ 〉

equation is obvious, B·A = A·B but 〈 χ | φ 〉 ≠ 〈 φ | χ 〉. Indeed, 〈 χ | φ 〉 and 〈 φ | χ 〉 are complex conjugates – so 〈 χ | φ 〉 = 〈 φ | χ 〉* – but they’re not equal. So we’ll have to watch the order when working with those amplitudes. That’s because we’re working with complex numbers instead of real numbers. Indeed, it’s only because the A·B dot product involves real numbers, whose complex conjugate is the same, that we have that commutativity in the real vector space. Apart from that – so apart from having to carefully check the order of our products – the correspondence is complete.

Let me mention another similarity here. As mentioned above, our base vectors ei had to be orthogonal. We can write this condition as:

ei·ej = δij, with δij = 0 if i ≠ j, and 1 if i = j.

Now, our first quantum-mechanical rule says the same:

〈 i | j 〉 = δij, with δij = 0 if i ≠ j, and 1 if i = j.

So our set of base states also has to be ‘orthogonal’, which is the term you’ll find in physics textbooks, although – as evidenced from our discussion on the base states for measuring angular momentum – one should not try to give any geometrical interpretation here: +ħ/2 and +ħ/2 (so that’s spin ‘up’ and ‘down’ respectively) are not ‘orthogonal’ in any geometric sense, indeed. It’s just that pure states, i.e. base states, are separate, which we write as: 〈 ‘up’ | ‘down’ 〉 = 〈 ‘down’ | ‘up’ 〉 = 0 and 〈 ‘up’ | ‘up’ 〉 = 〈 ‘down’ | ‘down’ 〉 = 1. It just means they are just different base states, and so it’s one or the other. For our +S, 0S and −S example, we’d have nine such amplitudes, and we can organize them in a little matrix:

def base statesIn fact, just like we defined the base vectors ei as e= [1, 0, 0], e= [0, 1, 0] and e= [0, 0, 1] respectively, we may say that the matrix above, which states exactly the same as the 〈 i | j 〉 = δij rule, can serve as a definition of what base states actually are. [Having said that, it’s obvious we like to believe that base states are more than just mathematical constructs: we’re talking reality here. The angular momentum as measured in the x-, y- or z-direction, or in whatever direction, is more than just a number.]

OK. You get this. In fact, you’re probably getting impatient because this is too simple for you. So let’s take another step. We showed that the 〈 χ | φ 〉 = ∑〈 χ | i 〉〈 i | χ 〉 and B·= ∑(B·ei)(ei·A) are structurally equivalent – from a mathematical point of view, that is – but B and A are separate vectors, while 〈 χ | φ 〉 is just a complex number. Right?

Well… No. We can actually analyze the bra and the ket in the 〈 χ | φ 〉 bra-ket as separate pieces too. Moreover, we’ll show they are actually state vectors too, even if the bra, i.e. 〈 χ |, and the ket, i.e. | φ 〉, are ‘unfinished pieces’, so to speak. Let’s be bold. Let’s just cut the 〈 χ | φ 〉 = ∑〈 χ | i 〉〈 i | χ 〉 by writing:

bra and ket

Huh? 

Yes. That’s the power of Dirac’s bra-ket notation: we can just drop symbols left or right. It’s quite incredible. But, of course, the question is: so what does this actually mean? Well… Don’t rack your brain. I’ll tell you. We define | φ 〉 as a state vector because we define | i 〉 as a (base) state vector. Look at it this way: we wrote the 〈 +S | φ 〉, 〈 0S | φ 〉 and 〈 −S | φ 〉 amplitudes as C+, C0, C, respectively, so we can write the equation above as:

a

So we’ve got a sum of products here, and it’s just like A = Ax·e+ Ay·e2 + Az·e3. Just substitute the Acoefficients for Ci and the ebase vectors for the | i 〉 base states. We get:

| φ 〉 = |+S〉 C+ + |0S〉 C0  + |+S〉 C

Of course, you’ll wonder what those terms mean: what does it mean to ‘multiply’ C+ (remember: C+  is some complex number) by |+S〉? Be patient. Just wait. You’ll understand when we do some examples, so when you start working with this stuff. You’ll see it all makes sense—later. 🙂

Of course, we’ll have a similar equation for | χ 〉, and so if we write 〈 χ | i 〉 as Di, then we can write | χ 〉 = ∑ | i 〉〈 χ | i 〉 as | χ 〉 = ∑ | i 〉 Di.

So what? Again: be patient. We know that 〈 χ | i 〉 = 〈 i | χ 〉*, so our second equation above becomes:

b

You’ll have two questions now. The first is the same as the one above: what does it mean to ‘multiply’, let’s say, D0* (i.e. the complex conjugate of D0, so if D= a + ib, then D0* = a − ib) with 〈0S|? The answer is the same: be patient. 🙂 Your second question is: why do I use another symbol for the index here? Why j instead of i? Well… We’ll have to re-combine stuff, so it’s better to keep things separate by using another symbol for the same index. 🙂

In fact, let’s re-combine stuff right now, in exactly the same way as we took it apart: we just write the two things right next to each other. We get the following:

c

What? Is that it? So we went through all of this hocus-pocus just to find the same equation as we started out with?

Yes. I had to take you through this so you get used to juggling all those symbols, because that’s what we’ll do in the next post. Just think about it and give yourself some time. I know you’ve probably never ever handled such exercise in symbols before – I haven’t, for sure! – but it all makes sense: we cut and paste. It’s all great! 🙂 [Oh… In case you wonder about the transition from the sum involving i and j to the sum involving i only, think about the Kronecker expression: 〈 j | i 〉 = δij, with δij = 0 if i ≠ j, and 1 if i = j, so most of the terms are zero.]

To summarize the whole discussion, note that the expression above is completely analogous with the B·= BxA+ ByA+ BzAformula. The only difference is that we’re talking complex numbers here, so we need to watch out. We have to watch the order of stuff, and we can’t use the Dnumbers themselves: we have to use their complex conjugates Di*. But, for the rest, we’re all set! 🙂 If we’ve got a set of base states, then we can define any state in terms of a set of ‘coordinates’ or ‘coefficients’ – i.e. the Ci or Di numbers for the φ or χ example above – and we can then calculate the amplitude to go from one state to another as:

d

In case you’d get confused, just take the original equation:

sum

The two equations are fully equivalent.

[…]

So we just went through all of the shit above so as to show that structural similarity with vector spaces?

Yes. It’s important. You just need to remember that we may have two, three, four, five,… or even an infinite number of base states depending on the situation we’re looking at, and what we’re trying to measure. I am sorry I had to take you through all of this. However, there’s more to come, and so you need this baggage. We’ll take the next step now, and that is to introduce the concept of an operator.

Look at the middle term in that expression above—let me copy it:

c

We’ve got three terms in that double sum (a double sum is a sum involving two indices, which is what we have here: i and j). When we have two indices like that, one thinks of matrices. That’s easy to do here, because we represented that 〈 i | j 〉 = δij equation as a matrix too! To be precise, we presented it as the identity matrix, and a simple substitution allows us to re-write our equation above as:

matrix

I must assume you’re shaking your head in disbelief now: we’ve expanded a simple amplitude into a product of three matrices now. Couldn’t we just stick to that sum, i.e that vector dot product ∑ Di*Ci? What’s next? Well… I am afraid there’s a lot more to come. :-/ For starters, we’ll take that idea of ‘putting something in the middle’ to the next level by going back to our Stern-Gerlach filters and whatever other apparatus we can think of. Let’s assume that, instead of some filter S or T, we’ve got something more complex now, which we’ll denote by A. [Don’t confuse it with our vectors: we’re talking an apparatus now, so you should imagine some beam of particles, polarized or not, entering it, going through, and coming out.]

We’ll stick to the symbols we used already, and so we’ll just assume a particle enters into the apparatus in some state φ, and that it comes out in some state χ. Continuing the example of spin-one particles, and assuming our beam has not been filtered – so, using lingo, we’d say it’s unpolarized – we’d say there’s a probability of 1/3 for being either in the ‘plus’, ‘zero’, or ‘minus’ state with respect to whatever representation we’d happen to be working with, and the related amplitudes would be 1/√3. In other words, we’d say that φ is defined by C+ = 〈 +S | φ 〉, C0 = 〈 0S | φ 〉, and C = 〈 −S | φ 〉, with C+ = C0 = C− = 1/√3. In fact, using that | φ 〉 = |+S〉 C+ + |0S〉 C0  + |+S〉 C− expression we invented above, we’d write: | φ 〉 = (1/√3)|+S〉 + (1/√3)|0S〉 C0  + (1/√3)|+S〉 C or, using ‘matrices’—just a row and a column, really:

matrix 2

However, you don’t need to worry about that now. The new big thing is the following expression:

〈 χ | A | φ〉

It looks simple enough: φ to A to χ. Right? Well… Yes and no. The question is: what do you do with this? How would we take its complex conjugate, for example? And if we know how to do that, would it be equal to 〈 φ | A | χ〉?

You guessed it: we’ll have to take it apart, but how? We’ll do this using another fantastic abstraction. Remember how we took Dirac’s 〈 χ | φ 〉 bra-ket apart by writing | φ 〉 = ∑ | i 〉〈 i | φ 〉? We just dropped the 〈 χ left and right in our 〈 χ | φ 〉 = ∑〈 χ | i 〉〈 i | φ 〉 expression. We can go one step further now, and drop the φ 〉 left and right in our | φ 〉 = ∑ | i 〉〈 i | φ 〉 expression. We get the following wonderful thing:

| = ∑ | i 〉〈 i | over all base states i

With characteristic humor, Feynman calls this ‘The Great Law of Quantum Mechanics’ and, frankly, there’s actually more than one grain of truth in this. 🙂

Now, if we apply this ‘Great Law’ to our 〈 χ | A | φ〉 expression – we should apply it twice, actually – we get:

A1

As Feynman points out, it’s easy to add another apparatus in series. We just write:

B1

Just put a | bar between B and A and apply the same trick. The | bar is really like a factor 1 in multiplication. However, that’s all great fun but it doesn’t solve our problem. Our ‘Great Law’ allows us to sort of ‘resolve’ our apparatus A in terms of base states, as we now have 〈 i | A | j 〉 in the middle, rather than 〈 χ | A | φ〉 but, again, how do we work with that?

Well… The answer will surprise you. Rather than trying to break this thing up, we’ll say that the apparatus A is actually being described, or defined, by the nine 〈 i | A | j 〉 amplitudes. [There are nine for this example, but four only for the example involving spin-1/2 particles, of course.] We’ll call those amplitudes, quite simply, the matrix of amplitudes, and we’ll often denote it by Aij.

Now, I wanted to talk about operators here. The idea of an operator comes up when we’re creative again, and when we drop the 〈 χ | state from the 〈 χ | A | φ〉 expression. We write:

C1

So now we think of the particle entering the ‘apparatus’ A in the state ϕ and coming out of A in some state ψ (‘psi’). We can generalize this and think of it as an ‘operator’, which Feynman intuitively defines as follows:

The symbol A is neither an amplitude, nor a vector; it is a new kind of thing called an operator. It is something which “operates on” a state to produce a new state.”

But… Wait a minute! | ψ 〉 is not the same as 〈 χ |. Why can we do that substitution? We can only do it because any state ψ and χ are related through that other ‘Law’ of quantum math:

C2

Combining the two shows our ‘definition’ of an operator is OK. We should just note that it’s an ‘open’ equation until it is completed with a ‘bra’, i.e. a state like 〈 χ |, so as to give the 〈 χ | ψ〉 = 〈 χ | A | φ〉 type of amplitude that actually means something. In practical terms, that means our operator or our apparatus doesn’t mean much as long as we don’t measure what comes out, so then we choose some set of base states, i.e. a representation, which allows us to describe the final state, i.e. 〈 χ |.

[…]

Well… Folks, that’s it. I know this was mighty abstract, but the next posts should bring things back to earth again. I realize it’s only by working examples and doing exercises that one can get some kind of ‘feel’ for this kind of stuff, so that’s what we’ll have to go through now. 🙂

Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/

Taking the magic out of God’s number: some additional reflections

Note: I have published a paper that is very coherent and fully explains this so-called God-given number. There is nothing magical about it. It is just a scaling constant. Check it out: The Meaning of the Fine-Structure Constant. No ambiguity. No hocus-pocus.

Jean Louis Van Belle, 23 December 2018

Original post:

In my previous post, I explained why the fine-structure constant α is not a ‘magical’ number, even if it relates all fundamental properties of the electron: its mass, its energy, its charge, its radius, its photon scattering cross-section (i.e. the Bohr radius, or the size of the atom really) and, finally, the coupling constant for photon-electron interactions. The key to such understanding of α was the model of an electron as a tiny ball of charge. As such, we have two energy formulas for it. One is the energy that’s needed to assemble the charge from infinitely dispersed infinitesimal charges, which we denoted as Uelec. The other formula is the energy of the field of the tiny ball of charge, which we denoted as Eelec.

The formula for Eelec is calculated using the formula for the field momentum of a moving charge and, using the m = E/cmas-energy equivalence relationship, is equivalent to the electromagnetic mass. We went through the derivation in our previous post, so let me just jot down the result:

emm - 2

The second formula depends on what ball of charge we’re thinking of, because the formulas for a charged sphere and a spherical shell of charge are different: both have the same structure as the relationship above (so the energy is also proportional to the square of the electron charge and inversely proportional to the radius a), but the constant of proportionality is different. For a sphere of charge, we write:

 f sphre

For a spherical shell of charge we write:

shell

To compare the formulas, you need to note that the square of the electron charge in the formula for the field energy is equal to e2 = qe2/4πε= ke·qe2. So we multiply the square of the actual electron charge by the Coulomb constant k= 1/4πε0. As you can see, the three formulas have exactly the same form then. It’s just the proportionality constant that’s different: it’s 2/3, 3/5 and 1/2 respectively. It’s interesting to quickly reflect on the dimensions here: [ke] ≈ 9×109 N·m2/C2, so e2 is expressed in N·m2. That makes the units come out alright, as we divide by a (so that’s in meter) and so we get the energy in joule (which is newton·meter). In fact, now that we’re here, let’s quickly calculate the value of e2: it’s that ke·qe2 product, so it’s equal to 2.3×10−28 N·m2. We can quickly check this value because we know that the classical electron radius is equal to:

classical electron radius

So we divide 2.3×10−28 N·mby mec≈ 8.2×10−14 J, so we get r≈ 2.82×10−15 m. So we’re spot on! Why did I do this check? Not really to check what I wrote. It’s more to show what’s going on. We’ve got yet another formula relating the energy and the radius of an electron here, so now we have three. In fact we have more because the formula for Uelec depends on the finer details of our model for the electron (sphere versus shell, uniform versus non-uniform distribution):

  1. Eelec = (2/3)·(e2/a): This is the formula for the energy of the field, so we may all it is external energy.
  2. Uelec = (3/5)·(e2/a), or Uelec = (1/2)·(e2/a): This is the energy needed to assemble our electron, so we might, perhaps, call it its internal energy. The first formula assumes our electron is a uniformly charged sphere. The second assumes all charges sit on the surface of the sphere. If we drop the assumption of the charge having to be uniformly distributed, we’ll find yet another formula.
  3. mece2/r0: This is the energy associated with the so-called classical electron radius (r0) and the electron’s rest mass (me).

In our previous posts, we assumed the last equation was the right one. Why? Because it’s the one that’s been verified experimentally. The discrepancies between the various proportionality coefficients – i.e. the difference between 2/3 and 1, basically – are to be explained because of the binding forces within the electron, without which the electron would just ‘explode’, as the French physicist and polymath Henri Poincaré famously put itIndeed, if the electron is a little ball of negative charge, the repulsive forces between its parts should rip it apart. So we will not say anything more about this. You can have fun yourself by googling all the various theories that try to model these binding forces. [I may do the same some day, but now I’ve got other priorities: I want to move to Feynman’s third volume of Lectures, which is devoted to quantum physics only, so I look very much forward to that.]

In this post, I just wanted to reflect once more on what constants are really fundamental and what constants are somewhat less fundamental. From all what I wrote in my previous post, I said there were three:

  1. The fine-structure constant α, which is a dimensionless number.
  2. Planck’s constant h, whose dimension is joule·second, so that’s the dimension of action.
  3. The speed of light c, whose dimension is that of a velocity.

The three are related through the following expression:

alpha re-expressed

This is an interesting expression. Let’s first check its dimension. We already explained that e2 is expressed in N·m2. That’s rather strange, because it means the dimension of e itself is N1/2·m: what’s the square root of a force of one newton? In fact, to interpret the formula above, it’s probably better to re-write eas e2 = qe2/4πε= ke·qe2. That shows you how the electron charge and Coulomb’s constant are related. Of course, they are part and parcel of one and the same force lawCoulomb’s law. We don’t need anything else, except for relativity theory, because we need to explain the magnetic force as well—and that we can do because magnetism is just a relativistic effect. Think of the field momentum indeed: the magnetic field comes into play only when we start to move our electron. The relativity effect is captured by c  in that formula for α above. As for ħ, ħ = h/2π comes with the E = h·f equation, which links us to the electron’s Compton wavelength λ through the de Broglie relation λ = h/p.

The point is: we should probably not look at α as a ‘fundamental physical constant’. It’s e2 that’s the third fundamental constant, besides h and c. Indeed, it’s from e2 that all the rest follows: the electron’s internal energy, its external energy, and its radius, and then all the rest by combining stuff with other stuff.

Now, we took the magic out of α by doing what we did in the previous posts, and that’s to combine stuff with other stuff, and so now you may think I am putting the magic back in with that formula for α, which seems to define α in terms of the three mentioned ‘fundamental’ constants. That’s not the case: this relation comes out of all of the other relationships we found, and so it’s nothing new really. It’s actually not a definition of α: it just does what it does, and that’s to relate α to the ‘fundamental’ physical constants behind.

So… No new magic. In fact, I want to close this post by taking away even more of the magic. If you read my previous post, I said that α was ‘God’s cut-off factor’ 🙂 ensuring our energy functions do not blow up, but I also said it was impossible to say why he chose 0.00729735256 as the cut-off factor. The question is actually easily answered by thinking about those two formulas we had for the internal and external energy respectively. Let’s re-write them in natural units and, temporarily, two different subscripts for α, so we write:

  1. Eelec = αe/r0: This is the formula for the energy of the field.
  2. Uelec = αu/r0: This is the energy needed to assemble our electron.

Both energies are determined by the above-mentioned laws, i.e. Coulomb’s Law and the theory of relativity, so α has got nothing to do what that. However, both energies have to be the same, and so αhas to be equal to αu. In that sense, α is, quite simply, a proportionality constant that achieves that equality. Now that explains why we can derive α from the three other constants which, as mentioned above, are probably more fundamental. In fact, we’ve got only three degrees of freedom here, so if we chose c, h and as ‘fundamental’, then α isn’t any more.

The underlying deep question behind it all is why those two energies should be equal. Why would our electron have some internal energy if it’s elementary? The answer to that question is: because it has some non-zero radius, and it has some non-zero radius because we don’t want our formula for the field energy (or the field momentum) to blow up. Now, if it has some radius, then it has to have some internal energy.

You’ll say: that makes sense, but it doesn’t answer the question. Why would it have internal energy, with or without a zero radius? If an electron is an elementary particle, then it’s really elementary, isn’t? And so then we shouldn’t try to ‘assemble’ it from an infinite number of infinitesimally small charges. You’re right, and here we can also note that the fact that the electron doesn’t blow up is firm evidence it’s very elementary indeed.

I should also note that Feynman actually doesn’t talk about the energy that’s needed to assemble a charge: he gets his Uelec = (1/2)·(e2/a) by calculating the external field energy for a spherical shell of charge, and he sticks to it—presumably because it’s the same field for a uniform or non-uniform sphere of charge. He only notes there has to be some radius because, if not, the formula he uses blows up, indeed. So – who knows? – perhaps he doesn’t quite believe that formula for the internal energy is relevant either.

So perhaps there is no internal energy indeed. Perhaps there’s just the energy of the field. So… Well… I can’t say much about this… Except… Well… Perhaps just one more thing. Let me note something that, I hope, you noticed as well: the ke·qe2 is the numerator in Coulomb’s Law itself. You also know that energy equals force times distance. So if we divide both sides by r0, we get Coulomb’s Law itself Felec = ke·qe2/r02. The only thing is: what’s the distance? It’s one charge only, and there is no distance between one charge, is there? Well… Yes and no. I have been thinking that the requirement of the internal and external energies being equal resembles the statement that the forces between two charges are equal and opposite. That ties in with the idea of the internal energy itself: remember we were basically talking forces between infinitesimally small elements of charge within the electron itself? So r0 is, perhaps, some average distance or so. There must be some way of thinking of it like that. But… Well… Which one exactly?

This kind of reflection may not make sense. Who knows? I obviously need to think all of this through and so this post is, indeed, just a bunch of reflections for which I will have more time later—hopefully. 🙂 Perhaps we’re all just pushing the matter too far. Perhaps we should just accept that the external energy has that 2/3 factor but that the actual energy of the electron should also include the equivalent energy of some binding force that holds the electron together. Well… In any case. That’s all I am going to do on this extremely complicated matter. It’s time to move indeed! So the point to take home here is probably just this:

  1. When calculating the radius of an electron using classical theory, we get in trouble: not only do we find different radii, but the radii that we find do not respect the E = meclaw. It’s only the mece2/r0 that’s relativistically correct.
  2. That suggests the electron also has some non-electric mass, which are referred to as ‘binding forces’ or ‘Poincaré stresses’, but which remain to be explained convincingly.
  3. All of this shouldn’t surprise us: for all we know, the electron is something fuzzy. 🙂

So my next posts will focus on the ‘essentials’ preparing for Feynman’s Volume on quantum mechanics. Those ‘essentials’ will still involve some classical stuff but, as you will see, even more contradictions, that – hopefully! – will then be solved in the quantum-mechanical picture of it all. 🙂

Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 20, 2020 as a result of a DMCA takedown notice from Michael A. Gottlieb, Rudolf Pfeiffer, and The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 20, 2020 as a result of a DMCA takedown notice from Michael A. Gottlieb, Rudolf Pfeiffer, and The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/

Taking the magic out of God’s number

Note: I have published a paper that is very coherent and fully explains this so-called God-given number. There is nothing magical about it. It is just a scaling constant. Check it out: The Meaning of the Fine-Structure Constant. No ambiguity. No hocus-pocus.

Jean Louis Van Belle, 23 December 2018

Original post:

I think the post scriptum to my previous post is interesting enough to separate it out as a piece of its own, so let me do that here. You’ll remember that we were trying to find some kind of a model for the electron, picturing it like a tiny little ball of charge, and then we just applied the classical energy formulas to it to see what comes out of it. The key formula is the integral that gives us the energy that goes into assembling a charge. It was the following thing:

U 4

This is a double integral which we simplified in two stages, so we’re looking at an integral within an integral really, but we can substitute the integral over the ρ(2)·dVproduct by the formula we got for the potential, so we write that as Φ(1), and so the integral above becomes:

U 5Now, this integral integrates the ρ(1)·Φ(1)·dVproduct over all of space, so that’s over all points in space, and so we just dropped the index and wrote the whole thing as the integral of ρ·Φ·dV over all of space:

U 6

We then established that this integral was mathematically equivalent to the following equation:

U 7

So this integral is actually quite simple: it just integrates EE = E2 over all of space. The illustration below shows E as a function of the distance for a sphere of radius R filled uniformly with charge.

uniform density

So the field (E) goes as for r ≤ R and as 1/rfor r ≥ R. So, for r ≥ R, the integral will have (1/r2)2 = 1/rin it. Now, you know that the integral of some function is the surface under the graph of that function. Look at the 1/r4 function below: it blows up between 1 and 0. That’s where the problem is: there needs to be some kind of cut-off, because that integral will effectively blow up when the radius of our little sphere of charge gets ‘too small’. So that makes it clear why it doesn’t make sense to use this formula to try to calculate the energy of a point charge. It just doesn’t make sense to do that.

graph

In fact, the need for a ‘cut-off factor’ so as to ensure our energy function doesn’t ‘blow up’ is not because of the exponent in the 1/r4 expression: the need is also there for any 1/r relation, as illustrated below. All 1/rfunction have the same pivot point, as you can see from the simple illustration below. So, yes, we cannot go all the way to zero from there when integrating: we have to stop somewhere.

graph 2So what’s the ‘cut-off point’? What’s ‘too small’ a radius? Let’s look at the formula we got for our electron as a shell of charge (so the assumption here is that the charge is uniformly distributed on the surface of a sphere with radius a):

energy electron

So we’ve got an even simpler formula here: it’s just a 1/relation (a is in this formula), not 1/r4. Why is that? Well… It’s just the way the math turns out: we’re integrating over volumes and so that involves an r3 factor and so it all simplifies to 1/r, and so that gives us this simple inversely proportional relationship between U and r, i.e a, in this case. 🙂 I copied the detail of Feynman’s calculation in my previous post, so you can double-check it. It’s quite wonderful, really. Look at it again: we have a very simple inversely proportional relationship between the radius of our electron and its energy as a sphere of charge. We could write it as:

Uelect  = α/a, with α = e2/2

Still… We need the cut-off point’. Also note that, as I pointed out, we don’t necessarily need to assume that the charge in our little ball of charge (i.e. our electron) sits on the surface only: if we’d assume it’s a uniformly charged sphere of charge, we’d just get another constant of proportionality: our 1/2 factor would become a 3/5 factor, so we’d write: Uelect  = (3/5)·e2/a. But we’re not interested in finding the right model here. We know the Uelect  = (3/5)·e2/a gives us a value for that differs with a 2/5 factor as the classical electron radius. That’s not so bad and so let’s go along with it. 🙂

We’re going to look at the simple structure of this relation, and all of its implications. The simple equation above says that the energy of our electron is (a) proportional to the square of its charge and (b) inversely proportional to its radius. Now, that is a very remarkable result. In fact, we’ve seen something like this before, and we were astonished. We saw it when we were discussing the wonderful properties of that magical number, the fine-structure constant, which we also denoted by α. However, because we used α already, I’ll denote the fine-structure constant as αe here, so you don’t get confused. You’ll remember that the fine-structure constant is a God-like number indeed: it links all of the fundamental properties of the electron, i.e. its charge, its radius, its distance to the nucleus (i.e. the Bohr radius), its velocity, its mass (and, hence, its energy), its de Broglie wavelength. Whatever: all these physical constants are all related through the fine-structure constant. 

In my various posts on this topic, I’ve repeatedly said that, but I never showed why it’s true, and so it was a very magical number indeed. I am going to take some of the magic out now. Not too much but… Well… You can judge for yourself how much of the magic remains after I am done here. 🙂

So, at this stage of the argument, α can be anything, and αcannot, of course. It’s just that magical number out there, which relates everything to everything: it’s the God-given number we don’t understand, or didn’t understand, I should say. Past tense. Indeed, we’re going to get some understanding here because we know that one of the many expressions involving αe was the following one:

me = αe/re

This says that the mass of the electron is equal to the ratio of the fine-structure constant and the electron radius. [Note that we express everything in natural units here, so that’s Planck units. For the detail of the conversion, please see the relevant section on that in my one of my posts on this and other stuff.] In fact, the U = (3/5)·e2/a and me = αe/rrelations looks exactly the same, because one of the other equations involving the fine-structure constant was: αe = eP2. So we’ve got the square of the charge here as well! Indeed, as I’ll explain in a moment, the difference between the two formulas is just a matter of units.

Now, mass is equivalent to energy, of course: it’s just a matter of units, so we can equate me with Ee (this amounts to expressing the energy of the electron in a kg unit—bit weird, but OK) and so we get:

Ee = αe/re

So there we have: the fine-structure constant αe is Nature’s ‘cut-off’ factor, so to speak. Why? Only God knows. 🙂 But it’s now (fairly) easy to see why all the relations involving αe are what they are. As I mentioned already, we also know that αe is the square of the electron charge expressed in Planck units, so we have:

 αe = eP2 and, therefore, Ee = eP2/re

Now, you can check for yourself: it’s just a matter of re-expressing everything in standard SI units, and relating eP2 to e2, and it should all work: you should get the Eelect  = (2/3)·e2/expression. So… Well… At least this takes some of the magic out the fine-structure constant. It’s still a wonderful thing, but so you see that the fundamental relationship between (a) the energy (and, hence, the mass), (b) the radius and (c) the charge of an electron is not something God-given. What’s God-given are Maxwell’s equations, and so the Ee = αe/r= eP2/re is just one of the many wonderful things that you can get out of  them.

So we found God’s ‘cut-off factor’ 🙂 It’s equal to αe ≈ 0.0073 = 7.3×10−3. So 7.3 thousands of… What? Well… Nothing. It’s just a pure ratio between the energy and the radius of an electron (if both are expressed in Planck units, of course). And so it determines the electron charge (again, expressed in Planck units). Indeed, we write:

eP = √αe

Really? Yes. Just do all these formulas:

eP = √α≈ √0.0073·1.9×10−18 coulomb ≈ 1.6 ×10−19 C

Just re-check it with all the known decimals: you’ll see it’s bang on. Let’s look at the E= me = αe/rratio once again. What’s the meaning of it? Let’s first calculate the value of re and me, i.e. the electron radius and electron mass expressed in Planck units. It’s equal to the classical electron radius divided by the Planck length, and then the same for the mass, so we get the following thing:

re ≈ (2.81794×10−15 m)/(1.6162×10−35 m) = 1.7435×1020 

me ≈ (9.1×10−31 kg)/(2.17651×10−8 kg) = 4.18×10−23

αe = (4.18×10−23)·(1.7435×1020) ≈ 0.0073

It works like a charm, but what does it mean? Well… It’s just a ratio between two physical quantities, and the scale you use to measure those quantities matters very much. We’ve explained that the Planck mass is a rather large unit at the atomic scale and, therefore, it’s perhaps not quite appropriate to use it here. In fact, out of the many interesting expressions for αe, I should highlight the following one:

αe = e2/(ħ·c) ≈ (1.60217662×10−19 C)2/(4πε0·[(1.054572×10−34 N·m·s)·(2.998×108 m/s)]) ≈ 0.0073 once more 🙂

Note that the elementary charge e is actually equal to qe/4πε0, which is what I am using in the formula. I know that’s confusing, but it what it is. As for the units, it’s a bit tedious to write it all out, but you’ll get there. Note that ε≈ 8.8542×10−12 C2/(N·m2) so… Well… All the units do cancel out, and we get a dimensionless number indeed, which is what αe is.

The point is: this expression links αe to the the de Broglie relation (p = h/λ), with λ the wavelength that’s associated with the electron. Of course, because of the Uncertainty Principle, we know we’re talking some wavelength range really, so we should write the de Broglie relation as Δp = h·Δ(1/λ). Now, that, in turn, allows us to try to work out the Bohr radius, which is the other ‘dimension’ we associate with an electron. Of course, now you’ll say: why would you do that. Why would you bring in the de Broglie relation here?

Well… We’re talking energy, and so we have the Planck-Einstein relation first: the energy of some particle can always be written as the product of and some frequency f: E = h·f. The only thing that de Broglie relation adds is the Uncertainty Principle indeed: the frequency will be some frequency range, associated with some momentum range, and so that’s what the Uncertainty Principle really says. I can’t dwell too much on that here, because otherwise this post would become a book. 🙂 For more detail, you can check out one of my many posts on the Uncertainty Principle. In fact, the one I am referring to here has Feynman’s calculation of the Bohr radius, so I warmly recommend you check it out. The thrust of the argument is as follows:

  1. If we assume that (a) an electron takes some space – which I’ll denote by r 🙂 – and (b) that it has some momentum p because of its mass m and its velocity v, then the ΔxΔp = ħ relation (i.e. the Uncertainty Principle in its roughest form) suggests that the order of magnitude of r and p should be related in the very same way. Hence, let’s just boldly write r ≈ ħ/p and see what we can do with that.
  2. We know that the kinetic energy of our electron equals mv2/2, which we can write as p2/2m so we get rid of the velocity factor.Well… Substituting our p ≈ ħ/r conjecture, we get K.E. = ħ2/2mr2. So that’s a formula for the kinetic energy. Next is potential.
  3. The formula for the potential energy is U = q1q2/4πε0r12. Now, we’re actually talking about the size of an atom here, so one charge is the proton (+e) and the other is the electron (–e), so the potential energy is U = P.E. = –e2/4πε0r, with r the ‘distance’ between the proton and the electron—so that’s the Bohr radius we’re looking for!
  4. We can now write the total energy (which I’ll denote by E, but don’t confuse it with the electric field vector!) as E = K.E. + P.E. =  ħ2/2mr– e2/4πε0r. Now, the electron (whatever it is) is, obviously, in some kind of equilibrium state. Why is that obvious? Well… Otherwise our hydrogen atom wouldn’t or couldn’t exist. 🙂 Hence, it’s in some kind of energy ‘well’ indeed, at the bottom. Such equilibrium point ‘at the bottom’ is characterized by its derivative (in respect to whatever variable) being equal to zero. Now, the only ‘variable’ here is r (all the other symbols are physical constants), so we have to solve for dE/dr = 0. Writing it all out yields: dE/dr = –ħ2/mr+ e2/4πε0r= 0 ⇔ r = 4πε0ħ2/me2
  5. We can now put the values in: r = 4πε0h2/me= [(1/(9×109) C2/N·m2)·(1.055×10–34 J·s)2]/[(9.1×10–31 kg)·(1.6×10–19 C)2] = 53×10–12 m = 53 pico-meter (pm)

Done. We’re right on the spot. The Bohr radius is, effectively, about 53 trillionths of a meter indeed!

Phew!

Yes… I know… Relax. We’re almost done. You should now be able to figure out why the classical electron radius and the Bohr radius can also be related to each other through the fine-structure constant. We write:

me = α/r= α/α2r = 1/αr

So we get that α/r= 1/αr and, therefore, we get re/r = α2, which explains why α is also equal to the so-called junction number, or the coupling constant, for an electron-photon coupling (see my post on the quantum-mechanical aspects of the photon-electron interaction). It gives a physical meaning to the probability (which, as you know, is the absolute square of the probability amplitude) in terms of the chance of a photon actually ‘hitting’ the electron as it goes through the atom. Indeed, the ratio of the Thomson scattering cross-section and the Bohr size of the atom should be of the same order as re/r, and so that’s α2.

[Note: To be fully correct and complete, I should add that the coupling constant itself is not α2 but √α = eP. Why do we have this square root? You’re right: the fact that the probability is the absolute square of the amplitude explains one square root (√α2 = α), but not two. The thing is: the photon-electron interaction consists of two things. First, the electron sort of ‘absorbs’ the photon, and then it emits another one, that has the same or a different frequency depending on whether or not the ‘collision’ was elastic or not. So if we denote the coupling constant as j, then the whole interaction will have a probability amplitude equal to j2. In fact, the value which Feynman uses in his wonderful popular presentation of quantum mechanics (The Strange Theory of Light and Matter), is −α ≈ −0.0073. I am not quite sure why the minus sign is there. It must be something with the angles involved (the emitted photon will not be following the trajectory of the incoming photon) or, else, with the special arithmetic involved in boson-fermion interactions (we add amplitudes when bosons are involved, but subtract amplitudes when it’s fermions interacting. I’ll probably find out once I am true through Feynman’s third volume of Lectures, which focus on quantum mechanics only.]

Finally, the last bit of unexplained ‘magic’ in the fine-structure constant is that the fine-structure constant (which I’ve started to write as α again, instead of αe) also gives us the (classical) relative speed of an electron, so that’s its speed as it orbits around the nucleus (according to the classical theory, that is), so we write

α = v/= β

I should go through the motions here – I’ll probably do so in the coming days – but you can see we must be able to get it out somehow from all what we wrote above. See how powerful our Uelect  ∼ e2/a relation really is? It links the electron, charge, its radius and its energy, and it’s all we need to all the rest out of it: its mass, its momentum, its speed and – through the Uncertainty Principle – the Bohr radius, which is the size of the atom.

We’ve come a long way. This is truly a milestone. We’ve taken the magic out of God’s number—to some extent at least. 🙂

You’ll have one last question, of course: if proportionality constants are all about the scale in which we measure the physical quantities on either side of an equation, is there some way the fine-structure constant would come out differently? That’s the same as asking: what if we’d measure energy in units that are equivalent to the energy of an electron, and the radius of our electron just as… Well… What if we’d equate our unit of distance with the radius of the electron, so we’d write re = 1? What would happen to α? Well… I’ll let you figure that one out yourself. I am tired and so I should go to bed now. 🙂

[…] OK. OK. Let me tell you. It’s not that simple here. All those relationships involving α, in one form or the other, are very deep. They relate a lot of stuff to a lot of stuff, and we can appreciate that only when doing a dimensional analysis. A dimensional analysis of the Ee = αe/r= eP2/r yields [eP2/r] = C2/m on the right-hand side and [Ee] = J = N·m on the left-hand side. How can we reconcile both? The coulomb is an SI base unit , so we can’t ‘translate’ it into something with N and m. [To be fully correct, for some reason, the ampère (i.e. coulomb per second) was chosen as an SI base unit, but they’re interchangeable in regard to their place in the international system of units: they can’t be reduced.] So we’ve got a problem. Yes. That’s where we sort of ‘smuggled’ the 4πε0 factor in when doing our calculations above. That ε0 constant is, obviously, not ‘as fundamental’ as or α (just think of the c−2 = ε0μ0 relationship to understand what I mean here) but, still, it was necessary to make the dimensions come out alright: we need the reciprocal dimension of ε0, i.e. (N·m2)/C2, to make the dimensional analysis work. We get: (C2/m)·(N·m2)/C2 = N·m = J, i.e. joule, so that’s the unit in which we measure energy or – using the E = mc2 equivalence – mass, which is the aspect of energy emphasizing its inertia.

So the answer is: no. Changing units won’t change alpha. So all that’s left is to play with it now. Let’s try to do that. Let me first plot that E= me = αe/re = 0.00729735256/re:

graph 3Unsurprisingly, we find the pivot point of this curve is at the intersection of the diagonal and the curve itself, so that’s at the (0.00729735256, 0.00729735256) point, where slopes are ± 1, i.e. plus or minus unity. What does this show? Nothing much. What? I can hear you: I should be excited because… Well… Yes! Think of it. If you would have to chose a cut-off point, you’d chose this one, wouldn’t you? 🙂 Sure, you’re right. How exciting! Let me show you. Look at it! It proves that God thinks in terms of logarithms. He has chosen α such that ln(E) = ln(α/r) = lnα – ln= lnα – ln= 0, so ln α = lnr and, therefore, α = r. 🙂

Huh? Excuse me?

I am sorry. […] Well… I am not, of course… 🙂 I just wanted to illustrate the kind of exercise some people are tempted to do. It’s no use. The fine-structure constant is what it is: it sort of summarizes an awful lot of formulas. It basically shows what Maxwell’s equation imply in terms of the structure of an atom defined as a negative charge orbiting around some positive charge. It shows we can get calculate everything as a function of something else, and that’s what the fine-structure constant tells us: it relates everything to everything. However, when everything is said and done, the fine-structure constant shows us two things:

  1. Maxwell’s equations are complete: we can construct a complete model of the electron and the atom, which includes: the electron’s energy and mass, its velocity, its own radius, and the radius of the atom. [I might have forgotten one of the dimensions here, but you’ll add it. :-)]
  2. God doesn’t want our equations to blow up. Our equations are all correct but, in reality, there’s a cut-off factor that ensures we don’t go to the limit with them.

So the fine-structure constant anchors our world, so to speak. In other words: of all the worlds that are possible, we live in this one.

[…] It’s pretty good as far as I am concerned. Isn’t it amazing that our mind is able to just grasp things like that? I know my approach here is pretty intuitive, and with ‘intuitive’, I mean ‘not scientific’ here. 🙂 Frankly, I don’t like the talk about physicists “looking into God’s mind.” I don’t think that’s what they’re trying to do. I think they’re just trying to understand the fundamental unity behind it all. And that’s religion enough for me. 🙂

So… What’s the conclusion? Nothing much. We’ve sort of concluded our description of the classical world… Well… Of its ‘electromagnetic sector’ at least. 🙂 That sector can be summarized in Maxwell’s equations, which describe an infinite world of possible worlds. However, God fixed three constants: hand α. So we live in a world that’s defined by this Trinity of fundamental physical constants. Why is it not two, or four?

My guts instinct tells me it’s because we live in three dimensions, and so there’s three degrees of freedom really. But what about time? Time is the fourth dimension, isn’t it? Yes. But time is symmetric in the ‘electromagnetic’ sector: we can reverse the arrow of time in our equations and everything still works. The arrow of time involves other theories: statistics (physicists refer to it as ‘statistical mechanics‘) and the ‘weak force’ sector, which I discussed when talking about symmetries in physics. So… Well… We’re not done. God gave us plenty of other stuff to try to understand. 🙂

The classical explanation for the electron’s mass and radius

Feynman’s 28th Lecture in his series on electromagnetism is one of the more interesting but, at the same time, it’s one of the few Lectures that is clearly (out)dated. In essence, it talks about the difficulties involved in applying Maxwell’s equations to the elementary charges themselves, i.e. the electron and the proton. We already signaled some of these problems in previous posts. For example, in our post on the energy in electrostatic fields, we showed how our formulas for the field energy and/or the potential of a charge blow up when we use it to calculate the energy we’d need to assemble a point charge. What comes out is infinity: ∞. So our formulas tell us we’d need an infinite amount of energy to assemble a point charge.

Well… That’s no surprise, is it? The idea itself is impossible: how can one have a finite amount of charge in something that’s infinitely small? Something that has no size whatsoever? It’s pretty obvious we get some division by zero there. 🙂 The mathematical approach is often inconsistent. Indeed, a lot of blah-blah in physics is obviously just about applying formulas to situations that are clearly not within the relevant area of application of the formula. So that’s why I went through the trouble (in my previous post, that is) of explaining you how we get these energy and potential formulas, and that’s by bringing charges (note the plural) together. Now, we may assume these charges are point charges, but that assumption is not so essential. What I tried to say when being so explicit was the following: yes, a charge causes a field, but the idea of a potential makes sense only when we’re thinking of placing some other charge in that field. So point charges with ‘infinite energy’ should not be a problem. Feynman admits as much when he writes:

“If the energy can’t get out, but must stay there forever, is there any real difficulty with an infinite energy? Of course, a quantity that comes out infinite may be annoying, but what really matters is only whether there are any observable physical effects.”

So… Well… Let’s see. There’s another, more interesting, way to look at an electron: let’s have a look at the field it creates. A electron – stationary or moving – will create a field in Maxwell’s world, which we know inside out now. So let’s just calculate it. In fact, Feynman calculates it for the unit charge (+1), so that’s a positron. It eases the analysis because we don’t have to drag any minus sign along. So how does it work? Well…

We’ll have an energy flux density vector – i.e. the Poynting vector S – as well as a momentum density vector g all over space. Both are related through the g = S/c2 equation which, as I explained in my previous post, is probably best written as cg = S/c, because we’ve got units then, on both sides, that we can readily understand, like N/m2 (so that’s force per unit area) or J/m3 (so that’s energy per unit volume). On the other hand, we’ll need something that’s written as a function of the velocity of our positron, so that’s v, and so it’s probably best to just calculate g, the momentum, which is measured in N·s or kg·(m/s2)·s (both are equivalent units for the momentum p = mv, indeed) per unit volume (so we need to add a 1/ m3 to the unit). So we’ll have some integral all over space, but I won’t bother you with it. Why not? Well… Feynman uses a rather particular volume element to solve the integral, and so I want you to focus on the solution. The geometry of the situation, and the solution for g, i.e. the momentum of the field per unit volume, is what matters here.

So let’s look at that geometry. It’s depicted below. We’ve got a radial electric field—a Coulomb field really, because our charge is moving at a non-relativistic speed, so v << c and we can approximate with a Coulomb field indeed. Maxwell’s equations imply that B = v×E/c2, so g = ε0E×B is what it is in the illustration below. Note that we’d have to reverse the direction of both E and B for an electron (because it’s negative), but g would be the same. It is directed obliquely toward the line of motion and its magnitude is g = (ε0v/c2)·E2·sinθ. Don’t worry about it: Feynman integrates this thing for you. 🙂 It’s not that difficult, but still… To solve it, he uses the fact that the fields are symmetric about the line of motion, which is indicated by the little arrow around the v-axis, with the Φ symbol next to it (it symbolizes the potential). [The ‘rather particular volume element’ is a ring around the v-axis, and it’s because of this symmetry that Feynman picks the ring. Feynman’s Lectures are not only great to learn physics: they’re a treasure drove of mathematical tricks too. :-)]

momentum field

As said, I don’t want to bother you with the technicalities of the integral here. This is the result:

  emm

What does this say? It says that the momentum of the field – i.e. the electromagnetic momentum, integrated over all of space – is proportional to the velocity v of our charge. That makes sense: when v = 0, we’ll have an electrostatic field all over space and, hence, some inertia, but it’s only when we try to move our charge that Newton’s Law comes into play: then we’ll need some force to overcome that inertia. It all works through the Poynting formula: S = E×B0. If nothing’s moving, then B = 0, and so we’ll have some E and, therefore, we’ll have field energy alright, but the energy flow will be zero. But when we move the charge, we’re moving the field, and so then B ≠ 0 and so it’s through B that the E in our S equation start kicking in. Does that make sense? Think about it: it’s good to try to visualize things in your mind. 🙂

The constants in the proportionality constant (2e2)/(3ac2) of our pv formula above are:

  • e2 = qe2/(4πε0), with qthe electron charge (without the minus sign) and ε0 our ubiquitous electric constant. [Note that, unlike Feynman, I prefer to not write e in italics, so as to not confuse it with Euler’s number ≈ 2.71828 etc. However, I know I am not always consistent in my notation. :-/ We don’t need Euler’s number in this post, so e or is always an expression for the electron charge, not Euler’s number. Stupid remark, perhaps, but I don’t want you to be confused.]
  • a is the radius of our charge—see we got away from the idea of a point charge? 🙂
  • c2 is just c2, i.e. our weird constant (the square of the speed of light) which seems to connect everything to everything. Indeed, think about stuff like this: S/g = c= 1/(ε0μ0).

Now, p = mv, so that formula for p basically says that our elementary charge (as mentioned, g is the same for a positron or an electron: E and B will be reversed, but g is not) has an electromagnetic mass melec equal to:

emm - 2

That’s an amazing result. We don’t need to give our electron any rest mass: just its charge and its movement will do! Super! So we don’t need any Higgs fields here! 🙂 The electromagnetic field will do!

Well… Maybe. Let’s explore what we’ve got here.

First, let’s compare that radius a in our formula to what’s found in experiments. Huh? Did someone ever try to measure the electron radius? Of course. There are all these scattering experiments in which electrons get fired at atoms. They can fly through or, else, hit something. Therefore, one can some statistical analysis and determine what is referred to as a cross-section. A cross-section is denoted by the same symbol as the standard deviation: σ (sigma). In any case… So there’s something that’s referred to as the classical electron radius, and it’s equal to the so-called Thomsom scattering length. Thomson scattering, as opposed to Compton scattering, is elastic scattering, so it preserves kinetic energy (unlike Compton scattering, where energy gets absorbed and changes frequencies). So… Well… I won’t go into too much detail but, yes, this is the electron radius we need. [I am saying this rather explicitly because there are two other numbers around: the so-called Bohr radius and, as you might imagine, the Compton scattering cross-section.]

The Thomson scattering length is 2.82 femtometer (so that’s 2.82×10−15 m), more or less that is :-), and it’s usually related to the observed electron mass mthrough the fine-structure constant α. In fact, using Planck units, we can write:  re·me= α, which is an amazing formula but, unfortunately, I can’t dwell on it here. Using ordinary m, s, C and what have you units, we can write ras:

classical electron radius

That’s good, because if we equate mand melec and switch melec and a in our formula for melec, we get:

a

So, frankly, we’re spot on! Well… Almost. The two numbers differ by 1/3. But who cares about a 1/3 factor indeed? We’re talking rather fuzzy stuff here – scattering cross-sections and standard deviations and all that – so… Yes. Well done! Our theory works!

Well… Maybe. Physicists don’t think so. They think the 1/3 factor is an issue. It’s sad because it really makes a lot of sense. In fact, the Dutch physicist Hendrik Lorentz – whom we know so well by now 🙂 – had also worked out that, because of the length contraction effect, our spherical charge would contract into an ellipsoid and… Well… He worked it all out, and it was not a problem: he found that the momentum was altered by the factor (1−v2/c2)−1/2, so that’s the ubiquitous Lorentz factor γ! He got this formula in the 1890s already, so that’s long before the theory of relativity had been developed. So, many years before Planck and Einstein would come up with their stuff, Hendrik Antoon Lorentz had the correct formulas already: the mass, or everything really, all should vary with that γ-factor. 🙂

Why bother about the 1/3 factor? [I should note it’s actually referred to as the 4/3 problem in physics.] Well… The critics do have a point: if we assume that (a) an electron is not a point charge – so if we allow it to have some radius a – and (b) that Maxwell’s Laws apply, then we should go all the way. The energy that’s needed to assemble an electron should then, effectively, be the same as the value we’d get out of those field energy formulas. So what do we get when we apply those formulas? Well… Let me quickly copy Feynman as he does the calculation for an electron, not looking at it as a point particle, but as a tiny shell of charge, i.e. a sphere with all charge sitting on the surface:

Feynman energy

 Let me enlarge the formula:

energy electron

Now, if we combine that with our formula for melec above, then we get:

4-3 problem

So that formula does not respect Einstein’s universal mass-energy equivalence formula E = mc2. Now, you will agree that we really want Einstein’s mass-energy equivalence relation to be respected by all, so our electron should respect it too. 🙂 So, yes, we’ve got a problem here, and it’s referred to as the 4/3 problem (yes, the ratio got turned around).

Now, you may think it got solved in the meanwhile. Well… No. It’s still a bit of a puzzle today, and the current-day explanation is not really different from what the French scientist Henri Poincaré proposed as a ‘solution’ to the problem back in the 1890s. He basically told Lorentz the following: “If the electron is some little ball of charge, then it should explode because of the repulsive forces inside. So there should be some binding forces there, and so that energy explains the ‘missing mass’ of the electron.” So these forces are effectively being referred to as Poincaré stresses, and the non-electromagnetic energy that’s associated with them – which, of course, has to be equal to 1/3 of the electromagnetic energy (I am sure you see why) 🙂 – adds to the total energy and all is alright now. We get:

U = mc2 = (melec + mPoincaré)c2

So… Yes… Pretty ad hoc. Worse, according to the Wikipedia article on electromagnetic mass, that’s still where we are. And, no, don’t read Feynman’s overview of all of the theories that were around then (so that’s in the 1960s, or earlier). As I said, it’s the one Lecture you don’t want to waste time on. So I won’t do that either.

In fact, let me try to do something else here, and that’s to de-construct the whole argument really. 🙂 Before I do so, let me highlight the essence of what was written above. It’s quite amazing really. Think of it: we say that the mass of an electron – i.e. its inertia, or the proportionality factor in Newton’s F = m·a law of motion – is the energy in the electric and magnetic field it causes. So the electron itself is just a hook for the force law, so to say. There’s nothing there, except for the charge causing the field. But so its mass is everywhere and, hence, nowhere really. Well… I should correct that: the field strength falls of as 1/rand, hence, the energy flow and momentum density that’s associated with it, falls of as 1/r4, so it falls of very rapidly and so the bulk of the energy is pretty near the charge. 🙂

[Note: You’ll remember that the field that’s associated with electromagnetic radiation falls of as 1/r, not as 1/r2, which is why there is an energy flux there which is never lost, which can travel independently through space. It’s not the same here, so don’t get confused.]

So that’s something to note: the melec = (2c−2/3)·(e2/a) has the radius in it, but that radius is only the hook, so to say. That’s fine, because it is not inconsistent with the idea of the Thomson scattering cross-section, which is the area that one can hit. Now, you’ll wonder how one can hit an electron: you can readily imagine an electron beam aimed at nuclei, but how would one hit electrons? Well… You can shoot photons at them, and see if they bounce back elastically or non-elastically. The cross-section area that bounces them off elastically must be pretty ‘hard’, and the cross-section that deflects them non-elastically somewhat less so. 🙂

OK… But… Yes? Hey! How did we get that electron radius in that formula? 

Good question! Brilliant, in fact! You’re right: it’s here that the whole argument falls apart really. We did a substitution. That radius a is the radius of a spherical shell of charge with an energy that’s equal to Uelec = (1/2)·(e2/a), so there’s another way of stating the inconsistency: the equivalent energy of melec = (2c−2/3)·(e2)/a) is equal to E = melec·c= (2/3)·(e2/a) and that’s not the same as Uelec = (1/2)·(e2/a). If we take the ratio of Uelec and melec·c=, we get the same factor: (1/2)/(2/3) = 3/4. But… Your question is superb! Look at it: putting it the way we put it reveals the inconsistency in the whole argument. We’re mixing two things here:

  1. We first calculate the momentum density, and the momentum, that’s caused by the unit charge, so we get some energy which I’ll denote as Eelec = melec·c2
  2. Now, we then assume this energy must be equal to the energy that’s needed to assemble the unit charge from an infinite number of infinitesimally small charges, thereby also assuming the unit charge is a uniformly charged sphere of charge with radius a.
  3. We then use this radius a to simplify our formula for Eelec = melec·c2

Now that is not kosher, really! First, it’s (a) a lot of assumptions, both implicit as well as explicit, and then (b) it’s, quite simply, not a legit mathematical procedure: calculating the energy in the field, or calculating the energy we need to assemble a uniformly charged sphere of radius a are two very different things.

Well… Let me put it differently. We’re using the same laws – it’s all Maxwell’s equations, really – but we should be clear about what we’re doing with them, and those two things are very different. The legitimate conclusion must be that our a is wrong. In other words, we should not assume that our electron is spherical shell of charge. So then what? Well… We could easily imagine something else, like a uniform or even a non-uniformly charged sphere. Indeed, if we’re just filling empty space with infinitesimally small charge ‘elements’, then we may want to think the density at the ‘center’ will be much higher, like what’s going on when planets form: the density of the inner core of our own planet Earth is more than four times the density of its surface material. [OK. Perhaps not very relevant here, but you get the idea.] Or, conversely, taking into account Poincaré’s objection, we may want to think all of the charge will be on the surface, just like on a perfect conductor, where all charge is surface charge!

Note that the field outside of a uniformly charged sphere and the field of a spherical shell of charge is exactly the same, so we would not find a different number for Eelec = melec·c2, but we surely would find a different number for Uelec. You may want to look up some formulas here: you’ll find that the energy of a uniformly distributed sphere of charge (so we do not assume that all of the charge sits on the surface here) is equal to (3/5)·(e2/a). So we’d already have much less of a problem, because the 3/4 factor in the Uelec = (3/4)·melec·c2 becomes a (5/3)·(2/3) = 10/9 factor. So now we have a discrepancy of some 10% only. 🙂

You’ll say: 10% is 10%. It’s huge in physics, as it’s supposed to be an exact science. Well… It is and it isn’t. Do you realize we haven’t even started to talk about stuff like spin? Indeed, in modern physics, we think of electrons as something that also spins around one or the other axis, so there’s energy there too, and we didn’t include that in our analysis.

In short, Feynman’s approach here is disappointing. Naive even, but then… Well… Who knows? Perhaps he didn’t do this Lecture himself. Perhaps it’s just an assistant or so. In fact, I should wonder why there’s still physicists wasting time on this! I should also note that naively comparing that a radius with the classical electron radius also makes little or no sense. Unlike what you’d expect, the classical electron radius re and the Thomson scattering cross-section σare not related like you might think they are, i.e. like σ= π·re2 or σ= π·(re/2)2 or σre2 or σ= π·(2·re)2 or whatever circular surface calculation rule that might make sense here. No. The Thomson scattering cross-section is equal to:

σ= (8π/3)·re2 = (2π/3)·(2·re)2 = (2/3)·π·(2·re)2 ≈ 66.5×10−30 m= 66.5 (fm)2

Why? I am not sure. I must assume it’s got to do with the standard deviation and all that. The point is, we’ve got a 2/3 factor here too, so do we have a problem really? I mean… The a we got was equal to a = (2/3)·re, wasn’t it? It was. But, unfortunately, it doesn’t mean anything. It’s just a coincidence. In fact, looking at the Thomson scattering cross-section, instead of the Thomson scattering radius, makes the ‘problem’ a little bit worse. Indeed, applying the π·r2 rule for a circular surface, we get that the radius would be equal to (8/3)1/2·re ≈ 1.633·re, so we get something that’s much larger rather than something that’s smaller here.

In any case, it doesn’t matter. The point is: this kind of comparisons should not be taken too seriously. Indeed, when everything is said and done, we’re comparing three very different things here:

  1. The radius that’s associated with the energy that’s needed to assemble our electron from infinitesimally small charges, and so that’s based on Coulomb’s law and the model we use for our electron: is it a shell or a sphere of charge? If it’s a sphere, do we want to think of it as something that’s of uniform of non-uniform density.
  2. The second radius is associated with the field of an electron, which we calculate using Poynting’s formula for the energy flow and/or the momentum density. So that’s not about the internal structure of the electron but, of course, it would be nice if we could find some model of an electron that matches this radius.
  3. Finally, there’s the radius that’s associated with elastic scattering, which is also referred to as hard scattering because it’s like the collision of two hard spheres indeed. But so that’s some value that has to be established experimentally and so it involves judicious choices because there’s probabilities and standard deviations involved.

So should we worry about the gaps between these three different concepts? In my humble opinion: no. Why? Because they’re all damn close and so we’re actually talking about the same thing. I mean: isn’t terrific that we’ve got a model that brings the first and the second radius together with a difference of 10% only? As far as I am concerned, that shows the theory works. So what Feynman’s doing in that (in)famous chapter is some kind of ‘dimensional analysis’ which confirms rather than invalidates classical electromagnetic theory. So it shows classical theory’s strength, rather than its weakness. It actually shows our formula do work where we wouldn’t expect them to work. 🙂

The thing is: when looking at the behavior of electrons themselves, we’ll need a different conceptual framework altogether. I am talking quantum mechanics here. Indeed, we’ll encounter other anomalies than the ones we presented above. There’s the issue of the anomalous magnetic moment of electrons, for example. Indeed, as I mentioned above, we’ll also want to think as electrons as spinning around their own axis, and so that implies some circulation of E that will generate a permanent magnetic dipole moment… […] OK, just think of some magnetic field if you don’t have a clue what I am saying here (but then you should check out my post on it). […] The point is: here too, the so-called ‘classical result’, so that’s its theoretical value, will differ from the experimentally measured value. Now, the difference here will be 0.0011614, so that’s about 0.1%, i.e. 100 times smaller than my 10%. 🙂

Personally, I think that’s not so bad. 🙂 But then physicists need to stay in business, of course. So, yes, it is a problem. 🙂

Post scriptum on the math versus the physics

The key to the calculation of the energy that goes into assembling a charge was the following integral:

U 4

This is a double integral which we simplified in two stages, so we’re looking at an integral within an integral really, but we can substitute the integral over the ρ(2)·dVproduct by the formula we got for the potential, so we write that as Φ(1), and so the integral above becomes:

U 5Now, this integral integrates the ρ(1)·Φ(1)·dVproduct over all of space, so that’s over all points in space, and so we just dropped the index and wrote the whole thing as the integral of ρ·Φ·dV over all of space:

U 6

We then established that this integral was mathematically equivalent to the following equation:

U 7

So this integral is actually quite simple: it just integrates EE = E2 over all of space. The illustration below shows E as a function of the distance for a sphere of radius R filled uniformly with charge.

uniform density

So the field (E) goes as for r ≤ R and as 1/rfor r ≥ R. So, for r ≥ R, the integral will have (1/r2)2 = 1/rin it. Now, you know that the integral of some function is the surface under the graph of that function. Look at the 1/r4 function below: it blows up between 1 and 0. That’s where the problem is: there needs to be some kind of cut-off, because that integral will effectively blow up when the radius of our little sphere of charge gets ‘too small’. So that makes it clear why it doesn’t make sense to use this formula to try to calculate the energy of a point charge. It just doesn’t make sense to do that.

graph

What’s ‘too small’? Let’s look at the formula we got for our electron as a spherical shell of charge:

energy electron

So we’ve got an even simpler formula here: it’s just a 1/relation. Why is that? Well… It’s just the way the math turns it out. I copied the detail of Feynman’s calculation above, so you can double-check it. It’s quite wonderful, really. We have a very simple inversely proportional relationship between the radius of our electron and its energy as a sphere of charge. We could write it as:

Uelect  = α/, with α = e2/2

But – Hey! Wait a minute! We’ve seen something like this before, haven’t we? We did. We did when we were discussing the wonderful properties of that magical number, the fine-structure constant, which we also denoted by α. 🙂 However, because we used α already, I’ll denote the fine-structure constant as αe here, so you don’t get confused. As you can see, the fine-structure constant links all of the fundamental properties of the electron: its charge, its radius, its distance to the nucleus (i.e. the Bohr radius), its velocity, and its mass (and, hence, its energy). So, at this stage of the argument, α can be anything, and αcannot, of course. It’s just that magical number out there, which relates everything to everything: it’s the God-given number we don’t understand. 🙂 Having said that, it seems like we’re going to get some understanding here because we know that, one the many expressions involving αe was the following one:

me = αe/re

This says that the mass of the electron is equal to the ratio of the fine-structure constant and the electron radius. [Note that we express everything in natural units here, so that’s Planck units. For the detail of the conversion, please see the relevant section on that in my one of my posts on this and other stuff.] Now, mass is equivalent to energy, of course: it’s just a matter of units, so we can equate me with Ee (this amounts to expressing the energy of the electron in a kg unit—bit weird, but OK) and so we get:

Ee = αe/re

So there we have: the fine-structure constant αe is Nature’s ‘cut-off’ factor, so to speak. Why? Only God knows. 🙂 But it’s now (fairly) easy to see why all the relations involving αe are what they are. For example, we also know that αe is the square of the electron charge expressed in Planck units, so we have:

 α = eP2 and, therefore, Ee = eP2/re

Now, you can check for yourself: it’s just a matter of re-expressing everything in standard SI units, and relating eP2 to e2, and it should all work: you should get the Uelect  = (1/2)·e2/expression. So… Well… At least this takes some of the magic out the fine-structure constant. It’s still a wonderful thing, but so you see that the fundamental relationship between (a) the energy (and, hence, the mass), (b) the radius and (c) the charge of an electron is not something God-given. What’s God-given are Maxwell’s equations, and so the Ee = αe/r= eP2/re is just one of the many wonderful things that you can get out of  them🙂

Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 17, 2020 as a result of a DMCA takedown notice from Michael A. Gottlieb, Rudolf Pfeiffer, and The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/

Field energy and field momentum

This post goes to the heart of the E = mc2, equation. It’s kinda funny, because Feynman just compresses all of it in a sub-section of his Lectures. However, as far as I am concerned, I feel it’s a very crucial section. Pivotal, I’d say, which would fit with its place in all of the 115 Lectures that make up the three volumes, which is sort of mid-way, which is where we are here. So let’s get go for it. 🙂

Let’s first recall what we wrote about the Poynting vector S, which we calculate from the magnetic and electric field vectors E and B by taking their cross-product:

S formula

This vector represents the energy flow, per unit area and per unit time, in electrodynamical situations. If E and/or are zero (which is the case in electrostatics, for example, because we don’t have magnetic fields in electrostatics), then S is zero too, so there is no energy flow then. That makes sense, because we have no moving charges, so where would the energy go to?

I also made it clear we should think of S as something physical, by comparing it to the heat flow vector h, which we presented when discussing vector analysis and vector operators. The heat flow out of a surface element da is the area times the component of perpendicular to da, so that’s (hn)·da = hn·da. Likewise, we can write (Sn)·da = Sn·da. The units of S and h are also the same: joule per second and per square meter or, using the definition of the watt (1 W = 1 J/s), in watt per square meter. In fact, if you google a bit, you’ll find that both h and S are referred to as a flux density:

  1. The heat flow vector h is the heat flux density vector, from which we get the heat flux through an area through the (hn)·da = hn·da product.
  2. The energy flow is the energy flux density vector, from which we get the energy flux through the (Sn)·da = Sn·da product.

So that should be enough as an introduction to what I want to talk about here. Let’s first look at the energy conservation principle once again.

Local energy conservation

In a way, you can look at my previous post as being all about the equation below, which we referred to as the ‘local’ energy conservation law:

energy flux

Of course, it is not the complete energy conservation law. The local energy is not only in the field. We’ve got matter as well, and so that’s what I want to discuss here: we want to look at the energy in the field as well as the energy that’s in the matter. Indeed, field energy is conserved, and then it isn’t: if the field is doing work on matter, or matter is doing work on the field, then… Well… Energy goes from one to the other, i.e. from the field to the matter or from the matter to the field. So we need to include matter in our analysis, which we didn’t do in our last post. Feynman gives the following simple example: we’re in a dark room, and suddenly someone turns on the light switch. So now the room is full of field energy—and, yes, I just mean it’s not dark anymore. :-). So that means some matter out there must have radiated its energy out and, in the process, it must have lost the equivalent mass of that energy. So, yes, we had matter losing energy and, hence, losing mass.

Now, we know that energy and momentum are related. Respecting and incorporating relativity theory, we’ve got two equivalent formulas for it:

  1. E− p2c2 = m02c4
  2. pc = E·(v/c) ⇔ p = v·E/c= m·v

The E = mc2 and m = ·m0·(1−v2/c2)−1/2 formulas connect both expressions. So we can look at it in either of two ways. We could use the energy conservation law, but Feynman prefers the conservation of momentum approach, so let’s see where he takes us. If the field has some energy (and, hence, some equivalent mass) per unit volume, and if there’s some flow, so if there’s some velocity (which there is: that’s what our previous post was all about), then it will have a certain momentum per unit volume. [Remember: momentum is mass times velocity.] That momentum will have a direction, so it’s a vector, just like p = mv. We’ll write it as g, so we define g as:

g is the momentum of the field per unit volume.

What units would we express it in? We’ve got a bit of choice here. For example, because we’re relating everything to energy here, we may want to convert our kilogram into eV/cor J/cunits, using the mass-energy equivalence relation E = mc2. Hmm… Let’s first keep the kg as a measure of inertia though. So we write: [g] = [m]·[v]/m= (kg·m/s)/m3. Hmm… That doesn’t show it’s energy, so let’s replace the kg with a unit that’s got newton and meter in it, cf. the F = ma law. So we write: [g] = (kg·m/s)/m= (kg/s)/m= [(N·s2/m)/s]/m= N·s/m3. Well… OK. The newton·second is the unit of momentum indeed, and we can re-write it including the joule (1 J = 1 N·m), so then we get [g] = (J·s/m4), so what’s that? Well… Nothing much. However, I do note it happens to be the dimension of S/c2, so that’s [S/c2] = [J/(s·m2)]·(s2/m2) = (J·s/m4). 🙂 Let’s continue the discussion.

Now, momentum is conserved, and each component of it is conserved. So let’s look at the x-direction. We should have something like:

formula

If you look at this carefully, you’ll probably say: “OK. I understood the thing with the dark room and light switch. Mass got converted into field energy, but what’s that second term of the left?”

Good. Smart. Right remark. Perfect. […] Let me try to answer the question. While all of the quantities above are expressed per unit volume, we’re actually looking at the same infinitesimal volume element here, so the example of the light switch is actually an example of a ‘momentum outflow’, so it’s actually an example of that second term of the left-hand side of the equation above kicking in! 🙂

Indeed, the first term just sort of reiterates the mass-energy equivalence: the energy that’s in the matter can become field energy, so to speak, in our infinitesimal volume element itself, and vice versa. But if it doesn’t, then it should get out and, hence, become ‘momentum outflow’. Does that make sense? No?

Hmm… What to say? You’ll need to look at that equation a couple of times more, I guess. :-/ But I need to move on, unfortunately. [Don’t get put off when I say things like this: I am basically talking to myself, so it means I’ll need to re-visit this myself. :-/]

Let’s look at all of the three terms:

  1. The left-hand side (i.e. the time rate-of-change of the momentum of matter) is easy. It’s just the force on it, which we know is equal to Fq(E+v×B). Do we know that? OK… I’ll admit it. Sometimes it’s easy to forget where we are in an analysis like this, but so we’re looking at the electromagnetic force here. 🙂 As we’re talking infinitesimals here and, therefore, charge density rather than discrete charges, we should re-write this as the force per unit volume which is ρE+j×B. [This is an interesting formula which I didn’t use before, so you should double-check it. :-)]
  2. The first term on the right-hand side should be equally obvious, or… Well… Perhaps somewhat less so. But with all my rambling on the Uncertainty Principle and/or the wave-particle duality, it should make sense. If we scrap the second term on the right-hand side, we basically have an equation that is equivalent to the E = mc2 equation. No? Sorry. Just look at it, again and again. You’ll end up understanding it. 🙂
  3. So it’s that second term on the right-hand side. What the hell does that say? Well… I could say: it’s the local energy or momentum conservation law. If the energy or momentum doesn’t stay in, it has to go out. 🙂 But that’s not very satisfactory as an answer, of course. However, please just go along with this ‘temporary’ answer for a while.

So what is that second term on the right-hand side? As we wrote it, it’s an x-component – or, let’s put it differently, it is or was part of the x-component of the momentum density – but, frankly, we should probably allow it to go out in any direction really, as the only constraint on the left-hand side is a per second rate of change of something. Hence, Feynman suggest to equate it to something like this:

general

What a, b and c? The components of some vector? Not sure. We’re stuck. This piece really requires very advanced math. In fact, as far as I know, this is the only time where Feynman says: “Sorry. This is too advanced. I’ll just give you the equation. Sorry.” So that’s what he does. He explains the philosophy of the argument, which is the following:

  1. On the left-hand side, we’ve got the time rate-of-change of momentum, so that obeys the F = dp/dt = d(mv)/dt law, with the force Fper unit volume, being equal to F(unit volume) = ρE+j×B.
  2. On the right-hand side, we’ve got something that can be written as:

general 2

So we’d need to find a way to ρE+j×B in terms of and B only – eliminating ρ and j by using Maxwell’s equations or whatever other trick  – and then juggle terms and make substitutions to get it into a form that looks like the formula above, i.e. the right-hand side of that equation. But so Feynman doesn’t show us how it’s being done. He just mentions some theorem in physics, which says that the energy that’s flowing through a unit area per unit time divided by c2 – so that’s E/cper unit area and per unit time – must be equal to the momentum per unit volume in the space, so we write:

g = S/c2

He illustrates the general theorem that’s used to get the equation above by giving two examples:

example theorem

OK. Two good examples. However, it’s still frustrating to not see how we get the g = S/c2 in the specific context of the electromagnetic force, so let’s do a dimensional analysis at least. In my previous post, I showed that the dimension of S must be J/(m2·s), so [S/c2] = [J/(m2·s)]/(m2/s2) = [N·m/(m2·s)]·(s2/m2) = [N·s/m3]. Now, we know that the unit of mass is 1 kg = N/(m/s2). That’s just the force law: a force of 1 newton will give a mass of 1 kg an acceleration of 1 m/s per second, so 1 N = 1 kg·(m/s2). So the [N·s/m3] dimension is equal to [kg·(m/s2)·s/m3] = [(kg·(m/s)/m3] = [(kg·(m/s)]/m3, which is the dimension of momentum (p = mv) per unit volume, indeed. So, yes, the dimensional analysis works out, and it’s also in line with the p = v·E/c2 = m·v equation, but… Oh… We did a dimensional analysis already, where we also showed that [g] = [S/c2] = (J·s/m4). Well… In any case… It’s a bit frustrating to not see the detail here, but let us note the the Grand Result once again:

The Poynting vector S gives us the energy flow as well as the momentum density= S/c2.

But what does it all mean, really? Let’s go through Einstein’s illustration of the principle. That will help us a lot. Before we do, however, I’d like to note something. I’ve always wondered a bit about that dichotomy between energy and momentum. Energy is force times distance: 1 joule is 1 newton × 1 meter indeed (1 J = 1 N·m). Momentum is force times time, as we can express it in N·s. Planck’s constant combines all three in the dimension of action, which is force times distance times time: ≈ 6.6×10−34 N·m·s, indeed. I like that unity. In this regard, you should, perhaps, quickly review that post in which I explain that is the energy per cycle, i.e. per wavelength or per period, of a photon, regardless of its wavelength. So it’s really something very fundamental.

We’ve got something similar here: energy and momentum coming together, and being shown as one aspect of the same thing: some oscillation. Indeed, just see what happens with the dimensions when we ‘distribute’ the 1/cfactor on the right-hand side over the two sides, so we write: c·= S/c and work out the dimensions:

  1. [c·g = (m/s)·(N·s)/m= N/m= J/m3.
  2. [S/c] = (s/m)·(N·m)/(s·m2) = N/m= J/m3.

Isn’t that nice? Both sides of the equation now have a dimension like ‘the force per unit area’, or ‘the energy per unit volume’. To get that, we just re-scaled g and S, by c and 1/c respectively. As far as I am concerned, this shows an underlying unity we probably tend to mask with our ‘related but different’ energy and momentum concepts. It’s like E and B: I just love it we can write them together in our Poynting formula = ε0c2E×B. In fact, let me show something else here, which you should think about. You know that c= 1/(ε0μ0), so we can write also as SE×B0. That’s nice, but what’s nice too is the following:

  1. S/c = c·= ε0cE×B = E×B/μ0c
  2. S/g = c= 1/(ε0μ0)

So, once again, Feynman may feel the Poynting vector is sort of counter-intuitive when analyzing specific situations but, as far as I am concerned, I feel the Poyning vector makes things actually easier to understand. Instead of two E and B vectors, and two concepts to deal with ‘energy’ (i.e. energy and momentum), we’re sort of unifying things here. In that regard – i.e in regard of feeling we’re talking the same thing really – I’d really highlight the S/g = c2 = 1/(ε0μ0) equation. Indeed, the universal constant acts just like the fine-structure constant here: it links everything to everything. 🙂

And, yes, it’s also about time we introduce the so-called principle of least action to explain things, because action, as a concept, combines force, distance and time indeed, so it’s a bit more promising than just energy, of just momentum. Having said that, you’ll see in the next section that it’s sometimes quite useful to have the choice between one formula or the other. But… Well… Enough talk. Let’s look at Einstein’s car.

Einstein’s car

Einstein’s car is a wonderful device: it rolls without any friction and it moves with a little flashlight. That’s all it needs. It’s pictured below. 🙂 So the situation is the following: the flashlight shoots some light out from one side, which is then stopped at the opposite end of the car. When the light is emitted, there must be some recoil. In fact, we know it’s going to be equal to 1/c times the energy because all we need to do is apply the pc = E·(v/c) formula for v = c, so we know that p = E/c. Of course, this momentum now needs to move Einstein’s car. It’s frictionless, so it should work, but still… The car has some mass M, and so that will determine its recoil velocity: v = p/M. We just apply the general p = mv formula here, and v is not equal to c here, of course! Of course, then the light hits the opposite end of the car and delivers the same momentum, so that stops the car again. However, it did move over some distance x = vt. So we could flash our light again and get to wherever we want to get. [Never mind the infinite accelerations involved!] So… Well… Great! Yes, but Einstein didn’t like this car when he first saw it. In fact, he still doesn’t like it, because he knows it won’t take you very far. 🙂

Einsteins' car

The problem is that we seem to be moving the center of gravity of this car by fooling around on the inside only. Einstein doesn’t like that. He thinks it’s impossible. And he’s right of course. The thing is: the center of gravity did not change. What happened here is that we’ve got some blob of energy, and so that blob has some equivalent mass (which we’ll denote by U/c2), and so that equivalent mass moved all the way from one side to the other, i.e. over the length of the car, which we denote by L. In fact, it’s stuff like this that inspired the whole theory of the field energy and field momentum, and how it interacts with matter.

What happens here is like switching the light on in the dark room: we’ve got matter doing work on the field, and so matter loses mass, and the field gains it, through its momentum and/or energy. To calculate how much, we could integrate S/c or c·over the volume of our blob, and we’d get something in joule indeed, but there’s a simpler way here. The momentum conservation says that the momentum of our car and the momentum of our blob must be equal, so if T is the time that was needed for our blob to go to the other side – and so that’s, of course, also the time during which our car was rolling – then M·v = M·x/T must be equal to (U/c2= (U/c2)·L/T. The 1/T factor on both sides cancel, so we write: M·x = (U/c2)·L. Now, what is x? Yes. In case you were wondering, that’s what we’re looking for here. 🙂 Here it is:

x = vT = vL/c = (p/M)·(L/c) = [U/c)/M]·(L/c) = (U/c2)·(L/M)

So what’s next? Well… Now we need to show that the center-of-mass actually did not move with this ‘transfer’ of the blob. I’ll leave the math to you here: it should all work out. And you can also think through the obvious questions:

  1. Where is the energy and, hence, the mass of our blob after it stops the car? Hint: think about excited atoms and imagine they might radiate some light back. 🙂
  2. As the car did move a little bit, we should be able to move it further and further away from its center of gravity, until the center of gravity is no longer in the car. Hint: think about batteries and energy levels going down while shooting light out. It just won’t happen. 🙂

Now, what about a blob of light going from the top to the bottom of the car? Well… That involves the conservation of angular momentum: we’ll have more mass on the bottom, but on a shorter lever-arm, so angular momentum is being conserved. It’s a very good question though, and it led Einstein to combine the center-of-gravity theorem with the angular momentum conservation theorem to explain stuff like this.

It’s all fascinating, and one can think of a great many paradoxes that, at first, seem to contradict the Grand Principles we used here, which means that they would contradict all that we have learned so far. However, a careful analysis of those paradox reveals that they are paradoxes indeed: propositions which sound true but are, in the end, self-contradictory. In fact, when explaining electromagnetism over his various Lectures, Feynman tasks his readers with a rather formidable paradox when discussing the laws of induction, he solves it here, ten chapters later, after describing what we described above. You can busy yourself with it but… Well… I guess you’ve got something better to do. If so, just take away the key lesson: there’s momentum in the field, and it’s also possible to build up angular momentum in a magnetic field and, if you switch it off, the angular momentum will be given back, somehow, as it’s stored energy.

That’s also why the seemingly irrelevant circulation of S we discussed in my previous post, where we had a charge next to an ordinary magnet, and where we found that there was energy circulating around, is not so queer. The energy is there, in the circulating field, and it’s real. As real as can be. 🙂

crazy

Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/

The energy of fields and the Poynting vector

For some reason, I always thought that Poynting was a Russian physicist, like Minkowski. He wasn’t. I just looked it up. Poynting was an Englishman, born near Manchester, and he teached in Birmingham. I should have known. Poynting is a very English name, isn’t it? My confusion probably stems from the fact that it was some Russian physicist, Nikolay Umov, who first proposed the basic concepts we are going to discuss here, i.e. the speed and direction of energy itself, or its movement. And as I am double-checking, I just learned that Hermann Minkowski is generally considered to be German-Jewish, not Russian. Makes sense. With Einstein and all that. His personal life story is actually quite interesting. You should check it out. 🙂

Let’s go for it. We’ve done a few posts on the energy in the fields already, but all in the contexts of electrostatics. Let me first walk you through the ideas we presented there.

The basic concepts: force, work, energy and potential

1. A charge q causes an electric field E, and E‘s magnitude E is a simple function of the charge (q) and its distance (r) from the point that we’re looking at, which we usually write as P = (x, y, z). Of course, the origin of our reference frame here is q. The formula is the simple inverse-square law that you (should) know: E ∼ q/r2, and the proportionality constant is just Coulomb’s constant, which I think you wrote as ke in your high-school days and which, as you know, is there so as to make sure the units come out alright. So we could just write E = ke·q/r2. However, just to make sure it does not look like a piece of cake 🙂 physicists write the proportionality constant as 1/4πε0, so we get:

E 3

Now, the field is the force on any unit charge (+1) we’d bring to P. This led us to think of energy, potential energy, because… Well… You know: energy is measured by work, so that’s some force acting over some distance. The potential energy of a charge increases if we move it against the field, so we wrote:

formula 1

Well… We actually gave the formula below in that post, so that’s the work done per unit charge. To interpret it, you just need to remember that F = qE, which is equivalent to saying that E is the force per unit charge.

unit chage

As for the F•ds or E•ds product in the integrals, that’s a vector dot product, which we need because it’s only the tangential component of the force that’s doing work, as evidenced by the formula F•ds = |F|·|ds|·cosθ = Ft·ds, and as depicted below.

illustration 1

Now, this allowed us to describe the field in terms of the (electric) potential Φ and the potential differences between two points, like the points a and b in the integral above. We have to chose some reference point, of course, some P0 defining zero potential, which is usually infinitely far away. So we wrote our formula for the work that’s being done on a unit charge, i.e. W(unit) as:

potential

2. The world is full of charges, of course, and so we need to add all of their fields. But so now you need a bit of imagination. Let’s reconstruct the world by moving all charges out, and then we bring them back one by one. So we take q1 now, and we bring it back into the now-empty world. Now that does not require any energy, because there’s no field to start with. However, when we take our second charge q2, we will be doing work as we move it against the field or, if it’s an opposite charge, we’ll be taking energy out of the field. Huh? Yes. Think about it. All is symmetric. Just to make sure you’re comfortable with every step we take, let me jot down the formula for the force that’s involved. It’s just the Coulomb force of course:

Coulomb's law

Fis the force on charge q1, and Fis the force on charge q2. Now, qand q2. may attract or repel each other but the forces will always be equal and opposite. The e12 vector makes sure the directions and signs come out alright, as it’s the unit vector from qto q(not from qto q2, as you might expect when looking at the order of the indices). So we would need to integrate this for r going from infinity to… Well… The distance between qand q2 – wherever they end up as we put them back into the world – so that’s what’s denoted by r12. Now I hate integrals too, but this is an easy one. Just note that ∫ r−2dr = 1/r and you’ll be able to figure out that what I’ll write now makes sense (if not, I’ll do a similar integral in a moment): the work done in bringing two charges together from a large distance (infinity) is equal to:

U 1So now we should bring in qand then q4, of course. That’s easy enough. Bringing the first two charges into that world we had emptied took a lot of time, but now we can automate processes. Trust me: we’ll be done in no time. 🙂 We just need to sum over all of the pairs of charges qi and qj. So we write the total electrostatic energy U as the sum of the energies of all possible pairs of charges:

U 3

Huh? Can we do that? I mean… Every new charge that we’re bringing in here changes the field, doesn’t it? It does. But it’s the magic of the superposition principle at work here. Our third charge qis associated with two pairs in this formula. Think of it: we’ve got the q1qand the q2qcombination, indeed. Likewise, our fourh charge qis to be paired up with three charges now: q1, q1 and q3. This formula takes care of it, and the ‘all pairs’ mention under the summation sign (Σ) reminds us we should watch we don’t double-count pairs: the q1qand q3qcombination, for example, count for one pair only, obviously. So, yes, we write ‘all pairs’ instead of the usual i, j subscripts. But then, yes, this formula takes care of it. We’re done!

Well… Not really, of course. We’ve still got some way to go before I can introduce the Poynting vector. 🙂 However, to make sure you ‘get’ the energy formula above, let me insert an extremely simple diagram so you’ve got a bit of a visual of what we’re talking about.

U system

3. Now, let’s take a step back. We just calculated the (potential) energy of the world (U), which is great. But perhaps we should also be interested in the world’s potential Φ, rather than its potential energy U. Why? Well, we’ll want to know what happens when we bring yet another charge in—from outer space or so. 🙂 And so then it’s easier to know the world’s potential, rather than its energy, because we can calculate the field from it using the E = −Φ formula. So let’s de- and re-construct the world once again 🙂 but now we’ll look at what happens with the field and the potential.

We know our first charge created a field with a field strength we calculated as:

E 3

So, when bringing in our second charge, we can use our Φ(P) integral to calculate the potential:

potential

[Let me make a note here, just for the record. You probably think I am being pretty childish when talking about my re-construction of the world in terms of bringing all charges out and then back in again but, believe me, there will be a lot of confusion when we’ll start talking about the energy of one charge, and that confusion can be avoided, to a large extent, when you realize that the idea (I mean the concept itself, really—not its formula) of a potential involves two charges really. Just remember: it’s the first charge that causes the field (and, of course, any charge causes a field), but calculating a potential only makes sense when we’re talking some other charge. Just make a mental note of it. You’ll be grateful to me later.]

Let’s now combine the integral and the formula for E above. Because you hate integrals as much as I do, I’ll spell it out: the antiderivative of the Φ(P) integral is ∫ q/(4πε0r2)·dr. Now, let’s bring q/4πε0 out for a while so we can focus on solving ∫(1/r2)dr. Now, ∫(1/r2)dr is equal to –1/r + k, and so the whole antiderivative is –q/4πε0r + k. Now, we integrate from r = ∞ to r, and so the definite integral is [–q/(4πε0)]·[1/∞ − 1/r] = [–q/(4πε0)]·[0 − 1/r] = q/(4πε0r). Let me present this somewhat nicer:

E 4

You’ll say: so what? Well… We’re done! The only thing we need to do now is add up the potentials of all of the charges in the world. So the formula for the potential Φ at a point which we’ll simply refer to as point 1, is:

P 1

Note that our index j starts at 2, otherwise it doesn’t make sense: we’d have a division by zero for the q1/r11 term. Again, it’s an obvious remark, but not thinking about it can cause a lot of confusion down the line.

4. Now, I am very sorry but I have to inform you that we’ll be talking charge densities and all that shortly, rather than discrete charges, so I have to give you the continuum version of this formula, i.e. the formula we’ll use when we’ve got charge densities rather than individual charges. That sum above then becomes an infinite sum (i.e. an integral), and qj becomes a variable which we write as ρ(2). [That’s totally in line with our index j starts at 2, rather than from 1.] We get:

U 2

Just look at this integral, and try to understand it: we’re integrating over all of space – so we’re integrating the whole world, really 🙂 – and the ρ(2)·dVproduct in the integral is just the charge of an infinitesimally small volume of our world. So the whole integral is just the (infinite) sum of the contributions to the potential (at point 1) of all (infinitesimally small) charges that are around indeed. Now, there’s something funny here. It’s just a mathematical thing: we don’t need to worry about double-counting here. Why? We’re not having products of volumes here. Just make a mental note of it because it will be different in a moment.

Now we’re going to look at the continuum version for our energy formula indeed. Which energy formula? That electrostatic energy formula, which said that the total electrostatic energy U as the sum of the energies of all possible pairs of charges:

U 3

Its continuum version is the following monster:

U 4

Hmm… What kind of integral is that? We’ve got two variables here: dV2 and dV1. Yes. And we’ve also got a 1/2 factor now, because we do not want to double-count and, unfortunately, there is no convenient way of writing an integral like this that keeps track of the pairs. It’s a so-called double integral, but I’ll let you look up the math yourself. In any case, we can simplify this integral so you don’t need to worry about it too much. How do we simplify it? Well… Just look at that integral we got for Φ(1): we calculated the potential at point 1 by integrating the ρ(2)·dVproduct over all of space, so the integral above can be written as:

U 5But so this integral integrates the ρ(1)·Φ(1)·dVproduct over all of space, so that’s over all points in space. So we can just drop the index and write the whole thing as the integral of ρ·Φ·dV over all of space:

U 6

5. It’s time for the hat-trick now. The equation above is mathematically equivalent to the following equation:

U 7

Huh? Yes. Let me make two remarks here. First on the math, the E = −Φ formula allows you to the integrand of the integral above as E•E = (−Φ)•(−Φ) = (Φ)•(Φ). And then you may or may not remember that, when substituting E = −Φ in Maxwell’s first equation (E = ρ/ε0), we got the following equality: ρ = ε0·•(Φ) = ε0·∇2Φ, so we can write ρΦ as ε0·Φ·∇2Φ. However, that still doesn’t show the two integrals are the same thing. The proof is actually rather involved, and so I’ll refer to that post I referred to, so you can check the proof there.

The second remark is much more fundamental. The two integrals are mathematically equivalent, but are they also physically? What do I mean with that? Well… Look at it. The second integral implies that we can look at (ε0/2)·EE = ε0E2/2 as an energy density, which we’ll denote by u, so we write:

D 6

Just to make sure you ‘get’ what we’re talking about here: u is the energy density in the little cube dV in the rather simplistic (and, therefore, extremely useful) illustration below (which, just like most of what I write above, I got from Feynman).

Capture

Now the question: what is the reality of that formula? Indeed, what we did when calculating U amounted to calculating the Universe with some number U – and that’s kinda nice, of course! – but then what? Is u = ε0E2/2 anything real? Well… That’s what this post is about. So we’re finished with the introduction now. 🙂

Energy density and energy flow in electrodynamics

Before giving you any more formulas, let me answer the question: there is no doubt, in the classical theory of electromagnetism at least, that the energy density u is something very real. It has to be because of the charge conservation law. Charges cannot just disappear in space, to then re-appear somewhere else. The charge conservation law is written as j = −∂ρ/∂t, and that makes it clear it’s a local conservation law. Therefore, charges can only disappear and re-appear through some current. We write dQ1/dt = ∫ (j•n)·da = −dQ2/dt, and here’s the simple illustration that comes with it:

charge flow

So we do not allow for any ‘non-local’ interactions here! Therefore, we say that, if energy goes away from a region, it’s because it flows away through the boundaries of that region. So that’s what the Poynting formulas are all about, and so I want to be clear on that from the outset.

Now, to get going with the discussion, I need to give you the formula for the energy density in electrodynamics. Its shape won’t surprise you:

energy density

However, it’s just like the electrostatic formula: it takes quite a bit of juggling to get this from our electrodynamic equations, so, if you want to see how it’s done, I’ll refer you to Feynman. Indeed, I feel the derivation doesn’t matter all that much, because the formula itself is very intuitive: it’s really the thing everyone knows about a wave, electromagnetic or not: the energy in it is proportional to the square of its amplitude, and so that’s E•E = E2 and B•B = B2. Now, you also know that the magnitude of B is 1/c of that of E, so cB = E, and so that explains the extra c2 factor in the second term.

The second formula is also very intuitive. Let me write it down:

energy flux

Just look at it: u is the energy density, so that’s the amount of energy per unit volume at a given point, and so whatever flows out of that point must represent its time rate of change. As for the –S expression… Well… Sorry, I can’t keep re-explaining things: the • operator is the divergence, and so it give us the magnitude of a (vector) field’s source or sink at a given point. is a scalar, and if it’s positive in a region, then that region is a source. Conversely, if it’s negative, then it’s a sink. To be precise, the divergence represents the volume density of the outward flux of a vector field from an infinitesimal volume around a given point. So, in this case, it gives us the volume density of the flux of S. As you can see, the formula has exactly the same shape as j = −∂ρ/∂t.

So what is S? Well… Think about the more general formula for the flux out of some closed surface, which we get from integrating over the volume enclosed. It’s just Gauss’ Theorem:

Gauss Theorem

Just replace C by E, and think about what it meant: the flux of E was the field strength multiplied by the surface area, so it was the total flow of E. Likewise, S represents the flow of (field) energy. Let me repeat this, because it’s an important result:

S represents the flow of field energy.

Huh? What flow? Per unit area? Per second? How do you define such ‘flow’? Good question. Let’s do a dimensional analysis:

  1. E is measured in newton per coulomb, so [E•E] = [E2] = N2/C2.
  2. B is measured in (N/C)/(m/s). [Huh? Well… Yes. I explained that a couple of times already. Just check it in my introduction to electric circuits.] So we get [B•B] = [B2] = (N2/C2)·(s2/m2) but the dimension of our c2 factor is (m2/s2) so we’re left with N2/C2. That’s nice, because we need to add in the same units.
  3. Now we need to look at ε0. That constant usually ‘fixes’ our units, but can we trust it to do the same now? Let’s see… One of the many ways in which we can express its dimension is [ε0] = C2/(N·m2), so if we multiply that with N2/C2, we find that u is expressed in N/m2Wow! That’s kinda neat. Why? Well… Just multiply with m/m and its dimension becomes N·m/m= J/m3, so that’s  joule per cubic meter, so… Yes: has got the right unit for something that’s supposed to measure energy density!
  4. OK. Now, we take the time rate of change of u, and so both the right and left of our ∂u/∂t = −formula are expressed in (J/m3)/s, which means that the dimension of S itself must be J/(m2·s). Just check it by writing it all out: = ∂Sx/∂x + ∂Sy/∂x + ∂Sz/∂z, and so that’s something per meter so, to get the dimension of S itself, we need to go from cubic meter to square meter. Done! Let me highlight the grand result:

S is the energy flow per unit area and per second.

Now we’ve got its magnitude and its dimension, but what is its direction? Indeed, we’ve been writing S as a vector, but… Well… What’s its direction indeed?

Well… Hmm… I referred you to Feynman for that derivation of that u = ε0E2/2 + ε0c2B2/2 formula energy for u, and so the direction of S – I should actually say, its complete definition – comes out of that derivation as well. So… Well… I think you should just believe what I’ll be writing here for S:

S formula

So it’s the vector cross product of E and B with ε0cthrown in. It’s a simple formula really, and because I didn’t drag you through the whole argument, you should just quickly do a dimensional analysis again—just to make sure I am not talking too much nonsense. 🙂 So what’s the direction? Well… You just need to apply the usual right-hand rule:

right hand rule

OK. We’re done! This S vector, which – let me repeat it – represents the energy flow per unit area and per second, is what is referred to as Poynting’s vector, and it’s a most remarkable thing, as I’ll show now. Let’s think about the implications of this thing.

Poynting’s vector in electrodynamics

The S vector is actually quite similar to the heat flow vector h, which we presented when discussing vector analysis and vector operators. The heat flow out of a surface element da is the area times the component of perpendicular to da, so that’s (hn)·da = hn·da. Likewise, we can write (Sn)·da = Sn·da. The units of S and h are also the same: joule per second and per square meter or, using the definition of the watt (1 W = 1 J/s), in watt per square meter. In fact, if you google a bit, you’ll find that both h and S are referred to as a flux density:

  1. The heat flow vector h is the heat flux density vector, from which we get the heat flux through an area through the (hn)·da = hn·da product.
  2. The energy flow is the energy flux density vector, from which we get the energy flux through the (Sn)·da = Sn·da product.

The big difference, of course, is that we get h from a simpler vector equation:

h = κT ⇔ (hxhyhz) = −κ(∂Tx/∂x, ∂Ty/∂y,∂Tz/∂x)

The vector equation for S is more complicated:

S formula

So it’s a vector product. Note that S will be zero if E = 0 and/or if B = 0. So S = 0 in electrostatics, i.e. when there are no moving charges and only steady currents. Let’s examine Feynman’s examples.

The illustration below shows the geometry of the E, B and S vectors for a light wave. It’s neat, and totally in line with what we wrote on the radiation pressure, or the momentum of light. So I’ll refer you to that post for an explanation, and to Feynman himself, of course.

light wave

OK. The situation here is rather simple. Feynman gives a few others examples that are not so simple, like that of a charging capacitor, which is depicted below.

capacitor

The Poynting vector points inwards here, toward the axis. What does it mean? It means the energy isn’t actually coming down the wires, but from the space surrounding the capacitor. 

What? I know. It’s completely counter-intuitive, at first that is. You’d think it’s the charges. But it actually makes sense. The illustration below shows how we should think of it. The charges outside of the capacitor are associated with a weak, enormously spread-out field that surrounds the capacitor. So if we bring them to the capacitor, that field gets weaker, and the field between the plates gets stronger. So the field energy which is way out moves into the space between the capacitor plates indeed, and so that’s what Poynting’s vector tells us here.

capacitor 2

Hmm… Yes. You can be skeptic. You should be. But that’s how it works. The next illustration looks at a current-carrying wire itself. Let’s first look at the B and E vectors. You’re familiar with the magnetic field around a wire, so the B vector makes sense, but what about the electric field? Aren’t wires supposed to be electrically neutral? It’s a tricky question, and we handled it in our post on the relativity of fields. The positive and negative charges in a wire should cancel out, indeed, but then it’s the negative charges that move and, because of their movement, we have the relativistic effect of length contraction, so the volumes are different, and the positive and negative charge density do not cancel out: the wire appears to be charged, so we do have a mix of E and B! Let me quickly give you the formula: E = (2πε0)·(λ/r), with λ the (apparent) charge per unit length, so it’s the same formula as for a long line of charge, or for a long uniformly charged cylinder.

So we have a non-zero E and B and, hence, a non-zero Poynting vector S, whose direction is radially inward, so there is a flow of energy into the wire, all around. What the hell? Where does it go? Well… There’s a few possibilities here: the charges need kinetic energy to move, or as they increase their potential energy when moving towards the terminals of our capacitor to increase the charge on the plates or, much more mundane, the energy may be radiated out again in the form of heat. It looks crazy, but that’s how it is really. In fact, the more you think about, the more logical it all starts to sound. Energy must be conserved locally, and so it’s just field energy going in and re-appearing in some other form. So it does make sense. But, yes, it’s weird, because no one bothered to teach us this in school. 🙂

wire

The ‘craziest’ example is the one below: we’ve got a charge and a magnet here. All is at rest. Nothing is moving… Well… I’ll correct that in a moment. 🙂 The charge (q) causes a (static) Coulomb field, while our magnet produces the usual magnetic field, whose shape we (should) recognize: it’s the usual dipole field. So E and B are not changing. But so when we calculate our Poynting vector, we see there is a circulation of S. The E×B product is not zero. So what’s going on here?

crazy

Well… There is no net change in energy with time: the energy just circulates around and around. Everything which flows into one volume flows out again. As Feynman puts it: “It is like incompressible water flowing around.” What’s the explanation? Well… Let me copy Feynman’s explanation of this ‘craziness’:

“Perhaps it isn’t so terribly puzzling, though, when you remember that what we called a “static” magnet is really a circulating permanent current. In a permanent magnet the electrons are spinning permanently inside. So maybe a circulation of the energy outside isn’t so queer after all.”

So… Well… It looks like we do need to revise some of our ‘intuitions’ here. I’ll conclude this post by quoting Feynman on it once more:

“You no doubt get the impression that the Poynting theory at least partially violates your intuition as to where energy is located in an electromagnetic field. You might believe that you must revamp all your intuitions, and, therefore have a lot of things to study here. But it seems really not necessary. You don’t need to feel that you will be in great trouble if you forget once in a while that the energy in a wire is flowing into the wire from the outside, rather than along the wire. It seems to be only rarely of value, when using the idea of energy conservation, to notice in detail what path the energy is taking. The circulation of energy around a magnet and a charge seems, in most circumstances, to be quite unimportant. It is not a vital detail, but it is clear that our ordinary intuitions are quite wrong.”

Well… That says it all, I guess. As far as I am concerned, I feel the Poyning vector makes things actually easier to understand. Indeed, the E and B vectors were quite confusing, because we had two of them, and the magnetic field is, frankly, a weird thing. Just think about the units in which we’re measuring B: (N/C)/(m/s). can’t imagine what a unit like that could possible represent, so I must assume you can’t either. But so now we’ve got this Poynting vector that combines both E and B, and which represents the flow of the field energy. Frankly, I think that makes a lot of sense, and it’s surely much easier to visualize than E and/or B. [Having said that, of course, you should note that E and B do have their value, obviously, if only because they represent the lines of force, and so that’s something very physical too, of course. I guess it’s a matter of taste, to some extent, but so I’d tend to soften Feynman’s comments on the supposed ‘craziness’ of S.

In any case… The next thing I should discuss is field momentum. Indeed, if we’ve got flow, we’ve got momentum. But I’ll leave that for my next post. This topic can’t be exhausted in one post only, indeed. 🙂 So let me conclude this post. I’ll do with a very nice illustration I got from the Wikipedia article on the Poynting vector. It shows the Poynting vector around a voltage source and a resistor, as well as what’s going on in-between. [Note that the magnetic field is given by the field vector H, which is related to B as follows: B = μ0(H + M), with M the magnetization of the medium. B and H are obviously just proportional in empty space, with μ0 as the proportionality constant.]

Poynting_vectors_of_DC_circuit

Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 17, 2020 as a result of a DMCA takedown notice from Michael A. Gottlieb, Rudolf Pfeiffer, and The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 20, 2020 as a result of a DMCA takedown notice from Michael A. Gottlieb, Rudolf Pfeiffer, and The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/