The de Broglie relations, the wave equation, and relativistic length contraction

You know the two de Broglie relations, also known as matter-wave equations:

f = E/h and λ = h/p

You’ll find them in almost any popular account of quantum mechanics, and the writers of those popular books will tell you that is the frequency of the ‘matter-wave’, and λ is its wavelength. In fact, to add some more weight to their narrative, they’ll usually write them in a somewhat more sophisticated form: they’ll write them using ω and k. The omega symbol (using a Greek letter always makes a big impression, doesn’t it?) denotes the angular frequency, while k is the so-called wavenumber.  Now, k = 2π/λ and ω = 2π·f and, therefore, using the definition of the reduced Planck constant, i.e. ħ = h/2π, they’ll write the same relations as:

1. λ = h/p = 2π/k ⇔ k = 2π·p/h
2. f = E/h = (ω/2π)

⇒ k = p/ħ and ω = E/ħ

They’re the same thing: it’s just that working with angular frequencies and wavenumbers is more convenient, from a mathematical point of view that is: it’s why we prefer expressing angles in radians rather than in degrees (k is expressed in radians per meter, while ω is expressed in radians per second). In any case, the ‘matter wave’ – even Wikipedia uses that term now – is, of course, the amplitude, i.e. the wave-function ψ(x, t), which has a frequency and a wavelength, indeed. In fact, as I’ll show in a moment, it’s got two frequencies: one temporal, and one spatial. I am modest and, hence, I’ll admit it took me quite a while to fully distinguish the two frequencies, and so that’s why I always had trouble connecting these two ‘matter wave’ equations.

Indeed, if they represent the same thing, they must be related, right? But how exactly? It should be easy enough. The wavelength and the frequency must be related through the wave velocity, so we can write: f·λ = v, with the velocity of the wave, which must be equal to the classical particle velocity, right? And then momentum and energy are also related. To be precise, we have the relativistic energy-momentum relationship: p·c = mv·v·c = mv·c2·v/c = E·v/c. So it’s just a matter of substitution. We should be able to go from one equation to the other, and vice versa. Right?

Well… No. It’s not that simple. We can start with either of the two equations but it doesn’t work. Try it. Whatever substitution you try, there’s no way you can derive one of the two equations above from the other. The fact that it’s impossible is evidenced by what we get when we’d multiply both equations. We get:

1. f·λ = (E/h)·(h/p) = E/p
2. v = f·λ  ⇒ f·λ = v = E/p ⇔ E = v·p = v·(m·v)

⇒ E = m·v2

Huh? What kind of formula is that? E = m·v2? That’s a formula you’ve never ever seen, have you? It reminds you of the kinetic energy formula of course—K.E. = m·v2/2—but… That factor 1/2 should not be there. Let’s think about it for a while. First note that this E = m·vrelation makes perfectly sense if v = c. In that case, we get Einstein’s mass-energy equivalence (E = m·c2), but that’s besides the point here. The point is: if v = c, then our ‘particle’ is a photon, really, and then the E = h·f is referred to as the Planck-Einstein relation. The wave velocity is then equal to c and, therefore, f·λ = c, and so we can effectively substitute to find what we’re looking for:

E/p = (h·f)/(h/λ) = f·λ = c ⇒ E = p·

So that’s fine: we just showed that the de Broglie relations are correct for photons. [You remember that E = p·c relation, no? If not, check out my post on it.] However, while that’s all nice, it is not what the de Broglie equations are about: we’re talking the matter-wave here, and so we want to do something more than just re-confirm that Planck-Einstein relation, which you can interpret as the limit of the de Broglie relations for v = c. In short, we’re doing something wrong here! Of course, we are. I’ll tell you what exactly in a moment: it’s got to do with the fact we’ve got two frequencies really.

Let’s first try something else. We’ve been using the relativistic E = mv·c2 equation above. Let’s try some other energy concept: let’s substitute the E in the f = E/h by the kinetic energy and then see where we get—if anywhere at all. So we’ll use the Ekinetic = m∙v2/2 equation. We can then use the definition of momentum (p = m∙v) to write E = p2/(2m), and then we can relate the frequency f to the wavelength λ using the v = λ∙f formula once again. That should work, no? Let’s do it. We write:

1. E = p2/(2m)
2. E = h∙f = h·v

⇒ λ = h·v/E = h·v/(p2/(2m)) = h·v/[m2·v2/(2m)] = h/[m·v/2] = 2∙h/p

So we find λ = 2∙h/p. That is almost right, but not quite: that factor 2 should not be there. Well… Of course you’re smart enough to see it’s just that factor 1/2 popping up once more—but as a reciprocal, this time around. 🙂 So what’s going on? The honest answer is: you can try anything but it will never work, because the f = E/h and λ = h/p equations cannot be related—or at least not so easily. The substitutions above only work if we use that E = m·v2 energy concept which, you’ll agree, doesn’t make much sense—at first, at least. Again: what’s going on? Well… Same honest answer: the f = E/h and λ = h/p equations cannot be related—or at least not so easily—because the wave equation itself is not so easy.

Let’s review the basics once again.

The wavefunction

The amplitude of a particle is represented by a wavefunction. If we have no information whatsoever on its position, then we usually write that wavefunction as the following complex-valued exponential:

ψ(x, t) = a·ei·[(E/ħ)·t − (p/ħ)∙x] = a·ei·(ω·t − kx= a·ei(kx−ω·t) = a·eiθ = (cosθ + i·sinθ)

θ is the so-called phase of our wavefunction and, as you can see, it’s the argument of a wavefunction indeed, with temporal frequency ω and spatial frequency k (if we choose our x-axis so its direction is the same as the direction of k, then we can substitute the k and x vectors for the k and x scalars, so that’s what we’re doing here). Now, we know we shouldn’t worry too much about a, because that’s just some normalization constant (remember: all probabilities have to add up to one). However, let’s quickly develop some logic here. Taking the absolute square of this wavefunction gives us the probability of our particle being somewhere in space at some point in time. So we get the probability as a function of x and t. We write:

P(x ,t) = |a·ei·[(E/ħ)·t − (p/ħ)∙x]|= a2

As all probabilities have to add up to one, we must assume we’re looking at some box in spacetime here. So, if the length of our box is Δx = x2 − x1, then (Δx)·a2 = (x2−x1a= 1 ⇔ Δx = 1/a2. [We obviously simplify the analysis by assuming a one-dimensional space only here, but the gist of the argument is essentially correct.] So, freezing time (i.e. equating t to some point t = t0), we get the following probability density function: That’s simple enough. The point is: the two de Broglie equations f = E/h and λ = h/p give us the temporal and spatial frequencies in that ψ(x, t) = a·ei·[(E/ħ)·t − (p/ħ)∙x] relation. As you can see, that’s an equation that implies a much more complicated relationship between E/ħ = ω and p/ħ = k. Or… Well… Much more complicated than what one would think of at first.

To appreciate what’s being represented here, it’s good to play a bit. We’ll continue with our simple exponential above, which also illustrates how we usually analyze those wavefunctions: we either assume we’re looking at the wavefunction in space at some fixed point in time (t = t0) or, else, at how the wavefunction changes in time at some fixed point in space (x = x0). Of course, we know that Einstein told us we shouldn’t do that: space and time are related and, hence, we should try to think of spacetime, i.e. some ‘kind of union’ of space and time—as Minkowski famously put it. However, when everything is said and done, mere mortals like us are not so good at that, and so we’re sort of condemned to try to imagine things using the classical cut-up of things. 🙂 So we’ll just an online graphing tool to play with that a·ei(k∙x−ω·t) = a·eiθ = (cosθ + i·sinθ) formula.

Compare the following two graps, for example. Just imagine we either look at how the wavefunction behaves at some point in space, with the time fixed at some point t = t0, or, alternatively, that we look at how the wavefunction behaves in time at some point in space x = x0. As you can see, increasing k = p/ħ or increasing ω = E/ħ gives the wavefunction a higher ‘density’ in space or, alternatively, in time.  That makes sense, intuitively. In fact, when thinking about how the energy, or the momentum, affects the shape of the wavefunction, I am reminded of an airplane propeller: as it spins, faster and faster, it gives the propeller some ‘density’, in space as well as in time, as its blades cover more space in less time. It’s an interesting analogy: it helps—me, at least—to think through what that wavefunction might actually represent. So as to stimulate your imagination even more, you should also think of representing the real and complex part of that ψ = a·ei(k∙x−ω·t) = a·eiθ = (cosθ + i·sinθ) formula in a different way. In the graphs above, we just showed the sine and cosine in the same plane but, as you know, the real and the imaginary axis are orthogonal, so Euler’s formula a·eiθ (cosθ + i·sinθ) = cosθ + i·sinθ = Re(ψ) + i·Im(ψ) may also be graphed as follows: The illustration above should make you think of yet another illustration you’ve probably seen like a hundred times before: the electromagnetic wave, propagating through space as the magnetic and electric field induce each other, as illustrated below. However, there’s a big difference: Euler’s formula incorporates a phase shift—remember: sinθ = cos(θ − π/2)—and you don’t have that in the graph below. The difference is much more fundamental, however: it’s really hard to see how one could possibly relate the magnetic and electric field to the real and imaginary part of the wavefunction respectively. Having said that, the mathematical similarity makes one think! Of course, you should remind yourself of what E and B stand for: they represent the strength of the electric (E) and magnetic (B) field at some point x at some time t. So you shouldn’t think of those wavefunctions above as occupying some three-dimensional space. They don’t. Likewise, our wavefunction ψ(x, t) does not occupy some physical space: it’s some complex number—an amplitude that’s associated with each and every point in spacetime. Nevertheless, as mentioned above, the visuals make one think and, as such, do help us as we try to understand all of this in a more intuitive way.

Let’s now look at that energy-momentum relationship once again, but using the wavefunction, rather than those two de Broglie relations.

Energy and momentum in the wavefunction

I am not going to talk about uncertainty here. You know that Spiel. If there’s uncertainty, it’s in the energy or the momentum, or in both. The uncertainty determines the size of that ‘box’ (in spacetime) in which we hope to find our particle, and it’s modeled by a splitting of the energy levels. We’ll say the energy of the particle may be E0, but it might also be some other value, which we’ll write as En = E0 ± n·ħ. The thing to note is that energy levels will always be separated by some integer multiple of ħ, so ħ is, effectively , the quantum of energy for all practical—and theoretical—purposes. We then super-impose the various wave equations to get a wave function that might—or might not—resemble something like this: Who knows? 🙂 In any case, that’s not what I want to talk about here. Let’s repeat the basics once more: if we write our wavefunction a·ei·[(E/ħ)·t − (p/ħ)∙x] as a·ei·[ω·t − k∙x], we refer to ω = E/ħ as the temporal frequency, i.e. the frequency of our wavefunction in time (i.e. the frequency it has if we keep the position fixed), and to k = p/ħ as the spatial frequency (i.e. the frequency of our wavefunction in space (so now we stop the clock and just look at the wave in space). Now, let’s think about the energy concept first. The energy of a particle is generally thought of to consist of three parts:

1. The particle’s rest energy m0c2, which de Broglie referred to as internal energy (Eint): it includes the rest mass of the ‘internal pieces’, as Feynman puts it (now we call those ‘internal pieces’ quarks), as well as their binding energy (i.e. the quarks’ interaction energy);
2. Any potential energy it may have because of some field (so de Broglie was not assuming the particle was traveling in free space), which we’ll denote by U, and note that the field can be anything—gravitational, electromagnetic: it’s whatever changes the energy because of the position of the particle;
3. The particle’s kinetic energy, which we write in terms of its momentum p: m·v2/2 = m2·v2/(2m) = (m·v)2/(2m) = p2/(2m).

So we have one energy concept here (the rest energy) that does not depend on the particle’s position in spacetime, and two energy concepts that do depend on position (potential energy) and/or how that position changes because of its velocity and/or momentum (kinetic energy). The two last bits are related through the energy conservation principle. The total energy is E = mvc2, of course—with the little subscript (v) ensuring the mass incorporates the equivalent mass of the particle’s kinetic energy.

So what? Well… In my post on quantum tunneling, I drew attention to the fact that different potentials , so different potential energies (indeed, as our particle travels one region to another, the field is likely to vary) have no impact on the temporal frequency. Let me re-visit the argument, because it’s an important one. Imagine two different regions in space that differ in potential—because the field has a larger or smaller magnitude there, or points in a different direction, or whatever: just different fields, which corresponds to different values for U1 and U2, i.e. the potential in region 1 versus region 2. Now, the different potential will change the momentum: the particle will accelerate or decelerate as it moves from one region to the other, so we also have a different p1 and p2. Having said that, the internal energy doesn’t change, so we can write the corresponding amplitudes, or wavefunctions, as:

1. ψ11) = Ψ1(x, t) = a·eiθ1 = a·e−i[(Eint + p12/(2m) + U1)·t − p1∙x]/ħ
2. ψ22) = Ψ2(x, t) = a·e−iθ2 = a·e−i[(Eint + p22/(2m) + U2)·t − p2∙x]/ħ

Now how should we think about these two equations? We are definitely talking different wavefunctions. However, their temporal frequencies ω= Eint + p12/(2m) + U1 and ω= Eint + p22/(2m) + Umust be the same. Why? Because of the energy conservation principle—or its equivalent in quantum mechanics, I should say: the temporal frequency f or ω, i.e. the time-rate of change of the phase of the wavefunction, does not change: all of the change in potential, and the corresponding change in kinetic energy, goes into changing the spatial frequency, i.e. the wave number k or the wavelength λ, as potential energy becomes kinetic or vice versa. The sum of the potential and kinetic energy doesn’t change, indeed. So the energy remains the same and, therefore, the temporal frequency does not change. In fact, we need this quantum-mechanical equivalent of the energy conservation principle to calculate how the momentum and, hence, the spatial frequency of our wavefunction, changes. We do so by boldly equating ω= Eint + p12/(2m) + Uand ω2 = Eint + p22/(2m) + U2, and so we write:

ω= ω2 ⇔ Eint + p12/(2m) + U=  Eint + p22/(2m) + U

⇔ p12/(2m) − p22/(2m) = U– U⇔ p2=  (2m)·[p12/(2m) – (U– U1)]

⇔ p2 = (p12 – 2m·ΔU )1/2

We played with this in a previous post, assuming that p12 is larger than 2m·ΔU, so as to get a positive number on the right-hand side of the equation for p22, so then we can confidently take the positive square root of that (p12 – 2m·ΔU ) expression to calculate p2. For example, when the potential difference ΔU = U– U1 was negative, so ΔU < 0, then we’re safe and sure to get some real positive value for p2.

Having said that, we also contemplated the possibility that p2= p12 – 2m·ΔU was negative, in which case p2 has to be some pure imaginary number, which we wrote as p= i·p’ (so p’ (read: p prime) is a real positive number here). We could work with that: it resulted in an exponentially decreasing factor ep’·x/ħ that ended up ‘killing’ the wavefunction in space. However, its limited existence still allowed particles to ‘tunnel’ through potential energy barriers, thereby explaining the quantum-mechanical tunneling phenomenon.

This is rather weird—at first, at least. Indeed, one would think that, because of the E/ħ = ω equation, any change in energy would lead to some change in ω. But no! The total energy doesn’t change, and the potential and kinetic energy are like communicating vessels: any change in potential energy is associated with a change in p, and vice versa. It’s a really funny thing. It helps to think it’s because the potential depends on position only, and so it should not have an impact on the temporal frequency of our wavefunction. Of course, it’s equally obvious that the story would change drastically if the potential would change with time, but… Well… We’re not looking at that right now. In short, we’re assuming energy is being conserved in our quantum-mechanical system too, and so that implies what’s described above: no change in ω, but we obviously do have changes in p whenever our particle goes from one region in space to another, and the potentials differ. So… Well… Just remember: the energy conservation principle implies that the temporal frequency of our wave function doesn’t change. Any change in potential, as our particle travels from one place to another, plays out through the momentum.

Now that we know that, let’s look at those de Broglie relations once again.

Re-visiting the de Broglie relations

As mentioned above, we usually think in one dimension only: we either freeze time or, else, we freeze space. If we do that, we can derive some funny new relationships. Let’s first simplify the analysis by re-writing the argument of the wavefunction as:

θ = E·t − p·x

Of course, you’ll say: the argument of the wavefunction is not equal to E·t − p·x: it’s (E/ħ)·t − (p/ħ)∙x. Moreover, θ should have a minus sign in front. Well… Yes, you’re right. We should put that 1/ħ factor in front, but we can change units, and so let’s just measure both E as well as p in units of ħ here. We can do that. No worries. And, yes, the minus sign should be there—Nature choose a clockwise direction for θ—but that doesn’t matter for the analysis hereunder.

The E·t − p·x expression reminds one of those invariant quantities in relativity theory. But let’s be precise here. We’re thinking about those so-called four-vectors here, which we wrote as pμ = (E, px, py, pz) = (E, p) and xμ = (t, x, y, z) = (t, x) respectively. [Well… OK… You’re right. We wrote those four-vectors as pμ = (E, px·c , py·c, pz·c) = (E, p·c) and xμ = (c·t, x, y, z) = (t, x). So what we write is true only if we measure time and distance in equivalent units so we have = 1. So… Well… Let’s do that and move on.] In any case, what was invariant was not E·t − p·x·c or c·t − x (that’s a nonsensical expression anyway: you cannot subtract a vector from a scalar), but pμ2 = pμpμ = E2 − (p·c)2 = E2 − p2·c= E2 − (px2 + py2 + pz2c2 and xμ2 = xμxμ = (c·t)2 − x2 = c2·t2 − (x2 + y2 + z2) respectively. [Remember pμpμ and xμxμ are four-vector dot products, so they have that +— signature, unlike the p2 and x2 or a·b dot products, which are just a simple sum of the squared components.] So… Well… E·t − p·x is not an invariant quantity. Let’s try something else.

Let’s re-simplify by equating ħ as well as c to one again, so we write: ħ = c = 1. [You may wonder if it is possible to ‘normalize’ both physical constants simultaneously, but the answer is yes. The Planck unit system is an example.]  then our relativistic energy-momentum relationship can be re-written as E/p = 1/v. [If c would not be one, we’d write: E·β = p·c, with β = v/c. So we got E/p = c/β. We referred to β as the relative velocity of our particle: it was the velocity, but measured as a ratio of the speed of light. So here it’s the same, except that we use the velocity symbol v now for that ratio.]

Now think of a particle moving in free space, i.e. without any fields acting on it, so we don’t have any potential changing the spatial frequency of the wavefunction of our particle, and let’s also assume we choose our x-axis such that it’s the direction of travel, so the position vector (x) can be replaced by a simple scalar (x). Finally, we will also choose the origin of our x-axis such that x = 0 zero when t = 0, so we write: x(t = 0) = 0. It’s obvious then that, if our particle is traveling in spacetime with some velocity v, then the ratio of its position x and the time t that it’s been traveling will always be equal to = x/t. Hence, for that very special position in spacetime (t, x = v·t) – so we’re talking the actual position of the particle in spacetime here – we get: θ = E·t − p·x = E·t − p·v·t = E·t − m·v·v·t= (E −  m∙v2)·t. So… Well… There we have the m∙v2 factor.

The question is: what does it mean? How do we interpret this? I am not sure. When I first jotted this thing down, I thought of choosing a different reference potential: some negative value such that it ensures that the sum of kinetic, rest and potential energy is zero, so I could write E = 0 and then the wavefunction would reduce to ψ(t) = ei·m∙v2·t. Feynman refers to that as ‘choosing the zero of our energy scale such that E = 0’, and you’ll find this in many other works too. However, it’s not that simple. Free space is free space: if there’s no change in potential from one region to another, then the concept of some reference point for the potential becomes meaningless. There is only rest energy and kinetic energy, then. The total energy reduces to E = m (because we chose our units such that c = 1 and, therefore, E = mc2 = m·12 = m) and so our wavefunction reduces to:

ψ(t) = a·ei·m·(1 − v2)·t

We can’t reduce this any further. The mass is the mass: it’s a measure for inertia, as measured in our inertial frame of reference. And the velocity is the velocity, of course—also as measured in our frame of reference. We can re-write it, of course, by substituting t for t = x/v, so we get:

ψ(x) = a·ei·m·(1/vv)·x

For both functions, we get constant probabilities, but a wavefunction that’s ‘denser’ for higher values of m. The (1 − v2) and (1/vv) factors are different, however: these factors becomes smaller for higher v, so our wavefunction becomes less dense for higher v. In fact, for = 1 (so for travel at the speed of light, i.e. for photons), we get that ψ(t) = ψ(x) = e0 = 1. [You should use the graphing tool once more, and you’ll see the imaginary part, i.e. the sine of the (cosθ + i·sinθ) expression, just vanishes, as sinθ = 0 for θ = 0.] The wavefunction and relativistic length contraction

Are exercises like this useful? As mentioned above, these constant probability wavefunctions are a bit nonsensical, so you may wonder why I wrote what I wrote. There may be no real conclusion, indeed: I was just fiddling around a bit, and playing with equations and functions. I feel stuff like this helps me to understand what that wavefunction actually is somewhat better. If anything, it does illustrate that idea of the ‘density’ of a wavefunction, in space or in time. What we’ve been doing by substituting x for x = v·t or t for t = x/v is showing how, when everything is said and done, the mass and the velocity of a particle are the actual variables determining that ‘density’ and, frankly, I really like that ‘airplane propeller’ idea as a pedagogic device. In fact, I feel it may be more than just a pedagogic device, and so I’ll surely re-visit it—once I’ve gone through the rest of Feynman’s Lectures, that is. 🙂

That brings me to what I added in the title of this post: relativistic length contraction. You’ll wonder why I am bringing that into a discussion like this. Well… Just play a bit with those (1 − v2) and (1/vv) factors. As mentioned above, they decrease the density of the wavefunction. In other words, it’s like space is being ‘stretched out’. Also, it can’t be a coincidence we find the same (1 − v2) factor in the relativistic length contraction formula: L = L0·√(1 − v2), in which L0 is the so-called proper length (i.e. the length in the stationary frame of reference) and is the (relative) velocity of the moving frame of reference. Of course, we also find it in the relativistic mass formula: m = mv = m0/√(1−v2). In fact, things become much more obvious when substituting m for m0/√(1−v2) in that ψ(t) = ei·m·(1 − v2)·t function. We get:

ψ(t) = a·ei·m·(1 − v2)·t = a·ei·m0·√(1−v2)·t

Well… We’re surely getting somewhere here. What if we go back to our original ψ(x, t) = a·ei·[(E/ħ)·t − (p/ħ)∙x] function? Using natural units once again, that’s equivalent to:

ψ(x, t) = a·ei·(m·t − p∙x) = a·ei·[(m0/√(1−v2))·t − (m0·v/√(1−v2)∙x)

= a·ei·[m0/√(1−v2)]·(t − v∙x)

Interesting! We’ve got a wavefunction that’s a function of x and t, but with the rest mass (or rest energy) and velocity as parameters! Now that really starts to make sense. Look at the (blue) graph for that 1/√(1−v2) factor: it goes from one (1) to infinity (∞) as v goes from 0 to 1 (remember we ‘normalized’ v: it’s a ratio between 0 and 1 now). So that’s the factor that comes into play for t. For x, it’s the red graph, which has the same shape but goes from zero (0) to infinity (∞) as v goes from 0 to 1. Now that makes sense: the ‘density’ of the wavefunction, in time and in space, increases as the velocity v increases. In space, that should correspond to the relativistic length contraction effect: it’s like space is contracting, as the velocity increases and, therefore, the length of the object we’re watching contracts too. For time, the reasoning is a bit more complicated: it’s our time that becomes more dense and, therefore, our clock that seems to tick faster.

[…]

I know I need to explore this further—if only so as to assure you I have not gone crazy. Unfortunately, I have no time to do that right now. Indeed, from time to time, I need to work on other stuff besides this physics ‘hobby’ of mine. Post scriptum 1: As for the E = m·vformula, I also have a funny feeling that it might be related to the fact that, in quantum mechanics, both the real and imaginary part of the oscillation actually matter. You’ll remember that we’d represent any oscillator in physics by a complex exponential, because it eased our calculations. So instead of writing A = A0·cos(ωt + Δ), we’d write: A = A0·ei(ωt + Δ) = A0·cos(ωt + Δ) + i·A0·sin(ωt + Δ). When calculating the energy or intensity of a wave, however, we couldn’t just take the square of the complex amplitude of the wave – remembering that E ∼ A2. No! We had to get back to the real part only, i.e. the cosine or the sine only. Now the mean (or average) value of the squared cosine function (or a squared sine function), over one or more cycles, is 1/2, so the mean of A2 is equal to 1/2 = A02. cos(ωt + Δ). I am not sure, and it’s probably a long shot, but one must be able to show that, if the imaginary part of the oscillation would actually matter – which is obviously the case for our matter-wave – then 1/2 + 1/2 is obviously equal to 1. I mean: try to think of an image with a mass attached to two springs, rather than one only. Does that make sense? 🙂 […] I know: I am just freewheeling here. 🙂

Post scriptum 2: The other thing that this E = m·vequation makes me think of is – curiously enough – an eternally expanding spring. Indeed, the kinetic energy of a mass on a spring and the potential energy that’s stored in the spring always add up to some constant, and the average potential and kinetic energy are equal to each other. To be precise: 〈K.E.〉 + 〈P.E.〉 = (1/4)·k·A2 + (1/4)·k·A= k·A2/2. It means that, on average, the total energy of the system is twice the average kinetic energy (or potential energy). You’ll say: so what? Well… I don’t know. Can we think of a spring that expands eternally, with the mass on its end not gaining or losing any speed? In that case, is constant, and the total energy of the system would, effectively, be equal to Etotal = 2·〈K.E.〉 = (1/2)·m·v2/2 = m·v2.

Post scriptum 3: That substitution I made above – substituting x for x = v·t – is kinda weird. Indeed, if that E = m∙v2 equation makes any sense, then E − m∙v2 = 0, of course, and, therefore, θ = E·t − p·x = E·t − p·v·t = E·t − m·v·v·t= (E −  m∙v2)·t = 0·t = 0. So the argument of our wavefunction is 0 and, therefore, we get a·e= for our wavefunction. It basically means our particle is where it is. 🙂

Post scriptum 4: This post scriptum – no. 4 – was added later—much later. On 29 February 2016, to be precise. The solution to the ‘riddle’ above is actually quite simple. We just need to make a distinction between the group and the phase velocity of our complex-valued wave. The solution came to me when I was writing a little piece on Schrödinger’s equation. I noticed that we do not find that weird E = m∙v2 formula when substituting ψ for ψ = ei(kx − ωt) in Schrödinger’s equation, i.e. in: Let me quickly go over the logic. To keep things simple, we’ll just assume one-dimensional space, so ∇2ψ = ∂2ψ/∂x2. The time derivative on the left-hand side is ∂ψ/∂t = −iω·ei(kx − ωt). The second-order derivative on the right-hand side is ∂2ψ/∂x2 = (ik)·(ik)·ei(kx − ωt) = −k2·ei(kx − ωt) . The ei(kx − ωt) factor on both sides cancels out and, hence, equating both sides gives us the following condition:

iω = −(iħ/2m)·k2 ⇔ ω = (ħ/2m)·k2

Substituting ω = E/ħ and k = p/ħ yields:

E/ħ = (ħ/2m)·p22 = m2·v2/(2m·ħ) = m·v2/(2ħ) ⇔ E = m·v2/2

In short: the E = m·v2/2 is the correct formula. It must be, because… Well… Because Schrödinger’s equation is a formula we surely shouldn’t doubt, right? So the only logical conclusion is that we must be doing something wrong when multiplying the two de Broglie equations. To be precise: our v = f·λ equation must be wrong. Why? Well… It’s just something one shouldn’t apply to our complex-valued wavefunction. The ‘correct’ velocity formula for the complex-valued wavefunction should have that 1/2 factor, so we’d write 2·f·λ = v to make things come out alright. But where would this formula come from? The period of cosθ + isinθ is the period of the sine and cosine function: cos(θ+2π) + isin(θ+2π) = cosθ + isinθ, so T = 2π and f = 1/T = 1/2π do not change.

But so that’s a mathematical point of view. From a physical point of view, it’s clear we got two oscillations for the price of one: one ‘real’ and one ‘imaginary’—but both are equally essential and, hence, equally ‘real’. So the answer must lie in the distinction between the group and the phase velocity when we’re combining waves. Indeed, the group velocity of a sum of waves is equal to vg = dω/dk. In this case, we have:

vg = d[E/ħ]/d[p/ħ] = dE/dp

We can now use the kinetic energy formula to write E as E = m·v2/2 = p·v/2. Now, v and p are related through m (p = m·v, so = p/m). So we should write this as E = m·v2/2 = p2/(2m). Substituting E and p = m·v in the equation above then gives us the following:

dω/dk = d[p2/(2m)]/dp = 2p/(2m) = v= v

However, for the phase velocity, we can just use the v= ω/k formula, which gives us that 1/2 factor:

v= ω/k = (E/ħ)/(p/ħ) = E/p = (m·v2/2)/(m·v) = v/2

Bingo! Riddle solved! 🙂 Isn’t it nice that our formula for the group velocity also applies to our complex-valued wavefunction? I think that’s amazing, really! But I’ll let you think about it. 🙂

The Uncertainty Principle revisited

I’ve written a few posts on the Uncertainty Principle already. See, for example, my post on the energy-time expression for it (ΔE·Δt ≥ h). So why am I coming back to it once more? Not sure. I felt I left some stuff out. So I am writing this post to just complement what I wrote before. I’ll do so by explaining, and commenting on, the ‘semi-formal’ derivation of the so-called Kennard formulation of the Principle in the Wikipedia article on it.

The Kennard inequalities, σxσp ≥ ħ/2 and σEσt ≥ ħ/2, are more accurate than the more general Δx·Δp ≥ h and ΔE·Δt ≥ h expressions one often sees, which are an early formulation of the Principle by Niels Bohr, and which Heisenberg himself used when explaining the Principle in a thought experiment picturing a gamma-ray microscope. I presented Heisenberg’s thought experiment in another post, and so I won’t repeat myself here. I just want to mention that it ‘proves’ the Uncertainty Principle using the Planck-Einstein relations for the energy and momentum of a photon:

E = hf and p = h/λ

Heisenberg’s thought experiment is not a real proof, of course. But then what’s a real proof? The mentioned ‘semi-formal’ derivation looks more impressive, because more mathematical, but it’s not a ‘proof’ either (I hope you’ll understand why I am saying that after reading my post). The main difference between Heisenberg’s thought experiment and the mathematical derivation in the mentioned Wikipedia article is that the ‘mathematical’ approach is based on the de Broglie relation. That de Broglie relation looks the same as the Planck-Einstein relation (p = h/λ) but it’s fundamentally different.

Indeed, the momentum of a photon (i.e. the p we use in the Planck-Einstein relation) is not the momentum one associates with a proper particle, such as an electron or a proton, for example (so that’s the p we use in the de Broglie relation). The momentum of a particle is defined as the product of its mass (m) and velocity (v). Photons don’t have a (rest) mass, and their velocity is absolute (c), so how do we define momentum for a photon? There are a couple of ways to go about it, but the two most obvious ones are probably the following:

1. We can use the classical theory of electromagnetic radiation and show that the momentum of a photon is related to the magnetic field (we usually only analyze the electric field), and the so-called radiation pressure that results from it. It yields the p = E/c formula which we need to go from E = hf to p = h/λ, using the ubiquitous relation between the frequency, the wavelength and the wave velocity (c = λf). In case you’re interested in the detail, just click on the radiation pressure link).
2. We can also use the mass-energy equivalence E = mc2. Hence, the equivalent mass of the photon is E/c2, which is relativistic mass only. However, we can multiply that mass with the photon’s velocity, which is c, thereby getting the very same value for its momentum p = E/c= E/c.

So Heisenberg’s ‘proof’ uses the Planck-Einstein relations, as it analyzes the Uncertainty Principle more as an observer effect: probing matter with light, so to say. In contrast, the mentioned derivation takes the de Broglie relation itself as the point of departure. As mentioned, the de Broglie relations look exactly the same as the Planck-Einstein relationship (E = hf and p = h/λ) but the model behind is very different. In fact, that’s what the Uncertainty Principle is all about: it says that the de Broglie frequency and/or wavelength cannot be determined exactly: if we want to localize a particle, somewhat at least, we’ll be dealing with a frequency range Δf. As such, the de Broglie relation is actually somewhat misleading at first. Let’s talk about the model behind.

A particle, like an electron or a proton, traveling through space, is described by a complex-valued wavefunction, usually denoted by the Greek letter psi (Ψ) or phi (Φ). This wavefunction has a phase, usually denoted as θ (theta) which – because we assume the wavefunction is a nice periodic function – varies as a function of time and space. To be precise, we write θ as θ = ωt – kx or, if the wave is traveling in the other direction, as θ = kx – ωt.

I’ve explained this in a couple of posts already, including my previous post, so I won’t repeat myself here. Let me just note that ω is the angular frequency, which we express in radians per second, rather than cycles per second, so ω = 2π(one cycle covers 2π rad). As for k, that’s the wavenumber, which is often described as the spatial frequency, because it’s expressed in cycles per meter or, more often (and surely in this case), in radians per meter. Hence, if we freeze time, this number is the rate of change of the phase in space. Because one cycle is, again, 2π rad, and one cycle corresponds to the wave traveling one wavelength (i.e. λ meter), it’s easy to see that k = 2π/λ. We can use these definitions to re-write the de Broglie relations E = hf and p = h/λ as:

E = ħω and p = ħk with h = h/2π

What about the wave velocity? For a photon, we have c = λf and, hence, c = (2π/k)(ω/2π) = ω/k. For ‘particle waves’ (or matter waves, if you prefer that term), it’s much more complicated, because we need to distinguish between the so-called phase velocity (vp) and the group velocity (vg). The phase velocity is what we’re used to: it’s the product of the frequency (the number of cycles per second) and the wavelength (the distance traveled by the wave over one cycle), or the ratio of the angular frequency and the wavenumber, so we have, once again, λf = ω/k = vp. However, this phase velocity is not the classical velocity of the particle that we are looking at. That’s the so-called group velocity, which corresponds to the velocity of the wave packet representing the particle (or ‘wavicle’, if your prefer that term), as illustrated below. The animation below illustrates the difference between the phase and the group velocity even more clearly: the green dot travels with the ‘wavicles’, while the red dot travels with the phase. As mentioned above, the group velocity corresponds to the classical velocity of the particle (v). However, the phase velocity is a mathematical point that actually travels faster than light. It is a mathematical point only, which does not carry a signal (unlike the modulation of the wave itself, i.e. the traveling ‘groups’) and, hence, it does not contradict the fundamental principle of relativity theory: the speed of light is absolute, and nothing travels faster than light (except mathematical points, as you can, hopefully, appreciate now). The two animations above do not represent the quantum-mechanical wavefunction, because the functions that are shown are real-valued, not complex-valued. To imagine a complex-valued wave, you should think of something like the ‘wavicle’ below or, if you prefer animations, the standing waves underneath (i.e. C to H: A and B just present the mathematical model behind, which is that of a mechanical oscillator, like a mass on a spring indeed). These representations clearly show the real as well as the imaginary part of complex-valued wave-functions.  With this general introduction, we are now ready for the more formal treatment that follows. So our wavefunction Ψ is a complex-valued function in space and time. A very general shape for it is one we used in a couple of posts already:

Ψ(x, t) ∝ ei(kx – ωt) = cos(kx – ωt) + isin(kx – ωt)

If you don’t know anything about complex numbers, I’d suggest you read my short crash course on it in the essentials page of this blog, because I don’t have the space nor the time to repeat all of that. Now, we can use the de Broglie relationship relating the momentum of a particle with a wavenumber (p = ħk) to re-write our psi function as:

Ψ(x, t) ∝ ei(kx – ωt) = ei(px/ħ – ωt)

Note that I am using the ‘proportional to’ symbol (∝) because I don’t worry about normalization right now. Indeed, from all of my other posts on this topic, you know that we have to take the absolute square of all these probability amplitudes to arrive at a probability density function, describing the probability of the particle effectively being at point x in space at point t in time, and that all those probabilities, over the function’s domain, have to add up to 1. So we should insert some normalization factor.

Having said that, the problem with the wavefunction above is not normalization really, but the fact that it yields a uniform probability density function. In other words, the particle position is extremely uncertain in the sense that it could be anywhere. Let’s calculate it using a little trick: the absolute square of a complex number equals the product of itself with its (complex) conjugate. Hence, if z = reiθ, then │z│2 = zz* = reiθ·reiθ = r2eiθiθ = r2e0 = r2. Now, in this case, assuming unique values for k, ω, p, which we’ll note as k0, ω0, p0 (and, because we’re freezing time, we can also write t = t0), we should write:

│Ψ(x)│2 = │a0ei(p0x/ħ – ω0t02 = │a0eip0x/ħ eiω0t0 2 = │a0eip0x/ħ 2 │eiω0t0 2 = a02

Note that, this time around, I did insert some normalization constant a0 as well, so that’s OK. But so the problem is that this very general shape of the wavefunction gives us a constant as the probability for the particle being somewhere between some point a and another point b in space. More formally, we get the surface for a rectangle when we calculate the probability P[a ≤ X ≤ b] as we should calculate it, which is as follows: More specifically, because we’re talking one-dimensional space here, we get P[a ≤ X ≤ b] = (b–a)·a02. Now, you may think that such uniform probability makes sense. For example, an electron may be in some orbital around a nucleus, and so you may think that all ‘points’ on the orbital (or within the ‘sphere’, or whatever volume it is) may be equally likely. Or, in another example, we may know an electron is going through some slit and, hence, we may think that all points in that slit should be equally likely positions. However, we know that it is not the case. Measurements show that not all points are equally likely. For an orbital, we get complicated patterns, such as the one shown below, and please note that the different colors represent different complex numbers and, hence, different probabilities. Also, we know that electrons going through a slit will produce an interference pattern—even if they go through it one by one! Hence, we cannot associate some flat line with them: it has to be a proper wavefunction which implies, once again, that we can’t accept a uniform distribution.

In short, uniform probability density functions are not what we see in Nature. They’re non-uniform, like the (very simple) non-uniform distributions shown below. [The left-hand side shows the wavefunction, while the right-hand side shows the associated probability density function: the first two are static (i.e. they do not vary in time), while the third one shows a probability distribution that does vary with time.] I should also note that, even if you would dare to think that a uniform distribution might be acceptable in some cases (which, let me emphasize this, it is not), an electron can surely not be ‘anywhere’. Indeed, the normalization condition implies that, if we’d have a uniform distribution and if we’d consider all of space, i.e. if we let a go to –∞ and b to +∞, then a0would tend to zero, which means we’d have a particle that is, literally, everywhere and nowhere at the same time.

In short, a uniform probability distribution does not make sense: we’ll generally have some idea of where the particle is most likely to be, within some range at least. I hope I made myself clear here.

Now, before I continue, I should make some other point as well. You know that the Planck constant (h or ħ) is unimaginably small: about 1×10−34 J·s (joule-second). In fact, I’ve repeatedly made that point in various posts. However, having said that, I should add that, while it’s unimaginably small, the uncertainties involved are quite significant. Let us indeed look at the value of ħ by relating it to that σxσp ≥ ħ/2 relation.

Let’s first look at the units. The uncertainty in the position should obviously be expressed in distance units, while momentum is expressed in kg·m/s units. So that works out, because 1 joule is the energy transferred (or work done) when applying a force of 1 newton (N) over a distance of 1 meter (m). In turn, one newton is the force needed to accelerate a mass of one kg at the rate of 1 meter per second per second (this is not a typing mistake: it’s an acceleration of 1 m/s per second, so the unit is m/s2: meter per second squared). Hence, 1 J·s = 1 N·m·s = 1 kg·m/s2·m·s = kg·m2/s. Now, that’s the same dimension as the ‘dimensional product’ for momentum and distance: m·kg·m/s = kg·m2/s.

Now, these units (kg, m and s) are all rather astronomical at the atomic scale and, hence, h and ħ are usually expressed in other dimensions, notably eV·s (electronvolt-second). However, using the standard SI units gives us a better idea of what we’re talking about. If we split the ħ = 1×10−34 J·s value (let’s forget about the 1/2 factor for now) ‘evenly’ over σx and σp – whatever that means: all depends on the units, of course!  – then both factors will have magnitudes of the order of 1×10−17: 1×10−17 m times 1×10−17 kg·m/s gives us 1×10−34 J·s.

You may wonder how this 1×10−17 m compares to, let’s say, the classical electron radius, for example. The classical electron radius is, roughly speaking, the ‘space’ an electron seems to occupy as it scatters incoming light. The idea is illustrated below (credit for the image goes to Wikipedia, as usual). The classical electron radius – or Thompson scattering length – is about 2.818×10−15 m, so that’s almost 300 times our ‘uncertainty’ (1×10−17 m). Not bad: it means that we can effectively relate our ‘uncertainty’ in regard to the position to some actual dimension in space. In this case, we’re talking the femtometer scale (1 fm = 10−15 m), and so you’ve surely heard of this before. What about the other ‘uncertainty’, the one for the momentum (1×10−17 kg·m/s)? What’s the typical (linear) momentum of an electron? Its mass, expressed in kg, is about 9.1×10−31 kg. We also know its relative velocity in an electron: it’s that magical number α = v/c, about which I wrote in some other posts already, so v = αc ≈ 0.0073·3×10m/s ≈ 2.2×10m/s. Now, 9.1×10−31 kg times 2.2×10m/s is about 2×10–26 kg·m/s, so our proposed ‘uncertainty’ in regard to the momentum (1×10−17 kg·m/s) is half a billion times larger than the typical value for it. Now that is, obviously, not so good. [Note that calculations like this are extremely rough. In fact, when one talks electron momentum, it’s usual angular momentum, which is ‘analogous’ to linear momentum, but angular momentum involves very different formulas. If you want to know more about this, check my post on it.]

Of course, now you may feel that we didn’t ‘split’ the uncertainty in a way that makes sense: those –17 exponents don’t work, obviously. So let’s take 1×10–26 kg·m/s for σp, which is half of that ‘typical’ value we calculated. Then we’d have 1×10−8 m for σx (1×10−8 m times 1×10–26 kg·m/s is, once again, 1×10–34 J·s). But then that uncertainty suddenly becomes a huge number: 1×10−8 m is 100 angstrom. That’s not the atomic scale but the molecular scale! So it’s huge as compared to the pico- or femto-meter scale (1 pm = 1×10−12 m, 1 fm = 1×10−15 m) which we’d sort of expect to see when we’re talking electrons.

OK. Let me get back to the lesson. Why this digression? Not sure. I think I just wanted to show that the Uncertainty Principle involves ‘uncertainties’ that are extremely relevant: despite the unimaginable smallness of the Planck constant, these uncertainties are quite significant at the atomic scale. But back to the ‘proof’ of Kennard’s formulation. Here we need to discuss the ‘model’ we’re using. The rather simple animation below (again, credit for it has to go to Wikipedia) illustrates it wonderfully. Look at it carefully: we start with a ‘wave packet’ that looks a bit like a normal distribution, but it isn’t, of course. We have negative and positive values, and normal distributions don’t have that. So it’s a wave alright. Of course, you should, once more, remember that we’re only seeing one part of the complex-valued wave here (the real or imaginary part—it could be either). But so then we’re superimposing waves on it. Note the increasing frequency of these waves, and also note how the wave packet becomes increasingly localized with the addition of these waves. In fact, the so-called Fourier analysis, of which you’ve surely heard before, is a mathematical operation that does the reverse: it separates a wave packet into its individual component waves.

So now we know the ‘trick’ for reducing the uncertainty in regard to the position: we just add waves with different frequencies. Of course, different frequencies imply different wavenumbers and, through the de Broglie relationship, we’ll also have different values for the ‘momentum’ associated with these component waves. Let’s write these various values as kn, ωn, and pn respectively, with n going from 0 to N. Of course, our point in time remains frozen at t0. So we get a wavefunction that’s, quite simply, the sum of N component waves and so we write:

Ψ(x) = ∑ anei(pnx/ħ – ωnt0= ∑ an  eipnx/ħeiωnt= ∑ Aneipnx/ħ

Note that, because of the eiωnt0, we now have complex-valued coefficients An = aneiωnt0 in front. More formally, we say that An represents the relative contribution of the mode pn to the overall Ψ(x) wave. Hence, we can write these coefficients A as a function of p. Because Greek letters always make more of an impression, we’ll use the Greek letter Φ (phi) for it. 🙂 Now, we can go to the continuum limit and, hence, transform that sum above into an infinite sum, i.e. an integral. So our wave function then becomes an integral over all possible modes, which we write as: Don’t worry about that new 1/√2πħ factor in front. That’s, once again, something that has to do with normalization and scales. It’s the integral itself you need to understand. We’ve got that Φ(p) function there, which is nothing but our An coefficient, but for the continuum case. In fact, these relative contributions Φ(p) are now referred to as the amplitude of all modes p, and so Φ(p) is actually another wave function: it’s the wave function in the so-called momentum space.

You’ll probably be very confused now, and wonder where I want to go with an integral like this. The point to note is simple: if we have that Φ(p) function, we can calculate (or derive, if you prefer that word) the Ψ(x) from it using that integral above. Indeed, the integral above is referred to as the Fourier transform, and it’s obviously closely related to that Fourier analysis we introduced above.

Of course, there is also an inverse transform, which looks exactly the same: it just switches the wave functions (Ψ and Φ) and variables (x and p), and then (it’s an important detail!), it has a minus sign in the exponent. Together, the two functions –  as defined by each other through these two integrals – form a so-called Fourier integral pair, also known as a Fourier transform pair, and the variables involved are referred to as conjugate variables. So momentum (p) and position (x) are conjugate variables and, likewise, energy and time are also conjugate variables (but so I won’t expand on the time-energy relation here: please have a look at one of my others posts on that).

Now, I thought of copying and explaining the proof of Kennard’s inequality from Wikipedia’s article on the Uncertainty Principle (you need to click on the show button in the relevant section to see it), but then that’s pretty boring math, and simply copying stuff is not my objective with this blog. More importantly, the proof has nothing to do with physics. Nothing at all. Indeed, it just proves a general mathematical property of Fourier pairs. More specifically, it proves that, the more concentrated one function is, the more spread out its Fourier transform must be. In other words, it is not possible to arbitrarily concentrate both a function and its Fourier transform.

So, in this case, if we’d ‘squeeze’ Ψ(x), then its Fourier transform Φ(p) will ‘stretch out’, and so that’s what the proof in that Wikipedia article basically shows. In other words, there is some ‘trade-off’ between the ‘compaction’ of Ψ(x), on the one hand, and Φ(p), on the other, and so that is what the Uncertainty Principle is all about. Nothing more, nothing less.

But… Yes? What’s all this talk about ‘squeezing’ and ‘compaction’? We can’t change reality, can we? Well… Here we’re entering the philosophical field, of course. How do we interpret the Uncertainty Principle? It surely does look like us trying to measure something has some impact on the wavefunction. In fact, usually, our measurement – of either position or momentum – usually makes the wavefunctions collapse: we suddenly know where the particle is and, hence, ψ(x) seems to collapse into one point. Alternatively, we measure its momentum and, hence, Φ(p) collapses.

That’s intriguing. In fact, even more intriguing is the possibility we may only partially affect those wavefunctions with measurements that are somewhat less ‘drastic’. It seems a lot of research is focused on that (just Google for partial collapse of the wavefunction, and you’ll finds tons of references, including presentations like this one).

Hmm… I need to further study the topic. The decomposition of a wave into its component waves is obviously something that works well in physics—and not only in quantum mechanics but also in much more mundane examples. Its most general application is signal processing, in which we decompose a signal (which is a function of time) into the frequencies that make it up. Hence, our wavefunction model makes a lot of sense, as it mirrors the physics involved in oscillators and harmonics obviously.

Still… I feel it doesn’t answer the fundamental question: what is our electron really? What do those wave packets represent? Physicists will say questions like this don’t matter: as long as our mathematical models ‘work’, it’s fine. In fact, if even Feynman said that nobody – including himself – truly understands quantum mechanics, then I should just be happy and move on. However, for some reason, I can’t quite accept that. I should probably focus some more on that de Broglie relationship, p = h/λ, as it’s obviously as fundamental to my understanding of the ‘model’ of reality in physics as that Fourier analysis of the wave packet. So I need to do some more thinking on that.

The de Broglie relationship is not intuitive. In fact, I am not ashamed to admit that it actually took me quite some time to understand why we can’t just re-write the de Broglie relationship (λ = h/p) as an uncertainty relation itself: Δλ = h/Δp. Hence, let me be very clear on this:

Δx = h/Δp (that’s the Uncertainty Principle) but Δλ ≠ h/Δp !

Let me quickly explain why.

If the Δ symbol expresses a standard deviation (or some other measurement of uncertainty), we can write the following:

p = h/λ ⇒ Δp = Δ(h/λ) = hΔ(1/λ) ≠ h/Δp

So I can take h out of the brackets after the Δ symbol, because that’s one of the things that’s allowed when working with standard deviations. More in particular, one can prove the following:

1. The standard deviation of some constant function is 0: Δ(k) = 0
2. The standard deviation is invariant under changes of location: Δ(x + k) = Δ(x + k)
3. Finally, the standard deviation scales directly with the scale of the variable: Δ(kx) = |k |Δ(x).

However, it is not the case that Δ(1/x) = 1/Δx. However, let’s not focus on what we cannot do with Δx: let’s see what we can do with it. Δx equals h/Δp according to the Uncertainty Principle—if we take it as an equality, rather than as an inequality, that is. And then we have the de Broglie relationship: p = h/λ. Hence, Δx must equal:

Δx = h/Δp = h/[Δ(h/λ)] =h/[hΔ(1/λ)] = 1/Δ(1/λ)

That’s obvious, but so what? As mentioned, we cannot write Δx = Δλ, because there’s no rule that says that Δ(1/λ) = 1/Δλ and, therefore, h/Δp ≠ Δλ. However, what we can do is define Δλ as an interval, or a length, defined by the difference between its lower and upper bound (let’s denote those two values by λa and λb respectively. Hence, we write Δλ = λb – λa. Note that this does not assume we have a continuous range of values for λ: we can have any number of frequencies λbetween λa and λb, but so you see the point: we’ve got a range of values λ, discrete or continuous, defined by some lower and upper bound.

Now, the de Broglie relation associates two values pa and pb with λa and λb respectively:  pa = h/λa and pb = h/λb. Hence, we can similarly define the corresponding Δp interval as pa – pb, with pa = h/λa and p= h/λb. Note that, because we’re taking the reciprocal, we have to reverse the order of the values here: if λb > λa, then pa = h/λa > p= h/λb. Hence, we can write Δp = Δ(h/λ) = pa – pb = h/λ1 – h/λ= h(1/λ1 – 1/λ2) = h[λ2 – λ1]/λ1λ2. In case you have a bit of difficulty, just draw some reciprocal functions (like the ones below), and have fun connecting intervals on the horizontal axis with intervals on the vertical axis using these functions. Now, h[λ2 – λ1]/λ1λ2) is obviously something very different than h/Δλ = h/(λ2 – λ1). So we can surely not equate the two and, hence, we cannot write that Δp = h/Δλ.

Having said that, the Δx = 1/Δ(1/λ) = λ1λ2/(λ2 – λ1) that emerges here is quite interesting. We’ve got a ratio here, λ1λ2/(λ2 – λ1, which shows that Δx depends only on the upper and lower bounds of the Δλ range. It does not depend on whether or not the interval is discrete or continuous.

The second thing that is interesting to note is Δx depends not only on the difference between those two values (i.e. the length of the interval) but also on their value: if the length of the interval, i.e. the difference between the two frequencies is the same, but their values as such are higher, then we get a higher value for Δx, i.e. a greater uncertainty in the position. Again, this shows that the relation between Δλ and Δx is not straightforward. But so we knew that already, and so I’ll end this post right here and right now. 🙂

The shape and size of a photon

Important post script (PS) – dated 22 December 2018: Dear readers of this post, this is one of the more popular posts of my blog but – in the meanwhile – I did move on a bit, and many of the questions I get touch on the same topic: the analysis below is not entirely consistent. These questions are fully justified and I have been thinking differently as a result. The Q&A below sums up everything. At the same time, if you are interested in this question, you should probably also read the original post.

Hi Brian – see section III of this paper: http://vixra.org/pdf/1812.0273v2.pdf

Feynman’s classical idea of an atomic oscillator is fine in the context of the blackbody radiation problem, but his description of the photon as a long wavetrain does not make any sense. A photon has to pack two things: (1) the energy difference between the Bohr orbitals and (2) Planck’s constant h, which is the (physical) action associated with one cycle of an oscillation (so it’s a force over a distance (the loop or the radius – depending on the force you’re looking at) over a cycle time). See section V of the paper for how the fine-structure constant pops up here – it’s, as usual, a sort of scaling constant, but this time it scales a force. In any case, the idea is that we should think of a photon as one cycle – rather than a long wavetrain. The one cycle makes sense: when you calculate field strength and force you get quite moderate values (not the kind of black-hole energy concentrations some people suggest). It also makes sense from a logical point of view: the wavelength is something real, and so we should think of the photon amplitude (the electric field strength) as being real as well – especially when you think of how that photon is going to interact or be absorbed into another atom.

Sorry for my late reply. It’s been a while since I checked the comments. Please let me know if this makes sense. I’ll have a look at your blog in the coming days. I am working on a new paper on the anomalous magnetic moment – which is not anomalous as all if you start to think about how things might be working in reality. After many years of study, I’ve come to the conclusion that quantum mechanics is a nice way of describing things, but it doesn’t help us in terms of understanding anything. When we want to understand something, we need to push the classical framework a lot further than we currently do. In any case, that’s another discussion. OK. Now you can move on the post itself. 🙂 Sorry if this is confusing the reader, but it is a necessary thing, I feel.

Original post:

Photons are weird. All elementary particles are weird. As Feynman puts it, in the very first paragraph of his Lectures on Quantum Mechanics : “Historically, the electron, for example, was thought to behave like a particle, and then it was found that in many respects it behaved like a wave. So it really behaves like neither. Now we have given up. We say: “It is like neither. There is one lucky break, however—electrons behave just like light. The quantum behavior of atomic objects (electrons, protons, neutrons, photons, and so on) is the same for all, they are all “particle waves,” or whatever you want to call them. So what we learn about the properties of electrons will apply also to all “particles,” including photons of light.” (Feynman’s Lectures, Vol. III, Chapter 1, Section 1)

I wouldn’t dare to argue with Feynman, of course, but… What? Well… Photons are like electrons, and then they are not. Obviously not, I’d say. For starters, photons do not have mass or charge, and they are also bosons, i.e. ‘force-carriers’ (as opposed to matter-particles), and so they obey very different quantum-mechanical rules, which are referred to as Bose-Einstein statistics. I’ve written about that in other post (see, for example, my post on Bose-Einstein and Fermi-Dirac statistics), so I won’t do that again here. It’s probably sufficient to remind the reader that these rules imply that the so-called Pauli exclusion principle does not apply to them: bosons like to crowd together, thereby occupying the same quantum state–unlike their counterparts, the so-called fermions or matter-particles: quarks (which make up protons and neutrons) and leptons (including electrons and neutrinos), which can’t do that: two electrons can only sit on top of each other if their spins are opposite (so that makes their quantum state different), and there’s no place whatsoever to add a third one–because there are only two possible ‘directions’ for the spin: up or down.

From all that I’ve been writing so far, I am sure you have some kind of picture of matter-particles now, and notably of the electron: it’s not really point-like, because it has a so-called scattering cross-section (I’ll say more about this later), and we can find it somewhere taking into account the Uncertainty Principle, with the probability of finding it at point x at time t given by the absolute square of a so-called ‘wave function’ Ψ(x, t).

But what about the photon? Unlike quarks or electrons, they are really point-like, aren’t they? And can we associate them with a psi function too? I mean, they have a wavelength, obviously, which is given by the Planck-Einstein energy-frequency relation: E = hν, with h the Planck constant and ν the frequency of the associated ‘light’. But an electromagnetic wave is not like a ‘probability wave’. So… Do they have a de Broglie wavelength as well?

Before answering that question, let me present that ‘picture’ of the electron once again.

The wave function for electrons

The electron ‘picture’ can be represented in a number of ways but one of the more scientifically correct ones – whatever that means – is that of a spatially confined wave function representing a complex quantity referred to as the probability amplitude. The animation below (which I took from Wikipedia) visualizes such wave functions. As mentioned above, the wave function is usually represented by the Greek letter psi (Ψ), and it is often referred to as a ‘probability wave’ – by bloggers like me, that is 🙂 – but that term is quite misleading. Why? You surely know that by now: the wave function represents a probability amplitude, not a probability. [So, to be correct, we should say a ‘probability amplitude wave’, or an ‘amplitude wave’, but so these terms are obviously too long and so they’ve been dropped and everybody talks about ‘the’ wave function now, although that’s confusing too, because an electromagnetic wave is a ‘wave function’ too, but describing ‘real’ amplitudes, not some weird complex numbers referred to as ‘probability amplitudes’.] Having said what I’ve said above, probability amplitude and probability are obviously related: if we take the (absolute) square of the psi function – i.e. if we take the (absolute) square of all these amplitudes Ψ(x, t) – then we get the actual probability of finding that electron at point x at time t. So then we get the so-called probability density functions, which are shown on the right-hand side of the illustration above. [As for the term ‘absolute’ square, the absolute square is the squared norm of the associated ‘vector’. Indeed, you should note that the square of a complex number can be negative as evidenced, for example, by the definition of i: i= –1. In fact, if there’s only an imaginary part, then its square is always negative. Probabilities are real numbers between 0 and 1, and so they can’t be negative, and so that’s why we always talk about the absolute square, rather than the square as such.]

Below, I’ve inserted another image, which gives a static picture (i.e. one that is not varying in time) of the wave function of a real-life electron. To be precise: it’s the wave function for an electron on the 5d orbital of a hydrogen orbital. You can see it’s much more complicated than those easy things above. However, the idea behind is the same. We have a complex-valued function varying in space and in time. I took it from Wikipedia and so I’ll just copy the explanation here: “The solid body shows the places where the electron’s probability density is above a certain value (0.02), as calculated from the probability amplitude.” What about these colors? Well… The image uses the so-called HSL color system to represent complex numbers: each complex number is represented by a unique color, with a different hue (H), saturation (S) and lightness (L). Just google if you want to know how that works exactly. OK. That should be clear enough. I wanted to talk about photons here. So let’s go for it. Well… Hmm… I realize I need to talk about some more ‘basics’ first. Sorry for that.

The Uncertainty Principle revisited (1)

The wave function is usually given as a function in space and time: Ψ = Ψ(x, t). However, I should also remind you that we have a similar function in the ‘momentum space’: if ψ is a psi function, then the function in the momentum space is a phi function, and we’ll write it as Φ = Φ(p, t). [As for the notation, x and p are written with capital letters and, hence, represent (three-dimensional) vectors. Likewise, we use a capital letter for psi and phi so we don’t confuse it with, for example, the lower-case φ (phi) representing the phase of a wave function.]

The position-space and momentum-space wave functions Ψ and Φ are related through the Uncertainty Principle. To be precise: they are Fourier transforms of each other. Huh? Don’t be put off by that statement. In fact, I shouldn’t have mentioned it, but then it’s how one can actually prove or derive the Uncertainty Principle from… Well… From ‘first principles’, let’s say, instead of just jotting it down as some God-given rule. Indeed, as Feynman puts: “The Uncertainty Principle should be seen in its historical context. If you get rid of all of the old-fashioned ideas and instead use the ideas that I’m explaining in these lectures—adding arrows for all the ways an event can happen—there is no need for an uncertainty principle!” However, I must assume you’re, just like me, not quite used to the new ideas as yet, and so let me just jot down the Uncertainty Principle once again, as some God-given rule indeed :-):

σx·σħ/2

This is the so-called Kennard formulation of the Principle: it measures the uncertainty about the exact position (x) as well as the momentum (p), in terms of the standard deviation (so that’s the σ (sigma) symbol) around the mean. To be precise, the assumption is that we cannot know the real x and p: we can only find some probability distribution for x and p, which is usually some nice “bell curve” in the textbooks. While the Kennard formulation is the most precise (and exact) formulation of the Uncertainty Principle (or uncertainty relation, I should say), you’ll often find ‘other’ formulations. These ‘other’ formulates usually write Δx and Δp instead of σand σp, with the Δ symbol indicating some ‘spread’ or a similar concept—surely do not think of Δ as a differential or so! [Sorry for assuming you don’t know this (I know you do!) but I just want to make sure here!] Also, these ‘other’ formulations will usually (a) not mention the 1/2 factor, (b) substitute ħ for h (ħ = h/2π, as you know, so ħ is preferred when we’re talking things like angular frequency or other stuff involving the unit circle), or (c) put an equality (=) sign in, instead of an inequality sign (≥). Niels Bohr’s early formulation of the Uncertainty Principle actually does all of that:

ΔxΔp h

So… Well… That’s a bit sloppy, isn’t it? Maybe. In Feynman’s Lectures, you’ll find an oft-quoted ‘application’ of the Uncertainty Principle leading to a pretty accurate calculation of the typical size of an atom (the so-called Bohr radius), which Feynman starts with an equally sloppy statement of the Uncertainty Principle, so he notes: “We needn’t trust our answer to within factors like 2, π etcetera.” Frankly, I used to think that’s ugly and, hence, doubt the ‘seriousness’ of such kind of calculations. Now I know it doesn’t really matter indeed, as the essence of the relationship is clearly not a 2, π or 2π factor. The essence is the uncertainty itself: it’s very tiny (and multiplying it with 2, π or 2π doesn’t make it much bigger) but so it’s there.

In this regard, I need to remind you of how tiny that physical constant ħ actually is: about 6.58×10−16 eV·s. So that’s a zero followed by a decimal point and fifteen zeroes: only then we get the first significant digits (65812…). And if 10−16 doesn’t look tiny enough for you, then just think about how tiny the electronvolt unit is: it’s the amount of (potential) energy gained (or lost) by an electron as it moves across a potential difference of one volt (which, believe me, is nothing much really): if we’d express ħ in Joule, then we’d have to add nineteen more zeroes, because 1 eV = 1.6×10−19 J. As for such phenomenally small numbers, I’ll just repeat what I’ve said many times before: we just cannot imagine such small number. Indeed, our mind can sort of intuitively deal with addition (and, hence, subtraction), and with multiplication and division (but to some extent only), but our mind is not made to understand non-linear stuff, such as exponentials indeed. If you don’t believe me, think of the Richter scale: can you explain the difference between a 4.0 and a 5.0 earthquake? […] If the answer to that question took you more than a second… Well… I am right. 🙂 [The Richter scale is based on the base-10 exponential function: a 5.0 earthquake has a shaking amplitude that is 10 times that of an earthquake that registered 4.0, and because energy is proportional to the square of the amplitude, that corresponds to an energy release that is 31.6 times that of the lesser earthquake.]

A digression on units

Having said what I said above, I am well aware of the fact that saying that we cannot imagine this or that is what most people say. I am also aware of the fact that they usually say that to avoid having to explain something. So let me try to do something more worthwhile here.

1. First, I should note that ħ is so small because the second, as a unit of time, is so incredibly large. All is relative, of course. 🙂 For sure, we should express time in a more natural unit at the atomic or sub-atomic scale, like the time that’s needed for light to travel one meter. Let’s do it. Let’s express time in a unit that I shall call a ‘meter‘. Of course, it’s not an actual meter (because it doesn’t measure any distance), but so I don’t want to invent a new word and surely not any new symbol here. Hence, I’ll just put apostrophes before and after: so I’ll write ‘meter’ or ‘m’. When adopting the ‘meter’ as a unit of time, we get a value for ‘ħ‘ that is equal to (6.6×10−16 eV·s)(1/3×108 ‘meter’/second) = 2.2×10−8 eV·’m’. Now, 2.2×10−8 is a number that is still too tiny to imagine. But then our ‘meter’ is still a rather huge unit at the atomic scale: we should take the ‘millimicron’, aka the ‘nanometer’ (1 nm = 1×10−9 m), or – even better because more appropriate – the ‘angstrom‘: 1 Å = 0.1 nm = 1×10−10 m. Indeed, the smallest atom (hydrogen) has a radius of 0.25 Å, while larger atoms will have a radius of about 1 or more Å. Now that should work, isn’t it? You’re right, we get a value for ‘ħ‘ equal to (6.6×10−16 eV·s)(1/3×108 ‘m’/s)(1×1010 ‘Å’/m) = 220 eV·’Å’, or 22 220 eV·’nm’. So… What? Well… If anything, it shows ħ is not a small unit at the atomic or sub-atomic level! Hence, we actually can start imagining how things work at the atomic level when using more adequate units.

[Now, just to test your knowledge, let me ask you: what’s the wavelength of visible light in angstrom? […] Well? […] Let me tell you: 400 to 700 nm is 4000 to 7000 Å. In other words, the wavelength of visible light is quite sizable as compared to the size of atoms or electron orbits!]

2. Secondly, let’s do a quick dimension analysis of that ΔxΔp h relation and/or its more accurate expression σx·σħ/2.

A position (and its uncertainty or standard deviation) is expressed in distance units, while momentum… Euh… Well… What? […] Momentum is mass times velocity, so it’s kg·m/s. Hence, the dimension of the product on the left-hand side of the inequality is m·kg·m/s = kg·m2/s. So what about this eV·s dimension on the right-hand side? Well… The electronvolt is a unit of energy, and so we can convert it to joules. Now, a joule is a newton-meter (N·m), which is the unit for both energy and work: it’s the work done when applying a force of one newton over a distance of one meter. So we now have N·m·s for ħ, which is nice, because Planck’s constant (h or ħ—whatever: the choice for one of the two depends on the variables we’re looking at) is the quantum for action indeed. It’s a Wirkung as they say in German, so its dimension combines both energy as well as time.

To put it simply, it’s a bit like power, which is what we men are interested in when looking at a car or motorbike engine. 🙂 Power is the energy spent or delivered per second, so its dimension is J/s, not J·s. However, your mind can see the similarity in thinking here. Energy is a nice concept, be it potential (think of a water bucket above your head) or kinetic (think of a punch in a bar fight), but it makes more  sense to us when adding the dimension of time (emptying a bucket of water over your head is different than walking in the rain, and the impact of a punch depends on the power with which it is being delivered). In fact, the best way to understand the dimension of Planck’s constant is probably to also write the joule in ‘base units’. Again, one joule is the amount of energy we need to move an object over a distance of one meter against a force of one newton. So one J·s is one N·m·s is (1) a force of one newton acting over a distance of (2) one meter over a time period equal to (3) one second.

I hope that gives you a better idea of what ‘action’ really is in physics. […] In any case, we haven’t answered the question. How do we relate the two sides? Simple: a newton is an oft-used SI unit, but it’s not a SI base unit, and so we should deconstruct it even more (i.e. write it in SI base units). If we do that, we get 1 N = 1 kg·m/s2: one newton is the force needed to give a mass of 1 kg an acceleration of 1 m/s per second. So just substitute and you’ll see the dimension on the right-hand side is kg·(m/s2)·m·s = kg·m2/s, so it comes out alright.

Why this digression on units? Not sure. Perhaps just to remind you also that the Uncertainty Principle can also be expressed in terms of energy and time:

ΔE·Δt = h

Here there’s no confusion  in regard to the units on both sides: we don’t need to convert to SI base units to see that they’re the same: [ΔE][Δt] = J·s.

The Uncertainty Principle revisited (2)

The ΔE·Δt = h expression is not so often used as an expression of the Uncertainty Principle. I am not sure why, and I don’t think it’s a good thing. Energy and time are also complementary variables in quantum mechanics, so it’s just like position and momentum indeed. In fact, I like the energy-time expression somewhat more than the position-momentum expression because it does not create any confusion in regard to the units on both sides: it’s just joules (or electronvolts) and seconds on both sides of the equation. So what?

Frankly, I don’t want to digress too much here (this post is going to become awfully long) but, personally, I found it hard, for quite a while, to relate the two expressions of the very same uncertainty ‘principle’ and, hence, let me show you how the two express the same thing really, especially because you may or may not know that there are even more pairs of complementary variables in quantum mechanics. So, I don’t know if the following will help you a lot, but it helped me to note that:

1. The energy and momentum of a particle are intimately related through the (relativistic) energy-momentum relationship. Now, that formula, E2 = p2c2 – m02c4, which links energy, momentum and intrinsic mass (aka rest mass), looks quite monstrous at first. Hence, you may prefer a simpler form: pc = Ev/c. It’s the same really as both are based on the relativistic mass-energy equivalence: E = mc2 or, the way I prefer to write it: m = E/c2. [Both expressions are the same, obviously, but we can ‘read’ them differently: m = E/c2 expresses the idea that energy has a equivalent mass, defined as inertia, and so it makes energy the primordial concept, rather than mass.] Of course, you should note that m is the total mass of the object here, including both (a) its rest mass as well as (b) the equivalent mass it gets from moving at the speed v. So m, not m0, is the concept of mass used to define p, and note how easy it is to demonstrate the equivalence of both formulas: pc = Ev/c ⇔ mvc = Ev/c ⇔ E = mc2. In any case, the bottom line is: don’t think of the energy and momentum of a particle as two separate things; they are two aspects of the same ‘reality’, involving mass (a measure of inertia, as you know) and velocity (as measured in a particular (so-called inertial) reference frame).
2. Time and space are intimately related through the universal constant c, i.e. the speed of light, as evidenced by the fact that we will often want to express distance not in meter but in light-seconds (i.e. the distance that light travels (in a vacuum) in one second) or, vice versa, express time in meter (i.e. the time that light needs to travel a distance of one meter).

These relationships are interconnected, and the following diagram shows how. The easiest way to remember it all is to apply the Uncertainty Principle, in both its ΔE·Δt = h as well as its Δp·Δx = h  expressions, to a photon. A photon has no rest mass and its velocity v is, obviously, c. So the energy-momentum relationship is a very simple one: p = E/c. We then get both expressions of the Uncertainty Principle by simply substituting E for p, or vice versa, and remember that time and position (or distance) are related in exactly the same way: the constant of proportionality is the very same. It’s c. So we can write: Δx = Δt·c and Δt = Δx/c. If you’re confused, think about it in very practical terms: because the speed of light is what it is, an uncertainty of a second in time amounts, roughly, to an uncertainty in position of some 300,000 km (c = 3×10m/s). Conversely, an uncertainty of some 300,000 km in the position amounts to a uncertainty in time of one second. That’s what the 1-2-3 in the diagram above is all about: please check if you ‘get’ it, because that’s ‘essential’ indeed.

Back to ‘probability waves’

Matter-particles are not the same, but we do have the same relations, including that ‘energy-momentum duality’. The formulas are just somewhat more complicated because they involve mass and velocity (i.e. a velocity less than that of light). For matter-particles, we can see that energy-momentum duality not only in the relationships expressed above (notably the relativistic energy-momentum relation), but also in the (in)famous de Broglie relation, which associates some ‘frequency’ (f) to the energy (E) of a particle or, what amounts to the same, some ‘wavelength’ (λ) to its momentum (p):

λ = h/p and f = E/h

These two complementary equations give a ‘wavelength’ (λ) and/or a ‘frequency’ (f) of a de Broglie wave, or a ‘matter wave’ as it’s sometimes referred to. I am using, once again, apostrophes because the de Broglie wavelength and frequency are a different concept—different than the wavelength or frequency of light, or of any other ‘real’ wave (like water or sound waves, for example). To illustrate the differences, let’s start with a very simple question: what’s the velocity of a de Broglie wave? Well… […] So? You thought you knew, didn’t you?

1. The mathematically (and physically) correct answer involves distinguishing the group and phase velocity of a wave.
2. The ‘easy’ answer is: the de Broglie wave of a particle moves with the particle and, hence, its velocity is, obviously, the speed of the particle which, for electrons, is usually non-relativistic (i.e. rather slow as compared to the speed of light).

To be clear on this, the velocity of a de Broglie wave is not the speed of light. So a de Broglie wave is not like an electromagnetic wave at all. They have nothing in common really, except for the fact that we refer to both of them as ‘waves’. 🙂

The second thing to note is that, when we’re talking about the ‘frequency’ or ‘wavelength’ of ‘matter waves’ (i.e. de Broglie waves), we’re talking the frequency and wavelength of a wave with two components: it’s a complex-valued wave function, indeed, and so we get a real and imaginary part when we’re ‘feeding’ the function with some values for x and t.

Thirdly and, perhaps, most importantly, we should always remember the Uncertainty Principle when looking at the de Broglie relation. The Uncertainty Principle implies that we can actually not assign any precise wavelength (or, what amounts to the same, a precise frequency) to a de Broglie wave: if there is a spread in p (and, hence, in E), then there will be a spread in λ (and in f). In fact, I tend to think that it would be better to write the de Broglie relation as an ‘uncertainty relation’ in its own right:

Δλ = Δ(h/p) = hΔp and Δf = ΔE/h = hΔE

Besides from underscoring the fact that we have other ‘pairs’ of complementary variables, this ‘version’ of the de Broglie equation would also remind us continually of the fact that a ‘regular’ wave with an exact frequency and/or an exact wavelength (so a Δλ and/or a Δf equal to zero) would not give us any information about the momentum and/or the energy. Indeed, as Δλ and/or Δf go to zero (Δλ → 0 and/or Δf → 0 ), then Δp and ΔE must go to infinity (Δp → ∞ and ΔE → ∞. That’s just the math involved in such expressions. 🙂

Jokes aside, I’ll admit I used to have a lot of trouble understanding this, so I’ll just quote the expert teacher (Feynman) on this to make sure you don’t get me wrong here:

“The amplitude to find a particle at a place can, in some circumstances, vary in space and time, let us say in one dimension, in this manner: Ψ Aei(ωtkx, where ω is the frequency, which is related to the classical idea of the energy through ħω, and k is the wave number, which is related to the momentum through ħk. [These are equivalent formulations of the de Broglie relations using the angular frequency and the wave number instead of wavelength and frequency.] We would say the particle had a definite momentum p if the wave number were exactly k, that is, a perfect wave which goes on with the same amplitude everywhere. The Ψ Aei(ωtkxequation [then] gives the [complex-valued probability] amplitude, and if we take the absolute square, we get the relative probability for finding the particle as a function of position and time. This is a constant, which means that the probability to find a [this] particle is the same anywhere.” (Feynman’s Lectures, I-48-5)

You may say or think: What’s the problem here really? Well… If the probability to find a particle is the same anywhere, then the particle can be anywhere and, for all practical purposes, that amounts to saying it’s nowhere really. Hence, that wave function doesn’t serve the purpose. In short, that nice Ψ Aei(ωtkxfunction is completely useless in terms of representing an electron, or any other actual particle moving through space. So what to do?

The Wikipedia article on the Uncertainty Principle has this wonderful animation that shows how we can superimpose several waves, one on top of each other, to form a wave packet. Let me copy it below: So that’s what the wave we want indeed: a wave packet that travels through space but which is, at the same time, limited in space. Of course, you should note, once again, that it shows only one part of the complex-valued probability amplitude: just visualize the other part (imaginary if the wave above would happen to represent the real part, and vice versa if the wave would happen to represent the imaginary part of the probability amplitude). The animation basically illustrates a mathematical operation. To be precise, it involves a Fourier analysis or decomposition: it separates a wave packet into a finite or (potentially) infinite number of component waves. Indeed, note how, in the illustration above, the frequency of the component waves gradually increases (or, what amounts to the same, how the wavelength gets smaller and smaller) and how, with every wave we ‘add’ to the packet, it becomes increasingly localized. Now, you can easily see that the ‘uncertainty’ or ‘spread’ in the wavelength here (which we’ll denote by Δλ) is, quite simply, the difference between the wavelength of the ‘one-cycle wave’, which is equal to the space the whole wave packet occupies (which we’ll denote by Δx), and the wavelength of the ‘highest-frequency wave’. For all practical purposes, they are about the same, so we can write: Δx ≈ Δλ. Using Bohr’s formulation of the Uncertainty Principle, we can see the expression I used above (Δλ = hΔp) makes sense: Δx = Δλ = h/Δp, so ΔλΔp = h.

[Just to be 100% clear on terminology: a Fourier decomposition is not the same as that Fourier transform I mentioned when talking about the relation between position and momentum in the Kennard formulation of the Uncertainty Principle, although these two mathematical concepts obviously have a few things in common.]

The wave train revisited

All what I’ve said above, is the ‘correct’ interpretation of the Uncertainty Principle and the de Broglie equation. To be frank, it took me quite a while to ‘get’ that—and, as you can see, it also took me quite a while to get ‘here’, of course. 🙂

In fact, I was confused, for quite a few years actually, because I never quite understood whey there had to be a spread in the wavelength of a wave train. Indeed, we can all easily imagine a localized wave train with a fixed frequency and a fixed wavelength, like the one below, which I’ll re-use later. I’ve made this wave train myself: it’s a standard sine and cosine function multiplied with an ‘envelope’ function generating the envelope. As you can see, it’s a complex-valued thing indeed: the blue curve is the real part, and the imaginary part is the red curve. You can easily make a graph like this yourself. [Just use of one of those online graph tools.] This thing is localized in space and, as mentioned above, it has a fixed frequency and wavelength. So all those enigmatic statements you’ll find in serious or less serious books (i.e. textbooks or popular accounts) on quantum mechanics saying that “we cannot define a unique wavelength for a short wave train” and/or saying that “there is an indefiniteness in the wave number that is related to the finite length of the train, and thus there is an indefiniteness in the momentum” (I am quoting Feynman here, so not one of the lesser gods) are – with all due respect for these authors, especially Feynman – just wrong. I’ve made another ‘short wave train’ below, but this time it depicts the real part of a (possible) wave function only. Hmm… Now that one has a weird shape, you’ll say. It doesn’t look like a ‘matter wave’! Well… You’re right. Perhaps. [I’ll challenge you in a moment.] The shape of the function above is consistent, though, with the view of a photon as a transient electromagnetic oscillation. Let me come straight to the point by stating the basics: the view of a photon in physics is that photons are emitted by atomic oscillators. As an electron jumps from one energy level to the other, it seems to oscillate back and forth until it’s in equilibrium again, thereby emitting an electromagnetic wave train that looks like a transient.

Huh? What’s a transient? It’s an oscillation like the one above: its amplitude and, hence, its energy, gets smaller and smaller as time goes by. To be precise, its energy level has the same shape as the envelope curve below: E = E0e–t/τ. In this expression, we have τ as the so-called decay time, and one can show it’s the inverse of the so-called decay rate: τ = 1/γ with γE = –dE/dt. In case you wonder, check it out on Wikipedia: it’s one of the many applications of the natural exponential function: we’re talking a so-called exponential decay here indeed, involves a quantity (in this case, the amplitude and/or the energy) decreasing at a rate that is proportional to its current value, with the coefficient of proportionality being γ. So we write that as γE = –dE/dt in mathematical notation. 🙂 I need to move on. All of what I wrote above was ‘plain physics’, but so what I really want to explore in this post is a crazy hypothesis. Could these wave trains above – I mean the wave trains with the fixed frequency and wavelength – possible represent a de Broglie wave for a photon?

You’ll say: of course not! But, let’s be honest, you’d have some trouble explaining why. The best answer you could probably come up with is: because no physics textbook says something like that. You’re right. It’s a crazy hypothesis because, when you ask a physicist (believe it or not, but I actually went through the trouble of asking two nuclear scientists), they’ll tell you that photons are not to be associated with de Broglie waves. [You’ll say: why didn’t you try looking for an answer on the Internet? I actually did but – unlike what I am used to – I got very confusing answers on this one, so I gave up trying to find some definite answer on this question on the Internet.]

However, these negative answers don’t discourage me from trying to do some more freewheeling. Before discussing whether or not the idea of a de Broglie wave for a photon makes sense, let’s think about mathematical constraints. I googled a bit but I only see one actually: the amplitudes of a de Broglie wave are subject to a normalization condition. Indeed, when everything is said and done, all probabilities must take a value between 0 and 1, and they must also all add up to exactly 1. So that’s a so-called normalization condition that obviously imposes some constraints on the (complex-valued) probability amplitudes of our wave function.

But let’s get back to the photon. Let me remind you of what happens when a photon is being emitted by inserting the two diagrams below, which gives the energy levels of the atomic orbitals of electrons. So an electron absorbs or emits a photon when it goes from one energy level to the other, so it absorbs or emits radiation. And, of course, you will also remember that the frequency of the absorbed or emitted light is related to those energy levels. More specifically, the frequency of the light emitted in a transition from, let’s say, energy level Eto Ewill be written as ν31 = (E– E1)/h. This frequency will be one of the so-called characteristic frequencies of the atom and will define a specific so-called spectral emission line.

Now, from a mathematical point of view, there’s no difference between that ν31 = (E– E1)/h equation and the de Broglie equation, f = E/h, which assigns a de Broglie wave to a particle. But, of course, from all that I wrote above, it’s obvious that, while these two formulas are the same from a math point of view, they represent very different things. Again, let me repeat what I said above: a de Broglie wave is a matter-wave and, as such, it has nothing to do with an electromagnetic wave.

Let me be even more explicit. A de Broglie wave is not a ‘real’ wave, in a sense (but, of course, that’s a very unscientific statement to make); it’s a psi function, so it represents these weird mathematical quantities–complex probability amplitudes–which allow us to calculate the probability of finding the particle at position x or, if it’s a wave function for the momentum-space, to find a value p for its momentum. In contrast, a photon that’s emitted or absorbed represents a ‘real’ disturbance of the electromagnetic field propagating through space. Hence, that frequency ν is something very different than f, which is why we use another symbol for it (ν is the Greek letter nu, not to be confused with the v symbol we use for velocity). [Of course, you may wonder how ‘real’ or ‘unreal’ an electromagnetic field is but, in the context of this discussion, let me assure you we should look at it as something that’s very real.]

That being said, we also know light is emitted in discrete energy packets: in fact, that’s how photons were defined originally, first by Planck and then by Einstein. Now, when an electron falls from one energy level in an atom to another (lower) energy level, it emits one – and only one – photon with that particular wavelength and energy. The question then is: how should we picture that photon? Does it also have some more or less defined position in space, and some momentum? The answer is definitely yes, on both accounts:

1. Subject to the constraints of the Uncertainty Principle, we know, more or less indeed, when a photon leaves a source and when it hits some detector. [And, yes, due to the ‘Uncertainty Principle’ or, as Feynman puts it, the rules for adding arrows, it may not travel in a straight line and/or at the speed of light—but that’s a discussion that, believe it or not, is not directly relevant here. If you want to know more about it, check one or more of my posts on it.]
2. We also know light has a very definite momentum, which I’ve calculated elsewhere and so I’ll just note the result: p = E/c. It’s a ‘pushing momentum’ referred to as radiation pressure, and its in the direction of travel indeed.

In short, it does makes sense, in my humble opinion that is, to associate some wave function with the photon, and then I mean a de Broglie wave. Just think about it yourself. You’re right to say that a de Broglie wave is a ‘matter wave’, and photons aren’t matter but, having said that, photons do behave like like electrons, don’t they? There’s diffraction (when you send a photon through one slit) and interference (when photons go through two slits, altogether or – amazingly – one by one), so it’s the same weirdness as electrons indeed, and so why wouldn’t we associate some kind of wave function with them?

You can react in one of three ways here. The first reaction is: “Well… I don’t know. You tell me.” Well… That’s what I am trying to do here. 🙂

The second reaction may be somewhat more to the point. For example, those who’ve read Feynman’s Strange Theory of Light and Matter, could say: “Of course, why not? That’s what we do when we associate a photon going from point A to B with an amplitude P(A to B), isn’t it?”

Well… No. I am talking about something else here. Not some amplitude associated with a path in spacetime, but a wave function giving an approximate position of the photon.

The third reaction may be the same as the reaction of those two nuclear scientists I asked: “No. It doesn’t make sense. We do not associate photons with a de Broglie wave.” But so they didn’t tell me why because… Well… They didn’t have the time to entertain a guy like me and so I didn’t dare to push the question and continued to explore it more in detail myself.

So I’ve done that, and I thought of one reason why the question, perhaps, may not make all that much sense: a photon travels at the speed of light; therefore, it has no length. Hence, doing what I am doing below, and that’s to associate the electromagnetic transient with a de Broglie wave might not make sense.

Maybe. I’ll let you judge. Before developing the point, I’ll raise two objections to the ‘objection’ raised above (i.e. the statement that a photon has no length). First, if we’re looking at the photon as some particle, it will obviously have no length. However, an electromagnetic transient is just what it is: an electromagnetic transient. I’ve see nothing that makes me think its length should be zero. In fact, if that would be the case, the concept of an electromagnetic wave itself would not make sense, as its ‘length’ would always be zero. Second, even if – somehow – the length of the electromagnetic transient would be reduced to zero because of its speed, we can still imagine that we’re looking at the emission of an electromagnetic pulse (i.e. a photon) using the reference frame of the photon, so that we’re traveling at speed c,’ riding’ with the photon, so to say, as it’s being emitted. Then we would ‘see’ the electromagnetic transient as it’s being radiated into space, wouldn’t we?

Perhaps. I actually don’t know. That’s why I wrote this post and hope someone will react to it. I really don’t know, so I thought it would be nice to just freewheel a bit on this question. So be warned: nothing of what I write below has been researched really, so critical comments and corrections from actual specialists are more than welcome.

The shape of a photon wave

As mentioned above, the answer in regard to the definition of a photon’s position and momentum is, obviously, unambiguous. Perhaps we have to stretch whatever we understand of Einstein’s (special) relativity theory, but we should be able to draw some conclusions, I feel.

Let me say one thing more about the momentum here. As said, I’ll refer you to one of my posts for the detail but, all you should know here is that the momentum of light is related to the magnetic field vector, which we usually never mention when discussing light because it’s so tiny as compared to the electric field vector in our inertial frame of reference. Indeed, the magnitude of the magnetic field vector is equal to the magnitude of the electric field vector divided by c = 3×108, so we write B = E/c. Now, the E here stands for the electric field, so let me use W to refer to the energy instead of E. Using the B = E/equation and a fairly straightforward calculation of the work that can be done by the associated force on a charge that’s being put into this field, we get that famous equation which we mentioned above already: the momentum of a photon is its total energy divided by c, so we write p = W/c. You’ll say: so what? Well… Nothing. I just wanted to note we get the same p = W/c equation indeed, but from a very different angle of analysis here. We didn’t use the energy-momentum relation here at all! In any case, the point to note is that the momentum of a photon is only a tiny fraction of its energy (p = W/c), and that the associated magnetic field vector is also just a tiny fraction of the electric field vector (B = E/c).

But so it’s there and, in fact, when adopting a moving reference frame, the mix of E and B (i.e. the electric and magnetic field) becomes an entirely different one. One of the ‘gems’ in Feynman’s Lectures is the exposé on the relativity of electric and magnetic fields indeed, in which he analyzes the electric and magnetic field caused by a current, and in which he shows that, if we switch our inertial reference frame for that of the moving electrons in the wire, the ‘magnetic’ field disappears, and the whole electromagnetic effect becomes ‘electric’ indeed.

I am just noting this because I know I should do a similar analysis for the E and B ‘mixture’ involved in the electromagnetic transient that’s being emitted by our atomic oscillator. However, I’ll admit I am not quite comfortably enough with the physics nor the math involved to do that, so… Well… Please do bear this in mind as I will be jotting down some quite speculative thoughts in what follows.

So… A photon is, in essence, a electromagnetic disturbance and so, when trying to picture a photon, we can think of some oscillating electric field vector traveling through–and also limited in–space. [Note that I am leaving the magnetic field vector out of the analysis from the start, which is not ‘nice’ but, in light of that B = E/c relationship, I’ll assume it’s acceptable.] In short, in the classical world – and in the classical world only of course – a photon must be some electromagnetic wave train, like the one below–perhaps. But why would it have that shape? I only suggested it because it has the same shape as Feynman’s representation of a particle (see below) as a ‘probability wave’ traveling through–and limited in–space. So, what about it? Let me first remind you once again (I just can’t stress this point enough it seems) that Feynman’s representation – and most are based on his, it seems – is misleading because it suggests that ψ(x) is some real number. It’s not. In the image above, the vertical axis should not represent some real number (and it surely should not represent a probability, i.e. some real positive number between 0 and 1) but a probability amplitude, i.e. a complex number in which both the real and imaginary part are important. Just to be fully complete (in case you forgot), such complex-valued wave function ψ(x) will give you all the probabilities you need when you take its (absolute) square, but so… Well… We’re really talking a different animal here, and the image above gives you only one part of the complex-valued wave function (either the real or the imaginary part), while it should give you both. That’s why I find my graph below much better. 🙂 It’s the same really, but so it shows both the real as well as the complex part of a wave function. But let me go back to the first illustration: the vertical axis of the first illustration is not ψ but E – the electric field vector. So there’s no imaginary part here: just a real number, representing the strength–or magnitude I should say– of the electric field E as a function of the space coordinate x. [Can magnitudes be negative? The honest answer is: no, they can’t. But just think of it as representing the field vector pointing in the other way .]

Regardless of the shortcomings of this graph, including the fact we only have some real-valued oscillation here, would it work as a ‘suggestion’ of how a real-life photon could look like?

Of course, you could try to not answer that question by mumbling something like: “Well… It surely doesn’t represent anything coming near to a photon in quantum mechanics.” But… Well… That’s not my question here: I am asking you to be creative and ‘think outside of the box’, so to say. 🙂

So you should say ‘No!’ because of some other reason. What reason? Well… If a photon is an electromagnetic transient – in other words, if we adopt a purely classical point of view – it’s going to be a transient wave indeed, and so then it should walk, talk and even look like a transient. 🙂 Let me quickly jot down the formula for the (vertical) component of E as a function of the acceleration of some charge q: The charge q (i.e. the source of the radiation) is, of course, our electron that’s emitting the photon as it jumps from a higher to a lower energy level (or, vice versa, absorbing it). This formula basically states that the magnitude of the electric field (E) is proportional to the acceleration (a) of the charge (with t–r/c the retarded argument). Hence, the suggested shape of E as a function of x as shown above would imply that the acceleration of the electron is (a) initially quite small, (b) then becomes larger and larger to reach some maximum, and then (c) becomes smaller and smaller again to then die down completely. In short, it does match the definition of a transient wave sensu stricto (Wikipedia defines a transient as “a short-lived burst of energy in a system caused by a sudden change of state”) but it’s not likely to represent any real transient. So, we can’t exclude it, but a real transient is much more likely to look like something what’s depicted below: no gradual increase in amplitude but big swings initially which then dampen to zero. In other words, if our photon is a transient electromagnetic disturbance caused by a ‘sudden burst of energy’ (which is what that electron jump is, I would think), then its representation will, much more likely, resemble a damped wave, like the one below, rather than Feynman’s picture of a moving matter-particle. In fact, we’d have to flip the image, both vertically and horizontally, because the acceleration of the source and the field are related as shown below. The vertical flip is because of the minus sign in the formula for E(t). The horizontal flip is because of the minus sign in the (t – r/c) term, the retarded argument: if we add a little time (Δt), we get the same value for a(tr/cas we would have if we had subtracted a little distance: Δr=cΔt. So that’s why E as a function of r (or of x), i.e. as a function in space, is a ‘reversed’ plot of the acceleration as a function of time. So we’d have something like below. What does this resemble? It’s not a vibrating string (although I do start to understand the attractiveness of string theory now: vibrating strings are great as energy storage systems, so the idea of a photon being some kind of vibrating string sounds great, doesn’t it?). It’s not resembling a bullwhip effect either, because the oscillation of a whip is confined by a different envelope (see below). And, no, it’s also definitely not a trumpet. 🙂 It’s just what it is: an electromagnetic transient traveling through space. Would this be realistic as a ‘picture’ of a photon? Frankly, I don’t know. I’ve looked at a lot of stuff but didn’t find anything on this really. The easy answer, of course, is quite straightforward: we’re not interested in the shape of a photon because we know it is not an electromagnetic wave. It’s a ‘wavicle’, just like an electron.

[…] Sure. I know that too. Feynman told me. 🙂 But then why wouldn’t we associate some wave function with it? Please tell me, because I really can’t find much of an answer to that question in the literature, and so that’s why I am freewheeling here. So just go along with me for a while, and come up with another suggestion. As I said above, your bet is as good as mine. All that I know is that there’s one thing we need to explain when considering the various possibilities: a photon has a very well-defined frequency (which defines its color in the visible light spectrum) and so our wave train should – in my humble opinion – also have that frequency. At least for ‘quite a while’—and then I mean ‘most of the time’, or ‘on average’ at least. Otherwise the concept of a frequency – or a wavelength – wouldn’t make much sense. Indeed, if the photon has no defined wavelength or frequency, then we could not perceive it as some color (as you may or may not know, the sense of ‘color’ is produced by our eye and brain, but so it’s definitely associated with the frequency of the light). A photon should have a color (in phyics, that means a frequency) because, when everything is said and done, that’s what the Planck relation is all about.

What would be your alternative? I mean… Doesn’t it make sense to think that, when jumping from one energy level to the other, the electron would initially sort of overshoot its new equilibrium position, to then overshoot it again on the other side, and so on and so on, but with an amplitude that becomes smaller and smaller as the oscillation dies out? In short, if we look at radiation as being caused by atomic oscillators, why would we not go all the way and think of them as oscillators subject to some damping force? Just think about it. 🙂

The size of a photon wave

Let’s forget about the shape for a while and think about size. We’ve got an electromagnetic train here. So how long would it be? Well… Feynman calculated the Q of these atomic oscillators: it’s of the order of 10(see his Lectures, I-33-3: it’s a wonderfully simple exercise, and one that really shows his greatness as a physics teacher) and, hence, this wave train will last about 10–8 seconds (that’s the time it takes for the radiation to die out by a factor 1/e). To give a somewhat more precise example, for sodium light, which has a frequency of 500 THz (500×1012 oscillations per second) and a wavelength of 600 nm (600×10–9 meter), the radiation will lasts about 3.2×10–8 seconds. [In fact, that’s the time it takes for the radiation’s energy to die out by a factor 1/e, so(i.e. the so-called decay time τ), so the wavetrain will actually last longer, but so the amplitude becomes quite small after that time.]

So that’s a very short time, but still, taking into account the rather spectacular frequency (500 THz) of sodium light, that still makes for some 16 million oscillations and, taking into the account the rather spectacular speed of light (3×10m/s), that makes for a wave train with a length of, roughly, 9.6 meter. Huh? 9.6 meter!?

You’re right. That’s an incredible distance: it’s like infinity on an atomic scale!

So… Well… What to say? Such length surely cannot match the picture of a photon as a fundamental particle which cannot be broken up, can it? So it surely cannot be right because, if this would be the case, then there surely must be some way to break this thing up and, hence, it cannot be ‘elementary’, can it?

Well… Maybe. But think it through. First note that we will not see the photon as a 10-meter long string because it travels at the speed of light indeed and so the length contraction effect ensure its length, as measured in our reference frame (and from whatever ‘real-life’ reference frame actually, because the speed of light will always be c, regardless of the speeds we mortals could ever reach (including speeds close to c), is zero.

So, yes, I surely must be joking here but, as far as jokes go, I can’t help thinking this one is fairly robust from a scientific point of view. Again, please do double-check and correct me, but all what I’ve written so far is not all that speculative. It corresponds to all what I’ve read about it: only one photon is produced per electron in any de-excitation, and its energy is determined by the number of energy levels it drops, as illustrated (for a simple hydrogen atom) below. For those who continue to be skeptical about my sanity here, I’ll quote Feynman once again:

“What happens in a light source is that first one atom radiates, then another atom radiates, and so forth, and we have just seen that atoms radiate a train of waves only for about 10–8 sec; after 10–8 sec, some atom has probably taken over, then another atom takes over, and so on. So the phases can really only stay the same for about 10–8 sec. Therefore, if we average for very much more than 10–8 sec, we do not see an interference from two different sources, because they cannot hold their phases steady for longer than 10–8 sec. With photocells, very high-speed detection is possible, and one can show that there is an interference which varies with time, up and down, in about 10–8 sec.” (Feynman’s Lectures, I-34-4) So… Well… Now it’s up to you. I am going along here with the assumption that a photon in the visible light spectrum, from a classical world perspective, should indeed be something that’s several meters long and packs a few million oscillations. So, while we usually measure stuff in seconds, or hours, or years, and, hence, while we would that think 10–8 seconds is short, a photon would actually be a very stretched-out transient that occupies quite a lot of space. I should also add that, in light of that number of ten meter, the dampening seems to happen rather slowly!

[…]

First because this type of analysis is not appropriate. […] You think so? Well… I don’t know. Perhaps you’re right. Perhaps we shouldn’t try to think of a photon as being something different than a discrete packet of energy. But then we also know it is an electromagnetic waveSo why wouldn’t we go all the way?

Second, I guess you may find the math involved in this post not to your liking, even if it’s quite simple and I am not doing anything spectacular here. […] Well… Frankly, I don’t care. Let me bulldozer on. 🙂

What about the ‘vertical’ dimension, the y and the z coordinates in space? We’ve got this long snaky  thing: how thick-bodied is it?

Here, we need to watch our language. While it’s fairly obvious to associate a wave with a cross-section that’s normal to its direction of propagation, it is not obvious to associate a photon with the same thing. Not at all actually: as that electric field vector E oscillates up and down (or goes round and round, as shown in the illustration below, which is an image of a circularly polarized wave), it does not actually take any space. Indeed, the electric and magnetic field vectors E and B have a direction and a magnitude in space but they’re not representing something that is actually taking up some small or larger core in space. Hence, the vertical axis of that graph showing the wave train does not indicate some spatial position: it’s not a y-coordinate but the magnitude of an electric field vector. [Just to underline the fact that the magnitude E has nothing to do with spatial coordinates: note that its value depends on the unit we use to measure field strength (so that’s newton/coulomb, if you want to know), so it’s really got nothing to do with an actual position in space-time.]

So, what can we say about it? Nothing much, perhaps. But let me try.

Cross-sections in nuclear physics

In nuclear physics, the term ‘cross-section’ would usually refer to the so-called Thompson scattering cross-section of an electron (or any charged particle really), which can be defined rather loosely as the target area for the incident wave (i.e. the photons): it is, in fact, a surface which can be calculated from what is referred to as the classical electron radius, which is about 2.82×10–15 m. Just to compare: you may or may not remember the so-called Bohr radius of an atom, which is about 5.29×10–11 m, so that’s a length that’s about 20,000 times longer. To be fully complete, let me give you the exact value for the Thompson scattering cross-section of an electron: 6.62×10–29 m(note that this is a surface indeed, so we have m squared as a unit, not m).

Now, let me remind you – once again – that we should not associate the oscillation of the electric field vector with something actually happening in space: an electromagnetic field does not move in a medium and, hence, it’s not like a water or sound wave, which makes molecules go up and down as it propagates through its medium. To put it simply: there’s nothing that’s wriggling in space as that photon is flashing through space. However, when it does hit an electron, that electron will effectively ‘move’ (or vibrate or wriggle or whatever you can imagine) as a result of the incident electromagnetic field.

That’s what’s depicted and labeled below: there is a so-called ‘radial component’ of the electric field, and I would say: that’s our photon! [What else would it be?] The illustration below shows that this ‘radial’ component is just E for the incident beam and that, for the scattered beam, it is, in fact, determined by the electron motion caused by the incident beam through that relation described above, in which a is the normal component (i.e. normal to the direction of propagation of the outgoing beam) of the electron’s acceleration. Now, before I proceed, let me remind you once again that the above illustration is, once again, one of those illustrations that only wants to convey an idea, and so we should not attach too much importance to it: the world at the smallest scale is best not represented by a billiard ball model. In addition, I should also note that the illustration above was taken from the Wikipedia article on elastic scattering (i.e. Thomson scattering), which is only a special case of the more general Compton scattering that actually takes place. It is, in fact, the low-energy limit. Photons with higher energy will usually be absorbed, and then there will be a re-emission, but, in the process, there will be a loss of energy in this ‘collision’ and, hence, the scattered light will have lower energy (and, hence, lower frequency and longer wavelength). But – Hey! – now that I think of it: that’s quite compatible with my idea of damping, isn’t it? 🙂 [If you think I’ve gone crazy, I am really joking here: when it’s Compton scattering, there’s no ‘lost’ energy: the electron will recoil and, hence, its momentum will increase. That’s what’s shown below (credit goes to the HyperPhysics site).] So… Well… Perhaps we should just assume that a photon is a long wave train indeed (as mentioned above, ten meter is very long indeed: not an atomic scale at all!) but that its effective ‘radius’ should be of the same order as the classical electron radius. So what’s that order? If it’s more or less the same radius, then it would be in the order of femtometers (1 fm = 1 fermi = 1×10–15 m). That’s good because that’s a typical length-scale in nuclear physics. For example, it would be comparable with the radius of a proton. So we look at a photon here as something very different – because it’s so incredibly long (at least as measured from its own reference frame) – but as something which does have some kind of ‘radius’ that is normal to its direction of propagation and equal or smaller than the classical electron radius. [Now that I think of it, we should probably think of it as being substantially smaller. Why? Well… An electron is obviously fairly massive as compared to a photon (if only because an electron has a rest mass and a photon hasn’t) and so… Well… When everything is said and done, it’s the electron that absorbs a photon–not the other way around!]

Now, that radius determines the area in which it may produce some effect, like hitting an electron, for example, or like being detected in a photon detector, which is just what this so-called radius of an atom or an electron is all about: the area which is susceptible of being hit by some particle (including a photon), or which is likely to emit some particle (including a photon). What is exactly, we don’t know: it’s still as spooky as an electron and, therefore, it also does not make all that much sense to talk about its exact position in space. However, if we’d talk about its position, then we should obviously also invoke the Uncertainty Principle, which will give us some upper and lower bounds for its actual position, just like it does for any other particle: the uncertainty about its position will be related to the uncertainty about its momentum, and more knowledge about the former, will implies less knowledge about the latter, and vice versa. Therefore, we can also associate some complex wave function with this photon which is – for all practical purposes – a de Broglie wave. Now how should we visualize that wave?

Well… I don’t know. I am actually not going to offer anything specific here. First, it’s all speculation. Second, I think I’ve written too much rubbish already. However, if you’re still reading, and you like this kind of unorthodox application of electromagnetics, then the following remarks may stimulate your imagination.

The first thing to note is that we should not end up with a wave function that, when squared, gives us a constant probability for each and every point in space. No. The wave function needs to be confined in space and, hence, we’re also talking a wave train here, and a very short one in this case. So… Well… What about linking its amplitude to the amplitude of the field for the photon. In other words, the probability amplitude could, perhaps, be proportional to the amplitude of E, with the proportionality factor being determined by (a) the unit in which we measure E (i.e. newton/coulomb) and (b) the normalization condition.

OK. I hear you say it now: “Ha-ha! Got you! Now you’re really talking nonsense! How can a complex number (the probability amplitude) be proportional to some real number (the field strength)?”

Well… Be creative. It’s not that difficult to imagine some linkages. First, the electric field vector has both a magnitude and a direction. Hence, there’s more to E than just its magnitude. Second, you should note that the real and imaginary part of a complex-valued wave function is a simple sine and cosine function, and so these two functions are the same really, except for a phase difference of π/2. In other words, if we have a formula for the real part of a wave function, we have a formula for its imaginary part as well. So… Your remark is to the point and then it isn’t.

OK, you’ll say, but then so how exactly would you link the E vector with the ψ(x, t) function for a photon. Well… Frankly, I am a bit exhausted now and so I’ll leave any further speculation to you. The whole idea of a de Broglie wave of a photon, with the (complex-valued) amplitude having some kind of ‘proportional’ relationship to the (magnitude of) the electric field vector makes sense to me, although we’d have to be innovative about what that ‘proportionality’ exactly is.

Let me conclude this speculative business by noting a few more things about our ‘transient’ electromagnetic wave:

1. First, it’s obvious that the usual relations between (a) energy (W), (b) frequency (f) and (c) amplitude (A) hold. If we increase the frequency of a wave, we’ll have a proportional increase in energy (twice the frequency is twice the energy), with the factor of proportionality being given by the Planck-Einstein relation: W = hf. But if we’re talking amplitudes (for which we do not have a formula, which is why we’re engaging in those assumptions on the shape of the transient wave), we should not forget that the energy of a wave is proportional to the square of its amplitude: W ∼ A2. Hence, a linear increase of the amplitudes results in an exponential (quadratic) increase in energy (e.g. if you double all amplitudes, you’ll pack four times more energy in that wave).

2. Both factors come into play when an electron emits a photon. Indeed, if the difference between the two energy levels is larger, then the photon will not only have a higher frequency (i.e. we’re talking light (or electromagnetic radiation) in the upper ranges of the spectrum then) but one should also expect that the initial overshooting – and, hence, the initial oscillation – will also be larger. In short, we’ll have larger amplitudes. Hence, higher-energy photons will pack even more energy upfront. They will also have higher frequency, because of the Planck relation. So, yes, both factors would come into play.

What about the length of these wave trains? Would it make them shorter? Yes. I’ll refer you to Feynman’s Lectures to verify that the wavelength appears in the numerator of the formula for Q. Hence, higher frequency means shorter wavelength and, hence, lower Q. Now, I am not quite sure (I am not sure about anything I am writing here it seems) but this may or may not be the reason for yet another statement I never quite understood: photons with higher and higher energy are said to become smaller and smaller, and when they reach the Planck scale, they are said to become black holes.

Hmm… I should check on that. 🙂

Conclusion

So what’s the conclusion? Well… I’ll leave it to you to think about this. As said, I am a bit tired now and so I’ll just wrap this up, as this post has become way too long anyway. Let me, before parting, offer the following bold suggestion in terms of finding a de Broglie wave for our photon: perhaps that transient above actually is the wave function.

You’ll say: What !? What about normalization? All probabilities have to add up to one and, surely, those magnitudes of the electric field vector wouldn’t add up to one, would they?

My answer to that is simple: that’s just a question of units, i.e. of normalization indeed. So just measure the field strength in some other unit and it will come all right.

[…] But… Yes? What? Well… Those magnitudes are real numbers, not complex numbers.

I am not sure how to answer that one but there’s two things I could say:

1. Real numbers are complex numbers too: it’s just that their imaginary part is zero.
2. When working with waves, and especially with transients, we’ve always represented them using the complex exponential function. For example, we would write a wave function whose amplitude varies sinusoidally in space and time as Aei(ωtr), with ω the (angular) frequency and k the wave number (so that’s the wavelength expressed in radians per unit distance).

So, frankly, think about it: where is the photon? It’s that ten-meter long transient, isn’t it? And the probability to find it somewhere is the (absolute) square of some complex number, right? And then we have a wave function already, representing an electromagnetic wave, for which we know that the energy which it packs is the square of its amplitude, as well as being proportional to its frequency. We also know we’re more likely to detect something with high energy than something with low energy, don’t we? So… Tell me why the transient itself would not make for a good psi function?

But then what about these probability amplitudes being a function of the y and z coordinates?

Well… Frankly, I’ve started to wonder if a photon actually has a radius. If it doesn’t have a mass, it’s probably the only real point-like particle (i.e. a particle not occupying any space) – as opposed to all other matter-particles, which do have mass.

Why?

I don’t know. Your guess is as good as mine. Maybe our concepts of amplitude and frequency of a photon are not very relevant. Perhaps it’s only energy that counts. We know that a photon has a more or less well-defined energy level (within the limits of the Uncertainty Principle) and, hence, our ideas about how that energy actually gets distributed over the frequency, the amplitude and the length of that ‘transient’ have no relation with reality. Perhaps we like to think of a photon as a transient electromagnetic wave, because we’re used to thinking in terms of waves and fields, but perhaps a photon is just a point-like thing indeed, with a wave function that’s got the same shape as that transient. 🙂

Post scriptum: Perhaps I should apologize to you, my dear reader. It’s obvious that, in quantum mechanics, we don’t think of a photon as having some frequency and some wavelength and some dimension in space: it’s just an elementary particle with energy interacting with other elementary particles with energy, and we use these coupling constants and what have you to work with them. So we don’t usually think of photons as ten-meter long transients moving through space. So, when I write that “our concepts of amplitude and frequency of a photon are maybe not very relevant” when trying to picture a photon, and that “perhaps, it’s only energy that counts”, I actually don’t mean “maybe” or “perhaps“. I mean: Of course! […] In the quantum-mechanical world view, that is.

So I apologize for, perhaps, posting what may or may not amount to plain nonsense. However, as all of this nonsense helps me to make sense of these things myself, I’ll just continue. 🙂 I seem to move very slowly on this Road to Reality, but the good thing about moving slowly, is that it will – hopefully – give me the kind of ‘deeper’ understanding I want, i.e. an understanding beyond the formulas and mathematical and physical models. In the end, that’s all that I am striving for when pursuing this ‘hobby’ of mine. Nothing more, nothing less. 🙂 Onwards!

Bad thinking: photons versus the matter wave

In my previous post, I wrote that I was puzzled by that relation between the energy and the size of a particle: higher-energy photons are supposed to be smaller and, pushing that logic to the limit, we get photons becoming black holes at the Planck scale. Now, understanding what the Planck scale is all about, is important to understand why we’d need a GUT, and so I do want to explore that relation between size and energy somewhat further.

I found the answer by a coincidence. We’ll call it serendipity. 🙂 Indeed, an acquaintance of mine who is very well versed in physics pointed out a terrible mistake in (some of) my reasoning in the previous posts: photons do not have a de Broglie wavelength. They just have a wavelength. Full stop. It immediately reduced my bemusement about that energy-size relation and, in the end, eliminated it completely. So let’s analyze that mistake – which seems to be a fairly common freshman mistake judging from what’s being written about it in some of the online discussions on physics.

If photons are not to be associated with a de Broglie wave, it basically means that the Planck relation has nothing to do with the de Broglie relation, even if these two relations are identical from a pure mathematical point of view:

1. The Planck relation E = hν states that electromagnetic waves with frequency ν are a bunch of discrete packets of energy referred to as photons, and that the energy of these photons is proportional to the frequency of the electromagnetic wave, with the Planck constant h as the factor of proportionality. In other words, the natural unit to measure their energy is h, which is why h is referred to as the quantum of action.
2. The de Broglie relation E = hf assigns de Broglie wave with frequency f to a matter particle with energy E = mc2 = γm0c2. [The factor γ in this formula is the Lorentz factor: γ = (1 – v2/c2)–1/2. It just corrects for the relativistic effect on mass as the velocity of the particle (v) gets closer to the speed of light (c).]

These are two very different things: photons do not have rest mass (which is why they can travel at light speed) and, hence, they are not to be considered as matter particles. Therefore, one should not assign a de Broglie wave to them. So what are they then? A photon is a wave packet but it’s an electromagnetic wave packet. Hence, its wave function is not some complex-valued psi function Ψ(x, t). What is oscillating in the illustration below (let’s say this is a procession of photons) is the electric field vector E. [To get the full picture of the electromagnetic wave, you should also imagine a (tiny) magnetic field vector B, which oscillates perpendicular to E), but that does not make much of a difference. Finally, in case you wonder about these dots: the red and green dot just make it clear that phase and group velocity of the wave are the same: vg = vp = v = c.] The point to note is that we have a real wave here: it is not de Broglie wave. A de Broglie wave is a complex-valued function Ψ(x, t) with two oscillating parts: (i) the so-called real part of the complex value Ψ, and (ii) the so-called imaginary part (and, despite its name, that counts as much as the real part when working with Ψ !). That’s what’s shown in the examples of complex (standing) waves below: the blue part is one part (let’s say the real part), and then the salmon color is the other part. We need to square the modulus of that complex value to find the probability P of detecting that particle in space at point x at time t: P(x, t) = |Ψ(x, t)|2. Now, if we would write Ψ(x, t) as Ψ = u(x, t) + iv(x, t), then u(x, t) is the real part, and v(x, t) is the imaginary part. |Ψ(x, t)|2 is then equal to u2 + u2 so that shows that both the blue as well as the salmon amplitude matter when doing the math. So, while I may have given the impression that the Planck relation was like a limit of the de Broglie relation for particles with zero rest mass traveling at speed c, that’s just plain wrong ! The description of a particle with zero rest mass fits a photon but the Planck relation is not the limit of the de Broglie relation: photons are photons, and electrons are electrons, and an electron wave has nothing to do with a photon. Electrons are matter particles (fermions as physicists would say), and photons are bosons, i.e. force carriers.

Let’s now re-examine the relationship between the size and the energy of a photon. If the wave packet below would represent an (ideal) photon, what is its energy E as a function of the electric and magnetic field vectors E and B[Note that the (non-boldface) E stands for energy (i.e. a scalar quantity, so it’s just a number) indeed, while the (italic and bold) E stands for the (electric) field vector (so that’s something with a magnitude (E – with the symbol in italics once again to distinguish it from energy E) and a direction).] Indeed, if a photon is nothing but a disturbance of the electromagnetic field, then the energy E of this disturbance – which obviously depends on E and B – must also be equal to E = hν according to the Planck relation. Can we show that?

Well… Let’s take a snapshot of a plane-wave photon, i.e. a photon oscillating in a two-dimensional plane only. That plane is perpendicular to our line of sight here: Because it’s a snapshot (time is not a variable), we may look at this as an electrostatic field: all points in the interval Δx are associated with some magnitude (i.e. the magnitude of our electric field E), and points outside of that interval have zero amplitude. It can then be shown (just browse through any course on electromagnetism) that the energy density (i.e. the energy per unit volume) is equal to (1/2)ε0Eis the electric constant which we encountered in previous posts already). To calculate the total energy of this photon, we should integrate over the whole distance Δx, from left to right. However, rather than bothering you with integrals, I think that (i) the ε0E2/2 formula and (ii) the illustration above should be sufficient to convince you that:

1. The energy of a photon is proportional to the square of the amplitude of the electric field. Such E ∝ Arelation is typical of any real wave, be they water waves or electromagnetic waves. So if we would double, triple, or quadruple its amplitude (i.e. the magnitude E of the electric field E), then the energy of this photon with be multiplied with four, nine times and sixteen respectively.
2. If we would not change the amplitude of the wave above but double, triple or quadruple its frequency, then we would only double, triple or quadruple its energy: there’s no exponential relation here. In other words, the Planck relation E = hν makes perfect sense, because it reflects that simple proportionality: there is nothing to be squared.
3. If we double the frequency but leave the amplitude unchanged, then we can imagine a photon with the same energy occupying only half of the Δx space. In fact, because we also have that universal relationship between frequency and wavelength (the propagation speed of a wave equals the product of its wavelength and its frequency: v = λf), we would have to halve the wavelength (and, hence, that would amount to dividing the Δx by two) to make sure our photon is still traveling at the speed of light.

Now, the Planck relation only says that higher energy is associated with higher frequencies: it does not say anything about amplitudes. As mentioned above, if we leave amplitudes unchanged, then the same Δx space will accommodate a photon with twice the frequency and twice the energy. However, if we would double both frequency and amplitude, then the photon would occupy only half of the Δx space, and still have twice as much energy. So the only thing I now need to prove is that higher-frequency electromagnetic waves are associated with larger-amplitude E‘s. Now, while that is something that we get straight out of the the laws of electromagnetic radiation: electromagnetic radiation is caused by oscillating electric charges, and it’s the magnitude of the acceleration (written as a in the formula below) of the oscillating charge that determines the amplitude. Indeed, for a full write-up of these ‘laws’, I’ll refer to a textbook (or just download Feynman’s 28th Lecture on Physics), but let me just give the formula for the (vertical) component of E: You will recognize all of the variables and constants in this one: the electric constant ε0, the distance r, the speed of light (and our wave) c, etcetera. The ‘a’ is the acceleration: note that it’s a function not of t but of (t – r/c), and so we’re talking the so-called retarded acceleration here, but don’t worry about that.

Now, higher frequencies effectively imply a higher magnitude of the acceleration vector, and so that’s what’s I had to prove and so we’re done: higher-energy photons not only have higher frequency but also larger amplitude, and so they take less space.

It would be nice if I could derive some kind of equation to specify the relation between energy and size, but I am not that advanced in math (yet). 🙂 I am sure it will come.

Post scriptum 1: The ‘mistake’ I made obviously fully explains why Feynman is only interested in the amplitude of a photon to go from point A to B, and not in the amplitude of a photon to be at point x at time t. The question of the ‘size of the arrows’ then becomes a question related to the so-called propagator function, which gives the probability amplitude for a particle (a photon in this case) to travel from one place to another in a given time. The answer seems to involve another important buzzword when studying quantum mechanics: the gauge parameter. However, that’s also advanced math which I don’t master (as yet). I’ll come back on it… Hopefully… 🙂

Post scriptum 2: As I am re-reading some of my post now (i.e. on 12 January 2015), I noted how immature this post is. I wanted to delete it, but finally I didn’t, as it does illustrate my (limited) progress. I am still struggling with the question of a de Broglie wave for a photon, but I dare to think that my analysis of the question at least is a bit more mature now: please see one of my other posts on it.