A post for Vincent: on the math of waves

Pre-scriptum (dated 26 June 2020): These posts on elementary math and physics for my kids (they are 21 and 23 now and no longer need such explanations) have not suffered much the attack by the dark force—which is good because I still like them. While my views on the true nature of light, matter and the force or forces that act on them have evolved significantly as part of my explorations of a more realist (classical) explanation of quantum mechanics, I think most (if not all) of the analysis in this post remains valid and fun to read. In fact, I find the simplest stuff is often the best. 🙂

Original post:

I wrote this post to just briefly entertain myself and my teenage kids. To be precise, I am writing this for Vincent, as he started to study more math this year (eight hours a week!), and as he also thinks he might go for engineering studies two years from now. So let’s see if he gets this and − much more importantly − if he likes the topic. If not… Well… Then he should get even better at golf than he already is, so he can make a living out of it. 🙂

To be sure, nothing what I write below requires an understanding of stuff you haven’t seen yet, like integrals, or complex numbers. There’s no derivatives, exponentials or logarithms either: you just need to know what a sine or a cosine is, and then it’s just a bit of addition and multiplication. So it’s just… Well… Geometry and waves as I would teach it to an interested teenager. So let’s go for it. And, yes, I am talking to you now, Vincent! 🙂

The animation below shows a repeating pulse. It is a periodic function: a traveling wave. It obviously travels in the positive x-direction, i.e. from left to right as per our convention. As you can see, the amplitude of our little wave varies as a function of time (t) and space (x), so it’s a function in two variables, like y = F(u, v). You know what that is, and you also know we’d refer to y as the dependent variable and to u and v as the independent variables.

Now, because it’s a wave, and because it travels in the positive x-direction, the argument of the wave function F will be x−ct, so we write:

y = F(x−ct)

Just to make sure: c is the speed of travel of this particular wave, so don’t think it’s the speed of light. This wave can be any wave: a water wave, a sound wave,… Whatever. Our dependent variable y is the amplitude of our wave, so it’s the vertical displacement − up or down − of whatever we’re looking at. As it’s a repeating pulse, y is zero most of the time, except when that pulse is pulsing. 🙂

So what’s the wavelength of this thing?

[…] Come on, Vincent. Think! Don’t just look at this!

[…] I got it, daddy! It’s the distance between two peaks, or between the center of two successive pulses— obviously! 🙂

[…] Good! 🙂 OK. That was easy enough. Now look at the argument of this function once again:

F = F(x−ct)

We are not merely acknowledging here that F is some function of x and t, i.e. some function varying in space and time. Of course, F is that too, so we can write: y = F = F(x, t) = F(x−ct), but it’s more than just some function: we’ve got a very special argument here, x−ct, and so let’s start our little lesson by explaining it.

The x−ct argument is there because we’re talking waves, so that is something moving through space and time indeed. Now, what are we actually doing when we write x−ct? Believe it or not, we’re basically converting something expressed in time units into something expressed in distance units. So we’re converting time into distance, so to speak. To see how this works, suppose we add some time Δt to the argument of our function y = F, so we’re looking at F[x−c(t+Δt)] now, instead of F(x−ct). Now, F[x−c(t+Δt)] = F(x−ct−cΔt), so we’ll get a different value for our function—obviously! But it’s easy to see that we can restore our wave function F to its former value by also adding some distance Δx = cΔt to the argument. Indeed, if we do so, we get F[x+Δx−c(t+Δt)] = F(x+cΔt–ct−cΔt) = F(x–ct). For example, if c = 3 m/s, then 2 seconds of time correspond to (2 s)×(3 m/s) = 6 meters of distance.

The idea behind adding both some time Δt as well as some distance Δx is that you’re traveling with the waveform itself, or with its phase as they say. So it’s like you’re riding on its crest or in its trough, or somewhere hanging on to it, so to speak. Hence, the speed of a wave is also referred to as its phase velocity, which we denote by v_p = c. Now, let me make some remarks here.

First, there is the direction of travel. The pulses above travel in the positive x-direction, so that’s why we have x minus ct in the argument. For a wave traveling in the negative x-direction, we’ll have a wave function y = F(x+ct). [And, yes, don’t be lazy, Vincent: please go through the Δx = cΔt math once again to double-check that.]

The second thing you should note is that the speed of a regular periodic wave is equal to to the product of its wavelength and its frequency, so we write: v_p = c = λ·f, which we can also write as λ = c/f or f = c/λ. Now, you know we express the frequency in oscillations or cycles per second, i.e. in hertz: one hertz is, quite simply, 1 s⁻¹, so the unit of frequency is the reciprocal of the second. So the m/s and the Hz units in the fraction below give us a wavelength λ equal to λ = (20 m/s)/(5/s) = 4 m. You’ll say that’s too simple but I just want to make sure you’ve got the basics right here.

The third thing is that, in physics, and in math, we’ll usually work with nice sinusoidal functions, i.e. sine or cosine functions. A sine and a cosine function are the same function but with a phase difference of 90 degrees, so that’s π/2 radians. That’s illustrated below: cosθ = sin(θ+π/2).

Now, when we converted time to distance by multiplying it with c, what we actually did was to ensure that the argument of our wavefunction F was expressed in one unit only: the meter, so that’s the distance unit in the international SI system of units. So that’s why we had to convert time to distance, so to speak.

The other option is to express all in seconds, so that’s in time units. So then we should measure distance in seconds, rather than meters, so to speak, and the corresponding argument is t–x/c, and our wave function would be written as y = G(t–x/c). Just go through the same Δx = cΔt math once more: G[t+Δt–(x+Δx)/c] = G(t+Δt–x/c−cΔt/c) = G(t–x/c).

In short, we’re talking the same wave function here, so F(x−ct) = G(t−x/c), but the argument of F is expressed in distance units, while the argument of G is expressed in time units. If you’d want to double-check what I am saying here, you can use the same 20 m/s wave example again: suppose the distance traveled is 100 m, so x = 100 m and x/c = (100 m)/(20 m/s) = 5 seconds. It’s always important to check the units, and you can see they come out alright in both cases! 🙂

Now, to go from F or G to our sine or cosine function, we need to do yet another conversion of units, as the argument of a sinusoidal function is some angle θ, not meters or seconds. In physics, we refer to θ as the phase of the wave function. So we need degrees or, more common now, radians, which I’ll explain in a moment. Let me first jot it down:

y = sin(2π(x–ct)/λ)

So what are we doing here? What’s going on? Well… First, we divide x–ct by the wavelength λ, so that’s the (x–ct)/λ in the argument of our sine function. So our ‘distance unit’ is no longer the meter but the wavelength of our wave, so we no longer measure in meter but in wavelengths. For example, if our argument x–ct was 20 m, and the wavelength of our wave is 4 m, we get (x–ct)/λ = 5 between the brackets. It’s just like comparing our length: ten years ago you were about half my size. Now you’re the same: one unit. 🙂 When we’re saying that, we’re using my length as the unit – and so that’s also your length unit now 🙂 – rather than meters or centimeters.

Now I need to explain the 2π factor, which is only slightly more difficult. Think about it: one wavelength corresponds to one full cycle, so that’s the full 360° of the circle below. In fact, we’ll express angles in radians, and the two animations below illustrate what a radian really is: an angle of 1 rad defines an arc whose length, as measured on the circle, is equal to the radius of that circle. […] Oh! Please look at the animations as two separate things: they illustrate the same idea, but they’re not synchronized, unfortunately! 🙂
Circle_radians

So… I hope it all makes sense now: if we add one wavelength to the argument of our wave function, we should get the same value, and so it’s equivalent to adding 2π to the argument of our sine function. Adding half a wavelength, or 35% of it, or a quarter, or two wavelengths, or e wavelengths, etc is equivalent to adding π, or 35%·2π ≈ 2.2, or 2π/4 = π/2, or 2·2π = 4π, or e·2π, etc to it. So… Well… Think about it: to go from the argument of our wavefunction expressed as a number of wavelengths − so that’s (x–ct)/λ – to the argument of our sine function, which is expressed in radians, we need to multiply by 2π.

[…] OK, Vincent. If it’s easier for you, you may want to think of the 1/λ and 2π factors in the argument of the sin(2π(x–ct)/λ) function as scaling factors: you’d use a scaling factor when you go from one measurement scale to another indeed. It’s like using vincents rather than meter. If one vincent corresponds to 1.8 m, then we need to re-scale all lengths by dividing them by 1.8 so as to express them in vincents. Vincent ten year ago was 0.9 m, so that’s half a vincent: 0.9/1.8 = 0.5. 🙂

[…] OK. […] Yes, you’re right: that’s rather stupid and makes nobody smile. Fine. You’re right: it’s time to move on to more complicated stuff. Now, read the following a couple of times. It’s my one and only message to you:

If there’s anything at all that you should remember from all of the nonsense I am writing about in this physics blog, it’s that any periodic phenomenon, any motion really, can be analyzed by assuming that it is the sum of the motions of all the different modes of what we’re looking at, combined with appropriate amplitudes and phases.

It really is a most amazing thing—it’s something very deep and very beautiful connecting all of physics with math.

We often refer to these modes as harmonics and, in one of my posts on the topic, I explained how the wavelengths of the harmonics of a classical guitar string – it’s just an example – depended on the length of the string only. Indeed, if we denote the various harmonics by their harmonic number n = 1, 2, 3,… n,… and the length of the string by L, we have λ₁ = 2L = (1/1)·2L, λ₂ = L = (1/2)·2L, λ₃ = (1/3)·2L,… λ_n = (1/n)·2L. So they look like this:

etcetera (1/8, 1/9,…,1/n,… 1/∞)

The diagram makes it look like it’s very obvious, but it’s an amazing fact: the material of the string, or its tension, doesn’t matter. It’s just the length: simple geometry is all that matters! As I mentioned in my post on music and physics, this realization led to a somewhat misplaced fascination with harmonic ratios, which the Greeks thought could explain everything. For example, the Pythagorean model of the orbits of the planets would also refer to these harmonic ratios, and it took intellectual giants like Galileo and Copernicus to finally convince the Pope that harmonic ratios are great, but that they cannot explain everything. 🙂 [Note: When I say that the material of the string, or its tension, doesn’t matter, I should correct myself: they do come into play when time becomes the variable. Also note that guitar strings are not the same length when strung on a guitar: the so-called bridge saddle is not in an exact right angle to the strings: this is a link to some close-up pictures of a bridge saddle on a guitar, just in case you don’t have a guitar at home to check.]

Now, I already explained the need to express the argument of a wave function in radians – because we’re talking periodic functions and so we want to use sinusoidals − and how it’s just a matter of units really, and so how we can go from meter to wavelengths to radians. I also explained how we could do the same for seconds, i.e. for time. The key to converting distance units to time units, and vice versa, is the speed of the wave, or the phase velocity, which relates wavelength and frequency: c = λ·f. Now, as we have to express everything in radians anyway, we’ll usually substitute the wavelength and frequency by the wavenumber and the angular frequency so as to convert these quantities too to something expressed in radians. Let me quickly explain how it works:

The wavenumber k is equal to k = 2π/λ, so it’s some number expressed in radians per unit distance, i.e. radians per meter. In the example above, where λ was 4 m, we have k = 2π/(4 m) = π/2 radians per meter. To put it differently, if our wave travels one meter, its phase θ will change by π/2.
Likewise, the angular frequency is ω = 2π·f = 2π/T. Using the same example once more, so assuming a frequency of 5 Hz, i.e. a period of one fifth of a second, we have ω = 2π/[(1/5)·s] = 10π per second. So the phase of our wave will change with 10 times π in one second. Now that makes sense because, in one second, we have five cycles, and so that corresponds to 5 times 2π.

Note that our definition implies that λ = 2π/k, and that it’s also easy to figure out that our definition of ω, combined with the f = c/λ relation, implies that ω = 2π·c/λ and, hence, that c = ω·λ/(2π) = (ω·2π/k)/(2π) = ω/k. OK. Let’s move on.

Using the definitions and explanations above, it’s now easy to see that we can re-write our y = sin(2π(x–ct)/λ) as:

y = sin(2π(x–ct)/λ) = sin[2π(x–(ω/k)t)/(2π/k)] = sin[(x–(ω/k)t)·k)] = sin(kx–ωt)

Remember, however, that we were talking some wave that was traveling in the positive x-direction. For the negative x-direction, the equation becomes:

y = sin(2π(x+ct)/λ) = sin(kx+ωt)

OK. That should be clear enough. Let’s go back to our guitar string. We can go from λ to k by noting that λ = 2L and, hence, we get the following for all of the various modes:

k = k₁ = 2π·1/(2L) = π/L, k₂ = 2π·2/(2L) = 2k, k₃ = 2π·3/(2L) = 3k,,… k_n = 2π·3/(2L) = nk,…

That gives us our grand result, and that’s that we can write some very complicated waveform Ψ(x) as the sum of an infinite number of simple sinusoids, so we have:

Ψ(x) = a₁sin(kx) + a₂sin(2kx) + a₃sin(3kx) + … + a_nsin(nkx) + … = ∑ a_nsin(nkx)

The equation above assumes we’re looking at the oscillation at some fixed point in time. If we’d be looking at the oscillation at some fixed point in space, we’d write:

Φ(t) = a₁sin(ωt) + a₂sin(2ωt) + a₃sin(3ωt) + … + a_nsin(nωt) + … = ∑ a_nsin(nωt)

Of course, to represent some very complicated oscillation on our guitar string, we can and should combine some Ψ(x) as well as some Φ(t) function, but how do we do that, exactly? Well… We’ll obviously need both the sin(kx–ωt) as well as those sin(kx+ωt) functions, as I’ll explain in a moment. However, let me first make another small digression, so as to complete your knowledge of wave mechanics. 🙂

We look at a wave as something that’s traveling through space and time at the same time. In that regard, I told you that the speed of the wave is its so-called phase velocity, which we denoted as v_p = c and which, as I explained above, is equal to v_p = c = λ·f = (2π/k)·(ω/2π) = ω/k. The animation below (credit for it must go to Wikipedia—and sorry I forget to acknowledge the same source for the illustrations above) illustrates the principle: the speed of travel of the red dot is the phase velocity. But you can see that what’s going on here is somewhat more complicated: we have a series of wave packets traveling through space and time here, and so that’s where the concept of the so-called group velocity comes in: it’s the speed of travel of the green dot.

Now, look at the animation below. What’s going on here? The wave packet (or the group or the envelope of the wave—whatever you want to call it) moves to the right, but the phase goes to the left, as the peaks and troughs move leftward indeed. Huh? How is that possible? And where is this wave going? Left or right? Can we still associate some direction with the wave here? It looks like it’s traveling in both directions at the same time!

The wave actually does travel in both directions at the same time. Well… Sort of. The point is actually quite subtle. When I started this post by writing that the pulses were ‘obviously’ traveling in the positive x-direction… Well… That’s actually not so obvious. What is it that is traveling really? Think about an oscillating guitar string: nothing travels left or right really. Each point on the string just moves up and down. Likewise, if our repeated pulse is some water wave, then the water just stays where it is: it just moves up and down. Likewise, if we shake up some rope, the rope is not going anywhere: we just started some motion that is traveling down the rope. In other words, the phase velocity is just a mathematical concept. The peaks and troughs that seem to be traveling are just mathematical points that are ‘traveling’ left or right.

What about the group velocity? Is that a mathematical notion too? It is. The wave packet is often referred to as the envelope of the wave curves, for obviously reasons: they’re enveloped indeed. Well… Sort of. 🙂 However, while both the phase and group velocity are velocities of mathematical constructs, it’s obvious that, if we’re looking at wave packets, the group velocity would be of more interest to us than the phase velocity. Think of those repeated pulses as real water waves, for example: while the water stays where it is (as mentioned, the water molecules just go up and down—more or less, at least), we’d surely be interested to know how fast these waves are ‘moving’, and that’s given by the group velocity, not the phase velocity. Still, having said that, the group velocity is as ‘unreal’ as the phase velocity: both are mathematical concepts. The only thing that’s ‘real’ is the up and down movement. Nothing travels in reality. Now, I shouldn’t digress too much here, but that’s why there’s no limit on the phase velocity: it can exceed the speed of light. In fact, in quantum mechanics, some real-life particle − like an electron, for instance – will be represented by a complex-valued wave function, and there’s no reason to put some limit on the phase velocity. In contrast, the group velocity will actually be the speed of the electron itself, and that speed can, obviously, approach the speed of light – in particle accelerators, for example – but it can never exceed it. [If you’re smart, and you are, you’ll wonder: what about photons? Well…The classical and quantum-mechanical view of an electromagnetic wave are surely not the same, but they do have a lot in common: both photons and electromagnetic radiation travel at the speed c. Photons can do so because their rest mass is zero. But I can’t go into any more detail here, otherwise this thing will become way too long.]

OK. Let me get back to the issue at hand. So I’ll now revert to the simpler situation we’re looking at here, and so that’s these harmonic waves, whose form is a simple sinusoidal indeed. The animation below (and, yes, it’s also from Wikipedia) is the one that’s relevant for this situation. You need to study it for a while to understand what’s going on. As you can see, the green wave travels to the right, the blue one travels to the left, and the red wave function is the sum of both.

Of course, after all that I wrote above, I should use quotation marks and write ‘travel’ instead of travel, so as to indicate there’s nothing traveling really, except for those mathematical points, but then no one does that, and so I won’t do it either. Just make sure you always think twice when reading stuff like this! Back to the lesson: what’s going on here?

As I explained, the argument of a wave traveling towards the negative x-direction will be x+ct. Conversely, the argument of a wave traveling in the positive x-direction will be x–ct. Now, our guitar string is going nowhere, obviously: it’s like the red wave function above. It’s a so-called standing wave. The red wave function has nodes, i.e. points where there is no motion—no displacement at all! Between the nodes, every point moves up and down sinusoidally, but the pattern of motion stays fixed in space. So that’s the kind of wave function we want, and the animation shows us how we can get it.

Indeed, there’s a funny thing with fixed strings: when a wave reaches the clamped end of a string, it will be reflected with a change in sign, as illustrated below: we’ve got that F(x+ct) wave coming in, and then it goes back indeed, but with the sign reversed.

The illustration above speaks for itself but, of course, once again I need to warn you about the use of sentences like ‘the wave reaches the end of the string’ and/or ‘the wave gets reflected back’. You know what it really means now: it’s some movement that travels through space. […] In any case, let’s get back to the lesson once more: how do we analyze that?

Easy: the red wave function is the sum of two waves: one traveling to the right, and one traveling to the left. We’ll call these component waves F and G respectively, so we have y = F(x, t) + G(x, t). Let’s go for it.

Let’s first assume the string is not held anywhere, so that we have an infinite string along which waves can travel in either direction. In fact, the most general functional form to capture the fact that a waveform can travel in any direction is to write the displacement y as the sum of two functions: one wave traveling one way (which we’ll denote by F, indeed), and the other wave (which, yes, we’ll denote by G) traveling the other way. From the illustration above, it’s obvious that the F wave is traveling towards the negative x-direction and, hence, its argument will be x+ct. Conversely, the G wave travels in the positive x-direction, so its argument is x–ct. So we write:

y = F(x, t) + G(x, t) = F(x+ct) + G(x–ct)

So… Well… We know that the string is actually not infinite, but that it’s fixed to two points. Hence, y is equal to zero there: y = 0. Now let’s choose the origin of our x-axis at the fixed end so as to simplify the analysis. Hence, where y is zero, x is also zero. Now, at x = 0, our general solution above for the infinite string becomes y = F(ct) + G(−ct) = 0, for all values of t. Of course, that means G(−ct) must be equal to –F(ct). Now, that equality is there for all values of t. So it’s there for all values of ct and −ct. In short, that equality is valid for whatever value of the argument of G and –F. As Feynman puts it: “G of anything must be –F of minus that same thing.” Now, the ‘anything’ in G is its argument: x – ct, so ‘minus that same thing’ is –(x–ct) = −x+ct. Therefore, our equation becomes:

y = F(x+ct) − F(−x+ct)

So that’s what’s depicted in the diagram above: the F(x+ct) wave ‘vanishes’ behind the wall as the − F(−x+ct) wave comes out of it. Now, of course, so as to make sure our guitar string doesn’t stop its vibration after being plucked, we need to ensure F is a periodic function, like a sin(kx+ωt) function. 🙂 Why? Well… If this F and G function would simply disappear and ‘serve’ only once, so to speak, then we only have one oscillation and that’s it! So the waves need to continue and so that’s why it needs to be periodic.

OK. Can we just take sin(kx+ωt) and −sin(−kx+ωt) and add both? It makes sense, doesn’t it? Indeed, −sinα = sin(−α) and, therefore, −sin(−kx+ωt) = sin(kx−ωt). Hence, y = F(x+ct) − F(−x+ct) would be equal to:

y = sin(kx+ωt) + sin(kx–ωt) = sin(2π(x+ct)/λ) + sin(2π(x−ct)/λ)

Done! Let’s use specific values for k and ω now. For the first harmonic, we know that k = 2π/2L = π/L. What about ω? Hmm… That depends on the wave velocity and, therefore, that actually does depend on the material and/or the tension of the string! The only thing we can say is that ω = c·k, so ω = c·2π/λ = c·π/L. So we get:

sin(kx+ωt) = sin(π·x/L + π·c·t/L) = sin[(π/L)·(x+ct)]

But this is our F function only. The whole oscillation is y = F(x+ct) − F(−x+ct), and − F(−x+ct) is equal to:

–sin[(π/L)·(−x+ct)] = –sin(−π·x/L+π·c·t/L) = −sin(−kx+ωt) = sin(kx–ωt) = sin[(π/L)·(x–ct)]

So, yes, we should add both functions to get:

y = sin[π(x+ct)/L] + sin[π(x−ct)/L]

Now, we can, of course, apply our trigonometric formulas for the addition of angles, which say that sin(α+β) = sinαcosβ + sinβcosα and sin(α–β) = sinαcosβ – sinβcosα. Hence, y = sin(kx+ωt) + sin(kx–ωt) is equal to sin(kx)cos(ωt) + sin(ωt)cos(kx) + sin(kx)cos(ωt) – sin(ωt)cos(kx) = 2sin(kx)cos(ωt). Now, that’s a very interesting result, so let’s give it some more prominence by writing it in boldface:

y = sin(kx+ωt) + sin(kx–ωt) = 2sin(kx)cos(ωt) = 2sin(π·x/L)cos(π·c·t/L)

The sin(π·x/L) factor gives us the nodes in space. Indeed, sin(π·x/L) = 0 if x is equal to 0 or L (values of x outside of the [0, L] interval are obviously not relevant here). Now, the other factor cos(π·c·t/L) can be re-written cos(2π·c·t/λ) = cos(2π·f·t) = cos(2π·t/T), with T the period T = 1/f = λ/c, so the amplitude reaches a maximum (+1 or −1 or, including the factor 2, +2 or −2) if 2π·t/T is equal to a multiple of π, so that’s if t = n·T/2 with n = 0, 1, 2, etc. In our example above, for f = 5 Hz, that means the amplitude reaches a maximum (+2 or −2) every tenth of a second.

The analysis for the other modes is as easy, and I’ll leave it you, Vincent, as an exercise, to work it all out and send me the y = 2·sin[something]·cos[something else] formula (with the ‘something’ and ‘something else’ written in terms of L and c, of course) for the higher harmonics. 🙂

[…] You’ll say: what’s the point, daddy? Well… Look at that animation again: isn’t it great we can analyze any standing wave, or any harmonic indeed, as the sum of two component waves with the same wavelength and frequency but ‘traveling’ in opposite directions?

Yes, Vincent. I can hear you sigh: “Daddy, I really do not see why I should be interested in this.”

Well… Your call… What can I say? Maybe one day you will. In fact, if you’re going to go for engineering studies, you’ll have to. 🙂

To conclude this post, I’ll insert one more illustration. Now that you know what modes are, you can start thinking about those more complicated Ψ and Φ functions. The illustration below shows how the first and second mode of our guitar string combine to give us some composite wave traveling up and down the very same string.

Think about it. We have one physical phenomenon here: at every point in time, the string is somewhere, but where exactly, depends on the mathematical shape of its components. If this doesn’t illustrate the beauty of Nature, the fact that, behind every simple physical phenomenon − most of which are some sort of oscillation indeed − we have some marvelous mathematical structure, then… Well… Then I don’t know how to explain why I am absolutely fascinated by this stuff.

Addendum 1: On actual waves

My examples of waves above were all examples of so-called transverse waves, i.e. oscillations at a right angle to the direction of the wave. The other type of wave is longitudinal. I mentioned sound waves above, but they are essentially longitudinal. So there the displacement of the medium is in the same direction of the wave, as illustrated below.

Real-life waves, like water waves, may be neither of the two. The illustration below shows how water molecules actually move as a wave passes. They move in little circles, with a systemic phase shift from circle to circle.

Why is this so? I’ll let Feynman answer, as he also provided the illustration above:

“Although the water at a given place is alternately trough or hill, it cannot simply be moving up and down, by the conservation of water. That is, if it goes down, where is the water going to go? The water is essentially incompressible. The speed of compression of waves—that is, sound in the water—is much, much higher, and we are not considering that now. Since water is incompressible on this scale, as a hill comes down the water must move away from the region. What actually happens is that particles of water near the surface move approximately in circles. When smooth swells are coming, a person floating in a tire can look at a nearby object and see it going in a circle. So it is a mixture of longitudinal and transverse, to add to the confusion. At greater depths in the water the motions are smaller circles until, reasonably far down, there is nothing left of the motion.”

So… There you go… 🙂

Addendum 2: On non-periodic waves, i.e. pulses

A waveform is not necessarily periodic. The pulse we looked at could, perhaps, not repeat itself. It is not possible, then, to describe its wavelength. However, it’s still a wave and, hence, its functional form would still be some y = F(x−ct) or y = F(x+ct) form, depending on its direction of travel.

The example below also comes out of Feynman’s Lectures: electromagnetic radiation is caused by some accelerating electric charge – an electron, usually, because its mass is small and, hence, it’s much easier to move than a proton 🙂 – and then the electric field travels out in space. So the two diagrams below show (i) the acceleration (a) as a function of time (t) and (ii) the electric field strength (E) as a function of the distance (r). [To be fully precise, I should add he ignores the 1/r variation, but that’s a fine point which doesn’t matter much here.]

He basically uses this illustration to explain why we can use a y = G(t–x/c) functional form to describe a wave. The point is: he actually talks about one pulse only here. So the F(x±ct) or G(t±x/c) or sin(kx±ωt) form has nothing to do with whether or not we’re looking at a periodic or non-periodic waveform. The gist of the matter is that we’ve got something moving through space, and it doesn’t matter whether it’s periodic or not: the periodicity or non-periodicity, of a wave has nothing to do with the x±ct, t±x/c or kx±ωt shape of the argument of our wave function. The functional form of our argument is just the result of what I said about traveling along with our wave.

So what is it about periodicity then? Well… If periodicity kicks it, you’ll talk sinusoidal functions, and so the circle will be needed once more. 🙂

Now, I mentioned we cannot associate any particular wavelength with such non-periodic wave. Having said that, it’s still possible to analyze this pulse as a sum of sinusoids through a mathematical procedure which is referred to as the Fourier transform. If you’re going for engineer, you’ll need to learn how to master this technique. As for now, however, you can just have a look at the Wikipedia article on it. 🙂

Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/

The Uncertainty Principle revisited

Pre-script (dated 26 June 2020): This post has become less relevant (even irrelevant, perhaps) because my views on all things quantum-mechanical have evolved significantly as a result of my progression towards a more complete realist (classical) interpretation of quantum physics. I keep blog posts like these mainly because I want to keep track of where I came from. I might review them one day, but I currently don’t have the time or energy for it. 🙂

Original post:

I’ve written a few posts on the Uncertainty Principle already. See, for example, my post on the energy-time expression for it (ΔE·Δt ≥ h). So why am I coming back to it once more? Not sure. I felt I left some stuff out. So I am writing this post to just complement what I wrote before. I’ll do so by explaining, and commenting on, the ‘semi-formal’ derivation of the so-called Kennard formulation of the Principle in the Wikipedia article on it.

The Kennard inequalities, σ_xσ_p ≥ ħ/2 and σ_Eσ_t ≥ ħ/2, are more accurate than the more general Δx·Δp ≥ h and ΔE·Δt ≥ h expressions one often sees, which are an early formulation of the Principle by Niels Bohr, and which Heisenberg himself used when explaining the Principle in a thought experiment picturing a gamma-ray microscope. I presented Heisenberg’s thought experiment in another post, and so I won’t repeat myself here. I just want to mention that it ‘proves’ the Uncertainty Principle using the Planck-Einstein relations for the energy and momentum of a photon:

E = hf and p = h/λ

Heisenberg’s thought experiment is not a real proof, of course. But then what’s a real proof? The mentioned ‘semi-formal’ derivation looks more impressive, because more mathematical, but it’s not a ‘proof’ either (I hope you’ll understand why I am saying that after reading my post). The main difference between Heisenberg’s thought experiment and the mathematical derivation in the mentioned Wikipedia article is that the ‘mathematical’ approach is based on the de Broglie relation. That de Broglie relation looks the same as the Planck-Einstein relation (p = h/λ) but it’s fundamentally different.

Indeed, the momentum of a photon (i.e. the p we use in the Planck-Einstein relation) is not the momentum one associates with a proper particle, such as an electron or a proton, for example (so that’s the p we use in the de Broglie relation). The momentum of a particle is defined as the product of its mass (m) and velocity (v). Photons don’t have a (rest) mass, and their velocity is absolute (c), so how do we define momentum for a photon? There are a couple of ways to go about it, but the two most obvious ones are probably the following:

We can use the classical theory of electromagnetic radiation and show that the momentum of a photon is related to the magnetic field (we usually only analyze the electric field), and the so-called radiation pressure that results from it. It yields the p = E/c formula which we need to go from E = hf to p = h/λ, using the ubiquitous relation between the frequency, the wavelength and the wave velocity (c = λf). In case you’re interested in the detail, just click on the radiation pressure link).
We can also use the mass-energy equivalence E = mc². Hence, the equivalent mass of the photon is E/c², which is relativistic mass only. However, we can multiply that mass with the photon’s velocity, which is c, thereby getting the very same value for its momentum p = c·E/c²= E/c.

So Heisenberg’s ‘proof’ uses the Planck-Einstein relations, as it analyzes the Uncertainty Principle more as an observer effect: probing matter with light, so to say. In contrast, the mentioned derivation takes the de Broglie relation itself as the point of departure. As mentioned, the de Broglie relations look exactly the same as the Planck-Einstein relationship (E = hf and p = h/λ) but the model behind is very different. In fact, that’s what the Uncertainty Principle is all about: it says that the de Broglie frequency and/or wavelength cannot be determined exactly: if we want to localize a particle, somewhat at least, we’ll be dealing with a frequency range Δf. As such, the de Broglie relation is actually somewhat misleading at first. Let’s talk about the model behind.

A particle, like an electron or a proton, traveling through space, is described by a complex-valued wavefunction, usually denoted by the Greek letter psi (Ψ) or phi (Φ). This wavefunction has a phase, usually denoted as θ (theta) which – because we assume the wavefunction is a nice periodic function – varies as a function of time and space. To be precise, we write θ as θ = ωt – kx or, if the wave is traveling in the other direction, as θ = kx – ωt.

I’ve explained this in a couple of posts already, including my previous post, so I won’t repeat myself here. Let me just note that ω is the angular frequency, which we express in radians per second, rather than cycles per second, so ω = 2πf (one cycle covers 2π rad). As for k, that’s the wavenumber, which is often described as the spatial frequency, because it’s expressed in cycles per meter or, more often (and surely in this case), in radians per meter. Hence, if we freeze time, this number is the rate of change of the phase in space. Because one cycle is, again, 2π rad, and one cycle corresponds to the wave traveling one wavelength (i.e. λ meter), it’s easy to see that k = 2π/λ. We can use these definitions to re-write the de Broglie relations E = hf and p = h/λ as:

E = ħω and p = ħk with h = h/2π

What about the wave velocity? For a photon, we have c = λf and, hence, c = (2π/k)(ω/2π) = ω/k. For ‘particle waves’ (or matter waves, if you prefer that term), it’s much more complicated, because we need to distinguish between the so-called phase velocity (v_p) and the group velocity (v_g). The phase velocity is what we’re used to: it’s the product of the frequency (the number of cycles per second) and the wavelength (the distance traveled by the wave over one cycle), or the ratio of the angular frequency and the wavenumber, so we have, once again, λf = ω/k = v_p. However, this phase velocity is not the classical velocity of the particle that we are looking at. That’s the so-called group velocity, which corresponds to the velocity of the wave packet representing the particle (or ‘wavicle’, if your prefer that term), as illustrated below.

The animation below illustrates the difference between the phase and the group velocity even more clearly: the green dot travels with the ‘wavicles’, while the red dot travels with the phase. As mentioned above, the group velocity corresponds to the classical velocity of the particle (v). However, the phase velocity is a mathematical point that actually travels faster than light. It is a mathematical point only, which does not carry a signal (unlike the modulation of the wave itself, i.e. the traveling ‘groups’) and, hence, it does not contradict the fundamental principle of relativity theory: the speed of light is absolute, and nothing travels faster than light (except mathematical points, as you can, hopefully, appreciate now).

The two animations above do not represent the quantum-mechanical wavefunction, because the functions that are shown are real-valued, not complex-valued. To imagine a complex-valued wave, you should think of something like the ‘wavicle’ below or, if you prefer animations, the standing waves underneath (i.e. C to H: A and B just present the mathematical model behind, which is that of a mechanical oscillator, like a mass on a spring indeed). These representations clearly show the real as well as the imaginary part of complex-valued wave-functions.

With this general introduction, we are now ready for the more formal treatment that follows. So our wavefunction Ψ is a complex-valued function in space and time. A very general shape for it is one we used in a couple of posts already:

Ψ(x, t) ∝ e^{i(kx – ωt)}= cos(kx – ωt) + isin(kx – ωt)

If you don’t know anything about complex numbers, I’d suggest you read my short crash course on it in the essentials page of this blog, because I don’t have the space nor the time to repeat all of that. Now, we can use the de Broglie relationship relating the momentum of a particle with a wavenumber (p = ħk) to re-write our psi function as:

Ψ(x, t) ∝ e^{i(kx – ωt)}= e^{i(px/ħ – ωt)}

Note that I am using the ‘proportional to’ symbol (∝) because I don’t worry about normalization right now. Indeed, from all of my other posts on this topic, you know that we have to take the absolute square of all these probability amplitudes to arrive at a probability density function, describing the probability of the particle effectively being at point x in space at point t in time, and that all those probabilities, over the function’s domain, have to add up to 1. So we should insert some normalization factor.

Having said that, the problem with the wavefunction above is not normalization really, but the fact that it yields a uniform probability density function. In other words, the particle position is extremely uncertain in the sense that it could be anywhere. Let’s calculate it using a little trick: the absolute square of a complex number equals the product of itself with its (complex) conjugate. Hence, if z = re^iθ, then │z│² = zz* = re^iθ·re^–iθ= r²e^iθ^–iθ= r²e⁰= r². Now, in this case, assuming unique values for k, ω, p, which we’ll note as k₀, ω₀, p₀ (and, because we’re freezing time, we can also write t = t₀), we should write:

│Ψ(x)│² = │a₀e^{i(p₀x/ħ – ω₀t₀)}│² = │a₀e^ip₀x/ħe^{–iω₀t₀}│² = │a₀e^ip₀x/ħ│² │e^–i^ω^₀t₀│² = a₀²

Note that, this time around, I did insert some normalization constant a₀ as well, so that’s OK. But so the problem is that this very general shape of the wavefunction gives us a constant as the probability for the particle being somewhere between some point a and another point b in space. More formally, we get the surface for a rectangle when we calculate the probability P[a ≤ X ≤ b] as we should calculate it, which is as follows:

More specifically, because we’re talking one-dimensional space here, we get P[a ≤ X ≤ b] = (b–a)·a₀². Now, you may think that such uniform probability makes sense. For example, an electron may be in some orbital around a nucleus, and so you may think that all ‘points’ on the orbital (or within the ‘sphere’, or whatever volume it is) may be equally likely. Or, in another example, we may know an electron is going through some slit and, hence, we may think that all points in that slit should be equally likely positions. However, we know that it is not the case. Measurements show that not all points are equally likely. For an orbital, we get complicated patterns, such as the one shown below, and please note that the different colors represent different complex numbers and, hence, different probabilities.

Also, we know that electrons going through a slit will produce an interference pattern—even if they go through it one by one! Hence, we cannot associate some flat line with them: it has to be a proper wavefunction which implies, once again, that we can’t accept a uniform distribution.

In short, uniform probability density functions are not what we see in Nature. They’re non-uniform, like the (very simple) non-uniform distributions shown below. [The left-hand side shows the wavefunction, while the right-hand side shows the associated probability density function: the first two are static (i.e. they do not vary in time), while the third one shows a probability distribution that does vary with time.]

I should also note that, even if you would dare to think that a uniform distribution might be acceptable in some cases (which, let me emphasize this, it is not), an electron can surely not be ‘anywhere’. Indeed, the normalization condition implies that, if we’d have a uniform distribution and if we’d consider all of space, i.e. if we let a go to –∞ and b to +∞, then a₀²would tend to zero, which means we’d have a particle that is, literally, everywhere and nowhere at the same time.

In short, a uniform probability distribution does not make sense: we’ll generally have some idea of where the particle is most likely to be, within some range at least. I hope I made myself clear here.

Now, before I continue, I should make some other point as well. You know that the Planck constant (h or ħ) is unimaginably small: about 1×10⁻³⁴J·s (joule-second). In fact, I’ve repeatedly made that point in various posts. However, having said that, I should add that, while it’s unimaginably small, the uncertainties involved are quite significant. Let us indeed look at the value of ħ by relating it to that σ_xσ_p ≥ ħ/2 relation.

Let’s first look at the units. The uncertainty in the position should obviously be expressed in distance units, while momentum is expressed in kg·m/s units. So that works out, because 1 joule is the energy transferred (or work done) when applying a force of 1 newton (N) over a distance of 1 meter (m). In turn, one newton is the force needed to accelerate a mass of one kg at the rate of 1 meter per second per second (this is not a typing mistake: it’s an acceleration of 1 m/s per second, so the unit is m/s²: meter per second squared). Hence, 1 J·s = 1 N·m·s = 1 kg·m/s²·m·s = kg·m²/s. Now, that’s the same dimension as the ‘dimensional product’ for momentum and distance: m·kg·m/s = kg·m²/s.

Now, these units (kg, m and s) are all rather astronomical at the atomic scale and, hence, h and ħ are usually expressed in other dimensions, notably eV·s (electronvolt-second). However, using the standard SI units gives us a better idea of what we’re talking about. If we split the ħ = 1×10⁻³⁴J·s value (let’s forget about the 1/2 factor for now) ‘evenly’ over σ_xand σ_p – whatever that means: all depends on the units, of course! – then both factors will have magnitudes of the order of 1×10⁻¹⁷: 1×10⁻¹⁷m times 1×10⁻¹⁷kg·m/s gives us 1×10⁻³⁴J·s.

You may wonder how this 1×10⁻¹⁷m compares to, let’s say, the classical electron radius, for example. The classical electron radius is, roughly speaking, the ‘space’ an electron seems to occupy as it scatters incoming light. The idea is illustrated below (credit for the image goes to Wikipedia, as usual). The classical electron radius – or Thompson scattering length – is about 2.818×10⁻¹⁵m, so that’s almost 300 times our ‘uncertainty’ (1×10⁻¹⁷m). Not bad: it means that we can effectively relate our ‘uncertainty’ in regard to the position to some actual dimension in space. In this case, we’re talking the femtometer scale (1 fm = 10⁻¹⁵m), and so you’ve surely heard of this before.

What about the other ‘uncertainty’, the one for the momentum (1×10⁻¹⁷kg·m/s)? What’s the typical (linear) momentum of an electron? Its mass, expressed in kg, is about 9.1×10⁻³¹ kg. We also know its relative velocity in an electron: it’s that magical number α = v/c, about which I wrote in some other posts already, so v = αc ≈ 0.0073·3×10⁸m/s ≈ 2.2×10⁶m/s. Now, 9.1×10⁻³¹ kg times 2.2×10⁶m/s is about 2×10^–26kg·m/s, so our proposed ‘uncertainty’ in regard to the momentum (1×10⁻¹⁷kg·m/s) is half a billion times larger than the typical value for it. Now that is, obviously, not so good. [Note that calculations like this are extremely rough. In fact, when one talks electron momentum, it’s usual angular momentum, which is ‘analogous’ to linear momentum, but angular momentum involves very different formulas. If you want to know more about this, check my post on it.]

Of course, now you may feel that we didn’t ‘split’ the uncertainty in a way that makes sense: those –17 exponents don’t work, obviously. So let’s take 1×10^–26kg·m/s for σ_p, which is half of that ‘typical’ value we calculated. Then we’d have 1×10⁻⁸m for σ_x (1×10⁻⁸m times 1×10^–26kg·m/s is, once again, 1×10^–34J·s). But then that uncertainty suddenly becomes a huge number: 1×10⁻⁸m is 100 angstrom. That’s not the atomic scale but the molecular scale! So it’s huge as compared to the pico- or femto-meter scale (1 pm = 1×10⁻¹² m, 1 fm = 1×10⁻¹⁵ m) which we’d sort of expect to see when we’re talking electrons.

OK. Let me get back to the lesson. Why this digression? Not sure. I think I just wanted to show that the Uncertainty Principle involves ‘uncertainties’ that are extremely relevant: despite the unimaginable smallness of the Planck constant, these uncertainties are quite significant at the atomic scale. But back to the ‘proof’ of Kennard’s formulation. Here we need to discuss the ‘model’ we’re using. The rather simple animation below (again, credit for it has to go to Wikipedia) illustrates it wonderfully.

Look at it carefully: we start with a ‘wave packet’ that looks a bit like a normal distribution, but it isn’t, of course. We have negative and positive values, and normal distributions don’t have that. So it’s a wave alright. Of course, you should, once more, remember that we’re only seeing one part of the complex-valued wave here (the real or imaginary part—it could be either). But so then we’re superimposing waves on it. Note the increasing frequency of these waves, and also note how the wave packet becomes increasingly localized with the addition of these waves. In fact, the so-called Fourier analysis, of which you’ve surely heard before, is a mathematical operation that does the reverse: it separates a wave packet into its individual component waves.

So now we know the ‘trick’ for reducing the uncertainty in regard to the position: we just add waves with different frequencies. Of course, different frequencies imply different wavenumbers and, through the de Broglie relationship, we’ll also have different values for the ‘momentum’ associated with these component waves. Let’s write these various values as k_n, ω_n, and p_n respectively, with n going from 0 to N. Of course, our point in time remains frozen at t₀. So we get a wavefunction that’s, quite simply, the sum of N component waves and so we write:

Ψ(x) = ∑ a_ne^{i(p_nx/ħ – ω_nt₀)}= ∑ a_ne^ip_nx/ħe^–iω_nt₀= ∑ A_ne^ip_nx/ħ

Note that, because of the e^–iω_nt₀, we now have complex-valued coefficients A_n = a_ne^–iω_nt₀ in front. More formally, we say that A_n represents the relative contribution of the mode p_n to the overall Ψ(x) wave. Hence, we can write these coefficients A as a function of p. Because Greek letters always make more of an impression, we’ll use the Greek letter Φ (phi) for it. 🙂 Now, we can go to the continuum limit and, hence, transform that sum above into an infinite sum, i.e. an integral. So our wave function then becomes an integral over all possible modes, which we write as:

Don’t worry about that new 1/√2πħ factor in front. That’s, once again, something that has to do with normalization and scales. It’s the integral itself you need to understand. We’ve got that Φ(p) function there, which is nothing but our A_n coefficient, but for the continuum case. In fact, these relative contributions Φ(p) are now referred to as the amplitude of all modes p, and so Φ(p) is actually another wave function: it’s the wave function in the so-called momentum space.

You’ll probably be very confused now, and wonder where I want to go with an integral like this. The point to note is simple: if we have that Φ(p) function, we can calculate (or derive, if you prefer that word) the Ψ(x) from it using that integral above. Indeed, the integral above is referred to as the Fourier transform, and it’s obviously closely related to that Fourier analysis we introduced above.

Of course, there is also an inverse transform, which looks exactly the same: it just switches the wave functions (Ψ and Φ) and variables (x and p), and then (it’s an important detail!), it has a minus sign in the exponent. Together, the two functions – as defined by each other through these two integrals – form a so-called Fourier integral pair, also known as a Fourier transform pair, and the variables involved are referred to as conjugate variables. So momentum (p) and position (x) are conjugate variables and, likewise, energy and time are also conjugate variables (but so I won’t expand on the time-energy relation here: please have a look at one of my others posts on that).

Now, I thought of copying and explaining the proof of Kennard’s inequality from Wikipedia’s article on the Uncertainty Principle (you need to click on the show button in the relevant section to see it), but then that’s pretty boring math, and simply copying stuff is not my objective with this blog. More importantly, the proof has nothing to do with physics. Nothing at all. Indeed, it just proves a general mathematical property of Fourier pairs. More specifically, it proves that, the more concentrated one function is, the more spread out its Fourier transform must be. In other words, it is not possible to arbitrarily concentrate both a function and its Fourier transform.

So, in this case, if we’d ‘squeeze’ Ψ(x), then its Fourier transform Φ(p) will ‘stretch out’, and so that’s what the proof in that Wikipedia article basically shows. In other words, there is some ‘trade-off’ between the ‘compaction’ of Ψ(x), on the one hand, and Φ(p), on the other, and so that is what the Uncertainty Principle is all about. Nothing more, nothing less.

But… Yes? What’s all this talk about ‘squeezing’ and ‘compaction’? We can’t change reality, can we? Well… Here we’re entering the philosophical field, of course. How do we interpret the Uncertainty Principle? It surely does look like us trying to measure something has some impact on the wavefunction. In fact, usually, our measurement – of either position or momentum – usually makes the wavefunctions collapse: we suddenly know where the particle is and, hence, ψ(x) seems to collapse into one point. Alternatively, we measure its momentum and, hence, Φ(p) collapses.

That’s intriguing. In fact, even more intriguing is the possibility we may only partially affect those wavefunctions with measurements that are somewhat less ‘drastic’. It seems a lot of research is focused on that (just Google for partial collapse of the wavefunction, and you’ll finds tons of references, including presentations like this one).

Hmm… I need to further study the topic. The decomposition of a wave into its component waves is obviously something that works well in physics—and not only in quantum mechanics but also in much more mundane examples. Its most general application is signal processing, in which we decompose a signal (which is a function of time) into the frequencies that make it up. Hence, our wavefunction model makes a lot of sense, as it mirrors the physics involved in oscillators and harmonics obviously.

Still… I feel it doesn’t answer the fundamental question: what is our electron really? What do those wave packets represent? Physicists will say questions like this don’t matter: as long as our mathematical models ‘work’, it’s fine. In fact, if even Feynman said that nobody – including himself – truly understands quantum mechanics, then I should just be happy and move on. However, for some reason, I can’t quite accept that. I should probably focus some more on that de Broglie relationship, p = h/λ, as it’s obviously as fundamental to my understanding of the ‘model’ of reality in physics as that Fourier analysis of the wave packet. So I need to do some more thinking on that.

The de Broglie relationship is not intuitive. In fact, I am not ashamed to admit that it actually took me quite some time to understand why we can’t just re-write the de Broglie relationship (λ = h/p) as an uncertainty relation itself: Δλ = h/Δp. Hence, let me be very clear on this:

Δx = h/Δp (that’s the Uncertainty Principle) but Δλ ≠ h/Δp !

Let me quickly explain why.

If the Δ symbol expresses a standard deviation (or some other measurement of uncertainty), we can write the following:

p = h/λ ⇒ Δp = Δ(h/λ) = hΔ(1/λ) ≠ h/Δp

So I can take h out of the brackets after the Δ symbol, because that’s one of the things that’s allowed when working with standard deviations. More in particular, one can prove the following:

The standard deviation of some constant function is 0: Δ(k) = 0
The standard deviation is invariant under changes of location: Δ(x + k) = Δ(x + k)
Finally, the standard deviation scales directly with the scale of the variable: Δ(kx) = |k |Δ(x).

However, it is not the case that Δ(1/x) = 1/Δx. However, let’s not focus on what we cannot do with Δx: let’s see what we can do with it. Δx equals h/Δp according to the Uncertainty Principle—if we take it as an equality, rather than as an inequality, that is. And then we have the de Broglie relationship: p = h/λ. Hence, Δx must equal:

Δx = h/Δp = h/[Δ(h/λ)] =h/[hΔ(1/λ)] = 1/Δ(1/λ)

That’s obvious, but so what? As mentioned, we cannot write Δx = Δλ, because there’s no rule that says that Δ(1/λ) = 1/Δλ and, therefore, h/Δp ≠ Δλ. However, what we can do is define Δλ as an interval, or a length, defined by the difference between its lower and upper bound (let’s denote those two values by λ_a and λ_b respectively. Hence, we write Δλ = λ_b – λ_a. Note that this does not assume we have a continuous range of values for λ: we can have any number of frequencies λ_nbetween λ_a and λ_b, but so you see the point: we’ve got a range of values λ, discrete or continuous, defined by some lower and upper bound.

Now, the de Broglie relation associates two values p_a and p_b with λ_a and λ_b respectively: p_a = h/λ_a and p_b = h/λ_b. Hence, we can similarly define the corresponding Δp interval as p_a – p_b, with p_a = h/λ_a and p_b= h/λ_b. Note that, because we’re taking the reciprocal, we have to reverse the order of the values here: if λ_b > λ_a, then p_a = h/λ_a > p_b= h/λ_b. Hence, we can write Δp = Δ(h/λ) = p_a – p_b = h/λ₁ – h/λ₂= h(1/λ₁ – 1/λ₂) = h[λ₂ – λ₁]/λ₁λ₂. In case you have a bit of difficulty, just draw some reciprocal functions (like the ones below), and have fun connecting intervals on the horizontal axis with intervals on the vertical axis using these functions.

Now, h[λ₂ – λ₁]/λ₁λ₂) is obviously something very different than h/Δλ = h/(λ₂ – λ₁). So we can surely not equate the two and, hence, we cannot write that Δp = h/Δλ.

Having said that, the Δx = 1/Δ(1/λ) = λ₁λ₂/(λ₂ – λ₁) that emerges here is quite interesting. We’ve got a ratio here, λ₁λ₂/(λ₂ – λ₁, which shows that Δx depends only on the upper and lower bounds of the Δλ range. It does not depend on whether or not the interval is discrete or continuous.

The second thing that is interesting to note is Δx depends not only on the difference between those two values (i.e. the length of the interval) but also on their value: if the length of the interval, i.e. the difference between the two frequencies is the same, but their values as such are higher, then we get a higher value for Δx, i.e. a greater uncertainty in the position. Again, this shows that the relation between Δλ and Δx is not straightforward. But so we knew that already, and so I’ll end this post right here and right now. 🙂

Some content on this page was disabled on June 17, 2020 as a result of a DMCA takedown notice from Michael A. Gottlieb, Rudolf Pfeiffer, and The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/

Complex Fourier analysis: an introduction

Pre-script (dated 26 June 2020): This post has become less relevant (almost irrelevant, I would say) because my views on the nature of the concept of uncertainty in the context of quantum mechanics has evolved significantly as a result of my progression towards a more complete realist (classical) interpretation of quantum physics. Hence, we recommend you read our recent papers. I keep blog posts like these to see where I came from. I might review them one day, but I currently don’t have the time or energy for it. It is still interesting, though—in particular because I start by pointing out yet another error or myth in quantum mechanics that gets repeated all too often.

Original post:

One of the most confusing sentences you’ll read in an introduction to quantum mechanics – not only in those simple (math-free) popular books but also in Feynman’s Lecture introducing the topic – is that we cannot define a unique wavelength for a short wave train. In Feynman’s words: “Such a wave train does not have a definite wavelength; there is an indefiniteness in the wave number that is related to the finite length of the train, and thus there is an indefiniteness in the momentum.” (Feynman’s Lectures, Vol. I, Ch. 38, section 1).

That is not only confusing but, in some way, actually wrong. In fact, this is an oft-occurring statement which has effectively hampered my own understanding of quantum mechanics for a long time, and it was only when I had a closer look at what a Fourier analysis really is that I understood what Feynman, and others, wanted to say. In short, it’s a classic example of where a ‘simple’ account of things can lead you astray.

Indeed, we can all imagine a short wave train with a very definite frequency. Just take any sinusoidal function and multiply it with a so-called envelope function in order to shape it into a short pulse. Transients have that shape, and I gave an example in previous posts. Another example is given below. I copied it from the Wikipedia article on Fourier analysis: f(t) is a product of two factors:

The first factor in the product is a cosine function: cos[2π(3t)] to be precise.
The second factor is an exponential function: exp(–πt²).

The frequency of this ‘product function’ is quite precise: cos[2π(3t)] = cos[6πt] = cos[6π(t + 1/3)] for all values t, and so its period is equal to 1/3. [If f(x) is a function with period P, then f(ax+b), where a is a positive constant, is periodic with period P/a.] The only thing that the second factor, i.e. exp(–πt²), does is to shape this cosine function into a nice wave train, as it quickly tends to zero on both sides of the t = 0 point. So that second function is a nice simple bell curve (just plot the graph with a graph plotter) and it doesn’t change the period (or frequency) of the product. In short, the oscillation below–which we should imagine as the representation of ‘something’ traveling through space–has a very definite frequency. So what’s Feynman saying above? There’s no Δf or Δλ here, is there?

The point to note is that these Δ concepts – Δf, Δλ, and so on – actually have very precise mathematical definitions, as one would expect in physics: they usually refer to the standard deviation of the distribution of a variable around the mean.

[…] OK, you’ll say. So what?

Well… That f(t) function above can – and, more importantly, should – be written as the sum of a potentially infinite number of waves in order to make sense of the Δf and Δλ factors in those uncertainty relations. Each of these component waves has a very specific frequency indeed, and each one of them makes its own contribution to the resultant wave. Hence, there is a distribution function for these frequencies, and so that is what Δf refers to. In other words, unlike what you’d think when taking a quick look at that graph above, Δf is not zero. So what is it then?

Well… It’s tempting to get lost in the math of it all now but I don’t want this blog to be technical. The basic ideas, however, are the following. We have a real-valued function here, f(t), which is defined from –∞ to +∞, i.e. over its so-called time domain. Hence, t ranges from –∞ to +∞ (the definition of the zero point is a matter of convention only, and we can easily change the origin by adding or subtracting some constant). [Of course, we could – and, in fact, we should – also define it over a spatial domain, but we’ll keep the analysis simple by leaving out the spatial variable (x).]

Now, the so-called Fourier transform of this function will map it to its so-called frequency domain. The animation below (for which the credit must, once again, go to Wikipedia, from which I borrow most of the material here) clearly illustrates the idea. I’ll just copy the description from the same article: “In the first frames of the animation, a function f is resolved into Fourier series: a linear combination of sines and cosines (in blue). The component frequencies of these sines and cosines spread across the frequency spectrum, are represented as peaks in the frequency domain, as shown shown in the last frames of the animation). The frequency domain representation of the function, $\hat{f}$ , is the collection of these peaks at the frequencies that appear in this resolution of the function.”

[…] OK. You sort of get this (I hope). Now we should go a couple of steps further. In quantum mechanics, we’re talking not real-valued waves but complex-valued waves adding up to give us the resultant wave. Also, unlike what’s shown above, we’ll have a continuous distribution of frequencies. Hence, we’ll not have just six discrete values for the frequencies (and, hence, just six component waves), but an infinite number of them. So how does that work? Well… To do the Fourier analysis, we need to calculate the value of the following integral for each possible frequency, which I’ll denote with the Greek letter nu (ν), as we’ve used the f symbol already–not for the frequency but to denote the function itself! Let me just jot down that integral:

Huh? Don’t be scared now. Just try to understand what it actually represents. So just relax and take a long hard look at it. Note, first, that the integrand (i.e. the function that is to be integrated, between the integral sign and the dt, so that’s f(t)e^–i2πtν) is a complex-valued function (that should be very obvious from the i in the exponent of e). Secondly, note that we need to do such integral for each value of ν. So, for each possible value of ν, we have t ranging from –∞ to +∞ in that integral. Hmm… OK. So… How does that work? Well… The illustration below shows the real and imaginary part respectively of the integrand for ν = 3. [Just in case you still don’t get it: we fix ν here (ν = 3), and calculate the value of the real and imaginary part of the integrand for each possible value of t, so t ranges from –∞ to +∞ indeed.]

So what do we see here? The first thing you should note is that the value of both the real and imaginary part of the integrand quickly tends to zero on both sides of the t = 0 point. That’s because of the shape of f(t), which does exactly the same. However, in-between those ‘zero or close-to-zero values’, the integrand does take on very specific non-zero values. As for the real part of the integrand, which is denoted by Re[e^−2πi(3t)f(t)], we see that’s always positive, with a peak value equal to one at t = 0. Indeed, the real part of the integrand is always positive because f(t) and the real part of e^−2πi(3t)oscillate at the same rate. Hence, when f(t) is positive, so is the real part of e^−2πi(3t), and when f(t) is negative, so is the real part of e^−2πi(3t). However, the story is obviously different for the imaginary part of the integrand, denoted by Im[e^−2πi(3t)f(t)]. That’s because, in general, e^iθ = cosθ + isinθ and the sine and cosine function are essentially the same functions except for a phase difference of π/2 (remember: sin(θ+π/2) = cosθ).

Capito? No? Hmm… Well… Try to read what I am writing above once again. Else, just give up. 🙂

I know this is getting complicated but let me try to summarize what’s going on here. The bottom line is that the integral above will yield a positive real number, 0.5 to be precise (as noted in the margin of the illustration), for the real part of the integrand, but it will give you a zero value for its imaginary part (also as noted in the margin of the illustration). [As for the math involved in calculating an integral of a complex-valued function (with a real-valued argument), just note that we should indeed just separate the real and imaginary parts and integrate separately. However, I don’t want you to get lost in the math so don’t worry about it too much. Just try to stick to the main story line here.]

In short, what we have here is a very significant contribution (the associated density is 0.5) of the frequency ν = 3.

Indeed, let’s compare it to the contribution of the wave with frequency ν = 5. For ν = 5, we get, once again, a value of zero when integrating the imaginary part of the integral above, because the positive and negative values cancel out. As for the real part, we’d think they would do the same if we look at the graph below, but they don’t: the integral does yield, in fact, a very tiny positive value: 1.7×10^–6(so we’re talking 1.7 millionths here). That means that the contribution of the component wave with frequency ν = 5 is close to nil but… Well… It’s not nil: we have some contribution here (i.e. some density in other words).

You get the idea (I hope). We can, and actually should, calculate the value of that integral for each possible value of ν. In other words, we should calculate the integral over the entire frequency domain, so that’s for ν ranging from –∞ to +∞. However, I won’t do that. 🙂 What I will do is just show you the grand general result (below), with the particular results (i.e. the values of 0.5 and 1.7×10^–6for ν = 3 and ν = 5) as a green and red dot respectively. [Note that the graph below uses the ξ symbol instead of ν: I used ν because that’s a more familiar symbol, but so it doesn’t change the analysis.]

Now, if you’re still with me – probably not 🙂 – you’ll immediately wonder why there are two big bumps instead of just one, i.e. two peaks in the density function instead of just one. [You’re used to these Gauss curves, aren’t you?] And you’ll also wonder what negative frequencies actually are: the first bump is a density function for negative frequencies indeed, and… Well… Now that you think of it: why the hell would we do such integral for negative values of ν? I won’t say too much about that: it’s a particularity which results from the fact that e^2πiθ and e^−2πiθ both complete a cycle per second (if θ is measured in seconds, that is) so… Well… Hmm… […] Yes. The fact of the matter is that we do have a mathematical equivalent of the bump for positive frequencies on the negative side of the frequency domain, so… Well… […] Don’t worry about it, I’d say. As mentioned above, we shouldn’t get lost in the math here. For our purpose here, which is just to illustrate what a complex Fourier transform actually is (rather than present all of the mathematical intricacies of it), we should just focus on the second bump of that density function, i.e. the density function for positive frequencies only. 🙂

So what? You’re probably tired by now, and wondering what I want to get at. Well… Nothing much. I’ve done what I wanted to do. I started with a real-valued wave train (think of a transient electric field working its way through space, for example), and I then showed how such wave train can (and should) be analyzed as consisting of an infinite number of complex-valued component waves, which each make their own contribution to the combined wave (which consists of the sum of all component waves) and, hence, can be represented by a graph like the one above, i.e. a real-valued density function around some mean, usually denoted by μ, and with some standard deviation, usually denoted by σ. So now I hope that, when you think of Δf or Δλ in the context of a so-called ‘probability wave’ (i.e. a de Broglie wave), then you’ll think of all this machinery behind.

In other words, it is not just a matter of drawing a simple figure like the one below and saying: “You see: those oscillations represent three photons being emitted one after the other by an atomic oscillator. You can see that’s quite obvious, can’t you?”

No. It is not obvious. Why not? Because anyone that’s somewhat critical will immediately say: “But how does it work really? Those wave trains seem to have a pretty definite frequency (or wavelength), even if their amplitude dies out, and, hence, the Δf factor (or Δλ factor) in that uncertainty relation must be close or, more probably, must be equal to zero. So that means we cannot say these particles are actually somewhere, because Δx must be close or equal to infinity.”

Now you know that’s a very valid remark. Because now you understand that one actually has to go through the tedious exercise of doing that Fourier transform, and so now you understand what those Δ symbols actually represent. I hope you do because of this post, and despite the fact my approach has been very superficial and intuitive. In other words, I didn’t say what physicists would probably say, and that is: “Take a good math course before you study physics!” 🙂

The Uncertainty Principle re-visited: Fourier transforms and conjugate variables

Pre-scriptum (dated 26 June 2020): This post did not suffer from the DMCA take-down of some material. It is, therefore, still quite readable—even if my views on the nature of the Uncertainty Principle have evolved quite a bit as part of my realist interpretation of QM.

Original post:

In previous posts, I presented a time-independent wave function for a particle (or wavicle as we should call it – but so that’s not the convention in physics) – let’s say an electron – traveling through space without any external forces (or force fields) acting upon it. So it’s just going in some random direction with some random velocity v and, hence, its momentum is p = mv. Let me be specific – so I’ll work with some numbers here – because I want to introduce some issues related to units for measurement.

So the momentum of this electron is the product of its mass m (about 9.1×10⁻²⁸ grams) with its velocity v (typically something in the range around 2,200 km/s, which is fast but not even close to the speed of light – and, hence, we don’t need to worry about relativistic effects on its mass here). Hence, the momentum p of this electron would be some 20×10⁻²⁵ kg·m/s. Huh? Kg·m/s?Well… Yes, kg·m/s or N·s are the usual measures of momentum in classical mechanics: its dimension is [mass][length]/[time] indeed. However, you know that, in atomic physics, we don’t want to work with these enormous units (because we then always have to add these ×10⁻²⁸ and ×10⁻²⁵ factors and so that’s a bit of a nuisance indeed). So the momentum p will usually be measured in eV/c, with c representing what it usually represents, i.e. the speed of light. Huh? What’s this strange unit? Electronvolts divided by c? Well… We know that eV is an appropriate unit for measuring energy in atomic physics: we can express eV in Joule and vice versa: 1 eV = 1.6×10⁻¹⁹Joule, so that’s OK – except for the fact that this Joule is a monstrously large unit at the atomic scale indeed, and so that’s why we prefer electronvolt. But the Joule is a shorthand unit for kg·m²/s², which is the measure for energy expressed in SI units, so there we are: while the SI dimension for energy is actually [mass][length]²/[time]², using electronvolts (eV) is fine. Now, just divide the SI dimension for energy, i.e. [mass][length]²/[time]², by the SI dimension for velocity, i.e. [length]/[time]: we get something expressed in [mass][length]/[time]. So that’s the SI dimension for momentum indeed! In other words, dividing some quantity expressed in some measure for energy (be it Joules or electronvolts or erg or calories or coulomb-volts or BTUs or whatever – there’s quite a lot of ways to measure energy indeed!) by the speed of light (c) will result in some quantity with the right dimensions indeed. So don’t worry about it. Now, 1 eV/c is equivalent to 5.344×10⁻²⁸ kg·m/s, so the momentum of this electron will be 3.75 eV/c.

Let’s go back to the main story now. Just note that the momentum of this electron that we are looking at is a very tiny amount – as we would expect of course.

Time-independent means that we keep the time variable (t) in the wave function Ψ(x, t) fixed and so we only look at how Ψ(x, t) varies in space, with x as the (real) space variable representing position. So we have a simplified wave function Ψ(x) here: we can always put the time variable back in when we’re finished with the analysis. By now, it should also be clear that we should distinguish between real-valued wave functions and complex-valued wave functions. Real-valued wave functions represent what Feynman calls “real waves”, like a sound wave, or an oscillating electromagnetic field. Complex-valued wave functions describe probability amplitudes. They are… Well… Feynman actually stops short of saying that they are not real. So what are they?

They are, first and foremost complex numbers, so they have a real and a so-called imaginary part (z = a + ib or, if we use polar coordinates, re^θ= cosθ + isinθ). Now, you may think – and you’re probably right to some extent – that the distinction between ‘real’ waves and ‘complex’ waves is, perhaps, less of a dichotomy than popular writers – like me 🙂 – suggest. When describing electromagnetic waves, for example, we need to keep track of both the electric field vector E as well as the magnetic field vector B (both are obviously related through Maxwell’s equations). So we have two components as well, so to say, and each of these components has three dimensions in space, and we’ll use the same mathematical tools to describe them (so we will also represent them using complex numbers). That being said, these probability amplitudes, usually denoted by Ψ(x), describe something very different. What exactly? Well… By now, it should be clear that that is actually hard to explain: the best thing we can do is to work with them, so they start feeling familiar. The main thing to remember is that we need to square their modulus (or magnitude or absolute value if you find these terms more comprehensible) to get a probability (P). For example, the expression below gives the probability of finding a particle – our electron for example – in in the (space) interval [a, b]:

Of course, we should not be talking intervals but three-dimensional regions in space. However, we’ll keep it simple: just remember that the analysis should be extended to three (space) dimensions (and, of course, include the time dimension as well) when we’re finished (to do that, we’d use so-called four-vectors – another wonderful mathematical invention).

Now, we also used a simple functional form for this wave function, as an example: Ψ(x) could be proportional, we said, to some idealized function e^ikx. So we can write: Ψ(x) ∝ e^ikx (∝ is the standard symbol expressing proportionality). In this function, we have a wave number k, which is like the frequency in space of the wave (but then measured in radians because the phase of the wave function has to be expressed in radians). In fact, we actually wrote Ψ(x, t) = (1/x)e^{i(kx – ωt)}(so the magnitude of this amplitude decreases with distance) but, again, let’s keep it simple for the moment: even with this very simple function e^ikx , things will become complex enough.

We also introduced the de Broglie relation, which gives this wave number k as a function of the momentum p of the particle: k = p/ħ, with ħ the (reduced) Planck constant, i.e. a very tiny number in the neighborhood of 6.582 ×10⁻¹⁶ eV·s. So, using the numbers above, we’d have a value for k equal to 3.75 eV/c divided by 6.582 ×10⁻¹⁶ eV·s. So that’s 0.57×10¹⁶ (radians) per… Hey, how do we do it with the units here? We get an incredibly huge number here (57 with 14 zeroes after it) per second? We should get some number per meter because k is expressed in radians per unit distance, right? Right. We forgot c. We are actually measuring distance here, but in light-seconds instead of meter: k is 0.57×10¹⁶/c·s. Indeed, a light-second is the distance traveled by light in one second, so that’s c·s, and if we want k expressed in radians per meter, then we need to divide this huge number 0.57×10¹⁶ (in rad) by 2.998×10⁸( in (m/s)·s) and so then we get a much more reasonable value for k, and with the right dimension too: to be precise, k is about 19×10⁶ rad/m in this case. That’s still huge: it corresponds with a wavelength of 0.33 nanometer (1 nm = 10^-6 m) but that’s the correct order of magnitude indeed.

[In case you wonder what formula I am using to calculate the wavelength: it’s λ = 2π/k. Note that our electron’s wavelength is more than a thousand times shorter than the wavelength of (visible) light (we humans can see light with wavelengths ranging from 380 to 750 nm) but so that’s what gives the electron its particle-like character! If we would increase their velocity (e.g. by accelerating them in an accelerator, using electromagnetic fields to propel them to speeds closer to c and also to contain them in a beam), then we get hard beta rays. Hard beta rays are surely not as harmful as high-energy electromagnetic rays. X-rays and gamma rays consist of photons with wavelengths ranging from 1 to 100 picometer (1 pm = 10^–12m) – so that’s another factor of a thousand down – and thick lead shields are needed to stop them: they are the cause of cancer (Marie Curie’s cause of death), and the hard radiation of a nuclear blast will always end up killing more people than the immediate blast effect. In contrast, hard beta rays will cause skin damage (radiation burns) but they won’t go deeper than that.]

Let’s get back to our wave function Ψ(x) ∝ e^ikx. When we introduced it in our previous posts, we said it could not accurately describe a particle because this wave function (Ψ(x) = Ae^ikx) is associated with probabilities |Ψ(x)|² that are the same everywhere. Indeed, |Ψ(x)|² = |Ae^ikx|² = A². Apart from the fact that these probabilities would add up to infinity (so this mathematical shape is unacceptable anyway), it also implies that we cannot locate our electron somewhere in space. It’s everywhere and that’s the same as saying it’s actually nowhere. So, while we can use this wave function to explain and illustrate a lot of stuff (first and foremost the de Broglie relations), we actually need something different if we would want to describe anything real (which, in the end, is what physicists want to do, right?). We already said in our previous posts: real particles will actually be represented by a wave packet, or a wave train. A wave train can be analyzed as a composite wave consisting of a (potentially infinite) number of component waves. So we write:

Note that we do not have one unique wave number k or – what amounts to saying the same – one unique value p for the momentum: we have n values. So we’re introducing a spread in the wavelength here, as illustrated below:

In fact, the illustration above talks of a continuous distribution of wavelengths and so let’s take the continuum limit of the function above indeed and write what we should be writing:

Now that is an interesting formula. [Note that I didn’t care about normalization issues here, so it’s not quite what you’d see in a more rigorous treatment of the matter. I’ll correct that in the Post Scriptum.] Indeed, it shows how we can get the wave function Ψ(x) from some other function Φ(p). We actually encountered that function already, and we referred to it as the wave function in the momentum space. Indeed, Nature does not care much what we measure: whether it’s position (x) or momentum (p), Nature will not share her secrets with us and, hence, the best we can do – according to quantum mechanics – is to find some wave function associating some (complex) probability amplitude with each and every possible (real) value of x or p. What the equation above shows, then, is these wave functions come as a pair: if we have Φ(p), then we can calculate Ψ(x) – and vice versa. Indeed, the particular relation between Ψ(x) and Φ(p) as established above, makes Ψ(x) and Φ(p) a so-called Fourier transform pair, as we can transform Φ(p) into Ψ(x) using the above Fourier transform (that’s how that integral is called), and vice versa. More in general, a Fourier transform pair can be written as:

Instead of x and p, and Ψ(x) and Φ(p), we have x and y, and f(x) and g(y), in the formulas above, but so that does not make much of a difference when it comes to the interpretation: x and p (or x and y in the formulas above) are said to be conjugate variables. What it means really is that they are not independent. There are quite a few of such conjugate variables in quantum mechanics such as, for example: (1) time and energy (and time and frequency, of course, in light of the de Broglie relation between both), and (2) angular momentum and angular position (or orientation). There are other pairs too but these involve quantum-mechanical variables which I do not understand as yet and, hence, I won’t mention them here. [To be complete, I should also say something about that 1/2π factor, but so that’s just something that pops up when deriving the Fourier transform from the (discrete) Fourier series on which it is based. We can put it in front of either integral, or split that factor across both. Also note the minus sign in the exponent of the inverse transform.]

When you look at the equations above, you may think that f(x) and g(y) must be real-valued functions. Well… No. The Fourier transform can be used for both real-valued as well as complex-valued functions. However, at this point I’ll have to refer those who want to know each and every detail about these Fourier transforms to a course in complex analysis (such as Brown and Churchill’s Complex Variables and Applications (2004) for instance) or, else, to a proper course on real and complex Fourier transforms (they are used in signal processing – a very popular topic in engineering – and so there’s quite a few of those courses around).

The point to note in this post is that we can derive the Uncertainty Principle from the equations above. Indeed, the (complex-valued) functions Ψ(x) and Φ(p) describe (probability) amplitudes, but the (real-valued) functions |Ψ(x)|² and |Φ(p)|² describe probabilities or – to be fully correct – they are probability (density) functions. So it is pretty obvious that, if the functions Ψ(x) and Φ(p) are a Fourier transform pair, then |Ψ(x)|² and |Φ(p)|² must be related to. They are. The derivation is a bit lengthy (and, hence, I will not copy it from the Wikipedia article on the Uncertainty Principle) but one can indeed derive the so-called Kennard formulation of the Uncertainty Principle from the above Fourier transforms. This Kennard formulation does not use this rather vague Δx and Δp symbols but clearly states that the product of the standard deviation from the mean of these two probability density functions can never be smaller than ħ/2:

σ_xσ_p≥ ħ/2

To be sure: ħ/2 is a rather tiny value, as you should know by now, 🙂 but, so, well… There it is.

As said, it’s a bit lengthy but not that difficult to do that derivation. However, just for once, I think I should try to keep my post somewhat shorter than usual so, to conclude, I’ll just insert one more illustration here (yes, you’ve seen that one before), which should now be very easy to understand: if the wave function Ψ(x) is such that there’s relatively little uncertainty about the position x of our electron, then the uncertainty about its momentum will be huge (see the top graphs). Vice versa (see the bottom graphs), precise information (or a narrow range) on its momentum, implies that its position cannot be known.

Does all this math make it any easier to understand what’s going on? Well… Yes and no, I guess. But then, if even Feynman admits that he himself “does not understand it the way he would like to” (Feynman Lectures, Vol. III, 1-1), who am I? In fact, I should probably not even try to explain it, should I? 🙂

So the best we can do is try to familiarize ourselves with the language used, and so that’s math for all practical purposes. And, then, when everything is said and done, we should probably just contemplate Mario Livio’s question: Is God a mathematician? 🙂

Post scriptum:

I obviously cut corners above, and so you may wonder how that ħ factor can be related to σ_xand σ _pif it doesn’t appear in the wave functions. Truth be told, it does. Because of (i) the presence of ħ in the exponent in our e^i(p/ħ)x function, (ii) normalization issues (remember that probabilities (i.e. Ψ|(x)|² and |Φ(p)|²) have to add up to 1) and, last but not least, (iii) the 1/2π factor involved in Fourier transforms , Ψ(x) and Φ(p) have to be written as follows:

Note that we’ve also re-inserted the time variable here, so it’s pretty complete now. One more thing we could do is to substitute x for a proper three-dimensional space vector x or, better still, introduce four-vectors, which would allow us to also integrate relativistic effects (most notably the slowing of time with motion – as observed from the stationary reference frame) – which become important when, for instance, we’re looking at electrons being accelerated, which is the rule, rather than the exception, in experiments.

Remember (from a previous post) that we calculated that an electron traveling at its usual speed in orbit (2200 km/s, i.e. less than 1% of the speed of light) had an energy of about 70 eV? Well, the Large Electron-Positron Collider (LEP) did accelerate them to speeds close to light, thereby giving them energy levels topping 104.5 billion eV (or 104.5 GeV as it’s written) so they could hit each other with collision energies topping 209 GeV (they come from opposite directions so it’s two times 104.5 GeV). Now, 209 GeV is tiny when converted to everyday energy units: 209 GeV is 33×10^–9Joule only indeed – and so note the minus sign in the exponent here: we’re talking billionths of a Joule here. Just to put things into perspective: 1 Watt is the energy consumption of an LED (and 1 Watt is 1 Joule per second), so you’d need to combine the energy of billions of these fast-traveling electrons to power just one little LED lamp. But, of course, that’s not the right comparison: 104.5 GeV is more than 200,000 times the electron’s rest mass (0.511 MeV), so that means that – in practical terms – their mass (remember that mass is a measure for inertia) increased by the same factor (204,500 times to be precise). Just to give an idea of the effort that was needed to do this: CERN’s LEP collider was housed in a tunnel with a circumference of 27 km. Was? Yes. The tunnel is still there but it now houses the Large Hadron Collider (LHC) which, as you surely know, is the world’s largest and most powerful particle accelerator: its experiments confirmed the existence of the Higgs particle in 2013, thereby confirming the so-called Standard Model of particle physics. [But I’ll see a few things about that in my next post.]

Oh… And, finally, in case you’d wonder where we get the inequality sign in σ_xσ_p≥ ħ/2, that’s because – at some point in the derivation – one has to use the Cauchy-Schwarz inequality (aka as the triangle inequality): |z₁+ z₁| ≤ |z₁|+| z₁|. In fact, to be fully complete, the derivation uses the more general formulation of the Cauchy-Schwarz inequality, which also applies to functions as we interpret them as vectors in a function space. But I would end up copying the whole derivation here if I add any more to this – and I said I wouldn’t do that. 🙂 […]