Wavefunctions and the twin paradox

My previous post was awfully long, so I must assume many of my readers may have started to read it, but… Well… Gave up halfway or even sooner. 🙂 I added a footnote, though, which is interesting to reflect upon. Also, I know many of my readers aren’t interested in the math—even if they understand one cannot really appreciate quantum theory without the math. But… Yes. I may have left some readers behind. Let me, therefore, pick up the most interesting bit of all of the stories in my last posts in as easy a language as I can find.

We have that weird 360/720° symmetry in quantum physics or—to be precise—we have it for elementary matter-particles (think of electrons, for example). In order to, hopefully, help you understand what it’s all about, I had to explain the often-confused but substantially different concepts of a reference frame and a representational base (or representation tout court). I won’t repeat that explanation, but think of the following.

If we just rotate the reference frame over 360°, we’re just using the same reference frame and so we see the same thing: some object which we, vaguely, describe by some ei·θ function. Think of some spinning object. In its own reference frame, it will just spin around some center or, in ours, it will spin while moving along some axis in its own reference frame or, seen from ours, as moving in some direction while it’s spinning—as illustrated below.

To be precise, I should say that we describe it by some Fourier sum of such functions. Now, if its spin direction is… Well… In the other direction, then we’ll describe it by by some ei·θ function (again, you should read: a Fourier sum of such functions). Now, the weird thing is is the following: if we rotate the object itself, over the same 360°, we get a different object: our ei·θ and ei·θ function (again: think of a Fourier sum, so that’s a wave packet, really) becomes a −e±i·θ thing. We get a minus sign in front of it. So what happened here? What’s the difference, really?

Well… I don’t know. It’s very deep. Think of you and me as two electrons who are watching each other. If I do nothing, and you keep watching me while turning around me, for a full 360° (so that’s a rotation of your reference frame over 360°), then you’ll end up where you were when you started and, importantly, you’ll see the same thing: me. 🙂 I mean… You’ll see exactly the same thing: if I was an e+i·θ wave packet, I am still an an e+i·θ wave packet now. Or if I was an ei·θ wave packet, then I am still an an ei·θ wave packet now. Easy. Logical. Obvious, right?

But so now we try something different: turn around, over a full 360° turn, and you stay where you are and watch me while I am turning around. What happens? Classically, nothing should happen but… Well… This is the weird world of quantum mechanics: when I am back where I was—looking at you again, so to speak—then… Well… I am not quite the same any more. Or… Well… Perhaps I am but you see me differently. If I was e+i·θ wave packet, then I’ve become a −e+i·θ wave packet now.

Not hugely different but… Well… That minus sign matters, right? Or If I was wave packet built up from elementary a·ei·θ waves, then I’ve become a −ei·θ wave packet now. What happened?

It makes me think of the twin paradox in special relativity. We know it’s a paradox—so that’s an apparent contradiction only: we know which twin stayed on Earth and which one traveled because of the gravitational forces on the traveling twin. The one who stays on Earth does not experience any acceleration or deceleration. Is it the same here? I mean… The one who’s turning around must experience some force.

Can we relate this to the twin paradox? Maybe. Note that a minus sign in front of the e−±i·θ functions amounts a minus sign in front of both the sine and cosine components. So… Well… The negative of a sine and cosine is the sine and cosine but with a phase shift of 180°: −cosθ = cos(θ ± π) and −sinθ = sin(θ ± π). Now, adding or subtracting a common phase factor to/from the argument of the wavefunction amounts to changing the origin of time. So… Well… I do think the twin paradox and this rather weird business of 360° and 720° symmetries are, effectively, related. 🙂

Post scriptumGoogle honors Max Born’s 135th birthday today. 🙂 I think that’s a great coincidence in light of the stuff I’ve been writing about lately (possible interpretations of the wavefunction). 🙂

Quantum Mechanics: The Other Introduction

About three weeks ago, I brought my most substantial posts together in one document: it’s the Deep Blue page of this site. I also published it on Amazon/Kindle. It’s nice. It crowns many years of self-study, and many nights of short and bad sleep – as I was mulling over yet another paradox haunting me in my dreams. It’s been an extraordinary climb but, frankly, the view from the top is magnificent. 🙂 

The offer is there: anyone who is willing to go through it and offer constructive and/or substantial comments will be included in the book’s acknowledgements section when I go for a second edition (which it needs, I think). First person to be acknowledged here is my wife though, Maria Elena Barron, as she has given me the spacetime:-) and, more importantly, the freedom to take this bull by its horns.

Below I just copy the foreword, just to give you a taste of it. 🙂


Another introduction to quantum mechanics? Yep. I am not hoping to sell many copies, but I do hope my unusual background—I graduated as an economist, not as a physicist—will encourage you to take on the challenge and grind through this.

I’ve always wanted to thoroughly understand, rather than just vaguely know, those quintessential equations: the Lorentz transformations, the wavefunction and, above all, Schrödinger’s wave equation. In my bookcase, I’ve always had what is probably the most famous physics course in the history of physics: Richard Feynman’s Lectures on Physics, which have been used for decades, not only at Caltech but at many of the best universities in the world. Plus a few dozen other books. Popular books—which I now regret I ever read, because they were an utter waste of time: the language of physics is math and, hence, one should read physics in math—not in any other language.

But Feynman’s Lectures on Physics—three volumes of about fifty chapters each—are not easy to read. However, the experimental verification of the existence of the Higgs particle in CERN’s LHC accelerator a couple of years ago, and the award of the Nobel prize to the scientists who had predicted its existence (including Peter Higgs and François Englert), convinced me it was about time I take the bull by its horns. While, I consider myself to be of average intelligence only, I do feel there’s value in the ideal of the ‘Renaissance man’ and, hence, I think stuff like this is something we all should try to understand—somehow. So I started to read, and I also started a blog (www.readingfeynman.org) to externalize my frustration as I tried to cope with the difficulties involved. The site attracted hundreds of visitors every week and, hence, it encouraged me to publish this booklet.

So what is it about? What makes it special? In essence, it is a common-sense introduction to the key concepts in quantum physics. However, while common-sense, it does not shy away from the math, which is complicated, but not impossible. So this little book is surely not a Guide to the Universe for Dummies. I do hope it will guide some Not-So-Dummies. It basically recycles what I consider to be my more interesting posts, but combines them in a comprehensive structure.

It is a bit of a philosophical analysis of quantum mechanics as well, as I will – hopefully – do a better job than others in distinguishing the mathematical concepts from what they are supposed to describe, i.e. physical reality.

Last but not least, it does offer some new didactic perspectives. For those who know the subject already, let me briefly point these out:

I. Few, if any, of the popular writers seems to have noted that the argument of the wavefunction (θ = E·t – p·t) – using natural units (hence, the numerical value of ħ and c is one), and for an object moving at constant velocity (hence, x = v·t) – can be written as the product of the proper time of the object and its rest mass:

θ = E·t – p·x = E·t − p·x = mv·t − mv·v·x = mv·(t − v·x)

⇔ θ = m0·(t − v·x)/√(1 – v2) = m0·t’

Hence, the argument of the wavefunction is just the proper time of the object with the rest mass acting as a scaling factor for the time: the internal clock of the object ticks much faster if it’s heavier. This symmetry between the argument of the wavefunction of the object as measured in its own (inertial) reference frame, and its argument as measured by us, in our own reference frame, is remarkable, and allows to understand the nature of the wavefunction in a more intuitive way.

While this approach reflects Feynman’s idea of the photon stopwatch, the presentation in this booklet generalizes the concept for all wavefunctions, first and foremost the wavefunction of the matter-particles that we’re used to (e.g. electrons).

II. Few, if any, have thought of looking at Schrödinger’s wave equation as an energy propagation mechanism. In fact, when helping my daughter out as she was trying to understand non-linear regression (logit and Poisson regressions), it suddenly realized we can analyze the wavefunction as a link function that connects two physical spaces: the physical space of our moving object, and a physical energy space.

Re-inserting Planck’s quantum of action in the argument of the wavefunction – so we write θ as θ = (E/ħ)·t – (p/ħ)·x = [E·t – p·x]/ħ – we may assign a physical dimension to it: when interpreting ħ as a scaling factor only (and, hence, when we only consider its numerical value, not its physical dimension), θ becomes a quantity expressed in newton·meter·second, i.e. the (physical) dimension of action. It is only natural, then, that we would associate the real and imaginary part of the wavefunction with some physical dimension too, and a dimensional analysis of Schrödinger’s equation tells us this dimension must be energy.

This perspective allows us to look at the wavefunction as an energy propagation mechanism, with the real and imaginary part of the probability amplitude interacting in very much the same way as the electric and magnetic field vectors E and B. This leads me to the next point, which I make rather emphatically in this booklet:  the propagation mechanism for electromagnetic energy – as described by Maxwell’s equations – is mathematically equivalent to the propagation mechanism that’s implicit in the Schrödinger equation.

I am, therefore, able to present the Schrödinger equation in a much more coherent way, describing not only how this famous equation works for electrons, or matter-particles in general (i.e. fermions or spin-1/2 particles), which is probably the only use of the Schrödinger equation you are familiar with, but also how it works for bosons, including the photon, of course, but also the theoretical zero-spin boson!

In fact, I am personally rather proud of this. Not because I am doing something that hasn’t been done before (I am sure many have come to the same conclusions before me), but because one always has to trust one’s intuition. So let me say something about that third innovation: the photon wavefunction.

III. Let me tell you the little story behind my photon wavefunction. One of my acquaintances is a retired nuclear scientist. While he knew I was delving into it all, I knew he had little time to answer any of my queries. However, when I asked him about the wavefunction for photons, he bluntly told me photons didn’t have a wavefunction. I should just study Maxwell’s equations and that’s it: there’s no wavefunction for photons: just this traveling electric and a magnetic field vector. Look at Feynman’s Lectures, or any textbook, he said. None of them talk about photon wavefunctions. That’s true, but I knew he had to be wrong. I mulled over it for several months, and then just sat down and started doing to fiddle with Maxwell’s equations, assuming the oscillations of the E and B vector could be described by regular sinusoids. And – Lo and behold! – I derived a wavefunction for the photon. It’s fully equivalent to the classical description, but the new expression solves the Schrödinger equation, if we modify it in a rather logical way: we have to double the diffusion constant, which makes sense, because E and B give you two waves for the price of one!


In any case, I am getting ahead of myself here, and so I should wrap up this rather long introduction. Let me just say that, through my rather long journey in search of understanding – rather than knowledge alone – I have learned there are so many wrong answers out there: wrong answers that hamper rather than promote a better understanding. Moreover, I was most shocked to find out that such wrong answers are not the preserve of amateurs alone! This emboldened me to write what I write here, and to publish it. Quantum mechanics is a logical and coherent framework, and it is not all that difficult to understand. One just needs good pointers, and that’s what I want to provide here.

As of now, it focuses on the mechanics in particular, i.e. the concept of the wavefunction and wave equation (better known as Schrödinger’s equation). The other aspect of quantum mechanics – i.e. the idea of uncertainty as implied by the quantum idea – will receive more attention in a later version of this document. I should also say I will limit myself to quantum electrodynamics (QED) only, so I won’t discuss quarks (i.e. quantum chromodynamics, which is an entirely different realm), nor will I delve into any of the other more recent advances of physics.

In the end, you’ll still be left with lots of unanswered questions. However, that’s quite OK, as Richard Feynman himself was of the opinion that he himself did not understand the topic the way he would like to understand it. But then that’s exactly what draws all of us to quantum physics: a common search for a deep and full understanding of reality, rather than just some superficial description of it, i.e. knowledge alone.

So let’s get on with it. I am not saying this is going to be easy reading. In fact, I blogged about much easier stuff than this in my blog—treating only aspects of the whole theory. This is the whole thing, and it’s not easy to swallow. In fact, it may well too big to swallow as a whole. But please do give it a try. I wanted this to be an intuitive but formally correct introduction to quantum math. However, when everything is said and done, you are the only who can judge if I reached that goal.

Of course, I should not forget the acknowledgements but… Well… It was a rather lonely venture, so I am only going to acknowledge my wife here, Maria, who gave me all of the spacetime and all of the freedom I needed, as I would get up early, or work late after coming home from my regular job. I sacrificed weekends, which we could have spent together, and – when mulling over yet another paradox – the nights were often short and bad. Frankly, it’s been an extraordinary climb, but the view from the top is magnificent.

I just need to insert one caution, my site (www.readingfeynman.org) includes animations, which make it much easier to grasp some of the mathematical concepts that I will be explaining. Hence, I warmly recommend you also have a look at that site, and its Deep Blue page in particular – as that page has the same contents, more or less, but the animations make it a much easier read.

Have fun with it!

Jean Louis Van Belle, BA, MA, BPhil, Drs.

Re-visiting relativity and four-vectors: the proper time, the tensor and the four-force

My previous post explained how four-vectors transform from one reference frame to the other. Indeed, a four-vector is not just some one-dimensional array of four numbers: it represent something—a physical vector that… Well… Transforms like a vector. 🙂 So what vectors are we talking about? Let’s see what we have:

  1. We knew the position four-vector already, which we’ll write as xμ = (ct, x, y, z) = (ct, x).
  2. We also proved that Aμ = (Φ, Ax, Ay, Az) = (Φ, A) is a four-vector: it’s referred to as the four-potential.
  3. We also know the momentum four-vector from the Lectures on special relativity. We write it as pμ = (E, px, py, pz) = (E, p), with E = γm0, p = γm0v, and γ = (1−v2/c2)−1/2 or, for = 1, γ = (1−v2)−1/2

To show that it’s not just a matter of adding some fourth t-component to a three-vector, Feynman gives the example of the four-velocity vector. We have v= dx/dt, v= dy/dt and v= dz/dt, but a vμ = (d(ct)/dt, dx/dt, dy/dt, dz/dt) = (c, dx/dt, dy/dt, dz/dt) ‘vector’ is, obviously, not a four-vector. [Why obviously? The inner product vμvμ  is not invariant.] In fact, Feynman ‘fixes’ the problem by noting that ct, x, y and z have the ‘right behavior’, but the d/dt operator doesn’t. The d/dt operator is not an invariant operator. So how does he fix it then? He tries the (1−v2/c2)−1/2·d/dt operator and, yes, it turns out we do get a four-vector then. In fact, we get that four-velocity vector μμ that we were looking for:


Now how do we know this is four-vector? How can we prove this one? It’s simple. We can get it from our pμ = (E, p) by dividing it by m0 = E/c2, which is an invariant scalar in four dimensions too. Now, it is easy to see that a division by an invariant scalar does not change the transformation properties. So just write it all out, and you’ll see that pμ/m0 = μμ and, hence, that μμ is a four-vector too. 🙂

We’ve got an interesting thing here actually: division by an invariant scalar, or applying that (1−v2/c2)−1/2·d/dt operator, which is referred to as an invariant operator, on a four-vector will give us another four-vector. Why is that? Let’s switch to compatible time and distance units so c = 1 so to simplify the analysis that follows.

The invariant (1−v2)−1/2·d/dt operator and the proper time s

Why is the (1−v2)−1/2·d/dt operator invariant? Why does it ‘fix’ things? Well… Think about the invariant spacetime interval (Δs)= Δt− Δx− Δy− Δz2 going to the limit (ds)= dt− dx− dy− dz2 . Of course, we can and should relate this to an invariant quantity s = ∫ ds. Just like Δs, this quantity also ‘mixes’ time and distance. Now, we could try to associate some derivative d/ds with it because, as Feynman puts it, “it should be a nice four-dimensional operation because it is invariant with respect to a Lorentz transformation.” Yes. It should be. So let’s relate ds to dt and see what we get. That’s easy enough: dx = vx·dt, dy = vy·dt, dz = vz·dt, so we write:

(ds)= dt− vx2·dt− vy2·dt− vz2·dt⇔ (ds)= dt2·(1 − vx− vy− vz2) = dt2·(1 − v2)

and, therefore, ds = dt·(1−v2)1/2. So our operator d/ds is equal to (1−v2)−1/2·d/dt, and we can apply it to any four-vector, as we are sure that, as an invariant operator, it’s going to give us another four-vector. I’ll highlight the result, because it’s important:

The d/ds = (1−v2)−1/2·d/dt operator is an invariant operator for four-vectors.

For example, if we apply it to xμ = (t, x, y, z), we get the very same four-velocity vector μμ:

dxμ/ds = μμ = pμ/m0

Now, if you’re somewhat awake, you should ask yourself: what is this s, really, and what is this operator all about? Our new function s = ∫ ds is not the distance function, as it’s got both time and distance in it. Likewise, the invariant operator d/ds = (1−v2)−1/2·d/dt has both time and distance in it (the distance is implicit in the v2 factor). Still, it is referred to as the proper time along the path of a particle. Now why is that? If it’s got distance and time in it, why don’t we call it the ‘proper distance-time’ or something?

Well… The invariant quantity s actually is the time that would be measured by a clock that’s moving along, in spacetime, with the particle. Just think of it: in the reference frame of the moving particle itself, Δx, Δy and Δz must be zero, because it’s not moving in its own reference frame. So the (Δs)= Δt− Δx− Δy− Δz2 reduces to (Δs)= Δt2, and so we’re only adding time to s. Of course, this view of things implies that the proper time itself is fixed only up to some arbitrary additive constant, namely the setting of the clock at some event along the ‘world line’ of our particle, which is its path in four-dimensional spacetime. But… Well… In a way, s is the ‘genuine’ or ‘proper’ time coming with the particle’s reference frame, and so that’s why Einstein called it like that. You’ll see (later) that it plays a very important role in general relativity theory (which is a topic we haven’t discussed yet: we’ve only touched special relativity, so no gravity effects).

OK. I know this is simple and complicated at the same time: the math is (fairly) easy but, yes, it may be difficult to ‘understand’ this in some kind of intuitive way. But let’s move on.

The four-force vector fμ

We know the relativistically correct equation for the motion of some charge q. It’s just Newton’s Law F = dp/dt = d(mv)/dt but using m = p = γm0v, so we write:


How can we get a four-vector for the force? It turns out that we get it when applying our new invariant operator to the momentum four-vector pμ = (E, p), so we write: fμ = dpμ/ds. But pμ = m0μμ = m0dxμ/ds, so we can re-write this as fμ = d(m0·dxμ/ds)/ds, which gives us a formula which is reminiscent of the Newtonian F = ma equation:

force formula

What is this thing? Well… It’s not so difficult to verify that the x, y and z-components are just our old-fashioned Fx, Fy and Fz, so these are the components of F. The t-component is (1−v2)−1/2·dE/dt. Now, dE/dt is the time rate of change of energy and, hence, it’s equal to the rate of doing work on our charge, which is equal to Fv. So we can write fμ as:


The force and the tensor

We will now derive that formula which we ended the previous post with. We start with calculating the spacelike components of fμ from the Lorentz formula F = q(E + v×B). [The terminology is nice, isn’t it? The spacelike components of the four-force vector! Now that sounds impressive, doesn’t it? But so… Well… It’s really just the old stuff we know already.] So we start with fx = Fx, and write it all out:


What a monster! But, hey! We can ‘simplify’ this by substituting stuff by (1) the t-, x-, y- and z-components of the four-velocity vector μμ and (2) the components of our tensor Fμν = [Fij] = [∇iAj − ∇jAi] with i, j = t, x, y, z. We’ll also pop in the diagonal Fxx = 0 element, just to make sure it’s all there. We get:

fx 2

Looks better, doesn’t it? 🙂 Of course, it’s just the same, really. This is just an exercise in symbolism. Let me insert the electromagnetic tensor we defined in our previous post, just as a reminder of what that Fμν matrix actually is:

electromagnetic tensor final

If you read my previous post, this matrix – or the concept of a tensor – has no secrets for you. Let me briefly summarize it, because it’s an important result as well. The tensor is (a generalization of) the cross-product in four-dimensional space. We take two vectors: aμ = (at, ax, ay, az) and bμ = (bt, bx, by, bz) and then we take cross-products of their components just like we did in three-dimensional space, so we write Tij = aibj − ajbi. Now, it’s easy to see that this combination implies that Tij = − Tji and that Tii = 0, which is why we only have six independent numbers out of the 16 possible combinations, and which is why we’ll get a so-called anti-symmetric matrix when we organize them in a matrix. In three dimensions, the very same definition of the cross-product Tij gives us 9 combinations, and only 3 independent numbers, which is why we represented our ‘tensor’ as a vector too! In four-dimensional space we can’t do that: six things cannot be represented by a four-vector, so we need to use this matrix, which is referred to as a tensor of the second rank in four dimensions. [When you start using words like that, you’ve come a long way, really. :-)]

[…] OK. Back to our four-force. It’s easy to get a similar one-liner for fy and fz too, of course, as well as for ft. But… Yes, ft… Is it the same thing really? Let me quickly copy Feynman’s calculation for ft:


It does: remember that v×B and v are orthogonal, and so their dot product is zero indeed. So, to make a long story short, the four equations – one for each component of the four-force vector fμ – can be summarized in the following elegant equation:

motion equation

Writing this all requires a few conventions, however. For example, Fμν is a 4×4 matrix and so μν has to be written as a 1×4 vector. And the formula for the fx and ft component also make it clear that we also want to use the +−−− signature here, so the convention for the signs in the μνFμν product is the same as that for the scalar product aμbμ. So, in short, you really need to interpret what’s being written here.

A more important question, perhaps, is: what can we do with it? Well… Feynman’s evaluation of the usefulness of this formula is rather succinct: “Although it is nice to see that the equations can be written that way, this form is not particularly useful. It’s usually more convenient to solve for particle motions by using the F = q(E + v×B) = (1−v2)−1/2·d(m0v)/dt equations, and that’s what we will usually do.”

Having said that, this formula really makes good on the promise I started my previous post with: we wanted a formula, some mathematical construct, that effectively presents the electromagnetic force as one force, as one physical reality. So… Well… Here it is! 🙂

Well… That’s it for today. Tomorrow we’ll talk about energy and about a very mysterious concept—the electromagnetic mass. That should be fun! So I’ll c u tomorrow! 🙂

On (special) relativity: what’s relative?

This is my third and final post about special relativity. In the previous posts, I introduced the general idea and the Lorentz transformations. I present these Lorentz transformations once again below, next to their Galilean counterparts. [Note that I continue to assume, for simplicity, that the two reference frames move with respect to each other along the x- axis only, so the y- and z-component of u is zero. It is not all that difficult to generalize to three dimensions (especially not when using vectors) but it makes an intuitive understanding of what’s relativity all about more difficult.]

CaptureAs you can see, under a Lorentz transformation, the new ‘primed’ space and time coordinates are a mixture of the ‘unprimed’ ones. Indeed, the new x’ is a mixture of x and t, and the new t’ is a mixture as well. You don’t have that under a Galilean transformation: in the Newtonian world, space and time are neatly separated, and time is absolute, i.e. it is the same regardless of the reference frame. In Einstein’s world – our world – that’s not the case: time is relative, or local as Hendrik Lorentz termed it, and so it’s space-time – i.e. ‘some kind of union of space and time’ as Minkowski termed it  that transforms. In practice, physicists will use so-called four-vectors, i.e. vectors with four coordinates, to keep track of things. These four-vectors incorporate both the three-dimensional space vector as well as the time dimension. However, we won’t go into the mathematical details of that here.

What else is relative? Everything, except the speed of light. Of course, velocity is relative, just like in the Newtonian world, but the equation to go from a velocity as measured in one reference frame to a velocity as measured in the other, is different: it’s not a matter of just adding or subtracting speeds. In addition, besides time, mass becomes a relative concept as well in Einstein’s world, and that was definitely not the case in the Newtonian world.

What about energy? Well… We mentioned that velocities are relative in the Newtonian world as well, so momentum and kinetic energy were relative in that world as well: what you would measure for those two quantities would depend on your reference frame as well. However, here also, we get a different formula now. In addition, we have this weird equivalence between mass and energy in Einstein’s world, about which I should also say something more.

But let’s tackle these topics one by one. We’ll start with velocities.

Relativistic velocity

In the Newtonian world, it was easy. From the Galilean transformation equations above, it’s easy to see that

v’ = dx’/dt’ = d(x – ut)/dt = dx/dt – d(ut)/dt = v – u

So, in the Newtonian world, it’s just a matter of adding/subtracting speeds indeed: if my car goes 100 km/h (v), and yours goes 120 km/h, then you will see my car falling behind at a speed of (minus) 20 km/h. That’s it. In Einstein’s world, it is not so simply. Let’s take the spaceship example once again. So we have a man on the ground (the inertial or ‘unprimed’ reference frame) and a man in the spaceship (the primed reference frame), which is moving away from us with velocity u.

Now, suppose an object is moving inside the spaceship (along the x-axis as well) with a (uniform) velocity vx’, as measured from the point of view of the man inside the spaceship. Then the displacement x’ will be equal to x’vx’ t’. To know how that looks from the man on the ground, we just need to use the opposite Lorentz transformations: just replace u by –u everywhere (to the man in the spaceship, it’s like the man on the ground moves away with velocity –u), and note that the Lorentz factor does not change because we’re squaring and (–u)2 u2. So we get:


Hence, x’ = vx’ t’ can be written as x = γ(vx’ t’ + ut’). Now we should also substitute t’, because we want to measure everything from the point of view of the man on the ground. Now, t = γ(t’ + uvx’ t’/c2). Because we’re talking uniform velocities, v(i.e. the velocity of the object as measured by the man on the ground) will be equal to x divided by t (so we don’t need to take the time derivative of x), and then, after some simplifying and re-arranging (note, for instance, how the t’ factor miraculously disappears), we get:


What does this rather complicated formula say? Just put in some numbers:

  • Suppose the object is moving at half the speed of light, so 0.5c, and that the spaceship is moving itself also at 0.5c, then we get the rather remarkable result that, from the point of view of the observer on the ground, that object is not going as fast as light, but only at vx = (0.5c + 0.5c)/(1 + 0.5·0.5) = 0.8c.
  • Or suppose we’re looking at a light beam inside the spaceship, so something that’s traveling at speed c itself in the spaceship. How does that look to the man on the ground? Just put in the numbers: vx = (0.5c + c)/(1 + 0.5·1) = ! So the speed of light is not dependent on the reference frame: it looks the same – both to the man in the ship as well as to the man on the ground. As Feynman puts it: “This is good, for it is, in fact, what the Einstein theory of relativity was designed to do in the first place–so it had better work!”

It’s interesting to note that, even if u has no y– or z-component, velocity in the direction will be affected too. Indeed, if an object is moving upward in the spaceship, then the distance of travel of that object to the man on the ground will appear to be larger. See the triangle below: if that object travels a distance Δs’ = Δy’ = Δy = v’Δt’ with respect to the man in the spaceship, then it will have traveled a distance Δs = vΔt to the man on the ground, and that distance is longer.

CaptureI won’t go through the process of substituting and combining the Lorentz equations (you can do that yourself) but the grand result is the following:

vy = (1/γ)vy’ 

1/γ is the reciprocal of the Lorentz factor, and I’ll leave it to you to work out a few numeric examples. When you do that, you’ll find the rather remarkable result that vy is actually less than vy’. For example, for u = 0.6c, 1/γ will be equal to 0.8, so vy will be 20% less than vy’. How is that possible? The vertical distance is what it is (Δy’ = Δy), and that distance is not affected by the ‘length contraction’ effect (y’ = y). So how can the vertical velocity be smaller?  The answer is easy to state, but not so easy to understand: it’s the time dilation effect: time in the spaceship goes slower. Hence, the object will cover the same vertical distance indeed – for both observers – but, from the point of view of the observer on the ground, the object will apparently need more time to cover that distance than the time measured by the man in the spaceship: Δt > Δt’. Hence, the logical conclusion is that the vertical velocity of that object will appear to be less to the observer on the ground.

How much less? The time dilation factor is the Lorentz factor. Hence, Δt = γΔt’. Now, if u = 0.6c, then γ will be equal to 1.25 and Δt = 1.25Δt’. Hence, if that object would need, say, one second to cover that vertical distance, then, from the point of view of the observer on the ground, it would need 1.25 seconds to cover the same distance. Hence, its speed as observed from the ground is indeed only 1/(5/4) = 4/5 = 0.8 of its speed as observed by the man in the spaceship.

Is that hard to understand? Maybe. You have to think through it. One common mistake is that people think that length contraction and/or time dilation are, somehow, related to the fact that we are looking at things from a distance and that light needs time to reach us. Indeed, on the Web, you can find complicated calculations using the angle of view and/or the line of sight (and tons of trigonometric formulas) as, for example, shown in the drawing below. These have nothing to do with relativity theory and you’ll never get the Lorentz transformation out of them. They are plain nonsense: they are rooted in an inability of these youthful authors to go beyond Galilean relativity. Length contraction and/or time dilation are not some kind of visual trick or illusion. If you want to see how one can derive the Lorentz factor geometrically, you should look for a good description of the Michelson-Morley experiment in a good physics handbook such as, yes :-), Feynman’s Lectures.

visual effect 2

So, I repeat: illustrations that try to explain length contraction and time dilation in terms of line of sight and/or angle of view are useless and will not help you to understand relativity. On the contrary, they will only confuse you. I will let you think through this and move on to the next topic.

Relativistic mass and relativistic momentum

Einstein actually stated two principles in his (special) relativity theory:

  1. The first is the Principle of Relativity itself, which is basically just the same as Newton’s principle of relativity. So that was nothing new actually: “If a system of coordinates K is chosen such that, in relation to it, physical laws hold good in their simplest form, then the same laws must hold good in relation to any other system of coordinates K’ moving in uniform translation relatively to K.” Hence, Einstein did not change the principle of relativity – quite on the contrary: he re-confirmed it – but he did change Newton’s Laws, as well as the Galilean transformation equations that came with them. He also introduced a new ‘law’, which is stated in the second ‘principle’, and that the more revolutionary one really:
  2. The Principle of Invariant Light Speed: “Light is always propagated in empty space with a definite velocity [speed] c which is independent of the state of motion of the emitting body.”

As mentioned above, the most notable change in Newton’s Laws – the only change, in fact – is Einstein’s relativistic formula for mass:

mv = γm0

This formula implies that the inertia of an object, i.e. its mass, also depends on the reference frame of the observer. If the object moves (but velocity is relative as we know: an object will not be moving if we move with it), then its mass increases. This affects its momentum. As you may or may not remember, the momentum of an object is the product of its mass and its velocity. It’s a vector quantity and, hence, momentum has not only a magnitude but also a direction:

 pv = mvv = γm0v 

As evidenced from the formula above, the momentum formula is a relativistic formula as well, as it’s dependent on the Lorentz factor too. So where do I want to go from here? Well… In this section (relativistic mass and momentum), I just want to show that Einstein’s mass formula is not some separate law or postulate: it just comes with the Lorentz transformation equations (and the above-mentioned consequences in terms of measuring horizontal and vertical velocities).

Indeed, Einstein’s relativistic mass formula can be derived from the momentum conservation principle, which is one of the ‘physical laws’ that Einstein refers to. Look at the elastic collision between two billiard balls below. These balls are equal – same mass and same speed from the point of view of an inertial observer – but not identical: one is red and one is blue. The two diagrams show the collision from two different points of view: left, we have the inertial reference frame, and, right, we have a reference frame that is moving with a velocity equal to the horizontal component of the velocity of the blue ball.


The points to note are the following:

  1. The total momentum of such elastic collision before and after the collision must be the same.
  2. Because the two balls have equal mass (in the inertial reference frame at least), the collision will be perfectly symmetrical. Indeed, we may just turn the diagram ‘upside down’ and change the colors of the balls, as we do below, and the values w, u and v (as well as the angle α) are the same.

Elastic collision

As mentioned above, the velocity of the blue and red ball and, hence, their momentum, will depend on the frame of reference. In the diagram on the left, we’re moving with a velocity equal to the horizontal component of the velocity of the blue ball and, therefore, in this particular frame of reference, the velocity (and the momentum) of the blue ball consists of a vertical component only, which we refer to as w.

From this point of view (i.e. the reference frame moving with, the velocity (and, hence, the momentum) of the red ball will have both a horizontal as well as a vertical component. If we denote the horizontal component by u, then it’s easy to show that the vertical velocity of the red ball must be equal to sin(α)v. Now, because u = cos(α)v, this vertical component will be equal to tan(α)u. But so what is tan(α)u? Now, you’ll say, that is quite evident: tan(α)u must be equal to w, right?

No. That’s Newtonian physics. The red ball is moving horizontally with speed u with respect to the blue ball and, hence, its vertical velocity will not be quite equal to w. Its vertical velocity will be given by the formula which we derived above: vy = (1/γ)vy’, so it will be a little bit slower than the w we see in the diagram on the right which is, of course, the same w as in the diagram on the left. [If you look carefully at my drawing above, then you’ll notice that the w vector is a bit longer indeed.]

Huh? Yes. Just think about it: tan(α)= (1/γ)w. But then… How can momentum be conserved if these speeds are not the same? Isn’t the momentum conservation principle supposed to conserve both horizontal as well as vertical momentum? It is, and momentum is being conserved. Why? Because of the relativistic mass factor.

Indeed, the change in vertical momentum (Δp) of the blue ball in the diagram on the left or – which amounts to the same – the red ball in the diagram on the right (i.e. the vertically moving ball) is equal to Δpblue = 2mww. [The factor 2 is there because the ball goes down and then up (or vice versa) and, hence, the total change in momentum must be twice the mwamount.] Now, that amount must be equal to Δpred, which is equal to Δpblue = 2mv(1/γ)w. Equating both yields the following grand result:

mv/m= γ ⇔ mv = γmw

What does this mean? It means that mass of the red ball in the diagram on the left is larger than the mass of the blue ball. So here we have actually derived Einstein’s relativistic mass formula from the momentum conservation principle !

Of course you’ll say: not quite. This formula is not the mu = γmformula that we’re used to ! Indeed, it’s not. The blue ball has some velocity w itself, and so the formula links two velocities v and w. However, we can derive  mv = γmformula as a limit of mv = γmw for w going to zer0. How can w become infinitesimally small? If the angle α becomes infinitesimally small. It’s obvious, then, that v and u will be practically equal. In fact, if w goes to zero, then mw will be equal to m0 in the limiting case, and mv will be equal to mu. So, then, indeed, we get the familiar formula as a limiting case:

mu = γm

Hmm… You’ll probably find all of this quite fishy. I’d suggest you just think about it. What I presented above, is actually Feynman’s presentation of the subject, but with a bit more verbosity. Let’s move on to the final.

Relativistic energy

From what I wrote above (and from what I wrote in my two previous posts on this topic), it should be obvious, by now, that energy also depends on the reference frame. Indeed, mass and velocity depend on the reference frame (moving or not), and both appear in the formula for kinetic energy which, as you’ll remember, is

K.E. = mc– m0c= (m – m0)c= γm0c– m0c= m0c2(γ – 1).

Now, if you go back to the post where I presented that formula, you’ll see that we’re actually talking the change in kinetic energy here: if the mass is at rest, it’s kinetic energy is zero (because m = m0), and it’s only when the mass is moving, that we can observe the increase in mass. [If you wonder how, think about the example of the fast-moving electrons in an electron beam: we see it as an increase in the inertia: applying the same force does no longer yield the same acceleration.]

Now, in that same post, I also noted that Einstein added an equivalent rest mass energy (E= m0c2) to the kinetic energy above, to arrive at the total energy of an object:

E = E+ K.E. = mc

Now, what does this equivalence actually mean? Is mass energy? Can we equate them really? The short answer to that is: yes.

Indeed, in one of my older posts (Loose Ends), I explained that protons and neutrons are made of quarks and, hence, that quarks are the actual matter particles, not protons and neutrons. However, the mass of a proton – which consists of two up quarks and one down quark – is 938 MeV/c(don’t worry about the units I am using here: because protons are so tiny, we don’t measure their mass in grams), but the mass figure you get when you add the rest mass of two u‘s and one d, is 9.6 MeV/conly: about one percent of 938 ! So where’s the difference?

The difference is the equivalent mass (or inertia) of the binding energy between the quarks. Indeed, the so-called ‘mass’ that gets converted into energy when a nuclear bomb explodes is not the mass of quarks. Quarks survive: nuclear power is binding energy between quarks that gets converted into heat and radiation and kinetic energy and whatever else a nuclear explosion unleashes.

In short, 99% of the ‘mass’ of a proton or an electron is due to the strong force. So that’s ‘potential’ energy that gets unleashed in a nuclear chain reaction. In other words, the rest mass of the proton is actually the inertia of the system of moving quarks and gluons that make up the particle. In such atomic system, even the energy of massless particles (e.g. the virtual photons that are being exchanged between the nucleus and its electron shells) is measured as part of the rest mass of the system. So, yes, mass is energy. As Feynman put it, long before the quark model was confirmed and generally accepted:

“We do not have to know what things are made of inside; we cannot and need not justify, inside a particle, which of the energy is rest energy of the parts into which it is going to disintegrate. It is not convenient and often not possible to separate the total mc2 energy of an object into (1) rest energy of the inside pieces, (2) kinetic energy of the pieces, and (3) potential energy of the pieces; instead we simply speak of the total energy of the particle. We ‘shift the origin’ of energy by adding a constant m0c2 to everything, and say that the total energy of a particle is the mass in motion times c2, and when the object is standing still, the energy is the mass at rest times c2.” (Richard Feynman’s Lectures on Physics, Vol. I, p. 16-9)

 So that says it all, I guess, and, hence, that concludes my little ‘series’ on (special) relativity. I hope you enjoyed it.

Post scriptum:

Feynman describes the concept of space-time with a nice analogy: “When we move to a new position, our brain immediately recalculates the true width and depth of an object from the ‘apparent’ width and depth. But our brain does not immediately recalculate coordinates and time when we move at high speed, because we have had no effective experience of going nearly as fast as light to appreciate the fact that time and space are also of the same nature. It is as though we were always stuck in the position of having to look at just the width of something, not being able to move our heads appreciably one way or the other; if we could, we understand now, we would see some of the other man’s time—we would see “behind”, so to speak, a little bit. Thus, we shall try to think of objects in a new kind of world, of space and time mixed together, in the same sense that the objects in our ordinary space-world are real, and can be looked at from different directions. We shall then consider that objects occupying space and lasting for a certain length of time occupy a kind of a “blob” in a new kind of world, and that when we look at this “blob” from different points of view when we are moving at different velocities. This new world, this geometrical entity in which the “blobs” exist by occupying position and taking up a certain amount of time, is called space-time.”

If none of what I wrote could convey the general idea, then I hope the above quote will. 🙂 Apart from that, I should also note that physicists will prefer to re-write the Lorentz transformation equations by measuring time and distance in so-called equivalent units: velocities will be expressed not in km/h but as a ratio of c and, hence, = 1 (a pure number) and so u will also be a pure number between 0 and 1. That can be done by expressing distance in light-seconds ( a light-second is the distance traveled by light in one second or, alternatively, by expressing time in ‘meter’. Both are equivalent but, in most textbooks, it will be time that will be measured in the ‘new’ units. So how do we express time in meter?

It’s quite simple: we multiply the old seconds with c and then we get: timeexpressed in meters = timeexpressed in seconds multiplied by 3×10meters per second. Hence, as the ‘second’ the first factor and the ‘per second’ in the second factor cancel out, the dimension of the new time unit will effectively be the meter. Now, if both time and distance are expressed in meter, then velocity becomes a pure number without any dimension, because we are dividing distance expressed in meter by time expressed in meter, and it should be noted that it will be a pure number between 0 and 1 (0 ≤ u ≤ 1), because 1 ‘time second’ = 1/(3×108) ‘time meters’. Also, c itself becomes the pure number 1. The Lorentz transformation equations then become:


They are easy to remember in this form (cf. the symmetry between x ut and t  ux) and, if needed, we can always convert back to the old units to recover the original formulas.

I personally think there is no better way to illustrate how space and time are ‘mere shadows’ of the same thing indeed: if we express both time and space in the same dimension (meter), we can see how, as result of that, velocity becomes a dimensionless number between zero and one and, more importantly, how the equations for x’ and t’ then mirror each other nicely. I am not sure what ‘kind of union’ between space and time Minkowski had in mind, but this must come pretty close, no?

Final note: I noted the equivalence of mass and energy above. In fact, mass and energy can also be expressed in the same units, and we actually do that above already. If we say that an electron has a rest mass of 0.511 MeV/c(a bit less than a quarter of the mass of the u quark), then we express the mass in terms of energy. Indeed, the eV is an energy unit and so we’re actually using the m = E/c2 formula when we express mass in such units. Expressing mass and energy in equivalent units allows us to derive similar ‘Lorentz transformation equations’ for the energy and the momentum of an object as measured under an inertial versus a moving reference frame. Hence, energy and momentum also transform like our space-time four-vectors and – likewise – the energy and the momentum itself, i.e. the components of the (four-)vector, are less ‘real’ than the vector itself. However, I think this post has become way too long and, hence, I’ll just jot these four equations down – please note, once again, the nice symmetry between (1) and (2) – but then leave it at that and finish this post. 🙂


The Uncertainty Principle re-visited: Fourier transforms and conjugate variables

In previous posts, I presented a time-independent wave function for a particle (or wavicle as we should call it – but so that’s not the convention in physics) – let’s say an electron – traveling through space without any external forces (or force fields) acting upon it. So it’s just going in some random direction with some random velocity v and, hence, its momentum is p = mv. Let me be specific – so I’ll work with some numbers here – because I want to introduce some issues related to units for measurement.

So the momentum of this electron is the product of its mass m (about 9.1×10−28 grams) with its velocity v (typically something in the range around 2,200 km/s, which is fast but not even close to the speed of light – and, hence, we don’t need to worry about relativistic effects on its mass here). Hence, the momentum p of this electron would be some 20×10−25 kg·m/s. Huh? Kg·m/s?Well… Yes, kg·m/s or N·s are the usual measures of momentum in classical mechanics: its dimension is [mass][length]/[time] indeed. However, you know that, in atomic physics, we don’t want to work with these enormous units (because we then always have to add these ×10−28 and ×10−25 factors and so that’s a bit of a nuisance indeed). So the momentum p will usually be measured in eV/c, with c representing what it usually represents, i.e. the speed of light. Huh? What’s this strange unit? Electronvolts divided by c? Well… We know that eV is an appropriate unit for measuring energy in atomic physics: we can express eV in Joule and vice versa: 1 eV = 1.6×10−19 Joule, so that’s OK – except for the fact that this Joule is a monstrously large unit at the atomic scale indeed, and so that’s why we prefer electronvolt. But the Joule is a shorthand unit for kg·m2/s2, which is the measure for energy expressed in SI units, so there we are: while the SI dimension for energy is actually [mass][length]2/[time]2, using electronvolts (eV) is fine. Now, just divide the SI dimension for energy, i.e. [mass][length]2/[time]2, by the SI dimension for velocity, i.e. [length]/[time]: we get something expressed in [mass][length]/[time]. So that’s the SI dimension for momentum indeed! In other words, dividing some quantity expressed in some measure for energy (be it Joules or electronvolts or erg or calories or coulomb-volts or BTUs or whatever – there’s quite a lot of ways to measure energy indeed!) by the speed of light (c) will result in some quantity with the right dimensions indeed. So don’t worry about it. Now, 1 eV/c is equivalent to 5.344×10−28 kg·m/s, so the momentum of this electron will be 3.75 eV/c.

Let’s go back to the main story now. Just note that the momentum of this electron that we are looking at is a very tiny amount – as we would expect of course.

Time-independent means that we keep the time variable (t) in the wave function Ψ(x, t) fixed and so we only look at how Ψ(x, t) varies in space, with x as the (real) space variable representing position. So we have a simplified wave function Ψ(x) here: we can always put the time variable back in when we’re finished with the analysis. By now, it should also be clear that we should distinguish between real-valued wave functions and complex-valued wave functions. Real-valued wave functions represent what Feynman calls “real waves”, like a sound wave, or an oscillating electromagnetic field. Complex-valued wave functions describe probability amplitudes. They are… Well… Feynman actually stops short of saying that they are not real. So what are they?

They are, first and foremost complex numbers, so they have a real and a so-called imaginary part (z = a + ib or, if we use polar coordinates, reθ = cosθ + isinθ). Now, you may think – and you’re probably right to some extent – that the distinction between ‘real’ waves and ‘complex’ waves is, perhaps, less of a dichotomy than popular writers – like me 🙂 – suggest. When describing electromagnetic waves, for example, we need to keep track of both the electric field vector E as well as the magnetic field vector B (both are obviously related through Maxwell’s equations). So we have two components as well, so to say, and each of these components has three dimensions in space, and we’ll use the same mathematical tools to describe them (so we will also represent them using complex numbers). That being said, these probability amplitudes usually denoted by Ψ(x), describe something very different. What exactly? Well… By now, it should be clear that that is actually hard to explain: the best thing we can do is to work with them, so they start feeling familiar. The main thing to remember is that we need to square their modulus (or magnitude or absolute value if you find these terms more comprehensible) to get a probability (P). For example, the expression below gives the probability of finding a particle – our electron for example – in in the (space) interval [a, b]:

probability versus amplitude

Of course, we should not be talking intervals but three-dimensional regions in space. However, we’ll keep it simple: just remember that the analysis should be extended to three (space) dimensions (and, of course, include the time dimension as well) when we’re finished (to do that, we’d use so-called four-vectors – another wonderful mathematical invention).

Now, we also used a simple functional form for this wave function, as an example: Ψ(x) could be proportional, we said, to some idealized function eikx. So we can write: Ψ(x) ∝ eikx (∝ is the standard symbol expressing proportionality). In this function, we have a wave number k, which is like the frequency in space of the wave (but then measured in radians because the phase of the wave function has to be expressed in radians). In fact, we actually wrote Ψ(x, t) = (1/x)ei(kx – ωt) (so the magnitude of this amplitude decreases with distance) but, again, let’s keep it simple for the moment: even with this very simple function eikx , things will become complex enough.

We also introduced the de Broglie relation, which gives this wave number k as a function of the momentum p of the particle: k = p/ħ, with ħ the (reduced) Planck constant, i.e. a very tiny number in the neighborhood of 6.582 ×10−16 eV·s. So, using the numbers above, we’d have a value for k equal to 3.75 eV/c divided by 6.582 ×10−16 eV·s. So that’s 0.57×1016 (radians) per… Hey, how do we do it with the units here? We get an incredibly huge number here (57 with 14 zeroes after it) per second? We should get some number per meter because k is expressed in radians per unit distance, right? Right. We forgot c. We are actually measuring distance here, but in light-seconds instead of meter: k is 0.57×1016/s. Indeed, a light-second is the distance traveled by light in one second, so that’s s, and if we want k expressed in radians per meter, then we need to divide this huge number 0.57×1016 (in rad) by 2.998×108 ( in (m/s)·s) and so then we get a much more reasonable value for k, and with the right dimension too: to be precise, k is about 19×106 rad/m in this case. That’s still huge: it corresponds with a wavelength of 0.33 nanometer (1 nm = 10-6 m) but that’s the correct order of magnitude indeed.

[In case you wonder what formula I am using to calculate the wavelength: it’s λ = 2π/k. Note that our electron’s wavelength is more than a thousand times shorter than the wavelength of (visible) light (we humans can see light with wavelengths ranging from 380 to 750 nm) but so that’s what gives the electron its particle-like character! If we would increase their velocity (e.g. by accelerating them in an accelerator, using electromagnetic fields to propel them to speeds closer to and also to contain them in a beam), then we get hard beta rays. Hard beta rays are surely not as harmful as high-energy electromagnetic rays. X-rays and gamma rays consist of photons with wavelengths ranging from 1 to 100 picometer (1 pm = 10–12 m) – so that’s another factor of a thousand down – and thick lead shields are needed to stop them: they are the cause of cancer (Marie Curie’s cause of death), and the hard radiation of a nuclear blast will always end up killing more people than the immediate blast effect. In contrast, hard beta rays will cause skin damage (radiation burns) but they won’t go deeper than that.]

Let’s get back to our wave function Ψ(x) ∝ eikx. When we introduced it in our previous posts, we said it could not accurately describe a particle because this wave function (Ψ(x) = Aeikx) is associated with probabilities |Ψ(x)|2 that are the same everywhere. Indeed,  |Ψ(x)|2 = |Aeikx|2 = A2. Apart from the fact that these probabilities would add up to infinity (so this mathematical shape is unacceptable anyway), it also implies that we cannot locate our electron somewhere in space. It’s everywhere and that’s the same as saying it’s actually nowhere. So, while we can use this wave function to explain and illustrate a lot of stuff (first and foremost the de Broglie relations), we actually need something different if we would want to describe anything real (which, in the end, is what physicists want to do, right?). We already said in our previous posts: real particles will actually be represented by a wave packet, or a wave train. A wave train can be analyzed as a composite wave consisting of a (potentially infinite) number of component waves. So we write:

Composite wave

Note that we do not have one unique wave number k or – what amounts to saying the same – one unique value p for the momentum: we have n values. So we’re introducing a spread in the wavelength here, as illustrated below:

Explanation of uncertainty principle

In fact, the illustration above talks of a continuous distribution of wavelengths and so let’s take the continuum limit of the function above indeed and write what we should be writing:

Composite wave - integral

Now that is an interesting formula. [Note that I didn’t care about normalization issues here, so it’s not quite what you’d see in a more rigorous treatment of the matter. I’ll correct that in the Post Scriptum.] Indeed, it shows how we can get the wave function Ψ(x) from some other function Φ(p). We actually encountered that function already, and we referred to it as the wave function in the momentum space. Indeed, Nature does not care much what we measure: whether it’s position (x) or momentum (p), Nature will not share her secrets with us and, hence, the best we can do – according to quantum mechanics – is to find some wave function associating some (complex) probability amplitude with each and every possible (real) value of x or p. What the equation above shows, then, is these wave functions come as a pair: if we have Φ(p), then we can calculate Ψ(x) – and vice versa. Indeed, the particular relation between Ψ(x) and Φ(p) as established above, makes Ψ(x) and Φ(p) a so-called Fourier transform pair, as we can transform Φ(p) into Ψ(x) using the above Fourier transform (that’s how that  integral is called), and vice versa. More in general, a Fourier transform pair can be written as:

Fourier transform pair

Instead of x and p, and Ψ(x) and Φ(p), we have x and y, and f(x) and g(y), in the formulas above, but so that does not make much of a difference when it comes to the interpretation: x and p (or x and y in the formulas above) are said to be conjugate variables. What it means really is that they are not independent. There are quite a few of such conjugate variables in quantum mechanics such as, for example: (1) time and energy (and time and frequency, of course, in light of the de Broglie relation between both), and (2) angular momentum and angular position (or orientation). There are other pairs too but these involve quantum-mechanical variables which I do not understand as yet and, hence, I won’t mention them here. [To be complete, I should also say something about that 1/2π factor, but so that’s just something that pops up when deriving the Fourier transform from the (discrete) Fourier series on which it is based. We can put it in front of either integral, or split that factor across both. Also note the minus sign in the exponent of the inverse transform.]

When you look at the equations above, you may think that f(x) and g(y) must be real-valued functions. Well… No. The Fourier transform can be used for both real-valued as well as complex-valued functions. However, at this point I’ll have to refer those who want to know each and every detail about these Fourier transforms to a course in complex analysis (such as Brown and Churchill’s Complex Variables and Applications (2004) for instance) or, else, to a proper course on real and complex Fourier transforms (they are used in signal processing – a very popular topic in engineering – and so there’s quite a few of those courses around).

The point to note in this post is that we can derive the Uncertainty Principle from the equations above. Indeed, the (complex-valued) functions Ψ(x) and Φ(p) describe (probability) amplitudes, but the (real-valued) functions |Ψ(x)|2 and |Φ(p)|2 describe probabilities or – to be fully correct – they are probability (density) functions. So it is pretty obvious that, if the functions Ψ(x) and Φ(p) are a Fourier transform pair, then |Ψ(x)|2 and |Φ(p)|2 must be related to. They are. The derivation is a bit lengthy (and, hence, I will not copy it from the Wikipedia article on the Uncertainty Principle) but one can indeed derive the so-called Kennard formulation of the Uncertainty Principle from the above Fourier transforms. This Kennard formulation does not use this rather vague Δx and Δp symbols but clearly states that the product of the standard deviation from the mean of these two probability density functions can never be smaller than ħ/2:

σxσ≥ ħ/2

To be sure: ħ/2 is a rather tiny value, as you should know by now, 🙂 but, so, well… There it is.

As said, it’s a bit lengthy but not that difficult to do that derivation. However, just for once, I think I should try to keep my post somewhat shorter than usual so, to conclude, I’ll just insert one more illustration here (yes, you’ve seen that one before), which should now be very easy to understand: if the wave function Ψ(x) is such that there’s relatively little uncertainty about the position x of our electron, then the uncertainty about its momentum will be huge (see the top graphs). Vice versa (see the bottom graphs), precise information (or a narrow range) on its momentum, implies that its position cannot be known.


Does all this math make it any easier to understand what’s going on? Well… Yes and no, I guess. But then, if even Feynman admits that he himself “does not understand it the way he would like to” (Feynman Lectures, Vol. III, 1-1), who am I? In fact, I should probably not even try to explain it, should I? 🙂

So the best we can do is try to familiarize ourselves with the language used, and so that’s math for all practical purposes. And, then, when everything is said and done, we should probably just contemplate Mario Livio’s question: Is God a mathematician? 🙂

Post scriptum:

I obviously cut corners above, and so you may wonder how that ħ factor can be related to σand σ if it doesn’t appear in the wave functions. Truth be told, it does. Because of (i) the presence of ħ in the exponent in our ei(p/ħ)x function, (ii) normalization issues (remember that probabilities (i.e. Ψ|(x)|2 and |Φ(p)|2) have to add up to 1) and, last but not least, (iii) the 1/2π factor involved in Fourier transforms , Ψ(x) and Φ(p) have to be written as follows:

Position and momentum wave functionNote that we’ve also re-inserted the time variable here, so it’s pretty complete now. One more thing we could do is to substitute x for a proper three-dimensional space vector or, better still, introduce four-vectors, which would allow us to also integrate relativistic effects (most notably the slowing of time with motion – as observed from the stationary reference frame) – which become important when, for instance, we’re looking at electrons being accelerated, which is the rule, rather than the exception, in experiments.

Remember (from a previous post) that we calculated that an electron traveling at its usual speed in orbit (2200 km/s, i.e. less than 1% of the speed of light) had an energy of about 70 eV? Well, the Large Electron-Positron Collider (LEP) did accelerate them to speeds close to light, thereby giving them energy levels topping 104.5 billion eV (or 104.5 GeV as it’s written) so they could hit each other with collision energies topping 209 GeV (they come from opposite directions so it’s two times 104.5 GeV). Now, 209 GeV is tiny when converted to everyday energy units: 209 GeV is 33×10–9 Joule only indeed – and so note the minus sign in the exponent here: we’re talking billionths of a Joule here. Just to put things into perspective: 1 Watt is the energy consumption of an LED (and 1 Watt is 1 Joule per second), so you’d need to combine the energy of billions of these fast-traveling electrons to power just one little LED lamp. But, of course, that’s not the right comparison: 104.5 GeV is more than 200,000 times the electron’s rest mass (0.511 MeV), so that means that – in practical terms – their mass (remember that mass is a measure for inertia) increased by the same factor (204,500 times to be precise). Just to give an idea of the effort that was needed to do this: CERN’s LEP collider was housed in a tunnel with a circumference of 27 km. Was? Yes. The tunnel is still there but it now houses the Large Hadron Collider (LHC) which, as you surely know, is the world’s largest and most powerful particle accelerator: its experiments confirmed the existence of the Higgs particle in 2013, thereby confirming the so-called Standard Model of particle physics. [But I’ll see a few things about that in my next post.]

Oh… And, finally, in case you’d wonder where we get the inequality sign in σxσ≥ ħ/2, that’s because – at some point in the derivation – one has to use the Cauchy-Schwarz inequality (aka as the triangle inequality): |z1+ z1| ≤ |z1|+| z1|. In fact, to be fully complete, the derivation uses the more general formulation of the Cauchy-Schwarz inequality, which also applies to functions as we interpret them as vectors in a function space. But I would end up copying the whole derivation here if I add any more to this – and I said I wouldn’t do that. 🙂 […]