This text is a common-sense introduction to the key concepts in quantum physics. It recycles what I consider to be my more interesting posts, but combines them in a comprehensive structure. For those who’d like to read it in an e-book format, I also published it on Amazon/Kindle, and summarized it online on another site. In fact, I recommend reading online, because the e-books do not have the animations: click the link for the shorter version, or continue here. [Note that the shorter version is more recent and has an added chapter on the physical dimensions of the real and imaginary component of the wavefunction, which I think is quite innovative – but I will let you judge that.]
What I write – here and on the other site – is a bit of a philosophical analysis of quantum mechanics as well, as I will – hopefully – do a better job than others in distinguishing the mathematical concepts from what they are supposed to describe, i.e. physical reality.
As of now, it focuses on the mechanics in particular, i.e. the concept of the wavefunction and wave equation (better known as Schrödinger’s equation). The other aspect of quantum mechanics – i.e. the idea of uncertainty as implied by the quantum idea – will receive more attention in a later version of this document. I should also say I will limit myself to quantum electrodynamics (QED) only, so I won’t discuss quarks (i.e. quantum chromodynamics, which is an entirely different realm), nor will I delve into any of the other more recent advances of physics.
In the end, you’ll still be left with lots of unanswered questions. However, that’s quite OK, as the late Richard Feynman—who’s surely someone who knows quantum physics as few others do—was of the opinion that he himself did not understand the topic the way he would like to understand it. That’s what draws all of us to quantum physics: a common search for understanding, rather than knowledge alone.
So let’s now get on with it. Please note that, while everything I write is common sense, I am not saying this is going to be easy reading. I’ve written much easier posts than this—treating only aspects of the whole theory. But this is the whole thing, and it’s not easy to swallow. In fact, it may well too big to swallow as a whole. 🙂 But please do give it a try. I wanted this to be an intuitive but formally correct introduction to quantum math. However, when everything is said and done, you are the only who can judge if I reached that goal.
I. The scene: spacetime
Any discussion of physics – including quantum physics – should start with a discussion of our concepts of space and time, I think. So that’s what I’ll talk about first.
Space and time versus spacetime
Because of Einstein, we now know that our time and distance measurements are relative. Think of time dilation and relativistic length contraction here. Minkowski’s famous introduction to his equally famous 1908 lecture – in which he explained his ideas on four-dimensional spacetime, which enshrined Einstein’s special relativity theory in a coherent mathematical framework – sums it all up:
“The views of space and time which I wish to lay before you have sprung from the soil of experimental physics, and therein lies their strength. They are radical. Henceforth space by itself, and time by itself, are doomed to fade away into mere shadows, and only a kind of union of the two will preserve an independent reality.”
Minkowski was not stating the obvious when he said this back in 1908: in his lecture, he talked about four-vectors and Lorentzian inner products. To be precise, he talked about his spacetime concept, which is… Well… About preserving the invariance of four-vectors and other rather non-intuitive stuff. If you want a summary: Minkowski’s spacetime framework is a mathematical structure which incorporates relativistic time dilation, length contraction and mass increase.
Phew! That’s quite a mouthful, isn’t it? But we’ll need that non-intuitive stuff. Minkowski did not merely say that space and time are related. He went way beyond. Because it’s obvious that space and time are related, somehow. That has always been obvious. The thinking of Einstein and Minkowksi and their likes was radical because they told us we should think very differently about the way how space and time are related: they’re related in a non-intuitive way!
However, to set the scene, let’s first talk about the easy relations, i.e. the Galilean or Newtonian concepts of time and space—which actually go much further back in time than Galileo or Newton, as evidenced by the earliest philosophical definitions we have of space and time. Think of Plato and Aristotle, for example. Plato set the stage, indeed, by associating both concepts with motion: we can only measure distance, or some time interval, with a reference to some velocity, and the concept of velocity combines both the time as well as the space dimension: v = Δx/Δt.
Plato knew that ratio. He knew a lot about math, and he knew all about Zeno’s paradoxes which, from a mathematical point of view, can only really be refuted by introducing modern calculus, which includes the concepts of continuity, infinite series, limits and derivatives. In the limit, so for the time interval going to 0 (Δt → 0), the Δx/Δt ratio becomes a derivative, indeed, and the velocity becomes an instantaneous velocity:
v = dx/dt
Sorry for introducing math here but math is the language in which physics is expressed, so we can’t do without it. We will also need Pythagoras’ formula, which is shown below in a way which is somewhat less obvious than usual. [Note the length of the longest side of the upper triangle is one in this diagram.]
So the Greek philosophers already knew that time and distance were only ‘mere shadows’ of something more fundamental—and they knew what: motion. All is motion. Force is motion. Energy is motion. Momentum is motion. Action is motion. To be precise: force is measured as a change in motion, and all of the other concepts I just mentioned – i.e. energy, momentum and action – just combine force with time, distance, or both. So they’re all about motion too! I’ll come back to that in the next section. Let’s first further explore the classical ideas.
To help you – or, let me re-phrase that, to help you help yourself 🙂 – you should try to think of defining time in any other way—I mean in another way than referring to motion. You may want to start with the formal definition of time of the International Bureau of Standards and Weights here, which states that one second is “the duration of 9,192,631,770 periods of the radiation corresponding to the transition between the two hyper-fine levels of the ground state of the Caesium-133 atom at rest at a temperature of 0 K.” Where’s the reference to motion? Well… Radiation is the more scientific word for light, i.e. a wave with a propagation speed that’s equal to… Well… The speed of light. So that’s 299,792,458 meter per second, precisely. So… Yes. Time is defined by motion.
Let’s pause here. 299,792,458 meter per second precisely? How do we know that? The answer is quite straightforward: because the meter, as the distance unit in the International System of Units (SI), is defined as the distance traveled by light in the vacuum in 1/299,792,458 of a second. So there you go: both our time as well as our distance unit are defined by referring to each other—with the concept of the velocity of some wave as an intermediary.
Let me be precise here: the definition of time and distance reflect each other, so to speak, as they both depend on the invariance of the speed of light, as postulated by Einstein: space and time may be relative, but the speed of light isn’t. We’ll always measure it as c = 299,792,458 m/s, and so that’s why it defines both our time as well as our distance unit. Let me insert that great animation here, that shows the relativistic transformation of spacetime. I’ll come back to it later, but note how the central diagonals – which reflect the constant speed of light – are immovable: c is absolute. The speed of light is the same in whatever reference frame, inertial or moving.
Hence, the point is: what matters is how an object (or a wave) moves (or propagates) in spacetime. Space and time on their own are just mathematical notions—constructions of the mind. The physics are in the movement, or in the action—as I will show. Immanuel Kant had already concluded that back in 1770, when he wrote the following:
“Space is not something objective and real, nor a substance, nor an accident, nor a relation; instead, it is subjective and ideal, and originates from the mind’s nature in accord with a stable law as a scheme, as it were, for coordinating everything sensed externally.”
He could have written the same in regard to time. [I know Kant actually did think of time in very much the same way, but I didn’t bother to look up the quote for time. You can google it yourself.]
God, Newton, Lorentz and the absolute speed of light
Let me, after all of the philosophy above, tell you a light-hearted – and hopefully not to sacrilegious – story about the absolute speed of light which, as you know, inspired Einstein and Minkowksi to come up with their new theory—which, since Einstein published it more than a hundred years ago, has been validated in every possible way. It goes like this. If you would be God, and you’d have to regulate the Universe by putting a cap on speed. How would you do that?
First, you would probably want to benchmark speed against the fastest thing in the Universe, which are those photons. So you’d want to define speed as some ratio, v/c. And so that would be some ratio between 0 and 1. So that’s our definition of v now: it’s that ratio. And then you’d want to put a speed limiter on everything else, so you’d burden them with an intricate friction device, so as to make sure the friction goes up progressively as speed increases. So you would not want something linear. No. You want the friction to become infinite as v goes to 1 (i.e. c). So that’s one thing. You’d also want a device that can cope with everything: tiny protons, cars, spaceships, solar systems. Whatever. The speed limit applies to all. But you don’t need much force to accelerate a proton as compared, say, a Mercedes. So now you go around and talk to your engineers. One of them, Newton, will tell you that, when applying a force to an object, its acceleration will be proportional to its mass. He writes it down like this: F = m·a, and tells you to go to Lorentz. Lorentz listens to it and then shows you one of these online graphing tools and puts some formulas in. The graphs look like this.
This is an easy formula that does the trick, Lorentz says. The red one is for m = 1/2, the blue one for m = 1, and the green one for m = 3. In the beginning, nothing much happens: you pick up speed but your mass doesn’t change. But then the friction kicks in, and very progressively as the speed gets closer to the speed of light.
Now, you look at this, and tell Lorentz you don’t want to discriminate, because it looks like you’re punishing the green thing more than the blue or the red thing. But Lorentz says that isn’t the case. His factor is the same per unit mass. So those graphs are the product of mass and his Lorentz factor, which is represented by the blue line—as that’s the one for m = 1.
Now, you think that’s looking good, but then you hesitate. You’re having second thoughts, and you tell Lorentz you don’t want to change the Laws of the Universe, as that would be messy. More importantly, it would upset Newton, who’s pretty fussy about God tampering with stuff. But Lorentz tells you it’s not a problem: Newton’s F = m·a will still work, he says. We just need to distinguish two mass concepts: the mass at rest, and the mass at velocity v. Just put a subscript: mv, and then you use that in Newton’s formula, so we write: F = mv·a.
But so you don’t want to do things in a rush – fixing the Universe is not an easy job 🙂 – and so you go back to Newton and show him that graph and the F = mv·a. You expected him to shout at you, but… No. Something weird happens: Newton agrees! In principle, at least. He just wants to see a formula for mv. It’s funny, because he’s known for being skeptical and blocking new ideas, because he thinks the Universe is good as it is. And so now you’re really worried and ask Newton why he’d agree.
Now Newton looks at you and starts a complicated story about relativity. He says he’s been watching some guys on Earth – Michelson and Morley – who’ve recently proved, experimentally that his relativity theory – i.e. Newtonian relativity (OK—it’s actually a theory which Galileo had already formalized) – is wrong. Dead wrong. Newton looks really upset about it, so you try to console him. You say: how can it be wrong, if everything is working just fine? But then Newton starts rambling about limits and black bodies and other stuff you’ve never heard about. To make a long story short, you take Newton to Lorentz, and Lorentz shows his formula to Newton:
mv = m0/√(1−v2)
Newton looks at it intensely. You can almost hear him thinking. And then his face lights up. That’s it, he says. This fixes it. Go ahead. And so… Well… That’s it. That’s the story. It explains why the whole Universe, everything that’s got mass, has a Lorentz device now. 🙂
[…] OK. Let’s get serious again. The point is: Einstein’s relativity theory is based on the experimentally established fact that the speed of light is absolute. The experiment is the 1887 Michelson-Morley experiment and, while it’s Einstein who explained the phenomenon in a more comprehensive theory (the special relativity theory) 18 years later (in 1905, to be precise), Hendrik Antoon Lorentz, a Dutch physicist, had already shown, in 1892, that all of the weird consequences of the Michelson-Morley experiment – i.e. relativistic time dilation, length contraction and mass increase – could be explained by that simple 1/√(1−v2) factor, which is why it’s referred to as the Lorentz factor, rather than the Einstein factor. 🙂
Mathematical note: The 1/√(1−v2) factor will – or should – remind you of the equation for a circle: y = √(1−x2), so you may think we could us that formula for an alternative Lorentz factor, which is shown below, first separately and then next to the actual Lorentz factor. The second graph shows the obvious differences between the two, and why the actual Lorentz factor works.
Indeed, the red graph, i.e. the 1–√(1−v2) formula (which is what it is because the origin of the whole circle that’s being described by this formula is the (0, 1) point, rather than the origin of our x and y axes), does not do the trick, because its value for v = 1 is not infinity (∞) but 1.
And now that we’re talking math, I want you to think about something else: while we measure the speed of light as c = 299,792,458 m/s – always and everywhere – we just may want to think of that speed as being unimaginably big and, hence, we may want to equate it with infinity. So then we’d have a very different scale for our velocity. To be precise, we’d have a non-linear scale, as equating c with ∞ would amount to stretching the [o, 1] or [o, c] segment of our horizontal axis (i.e. our velocity scale here) infinitely, so the interval now morphs into [o, ∞]. [I know the mathematicians will cry wolf here, and they should—because there’s a real wolf here.] So we would also have to stretch our curve—which can be done, obviously. There is no problem here, is there? It’s just an alternative scale, right?
Well… Think about it. I’ll let you find the wolf yourself. 🙂 Hint: think about the [o, ∞] notation and Zeno’s paradoxes. 🙂
[…] You probably skipped that little exercise above, didn’t you? You shouldn’t. Because there’s something very deep about the finite speed of light. It’s just like Nature challenges our mathematical concept of infinity. Likewise, the quantum of action challenges the mathematical notion of the infinitesimally small, i.e. the very notion of a differential. At the same time, we know that, without those notions, math becomes meaningless. Think about this. And don’t skip exercises. 🙂
Let me now talk about what I wanted to talk about here: dimensions. This term may well be one of the most ambiguous in all of the jargon. It’s a term like space: you should always ask what space? There are so many definitions of a space—both physical as well as mathematical. So let me be clear: I am talking physical dimensions here, so that’s SI units like second, or joule, or kilogram. So I am not talking the x, y an z dimensions of Cartesian coordinate space here: those are mathematical dimensions.
I must assume you are familiar with physical dimensions. I must assume, for example, that you know that energy is force over some distance, so the joule (i.e. the energy unit) is the newton·meter (so we write: 1 J = 1 N·m). Likewise, I must assume you know that (linear) momentum is force over some time: 1 N·s = 1 kg·m/s. [I know you’re used to think of momentum as mass times velocity, but just play a bit with Newton’s F = m·a (force = mass times acceleration) and you’ll see it all works out: 1 kg = 1 N·m/s2.] Energy and momentum are concepts that get most of the attention because they are used in conservation laws, which we can use to analyze and solve actual physical problems.
The less well known concept of action is actually more intuitive but it is associated with a physical principle that your high-school teacher did not teach you: the principle of least action. I’ll talk about action a lot, and you know that Planck’s constant is the quantum of action. Not the quantum of energy, or of momentum. The dimension of action is newton·meter·second (N·m·s).
I’ll also talk about angular momentum and the associated concept of spin—later—but you should note its dimension is newton·meter·second (N·m·s), so that’s the same dimension as action. It’s somewhat strange no one thought of associating the name of a scientist with the unit of momentum (linear or angular), or with the unit of action. I mean: we have joules, watts, coulombs, pascals, volts, lamberts, einsteins (check it on the Web) and many more, but so we do not have a shorthand for the N·s or N·m·s unit. Strange but… Well… It sure makes it easier to understand things, as you immediately see what’s what now.
It is these physical dimensions (as opposed to mathematical dimensions) that make physical equations very different from mathematical equations, even if they look the same. Think about Einstein’s E = m·c2 equation, for example. If, in math, we write b = m·n, then we mean b is equal to m·n, because we’re just talking numbers here. And when two numbers are equal, then they are really equal. Why? Because their ‘numericality’ is the only quality they have. That’s not the case for physical equations. For example, Einstein’s mass-energy equivalence relationship does not mean that mass is equal to energy. It implies energy has an equivalent mass which, in turn, means that some of the mass of a non-elementary particle (like an atom or – since we know protons consist of quarks – a proton) is explained by the moving bits inside.
The mass-energy equivalence relationship implies a lot of things. For example, it implies we can measure mass in J·s2/m2 units rather than in kg, which is nice. However, it does not imply that, ontologically, mass is energy. In fact, look at the dimensions here: we have joule on one side, and J·s2/m2 on the other. So E = m·c2 is definitely not like writing b = m·n. If mass would be equal to energy, we’d just have a scaling factor between the two, i.e. a relation like E = m·c or something. We would not have a squared factor in it: we’d have a simple proportionality relation, and so we do not have that, and that’s why we say it’s an equivalence relation, not an identity.
Let me make another quick note here. It’s a misconception to think that, when an atomic bomb explodes, it somehow converts the mass of some elementary particle into energy, because it doesn’t: the energy that’s released is the binding energy that keeps quarks together, but the quarks survive the blast. I know you’ll doubt that statement, but it’s true: even the concept of pair-wise annihilation of matter and anti-matter doesn’t apply to quarks—or not in the way we can observe electron-positron annihilation. But then I said I wouldn’t talk about quarks, and so I won’t.
The point is: the (physical) dimensions give physical equations some meaning. Think of re-writing the E = m·c2 equation as E/c = m·c, for example. If you know anything about physics, this will immediately remind you that this equation represents the momentum of a photon: E/c = p = m·c, from which we get the equivalent mass of a photon: m = p/c = E/c2. So the energy concept in that E = m·c2 equation is quite specific because we can now relate it to the mass of a photon even if a photon is a zero-mass particle. Of course, that means it has zero rest mass (or zero rest energy), and so it’s only movement, and that movement represents energy which, in turn, gives it an equivalent mass, which is why a large star (our Sun, for example) bends light as it passes.
But, again, I need to move on. Just think when you see a physical equation, OK? Treat it with some more respect than a mathematical equation. The math will help you to see things, but don’t confuse the physics with the math.
Let me inject something else here. You may or may not know that the physical dimension of action and angular momentum is the same: it’s newton·meter·second. However, it’s obvious that the two concepts are quite different, as action may or may not be linear, while angular momentum is definitely not a linear concept. OK. That’s enough. I don’t want this to become a book. 🙂 Onwards!
So I made it clear you should know the basic concepts – like energy and momentum, and the related conservation laws in physics, before you continue reading. Having said that, you may not be as familiar with that concept I mentioned above: the concept of (physical) action. Of course, you’ll say that you know what action is, because you do a lot of exercise. 🙂 But… Well… No. I am talking something else here. Or… Well… Maybe not. You can actually relate it to what you’re doing in the gym: pushing weights along a certain distance during a certain time. And, no, it’s not your wattage: watt is energy per second, so that’s N·m/s, not N·m·s.
I was actually very happy to hear about the concept when I first stumbled on it, because I always felt energy, and momentum, were both lacking some aspect of reality. Energy is force times distance, and momentum is force times time—so it’s only logical to combine all three: force, distance, and time. So that’s what the concept of action is all about: action = force × distance × time. As mentioned above, it’s weird no one seems to have thought of a shorthand for this unit which, as mentioned above, is expressed in N·m·s. I’d call it the Planck because it’s the dimension of Planck’s constant, i.e. the quantum of action—even if that may lead to confusion because you’ve surely heard about Planck units, which are something else. Well… Maybe not. Let me write it out. Planck’s constant is the product of the Planck force, the Planck distance (or Planck length), and the Planck time:
ħ = FP∙lP∙tP ≈ (1.21×1044 N)·(1.62×10−35 m)·(5.39×10−44 s) = 1.0545718×10−34 N·m·s
[By the way, note how huge the Planck force is—in contrast to the unimaginably small size of the Planck distance and time units. We’ll talk about these units later.] The N·m·s unit is particularly useful when thinking about complicated trajectories in spacetime, like the one below. Just imagine the forces on it as this object decelerates, changes direction, accelerates again, etcetera. Imagine it’s a spaceship. To trace its journey in spacetime, you would keep track of the distance it travels along this weird trajectory, and the time, as it moves back and forth between here and there. That would actually enable you to calculate the forces on it, that make it move as it does. So, yes, force, distance and time. 🙂
In mathematical terms, this implies you’d be calculating the value of a line integral, i.e. an infinite sum over differentials along the trajectory. I won’t say too much about it, as you should already have some kind of feel for the basics here.
Let me, to conclude this section, note that the ħ = FP∙lP∙tP equation implies two proportionality relations:
- ħ = EP/tP, i.e. the ratio of the Planck energy divided by the Planck time.
- ħ = pP/lP, i.e. the ratio of the Planck momentum divided by the Planck length.
In fact, we might be tempted to define the Planck unit like this, but then we get the Planck constant from experiment, and you should never ever forget that—just like you should never ever forget that the invariable (or absolute) speed of light is an experimental fact as well. In fact, it took a whole generation of physicists to accept what came out of these two experiments, and so don’t think they wanted it to be this way. In fact, Planck hated his own equation initially: he just knew it had to be true and, hence, he ended up accepting it. 🙂
Note how it all makes sense. We can now, of course, take the ratio of the two equations and we get:
ħ/ħ = 1 = (EP/tP)/(pP/lP) ⇔ EP/pP = lP/tP = vP = c = 1
So here we define the Planck velocity as EP/pP = lP/tP = vP and, of course, it’s just the speed of light. 🙂
Now, as we’re talking units here, I should make a small digression on these so-called natural units we’ll use so often.
You probably already know what natural units are—or, at least, you may think you know. There are various sets of natural units, but the best known – and most widely used – is the set of Planck units, which we introduced above, and it’s probably those you’ve heard about already. We’re going to use them a lot in this document. In fact, I’ll write things like: “We’re using natural units here and so c = ħ = 1.” Or something like: “We’ll measure mass (or energy, or momentum) in units of ħ.”
Now that is highly confusing. How can a velocity – which is expressed in m/s – be equal to some amount of action – which is expressed in N·m·s? As for the second statement, does that mean we’re thinking of mass (or energy, or momentum) as some countable variable, like m = 1, 2, 3, etcetera?
Let me take the first question first. The answer to that one is the one I gave above: they are not equal. Relations like this reflect some kind of equivalence—not some equality. I once wrote the following:
Space and time appear as separate dimensions to us, but they’re intimately connected through c, ħ and the other fundamental physical constants. Likewise, the real and imaginary part of the wavefunction appear as separate dimensions too, but they’re also intimately connected through π and Euler’s number, i.e. through mathematical constants.
That statement says it all but it is not very precise. Expressing our physical laws and principles using variables measured in natural units make us see things we wouldn’t see otherwise, so they help us in getting some better understanding. They show us proportionality relations and equivalence relations which are difficult to see otherwise. However, we have to use them carefully, and we must always remember that Nature doesn’t care about our units, so whatever units we use for our expressions, they describe the same physics. Let’s give an example. When writing that c = 1, we can also write the following:
E = m·c2 = m·c = m = p for v = c = 1
In fact, that’s an expression I’ll use a lot. However, this expression is nonsensical if you don’t think about the dimensions: m·c2 is expressed in kg·(m/s)2, m·c in kg·m/s, and m in kg only. Hence, the relation above tells us the values of our E and m variables are numerically equal, but it does not tell us that energy, mass and momentum are the same, because they obviously aren’t. They’re different physical concepts.
So what does it mean when we say we measure our variables in units of ħ? To explain that, I must explain how we get those natural units. Think about it for yourself: how would you go about it?
The first thing you might say is that the absolute speed of light implies some kind of metaphysical proportionality relation between time and distance, and so we would want to measure time and distance in so-called equivalent units, ensuring the velocity of any object is always measured as a v/c ratio. Equating c with 1 will then ensure this ratio is always measured as some number between 0 and 1. That makes sense but the problem is we can do this in many ways. For example, we could measure distance in light-seconds, i.e. the distance traveled by light in a second, i.e. 299,792,458 meter, exactly. Let’s denote that unit by lc. Now we keep the second as our time unit but we’ll just denote it as tc so as to signal we’ve switched from SI units to… Well… Light units. 🙂
It’s easy to check it works: c = (299,792,458 meter)/(1 second) = (1 lc/(1 tc) = 1 vc.
Huh? One vc? Yep. I could have put 1, but I just wanted to remind you the physical dimension of our physical equations is always there, regardless of the mathematical manipulations we let loose on them.
OK. So far so good. The problem is we can define an infinite number of sets of light units. The Planck units we mentioned are just one of them. They make that c = 1 equation work too:
c = (1 lP/(1 tP) = 1 vP = (1.62×10−35 m)·(5.39×10−44 s) = 299,792,458 m/s
[If you do the calculation using the numbers above, you’ll get a slightly different number but that’s because these numbers are not exact: I rounded the values for the Planck time and distance. You can google the exact values and you’ll see it all works out.]
So we need to some more constraints on the system to get a unique set of units. How many constraints do we need? Now that is a complicated story, which I won’t tell you here. . What are the ‘most fundamental constants in Nature’? To calculate Planck units, we use five constraints, and they’re all written as c = 1 or ħ = 1 equations. To be precise, we get the Planck units by equating the following fundamental constants in Nature with 1, and then we just solve that set of equations to get our Planck units in terms of our old and trusted SI units:
- c: the speed of light (299,792,458 m/s);
- ħ: the reduced Planck constant, which we use when we switch from hertz (the number of cycles per second) as a measure of frequency (like in E = h·f) to so-called angular frequencies (like in E = ħ·ω), which are much more convenient to work with from a math point of view: ħ = h/2π;
- G: the universal gravitational constant (6.67384×E = ħ·ω N·(m/kg)2);
- ke: Coulomb’s constant (ke = 1/4πε0); and, finally,
- kB: the Boltzmann constant, which you may not have heard of, but it’s as fundamental a constant as all the others.
[When seeing Boltzmann’s name, I always think about his suicide. I can’t help thinking he would not have done that if he would have know that Planck would include his constant as part of this select Club of Five. He was, without any doubt, much ahead of his time but, unfortunately, few recognized that. His tombstone bears the inscription of the entropy formula: S = kB·logW. It’s one of these magnificent formulas—as crucial as Einstein’s E = m·c2 formula. But… Well… I can’t dwell on it here, as I need to move on.]
Note that, when we equate these five constants with 1, we’re re-scaling both unimaginably large numbers (like the speed of light) as well as incredibly small numbers (like h, or G and kB). But then what’s large and small? That’s relative, because large and small here are defined here using our SI units, some of which we may judge to be large or small as well, depending on our perspective. In any case, the point is: after solving that set of five equations, we get the so-called ‘natural units’: the Planck length, the the Planck time, the Planck energy (and mass), the Planck charge, and the Planck unit of temperature:
- 1 Planck time unit (tP) ≈ 5.4×10−44 s
- 1 Planck length unit (lP) ≈ 1.6×10−35 m
- 1 Planck energy unit (EP) ≈ 1.22×1028 eV = 1.22×1019 GeV (giga-electronvolt) ≈ 2×109 J
- 1 Planck unit of electric charge (qP) ≈ 1.87555×10–18 C (Coulomb)
- 1 Planck unit of temperature (TP) ≈ 1.416834×1032 K (Kelvin)
Have a look at the values. The Planck time and length units are really unimaginably small—literally! For example, the wavelength of visible light ranges from 380 to 750 nanometer: a nanometer is a billionth of a meter, so that’s 10−9 m. Also, hard gamma rays have wavelengths measured in picometer, so that 10−12 m. Again, don’t even pretend you can imagine how small 10−35 m is, because you can’t: 10−12 and 10−35 differ by a factor 10−23. That’s something we cannot imagine. We just can’t. The same reasoning is valid for the Planck time unit (5.4×10−44 s), which has a (negative) exponent that’s even larger.
In contrast, we’ve got Planck energy and temperature units that are enormous—especially the temperature unit! Just compare: the temperature of the core of our Sun is 15 to 16 million degrees Kelvin only, so that’s about 1.5×107 K only: that’s 10,000,000,000,000,000,000,000,000 times smaller than the Planck unit of temperature. Strange, isn’t it?
Planck’s energy unit is somewhat more comprehensible because, while it’s huge at the atomic or sub-atomic scale, we can actually relate it to our daily life by doing yet another conversion: 2×109 J (i.e. 2 giga-joule) corresponds to 0.5433 MWh (megawatt-hour), i.e. 543 kilowatt-hours! I could give you a lot of examples of how much energy that is but one illustration I particularly like is that 0.5 MWh is equivalent to the electricity consumption of a typical American home over a month or so. So, yes, that’s huge…
What about the Planck unit for electric charge? Well… The charge of an electron expressed in Coulomb is about−1.6×10−19 C, so that’s pretty close to 1.87555×10–18 C, isn’t it? To be precise, the Planck charge is approximately 11.7 times the electron charge. So… Well… The Planck charge seems to be something we can imagine at least.
What about the Planck mass? Well… Energy and mass are related through the mass-energy equivalence relationship (E = m·c2) and, when you take care of the units, you should find that 2 giga-joule (i.e. the Planck energy unit) corresponds to a Planck mass unit (mP) equal to 2.1765×10−8 kg. Again, that’s huge (at the atomic scale, at least): it’s like the mass of an eyebrow hair, or a flea egg. But so it’s something we can imagine at least. Let’s quickly do the calculations for the energy and mass of an electron, just to see what we get:
- Measured in our old-fashioned super-sized SI kilogram unit, the electron mass is me = 9.1×10–31 kg.
- The Planck mass is mP = 2.1765×10−8 kg.
- Hence, the electron mass expressed in Planck units is meP = me/mP = (9.1×10–31 kg)/(2.1765×10−8 kg) = 4.181×10−23, which is a very tiny fraction as you can see (just write it out: it’s something with 22 zeroes after the decimal point).
Now, when we calculate the (equivalent) energy of an electron, we get the same number. Indeed, from the E = m·c2 relation, we know the mass of an electron can also be written as 0.511 MeV/c2. Hence, the equivalent energy is 0.511 MeV (in case you wonder, that’s just the same number but without the 1/c2 factor). Now, the Planck energy EP (in eV) is 1.22×1028 eV, so we get EeP = Ee/EP = (0.511×106 eV)/(1.22×1028 eV) = 4.181×10−23. So it’s exactly the same number as the electron mass expressed in Planck units. That’s nice, but not all that spectacular either because, when we equate c with 1, then E = m·c2 simplifies to E = m, so we don’t need Planck units for that equality.
So that’s the real meaning of “measuring energy (or mass) in units of ħ.” What we’re saying is that we’re using a new gauge: Planck units. It ensures that, when we measure the energy and the mass of some object, we get the same numerical value, but their dimensions are still very different, as evidenced by the numbers we get when we write it all out:
- meP = me/mP = (9.1×10–31 kg)/(2.1765×10−8 kg) = 4.181×10−23
- EeP = Ee/EP = (0.511×106 eV)/(1.22×1028 eV) = 4.181×10−23
You can check it for some other object, like a proton, for example:
- mpP = mp/mP = (1.672622×10−27 kg)/(2.1765×10−8 kg) = 0.7685×10−19 = 7,685×10−23
- EpP = Ep/EP = (938.272×106 eV)/(1.22×1028 eV) = 768.5×10−22 = 7,685×10−23
Interesting, isn’t it? If we measure stuff in Planck units, then we know that whenever we measure the energy and/or the mass of whatever object, we’ll always get the same numerical value. Hence, yes, we do confidently write that E = m. But so don’t forget this does not mean we’re talking the same thing: the dimensions are what they are. Measuring stuff in natural units is kinda special, but it’s still physics. Energy is energy. Action is action. Mass is mass. Time is time. Etcetera. 🙂
To conclude this section, I’ll also quickly remind you of what I derived above:
ħ = EP/tP and ħ = pP/lP ⇔ EP/pP = lP/tP = vP = c = 1
Of course, this reminds us of the famous E/p = c equation for photons. This, and what I wrote above, may lead you to confidently state that – when using natural units – we should always find the following:
E = p
However, that E = p equation is not always true. We only have that E = p relation for particles with zero rest mass. In that case, all of their energy will be kinetic (as they have no rest energy, there is no potential energy in their rest mass), and their velocity will be equal to c = 1 because… Well… The slightest force accelerates them infinitely. So, yes, the E = p relation makes sense for so-called zero-mass particles—by which we mean: zero rest mass, as their momentum translates into some equivalent energy and, hence, some equivalent mass. In case of doubt, just go back to the old-fashioned formulas in SI units. Then that E = p relation becomes mv·c2 = mv·v, and so now you’re reminded of the key assumption − besides the use of natural units − when you see that E = p thing, which is v = c. To sum it all up, we write:
E = p ⇔ v = c = 1
What about the E = m relation? That relation is valid whenever we use natural units, isn’t it? Well… Yes and no. More yes than no, of course—because we don’t get v when reverting to SI units and writing it all out: E = m ⇔ mv·c2 = mv ⇔ c2 = 1, and the latter condition is the case whenever we use natural units, because then c = 1 and, hence, c2 will also be equal to one—numerically at least!
At this point, I should just give you the relativistically correct equation for relating mass, energy, velocity and momentum. It is the following one:
p = (E/c2)·v
Using natural units (so c = 1 and v becomes a relative velocity now, i.e. v/c), this simplifies to:
p = v·E
This implies what we wrote above: m·v = v·E ⇔ E = m and E = p if and only if v = c = 1. So just memorize that relativistic formula (using natural units, it’s just p = v·E, and so that it’s easy) and the two consequences:
E = m (always) ⇐ p = v·E ⇒ E = p if and only if v = c = 1
However, as I will show in a later section (to be precise, as I will show when discussing the wavefunction of an actual electron), we must watch out with our mass concepts—and, consequently, with the energy concepts we use. As I’ve said a couple of time already, mass captures something else than energy, even if we tend to forget that when using that E = m equation too much. Energy is energy, and mass… Well… Mass is a measure of inertia, and things can become quite complicated here.
Let me give an example involving the spin factor here. The mv·v equation captures linear momentum, but we may imagine some particle – at rest or not – which has equal angular momentum. Think of it as spinning around its own axis at some incredible velocity: its mass will effectively increase, because the energy in its angular momentum will have some equivalent mass. I know what you’ll say: that shouldn’t affect the m·v equation, as our mass factor will incorporate the energy that’s related to the angular momentum. Well… Yes. You’re right, but so that’s why you’ll sometimes see funny stuff, like E = 2m, for example. 🙂 If you see stuff like that, don’t panic: just think! Always ask yourself: whose velocity? What mass? Whose energy? Remember: all is relative, except the speed of light!
Another example of how tricky things can be is the following. In the context of Schrödinger’s equation for electrons, I’ll introduce the concept of the effective mass, which, using natural units once more (so v is the relative velocity, as measured against the (absolute) speed of light), I’ll write as meff = m·v2, so the efficient mass is some fraction (between 0 and 1) of the usual mass factor here. Huh? Yes. Again, you should always think twice when seeing a variable or some equation. In this case, the question is: what mass are we talking about? [I know this is a very nasty example, as the concept of the effective mass pops up only when delving really deep into quantum math, but… Well… I told you I’d give you a formally correct account of it.]
Let me give one more example. A really fine paradox, really. When playing with the famous de Broglie relations – aka as the matter-wave equations – you may be tempted to derive the following energy concept:
- f·λ = (E/h)·(h/p) = E/p
- v = f·λ ⇒ f·λ = v = E/p ⇔ E = v·p = v·(m·v) ⇒ E = m·v2
If you want, you can use the ω = E/ħ and k = p/ħ equations. You’ll find the same nonsensical energy formula. Nonsensical? Yes. Think of it. The energy concept in the ω = E/ħ relation is the total energy, so that’s E = m∙c2, and m∙c2 is equal to m·v2 if, and only if, v = c, which is usually not the case because the wavefunction is supposed to describe real-life particles that are not traveling at the speed of light (although we actually will talk first about theoretical zero-mass particles when introducing the topic).
So how do we solve this paradox? It’s simple. We’re confusing two different velocity concepts here: the phase velocity of our wavefunction, versus the classical velocity of our particle, which is actually equal to the group velocity of the wavefunction. I know you may not be familiar with these concepts, so just look at the animation below (credit for which must go to Wikipedia): the green dot moves (rather slowly) with the group, so that’s the group velocity. The phase velocity is the velocity of the wave itself, so that’s the red dot.
So the equation we get out of the two de Broglie equations is E/p = vp, with vp the phase velocity. So the energy concept here is E = vp∙p = vp∙m = m·vp∙v. [Note we’re consistently using natural units, so all velocities are relative velocities measured against c.] Now, when presenting the Schrödinger equation, we’ll show that vp is equal to the reciprocal of v, so vp = 1/v and the energy formula makes total sense: E = m·vp∙v = m·(1/v)·v = m.
In any case, don’t worry about it now, as we’ll tackle it later. Just make a mental note: the use of natural units is usually enlightening, but it does lead to complications from time to time—as we tend to forget what’s behind those simple E = p or E = m relations. In case of trouble or doubt, just revert to SI units and do a dimensional analysis of your equation, or think about that relativistic relation between energy and momentum, and re-derive the consequences.
OK. I should really stop here and start the next piece but, as an exercise, think once more about what those Planck units really do. Think about the proton-electron example, for which we found that their mass and energy – as measured in Planck units – was equal to 4.181×10−23 units and 7,685×10−23 units in particular. That just establishes a proportionality relation between the energy and the mass of whatever objects we’d be looking at. Indeed, it implies the following:
This proportionality relation is a very deep fact, and it led many respected physicists say that mass actually is energy—that they are fundamentally, ontologically, or philosophically the same. Don’t buy that: energy is energy, and mass is mass. Never forget the c2 factor, as its physical dimension is still there when using natural units: the c = 1 identity does not make it disappear!
Now, to conclude, I should answer the question I started out with. What do we mean when saying something: “We’ll measure mass (or energy, or momentum) in units of ħ.” Unlike what you might think, it does not mean we’re thinking of mass (or energy, or momentum) as some countable variable. No. That’s not what we mean. I am underlining this because I know some of my blog posts seem to suggest that.
It’s a complicated matter, because I do like to think that, at the Planck scale, time and distance actually do become discrete (i.e. countable variables), so I do believe that the Planck time and distance units (i.e. tP ≈ 5.4×10−44 s and lP ≈ 1.6×10−35 m) are the smallest time and distance units that make sense. The argument is rather complicated but it’s related to the existence of the quantum of action itself: if ħ is what it is – i.e. some amount of energy that’s being delivered over some time, or some momentum over some distance – then there’s a limit to how small that time and/or distance can be: you can’t increase the energy (and/or momentum) and, simultaneously, decrease the space in which you’re packing all that energy (and/or momentum) indefinitely, as the equivalent mass density will turn your unimaginably tiny little space into a black hole, out of which that energy can’t escape anymore and, hence, it’s no longer consistent with your idea of some energy or some mass moving through spacetime because… Well… It can’t move anymore. It’s captured in the black hole. 🙂
In any case, that’s not what I want to talk about here. The point is: it’s not because we’d treat time and distance as countable variables, that energy and momentum and mass should be treated the same.
Let me give an example to show it’s not so simple. We know – from the black-body radiation problem – that the energy of the photons will always be an integer multiple of ħ·ω – so we’ll have E1 = ħ·ω, E2 = 2·ħ·ω,… En = n·ħ·ω = n·h·f. Now, you may think that I’ve just given you an argument as to why energy should be a countable variable as well, but… No. The frequency of the light (f = ω/2π) can take on any value, so ħ·ω = h·f is not something like 1, 2, 3, etcetera.
If you want a simpler example, just think of the value we found for the electron mass and/or its energy. Expressed in Planck units, it’s equal to something like 0.0000000000000000000000418531… and I am not sure what follows. That doesn’t look like something that’s countable, does it? 🙂
However, I am not excluding that, at some very basic level, energy (and momentum) and, hence, mass, might be countable. Why? I am not sure but, in physics, we have this magical number—referred to as alpha (α), but better known as the fine-structure constant for reasons I explain in my posts on it (in which I actually take some of the ‘magic’ out!). It always pops up when discussing physical scales so… Well… Let’s quickly introduce it, although we’re not really going to use it much later.
The Magical Number
Let me first give its value and definition—or definitions (plural), I should say:
For the definition of all those physical constants, you can check the Wikipedia article on it, and then you’ll that these definitions are essentially all the same. 🙂 It’s easy to see that, using natural units so ε0 = ħ = c = 1, we can write the first equation just as:
Don’t break your head over it. In fact, in a first reading, you may just want to skip this section and move on to the next. However, let’s go through the motions here and so let me tell you whatever can be told about it. It’s a dimensionless number whose value is equal to something like 7.297352566… ≈ 1/137.0359991… In fact, you’ve probably seen the 1/137 number somewhere, but that’s just an approximation. In any case, it turns out that this fine-structure constant relates all of the fundamental properties of the electron, thereby revealing a unity that, admittedly, we struggle to understand. In that post of mine, I prove the following relationships:
(1) α is the square of the electron charge expressed in Planck units: α = eP2. Now, this is essentially the same formula as the one above, as it turns out that the electron charge expressed in Planck units is equal to e/√(2π), with e the elementary charge. You can double-check this yourself, noting that the Planck charge is approximately 11.7 times the electron charge. To be precise, from this equation, it’s easy to see that the factor is 1/√α ≈ 11.706… You can then quickly check the relationship: qP = e/√α. In fact, as you play with these numbers, you’ll quickly notice most of the wonderful relations are just tautologies. [Having said that, when everything is said and done, α is and remains a very remarkable number.]
(2) α is also the square root of the ratio of (a) the classical electron radius and (b) the Bohr radius: α = √(re /r). Note that this equation does not depend on the units, in contrast to equation 1 (above), and 4 and 5 (below), which require you to switch to Planck units. It’s the square of a ratio and, hence, the units don’t matter. They fall away.
(3) Thirdly, α is the (relative) speed of an electron: α = v/c. The relative speed is, as usual, the speed as measured against the speed of light. Note that this is also an equation that does not depend on the units, just like (2): we can express v and c in whatever unit we want, as long we’re consistent and express both in the same units.
(4) α is also equal to the product of (a) the electron mass and (b) the classical electron radius re (if both are expressed in Planck units, that is): α = me·re. And, of course, because the electron mass and energy are the same when measured in Planck units, we can also write: α = EeP·re.
Now, from (2) and (4), we also get:
(5) The electron mass (in Planck units) is equal me = α/re = α/α2r = 1/αr. So that gives us an expression, using α once again, for the electron mass as a function of the Bohr radius r (expressed in Planck units).
Finally, we can substitute (1) in (5) to get:
(6) The electron mass (in Planck units) is equal to me = α/re = eP2/re. Using the Bohr radius, we get me = 1/αr = 1/eP2r.
These relationships are truly amazing and, as mentioned, reveal an underlying unity at the atomic/sub-atomic level that we’re struggling to understand. However, at this point, I need to get back to the lesson. Indeed, I just wanted to jot down some ‘essentials’ here. Sorry I got distracted. As mentioned above, while the number does incorporate a lot of what’s so fascinating about physics, the number is somewhat less magical than most popular writers would want you to believe. As said, you can check out my posts on it for further clues. Because… Well… I really need to move on—otherwise we’ll never get to the truly exciting bits—and to fully solve the riddle of God’s number, you’ll need to understand those more exciting bits, like how we get electron orbitals out of Schrödinger’s equation. 🙂 So… Well… We had better move on so we can get there. 🙂
The Principle of Least Action
The concept of action is related to what might well be the most interesting law in physics—even if it’s possible you’ve never heard about it before: the Principle of Least Action. Let me copy the illustrations from the Master from his introduction to the topic:
What’s illustrated above is that the actual motion of some object in a force field (a gravitational field, for example) will follow a path for which the action is least. So that’s the graph on the right-hand side. In contrast, if we add up all the action along the curve on the left-hand side (we do that with a line integral), we’ll get a much higher figure. From what we see, it’s obvious Nature tries to minimize the figures. 🙂
The math behind is not so easy, but one can show that the line integral I talked about is the following:
You can check its dimension: the integrand is expressed in energy units, so that’s N·m, while the dt differential is expressed in seconds, so we do get some amount expressed in N·m·s. If you’re interested, I highly recommend reading Feynman’s chapter on it, although it’s not an easy one, as it involves a branch of mathematics you may not be familiar with: the calculus of variations.
In any case, the point to note is the following: if space and time are only ‘shadows’ of something more fundamental – ‘some kind of union of the two’ – then energy and momentum are surely the same: they are just shadows of the more fundamental concept of action. Indeed, we can look at the dimension of Planck’s constant, or at the concept of action in general, in two different ways:
- [Planck’s constant] = [action] = N∙m∙s = (N∙m)∙s = [energy]∙[time]
- [Planck’s constant] = [action] = N∙m∙s = (N∙s)∙m = [momentum]∙[distance]
The bracket symbols [ and ] mean: ‘the dimension of what’s between the brackets’. Now, this may look like kids stuff, but the idea is quite fundamental: we’re thinking here of some amount of action expressing itself in time or, alternatively, expressing itself in space. In the former case, some amount of energy (E) is expended during some time. In the latter case, some momentum (p) is expended over some distance. So we can now think of action in three different ways:
- Action is force time distance times time
- Action is energy times time;
- Action is momentum time distance.
Now, you’ve surely seen the argument of the quantum-mechanical wavefunction:
θ = (E/ħ)·t – (p/ħ)·x = (E·t)/ħ – (p·x)/ħ
The E·t factor is energy times time, while the p·x factor is momentum times distance. Hence, the dimension of both is the same: it’s the dimension of physical action (N∙m∙s), but then we divide it by ħ, so we express both factors using a natural unit: Planck’s quantum of action. So θ becomes some dimensionless number. To be precise, it becomes a number that expresses an angle, i.e. a radian. However, remember how we got it, and also note that, if we’d look at ħ as just some numerical value – a scaling factor with no physical dimension – then θ would keep its dimension: it would be some number expressed in N·m·s. It’s an important note in light of what follows: you’ll see θ will come to represent the proper time of the object that’s being described by the wavefunction and… Well… Just read on. Sorry for jumping the gun all of the time. 🙂
What about the minus sign in-between the (E/ħ)·t and (p/ħ)·x? It reminds us of many things. One is the general shape of the argument of any wavefunction, which we can always write as k·x–ω·t (for a wave traveling in the positive x-direction, that is). In case you don’t know where this comes from, check my post on the math of waves. Also, if you’re worried about the fact that it’s k·x–ω·t rather than ω·t–k·x, remember we have a minus sign in front of θ in the wavefunction itself, which we’ll write as a·e−iθ in a moment. Don’t worry about it now: just note we’ve got a proper wavefunction with this minus sign. An argument like this usually represents something periodic, i.e. a proper wave indeed.
However, it reminds me also of other things. More difficult things. The minus sign in-between E·t and p·x also reminds me of the so-called variational principle in calculus, which is used to find a function (or a path) that maximizes or, in this case, minimizes the value of some quantity that depends on the function or the path. Think of the Lagrangian: L = T − V. [T represents the kinetic energy of the system, while V is its potential energy. Deep concepts. I’ll come back to them.] But let’s not digress too much. At this point, you should just note there is a so-called path integral formulation of quantum mechanics, which is based on the least action principle. It’s not an easy thing to understand intuitively because it is not based on the classical notion of a single, unique trajectory for some object (or some system). Instead, it allows for an infinity of possible trajectories. However, the principle behind is as intuitive as the minimum energy principle in classical mechanics.
I am not going to say much about uncertainty at this point, but it’s a good place to talk about some basics here. Note that the definition of the second as packing 9,192,631,770 periods of light that’s emitted by a Caesium-133 atom going from one energy state to another (because that’s what transitioning means) assumes we can effectively measure one period of that radiation. As such, it should make us think: shouldn’t we think of time as a countable variable, rather than as a continuous variable?
Huh? Yes. I mentioned the question before, when I discussed Planck units. But so here I am hinting at some other reason why we might want to think of time and distance as countable variables. Think of it: perhaps we should just think of time as any other scale: if our scale only allows measurements in centimeter (cm), we’ll say that this person is, say, 178 cm ± 1 cm, or – assuming we can confidently determine the measurement is closer to 178 cm than 179 or 177 cm – that his or her length is 178 cm ± 0.5 cm. Anyone who’s done surveys or studied a bit of statistics knows it’s a complicated matter: we’re talking confidence intervals, cut-off values, and so much other stuff. 🙂
But… Yes. I know what you’ll say now: this is not the fundamental Uncertainty – with capital U – that comes with Planck’s quantum of action, which we introduced already above. This is something that has to do with our limited powers of observation and all that.
Well… Yes. You’re right. I was talking inaccuracy above—not uncertainty. But… Well… I have to warn you: at some fundamental level, the inaccuracy in our measurements can actually be related to the Uncertainty Principle, as I explain in my post on diffraction and the Uncertainty Principle. But… Well… Let’s first think a bit about inaccuracy then. You’ll agree that the definition of the second of the International Bureau of Standards and Weights assumes we can – theoretically, at least – measure whether or not some event lasted as long as 1/9,192,631,770 seconds or, in contrast, if its duration was only 1/9,192,631,769 seconds, or 1/9,192,631,771 seconds. So the inaccuracy that’s implied is of the order of 1/9,192,631,770 s ≈ 1/(9.2×109) s ≈ 0.1×10−9 s, so that’s a tenth of a nano-second. That’s small but – believe it or not – we can accurately measure even smaller time units, as evidenced by the fact that, for example, scientists are able to confidently state that the mean lifetime of a neutral pion (π0) is (8.2±0.24)×10−17 seconds. So the precision here is like an atto-second (10−18 s).
To put things into perspective: that’s the time it takes for light to travel the length of two hydrogen atoms. Obvious question: can we actually do such measurements? The answer is: yes, of course! Otherwise we wouldn’t have measurements like the one above, would we? 🙂 But so how do we measure stuff like that? Simple: we measure it by analyzing the distance over which these pions disintegrate after appearing in some collision, and – importantly – because we know, approximately, their velocity, which we know because we know their mass and their momentum, and because we know momentum is conserved. So… Well… There you go! 🙂
In any case, an atto-second (1×10−18 s) – so that’s a billionth of a billionth of a second – is still huge as compared with the Planck time unit, which is equal to tP ≈ 5.4×10−44 seconds. It’s so small that we don’t even have a name for such small numbers. Now, I’ll talk about the physical significance of the Planck time and distance later, as it’s not an easy topic. It’s associated with equally difficult concepts, such as the concept of the quantum vacuum, which most writers throw around easily, but which few, if any, bother to accurately define. However, we can already say something about it. The key point to note is that all of of our complicated reflections lead us to think that the Planck time unit may be a theoretical limit to how small a time unit can be.
You’ll find many opinions on this topic, but I do find it very sensible to sort of accept that both time and distance become countable variables at the Planck scale. In other words, spacetime itself becomes discrete at that scale: all points in time, and all points in space, are separated by the Planck time and length respectively, i.e. tP ≈ 5.4×10−44 s and lP ≈ 1.6×10−35 m respectively.
Separated? Separated by what? That’s a good question ! It shows that our concepts of continuous and discrete spacetime – or the concept of the vacuum versus the quantum vacuum – are not very obvious: if there is such thing as a fundamental unit of time and distance – which we think is the case – then our mathematical concepts are just what they are: mathematical concepts. So the answer to your question is: if spacetime is discrete, then these discrete points would still be separated by mathematical space. In other words, they would be separated by nothing. 🙂
The next step is to assume that the quantum of action may express itself in time only or, alternatively, in space only. To be clear, what I am saying here is that, at the Planck scale, we may think of a pointlike particle moving in space only, or in time only.
I know what you’ll say: that’s not possible. Everything needs some time to move from here to there. Well… […] Maybe. But maybe not. When you’re reading this, you’ve probably done some reading on quantum-mechanical systems already—like the textbook example of the ammonia two-state system. Think of that model: the nitrogen atom is once here, then there, and then it goes back again. With no travel time. So… Well… If you accepted those models, you’ve accepted what I wrote above. 🙂
I think it is rather obvious that mathematical space is continuous, because it’s an idealization – a creation of our mind – of physical space, which looks like it’s fine-grained, i.e. discrete, but at the smallest scale only. In fact, quantum physics tells us the physical space must be fine-grained. Hence, in my view, there’s no contradiction here as long as we are clear the language we use to describe reality (i.e. math) is different from reality itself. 🙂
But, if time and distance become discrete or countable variables, and Planck’s quantum of action is what it is: a quantum, so action comes in integer multiples of it only, then energy and momentum must come in discrete units as well, isn’t it? Maybe. It’s not so simple. As I mentioned above, we know, for example, from the black-body radiation problem, that the energy of the photons will always be an integer multiple of ħ·ω – so we’ll have E1 = ħ·ω, E2 = 2·ħ·ω,… En = n·ħ·ω = n·h·f. Now, you may think that I’ve just given you an argument as to why energy should be a countable variable as well, but… No. The frequency of the light (f = ω/2π) can take on any value, so ħ·ω = h·f is not something like 1, 2, 3, etcetera.
At this point, we may want to look at the Uncertainty Principle for guidance. What happens if the E and p in our θ = (E/ħ)·t – (p/ħ)·x argument take on very small values, and x and t are measured in Planck units? Does our θ = (E/ħ)·t – (p/ħ)·x argument become a discrete variable. Some countable variable, like 1, 2, 3 etcetera? Let’s think of it. We know we should think of the Uncertainty Relations as a pair:
Δx·Δp ≥ ħ/2 and ΔE·Δt ≥ ħ/2
For example, if x and t are measured in Planck units, then we could imagine that both Δt and Δx will be positive integers, so they can only take on values like 1, 2, 3 etcetera. Let us then, just for argument’s sake, just equate Δx and Δt with one. Now, let’s also assume we measure everything in natural units, so we measure E and p – and, therefore, Δp and ΔE as well – in Planck units, so our Δx·Δp ≥ ħ/2 and ΔE·Δt ≥ ħ/2 relations now become fascinatingly simple:
Δp ≥ 1/2 and ΔE ≥ 1/2
What does this imply? Does it imply that E and p itself are discrete variables, to be measured in units of ħ/2 or… Well… 1/2. The answer is simple: No. It doesn’t. The Δp ≥ 1/2 equation just implies that the uncertainty about the momentum will be larger than the Planck momentum divided by two. Now, the Planck unit for the momentum is just as phenomenal – from an atomic or sub-atomic perspective, that is – as the Planck energy and mass. To be precise, pP ≈ 6.52485 kg·m/s. [You can calculate that yourself using the values for the Planck force and Planck time unit above, and then you should just convert the N·s dimension to kg·m/s using Newton’s Law. Just do it: this is a useful exercise which will give you some better feel about these quantities.] Now, you can google some examples of what this pP ≈ 6.52485 kg·m/s value corresponds to, but Wikipedia gives a nice example: it corresponds, among other things, to the momentum of a baseball with a mass m of about 145 grams travelling at 45 m/s, or 160 km/h. Now, that’s quite something, wouldn’t you agree?
Let me quickly make a methodological note here: I’ll often write things like ΔE ≥ ħ/2 and/or Δp ≥ ħ/2, but you should note that what we mean by this is the following:
- ΔE ≥ (ħ/tP)/2 = EP/2
- Δp ≥ (ħ/lP)/2 = pP/2
Now, hang in there and think about the following. If we would not have any uncertainty, how would our wavefunction look like in our discrete spacetime? That’s actually quite simple. We just need to make an assumption in regard to E and p. Let’s assume E = p = ħ. Again, that means E = EP and p = pP. And please note that’s a huge value, as the energy of an electron is only 0.0000000000000000000000418531… times that value! In any case, that assumption implies the argument of our wavefunction can be written as follows:
θ = (E/ħ)·t – (p/ħ)·x = [EP/(ħ)]·t – [pP/(ħ)]·x
Now we know that we’re going to measure t and x as multiples of tP and lP. Hence, t and x become something like t = n·tP and x = m·lP. In these two formulas, n and m are some integer number, and they’re both independent from each other. We can now further simplify the argument of our wavefunction:
θ = [EP/(ħ)]·t – [pP/(ħ)]·x = [EP/(ħ)]·n·tP – [pP/(ħ)]·m·lP
= [EP·tP/(ħ)]·n – [pP·lP/(ħ)]·m = [ħ/(ħ)]·n – [ħ/(ħ)]·m = (n – m)
Hence, our elementary wavefunction is now equal to ei·(n − m). Now, that implies you may want to think of the wavefunction—and, yes, I know that I am getting way ahead of myself here because I still need to tell you what the wavefunction actually is— as some infinite set of points like:
- ei(0 − 0) =ei(0) = cos(0) + i∙sin(0)
- ei(1 − 0) = ei(1) = cos(1) + i∙sin(1)
- ei(0 − 1) = ei(−1) = cos(−1) + i∙sin(−1)
- ei(1 − 1) = ei(0) = cos(0) + i∙sin(0)
The graphs hereunder show the results when we calculate the real and imaginary part of this wavefunction for n and m going from 0 to 14 (in steps of 1, of course). The graph on the right-hand side is the cosine value for all possible n = 0, 1, 2,… and m = 0, 1, 2,… combinations, and the left-hand graph depicts the sine values, so that’s the imaginary part of our wavefunction.
You may still wonder what it represents really. Well… If you wonder how the quantum vacuum could possibly look like, you should probably think of something like the images above. 🙂 Sorry for not being more explicit but I want you to think about these things yourself. 🙂
In any case, I am jumping the gun here, as I’ll be introducing the wavefunction only later. Much later. So don’t worry if you didn’t understand anything here. I just wanted to flag some stuff. You should also note that what separates us from the Planck scale is a very Great Desert, and so I fully agree with the advice of a friend of mine: let’s first focus on the things we know, rather than the things we don’t know. 🙂
Before we move on, however, I need to note something else. The E and p in the argument of the wavefunction θ = (E/ħ)·t – (p/ħ)·x = (E·t)/ħ – (p·x)/ħ may – in fact, probably will – vary in space and time as well. I am saying that here, because I want to warn you: some of what follows – in fact, a lot of what follows – assumes that E and p do not vary in space and time. So we often assume that we have no gravitational, electromagnetic or whatever other force fields causing accelerations or decelerations or changes in direction.
In fact, I am going to simplify even more: I’ll often assume we’re looking at some hypothetical particle that has zero rest mass! Now that’s a huge simplification—but a useful one.
Let’s go back to basics and just look at some object traveling in spacetime. As we noted above, time and space are related, but they are still different concepts. So let’s be precise and attempt to describe how they differ, and let’s try to be exact. As mathematical concepts, we can represent them as coordinate axes, as shown below. Think of the spatial dimension (x) as combining all of classical Cartesian three-dimensional space or, if that’s easier, just think of one-dimensional space, like an object moving along a line. I’ve also inserted a graph, a function, which relates x and t. We can think of it as representing some object moving in spacetime, indeed. Note that the graph doesn’t tell us where the object is right now. It only shows us where it was. In fact, we might try to predict where it’s going, but you’ll agree that’s rather difficult with this one. 🙂
Of course, you’ll immediately say the trajectory above is not kosher, as our object is traveling back in time in not less than three sections of the graph. [Check this. Do you see where?] You’re right. We should not allow that to happen. Not now, at least. 🙂 It’s easy to see how we should correct it. We just need to ensure our graph is a well-defined function: for every value of t, we should have one, and only one, value of x. It’s easy to see that our concept of time going in one direction, and in one direction only, implies that we should only allow well-behaved functions. [This is rather nice because we’ve got an explanation for the arrow of time here without us having to invoke concepts of entropy or other references to some physical reality. In case you’d be interested in the topic, please check one of my posts on time reversal and symmetries.] So let’s replace our graph by something more kosher traveling in spacetime. Let’s that thing that I showed you already:
You’ll say: “It’s still a silly function. This thing accelerates, decelerates, and reverses direction all of the time. What is this thing?”
I am not sure. It’s true it’s a weird thing: it only occupies a narrow band of space. But… Well… Why not? It’s only as weird as the concept of an electron orbital. 🙂 Indeed, all of the wonderful quantum-mechanical formulas I’ll give you can’t hide the fact we’re thinking of an electron in orbit in pretty much the same way as the object that’s depicted in the graph above: as a point-like object that’s zooming around and, hence, it’s somewhere at some point in time, but we just don’t know where it is exactly. 🙂 Of course, I agree we’d probably associate the path of our electron with something more regular, like a sine or cosine function.
Here I need to note two things. First, the sine and cosine are essentially the same function: they only differ because of a phase shift: cosφ = sin(φ + π/2). Second, I need to show you Euler’s formula.
Feynman calls it the the ‘most remarkable formula in mathematics’, and refers to it as ‘our jewel’. It’s true, although we can take some magic out of it by constructing it algebraically—but I won’t bother you with that here. Let me just show you the formula:
Now let me show you a nice animation that illustrates a fundamental idea that we’ll exploit in a moment. Think of φ (i.e. the argument of our sine and cosine) as time, and think of the whole thing as a clock, like the rather fancy clock below [Yes. I know it’s turning counter-clockwise at the moment, but that’s because of mathematical convention. Just put a minus sign in front of φ if you’d want to fix that.] Watch the red, blue and purple dots on the horizontal axis going back-and-forth. Watch their velocity: they oscillate between −1 and +1 (you can always re-scale), reach maximum velocity at the zero point (i.e. the center), and then decelerate to reverse direction, after which they accelerate in the other direction for another cycle.
The points on the horizontal axis really behave like a mass on a spring. Note, for example, that the frequencies of the three waves are all the same: it’s just the amplitude that’s different (and then there’s also a fixed phase difference, as the blue wave is leading the others).
You probably saw the formula: x = a·cos(ω·t + Δ), but a and Δ just incorporate the starting conditions (i.e. the initial stretch and the x = t(0) point), so let’s simplify and just write:
cosφ = cos(ω·t).
So, yes, φ is time, and the ω is just a scaling factor. What scaling factor? Well… I’ll come back to that. For the moment, just note that, for a mass on a spring, it’s the so-called natural frequency of the system, and it’s equal to ω = ω0 = √(k/m). In this equation, k is just the elastic (restoring) force that’s exerted by the spring, and m is the mass of the object that’s attached to it.
You may also know our simple cosine function solves a differential equation, i.e. an equation involves derivatives. To be precise, the a·cos(ω·t + Δ) solution solves the m·d2x/dt2 = −kx equation.
Don’t worry too much about it now. However, I do need to note something else we’ll also want to think about later. The energy formula for a mass on a spring tells us that the total energy—kinetic (i.e. the energy related to the momentum of that mass, which we’ll denote by T) plus potential (i.e. the energy stored in the spring, which we’ll denote by U)—is equal to T + U = m·ω02/2. Just look at this and note that it looks exactly the same as an another energy formula you’ll probably remember: E = m·v2/2, which describes the kinetic energy of some object in linear motion.
Now, the last thing I want to show you here, is that Euler’s formula gives us one clock, but two springs, so to speak—as shown below. Wouldn’t you agree that a system like this would permanently store an amount of energy that’s equal to two times the above-mentioned amount, i.e. 2·m·ω02/2 = m·ω02? Now that is a very interesting idea! 🙂
Why? Think about the following. Remember the argument of the wavefunction:
θ = ω·t – k·x = (E/ħ)·t – (p/ħ)·x
Now, we know that the phase velocity of any wave is equal to vp = ω/k = (E/ħ)/(p/ħ) = E/p, so we find that the phase velocity of the amplitude wave (vp) is equal to the E/p ratio. Now, for particles with zero rest mass (like photons, or the theoretical zero-mass spin-0 and spin-1/2 particles I’ll introduce shortly), we know that vp = v = c. Hence, for zero-mass particles we find that the classical velocity of the particle is equal to the speed of light, and that’s also the phase velocity of the amplitude wave. [As for the concept of group velocity, it just doesn’t apply here.] Hence, we can write:
vp = c = E/p = m·c2/m·c = c
We just get a tautology. However, when discussing non-zero mass fermions (i.e. actual particles), I’ll show that the phase velocity of the wavefunction is equal c2/v, which simplifies to 1/β with β = v/c when using natural units (so c = 1), but we don’t want to use natural units right now:
vp = E/p ⇔ E = vp·p = (c2/v)·(mv·v) = mv·c2
You’ll say: so what? E = mv·c2? We know that already, so I am just proving the obvious here, isn’t it? Well… Yes and no. The mv·c2 formula looks just like m·ω02. So we can – and probably should – think of the real and imaginary part of our wavefunction as energy stores: both store half of the total energy of our particle. Isn’t that interesting?
Of course, the smarter ones amongst you will immediately say the formula doesn’t make much sense, because mv·c2 = m·ω02 implies that ω0 = c and, hence, we’ve got a constant angular velocity here, which is not what we should have. Hmm… What can I say? Many things. First, it’s true that mv·c2 and m·ω02 look similar but, when everything is said and done, the m in m·ω02 does represents an actual mass on an actual spring, doesn’t it? And so the k in the ω0 = √(k/m) formula is a very different k than the k in the θ = ω·t − k∙x argument of our wavefunction. Hence, it’s true that writing mv·c2 = m·ω02 makes somewhat less sense than one would think at first. Secondly, you should also note that the m in the m·ω02 is a non-relativistic mass concept, so m would be equal to m0, not mv.
Let me first tackle the last remark, which is easy, because it’s really not to the point: for non-relativistic speeds, we’d have m0 ≈ mv, so they would not differ very much and, therefore, we should really think of the similarity—or, let’s be bolder, the equivalence—between those mv·c2 and m·ω02 equations. This brings me to the first remark.
The smarter guys amongst you should be much bolder. In the next sections, I will show that we can re-write the argument of the wavefunction as θ = m0·t’. The mass factor is, of course, the inertial mass, so that’s the mass of the object as measured in its own (inertial) reference frame, so it’s not the mass factor as we see it (that’s mv). Likewise, the time variable, which I denote with a prime here (so I write t’ rather than t), is the proper time of the object we’re looking at. So… Well… The conclusion is, effectively, that the m·c2 and m·ω02 are fully equivalent.
Indeed, as I will show in a moment, we can look at the wavefunction as a link function, which sort of projects what’s going on in our spacetime to what’s going on in the reference frame of the object that we’re looking at, and we can, effectively, think of what’s going on as some oscillation in two separate but similar energy spaces, which we can, effectively, represent by those two oscillations.
The question now becomes a very different one: if ω0 = c, then what does the ω0 = √(k/m) equation correspond to? If we’d really be talking some mass on a spring, then we know that the period and frequency of the oscillation are determined by the size of the mass (m) on that spring and the force constant k, which captures the strength of the restoring force—which is assumed to be proportional to the extension (or compression) of the spring, so we write: F = k·x, with x the distance from the zero point. However, here we should remind ourselves that we should not take the metaphor too far. We should not really think we’ve got some spring in a real space and then a duplicate spring in an imaginary space and our object is not only traveling along some trajectory in our spacetime but – on top of that – also going up and down in that real and imaginary energy space. No.
We may also use another metaphor: an electric circuit, for example, may also act as a harmonic oscillator and, in the process, store energy. In that case, the resonant frequency would be given by the ω0 = 1/√(L·C) formula, with L the inductance and C the capacitance of the circuit. In short, we should just think of the resonant frequency as some property of the system we’re looking at. In this case, we just find that ω0 = c, which is great, because…
Well… When everything is said and done, we can actually look at the constant c as just being some property of spacetime. I’ve done a few posts on that, notably one commenting on a rather poorly written article by a retired physics professor and so… Well… I won’t dwell on it here. OK. Onwards!
[Oh… Before I continue, let me give credit to Wikipedia for the animations above. They’re simple but great—I think.]
Cartesian versus polar coordinates
The concept of direction is associated – in our mind, at least – with the idea of linearity. We associate the momentum of a particle, for example, with a linear trajectory in spacetime. But then Einstein told us spacetime is curved, and so what’s ‘linear’ in curved spacetime? You’ll agree we always struggle to represent – or imagine, if you prefer that term – curved spacetime, as evidenced by the fact that most illustrations of curved spacetime (like the one below, for example) represent a two-dimensional space in three-dimensional Cartesian space. I find such illustrations puzzling because they sort of mix the concept of physical space with that of mathematical space.
Having said that, there’s no alternative, obviously: we do need the idea of a mathematical space to represent the physical space. So what’s mathematical space? Mathematical spaces can be defined in many ways: as mentioned above, the term has at least as many definitions as the term ‘dimension’. However, the most obvious mathematical space – and the one we’re usually referring to – is a coordinate space. Here I should note the simple Galilean or Newtonian relativity theory, so that’s pre-Einstein: when we’re talking mathematical space, we should always wonder whose space. So the concept of the observer and the inertial frame of reference creeps in. Note that, in general, we’ll want to look at things from our point of view. However, in what follows, I’ll introduce the notion of the proper space and the proper time, which is the space and time of the object that we’re looking at. Both were easy concepts before Einstein radically re-defined them. Before Einstein, the proper space was just the x = 0 space, and the proper time… Well… The proper time was just time: some universal clock that was the same for everyone and everything, moving or not. So relativity changed all of that, and we’ll come back to it.
To conclude this introduction to the more serious stuff, let me define the concept of a ‘straight’ line in curved spacetime for you: if no force, or force field, is acting on some object, then it will just move in a straight line. The corollary of this, of course, is that it is not going to move in a straight line when some force is acting on it. The thing that you should note here is that, if you’d be the object, you’d feel the force – and the accelerations, decelerations or – quite simply – the change in direction it causes. Hence, you would know that you’re moving away from previous x = 0 point. Hence, you’d be picking up speed in your reference frame as well, and so you’d be able to calculate your acceleration and, hence, the force that’s acting on you, using Newton’s famous Law: F = m·a. So the straight line in your own space, i.e. your proper space, is the one for which x = 0, and t =… Well… Just time: a clock. It’s an important point to note in light of what will follow.
But we need to move on. We’ll do some simple exercises with our mathematical space, i.e. our coordinate space. One such exercise is the transformation of a Cartesian coordinate space into a polar coordinate space, which is illustrated by the animation below.
It’s neat but weird. Just look at it at a couple of times so as to understand what’s going on. It looks weird because we’re dealing with a non-linear transformation of space here – so it is not a simple rotation or reflection (even if the animation starts with one) – and, therefore, we’re not familiar with it. I described how it works in detail in one of my blog posts, so I won’t repeat myself here. Just note the results: the r = sin(6θ) + 2 function in the final graph (i.e. the curve that looks like a petaled flower) is the same as the y = sin(6x) + 2 curve we started out with, so y = r and x = θ. So it’s the same function. It’s just… Well… Two different spaces: one transforms into the other and we can, of course, reverse the operation.
The transformation involves a reflection about the diagonal. In fact, this reflection can also be looked at as a rotation of all space, including the graph and the axes – by 180 degrees. The axis of rotation is, obviously, the same diagonal. [I like how the animation (for which the credit must go to one of the more creative contributors to Wikipedia) visualizes this.] Note how the axes get swapped, which include a swap of the domain and the range of the function: the independent variable (x = θ) goes from −π to +π here, so that’s one cycle (we could also let it range from 0 to 2π), and, hence, the dependent variable (y = r) ranges between 1 and 3. [Whatever its argument, the sine function always yields a value between −1 and +1, but we add 2 to every value it takes, so we get the [1, 3] interval now.] Of course, the term ‘(in)dependent’ in ‘(in)dependent variable’ has no real meaning, as all is related to all in physics, so the concept of causality is just another concept that exists only in our mind, and that we impose on reality. At least that’s what philosophers like David Hume and Immanuel Kant were thinking—and modern physics does not disagree with it. 🙂
OK. That’s clear enough. Let’s move on. The operation that follows, after the reflection or rotation, is a much more complicated transformation of space and, therefore, much more interesting. Look what it does: it bends the graph around the origin so its head and tail meet. Note how this transformation wraps all of the vertical lines around a circle, and how the radius of those circles depends on the distance of those lines from the origin (as measured along the horizontal axis).
What about the vertical axis itself? The animation is somewhat misleading here, as it gives the impression we’re first making another circle out of it, which we then sort of shrink—all the way down to a circle with zero radius! So the vertical axis becomes the origin of our new space. However, there’s no shrinking really. What happens is that we also wrap it around a circle—but one with zero radius! So the vertical axis does become the origin – but not because of some shrinking operation: we only wrap stuff here—not shrinking anything. 🙂
Let’s now think of wrapping our own crazy spacetime graph around some circle. We’d get something like below. [Don’t worry about the precise shape of the graph in the polar coordinate space, as I made up a new one. I made the two graphs with PowerPoint, and that doesn’t allow for bending graphs around a circle.] Note that the remark on the need for a well-behaved function – so time goes in one direction only – applies to our polar coordinate space too! Can you see how?
We know that x and t were the space and time dimension respectively, so what’s r and θ here? […] Hey! Are you still there? Try to find the answer yourself! 🙂 It’s easy to see that the distance out (r) corresponds to x, but what about θ? The angle still measures time, right? Correct.
But so we’ve got a weird thing here: our object just shakes around in some narrow band of space but so we made our polar graph start and stop at θ = 0 and θ = 2π respectively. This amounts to saying our graph covers one cycle. You’ll agree that’s kinda random. So we should do this wrapping exercise only when we’re thinking of our function as a periodic function. Fair enough. You’ll remember the relevant formulas here: if the period of our function is T, then its frequency is equal to f = 1/T. The so-called angular frequency will be equal to ω = ∂θ/∂t = 2π·f = 2π/T.
[Usually, you’ll just see something simple like θ = ω·t, so then it’s obvious that ω = ∂θ/∂t. Also note that I write ω = ∂θ/∂t, rather than ω = dθ/dt, so I am taking a partial derivative. Why? Because we’ll soon encounter another number, the wave number k = ω = ∂θ/∂x, which we should think of as a frequency in space, rather than as a frequency in time. Also note the 2π factor when switching from Cartesian to polar coordinates: one cycle corresponds to 2π radians. Please check out my post on the math of waves if you’re not familiar with basic concepts like this.]
So we need something more regular. So let’s re-set the discussion by representing something very regular now: an object that just moves away from the zero point at some constant speed (see the spacetime graph on the right-hand side). Such trajectory becomes a spiral in the polar coordinate system (left-hand side). To be precise, we have a so-called Archimedian or arithmetic spiral here—as opposed to, let’s say, a logarithmic spiral. [There are many other types of spirals, but I let you google that yourself.]
The arithmetic spiral is described by the following general equation: r = a + b·θ. Changing a turns the spiral, while b controls the distance between successive turnings. We just choose a to be zero here – because we want r to be zero for θ = 0 – but what about b? One possibility is shown below. We just equate b to 1/(2π) here, so the distance out (r) is just the angle (θ) divided by 2π. Huh? How do we know that? Relax. Let’s calculate it.
The choice of the formula above assumes that one cycle (i.e. θ = 2π) corresponds to one distance unit, i.e. one meter in the SI system of units. So that’s why we write what we write: r = 1 = b·θ = b·2π ⇔ b = 1/2π. What formula do we get for θ? That’s easy to calculate. After one cycle, x = r = 1, but x = v·t, and so the time that corresponds to point x = 1 is equal to t = x/v = 1/v. Now, it’s easy to see that θ is proportional to t, so we write θ = ω·t, knowing that θ = 2π at t = 1/v. Indeed, the angle still measures time, but we’re looking for a scaling factor here. Hence, 2π = ω/v ⇔ ω = 2π·v. To sum it up, we get:
θ = 2π·v·t and r = θ/2π = v·t
Piece of cake! However, while logical, our choice (i.e. us equating one cycle with one meter) is and remains quite arbitrary. We could also say that one cycle should correspond to 1 second, or 2π seconds, rather than 1 meter. So then we’d take the time unit as our reference, rather than the distance unit. That’s equally logical – if not more – because one cycle – in the polar representation – corresponds to 2π radians, so why wouldn’t we define θ as θ = 2π·t, as it’s the vertical axis – not the horizontal axis – that we are rolling up here? What do we get for r in that case? That’s equally easy to calculate: r = b·θ = 2π·b·t, but r is also equal to r = x = v·t. Hence, 2π·b·t = v·t and, therefore, b must be equal to v/2π. We get:
θ = 2π·t and r = v·t = θ/2π
Huh? We get the same thing! No, we don’t. θ = 2π·v·t ≠ 2π·t. This is kinda deep. What’s going on here? Think about it: when we are making those choices above, we are basically choosing our time unit only, even when we thought we were picking the distance unit as our reference. Think about the dimensions of the θ = 2π·v·t formula but, more importantly, also think of its form: it’s still the same fundamental θ = ω·t formula. We just re-scale our time unit here, by multiplying it with the velocity of our object.
Of course, the obvious question is: what’s the natural time unit here? The second, a jiffy, the Planck time, a galactic year, or what? Hard to say. However, one obvious way to try to get somewhere would be to say that we should choose our time and distance unit simultaneously, and in such a way so as to ensure c = 1.
Huh? Yes. Think about natural units: if we choose the second as our time unit, then our distance unit should be the distance that light travels in one second, i.e. 299,792,458 meter. The velocity of our object then becomes a relative velocity: a ratio between 0 and 1. This also brings in additional constraints on our graph in spacetime: the diagonal separates possible and impossible trajectories, as illustrated below. In jargon, we say that our spacetime intervals need to be time-like. You can look it up: two events that are separated by a time-like spacetime may be said to occur in each other’s future or past.
OK. Fine. However, inserting the c = 1 constraint doesn’t solve our scaling problem. We need something else for that—and I’ll tell you what in a moment. However, to understand what’s going to follow, you should think about the following fundamental ideas:
1. If we’d refer to the horizontal and vertical axis in our circle as a so-called real (Re) and imaginary (Im) axis respectively, then each point on our spiral above becomes a so-called complex number, which we write as ψ = a + b·i = ψ = r·eiθ = r·(cosθ+ i·sinθ). The i is the imaginary unit – and it has all kinds of wonderful properties, which you may or may not remember from your high school math course. For example, i2 = −1. Likewise, the r·eiθ = r·(cosθ+ i·sinθ) is an equally wonderful formula, which I explained previously, so I am sorry I can’t dwell on it here.
2. Hence, we can now associate some complex-valued function ψ = r·eiθ = a·(cosθ+ i·sinθ) with some pointlike object traveling in spacetime. [If you don’t like the idea of pointlike objects – which, frankly speaking, I would understand, because I don’t like it either – then think of the point as the center of mass—for the moment, at least.]
3. The argument of our complex-valued function, i.e. θ, would be some linear function of time, but we’re struggling to find the right scaling factor here. Hence, for the moment, we’ll just write θ as θ = ω·t.
4. It’s really annoying that the r in our ψ = r·eiθ =function just gets larger and larger. What we probably would want to wrap around our circle would be a rotated graph, as shown below.
So we’d want to rotate our coordinate axes (i.e. the t- and x-axis) before we wrap our graph representing our moving object around our circle—or, to put it differently, before we represent our graph in complex space. Also note we probably don’t want to wrap it around a circle of zero radius – so, in addition to the rotation, we’d need a shift along our new axis as well. Hmm… That’s getting complicated. How do we do that?
I need to insert some more math here. If we’d not be talking t and x, but the ordinary Cartesian (x, y, z) space, and we’d do a rotation in the xy-plane over an angle that’s equal to α, our coordinates would transform as follows:
t’ = t·cosα − x·cosα and x’ = x·cosα + t·cosα
These formulas are non-relativistically though – so they are marked as ‘not correct’ in the first of the two illustrations below, which I took from Feynman. Look at those illustrations now. Think of what is shown here as a particle traveling in spacetime, and suddenly disintegrating at a certain spacetime point into two new ones which follow some new tracks, so we have an event in spacetime here. So now we want to switch to another coordinate space. One that’s rotated, somehow.
In the Newtonian world, we just turn the axes, indeed, so we get a new pair, with the ‘primed’ coordinates (t’ and x’) being some mixture of the old coordinates (t and x). [The c in c·t and c·t’ just tells us we’re measuring time and distance in equivalent units. In this case, we do so by measuring time in meter. Just write it out: c times one second is equal to (299,792,458 m/s)·(1 s) = 299,792,458 meter, so our old time unit (one second) now corresponds to 299,792,458 ‘time-meter‘. Note that, while we’re measuring time in meter now, the new unit should not make you think that time and distance are the same. They are not: we just measure them in equivalent units so c = 1. That’s all.]
But, since Einstein, we know we do not live in the Newtonian world, so we need to apply a relativistically correct transformation. That transformation looks very different. It’s illustrated in the second of the two graphs above, and I’ll remind you of the formulas, i.e. the Lorentz transformation rules, in the next section. You’ll see it solves our time scaling problem. 🙂
But let’s not get ahead of ourselves here. Let’s first do some more thinking here. From what preceded this discussion on polar and Cartesian coordinates, it’s pretty obvious we want to associate some clock with our traveling object. So we want the φ or θ to represent some proper time. Now, the concept of ‘proper time’ is actually a relativistic one – which I’ll talk about in a moment – but here we just want to do non-relativistic stuff. So what’s the proper time, classically? Well… We need to do that rotation and then shift the origin, right?
Well… No. I led you astray—but for a good reason: I wanted you to think about what we’re doing here. The transformation we need is simple and complicated at the same time. We want our clock to just tick along the straight blue line, so we can nicely wrap it around Euler’s circle. So we do not want those weird graphs. We want to iron out all of the wrinkles, so to speak. So how do we do that? As I mentioned above, we’d feel it if we’re shaking along some shaky path. So we know what our proper space is: it’s the x = 0 space. So we know when we’re traveling along a straight line – even in curved space 🙂 – and when we’re not. Now let’s look at one of those wrinkles, like that green curve below that is breaking away from the proper path.
So we’d feel the force and, as mentioned above, we’d feel we’re picking up speed. In other words: we’d be deviating from our straight line in space, and we could calculate everything. We’d calculate that we’re moving away from the straight path with a velocity v that we can calculate as v = Δx/Δt = tanα, so that’s the slope (m = tanα) of the hypotenuse of the dotted triangle. Of course, you have to think differentials and derivatives here, so you should think of dx and dt, and the instantaneous velocity v = dx/dt here. But you see what I mean—I hope! The point is: once the force did what it had to do, we’re back on a straight line, but moving in some other direction, or moving at a higher velocity as observed from the non-inertial frame of reference.
So… Well… It seems like we’re making things way too complicated here. It’s actually very easy to iron out all of the wrinkles in the Galilean or Newtonian world: time is time, and so the proper time is just t’ = t: it’s the same in any frame of reference. So it doesn’t matter whether we’re moving in a straight line or along some complicated path: the proper time is just time, and we can just equate that θ in our sine or cosine function (i.e. the argument of our wavefunction) with t, so we write θ = t = t’.
So it’s simple: we literally stretch those weird graphs (or iron them out, if you prefer that term) and then we just measure time along them. So we just iron the graph and then wrap it around the unit circle. That’s all. 🙂 Well… Sort of. 🙂 The thing is: if it’s a complicated trajectory (i.e. anything that is not a straight line), the angular velocity will not be some constant: it will vary—and how exactly is captured by that action concept, as measured along the line. But I’ll come back to that.
What about the proper space of our object? That’s easy: to ensure x’ = 0, we’ll permanently correct and the new origin will be equal to x’ = x−v·t. So, yes, very easy. We’re just doing simple Galilean (or Newtonian) transformations here: we’re looking at some object that’s traveling in spacetime, and we keep track of how space and time looks like in its own reference frame, i.e. in its proper space and time. So it’s classical relativity, which is usually represented by the following set of equations.
So… This all really looks like much ado about nothing, isn’t it? Well… Yes. I made things very complicated above and, yes, you’re right: you don’t need all these complicated graphs to just explain the concept of a clock that’s traveling with some object, i.e. the concept of proper time. The concept of proper time, in the classical world, is just time: absolute time.
The thing is: since Einstein, we know the classical world is not the real world. 🙂 Now, quantum theory – i.e. the kind of wave mechanics that we will present below – was born 20 years after Einstein’s publication of his (special) relativity theory (we only need the special relativity theory to understand what the wavefunction is all about). That’s a long time, or a short time, depending on your perspective – another thing that’s relative 🙂 – but so here it is: the concept of proper time in the quantum-mechanical wavefunction is not the classical concept: it’s the relativistic one.
Let me show you that now.
II. The wavefunction
You know the elementary wavefunction (if not, you’ll need to go through the essentials page(s) of this blog):
ψ(x, t) = a·e−i·[(E/ħ)·t − (p/ħ)∙x] = a·e−i·(ω·t − k∙x) = a·e−iθ = a·eiφ = a·(cosφ + i·sinφ) with φ = −θ
The latter part of the formula above is just Euler’s formula. Note that the argument of the wavefunction rotates clockwise with time, while the mathematical convention for the φ angle demands we measure that angle counter-clockwise. It’s a minor detail but important to note.
The argument of the wavefunction
Let’s have a closer look at the mathematical structure of the argument of the quantum-mechanical wavefunction, i.e. θ = ωt – kx = (E/ħ)·t – (p/ħ)·x. [We’ve simplified once again by assuming assuming one-dimensional space only, so the bold-face x and p (i.e. the x and p vectors) are replaced by x and p. Likewise, the momentum vector p = m·v becomes just p = m·v.]
The ω in the θ = ωt – kx argument is usually referred to as the frequency in time (i.e. the temporal frequency) of our wavefunction, while k is the so-called wave number, i.e. the frequency in space (or spatial frequency) of the wavefunction. So ω is expressed in radians per second, while k is expressed in radians per meter. However, as we’ll see in a moment, the θ = ωt – kx expression is actually not very different from our previous θ = ω·t expression: the –kx term is like a relativistic correction because, in relativistic spacetime, you always have to wonder: whose time?
However, before we get into that, let’s first play a bit with on of these online graphing tools to see what that a·ei(k∙x−ω·t) =a·eiθ = a·(cosθ + i·sinθ) formula actually represents. Compare the following two graps, for example. Just imagine we either look at how the wavefunction behaves at some point in space, with the time fixed at some point t = t0, or, alternatively, that we look at how the wavefunction behaves in time at some point in space x = x0. As you can see, increasing k = p/ħ or increasing ω = E/ħ gives the wavefunction a higher ‘density’ in space or, alternatively, in time.
Relativistic spacetime transformations
Let’s now look at the whole thing once more. At first, it looks like that argument θ = ωt – kx = (E/ħ)·t – (p/ħ)·x is a lot more complicated than the θ = ω·t argument we introduced when talking about polar coordinates. However, it actually is not very different, as I’ll show below. Let’s see what we can do with this thing when assuming there are no force fields. In other words, we’re once again in the simplest of cases: we’re looking at some object moving in a straight line at constant velocity.
In that case, it has no potential energy, except for the equivalent energy that’s associated with its rest mass m0. Of course, it also has kinetic energy, because of its velocity. Now, if mv is the total mass of our object, including the equivalent mass of the particle’s kinetic energy, then its equivalent energy (potential and kinetic—all included!) is E = mvc2.
Let’s further simplify by assuming we measure everything in natural units, so c = 1. However, we’ll go one step further. We’ll also assume we measure stuff in such units so ħ is also equal to unity. [In case you wonder what units that could be, think of the Planck units above. You can quickly check that the speed of light comes out alright: (1.62×10−35 m)/(5.39×10−44 s) = 3×108 m·s, so if 1.62×10−35 m and 5.39×10−44 s are the new units – let’s denote them by 1 tP and 1 lP respectively – then c‘s value, as measured in the new units, will be one, indeed.] The velocity of any object, v, will now be measured as some fraction of c, i.e. a relative velocity. [You know that’s just the logical consequence of relativistic mass increase, which is real! It requires tremendous energy to accelerate elementary particles beyond a certain point, because they become so heavy!]
To make a long story (somewhat) shorter, our energy formula E = mvc2 reduces to E = mv. Finally, just like in our example with the Archimedean spiral we will also choose the origin of our axes such that x = 0 zero when t = 0, so we write: x(t = 0) = x(0) = 0. That ensures x = v·t for every point on the trajectory of our object. Hence, taking into account that the numerical value of ħ is also equal to 1 – and substituting p for p = mv·v – we can re-write θ = (E/ħ)·t – (p/ħ)·x as:
θ = E·t – p·x = E·t − p·v·t = mv·t − mv·v·v·t = mv·(1 − v2)·t
So our ψ(x, t) function becomes ψ = a·e−i·mv·(1 − v2)·t. This is really exciting! Our formula for θ now has the same functional form as our θ = 2π·v·t above: we just have an mv·(1 − v2) factor here, times t, rather than a 2π·v factor times t. Note that we don’t have the 2π factor in our θ = mv·(1 − v2) formula because we chose our units such that ħ = 1, so we equate the so-called reduced Planck constant with unity, rather than h = 2π·ħ. [In case you doubt: ħ is the real thing, as evidenced from the fact that we write the Uncertainty Principle as Δx·Δp ≥ ħ/2, not as Δx·Δp ≥ h/2.]
But… Well… That mv·(1 − v2) factor is a very different thing, isn’t it? Not at all like v, really. So what’s going on here? What can we say about this? Before investigating this, let me first look at something else—and, no, I am not trying to change the subject. I’ll answer the question above in a minute. However, let’s first do some more thinking about the coefficient in front of our wavefunction.
Remember that the formula for the distance out (r) in that r·eiφ formula for that spiral was rather annoying: we didn’t want it to depend on φ. Fortunately, the a in our ψ = a·e−iθ will, effectively, not depend on θ. But so what is it then? To explain what it is, I must assume that you already know a thing or two about quantum mechanics. [If you don’t, don’t worry. I’ll come back to everything I write here. Just go with the flow right now.] One of the things you probably know, is that we should take the absolute square of this wavefunction to get the probability of our particle being somewhere in space at some point in time. So we get the probability as a function of x and t. We write:
P(x, t) = |a·e−i·[(E/ħ)·t − (p/ħ)∙x]|2 = a2
The result above makes use of the fact that |ei·φ|2 = 1, always. That’s mysterious, but it’s actually just the old cos2φ + sin2φ = 1 rule you know from high school. In fact, the absolute square takes the time dependency out of the probability, so we can just write: P(x, t) = P(x), so the probability depends on x only. Interesting! But… Well… That’s actually what we’re trying to show here, so I still have to show that the a factor is not time-dependent. So let me show that to you, in an intuitive way. You know that all probabilities have to add up to one. Now, let’s assume, once again, we’re looking at some narrow band in space. To be specific, let’s assume our band is defined by Δx = x2 − x1. Also, as we have no information about the probability density function, we’ll just assume it’s a uniform distribution, as illustrated below. In that case, and because all probabilities have to add up to one, the following logic should hold:
(Δx)·a2 = (x2−x1)·a2 = 1 ⇔ Δx = 1/a2
In short, the a coefficient in the ψ = a·e−i·mv·(1 − v2)·t is related to the normalization condition: all probabilities have to add up to one. Hence, the coefficient of the elementary wavefunction does not depend on time: it only depends on the size of our box in space, so to speak. 🙂
OK. Done! Let’s now go back to our θ = mv·(1 − v2)·t formula. 🙂 Both mv and v vary, so that’s a bit annoying. Let us, therefore, substitute mv for the relativistically correct formula: mv = m0/√(1−v2). So now we only have one variable: v, or parameter, I should say, because we assume v is some constant velocity here. [Sorry for simplifying: we’ll make things more complicated again later.] Let’s also go back to our original ψ(x, t) = a·e−i·[(E/ħ)·t − (p/ħ)∙x] function, so as to include both the space as well as the time coordinates as the independent variables in our wavefunction. Using natural units once again, that’s equivalent to:
ψ(x, t) = a·e−i·(mv·t − p∙x) = a·e−i·[(m0/√(1−v2))·t − (m0·v/√(1−v2)∙x) = a·e−i·[m0/√(1−v2)]·(t − v∙x)
Interesting! We’ve got a wavefunction that’s a function of x and t, but with the rest mass (or rest energy) and the velocity of what we’re looking at as parameters! But… Hey! Wait a minute! You know that formula, don’t you?
The (t − v∙x)/√(1−v2) factor in the argument should make you think of a very famous formula—one that I am sure you must have seen a dozen of times already! It’s one of the formulas for the Lorentz transformation of spacetime. Let me quickly give you the formulas:
Let me now remind you of what they mean. [I am sure you know it already, but… Well… Just in case.] The (x, y, z, t) coordinates are the position and time of an object as measured by the observer who’s ‘standing still’, while the (x′,y′,z′,t′) is the position and time of an object as measured by the observer that’s ‘moving’. In most of the examples, that’s the guy in the spaceship, who’s often referred to as Moe. 🙂 The illustration below shows how it works: Joe is standing still, and Moe is moving.
The theory of relativity, as expressed in those transformation formulas above, shows us that the relationship between the position and time as measured in one coordinate system and another are not what we would have expected on the basis of our intuitive ideas. Indeed, the Galilean – or Newtonian – transformation of the coordinates as observed in Joe’s and Moe’s coordinate space would be given by the much simpler set of equations I already noted in the previous section, i.e.:
We also gave you the (Newtonian) formulas for a rotation in the previous section. For a rotation (in the Newtonian world) we also got ‘primed’ coordinates which were a mixture of the old coordinates (t and x). So we’ve got another mixture here. It’s fundamentally different, however: that Lorentz transformation also involves mixing the old coordinates to get the new ones, but it’s an entirely different formula. As you can see, it’s an algebraic thing. No sines and cosines.
The argument of the wavefunction as the proper time
The primed time coordinate (t’), i.e. time as measured in Moe’s reference frame, is referred to as the proper time. Let me be somewhat more precise here, and just give you the more formal definition: the proper time is the time as measured by a clock along the path in four-dimensional spacetime.
So what do we have here? A great discovery, really: we now know what time to use in our θ = ω·t formula. We need to use the proper time, so that’s t’ rather than t! Bingo! Let’s get the champagne out!
Not yet. We shouldn’t forget the second factor: we also have m0 in our m0·[t − v∙x]/√(1−v2)] argument. But… Well… That’s even better ! So we also get the scaling factor here! The natural unit in which we should measure the proper time is given by the rest mass of the object that we’re looking at. To sum it all up, the argument of our wavefunction reduces to:
θ = m0·t’ = m0·[t − v∙x]/√(1−v2)]
In fact, when thinking about how the rest mass – through the energy and momentum factors in the argument of the wavefunction – affects its density, both in time as well as in space, I often think of an airplane propeller: as it spins, faster and faster (as shown below), it gives the propeller some ‘density’, in space as well as in time, as its blades cover more space in less time.
It’s an interesting analogy, and it helps—me, at least—to try to imagine what that wavefunction might actually represent. In fact, every time I think about it, I find it provides me with yet another viewpoint or nuance. 🙂 The basics of the analogy are clear enough: our pointlike object (you may want to think of our electron in some orbital once again) is whizzing around, in a very limited box in space, and so it is everywhere and nowhere at the same time. At the same time, we may – perhaps – catch it at some point—in space, and in time—and it’s the density of its wavefunction that determines what the odds are for that. It may be useful for you to think of the following two numbers, so as to make this discussion somewhat more real:
- The so-called Bohr radius is, roughly speaking, the size of our electron orbital—for a one-proton/one-electron hydrogen atom, at least. It’s about 5.3×10−11 m.
- However, there is also a thing known as the classical electron radius (aka the Lorentz radius, or Thomson scattering length). It’s a complicated concept, but it does give us an indication of the actual size of an electron, as measured by the probability of ‘hitting’ it with some ray (usually a hard X-ray). That radius is about 2.8×10−15 m, so that’s are the same. It is typically denoted σ and measured in units of area.
Hence, the radius of our box is about 20,000 larger than the radius of our electron. I know what you’ll say: that’s not a lot. But that’s just because, by now, you’re used to all those supersonic numbers. 🙂 Think of what it represents in terms of volume: the cube in the volume formula – V = (4/3)·π·r3 – ensures the magnitude of the volume ratio is of the tera-order. To be precise: the volume of Bohr’s box is about 6,622,200,000,000 times larger than the Thomson box, plus or minus a few billion. Is that number supersonic enough? 🙂
So… Well… Yes. While, in quantum-mechanics, we should think of an electron as not having any actual size, it does help our understanding to think of it as having some unimaginably small, but actual, size, and as something that’s just whizzing around at some incredibly high speed. In short, think of the propeller picture. 🙂
To conclude this section, I’ll quickly insert a graph from Wikipedia illustrating the concept of proper time. Unlike what you might think, E1 and E2 are just events in spacetime, i.e. points in spacetime that we can identify because something happens there, like someone arriving or leaving. 🙂 The graph below would actually illustrate the twin paradox if the distance coordinate of E2 would have been the same as the distance coordinate of E1, and the t and τ are actually the time interval, which we’d usually denote by Δt and Δt’. In any case, you get the idea—I hope! 🙂
What does it all mean?
Let’s calculate some stuff to see what it all means. We’ve actually done the calculations already. Look at those cosine and sine functions above: a higher mass will give the wavefunction a higher density, both in space as well as in time, as the mass factor multiplies both t as well as x. Of course, that’s obvious just from looking at θ = m0·t’. However, it’s interesting to stick to our x and t coordinates (rather than the proper time) and see what happens. Let’s make abstraction from the m0 factor for a moment because, as mentioned above, that’s basically just a scaling factor for the proper time.
So let’s just look at the 1/√(1−v2) factor in front of t, and the v/√(1−v2) factor in front of x in our θ = m0·[t − v∙x]/√(1−v2)] = m0·[t/√(1−v2) − v∙x/√(1−v2)]. I’ve plotted them below.
First look at the blue graph for that 1/√(1−v2) factor in front of t: it goes from one (1) to infinity (∞) as v goes from 0 to 1 (remember we ‘normalized’ v: it’s a ratio between 0 and 1 now). So that’s the factor that comes into play for time. For x, it’s the red graph, which has the same shape but goes from zero (0) to infinity (∞) as v goes from 0 to 1.
Now that makes sense. Our time won’t differ from the proper time of our object if it’s standing still, and the v in the v∙x/√(1−v2) term ensures it disappears when v = 0. Just write it all out:
θ = m0·[t/√(1−v2) − v∙x/√(1−v2)] = m0·[t/√(1−02) − 0∙x/√(1−02)] = m0·t
However, as the velocity goes up, then the clock of our object – as we see it – will seem to be going slower. That’s the relativistic time dilation effect, with which most of us are more familiar with than the relativistic mass increase or length contraction effect. However, they’re all part of the same thing. You may wonder: how does it work exactly? Well… Let’s choose the origin of our axes such that x = 0 zero when t = 0, so we write: x(t = 0) = x(0) = 0. That ensures x = v·t for every point on the trajectory of our object. In fact, we’ve done this before. The argument of our wavefunction just reduces to:
θ = m0·[t/√(1−v2) − v∙v·t/√(1−v2)]= m0·(1 − v2)/√(1−v2)·t = m0·√(1 − v2)·t
I’ll let you draw the graph yourself: the √(1 − v2) factor goes from 1 to 0 as v goes from o to 1. OK. That’s obvious. But what happens with space? Here, the analysis becomes really interesting. The density of our wavefunction, as seen from our coordinate frame, also becomes larger in space, for any time of t or t’. However, note that all of our weird or regular graphs above assumed some fixed domain for our function and, hence, the number of oscillations is some fixed number. But if their density increases, that means we must pack them in a smaller space. In short, the increasing density of our wavefunction – as velocities increase – corresponds to the relativistic length contraction effect: it’s like space is contracting as the velocity increases.
OK. All of the above was rather imprecise—an introduction only, meant to provide some more intuitive approach to the subject-matter. However, now it’s time for the real thing. 🙂 Unfortunately, that will involve a lot more math. And I mean: a lot more! 😦
However, before I move on, let me first answer a question you may have: is it important to include relativistic effects? The answer is: it depends on the (relative) velocity of what we’re looking at. For example, you may or may not know we do have some kind of classical idea of the velocity of an electron in orbit. It’s actually one of the many interpretations of what some physicists refer to as ‘God’s number’, i.e. the fine-structure constant (α). Indeed, among other things, this number may be interpreted as the ratio of the velocity of the electron in the first circular orbit of the Bohr model of an atom and the speed of light, so it’s our v. In fact, I mentioned it in that digression on α: one of the ways we can write it is α = v/c, indeed. Now, the numerical value of α is about 7.3×10−3 (for historical reasons, you’ll usually see it written as 1/α ≈ 137), so its (classical) velocity is just a mere 2,187 km per second. At that velocity, the 1/√(1−v2) factor in front of t is very near to 1 (the first non-zero digit behind the decimal point appears after four zeros only), while the v/√(1−v2) in front of x is (almost) equal to α ≈ o.007297… [In fact, at first, I actually thought they were actually equal, but then α/√(1−α2) is obviously not equal to α.] Hence, when calculating electron orbitals (like I did in one of my posts on Schrödinger’s equation), one might just as well not bother about the relativistic correction and just equate the proper time (t’) with the time of the observer (t).
In fact, that’s what the Master (whose Chapter on electron orbitals I summarized in that post of mine) does routinely, as most of the time he’s talking about rather heavy objects, like electrons, or nitrogen atoms. 🙂 To be specific, the solutions for Schrödinger’s equation for the electron in a hydrogen atom all share the following functional form:
ψ(x, t) = ψ(x)·e−i·(E/ħ)·t
Hence, the position vector does not appear in the argument of the complex exponential: we only get the first term of the full (E/ħ)·t − (p/ħ)∙x argument here. The position vector does appear in the coefficient in front of our exponential, however—which is why we get all these wonderful shapes, as illustrated below (credit for the illustration goes to Wikipedia). 🙂
Well… No. I should be respectful for the Master. Feynman does not write the wavefunction in the way he writes it – i.e. as ψ(x, t) = ψ(x)·e−i·(E/ħ)·t – to get rid of the relativistic correction in the argument of the wavefunction. Think of it: the ψ = (r, t) = e−(i/ħ)·E·t·ψ(r) expression is not necessarily non-relativistic, because we can re-write the elementary a·e−i[(E/ħ)·t – (p/ħ)·x] function as e−(i/ħ)·E·t·a·ei·(p/ħ)·x]. Feynman just writes what he writes to ease the search for functional forms that satisfy Schrödinger’s equation. That’s all. [By the way, note that the coefficient in front of the complex exponential as ψ(r) = a·ei·(p/ħ)·x] still does the trick we want it to do: we do not want that coefficient to depend on time: it should only depend on the size of our ‘box’ in space.]
So what’s next? Well… The inevitable, I am afraid. After introducing the wavefunction, one has to introduce… Yep. The Other Big Thing. 🙂
III. Schrödinger’s equation
Schrödinger’s equation as a diffusion equation
You’ve probably seen Schrödinger’s equation a hundred times, trying to understand what it means. Perhaps you were successful. Perhaps you were not. Its derivation is not very straightforward, and so I won’t give you that here. [If you want, you can check my post on it.] Let me first jot it down once more. In its simplest form – i.e. not including any potential, so then it’s an equation that’s valid for free space only—no force fields!—it reduces to:
In my post on quantum-mechanical operators, I drew your attention to the fact that this equation is structurally similar to the heat diffusion equation. Indeed, assuming the heat per unit volume (q) is proportional to the temperature (T) – which is the case when expressing T in degrees Kelvin (K), so we can write q as q = k·T – we can write the heat diffusion equation as:
Moreover, I noted the similarity is not only structural. There is more to it: both equations model some flow in space and in time. Let me make the point once more by first explaining it for the heat diffusion equation. The time derivative on the left-hand side (∂T/∂t) is expressed in K/s (Kelvin per second). Weird, isn’t it? What’s a Kelvin per second? In fact, a Kelvin per second is actually a quite sensible and straightforward quantity, as I’ll explain in a minute. But I can understand you can’t make much sense of it now. So, fortunately, the constant in front (k) makes sense of it. That coefficient (k) is the (volume) heat capacity of the substance, which is expressed in J/(m3·K). So the dimension of the whole thing on the left-hand side (k·∂T/∂t) is J/(m3·s), so that’s energy (J) per cubic meter (m3) and per second (s). That sounds more or less OK, doesn’t it? 🙂
So what about the right-hand side? On the right-hand side we have the Laplacian operator – i.e. ∇2 = ∇·∇, with ∇ = (∂/∂x, ∂/∂y, ∂/∂z) – operating on T. The Laplacian operator, when operating on a scalar quantity, gives us a flux density, i.e. something expressed per square meter (1/m2). In this case, it’s operating on T, so the dimension of ∇2T is K/m2. Again, that doesn’t tell us very much: what’s the meaning of a Kelvin per square meter? However, we multiply it by the thermal conductivity, whose dimension is W/(m·K) = J/(m·s·K). Hence, the dimension of the product is the same as the left-hand side: J/(m3·s). So that’s OK again, as energy (J) per cubic meter (m3) and per second (s) is definitely something we can associate with a flow. Hence, the diffusion constant does what it’s supposed to do:
- As a constant of proportionality, it quantifies the relationship between both derivatives (i.e. the time derivative and the Laplacian)
- As a physical constant, it ensures the dimensions on both sides of the equation are compatible.
In fact, we can now scrap one m on each side 🙂 so the dimension of both sides then becomes joule per second and per square meter, which makes a lot of sense too—as flows through two-dimensional surfaces can easily be related to flows through three-dimensional surfaces. [The math buffs amongst you (unfortunately, I am not part of your crowd) can work that.] In any case, it’s clear that the heat diffusion equation does represent the energy conservation law indeed!
What about Schrödinger’s equation? Well… We can – and should – think of Schrödinger’s equation as a diffusion equation as well, but then one describing the diffusion of a probability amplitude.
Huh? Yes. Let me show you how it works. The key difference is the imaginary unit (i) in front, and the wavefunction in its terms. That makes it clear that we get two diffusion equations for the price of one, as our wavefunction consists of a real part (the term without the imaginary unit, i.e. the cosine part) and an imaginary part (i.e. the term with the imaginary unit, i.e. the sine part). Just think of Euler’s formula once more. To put it differently, Schrödinger’s equation packs two equations for the price of one: one in the real space and one in the imaginary space, so to speak—although that’s a rather ambiguous and, therefore, a rather confusing statement. But… Well… In any case… We wrote what we wrote.
What about the dimensions? Let’s jot down the complete equation so to make sure we’re not doing anything stupid here by looking at one aspect of the equation only. The complete equation is:
Let me first remind you that ψ is a function of position in space and time, so we write: ψ = ψ(x, y, z, t) = ψ(r, t), with (x, y, z) = r. Let’s now look at the coefficients, and at that −ħ2/2m coefficient in particular. First its dimension. The ħ2 factor is expressed in J2·s2. Now that doesn’t make much sense, but then that mass factor in the denominator makes everything come out alright. Indeed, we can use the mass-equivalence relation to express m in J/(m/s)2 units. Indeed, let me remind you here that the mass of an electron, for example, is usually expressed as being equal to 0.5109989461(31) MeV/c2, so that unit uses the E = m·c2 mass-equivalence formula. As for the eV, you know we can convert that into joule, which is a rather large unit—which is why we use the electronvolt as a measure of energy. In any case, to make a long story short, we’re OK: (J2·s2)·[(m/s)2/J] = J·m2. But so we multiply that with some quantity (the Laplacian) that’s expressed per m2. So −(ħ2/2m)·∇2ψ is something expressed in joule, so it’s some amount of energy!
Interesting, isn’t it? Especially because it works out just fine with the additional Vψ term, which is also expressed in joule.
But why the 1/2 factor? Well… That’s a bit hard to explain, and I’ll come back to it. That 1/2 factor also pops up in the Uncertainty Relations: Δx·Δp ≥ ħ/2 and ΔE·Δt ≥ ħ/2. So we have ħ/2 here as well, not ħ. Why do we need to divide the quantum of action by 2 here? It’s a very deep thing. I’ll show why we need that 1/2 factor in the next sections, in which I’ll also calculate the phase and group velocities of the elementary wavefunction for spin-0, spin-1/2 and spin-1 particles. So… Well… Be patient, please! 🙂
Now, we didn’t say all that much about V, but then that’s easy enough. V is the potential energy of… Well… Just do an example. Think of the electron here: its potential energy depends on the distance (r) from the proton. We write: V = −e2/│r│ = −e2/r. Why the minus sign? Because we say the potential energy is zero at large distances (see my post on potential energy). So we’ve got another minus sign here, although you couldn’t see it in the equation itself. In any case, the whole Vψ term is, obviously, expressed in joule too. So, to make a long story short, the right-hand side of Schrödinger’s equation is expressed in energy units.
On the left-hand side, we have ħ, and its dimension is the action dimension: J·s, i.e. force times distance times time (N·m·s). So we multiply that with a time derivative and we get J once again, the unit of energy. So it works out: we have joule units both left and right. But what does it mean?
The Laplacian on the right-hand side should work just the same as the Laplacian in our heat diffusion equation: it should give us a flux density, i.e. something expressed per square meter (1/m2). But so what is it that is flowing here? Well… Hard to say. In fact, the VΨ term spoils our flow interpretation, because that term does not have the 1/s or 1/m2 dimension.
Well… It does and it doesn’t. Let me do something bold here. Let me re-write Schrödinger’s equation as:
∂ψ/∂t + i·(V/ħ)·ψ = i·(ħ/2m)·∇2ψ
Huh? Yes. All I did, was to move the i·ħ factor to the other side here (remember that 1/i is just −i), so our Vψ term becomes −i·(V/ħ)·ψ, and then I move it to the left-hand side. What do we get now when doing a dimensional analysis?
- The ∂ψ/∂t term still gives us a flow expressed per second
- The dimension of V/ħ is (N·m)/(N·m·s) = 1/s, so that’s nice, as it’s the same dimension as ∂ψ/∂t. So on the left-hand side, we have something per second.
- The ħ/2m factor gives us (N·m·s)/(N·s2/m) = m2/s. That’s fantastic as that’s what we’d expect from a diffusion constant: it fixes the dimensions on both sides, because that Laplacian gives us some quantity per m2.
In short, the way I re-write Schrödinger’s equation gives a lot more meaning to it! 🙂 Frankly, I can’t believe no one else seems to have thought of this simple stuff.
But we’re still left with the question: what’s flowing here? Feynman’s answer is simple: the probability amplitude, which – as he repeats several times – is a dimensionless number—a scalar, albeit a complex-valued scalar.
Frankly, I love the Master, but I find that interpretation highly unsatisfactory. My answer is much bolder: it’s energy. The probability amplitude is like temperature: temperature is a measure for – and is actually defined as – the mean molecular kinetic energy. So it’s a measure of the (average) energy in an unimaginably small box in space. How small? As small as we can make it, taking into account that the notion of average must still make sense, of course! Now, you can easily sense we’ve got a statistical issue here that resembles the Uncertainty Principle: the standard error of the mean (average) energy will increase as the size of our box decreases. Interesting!
Of course, you’ll think this is crazy. No one interprets probability amplitudes like this. This doesn’t make sense, does it? Well… I think it does, and I’ll give you some reasons why in a moment. 🙂
However, let me first wrap up this section by talking about the ħ2/(2m) coefficient. That coefficient is the diffusion constant in Schrödinger’s equation, so it should do the two quintessential jobs: (1) fix dimensions, and (2) give us a sensible proportionality relation. So… Well… Let’s look at the dimensions first. We’ve talked about the dimension of the mass factor m, but so what is the m in the equation? It’s referred to as the effective mass of the elementary particle that we’re looking at—which, in most practical cases, is the electron (see our introduction to electron orbitals above, for example). I’ve talked about the subtleties of the concept of the effective mass of the electron in my post on the Schrödinger equation, so let’s not bother too much about its exact definition right here—just like you shouldn’t bother – for the moment, that is – about that 1/2 factor. 🙂 Just note that ħ2/(2m) is the reciprocal of 2m/ħ. We should think of the 2m factor as an energy concept (I will later argue that we’ve got that factor 2 because the energy includes spin energy), and that’s why the 2m/ħ factor also makes sense as a proportionality factor.
Before we go to the next section, I want you to consider something else. Think of the dimension of θ. We said it was the proper time of the thing we’re looking at, multiplied by the rest mass. At the same time, we said it was something that’s dimensionless. Some kind of pure number accompanying our object. To be precise: it became an angle, expressed in radians. But… Well… I want you to re-consider that.
What? Yes. Look at it: the E·t and p·x factors in θ = (E/ħ)·t – (p/ħ)·x both have the same dimension as Planck’s constant, i.e. the dimension of action (force times distance times time). [The first term (E·t divided by ħ) is energy times time, while the second (p·x divided by ħ) is momentum times distance. Both can be re-written as force times distance time time.] So θ becomes dimensionless just because we include ħ’s dimension when dividing E and p by it. But what if we’d say that ħ, in this particular case, is just a scaling factor, i.e. some numerical value without a dimension attached to it? In that case, θ would no longer be dimensionless: it would, effectively, have the action dimension: N·m·s. However, in that case, wouldn’t it be awkward to have a function relating some amount of action to something that’s dimensionless? I mean… Shouldn’t we then say that our wavefunction sort of projects something real – in this case, some amount of action – into some other real space? [In case you wonder: when I say a real space, I mean a physical space—i.e. a space which has physical dimensions (like time or distance, or energy, or action—or whatever physical dimensions) rather than just mathematical dimensions.
So let’s explore that idea now.
The wavefunction as a link function
The wavefunction acts as a link function between two spaces. If you’re not familiar with the concept of link functions, don’t worry. But it’s quite interesting. I stumbled upon when co-studying non-linear regression with my daughter, as she was preparing for her first-year MD examinations. 🙂 Link functions link mathematical spaces. However, here I am thinking of linking physical spaces.
The mechanism is something like this. Our physical space is an action space: some force moves something in spacetime. All the information is captured in the notion of action, i.e. force times distance times time. Now, the action is the proper time, and it’s the argument of the wavefunction, which acts as a link function between the action space and what I call the energy space, which is not our physical space—but it’s another physical space: it’s got physical dimensions.
You’ll say: what physical dimensions? What makes it different from our physical space?
Great question! 🙂 Not easy to answer. The philosophical problem here is that we should only have one physical space, right? Well… Maybe. I am thinking of any space whose dimensions are physical. So the dimensions we have here are time and energy. We don’t have x, though. So the spatial dimension got absorbed. But that’s it. And so, yes, our new energy space is a physical space. It just doesn’t have any spatial dimension: it just mirrors the energy in the system at any point in time, as measured by the proper time of the system itself. Does that make sense?
Note, once again, the phase shift between the sine and the cosine: if one reaches the +1 or −1 value, then the other function reaches the zero point—and vice versa. It’s a beautiful structure. Of course, the million-dollar question is: is it a physical structure, or a mathematical structure? Does that energy space really have an energy dimension? In other words: is it an actual energy space? Is it real?
I know what you think—because that’s what I thought too, initially. It’s just a figment of our imagination, isn’t it? It’s just some mathematical space, no? Nothing real: just a shadow from what’s going in real spacetime, isn’t it?
I thought about this for a long time, and my answer is: it’s real! It’s not a shadow. That sine and cosine space is a very real space. It associates every point in spacetime – through the wavefunction, which acts as a link function here – with some real as well as some imaginary energy—and the imaginary energy is as real as the real energy. 🙂 It’s that energy that explains why amplitudes interfere—which, as you know, is what they do. So these amplitudes are something real, and as the dimensional analysis of Schrödinger’s equation reveals their dimension is expressed in joule, then… Well… Then these physical equations say what they say, don’t they? And what they say, is something like the diagram below. 🙂
Unfortunately, it doesn’t show the phase difference between the two springs though (I should do an animation here), which… Well… That needs further analysis, especially in regard to that least action principle I mentioned: our particle – or whatever it is – will want to minimize the difference between kinetic and potential energy. 🙂 Contemplate that animation once again:
And think of the energy formula for a harmonic oscillator, which tells us that the total energy – kinetic (i.e. the energy related to its momentum) plus potential (i.e. the energy stored in the spring) – is equal to T + U = m·ω02/2. The ω0 here is the angular velocity. Now, the de Broglie relations tell us that the phase velocity of the wavefunction is equal to the vp factor in the E = m·vp2 equation. Look at it: not m·vp2/2. No 1/2 factor. All makes sense, because we’ve got two springs, ensuring the difference between the kinetic energy (KE) and potential energy (PE) in the integrand of the action integral
is not only minimized (in line with the least action principle) but is actually equal to zero! But then we haven’t introduced uncertainty here: we’re assuming some definite energy level. But I need to move on. We’ll talk about all of this later anyway. 🙂
Another reason why I think this energy space is not a figment of our mind, is the fact that we need to to take the absolute square of the wavefunction to get the probability that our elementary particle is actually right there! Now that’s something real! Hence, let me say a few more things about that. The absolute square gets rid of the time factor. Just write it out to see what happens:
|reiθ|2 = |r|2|eiθ|2 = r2[√(cos2θ + sin2θ)]2 = r2(√1)2 = r2
Now, the r gives us the maximum amplitude (sorry for the mix of terminology here: I am just talking the wave amplitude here – i.e. the classical concept of an amplitude – not the quantum-mechanical concept of a probability amplitude). Now, we know that the energy of a wave – any wave, really – is proportional to the amplitude of a wave. It would also be logical to expect that the probability of finding our particle at some point x is proportional to the energy density there, isn’t it? [I know what you’ll say now: you’re squaring the amplitude, so if the dimension of its square is energy, then its own dimension must be the square root, right? No. Wrong. That’s why this confusion between amplitude and probability amplitude is so bad. Look at the formula: we’re squaring the sine and cosine, to then take the square root again, so the dimension doesn’t change: it’s √J2 = J.
The third reason why I think the probability amplitude represent some energy is that its real and imaginary part also interfere with each other, as is evident when you take the ordinary square (i.e. not the absolute square). Then the i2 = –1 rule comes into play and, therefore, the square of the imaginary part starts messing with the square of the real part. Just write it out:
(reiθ)2 = r2(cosθ + isinθ)2 = r2(cos2θ – sin2θ + 2icosθsinθ)2 = r2(1 – 2sin2θ + 2icosθsinθ)2
As mentioned above, if there’s interference, then something is happening, and so then we’re talking something real. Hence, the real and imaginary part of the wavefunction must have some dimension, and not just any dimension: it must be energy, as that’s the currency of the Universe, so to speak.
Now, I should add a philosophical note here—or an ontological note, I should say. When you think we should only have one physical space, you’re right. This new physical space, in which we relate energy to the proper time of an object, is not our physical space. It’s not reality—as we know, as we experience it. So, in that sense, you’re right. It’s not physical space. But then… Well… It’s a definitional matter. Any space whose dimensions are physical—and, importantly, in which things happen (which is surely the case here!)—is a physical space for me. But then I should probably be more careful. What we have here is some kind of projection of our physical space to a space that lacks… Well… It lacks the spatial dimension. 🙂 It’s just time – but a special kind of time: relativistic proper time – and energy—albeit energy in two dimensions, so to speak. So… What can I say? It’s some kind of mixture between a physical and mathematical space. But then… Well… Our own physical space – including the spatial dimension – is something like a mixture as well, isn’t it? 🙂 We can try to disentangle them – which is what I am trying to do here – but then we’ll probably never fully succeed.
When everything is said and done, our description of the world (for which our language is math) and the world itself (which we refer to as the physical space), are part of one and the same reality.
Energy propagation mechanisms
One of my acquaintances is a retired nuclear physicist. A few years ago, when I was struggling a lot more with this stuff than I am now (although it never gets easy: it’s still tough!) – trying to find some kind of a wavefunction for photons – he bluntly told me photons don’t have a wavefunction—not in the sense I was talking at least. Photons are associated with a traveling electric and a magnetic field vector. That’s it. Full stop. Photons do not have a ψ or φ function. [I am using ψ and φ to refer to position or momentum wavefunction. Both are related: if we have one, we have the other.] So I could have given up – but then I just couldn’t let go of the idea of a photon wavefunction. The structural similarity in the propagation mechanism of the electric and magnetic field vectors E and B just looks too much like the quantum-mechanical wavefunction. So I kept trying and, while I don’t think I fully solved the riddle, I feel I understand it much better now. Let me show you.
I. An electromagnetic wave in free space is fully described by the following two equations:
- ∂B/∂t = –∇×E
- ∂E/∂t = c2∇×B
We’re making abstraction here of stationary charges, and we also do not consider any currents here, so no moving charges either. So I am omitting the ∇·E = ρ/ε0 equation (i.e. the first of Maxwell’s four equations), and I am also omitting the j/ε0 in the second equation. So, for all practical purposes (i.e. for the purpose of this discussion), you should think of a space with no charges: ρ = 0 and j = 0. It’s just a traveling electromagnetic wave. To make things even simpler, we’ll assume our time and distance units are chosen such that c = 1, so the equations above reduce to:
- ∂B/∂t = –∇×E
- ∂E/∂t = ∇×B
Perfectly symmetrical, except for the minus sign in the first equation. As for the interpretation, I should refer you to one of my many posts but, briefly, the ∇× operator is the curl operator. It’s a vector operator: it describes the (infinitesimal) rotation of a (three-dimensional) vector field. We discussed heat flow a couple of times, or the flow of a moving liquid. So… Well… If the vector field represents the flow velocity of a moving fluid, then the curl is the circulation density of the fluid. The direction of the curl vector is the axis of rotation as determined by the ubiquitous right-hand rule, and its magnitude of the curl is the magnitude of rotation. OK. Next.
II. For the wavefunction, we have Schrödinger’s equation, ∂ψ/∂t = i·(ħ/2m)·∇2ψ, which relates two complex-valued functions (∂ψ/∂t and ∇2ψ). [Note I am assuming we have no force fields (so no V), and also note I brought the i·ħ to the other side: −(ħ2/2m)/(i·ħ) = −(ħ/2m)/i = +i·(ħ/2m).] Now, complex-valued functions consist of a real and an imaginary part, and you should be able to verify the ∂ψ/∂t = i·(ħ/2m)·∇2ψ equation is equivalent to the following set of two equations:
- Re(∂ψ/∂t) = −(ħ/2m)·Im(∇2ψ)
- Im(∂ψ/∂t) = (ħ/2m)·Re(∇2ψ)
Perfectly symmetrical as well, except for the minus sign in the first equation. 🙂 [In case you don’t immediately see what I am doing here, note that two complex numbers a + i·b and c + i·d are equal if, and only if, their real and imaginary parts are the same. However, here we have something like this: a + i·b = i·(c + i·d) = i·c + i2·d = − d + i·c (remember i2 = −1).] Now, the energy E in the wave equation – i.e. the E in ψ(θ) = ψ(x, t) = a·e−iθ = a·e−i(E·t − p∙x)/ħ wavefunction – consists of:
- The rest rest energy E0 = m0·c2
- The kinetic energy mv·v2/2 = (mv·v)·(mv·v)/(2mv) = p2/(2m);
- The potential energy V.
[Note we’re using a non-relativistic formula for the kinetic energy here, but it doesn’t matter. It’s just to explain the various components of the total energy of the particle.]
Now let’s assume our particle has zero rest mass, so E0 = 0. By the way, note that the rest mass term is mathematically equivalent to the potential term in both the wavefunction as well as in Schrödinger’s equation: (E0·t +V·t = (E0 + V)·t, and V·ψ + E0·ψ = (V+E0)·ψ. So… Yes. We can look at the rest mass as some kind of potential energy or – alternatively – add the equivalent mass of the potential energy to the rest mass term.
Note that I am not saying it is a photon. I am just hypothesizing there is such thing as a zero-mass particle without any other qualifications or properties. In fact, it’s not a photon, as I’ll prove later. A photon packs more energy. 🙂 All it’s got in common with a photon is that all of its energy is kinetic, as both E0 and V are zero. So our elementary wavefunction ψ(θ) = ψ(x, t) = e−iθ = e−i[(E0 + p2/(2m) + V)·t − p∙x]/ħ reduces to e−i(p2/(2m)·t − p∙x)/ħ. [Note I don’t include any coefficient (a) in front, as that’s just a matter of normalization.] So we’re looking at the wavefunction of a massless particle here. While I mentioned it’s not a photon – or, to be precise, it’s not necessarily a photon – it has mass and momentum, just like a photon. [In case you forgot, the energy and momentum of a photon are given by the E/p = c relation.]
Now, it’s only natural to assume our zero-mass particle will be traveling at the speed of light, because the slightest force will give it an infinite acceleration. Hence, its velocity v = c is also equal to 1. Therefore, we can write its momentum as p = m∙c = m∙c = m, so we get:
E = m = p
Waw! What a weird combination, isn’t it? It is… But… Well… It’s OK. [You tell me why it wouldn’t be OK. It’s true we’re glossing over the dimensions here, but natural units are natural units, and so c = c2 = 1. So… Well… No worries!] The point to note is that the E = m = p equality yields extremely simple but also very sensible results. For the group velocity of our ei(kx − ωt) wavefunction, we get:
vg = ∂ω/∂k = ∂[E/ħ]/∂[p/ħ] = ∂E/∂p = ∂p/∂p = 1
So that’s the velocity of our zero-mass particle (c, i.e. the speed of light) expressed in natural units once more—just like what we found before. For the phase velocity, we get:
vp = ω/k = (E/ħ)/(p/ħ) = E/p = p/p = 1
What’s the corresponding wavefunction? It’s a·e−i·[E·t − p∙x]/ħ, of course. However, because of that E = m = p relation (and because we use Planck units), we can write it as a·e−i·(m·t − m∙x) = a·e−i·m·(t − x) . Let’s now calculate the time derivative and the Laplacian to see if it solves the Schrödinger equation, i.e. ∂ψ/∂t = i·(ħ/2m)·∇2ψ:
- ∂ψ/∂t = −i·a·m·e−i∙m·(t − x)
- ∇2ψ = ∂2[a·e−i∙m·(t − x)]/∂x2 = a·∂[e−i∙m·(t − x)·(i·m)]/∂x = −a·m2·e−i∙m·(t − x)
So the ∂ψ/∂t = i·(1/2m)·∇2ψ equation becomes:
−i·a·m·e−i∙m·(t − x) = −i·a·(1/2m)·m2·e−i∙m·(t − x) ⇔ 1 = 1/2 !?
The damn 1/2 factor. Schrödinger wants it in his wave equation, but it does not work here. We’re in trouble! So… Well… What’s the conclusion? Did Schrödinger get that 1/2 factor wrong?
Yes. And no. His wave equation is the wave equation for electrons, or for spin-1/2 particles with a non-zero rest mass in general. So the wave equation for these zero-mass particles should not have that 1/2 factor in the diffusion constant.
Of course, you may think those zero-mass particle wavefunctions make no sense because… Well… Their argument is zero, right? Think of it. When we – as an outside observer – look at the clock of an object traveling at the speed of light, its clock looks like it’s standing still. So if we assume t = 0 when x = 0, t will still be zero after it has traveled two or three light-seconds, or light-years! So its t – as we observe it from our inertial framework – is equal to zero forever ! So both t and x are zero—forever! Well… Maybe. Maybe not. It’s sure we have some difficulty here, as evidenced also from the fact that, if m0 = 0, then θ = m0·[t − v∙x]/√(1−v2)] should be zero, right?
Well… No. Look at it: we both multiply and divide by zero here: m0 is zero, but √(1−v2) is zero too! So we can’t define the θ argument, and so we also can’t really define x and t here, it seems. The conclusion is simple: our zero-mass particle is nowhere and everywhere at the same time: it really just models the flow of energy in space!
Let’s quickly do the derivations for E = m = p while not specifying any specific value. However, we will assume all is measured in natural units so ħ = 1. So the wavefunction becomes is just a·e−i·[E·t − p∙x]. The derivatives now become:
- ∂ψ/∂t = −a·i·E·e−i∙[E·t − p∙x]
- ∇2ψ = ∂2[a·e−i∙[E·t − p∙x]]/∂x2 = ∂[a·i·p·e−i∙[E·t − p∙x]]/∂x = −a·p2·e−i∙[E·t − p∙x]
So the ∂ψ/∂t = i·(ħ/m)·∇2ψ = i·(1/m)·∇2ψ equation now becomes:
−a·i·E·e−i∙[E·t − p∙x] = −i·(1/m)·a·p2·e−i∙[E·t − p∙x] ⇔ E = p2/m
It all works like a charm, as we assumed E = m = p. Note that the E = p2/m closely resembles the kinetic energy formula one often sees: K.E. = m·v2/2 = m·m·v2/(2m) = p2/(2m). We just don’t have the 1/2 factor in our E = p2/m formula, which is great—because we don’t want it! 🙂 Just to make sure: let me add that, when we write that E = m = p, we mean their numerical values are the same. Their dimensions remain what they are, of course. Just to make sure you get this rather subtle point, we’ll do a quick dimensional analysis of that E = p2/m formula:
[E] = [p2/m] ⇔ N·m = N2·s2/kg = N2·s2/[N·m/s2] = N·m = joule (J)
To conclude this section, let’s now just calculate the derivatives in the ∂ψ/∂t = i·(ħ/m)·∇2ψ equation (i.e. the equation without the 1/2 factor) without any special assumptions at all. So no E = m = p stuff, and we also will not assume we’re measuring stuff in natural units, so our elementary wavefunction is just what it is: a·e−i·[E·t − p∙x]/ħ. The derivatives now become:
- ∂ψ/∂t = −a·i·(E/ħ)·e−i∙[E·t − p∙x]/ħ
- ∇2ψ = ∂2[a·e−i∙[E·t − p∙x]/ħ]/∂x2 = a·∂[i·(p/ħ)·e−i∙[E·t − p∙x]/ħ]/∂x = −a·(p2/ħ2)·e−i∙[E·t − p∙x]/ħ
So the ∂ψ/∂t = i·(ħ/m)·∇2ψ equation now becomes:
−a·i·(E/ħ)·e−i∙[E·t − p∙x]/ħ = −i·(ħ/m)·a·(p2/ħ2)·e−i∙[E·t − p∙x]/ħ ⇔ E = p2/m
We get that E = p2/m formula again, so that’s twice the kinetic energy. Note that we do not assume stuff like E = m = p here. It’s all quite general. So… Well… It’s all perfect. 🙂 Well… No. We can write that E = p2/m as E = p2/m = m·v2, and that condition is nonsense, because we know that E = m·c2, so that E = p2/m is only fulfilled if m·c2 = m·v2, i.e. if v = c. So, again, we see this rather particular Schrödinger equation works only for zero-mass particles. In fact, what it describes is just a general propagation mechanism for energy.
Fine. On to the next: the photon wavefunction. Indeed, the photon does have a wavefunction, and it’s different from the wavefunction of my hypothetical zero-mass particle. Let me show how it’s different. However, before we do, let me say something about the superposition principle—which I always think of as the ‘additionality’ principle, because that’s what we’re doing: we’re just adding waves.
The superposition principle
The superposition principle tells us that any linear combination of solutions to a (homogeneous linear) differential equation will also be a solution. You know that’s how we can localize our wavefunction: we just add a lot of them and get some bump. It also works the other way around: we can analyze any regular wave as a sum of elementary waves. You’ve heard of this: it’s referred to as a Fourier analysis, and you can find more detail on that in my posts on that topic. My favorite illustration is still the one illustrating the Fourier transform on Wikipedia:
We can really get whatever weird shape we want. There is one catch, however: we need to combine waves with different frequencies, which… Well… How do we do that? For that, we need to introduce uncertainty, so we do not have one single definite value for E = p = m.
This shows, once again, that we’re just analyzing energy here—not some real-life elementary particle. So… Well… We’ll come back to that.
Now, before we look at the wavefunction for the photon, let me quickly add something on the energy concepts we are using here.
Relativistic and non-relativistic kinetic energy
You may have read that the Schrödinger equation is non-relativistic. That is correct, and not correct at the same time. The equation on his grave (below) is much more general, and encompasses both the relativistic as well as the non-relativistic case, depending on what you use for the operator (H) on the right-hand side of the equation:
The ‘over-dot’ is Newton’s notation for the time derivative. In fact, if you click on the picture above (and zoom in a bit), then you’ll see that the craftsman who made the stone grave marker, mistakenly, also carved a dot above the psi (ψ) on the right-hand side of the equation—but then someone pointed out his mistake and so the dot on the right-hand side isn’t painted. The thing I want to talk about here, however, is the H in that expression above. For the non-relativistic case, that operator is equal to:
So that gives us the Schrödinger equation we started off with. It’s referred to as a non-relativistic equation because the mass concept is the m that appears in the classical kinetic energy formula: K.E. = m·v2/2. Now that’s a non-relativistic approximation. In relativity theory, the kinetic energy of an object is calculated as the difference between (1) the total energy, which is given by Einstein’s mass-energy equivalence relation: E = m·c2 = mv·c2, and (2) the rest mass energy, which – as mentioned above – is like potential energy, and which is given by E0= m0·c2. So the relativistically correct formula for the kinetic energy is the following:
K.E. = E − E0 = mv·c2 − m0·c2 = m0·γ·c2 − m0·c2 = m0·c2·(γ − 1)
Now that looks very different, doesn’t it? Let’s compare the relativistic and non-relativistic formula by plotting them using equivalent time and distance units (so c = 1), and for a mass that we’ll also equate to one. As you can see from the graph below, the two concepts do not differ much for non-relativistic velocities, but the gap between them becomes huge as v approaches c. [Note the optics: it looks like the two functions are approaching each other again after separating, but it’s not the case! Remember to measure distance along the y-axis here!]
Hence, for zero-mass particles we should use the relativistic kinetic energy formula, which, for m0 = 0 and v = c, becomes:
K.E. = mv·c2 − m0·c2 = mv·c2 = mc·c2 = m·c2 = E
So that’s all of the energy. No missing energy: in the absence of force fields, zero-mass particles have no potential energy: all of their energy is kinetic.
What about our E = p2/m formulas? Is that formula relativistic or non-relativistic? Well… We derived these two formulas assuming v = c or, in equivalent units: v = 1. Now, both the p = mv·v = m·v momentum formula and E = m·c2 equations are relativistically correct. Hence, for v = c, we can write: E = (m2/m)·c2 = (m2·c2)/m = p2/m. Bingo! No problem whatsoever.
In contrast, the E = p2/(2m) = m·v2/2 is just the classical kinetic energy formula. Or… Well… Is it? The classical formula is m0·v2/2: it uses the rest mass, not the relativistic mass. So… Well… It’s really quite particular! 🙂 But don’t worry about. You’ll soon understand what it stands for. 🙂
IV. The wavefunction of a photon
Look at the following images:
Both are the same, and then they’re not. The illustration on the right-hand side is a regular quantum-mechanical wavefunction, i.e. an amplitude wavefunction: the x-axis represents time, so we’re looking at the wavefunction at some particular point in space.[Of course, we could just switch the dimensions and it would all look the same.]
The illustration on the left-hand side looks similar, but it’s not an amplitude wavefunction. The animation shows how the electric field vector (E) of an electromagnetic wave travels through space. Its shape is the same. So it’s the same function. Is it also the same reality?
Yes and no. And I would say: more no than yes—in this case, at least. Note that the animation does not show the accompanying magnetic field vector (B). That vector is equally essential in the electromagnetic propagation mechanism according to Maxwell’s equations, which—let me remind you—are equal to:
- ∂B/∂t = –∇×E
- ∂E/∂t = ∇×B
In fact, I should write the second equation as ∂E/∂t = c2∇×B, but then I assume we measure time and distance in equivalent units, so c = 1.
You know that E and B are two aspects of one and the same thing: if we have one, then we have the other. To be precise, B is always orthogonal to E in the direction that’s given by the right-hand rule for the following vector cross-product: B = ex×E, with ex the unit vector pointing in the x-direction (i.e. the direction of propagation). The reality behind is illustrated below for a linearly polarized electromagnetic wave.
The B = ex×E equation is equivalent to writing B= i·E, which is equivalent to:
B = i·E = ei(π/2)·ei(kx − ωt) = cos(kx − ωt + π/2) + i·sin(kx − ωt + π/2)
= −sin(kx − ωt) + i·cos(kx − ωt)
Now, E and B have only two components: Ey and Ez, and By and Bz. That’s only because we’re looking at some ideal or elementary electromagnetic wave here but… Well… Let’s just go along with it. It is then easy to prove that the equation above amounts to writing:
- By = cos(kx − ωt + π/2) = −sin(kx − ωt) = −Ez
- Bz = sin(kx − ωt + π/2) = cos(kx − ωt) = Ey
We should now think of Ey and Ez as the real and imaginary part of some wavefunction, which we’ll denote as ψE = ei(kx − ωt). So we write:
E = (Ey, Ez) = Ey + i·Ez = cos(kx − ωt) + i∙sin(kx − ωt) = Re(ψE) + i·Im(ψE) = ψE = ei(kx − ωt)
What about B? We just do the same, so we write:
B = (By, Bz) = By + i·Bz = ψB = i·E = i·ψE = −sin(kx − ωt) + i∙sin(kx − ωt) = − Im(ψE) + i·Re(ψE)
Now we need to prove that ψE and ψB are regular wavefunctions, which amounts to proving Schrödinger’s equation, i.e. ∂ψ/∂t = i·(ħ/m)·∇2ψ, for both ψE and ψB. [Note I use that revised Schrödinger’s equation, which uses the E = m·v2 energy concept, i.e. twice the kinetic energy.] To prove that ψE and ψB are regular wavefunctions, we should prove that:
- Re(∂ψE/∂t) = −(ħ/m)·Im(∇2ψE) and Im(∂ψE/∂t) = (ħ/m)·Re(∇2ψE), and
- Re(∂ψB/∂t) = −(ħ/m)·Im(∇2ψB) and Im(∂ψB/∂t) = (ħ/m)·Re(∇2ψB).
Let’s do the calculations for the second pair of equations. The time derivative on the left-hand side is equal to:
∂ψB/∂t = −iω·iei(kx − ωt) = ω·[cos(kx − ωt) + i·sin(kx − ωt)] = ω·cos(kx − ωt) + iω·sin(kx − ωt)
The second-order derivative on the right-hand side is equal to:
∇2ψB = ∂2ψB/∂x2 = i·k2·ei(kx − ωt) = k2·cos(kx − ωt) + i·k2·sin(kx − ωt)
So the two equations for ψB are equivalent to writing:
- Re(∂ψB/∂t) = −(ħ/m)·Im(∇2ψB) ⇔ ω·cos(kx − ωt) = k2·(ħ/m)·cos(kx − ωt)
- Im(∂ψB/∂t) = (ħ/m)·Re(∇2ψB) ⇔ ω·sin(kx − ωt) = k2·(ħ/m)·sin(kx − ωt)
So we see that both conditions are fulfilled if, and only if, ω = k2·(ħ/m).
Now, we also demonstrated in that post of mine that Maxwell’s equations imply the following:
- ∂By/∂t = –(∇×E)y = ∂Ez/∂x = ∂[sin(kx − ωt)]/∂x = k·cos(kx − ωt) = k·Ey
- ∂Bz/∂t = –(∇×E)z = – ∂Ey/∂x = – ∂[cos(kx − ωt)]/∂x = k·sin(kx − ωt) = k·Ez
Hence, using those By = −Ez and Bz = Ey equations above, we can also calculate these derivatives as:
- ∂By/∂t = −∂Ez/∂t = −∂sin(kx − ωt)/∂t = ω·cos(kx − ωt) = ω·Ey
- ∂Bz/∂t = ∂Ey/∂t = ∂cos(kx − ωt)/∂t = −ω·[−sin(kx − ωt)] = ω·Ez
In other words, Maxwell’s equations imply that ω = k, which is consistent with us measuring time and distance in equivalent units, so the phase velocity is c = 1 = ω/k.
So far, so good. We basically established that the propagation mechanism for an electromagnetic wave, as described by Maxwell’s equations, is fully coherent with the propagation mechanism—if we can call it like that—as described by Schrödinger’s equation. We also established the following equalities:
- ω = k
- ω = k2·(ħ/m)
The second of the two de Broglie equations tells us that k = p/ħ, so we can combine these two equations and re-write these two conditions as:
ω/k = 1 = k·(ħ/m) = (p/ħ)·(ħ/m) = p/m ⇔ p = m
What does this imply? The p here is the momentum: p = m·v, so this condition implies v must be equal to 1 too, so the wave velocity is equal to the speed of light. Makes sense, because we actually are talking light here. 🙂 In addition, because it’s light, we also know E/p = c = 1, so we have – once again – the general E = p = m equation, which we’ll need!
OK. Next. Let’s write the Schrödinger wave equation for both wavefunctions:
- ∂ψE/∂t = i·(ħ/mE)·∇2ψE, and
- ∂ψB/∂t = i·(ħ/mB)·∇2ψB.
Huh? What’s mE and mE? We should only associate one mass concept with our electromagnetic wave, shouldn’t we? Perhaps. I just want to be on the safe side now. Of course, if we distinguish mE and mB, we should probably also distinguish pE and pB, and EE and EB as well, right? Well… Yes. If we accept this line of reasoning, then the mass factor in Schrödinger’s equations is pretty much like the 1/c2 = μ0ε0 factor in Maxwell’s (1/c2)·∂E/∂t = ∇×B equation: the mass factor appears as a property of the medium, i.e. the vacuum here! [Just check my post on physical constants in case you wonder what I am trying to say here, in which I explain why and how c defines the (properties of the) vacuum.]
To be consistent, we should also distinguish pE and pB, and EE and EB, and so we should write ψE and ψB as:
- ψE = ei(kEx − ωEt), and
- ψB = ei(kBx − ωBt).
Huh? Yes. I know what you think: we’re talking one photon—or one electromagnetic wave—so there can be only one energy, one momentum and, hence, only one k, and one ω. Well… Yes and no. Of course, the following identities should hold: kE = kB and, likewise, ωE = ωB. So… Yes. They’re the same: one k and one ω. But then… Well… Conceptually, the two k’s and ω’s are different. So we write:
- pE = EE = mE, and
- pB = EB = mB.
The obvious question is: can we just add them up to find the total energy and momentum of our photon? The answer is obviously positive: E = EE + EB, p = pE + pB and m = mE + mB.
Let’s check a few things now. How does it work for the phase and group velocity of ψE and ψB? Simple:
- vg = ∂ωE/∂kE = ∂[EE/ħ]/∂[pE/ħ] = ∂EE/∂pE = ∂pE/∂pE = 1
- vp = ωE/kE = (EE/ħ)/(pE/ħ) = EE/pE = pE/pE = 1
So we’re fine, and you can check the result for ψB by substituting the subscript E for B. To sum it all up, what we’ve got here is the following:
- We can think of a photon having some energy that’s equal to E = p = m (assuming c = 1), but that energy would be split up in an electric and a magnetic wavefunction respectively: ψE and ψB.
- Schrödinger’s equation applies to both wavefunctions, but the E, p and m in those two wavefunctions are the same and not the same: their numerical value is the same (pE =EE = mE = pB =EB = mB), but they’re conceptually different. They must be: if not, we’d get a phase and group velocity for the wave that doesn’t make sense.
Of course, the phase and group velocity for the sum of the ψE and ψB waves must also be equal to c. This is obviously the case, because we’re adding waves with the same phase and group velocity c, so there’s no issue with the dispersion relation.
So let’s insert those pE =EE = mE = pB =EB = mB values in the two wavefunctions. For ψE, we get:
ψE = ei[kEx − ωEt) = ei[(pE/ħ)·x − (EE/ħ)·t]
You can do the calculation for ψB yourself. Let’s simplify our life a little bit and assume we’re using Planck units, so ħ = 1, and so the wavefunction simplifies to ψE = ei·(pE·x − EE·t). We can now add the components of E and B using the summation formulas for sines and cosines:
1. By + Ey = cos(pB·x − EB·t + π/2) + cos(pE·x − EE·t) = 2·cos[(p·x − E·t + π/2)/2]·cos(π/4) = √2·cos(p·x/2 − E·t/2 + π/4)
2. Bz + Ez = sin(pB·x − EB·t+π/2) + sin(pE·x − EE·t) = 2·sin[(p·x − E·t + π/2)/2]·cos(π/4) = √2·sin(p·x/2 − E·t/2 + π/4)
Interesting! We find a composite wavefunction for our photon which we can write as:
E + B = ψE + ψB = E + i·E = √2·ei(p·x/2 − E·t/2 + π/4) = √2·ei(π/4)·ei(p·x/2 − E·t/2) = √2·ei(π/4)·E
What a great result! It’s easy to double-check, because we can see the E + i·E = √2·ei(π/4)·E formula implies that 1 + i should equal √2·ei(π/4). Now that’s easy to prove, both geometrically (just do a drawing) or formally: √2·ei(π/4) = √2·cos(π/4) + i·sin(π/4ei(π/4) = (√2/√2) + i·(√2/√2) = 1 + i. We’re bang on! 🙂
We can double-check once more, because we should get the same from adding E and B = i·E, right? Let’s try:
E + B = E + i·E = cos(pE·x − EE·t) + i·sin(pE·x − EE·t) + i·cos(pE·x − EE·t) − sin(pE·x − EE·t)
= [cos(pE·x − EE·t) – sin(pE·x − EE·t)] + i·[sin(pE·x − EE·t) – cos(pE·x − EE·t)]
Indeed, we can see we’re going to obtain the same result, because the −sinθ in the real part of our composite wavefunction is equal to cos(θ+π/2), and the −cosθ in its imaginary part is equal to sin(θ+π/2). So the sum above is the same sum of cosines and sines that we did already.
So our electromagnetic wavefunction, i.e. the wavefunction for the photon, is equal to:
ψ = ψE + ψB = √2·ei(p·x/2 − E·t/2 + π/4) = √2·ei(π/4)·ei(p·x/2 − E·t/2)
What about the √2 factor in front, and the π/4 term in the argument itself? No sure. It must have something to do with the way the magnetic force works, which is not like the electric force. Indeed, remember the Lorentz formula: the force on some unit charge (q = 1) will be equal to F = E + v×B. So… Well… We’ve got another cross-product here and so the geometry of the situation is quite complicated: it’s not like adding two forces F1 and F2 to get some combined force F = F1 and F2.
In any case, we need the energy, and we know that its proportional to the square of the amplitude, so… Well… We’re spot on: the square of the √2 factor in the √2·cos product and √2·sin product is 2, so that’s twice… Well… What? Hold on a minute! We’re actually taking the absolute square of the E + B = ψE + ψB = E + i·E = √2·ei(p·x/2 − E·t/2 + π/4) wavefunction here. Is that legal? I must assume it is—although… Well… Yes. You’re right. We should do some more explaining here.
We know that we usually measure the energy as some definite integral, from t = 0 to some other point in time, or over the cycle of the oscillation. So what’s the cycle here? Our combined wavefunction can be written as √2·ei(p·x/2 − E·t/2 + π/4) = √2·ei(θ/2 + π/4), so a full cycle would correspond to θ going from 0 to 4π here, rather than from 0 to 2π. So that explains the √2 factor in front of our wave equation.
It’s quite fascinating, isn’t it? A natural question that pops up, of course, is whether or not it can explain the different behavior of bosons and fermions. Indeed, we know that:
- The amplitudes of identitical bosonic particles interfere with a positive sign, so we have Bose-Einstein statistics here. As Feynman writes it: (amplitude direct) + (amplitude exchanged).
- The amplitudes of identical fermionic particles interfere with a negative sign, so we have Fermi-Dirac statistics here: (amplitude direct) − (amplitude exchanged).
I’ll think about it. I am sure it’s got something to do with that B= i·E formula or, to put it simply, with the fact that, when bosons are involved, we get two wavefunctions (ψE and ψB) for the price of one. The reasoning should be something like this:
I. For a massless particle (i.e. a zero-mass fermion), our wavefunction is just ψ = ei(p·x − E·t). So we have no √2 or √2·ei(π/4) factor in front here. So we can just add any number of them – ψ1 + ψ2 + ψ3 + … – and then take the absolute square of the amplitude to find a probability density, and we’re done.
II. For a photon (i.e. a zero-mass boson), our wavefunction is √2·ei(π/4)·ei(p·x − E·t)/2, which – let’s introduce a new symbol – we’ll denote by φ, so φ = √2·ei(π/4)·ei(p·x − E·t)/2. Now, if we add any number of these, we get a similar sum but with that √2·ei(π/4) factor in front, so we write: φ1 + φ2 + φ3 + … = √2·ei(π/4)·(ψ1 + ψ2 + ψ3 + …). If we take the absolute square now, we’ll see the probability density will be equal to twice the density for the ψ1 + ψ2 + ψ3 + … sum, because
|√2·ei(π/4)·(ψ1 + ψ2 + ψ3 + …)|2 = |√2·ei(π/4)|2·|ψ1 + ψ2 + ψ3 + …)|2 = 2·|ψ1 + ψ2 + ψ3 + …)|2
So… Well… I still need to connect this to Feynman’s (amplitude direct) ± (amplitude exchanged) formula, but I am sure it can be done.
Now, we haven’t tested the complete √2·ei(π/4)·ei(p·x − E·t)/2 wavefunction. Does it respect Schrödinger’s ∂ψ/∂t = i·(1/m)·∇2ψ or, including the 1/2 factor, the ∂ψ/∂t = i·[1/2m)]·∇2ψ equation? [Note we assume, once again, that ħ = 1, so we use Planck units once more.] Let’s see. We can calculate the derivatives as:
- ∂ψ/∂t = −√2·ei(π/4)·e−i∙[p·x − E·t]/2·(i·E/2)
- ∇2ψ = ∂2[√2·ei(π/4)·e−i∙[p·x − E·t]/2]/∂x2 = √2·ei(π/4)·∂[√2·ei(π/4)·e−i∙[p·x − E·t]/2·(i·p/2)]/∂x = −√2·ei(π/4)·e−i∙[p·x − E·t]/2·(p2/4)
So Schrödinger’s equation becomes:
−i·√2·ei(π/4)·e−i∙[p·x − E·t]/2·(i·E/2) = −i·(1/m)·√2·ei(π/4)·e−i∙[p·x − E·t]/2·(p2/4) ⇔ 1/2 = 1/4!?
That’s funny ! It doesn’t work ! The E and m and p2 are OK because we’ve got that E = m = p equation, but we’ve got problems with yet another factor 2. It only works when we use the 2/m coefficient in Schrödinger’s equation.
So… Well… There’s no choice. That’s what we’re going to do. The Schrödinger equation for the photon is ∂ψ/∂t = i·(2/m)·∇2ψ !
This is all great, and very fundamental stuff! Let’s now move on to Schrödinger’s actual equation, i.e. the ∂ψ/∂t = i·(ħ/2m)·∇2ψ equation.
V. The wavefunction for spin-1/2 particles
Schrödinger’s original equation – with the 1/2 factor – is not wrong, of course! It can’t be wrong, as it correctly explains the precise shape of electron orbitals! So let’s think about the wavefunction that makes Schrödinger’s original equation work. Leaving the Vψ term out, that equation is:
∂ψ/∂t = i·(ħ/2m)·∇2ψ
Hence, if our elementary wavefunction is a·e−i·[E·t − p∙x]/ħ, then the derivatives become:
- ∂ψ/∂t = −a·i·(E/ħ)·e−i∙[E·t − p∙x]/ħ
- ∇2ψ = ∂2[a·e−i∙[E·t − p∙x]/ħ]/∂x2 = ∂[a·i·(p/ħ)·e−i∙[E·t − p∙x]/ħ]/∂x = −a·(p2/ħ2)·e−i∙[E·t − p∙x]/ħ
So the ∂ψ/∂t = i·(ħ/2m)·∇2ψ = i·(ħ/m)·∇2ψ equation now becomes:
−a·i·(E/ħ)·e−i∙[E·t − p∙x]/ħ = −i·(ħ/2m)·a·(p2/ħ2)·e−i∙[E·t − p∙x]/ħ ⇔ E = p2/2m
That’s a very weird condition, because we can re-write p2/2m as m·v2/2, and so we find that our wavefunction is a solution for Schrödinger’s equation if, and only if, E = m·v2/2. But that’s a weird formula, as it captures the kinetic energy only—and, even then, it should be written as m0·v2/2. But so we know E = E = m·c2. So what’s going on here? We must be doing something wrong here, but what?
Let’s start with the basics by simplifying the situation first: we’ll, once again, assume a fermion with zero rest mass. I know you think you will not come back to a non-zero rest mass particle so as to answer the deep question here, but I promise you I will. Don’t worry.
The zero-mass fermion
If we do not want to doubt the formula for the elementary wavefunction – noting that the E = p2/2m comes out of it when combining it with Schrödinger’s equation, which we do not want to doubt at this point either – then… Well… The only thing we did was substitute p for m·v. So that must be wrong. Perhaps we’re using the wrong momentum formula. [Of course, we didn’t: we used the wrong mass concept, as I’ll explain in a moment. But just go along with the logic here as for now.]
The wrong momentum formula?
Well… Yes. After all, we’ve used the formula for linear momentum, and we know an electron (or any spin-1/2 particle) has angular momentum as well. Let’s try the following: if we allow for two independent directions of motion (i.e. two degrees of freedom), then the equipartition theorem tells us that energy should be equally divided over two. Assuming the smallest possible value for the mass (m) in the linear momentum formula (p = m·v) and for the moment of inertia (I) in the angular momentum formula (L = I·ω) is equal to ħ/2, and also assuming that v = c = ω = 1 (so we are using equivalent time and distance units), we could envisage that the total momentum could be m·v + I·ω = ħ/2 + ħ/2 = ħ. Let’s denote that by a new p, which is the sum of the old linear momentum and the angular momentum. Hence, using natural units, we get:
p = 2m
It’s a weird formula, so let’s try to find in some other way. The Schrödinger equation is equivalent to writing:
- Re(∂ψ/∂t) = −(ħ/2m)·Im(∇2ψ) ⇔ ω·cos(kx − ωt) = k2·(ħ/2m)·cos(kx − ωt)
- Im(∂ψ/∂t) = (ħ/2m)·Re(∇2ψ) ⇔ ω·sin(kx − ωt) = k2·(ħ/2m)·sin(kx − ωt)
So ω = k2·(ħ/2m). At the same time, ω/k = vp, i.e. the phase velocity of the wave. Hence, we find that:
vp = k2·(ħ/2m)/k = k·ħ/(2m) = (p/ħ)·(ħ/2m) = p/2m ⇔ p = 2m
That’s sweet. Now we can use the E = p2/2m that we got when combining our elementary wavefunction with Schrödinger’s equation to get the following:
E = p2/2m and p = 2m ⇔ E = (2m)2/2m = 2m
E = p = 2m
This looks weird but comprehensible. Note that the phase velocity of our wave is equal to c = 1 as vp = p/2m = 2m/2m = 1. What about the group velocity, i.e. vg = ∂ω/∂k? Let’s calculate it:
vg = ∂ω/∂k = ∂[k2·(ħ/2m)]/∂k = 2k·(ħ/2m) = 2·(p/ħ)·(ħ/2m) = p/m = m·v/m = v = c = 1
That’s nice, because it’s what we wanted to find. If the group velocity would not equal the classical velocity of our particle, then our model would not make sense.
Now, it’s nice we get that p = 2m equation when calculating the phase velocity, but… Well… Think about it: there’s something wrong here: if vp = p/2m, and p = m·v, then this formula cannot be correct for fermions that actually do have some rest mass, because it implies vp = m·v/2m = v/2. That doesn’t make sense, does it? Why not?
Well… I’ll answer that question in the next section. We’re actually mixing stuff here that we shouldn’t be mixing. 🙂
The actual fermion (non-zero mass)
In what I wrote above, I showed that Schrödinger’s wave equation for spin-zero, spin-1/2, and spin-one particles in free space differ from each other by a factor two:
- For particles with zero spin, we write: ∂ψ/∂t = i·(ħ/m)·∇2ψ. We get this by multiplying the ħ/(2m) factor in Schrödinger’s original wave equation – which applies to spin-1/2 particles (e.g. electrons) only – by two. Hence, the correction that needs to be made is very straightforward.
- For fermions (spin-1/2 particles), Schrödinger’s equation is what it is: ∂ψ/∂t = i·[ħ/(2m)]·∇2ψ.
- For spin-1 particles (photons), we have ∂ψ/∂t = i·(2ħ/m)·∇2ψ, so here we multiply the ħ/m factor in Schrödinger’s wave equation for spin-zero particles by two, which amounts to multiplying Schrödinger’s original coefficient by four.
We simplified the analysis by assuming our particles had zero rest mass, and we found that we were basically modeling an energy flow when developing the model for the spin-zero particle—because spin-zero particles with zero rest mass don’t exist.
In contrast, the model for the spin-one particle is a model that works for the photon—an actual bosonic particle. To be precise, we derived the photon wavefunction from Maxwell’s equations in free space, and then found the wavefunction is a solution for the ∂ψ/∂t =i·(2ħ/m)·∇2ψ equation only.
For a real-life electron, we had a problem. If our elementary wavefunction is a·e−i·[E·t − p∙x]/ħ, then the derivatives in Schrödinger’s wave equation become:
- ∂ψ/∂t = −a·i·(E/ħ)·e−i∙[E·t − p∙x]/ħ
- ∇2ψ = ∂2[a·e−i∙[E·t − p∙x]/ħ]/∂x2 = ∂[a·i·(p/ħ)·e−i∙[E·t − p∙x]/ħ]/∂x = −a·(p2/ħ2)·e−i∙[E·t − p∙x]/ħ
So the ∂ψ/∂t = i·(ħ/2m)·∇2ψ = i·(ħ/m)·∇2ψ equation now becomes:
−a·i·(E/ħ)·e−i∙[E·t − p∙x]/ħ = −i·(ħ/2m)·a·(p2/ħ2)·e−i∙[E·t − p∙x]/ħ ⇔ E = p2/2m
That E = p2/2m is a very weird condition, because we can re-write p2/2m as m·v2/2, and so we find that our wavefunction is a solution for Schrödinger’s equation if, and only if, E = m·v2/2. But that’s a weird formula, as it captures the kinetic energy only—and, even then, it should be written as m0·v2/2. But so we know E = E = m·c2. So what’s going on here? We must be doing something wrong here, but what?
The true answer is: Schrödinger did use a different energy concept when developing his equation. He used the following formula:
E = meff·v2/2
What’s meff? It’s referred to as the effective mass, and it has nothing to do with the real mass, or the true mass. In fact, the effective mass, in units of the true mass, can be anything between zero and infinity. So that resolves the paradox. I know you won’t be happy with the answer, but that’s what it is. 😦 I’ll come back to the question, however.
Let’s do something more interesting now. Let’s calculate the phase velocity. It’s easy to see the phase velocity will be equal to:
vp = ω/k = (E/ħ)/(p/ħ) = E/p
Using natural units, that becomes:
vp = E/p = mv/mv·v = 1/v
Interesting! The phase velocity is the reciprocal of the classical velocity! This implies it is always superluminal, ranging from vp = ∞ to vp = c = 1 for v going from 0 to 1 = c, as illustrated in the simple graph below.
However, we are, of course, interested in the group velocity, as the group velocity should correspond to the classical velocity of the particle. The group velocity of a composite wave is given by the vg = ∂ω/∂k formula. Of course, that formula assumes an unambiguous relation between the temporal and spatial frequency of the component waves, which we may want to denote as ωn and kn, with n = 1, 2, 3,… However, we will not use the index as the context makes it quite clear what we are talking about.
The relation between ωn and kn is known as the dispersion relation, and one particularly nice way to calculate ω as a function of k is to distinguish the real and imaginary parts of the ∂ψ/∂t =i·[ħ/(2m)]·∇2ψ wave equation and, hence, re-write it as a pair of two equations:
- Re(∂ψB/∂t) = −[ħ/(2m)]·Im(∇2ψB) ⇔ ω·cos(kx − ωt) = k2·[ħ/(2m)]·cos(kx − ωt)
- Im(∂ψB/∂t) = [ħ/(2m)]·Re(∇2ψB) ⇔ ω·sin(kx − ωt) = k2·[ħ/(2m)]·sin(kx − ωt)
Both equations imply the following dispersion relation:
ω = ħ·k2/(2m)
We can now calculate vg = ∂ω/∂k as:
vg = ∂ω/∂k = ∂[ħ·k2/(2m)]/∂k = 2ħk/(2m) = ħ·(p/ħ)/m = p/m = m·v/m = v
That’s nice, because it’s what we wanted to find. If the group velocity would not equal the classical velocity of our particle, then our model would not make sense.
Now, let’s have another at the energy concept that’s implicit in Schrödinger’s equation. We said he used the E = meff·v2/2 formula, but let’s look at the energy concept once more. We said the phase velocity of our wavefunction was equal to vp = E/p. Now p = m·v, but it’s only when we’re modeling zero-mass particles that v = vp. So, for non-zero rest mass particles, the energy concept that’s implicit in the de Broglie relations and the wavefunction is equal to:
E = vp·p = mv·vp·v ≠ m·v2 for v ≠ vp
Now, we just calculated that vp = 1/v, so we can write E = vp·p = mv·vp·v as E = m. So… Well… That’s consistent at least! However, that leads one to conclude that the 1/2 factor in Schrödinger’s equation should not be there. Indeed, if we’d drop it, and we’d calculate those derivatives once more, and substitute them, we’d get a condition that makes somewhat more sense:
−a·i·(E/ħ)·e−i∙[E·t − p∙x]/ħ = −i·(ħ/m)·a·(p2/ħ2)·e−i∙[E·t − p∙x]/ħ ⇔ E = p2/m = m·v2
This comes naturally out of the p = mv·v = m·v and E = m·c2 equations, which are both relativistically correct, and for v = c, this gives us the E = m·c2 equation. It’s still a weird equation though, as we do not get the E = m·c2 equation when v ≠ c. But then… Well… So be it. What we get is an energy formula that says the total energy is twice the kinetic energy. Or… Well… Not quite, because the classical kinetic energy formula is m0·v2, not mv·v2. Now, you’ll have to admit that fits much better with our θ = m0·t’ and m0·c2 energy formula for those two springs, doesn’t it?
The whole discussion makes me think of an inconsistency in that relativistic definition of the kinetic energy. We said that, for particles with zero rest mass, all of the energy was kinetic, and we wrote it as:
K.E. = mv·c2 − m0·c2 = mv·c2 = mc·c2 = m·c2 = E
Because we know that the energy of a photon is finite (like 2 or 3 eV for visible light, or like 100,000 eV for gamma-rays), we know mc must have a finite value too, but how can some mass moving at the speed of light be finite? It’s one of those paradoxes in relativity theory. The answer, of course, is that we only see some mass moving (a photon) in our reference frame: the photon in it’s own space is just a wave, and it’s frozen—so to speak—in its own (inertial) frame of reference. However, while that’s (probably) the best answer we can give at this point, it’s not very satisfactory. At this point, I am thinking of a quote that I like a lot. It’s from an entirely different realm of experience – much less exact than math or physics:
“We are in the words, and at the same time, apart from them. The words spin out, spin us out, over a void. There, somewhere between us, some words form some answer for some time, allowing us to live more fully in the forgetting face of nonexistence, in the dissolving away of each other.” (Robert Langan, in Jeremy Safran (2003), Psychoanalysis and Buddhism: an Unfolding Dialogue)
As all of physics is expressed in the language of math, we should subsitute the “words” in that quote above for “the math”: it’s the math that spins us out now – not the words – over some void. And the math does give us some answer, for some time at least. 🙂 But… Then… Well… The math also gives us new questions to solve, it seems. 🙂
I must assume the relativistic version of Schrödinger’s equation, i.e. the Klein-Gordon equation, does away with these inconsistencies. But so that’s even more advanced stuff that what we’ve been dealing with so far. And then it’s an equation which does not correctly give us the electron orbitals, so we don’t know what it describes—exactly.
Let me remind you, at this point, of the relativistically correct relation between E, m, p and v, which is the following one:
p = (E/c2)·v or, using natural units, p = v·E
Now, if the m (or 2meff) factor in the ħ/m (or ħ/2meff) diffusion constant is to be interpreted as the (equivalent mass) of the total energy, so E = m (expressed in natural units), then the condition that comes out of the Schrödinger equation is:
E2 = p2 ⇔ p = E
So now we’ve got that old E = p = m equation now. And how can we reconcile it with the relativistically correct p = v·E. Well… All is relative, so whose v are we talking about here? Well… I am sorry, but the answer to that is rather unambiguous: our v, as we measure it. Then the question becomes: what v? Phase velocity? Group velocity? Again, the answer is unambiguous: we’re talking the group velocity, which corresponds to the classical velocity of our particle.
So… When everything is said and done, there is only one conclusion: the use of that meff factor seems to ensure we’ve got the right formula, and we know that – for our wavefunction to be a solution to the Schrödinger equation – the following condition must be satisfied:
E·(2meff) = p2 = m2·v2
Now, noting that the E = m relation holds (all is in natural units once more), then that condition only works if:
m·(2meff) = m2·v2 ⇔ meff = m·v2/2
As mentioned above, I’d rather drop that 1/2 factor and, hence, re-define meff as two times the old meff, in which case we get:
meff = Eeff = m·v2 = E·v2
It’s an ugly solution, at first sight, but it makes all come out alright. And, in fact, it’s not all that ugly: the effective mass only differs from the classical mass because of a correction factor, which is equal to v2, so that’s the square of some value between 0 and 1 (as v is a relative velocity here), so it’s some factor between 0 and 1 itself. Let me quickly insert the graph:
Interesting! Let’s double-check it by substituting those derivatives we calculated into Schrödinger’s equation. It now gives us the following condition:
−a·i·(E/ħ)·e−i∙[E·t − p∙x]/ħ = −i·(ħ/meff)·a·(p2/ħ2)·e−i∙[E·t − p∙x]/ħ
⇔ E·meff = p2 ⇔ m·m·v2 = m2·v2 = p2
It works. You’ll still wonder, of course: what is that efficient mass? Well… As it appears in the diffusion constant, we can look at it as some property of the medium. In free space, it just becomes m, as v becomes c. However, in a crystal lattice (which is the context for which Schrödinger developed his equation), we get something entirely different. What makes a crystal lattice different? The presence of other particles and/or charges which, in turn, gives us force fields and potential barriers we need to break. So… Well… It’s all very strange but, when everything is said and done, the whole story makes sense. 🙂 For some time at least. 🙂
OK. We’re done. Let me just add something on the superposition principle here.
The superposition principle re-visited
As you can see from our calculations for the group velocity of our wave for spin-1/2 particle, the 1/2 factor in the Schrödinger equation ensures we’ve got a ‘nice’ dispersion relation, as it also includes the 1/2 factor – ω = k2/(2m) – and that factor, in turn, cancels the 2 that comes down when doing that vg = ∂ω/∂k = ∂[k2/(2m)]/∂k = 2k/(2m) = k/m derivation. And then we do the p = m·v substitution and all is wonderful: we find that the phase velocity corresponds to the classical particle velocity:
vg = k/m = p/m = m·v/m = v
But we’re still talking some wave here whose amplitude is the same all over spacetime, right? So how can we localize it?
Think about it: for our zero-mass particles, there was no real need to resort to this funny business of adding waves to localize it, because we did not need to localize it. Why not? Well… It’s here that quantum mechanics and relativity theory come together in what might well be the most logical and absurd conclusion ever:
As an outside observer, we’re going to see all those zero-mass particles as point objects whizzing by because of the relativistic length contraction. So their wavefunction is only all over spacetime in their proper space and time, not in ours!
I know it will take you some time to think about this, and you may actually refuse to believe this, but… Then… Well… I’ve been thinking about this for years, and I’ve come to the conclusion it’s the only way out: it must be true.
So we don’t need to add waves for those zero-mass particles. In other words, we can have definite values for E and so there’s no need for an uncertainty principle here. Furthermore, if we have definite values for E, we’ll also have definite values for p and m and… Well… Just note it: we only need one wave for our theoretical spin-zero particle and our photon. No superposition. No Fourier analysis! 🙂
So let’s now get back to our spin-1/2 particles, and spin-1/2 particles with actual mass, like electrons. We can get a localized wave in two ways:
I. We can introduce the Uncertainty Principle, so we allow some uncertainty for both E and p. This uncertainty is fundamental, because it’s directly linked to the agreed-upon hypothesis that, in physics, we have a quantum of action:
ħ = 1.0545718×10−34 N·m·s
So E and p can vary, and the order of magnitude of that variation is given by the Uncertainty Relations:
Δx·Δp ≥ ħ/2 and ΔE·Δt ≥ ħ/2
That’s a very tiny magnitude, but then E and p are measured in terms of ħ, so it’s actually very substantial! [One needs to go through an actual exercise to appreciate this, like the calculation of electron orbitals for the hydrogen atom, which we’ll discuss in the next section.]
II. We can insert potential energy. Remember, the E in ψ(θ) = ψ(x, t) = a·e−iθ = a·e−i(E·t − p∙x)/ħ wavefunction – consists of:
- The rest mass E0;
- The kinetic energy mv·v2/2 = (mv·v)·(mv·v)/(2mv) = p2/(2m);
- The potential energy V.
We only discussed particles in free space so far, so no force fields. No potential. The whole analysis changes if our particles are actually traveling in a force field, as an electron does when it’s in some orbital around the nucleus. But so here it’s better to work examples, and so that’s what we’ll start doing now.
[Note we’re using a non-relativistic formula for the kinetic energy here but, as mentioned above, the velocity of an electron in orbital is not relativistic. Indeed, you’ll remember that, when we were writing about natural units and the fine-structure constant, we calculated its (classical) velocity as a mere 2,187 km per second only.]
The solution for actual electron orbitals
As mentioned above, our easy E = p = 2m identity does not show the complexities of the real world. In fact, the derivation of Schrödinger’s equation is not very rigorous. Richard Feynman – who knows the historical argument much better than I do – actually says that some of the arguments that Schrödinger used were false. He also says that does not matter, stating that “the only important thing is that the ultimate equation gives a correct description of nature.” Indeed, if you click on the link to my post on that argument, you’ll see I also doubt if the energy concepts that are used in that argument are the right ones. In addition, some of the simplifications are as horrendous as the ones I used above. 🙂
So… Well… It’s actually quite surprising that Schrödinger’s derivation, “based on some heuristic arguments and some brilliant intuitive guesses”, actually does give a formula we can use to calculate electron orbitals. So how does that work?
Well… The wave equation above described electrons in free space. Actual electrons are always in some orbit—in a force field around some nucleus. So we have to add the Vψ term and solve the new equation? But… Well… I am not going to give you those calculations here because you can find them elsewhere: check my post on it or, better still, read the original. 🙂