The energy and 1/2 factor in Schrödinger’s equation

Schrödinger’s equation, for a particle moving in free space (so we have no external force fields acting on it, so V = 0 and, therefore, the Vψ term disappears) is written as:

∂ψ(x, t)/∂t = i·(1/2)·(ħ/meff)·∇2ψ(x, t)

We already noted and explained the structural similarity with the ubiquitous diffusion equation in physics:

∂φ(x, t)/∂t = D·∇2φ(x, t) with x = (x, y, z)

The big difference between the wave equation and an ordinary diffusion equation is that the wave equation gives us two equations for the price of one: ψ is a complex-valued function, with a real and an imaginary part which, despite their name, are both equally fundamental, or essential. Whatever word you prefer. 🙂 That’s also what the presence of the imaginary unit (i) in the equation tells us. But for the rest it’s the same: the diffusion constant (D) in Schrödinger’s equation is equal to (1/2)·(ħ/meff).

Why the 1/2 factor? It’s ugly. Think of the following: If we bring the (1/2)·(ħ/meff) to the other side, we can write it as meff/(ħ/2). The ħ/2 now appears as a scaling factor in the diffusion constant, just like ħ does in the de Broglie equations: ω = E/ħ and k = p/ħ, or in the argument of the wavefunction: θ = (E·t − p∙x)/ħ. Planck’s constant is, effectively, a physical scaling factor. As a physical scaling constant, it usually does two things:

  1. It fixes the numbers (so that’s its function as a mathematical constant).
  2. As a physical constant, it also fixes the physical dimensions. Note, for example, how the 1/ħ factor in ω = E/ħ and k = p/ħ ensures that the ω·t = (E/ħ)·t and k·x = (p/ħ)·x terms in the argument of the wavefunction are both expressed as some dimensionless number, so they can effectively be added together. Physicists don’t like adding apples and oranges.

The question is: why did Schrödinger use ħ/2, rather than ħ, as a scaling factor? Let’s explore the question.

The 1/2 factor

We may want to think that 1/2 factor just echoes the 1/2 factor in the Uncertainty Principle, which we should think of as a pair of relations: σx·σp ≥ ħ/2 and σE·σ≥ ħ/2. However, the 1/2 factor in those relations only makes sense because we chose to equate the fundamental uncertainty (Δ) in x, p, E and t with the mathematical concept of the standard deviation (σ), or the half-width, as Feynman calls it in his wonderfully clear exposé on it in one of his Lectures on quantum mechanics (for a summary with some comments, see my blog post on it). We may just as well choose to equate Δ with the full-width of those probability distributions we get for x and p, or for E and t. If we do that, we get σx·σp ≥ ħ and σE·σ≥ ħ.

It’s a bit like measuring the weight of a person on an old-fashioned (non-digital) bathroom scale with 1 kg marks only: do we say this person is x kg ± 1 kg, or x kg ± 500 g? Do we take the half-width or the full-width as the margin of error? In short, it’s a matter of appreciation, and the 1/2 factor in our pair of uncertainty relations is not there because we’ve got two relations. Likewise, it’s not because I mentioned we can think of Schrödinger’s equation as a pair of relations that, taken together, represent an energy propagation mechanism that’s quite similar in its structure to Maxwell’s equations for an electromagnetic wave (as shown below), that we’d insert (or not) that 1/2 factor: either of the two representations below works. It just depends on our definition of the concept of the effective mass.

The 1/2 factor is really a matter of choice, because the rather peculiar – and flexible – concept of the effective mass takes care of it. However, we could define some new effective mass concept, by writing: meffNEW = 2∙meffOLD, and then Schrödinger’s equation would look more elegant:

∂ψ/∂t = i·(ħ/meffNEW)·∇2ψ

Now you’ll want the definition, of course! What is that effective mass concept? Feynman talks at length about it, but his exposé is embedded in a much longer and more general argument on the propagation of electrons in a crystal lattice, which you may not necessarily want to go through right now. So let’s try to answer that question by doing something stupid: let’s substitute ψ in the equation for ψ = a·ei·[E·t − p∙x]/ħ (which is an elementary wavefunction), calculate the time derivative and the Laplacian, and see what we get. If we do that, the ∂ψ/∂t = i·(1/2)·(ħ/meff)·∇2ψ equation becomes:

i·a·(E/ħei∙(E·t − p∙x)/ħ = i·a·(1/2)·(ħ/meff)(p2/ħ2ei∙(E·t − p∙x) 

⇔ E = (1/2)·p2/meff = (1/2)·(m·v)2/meff ⇔ meff = (1/2)·(m/E)·m·v2

⇔ meff = (1/c2)·(m·v2/2) = m·β2/2

Hence, the effective mass appears in this equation as the equivalent mass of the kinetic energy (K.E.) of the elementary particle that’s being represented by the wavefunction. Now, you may think that sounds good – and it does – but you should note the following:

1. The K.E. = m·v2/2 formula is only correct for non-relativistic speeds. In fact, it’s the kinetic energy formula if, and only if, if m ≈ m0. The relativistically correct formula for the kinetic energy calculates it as the difference between (1) the total energy (which is given by the E = m·c2 formula, always) and (2) its rest energy, so we write:

K.E. = E − E0 = mv·c2 − m0·c2 = m0·γ·c2 − m0·c2 = m0·c2·(γ − 1)

2. The energy concept in the wavefunction ψ = a·ei·[E·t − p∙x]/ħ is, obviously, the total energy of the particle. For non-relativistic speeds, the kinetic energy is only a very small fraction of the total energy. In fact, using the formula above, you can calculate the ratio between the kinetic and the total energy: you’ll find it’s equal to 1 − 1/γ = 1 − √(1−v2/c2), and its graph goes from 0 to 1.


Now, if we discard the 1/2 factor, the calculations above yield the following:

i·a·(E/ħ)·ei∙(E·t − p∙x)/ħ = −i·a·(ħ/meff)(p22ei∙(E·t − p∙x)/ħ 

⇔ E = p2/meff = (m·v)2/meff ⇔ meff = (m/E)·m·v2

⇔ meff = m·v2/c= m·β2

In fact, it is fair to say that both definitions are equally weird, even if the dimensions come out alright: the effective mass is measured in old-fashioned mass units, and the βor β2/2 factor appears as a sort of correction factor, varying between 0 and 1 (for β2) or between 0 and 1/2 (for β2/2). I prefer the new definition, as it ensures that meff becomes equal to m in the limit for the velocity going to c. In addition, if we bring the ħ/meff or (1/2)∙ħ/meff factor to the other side of the equation, the choice becomes one between a meffNEW/ħ or a 2∙meffOLD/ħ coefficient.

It’s a choice, really. Personally, I think the equation without the 1/2 factor – and, hence, the use of ħ rather than ħ/2 as the scaling factor – looks better, but then you may argue that – if half of the energy of our particle is in the oscillating real part of the wavefunction, and the other is in the imaginary part – then the 1/2 factor should stay, because it ensures that meff becomes equal to m/2 as v goes to c (or, what amounts to the same, β goes to 1). But then that’s the argument about whether or not we should have a 1/2 factor because we get two equations for the price of one, like we did for the Uncertainty Principle.

So… What to do? Let’s first ask ourselves whether that derivation of the effective mass actually makes sense. Let’s therefore look at both limit situations.

1. For v going to c (or β = v/c going to 1), we do not have much of a problem: meff just becomes the total mass of the particle that we’re looking at, and Schrödinger’s equation can easily be interpreted as an energy propagation mechanism. Our particle has zero rest mass in that case ( we may also say that the concept of a rest mass is meaningless in this situation) and all of the energy – and, therefore, all of the equivalent mass – is kinetic: m = E/cand the effective mass is just the mass: meff = m·c2/c= m. Hence, our particle is everywhere and nowhere. In fact, you should note that the concept of velocity itself doesn’t make sense in this rather particular case. It’s like a photon (but note it’s not a photon: we’re talking some theoretical particle here with zero spin and zero rest mass): it’s a wave in its own frame of reference, but as it zips by at the speed of light, we think of it as a particle.

2. Let’s look at the other limit situation. For v going to 0 (or β = v/c going to 0), Schrödinger’s equation no longer makes sense, because the diffusion constant goes to zero, so we get a nonsensical equation. Huh? What’s wrong with our analysis?

Well… I must be honest. We started off on the wrong foot. You should note that it’s hard – in fact, plain impossible – to reconcile our simple a·ei·[E·t − p∙x]/ħ function with the idea of the classical velocity of our particle. Indeed, the classical velocity corresponds to a group velocity, or the velocity of a wave packet, and so we just have one wave here: no group. So we get nonsense. You can see the same when equating p to zero in the wave equation: we get another nonsensical equation, because the Laplacian is zero! Check it. If our elementary wavefunction is equal to ψ = a·ei·(E/ħ)·t, then that Laplacian is zero.

Hence, our calculation of the effective mass is not very sensical. Why? Because the elementary wavefunction is a theoretical concept only: it may represent some box in space, that is uniformly filled with energy, but it cannot represent any actual particle. Actual particles are always some superposition of two or more elementary waves, so then we’ve got a wave packet (as illustrated below) that we can actually associate with some real-life particle moving in space, like an electron in some orbital indeed. 🙂


I must credit Oregon State University for the animation above. It’s quite nice: a simple particle in a box model without potential. As I showed on my other page (explaining various models), we must add at least two waves – traveling in opposite directions – to model a particle in a box. Why? Because we represent it by a standing wave, and a standing wave is the sum of two waves traveling in opposite directions.

So, if our derivation above was not very meaningful, then what is the actual concept of the effective mass?

The concept of the effective mass

I am afraid that, at this point, I do have to direct you back to the Grand Master himself for the detail. Let me just try to sum it up very succinctly. If we have a wave packet, there is – obviously – some energy in it, and it’s energy we may associate with the classical concept of the velocity of our particle – because it’s the group velocity of our wave packet. Hence, we have a new energy concept here – and the equivalent mass, of course. Now, Feynman’s analysis – which is Schrödinger’s analysis, really – shows we can write that energy as:

E = meff·v2/2

So… Well… That’s the classical kinetic energy formula. And it’s the very classical one, because it’s not relativistic. 😦 But that’s OK for relatively small-moving electrons! [Remember the typical (relative) velocity is given by the fine-structure constant: α = β = v/c. So that’s impressive (about 2,188 km per second), but it’s only a tiny fraction of the speed of light, so non-relativistic formulas should work.]

Now, the meff factor in this equation is a function of the various parameters of the model he uses. To be precise, we get the following formula out of his model (which, as mentioned above, is a model of electrons propagating in a crystal lattice):

meff = ħ2/(2·A·b2 )

Now, the b in this formula is the spacing between the atoms in the lattice. The A basically represents an energy barrier: to move from one atom to another, the electron needs to get across it. I talked about this in my post on it, and so I won’t explain the graph below – because I did that in that post. Just note that we don’t need that factor 2: there is no reason whatsoever to write E+ 2·A and E2·A. We could just re-define a new A: (1/2)·ANEW = AOLD. The formula for meff then simplifies to ħ2/(2·AOLD·b2) = ħ2/(ANEW·b2). We then get an Eeff = meff·vformula for the extra energy.


Eeff = meff·v2?!? What energy formula is that? Schrödinger must have thought the same thing, and so that’s why we have that ugly 1/2 factor in his equation. However, think about it. Our analysis shows that it is quite straightforward to model energy as a two-dimensional oscillation of mass. In this analysis, both the real and the imaginary component of the wavefunction each store half of the total energy of the object, which is equal to E = m·c2. Remember, indeed, that we compared it to the energy in an oscillator, which is equal to the sum of kinetic and potential energy, and for which we have the T + U = m·ω02/2 formula. But so we have two oscillators here and, hence, twice the energy. Hence, the E = m·c2 corresponds to m·ω0and, hence, we may think of as the natural frequency of the vacuum.

Therefore, the Eeff = meff·v2 formula makes much more sense. It nicely mirrors Einstein’s E = m·c2 formula and, in fact, naturally merges into E = m·c for v approaching c. But, I admit, it is not so easy to interpret. It’s much easier to just say that the effective mass is the mass of our electron as it appears in the kinetic energy formula, or – alternatively – in the momentum formula. Indeed, Feynman also writes the following formula:

meff·v = p = ħ·k

Now, that is something we easily recognize! 🙂

So… Well… What do we do now? Do we use the 1/2 factor or not?

It would be very convenient, of course, to just stick with tradition and use meff as everyone else uses it: it is just the mass as it appears in whatever medium we happen to look it, which may be a crystal lattice (or a semi-conductor), or just free space. In short, it’s the mass of the electron as it appears to us, i.e. as it appears in the (non-relativistic) kinetic energy formula (K.E. = meff·v2/2), the formula for the momentum of an electron (p = meff·v), or in the wavefunction itself (k = p/ħ = (meff·v)/ħ. In fact, in his analysis of the electron orbitals, Feynman (who just follows Schrödinger here) drops the eff subscript altogether, and so the effective mass is just the mass: meff = m. Hence, the apparent mass of the electron in the hydrogen atom serves as a reference point, and the effective mass in a different medium (such as a crystal lattice, rather than free space or, I should say, a hydrogen atom in free space) will also be different.

The thing is: we get the right results out of Schrödinger’s equation, with the 1/2 factor in it. Hence, Schrödinger’s equation works: we get the actual electron orbitals out of it. Hence, Schrödinger’s equation is true – without any doubt. Hence, if we take that 1/2 factor out, then we do need to use the other effective mass concept. We can do that. Think about the actual relation between the effective mass and the real mass of the electron, about which Feynman writes the following: “The effective mass has nothing to do with the real mass of an electron. It may be quite different—although in commonly used metals and semiconductors it often happens to turn out to be the same general order of magnitude: about 0.1 to 30 times the free-space mass of the electron.” Hence, if we write the relation between meff and m as meff = g(m), then the same relation for our meffNEW = 2∙meffOLD becomes meffNEW = 2·g(m), and the “about 0.1 to 30 times” becomes “about 0.2 to 60 times.”

In fact, in the original 1963 edition, Feynman writes that the effective mass is “about 2 to 20 times” the free-space mass of the electron. Isn’t that interesting? I mean… Note that factor 2! If we’d write meff = 2·m, then we’re fine. We can then write Schrödinger’s equation in the following two equivalent ways:

  1. (meff/ħ)·∂ψ/∂t = i·∇2ψ
  2. (2m/ħ)·∂ψ/∂t = i·∇2ψ

Both would be correct, and it explains why Schrödinger’s equation works. So let’s go for that compromise and write Schrödinger’s equation in either of the two equivalent ways. 🙂 The question then becomes: how to interpret that factor 2? The answer to that question is, effectively, related to the fact that we get two waves for the price of one here. So we have two oscillators, so to speak. Now that‘s quite deep, and I will explore that in one of my next posts.

Let me now address the second weird thing in Schrödinger’s equation: the energy factor. I should be more precise: the weirdness arises when solving Schrödinger’s equation. Indeed, in the texts I’ve read, there is this constant switching back and forth between interpreting E as the energy of the atom, versus the energy of the electron. Now, both concepts are obviously quite different, so which one is it really?

The energy factor E

It’s a confusing point—for me, at least and, hence, I must assume for students as well. Let me indicate, by way of example, how the confusion arises in Feynman’s exposé on the solutions to the Schrödinger equation. Initially, the development is quite straightforward. Replacing V by −e2/r, Schrödinger’s equation becomes:


As usual, it is then assumed that a solution of the form ψ (r, t) =  e−(i/ħ)·E·t·ψ(r) will work. Apart from the confusion that arises because we use the same symbol, ψ, for two different functions (you will agree that ψ (r, t), a function in two variables, is obviously not the same as ψ(r), a function in one variable only), this assumption is quite straightforward and allows us to re-write the differential equation above as:


To get this, you just need to actually to do that time derivative, noting that the ψ in our equation is now ψ(r), not ψ (r, t). Feynman duly notes this as he writes: “The function ψ(rmust solve this equation, where E is some constant—the energy of the atom.” So far, so good. In one of the (many) next steps, we re-write E as E = ER·ε, with E= m·e4/2ħ2. So we just use the Rydberg energy (E≈ 13.6 eV) here as a ‘natural’ atomic energy unit. That’s all. No harm in that.

Then all kinds of complicated but legitimate mathematical manipulations follow, in an attempt to solve this differential equation—attempt that is successful, of course! However, after all these manipulations, one ends up with the grand simple solution for the s-states of the atom (i.e. the spherically symmetric solutions):

En = −ER/nwith 1/n= 1, 1/4, 1/9, 1/16,…, 1

So we get: En = −13.6 eV, −3.4 eV, −1.5 eV, etcetera. Now how is that possible? How can the energy of the atom suddenly be negative? More importantly, why is so tiny in comparison with the rest energy of the proton (which is about 938 mega-electronvolt), or the electron (0.511 MeV)? The energy levels above are a few eV only, not a few million electronvolt. Feynman answers this question rather vaguely when he states the following:

“There is, incidentally, nothing mysterious about negative numbers for the energy. The energies are negative because when we chose to write V = −e2/r, we picked our zero point as the energy of an electron located far from the proton. When it is close to the proton, its energy is less, so somewhat below zero. The energy is lowest (most negative) for n = 1, and increases toward zero with increasing n.”

We picked our zero point as the energy of an electron located far away from the proton? But we were talking the energy of the atom all along, right? You’re right. Feynman doesn’t answer the question. The solution is OK – well, sort of, at least – but, in one of those mathematical complications, there is a ‘normalization’ – a choice of some constant that pops up when combining and substituting stuff – that is not so innocent. To be precise, at some point, Feynman substitutes the ε variable for the square of another variable – to be even more precise, he writes: ε = −α2. He then performs some more hat tricks – all legitimate, no doubt – and finds that the only sensible solutions to the differential equation require α to be equal to 1/n, which immediately leads to the above-mentioned solution for our s-states.

The real answer to the question is given somewhere else. In fact, Feynman casually gives us an explanation in one of his very first Lectures on quantum mechanics, where he writes the following:

“If we have a “condition” which is a mixture of two different states with different energies, then the amplitude for each of the two states will vary with time according to an equation like a·eiωt, with ħ·ω = E0 = m·c2. Hence, we can write the amplitude for the two states, for example as:

ei(E1/ħ)·t and ei(E2/ħ)·t

And if we have some combination of the two, we will have an interference. But notice that if we added a constant to both energies, it wouldn’t make any difference. If somebody else were to use a different scale of energy in which all the energies were increased (or decreased) by a constant amount—say, by the amount A—then the amplitudes in the two states would, from his point of view, be

ei(E1+A)·t/ħ and ei(E2+A)·t/ħ

All of his amplitudes would be multiplied by the same factor ei(A/ħ)·t, and all linear combinations, or interferences, would have the same factor. When we take the absolute squares to find the probabilities, all the answers would be the same. The choice of an origin for our energy scale makes no difference; we can measure energy from any zero we want. For relativistic purposes it is nice to measure the energy so that the rest mass is included, but for many purposes that aren’t relativistic it is often nice to subtract some standard amount from all energies that appear. For instance, in the case of an atom, it is usually convenient to subtract the energy Ms·c2, where Ms is the mass of all the separate pieces—the nucleus and the electrons—which is, of course, different from the mass of the atom. For other problems, it may be useful to subtract from all energies the amount Mg·c2, where Mg is the mass of the whole atom in the ground state; then the energy that appears is just the excitation energy of the atom. So, sometimes we may shift our zero of energy by some very large constant, but it doesn’t make any difference, provided we shift all the energies in a particular calculation by the same constant.”

It’s a rather long quotation, but it’s important. The key phrase here is, obviously, the following: “For other problems, it may be useful to subtract from all energies the amount Mg·c2, where Mg is the mass of the whole atom in the ground state; then the energy that appears is just the excitation energy of the atom.” So that’s what he’s doing when solving Schrödinger’s equation. However, I should make the following point here: if we shift the origin of our energy scale, it does not make any difference in regard to the probabilities we calculate, but it obviously does make a difference in terms of our wavefunction itself. To be precise, its density in time will be very different. Hence, if we’d want to give the wavefunction some physical meaning – which is what I’ve been trying to do all along – it does make a huge difference. When we leave the rest mass of all of the pieces in our system out, we can no longer pretend we capture their energy.

This is a rather simple observation, but one that has profound implications in terms of our interpretation of the wavefunction. Personally, I admire the Great Teacher’s Lectures, but I am really disappointed that he doesn’t pay more attention to this. 😦

Quantum Mechanics: The Other Introduction

About three weeks ago, I brought my most substantial posts together in one document: it’s the Deep Blue page of this site. I also published it on Amazon/Kindle. It’s nice. It crowns many years of self-study, and many nights of short and bad sleep – as I was mulling over yet another paradox haunting me in my dreams. It’s been an extraordinary climb but, frankly, the view from the top is magnificent. 🙂 

The offer is there: anyone who is willing to go through it and offer constructive and/or substantial comments will be included in the book’s acknowledgements section when I go for a second edition (which it needs, I think). First person to be acknowledged here is my wife though, Maria Elena Barron, as she has given me the spacetime:-) and, more importantly, the freedom to take this bull by its horns.

Below I just copy the foreword, just to give you a taste of it. 🙂


Another introduction to quantum mechanics? Yep. I am not hoping to sell many copies, but I do hope my unusual background—I graduated as an economist, not as a physicist—will encourage you to take on the challenge and grind through this.

I’ve always wanted to thoroughly understand, rather than just vaguely know, those quintessential equations: the Lorentz transformations, the wavefunction and, above all, Schrödinger’s wave equation. In my bookcase, I’ve always had what is probably the most famous physics course in the history of physics: Richard Feynman’s Lectures on Physics, which have been used for decades, not only at Caltech but at many of the best universities in the world. Plus a few dozen other books. Popular books—which I now regret I ever read, because they were an utter waste of time: the language of physics is math and, hence, one should read physics in math—not in any other language.

But Feynman’s Lectures on Physics—three volumes of about fifty chapters each—are not easy to read. However, the experimental verification of the existence of the Higgs particle in CERN’s LHC accelerator a couple of years ago, and the award of the Nobel prize to the scientists who had predicted its existence (including Peter Higgs and François Englert), convinced me it was about time I take the bull by its horns. While, I consider myself to be of average intelligence only, I do feel there’s value in the ideal of the ‘Renaissance man’ and, hence, I think stuff like this is something we all should try to understand—somehow. So I started to read, and I also started a blog ( to externalize my frustration as I tried to cope with the difficulties involved. The site attracted hundreds of visitors every week and, hence, it encouraged me to publish this booklet.

So what is it about? What makes it special? In essence, it is a common-sense introduction to the key concepts in quantum physics. However, while common-sense, it does not shy away from the math, which is complicated, but not impossible. So this little book is surely not a Guide to the Universe for Dummies. I do hope it will guide some Not-So-Dummies. It basically recycles what I consider to be my more interesting posts, but combines them in a comprehensive structure.

It is a bit of a philosophical analysis of quantum mechanics as well, as I will – hopefully – do a better job than others in distinguishing the mathematical concepts from what they are supposed to describe, i.e. physical reality.

Last but not least, it does offer some new didactic perspectives. For those who know the subject already, let me briefly point these out:

I. Few, if any, of the popular writers seems to have noted that the argument of the wavefunction (θ = E·t – p·t) – using natural units (hence, the numerical value of ħ and c is one), and for an object moving at constant velocity (hence, x = v·t) – can be written as the product of the proper time of the object and its rest mass:

θ = E·t – p·x = E·t − p·x = mv·t − mv·v·x = mv·(t − v·x)

⇔ θ = m0·(t − v·x)/√(1 – v2) = m0·t’

Hence, the argument of the wavefunction is just the proper time of the object with the rest mass acting as a scaling factor for the time: the internal clock of the object ticks much faster if it’s heavier. This symmetry between the argument of the wavefunction of the object as measured in its own (inertial) reference frame, and its argument as measured by us, in our own reference frame, is remarkable, and allows to understand the nature of the wavefunction in a more intuitive way.

While this approach reflects Feynman’s idea of the photon stopwatch, the presentation in this booklet generalizes the concept for all wavefunctions, first and foremost the wavefunction of the matter-particles that we’re used to (e.g. electrons).

II. Few, if any, have thought of looking at Schrödinger’s wave equation as an energy propagation mechanism. In fact, when helping my daughter out as she was trying to understand non-linear regression (logit and Poisson regressions), it suddenly realized we can analyze the wavefunction as a link function that connects two physical spaces: the physical space of our moving object, and a physical energy space.

Re-inserting Planck’s quantum of action in the argument of the wavefunction – so we write θ as θ = (E/ħ)·t – (p/ħ)·x = [E·t – p·x]/ħ – we may assign a physical dimension to it: when interpreting ħ as a scaling factor only (and, hence, when we only consider its numerical value, not its physical dimension), θ becomes a quantity expressed in newton·meter·second, i.e. the (physical) dimension of action. It is only natural, then, that we would associate the real and imaginary part of the wavefunction with some physical dimension too, and a dimensional analysis of Schrödinger’s equation tells us this dimension must be energy.

This perspective allows us to look at the wavefunction as an energy propagation mechanism, with the real and imaginary part of the probability amplitude interacting in very much the same way as the electric and magnetic field vectors E and B. This leads me to the next point, which I make rather emphatically in this booklet:  the propagation mechanism for electromagnetic energy – as described by Maxwell’s equations – is mathematically equivalent to the propagation mechanism that’s implicit in the Schrödinger equation.

I am, therefore, able to present the Schrödinger equation in a much more coherent way, describing not only how this famous equation works for electrons, or matter-particles in general (i.e. fermions or spin-1/2 particles), which is probably the only use of the Schrödinger equation you are familiar with, but also how it works for bosons, including the photon, of course, but also the theoretical zero-spin boson!

In fact, I am personally rather proud of this. Not because I am doing something that hasn’t been done before (I am sure many have come to the same conclusions before me), but because one always has to trust one’s intuition. So let me say something about that third innovation: the photon wavefunction.

III. Let me tell you the little story behind my photon wavefunction. One of my acquaintances is a retired nuclear scientist. While he knew I was delving into it all, I knew he had little time to answer any of my queries. However, when I asked him about the wavefunction for photons, he bluntly told me photons didn’t have a wavefunction. I should just study Maxwell’s equations and that’s it: there’s no wavefunction for photons: just this traveling electric and a magnetic field vector. Look at Feynman’s Lectures, or any textbook, he said. None of them talk about photon wavefunctions. That’s true, but I knew he had to be wrong. I mulled over it for several months, and then just sat down and started doing to fiddle with Maxwell’s equations, assuming the oscillations of the E and B vector could be described by regular sinusoids. And – Lo and behold! – I derived a wavefunction for the photon. It’s fully equivalent to the classical description, but the new expression solves the Schrödinger equation, if we modify it in a rather logical way: we have to double the diffusion constant, which makes sense, because E and B give you two waves for the price of one!


In any case, I am getting ahead of myself here, and so I should wrap up this rather long introduction. Let me just say that, through my rather long journey in search of understanding – rather than knowledge alone – I have learned there are so many wrong answers out there: wrong answers that hamper rather than promote a better understanding. Moreover, I was most shocked to find out that such wrong answers are not the preserve of amateurs alone! This emboldened me to write what I write here, and to publish it. Quantum mechanics is a logical and coherent framework, and it is not all that difficult to understand. One just needs good pointers, and that’s what I want to provide here.

As of now, it focuses on the mechanics in particular, i.e. the concept of the wavefunction and wave equation (better known as Schrödinger’s equation). The other aspect of quantum mechanics – i.e. the idea of uncertainty as implied by the quantum idea – will receive more attention in a later version of this document. I should also say I will limit myself to quantum electrodynamics (QED) only, so I won’t discuss quarks (i.e. quantum chromodynamics, which is an entirely different realm), nor will I delve into any of the other more recent advances of physics.

In the end, you’ll still be left with lots of unanswered questions. However, that’s quite OK, as Richard Feynman himself was of the opinion that he himself did not understand the topic the way he would like to understand it. But then that’s exactly what draws all of us to quantum physics: a common search for a deep and full understanding of reality, rather than just some superficial description of it, i.e. knowledge alone.

So let’s get on with it. I am not saying this is going to be easy reading. In fact, I blogged about much easier stuff than this in my blog—treating only aspects of the whole theory. This is the whole thing, and it’s not easy to swallow. In fact, it may well too big to swallow as a whole. But please do give it a try. I wanted this to be an intuitive but formally correct introduction to quantum math. However, when everything is said and done, you are the only who can judge if I reached that goal.

Of course, I should not forget the acknowledgements but… Well… It was a rather lonely venture, so I am only going to acknowledge my wife here, Maria, who gave me all of the spacetime and all of the freedom I needed, as I would get up early, or work late after coming home from my regular job. I sacrificed weekends, which we could have spent together, and – when mulling over yet another paradox – the nights were often short and bad. Frankly, it’s been an extraordinary climb, but the view from the top is magnificent.

I just need to insert one caution, my site ( includes animations, which make it much easier to grasp some of the mathematical concepts that I will be explaining. Hence, I warmly recommend you also have a look at that site, and its Deep Blue page in particular – as that page has the same contents, more or less, but the animations make it a much easier read.

Have fun with it!

Jean Louis Van Belle, BA, MA, BPhil, Drs.

The Imaginary Energy Space

Post scriptum note added on 11 July 2016: This is one of the more speculative posts which led to my e-publication analyzing the wavefunction as an energy propagation. With the benefit of hindsight, I would recommend you to immediately the more recent exposé on the matter that is being presented here, which you can find by clicking on the provided link.

Original post:

Intriguing title, isn’t it? You’ll think this is going to be highly speculative and you’re right. In fact, I could also have written: the imaginary action space, or the imaginary momentum space. Whatever. It all works ! It’s an imaginary space – but a very real one, because it holds energy, or momentum, or a combination of both, i.e. action. 🙂

So the title is either going to deter you or, else, encourage you to read on. I hope it’s the latter. 🙂

In my post on Richard Feynman’s exposé on how Schrödinger got his famous wave equation, I noted an ambiguity in how he deals with the energy concept. I wrote that piece in February, and we are now May. In-between, I looked at Schrödinger’s equation from various perspectives, as evidenced from the many posts that followed that February post, which I summarized on my Deep Blue page, where I note the following:

  1. The argument of the wavefunction (i.e. θ = ωt – kx = [E·t – p·x]/ħ) is just the proper time of the object that’s being represented by the wavefunction (which, in most cases, is an elementary particle—an electron, for example).
  2. The 1/2 factor in Schrödinger’s equation (∂ψ/∂t = i·(ħ/2m)·∇2ψ) doesn’t make all that much sense, so we should just drop it. Writing ∂ψ/∂t = i·(m/ħ)∇2ψ (i.e. Schrödinger’s equation without the 1/2 factor) does away with the mentioned ambiguities and, more importantly, avoids obvious contradictions.

Both remarks are rather unusual—especially the second one. In fact, if you’re not shocked by what I wrote above (Schrödinger got something wrong!), then stop reading—because then you’re likely not to understand a thing of what follows. 🙂 In any case, I thought it would be good to follow up by devoting a separate post to this matter.

The argument of the wavefunction as the proper time

Frankly, it took me quite a while to see that the argument of the wavefunction is nothing but the t’ = (t − v∙x)/√(1−v2)] formula that we know from the Lorentz transformation of spacetime. Let me quickly give you the formulas (just substitute the for v):


In fact, let me be precise: the argument of the wavefunction also has the particle’s rest mass m0 in it. That mass factor (m0) appears in it as a general scaling factor, so it determines the density of the wavefunction both in time as well as in space. Let me jot it down:

ψ(x, t) = a·ei·(mv·t − p∙x) = a·ei·[(m0/√(1−v2))·t − (m0·v/√(1−v2))∙x] = a·ei·m0·(t − v∙x)/√(1−v2)

Huh? Yes. Let me show you how we get from θ = ωt – kx = [E·t – p·x]/ħ to θ = mv·t − p∙x. It’s really easy. We first need to choose our units such that the speed of light and Planck’s constant are numerically equal to one, so we write: = 1 and ħ = 1. So now the 1/ħ factor no longer appears.

[Let me note something here: using natural units does not do away with the dimensions: the dimensions of whatever is there remain what they are. For example, energy remains what it is, and so that’s force over distance: 1 joule = 1 newton·meter (1 J = 1 N·m. Likewise, momentum remains what it is: force times time (or mass times velocity). Finally, the dimension of the quantum of action doesn’t disappear either: it remains the product of force, distance and time (N·m·s). So you should distinguish between the numerical value of our variables and their dimension. Always! That’s where physics is different from algebra: the equations actually mean something!]

Now, because we’re working in natural units, the numerical value of both and cwill be equal to 1. It’s obvious, then, that Einstein’s mass-energy equivalence relation reduces from E = mvc2 to E = mv. You can work out the rest yourself – noting that p = mv·v and mv = m0/√(1−v2). Done! For a more intuitive explanation, I refer you to the above-mentioned page.

So that’s for the wavefunction. Let’s now look at Schrödinger’s wave equation, i.e. that differential equation of which our wavefunction is a solution. In my introduction, I bluntly said there was something wrong with it: that 1/2 factor shouldn’t be there. Why not?

What’s wrong with Schrödinger’s equation?

When deriving his famous equation, Schrödinger uses the mass concept as it appears in the classical kinetic energy formula: K.E. = m·v2/2, and that’s why – after all the complicated turns – that 1/2 factor is there. There are many reasons why that factor doesn’t make sense. Let me sum up a few.

[I] The most important reason is that de Broglie made it quite clear that the energy concept in his equations for the temporal and spatial frequency for the wavefunction – i.e. the ω = E/ħ and k = p/ħ relations – is the total energy, including rest energy (m0), kinetic energy (m·v2/2) and any potential energy (V). In fact, if we just multiply the two de Broglie (aka as matter-wave equations) and use the old-fashioned v = λ relation (so we write E as E = ω·ħ = (2π·f)·(h/2π) = f·h, and p as p = k·ħ = (2π/λ)·(h/2π) = h/λ and, therefore, we have = E/h and p = h/p), we find that the energy concept that’s implicit in the two matter-wave equations is equal to E = m∙v2, as shown below:

  1. f·λ = (E/h)·(h/p) = E/p
  2. v = λ ⇒ f·λ = v = E/p ⇔ E = v·p = v·(m·v) ⇒ E = m·v2

Huh? E = m∙v2? Yes. Not E = m∙c2 or m·v2/2 or whatever else you might be thinking of. In fact, this E = m∙v2 formula makes a lot of sense in light of the two following points.

Skeptical note: You may – and actually should – wonder whether we can use that v = λ relation for a wave like this, i.e. a wave with both a real (cos(-θ)) as well as an imaginary component (i·sin(-θ). It’s a deep question, and I’ll come back to it later. But… Yes. It’s the right question to ask. 😦

[II] Newton told us that force is mass time acceleration. Newton’s law is still valid in Einstein’s world. The only difference between Newton’s and Einstein’s world is that, since Einstein, we should treat the mass factor as a variable as well. We write: F = mv·a = mv·= [m0/√(1−v2)]·a. This formula gives us the definition of the newton as a force unit: 1 N = 1 kg·(m/s)/s = 1 kg·m/s2. [Note that the 1/√(1−v2) factor – i.e. the Lorentz factor (γ) – has no dimension, because is measured as a relative velocity here, i.e. as a fraction between 0 and 1.]

Now, you’ll agree the definition of energy as a force over some distance is valid in Einstein’s world as well. Hence, if 1 joule is 1 N·m, then 1 J is also equal to 1 (kg·m/s2)·m = 1 kg·(m2/s2), so this also reflects the E = m∙v2 concept. [I can hear you mutter: that kg factor refers to the rest mass, no? No. It doesn’t. The kg is just a measure of inertia: as a unit, it applies to both mas well as mv. Full stop.]

Very skeptical note: You will say this doesn’t prove anything – because this argument just shows the dimensional analysis for both equations (i.e. E = m∙v2 and E = m∙c2) is OK. Hmm… Yes. You’re right. 🙂 But the next point will surely convince you! 🙂

[III] The third argument is the most intricate and the most beautiful at the same time—not because it’s simple (like the arguments above) but because it gives us an interpretation of what’s going on here. It’s fairly easy to verify that Schrödinger’s equation, ∂ψ/∂t = i·(ħ/2m)·∇2ψ equation (including the 1/2 factor to which I object), is equivalent to the following set of two equations:

  1. Re(∂ψ/∂t) = −(ħ/2m)·Im(∇2ψ)
  2. Im(∂ψ/∂t) = (ħ/2m)·Re(∇2ψ)

[In case you don’t see it immediately, note that two complex numbers a + i·b and c + i·d are equal if, and only if, their real and imaginary parts are the same. However, here we have something like this: a + i·b = i·(c + i·d) = i·c + i2·d = − d + i·c (remember i= −1).]

Now, before we proceed (i.e. before I show you what’s wrong here with that 1/2 factor), let us look at the dimensions first. For that, we’d better analyze the complete Schrödinger equation so as to make sure we’re not doing anything stupid here by looking at one aspect of the equation only. The complete equation, in its original form, is:

schrodinger 5

Notice that, to simplify the analysis above, I had moved the and the ħ on the left-hand side to the right-hand side (note that 1/= −i, so −(ħ2/2m)/(i·ħ) = ħ/2m). Now, the ħfactor on the right-hand side is expressed in J2·s2. Now that doesn’t make much sense, but then that mass factor in the denominator makes everything come out alright. Indeed, we can use the mass-equivalence relation to express m in J/(m/s)2 units. So our ħ2/2m coefficient is expressed in (J2·s2)/[J/(m/s)2] = J·m2. Now we multiply that by that Laplacian operating on some scalar, which yields some quantity per square meter. So the whole right-hand side becomes some amount expressed in joule, i.e. the unit of energy! Interesting, isn’t it?

On the left-hand side, we have i and ħ. We shouldn’t worry about the imaginary unit because we can treat that as just another number, albeit a very special number (because its square is minus 1). However, in this equation, it’s like a mathematical constant and you can think of it as something like π or e. [Think of the magical formula: eiπ = i2 = −1.] In contrast, ħ is a physical constant, and so that constant comes with some dimension and, therefore, we cannot just do what we want. [I’ll show, later, that even moving it to the other side of the equation comes with interpretation problems, so be careful with physical constants, as they really mean something!] In this case, its dimension is the action dimension: J·s = N·m·s, so that’s force times distance times time. So we multiply that with a time derivative and we get joule once again (N·m·s/s = N·m = J), so that’s the unit of energy. So it works out: we have joule units both left and right in Schrödinger’s equation. Nice! Yes. But what does it mean? 🙂

Well… You know that we can – and should – think of Schrödinger’s equation as a diffusion equation – just like a heat diffusion equation, for example – but then one describing the diffusion of a probability amplitude. [In case you are not familiar with this interpretation, please do check my post on it, or my Deep Blue page.] But then we didn’t describe the mechanism in very much detail, so let me try to do that now and, in the process, finally explain the problem with the 1/2 factor.

The missing energy

There are various ways to explain the problem. One of them involves calculating group and phase velocities of the elementary wavefunction satisfying Schrödinger’s equation but that’s a more complicated approach and I’ve done that elsewhere, so just click the reference if you prefer the more complicated stuff. I find it easier to just use those two equations above:

  1. Re(∂ψ/∂t) = −(ħ/2m)·Im(∇2ψ)
  2. Im(∂ψ/∂t) = (ħ/2m)·Re(∇2ψ)

The argument is the following: if our elementary wavefunction is equal to ei(kx − ωt) = cos(kx−ωt) + i∙sin(kx−ωt), then it’s easy to proof that this pair of conditions is fulfilled if, and only if, ω = k2·(ħ/2m). [Note that I am omitting the normalization coefficient in front of the wavefunction: you can put it back in if you want. The argument here is valid, with or without normalization coefficients.] Easy? Yes. Check it out. The time derivative on the left-hand side is equal to:

∂ψ/∂t = −iω·iei(kx − ωt) = ω·[cos(kx − ωt) + i·sin(kx − ωt)] = ω·cos(kx − ωt) + iω·sin(kx − ωt)

And the second-order derivative on the right-hand side is equal to:

2ψ = ∂2ψ/∂x= i·k2·ei(kx − ωt) = k2·cos(kx − ωt) + i·k2·sin(kx − ωt)

So the two equations above are equivalent to writing:

  1. Re(∂ψB/∂t) =   −(ħ/2m)·Im(∇2ψB) ⇔ ω·cos(kx − ωt) = k2·(ħ/2m)·cos(kx − ωt)
  2. Im(∂ψB/∂t) = (ħ/2m)·Re(∇2ψB) ⇔ ω·sin(kx − ωt) = k2·(ħ/2m)·sin(kx − ωt)

So both conditions are fulfilled if, and only if, ω = k2·(ħ/2m). You’ll say: so what? Well… We have a contradiction here—something that doesn’t make sense. Indeed, the second of the two de Broglie equations (always look at them as a pair) tells us that k = p/ħ, so we can re-write the ω = k2·(ħ/2m) condition as:

ω/k = vp = k2·(ħ/2m)/k = k·ħ/(2m) = (p/ħ)·(ħ/2m) = p/2m ⇔ p = 2m

You’ll say: so what? Well… Stop reading, I’d say. That p = 2m doesn’t make sense—at all! Nope! In fact, if you thought that the E = m·v2  is weird—which, I hope, is no longer the case by now—then… Well… This p = 2m equation is much weirder. In fact, it’s plain nonsense: this condition makes no sense whatsoever. The only way out is to remove the 1/2 factor, and to re-write the Schrödinger equation as I wrote it, i.e. with an ħ/m coefficient only, rather than an (1/2)·(ħ/m) coefficient.

Huh? Yes.

As mentioned above, I could do those group and phase velocity calculations to show you what rubbish that 1/2 factor leads to – and I’ll do that eventually – but let me first find yet another way to present the same paradox. Let’s simplify our life by choosing our units such that = ħ = 1, so we’re using so-called natural units rather than our SI units. [Again, note that switching to natural units doesn’t do anything to the physical dimensions: a force remains a force, a distance remains a distance, and so on.] Our mass-energy equivalence then becomes: E = m·c= m·1= m. [Again, note that switching to natural units doesn’t do anything to the physical dimensions: a force remains a force, a distance remains a distance, and so on. So we’d still measure energy and mass in different but equivalent units. Hence, the equality sign should not make you think mass and energy are actually the same: energy is energy (i.e. force times distance), while mass is mass (i.e. a measure of inertia). I am saying this because it’s important, and because it took me a while to make these rather subtle distinctions.]

Let’s now go one step further and imagine a hypothetical particle with zero rest mass, so m0 = 0. Hence, all its energy is kinetic and so we write: K.E. = mv·v/2. Now, because this particle has zero rest mass, the slightest acceleration will make it travel at the speed of light. In fact, we would expect it to travel at the speed, so mv = mc and, according to the mass-energy equivalence relation, its total energy is, effectively, E = mv = mc. However, we just said its total energy is kinetic energy only. Hence, its total energy must be equal to E = K.E. = mc·c/2 = mc/2. So we’ve got only half the energy we need. Where’s the other half? Where’s the missing energy? Quid est veritas? Is its energy E = mc or E = mc/2?

It’s just a paradox, of course, but one we have to solve. Of course, we may just say we trust Einstein’s E = m·c2 formula more than the kinetic energy formula, but that answer is not very scientific. 🙂 We’ve got a problem here and, in order to solve it, I’ve come to the following conclusion: just because of its sheer existence, our zero-mass particle must have some hidden energy, and that hidden energy is also equal to E = m·c2/2. Hence, the kinetic and the hidden energy add up to E = m·c2 and all is alright.

Huh? Hidden energy? I must be joking, right?

Well… No. Let me explain. Oh. And just in case you wonder why I bother to try to imagine zero-mass particles. Let me tell you: it’s the first step towards finding a wavefunction for a photon and, secondly, you’ll see it just amounts to modeling the propagation mechanism of energy itself. 🙂

The hidden energy as imaginary energy

I am tempted to refer to the missing energy as imaginary energy, because it’s linked to the imaginary part of the wavefunction. However, it’s anything but imaginary: it’s as real as the imaginary part of the wavefunction. [I know that sounds a bit nonsensical, but… Well… Think about it. And read on!]

Back to that factor 1/2. As mentioned above, it also pops up when calculating the group and the phase velocity of the wavefunction. In fact, let me show you that calculation now. [Sorry. Just hang in there.] It goes like this.

The de Broglie relations tell us that the k and the ω in the ei(kx − ωt) = cos(kx−ωt) + i∙sin(kx−ωt) wavefunction (i.e. the spatial and temporal frequency respectively) are equal to k = p/ħ, and ω = E/ħ. Let’s now think of that zero-mass particle once more, so we assume all of its energy is kinetic: no rest energy, no potential! So… If we now use the kinetic energy formula E = m·v2/2 – which we can also write as E = m·v·v/2 = p·v/2 = p·p/2m = p2/2m, with v = p/m the classical velocity of the elementary particle that Louis de Broglie was thinking of – then we can calculate the group velocity of our ei(kx − ωt) = cos(kx−ωt) + i∙sin(kx−ωt) wavefunction as:

vg = ∂ω/∂k = ∂[E/ħ]/∂[p/ħ] = ∂E/∂p = ∂[p2/2m]/∂p = 2p/2m = p/m = v

[Don’t tell me I can’t treat m as a constant when calculating ∂ω/∂k: I can. Think about it.]

Fine. Now the phase velocity. For the phase velocity of our ei(kx − ωt) wavefunction, we find:

vp = ω/k = (E/ħ)/(p/ħ) = E/p = (p2/2m)/p = p/2m = v/2

So that’s only half of v: it’s the 1/2 factor once more! Strange, isn’t it? Why would we get a different value for the phase velocity here? It’s not like we have two different frequencies here, do we? Well… No. You may also note that the phase velocity turns out to be smaller than the group velocity (as mentioned, it’s only half of the group velocity), which is quite exceptional as well! So… Well… What’s the matter here? We’ve got a problem!

What’s going on here? We have only one wave here—one frequency and, hence, only one k and ω. However, on the other hand, it’s also true that the ei(kx − ωt) wavefunction gives us two functions for the price of one—one real and one imaginary: ei(kx − ωt) = cos(kx−ωt) + i∙sin(kx−ωt). So the question here is: are we adding waves, or are we not? It’s a deep question. If we’re adding waves, we may get different group and phase velocities, but if we’re not, then… Well… Then the group and phase velocity of our wave should be the same, right? The answer is: we are and we aren’t. It all depends on what you mean by ‘adding’ waves. I know you don’t like that answer, but that’s the way it is, really. 🙂

Let me make a small digression here that will make you feel even more confused. You know – or you should know – that the sine and the cosine function are the same except for a phase difference of 90 degrees: sinθ = cos(θ + π/2). Now, at the same time, multiplying something with amounts to a rotation by 90 degrees, as shown below.

Hence, in order to sort of visualize what our ei(kx − ωt) function really looks like, we may want to super-impose the two graphs and think of something like this:


You’ll have to admit that, when you see this, our formulas for the group or phase velocity, or our v = λ relation, do no longer make much sense, do they? 🙂

Having said that, that 1/2 factor is and remains puzzling, and there must be some logical reason for it. For example, it also pops up in the Uncertainty Relations:

Δx·Δp ≥ ħ/2 and ΔE·Δt ≥ ħ/2

So we have ħ/2 in both, not ħ. Why do we need to divide the quantum of action here? How do we solve all these paradoxes? It’s easy to see how: the apparent contradiction (i.e. the different group and phase velocity) gets solved if we’d use the E = m∙v2 formula rather than the kinetic energy E = m∙v2/2. But then… What energy formula is the correct one: E = m∙v2 or m∙c2? Einstein’s formula is always right, isn’t it? It must be, so let me postpone the discussion a bit by looking at a limit situation. If v = c, then we don’t need to make a choice, obviously. 🙂 So let’s look at that limit situation first. So we’re discussing our zero-mass particle once again, assuming it travels at the speed of light. What do we get?

Well… Measuring time and distance in natural units, so c = 1, we have:

E = m∙c2 = m and p = m∙c = m, so we get: E = m = p

Waw ! E = m = p ! What a weird combination, isn’t it? Well… Yes. But it’s fully OK. [You tell me why it wouldn’t be OK. It’s true we’re glossing over the dimensions here, but natural units are natural units and, hence, the numerical value of c and c2 is 1. Just figure it out for yourself.] The point to note is that the E = m = p equality yields extremely simple but also very sensible results. For the group velocity of our ei(kx − ωt) wavefunction, we get:

vg = ∂ω/∂k = ∂[E/ħ]/∂[p/ħ] = ∂E/∂p = ∂p/∂p = 1

So that’s the velocity of our zero-mass particle (remember: the 1 stands for c here, i.e. the speed of light) expressed in natural units once more—just like what we found before. For the phase velocity, we get:

vp = ω/k = (E/ħ)/(p/ħ) = E/p = p/p = 1

Same result! No factor 1/2 here! Isn’t that great? My ‘hidden energy theory’ makes a lot of sense.:-)

However, if there’s hidden energy, we still need to show where it’s hidden. 🙂 Now that question is linked to the propagation mechanism that’s described by those two equations, which now – leaving the 1/2 factor out, simplify to:

  1. Re(∂ψ/∂t) = −(ħ/m)·Im(∇2ψ)
  2. Im(∂ψ/∂t) = (ħ/m)·Re(∇2ψ)

Propagation mechanism? Yes. That’s what we’re talking about here: the propagation mechanism of energy. Huh? Yes. Let me explain in another separate section, so as to improve readability. Before I do, however, let me add another note—for the skeptics among you. 🙂

Indeed, the skeptics among you may wonder whether our zero-mass particle wavefunction makes any sense at all, and they should do so for the following reason: if x = 0 at t = 0, and it’s traveling at the speed of light, then x(t) = t. Always. So if E = m = p, the argument of our wavefunction becomes E·t – p·x = E·t – E·t = 0! So what’s that? The proper time of our zero-mass particle is zero—always and everywhere!?

Well… Yes. That’s why our zero-mass particle – as a point-like object – does not really exist. What we’re talking about is energy itself, and its propagation mechanism. 🙂

While I am sure that, by now, you’re very tired of my rambling, I beg you to read on. Frankly, if you got as far as you have, then you should really be able to work yourself through the rest of this post. 🙂 And I am sure that – if anything – you’ll find it stimulating! 🙂

The imaginary energy space

Look at the propagation mechanism for the electromagnetic wave in free space, which (for = 1) is represented by the following two equations:

  1. B/∂t = –∇×E
  2. E/∂t = ∇×B

[In case you wonder, these are Maxwell’s equations for free space, so we have no stationary nor moving charges around.] See how similar this is to the two equations above? In fact, in my Deep Blue page, I use these two equations to derive the quantum-mechanical wavefunction for the photon (which is not the same as that hypothetical zero-mass particle I introduced above), but I won’t bother you with that here. Just note the so-called curl operator in the two equations above (∇×) can be related to the Laplacian we’ve used so far (∇2). It’s not the same thing, though: for starters, the curl operator operates on a vector quantity, while the Laplacian operates on a scalar (including complex scalars). But don’t get distracted now. Let’s look at the revised Schrödinger’s equation, i.e. the one without the 1/2 factor:

∂ψ/∂t = i·(ħ/m)·∇2ψ

On the left-hand side, we have a time derivative, so that’s a flow per second. On the right-hand side we have the Laplacian and the i·ħ/m factor. Now, written like this, Schrödinger’s equation really looks exactly the same as the general diffusion equation, which is written as: ∂φ/∂t = D·∇2φ, except for the imaginary unit, which makes it clear we’re getting two equations for the price of one here, rather than one only! 🙂 The point is: we may now look at that ħ/m factor as a diffusion constant, because it does exactly the same thing as the diffusion constant D in the diffusion equation ∂φ/∂t = D·∇2φ, i.e:

  1. As a constant of proportionality, it quantifies the relationship between both derivatives.
  2. As a physical constant, it ensures the dimensions on both sides of the equation are compatible.

So the diffusion constant for  Schrödinger’s equation is ħ/m. What is its dimension? That’s easy: (N·m·s)/(N·s2/m) = m2/s. [Remember: 1 N = 1 kg·m/s2.] But then we multiply it with the Laplacian, so that’s something expressed per square meter, so we get something per second on both sides.

Of course, you wonder: what per second? Not sure. That’s hard to say. Let’s continue with our analogy with the heat diffusion equation so as to try to get a better understanding of what’s being written here. Let me give you that heat diffusion equation here. Assuming the heat per unit volume (q) is proportional to the temperature (T) – which is the case when expressing T in degrees Kelvin (K), so we can write q as q = k·T  – we can write it as:

heat diffusion 2

So that’s structurally similar to Schrödinger’s equation, and to the two equivalent equations we jotted down above. So we’ve got T (temperature) in the role of ψ here—or, to be precise, in the role of ψ ‘s real and imaginary part respectively. So what’s temperature? From the kinetic theory of gases, we know that temperature is not just a scalar: temperature measures the mean (kinetic) energy of the molecules in the gas. That’s why we can confidently state that the heat diffusion equation models an energy flow, both in space as well as in time.

Let me make the point by doing the dimensional analysis for that heat diffusion equation. The time derivative on the left-hand side (∂T/∂t) is expressed in K/s (Kelvin per second). Weird, isn’t it? What’s a Kelvin per second? Well… Think of a Kelvin as some very small amount of energy in some equally small amount of space—think of the space that one molecule needs, and its (mean) energy—and then it all makes sense, doesn’t it?

However, in case you find that a bit difficult, just work out the dimensions of all the other constants and variables. The constant in front (k) makes sense of it. That coefficient (k) is the (volume) heat capacity of the substance, which is expressed in J/(m3·K). So the dimension of the whole thing on the left-hand side (k·∂T/∂t) is J/(m3·s), so that’s energy (J) per cubic meter (m3) and per second (s). Nice, isn’t it? What about the right-hand side? On the right-hand side we have the Laplacian operator  – i.e. ∇= ·, with ∇ = (∂/∂x,  ∂/∂y,  ∂/∂z) – operating on T. The Laplacian operator, when operating on a scalar quantity, gives us a flux density, i.e. something expressed per square meter (1/m2). In this case, it’s operating on T, so the dimension of ∇2T is K/m2. Again, that doesn’t tell us very much (what’s the meaning of a Kelvin per square meter?) but we multiply it by the thermal conductivity (κ), whose dimension is W/(m·K) = J/(m·s·K). Hence, the dimension of the product is  the same as the left-hand side: J/(m3·s). So that’s OK again, as energy (J) per cubic meter (m3) and per second (s) is definitely something we can associate with an energy flow.

In fact, we can play with this. We can bring k from the left- to the right-hand side of the equation, for example. The dimension of κ/k is m2/s (check it!), and multiplying that by K/m(i.e. the dimension of ∇2T) gives us some quantity expressed in Kelvin per second, and so that’s the same dimension as that of ∂T/∂t. Done! 

In fact, we’ve got two different ways of writing Schrödinger’s diffusion equation. We can write it as ∂ψ/∂t = i·(ħ/m)·∇2ψ or, else, we can write it as ħ·∂ψ/∂t = i·(ħ2/m)·∇2ψ. Does it matter? I don’t think it does. The dimensions come out OK in both cases. However, interestingly, if we do a dimensional analysis of the ħ·∂ψ/∂t = i·(ħ2/m)·∇2ψ equation, we get joule on both sides. Interesting, isn’t it? The key question, of course, is: what is it that is flowing here?

I don’t have a very convincing answer to that, but the answer I have is interesting—I think. 🙂 Think of the following: we can multiply Schrödinger’s equation with whatever we want, and then we get all kinds of flows. For example, if we multiply both sides with 1/(m2·s) or 1/(m3·s), we get a equation expressing the energy conservation law, indeed! [And you may want to think about the minus sign of the  right-hand side of Schrödinger’s equation now, because it makes much more sense now!]

We could also multiply both sides with s, so then we get J·s on both sides, i.e. the dimension of physical action (J·s = N·m·s). So then the equation expresses the conservation of actionHuh? Yes. Let me re-phrase that: then it expresses the conservation of angular momentum—as you’ll surely remember that the dimension of action and angular momentum are the same. 🙂

And then we can divide both sides by m, so then we get N·s on both sides, so that’s momentum. So then Schrödinger’s equation embodies the momentum conservation law.

Isn’t it just wonderfulSchrödinger’s equation packs all of the conservation laws!:-) The only catch is that it flows back and forth from the real to the imaginary space, using that propagation mechanism as described in those two equations.

Now that is really interesting, because it does provide an explanation – as fuzzy as it may seem – for all those weird concepts one encounters when studying physics, such as the tunneling effect, which amounts to energy flowing from the imaginary space to the real space and, then, inevitably, flowing back. It also allows for borrowing time from the imaginary space. Hmm… Interesting! [I know I still need to make these points much more formally, but… Well… You kinda get what I mean, don’t you?]

To conclude, let me re-baptize my real and imaginary ‘space’ by referring to them to what they really are: a real and imaginary energy space respectively. Although… Now that I think of it: it could also be real and imaginary momentum space, or a real and imaginary action space. Hmm… The latter term may be the best. 🙂

Isn’t this all great? I mean… I could go on and on—but I’ll stop here, so you can freewheel around yourself. For  example, you may wonder how similar that energy propagation mechanism actually is as compared to the propagation mechanism of the electromagnetic wave? The answer is: very similar. You can check how similar in one of my posts on the photon wavefunction or, if you’d want a more general argument, check my Deep Blue page. Have fun exploring! 🙂

So… Well… That’s it, folks. I hope you enjoyed this post—if only because I really enjoyed writing it. 🙂


OK. You’re right. I still haven’t answered the fundamental question.

So what about  the 1/2 factor?

What about that 1/2 factor? Did Schrödinger miss it? Well… Think about it for yourself. First, I’d encourage you to further explore that weird graph with the real and imaginary part of the wavefunction. I copied it below, but with an added 45º line—yes, the green diagonal. To make it somewhat more real, imagine you’re the zero-mass point-like particle moving along that line, and we observe you from our inertial frame of reference, using equivalent time and distance units.

spacetime travel

So we’ve got that cosine (cosθ) varying as you travel, and we’ve also got the i·sinθ part of the wavefunction going while you’re zipping through spacetime. Now, THINK of it: the phase velocity of the cosine bit (i.e. the red graph) contributes as much to your lightning speed as the i·sinθ bit, doesn’t it? Should we apply Pythagoras’ basic r2 = x2 + yTheorem here? Yes: the velocity vector along the green diagonal is going to be the sum of the velocity vectors along the horizontal and vertical axes. So… That’s great.

Yes. It is. However, we still have a problem here: it’s the velocity vectors that add up—not their magnitudes. Indeed, if we denote the velocity vector along the green diagonal as u, then we can calculate its magnitude as:

u = √u2 = √[(v/2)2 + (v/2)2] = √[2·(v2/4) = √[v2/2] = v/√2 ≈ 0.7·v

So, as mentioned, we’re adding the vectors, but not their magnitudes. We’re somewhat better off than we were in terms of showing that the phase velocity of those sine and cosine velocities add up—somehow, that is—but… Well… We’re not quite there.

Fortunately, Einstein saves us once again. Remember we’re actually transforming our reference frame when working with the wavefunction? Well… Look at the diagram below (for which I  thank the author)

special relativity

In fact, let me insert an animated illustration, which shows what happens when the velocity goes up and down from (close to) −c to +c and back again.  It’s beautiful, and I must credit the author here too. It sort of speaks for itself, but please do click the link as the accompanying text is quite illuminating. 🙂


The point is: for our zero-mass particle, the x’ and t’ axis will rotate into the diagonal itself which, as I mentioned a couple of times already, represents the speed of light and, therefore, our zero-mass particle traveling at c. It’s obvious that we’re now adding two vectors that point in the same direction and, hence, their magnitudes just add without any square root factor. So, instead of u = √[(v/2)2 + (v/2)2], we just have v/2 + v/2 = v! Done! We solved the phase velocity paradox! 🙂

So… I still haven’t answered that question. Should that 1/2 factor in Schrödinger’s equation be there or not? The answer is, obviously: yes. It should be there. And as for Schrödinger using the mass concept as it appears in the classical kinetic energy formula: K.E. = m·v2/2… Well… What other mass concept would he use? I probably got a bit confused with Feynman’s exposé – especially this notion of ‘choosing the zero point for the energy’ – but then I should probably just re-visit the thing and adjust the language here and there. But the formula is correct.

Thinking it all through, the ħ/2m constant in Schrödinger’s equation should be thought of as the reciprocal of m/(ħ/2). So what we’re doing basically is measuring the mass of our object in units of ħ/2, rather than units of ħ. That makes perfect sense, if only because it’s ħ/2, rather than ħthe factor that appears in the Uncertainty Relations Δx·Δp ≥ ħ/2 and ΔE·Δt ≥ ħ/2. In fact, in my post on the wavefunction of the zero-mass particle, I noted its elementary wavefunction should use the m = E = p = ħ/2 values, so it becomes ψ(x, t) = a·ei∙[(ħ/2)∙t − (ħ/2)∙x]/ħ = a·ei∙[t − x]/2.

Isn’t that just nice? 🙂 I need to stop here, however, because it looks like this post is becoming a book. Oh—and note that nothing what I wrote above discredits my ‘hidden energy’ theory. On the contrary, it confirms it. In fact, the nice thing about those illustrations above is that it associates the imaginary component of our wavefunction with travel in time, while the real component is associated with travel in space. That makes our theory quite complete: the ‘hidden’ energy is the energy that moves time forward. The only thing I need to do is to connect it to that idea of action expressing itself in time or in space, cf. what I wrote on my Deep Blue page: we can look at the dimension of Planck’s constant, or at the concept of action in general, in two very different ways—from two different perspectives, so to speak:

  1. [Planck’s constant] = [action] = N∙m∙s = (N∙m)∙s = [energy]∙[time]
  2. [Planck’s constant] = [action] = N∙m∙s = (N∙s)∙m = [momentum]∙[distance]

Hmm… I need to combine that with the idea of the quantum vacuum, i.e. the mathematical space that’s associated with time and distance becoming countable variables…. In any case. Next time. 🙂

Before I sign off, however, let’s quickly check if our a·ei∙[t − x]/2 wavefunction solves the Schrödinger equation:

  • ∂ψ/∂t = −a·ei∙[t − x]/2·(i/2)
  • 2ψ = ∂2[a·ei∙[t − x]/2]/∂x=  ∂[a·ei∙[t − x]/2·(i/2)]/∂x = −a·ei∙[t − x]/2·(1/4)

So the ∂ψ/∂t = i·(ħ/2m)·∇2ψ equation becomes:

a·ei∙[t − x]/2·(i/2) = −i·(ħ/[2·(ħ/2)])·a·ei∙[t − x]/2·(1/4)

⇔ 1/2 = 1/4 !?

The damn 1/2 factor. Schrödinger wants it in his wave equation, but not in the wavefunction—apparently! So what if we take the m = E = p = ħ solution? We get:

  • ∂ψ/∂t = −a·i·ei∙[t − x]
  • 2ψ = ∂2[a·ei∙[t − x]]/∂x=  ∂[a·i·ei∙[t − x]]/∂x = −a·ei∙[t − x]

So the ∂ψ/∂t = i·(ħ/2m)·∇2ψ equation now becomes:

a·i·ei∙[t − x] = −i·(ħ/[2·ħ])·a·ei∙[t − x]

⇔ 1 = 1/2 !?

We’re still in trouble! So… Was Schrödinger wrong after all? There’s no difficulty whatsoever with the ∂ψ/∂t = i·(ħ/m)·∇2ψ equation:

  • a·ei∙[t − x]/2·(i/2) = −i·[ħ/(ħ/2)]·a·ei∙[t − x]/2·(1/4) ⇔ 1 = 1
  • a·i·ei∙[t − x] = −i·(ħ/ħ)·a·ei∙[t − x] ⇔ 1 = 1

What these equations might tell us is that we should measure mass, energy and momentum in terms of ħ (and not in terms of ħ/2) but that the fundamental uncertainty is ± ħ/2. That solves it all. So the magnitude of the uncertainty is ħ but it separates not 0 and ± 1, but −ħ/2 and −ħ/2. Or, more generally, the following series:

…, −7ħ/2, −5ħ/2, −3ħ/2, −ħ/2, +ħ/2, +3ħ/2,+5ħ/2, +7ħ/2,…

Why are we not surprised? The series represent the energy values that a spin one-half particle can possibly have, and ordinary matter – i.e. all fermions – is composed of spin one-half particles.

To  conclude this post, let’s see if we can get any indication on the energy concepts that Schrödinger’s revised wave equation implies. We’ll do so by just calculating the derivatives in the ∂ψ/∂t = i·(ħ/m)·∇2ψ equation (i.e. the equation without the 1/2 factor). Let’s also not assume we’re measuring stuff in natural units, so our wavefunction is just what it is: a·ei·[E·t − p∙x]/ħ. The derivatives now become:

  • ∂ψ/∂t = −a·i·(E/ħ)·ei∙[E·t − p∙x]/ħ
  • 2ψ = ∂2[a·ei∙[E·t − p∙x]/ħ]/∂x=  ∂[a·i·(p/ħ)·ei∙[E·t − p∙x]/ħ]/∂x = −a·(p22ei∙[E·t − p∙x]/ħ

So the ∂ψ/∂t = i·(ħ/m)·∇2ψ = i·(1/m)·∇2ψ equation now becomes:

a·i·(E/ħ)·ei∙[E·t − p∙x]/ħ = −i·(ħ/m)·a·(p22ei∙[E·t − p∙x]/ħ  ⇔ E = p2/m = m·v2

It all works like a charm. Note that we do not assume stuff like E = m = p here. It’s all quite general. Also note that the E = p2/m closely resembles the kinetic energy formula one often sees: K.E. = m·v2/2 = m·m·v2/(2m) = p2/(2m). We just don’t have the 1/2 factor in our E = p2/m formula, which is great—because we don’t want it! :-) Of course, if you’d add the 1/2 factor in Schrödinger’s equation again, you’d get it back in your energy formula, which would just be that old kinetic energy formula which gave us all these contradictions and ambiguities. 😦

Finally, and just to make sure: let me add that, when we wrote that E = m = p – like we did above – we mean their numerical values are the same. Their dimensions remain what they are, of course. Just to make sure you get that subtle point, we’ll do a quick dimensional analysis of that E = p2/m formula:

[E] = [p2/m] ⇔ N·m = N2·s2/kg = N2·s2/[N·m/s2] = N·m = joule (J)

So… Well… It’s all perfect. 🙂

Post scriptum: I revised my Deep Blue page after writing this post, and I think that a number of the ideas that I express above are presented more consistently and coherently there. In any case, the missing energy theory makes sense. Think of it: any oscillator involves both kinetic as well as potential energy, and they both add up to twice the average kinetic (or potential) energy. So why not here? When everything is said and done, our elementary wavefunction does describe an oscillator. 🙂

Schrödinger’s equation in action

This post is about something I promised to write about aeons ago: how do we get those electron orbitals out of Schrödinger’s equation? So let me write it now – for the simplest of atoms: hydrogen. I’ll largely follow Richard Feynman’s exposé on it: this text just intends to walk you through it and provide some comments here and there.

Let me first remind you of what that famous Schrödinger’s equation actually represents. In its simplest form – i.e. not including any potential, so then it’s an equation that’s valid for free space only—no force fields!—it reduces to:

i·ħ∙∂ψ/∂t = –(1/2)∙(ħ2/meff)∙∇2ψ

Note the enigmatic concept of the efficient mass in it (meff), as well as the rather awkward 1/2 factor, which we may get rid of by re-defining it. We then write: meffNEW = 2∙meffOLD, and Schrödinger’s equation then simplifies to:

  • ∂ψ/∂t + i∙(V/ħ)·ψ = i(ħ/meff)·∇2ψ
  • In free space (no potential): ∂ψ/∂t = i∙(ħ/meff)·∇2ψ

In case you wonder where the minus sign went, I just brought the imaginary unit to the other side. Remember 1/= −i. 🙂

Now, in my post on quantum-mechanical operators, I drew your attention to the fact that this equation is structurally similar to the heat diffusion equation – or to any diffusion equation, really. Indeed, assuming the heat per unit volume (q) is proportional to the temperature (T) – which is the case when expressing T in degrees Kelvin (K), so we can write q as q = k·T  – we can write the heat diffusion equation as:

heat diffusion 2

Moreover, I noted the similarity is not only structural. There is more to it: both equations model energy flows. How exactly is something I wrote about in my e-publication on this, so let me refer you to that. Let’s jot down the complete equation once more:

∂ψ/∂t + i∙(V/ħ)·ψ = i(ħ/meff)·∇2ψ

In fact, it is rather surprising that Feynman drops the eff subscript almost immediately, so he just writes: schrodinger 5

Let me first remind you that ψ is a function of position in space and time, so we write: ψ = ψ(x, y, z, t) = ψ(r, t), with (x, y, z) = r. And m, on the other side of the equation, is what it always was: the effective electron mass. Now, we talked about the subtleties involved before, so let’s not bother about the definition of the effective electron mass, or wonder where that factor 1/2 comes from here.

What about V? V is the potential energy of the electron: it depends on the distance (r) from the proton. We write: V = −e2/│r│ = −e2/r. Why the minus sign? Because we say the potential energy is zero at  large distances (see my post on potential energy). Back to Schrödinger’s equation.

On the left-hand side, we have ħ, and its dimension is J·s (or N·m·s, if you want). So we multiply that with a time derivative and we get J, the unit of energy. On the right-hand side, we have Planck’s constant squared, the mass factor in the denominator, and the Laplacian operator – i.e. ∇= ·, with ∇ = (∂/∂x,  ∂/∂y,  ∂/∂z) – operating on the wavefunction.

Let’s start with the latter. The Laplacian works just the same as for our heat diffusion equation: it gives us a flux density, i.e. something expressed per square meter (1/m2). The ħfactor gives us J2·s2. The mass factor makes everything come out alright, if we use the mass-equivalence relation, which says it’s OK to express the mass in J/(m/s)2. [The mass of an electron is usually expressed as being equal to 0.5109989461(31) MeV/c2. That unit uses the E = m·cmass-equivalence formula. As for the eV, you know we can convert that into joule, which is a rather large unit—which is why we use the electronvolt as a measure of energy.] To make a long story short, we’re OK: (J2·s2)·[(m/s)2/J]·(1/m2) = J! Perfect. [As for the Vψ term, that’s obviously expressed in joule too.]

In short, Schrödinger’s equation expresses the energy conservation law too, and we may express it per square meter or per second or per cubic meter as well, if we’d wish: we can just multiply both sides by 1/m2 or 1/s or 1/mor by whatever dimension you want. Again, if you want more detail on the Schrödinger equation as an energy propagation mechanism, read the mentioned e-publication. So let’s get back to our equation, which, taking into account our formula for V, now looks like this:


Feynman then injects one of these enigmatic phrases—enigmatic for novices like us, at least!

“We want to look for definite energy states, so we try to find solutions which have the form: ψ (r, t) =  e−(i/ħ)·E·t·ψ(r).”

At first, you may think he’s just trying to get rid of the relativistic correction in the argument of the wavefunction. Indeed, as I explain in that little booklet of mine, the –(p/ħ)·x term in the argument of the elementary wavefunction ei·θ =  ei·[(E/ħ)·t – (p/ħ)·x] is there because the young Comte Louis de Broglie, back in 1924, when he wrote his groundbreaking PhD thesis, suggested the θ = ω∙t – kx = (E∙t – px)/ħ formula for the argument of the wavefunction, as he knew that relativity theory had already established the invariance of the four-vector (dot) product pμxμ = E∙t – px = pμ‘xμ‘ = E’∙t’ – p’x’. [Note that Planck’s constant, as a physical constant, should obviously not depend on the reference frame either. Hence, if the E∙t – px product is invariant, so is (E∙t – px)/ħ.] So the θ = E∙t – px and the θ = E0∙t’ = E’·t’ are fully equivalent. Using lingo, we can say that the argument of the wavefunction is a Lorentz scalar and, therefore, invariant under a Lorentz boost. Sounds much better, doesn’t it? 🙂

But… Well. That’s not why Feynman says what he says. He just makes abstraction of uncertainty here, as he looks for states with a definite energy state, indeed. Nothing more, nothing less. Indeed, you should just note that we can re-write the elementary a·ei[(E/ħ)·t – (p/ħ)·x] function as e−(i/ħ)·E·t·ei·(p/ħ)·x]. So that’s what Feynman does here: he just eases the search for functional forms that satisfy Schrödinger’s equation. You should note the following:

  1. Writing the coefficient in front of the complex exponential as ψ(r) = ei·(p/ħ)·x] does the trick we want it to do: we do not want that coefficient to depend on time: it should only depend on the size of our ‘box’ in space, as I explained in one of my posts.
  2. Having said that, you should also note that the ψ in the ψ(r, t) function and the ψ in the ψ(r) denote two different beasts: one is a function of two variables (r and t), while the other makes abstraction of the time factor and, hence, becomes a function of one variable only (r). I would have used another symbol for the ψ(r) function, but then the Master probably just wants to test your understanding. 🙂

In any case, the differential equation we need to solve now becomes:


Huh? How does that work? Well… Just take the time derivative of e−(i/ħ)·E·t·ψ(r), multiply with the i·ħ in front of that term in Schrödinger’s original equation  and re-arrange the terms. [Just do it: ∂[e−(i/ħ)·E·t·ψ(r)]/∂t = −(i/ħ)·E·e−(i/ħ)·E·t·ψ(r). Now multiply that with i·ħ: the ħ factor cancels and the minus disappears because i= −1.]

So now we need to solve that differential equation, i.e. we need to find functional forms for ψ – and please do note we’re talking ψ(r) here – not ψ(r, t)! – that satisfy the above equation. Interesting question: is our equation still Schrödinger’s equation? Well… It is and it isn’t. Any linear combination of the definite energy solutions we find will also solve Schrödinger’s equation, but so we limited the solution set here to those definite energy solutions only. Hence, it’s not quite the same equation. We removed the time dependency here – and in a rather interesting way, I’d say.

The next thing to do is to switch from Cartesian to polar coordinates. Why? Well… When you have a central-force problem – like this one (because of the potential) – it’s easier to solve them using polar coordinates. In fact, because we’ve got three dimensions here, we’re actually talking a spherical coordinate system. The illustration and formulas below show how spherical and Cartesian coordinates are related:

 x = r·sinθ·cosφ; y = r·sinθ·sinφ; zr·cosθ


As you know, θ (theta) is referred to as the polar angle, while φ (phi) is the azimuthal angle, and the coordinate transformation formulas can be easily derived. The rather simple differential equation above now becomes the following monster:

new de

Huh? Yes, I am very sorry. That’s how it is. Feynman does this to help us. If you think you can get to the solutions by directly solving the equation in Cartesian coordinates, please do let me know. 🙂 To tame the beast, we might imagine to first look for solutions that are spherically symmetric, i.e. solutions that do not depend on θ and φ. That means we could rotate the reference frame and none of the amplitudes would change. That means the ∂ψ/∂θ and ∂ψ/∂φ (partial) derivatives in our formula are equal to zero. These spherically symmetric states, or s-states as they are referred to, are states with zero (orbital) angular momentum, but you may want to think about that statement before accepting it. 🙂 [It’s not  that there’s no angular momentum (on the contrary: there’s lots of it), but the total angular momentum should obviously be zero, and so that’s what meant when these states are denoted as = 0 states.] So now we have to solve:

de 3

Now that looks somewhat less monstrous, but Feynman still fills two rather dense pages to show how this differential equation can be solved. It’s not only tedious but also complicated, so please check it yourself by clicking on the link. One of the steps is a switch in variables, or a re-scaling, I should say. Both E and r are now measured as follows:



The complicated-looking factors are just the Bohr radius (r= ħ2/(m·e2) ≈ 0.528 Å) and the Rydberg energy (E= m·e4/2·ħ2 ≈ 13.6 eV). We calculated those long time ago using a rather heuristic model to describe an atom. In case you’d want to check the dimensions, note eis a rather special animal. It’s got nothing to do with Euler’s number. Instead, eis equal to ke·qe2, and the ke here is Coulomb’s constant: ke = 1/(4πε0). This allows to re-write the force between two electrons as a function of the distance: F = e2/r2This, in turn, explains the rather weird dimension of e2: [e2] = N·e= J·m. But I am digressing too much. The bottom line is: the various energy levels that fit the equation, i.e. the allowable energies, are fractions of the Rydberg energy, i.e. E=m·e4/2·ħ2. To be precise, the formula for the nth energy level is:

E= − ER/n2.

The interesting thing is that the spherically symmetric solutions yield real-valued ψ(r) functions. The solutions for n = 1, 2, and 3 respectively, and their graph is given below.




graphAs Feynman writes, all of the wave functions approach zero rapidly for large r (also, confusingly, denoted as ρ) after oscillating a few times, with the number of ‘bumps’ equal to n. Of course, you should note that you should put the time factor back in in order to correctly interpret these functions. Indeed, remember how we separated them when we wrote:

ψ(r, t) =  ei·(E/ħ)·t·ψ(r)

We might say the ψ(r) function is sort of an envelope function for the whole wavefunction, but it’s not quite as straightforward as that. :-/ However, I am sure you’ll figure it out.

States with an angular dependence

So far, so good. But what if those partial derivatives are not zero? Now the calculations become really complicated. Among other things, we need these transformation matrices for rotations, which we introduced a very long time ago. As mentioned above, I don’t have the intention to copy Feynman here, who needs another two or three dense pages to work out the logic. Let me just state the grand result:

  • We’ve got a whole range of definite energy states, which correspond to orbitals that form an orthonormal basis for the actual wavefunction of the electron.
  • The orbitals are characterized by three quantum numbers, denoted as ln and m respectively:
    • The is the quantum number of (total) angular momentum, and it’s equal to 0, 1, 2, 3, etcetera. [Of course, as usual, we’re measuring in units of ħ.] The l = 0 states are referred to as s-states, the = 1 states are referred to as p-states, and the = 2 states are d-states. They are followed by f, g, h, etcetera—for no particular good reason. [As Feynman notes: “The letters don’t mean anything now. They did once—they meant “sharp” lines, “principal” lines, “diffuse” lines and “fundamental” lines of the optical spectra of atoms. But those were in the days when people did not know where the lines came from. After f there were no special names, so we now just continue with g, h, and so on.]
    • The is referred to as the ‘magnetic’ quantum number, and it ranges from −l to +l.
    • The n is the ‘principle’ quantum number, and it goes from + 1 to infinity (∞).

How do these things actually look like? Let me insert two illustrations here: one from Feynman, and the other from Wikipedia.


The number in front just tracks the number of s-, p-, d-, etc. orbital. The shaded region shows where the amplitudes are large, and the plus and minus signs show the relative sign of the amplitude. [See my remark above on the fact that the ψ factor is real-valued, even if the wavefunction as a whole is complex-valued.] The Wikipedia image shows the same density plots but, as it was made some 50 years later, with some more color. 🙂


This is it, guys. Feynman takes it further by also developing the electron configurations for the next 35 elements in the periodic table but… Well… I am sure you’ll want to read the original here, rather than my summaries. 🙂

Congrats ! We now know all what we need to know. All that remains is lots of practical exercises, so you can be sure you master the material for your exam. 🙂

Schrödinger’s equation and the two de Broglie relations

Post scriptum note added on 11 July 2016: This is one of the more speculative posts which led to my e-publication analyzing the wavefunction as an energy propagation. With the benefit of hindsight, I would recommend you to immediately the more recent exposé on the matter that is being presented here, which you can find by clicking on the provided link. In fact, I actually made some (small) mistakes when writing the post below.

Original post:

I’ve re-visited the de Broglie equations a couple of times already. In this post, however, I want to relate them to Schrödinger’s equation. Let’s start with the de Broglie equations first. Equations. Plural. Indeed, most popularizing books on quantum physics will give you only one of the two de Broglie equations—the one that associates a wavelength (λ) with the momentum (p) of a matter-particle:

λ = h/p

In fact, even the Wikipedia article on the ‘matter wave’ starts off like that and is, therefore, very confusing, because, for a good understanding of quantum physics, one needs to realize that the λ = h/p equality is just one of a pair of two ‘matter wave’ equations:

  1. λ = h/p
  2. f = E/h

These two equations give you the spatial and temporal frequency of the wavefunction respectively. Now, those two frequencies are related – and I’ll show you how in a minute – but they are not the same. It’s like space and time: they are related, but they are definitely not the same. Now, because any wavefunction is periodic, the argument of the wavefunction – which we’ll introduce shortly – will be some angle and, hence, we’ll want to express it in radians (or – if you’re really old-fashioned – degrees). So we’ll want to express the frequency as an angular frequency (i.e. in radians per second, rather than in cycles per second), and the wavelength as a wave number (i.e. in radians per meter). Hence, you’ll usually see the two de Broglie equations written as:

  1. k = p/ħ
  2. ω = E/ħ

It’s the same: ω = 2π∙f and f = 1/T (T is the period of the oscillation), and k = 2π/λ and then ħ = h/2π, of course! [Just to remove all ambiguities: stop thinking about degrees. They’re a Babylonian legacy, who thought the numbers 6, 12, and 60 had particular religious significance. So that’s why we have twelve-hour nights and twelve-hour days, with each hour divided into sixty minutes and each minute divided into sixty seconds, and – particularly relevant in this context – why ‘once around’ is divided into 6×60 = 360 degrees. Radians are the unit in which we should measure angles because… Well… Google it. They measure an angle in distance units. That makes things easier—a lot easier! Indeed, when studying physics, the last thing you want is artificial units, like degrees.]

So… Where were we? Oh… Yes. The de Broglie relation. Popular textbooks usually commit two sins. One is that they forget to say we have two de Broglie relations, and the other one is that the E = h∙f relationship is presented as the twin of the Planck-Einstein relation for photons, which relates the energy (E) of a photon to its frequency (ν): E = h∙ν = ħ∙ω. The former is criminal neglect, I feel. As for the latter… Well… It’s true and not true: it’s incomplete, I’d say, and, therefore, also very confusing.

Why? Because both things lead one to try to relate the two equations, as momentum and energy are obviously related. In fact, I’ve wasted days, if not weeks, on this. How are they related? What formula should we use? To answer that question, we need to answer another one: what energy concept should we use? Potential energy? Kinetic energy? Should we include the equivalent energy of the rest mass?

One quickly gets into trouble here. For example, one can try the kinetic energy, K.E. = m∙v2/2, and use the definition of momentum (p = m∙v), to write E = p2/(2m), and then we could relate the frequency f to the wavelength λ using the general rule that the traveling speed of a wave is equal to the product of its wavelength and its frequency (v = λ∙f). But if E = p2/(2m) and f = v/λ, we get:

p2/(2m) = h∙v/λ ⇔  λ = 2∙h/p

So that is almost right, but not quite: that factor 2 should not be there. In fact, it’s easy to see that we’d get de Broglie’s λ = h/p equation from his E = h∙f equation if we’d use E = m∙v2 rather than E = m∙v2/2. In fact, the E = m∙v2 relation comes out of them if we just multiply the two and, yes, use that v = λ relation once again:

  1. f·λ = (E/h)·(h/p) = E/p
  2. v = λ ⇒ f·λ = v = E/p ⇔ E = v·p = v·(m·v) ⇒ E = m·v2

But… Well… E = m∙v2? How could we possibly justify the use of that formula?

The answer is simple: our v = f·λ equation is wrong. It’s just something one shouldn’t apply to the complex-valued wavefunction. The ‘correct’ velocity formula for the complex-valued wavefunction should have that 1/2 factor, so we’d write 2·f·λ = v to make things come out alright. But where would this formula come from?

Well… Now it’s time to introduce the wavefunction.

The wavefunction

You know the elementary wavefunction:

ψ = ψ(x, t) = ei(ωt − kx) = ei(kx − ωt) = cos(kx−ωt) + i∙sin(kx−ωt)

As for terminology, note that the term ‘wavefunction’ refers to what I write above, while the term ‘wave equation’ usually refers to Schrödinger’s equation, which I’ll introduce in a minute. Also note the use of boldface indicates we’re talking vectors, so we’re multiplying the wavenumber vector k with the position vector x = (x, y, z) here, although we’ll often simplify and assume one-dimensional space. In any case…

So the question is: why can’t we use the v = f·λ formula for this wave? The period of cosθ + isinθ is the same as that of the sine and cosine function considered separately: cos(θ+2π) + isin(θ+2π) = cosθ + isinθ, so T = 2π and f = 1/T = 1/2π do not change. So the f, T and λ should be the same, no?

No. We’ve got two oscillations for the price of one here: one ‘real’ and one ‘imaginary’—but both are equally essential and, hence, equally ‘real’. So we’re actually combining two waves. So it’s just like adding other waves: when adding waves, one gets a composite wave that has (a) a phase velocity and (b) a group velocity.

Huh? Yes. It’s quite interesting. When adding waves, we usually have a different ω and k for each of the component waves, and the phase and group velocity will depend on the relation between those ω’s and k’s. That relation is referred to as the dispersion relation. To be precise, if you’re adding waves, then the phase velocity of the composite wave will be equal to vp = ω/k, and its group velocity will be equal to vg = dω/dk. We’ll usually be interested in the group velocity, and so to calculate that derivative, we need to express ω as a function of k, of course, so we write ω as some function of k, i.e. ω = ω(k). There are number of possibilities then:

  1. ω and k may be directly proportional, so we can write ω as ω = a∙k: in that case, we find that vp = vg = a.
  2. ω and k are not directly proportional but have a linear relationship, so we can write write ω as ω = a∙k + b. In that case, we find that vg = a and… Well… We’ve got a problem calculating vp, because we don’t know what k to use!
  3. ω and k may be non-linearly related, in which case… Well… One does has to do the calculation and see what comes out. 🙂

Let’s now look back at our ei(kx − ωt) = cos(kx−ωt) + i∙sin(kx−ωt) function. You’ll say that we’ve got only one ω and one k here, so we’re not adding waves with different ω’s and k’s. So… Well… What?

That’s where the de Broglie equations come in. Look: k = p/ħ, and ω = E/ħ. If we now use the correct energy formula, i.e. the kinetic energy formula E = m·v2/2 (rather than that nonsensical E = m·v2 equation) – which we can also write as E = m·v·v/2 = p·v/2 = p·p/2m = p2/2m, with v = p/m the classical velocity of the elementary particle that Louis de Broglie was thinking of – then we can calculate the group velocity of our ei(kx − ωt) = cos(kx−ωt) + i∙sin(kx−ωt) as:

vg = dω/dk = d[E/ħ]/d[p/ħ] = dE/dp = d[p2/2m]/dp = 2p/2m = p/m = v

However, the phase velocity of our ei(kx − ωt) is:

vp = ω/k = (E/ħ)/(p/ħ) = E/p = (p2/2m)/p = p/2m = v/2

So that factor 1/2 only appears for the phase velocity. Weird, isn’t it? We find that the group velocity (vg) of the ei(kx − ωt) function is equal to the classical velocity of our particle (i.e. v), but that its phase velocity (vp) is equal to v divided by 2.

Hmm… What to say? Well… Nothing much—except that it makes sense, and very much so, because it’s the group velocity of the wavefunction that’s associated with the classical velocity of a particle, not the phase velocity. In fact, if we include the rest mass in our energy formula, so if we’d use the relativistic E = γm0c2 and p = γm0v formulas (with γ the Lorentz factor), then we find that vp = ω/k = E/p = (γm0c2)/(γm0v) = c2/v, and so that’s a superluminal velocity, because v is always smaller than c!

What? That’s even weirder! If we take the kinetic energy only, we find a phase velocity equal to v/2, but if we include the rest energy, then we get a superluminal phase velocity. It must be one or the other, no? Yep! You’re right! So that makes us wonder: is E = m·v2/2 really the right energy concept to use? The answer is unambiguous: no! It isn’t! And, just for the record, our young nobleman didn’t use the kinetic energy formula when he postulated his equations in his now famous PhD thesis.

So what did he use then? Where did he get his equations?

I am not sure. 🙂 A stroke of genius, it seems. According to Feynman, that’s how Schrödinger got his equation too: intuition, brilliance. In short, a stroke of genius. 🙂 Let’s relate these these two gems.

Schrödinger’s equation and the two de Broglie relations

Erwin Schrödinger and Louis de Broglie published their equations in 1924 and 1926 respectively. Can they be related? The answer is: yes—of course! Let’s first look at de Broglie‘s energy concept, however. Louis de Broglie was very familiar with Einsteins’ work and, hence, he knew that the energy of a particle consisted of three parts:

  1. The particle’s rest energy m0c2, which de Broglie referred to as internal energy (Eint): this ‘internal energy’ includes the rest mass of the ‘internal pieces’, as he put it (now we call those ‘internal pieces’ quarks), as well as their binding energy (i.e. the quarks’interaction energy);
  2. Any potential energy it may have because of some field (so de Broglie was not assuming the particle was traveling in free space), which we’ll denote by V: the field(s) can be anything—gravitational, electromagnetic—you name it: whatever changes the energy because of the position of the particle;
  3. The particle’s kinetic energy, which we wrote in terms of its momentum p: K.E. = m·v2/2 = m2·v2/(2m) = (m·v)2/(2m) = p2/(2m).

Indeed, in my previous posts, I would write the wavefunction as de Broglie wrote it, which is as follows:

ψ(θ) = ψ(x, t) = a·eiθ = a·e−i[(Eint + p2/(2m) + V)·t − p∙x]/ħ 

In those post – such as my post on virtual particles – I’d also note how a change in potential energy plays out: a change in potential energy, when moving from one place to another, would change the wavefunction, but through the momentum only—so it would impact the spatial frequency only. So the change in potential would not change the temporal frequencies ω= Eint + p12/(2m) + V1 and ω= Eint + p22/(2m) + V2. Why? Or why not, I should say? Because of the energy conservation principle—or its equivalent in quantum mechanics. The temporal frequency f or ω, i.e. the time-rate of change of the phase of the wavefunction, does not change: all of the change in potential, and the corresponding change in kinetic energy, goes into changing the spatial frequency, i.e. the wave number k or the wavelength λ, as potential energy becomes kinetic or vice versa.

So is that consistent with what we wrote above, that E = m·v2? Maybe. Let’s think about it. Let’s first look at Schrödinger’s equation in free space (i.e. a space with zero potential) once again:

Schrodinger's equation 2

If we insert our ψ = ei(kx − ωt) formula in Schrödinger’s free-space equation, we get the following nice result. [To keep things simple, we’re just assuming one-dimensional space for the calculations, so ∇2ψ = ∂2ψ/∂x2. But the result can easily be generalized.] The time derivative on the left-hand side is ∂ψ/∂t = −iω·ei(kx − ωt). The second-order derivative on the right-hand side is ∂2ψ/∂x2 = (ik)·(ik)·ei(kx − ωt) = −k2·ei(kx − ωt) . The ei(kx − ωt) factor on both sides cancels out and, hence, equating both sides gives us the following condition:

iω = −(iħ/2m)·k2 ⇔ ω = (ħ/2m)·k2

Substituting ω = E/ħ and k = p/ħ yields:

E/ħ = (ħ/2m)·p22 = m2·v2/(2m·ħ) = m·v2/(2ħ) ⇔ E = m·v2/2

Bingo! We get that kinetic energy formula! But now… What if we’d not be considering free space? In other words: what if there is some potential? Well… We’d use the complete Schrödinger equation, which is:

schrodinger 5

Huh? Why is there a minus sign now? Look carefully: I moved the iħ factor on the left-hand side to the other when writing the free space version. If we’d do that for the complete equation, we’d get:

Schrodinger's equation 3I like that representation a lot more—if only because it makes it a lot easier to interpret the equation—but, for some reason I don’t quite understand, you won’t find it like that in textbooks. Now how does it work when using the complete equation, so we add the −(i/ħ)·V·ψ term? It’s simple: the ei(kx − ωt) factor also cancels out, and so we get:

iω = −(iħ/2m)·k2−(i/ħ)·V ⇔ ω = (ħ/2m)·k+ V/ħ

Substituting ω = E/ħ and k = p/ħ once more now yields:

E/ħ = (ħ/2m)·p22 + V/ħ = m2·v2/(2m·ħ) + V/ħ = m·v2/(2ħ) + V/ħ ⇔ E = m·v2/2 + V

Bingo once more!

The only thing that’s missing now is the particle’s rest energy m0c2, which de Broglie referred to as internal energy (Eint). That includes everything, i.e. not only the rest mass of the ‘internal pieces’ (as said, now we call those ‘internal pieces’ quarks) but also their binding energy (i.e. the quarks’interaction energy). So how do we get that energy concept out of Schrödinger’s equation? There’s only one answer to that: that energy is just like V. We can, quite simply, just add it.

That brings us to the last and final question: what about our vg = result if we do not use the kinetic energy concept, but the E = m·v2/2 + V + Eint concept? The answer is simple: nothing. We still get the same, because we’re taking a derivative and the V and Eint just appear as constants, and so their derivative with respect to p is zero. Check it:

vg = dω/dk = d[E/ħ]/d[p/ħ] = dE/dp = d[p2/2m + V + Eint ]/dp = 2p/2m = p/m = v

It’s now pretty clear how this thing works. To localize our particle, we just superimpose a zillion of these ei(ωt − kx) equations. The only condition is that we’ve got that fixed vg = dω/dk = v relationhip, but so we do have such fixed relationship—as you can see above. In fact, the Wikipedia article on the dispersion relation mentions that the de Broglie equations imply the following relation between ω and k: ω = ħk2/2m. As you can see, that’s not entirely correct: the author conveniently forgets the potential (V) and the rest energy (Eint) in the energy formula here!

What about the phase velocity? That’s a different story altogether. You can think about that for yourself. 🙂

I should make one final point here. As said, in order to localize a particle (or, to be precise, its wavefunction), we’re going to add a zillion elementary wavefunctions, each of which will make its own contribution to the composite wave. That contribution is captured by some coefficient ai in front of every eiθi function, so we’ll have a zillion aieiθi functions, really. [Yep. Bit confusing: I use here as subscript, as well as imaginary unit.] In case you wonder how that works out with Schrödinger’s equation, the answer is – once again – very simple: both the time derivative (which is just a first-order derivative) and the Laplacian are linear operators, so Schrödinger’s equation, for a composite wave, can just be re-written as the sum of a zillion ‘elementary’ wave equations.

So… Well… We’re all set now to effectively use Schrödinger’s equation to calculate the orbitals for a hydrogen atom, which is what we’ll do in our next post.

In the meanwhile, you can amuse yourself with reading a nice Wikibook article on the Laplacian, which gives you a nice feel for what Schrödinger’s equation actually represents—even if I gave you a good feel for that too on my Essentials page. Whatever. You choose. Just let me know what you liked best. 🙂

Oh… One more point: the vg = dω/dk = d[p2/2m]/dp = p/m = calculation obviously assumes we can treat m as a constant. In fact, what we’re actually doing is a rather complicated substitution of variables: you should write it all out—but that’s not the point here. The point is that we’re actually doing a non-relativistic calculation. Now, that does not mean that the wavefunction isn’t consistent with special relativity. It is. In fact, in one of my posts, I show how we can explain relativistic length contraction using the wavefunction. But it does mean that our calculation of the group velocity is not relativistically correct. But that’s a minor point: I’ll leave it for you as an exercise to calculate the relativistically correct formula for the group velocity. Have fun with it! 🙂

Note: Notations are often quite confusing. One should, generally speaking, denote a frequency by ν (nu), rather than by f, so as to not cause confusion with any function f, but then… Well… You create a new problem when you do that, because that Greek letter nu (ν) looks damn similar to the v of velocity, so that’s why I’ll often use f when I should be using nu (ν). As for the units, a frequency is expressed in cycles per second, while the angular frequency ω is expressed in radians per second. One cycle covers 2π radians and, therefore, we can write: ν = ω/2π. Hence, h∙ν = h∙ω/2π = ħ∙ω. Both ν as well as ω measure the time-rate of change of the phase of the wave function, as opposed to k, i.e. the spatial frequency of the wave function, which depends on the speed of the wave. Physicists also often use the symbol v for the speed of a wave, which is also hugely confusing, because it’s also used to denote the classical velocity of the particle. And then there’s two wave velocities, of course: the group versus the phase velocity. In any case… I find the use of that other symbol (c) for the wave velocity even more confusing, because this symbol is also used for the speed of light, and the speed of a wave is not necessarily (read: usually not) equal to the speed of light. In fact, both the group as well as the phase velocity of a particle wave are very different from the speed of light. The speed of a wave and the speed of light only coincide for electromagnetic waves and, even then, it should be noted that photons also have amplitudes to travel faster or slower than the speed of light.

Quantum-mechanical operators

We climbed a mountain—step by step, post by post. 🙂 We have reached the top now, and the view is gorgeous. We understand Schrödinger’s equation, which describes how amplitudes propagate through space-time. It’s the quintessential quantum-mechanical expression. Let’s enjoy now, and deepen our understanding by introducing the concept of (quantum-mechanical) operators.

The operator concept

We’ll introduce the operator concept using Schrödinger’s equation itself and, in the process, deepen our understanding of Schrödinger’s equation a bit. You’ll remember we wrote it as:

schrodinger 5

However, you’ve probably seen it like it’s written on his bust, or on his grave, or wherever, which is as follows:



It’s the same thing, of course. The ‘over-dot’ is Newton’s notation for the time derivative. In fact, if you click on the picture above (and zoom in a bit), then you’ll see that the craftsman who made the stone grave marker, mistakenly, also carved a dot above the psi (ψ) on the right-hand side of the equation—but then someone pointed out his mistake and so the dot on the right-hand side isn’t painted. 🙂 The thing I want to talk about here, however, is the H in that expression above, which is, obviously, the following operator:


That’s a pretty monstrous operator, isn’t it? It is what it is, however: an algebraic operator (it operates on a number—albeit a complex number—unlike a matrix operator, which operates on a vector or another matrix). As you can see, it actually consists of two other (algebraic) operators:

  1. The ∇operator, which you know: it’s a differential operator. To be specific, it’s the Laplace operator, which is the divergence (·) of the gradient () of a function: ∇= · = (∂/∂x, ∂/∂y , ∂/∂z)·(∂/∂x, ∂/∂y , ∂/∂z) = ∂2/∂x2  + ∂2/∂y+ ∂2/∂z2. This too operates on our complex-valued function wavefunction ψ, and yields some other complex-valued function, which we then multiply by −ħ2/2m to get the first term.
  2. The V(x, y, z) ‘operator’, which—in this particular context—just means: “multiply with V”. Needless to say, V is the potential here, and so it captures the presence of external force fields. Also note that V is a real number, just like −ħ2/2m.

Let me say something about the dimensions here. On the left-hand side of Schrödinger’s equation, we have the product of ħ and a time derivative (is just the imaginary unit, so that’s just a (complex) number). Hence, the dimension there is [J·s]/[s] (the dimension of a time derivative is something expressed per second). So the dimension of the left-hand side is joule. On the right-hand side, we’ve got two terms. The dimension of that second-order derivative (∇2ψ) is something expressed per square meter, but then we multiply it with −ħ2/2m, whose dimension is [J2·s2]/[J/(m2/s2)]. [Remember: m = E/c2.] So that reduces to [J·m2]. Hence, the dimension of (−ħ2/2m)∇2ψ is joule. And the dimension of V is joule too, of course. So it all works out. In fact, now that we’re here, it may or may not be useful to remind you of that heat diffusion equation we discussed when introducing the basic concepts involved in vector analysis:

diffusion equation

That equation illustrated the physical significance of the Laplacian. We were talking about the flow of heat in, say, a block of metal, as illustrated below. The in the equation above is the heat per unit volume, and the h in the illustration below was the heat flow vector (so it’s got nothing to do with Planck’s constant), which depended on the material, and which we wrote as = –κT, with T the temperature, and κ (kappa) the thermal conductivity. In any case, the point is the following: the equation below illustrates the physical significance of the Laplacian. We let it operate on the temperature (i.e. a scalar function) and its product with some constant (just think of replacing κ by −ħ2/2m gives us the time derivative of q, i.e. the heat per unit volume.

heat flow

In fact, we know that is proportional to T, so if we’d choose an appropriate temperature scale – i.e. choose the zero point such that T (your physics teacher in high school would refer to as the (volume) specific heat capacity) – then we could simple write:

∂T/∂t = (κ/k)∇2T

From a mathematical point of view, that equation is just the same as ∂ψ/∂t = –(i·ħ/2m)·∇2ψ, which is Schrödinger’s equation for V = 0. In other words, you can – and actually should – also think of Schrödinger’s equation as describing the flow of… Well… What?

Well… Not sure. I am tempted to think of something like a probability density in space, but ψ represents a (complex-valued) amplitude. Having said that, you get the idea—I hope! 🙂 If not, let me paraphrase Feynman on this:

“We can think of Schrödinger’s equation as describing the diffusion of a probability amplitude from one point to another. In fact, the equation looks something like the diffusion equation we introduced when discussing heat flow, or the spreading of a gas. But there is one main difference: the imaginary coefficient in front of the time derivative makes the behavior completely different from the ordinary diffusion such as you would have for a gas spreading out. Ordinary diffusion gives rise to real exponential solutions, whereas the solutions of Schrödinger’s equation are complex waves.”

That says it all, right? 🙂 In fact, Schrödinger’s equation – as discussed here – was actually being derived when describing the motion of an electron along a line of atoms, i.e. for motion in one direction only, but you can visualize what it represents in three-dimensional space. The real exponential functions Feynman refer to exponential decay function: as the energy is spread over an ever-increasing volume, the amplitude of the wave becomes smaller and smaller. That may be the case for complex-valued exponentials as well. The key difference between a real- and complex-valued exponential decay function is that a complex exponential is a cyclical function. Now, I quickly googled to see how we could visualize that, and I like the following illustration:


The dimensional analysis of Schrödinger’s equation is also quite interesting because… Well… Think of it: that heat diffusion equation incorporates the same dimensions: temperature is a measure of the average energy of the molecules. That’s really something to think about. These differential equations are not only structurally similar but, in addition, they all seem to describe some flow of energy. That’s pretty deep stuff: it relates amplitudes to energies, so we should think in terms of Poynting vectors and all that. But… Well… I need to move on, and so I will move on—so you can re-visit this later. 🙂

Now that we’ve introduced the concept of an operator, let me say something about notations, because that’s quite confusing.

Some remarks on notation

Because it’s an operator, we should actually use the hat symbol—in line with what we did when we were discussing matrix operators: we’d distinguish the matrix (e.g. A) from its use as an operator (Â). You may or may not remember we do the same in statistics: the hat symbol is supposed to distinguish the estimator (â) – i.e. some function we use to estimate a parameter (which we usually denoted by some Greek symbol, like α) – from a specific estimate of the parameter, i.e. the value (a) we get when applying â to a specific sample or observation. However, if you remember the difference, you’ll also remember that hat symbol was quickly forgotten, because the context made it clear what was what, and so we’d just write a(x) instead of â(x). So… Well… I’ll be sloppy as well here, if only because the WordPress editor only offers very few symbols with a hat! 🙂

In any case, this discussion on the use (or not) of that hat is irrelevant. In contrast, what is relevant is to realize this algebraic operator H here is very different from that other quantum-mechanical Hamiltonian operator we discussed when dealing with a finite set of base states: that H was the Hamiltonian matrix, but used in an ‘operation’ on some state. So we have the matrix operator H, and the algebraic operator H.


Yes and no. First, we’ve got the context again, and so you always know whether you’re looking at continuous or discrete stuff:

  1. If your ‘space’ is continuous (i.e. if states are to defined with reference to an infinite set of base states), then it’s the algebraic operator.
  2. If, on the other hand, your states are defined by some finite set of discrete base states, then it’s the Hamiltonian matrix.

There’s another, more fundamental, reason why there should be no confusion. In fact, it’s the reason why physicists use the same symbol H in the first place: despite the fact that they look so different, these two operators (i.e. H the algebraic operator and H the matrix operator) are actually equivalent. Their interpretation is similar, as evidenced from the fact that both are being referred to as the energy operator in quantum physics. The only difference is that one operates on a (state) vector, while the other operates on a continuous function. It’s just the difference between matrix mechanics as opposed to wave mechanics really.

But… Well… I am sure I’ve confused you by now—and probably very much so—and so let’s start from the start. 🙂

Matrix mechanics

Let’s start with the easy thing indeed: matrix mechanics. The matrix-mechanical approach is summarized in that set of Hamiltonian equations which, by now, you know so well:


If we have base states, then we have equations like this: one for each = 1, 2,… n. As for the introduction of the Hamiltonian, and the other subscript (j), just think of the description of a state:


So… Well… Because we had used already, we had to introduce j. 🙂

Let’s think about |ψ〉. It is the state of a system, like the ground state of a hydrogen atom, or one of its many excited states. But… Well… It’s a bit of a weird term, really. It all depends on what you want to measure: when we’re thinking of the ground state, or an excited state, we’re thinking energy. That’s something else than thinking its position in space, for example. Always remember: a state is defined by a set of base states, and so those base states come with a certain perspective: when talking states, we’re only looking at some aspect of reality, really. Let’s continue with our example of energy states, however.

You know that the lifetime of a system in an excited state is usually short: some spontaneous or induced emission of a quantum of energy (i.e. a photon) will ensure that the system quickly returns to a less excited state, or to the ground state itself. However, you shouldn’t think of that here: we’re looking at stable systems here. To be clear: we’re looking at systems that have some definite energy—or so we think: it’s just because of the quantum-mechanical uncertainty that we’ll always measure some other different value. Does that make sense?

If it doesn’t… Well… Stop reading, because it’s only going to get even more confusing. Not my fault, however!


The ubiquity of that ψ symbol (i.e. the Greek letter psi) is really something psi-chological 🙂 and, hence, very confusing, really. In matrix mechanics, our ψ would just denote a state of a system, like the energy of an electron (or, when there’s only one electron, our hydrogen atom). If it’s an electron, then we’d describe it by its orbital. In this regard, I found the following illustration from Wikipedia particularly helpful: the green orbitals show excitations of copper (Cu) orbitals on a CuOplane. [The two big arrows just illustrate the principle of X-ray spectroscopy, so it’s an X-ray probing the structure of the material.]


So… Well… We’d write ψ as |ψ〉 just to remind ourselves we’re talking of some state of the system indeed. However, quantum physicists always want to confuse you, and so they will also use the psi symbol to denote something else: they’ll use it to denote a very particular Ci amplitude (or coefficient) in that |ψ〉 = ∑|iCi formula above. To be specific, they’d replace the base states |i〉 by the continuous position variable x, and they would write the following:

Ci = ψ(i = x) = ψ(x) = Cψ(x) = C(x) = 〈x|ψ〉

In fact, that’s just like writing:

φ(p) = 〈 mom p | ψ 〉 = 〈p|ψ〉 = Cφ(p) = C(p)

What they’re doing here, is (1) reduce the ‘system‘ to a ‘particle‘ once more (which is OK, as long as you know what you’re doing) and (2) they basically state the following:

If a particle is in some state |ψ〉, then we can associate some wavefunction ψ(x) or φ(p)—with it, and that wavefunction will represent the amplitude for the system (i.e. our particle) to be at x, or to have a momentum that’s equal to p.

So what’s wrong with that? Well… Nothing. It’s just that… Well… Why don’t they use χ(x) instead of ψ(x)? That would avoid a lot of confusion, I feel: one should not use the same symbol (psi) for the |ψ〉 state and the ψ(x) wavefunction.

Huh? Yes. Think about it. The point is: the position or the momentum, or even the energy, are properties of the system, so to speak and, therefore, it’s really confusing to use the same symbol psi (ψ) to describe (1) the state of the system, in general, versus (2) the position wavefunction, which describes… Well… Some very particular aspect (or ‘state’, if you want) of the same system (in this case: its position). There’s no such problem with φ(p), so… Well… Why don’t they use χ(x) instead of ψ(x) indeed? I have only one answer: psi-chology. 🙂

In any case, there’s nothing we can do about it and… Well… In fact, that’s what this post is about: it’s about how to describe certain properties of the system. Of course, we’re talking quantum mechanics here and, hence, uncertainty, and, therefore, we’re going to talk about the average position, energy, momentum, etcetera that’s associated with a particular state of a system, or—as we’ll keep things very simple—the properties of a ‘particle’, really. Think of an electron in some orbital, indeed! 🙂

So let’s now look at that set of Hamiltonian equations once again:


Looking at it carefully – so just look at it once again! 🙂 – and thinking about what we did when going from the discrete to the continuous setting, we can now understand we should write the following for the continuous case:


Of course, combining Schrödinger’s equation with the expression above implies the following:


Now how can we relate that integral to the expression on the right-hand side? I’ll have to disappoint you here, as it requires a lot of math to transform that integral. It requires writing H(x, x’) in terms of rather complicated functions, including – you guessed it, didn’t you? – Dirac’s delta function. Hence, I assume you’ll believe me if I say that the matrix- and wave-mechanical approaches are actually equivalent. In any case, if you’d want to check it, you can always read Feynman yourself. 🙂

Now, I wrote this post to talk about quantum-mechanical operators, so let me do that now.

Quantum-mechanical operators

You know the concept of an operator. As mentioned above, we should put a little hat (^) on top of our Hamiltonian operator, so as to distinguish it from the matrix itself. However, as mentioned above, the difference is usually quite clear from the context. Our operators were all matrices so far, and we’d write the matrix elements of, say, some operator A, as:

Aij ≡ 〈 i | A | j 〉

The whole matrix itself, however, would usually not act on a base state but… Well… Just on some more general state ψ, to produce some new state φ, and so we’d write:

| φ 〉 = A | ψ 〉

Of course, we’d have to describe | φ 〉 in terms of the (same) set of base states and, therefore, we’d expand this expression into something like this:

operator 2

You get the idea. I should just add one more thing. You know this important property of amplitudes: the 〈 ψ | φ 〉 amplitude is the complex conjugate of the 〈 φ | ψ 〉 amplitude. It’s got to do with time reversibility, because the complex conjugate of eiθ = ei(ω·t−k·x) is equal to eiθ = ei(ω·t−k·x), so we’re just reversing the x- and tdirection. We write:

 〈 ψ | φ 〉 = 〈 φ | ψ 〉*

Now what happens if we want to take the complex conjugate when we insert a matrix, so when writing 〈 φ | A | ψ 〉 instead of 〈 φ | ψ 〉, this rules becomes:

〈 φ | A | ψ 〉* = 〈 ψ | A† | φ 〉

The dagger symbol denotes the conjugate transpose, so A† is an operator whose matrix elements are equal to Aij† = Aji*. Now, it may or may not happen that the A† matrix is actually equal to the original A matrix. In that case – and only in that case – we can write:

〈 ψ | A | φ 〉 = 〈 φ | A | ψ 〉*

We then say that A is a ‘self-adjoint’ or ‘Hermitian’ operator. That’s just a definition of a property, which the operator may or may not have—but many quantum-mechanical operators are actually Hermitian. In any case, we’re well armed now to discuss some actual operators, and we’ll start with that energy operator.

The energy operator (H)

We know the state of a system is described in terms of a set of base states. Now, our analysis of N-state systems showed we can always describe it in terms of a special set of base states, which are referred to as the states of definite energy because… Well… Because they’re associated with some definite energy. In that post, we referred to these energy levels as En (n = I, II,… N). We used boldface for the subscript n (so we wrote n instead of n) because of these Roman numerals. With each energy level, we could associate a base state, of definite energy indeed, that we wrote as |n〉. To make a long story short, we summarized our results as follows:

  1. The energies EI, EII,…, En,…, EN are the eigenvalues of the Hamiltonian matrix H.
  2. The state vectors |n〉 that are associated with each energy En, i.e. the set of vectors |n〉, are the corresponding eigenstates.

We’ll be working with some more subscripts in what follows, and these Roman numerals and the boldface notation are somewhat confusing (if only because I don’t want you to think of these subscripts as vectors), we’ll just denote EI, EII,…, En,…, EN as E1, E2,…, Ei,…, EN, and we’ll number the states of definite energy accordingly, also using some Greek letter so as to clearly distinguish them from all our Latin letter symbols: we’ll write these states as: |η1〉, |η1〉,… |ηN〉. [If I say, ‘we’, I mean Feynman of course. You may wonder why he doesn’t write |Ei〉, or |εi〉. The answer is: writing |En〉 would cause confusion, because this state will appear in expressions like: |Ei〉Ei, so that’s the ‘product’ of a state (|Ei〉) and the associated scalar (Ei). Too confusing. As for using η (eta) instead of ε (epsilon) to denote something that’s got to do with energy… Well… I guess he wanted to keep the resemblance with the n, and then the Ancient Greek apparently did use this η letter  for a sound like ‘e‘ so… Well… Why not? Let’s get back to the lesson.]

Using these base states of definite energy, we can write the state of the system as:

|ψ〉 = ∑ |ηi〉 C = ∑ |ηi〉〈ηi|ψ〉    over all (i = 1, 2,… , N)

Now, we didn’t talk all that much about what these base states actually mean in terms of measuring something but you’ll believe if I say that, when measuring the energy of the system, we’ll always measure one or the other E1, E2,…, Ei,…, EN value. We’ll never measure something in-between: it’s eitheror. Now, as you know, measuring something in quantum physics is supposed to be destructive but… Well… Let us imagine we could make a thousand measurements to try to determine the average energy of the system. We’d do so by counting the number of times we measure E1 (and of course we’d denote that number as N1), E2E3, etcetera. You’ll agree that we’d measure the average energy as:

E average

However, measurement is destructive, and we actually know what the expected value of this ‘average’ energy will be, because we know the probabilities of finding the system in a particular base state. That probability is equal to the absolute square of that Ccoefficient above, so we can use the P= |Ci|2 formula to write:

Eav〉 = ∑ Pi Ei over all (i = 1, 2,… , N)

Note that this is a rather general formula. It’s got nothing to do with quantum mechanics: if Ai represents the possible values of some quantity A, and Pi is the probability of getting that value, then (the expected value of) the average A will also be equal to 〈Aav〉 = ∑ Pi Ai. No rocket science here! 🙂 But let’s now apply our quantum-mechanical formulas to that 〈Eav〉 = ∑ Pi Ei formula. [Oh—and I apologize for using the same angle brackets 〈 and 〉 to denote an expected value here—sorry for that! But it’s what Feynman does—and other physicists! You see: they don’t really want you to understand stuff, and so they often use very confusing symbols.] Remembering that the absolute square of a complex number equals the product of that number and its complex conjugate, we can re-write the 〈Eav〉 = ∑ Pi Ei formula as:

Eav〉 = ∑ Pi Ei = ∑ |Ci|Ei = ∑ Ci*CEi = ∑ C*CEi = ∑ 〈ψ|ηi〉〈ηi|ψ〉E= ∑ 〈ψ|ηiEi〈ηi|ψ〉 over all i

Now, you know that Dirac’s bra-ket notation allows numerous manipulations. For example, what we could do is take out that ‘common factor’ 〈ψ|, and so we may re-write that monster above as:

Eav〉 = 〈ψ| ∑ ηiEi〈ηi|ψ〉 = 〈ψ|φ〉, with |φ〉 = ∑ |ηiEi〈ηi|ψ〉 over all i

Huh? Yes. Note the difference between |ψ〉 = ∑ |ηi〉 C = ∑ |ηi〉〈ηi|ψ〉 and |φ〉 = ∑ |ηiEi〈ηi|ψ〉. As Feynman puts it: φ is just some ‘cooked-up‘ state which you get by taking each of the base states |ηi〉 in the amount Ei〈ηi|ψ〉 (as opposed to the 〈ηi|ψ〉 amounts we took for ψ).

I know: you’re getting tired and you wonder why we need all this stuff. Just hang in there. We’re almost done. I just need to do a few more unpleasant things, one of which is to remind you that this business of the energy states being eigenstates (and the energy levels being eigenvalues) of our Hamiltonian matrix (see my post on N-state systems) comes with a number of interesting properties, including this one:

H |ηi〉 = Eii〉 = |ηiEi

Just think about what’s written here: on the left-hand side, we’re multiplying a matrix with a (base) state vector, and on the left-hand side we’re multiplying it with a scalar. So our |φ〉 = ∑ |ηiEi〈ηi|ψ〉 sum now becomes:

|φ〉 = ∑ H |ηi〉〈ηi|ψ〉 over all (i = 1, 2,… , N)

Now we can manipulate that expression some more so as to get the following:

|φ〉 = H ∑|ηi〉〈ηi|ψ〉 = H|ψ〉

Finally, we can re-combine this now with the 〈Eav〉 = 〈ψ|φ〉 equation above, and so we get the fantastic result we wanted:

Eav〉 = 〈 ψ | φ 〉 = 〈 ψ | H ψ 〉

Huh? Yes! To get the average energy, you operate on |ψ with H, and then you multiply the result with ψ|. It’s a beautiful formula. On top of that, the new formula for the average energy is not only pretty but also useful, because now we don’t need to say anything about any particular set of base states. We don’t even have to know all of the possible energy levels. When we have to calculate the average energy of some system, we only need to be able to describe the state of that system in terms of some set of base states, and we also need to know the Hamiltonian matrix for that set, of course. But if we know that, we can calculate its average energy.

You’ll say that’s not a big deal because… Well… If you know the Hamiltonian, you know everything, so… Well… Yes. You’re right: it’s less of a big deal than it seems. Having said that, the whole development above is very interesting because of something else: we can easily generalize it for other physical measurements. I call it the ‘average value’ operator idea, but you won’t find that term in any textbook. 🙂 Let me explain the idea.

The average value operator (A)

The development above illustrates how we can relate a physical observable, like the (average) energy (E), to a quantum-mechanical operator (H). Now, the development above can easily be generalized to any observable that would be proportional to the energy. It’s perfectly reasonable, for example, to assume the angular momentum – as measured in some direction, of course, which we usually refer to as the z-direction – would be proportional to the energy, and so then it would be easy to define a new operator Lz, which we’d define as the operator of the z-component of the angular momentum L. [I know… That’s a bit of a long name but… Well… You get the idea.] So we can write:

Lzav = 〈 ψ | Lψ 〉

In fact, further generalization yields the following grand result:

If a physical observable A is related to a suitable quantum-mechanical operator Â, then the average value of A for the state | ψ 〉 is given by:

Aav = 〈 ψ |  ψ 〉 = 〈 ψ | φ 〉 with | φ 〉 =  ψ 〉

At this point, you may have second thoughts, and wonder: what state | ψ 〉? The answer is: it doesn’t matter. It can be any state, as long as we’re able to describe in terms of a chosen set of base states. 🙂

OK. So far, so good. The next step is to look at how this works for the continuity case.

The energy operator for wavefunctions (H)

We can start thinking about the continuous equivalent of the 〈Eav〉 = 〈ψ|H|ψ〉 expression by first expanding it. We write:

e average continuous function

You know the continuous equivalent of a sum like this is an integral, i.e. an infinite sum. Now, because we’ve got two subscripts here (i and j), we get the following double integral:

double integral

Now, I did take my time to walk you through Feynman’s derivation of the energy operator for the discrete case, i.e. the operator when we’re dealing with matrix mechanics, but I think I can simplify my life here by just copying Feynman’s succinct development:


Done! Given a wavefunction ψ(x), we get the average energy by doing that integral above. Now, the quantity in the braces of that integral can be written as that operator we introduced when we started this post:


So now we can write that integral much more elegantly. It becomes:

Eav = ∫ ψ*(xH ψ(x) dx

You’ll say that doesn’t look like 〈Eav〉 = 〈 ψ | H ψ 〉! It does. Remember that 〈 ψ | = ψ 〉*. 🙂 Done!

I should add one qualifier though: the formula above assumes our wavefunction has been normalized, so all probabilities add up to one. But that’s a minor thing. The only thing left to do now is to generalize to three dimensions. That’s easy enough. Our expression becomes a volume integral:

Eav = ∫ ψ*(rH ψ(r) dV

Of course, dV stands for dVolume here, not for any potential energy, and, of course, once again we assume all probabilities over the volume add up to 1, so all is normalized. Done! 🙂

We’re almost done with this post. What’s left is the position and momentum operator. You may think this is going to another lengthy development but… Well… It turns out the analysis is remarkably simple. Just stay with me a few more minutes and you’ll have earned your degree. 🙂

The position operator (x)

The thing we need to solve here is really easy. Look at the illustration below as representing the probability density of some particle being at x. Think about it: what’s the average position?

average position

Well? What? The (expected value of the) average position is just this simple integral: 〈xav = ∫ P(x) dx, over all the whole range of possible values for x. 🙂 That’s all. Of course, because P(x) = |ψ(x)|2 =ψ*(x)·ψ(x), this integral now becomes:

xav = ∫ ψ*(x) x ψ(x) dx

That looks exactly the same as 〈Eav = ∫ ψ*(xH ψ(x) dx, and so we can look at as an operator too!

Huh? Yes. It’s an extremely simple operator: it just means “multiply by x“. 🙂

I know you’re shaking your head now: is it that easy? It is. Moreover, the ‘matrix-mechanical equivalent’ is equally simple but, as it’s getting late here, I’ll refer you to Feynman for that. 🙂

The momentum operator (px)

Now we want to calculate the average momentum of, say, some electron. What integral would you use for that? […] Well… What? […] It’s easy: it’s the same thing as for x. We can just substitute replace for in that 〈xav = ∫ P(x) dformula, so we get:

pav = ∫ P(p) dp, over all the whole range of possible values for p

Now, you might think the rest is equally simple, and… Well… It actually is simple but there’s one additional thing in regard to the need to normalize stuff here. You’ll remember we defined a momentum wavefunction (see my post on the Uncertainty Principle), which we wrote as:

φ(p) = 〈 mom p | ψ 〉

Now, in the mentioned post, we related this momentum wavefunction to the particle’s ψ(x) = 〈x|ψ〉 wavefunction—which we should actually refer to as the position wavefunction, but everyone just calls it the particle’s wavefunction, which is a bit of a misnomer, as you can see now: a wavefunction describes some property of the system, and so we can associate several wavefunctions with the same system, really! In any case, we noted the following there:

  • The two probability density functions, φ(p) and ψ(x), look pretty much the same, but the half-width (or standard deviation) of one was inversely proportional to the half-width of the other. To be precise, we found that the constant of proportionality was equal to ħ/2, and wrote that relation as follows: σp = (ħ/2)/σx.
  • We also found that, when using a regular normal distribution function for ψ(x), we’d have to normalize the probability density function by inserting a (2πσx2)−1/2 in front of the exponential.

Now, it’s a bit of a complicated argument, but the upshot is that we cannot just write what we usually write, i.e. Pi = |Ci|2 or P(x) = |ψ(x)|2. No. We need to put a normalization factor in front, which combines the two factors I mentioned above. To be precise, we have to write:

P(p) = |〈p|ψ〉|2/(2πħ)

So… Well… Our 〈pav = ∫ P(p) dp integral can now be written as:

pav = ∫ 〈ψ|ppp|ψ〉 dp/(2πħ)

So that integral is totally like what we found for 〈xav and so… We could just leave it at that, and say we’ve solved the problem. In that sense, it is easy. However, having said that, it’s obvious we’d want some solution that’s written in terms of ψ(x), rather than in terms of φ(p), and that requires some more manipulation. I’ll refer you, once more, to Feynman for that, and I’ll just give you the result:

momentum operator

So… Well… I turns out that the momentum operator – which I tentatively denoted as px above – is not so simple as our position operator (x). Still… It’s not hugely complicated either, as we can write it as:

px ≡ (ħ/i)·(∂/∂x)

Of course, the purists amongst you will, once again, say that I should be more careful and put a hat wherever I’d need to put one so… Well… You’re right. I’ll wrap this all up by copying Feynman’s overview of the operators we just explained, and so he does use the fancy symbols. 🙂


Well, folks—that’s it! Off we go! You know all about quantum physics now! We just need to work ourselves through the exercises that come with Feynman’s Lectures, and then you’re ready to go and bag a degree in physics somewhere. So… Yes… That’s what I want to do now, so I’ll be silent for quite a while now. Have fun! 🙂

Dirac’s delta function and Schrödinger’s equation in three dimensions

Feynman’s rather informal derivation of Schrödinger’s equation – following Schrödinger’s own logic when he published his famous paper on it back in 1926 – is wonderfully simple but, as I mentioned in my post on it, does lack some mathematical rigor here and there. Hence, Feynman hastens to dot all of the i‘s and cross all of the t‘s in the subsequent Lectures. We’ll look at two things here:

  1. Dirac’s delta function, which ensures proper ‘normalization’. In fact, as you’ll see in a moment, it’s more about ‘orthogonalization’ than normalization. 🙂
  2. The generalization of Schrödinger’s equation to three dimensions (in space) and also including the presence of external force fields (as opposed to the usual ‘free space’ assumption).

The second topic is the most interesting, of course, and also the easiest, really. However, let’s first use our energy to grind through the first topic. 🙂

Dirac’s delta function

When working with a finite set of discrete states, a fundamental condition is that the base states be ‘orthogonal’, i.e. they must satisfy the following equation:

ij 〉 = δij, with δij = 1 if i = j and δij = 0 if ij

Needless to say, the base states and j are rather special vectors in a rather special mathematical space (a so-called Hilbert space) and so it’s rather tricky to interpret their ‘orthogonality’ in any geometric way, although such geometric interpretation is often actually possible in simple quantum-mechanical systems: you’ll just notice a ‘right’ angle may actually be 45°, or 180° angles, or whatever. 🙂 In any case, that’s not the point here. The question is: if we move an infinite number of base states – like we did when we introduced the ψ(x) and φ(p) wavefunctions – what happens to that condition?

Your first reaction is going to be: nothing. Because… Well… Remember that, for a two-state system, in which we have two base states only, we’d fully describe some state | φ 〉 as a linear combination of the base states, so we’d write:

| φ 〉 =| I 〉 CI + | II 〉 CII 

Now, while saying we were talking a Hilbert space here, I did add we could use the same expression to define the base states themselves, so I wrote the following triviality:

M1Trivial but sensible. So we’d associate the base state | I 〉 with the base vector (1, 0) and, likewise, base state | II 〉 with the base vector (0, 1). When explaining this, I added that we could easily extend to an N-state system and so there’s a perfect analogy between the 〈 i | j 〉 bra-ket expression in quantum math and the ei·ej product in the run-of-the-mill coordinate spaces that you’re used to. So why can’t we just extend the concept to an infinite-state system and move to base vectors with an infinite number of elements, which we could write as ei =(…, 0, ei = 1, 0, 0,,…) and ej =(…, 0, 0, ej = 1, 0,…), thereby ensuring 〈 i | j 〉 = ei·ej = δijalways! The ‘orthogonality’ condition looks simple enough indeed, and so we could re-write it as:

xx’ 〉 = δxx’, with δxx’ = 1 if x = x’ and δxx’ = 0 if if x ≠ x’

However, when moving from a space with a finite number of dimensions to a space with an infinite number of dimensions, there are some issues. They pop up, for example, when we insert that 〈 xx’ 〉 = δxx’ function (note that we’re talking some function here of x and x’, indeed, so we’ll write it as f(x, x’) in the next step) in that 〈φ|ψ〉 = ∫〈φ|x〉〈x|ψ〉dx integral.

Huh? What integral? Relax: that 〈φ|ψ〉 = ∫〈φ|x〉〈x|ψ〉dx integral just generalizes our 〈φ|ψ〉 = ∑〈φ|x〉〈x|ψ〉 expression for discrete settings for the continuous case. Just look at it. When substituting φ for x’, we get:

x’|ψ〉 = ψ(x’) = ∫ 〈x’|x〉 〈x|ψ〉 dx ⇔ ψ(x’) = ∫ 〈x’|x〉 ψ(x) dx

You’ll say: what’s the problem? Well… From a mathematical point of view, it’s a bit difficult to find a function 〈x’|x〉 = f(x, x’) which, when multiplied with a wavefunction ψ(x), and integrated over all x, will just give us ψ(x’). A bit difficult? Well… It’s worse than that: it’s actually impossible!

Huh? Yes. Feynman illustrates the difficulty for x’ = 0, but he could have picked whatever value, really. In any case, if x’ = 0, we can write f(x, 0) = f(x), and our integral now reduces to:

ψ(0) = ∫ f(x) ψ(0) dx

This is a weird expression: the value of the integral (i.e. the right-hand side of the expression) does not depend on x: it is just some non-zero value ψ(0). However, we know that the f(x) in the integrand is zero for all x ≠ 0. Hence, this integral will be zero. So we have an impossible situation: we wish a function to be zero everywhere but for one point, and, at the same time, we also want it to give us a finite integral when using it in that integral above.

You’re likely to shake your head now and say: what the hell? Does it matter? It does: it is an actual problem in quantum math. Well… I should say: it was an actual problem in quantum math. Dirac solved it. He invented a new function which looks a bit less simple than our suggested generalization of Kronecker’s delta for the continuous case (i.e. that 〈 xx’ 〉 = δxx’ conjecture above). Dirac’s function is – quite logically – referred to as the Dirac delta function, and it’s actually defined by that integral above, in the sense that we impose the following two conditions on it:

  • δ(x‘) = 0 if x ≠ x’ (so that’s just like the first of our two conditions for that 〈 xx’ 〉 = δxx’ function)
  • δ(x)ψ(x) dx = ψ(x’) (so that’s not like the second of our two condition for that 〈 xx’ 〉 = δxx’ function)

Indeed, that second condition is much more sophisticated than our 〈 xx’ 〉 = 1 if x = x’ condition. In fact, one can show that the second condition amounts to finding some function satisfying this condition:

δ(x)dx = 1

We get this by equating x’ to zero once more and, additionally, by equating ψ(x) to 1. [Please do double-check yourself.] Of course, this ‘normalization’ (or ‘orthogonalization’) problem all sounds like a lot of hocus-pocus and, in many ways, it is. In fact, we’re actually talking a mathematical problem here which had been lying around for centuries (for a brief overview, see the Wikipedia article on it). So… Well… Without further ado, I’ll just give you the mathematical expression now—and please don’t stop reading now, as I’ll explain it in a moment:


I will also credit Wikipedia with the following animation, which shows that the expression above is just the normal distribution function, and which shows what happens when that a, i.e. its standard deviation, goes to zero: Dirac’s delta function is just the limit of a sequence of (zero-centered) normal distributions. That’s all. Nothing more, nothing less.


But how do we interpret it? Well… I can’t do better than Feynman as he describes what’s going on really:

“Dirac’s δ(xfunction has the property that it is zero everywhere except at x = 0 but, at the same time, it has a finite integral equal to unity. [See the δ(x)dx = 1 equation.] One should imagine that the δ(x) function has such fantastic infinity at one point that the total area comes out equal to one.”

Well… That says it all, I guess. 🙂 Don’t you love the way he puts it? It’s not an ‘ordinary’ infinity. No. It’s fantastic. Frankly, I think these guys were all fantastic. 🙂 The point is: that special function, Dirac’s delta function, solves our problem. The equivalent expression for the 〈 ij 〉 = δij condition for a finite and discrete set of base states is the following one for the continuous case:

xx’ 〉 = δ(x − x’)

The only thing left now is to generalize this result to three dimensions. Now that’s fairly straightforward. The ‘normalization’ condition above is all that’s needed in terms of modifying the equations for dealing with the continuum of base states corresponding to the points along a line. Extending the analysis to three dimensions goes as follows:

  • First, we replace the x coordinate by the vector r = (x, y, z)
  • As a result, integrals over x, become integrals over x, y and z. In other words, they become volume integrals.
  • Finally, the one-dimensional δ-function must be replaced by the product of three δ-functions: one in x, one in y and one in z. We write:

r | r 〉 = δ(x − x’) δ(y − y’)δ(z − z’)

Feynman summarizes it all together as follows:


What if we have two particles, or more? Well… Once again, I won’t bother to try to re-phrase the Grand Master as he explains it. I’ll just italicize or boldface the key points:

Suppose there are two particles, which we can call particle 1 and particle 2. What shall we use for the base states? One perfectly good set can be described by saying that particle 1 is at xand particle 2 is at x2, which we can write as | xx〉. Notice that describing the position of only one particle does not define a base state. Each base state must define the condition of the entire system, so you must not think that each particle moves independently as a wave in three dimensions. Any physical state | ψ 〉 can be defined by giving all of the amplitudes 〈 xx| ψ 〉 to find the two particles at x1 and x2. This generalized amplitude is therefore a function of the two sets of coordinates x1 and x1. You see that such a function is not a wave in the sense of an oscillation that moves along in three dimensions. Neither is it generally simply a product of two individual waves, one for each particle. It is, in general, some kind of a wave in the six dimensions defined by x1 and x1Hence, if there are two particles in Nature which are interacting, there is no way of describing what happens to one of the particles by trying to write down a wave function for it alone. The famous paradoxes that we considered in earlier chapters—where the measurements made on one particle were claimed to be able to tell what was going to happen to another particle, or were able to destroy an interference—have caused people all sorts of trouble because they have tried to think of the wave function of one particle alone, rather than the correct wave function in the coordinates of both particles. The complete description can be given correctly only in terms of functions of the coordinates of both particles.

Now we really know it all, don’t we? 🙂

Well… Almost. I promised to tackle another topic as well. So here it is:

Schrödinger’s equation in three dimensions

Let me start by jotting down what we had found already, i.e. Schrödinger’s equation when only one coordinate in space is involved. It’s written as:

schrodinger 3

Now, the extension to three dimensions is remarkably simple: we just substitute the ∂/∂xoperator by the ∇operator, i.e. ∇= ∂/∂x2  + ∂/∂y+ ∂/∂z2. We get:

schrodinger 4

Finally, we can also put forces on the particle, so now we are not looking at a particle moving in free space: we’ve got some force field working on it. It turns out the required modification is equally simple. The grand result is Schrödinger’s original equation in three dimensions:

schrodinger 5

V = V(x, y, z) is, of course, just the potential here. Remarkably simple equations but… How do we get these? Well… Sorry. The math is not too difficult, but you’re well equipped now to look at Feynman’s Lecture on it yourself now. You really are. Trust me. I really dealt with all of the ‘serious’ stuff you need to understand how he’s going about it in my previous posts so, yes, now I’ll just sit back and relax. Or go biking. Or whatever. 🙂

Schrödinger’s equation: the original approach

Of course, your first question when seeing the title of this post is: what’s original, really? Well… The answer is simple: it’s the historical approach, and it’s original because it’s actually quite intuitive. Indeed, Lecture no. 16 in Feynman’s third Volume of Lectures on Physics is like a trip down memory lane as Feynman himself acknowledges, after presenting Schrödinger’s equation using that very rudimentary model we developed in our previous post:

“We do not intend to have you think we have derived the Schrödinger equation but only wish to show you one way of thinking about it. When Schrödinger first wrote it down, he gave a kind of derivation based on some heuristic arguments and some brilliant intuitive guesses. Some of the arguments he used were even false, but that does not matter; the only important thing is that the ultimate equation gives a correct description of nature.”

So… Well… Let’s have a look at it. 🙂 We were looking at some electron we described in terms of its location at one or the other atom in a linear array (think of it as a line). We did so by defining base states |n〉 = |xn〉, noting that the state of the electron at any point in time could then be written as:

|φ〉 = ∑ |xnCn(t) = ∑ |xn〉〈xn|φ〉 over all n

The Cn(t) = 〈xn|φ〉 coefficient is the amplitude for the electron to be at xat t. Hence, the Cn(t) amplitudes vary with t as well as with x. We’ll re-write them as Cn(t) = C(xn, t) = C(xn). Note that the latter notation does not explicitly show the time dependence. The Hamiltonian equation we derived in our previous post is now written as:

iħ·(∂C(xn)/∂t) = E0C(xn) − AC(xn+b) − AC(xn−b)

Note that, as part of our move from the Cn(t) to the C(xn) notation, we write the time derivative dCn(t)/dt now as ∂C(xn)/∂t, so we use the partial derivative symbol now (∂). Of course, the other partial derivative will be ∂C(x)/∂x) as we move from the count variable xto the continuous variable x, but let’s not get ahead of ourselves here. The solution we found for our C(xn) functions was the following wavefunction:

C(xn) = a·ei(k∙xn−ω·t) ei∙ω·t·ei∙k∙xn ei·(E/ħ)·t·ei·k∙xn

We also found the following relationship between E and k:

E = E0 − 2A·cos(kb)

Now, even Feynman struggles a bit with the definition of E0 and k here, and their relationship with E, which is graphed below.


Indeed, he first writes, as he starts developing the model, that E0 is, physically, the energy the electron would have if it couldn’t leak away from one of the atoms, but then he also adds: “It represents really nothing but our choice of the zero of energy.”

This is all quite enigmatic because we cannot just do whatever we want when discussing the energy of a particle. As I pointed out in one of my previous posts, when discussing the energy of a particle in the context of the wavefunction, we generally consider it to be the sum of three different energy concepts:

  1. The particle’s rest energy m0c2, which de Broglie referred to as internal energy (Eint), and which includes the rest mass of the ‘internal pieces’, as Feynman puts it (now we call those ‘internal pieces’ quarks), as well as their binding energy (i.e. the quarks’ interaction energy).
  2. Any potential energy it may have because of some field (i.e. if it is not traveling in free space), which we usually denote by U. This field can be anything—gravitational, electromagnetic: it’s whatever changes the energy of the particle because of its position in space.
  3. The particle’s kinetic energy, which we write in terms of its momentum p: m·v2/2 = m2·v2/(2m) = (m·v)2/(2m) = p2/(2m).

It’s obvious that we cannot just “choose” the zero point here: the particle’s rest energy is its rest energy, and its velocity is its velocity. So it’s not quite clear what the E0 in our model really is. As far as I am concerned, it represents the average energy of the system really, so it’s just like the E0 for our ammonia molecule, or the E0 for whatever two-state system we’ve seen so far. In fact, when Feynman writes that we can “choose our zero of energy so that E0 − 2A = 0″ (so the minimum of that curve above is at the zero of energy), he actually makes some assumption in regard to the relative magnitude of the various amplitudes involved.

We should probably think about it in this way: −(i/ħ)·E0 is the amplitude for the electron to just stay where it is, while i·A/ħ is the amplitude to go somewhere else—and note we’ve got two possibilities here: the electron can go to |xn+1〉,  or, alternatively, it can go to |xn−1〉. Now, amplitudes can be associated with probabilities by taking the absolute square, so I’d re-write the E0 − 2A = 0 assumption as:

E0 = 2A ⇔ |−(i/ħ)·E0|= |(i/ħ)·2A|2

Hence, in my humble opinion, Feynman’s assumption that E0 − 2A = 0 has nothing to do with ‘choosing the zero of energy’. It’s more like a symmetry assumption: we’re basically saying it’s as likely for the electron to stay where it is as it is to move to the next position. It’s an idea I need to develop somewhat further, as Feynman seems to just gloss over these little things. For example, I am sure it is not a coincidence that the EI, EIIEIII and EIV energy levels we found when discussing the hyperfine splitting of the hydrogen ground state also add up to 0. In fact, you’ll remember we could actually measure those energy levels (E= EII = EIII = A ≈ 9.23×10−6 eV, and EIV = −3A ≈ −27.7×10−6 eV), so saying that we can “choose” some zero energy point is plain nonsense. The question just doesn’t arise. In any case, as I have to continue the development here, I’ll leave this point for further analysis in the future. So… Well… Just note this E0 − 2A = 0 assumption, as we’ll need it in a moment.

The second assumption we’ll need concerns the variation in k. As you know, we can only get a wave packet if we allow for uncertainty in k which, in turn, translates into uncertainty for E. We write:

ΔE = Δ[E0 − 2A·cos(kb)]

Of course, we’d need to interpret the Δ as a variance (σ2) or a standard deviation (σ) so we can apply the usual rules – i.e. var(a) = 0, var(aX) = a2·var(X), and var(aX ± bY) = a2·var(X) + b2·var(Y) ± 2ab·cov(X, Y) – to be a bit more precise about what we’re writing here, but you get the idea. In fact, let me quickly write it out:

var[E0 − 2A·cos(kb)] = var(E0) + 4A2·var[cos(kb)] ⇔ var(E) = 4A2·var[cos(kb)]

Now, you should check my post scriptum to my page on the Essentials, to see how the probability density function of the cosine of a randomly distributed variable looks like, and then you should go online to find a formula for its variance, and then you can work it all out yourself, because… Well… I am not going to do it for you. What I want to do here is just show how Feynman gets Schrödinger’s equation out of all of these simplifications.

So what’s the second assumption? Well… As the graph shows, our k can take any value between −π/b and +π/b, and therefore, the kb argument in our cosine function can take on any value between −π and +π. In other words, kb could be any angle. However, as Feynman puts it—we’ll be assuming that kb is ‘small enough’, so we can use the small-angle approximations whenever we see the cos(kb) and/or sin(kb) functions. So we write: sin(kb) ≈ kb and cos(kb) ≈ 1 − (kb)2/2 = 1 − k2b2/2. Now, that assumption led to another grand result, which we also derived in our previous post. It had to do with the group velocity of our wave packet, which we calculated as:

= dω/dk = (2Ab2/ħ)·k

Of course, we should interpret our k here as “the typical k“. Huh? Yes… That’s how Feynman refers to it, and I have no better term for it. It’s some kind of ‘average’ of the Δk interval, obviously, but… Well… Feynman does not give us any exact definition here. Of course, if you look at the graph once more, you’ll say that, if the typical kb has to be “small enough”, then its expected value should be zero. Well… Yes and no. If the typical kb is zero, or if is zero, then is zero, and then we’ve got a stationary electron, i.e. an electron with zero momentum. However, because we’re doing what we’re doing (that is, we’re studying “stuff that moves”—as I put it unrespectfully in a few of my posts, so as to distinguish from our analyses of “stuff that doesn’t move”, like our two-state systems, for example), our “typical k” should not be zero here. OK… We can now calculate what’s referred to as the effective mass of the electron, i.e. the mass that appears in the classical kinetic energy formula: K.E. = m·v2/2. Now, there are two ways to do that, and both are somewhat tricky in their interpretation:

1. Using both the E0 − 2A = 0 as well as the “small kb” assumption, we find that E = E0 − 2A·(1 − k2b2/2) = A·k2b2. Using that for the K.E. in our formula yields:

meff = 2A·k2b2/v= 2A·k2b2/[(2Ab2/ħ)·k]= ħ2/(2Ab2)

2. We can use the classical momentum formula (p = m·v), and then the 2nd de Broglie equation, which tells us that each wavenumber (k) is to be associated with a value for the momentum (p) using the p = ħk (so p is proportional to k, with ħ as the factor of proportionality). So we can now calculate meff as meff = ħk/v. Substituting again for what we’ve found above, gives us the same:

meff = 2A·k2b2/v = ħ·k/[(2Ab2/ħ)·k] = ħ2/(2Ab2)

Of course, we’re not supposed to know the de Broglie relations at this point in time. 🙂 But, now that you’ve seen them anyway, note how we have two formulas for the momentum:

  • The classical formula (p = m·v) tells us that the momentum is proportional to the classical velocity of our particle, and m is then the factor of proportionality.
  • The quantum-mechanical formula (p = ħk) tells us that the (typical) momentum is proportional to the (typical) wavenumber, with Planck’s constant (ħ) as the factor of proportionality. Combining both combines the classical and quantum-mechanical perspective of a moving particle:

v = ħk

I know… It’s an obvious equation but… Well… Think of it. It’s time to get back to the main story now. Remember we were trying to find Schrödinger’s equation? So let’s get on with it. 🙂

To do so, we need one more assumption. It’s the third major simplification and, just like the others, the assumption is obvious on first, but not on second thought. 😦 So… What is it? Well… It’s easy to see that, in our meff = ħ2/(2Ab2) formula, all depends on the value of 2Ab2. So, just like we should wonder what happens with that kb factor in the argument of our sine or cosine function if b goes to zero—i.e. if we’re letting the lattice spacing go to zero, so we’re moving from a discrete to a continuous analysis now—we should also wonder what happens with that 2Ab2 factor! Well… Think about it. Wouldn’t it be reasonable to assume that the effective mass of our electron is determined by some property of the material, or the medium (so that’s the silicon in our previous post) and, hence, that it’s constant really. Think of it: we’re not changing the fundamentals really—we just have some electron roaming around in some medium and all that we’re doing now is bringing those xcloser together. Much closer. It’s only logical, then, that our amplitude to jump from xn±1 to xwould also increase, no? So what we’re saying is that 2Ab2 is some constant which we write as ħ2/meff or, what amounts to the same, that Ab= ħ2/2·meff.

Of course, you may raise two objections here:

  1. The Ab= ħ2/2·meff assumption establishes a very particular relation between A and b, as we can write A as A = [ħ2/(2meff)]·b−2 now. So we’ve got like an y = 1/x2 relation here. Where the hell does that come from?
  2. We were talking some real stuff here: a crystal lattice with atoms that, in reality, do have some spacing, so that corresponds to some real value for b. So that spacing gives some actual physical significance to those xvalues.

Well… What can I say? I think you should re-read that quote of Feynman when I started this post. We’re going to get Schrödinger’s equation – i.e. the ultimate prize for all of the hard work that we’ve been doing so far – but… Yes. It’s really very heuristic, indeed! 🙂 But let’s get on with it now! We can re-write our Hamiltonian equation as:

iħ·(∂C(xn)/∂t) = E0C(xn) − AC(xn+b) − AC(xn−b)]

= (E0−2A)C(xn) + A[2C(xn) − C(xn+b) − C(xn−b) = A[2C(xn) − C(xn+b) − C(xn−b)]

Now, I know your brain is about to melt down but, fiddling with this equation as we’re doing right now, Schrödinger recognized a formula for the second-order derivative of a function. I’ll just jot it down, and you can google it so as to double-check where it comes from:

second derivative

Just substitute f(x) for C(xn) in the second part of our equation above, and you’ll see we can effectively write that 2C(xn) − C(xn+b) − C(xn−b) factor as:

formula 1

We’re done. We just iħ·(∂C(xn)/∂t) on the left-hand side now and multiply the expression above with A, to get what we wanted to get, and that’s – YES! – Schrödinger’s equation:

Schrodinger 2

Whatever your objections to this ‘derivation’, it is the correct equation. For a particle in free space, we just write m instead of meff, but it’s exactly the same. I’ll now give you Feynman’s full quote, which is quite enlightening:

“We do not intend to have you think we have derived the Schrödinger equation but only wish to show you one way of thinking about it. When Schrödinger first wrote it down, he gave a kind of derivation based on some heuristic arguments and some brilliant intuitive guesses. Some of the arguments he used were even false, but that does not matter; the only important thing is that the ultimate equation gives a correct description of nature. The purpose of our discussion is then simply to show you that the correct fundamental quantum mechanical equation [i.e. Schrödinger’s equation] has the same form you get for the limiting case of an electron moving along a line of atoms. We can think of it as describing the diffusion of a probability amplitude from one point to the next along the line. That is, if an electron has a certain amplitude to be at one point, it will, a little time later, have some amplitude to be at neighboring points. In fact, the equation looks something like the diffusion equations which we have used in Volume I. But there is one main difference: the imaginary coefficient in front of the time derivative makes the behavior completely different from the ordinary diffusion such as you would have for a gas spreading out along a thin tube. Ordinary diffusion gives rise to real exponential solutions, whereas the solutions of Schrödinger’s equation are complex waves.”

So… That says it all, I guess. Isn’t it great to be where we are? We’ve really climbed a mountain here. And I think the view is gorgeous. 🙂

Oh—just in case you’d think I did not give you Schrödinger’s equation, let me write it in the form you’ll usually see it:

schrodinger 3

Done! 🙂