The de Broglie relations, the wave equation, and relativistic length contraction

You know the two de Broglie relations, also known as matter-wave equations:

f = E/h and λ = h/p

You’ll find them in almost any popular account of quantum mechanics, and the writers of those popular books will tell you that is the frequency of the ‘matter-wave’, and λ is its wavelength. In fact, to add some more weight to their narrative, they’ll usually write them in a somewhat more sophisticated form: they’ll write them using ω and k. The omega symbol (using a Greek letter always makes a big impression, doesn’t it?) denotes the angular frequency, while k is the so-called wavenumber.  Now, k = 2π/λ and ω = 2π·f and, therefore, using the definition of the reduced Planck constant, i.e. ħ = h/2π, they’ll write the same relations as:

  1. λ = h/p = 2π/k ⇔ k = 2π·p/h
  2. f = E/h = (ω/2π)

⇒ k = p/ħ and ω = E/ħ

They’re the same thing: it’s just that working with angular frequencies and wavenumbers is more convenient, from a mathematical point of view that is: it’s why we prefer expressing angles in radians rather than in degrees (k is expressed in radians per meter, while ω is expressed in radians per second). In any case, the ‘matter wave’ – even Wikipedia uses that term now – is, of course, the amplitude, i.e. the wave-function ψ(x, t), which has a frequency and a wavelength, indeed. In fact, as I’ll show in a moment, it’s got two frequencies: one temporal, and one spatial. I am modest and, hence, I’ll admit it took me quite a while to fully distinguish the two frequencies, and so that’s why I always had trouble connecting these two ‘matter wave’ equations.

Indeed, if they represent the same thing, they must be related, right? But how exactly? It should be easy enough. The wavelength and the frequency must be related through the wave velocity, so we can write: f·λ = v, with the velocity of the wave, which must be equal to the classical particle velocity, right? And then momentum and energy are also related. To be precise, we have the relativistic energy-momentum relationship: p·c = mv·v·c = mv·c2·v/c = E·v/c. So it’s just a matter of substitution. We should be able to go from one equation to the other, and vice versa. Right?

Well… No. It’s not that simple. We can start with either of the two equations but it doesn’t work. Try it. Whatever substitution you try, there’s no way you can derive one of the two equations above from the other. The fact that it’s impossible is evidenced by what we get when we’d multiply both equations. We get:

  1. f·λ = (E/h)·(h/p) = E/p
  2. v = f·λ  ⇒ f·λ = v = E/p ⇔ E = v·p = v·(m·v)

⇒ E = m·v2

Huh? What kind of formula is that? E = m·v2? That’s a formula you’ve never ever seen, have you? It reminds you of the kinetic energy formula of course—K.E. = m·v2/2—but… That factor 1/2 should not be there. Let’s think about it for a while. First note that this E = m·vrelation makes perfectly sense if v = c. In that case, we get Einstein’s mass-energy equivalence (E = m·c2), but that’s besides the point here. The point is: if v = c, then our ‘particle’ is a photon, really, and then the E = h·f is referred to as the Planck-Einstein relation. The wave velocity is then equal to c and, therefore, f·λ = c, and so we can effectively substitute to find what we’re looking for:

E/p = (h·f)/(h/λ) = f·λ = c ⇒ E = p·

So that’s fine: we just showed that the de Broglie relations are correct for photons. [You remember that E = p·c relation, no? If not, check out my post on it.] However, while that’s all nice, it is not what the de Broglie equations are about: we’re talking the matter-wave here, and so we want to do something more than just re-confirm that Planck-Einstein relation, which you can interpret as the limit of the de Broglie relations for v = c. In short, we’re doing something wrong here! Of course, we are. I’ll tell you what exactly in a moment: it’s got to do with the fact we’ve got two frequencies really.

Let’s first try something else. We’ve been using the relativistic E = mv·c2 equation above. Let’s try some other energy concept: let’s substitute the E in the f = E/h by the kinetic energy and then see where we get—if anywhere at all. So we’ll use the Ekinetic = m∙v2/2 equation. We can then use the definition of momentum (p = m∙v) to write E = p2/(2m), and then we can relate the frequency f to the wavelength λ using the v = λ∙f formula once again. That should work, no? Let’s do it. We write:

  1. E = p2/(2m)
  2. E = h∙f = h·v

⇒ λ = h·v/E = h·v/(p2/(2m)) = h·v/[m2·v2/(2m)] = h/[m·v/2] = 2∙h/p

So we find λ = 2∙h/p. That is almost right, but not quite: that factor 2 should not be there. Well… Of course you’re smart enough to see it’s just that factor 1/2 popping up once more—but as a reciprocal, this time around. 🙂 So what’s going on? The honest answer is: you can try anything but it will never work, because the f = E/h and λ = h/p equations cannot be related—or at least not so easily. The substitutions above only work if we use that E = m·v2 energy concept which, you’ll agree, doesn’t make much sense—at first, at least. Again: what’s going on? Well… Same honest answer: the f = E/h and λ = h/p equations cannot be related—or at least not so easily—because the wave equation itself is not so easy.

Let’s review the basics once again.

The wavefunction

The amplitude of a particle is represented by a wavefunction. If we have no information whatsoever on its position, then we usually write that wavefunction as the following complex-valued exponential:

ψ(x, t) = a·ei·[(E/ħ)·t − (p/ħ)∙x] = a·ei·(ω·t − kx= a·ei(kx−ω·t) = a·eiθ = (cosθ + i·sinθ)

θ is the so-called phase of our wavefunction and, as you can see, it’s the argument of a wavefunction indeed, with temporal frequency ω and spatial frequency k (if we choose our x-axis so its direction is the same as the direction of k, then we can substitute the k and x vectors for the k and x scalars, so that’s what we’re doing here). Now, we know we shouldn’t worry too much about a, because that’s just some normalization constant (remember: all probabilities have to add up to one). However, let’s quickly develop some logic here. Taking the absolute square of this wavefunction gives us the probability of our particle being somewhere in space at some point in time. So we get the probability as a function of x and t. We write:

P(x ,t) = |a·ei·[(E/ħ)·t − (p/ħ)∙x]|= a2

As all probabilities have to add up to one, we must assume we’re looking at some box in spacetime here. So, if the length of our box is Δx = x2 − x1, then (Δx)·a2 = (x2−x1a= 1 ⇔ Δx = 1/a2. [We obviously simplify the analysis by assuming a one-dimensional space only here, but the gist of the argument is essentially correct.] So, freezing time (i.e. equating t to some point t = t0), we get the following probability density function:


That’s simple enough. The point is: the two de Broglie equations f = E/h and λ = h/p give us the temporal and spatial frequencies in that ψ(x, t) = a·ei·[(E/ħ)·t − (p/ħ)∙x] relation. As you can see, that’s an equation that implies a much more complicated relationship between E/ħ = ω and p/ħ = k. Or… Well… Much more complicated than what one would think of at first.

To appreciate what’s being represented here, it’s good to play a bit. We’ll continue with our simple exponential above, which also illustrates how we usually analyze those wavefunctions: we either assume we’re looking at the wavefunction in space at some fixed point in time (t = t0) or, else, at how the wavefunction changes in time at some fixed point in space (x = x0). Of course, we know that Einstein told us we shouldn’t do that: space and time are related and, hence, we should try to think of spacetime, i.e. some ‘kind of union’ of space and time—as Minkowski famously put it. However, when everything is said and done, mere mortals like us are not so good at that, and so we’re sort of condemned to try to imagine things using the classical cut-up of things. 🙂 So we’ll just an online graphing tool to play with that a·ei(k∙x−ω·t) = a·eiθ = (cosθ + i·sinθ) formula.

Compare the following two graps, for example. Just imagine we either look at how the wavefunction behaves at some point in space, with the time fixed at some point t = t0, or, alternatively, that we look at how the wavefunction behaves in time at some point in space x = x0. As you can see, increasing k = p/ħ or increasing ω = E/ħ gives the wavefunction a higher ‘density’ in space or, alternatively, in time.

density 1

density 2That makes sense, intuitively. In fact, when thinking about how the energy, or the momentum, affects the shape of the wavefunction, I am reminded of an airplane propeller: as it spins, faster and faster, it gives the propeller some ‘density’, in space as well as in time, as its blades cover more space in less time. It’s an interesting analogy: it helps—me, at least—to think through what that wavefunction might actually represent.


So as to stimulate your imagination even more, you should also think of representing the real and complex part of that ψ = a·ei(k∙x−ω·t) = a·eiθ = (cosθ + i·sinθ) formula in a different way. In the graphs above, we just showed the sine and cosine in the same plane but, as you know, the real and the imaginary axis are orthogonal, so Euler’s formula a·eiθ (cosθ + i·sinθ) = cosθ + i·sinθ = Re(ψ) + i·Im(ψ) may also be graphed as follows:


The illustration above should make you think of yet another illustration you’ve probably seen like a hundred times before: the electromagnetic wave, propagating through space as the magnetic and electric field induce each other, as illustrated below. However, there’s a big difference: Euler’s formula incorporates a phase shift—remember: sinθ = cos(θ − π/2)—and you don’t have that in the graph below. The difference is much more fundamental, however: it’s really hard to see how one could possibly relate the magnetic and electric field to the real and imaginary part of the wavefunction respectively. Having said that, the mathematical similarity makes one think!


Of course, you should remind yourself of what E and B stand for: they represent the strength of the electric (E) and magnetic (B) field at some point x at some time t. So you shouldn’t think of those wavefunctions above as occupying some three-dimensional space. They don’t. Likewise, our wavefunction ψ(x, t) does not occupy some physical space: it’s some complex number—an amplitude that’s associated with each and every point in spacetime. Nevertheless, as mentioned above, the visuals make one think and, as such, do help us as we try to understand all of this in a more intuitive way.

Let’s now look at that energy-momentum relationship once again, but using the wavefunction, rather than those two de Broglie relations.

Energy and momentum in the wavefunction

I am not going to talk about uncertainty here. You know that Spiel. If there’s uncertainty, it’s in the energy or the momentum, or in both. The uncertainty determines the size of that ‘box’ (in spacetime) in which we hope to find our particle, and it’s modeled by a splitting of the energy levels. We’ll say the energy of the particle may be E0, but it might also be some other value, which we’ll write as En = E0 ± n·ħ. The thing to note is that energy levels will always be separated by some integer multiple of ħ, so ħ is, effectively , the quantum of energy for all practical—and theoretical—purposes. We then super-impose the various wave equations to get a wave function that might—or might not—resemble something like this:

Photon waveWho knows? 🙂 In any case, that’s not what I want to talk about here. Let’s repeat the basics once more: if we write our wavefunction a·ei·[(E/ħ)·t − (p/ħ)∙x] as a·ei·[ω·t − k∙x], we refer to ω = E/ħ as the temporal frequency, i.e. the frequency of our wavefunction in time (i.e. the frequency it has if we keep the position fixed), and to k = p/ħ as the spatial frequency (i.e. the frequency of our wavefunction in space (so now we stop the clock and just look at the wave in space). Now, let’s think about the energy concept first. The energy of a particle is generally thought of to consist of three parts:

  1. The particle’s rest energy m0c2, which de Broglie referred to as internal energy (Eint): it includes the rest mass of the ‘internal pieces’, as Feynman puts it (now we call those ‘internal pieces’ quarks), as well as their binding energy (i.e. the quarks’ interaction energy);
  2. Any potential energy it may have because of some field (so de Broglie was not assuming the particle was traveling in free space), which we’ll denote by U, and note that the field can be anything—gravitational, electromagnetic: it’s whatever changes the energy because of the position of the particle;
  3. The particle’s kinetic energy, which we write in terms of its momentum p: m·v2/2 = m2·v2/(2m) = (m·v)2/(2m) = p2/(2m).

So we have one energy concept here (the rest energy) that does not depend on the particle’s position in spacetime, and two energy concepts that do depend on position (potential energy) and/or how that position changes because of its velocity and/or momentum (kinetic energy). The two last bits are related through the energy conservation principle. The total energy is E = mvc2, of course—with the little subscript (v) ensuring the mass incorporates the equivalent mass of the particle’s kinetic energy.

So what? Well… In my post on quantum tunneling, I drew attention to the fact that different potentials , so different potential energies (indeed, as our particle travels one region to another, the field is likely to vary) have no impact on the temporal frequency. Let me re-visit the argument, because it’s an important one. Imagine two different regions in space that differ in potential—because the field has a larger or smaller magnitude there, or points in a different direction, or whatever: just different fields, which corresponds to different values for U1 and U2, i.e. the potential in region 1 versus region 2. Now, the different potential will change the momentum: the particle will accelerate or decelerate as it moves from one region to the other, so we also have a different p1 and p2. Having said that, the internal energy doesn’t change, so we can write the corresponding amplitudes, or wavefunctions, as:

  1. ψ11) = Ψ1(x, t) = a·eiθ1 = a·e−i[(Eint + p12/(2m) + U1)·t − p1∙x]/ħ 
  2. ψ22) = Ψ2(x, t) = a·e−iθ2 = a·e−i[(Eint + p22/(2m) + U2)·t − p2∙x]/ħ 

Now how should we think about these two equations? We are definitely talking different wavefunctions. However, their temporal frequencies ω= Eint + p12/(2m) + U1 and ω= Eint + p22/(2m) + Umust be the same. Why? Because of the energy conservation principle—or its equivalent in quantum mechanics, I should say: the temporal frequency f or ω, i.e. the time-rate of change of the phase of the wavefunction, does not change: all of the change in potential, and the corresponding change in kinetic energy, goes into changing the spatial frequency, i.e. the wave number k or the wavelength λ, as potential energy becomes kinetic or vice versa. The sum of the potential and kinetic energy doesn’t change, indeed. So the energy remains the same and, therefore, the temporal frequency does not change. In fact, we need this quantum-mechanical equivalent of the energy conservation principle to calculate how the momentum and, hence, the spatial frequency of our wavefunction, changes. We do so by boldly equating ω= Eint + p12/(2m) + Uand ω2 = Eint + p22/(2m) + U2, and so we write:

ω= ω2 ⇔ Eint + p12/(2m) + U=  Eint + p22/(2m) + U

⇔ p12/(2m) − p22/(2m) = U– U⇔ p2=  (2m)·[p12/(2m) – (U– U1)]

⇔ p2 = (p12 – 2m·ΔU )1/2

We played with this in a previous post, assuming that p12 is larger than 2m·ΔU, so as to get a positive number on the right-hand side of the equation for p22, so then we can confidently take the positive square root of that (p12 – 2m·ΔU ) expression to calculate p2. For example, when the potential difference ΔU = U– U1 was negative, so ΔU < 0, then we’re safe and sure to get some real positive value for p2.

Having said that, we also contemplated the possibility that p2= p12 – 2m·ΔU was negative, in which case p2 has to be some pure imaginary number, which we wrote as p= i·p’ (so p’ (read: p prime) is a real positive number here). We could work with that: it resulted in an exponentially decreasing factor ep’·x/ħ that ended up ‘killing’ the wavefunction in space. However, its limited existence still allowed particles to ‘tunnel’ through potential energy barriers, thereby explaining the quantum-mechanical tunneling phenomenon.

This is rather weird—at first, at least. Indeed, one would think that, because of the E/ħ = ω equation, any change in energy would lead to some change in ω. But no! The total energy doesn’t change, and the potential and kinetic energy are like communicating vessels: any change in potential energy is associated with a change in p, and vice versa. It’s a really funny thing. It helps to think it’s because the potential depends on position only, and so it should not have an impact on the temporal frequency of our wavefunction. Of course, it’s equally obvious that the story would change drastically if the potential would change with time, but… Well… We’re not looking at that right now. In short, we’re assuming energy is being conserved in our quantum-mechanical system too, and so that implies what’s described above: no change in ω, but we obviously do have changes in p whenever our particle goes from one region in space to another, and the potentials differ. So… Well… Just remember: the energy conservation principle implies that the temporal frequency of our wave function doesn’t change. Any change in potential, as our particle travels from one place to another, plays out through the momentum.

Now that we know that, let’s look at those de Broglie relations once again.

Re-visiting the de Broglie relations

As mentioned above, we usually think in one dimension only: we either freeze time or, else, we freeze space. If we do that, we can derive some funny new relationships. Let’s first simplify the analysis by re-writing the argument of the wavefunction as:

θ = E·t − p·x

Of course, you’ll say: the argument of the wavefunction is not equal to E·t − p·x: it’s (E/ħ)·t − (p/ħ)∙x. Moreover, θ should have a minus sign in front. Well… Yes, you’re right. We should put that 1/ħ factor in front, but we can change units, and so let’s just measure both E as well as p in units of ħ here. We can do that. No worries. And, yes, the minus sign should be there—Nature choose a clockwise direction for θ—but that doesn’t matter for the analysis hereunder.

The E·t − p·x expression reminds one of those invariant quantities in relativity theory. But let’s be precise here. We’re thinking about those so-called four-vectors here, which we wrote as pμ = (E, px, py, pz) = (E, p) and xμ = (t, x, y, z) = (t, x) respectively. [Well… OK… You’re right. We wrote those four-vectors as pμ = (E, px·c , py·c, pz·c) = (E, p·c) and xμ = (c·t, x, y, z) = (t, x). So what we write is true only if we measure time and distance in equivalent units so we have = 1. So… Well… Let’s do that and move on.] In any case, what was invariant was not E·t − p·x·c or c·t − x (that’s a nonsensical expression anyway: you cannot subtract a vector from a scalar), but pμ2 = pμpμ = E2 − (p·c)2 = E2 − p2·c= E2 − (px2 + py2 + pz2c2 and xμ2 = xμxμ = (c·t)2 − x2 = c2·t2 − (x2 + y2 + z2) respectively. [Remember pμpμ and xμxμ are four-vector dot products, so they have that +— signature, unlike the p2 and x2 or a·b dot products, which are just a simple sum of the squared components.] So… Well… E·t − p·x is not an invariant quantity. Let’s try something else.

Let’s re-simplify by equating ħ as well as c to one again, so we write: ħ = c = 1. [You may wonder if it is possible to ‘normalize’ both physical constants simultaneously, but the answer is yes. The Planck unit system is an example.]  then our relativistic energy-momentum relationship can be re-written as E/p = 1/v. [If c would not be one, we’d write: E·β = p·c, with β = v/c. So we got E/p = c/β. We referred to β as the relative velocity of our particle: it was the velocity, but measured as a ratio of the speed of light. So here it’s the same, except that we use the velocity symbol v now for that ratio.]

Now think of a particle moving in free space, i.e. without any fields acting on it, so we don’t have any potential changing the spatial frequency of the wavefunction of our particle, and let’s also assume we choose our x-axis such that it’s the direction of travel, so the position vector (x) can be replaced by a simple scalar (x). Finally, we will also choose the origin of our x-axis such that x = 0 zero when t = 0, so we write: x(t = 0) = 0. It’s obvious then that, if our particle is traveling in spacetime with some velocity v, then the ratio of its position x and the time t that it’s been traveling will always be equal to = x/t. Hence, for that very special position in spacetime (t, x = v·t) – so we’re talking the actual position of the particle in spacetime here – we get: θ = E·t − p·x = E·t − p·v·t = E·t − m·v·v·t= (E −  m∙v2)·t. So… Well… There we have the m∙v2 factor.

The question is: what does it mean? How do we interpret this? I am not sure. When I first jotted this thing down, I thought of choosing a different reference potential: some negative value such that it ensures that the sum of kinetic, rest and potential energy is zero, so I could write E = 0 and then the wavefunction would reduce to ψ(t) = ei·m∙v2·t. Feynman refers to that as ‘choosing the zero of our energy scale such that E = 0’, and you’ll find this in many other works too. However, it’s not that simple. Free space is free space: if there’s no change in potential from one region to another, then the concept of some reference point for the potential becomes meaningless. There is only rest energy and kinetic energy, then. The total energy reduces to E = m (because we chose our units such that c = 1 and, therefore, E = mc2 = m·12 = m) and so our wavefunction reduces to:

ψ(t) = a·ei·m·(1 − v2)·t

We can’t reduce this any further. The mass is the mass: it’s a measure for inertia, as measured in our inertial frame of reference. And the velocity is the velocity, of course—also as measured in our frame of reference. We can re-write it, of course, by substituting t for t = x/v, so we get:

ψ(x) = a·ei·m·(1/vv)·x

For both functions, we get constant probabilities, but a wavefunction that’s ‘denser’ for higher values of m. The (1 − v2) and (1/vv) factors are different, however: these factors becomes smaller for higher v, so our wavefunction becomes less dense for higher v. In fact, for = 1 (so for travel at the speed of light, i.e. for photons), we get that ψ(t) = ψ(x) = e0 = 1. [You should use the graphing tool once more, and you’ll see the imaginary part, i.e. the sine of the (cosθ + i·sinθ) expression, just vanishes, as sinθ = 0 for θ = 0.]


The wavefunction and relativistic length contraction

Are exercises like this useful? As mentioned above, these constant probability wavefunctions are a bit nonsensical, so you may wonder why I wrote what I wrote. There may be no real conclusion, indeed: I was just fiddling around a bit, and playing with equations and functions. I feel stuff like this helps me to understand what that wavefunction actually is somewhat better. If anything, it does illustrate that idea of the ‘density’ of a wavefunction, in space or in time. What we’ve been doing by substituting x for x = v·t or t for t = x/v is showing how, when everything is said and done, the mass and the velocity of a particle are the actual variables determining that ‘density’ and, frankly, I really like that ‘airplane propeller’ idea as a pedagogic device. In fact, I feel it may be more than just a pedagogic device, and so I’ll surely re-visit it—once I’ve gone through the rest of Feynman’s Lectures, that is. 🙂

That brings me to what I added in the title of this post: relativistic length contraction. You’ll wonder why I am bringing that into a discussion like this. Well… Just play a bit with those (1 − v2) and (1/vv) factors. As mentioned above, they decrease the density of the wavefunction. In other words, it’s like space is being ‘stretched out’. Also, it can’t be a coincidence we find the same (1 − v2) factor in the relativistic length contraction formula: L = L0·√(1 − v2), in which L0 is the so-called proper length (i.e. the length in the stationary frame of reference) and is the (relative) velocity of the moving frame of reference. Of course, we also find it in the relativistic mass formula: m = mv = m0/√(1−v2). In fact, things become much more obvious when substituting m for m0/√(1−v2) in that ψ(t) = ei·m·(1 − v2)·t function. We get:

ψ(t) = a·ei·m·(1 − v2)·t = a·ei·m0·√(1−v2)·t 

Well… We’re surely getting somewhere here. What if we go back to our original ψ(x, t) = a·ei·[(E/ħ)·t − (p/ħ)∙x] function? Using natural units once again, that’s equivalent to:

ψ(x, t) = a·ei·(m·t − p∙x) = a·ei·[(m0/√(1−v2))·t − (m0·v/√(1−v2)∙x)

= a·ei·[m0/√(1−v2)]·(t − v∙x)

Interesting! We’ve got a wavefunction that’s a function of x and t, but with the rest mass (or rest energy) and velocity as parameters! Now that really starts to make sense. Look at the (blue) graph for that 1/√(1−v2) factor: it goes from one (1) to infinity (∞) as v goes from 0 to 1 (remember we ‘normalized’ v: it’s a ratio between 0 and 1 now). So that’s the factor that comes into play for t. For x, it’s the red graph, which has the same shape but goes from zero (0) to infinity (∞) as v goes from 0 to 1.

graph 2Now that makes sense: the ‘density’ of the wavefunction, in time and in space, increases as the velocity v increases. In space, that should correspond to the relativistic length contraction effect: it’s like space is contracting, as the velocity increases and, therefore, the length of the object we’re watching contracts too. For time, the reasoning is a bit more complicated: it’s our time that becomes more dense and, therefore, our clock that seems to tick faster.


I know I need to explore this further—if only so as to assure you I have not gone crazy. Unfortunately, I have no time to do that right now. Indeed, from time to time, I need to work on other stuff besides this physics ‘hobby’ of mine. :-/

Post scriptum 1: As for the E = m·vformula, I also have a funny feeling that it might be related to the fact that, in quantum mechanics, both the real and imaginary part of the oscillation actually matter. You’ll remember that we’d represent any oscillator in physics by a complex exponential, because it eased our calculations. So instead of writing A = A0·cos(ωt + Δ), we’d write: A = A0·ei(ωt + Δ) = A0·cos(ωt + Δ) + i·A0·sin(ωt + Δ). When calculating the energy or intensity of a wave, however, we couldn’t just take the square of the complex amplitude of the wave – remembering that E ∼ A2. No! We had to get back to the real part only, i.e. the cosine or the sine only. Now the mean (or average) value of the squared cosine function (or a squared sine function), over one or more cycles, is 1/2, so the mean of A2 is equal to 1/2 = A02. cos(ωt + Δ). I am not sure, and it’s probably a long shot, but one must be able to show that, if the imaginary part of the oscillation would actually matter – which is obviously the case for our matter-wave – then 1/2 + 1/2 is obviously equal to 1. I mean: try to think of an image with a mass attached to two springs, rather than one only. Does that make sense? 🙂 […] I know: I am just freewheeling here. 🙂

Post scriptum 2: The other thing that this E = m·vequation makes me think of is – curiously enough – an eternally expanding spring. Indeed, the kinetic energy of a mass on a spring and the potential energy that’s stored in the spring always add up to some constant, and the average potential and kinetic energy are equal to each other. To be precise: 〈K.E.〉 + 〈P.E.〉 = (1/4)·k·A2 + (1/4)·k·A= k·A2/2. It means that, on average, the total energy of the system is twice the average kinetic energy (or potential energy). You’ll say: so what? Well… I don’t know. Can we think of a spring that expands eternally, with the mass on its end not gaining or losing any speed? In that case, is constant, and the total energy of the system would, effectively, be equal to Etotal = 2·〈K.E.〉 = (1/2)·m·v2/2 = m·v2.

Post scriptum 3: That substitution I made above – substituting x for x = v·t – is kinda weird. Indeed, if that E = m∙v2 equation makes any sense, then E − m∙v2 = 0, of course, and, therefore, θ = E·t − p·x = E·t − p·v·t = E·t − m·v·v·t= (E −  m∙v2)·t = 0·t = 0. So the argument of our wavefunction is 0 and, therefore, we get a·e= for our wavefunction. It basically means our particle is where it is. 🙂

Post scriptum 4: This post scriptum – no. 4 – was added later—much later. On 29 February 2016, to be precise. The solution to the ‘riddle’ above is actually quite simple. We just need to make a distinction between the group and the phase velocity of our complex-valued wave. The solution came to me when I was writing a little piece on Schrödinger’s equation. I noticed that we do not find that weird E = m∙v2 formula when substituting ψ for ψ = ei(kx − ωt) in Schrödinger’s equation, i.e. in:

Schrodinger's equation 2

Let me quickly go over the logic. To keep things simple, we’ll just assume one-dimensional space, so ∇2ψ = ∂2ψ/∂x2. The time derivative on the left-hand side is ∂ψ/∂t = −iω·ei(kx − ωt). The second-order derivative on the right-hand side is ∂2ψ/∂x2 = (ik)·(ik)·ei(kx − ωt) = −k2·ei(kx − ωt) . The ei(kx − ωt) factor on both sides cancels out and, hence, equating both sides gives us the following condition:

iω = −(iħ/2m)·k2 ⇔ ω = (ħ/2m)·k2

Substituting ω = E/ħ and k = p/ħ yields:

E/ħ = (ħ/2m)·p22 = m2·v2/(2m·ħ) = m·v2/(2ħ) ⇔ E = m·v2/2

In short: the E = m·v2/2 is the correct formula. It must be, because… Well… Because Schrödinger’s equation is a formula we surely shouldn’t doubt, right? So the only logical conclusion is that we must be doing something wrong when multiplying the two de Broglie equations. To be precise: our v = f·λ equation must be wrong. Why? Well… It’s just something one shouldn’t apply to our complex-valued wavefunction. The ‘correct’ velocity formula for the complex-valued wavefunction should have that 1/2 factor, so we’d write 2·f·λ = v to make things come out alright. But where would this formula come from? The period of cosθ + isinθ is the period of the sine and cosine function: cos(θ+2π) + isin(θ+2π) = cosθ + isinθ, so T = 2π and f = 1/T = 1/2π do not change.

But so that’s a mathematical point of view. From a physical point of view, it’s clear we got two oscillations for the price of one: one ‘real’ and one ‘imaginary’—but both are equally essential and, hence, equally ‘real’. So the answer must lie in the distinction between the group and the phase velocity when we’re combining waves. Indeed, the group velocity of a sum of waves is equal to vg = dω/dk. In this case, we have:

vg = d[E/ħ]/d[p/ħ] = dE/dp

We can now use the kinetic energy formula to write E as E = m·v2/2 = p·v/2. Now, v and p are related through m (p = m·v, so = p/m). So we should write this as E = m·v2/2 = p2/(2m). Substituting E and p = m·v in the equation above then gives us the following:

dω/dk = d[p2/(2m)]/dp = 2p/(2m) = v= v

However, for the phase velocity, we can just use the v= ω/k formula, which gives us that 1/2 factor:

v= ω/k = (E/ħ)/(p/ħ) = E/p = (m·v2/2)/(m·v) = v/2

Bingo! Riddle solved! 🙂 Isn’t it nice that our formula for the group velocity also applies to our complex-valued wavefunction? I think that’s amazing, really! But I’ll let you think about it. 🙂

The Pauli spin matrices as operators

You must be despairing by now. More theory? Haven’t we had enough? Relax. We’re almost there. The next post is going to generalize our results for n-state systems. However, before we do that, we need one more building block, and that’s this one. So… Well… Let’s go for it. It’s a bit long but, hopefully, interesting enough—so you don’t fall asleep before the end. 🙂 Let’s first review the concept of an operator itself.

The concept of an operator

You’ll remember Feynman‘s ‘Great Law of Quantum Mechanics’:

| = ∑ | i 〉〈 i | over all base states i.

We also talked of all kinds of apparatuses: a Stern-Gerlach spin filter, a state selector for a maser, a resonant cavity or—quite simply—just time passing by. From a quantum-mechanical point of view, we think of this as particles going into the apparatus in some state φ, and coming out of it in some other state χ. We wrote the amplitude for that as 〈 χ | A | φ 〉. [Remember the right-to-left reading, like Arab or Hebrew script.] Then we applied our ‘Great Law’ to that 〈 χ | A | φ 〉 expression – twice, actually – to get the following expression:


We’re just ‘unpacking’ the φ and χ states here, as we can only describe those states in terms of base states, which we denote as and j here. That’s all. If we’d add another apparatus in series, we’d get:


We just put the | bar between B and A and apply the same trick. The | bar is really like a factor 1 in multiplication—in the sense that we can insert it anywhere: a×b = a×1×b = 1×a×b = a×b×1 = 1×a×1×b×1 = 1×a×b×1 etc. Anywhere? Hmm… It’s not quite the same, but I’ll let you check out the differences. 🙂 The point is that, from a mathematical point of view, we can fully describe the apparatus A, or the combined apparatus BA, in terms of those 〈 i | A | j 〉 or 〈 i | BA | j 〉 amplitudes. Depending on the number of base states, we’d have a three-by-three, or a two-by-two, or, more generally, an n-by-n matrix, i.e. a square matrix of order n. For example, there are 3×3 = 9 amplitudes if we have three possible states, for example—and, equally obviously, 2×2 = 4 amplitudes for the example involving spin-1/2 particles. [If you think things are way too complicated,… Well… At least we’ve got square matrices here—not n-by-matrices.] We simply called such matrix the matrix of amplitudes, and we usually denoted it by A. However, sometimes we’d also denote it by Aij, or by [Aij], depending on our mood. 🙂 The preferred notation was A, however, so as to avoid confusion with the matrix elements, which we’d write as Aij.

The Hamiltonian matrix – which, very roughly speaking, is like the quantum-mechanical equivalent of the  dp/dt term of Newton’s Law of Motion: F = dp/dt = m·dv/dt = m·a – is a matrix of amplitudes as well, and we’ll come back to it in a minute. Let’s first continue our story on operators here. The idea of an operator comes up when we’re creative again, and when we drop the 〈 χ | state from the 〈 χ | A | φ〉 expression, so we write:


So now we think of the particle entering the ‘apparatus’ A in the state ϕ and coming out of A in some state ψ (‘psi’). But our psi is a ket, i.e. some initial state. That’s why we write it as | ψ 〉. It doesn’t mean anything until we combine with some bra, like a base state 〈 i |, or with a final state, which we’d denote by 〈 χ | or some other Greek letter between a 〈 and a | symbol. So then we get 〈 χ | ψ 〉 = 〈 χ | A | φ〉 or 〈 i | ψ 〉 = 〈 i | A | φ 〉. So then we’re ‘unpacking’ our bar once more. Let me be explicit here: it’s kinda weird, but if you’re going to study quantum math, you’ll need to accept that, when discussing the state of a system or a particle, like ψ or φ, it does make a difference if they’re initial or final states. To be precise, the final 〈 χ | or 〈 φ | states are equal to the conjugate transpose of the initial | χ 〉 or | φ 〉 states, so we write: 〈 χ | = | χ 〉 or 〈 φ | = | φ 〉. I’ll come back to that, because it’s kind of counter-intuitive: a state should be a state, no? Well… No. Not from a quantum-math point of view at least. 😦 But back to our operator. Feynman defines an operator in the following rather intuitive way:

The symbol A is neither an amplitude, nor a vector; it is a new kind of thing called an operator. It is something which “operates on” some state | φ 〉 to produce some new state | ψ 〉.”

But… Well… Be careful! What’s a state? As I mentioned, | ψ 〉 is not the same as 〈 ψ |. We’re talking an initial state | ψ 〉 here, not 〈 ψ |. That’s why we need to ‘unpack’ the operator to see what it does: we have to combine it with some final state that we’re interested in, or a base state. Then—and only then—we get a proper amplitude, i.e. some complex number – or some complex function – that we can work with. To be precise, we then get the amplitude to be in that final state, or in that base state. In practical terms, that means our operator, or our apparatus, doesn’t mean very much as long as we don’t measure what comes out—and measuring something implies we have to choose some set of base states, i.e. a representation, which allows us to describe the final state, which we denoted as 〈 χ | above.

Let’s wrap this up by being clear on the notation once again. We’ll write: Aij = 〈 i | A | j 〉, or Uij = 〈 i | U | j 〉, or Hij = 〈 i | H | j 〉. In other words, we’ll really be consistent now with those subscripts: if they are there, we’re talking a coefficient, or a matrix element. If they’re not there, we’re talking the matrix itself, i.e. A, U or H. Now, to give you a sort of feeling for how that works in terms of the matrix equations that we’ll inevitably have to deal with, let me just jot one of them down here:


The Di* numbers are the ‘coordinates’ of the (final) 〈 χ | state in terms of the base states, which we denote as i = +, 0 or − here. So we have three states here. [That’s just to remind you that the two-state systems we’ve seen so far are pretty easy. We’ll soon be working with four-state systems—and then the sky is the limit. :-)] In fact, you’ll remember that those coordinates were the complex conjugate of the ‘coordinates’ of the initial | χ 〉 state, i.e. D+, D0, D, so that 1-by-3 matrix above, i.e. the row vector 〈 χ |[D+*  D0*  D*], is the so-called conjugate transpose of the column vector | χ 〉 = [D+  D0  D]T. [I can’t do columns with this WordPress editor, so I am just putting the T for transpose so as to make sure you understand | χ 〉 is a column vector.]

Now, you’ll wonder – if you don’t, you should 🙂 – how that Aij = 〈 i | A | j 〉, Uij = 〈 i | U | j 〉, or Hij = 〈 i | H | j 〉 notation works out in terms of matrices. It’s extremely simple really. If we have only two states (yes, back to simplicity), which we’ll also write as + and − (forget about the 0 state), then we can write Aij = 〈 i | A | j 〉 in matrix notation as:


Huh? Is is that simple? Yes. We can make things more complicated by involving a transformation matrix so we can write our base states in terms of another, different, set of base states but, in essence, this is what we are talking about here. Of course, you should absolutely not try to give a geometric interpretation to our [1 0] or [0 1] ‘coordinates’. If you do that, you get in trouble, because then you want to give the transformed base states the same geometric interpretation and… Well… It just doesn’t make sense. I gave an example of that in my post on the hydrogen molecule as a two-state system. Symmetries in quantum physics are not geometric… Well… Not in a physical sense, that is. As I explained in my previous post, describing spin-1/2 particles involves stuff like 720 degree symmetries and all that. So… Well… Just don’t! 🙂


The Hamiltonian as a matrix and as an operator

As mentioned above, our Hamiltonian is a matrix of amplitudes as well, and we can also write it as H, Hij, or [Hij] respectively, depending on our mood. 🙂 For some reason, Feynman often writes it as Hij, instead of H, which creates a lot of confusion because, in most contexts, Hij refers to the matrix elements, rather than the matrix itself. I guess Feynman likes to keep the subscripts, i.e ij or I,II, as they refer to the representation that was chosen. However, Hij should really refer to the matrix element, and then we can use H for the matrix itself. So let’s be consistent. As I’ve shown above, the Hij notation – and so I am talking the Hamiltonian coefficients here – is actually a shorthand for writing:

Hij = 〈 i | H | j 〉

So the Hamiltonian coefficient (Hij) connects two base states (i and j) through the Hamiltonian matrix (H). Connect? How? Our language in the previous posts, and some of Feynman’s language, may have suggested the Hamiltonian coefficients are amplitudes to go from state j to state i. However, that’s not the case. Or… Well… We need to qualify that statement. What does it mean? The i and j states are base states and, hence, 〈 i | j 〉 = δij, with δij = 1 if i = j and δij = 0 if i ≠ j. Hence, stating that the Hamiltonian coefficients are the amplitudes to go from one state to another is… Well… Let’s say that language is rather inaccurate. We need to include the element of time, so we need to think in terms of those amplitudes C1 and C2, or Cand CII, which are functions in time: Ci = Ci(t). Now, the Hamiltonian coefficients are obviously related to those amplitudes. Sure! That’s quite obvious from the fact they appear in those differential equations for Cand C2, or Cand CII, i.e. the amplitude to be in state 1 or state 2, or state I or state II, respectively. But they’re not the same.

Let’s go back to the basics here. When we derived the Hamiltonian matrix as we presented Feynman’s brilliant differential analysis of it, we wrote the amplitude to go from one base state to another, as a function in time (or a function of time, I should say), as:

Uij = Uij(t + Δt, t) = 〈 i | U | j 〉 = 〈 i | U(t + Δt, t) | j 〉

Our ‘unpacking’ rules then allowed us to write something like this for t = t1 and t + Δt = t2 or – let me quickly circle back to that monster matrix notation above – for Δt = t− t1:


The key – as presented by Feynman – to go from those Uij amplitudes to the Hij amplitudes is to consider the following: if Δt goes to zero, nothing happens, so we wrote: Uij = 〈 i | U | j 〉 → 〈 i | j 〉 = δij for Δt → 0. We also assumed that, for small t, those Uij amplitudes should differ from δij (i.e. from 1 or 0) by amounts that are proportional to Δt. So we wrote:

Uij(t + Δt, t) = δij + ΔUij(t + Δt, t) = δij + Kij(t)·Δt ⇔ Uij(t + Δt, t) = δij − (i/ħ)·Hij(t)·Δt

There’s several things here. First, note the first-order linear approximation: it’s just like the general y(t + Δt) = y(t) + Δy = y(t) + (dy/dt)·Δt formula. So can we look at our Kij(t) function as being the time derivative of the Uij(t + Δt, t) function? The answer is, unambiguously, yes. Hence, −(i/ħ)·Hij(t) is the same time derivative. [Why? Because Kij(t) = −(i/ħ)·Hij(t).] Now, the time derivative of a function, i.e. dy/dt, is equal to Δy/Δt for Δt → 0 and, of course, we know that Δy = 0 for Δt → 0. We are now in a position to understand Feynman’s interpretation of the Hamiltonian coefficients:

The −(i/ħ)·Hij(t) = −(i/ħ)·〈 i | H | j 〉 factor is the amplitude that—under the physical conditions described by H—a state j will, during the time dt, “generate” the state i.

I know I shouldn’t make this post too long (I promised to write about the Pauli spin matrices, and I am not even halfway there) but I should note a funny thing there: in that Uij(t + Δt, t) = δij + ΔUij(t + Δt, t) = δij + Kij(t)·Δt = δij − (i/ħ)·Hij(t)·Δt formula, for Δt → 0, we go from real to complex numbers. I shouldn’t anticipate anything but… Well… We know that the Hij coefficients will (usually) represent some energy level, so they are real numbers. Therefore, − (i/ħ)·Hij(t) = Kij(t) is complex-valued, as we’d expect, because Uij(t + Δt, t) is, in general, complex-valued, and δij is just 0 or 1. I don’t have too much time to linger on this, but it should remind you of how one may mathematically ‘construct’ the complex exponential eiby using the linear approximation eiε = 1 + iε near s = 0 or, what amounts to the same, for small ε. My post on this shows how Feynman takes the magic out of Euler’s formula doing that – and I should re-visit it, because I feel the formula above, and that linear approximation formula for a complex exponential, go to the heart of the ‘mystery’, really. But… Well… No time. I have to move on.

Let me quickly make another small technical remark here. When Feynman talks about base states, he always writes them as a bra or a ket, just like any other state. So he talks about “base state | i 〉”, or “base state 〈 i |”. If you look it up, you’ll see he does the same in that quote: he writes | j 〉 and | i 〉, rather than j and i. In fact, strictly speaking, he should write 〈 i | instead of | i 〉. Frankly, I really prefer to just write “base state i”, or base state j”, without specifying if it’s a bra or a ket. A base state is a base state: 〈 i | and | i 〉 represent the same. Of course, it’s rather obvious that 〈 χ | and | χ 〉 are not the same. In fact, as I showed above, they’re each other’s complex conjugate, so 〈 χ |* = | χ 〉. To be precise, I should say: they’re each other’s conjugate transpose, because we’re talking row and column vectors respectively. Likewise, we can write: 〈 χ | φ 〉* = 〈 φ | χ 〉. For base states, this becomes 〈 i | j 〉* = 〈 j | i 〉. Now, 〈 i | and | j 〉 were matrices, really – row and column vectors, to be precise – so we can apply the following rule: the conjugate transpose of the product of two matrices is the product of the conjugate transpose of the same matrices, but with the order of the matrices reversed. So we have: (AB)* = B*A*. In this case: 〈 i | j 〉* = | j 〉*〈 i |*. Huh? Yes. Think about it. I should probably use the dagger notation for the conjugate transpose, rather than the simple * notation, but… Well… It works. The bottom line is: 〈 i | j 〉* = 〈 j | i 〉 = | j 〉*〈 i |* and, therefore, 〈 j | = | j 〉* and | i 〉 = 〈 i |*. Conversely, 〈 j | i 〉* = 〈 i | j 〉 = | i 〉*〈 j |* and, therefore, we also have 〈 j |* = | j 〉 and | i 〉* = 〈 i |. Now, we know the coefficients of these row and column vectors are either one or zero. In short, 〈 i | and | i 〉, or 〈 j | and | j 〉 are really one and the same ‘object’. The only reason why we would use the bra-ket notation is to indicate whether we’re using them in an initial condition, or in a final state. In the specific case that we’re dealing with here, it’s obvious that j is used in an initial condition, and i is a final condition.

We’re now ready to look at these differential equations once more, and try to truly understand them:


The summation over all base states j amounts to adding the contribution, so to speak, of all those base states j, during the infinitesimally small time interval dt, to the change in the amplitude (during the same infinitesimal time interval, of course) to be in state i. Does that make sense?

You’ll say: yes. Or maybe. Or maybe not. 🙂 And I know you’re impatient. We were supposed to talk about the Hamiltonian operator here. So what about that? Why this long story on the Hamiltonian coefficients? Well… Let’s take the next step. An operator is all about ‘abstracting away’, or ‘dropping terms’, as Feynman calls it—more down to the ground. 🙂 So let’s do that in two successive rounds, as shown below. First we drop the 〈 i |, because the equation holds for any i. Then we apply the grand | = ∑ | i 〉〈 i | rule—which is somewhat tricky, as it also gets rid of the summation. We then define the Hamiltonian operator as H, but we just put a little hat on top of it. That’s all.


As this is all rather confusing, let me show what it means in terms of matrix algebra:


So… Frankly, it’s not all that difficult. It’s basically introducing a summary notation, which is what operators usually do. Note that the H = (i/ħ)·d/dt operator (sorry if I am not always putting the hat) is not just the d/dt with an extra division by ħ and a multiplication by the imaginary unit i. From a mathematical point of view, of course, that’s what it seems to be, and actually is. From a mathematical point of view, it’s just an n-by-n matrix, and so we can effectively apply it to some n-by-1 column vector to get another n-by-1 column vector.

But its meaning is much deeper: as Feynman puts it: the equation(s) above are the dynamical law of Nature—the law of motion for a quantum system. In a way, it’s like that invariant (1−v2)−1/2·d/dt operator that we introduced when discussing relativity, and things like the proper time and invariance under Lorentz transformation. That operator really did something. It ‘fixed’ things as we applied to the four-vectors in relativistic spacetime. So… Well… Think about it.

Before I move on – because, when everything is said and done, I promised to use the Pauli matrices as operators – I’ll just copy Feynman as he approaches the equations from another angle:


Of course, that’s the equation we started out with, before we started ‘abstracting away’:


So… Well… You can go through the motions once more. Onward!

The Pauli spin matrices as operators

If the Hamiltonian matrix can be used as an operator, then we can use the Pauli spin matrices as little operators too! Indeed, from my previous post, you’ll remember we can write the Hamiltonian in terms of the Pauli spin matrices:


Now, if we think of the Hamiltonian matrix as an operator, we can put a little hat everywhere, so we get:


It’s really as simple as that. Now, we get a little bit in trouble with the x, y and subscripts as we’re going to want to write the matrix elements as σij, so we’ll just move them and write them as superscripts, so our matrix elements will be written as σxij = 〈 i | σx | j 〉, σyij = 〈 i | σy | j 〉 and σzij = 〈 i | σz | j 〉 respectively. Now, we introduced all kinds of properties of the Pauli matrices themselves, but let’s now look at the properties of these matrices as an operator. To do that, we’ll let them loose on the base states. We get the following:


[You can check this in Feynman, but it’s really very straightforward, so you should try to get this result yourself.] The next thing is to create even more operators by multiplying the operators two by two. We get stuff like:

σxσy|+〉 = σxy|+〉) = σx(i|−〉) = i·(σx|−〉) = i·|+〉

The thing to note here is that it’s business as usual: we can move factors like out of the operators, as the operators work on the state vectors only. Oh… And sorry I am not putting the hat again. It’s the limitations of the WordPress editor here (I always need to ‘import’ my formulas from Word or some other editor, so I can’t put them in the text itself). On the other hand, Feynman himself seems to doubt the use of the hat symbol, as he writes: “It is best, when working with these things, not to keep track of whether a quantity like σ or H is an operator or a matrix. All the (matrix) equations are the same anyway.

That makes it all rather tedious or, in fact, no! That makes it all quite easy, because our table with the properties of the sigma matrices is also valid for the sigma operators, so let’s just copy it, and then we’re done, so we can wrap up and do something else. 🙂


To conclude, let me answer your most pressing question at this very moment: what’s the use of this? Well… To a large extent, it’s a nice way of rather things. For example, let’s look at our equations for the ammonia molecule once more. But… Well… No. I’ll refer you to Feynman here, as he re-visits all the systems we’ve studied before, but now approaches them with our new operators and notations. Have fun with it! 🙂

Pauli’s spin matrices

Wolfgang Pauli’s life is as wonderful as his scientific legacy—but we’ll just talk about one of his many contributions to quantum mechanics here in this post—not about his life.

This post should be fairly straightforward. We just want to review some of the math. Indeed, we got the ‘Grand Result’ already in our previous post, as we found the Hamiltonian coefficients for a spin one-half particle—read: all matter-particles, practically speaking—in a magnetic field—but then we can just replace the magnetic dipole moment by an electric dipole moment, if needed, and we’ll find the same formulas, so we’ve basically covered everything you can possible think of.

[…] Well… Sort of… 🙂

OK. Jokes aside, we have a magnetic field B, which we describe in terms of its components: B = (Bx, By, Bz), and we’ve defined two mutually exclusive states – call them ‘up’ or ‘down’, or 1 or 2, or + or −, whatever − along some direction, which we call the z-direction. Why? Convention. Historical accident. The z-direction is the direction in regard to which we measure stuff. What stuff? Well… Stuff like the spin of an electron: quantum-mechanical stuff. 🙂 In any case, the Hamiltonian that comes with this system is:


Now, because this matrix doesn’t look impressive enough, we’re going to re-write it as:


Huh? Yes. It looks good, doesn’t it? And the σx, σy and σz matrices are given below, so you can check it’s actually true. […] I mean: you can check that the two notations are equivalent, from a math point of view, that is. 🙂


As Feynman puts it: “This is what the professionals use all of the time.” So… Well… Yes. We had better learn them by heart. 🙂

The identity matrix is actually not one of the so-called Pauli spin matrices, but we need it when we’d decide to not equate the average energy of our system to zero, i.e. when we’d decide to shift the zero point of our energy scale so as to include the equivalent energy of the rest mass. In that case, we re-write the Hamiltonian as:


In fact, as most academics want to hide their knowledge from us by confusing us deliberately, they’ll often omit the Kronecker delta, and simply write:


It’s OK, as long as you know what it is that you’re trying to do. 🙂 The point is, we’ve got four ‘elementary’ matrices now which allow us to write any matrix – literally, any matrix – as a linear combination of them. In Feynman‘s words:


Now, the Pauli matrices have lots of interesting properties. Their products, for example, taken two at a time, are rather special:


The most interesting property, however, is that, when chosing some other represenation, i.e. when changing to another coordinate systemthe three Pauli matrices behave like the components of a vector. That vector is written as σ, and so it’s a matrix you can use in different coordinate systems, as though it’s a vector. It allows us to re-write the Hamiltonian we started out with in a particularly nice way:

Pauli vector

You should compare this to the classical formula for the energy of a little magnet with the magnetic moment μ in the same magnetic field:


There are several differences, of course. First, note that the quantum-mechanical magnetic moment is like the quantum-mechanical angular momentum: there’s only a limited set of discrete values, given by the following relation:


That’s why we write it as a scalar in the quantum-mechanical equation, and as a vector, i.e. in boldface (μ), in the second equation. The two equations differ more fundamentally, however: the first one is a matrix equation, while the second one is… Well… Just a simple vector dot product.

The point is: the classical energy becomes the Hamiltonian matrix, and the classical μ vector becomes the μσ matrix. As Feynman puts it: “It is sometimes said that to each quantity in classical physics there corresponds a matrix in quantum mechanics, but it is really more correct to say that the Hamiltonian matrix corresponds to the energy, and any quantity that can be defined via energy has a corresponding matrix.”


What does he mean by a quantity that can be defined via energy? It’s simple: the magnetic moment, for example, can be defined via energy by saying that the energy, in an external field B, is equal to −μ·B.

Huh? Wasn’t it the other way around? Didn’t we define the energy by saying it’s equal to −μ·B?

We did. In our posts on electromagnetism. That was classical theory. However, in quantum mechanics, it’s the energy that’s the ‘currency’ we need to be dealing in. So it makes sense to look at things the other way around: we’ll first think about the energy, and then we try to find a matrix that corresponds to it.

So… Yes. Many classical quantities have their quantum-mechanical counterparts, and those quantum-mechanical counterparts are often some matrices. But not all of them. Sometimes there’s just no comparison, because the two worlds are actually different. Let me quote Feynman on what he thinks of how these two worlds relate, as he wraps up his discussion of the two equations above:

philosophy Well… That says it all, doesn’t it? 🙂 We’ll talk more tomorrow. 🙂

The Hamiltonian of matter in a field

In this and the next post, I want to present some essential discussions in Feynman’s 10th, 11th and 12th Lectures on Quantum Mechanics. This post in particular will actually present the Hamiltonian for the spin state of an electron, but the discussion is much more general than that: it’s a model for any spin-1/2 particle, i.e. for all elementary fermions—so that’s the ‘matter-particles’ which you know: electrons, protons and neutrons. Or, taking into account protons and neutrons consists of quarks, we should say quarks, which also have spin 1/2. So let’s go for it. Let me first, by way of introduction, remind you of a few things.

What is it that we are trying to do?

That’s always a good question to start with. 🙂 Just for fun, and as we’ll be talking a lot about symmetries and directions in space, I’ve inserted an animation below of a four-dimensional object, as its author calls it. This ‘object’ returns to its original configuration after a rotation of 720 degrees only (after 360 degrees, the spiral flips between clockwise and counterclockwise orientations, so it’s not the same). For some rather obscure reason 🙂 he refers to it as a spin-1/2 particle, or a spinor.


Are spin one-half particles, like an electron or a proton, really four-dimensional? Well… I guess so. All depends, of course, on your definition or concept of a dimension. 🙂 Indeed, the term is as well – I should say, as badly, really – defined as the ubiquitous term ‘vector’ and so… Well… Let me say that spinors are usually defined in four-dimensional vector spaces, indeed. […] So is this what it’s all about, and should we talk about spinors?

Not really. Feynman doesn’t push the math that far, so I won’t do that either. 🙂 In fact, I am not sure why he’s holding back here: spinors are just mathematical objects, like vectors or tensors, which we introduced in one of our posts on electromagnetism, so why not have a go at it? You’ll remember that our electromagnetic tensor was like a special vector cross-product which, using the four-potential vector Aμ and the ∇μ = (∂/∂t, −∂/∂x, −∂/∂y, −∂/∂z) operator, we could write as (∇μAμ) − (∇μAμ)T.

Huh? Hey! Relax! It’s a matrix equation. It looks like this:


In fact, I left out above, and so we should plug it in, remembering that B’s magnitude is 1/c times E’s magnitude. So the electromagnetic tensor – in one of its many forms at least – is the following matrix:

electromagnetic tensor final

Why do we need a beast like this? Well… Have a look at the mentioned post or, better, one of the subsequent posts: we used it in very powerful equations (read: very concise equations, because that’s what mathematicans, and physicists, like) describing the dynamics of a system. So we have something similar here: what we’re trying to describe the dynamics of a quantum-mechanical system in terms of the evolution of its state, which we express as a linear combination of ‘pure’ base states, which we wrote as:

|ψ〉 = |1〉C|2〉C= |1〉〈1|ψ〉 + |2 〉〈2|ψ〉

C1 and C2 are complex-valued wavefunctions, or amplitudes as we call them, and the dynamics of the system are captured in a set of differential equations, which we wrote as:


The trick was to know or guess our Hamiltonian, i.e. we had to know or, more likely, guess those Hij coefficients (and then find experiments to confirm our guesses). Once we got those, it was a piece of cake. We’d solve for C1 and C2, and then take their absolute square so as to get probability functions. like the ones we found for our ammonia (NH3) molecule: P1(t) = |C1(t)|2 = cos2[(A/ħ)·t] and P2(t) = |C2(t)|= sin2[(A/ħ)·t]. They say that, if we would take a measurement, then the probability of finding the molecule in the ‘up’ or ‘down’ state (i.e. state 1 versus state 2) varies as shown:


So here we are going to generalize the analysis: rather than guessing, or assuming we know them (from experiment, for example, or because someone else told us so), we’re going to calculate what those Hamiltonian coefficients are in general.

Now, returning to those spinors, it’s rather daunting to think that such a simple thing as being in the ‘up’ or ‘down’ condition has to be represented by some mathematical object that’s at least as complicated as these tensors. But… Well… I am afraid that’s the way it is. Having said that, Feynman himself seems to consider that’s math for graduate students in physics, rather than the undergraduate public for which he wrote the course. Hence, while he presented all of the math in the Lecture Volume on electromagnetism, he keeps things as simple as possible in the Volume on quantum mechanics. So… No. We will not be talking about spinors here.

The only reason why I started out with that wonderful animation is to remind you of the weirdness of quantum mechanics as evidenced by, for example, the fact I almost immediately got into trouble when trying to associate base states with two-dimensional geometric vectors when writing my post on the hydrogen molecule, or when thinking about the magnitude of the quantum-mechanical equivalent of the angular momentum of a particle (see my post on spin and angular momentum).

Thinking of that, it’s probably good to remind ourselves of the latter discussion. If we denote the angular momentum as J, then we know that, in classical mechanics, any of J‘s components Jx, Jy or Jz, could take on any value from +J to −J and, therefore, the maximum value of any component of J – say Jz – would be equal to J. To be precise, J would be the value of the component of J in the direction of J itself. So, in classical mechanics, we’d write: |J| = +√(J·J) = +√JJ, and it would be the maximum value of any component of J.

However, in quantum mechanics, that’s not the case. If the spin number of J is j, then the maximum value of any component of J is equal to j·ħ. In this case, the spin number will be either +1/2 or −1/2. So, naturally, one would think that J, i.e. the magnitude of J, would be equal to J = |J| = +√(J·J) = +√J= j·ħ = ħ/2. But that’s not the case: J = |J| ≠ j·ħ = ħ/2. To calculate the magnitude, we need to calculate J= Jx+ Jy+ Jz2. So the idea is to measure these repeatedly and use the expected value for Jx2, Jy2 and Jz2 in the formula. Now, that’s pretty simple: we know that Jx, Jy or Jz are equal to either +ħ/2 or −ħ/2, and, in the absence of a field (i.e. in free space), there’s no preference, so both values are equally likely. To make a long story short, the expected value of Jx2, Jy2 and Jz2 is equal to (1/2)·(ħ/2)+ (1/2)·(−ħ/2)= ħ2/4, and J= 3·ħ2/4 = j(j+1)ħ, with j = 1/2. So J = |J| = +√J= √(3·ħ2/4) = √3·(ħ/2) ≈ 0.866·ħ. Now that’s a huge difference as compared to ħ/2 = ħ/2.

What we’re saying here is that the magnitude of the angular momentum is √3 ≈ 1.7 times the maximum value of the angular momentum in any direction. How is that possible? Thinking classically, this is nonsensical. However, we need to stop thinking classically here: it means that, when we’re atomic or sub-atomic particles, their angular momentum is never completely in one direction. This implies we need to revise our classical idea of an oriented (electric or magnetic) moment: to put it simply, we find it’s never in one direction only! Alternatively, we might want to re-visit our concept of direction itself, but then we do not want to go there: we continue to say we’re measuring this or that quantity in this or that direction. Of course we do! What’s the alternative? There’s none. You may think we didn’t use the proper definition of the magnitude of a quantity when calculating J as √3·(ħ/2), but… Well… You’ll find yourself alone with that opinion. 🙂

This weird thing really comes with the experimental fact that, if you measure the angular momentum, along any axis, you’ll find it is always an integer or half-integer times ħ. Always! So it comes with the experimental fact that energy levels are discrete: they’re separated by the quantum of energy, which is ħ, and which explains why we have the 1/ħ factor in all coefficients in the coefficient matrix for our set of differential equations. The Hamiltonian coefficients represent energies indeed, and so we’ll want to measure them in units of ħ.

Of course, now you’ll wonder: why the −i? I wish I could you a simple answer here, like: “The −factor corresponds to a rotation by −π/2, and that’s the angle we use to go from our ‘up’ and ‘down’ base states to the ‘Uno‘ and ‘Duo‘ (I and II) base states.” 🙂 Unfortunately, this easy answer isn’t the answer. :-/ I need to refer you to my post on the Hamiltonian: the true answer is that it’s got to do with the in the e(i/ħ)·(E·t − pxfunction: the E, i.e. the energy, is real – most of the time, at least 🙂 – but the wavefunction is what it is: a complex exponential. So… Well…

Frankly, that’s more than enough as an introduction. You may want to think about the imaginary momentum of virtual particles here – i.e. ‘particles’ that are being exchanged as part of a ‘state switch’ –  but then we’d be babbling for hours! So let’s just do what we wanted to do here, and that is to find the Hamiltonian for a spin one-half particle in general, so that’s usually in some field, rather than in free space. 🙂

So here we go. Finally! 🙂

The Hamiltonian of a spin one-half particle in a magnetic field

We’ve actually done some really advanced stuff already. For example, when discussing the ammonia maser, we agreed on the following Hamiltonian in order to make sense of what happens inside of the maser’s resonant cavity:


State 1 was the state with the ‘upper’ energy E0 + με, as the energy that’s associated with the electric dipole moment of the ammonia molecule was added to the (average) energy of the system (i.e. E0). State 2 was the state with the ‘lower’ energy level E0 − με, implying the electric dipole moment is opposite to that of state 1. The field could be dynamic or static, i.e. varying in time, or not, but it was the same Hamiltonian. Of course, solving the differential equations with non-constant Hamiltonian coefficients was much more difficult, but we did it.

We also have a “flip-flop amplitude” – I am using Feynman’s term for it 🙂 – in that Hamiltonian above. So that’s an amplitude for the system to go from one state to another in the absence of an electric field. For our ammonia molecule, and our hydrogen molecule too, it was associated with the energy that’s needed to tunnel through a potential barrier and, as we explained in our post on virtual particles, that’s usually associated with a negative value for the energy or, what amounts to the same, with a purely imaginary momentum, so that’s why we write minus A in the matrix. However, don’t rack your brain over this as it is a bit of convention, really: putting +A would just result in a phase difference for the amplitudes, but it would give us the same probabilities. If it helps you, you may also like to think of our nitrogen atom (or our electron when we were talking the hydrogen system) as borrowing some energy from the system so as to be able to tunnel through and, hence, temporarily reducing the energy of the system by an amount that’s equal to A. In any case… We need to move on.

As for these probabilities, we could see – after solving the whole thing, of course (and that was very complicated, indeed) – that they’re going up and down just like in that graph above. The only difference was that we were talking induced transitions here, and so the frequency of the transitions depended on με0, i.e. on the strength of the field, and the magnitude of the dipole moment itself of course, rather than on A. In fact, to be precise, we found that the ratio between the average periods was equal to:

Tinduced/Tspontaneous = [(π·ħ)/(2με0)]/[(π·ħ)/(2A)] = A/με0

But… Well… I need to move on. I just wanted to present the general philosophy behind these things. For a simple electron which, as you know, is either in a ‘up’ or a ‘down’ state – vis-á-vis a certain direction, of course – the Hamiltonian will be very simple. As usual, we’ll assume the direction is that z-direction. Of course, this ‘z-direction” is just a short-hand for our reference frame: we decide to measure something in this or that direction, and we call that direction the z-direction.

Fine. Next. As our z-direction is currently our reference direction, we assume it’s the direction of some magnetic field, which wel’ll write as B. So the components of B in the x– and y-direction are zero: all of the field is in the z-direction, so B = Bz. [Note that the magnetic field is not some quantum-mechanical quantity, and so we can have all of the magnitude in one direction. It’s just a classical thing.]

Fine. Next. The spin or the angular momentum of our electron is, of course, associated with some magnetic dipole moment, which we’ll write as μ. [And, yes, sometimes we use this symbol for an electric dipole moment and, at other times, for a magnetic dipole moment, like here. I can’t help that. You don’t want a zillion different symbols anyway.] Hence, just like we had two energy levels E0 ± με, we’ll now have two energy levels E0 ± μBz. We’ll just shift the energy scale so E0 = 0, so that’s as per our convention. [Feynman glosses over it, but this is a bit of a tricky point, really. Usually, one includes the rest mass, or rest energy, in the E in the argument of the wavefunction, but so here we’re equating m0 c2 with zero. Tough! However, you can think of this re-definition of the zero energy points as a phase shift in all wavefunctions, so it shouldn’t matter when taking the absolute square or looking at interference. Still… Think about it.]

Fine. Next. Well… We’ve got two energy levels, +μBz and +μBz, but no A to put in our Hamiltonian, so the following Hamiltonian may or may not make sense:


Hmm… Why is there no flip-flop amplitude? Well… You tell me. Why would we have one? It’s not like the ammonia or hydrogen molecule here, so… Well… Where’s the potential barrier? Of course, you’ll now say that we can imagine it takes some energy to change the spin of an electron, like we were doing with those induced transitions. But… Yes and no. We’ve been selecting particles using our Stern-Gerlach apparatus, or that state selector for our maser, but were we actually flip-flopping things? The changing electric field in our resonant cavity is changes the transition frequency but, when everything is said and done, the transition itself has to do with that A. You’ll object again: a pure stationary state? So the electron is either ‘up’ or ‘down’, and it stays like that foreverReally?

Well… I am afraid I have to cut you off, because otherwise we’ll never get to the end. Stop being so critical. 🙂 Well… No. You should be critical. However, you’re right in saying that, when everything is said and done, these are all hypotheses that may or may not make sense. However, Feynman is also right when he says that, ultimately, the proof of the pudding is in the eating: at the end of this long, winding story, we’ll get some solutions that can be tested in experiment: they should give predictions, or probabilities rather, that agree with experiment. As Feynman writes: “[The objective is to find] “equations of motion for the spin states” of an electron in a magnetic field. We guess at them by making some physical argument, but the real test of any Hamiltonian is that it should give predictions in agreement with experiment. According to any tests that have been made, these equations are right. In fact, although we made our arguments only for constant fields, the Hamiltonian we have written is also right for magnetic fields which vary with time.”

So let’s get on with it: let’s assume the Hamiltonian above is the one we should use for a magnetic field in the z-direction, and that we have those pure stationary states with the energies they have, i.e. −μBz and +μBz. One minor technical point, perhaps: you may wonder why we write what we write and do not switch −μBz and +μBz in the Hamiltonian—so as to reflect these ‘upper’ and ‘lower’ energies in those other Hamiltonians. The answer is: it’s just convention. We choose state 1 to be the ‘up’ state, so its spin is ‘up’, but the magnetic moment is opposite to the spin, so the ‘up’ state has the minus sign. Full stop. Onwards!

We’re now going to assume our B field is not in the z-direction. Hence, its Bx and By components are not zero. What we want to see now is how the Hamiltonian looks like. [Yes. Sorry for regularly reminding you of what it is that we are trying to do.] Here you need to be creative. Whatever the direction of the field, we need to be consistent. If that Hamiltonian makes sense, i.e. if we’d have two pure stationary states with the energies they have, if the field is in the z-direction, then it’s rather obvious that, if the field is in some other direction, we should still be able to find two stationary states with exactly the same energy levels. As Feynman puts it: “We could have chosen our z-axis in its direction, and we would have found two stationary states with the energies ±μBz. Just choosing our axes in a different direction doesn’t change the physics. Our description of the stationary states will be different, but their energies will still be ±μBz.” Right. And because the magnetic field is a classical quantity, the relevant magnitude is just the square root of the squares of its components, so we write:

formula 1So we have the energies now, but we want the Hamiltonian coefficients. Here we need to work backwards. The general solution for any system with constant Hamiltonian coefficients always involves two stationary states with energy levels which we denoted as Eand EII, indeed. Let me remind you of the formula for them:


[If you want to double-check and see how we get those, it’s probably best to check it in the original text, i.e. Feynman’s Lecture on the Ammonia Maser, Section 2.]

So how do we connect the two sets of equations? How do we get the Hij coefficients out of these square roots and all of that? [Again. I am just reminding you of what it is that we are trying to do.] We’ve got two equations and four coefficients, so… Well… There’s some rules we can apply. For example, we know that any Hij coefficient must equal Hji*, i.e. complex conjugate of Hji. [However, I should add that’s true only if i ≠ j.] But… Hey! We can already see that H11 must be equal to minus H22. Just compare the two sets. That comes out as a condition, clearly. Now that simplifies our square roots above significantly. Also noting that the absolute square of a complex number is equal to the product of the number with its complex conjugate, the two equations above imply the following:

formula 2

Let’s see what this means if we’d apply this to our ‘special’ direction once more, so let’s assume the field is in the z-direction once again. Perhaps we can some more ‘conditions’ out of that. If the field is in the z-direction itself, the equation above reduces to:

formula 3

That makes it rather obvious that, in this special case, at least, |H12|2 = 0. You’ll say: that’s nothing new, because we had those zeroes in that Hamiltonian already. Well… Yes and no! Here we need to introduce another constraint. I’ll let Feynman explain it: “We are going to make an assumption that there is a kind of superposition principle for the terms of the Hamiltonian. More specifically, we want to assume that if two magnetic fields are superposed, the terms in the Hamiltonian simply add—if we know the Hij for a pure Band we know the Hij for a pure Bx, then the Hij for a both Band Btogether is simply the sum. This is certainly true if we consider only fields in the z-direction—if we double Bz, then all the Hij are doubled. So let’s assume that H is linear in the field B.”

Now, the assumption that H12 must be some linear combination of Bx, Band Bz, combined with the |H12|2 = 0 condition when all of the magnitude of the field is in the z-direction, tells us that H12 has no term in Bz. It may have – in fact, it probably should have – terms in Bx and By, but not in Bz. That does take us a step further.

Next assumption. The next assumption is that, regardless of the direction of the field, H11 and H22 don’t change: they remain what they are, so we write: H11 = −μBz and H22 = +μBz. Now, you may think that’s no big deal, because we defined the 1 and 2 states in terms of our z-direction, but… Well… We did so assuming all of the magnitude was in the z-direction.

You’ll say: so what? Now we’ve got some field in the x– and y-directions, so that shouldn’t impact the amplitude to be in a state that’s associated with the z-direction. Well… I should say two things here. First, we’re not talking about the amplitude to be in state 1 or state 2. These amplitudes are those C1 and Cfunctions that we can find once we’ve got those Hamiltonian coefficients. Second, you’d surely expect that some field in the x– and y-directions should have some impact on those C1 and Cfunctions. Of course!

In any case, I’ll let you do some more thinking about this assumption. Again, we need to move on, so let’s just go along with it. At this point, Feynman‘s had enough of the assumptions, and so he boldly proposes a solution, which incorporates that the H11 = −μBz and H22 = +μBz assumption. Let me quote him:

Formula 4

Of course, this leaves us gasping for breath. A simple guess? One can plug it in, of course, and see it makes sense—rather quickly, really. But… Nothing linear is going to come out of that expression for |H12|2, right? We’ll have to take a square root to find that H12 = ±μ·(Bx+ By2)1/2. Well… No. We’re working in the complex space here, remember? So we can use complex solutions. Feynman notes the same and immediately proposes the right solution:

final 1

To make a long story, we get what we wanted, i.e. those “equations of motion for the spin states” of an electron in a magnetic field. I’ll let Feynman summarize the results:

Final 3

It’s truly a Great Result, especially because, as Feynman notes, (almost) any problem about two-state systems can be solved by making a mathematical analog to the system of the spinning electron. We’ll illustrate that as we move ahead. For now, however, I think we’ve had enough, isn’t it? 🙂

We’ve made a big leap here, and perhaps we should re-visit some of the assumptions and conventions—later, that is. As for now, let’s try to work with it. As mentioned above, Feynman shied away from the grand mathematical approach to it. Indeed, the whole argument might have been somewhat fuzzy, but at least we got a good feel for the solution. In my next post, I’ll abstract away from it, as Feynman does in his next Lecture, where he introduces the so-called Pauli spin matrices, which are like Lego building blocks for all of the matrix algebra which – I must assume you sort of sense that’s coming, no? 🙂 – we’ll need to master so as to understand what’s going on.

So… That’s it for today. I hope you understood “what it is that we’re trying to do”, and that you’ll have some fun working on it on your own now. 🙂

The quantum-mechanical view of chemical binding

In my post on the hydrogen atom, I explained its stability using the following graph out of Feynman’s Lectures. It shows an equilibrium state for the Hmolecule with an energy level that’s about 5 eV (ΔE/E≈ −0.375 ⇔ ΔE ≈ −0.375×13.6 eV = 5.1 eV) lower than the energy of two separate hydrogen atoms (2H).

raph3The lower energy level is denoted by EII and refers to a state, which we also denoted as state II, that’s associated with some kind of molecular orbital for both electrons, resulting in more (shared) space where the two electrons can have a low potential energy, as Feynman puts it, so “the electron can spread out—lowering its kinetic energy—without increasing its potential energy.” The electrons have opposite spin. The have to have opposite spin because our formula for state II would violate the Fermi exclusion principle if they would not have opposite spin. Indeed, if the two electrons would not have opposite spin, the formula for our CII amplitude, would be violating the rule that, when identical fermions are involved, and we’re adding amplitudes, then we should do so with a negative sign for the exchanged case. So our CII = 〈II|ψ〉 = (1/√2)[〈1|ψ〉 + 〈2|ψ〉] = (1/√2)[〈2|ψ〉 + 〈1|ψ〉] would be problematic: when we switch the electrons, we should get a minus sign.

We do get that minus sign for state I:

〈I|ψ〉 = (1/√2)[〈1|ψ〉 − 〈2|ψ〉] = −(1/√2)[〈2|ψ〉 − 〈1|ψ〉]

To make a long story short, state II is the equilibrium state, and so that’s an Hmolecule with two electrons with opposite spins that share a molecular orbital, rather than moving around in some atomic orbital.

The question is: can we generalize this analysis? I mean… We’ve spent a lot of time so as to make sure we understand this one particular case. What’s the use of such analysis if we can’t generalize? We shouldn’t be doing nitty-gritty all of the time, isn’t it?

You’re right. The thing is: we can easily generalize. We’ve learned to play with those Hamiltonian matrices now, and so let’s do the ‘same-same but different’ with other systems. Let’s replace one of the two protons in the two-protons-one-electron model by a much heavier ion—say, lithium. [The example is not random, of course: lithium is very easily ionized, which is why it’s used in batteries.]

We need to think of the Hamiltonian again, right? We’re now in a situation in which the Hamiltonian coefficients H11 and H22 are likely to be different. We’ve lost the symmetry: if the electron is near the lithium ion, then we can’t assume the system has the same energy as when it’s near the hydrogen nucleus (in case you forgot: that’s what the proton is, really). Because we’ve lost the symmetry, we no longer have these ‘easy’ Hamiltonians:


We need to look at the original formulas for Eand E2 once again. Let me write them down:


Of course, H12 and H21 will still be equal to A, and so… Well… Let me simplify my life and copy Feynman:


There’s several things here. First, note that approximation to the square root:

square root sum of squares

We’re only allowed to do that if y is much smaller than x, with = 1 and = 2A/(H11 − H22). In fact, the condition is usually written as 0 ≤ y/x ≤ 1/2, so we take the A/(H11 − H22) ratio as (much) less than one, indeed. So the second term in the energy difference E− EII = (H11 − H22) + 2A·A/(H11 − H22) is surely smaller than 2A. But there’s the first term, of course: H11 − H22. However, that’s there anyway, and so we should actually be looking at the additional separation, so that’s where the A comes in, and so that’s the second term: 2A·A/(H11 − H22) which, as mentioned, is smaller by the factor A/(H11 − H22), which is less than one. So Feynman’s conclusion is correct: “The binding of unsymmetric diatomic molecules is generally very weak.

However, that’s not the case when binding two ions by two electrons, which is referred to as a two-electron binding, which is the most common valence bond. Let me simplify my life once more and quote once again:

Feynman 2

What he’s saying is that H11 and H22 are one and the same once again, and equal to E0, because both ions can take one electron, so there’s no difference between state 1 and state 2 in that regard. So the energy difference is 2A once more and we’ve got good covalent binding. [Note that the term ‘covalent’ just refers to sharing electrons, so their value is shared, so to speak.]

Now, this result is, of course, subject to the hypothesis that the electron is more or less equally attracted to both ions, which may or may not be the case. If it’s not the case, we’ll have what’s referred to as ‘ionic’ binding. Again, I’ll let Feynman explain it, as it’s pretty straightforward and so it’s no use to try to write another summary of this:


So… That’s it, really. As Feynman puts it, by way of conclusion: “You can now begin to see how it is that many of the facts of chemistry can be most clearly understood in terms of a quantum mechanical description.”

Most clearly? Well… I guess that, at the very least, we’re “beginning to see” something here, aren’t we? 🙂

An introduction to virtual particles

In one of my posts on the rules of quantum math, I introduced the propagator function, which gives us the amplitude for a particle to go from one place to another. It looks like this:


The rand r2 vectors are, obviously, position vectors describing (1) where the particle is right now, so the initial state is written as |r1〉, and (2) where it might go, so the final state is |r2〉. Now we can combine this with the analysis in my previous post to think about what might happen when an electron sort of ‘jumps’ from one state to another. It’s a rather funny analysis, but it will give you some feel of what these so-called ‘virtual’ particles might represent.

Let’s first look at the shape of that function. The e(i/ħ)·(pr12function in the numerator is now familiar to you. Note the r12 in the argument, i.e. the vector pointing from r1 to r2. The pr12 dot product equals |p|∙|r12|·cosθ = p∙r12·cosθ, with θ the angle between p and r12. If the angle is the same, then cosθ is equal to 1. If the angle is π/2, then it’s 0, and the function reduces to 1/r12. So the angle θ, through the cosθ factor, sort of scales the spatial frequency. Let me try to give you some idea of how this looks like by assuming the angle between p and r12 is the same, so we’re looking at the space in the direction of the momentum only and |p|∙|r12|·cosθ = p∙r12. Now, we can look at the p/ħ factor as a scaling factor, and measure the distance x in units defined by that scale, so we write: x = p∙r12/ħ. The whole function, including the denominator, then reduces to (ħ/p)·eix/x = (ħ/p)·cos(x)/x + i·(ħ/p)·sin(x)/x, and we just need to square this to get the probability. All of the graphs are drawn hereunder: I’ll let you analyze them. [Note that the graphs do not include the ħ/p factor, which you may look at as yet another scaling factor.] You’ll see – I hope! – that it all makes perfect sense: the probability quickly drops off with distance, both in the positive as well as in the negative x-direction, while going to infinity when very near, i.e. for very small x. [Note that the absolute square, using cos(x)/x and sin(x)/x yields the same graph as squaring 1/x—obviously!]


Now, this propagator function is not dependent on time: it’s only the momentum that enters the argument. Of course, we assume p to be some positive real number. Of course?

This is where Feynman starts an interesting conversation. In the previous post, we studied a model in which we had two protons, and one electron jumping from one to another, as shown below.


This model told us the equilibrium state is a stable ionized hydrogen molecule (so that’s an H2+ molecule), with an interproton distance that’s equal to 1 Ångstrom – so that’s like twice the size of a hydrogen atom (which we simply write as H) – and an energy that’s 2.72 eV less than the energy of a hydrogen atom and a proton (so that’s not an H2+ molecule but a system consisting of a separate hydrogen atom and a proton). The why and how of that equilibrium state is illustrated below. [For more details, see my previous post.]


Now, the model implies there is a sort of attractive force pulling the two protons together even when the protons are at larger distances than 1 Å. One can see that from the graph indeed. Now, we would not associate any molecular orbital with those distances, as the system is, quite simply, not a molecule but a separate hydrogen atom and a proton. Nevertheless, the amplitude A is non-zero, and so we have an electron jumping back and forth.

We know how that works from our post on tunneling: particles can cross an energy barrier and tunnel through. One of the weird things we had to consider when a particle crosses such potential barrier, is that the momentum factor p in its wavefunction was some pure imaginary number, which we wrote as p = i·p’. We then re-wrote that wavefunction as a·e−iθ = a·e−i[(E/ħ)∙t − (i·p’/ħ)x] = a·e−i(E/ħ)∙t·ei2·p’·x/ħ = a·e−i(E/ħ)∙t·e−p’·x/ħ. The e−p’·x/ħ factor in this formula is a real-valued exponential function, that sort of ‘kills’ our wavefunction as we move across the potential barrier, which is what is illustrated below: if the distance is too large, then the amplitude for tunneling goes to zero.

potential barrier

From a mathematical point of view, the analysis of our electron jumping back and forth is very similar. However, there are differences too. We can’t really analyze this in terms of a potential barrier in space. The barrier is the potential energy of the electron itself: it’s happy when it’s bound, because its energy then contributes to a reduction of the total energy of the hydrogen atomic system that is equal to the ionization energy, or the Rydberg energy as it’s called, which is equal to not less than 13.6 eV (which, as mentioned, is pretty big at the atomic level). Well… We can take that propagator function (1/re(i/ħ)·p∙r (note the argument has no minus sign: it can be quite tricky!), and just fill in the value for the momentum of the electron.

Huh? What momentum? It’s got no momentum to spare. On the contrary, it wants to stay with the proton, so it has no energy whatsoever to escape. Well… Not in quantum mechanics. In quantum mechanics it can use all its potential energy and convert it into kinetic energy, so it can get away from its proton and convert the energy that’s being released into kinetic energy.

But there is no release of energy! The energy is negative!

Exactly! You’re right. So we boldly write: K.E. = m·v2/2 = p2/(2m) = −13.6 eV, and, because we’re working with complex numbers, we can take a square root of negative number, using the definition of the imaginary unit: i = √(−1), so we get a purely imaginary value for the momentum p, which we write as:

p = ±i·√(2m·EH)

The sign of p is chosen so it makes sense: our electron should go in one direction only. It’s going to be the plus sign. [If you’d take the negative root, you’d get a nonsensical propagator function.] To make a long story short, our propagator function becomes:

(1/re(i/ħ)·i·√(2m·EH)∙r = (1/re(i/ħ)·i·√(2m·EH)∙r = (1/rei2/ħ·√(2m·EH)∙r = (1/r)·e−√(2m·EH)/ħ∙r

Of course, from a mathematical point of view, that’s the same function as e−p’·x/ħ: it’s a real-valued exponential function that quickly dies. But it’s an amplitude alright, and it’s just like an amplitude for tunneling indeed: if the distance is too large, then the amplitude goes to zero. The final cherry on the cake, of course, is to write:

A ∼ (1/r)·e−√(2m·EH)/ħ∙r

Well… No. It gets better. This amplitude is an amplitude for an electron bond between the two protons which, as we know, lowers the energy of the system. By how much? Well… By A itself. Now we know that work or energy is an integral or antiderivative of force over distance, so force is the derivative of energy with respect to the distance. So we can just take the derivative of the expression above to get the force. I’ll leave that you as an exercise: don’t forget to use the product rule! 🙂

So are we done? No. First, we didn’t talk about virtual particles yet! Let me do that now. However, first note that we should add one more effect in our two-proton-one-electron system: the coulomb field (ε) caused by the bare proton will cause the hydrogen molecule to take on an induced electric dipole moment (μ), so we should integrate that in our energy equation. Feynman shows how, but I won’t bother you with that here. Let’s talk about those virtual particles. What are they?

Well… There’s various definitions, but Feynman’s definition is this one:

“There is an exchange of a virtual electron when–as here–the electron has to jump across a space where it would have a negative energy. More specifically, a ‘virtual exchange’ means that the phenomenon involves a quantum-mechanical interference between an exchanged state and a non-exchanged state.”

You’ll say: what’s virtual about it? The electron does go from one place to another, doesn’t it? Well… Yes and no. We can’t observe it while it’s supposed to be doing that. Our analysis just tells us it seems to be useful to distinguish two different states and analyze all in terms of those differential equations. Who knows what’s really going on? What’s actual and what’s virtual? We just have some ‘model’ here: a model for the interaction between a hydrogen atom and a proton. It explains the attraction between them in terms of a sort of continuous exchange of an electron, but is it real?

The point is: in physics, it’s assumed that the coulomb interaction, i.e. all of electrostatics really, comes from the exchange of virtual photons: one electron, or proton, emits a photon, and then another absorbs it in the reverse of the same reaction. Furthermore, it is assumed that the amplitude for doing so is like that formula we found for the amplitude to exchange a virtual electron, except that the rest mass of a photon is zero, and so the formula reduces to 1/r. Such simple relationship makes sense, of course, because that’s how the electrostatic potential varies in space!

That, in essence, is all what there is to the quantum-mechanical theory of electromagnetism, which Feynman refers to as the ‘particle point of view’.

So… Yes. It’s that simple. Yes! For a change! 🙂

Post scriptum: Feynman’s Lecture on virtual particles is actually focused on a model for the nuclear forces. Most of it is devoted to a discussion of the virtual ‘pion’, or π-meson, which was then, when Feynman wrote his Lectures, supposed to mediate the force between two nucleons. However, this theory is clearly outdated: nuclear forces are described by quantum chromodynamics. So I’ll just skip the Yukawa theory here. It’s actually kinda strange his theory, which he proposed in 1935, was the theory for nuclear forces for such a long time. Hence, it’s surely all very interesting from a historical point of view.

The hydrogen molecule as a two-state system

My posts on the state transitions of an ammonia molecule weren’t easy, were they? So let’s try another two-state system. The illustration below shows an ionized hydrogen molecule in two possible states which, as usual, we’ll denote as |1〉 and |2〉. An ionized hydrogen molecule is an H2 molecule which lost an electron, so it’s two protons with one electron only, so we denote it as H2+. The difference between the two states is obvious: the electron is either with the first proton or with the second.


It’s an example taken from Feynman’s Lecture on two-state systems. The illustration itself raises a lot of questions, of course. The most obvious question is: how do we know which proton is which? We’re talking identical particles, right? Right. We should think of the proton spins! However, protons are fermions and, hence, they can’t be in the same state, so they must have opposite spins. Of course, now you’ll say: they’re not in the same state because they’re at different locations. Well… Now you’ve answered your own question. 🙂 However you want to look at this, the point is: we can distinguish both protons. Having said that, the reflections above raise other questions: what reference frame are we using? The answer is: it’s the reference frame of the system. We can mirror or rotate this image however we want – as I am doing below – but state |1〉 is state |1〉, and state |2〉 is state |2〉.


The other obvious question is more difficult. If you’ve read anything at all about quantum mechanics, you’ll ask: what about the in-between states? The electron is actually being shared by the two protons, isn’t it? That’s what chemical bonds are all about, no? Molecular orbitals rather than atomic orbitals, right? Right. That’s actually what this post is all about. We know that, in quantum mechanics, the actual state – or what we think is the actual state – is always expressed as some linear combination of so-called base states. We wrote:

|ψ〉 = |1〉C|2〉C= |1〉〈1|ψ〉 + |2 〉〈2|ψ 〉

In terms of representing what’s actually going on, we only have these probability functions: they say that, if we would take a measurement, the probability of finding the electron near the first or the second proton varies as shown below:


If the |1〉 and |2〉 states were actually representing two dual physical realities, the actual state of our H2molecule would be represented by some square or some pulse wave, as illustrated below. [We should be calling it a square function but that term has been reserved for a function like y = x2.]


Of course, the symmetry of the situation implies that the average pulse duration τ would be one-half of the (average) period T, so we’d be talking a square wavefunction indeed. The two wavefunctions both qualify as probability density functions: the system is always in one state or the other, and the probabilities add up to one. But you’ll agree we prefer the smooth squared sine and cosine functions. To be precise, these smooth functions are:

  • P1(t) = |C1(t)|2 = cos2[(A/ħ)·t]
  • P2(t) = |C2(t)|= sin2[(A/ħ)·t]

So now we only need to explain A here (you know ħ already). But… Well… Why would we actually prefer those smooth functions? An irregular pulse function would seem to be doing a better job when it comes to modeling reality, doesn’t it? The electron should be either here, or there. Isn’t it?

Well… No. At least that’s why am slowly starting to understand. These pure base states |1〉 and |2〉 are real and not real at the same time. They’re real, because it’s what we’ll get when we verify, or measure, the state, so our measurement will tell us that it’s here or there. There’s no in-between. [I still need to study weak measurement theory.] But then they are not real, because our molecule will never ever be in those two states, except for those ephemeral moments when (A/ħ)·t = n·π (n = 0, 1, 2,…). So we’re really modeling uncertainty here and, while I am still exploring what that actually means, you should think of the electron as being everywhere really, but with an unequal density in space—sort of. 🙂

Now, we’ve learned we can describe the state of a system in terms of an alternative set of base states. We wrote: |ψ〉 = |I〉C|II〉CII = |I〉〈I|ψ〉 + |II〉〈II|ψ〉, with the CI, II and C1, 2 coefficients being related to each other in exactly the same way as the associated base states, i.e. through a transformation matrix, which we summarized as:


To be specific, the two sets of base states we’ve been working with so far were related as follows:


So we’d write: |ψ〉 = |I〉C|II〉CII = |I〉〈I|ψ〉 + |II〉〈II|ψ〉 = |1〉C|2〉C= |1〉〈1|ψ〉 + |2 〉〈2|ψ 〉, and the CI, II and C1, 2 coefficients would be related in exactly the same way as the base states:

Eq 4

[In case you’d want to review how that works, see my post on the Hamiltonian and base states.] Now, we cautioned that it’s difficult to try to interpret such base transformations – often referred to as a change in the representation or a different projection – geometrically. Indeed, we acknowledged that (base) states were very much like (base) vectors – from a mathematical point of view, that is – but, at the same time, we said that they were ‘objects’, really: elements in some Hilbert space, which means you can do the operations we’re doing here, i.e. adding and multiplying. Something like |I〉CI doesn’t mean all that much: Cis a complex number – and so we can work with numbers, of course, because we can visualize them – but |I〉 is a ‘base state’, and so what’s the meaning of that, and what’s the meaning of the |I〉CI or CI|I〉 product? I could babble about that, but it’s no use: a base state is a base state. It’s some state of the system that makes sense to us. In fact, it may be some state that does not make sense to us—in terms of the physics of the situation, that is – but then there will always be some mathematical sense to it because of that transformation matrix, which establishes a one-to-one relationship between all sets of base states.

You’ll say: why don’t you try to give it some kind of geometrical or whatever meaning? OK. Let’s try. State |1〉 is obviously like minus state |2〉 in space, so let’s see what happens when we equate |1〉 to 1 on the real axis, and |2〉 to −1. Geometrically, that corresponds to the (1, 0) and (−1, 0) points on the unit circle. So let’s multiply those points with (1/√2, −1/√2) and (1/√2, 1/√2) respectively. What do we get? Well… What product should we take? The dot product, the cross product, or the ordinary complex-number product? The dot product gives us a number, so we don’t want that. [If we’re going to represent base states by vectors, we want all states to be vectors.] A cross product will give us a vector that’s orthogonal to both vectors, so it’s a vector in ‘outer space’, so to say. We don’t want that, I must assume, and so we’re left with the complex-number product, which projects our  (1, 0) and (−1, 0) vectors into the (1/√2, −1/√2)·(1, 0) = (1/√2−i/√2)·(1+0·i) = √2−i/√2 = (1/√2, −i/√2) and (1/√2, 1/√2)·(−1, 0) = (1/√2+i/√2)·(−1+0·i) = −√2−i/√2 = (−1/√2, −i/√2) respectively.

transformation 2

What does this say? Nothing. Stuff like this only causes confusion. We had two base states that were ‘180 degrees’ apart, and now our new base states are only ’90 degrees’ apart. If we’d ‘transform’ the two new base states once more, they collapse into each other: (1/√2, −1/√2)·(1/√2, −1/√2) = (1/√2−i/√2)2 = −= (0, −1) = (1/√2, 1/√2)·(−1/√2, −1/√2) = −i. This is nonsense, of course. It’s got nothing to do with the angle we picked for our original set of base states: we could have separated our original set of base states by 90 degrees, or 45 degrees. It doesn’t matter. It’s the transformation itself: multiplying by (+1/√2, −1/√2) amounts to a clockwise rotation by 45 degrees, while multiplying by (+1/√2, +1/√2) amounts to the same, but counter-clockwise. So… Well… We should not try to think of our base vectors in any geometric way, because it just doesn’t make any sense. So Let’s not waste time on this: the ‘base states’ are a bit of a mystery, in the sense that they just are what they are: we can’t ‘reduce’ them any further, and trying to interpret them geometrically leads to contradictions, as evidenced by what I tried to do above. Base states are ‘vectors’ in a so-called Hilbert space, and… Well… That’s not your standard vector space. [If you think you can make more sense of it, please do let me know!]


Let’s take our transformation again:

  • |I〉 = (1/√2)|1〉 − (1/√2)|2〉 = (1/√2)[|1〉 − |2〉]
  • |II〉 = (1/√2)|1〉 + (1/√2)|2〉 = (1/√2)[|1〉 + |2〉]

Again, trying to geometrically interpret what it means to add or subtract two base states is not what you should be trying to do. In a way, the two expressions above only make sense when combining them with a final state, so when writing:

  • 〈ψ|I〉 = (1/√2)〈ψ|1〉 − (1/√2)〈ψ|2〉 = (1/√2)[〈ψ|1〉 − 〈ψ|2〉]
  • 〈ψ|II〉 = (1/√2)〈ψ|1〉 + (1/√2)〈ψ|2〉 = (1/√2)[〈ψ|1〉 + 〈ψ|2〉]

Taking the complex conjugate of this gives us the amplitudes of the system to be in state I or state II:

  • 〈I|ψ〉 = 〈ψ|I〉* = (1/√2)[〈ψ|1〉* − 〈ψ|2〉*] = (1/√2)[〈1|ψ〉 − 〈2|ψ〉]
  • 〈II|ψ〉 = 〈ψ|II〉* = (1/√2)[〈ψ|1〉* + 〈ψ|2〉*] = (1/√2)[〈1|ψ〉 + 〈2|ψ〉]

That still doesn’t tell us much, because we’d need to know the 〈1|ψ〉 and 〈2|ψ〉 functions, i.e. the amplitudes of the system to be in state 1 and state 2 respectively. What we do know, however, is that the 〈1|ψ〉 and 〈2|ψ〉 functions will have some rather special amplitudes. We wrote:

  • C= 〈 I | ψ 〉 =  e−(i/ħ)·EI·t
  • CII = 〈 II | ψ 〉 = e−(i/ħ)·EII·t

These are amplitudes of so-called stationary states: the associated probabilities – i.e. the absolute square of these functions – do not vary in time: |e−(i/ħ)·EI·t|2 = |e−(i/ħ)·EII·t|2 = 1. For our ionized hydrogen molecule, it means that, if it would happen to be in state I, it will stay in state I, and the same goes for state II. We write:

〈 I | I 〉 = 〈 II | II 〉 = 1 and 〈 I | II 〉 = 〈 II | I 〉 = 0

That’s actually just the so-called ‘orthogonality’ condition for base states, which we wrote as 〈i|j〉 = 〈j|i〉 = δij, but, in light of the fact that we can’t interpret them geometrically, we shouldn’t be calling it like that. The point is: we had those differential equations describing a system like this. If the amplitude to go from state 1 to state 2 was equal to some real- or complex-valued constant A, then we could write those equations either in terms of Cand C2, or in terms of Cand CII:

set of equations

So the two sets of equations are equivalent. However, what we want to do here is look at it in terms of Cand CII. Let’s first analyze those two energy levels E= E+ A and EII = E− A. Feynman graphs them as follows:


Let me explain. In the first graph, we have E= E+ A and EII = E− A, and they are depicted as being symmetric, with A depending on the distance between the two protons. As for E0, that’s the energy of a hydrogen atom, i.e. a proton with a bound electron, and a separate proton. So it’s the energy of a system consisting of a hydrogen atom and a proton, which is obviously not the same as that of an ionized hydrogen molecule. The concept of a molecule assumes the protons are closely together. We assume E= 0 if the interproton distance is relatively large but, of course, as the protons come closer, we shouldn’t forget the repulsive electrostatic force between the two protons, which is represented by the dashed line in the first graph. Indeed, unlike the electron and the proton, the two protons will want to push apart, rather than pull together, so the potential energy of the system increases as the interproton distance decreases. So Eis not constant either: it also depends on the interproton distance. But let’s forget about Efor a while. Let’s look at the two curves for A now.

A is not varying in time, but its value does depend on the distance between the two protons. We’ll use this in a moment to calculate the approximate size of the hydrogen nucleus in a calculation that closely resembles Feynman’s calculation of the size of a hydrogen atom. That A should be some function of the interproton distance makes sense: the transition probability, and therefore A, will exponentially decrease with distance. There are a few things to reflect on here:

1. In the mentioned calculation of the size of a hydrogen atom, which is based on the Uncertainty Principle, Feynman shows that the energy of the system decreases when an electron is bound to the proton. The reasoning is that, if the potential energy of the electron is zero when it is not bound, then its potential energy will be negative when bound. Think of it: the electron and the proton attract each other, so it requires force to separate them, and force over a distance is energy. From our course in electromagnetics, we know that the potential energy, when bound, should be equal to −e2/a0, with ethe squared charge of the electron divided by 4πε0, and a0 the so-called Bohr radius of the atom. Of course, the electron also has kinetic energy. It can’t just sit on top of the proton because that would violate the Uncertainty Principle: we’d know where it was. Combining the two, Feynman calculates both a0 as well as the so-called Rydberg energy, i.e. the total energy of the bound electron, which is equal to −13.6 eV. So, yes, the bound state has less energy, so the electron will want to be bound, i.e. it will want to be close to one of the two protons.

2. Now, while that’s not what’s depicted above, it’s clear the magnitude of A will be related to that Rydberg energy which − please note − is quite high. Just compare it with the A for the ammonia molecule, which we calculated in our post on the maser: we found an A of about 0.5×10−4 eV there, so that’s like 270,000 times less! Nevertheless, the possibility is there, and what happens when the electron flips over amounts to tunneling: it penetrates and crosses a potential barrier. We did a post on that, and so you may want to look at how that works. One of the weird things we had to consider when a particle crosses such potential barrier, is that the momentum factor p in its wavefunction was some pure imaginary number, which we wrote as p = i·p’. We then re-wrote that wavefunction as a·e−iθ = a·e−i[(E/ħ)∙t − (i·p’/ħ)x] = a·e−i(E/ħ)∙t·ei2·p’·x/ħ = a·e−i(E/ħ)∙t·e−p’·x/ħ. Now, it’s easy to see that the e−p’·x/ħ factor in this formula is a real-valued exponential function, with the same shape as the general e−x function, which I depict below.


This e−p’·x/ħ basically ‘kills’ our wavefunction as we move in the positive x-direction, across the potential barrier, which is what is illustrated below: if the distance is too large, then the amplitude for tunneling goes to zero.

potential barrier

So that’s what depicted in those graphs of E= E+ A and EII = E− A: A goes to zero when the interproton distance becomes too large. We also recognize the exponential shape for A in those graphs, which can also be derived from the same tunneling story.

Now we can calculate EA and E− A taking into account that both terms vary with the interproton distance as explained, and so that gives us the final curves on the right-hand side, which tell us that the equilibrium configuration of the ionized hydrogen molecule is state II, i.e. the lowest energy state, and the interproton distance there is approximately one Ångstrom, i.e. 1×10−10 m. [You can compare this with the Bohr radius, which we calculated as a0 = 0.528×10−10 m, so that all makes sense.] Also note the energy scale: ΔE is the excess energy over a proton plus a hydrogen atom, so that’s the energy when the two protons are far apart. Because it’s the excess energy, we have a zero point. That zero point is, obviously, the energy of a hydrogen atom and a proton. [Read this carefully, and please refer back to what I wrote above. The energy of a system consisting of a hydrogen atom and a proton is not the same as that of an ionized hydrogen molecule: the concept of a molecule assumes the protons are closely together.] We then re-scale by dividing by the Rydberg energy E= 13.6 eV. So ΔE/E≈ −0.2 ⇔ ΔE ≈ −0.2×13.6 = –2.72 eV. That basically says that the energy of our ionized hydrogen molecule is 2.72 eV lower than the energy of a hydrogen atom and a proton.

Why is it lower? We need to think about our model of the hydrogen atom once more: the energy of the electron was minimized by striking a balance between (1) being close to the proton and, therefore, having a low potential energy (or a low coulomb energy, as Feynman calls it) and (2) being further away from the proton and, therefore, lowering its kinetic energy according to the Uncertainty Principle ΔxΔp ≥ ħ/2, which Feynman boldly re-wrote as p = ħ/a0. Now, a molecular orbital, i.e. the electron being around two protons, results in “more space where the electron can have a low potential energy”, as Feynman puts it, so “the electron can spread out—lowering its kinetic energy—without increasing its potential energy.”

The whole discussion here actually amounts to an explanation for the mechanism by which an electron shared by two protons provides, in effect, an attractive force between the two protons. So we’ve got a single electron actually holding two protons together, which chemists refer to as a “one-electron bond.”

So… Well… That explains why the energy EII = E− A is what it is, so that’s smaller than Eindeed, with the difference equal to the value A for an interproton distance of 1 Å. But how should we interpret E= E+ A? What is that higher energy level? What does it mean?

That’s a rather tricky question. There’s no easy interpretation here, like we had for our ammonia molecule: the higher energy level had an obvious physical meaning in an electromagnetic field, as it was related to the electric dipole moment of the molecule. That’s not the case here: we have no magnetic or electric dipole moment here. So, once again, what’s the physical meaning of E= E+ A? Let me quote Feynman’s enigmatic answer here:

“Notice that this state is the difference of the states |1⟩ and |2⟩. Because of the symmetry of |1⟩ and |2⟩, the difference must have zero amplitude to find the electron half-way between the two protons. This means that the electron is somewhat more confined, which leads to a larger energy.”

What does he mean with that? It seems he’s actually trying to do what I said we shouldn’t try to do, and that is to interpret what adding versus subtracting states actually means. But let’s give it a fair look. We said that the |I〉 = (1/√2)[|1〉 − |2〉] expression didn’t mean much: we should add a final state and write: 〈ψ|I〉 = (1/√2)[〈ψ|1〉 − 〈ψ|2〉], which is equivalent to 〈I|ψ〉 = (1/√2)[〈1|ψ〉 − 〈2|ψ〉]. That still doesn’t tell us anything: we’re still adding amplitudes, and so we should allow for interference, and saying that |1⟩ and |2⟩ are symmetric simply means that 〈1|ψ〉 − 〈2|ψ〉 = 〈2|ψ〉 − 〈1|ψ〉 ⇔ 2·〈1|ψ〉 = 2·〈2|ψ〉 ⇔ 〈1|ψ〉 = 〈2|ψ〉. Wait a moment! That’s an interesting reflection. Following the same reasoning for |II〉 = (1/√2)[|1〉 + |2〉], we get 〈1|ψ〉 + 〈2|ψ〉 = 〈2|ψ〉 + 〈1|ψ〉 ⇔ … Huh? No, that’s trivial: 0 = 0.

Hmm… What to say? I must admit I don’t quite ‘get’ Feynman here: state I, with energy E= E+ A, seems to be both meaningless as well as impossible. The only energy levels that would seem to make sense here are the energy of a hydrogen atom and a proton and the (lower) energy of an ionized hydrogen molecule, which you get when you bring a hydrogen atom and a proton together. 🙂

But let’s move to the next thing: we’ve added only one electron to the two protons, and that was it, and so we had an ionized hydrogen molecule, i.e. an H2+ molecule. Why don’t we do a full-blown H2 molecule now? Two protons. Two electrons. It’s easy to do. The set of base states is quite predictable, and illustrated below: electron a can be either one of the two protons, and the same goes for electron b.


We can then go through the same as for the ion: the molecule’s stability is shown in the graph below, which is very similar to the graph of the energy levels of the ionized hydrogen molecule, i.e. the H2+  molecule. The shape is the same, but the values are different: the equilibrium state is at an interproton distance of 0.74 Å, and the energy of the equilibrium state is like 5 eV (ΔE/E≈ −0.375) lower than the energy of two separate hydrogen atoms.

raph3The explanation for the lower energy is the same: state II is associated with some kind of molecular orbital for both electrons, resulting in “more space where the electron can have a low potential energy”, as Feynman puts it, so “the electron can spread out—lowering its kinetic energy—without increasing its potential energy.”

However, there’s one extra thing here: the two electrons must have opposite spins. That’s the only way to actually distinguish the two electrons. But there is more to it: if the two electrons would not have opposite spin, we’d violate Fermi’s rule: when identical fermions are involved, and we’re adding amplitudes, then we should do so with a negative sign for the exchanged case. So our transformation would be problematic:

〈II|ψ〉 = (1/√2)[〈1|ψ〉 + 〈2|ψ〉] = (1/√2)[〈2|ψ〉 + 〈1|ψ〉]

When we switch the electrons, we should get a minus sign. The weird thing is: we do get that minus sign for state I:

〈I|ψ〉 = (1/√2)[〈1|ψ〉 − 〈2|ψ〉] = −(1/√2)[〈2|ψ〉 − 〈1|ψ〉]

So… Well… We’ve got a bit of an answer there as to what that the ‘other’ (upper) energy level of E= E+ A actually means, in physical terms, that is. It models two hydrogens coming together with parallel electron spins. Applying Fermi’s rules  – i.e. the exclusion principle, basically – we find that state II is, quite simply, not allowed for parallel electron spins: state I is, and it’s the only one. There’s something deep here, so let me quote the Master himself on it:

“We find that the lowest energy state—the only bound state—of the H2 molecule has the two electrons with spins opposite. The total spin angular momentum of the electrons is zero. On the other hand, two nearby hydrogen atoms with spins parallel—and so with a total angular momentum —must be in a higher (unbound) energy state; the atoms repel each other. There is an interesting correlation between the spins and the energies. It gives another illustration of something we mentioned before, which is that there appears to be an “interaction” energy between two spins because the case of parallel spins has a higher energy than the opposite case. In a certain sense you could say that the spins try to reach an antiparallel condition and, in doing so, have the potential to liberate energy—not because there is a large magnetic force, but because of the exclusion principle.”

You should read this a couple of times. It’s an important principle. We’ll discuss it again in the next posts, when we’ll be talking spin in much more detail once again. 🙂 The bottom line is: if the electrons are parallel, then they won’t ‘share’ any space at all and, hence, they are really much more confined in space, and the associated energy level is, therefore, much higher.

Post scriptum: I said we’d ‘calculate’ the equilibrium interproton distance. We didn’t do that. We just gave them through the graphs, which are based on the results of a ‘detailed quantum-mechanical calculation’—or that’s what Feynman claims, at least. I am not sure if they correspond to experimentally determined values, or what calculations are behind, exactly. Feynman notes that “this approximate treatment of the H2molecule as a two-state system breaks down pretty badly once the protons get as close together as they are at the minimum in the curve and, therefore, it will not give a good value for the actual binding energy. For small separations, the energies of the two “states” we imagined are not really equal to E0, and a more refined quantum mechanical treatment is needed.”

So… Well… That says it all, I guess.

Two-state systems: the math versus the physics, and vice versa.

I think my previous post, on the math behind the maser, was a bit of a brain racker. However, the results were important and, hence, it is useful to generalize them so we can apply it to other two-state systems. 🙂 Indeed, we’ll use the very same two-state framework to analyze things like the stability of neutral and ionized hydrogen molecules and the binding of diatomic molecules in general – and lots of other stuff that can be analyzed as a two-state system. However, let’s first have look at the math once more. More importantly, let’s analyze the physics behind. 

At the center of our little Universe here 🙂 is the fact that the dynamics of a two-state system are described by a set of two differential equations, which we wrote as: System

It’s obvious these two equations are usually not easy to solve: the Cand Cfunctions are complex-valued amplitudes which vary not only in time but also in space, obviously, but, in fact, that’s not the problem. The issue is that the Hamiltonian coefficients Hij may also vary in space and in time, and so that‘s what makes things quite nightmarish to solve. [Note that, while H11 and H22 represent some energy level and, hence, are usually real numbers, H12 and H21 may be complex-valued. However, in the cases we’ll be analyzing, they will be real numbers too, as they will usually also represent some energy. Having noted that, being real- or complex-valued is not the problem: we can work with complex numbers and, as you can see from the matrix equation above, the i/ħ factor in front of our differential equations results in a complex-valued coefficient matrix anyway.]

So… Yes. It’s those non-constant Hamiltonian coefficients that caused us so much trouble when trying to analyze how a maser works or, more generally, how induced transitions work. [The same equations apply to blackbody radiation indeed, or other phenomena involved induced transitions.] In any case, so we won’t do that again – not now, at least – and so we’ll just go back to analyzing ‘simple’ two-state systems, i.e. systems with constant Hamiltonian coefficients.

Now, even for such simple systems, Feynman made life super-easy for us – too easy, I think – because he didn’t use the general mathematical approach to solve the issue on hand. That more general approach would be based on a technique you may or may not remember from your high school or university days: it’s based on finding the so-called eigenvalues and eigenvectors of the coefficient matrix. I won’t say too much about that, as there’s excellent online coverage of that, but… Well… We do need to relate the two approaches, and so that’s where math and physics meet. So let’s have a look at it all.

If we would write the first-order time derivative of those C1 and Cfunctions as C1‘ and C2‘ respectively (so we just put a prime instead of writing dC1/dt and dC2/dt), and we put them in a two-by-one column matrix, which I’ll write as C, and then, likewise, we also put the functions themselves, i.e. C1 and C2, in a column matrix, which I’ll write as C, then the system of equations can be written as the following simple expression:

C = AC

One can then show that the general solution will be equal to:

C = a1eλI·tv+ a2eλII·tvII

The λI and λII in the exponential functions are the eigenvalues of A, so that’s that two-by-two matrix in the equation, i.e. the coefficient matrix with the −(i/ħ)Hij elements. The vI and vII column matrices in the solution are the associated eigenvectors. As for a1 and a2, these are coefficients that depend on the initial conditions of the system as well as, in our case at least, the normalization condition: the probabilities we’ll calculate have to add up to one. So… Well… It all comes with the system, as we’ll see in a moment.

Let’s first look at those eigenvalues. We get them by calculating the determinant of the A−λI matrix, and equating it to zero, so we write det(A−λI) = 0. If A is a two-by-two matrix (which it is for the two-state systems that we are looking at), then we get a quadratic equation, and its two solutions will be those λI and λII values. The two eigenvalues of our system above can be written as:

λI = −(i/ħ)·EI and λII = −(i/ħ)·EII.

EI and EII are two possible values for the energy of our system, which are referred to as the upper and the lower energy level respectively. We can calculate them as:


Note that we use the Roman numerals I and II for these two energy levels, rather than the usual Arabic numbers 1 and 2. That’s in line with Feynman’s notation: it relates to a special set of base states that we will introduce shortly. Indeed, plugging them into the a1eλI·t and a2eλII·t expressions gives us a1e−(i/ħ)·EI·t and a2e−(i/ħ)·EII·t and…

Well… It’s time to go back to the physics class now. What are we writing here, really? These two functions are amplitudes for so-called stationary states, i.e. states that are associated with probabilities that do not change in time. Indeed, it’s easy to see that their absolute square is equal to:

  • P= |a1e−(i/ħ)·EI·t|= |a1|2·|e−(i/ħ)·EI·t|= |a1|2
  • PII = |a2e−(i/ħ)·EII·t|= |a2|2·|e−(i/ħ)·EII·t|= |a2|2

Now, the a1 and a2 coefficients depend on the initial and/or normalization conditions of the system, so let’s leave those out for the moment and write the rather special amplitudes e−(i/ħ)·EI·t and e−(i/ħ)·EII·t as:

  • C= 〈 I | ψ 〉 =  e−(i/ħ)·EI·t
  • CII = 〈 II | ψ 〉 = e−(i/ħ)·EII·t

As you can see, there’s two base states that go with these amplitudes, which we denote as state | I 〉 and | II 〉 respectively, so we can write the state vector of our two-state system – like our ammonia molecule, or whatever – as:

| ψ 〉 = | I 〉 C| II 〉 CII = | I 〉〈 I | ψ 〉 + | II 〉〈 II | ψ 〉

In case you forgot, you can apply the magical | = ∑ | i 〉 〈 i | formula to see this makes sense: | ψ 〉 = ∑ | i 〉 〈 i | ψ 〉 = | I 〉 〈 I | ψ 〉 + | II 〉 〈 II | ψ 〉 = | I 〉 C| II 〉 CII.

Of course, we should also be able to revert back to the base states we started out with so, once we’ve calculated Cand C2, we can also write the state of our system in terms of state | 1 〉 and | 2 〉, which are the states as we defined them when we first looked at the problem. 🙂 In short, once we’ve got Cand C2, we can also write:

| ψ 〉 = | 1 〉 C| 2 〉 C= | 1 〉〈 1 | ψ 〉 + | 2 〉〈 2 | ψ 〉

So… Well… I guess you can sort of see how this is coming together. If we substitute what we’ve got so far, we get:

C = a1·CI·vI + a2·CII·vII

Hmm… So what’s that? We’ve seen something like C = a1·CI + a2·CII , as we wrote something like C1 = (a/2)·CI + (b/2)·CII b in our previous posts, for example—but what are those eigenvectors vI and vII? Why do we need them?

Well… They just pop up because we’re solving the system as mathematicians would do it, i.e. not as Feynman-the-Great-Physicist-and-Teacher-cum-Simplifier does it. 🙂 From a mathematical point of view, they’re the vectors that solve the (A−λII)vI = 0 and (A−λIII)vII = 0 equations, so they come with the eigenvalues, and their components will depend on the eigenvalues λand λI as well as the Hamiltonian coefficients. [I is the identity matrix in these matrix equations.] In fact, because the eigenvalues are written in terms of the Hamiltonian coefficients, they depend on the Hamiltonian coefficients only, but then it will be convenient to use the EI and EII values as a shorthand.

Of course, one can also look at them as base vectors that uniquely specify the solution C as a linear combination of vI and vII. Indeed, just ask your math teacher, or google, and you’ll find that eigenvectors can serve as a set of base vectors themselves. In fact, the transformations you need to do to relate them to the so-called natural basis are the ones you’d do when diagonalizing the coefficient matrix A, which you did when solving systems of equations back in high school or whatever you were doing at university. But then you probably forgot, right? 🙂 Well… It’s all rather advanced mathematical stuff, and so let’s cut some corners here. 🙂

We know, from the physics of the situations, that the C1 and C2 functions and the CI and CII functions are related in the same way as the associated base states. To be precise, we wrote:

eq 1

This two-by-two matrix here is the transformation matrix for a rotation of state filtering apparatus about the y-axis, over an angle equal to α, when only two states are involved. You’ve seen it before, but we wrote it differently:


In fact, we can be more precise: the angle that we chose was equal to minus 90 degrees. Indeed, we wrote our transformation as:

Eq 4[Check the values against α = −π/2.] However, let’s keep our analysis somewhat more general for the moment, so as to see if we really need to specify that angle. After all, we’re looking for a general solution here, so… Well… Remembering the definition of the inverse of a matrix (and the fact that cos2α + sin2α = 1), we can write:

Eq 3

Now, if we write the components of vI and vII as vI1 and vI2, and vII1 and vII2 respectively, then the C = a1·CI·vI + a2·CII·vII expression is equivalent to:

  • C1 = a1·vI1·Ca2·vII1·CII
  • C2 = a1·vI2·CI + a2·vII2 ·CII

Hence, a1·vI1 = a2·vII2 = cos(α/2) and a2·vII1 = −a1·vI2 = sin(α/2). What can we do with this? Can we solve this? Not really: we’ve got two equations and four variables. So we need to look at the normalization and starting conditions now. For example, we can choose our t = 0 point such that our two-state system is in state 1, or in state I. And then we know it will not be in state 2, or state II. In short, we can impose conditions like:

|C1(0)|= 1 = |a1·vI1·CI(0) + a2·vII1·CII(0)|and |C2|= 0 = |a1·vI1·CI(0) + a2·vII1·CII(0)|

However, as Feynman puts it: “These conditions do not uniquely specify the coefficients. They are still undetermined by an arbitrary phase.”

Hmm… He means the α, of course. So… What to do? Well… It’s simple. What he’s saying here is that we do need to specify that transformation angle. Just look at it: the a1·vI1 = a2·vII2 = cos(α/2) and a2·vII1 = −a1·vI2 = sin(α/2) conditions only make sense when we equate α with −π/2, so we can write:

  • a1·vI1 = a2·vII2 = cos(−π/4) = 1/√2
  • a2·vII1 = −a1·vI2 = sin(−π/4) = –1/√2

It’s only then that we get a unique ratio for a1/a= vI1/vII2 = −vII1/vI2. [In case you think there are two angles in the circle for which the cosine equals minus the sine – or, what amounts to the same, for which the sine equals minus the cosine – then… Well… You’re right, but we’ve got α divided by two in the argument. So if α/2 is equal to the ‘other’ angle, i.e. 3π/4, then α itself will be equal to 6π/4 = 3π/2. And so that’s the same −π/2 angle as above: 3π/2 − 2π = −π/2, indeed. So… Yes. It all makes sense.]

What are we doing here? Well… We’re sort of imposing a ‘common-sense’ condition here. Think of it: if the vI1/vII2 and −vII1/vI2 ratios would be different, we’d have a huge problem, because we’d have two different values for the a1/aratio! And… Well… That just doesn’t make sense. The system must come with some specific value for aand a2. We can’t just invent two ‘new’ ones!

So… Well… We are alright now, and we can analyze whatever two-state system we want now. One example was our ammonia molecule in an electric field, for which we found that the following systems of equations were fully equivalent:


So, the upshot is that you should always remember that everything we’re doing is subject to the condition that the ‘1’ and ‘2’ base states and the ‘I’ and ‘II’ base states (Feynman suggests to read I and II as ‘Eins’ and ‘Zwei’ – or try ‘Uno‘ and ‘Duo‘ instead 🙂 – so as to make a difference with ‘one’ and ‘two’) are ‘separated’ by an angle of (minus) 90 degrees. [Of course, I am not using the ‘right’ language here, obviously. I should say ‘projected’, or ‘orthogonal’, perhaps, but then that’s hard to say for base states: the [1/√2, 1/√2] and [1/√2, −1/√2] vectors are obviously orthogonal, because their dot product is zero, but, as you know, the base states themselves do not have such geometrical interpretation: they’re just ‘objects’ in what’s referred to as a Hilbert space. But… Well… I shouldn’t dwell on that here.]

So… There we are. We’re all set. Good to go! Please note that, in the absence of an electric field, the two Hamiltonians are even simpler:


In fact, they’ll usually do the trick in what we’re going to deal with now.

[…] So… Well… That’s is really! 🙂 We’re now going to apply all this in the next posts, so as to analyze things like the stability of neutral and ionized hydrogen molecules and the binding of diatomic molecules. More interestingly, we’re going to talk about virtual particles. 🙂

Addendum: I started writing this post because Feynman actually does give the impression there’s some kind of ‘doublet’ of aand a2 coefficients as he start his chapter on ‘other two-state systems’. It’s the symbols he’s using: ‘his’ aand a2, and the other doublet with the primes, i.e. a1‘ and a2‘, are the transformation amplitudesnot the coefficients that I am calculating above, and that he was calculating (in the previous chapter) too. So… Well… Again, the only thing you should remember from this post is that 90 degree angle as a sort of physical ‘common sense condition’ on the system.

Having criticized the Great Teacher for not being consistent in his use of symbols, I should add that the interesting thing is that, while confusing, his summary in that chapter does give us precise formulas for those transformation amplitudes, which he didn’t do before. Indeed, if we write them as a, b, c and d respectively (so as to avoid that confusing aand a2, and then a1‘ and a2‘ notation), so if we have:


then one can show that:


That’s, of course, fully consistent with the ratios we introduced above, as well as with the orthogonality condition that comes with those eigenvectors. Indeed, if a/b = −1 and c/d = +1, then a/b = −c/d and, therefore, a·d + b·c = 0. [I’ll leave it to you to compare the coefficients so as to check that’s the orthogonality condition indeed.]

In short, it all shows everything does come out of the system in a mathematical way too, so the math does match the physics once again—as it should, of course! 🙂

The math behind the maser

As I skipped the mathematical arguments in my previous post so as to focus on the essential results only, I thought it would be good to complement that post by looking at the math once again, so as to ensure we understand what it is that we’re doing. So let’s do that now. We start with the easy situation: free space.

The two-state system in free space

We started with an ammonia molecule in free space, i.e. we assumed there were no external force fields, like a gravitational or an electromagnetic force field. Hence, the picture was as simple as the one below: the nitrogen atom could be ‘up’ or ‘down’ with regard to its spin around its axis of symmetry.


It’s important to note that this ‘up’ or ‘down’ direction is defined in regard to the molecule itself, i.e. not in regard to some external reference frame. In other words, the reference frame is that of the molecule itself. For example, if I flip the illustration above – like below – then we’re still talking the same states, i.e. the molecule is still in state 1 in the image on the left-hand side and it’s still in state 2 in the image on the right-hand side. 


We then modeled the uncertainty about its state by associating two different energy levels with the molecule: E0 + A and E− A. The idea is that the nitrogen atom needs to tunnel through a potential barrier to get to the other side of the plane of the hydrogens, and that requires energy. At the same time, we’ll show the two energy levels are effectively associated with an ‘up’ or ‘down’ direction of the electric dipole moment of the molecule. So that resembles the two spin states of an electron, which we associated with the +ħ/2 and −ħ/2 energies respectively. So if E0 would be zero (we can always take another reference point, remember?), then we’ve got the same thing: two energy levels that are separated by some definite amount: that amount is 2A for the ammonia molecule, and ħ when we’re talking quantum-mechanical spin. I should make a last note here, before I move on: note that these energies only make sense in the presence of some external field, because the + and − signs in the E0 + A and E− A and +ħ/2 and −ħ/2 expressions make sense only with regard to some external direction defining what’s ‘up’ and what’s ‘down’ really. But I am getting ahead of myself here. Let’s go back to free space: no external fields, so what’s ‘up’ or ‘down’ is completely random here. 🙂

Now, we also know an energy level can be associated with a complex-valued wavefunction, or an amplitude as we call it. To be precise, we can associate it with the generic a·e−(i/ħ)·(E·t − px) expression which you know so well by now. Of course,  as the reference frame is that of the molecule itself, its momentum is zero, so the px term in the a·e−(i/ħ)·(E·t − px) expression vanishes and the wavefunction reduces to a·ei·ω·t a·e−(i/ħ)·E·t, with ω = E/ħ. In other words, the energy level determines the temporal frequency, or the temporal variation (as opposed to the spatial frequency or variation), of the amplitude.

We then had to find the amplitudes C1(t) = 〈 1 | ψ 〉 and C2(t) =〈 2 | ψ 〉, so that’s the amplitude to be in state 1 or state 2 respectively. In my post on the Hamiltonian, I explained why the dynamics of a situation like this can be represented by the following set of differential equations:


As mentioned, the Cand C2 functions evolve in time, and so we should write them as C= C1(t) and C= C2(t) respectively. In fact, our Hamiltonian coefficients may also evolve in time, which is why it may be very difficult to solve those differential equations! However, as I’ll show below, one usually assumes they are constant, and then one makes informed guesses about them so as to find a solution that makes sense.

Now, I should remind you here of something you surely know: if Cand Care solutions to this set of differential equations, then the superposition principle tells us that any linear combination a·C1 + b·Cwill also be a solution. So we need one or more extra conditions, usually some starting condition, which we can combine with a normalization condition, so we can get some unique solution that makes sense.

The Hij coefficients are referred to as Hamiltonian coefficients and, as shown in the mentioned post, the H11 and H22 coefficients are related to the amplitude of the molecule staying in state 1 and state 2 respectively, while the H12 and H21 coefficients are related to the amplitude of the molecule going from state 1 to state 2 and vice versa. Because of the perfect symmetry of the situation here, it’s easy to see that H11 should equal H22 , and that H12 and H21 should also be equal to each other. Indeed, Nature doesn’t care what we call state 1 or 2 here: as mentioned above, we did not define the ‘up’ and ‘down’ direction with respect to some external direction in space, so the molecule can have any orientation and, hence, switching the i an j indices should not make any difference. So that’s one clue, at least, that we can use to solve those equations: the perfect symmetry of the situation and, hence, the perfect symmetry of the Hamiltonian coefficients—in this case, at least!

The other clue is to think about the solution if we’d not have two states but one state only. In that case, we’d need to solve iħ·[dC1(t)/dt] = H11·C1(t). That’s simple enough, because you’ll remember that the exponential function is its own derivative. To be precise, we write: d(a·eiωt)/dt = a·d(eiωt)/dt = a·iω·eiωt, and please note that can be any complex number: we’re not necessarily talking a real number here! In fact, we’re likely to talk complex coefficients, and we multiply with some other complex number (iω) anyway here! So if we write iħ·[dC1/dt] = H11·C1 as dC1/dt = −(i/ħ)·H11·C1 (remember: i−1 = 1/i = −i), then it’s easy to see that the Ca·e–(i/ħ)·H11·t function is the general solution for this differential equation. Let me write it out for you, just to make sure:

dC1/dt = d[a·e–(i/ħ)H11t]/dt = a·d[e–(i/ħ)H11t]/dt = –a·(i/ħ)·H11·e–(i/ħ)H11t

= –(i/ħ)·H11·a·e–(i/ħ)H11= −(i/ħ)·H11·C1

Of course, that reminds us of our generic wavefunction a·e−(i/ħ)·E0·t wavefunction: we only need to equate H11 with E0 and we’re done! Hence, in a one-state system, the Hamiltonian coefficient is, quite simply, equal to the energy of the system. In fact, that’s a result can be generalized, as we’ll see below, and so that’s why Feynman says the Hamiltonian ought to be called the energy matrix.

In fact, we actually may have two states that are entirely uncoupled, i.e. a system in which there is no dependence of C1 on Cand vice versa. In that case, the two equations reduce to:

iħ·[dC1/dt] = H11·C1 and iħ·[dC2/dt] = H22·C2

These do not form a coupled system and, hence, their solutions are independent:

C1(t) = a·e–(i/ħ)·H11·t and C2(t) = b·e–(i/ħ)·H22·t 

The symmetry of the situation suggests we should equate a and b, and then the normalization condition says that the probabilities have to add up to one, so |C1(t)|+ |C2(t)|= 1, so we’ll find that = 1/√2.

OK. That’s simple enough, and this story has become quite long, so we should wrap it up. The two ‘clues’ – about symmetry and about the Hamiltonian coefficients being energy levels – lead Feynman to suggest that the Hamiltonian matrix for this particular case should be equal to:


Why? Well… It’s just one of Feynman’s clever guesses, and it yields probability functions that makes sense, i.e. they actually describe something real. That’s all. 🙂 I am only half-joking, because it’s a trial-and-error process indeed and, as I’ll explain in a separate section in this post, one needs to be aware of the various approximations involved when doing this stuff. So let’s be explicit about the reasoning here:

  1. We know that H11 = H22 = Eif the two states would be identical. In other words, if we’d have only one state, rather than two – i.e. if H12 and H21 would be zero – then we’d just plug that in. So that’s what Feynman does. So that’s what we do here too! 🙂
  2. However, H12 and H21 are not zero, of course, and so assume there’s some amplitude to go from one position to the other by tunneling through the energy barrier and flipping to the other side. Now, we need to assign some value to that amplitude and so we’ll just assume that the energy that’s needed for the nitrogen atom to tunnel through the energy barrier and flip to the other side is equal to A. So we equate H12 and H21 with −A.

Of course, you’ll wonder: why minus A? Why wouldn’t we try H12 = H21 = A? Well… I could say that a particle usually loses potential energy as it moves from one place to another, but… Well… Think about it. Once it’s through, it’s through, isn’t it? And so then the energy is just Eagain. Indeed, if there’s no external field, the + or − sign is quite arbitrary. So what do we choose? The answer is: when considering our molecule in free space, it doesn’t matter. Using +A or −A yields the same probabilities. Indeed, let me give you the amplitudes we get for H11 = H22 = Eand H12 and H21 = −A:

  1. C1(t) = 〈 1 | ψ 〉 = (1/2)·e(i/ħ)·(E− A)·t + (1/2)·e(i/ħ)·(E+ A)·t = e(i/ħ)·E0·t·cos[(A/ħ)·t]
  2. C2(t) = 〈 2 | ψ 〉 = (1/2)·e(i/ħ)·(E− A)·t – (1/2)·e(i/ħ)·(E+ A)·t = i·e(i/ħ)·E0·t·sin[(A/ħ)·t]

[In case you wonder how we go from those exponentials to a simple sine and cosine factor, remember that the sum of complex conjugates, i.e eiθ eiθ reduces to 2·cosθ, while eiθ − eiθ reduces to 2·i·sinθ.]

Now, it’s easy to see that, if we’d have used +A rather than −A, we would have gotten something very similar:

  • C1(t) = 〈 1 | ψ 〉 = (1/2)·e(i/ħ)·(E+ A)·t + (1/2)·e(i/ħ)·(E− A)·t = e(i/ħ)·E0·t·cos[(A/ħ)·t]
  • C2(t) = 〈 2 | ψ 〉 = (1/2)·e(i/ħ)·(E+ A)·t – (1/2)·e(i/ħ)·(E− A)·t = −i·e(i/ħ)·E0·t·sin[(A/ħ)·t]

So we get a minus sign in front of our C2(t) function, because cos(α) = cos(–α) but sin(α) = −sin(α). However, the associated probabilities are exactly the same. For both, we get the same P1(t) and P2(t) functions:

  • P1(t) = |C1(t)|2 = cos2[(A/ħ)·t]
  • P2(t) = |C2(t)|= sin2[(A/ħ)·t]

[Remember: the absolute square of and −is |i|= +√12 = +1 and |i|2 = (−1)2|i|= +1 respectively, so the i and −i in the two C2(t) formulas disappear.]

You’ll remember the graph:


Of course, you’ll say: that plus or minus sign in front of C2(t) should matter somehow, doesn’t it? Well… Think about it. Taking the absolute square of some complex number – or some complex function , in this case! – amounts to multiplying it with its complex conjugate. Because the complex conjugate of a product is the product of the complex conjugates, it’s easy to see what happens: the e(i/ħ)·E0·t factor in C1(t) = e(i/ħ)·E0·t·cos[(A/ħ)·t] and C2(t) = ±i·e(i/ħ)·E0·t·sin[(A/ħ)·t] gets multiplied by e+(i/ħ)·E0·t and, hence, doesn’t matter: e(i/ħ)·E0·t·e+(i/ħ)·E0·t = e0 = 1. The cosine factor in C1(t) = e(i/ħ)·E0·t·cos[(A/ħ)·t] is real, and so its complex conjugate is the same. Now, the ±i·sin[(A/ħ)·t] factor in C2(t) = ±i·e(i/ħ)·E0·t·sin[(A/ħ)·t] is a pure imaginary number, and so its complex conjugate is its opposite. For some reason, we’ll find similar solutions for all of the situations we’ll describe below: the factor determining the probability will either be real or, else, a pure imaginary number. Hence, from a math point of view, it really doesn’t matter if we take +A or −A for  or  real factor for those H12 and H21 coefficients. We just need to be consistent in our choice, and I must assume that, in order to be consistent, Feynman likes to think of our nitrogen atom borrowing some energy from the system and, hence, temporarily reducing its energy by an amount that’s equal to −A. If you have a better interpretation, please do let me know! 🙂

OK. We’re done with this section… Except… Well… I have to show you how we got those C1(t) and C1(t) functions, no? Let me copy Feynman here:

solutionNote that the ‘trick’ involving the addition and subtraction of the differential equations is a trick we’ll use quite often, so please do have a look at it. As for the value of the a and b coefficients – which, as you can see, we’ve equated to 1 in our solutions for C1(t) and C1(t) – we get those because of the following starting condition: we assume that at t = 0, the molecule will be in state 1. Hence, we assume C1(0) = 1 and C2(0) = 0. In other words: we assume that we start out on that P1(t) curve in that graph with the probability functions above, so the C1(0) = 1 and C2(0) = 0 starting condition is equivalent to P1(0) = 1 and P1(0) = 0. Plugging that in gives us a/2 + b/2 = 1 and a/2 − b/2 = 0, which is possible only if a = b = 1.

Of course, you’ll say: what if we’d choose to start out with state 2, so our starting condition is P1(0) = 0 and P1(0) = 1? Then a = 1 and b = −1, and we get the solution we got when equating H12 and H21 with +A, rather than with −A. So you can think about that symmetry once again: when we’re in free space, then it’s quite arbitrary what we call ‘up’ or ‘down’.

So… Well… That’s all great. I should, perhaps, just add one more note, and that’s on that A/ħ value. We calculated it in the previous post, because we wanted to actually calculate the period of those P1(t) and P2(t) functions. Because we’re talking the square of a cosine and a sine respectively, the period is equal to π, rather than 2π, so we wrote: (A/ħ)·T = π ⇔ T = π·ħ/A. Now, the separation between the two energy levels E+ A and E− A, so that’s 2A, has been measured as being equal, more or less, to 2A ≈ 10−4 eV.

How does one measure that? As mentioned above, I’ll show you, in a moment, that, when applying some external field, the plus and minus sign do matter, and the separation between those two energy levels E+ A and E− A will effectively represent something physical. More in particular, we’ll have transitions from one energy level to another and that corresponds to electromagnetic radiation being emitted or absorbed, and so there’s a relation between the energy and the frequency of that radiation. To be precise, we can write 2A = h·f0. The frequency of the radiation that’s being absorbed or emitted is 23.79 GHz, which corresponds to microwave radiation with a wavelength of λ = c/f0 = 1.26 cm. Hence, 2·A ≈ 25×109 Hz times 4×10−15 eV·s = 10−4 eV, indeed, and, therefore, we can write: T = π·ħ/A ≈ 3.14 × 6.6×10−16 eV·s divided by 0.5×10−4 eV, so that’s 40×10−12 seconds = 40 picoseconds. That’s 40 trillionths of a seconds. So that’s very short, and surely much shorter than the time that’s associated with, say, a freely emitting sodium atom, which is of the order of 3.2×10−8 seconds. You may think that makes sense, because the photon energy is so much lower: a sodium light photon is associated with an energy equal to E = h·f = 500×1012 Hz times 4×10−15  eV·s = 2 eV, so that’s 20,000 times 10−4 eV.

There’s a funny thing, however. An oscillation of a frequency of 500 tera-hertz that lasts 3.2×10−8 seconds is equivalent to 500×1012 Hz times 3.2×10−8 s ≈ 16 million cycles. However, an oscillation of a frequency of 23.97 giga-hertz that only lasts 40×10−12 seconds is equivalent to 23.97×109 Hz times 40×10−12 s ≈ 1000×10−3 = 1 ! One cycle only? We’re surely not talking resonance here!

So… Well… I am just flagging it here. We’ll have to do some more thinking about that later. [I’ve added an addendum that may or may not help us in this regard. :-)]

The two-state system in a field

As mentioned above, when there is no external force field, we define the ‘up’ or ‘down’ direction of the nitrogen atom was defined with regard to its its spin around its axis of symmetry, so with regard to the molecule itself. However, when we apply an external electromagnetic field, as shown below, we do have some external reference frame.

Now, the external reference frame – i.e. the physics of the situation, really – may make it more convenient to define the whole system using another set of base states, which we’ll refer to as I and II, rather than 1 and 2. Indeed, you’ve seen the picture below: it shows a state selector, or a filter as we called it. In this case, there’s a filtering according to whether our ammonia molecule is in state I or, alternatively, state II. It’s like a Stern-Gerlach apparatus splitting an electron beam according to the spin state of the electrons, which is ‘up’ or ‘down’ too, but in a totally different way than our ammonia molecule. Indeed, the ‘up’ and ‘down’ spin of an electron has to do with its magnetic moment and its angular momentum. However, there are a lot of similarities here, and so you may want to compare the two situations indeed, i.e. the electron beam in an inhomogeneous magnetic field versus the ammonia beam in an inhomogeneous electric field.

electric field

Now, when reading Feynman, as he walks us through the relevant Lecture on all of this, you get the impression that it’s the I and II states only that have some kind of physical or geometric interpretation. That’s not the case. Of course, the diagram of the state selector above makes it very obvious that these new I and II base states make very much sense in regard to the orientation of the field, i.e. with regard to external space, rather than with respect to the position of our nitrogen atom vis-á-vis the hydrogens. But… Well… Look at the image below: the direction of the field (which we denote by ε because we’ve been using the E for energy) obviously matters when defining the old ‘up’ and ‘down’ states of our nitrogen atom too!

In other words, our previous | 1 〉 and | 2 〉 base states acquire a new meaning too: it obviously matters whether or not the electric dipole moment of the molecule is in the same or, conversely, in the opposite direction of the field. To be precise, the presence of the electromagnetic field suddenly gives the energy levels that we’d associate with these two states a very different physical interpretation.


Indeed, from the illustration above, it’s easy to see that the electric dipole moment of this particular molecule in state 1 is in the opposite direction and, therefore, temporarily ignoring the amplitude to flip over (so we do not think of A for just a brief little moment), the energy that we’d associate with state 1 would be equal to E+ με. Likewise, the energy we’d associate with state 2 is equal to E− με.  Indeed, you’ll remember that the (potential) energy of an electric dipole is equal to the vector dot product of the electric dipole moment μ and the field vector ε, but with a minus sign in front so as to get the sign for the energy righ. So the energy is equal to −μ·ε = −|μ|·|ε|·cosθ, with θ the angle between both vectors. Now, the illustration above makes it clear that state 1 and 2 are defined for θ = π and θ = 0 respectively. [And, yes! Please do note that state 1 is the highest energy level, because it’s associated with the highest potential energy: the electric dipole moment μ of our ammonia molecule will – obviously! – want to align itself with the electric field ε ! Just think of what it would imply to turn the molecule in the field!]

Therefore, using the same hunches as the ones we used in the free space example, Feynman suggests that, when some external electric field is involved, we should use the following Hamiltonian matrix:

H-matrix 2

So we’ll need to solve a similar set of differential equations with this Hamiltonian now. We’ll do that later and, as mentioned above, it will be more convenient to switch to another set of base states, or another ‘representation’ as it’s referred to. But… Well… Let’s not get too much ahead of ourselves: I’ll say something about that before we’ll start solving the thing, but let’s first look at that Hamiltonian once more.

When I say that Feynman uses the same clues here, then… Well.. That’s true and not true. You should note that the diagonal elements in the Hamiltonian above are not the same: E+ με ≠ E+ με. So we’ve lost that symmetry of free space which, from a math point of view, was reflected in those identical H11 = H22 = Ecoefficients.

That should be obvious from what I write above: state 1 and state 2 are no longer those 1 and 2 states we described when looking at the molecule in free space. Indeed, the | 1 〉 and | 2 〉 states are still ‘up’ or ‘down’, but the illustration above also makes it clear we’re defining state 1 and state 2 not only with respect to the molecule’s spin around its own axis of symmetry but also vis-á-vis some direction in space. To be precise, we’re defining state 1 and state 2 here with respect to the direction of the electric field ε. Now that makes a really big difference in terms of interpreting what’s going on.

In fact, the ‘splitting’ of the energy levels because of that amplitude A is now something physical too, i.e. something that goes beyond just modeling the uncertainty involved. In fact, we’ll find it convenient to distinguish two new energy levels, which we’ll write as E= E+ A and EII = E− A respectively. They are, of course, related to those new base states | I 〉 and | II 〉 that we’ll want to use. So the E+ A and E− A energy levels themselves will acquire some physical meaning, and especially the separation between them, i.e. the value of 2A. Indeed, E= E+ A and EII = E− A will effectively represent an ‘upper’ and a ‘lower’ energy level respectively.

But, again, I am getting ahead of myself. Let’s first, as part of working towards a solution for our equations, look at what happens if and when we’d switch to another representation indeed.

Switching to another representation

Let me remind you of what I wrote in my post on quantum math in this regard. The actual state of our ammonia molecule – or any quantum-mechanical system really – is always to be described in terms of a set of base states. For example, if we have two possible base states only, we’ll write:

| φ 〉 = | 1 〉 C1 + | 2 〉 C2

You’ll say: why? Our molecule is obviously always in either state 1 or state 2, isn’t it? Well… Yes and no. That’s the mystery of quantum mechanics: it is and it isn’t. As long as we don’t measure it, there is an amplitude for it to be in state 1 and an amplitude for it to be in state 2. So we can only make sense of its state by actually calculating 〈 1 | φ 〉 and 〈 2 | φ 〉 which, unsurprisingly are equal to 〈 1 | φ 〉 = 〈 1 | 1 〉 C1 + 〈 1 | 2 〉 C2  = C1(t) and 〈 2 | φ 〉 = 〈 2 | 1 〉 C1 + 〈 2 | 2 〉 C2  = C2(t) respectively, and so these two functions give us the probabilities P1(t) and  P2(t) respectively. So that’s Schrödinger’s cat really: the cat is dead or alive, but we don’t know until we open the box, and we only have a probability function – so we can say that it’s probably dead or probably alive, depending on the odds – as long as we do not open the box. It’s as simple as that.

Now, the ‘dead’ and ‘alive’ condition are, obviously, the ‘base states’ in Schrödinger’s rather famous example, and we can write them as | DEAD 〉 and | ALIVE 〉 you’d agree it would be difficult to find another representation. For example, it doesn’t make much sense to say that we’ve rotated the two base states over 90 degrees and we now have two new states equal to (1/√2)·| DEAD 〉 – (1/√2)·| ALIVE 〉 and (1/√2)·| DEAD 〉 + (1/√2)·| ALIVE 〉 respectively. There’s no direction in space in regard to which we’re defining those two base states: dead is dead, and alive is alive.

The situation really resembles our ammonia molecule in free space: there’s no external reference against which to define the base states. However, as soon as some external field is involved, we do have a direction in space and, as mentioned above, our base states are now defined with respect to a particular orientation in space. That implies two things. The first is that we should no longer say that our molecule will always be in either state 1 or state 2. There’s no reason for it to be perfectly aligned with or against the field. Its orientation can be anything really, and so its state is likely to be some combination of those two pure base states | 1 〉 and | 2 〉.

The second thing is that we may choose another set of base states, and specify the very same state in terms of the new base states. So, assuming we choose some other set of base states | I 〉 and | II 〉, we can write the very same state | φ 〉 = | 1 〉 C1 + | 2 〉 Cas:

| φ 〉 = | I 〉 CI + | II 〉 CII

It’s really like what you learned about vectors in high school: one can go from one set of base vectors to another by a transformation, such as, for example, a rotation, or a translation. It’s just that, just like in high school, we need some direction in regard to which we define our rotation or our translation.

For state vectors, I showed how a rotation of base states worked in one of my posts on two-state systems. To be specific, we had the following relation between the two representations:


The (1/√2) factor is there because of the normalization condition, and the two-by-two matrix equals the transformation matrix for a rotation of a state filtering apparatus about the y-axis, over an angle equal to (minus) 90 degrees, which we wrote as:


The y-axis? What y-axis? What state filtering apparatus? Just relax. Think about what you’ve learned already. The orientations are shown below: the S apparatus separates ‘up’ and ‘down’ states along the z-axis, while the T-apparatus does so along an axis that is tilted, about the y-axis, over an angle equal to α, or φ, as it’s written in the table above.


Of course, we don’t really introduce an apparatus at this or that angle. We just introduced an electromagnetic field, which re-defined our | 1 〉 and | 2 〉 base states and, therefore, through the rotational transformation matrix, also defines our | I 〉 and | II 〉 base states.

[…] You may have lost me by now, and so then you’ll want to skip to the next section. That’s fine. Just remember that the representations in terms of | I 〉 and | II 〉 base states or in terms of | 1 〉 and | 2 〉 base states are mathematically equivalent. Having said that, if you’re reading this post, and you want to understand it, truly (because you want to truly understand quantum mechanics), then you should try to stick with me here. 🙂 Indeed, there’s a zillion things you could think about right now, but you should stick to the math now. Using that transformation matrix, we can relate the Cand CII coefficients in the | φ 〉 = | I 〉 CI + | II 〉 CII expression to the Cand CII coefficients in the | φ 〉 = | 1 〉 C1 + | 2 〉 C2 expression. Indeed, we wrote:

  • C= 〈 I | ψ 〉 = (1/√2)·(C1 − C2)
  • CII = 〈 II | ψ 〉 = (1/√2)·(C1 + C2)

That’s exactly the same as writing:


OK. […] Waw! You just took a huge leap, because we can now compare the two sets of differential equations:

set of equations

They’re mathematically equivalent, but the mathematical behavior of the functions involved is very different. Indeed, unlike the C1(t) and C2(t) amplitudes, we find that the CI(t) and CII(t) amplitudes are stationary, i.e. the associated probabilities – which we find by taking the absolute square of the amplitudes, as usual – do not vary in time. To be precise, if you write it all out and simplify, you’ll find that the CI(t) and CII(t) amplitudes are equal to:

  • CI(t) = 〈 I | ψ 〉 = (1/√2)·(C1 − C2) = (1/√2)·e(i/ħ)·(E0+ A)·t = (1/√2)·e(i/ħ)·EI·t
  • CII(t) = 〈 II | ψ 〉 = (1/√2)·(C1 + C2) = (1/√2)·e(i/ħ)·(E0− A)·t = (1/√2)·e(i/ħ)·EII·t

As the absolute square of the exponential is equal to one, the associated probabilities, i.e. |CI(t)|2 and |CII(t)|2, are, quite simply, equal to |1/√2|2 = 1/2. Now, it is very tempting to say that this means that our ammonia molecule has an equal chance to be in state I or state II. In fact, while I may have said something like that in my previous posts, that’s not how one should interpret this. The chance of our molecule being exactly in state I or state II, or in state 1 or state 2 is varying with time, with the probability being ‘dumped’ from one state to the other all of the time.

I mean… The electric dipole moment can point in any direction, really. So saying that our molecule has a 50/50 chance of being in state 1 or state 2 makes no sense. Likewise, saying that our molecule has a 50/50 chance of being in state I or state II makes no sense either. Indeed, the state of our molecule is specified by the | φ 〉 = | I 〉 CI + | II 〉 CII = | 1 〉 C1 + | 2 〉 Cequations, and neither of these two expressions is a stationary state. They mix two frequencies, because they mix two energy levels.

Having said that, we’re talking quantum mechanics here and, therefore, an external inhomogeneous electric field will effectively split the ammonia molecules according to their state. The situation is really like what a Stern-Gerlach apparatus does to a beam of electrons: it will split the beam according to the electron’s spin, which is either ‘up’ or, else, ‘down’, as shown in the graph below:

diagram 2

The graph for our ammonia molecule, shown below, is very similar. The vertical axis measures the same: energy. And the horizontal axis measures με, which increases with the strength of the electric field ε. So we see a similar ‘splitting’ of the energy of the molecule in an external electric field.

graph new

How should we explain this? It is very tempting to think that the presence of an external force field causes the electrons, or the ammonia molecule, to ‘snap into’ one of the two possible states, which are referred to as state I and state II respectively in the illustration of the ammonia state selector below. But… Well… Here we’re entering the murky waters of actually interpreting quantum mechanics, for which (a) we have no time, and (b) we are not qualified. So you should just believe, or take for granted, what’s being shown here: an inhomogeneous electric field will split our ammonia beam according to their state, which we define as I and II respectively, and which are associated with the energy E0+ A and E0− A  respectively.

electric field

As mentioned above, you should note that these two states are stationary. The Hamiltonian equations which, as they always do, describe the dynamics of this system, imply that the amplitude to go from state I to state II, or vice versa, is zero. To make sure you ‘get’ that, I reproduce the associated Hamiltonian matrix once again:

H-matrix I and II

Of course, that will change when we start our analysis of what’s happening in the maser. Indeed, we will have some non-zero HI,II and HII,I amplitudes in the resonant cavity of our ammonia maser, in which we’ll have an oscillating electric field and, as a result, induced transitions from state I to II and vice versa. However, that’s for later. While I’ll quickly insert the full picture diagram below, you should, for the moment, just think about those two stationary states and those two zeroes. 🙂

maser diagram

Capito? If not… Well… Start reading this post again, I’d say. 🙂

Intermezzo: on approximations

At this point, I need to say a few things about all of the approximations involved, because it can be quite confusing indeed. So let’s take a closer look at those energy levels and the related Hamiltonian coefficients. In fact, in his LecturesFeynman shows us that we can always have a general solution for the Hamiltonian equations describing a two-state system whenever we have constant Hamiltonian coefficients. That general solution – which, mind you, is derived assuming Hamiltonian coefficients that do not depend on time – can always be written in terms of two stationary base states, i.e. states with a definite energy and, hence, a constant probability. The equations, and the two definite energy levels are:



That yields the following values for the energy levels for the stationary states:

solution x

Now, that’s very different from the E= E0+ A and EII = E0− A energy levels for those stationary states we had defined in the previous section: those stationary states had no square root, and no μ2ε2, in their energy. In fact, that sort of answers the question: if there’s no external field, then that μ2ε2 factor is zero, and the square root in the expression becomes ±√A= ±A. So then we’re back to our E= E0+ A and EII = E0− A formulas. The whole point, however, is that we will actually have an electric field in that cavity. Moreover, it’s going to be a field that varies in time, which we’ll write:


Now, part of the confusion in Feynman’s approach is that he constantly switches between representing the system in terms of the I and II base states and the 1 and 2 base states respectively. For a good understanding, we should compare with our original representation of the dynamics in free space, for which the Hamiltonian was the following one:


That matrix can easily be related to the new one we’re going to have to solve, which is equal to:

H-matrix 2

The interpretation is easy if we look at that illustration again:


If the direction of the electric dipole moment is opposite to the direction ε, then the associated energy is equal to −μ·ε = −μ·ε = −|μ|·|ε|·cosθ = −μ·ε·cos(π) = +με. Conversely, for state 2, we find −μ·ε·cos(0) = −με for the energy that’s associated with the dipole moment. You can and should think about the physics involved here, because they make sense! Thinking of amplitudes, you should note that the +με and −με terms effectively change the H11 and H22 coefficients, so they change the amplitude to stay in state 1 or state 2 respectively. That, of course, will have an impact on the associated probabilities, and so that’s why we’re talking of induced transitions now.

Having said that, the Hamiltonian matrix above keeps the −A for H12 and H21, so the matrix captures spontaneous transitions too!

Still… You may wonder why Feynman doesn’t use those Eand EII formulas with the square root because that would give us some exact solution, wouldn’t it? The answer to that question is: maybe it would, but would you know how to solve those equations? We’ll have a varying field, remember? So our Hamiltonian H11 and H22 coefficients will no longer be constant, but time-dependent. As you’re going to see, it takes Feynman three pages to solve the whole thing using the +με and −με approximation. So just imagine how complicated it would be using that square root expression! [By the way, do have a look at those asymptotic curves in that illustration showing the splitting of energy levels above, so you see how that approximation looks like.]

So that’s the real answer: we need to simplify somehow, so as to get any solutions at all!

Of course, it’s all quite confusing because, after Feynman first notes that, for strong fields, the A2 in that square root is small as compared to μ2ε2, thereby justifying the use of the simplified E= E0+ με = H11 and EII = E0− με = H22 coefficients, he continues and bluntly uses the very same square root expression to explain how that state selector works, saying that the electric field in the state selector will be rather weak and, hence, that με will be much smaller than A, so one can use the following approximation for the square root in the expressions above:

square root sum of squaresThe energy expressions then reduce to:energy 2

And then we can calculate the force on the molecules as:


So the electric field in the state selector is weak, but the electric field in the cavity is supposed to be strong, and so… Well… That’s it, really. The bottom line is that we’ve a beam of ammonia molecules that are all in state I, and it’s what happens with that beam then, that is being described by our new set of differential equations:


Solving the equations

As all molecules in our ammonia beam are described in terms of the | I 〉 and | II 〉 base states – as evidenced by the fact that we say all molecules that enter the cavity are state I – we need to switch to that representation. We do that by using that transformation above, so we write:

  • C= 〈 I | ψ 〉 = (1/√2)·(C1 − C2)
  • CII = 〈 II | ψ 〉 = (1/√2)·(C1 + C2)

Keeping these ‘definitions’ of Cand CII in mind, you should then add the two differential equations, divide the result by the square root of 2, and you should get the following new equation:


Please! Do it and verify the result! You want to learn something here, no? 🙂

Likewise, subtracting the two differential equations, we get:


We can re-write this as:set new

Now, the problem is that the Hamiltonian constants here are not constant. To be precise, the electric field ε varies in time. We wrote:


So HI,II  and HII,I, which are equal to με, are not constant: we’ve got Hamiltonian coefficients that are a function of time themselves. […] So… Well… We just need to get on with it and try to finally solve this thing. Let me just copy Feynman as he grinds through this:


This is only the first step in the process. Feynman just takes two trial functions, which are really similar to the very general Ca·e–(i/ħ)·H11·t function we presented when only one equation was involved, or – if you prefer a set of two equations – those CI(t) = a·e(i/ħ)·EI·t and CI(t) = b·e(i/ħ)·EII·equations above. The difference is that the coefficients in front, i.e. γI and γII are not some (complex) constant, but functions of time themselves. The next step in the derivation is as follows:


One needs to do a bit of gymnastics here as well to follow what’s going on, but please do check and you’ll see it works. Feynman derives another set of differential equations here, and they specify these γI = γI(t) and γII = γII(t) functions. These equations are written in terms of the frequency of the field, i.e. ω, and the resonant frequency ω0, which we mentioned above when calculating that 23.79 GHz frequency from the 2A = h·f0 equation. So ω0 is the same molecular resonance frequency but expressed as an angular frequency, so ω0 = f0/2π = ħ/2A. He then proceeds to simplify, using assumptions one should check. He then continues:


That gives us what we presented in the previous post:


So… Well… What to say? I explained those probability functions in my previous post, indeed. We’ve got two probabilities here:

  • P= cos2[(με0/ħ)·t]
  • PII = sin2[(με0/ħ)·t]

So that’s just like the P=  cos2[(A/ħ)·t] and P= sin2[(A/ħ)·t] probabilities we found for spontaneous transitions. But so here we are talking induced transitions.

As you can see, the frequency and, hence, the period, depend on the strength, or magnitude, of the electric field, i.e. the εconstant in the ε = 2ε0cos(ω·t) expression. The natural unit for measuring time would be the period once again, which we can easily calculate as (με0/ħ)·T = π ⇔ T = π·ħ/με0.

Now, we had that T = (π·ħ)/(2A) expression above, which allowed us to calculate the period of the spontaneous transition frequency, which we found was like 40 picoseconds, i.e. 40×10−12 seconds. Now, the T = (π·ħ)/(2με0) is very similar, it allows us to calculate the expected, average, or mean time for an induced transition. In fact, if we write Tinduced = (π·ħ)/(2με0) and Tspontaneous = (π·ħ)/(2A), then we can take ratio to find:

Tinduced/Tspontaneous = [(π·ħ)/(2με0)]/[(π·ħ)/(2A)] = A/με0

This A/με0 ratio is greater than one, so Tinduced/Tspontaneous is greater than one, which, in turn, means that the presence of our electric field – which, let me remind you, dances to the beat of the resonant frequency – causes a slower transition than we would have had if the oscillating electric field were not present.

But – Hey! – that’s the wrong comparison! Remember all molecules enter in a stationary state, as they’ve been selected so as to ensure they’re in state I. So there is no such thing as a spontaneous transition frequency here! They’re all polarized, so to speak, and they would remain that way if there was no field in the cavity. So if there was no oscillating electric field, they would never transition. Nothing would happen! Well… In terms of our particular set of base states, of course! Why? Well… Look at the Hamiltonian coefficients HI,II = HII,I = με: these coefficients are zero if ε is zero. So… Well… That says it all.

So that‘s what it’s all about: induced emission and, as I explained in my previous post, because all molecules enter in state I, i.e. the upper energy state, literally, they all ‘dump’ a net amount of energy equal to 2A into the cavity at the occasion of their first transition. The molecules then keep dancing, of course, and so they absorb and emit the same amount as they go through the cavity, but… Well… We’ve got a net contribution here, which is not only enough to maintain the cavity oscillations, but actually also provides a small excess of power that can be drawn from the cavity as microwave radiation of the same frequency.

As Feynman notes, an exact description of what actually happens requires an understanding of the quantum mechanics of the field in the cavity, i.e. quantum field theory, which I haven’t studied yet. But… Well… That’s for later, I guess. 🙂

Post scriptum: The sheer length of this post shows we’re not doing something that’s easy here. Frankly, I feel the whole analysis is still quite obscure, in the sense that – despite looking at this thing again and again – it’s hard to sort of interpret what’s going on, in a physical sense that is. But perhaps one shouldn’t try that. I’ve quoted Feynman’s view on how easy or how difficult it is to ‘understand’ quantum mechanics a couple of times already, so let me do it once more:

“Because atomic behavior is so unlike ordinary experience, it is very difficult to get used to, and it appears peculiar and mysterious to everyone—both to the novice and to the experienced physicist. Even the experts do not understand it the way they would like to, and it is perfectly reasonable that they should not, because all of direct, human experience and human intuition applies to large objects.”

So… Well… I’ll grind through the remaining Lectures now – I am halfway through Volume III now – and then re-visit all of this. Despite Feynman’s warning, I want to understand it the way I like to, even if I don’t quite know what way that is right now. 🙂

Addendum: As for those cycles and periods, I noted a couple of times already that the Planck-Einstein equation E = h·f  can usefully be re-written as E/= h, as it gives a physical interpretation to the value of the Planck constant. In fact, I said h is the energy that’s associated with one cycle, regardless of the frequency of the radiation involved. Indeed, the energy of a photon divided by the number of cycles per second, should give us the energy per cycle, no?

Well… Yes and no. Planck’s constant h and the frequency are both expressed referencing the time unit. However, if we say that a sodium atom emits one photon only as its electron transitions from a higher energy level to a lower one, and if we say that involves a decay time of the order of 3.2×10−8 seconds, then what we’re saying really is that a sodium light photon will ‘pack’ like 16 million cycles, which is what we get when we multiply the number of cycles per second (i.e. the mentioned frequency of 500×1012 Hz) by the decay time (i.e. 3.2×10−8 seconds): (500×1012 Hz)·(3.2×10−8 s) = 16 ×10cycles, indeed. So the energy per cycle is 2.068 eV (i.e. the photon energy) divided by 16×106, so that’s 0.129×10−6 eV. Unsurprisingly, that’s what we get when we we divide h by 3.2×10−8 s: (4.13567×10−15)/(3.2×10−8 s) = 1.29×10−7 eV. We’re just putting some values in to the E/(T) = h/T equation here.

The logic for that 2A = h·f0 is the same. The frequency of the radiation that’s being absorbed or emitted is 23.79 GHz, so the photon energy is (23.97×109 Hz)·(4.13567×10−15 eV·s) ≈ 1×10−4 eV. Now, we calculated the transition period T as T = π·ħ/A ≈ (π·6.626×10−16 eV·s)/(0.5×10−4 eV) ≈ 41.6×10−12 seconds. Now, an oscillation of a frequency of 23.97 giga-hertz that only lasts 41.6×10−12 seconds is an oscillation of one cycle only. The consequence is that, when we continue this style of reasoning, we’d have a photon that packs all of its energy into one cycle!

Let’s think about what this implies in terms of the density in space. The wavelength of our microwave radiation is 1.25×10−2 m, so we’ve got a ‘density’ of 1×10−4 eV/1.25×10−2 m = 0.8×10−2 eV/m = 0.008 eV/m. The wavelength of our sodium light is 0.6×10−6 m, so we get a ‘density’ of 1.29×10−7 eV/0.6×10−6 m = 2.15×10−1 eV/m = 0.215 eV/m. So the energy ‘density’ of our sodium light is 26.875 times that of our microwave radiation. 🙂

Frankly, I am not quite sure if calculations like this make much sense. In fact, when talking about energy densities, I should review my posts on the Poynting vector. However, they may help you think things through. 🙂