The photon wavefunction

Post scriptum note added on 11 July 2016: This is one of the more speculative posts which led to my e-publication analyzing the wavefunction as an energy propagation. With the benefit of hindsight, I would recommend you to immediately the more recent exposé on the matter that is being presented here, which you can find by clicking on the provided link.

Original post:

In my previous posts, I juxtaposed the following images:

Animation 5d_euler_f

Both are the same, and then they’re not. The illustration on the left-hand side shows how the electric field vector (E) of an electromagnetic wave travels through space, but it does not show the accompanying magnetic field vector (B), which is as essential in the electromagnetic propagation mechanism according to Maxwell’s equations:

  1. B/∂t = –∇×E
  2. E/∂t = c2∇×B = ∇×B for c = 1

The second illustration shows a wavefunction ei(kx − ωt) = cos(kx − ωt) + i∙sin(kx − ωt). Its propagation mechanism—if we can call it like that—is Schrödinger’s equation:

∂ψ/∂t = i·(ħ/2m)·∇2ψ

We already drew attention to the fact that an equation like this models some flow. To be precise, the Laplacian on the right-hand side is the second derivative with respect to x here, and, therefore, expresses a flux density: a flow per unit surface area, i.e. per square meter. To be precise: the Laplacian represents the flux density of the gradient flow of ψ.

On the left-hand side of Schrödinger’s equation, we have a time derivative, so that’s a flow per second. The ħ/2m factor is like a diffusion constant. In fact, strictly speaking, that ħ/2m factor is a diffusion constant, because it does exactly the same thing as the diffusion constant D in the diffusion equation ∂φ/∂t = D·∇2φ, i.e:

  1. As a constant of proportionality, it quantifies the relationship between both derivatives.
  2. As a physical constant, it ensures the dimensions on both sides of the equation are compatible.

So our diffusion constant here is ħ/2m. Because of the Uncertainty Principle, m is always going to be some integer multiple of ħ/2, so ħ/2m = 1, 1/2, 1/3, 1/4 etcetera. In other words, the ħ/2m term is the inverse of the mass measured in units of ħ/2. We get the terms of the harmonic series here. How convenient! 🙂

In our previous posts, we studied the wavefunction for a zero-mass particle. Such particle has zero rest mass but – because of its movement – does have some energy, and, therefore, some mass and momentum. In fact, measuring time and distance in equivalent units (so = 1), we found that E = m = p = ħ/2 for the zero-mass particle. It had to be. If not, our equations gave us nonsense. So Schrödinger’s equation was reduced to:

∂ψ/∂t = i·∇2ψ

How elegant! We only need to explain that imaginary unit (i) in the equation. It does a lot of things. First, it gives us two equations for the price of one—thereby providing a propagation mechanism indeed. It’s just like the E and B vectors. Indeed, we can write that ∂ψ/∂t = i·∇2ψ equation as:

  1. Re(∂ψ/∂t) = −Im(∇2ψ)
  2. Im(∂ψ/∂t) = Re(∇2ψ)

You should be able to show that the two equations above are effectively equivalent to Schrödinger’s equation. If not… Well… Then you should not be reading this stuff.] The two equations above show that the real part of the wavefunction feeds into its imaginary part, and vice versa. Both are as essential. Let me say this one more time: the so-called real and imaginary part of a wavefunction are equally real—or essential, I should say!

Second, gives us the circle. Huh? Yes. Writing the wavefunction as ψ = a + i·b is not just like writing a vector in terms of its Cartesian coordinates, even if it looks very much that way. Why not? Well… Never forget: i2= −1, and so—let me use mathematical lingo here—the introduction of i makes our metric space complete. To put it simply: we can now compute everything. In short, the introduction of the imaginary unit gives us that wonderful mathematical construct, ei(kx − ωt), which allows us to model everything. In case you wonder, I mean: everything! Literally. 🙂

However, we’re not going to impose any pre-conditions here, and so we’re not going to make that E = m = p = ħ/2 assumption now. We’ll just re-write Schrödinger’s equation as we did last time—so we’re going to keep our ‘diffusion constant’ ħ/2m as for now:

  1. Re(∂ψ/∂t) = −(ħ/2m)·Im(∇2ψ)
  2. Im(∂ψ/∂t) = (ħ/2m)·Re(∇2ψ)

So we have two pairs of equations now. Can they be related? Well… They look the same, so they had better be related! 🙂 Let’s explore it. First note that, if we’d equate the direction of propagation with the x-axis, we can write the E vector as the sum of two y- and z-components: E = (Ey, Ez). Using complex number notation, we can write E as:

E = (Ey, Ez) = Ey + i·Ez

In case you’d doubt, just think of this simple drawing:

2000px-Complex_number_illustration

The next step is to imagine—funny word when talking complex numbers—that Ey and Eare the real and imaginary part of some wavefunction, which we’ll denote as ψE = ei(kx − ωt). So now we can write:

E = (Ey, Ez) = Ey + i·E= cos(kx − ωt) + i∙sin(kx − ωt) = ReE) + i·ImE)

What’s k and ω? Don’t worry about it—for the moment, that is. We’ve done nothing special here. In fact, we’re used to representing waves as some sine or cosine function, so that’s what we are doing here. Nothing more. Nothing less. We just need two sinusoids because of the circular polarization of our electromagnetic wave.

What’s next? Well… If ψE is a regular wavefunction, then we should be able to check if it’s a solution to Schrödinger’s equation. So we should be able to write:

  1. Re(∂ψE/∂t) =  −(ħ/2m)·Im(∇2ψE)
  2. Im(∂ψE/∂t) = (ħ/2m)·Re(∇2ψE)

Are we? How does that work? The time derivative on the left-hand side is equal to:

∂ψE/∂t = −iω·ei(kx − ωt) = −iω·[cos(kx − ωt) + i·sin(kx − ωt)] = ω·sin(kx − ωt) − iω·cos(kx − ωt)

The second-order derivative on the right-hand side is equal to:

2ψ= ∂2ψE/∂x= −k2·ei(kx − ωt) = −k2·cos(kx − ωt) − ik2·sin(kx − ωt)

So the two equations above are equivalent to writing:

  1. Re(∂ψE/∂t) =   −(ħ/2m)·Im(∇2ψE) ⇔ ω·sin(kx − ωt) = k2·(ħ/2m)·sin(kx − ωt)
  2. Im(∂ψE/∂t) = (ħ/2m)·Re(∇2ψE) ⇔ −ω·cos(kx − ωt) = −k2·(ħ/2m)·cos(kx − ωt)

Both conditions are fulfilled if, and only if, ω = k2·(ħ/2m). Now, assuming we measure time and distance in equivalent units (= 1), we can calculate the phase velocity of the electromagnetic wave as being equal to = ω/k = 1. We also have the de Broglie equation for the matter-wave, even if we’re not quite sure whether or not we should apply that to an electromagnetic waveIn any case, the de Broglie equation tells us that k = p/ħ. So we can re-write this condition as:

ω/k = 1 = k·(ħ/2m) = (p/ħ)·(ħ/2m) = p/2m ⇔ p = 2m ⇔ m = p/2

So that’s different from the E = m = p equality we imposed when discussing the wavefunction of the zero-mass particle: we’ve got that 1/2 factor which bothered us so much once again! And it’s causing us the same trouble: how do we interpret that m = p/2 equation? It leads to nonsense once more! E = m·c= m, but E is also supposed to be equal to p·c = p. Here, however, we find that E = p/2! We also get strange results when calculating the group and phase velocity. So… Well… What’s going on here?

I am not quite sure. It’s that damn 1/2 factor. Perhaps it’s got something to do with our definition of mass. The m in the Schrödinger equation was referred to as the effective or reduced mass of the electron wavefunction that it was supposed to model. Now that concept is something funny: it sure allows for some gymnastics, as you’ll see when going through the Wikipedia article on it! I promise I’ll dig into it—but not now and here, as I’ve got no time for that. 😦

However, the good news is that we also get a magnetic field vector with an electromagnetic wave: B. We know B is always orthogonal to E, and in the direction that’s given by the right-hand rule for the vector cross-product. Indeed, we can write B as B = ex×E/c, with ex the unit vector pointing in the x-direction (i.e. the direction of propagation), as shown below.

E and b

So we can do the same analysis: we just substitute E for B everywhere, and we’ll find the same condition: m = p/2. To distinguish the two wavefunctions, we used the E and B  subscripts for our wavefunctions, so we wrote ψand ψB. We can do the same for that m = p/2 condition:

  1. mE = pE/2
  2. m= pB/2

Should we just add mE and mE to get a total momentum and, hence, a total energy, that’s equal to E = m = p for the whole wave? I believe we should, but I haven’t quite figured out how we should interpret that summation!

So… Well… Sorry to disappoint you. I haven’t got the answer here. But I do believe my instinct tells me the truth: the wavefunction for an electromagnetic wave—so that’s the wavefunction for a photon, basically—is essentially the same as our wavefunction for a zero-mass particle. It’s just that we get two wavefunctions for the price of one. That’s what distinguishes bosons from fermions! And so I need to figure out how they differ exactly! And… Well… Yes. That might take me a while!

In the meanwhile, we should play some more with those E and B vectors, as that’s going to help us to solve the riddle—no doubt!

Fiddling with E and B

The B = ex×E/c equation is equivalent to saying that we’ll get B when rotating E by 90 degrees which, in turn, is equivalent to multiplication by the imaginary unit iHuh? Yes. Sorry. Just google the meaning of the vector cross product and multiplication by i. So we can write B = i·E, which amounts to writing:

B = i·E = ei(π/2)·ei(kx − ωt) = ei(kx − ωt + π/2) = cos(kx − ωt + π/2) + i·sin(kx − ωt + π/2)

So we can now associate a wavefunction ψB with the field magnetic field vector B, which is the same wavefunction as ψE except for a phase shift equal to π/2. You’ll say: so what? Well… Nothing much. I guess this observation just concludes this long digression on the wavefunction of a photon: it’s the same wavefunction as that of a zero-mass particle—except that we get two for the price of one!

It’s an interesting way of looking at things. Let’s look at the equations we started this post with, i.e. Maxwell’s equations in free space—i.e. no stationary charges, and no currents (i.e. moving charges) either! So we’re talking those ∂B/∂t = –∇×E and ∂E/∂t = ∇×B equations now.

Note that they actually give you four equations, because they’re vector equations:

  1. B/∂t = –∇×⇔ ∂By/∂t = –(∇×E)y and ∂Bz/∂t = –(∇×E)z
  2. E/∂t = ∇×⇔ ∂Ey/∂t = (∇×B)y and ∂Ez/∂t = (∇×B)z

To figure out what that means, we need to remind ourselves of the definition of the curl operator, i.e. the ∇× operator. For E, the components of ∇×E are the following:

  1. (∇×E)z = ∇xE– ∇yE= ∂Ey/∂x – ∂Ex/∂y
  2. (∇×E)x = ∇yE– ∇zE= ∂Ez/∂y – ∂Ey/∂z
  3. (∇×E)y = ∇zE– ∇xE= ∂Ex/∂z – ∂Ez/∂x

So the four equations above can now be written as:

  1. ∂By/∂t = –(∇×E)y = –∂Ex/∂z + ∂Ez/∂x
  2. ∂Bz/∂t = –(∇×E)z = –∂Ey/∂x + ∂Ex/∂y
  3. ∂Ey/∂t = (∇×B)y = ∂Bx/∂z – ∂Bz/∂x
  4. ∂Ez/∂t = (∇×B)= ∂By/∂x – ∂Bx/∂y

What can we do with this? Well… The x-component of E and B is zero, so one of the two terms in the equations simply disappears. We get:

  1. ∂By/∂t = –(∇×E)y = ∂Ez/∂x
  2. ∂Bz/∂t = –(∇×E)z = – ∂Ey/∂x
  3. ∂Ey/∂t = (∇×B)y = – ∂Bz/∂x
  4. ∂Ez/∂t = (∇×B)= ∂By/∂x

Interesting: only the derivatives with respect to x remain! Let’s calculate them:

  1. ∂By/∂t = –(∇×E)y = ∂Ez/∂x = ∂[sin(kx − ωt)]/∂x = k·cos(kx − ωt) = k·Ey
  2. ∂Bz/∂t = –(∇×E)z = – ∂Ey/∂x = – ∂[cos(kx − ωt)]/∂x = k·sin(kx − ωt) = k·Ez
  3. ∂Ey/∂t = (∇×B)y = – ∂Bz/∂x = – ∂[sin(kx − ωt + π/2)]/∂x = – k·cos(kx − ωt + π/2) = – k·By
  4. ∂Ez/∂t = (∇×B)= ∂By/∂x = ∂[cos(kx − ωt + π/2)]/∂x = − k·sin(kx − ωt + π/2) = – k·Bz

What wonderful results! The time derivatives of the components of B and E are equal to ±k times the components of E and B respectively! So everything is related to everything, indeed! 🙂

Let’s play some more. Using the cos(θ + π/2) = −sin(θ) and sin(θ + π/2) = cos(θ) identities, we know that By  and B= sin(kx − ωt + π/2) are equal to:

  1. B= cos(kx − ωt + π/2) = −sin(kx − ωt) = −Ez
  2. B= sin(kx − ωt + π/2) = cos(kx − ωt) = Ey

Let’s calculate those derivatives once more now:

  1. ∂By/∂t = −∂Ez/∂t = −∂sin(kx − ωt)/∂t = ω·cos(kx − ωt) = ω·Ey
  2. ∂Bz/∂t = ∂Ey/∂t = ∂cos(kx − ωt)/∂t = −ω·sin(kx − ωt) = −ω·Ez

This result can, obviously, be true only if ω = k, which we assume to be the case, as we’re measuring time and distance in equivalent units, so the phase velocity is  = 1 = ω/k.

Hmm… I am sure it won’t be long before I’ll be able to prove what I want to prove. I just need to figure out the math. It’s pretty obvious now that the wavefunction—any wavefunction, really—models the flow of energy. I just need to show how it works for the zero-mass particle—and then I mean: how it works exactly. We must be able to apply the concept of the Poynting vector to wavefunctions. We must be. I’ll find how. One day. 🙂

As for now, however, I feel we’ve played enough with those wavefunctions now. It’s time to do what we promised to do a long time ago, and that is to use Schrödinger’s equation to calculate electron orbitals—and other stuff, of course! Like… Well… We hardly ever talked about spin, did we? That comes with huge complexities. But we’ll get through it. Trust me. 🙂

The quantum of time and distance

Post scriptum note added on 11 July 2016: This is one of the more speculative posts which led to my e-publication analyzing the wavefunction as an energy propagation. With the benefit of hindsight, I would recommend you to immediately the more recent exposé on the matter that is being presented here, which you can find by clicking on the provided link. In fact, I actually made some (small) mistakes when writing the post below.

Original post:

In my previous post, I introduced the elementary wavefunction of a particle with zero rest mass in free space (i.e. the particle also has zero potential). I wrote that wavefunction as ei(kx − ωt) ei(x/2 − t/2) = cos[(x−t)/2] + i∙sin[(x−t)/2], and we can represent that function as follows:

5d_euler_f

If the real and imaginary axis in the image above are the y- and z-axis respectively, then the x-axis here is time, so here we’d be looking at the shape of the wavefunction at some fixed point in space.

Now, we could also look at its shape at some fixed in point in time, so the x-axis would then represent the spatial dimension. Better still, we could animate the illustration to incorporate both the temporal as well as the spatial dimension. The following animation does the trick quite well:

Animation

Please do note that space is one-dimensional here: the y- and z-axis represent the real and imaginary part of the wavefunction, not the y- or z-dimension in space.

You’ve seen this animation before, of course: I took it from Wikipedia, and it actually represents the electric field vector (E) for a circularly polarized electromagnetic wave. To get a complete picture of the electromagnetic wave, we should add the magnetic field vector (B), which is not shown here. We’ll come back to that later. Let’s first look at our zero-mass particle denuded of all properties, so that’s not an electromagnetic wave—read: a photon. No. We don’t want to talk charges here.

OK. So far so good. A zero-mass particle in free space. So we got that ei(x/2 − t/2) = cos[(x−t)/2] + i∙sin[(x−t)/2] wavefunction. We got that function assuming the following:

  1. Time and distance are measured in equivalent units, so = 1. Hence, the classical velocity (v) of our zero-mass particle is equal to 1, and we also find that the energy (E), mass (m) and momentum (p) of our particle are numerically the same. We wrote: E = m = p, using the p = m·v (for = c) and the E = m∙c2 formulas.
  2. We also assumed that the quantum of energy (and, hence, the quantum of mass, and the quantum of momentum) was equal to ħ/2, rather than ħ. The de Broglie relations (k = p/ħ and ω = E/ħ) then gave us the rather particular argument of our wavefunction: kx − ωt = x/2 − t/2.

The latter hypothesis (E = m = p = ħ/2) is somewhat strange at first but, as I showed in that post of mine, it avoids an apparent contradiction: if we’d use ħ, then we would find two different values for the phase and group velocity of our wavefunction. To be precise, we’d find for the group velocity, but v/2 for the phase velocity. Using ħ/2 solves that problem. In addition, using ħ/2 is consistent with the Uncertainty Principle, which tells us that ΔxΔp = ΔEΔt = ħ/2.

OK. Take a deep breath. Here I need to say something about dimensions. If we’re saying that we’re measuring time and distance in equivalent units – say, in meter, or in seconds – then we are not saying that they’re the same. The dimension of time and space is fundamentally different, as evidenced by the fact that, for example, time flows in one direction only, as opposed to x. To be precise, we assumed that x and t become countable variables themselves at some point in time. However, if we’re at t = 0, then we’d count time as t = 1, 2, etcetera only. In contrast, at the point x = 0, we can go to x = +1, +2, etcetera but we may also go to x = −1, −2, etc.

I have to stress this point, because what follows will require some mental flexibility. In fact, we often talk about natural units, such as Planck units, which we get from equating fundamental constants, such as c, or ħ, to 1, but then we often struggle to interpret those units, because we fail to grasp what it means to write = 1, or ħ = 1. For example, writing = 1 implies we can measure distance in seconds, or time in meter, but it does not imply that distance becomes time, or vice versa. We still need to keep track of whether or not we’re talking a second in time, or a second in space, i.e. c meter, or, conversely, whether we’re talking a meter in space, or a meter in time, i.e. 1/c seconds. We can make the distinction in various ways. For example, we could mention the dimension of each equation between brackets, so we’d write: t = 1×10−15 s [t] ≈ 299.8×10−9 m [t]. Alternatively, we could put a little subscript (like t, or d), so as to make sure it’s clear our meter is a a ‘light-meter’, so we’d write: t = 1×10−15 s ≈ 299.8×10−9 mt. Likewise, we could add a little subscript when measuring distance in light-seconds, so we’d write x = 3×10m ≈ 1 sd, rather than x = 3×10m [x] ≈ 1 s [x].

If you wish, we could refer to the ‘light-meter’ as a ‘time-meter’ (or a meter of time), and to the light-second as a ‘distance-second’ (or a second of distance). It doesn’t matter what you call it, or how you denote it. In fact, you will never hear of a meter of time, nor will you ever see those subscripts or brackets. But that’s because physicists always keep track of the dimensions of an equation, and so they know. They know, for example, that the dimension of energy combines the dimensions of both force as well as distance, so we write: [energy] = [force]·[distance]. Read: energy amounts to applying a force over a distance. Likewise, momentum amounts to applying some force over some time, so we write: [momentum] = [force]·[time]. Using the usual symbols for energy, momentum, force, distance and time respectively, we can write this as [E] = [F]·[x] and [p] = [F]·[t]. Using the units you know, i.e. joulenewton, meter and seconds, we can also write this as: 1 J = 1 N·m and 1…

Hey! Wait a minute! What’s that N·s unit for momentum? Momentum is mass times velocity, isn’t it? It is. But it amounts to the same. Remember that mass is a measure for the inertia of an object, and so mass is measured with reference to some force (F) and some acceleration (a): F = m·⇔ m = F/a. Hence, [m] = kg = [F/a] = N/(m/s2) = N·s2/m. [Note that the m in the brackets is symbol for mass but the other m is a meter!] So the unit of momentum is (N·s2/m)·(m/s) = N·s = newton·second.

Now, the dimension of Planck’s constant is the dimension of action, which combines all dimensions: force, time and distance. We write: ħ ≈ 1.0545718×10−34 N·m·s (newton·meter·second). That’s great, and I’ll show why in a moment. But, at this point, you should just note that when we write that E = m = p = ħ/2, we’re just saying they are numerically the same. The dimensions of E, m and p are not the same. So what we’re really saying is the following:

  1. The quantum of energy is ħ/2 newton·meter ≈ 0.527286×10−34 N·m.
  2. The quantum of momentum is ħ/2 newton·second ≈ 0.527286×10−34 N·s.

What’s the quantum of mass? That’s where the equivalent units come in. We wrote: 1 kg = 1 N·s2/m. So we could substitute the distance unit in this equation (m) by sd/= sd/(3×108). So we get: 1 kg = 3×108 N·s2/sd. Can we scrap both ‘seconds’ and say that the quantum of mass (ħ/2) is equal to the quantum of momentum? Think about it.

[…]

The answer is… Yes and no—but much more no than yes! The two sides of the equation are only numerically equal, but we’re talking a different dimension here. If we’d write that 1 kg = 0.527286×10−34 N·s2/sd = 0.527286×10−34 N·s, you’d be equating two dimensions that are fundamentally different: space versus time. To reinforce the point, think of it the other way: think of substituting the second (s) for 3×10m. Again, you’d make a mistake. You’d have to write 0.527286×10−34 N·(mt)2/m, and you should not assume that a time-meter is equal to a distance-meter. They’re equivalent units, and so you can use them to get some number right, but they’re not equal: what they measure, is fundamentally different. A time-meter measures time, while a distance-meter measure distance. It’s as simple as that. So what is it then? Well… What we can do is remember Einstein’s energy-mass equivalence relation once more: E = m·c2 (and m is the mass here). Just check the dimensions once more: [m]·[c2] = (N·s2/m)·(m2/s2) = N·m. So we should think of the quantum of mass as the quantum of energy, as energy and mass are equivalent, really.

Back to the wavefunction

The beauty of the construct of the wavefunction resides in several mathematical properties of this construct. The first is its argument:

θ = kx − ωt, with k = p/ħ and ω = E/ħ

Its dimension is the dimension of an angle: we express in it in radians. What’s a radian? You might think that a radian is a distance unit because… Well… Look at how we measure an angle in radians below:

Circle_radians

But you’re wrong. An angle’s measurement in radians is numerically equal to the length of the corresponding arc of the unit circle but… Well… Numerically only. 🙂 Just do a dimensional analysis of θ = kx − ωt = (p/ħ)·x − (E/ħ)·t. The dimension of p/ħ is (N·s)/(N·m·s) = 1/m = m−1, so we get some quantity expressed per meter, which we then multiply by x, so we get a pure number. No dimension whatsoever! Likewise, the dimension of E/ħ is (N·m)/(N·m·s) = 1/s = s−1, which we then multiply by t, so we get another pure number, which we then add to get our argument θ. Hence, Planck’s quantum of action (ħ) does two things for us:

  1. It expresses p and E in units of ħ.
  2. It sorts out the dimensions, ensuring our argument is a dimensionless number indeed.

In fact, I’d say the ħ in the (p/ħ)·x term in the argument is a different ħ than the ħ in the (E/ħ)·t term. Huh? What? Yes. Think of the distinction I made between s and sd, or between m and mt. Both were numerically the same: they captured a magnitude, but they measured different things. We’ve got the same thing here:

  1. The meter (m) in ħ ≈ 1.0545718×10−34 N·m·s in (p/ħ)·x is the dimension of x, and so it gets rid of the distance dimension. So the m in ħ ≈ 1.0545718×10−34 m·s goes, and what’s left measures p in terms of units equal to 1.0545718×10−34 N·s, so we get a pure number indeed.
  2. Likewise, the second (s) in ħ ≈ 1.0545718×10−34 N·m·s in (E/ħ)·t is the dimension of t, and so it gets rid of the time dimension. So the s in ħ ≈ 1.0545718×10−34 N·m·s goes, and what’s left measures E in terms of units equal to 1.0545718×10−34 N·m, so we get another pure number.
  3. Adding both gives us the argument θ: a pure number that measures some angle.

That’s why you need to watch out when writing θ = (p/ħ)·x − (E/ħ)·t as θ = (p·x − E·t)/ħ or – in the case of our elementary wavefunction for the zero-mass particle – as θ = (x/2 − t/2) = (x − t)/2. You can do it – in fact, you should do when trying to calculate something – but you need to be aware that you’re making abstraction of the dimensions. That’s quite OK, as you’re just calculating something—but don’t forget the physics behind!

You’ll immediately ask: what are the physics behind here? Well… I don’t know. Perhaps nobody knows. As Feynman once famously said: “I think I can safely say that nobody understands quantum mechanics.” But then he never wrote that, and I am sure he didn’t really mean that. And then he said that back in 1964, which is 50 years ago now. 🙂 So let’s try to understand it at least. 🙂

Planck’s quantum of action – 1.0545718×10−34 N·m·s – comes to us as a mysterious quantity. A quantity is more than a a number. A number is something like π or e, for example. It might be a complex number, like eiθ, but that’s still a number. In contrast, a quantity has some dimension, or some combination of dimensions. A quantity may be a scalar quantity (like distance), or a vector quantity (like a field vector). In this particular case (Planck’s ħ or h), we’ve got a physical constant combining three dimensions: force, time and distance—or space, if you want.  It’s a quantum, so it comes as a blob—or a lump, if you prefer that word. However, as I see it, we can sort of project it in space as well as in time. In fact, if this blob is going to move in spacetime, then it will move in space as well as in time: t will go from 0 to 1, and x goes from 0 to ± 1, depending on what direction we’re going. So when I write that E = p = ħ/2—which, let me remind you, are two numerical equations, really—I sort of split Planck’s quantum over E = m and p respectively.

You’ll say: what kind of projection or split is that? When projecting some vector, we’ll usually have some sine and cosine, or a 1/√2 factor—or whatever, but not a clean 1/2 factor. Well… I have no answer to that, except that this split fits our mathematical construct. Or… Well… I should say: my mathematical construct. Because what I want to find is this clean Schrödinger equation:

∂ψ/∂t = i·(ħ/2m)·∇2ψ = i·∇2ψ for m = ħ/2

Now I can only get this equation if (1) E = m = p and (2) if m = ħ/2 (which amounts to writing that E = p = m = ħ/2). There’s also the Uncertainty Principle. If we are going to consider the quantum vacuum, i.e. if we’re going to look at space (or distance) and time as count variables, then Δx and Δt in the ΔxΔp = ΔEΔt = ħ/2 equations are ± 1 and, therefore, Δp and ΔE must be ± ħ/2. In any case, I am not going to try to justify my particular projection here. Let’s see what comes out of it.

The quantum vacuum

Schrödinger’s equation for my zero-mass particle (with energy E = m = p = ħ/2) amounts to writing:

  1. Re(∂ψ/∂t) = −Im(∇2ψ)
  2. Im(∂ψ/∂t) = Re(∇2ψ)

Now that reminds of the propagation mechanism for the electromagnetic wave, which we wrote as ∂B/∂t = –∇×and E/∂t = ∇×B, also assuming we measure time and distance in equivalent units. However, we’ll come back to that later. Let’s first study the equation we have, i.e.

ei(kx − ωt) = ei(ħ·x/2 − ħ·t/2)/ħ = ei(x/2 − t/2) = cos[(x−t)/2] + i∙sin[(x−t)/2]

Let’s think some more. What is that ei(x/2 − t/2) function? It’s subject to conceiving time and distance as countable variables, right? I am tempted to say: as discrete variables, but I won’t go that far—not now—because the countability may be related to a particular interpretation of quantum physics. So I need to think about that. In any case… The point is that x can only take on values like 0, 1, 2, etcetera. And the same goes for t. To make things easy, we’ll not consider negative values for x right now (and, obviously, not for t either). But you can easily check it doesn’t make a difference: if you think of the propagation mechanism – which is what we’re trying to model here – then x is always positive, because we’re moving away from some source that caused the wave. In any case, we’ve got a infinite set of points like:

  • ei(0/2 − 0/2) ei(0) = cos(0) + i∙sin(0)
  • ei(1/2 − 0/2) = ei(1/2) = cos(1/2) + i∙sin(1/2)
  • ei(0/2 − 1/2) = ei(−1/2) = cos(−1/2) + i∙sin(−1/2)
  • ei(1/2 − 1/2) = ei(0) = cos(0) + i∙sin(0)

In my previous post, I calculated the real and imaginary part of this wavefunction for x going from 0 to 14 (as mentioned, in steps of 1) and for t doing the same (also in steps of 1), and what we got looked pretty good:

graph real graph imaginary

I also said that, if you wonder how the quantum vacuum could possibly look like, you should probably think of these discrete spacetime points, and some complex-valued wave that travels as illustrated above. In case you wonder what’s being illustrated here: the right-hand graph is the cosine value for all possible x = 0, 1, 2,… and t = 0, 1, 2,… combinations, and the left-hand graph depicts the sine values, so that’s the imaginary part of our wavefunction. Taking the absolute square of both gives 1 for all combinations. So it’s obvious we’d need to normalize and, more importantly, we’d have to localize the particle by adding several of these waves with the appropriate contributions. But so that’s not our worry right now. I want to check whether those discrete time and distance units actually make sense. What’s their size? Is it anything like the Planck length (for distance) and/or the Planck time?

Let’s see. What are the implications of our model? The question here is: if ħ/2 is the quantum of energy, and the quantum of momentum, what’s the quantum of force, and the quantum of time and/or distance?

Huh? Yep. We treated distance and time as countable variables above, but now we’d like to express the difference between x = 0 and x = 1 and between t = 0 and t = 1 in the units we know, this is in meter and in seconds. So how do we go about that? Do we have enough equations here? Not sure. Let’s see…

We obviously need to keep track of the various dimensions here, so let’s refer to that discrete distance and time unit as tand lP respectively. The subscript (P) refers to Planck, and the refers to a length, but we’re likely to find something else than Planck units. I just need placeholder symbols here. To be clear: tand lP are expressed in meter and seconds respectively, just like the actual Planck time and distance, which are equal to 5.391×10−44 s (more or less) and  1.6162×10−35 m (more or less) respectively. As I mentioned above, we get these Planck units by equating fundamental physical constants to 1. Just check it: (1.6162×10−35 m)/(5.391×10−44 s) = ≈ 3×10m/s. So the following relation must be true: lP = c·tP, or lP/t= c.

Now, as mentioned above, there must be some quantum of force as well, which we’ll write as FP, and which is – obviously – expressed in newton (N). So we have:

  1. E = ħ/2 ⇒ 0.527286×10−34 N·m = FP·lN·m
  2. p = ħ/2 ⇒ 0.527286×10−34 N·s = FP·tN·s

Let’s try to divide both formulas: E/p = (FP·lN·m)/(FP·tN·s) = lP/tP m/s = lP/tP m/s = c m/s. That’s consistent with the E/p = equation. Hmm… We found what we knew already. My model is not fully determined, it seems. 😦

What about the following simplistic approach? E is numerically equal to 0.527286×10−34, and its dimension is [E] = [F]·[x], so we write: E = 0.527286×10−34·[E] = 0.527286×10−34·[F]·[x]. Hence, [x] = [E]/[F] = (N·m)/N = m. That just confirms what we already know: the quantum of distance (i.e. our fundamental unit of distance) can be expressed in meter. But our model does not give that fundamental unit. It only gives us its dimension (meter), which is stuff we knew from the start. 😦

Let’s try something else. Let’s just accept that Planck length and time, so we write:

  • lP = 1.6162×10−35 m
  • t= 5.391×10−44 s

Now, if the quantum of action is equal to ħ N·m·s = FP·lP·tP N·m·s = 1.0545718×10−34 N·m·s, and if the two definitions of land tP above hold, then 1.0545718×10−34 N·m·s = (FN)×(1.6162×10−35 m)×(5.391×10−44 s) ≈ FP  8.713×10−79 N·m·s ⇔ FP ≈ 1.21×1044 N.

Does that make sense? It does according to Wikipedia, but how do we relate this to our E = p = m = ħ/2 equations? Let’s try this:

  1. EP = (1.0545718×10−34 N·m·s)/(5.391×10−44 s) = 1.956×109 J. That corresponds to the regular Planck energy.
  2. pP = (1.0545718×10−34 N·m·s)/(1.6162×10−35 m) = 0.6525 N·s. That corresponds to the regular Planck momentum.

Is EP = pP? Let’s substitute: 1.956×109 N·m = 1.956×109 N·(s/c) = 1.956×109/2.998×10N·s = 0.6525 N·s. So, yes, it comes out alright. In fact, I omitted the 1/2 factor in the calculations, but it doesn’t matter: it does come out alright. So I did not prove that the difference between my x = 0 and x = 1 points (or my t = 0 and t  = 1 points) is equal to the Planck length (or the Planck time unit), but I did show my theory is, at the very least, compatible with those units. That’s more than enough for now. And I’ll come surely come back to it in my next post. 🙂

Post Scriptum: One must solve the following equations to get the fundamental Planck units:

Planck units

We have five fundamental equations for five fundamental quantities respectively: tP, lP, FP, mP, and EP respectively, so that’s OK: it’s a fully determined system alright! But where do the expressions with G, kB (the Boltzmann constant) and ε0 come from? What does it mean to equate those constants to 1? Well… I need to think about that, and I’ll get back to you on it. 🙂

The wavefunction of a zero-mass particle

Post scriptum note added on 11 July 2016: This is one of the more speculative posts which led to my e-publication analyzing the wavefunction as an energy propagation. With the benefit of hindsight, I would recommend you to immediately the more recent exposé on the matter that is being presented here, which you can find by clicking on the provided link. In fact, I actually made some (small) mistakes when writing the post below.

Original post:

I hope you find the title intriguing. A zero-mass particle? So I am talking a photon, right? Well… Yes and no. Just read this post and, more importantly, think about this story for yourself. 🙂

One of my acquaintances is a retired nuclear physicist. We mail every now and then—but he has little or no time for my questions: he usually just tells me to keep studying. I once asked him why there is never any mention of the wavefunction of a photon in physics textbooks. He bluntly told me photons don’t have a wavefunction—not in the sense I was talking at least. Photons are associated with a traveling electric and a magnetic field vector. That’s it. Full stop. Photons do not have a ψ or φ function. [I am using ψ and φ to refer to position or momentum wavefunction. You know both are related: if we have one, we have the other.] But then I never give up, of course. I just can’t let go out of the idea of a photon wavefunction. The structural similarity in the propagation mechanism of the electric and magnetic field vectors E and B just looks too much like the quantum-mechanical wavefunction. So I kept trying and, while I don’t think I fully solved the riddle, I feel I understand it much better now. Let me show you the why and how.

I. An electromagnetic wave in free space is fully described by the following two equations:

  1. B/∂t = –∇×E
  2. E/∂t = c2∇×B

We’re making abstraction here of stationary charges, and we also do not consider any currents here, so no moving charges either. So I am omitting the ∇·E = ρ/ε0 equation (i.e. the first of the set of four equations), and I am also omitting the j0 in the second equation. So, for all practical purposes (i.e. for the purpose of this discussion), you should think of a space with no charges: ρ = 0 and = 0. It’s just a traveling electromagnetic wave. To make things even simpler, we’ll assume our time and distance units are chosen such that = 1, so the equations above reduce to:

  1. B/∂t = –∇×E
  2.  E/∂t = ∇×B

Perfectly symmetrical! But note the minus sign in the first equation. As for the interpretation, I should refer you to previous posts but, briefly, the ∇× operator is the curl operator. It’s a vector operator: it describes the (infinitesimal) rotation of a (three-dimensional) vector field. We discussed heat flow a couple of times, or the flow of a moving liquid. So… Well… If the vector field represents the flow velocity of a moving fluid, then the curl is the circulation density of the fluid. The direction of the curl vector is the axis of rotation as determined by the ubiquitous right-hand rule, and its magnitude of the curl is the magnitude of rotation. OK. Next  step.

II. For the wavefunction, we have Schrödinger’s equation, ∂ψ/∂t = i·(ħ/2m)·∇2ψ, which relates two complex-valued functions (∂ψ/∂t and ∇2ψ). Complex-valued functions consist of a real and an imaginary part, and you should be able to verify this equation is equivalent to the following set of two equations:

  1. Re(∂ψ/∂t) = −(ħ/2m)·Im(∇2ψ)
  2. Im(∂ψ/∂t) = (ħ/2m)·Re(∇2ψ)

[Two complex numbers a + ib and c + id are equal if, and only if, their real and imaginary parts are the same. However, note the −i factor in the right-hand side of the equation, so we get: a + ib = −i·(c + id) = d −ic.] The Schrödinger equation above also assumes free space (i.e. zero potential energy: V = 0) but, in addition – see my previous post – they also assume a zero rest mass of the elementary particle (E0 = 0). So just assume E= V = 0 in de Broglie’s elementary ψ(θ) = ψ(x, t) = eiθ = a·e−i[(E+ p2/(2m) + V)·t − p∙x]/ħ wavefunction. So, in essence, we’re looking at the wavefunction of a massless particle here. Sounds like nonsense, doesn’t it? But… Well… That should be the wavefunction of a photon in free space then, right? 🙂

Maybe. Maybe not. Let’s go as far as we can.

The energy of a zero-mass particle

What m would we use for a photon? It’s rest mass is zero, but it’s got energy and, hence, an equivalent mass. That mass is given by the m = E/cmass-energy equivalence. We also know a photon has momentum, and it’s equal to its energy divided by c: p = m·c = E/c. [I know the notation is somewhat confusing: E is, obviously, not the magnitude of E here: it’s energy!] Both yield the same result. We get: m·c = E/c ⇔ m = E/c⇔ E = m·c2.

OK. Next step. Well… I’ve always been intrigued by the fact that the kinetic energy of a photon, using the E = m·v2/2 = E = m·c2/2 formula, is only half of its total energy E = m·c2. Half: 1/2. That 1/2 factor is intriguing. Where’s the rest of the energy? It’s really a contradiction: our photon has no rest mass, and there’s no potential here, but its total energy is still twice its kinetic energy. Quid?

There’s only one conclusion: just because of its sheer existence, it must have some hidden energy, and that hidden energy is also equal to E = m·c2/2, and so the kinetic and hidden energy add up to E = m·c2.

Huh? Hidden energy? I must be joking, right?

Well… No. No joke. I am tempted to call it the imaginary energy, because it’s linked to the imaginary part of the wavefunction—but then it’s everything but imaginary: it’s as real as the imaginary part of the wavefunction. [I know that sounds a bit nonsensical, but… Well… Think about it: it does make sense.]

Back to that factor 1/2. You may or may not remember it popped up when we were calculating the group and the phase velocity of the wavefunction respectively, again assuming zero rest mass, and zero potential. [Note that the rest mass term is mathematically equivalent to the potential term in both the wavefunction as well as in Schrödinger’s equation: (E0·t +V·t = (E+ V)·t, and V·ψ + E0·ψ = (V+E0)·ψ—obviously!]

In fact, let me quickly show you that calculation again: the de Broglie relations tell us that the k and the ω in the ei(kx − ωt) = cos(kx−ωt) + i∙sin(kx−ωt) wavefunction (i.e. the spatial and temporal frequency respectively) are equal to k = p/ħ, and ω = E/ħ. If we would now use the kinetic energy formula E = m·v2/2 – which we can also write as E = m·v·v/2 = p·v/2 = p·p/2m = p2/2m, with v = p/m the classical velocity of the elementary particle that Louis de Broglie was thinking of – then we can calculate the group velocity of our ei(kx − ωt) = cos(kx−ωt) + i∙sin(kx−ωt) as:

vg = ∂ω/∂k = ∂[E/ħ]/∂[p/ħ] = ∂E/∂p = ∂[p2/2m]/∂p = 2p/2m = p/m = v

[Don’t tell me I can’t treat m as a constant when calculating ∂ω/∂k: I can. Think about it.] Now the phase velocity. The phase velocity of our ei(kx − ωt) is only half of that. Again, we get that 1/2 factor:

vp = ω/k = (E/ħ)/(p/ħ) = E/p = (p2/2m)/p = p/2m = v/2

Strange, isn’t it? Why would we get a different value for the phase velocity here? It’s not like we have two different frequencies here, do we? You may also note that the phase velocity turns out to be smaller than the group velocity, which is quite exceptional as well! So what’s the matter?

Well… The answer is: we do seem to have two frequencies here while, at the same time, it’s just one wave. There is only one k and ω here but, as I mentioned a couple of times already, that ei(kx − ωt) wavefunction seems to give you two functions for the price of one—one real and one imaginary: ei(kx − ωt) = cos(kx−ωt) + i∙sin(kx−ωt). So are we adding waves, or are we not? It’s a deep question. In my previous post, I said we were adding separate waves, but now I am thinking: no. We’re not. That sine and cosine are part of one and the same whole. Indeed, the apparent contradiction (i.e. the different group and phase velocity) gets solved if we’d use the E = m∙v2 formula rather than the kinetic energy E = m∙v2/2. Indeed, assuming that E = m∙v2 formula also applies to our zero-mass particle (I mean zero rest mass, of course), and measuring time and distance in natural units (so c = 1), we have:

E = m∙c2 = m and p = m∙c2 = m, so we get: E = m = p

Waw! What a weird combination, isn’t it? But… Well… It’s OK. [You tell me why it wouldn’t be OK. It’s true we’re glossing over the dimensions here, but natural units are natural units, and so c = c2 = 1. So… Well… No worries!] The point is: that E = m = p equality yields extremely simple but also very sensible results. For the group velocity of our ei(kx − ωt) wavefunction, we get:

vg = ∂ω/∂k = ∂[E/ħ]/∂[p/ħ] = ∂E/∂p = ∂p/∂p = 1

So that’s the velocity of our zero-mass particle (c, i.e. the speed of light) expressed in natural units once more—just like what we found before. For the phase velocity, we get:

vp = ω/k = (E/ħ)/(p/ħ) = E/p = p/p = 1

Same result! No factor 1/2 here! Isn’t that great? My ‘hidden energy theory’ makes a lot of sense. 🙂 In fact, I had mentioned a couple of times already that the E = m∙v2 relation comes out of the de Broglie relations if we just multiply the two and use the v = λ relation:

  1. f·λ = (E/h)·(h/p) = E/p
  2. v = λ ⇒ f·λ = v = E/p ⇔ E = v·p = v·(m·v) ⇒ E = m·v2

But so I had no good explanation for this. I have one now: the E = m·vis the correct energy formula for our zero-mass particle. 🙂

The quantization of energy and the zero-mass particle

Let’s now think about the quantization of energy. What’s the smallest value for E that we could possible think of? That’s h, isn’t it? That’s the energy of one cycle of an oscillation according to the Planck-Einstein relation (E = h·f). Well… Perhaps it’s ħ? Because… Well… We saw energy levels were separated by ħ, rather than h, when studying the blackbody radiation problem. So is it ħ = h/2π? Is the natural unit a radian (i.e. a unit distance), rather than a cycle?

Neither is natural, I’d say. We also have the Uncertainty Principle, which suggests the smallest possible energy value is ħ/2, because ΔxΔp = ΔtΔE = ħ/2.

Huh? What’s the logic here?

Well… I am not quite sure but my intuition tells me the quantum of energy must be related to the quantum of time, and the quantum of distance.

Huh? The quantum of time? The quantum of distance? What’s that? The Planck scale?

No. Or… Well… Let me correct that: not necessarily. I am just thinking in terms of logical concepts here. Logically, as we think of the smallest of smallest, then our time and distance variables must become count variables, so they can only take on some integer value n = 0, 1, 2 etcetera. So then we’re literally counting in time and/or distance units. So Δx and Δt are then equal to 1. Hence, Δp and ΔE are then equal to Δp = ΔE = ħ/2. Just think of the radian (i.e. the unit in which we measure θ) as measuring both time as well as distance. Makes sense, no?

No? Well… Sorry. I need to move on. So the smallest possible value for m = E = p would be ħ/2. Let’s substitute that in Schrödinger’s equation, or in that set of equations Re(∂ψ/∂t) = −(ħ/2m)·Im(∇2ψ) and Im(∂ψ/∂t) = (ħ/2m)·Re(∇2ψ). We get:

  1. Re(∂ψ/∂t) = −(ħ/2m)·Im(∇2ψ) = −(2ħ/2ħ)·Im(∇2ψ) = −Im(∇2ψ)
  2. Im(∂ψ/∂t) = (ħ/2m)·Re(∇2ψ) = (2ħ/2ħ)·Re(∇2ψ) = Re(∇2ψ)

Bingo! The Re(∂ψ/∂t) = −Im(∇2ψ) and Im(∂ψ/∂t) = Re(∇2ψ) equations were what I was looking for. Indeed, I wanted to find something that was structurally similar to the ∂B/∂t = –∇×and E/∂t = ∇×B equations—and something that was exactly similar: no coefficients in front or anything. 🙂

What about our wavefunction? Using the de Broglie relations once more (k = p/ħ, and ω = E/ħ), our ei(kx − ωt) = cos(kx−ωt) + i∙sin(kx−ωt) now becomes:

ei(kx − ωt) = ei(ħ·x/2 − ħ·t/2)/ħ = ei(x/2 − t/2) = cos[(x−t)/2] + i∙sin[(x−t)/2]

Hmm… Interesting! So we’ve got that 1/2 factor now in the argument of our wavefunction! I really feel I am close to squaring the circle here. 🙂 Indeed, it must be possible to relate the ∂B/∂t = –∇×E and ∂E/∂t = c2∇×B to the Re(∂ψ/∂t) = −Im(∇2ψ) and Im(∂ψ/∂t) = Re(∇2ψ) equations. I am sure it’s a complicated exercise. It’s likely to involve the formula for the Lorentz force, which says that the force on a unit charge is equal to E+v×B, with v the velocity of the charge. Why? Note the vector cross-product. Also note that ∂B/∂t and  ∂E/∂t are vector-valued functions, not scalar-valued functions. Hence, in that sense, ∂B/∂t and  ∂E/∂t and not like the Re(∂ψ/∂t) and/or Im(∂ψ/∂t) function. But… Well… For the rest, think of it: E and B are orthogonal vectors, and that’s  how we usually interpret the real and imaginary part of a complex number as well: the real and imaginary axis are orthogonal too!

So I am almost there. Who can help me prove what I want to prove here? The two propagation mechanisms are the “same-same but different”, as they say in Asia. The difference between the two propagation mechanisms must also be related to that fundamental dichotomy in Nature: the distinction between bosons and fermions. Indeed, when combining two directional quantities (i.e. two vectors), we like to think there are four different ways of doing that, as shown below. However, when we’re only interested in the magnitude of the result (and not in its direction), then the first and third result below are really the same, as are the second and fourth combination. Now, we’ve got pretty much the same in quantum math: we can, in theory, combine complex-valued amplitudes in four different ways but, in practice, we only have two (rather than four) types of behavior only: photons versus bosons.

vector addition

Is our zero-mass particle just the electric field vector?

Let’s analyze that ei(x/2 − t/2) = cos[(x−t)/2] + i∙sin[(x−t)/2] wavefunction some more. It’s easy to represent it graphically. The following animation does the trick:

Animation

I am sure you’ve seen this animation before: it represents a circularly polarized electromagnetic wave… Well… Let me be precise: it presents the electric field vector (E) of such wave only. The B vector is not shown here, but you know where and what it is: orthogonal to the E vector, as shown below—for a linearly polarized wave.

emwave2

Let’s think some more. What is that ei(x/2 − t/2) function? It’s subject to conceiving time and distance as countable variables, right? I am tempted to say: as discrete variables, but I won’t go that far—not now—because the countability may be related to a particular interpretation of quantum physics. So I need to think about that. In any case… The point is that x can only take on values like 0, 1, 2, etcetera. And the same goes for t. To make things easy, we’ll not consider negative values for x right now (and, obviously, not for t either). So we’ve got a infinite set of points like:

  • ei(0/2 − 0/2) = cos(0) + i∙sin(0)
  • ei(1/2 − 0/2) = cos(1/2) + i∙sin(1/2)
  • ei(0/2 − 1/2) = cos(−1/2) + i∙sin(−1/2)
  • ei(1/2 − 1/2) = cos(0) + i∙sin(0)

Now, I quickly opened Excel and calculated those cosine and sine values for x and t going from 0 to 14 below. It’s really easy. Just five minutes of work. You should do yourself as an exercise. The result is shown below. Both graphs connect 14×14 = 196 data points, but you can see what’s going on: this does effectively, represent the elementary wavefunction of a particle traveling in spacetime. In fact, you can see its speed is equal to 1, i.e. it effectively travels at the speed of light, as it should: the wave velocity is v = f·λ = (ω/2π)·(2π/k) = ω/k = (1/2)·(1/2) = 1. The amplitude of our wave doesn’t change along the x = t diagonal. As the Last Samurai puts it, just before he moves to the Other World: “Perfect! They are all perfect!” 🙂

graph imaginarygraph real

In fact, in case you wonder how the quantum vacuum could possibly look like, you should probably think of these discrete spacetime points, and some complex-valued wave that travels as it does in the illustration above.

Of course, that elementary wavefunction above does not localize our particle. For that, we’d have to add a potentially infinite number of such elementary wavefunctions, so we’d write the wavefunction as ∑ ajeiθj functions. [I use the symbol here for the subscript, rather than the more conventional i symbol for a subscript, so as to avoid confusion with the symbol used for the imaginary unit.] The acoefficients are the contribution that each of these elementary wavefunctions would make to the composite wave. What could they possibly be? Not sure. Let’s first look at the argument of our elementary component wavefunctions. We’d inject uncertainty in it. So we’d say that m = E = p is equal to

m = E = p = ħ/2 + j·ħ with j = 0, 1, 2,…

That amounts to writing: m = E = p = ħ/2, ħ, 3ħ/2, 2ħ, 5/2ħ, etcetera. Waw! That’s nice, isn’t it? My intuition tells me that our acoefficients will be smaller for higher j, so the aj(j) function would be some decreasing function. What shape? Not sure. Let’s first sum up our thoughts so far:

  1. The elementary wavefunction of a zero-mass particle (again, I mean zero rest mass) in free space is associated with an energy that’s equal to ħ/2.
  2. The zero-mass particle travels at the speed of light, obviously (because it has zero rest mass), and its kinetic energy is equal to E = m·v2/2 = m·c2/2.
  3. However, its total energy is equal to E = m·v= m·c2: it has some hidden energy. Why? Just because it exists.
  4. We may associate its kinetic energy with the real part of its wavefunction, and the hidden energy with its imaginary part. However, you should remember that the imaginary part of the wavefunction is as essential as its real part, so the hidden energy is equally real. 🙂

So… Well… Isn’t this just nice?

I think it is. Another obvious advantage of this way of looking at the elementary wavefunction is that – at first glance at least – it provides an intuitive understanding of why we need to take the (absolute) square of the wavefunction to find the probability of our particle being at some point in space and time. The energy of a wave is proportional to the square of its amplitude. Now, it is reasonable to assume the probability of finding our (point) particle would be proportional to the energy and, hence, to the square of the amplitude of the wavefunction, which is given by those aj(j) coefficients.

Huh?

OK. You’re right. I am a bit too fast here. It’s a bit more complicated than that, of course. The argument of probability being proportional to energy being proportional to the square of the amplitude of the wavefunction only works for a single wave a·eiθ. The argument does not hold water for a sum of functions ∑ ajeiθj. Let’s write it all out. Taking our m = E = p = ħ/2 + j·ħ = ħ/2, ħ, 3ħ/2, 2ħ, 5/2ħ,… formula into account, this sum would look like:

a1ei(x − t)(1/2) + a2ei(x − t)(2/2) + a3ei(x − t)(3/2) + a4ei(x − t)(4/2) + …

But—Hey! We can write this as some power series, can’t we? We just need to add a0ei(x − t)(0/2) = a0, and then… Well… It’s not so easy, actually. Who can help me? I am trying to find something like this:

power series

Or… Well… Perhaps something like this:

power series 2

Whatever power series it is, we should be able to relate it to this one—I’d hope:

power series 3

Hmm… […] It looks like I’ll need to re-visit this, but I am sure it’s going to work out. Unfortunately, I’ve got no more time today, I’ll let you have some fun now with all of this. 🙂 By the way, note that the result of the first power series is only valid for |x| < 1. 🙂

Note 1: What we should also do now is to re-insert mass in the equations. That should not be too difficult. It’s consistent with classical theory: the total energy of some moving mass is E = m·c2, out of which m·v2/2 is the classical kinetic energy. All the rest – i.e. m·c2 − m·v2/2 – is potential energy, and so that includes the energy that’s ‘hidden’ in the imaginary part of the wavefunction. 🙂

Note 2: I really didn’t pay much attentions to dimensions when doing all of these manipulations above but… Well… I don’t think I did anything wrong. Just to give you some more feel for that wavefunction ei(kx − ωt), please do a dimensional analysis of its argument. I mean, k = p/ħ, and ω = E/ħ, so check the dimensions:

  • Momentum is expressed in newton·second, and we divide it by the quantum of action, which is expressed in newton·meter·second. So we get something per meter. But then we multiply it with x, so we get a dimensionless number.
  • The same is true for the ωt term. Energy is expressed in joule, i.e. newton·meter, and so we divide it by ħ once more, so we get something per second. But then we multiply it with t, so… Well… We do get a dimensionless number: a number that’s expressed in radians, to be precise. And so the radian does, indeed, integrate both the time as well as the distance dimension. 🙂

Schrödinger’s equation and the two de Broglie relations

Post scriptum note added on 11 July 2016: This is one of the more speculative posts which led to my e-publication analyzing the wavefunction as an energy propagation. With the benefit of hindsight, I would recommend you to immediately the more recent exposé on the matter that is being presented here, which you can find by clicking on the provided link. In fact, I actually made some (small) mistakes when writing the post below.

Original post:

I’ve re-visited the de Broglie equations a couple of times already. In this post, however, I want to relate them to Schrödinger’s equation. Let’s start with the de Broglie equations first. Equations. Plural. Indeed, most popularizing books on quantum physics will give you only one of the two de Broglie equations—the one that associates a wavelength (λ) with the momentum (p) of a matter-particle:

λ = h/p

In fact, even the Wikipedia article on the ‘matter wave’ starts off like that and is, therefore, very confusing, because, for a good understanding of quantum physics, one needs to realize that the λ = h/p equality is just one of a pair of two ‘matter wave’ equations:

  1. λ = h/p
  2. f = E/h

These two equations give you the spatial and temporal frequency of the wavefunction respectively. Now, those two frequencies are related – and I’ll show you how in a minute – but they are not the same. It’s like space and time: they are related, but they are definitely not the same. Now, because any wavefunction is periodic, the argument of the wavefunction – which we’ll introduce shortly – will be some angle and, hence, we’ll want to express it in radians (or – if you’re really old-fashioned – degrees). So we’ll want to express the frequency as an angular frequency (i.e. in radians per second, rather than in cycles per second), and the wavelength as a wave number (i.e. in radians per meter). Hence, you’ll usually see the two de Broglie equations written as:

  1. k = p/ħ
  2. ω = E/ħ

It’s the same: ω = 2π∙f and f = 1/T (T is the period of the oscillation), and k = 2π/λ and then ħ = h/2π, of course! [Just to remove all ambiguities: stop thinking about degrees. They’re a Babylonian legacy, who thought the numbers 6, 12, and 60 had particular religious significance. So that’s why we have twelve-hour nights and twelve-hour days, with each hour divided into sixty minutes and each minute divided into sixty seconds, and – particularly relevant in this context – why ‘once around’ is divided into 6×60 = 360 degrees. Radians are the unit in which we should measure angles because… Well… Google it. They measure an angle in distance units. That makes things easier—a lot easier! Indeed, when studying physics, the last thing you want is artificial units, like degrees.]

So… Where were we? Oh… Yes. The de Broglie relation. Popular textbooks usually commit two sins. One is that they forget to say we have two de Broglie relations, and the other one is that the E = h∙f relationship is presented as the twin of the Planck-Einstein relation for photons, which relates the energy (E) of a photon to its frequency (ν): E = h∙ν = ħ∙ω. The former is criminal neglect, I feel. As for the latter… Well… It’s true and not true: it’s incomplete, I’d say, and, therefore, also very confusing.

Why? Because both things lead one to try to relate the two equations, as momentum and energy are obviously related. In fact, I’ve wasted days, if not weeks, on this. How are they related? What formula should we use? To answer that question, we need to answer another one: what energy concept should we use? Potential energy? Kinetic energy? Should we include the equivalent energy of the rest mass?

One quickly gets into trouble here. For example, one can try the kinetic energy, K.E. = m∙v2/2, and use the definition of momentum (p = m∙v), to write E = p2/(2m), and then we could relate the frequency f to the wavelength λ using the general rule that the traveling speed of a wave is equal to the product of its wavelength and its frequency (v = λ∙f). But if E = p2/(2m) and f = v/λ, we get:

p2/(2m) = h∙v/λ ⇔  λ = 2∙h/p

So that is almost right, but not quite: that factor 2 should not be there. In fact, it’s easy to see that we’d get de Broglie’s λ = h/p equation from his E = h∙f equation if we’d use E = m∙v2 rather than E = m∙v2/2. In fact, the E = m∙v2 relation comes out of them if we just multiply the two and, yes, use that v = λ relation once again:

  1. f·λ = (E/h)·(h/p) = E/p
  2. v = λ ⇒ f·λ = v = E/p ⇔ E = v·p = v·(m·v) ⇒ E = m·v2

But… Well… E = m∙v2? How could we possibly justify the use of that formula?

The answer is simple: our v = f·λ equation is wrong. It’s just something one shouldn’t apply to the complex-valued wavefunction. The ‘correct’ velocity formula for the complex-valued wavefunction should have that 1/2 factor, so we’d write 2·f·λ = v to make things come out alright. But where would this formula come from?

Well… Now it’s time to introduce the wavefunction.

The wavefunction

You know the elementary wavefunction:

ψ = ψ(x, t) = ei(ωt − kx) = ei(kx − ωt) = cos(kx−ωt) + i∙sin(kx−ωt)

As for terminology, note that the term ‘wavefunction’ refers to what I write above, while the term ‘wave equation’ usually refers to Schrödinger’s equation, which I’ll introduce in a minute. Also note the use of boldface indicates we’re talking vectors, so we’re multiplying the wavenumber vector k with the position vector x = (x, y, z) here, although we’ll often simplify and assume one-dimensional space. In any case…

So the question is: why can’t we use the v = f·λ formula for this wave? The period of cosθ + isinθ is the same as that of the sine and cosine function considered separately: cos(θ+2π) + isin(θ+2π) = cosθ + isinθ, so T = 2π and f = 1/T = 1/2π do not change. So the f, T and λ should be the same, no?

No. We’ve got two oscillations for the price of one here: one ‘real’ and one ‘imaginary’—but both are equally essential and, hence, equally ‘real’. So we’re actually combining two waves. So it’s just like adding other waves: when adding waves, one gets a composite wave that has (a) a phase velocity and (b) a group velocity.

Huh? Yes. It’s quite interesting. When adding waves, we usually have a different ω and k for each of the component waves, and the phase and group velocity will depend on the relation between those ω’s and k’s. That relation is referred to as the dispersion relation. To be precise, if you’re adding waves, then the phase velocity of the composite wave will be equal to vp = ω/k, and its group velocity will be equal to vg = dω/dk. We’ll usually be interested in the group velocity, and so to calculate that derivative, we need to express ω as a function of k, of course, so we write ω as some function of k, i.e. ω = ω(k). There are number of possibilities then:

  1. ω and k may be directly proportional, so we can write ω as ω = a∙k: in that case, we find that vp = vg = a.
  2. ω and k are not directly proportional but have a linear relationship, so we can write write ω as ω = a∙k + b. In that case, we find that vg = a and… Well… We’ve got a problem calculating vp, because we don’t know what k to use!
  3. ω and k may be non-linearly related, in which case… Well… One does has to do the calculation and see what comes out. 🙂

Let’s now look back at our ei(kx − ωt) = cos(kx−ωt) + i∙sin(kx−ωt) function. You’ll say that we’ve got only one ω and one k here, so we’re not adding waves with different ω’s and k’s. So… Well… What?

That’s where the de Broglie equations come in. Look: k = p/ħ, and ω = E/ħ. If we now use the correct energy formula, i.e. the kinetic energy formula E = m·v2/2 (rather than that nonsensical E = m·v2 equation) – which we can also write as E = m·v·v/2 = p·v/2 = p·p/2m = p2/2m, with v = p/m the classical velocity of the elementary particle that Louis de Broglie was thinking of – then we can calculate the group velocity of our ei(kx − ωt) = cos(kx−ωt) + i∙sin(kx−ωt) as:

vg = dω/dk = d[E/ħ]/d[p/ħ] = dE/dp = d[p2/2m]/dp = 2p/2m = p/m = v

However, the phase velocity of our ei(kx − ωt) is:

vp = ω/k = (E/ħ)/(p/ħ) = E/p = (p2/2m)/p = p/2m = v/2

So that factor 1/2 only appears for the phase velocity. Weird, isn’t it? We find that the group velocity (vg) of the ei(kx − ωt) function is equal to the classical velocity of our particle (i.e. v), but that its phase velocity (vp) is equal to v divided by 2.

Hmm… What to say? Well… Nothing much—except that it makes sense, and very much so, because it’s the group velocity of the wavefunction that’s associated with the classical velocity of a particle, not the phase velocity. In fact, if we include the rest mass in our energy formula, so if we’d use the relativistic E = γm0c2 and p = γm0v formulas (with γ the Lorentz factor), then we find that vp = ω/k = E/p = (γm0c2)/(γm0v) = c2/v, and so that’s a superluminal velocity, because v is always smaller than c!

What? That’s even weirder! If we take the kinetic energy only, we find a phase velocity equal to v/2, but if we include the rest energy, then we get a superluminal phase velocity. It must be one or the other, no? Yep! You’re right! So that makes us wonder: is E = m·v2/2 really the right energy concept to use? The answer is unambiguous: no! It isn’t! And, just for the record, our young nobleman didn’t use the kinetic energy formula when he postulated his equations in his now famous PhD thesis.

So what did he use then? Where did he get his equations?

I am not sure. 🙂 A stroke of genius, it seems. According to Feynman, that’s how Schrödinger got his equation too: intuition, brilliance. In short, a stroke of genius. 🙂 Let’s relate these these two gems.

Schrödinger’s equation and the two de Broglie relations

Erwin Schrödinger and Louis de Broglie published their equations in 1924 and 1926 respectively. Can they be related? The answer is: yes—of course! Let’s first look at de Broglie‘s energy concept, however. Louis de Broglie was very familiar with Einsteins’ work and, hence, he knew that the energy of a particle consisted of three parts:

  1. The particle’s rest energy m0c2, which de Broglie referred to as internal energy (Eint): this ‘internal energy’ includes the rest mass of the ‘internal pieces’, as he put it (now we call those ‘internal pieces’ quarks), as well as their binding energy (i.e. the quarks’interaction energy);
  2. Any potential energy it may have because of some field (so de Broglie was not assuming the particle was traveling in free space), which we’ll denote by V: the field(s) can be anything—gravitational, electromagnetic—you name it: whatever changes the energy because of the position of the particle;
  3. The particle’s kinetic energy, which we wrote in terms of its momentum p: K.E. = m·v2/2 = m2·v2/(2m) = (m·v)2/(2m) = p2/(2m).

Indeed, in my previous posts, I would write the wavefunction as de Broglie wrote it, which is as follows:

ψ(θ) = ψ(x, t) = a·eiθ = a·e−i[(Eint + p2/(2m) + V)·t − p∙x]/ħ 

In those post – such as my post on virtual particles – I’d also note how a change in potential energy plays out: a change in potential energy, when moving from one place to another, would change the wavefunction, but through the momentum only—so it would impact the spatial frequency only. So the change in potential would not change the temporal frequencies ω= Eint + p12/(2m) + V1 and ω= Eint + p22/(2m) + V2. Why? Or why not, I should say? Because of the energy conservation principle—or its equivalent in quantum mechanics. The temporal frequency f or ω, i.e. the time-rate of change of the phase of the wavefunction, does not change: all of the change in potential, and the corresponding change in kinetic energy, goes into changing the spatial frequency, i.e. the wave number k or the wavelength λ, as potential energy becomes kinetic or vice versa.

So is that consistent with what we wrote above, that E = m·v2? Maybe. Let’s think about it. Let’s first look at Schrödinger’s equation in free space (i.e. a space with zero potential) once again:

Schrodinger's equation 2

If we insert our ψ = ei(kx − ωt) formula in Schrödinger’s free-space equation, we get the following nice result. [To keep things simple, we’re just assuming one-dimensional space for the calculations, so ∇2ψ = ∂2ψ/∂x2. But the result can easily be generalized.] The time derivative on the left-hand side is ∂ψ/∂t = −iω·ei(kx − ωt). The second-order derivative on the right-hand side is ∂2ψ/∂x2 = (ik)·(ik)·ei(kx − ωt) = −k2·ei(kx − ωt) . The ei(kx − ωt) factor on both sides cancels out and, hence, equating both sides gives us the following condition:

iω = −(iħ/2m)·k2 ⇔ ω = (ħ/2m)·k2

Substituting ω = E/ħ and k = p/ħ yields:

E/ħ = (ħ/2m)·p22 = m2·v2/(2m·ħ) = m·v2/(2ħ) ⇔ E = m·v2/2

Bingo! We get that kinetic energy formula! But now… What if we’d not be considering free space? In other words: what if there is some potential? Well… We’d use the complete Schrödinger equation, which is:

schrodinger 5

Huh? Why is there a minus sign now? Look carefully: I moved the iħ factor on the left-hand side to the other when writing the free space version. If we’d do that for the complete equation, we’d get:

Schrodinger's equation 3I like that representation a lot more—if only because it makes it a lot easier to interpret the equation—but, for some reason I don’t quite understand, you won’t find it like that in textbooks. Now how does it work when using the complete equation, so we add the −(i/ħ)·V·ψ term? It’s simple: the ei(kx − ωt) factor also cancels out, and so we get:

iω = −(iħ/2m)·k2−(i/ħ)·V ⇔ ω = (ħ/2m)·k+ V/ħ

Substituting ω = E/ħ and k = p/ħ once more now yields:

E/ħ = (ħ/2m)·p22 + V/ħ = m2·v2/(2m·ħ) + V/ħ = m·v2/(2ħ) + V/ħ ⇔ E = m·v2/2 + V

Bingo once more!

The only thing that’s missing now is the particle’s rest energy m0c2, which de Broglie referred to as internal energy (Eint). That includes everything, i.e. not only the rest mass of the ‘internal pieces’ (as said, now we call those ‘internal pieces’ quarks) but also their binding energy (i.e. the quarks’interaction energy). So how do we get that energy concept out of Schrödinger’s equation? There’s only one answer to that: that energy is just like V. We can, quite simply, just add it.

That brings us to the last and final question: what about our vg = result if we do not use the kinetic energy concept, but the E = m·v2/2 + V + Eint concept? The answer is simple: nothing. We still get the same, because we’re taking a derivative and the V and Eint just appear as constants, and so their derivative with respect to p is zero. Check it:

vg = dω/dk = d[E/ħ]/d[p/ħ] = dE/dp = d[p2/2m + V + Eint ]/dp = 2p/2m = p/m = v

It’s now pretty clear how this thing works. To localize our particle, we just superimpose a zillion of these ei(ωt − kx) equations. The only condition is that we’ve got that fixed vg = dω/dk = v relationhip, but so we do have such fixed relationship—as you can see above. In fact, the Wikipedia article on the dispersion relation mentions that the de Broglie equations imply the following relation between ω and k: ω = ħk2/2m. As you can see, that’s not entirely correct: the author conveniently forgets the potential (V) and the rest energy (Eint) in the energy formula here!

What about the phase velocity? That’s a different story altogether. You can think about that for yourself. 🙂

I should make one final point here. As said, in order to localize a particle (or, to be precise, its wavefunction), we’re going to add a zillion elementary wavefunctions, each of which will make its own contribution to the composite wave. That contribution is captured by some coefficient ai in front of every eiθi function, so we’ll have a zillion aieiθi functions, really. [Yep. Bit confusing: I use here as subscript, as well as imaginary unit.] In case you wonder how that works out with Schrödinger’s equation, the answer is – once again – very simple: both the time derivative (which is just a first-order derivative) and the Laplacian are linear operators, so Schrödinger’s equation, for a composite wave, can just be re-written as the sum of a zillion ‘elementary’ wave equations.

So… Well… We’re all set now to effectively use Schrödinger’s equation to calculate the orbitals for a hydrogen atom, which is what we’ll do in our next post.

In the meanwhile, you can amuse yourself with reading a nice Wikibook article on the Laplacian, which gives you a nice feel for what Schrödinger’s equation actually represents—even if I gave you a good feel for that too on my Essentials page. Whatever. You choose. Just let me know what you liked best. 🙂

Oh… One more point: the vg = dω/dk = d[p2/2m]/dp = p/m = calculation obviously assumes we can treat m as a constant. In fact, what we’re actually doing is a rather complicated substitution of variables: you should write it all out—but that’s not the point here. The point is that we’re actually doing a non-relativistic calculation. Now, that does not mean that the wavefunction isn’t consistent with special relativity. It is. In fact, in one of my posts, I show how we can explain relativistic length contraction using the wavefunction. But it does mean that our calculation of the group velocity is not relativistically correct. But that’s a minor point: I’ll leave it for you as an exercise to calculate the relativistically correct formula for the group velocity. Have fun with it! 🙂

Note: Notations are often quite confusing. One should, generally speaking, denote a frequency by ν (nu), rather than by f, so as to not cause confusion with any function f, but then… Well… You create a new problem when you do that, because that Greek letter nu (ν) looks damn similar to the v of velocity, so that’s why I’ll often use f when I should be using nu (ν). As for the units, a frequency is expressed in cycles per second, while the angular frequency ω is expressed in radians per second. One cycle covers 2π radians and, therefore, we can write: ν = ω/2π. Hence, h∙ν = h∙ω/2π = ħ∙ω. Both ν as well as ω measure the time-rate of change of the phase of the wave function, as opposed to k, i.e. the spatial frequency of the wave function, which depends on the speed of the wave. Physicists also often use the symbol v for the speed of a wave, which is also hugely confusing, because it’s also used to denote the classical velocity of the particle. And then there’s two wave velocities, of course: the group versus the phase velocity. In any case… I find the use of that other symbol (c) for the wave velocity even more confusing, because this symbol is also used for the speed of light, and the speed of a wave is not necessarily (read: usually not) equal to the speed of light. In fact, both the group as well as the phase velocity of a particle wave are very different from the speed of light. The speed of a wave and the speed of light only coincide for electromagnetic waves and, even then, it should be noted that photons also have amplitudes to travel faster or slower than the speed of light.

Schrödinger’s equation as an energy conservation law

Post scriptum note added on 11 July 2016: This is one of the more speculative posts which led to my e-publication analyzing the wavefunction as an energy propagation. With the benefit of hindsight, I would recommend you to immediately the more recent exposé on the matter that is being presented here, which you can find by clicking on the provided link.

Original post:

In the movie about Stephen Hawking’s life, The Theory of Everything, there is talk about a single unifying equation that would explain everything in the universe. I must assume the real Stephen Hawking is familiar with Feynman’s unworldliness equation: U = 0, which – as Feynman convincingly demonstrates – effectively integrates all known equations in physics. It’s one of Feynman’s many jokes, of course, but an exceptionally clever one, as the argument convincingly shows there’s no such thing as one equation that explains all. Or, to be precise, one can, effectively, ‘hide‘ all the equations you want in a single equation, but it’s just a trick. As Feynman puts it: “When you unwrap the whole thing, you get back where you were before.”

Having said that, some equations in physics are obviously more fundamental than others. You can readily think of obvious candidates: Einstein’s mass-energy equivalence (m = E/c2); the wavefunction (ψ = ei(ω·t − k·x)) and the two de Broglie relations that come with it (ω = E/ħ and k = p/ħ); and, of course, Schrödinger’s equation, which we wrote as:

Schrodinger's equation

In my post on quantum-mechanical operators, I drew your attention to the fact that this equation is structurally similar to the heat diffusion equation. Indeed, assuming the heat per unit volume (q) is proportional to the temperature (T) – which is the case when expressing T in degrees Kelvin (K), so we can write q as q = k·T  – we can write the heat diffusion equation as:

heat diffusion 2

Moreover, I noted the similarity is not only structural. There is more to it: both equations model energy flows and/or densities. Look at it: the dimension of the left- and right-hand side of Schrödinger’s equation is the energy dimension: both quantities are expressed in joule. [Remember: a time derivative is a quantity expressed per second, and the dimension of Planck’s constant is the joule·second. You can figure out the dimension of the right-hand side yourself.] Now, the time derivative on the left-hand side is expressed in K/s. The constant in front (k) is just the (volume) heat capacity of the substance, which is expressed in J/(m3·K). So the dimension of k·(∂T/∂t) is J/(m3·s). On the right-hand side we have that Laplacian, whose dimension is K/m2, multiplied by the thermal conductivity, whose dimension is W/(m·K) = J/(m·s·K). Hence, the dimension of the product is  the same as the left-hand side: J/(m3·s).

We can present the thing in various ways: if we bring k to the other side, then we’ve got something expressed per second on the left-hand side, and something expressed per square meter on the right-hand side—but the k/κ factor makes it alright. The point is: both Schrödinger’s equation as well as the diffusion equation are actually an expression of the energy conservation law. They’re both expressions of Gauss’ flux theorem (but in differential form, rather than in integral form) which, as you know, pops up everywhere when talking energy conservation.

Huh? 

Yep. I’ll give another example. Let me first remind you that the k·(∂T/∂t) = ∂q/∂t = κ·∇2T equation can also be written as:

heat diffusion 3

The h in this equation is, obviously, not Planck’s constant, but the heat flow vector, i.e. the heat that flows through a unit area in a unit time, and h is, obviously, equal to h = −κ∇T. And, of course, you should remember your vector calculus here: ∇· is the divergence operator. In fact, we used the equation above, with ∇·h rather than ∇2T to illustrate the energy conservation principle. Now, you may or may not remember that we gave you a similar equation when talking about the energy of fields and the Poynting vector:

Poynting vector

This immediately triggers the following reflection: if there’s a ‘Poynting vector’ for heat flow (h), and for the energy of fields (S), then there must be some kind of ‘Poynting vector’ for amplitudes too! I don’t know which one, but it must exist! And it’s going to be some complex vector, no doubt! But it should be out there.

It also makes me think of a point I’ve made a couple of times already—about the similarity between the E and B vectors that characterize the traveling electromagnetic field, and the real and imaginary part of the traveling amplitude. Indeed, the similarity between the two illustrations below cannot be a coincidence. In both cases, we’ve got two oscillating magnitudes that are orthogonal to each other, always. As such, they’re not independent: one follows the other, or vice versa.

5d_euler_fFG02_06

 

 

 

 

 

The only difference is the phase shift. Euler’s formula incorporates a phase shift—remember: sinθ = cos(θ − π/2)—and so you don’t have that with the E and B vectors. But – Hey! – isn’t that why bosons and fermions are different? 🙂

[…]

This is great fun, and I’ll surely come back to it. As for now, however, I’ll let you ponder the matter for yourself. 🙂

Post scriptum: I am sure that all that the questions I raise here will be answered at the Masters’ level, most probably in some course dealing with quantum field theory, of course. 🙂 In any case, I am happy I can already anticipate such questions. 🙂

Oh – and, as for those two illustrations above, the animation below is one that should help you to think things through. It’s the electric field vector of a traveling circularly polarized electromagnetic wave, as opposed to the linearly polarized light that was illustrated above.

Animation

Quantum-mechanical operators

We climbed a mountain—step by step, post by post. 🙂 We have reached the top now, and the view is gorgeous. We understand Schrödinger’s equation, which describes how amplitudes propagate through space-time. It’s the quintessential quantum-mechanical expression. Let’s enjoy now, and deepen our understanding by introducing the concept of (quantum-mechanical) operators.

The operator concept

We’ll introduce the operator concept using Schrödinger’s equation itself and, in the process, deepen our understanding of Schrödinger’s equation a bit. You’ll remember we wrote it as:

schrodinger 5

However, you’ve probably seen it like it’s written on his bust, or on his grave, or wherever, which is as follows:

simple

Grave

It’s the same thing, of course. The ‘over-dot’ is Newton’s notation for the time derivative. In fact, if you click on the picture above (and zoom in a bit), then you’ll see that the craftsman who made the stone grave marker, mistakenly, also carved a dot above the psi (ψ) on the right-hand side of the equation—but then someone pointed out his mistake and so the dot on the right-hand side isn’t painted. 🙂 The thing I want to talk about here, however, is the H in that expression above, which is, obviously, the following operator:

H

That’s a pretty monstrous operator, isn’t it? It is what it is, however: an algebraic operator (it operates on a number—albeit a complex number—unlike a matrix operator, which operates on a vector or another matrix). As you can see, it actually consists of two other (algebraic) operators:

  1. The ∇operator, which you know: it’s a differential operator. To be specific, it’s the Laplace operator, which is the divergence (·) of the gradient () of a function: ∇= · = (∂/∂x, ∂/∂y , ∂/∂z)·(∂/∂x, ∂/∂y , ∂/∂z) = ∂2/∂x2  + ∂2/∂y+ ∂2/∂z2. This too operates on our complex-valued function wavefunction ψ, and yields some other complex-valued function, which we then multiply by −ħ2/2m to get the first term.
  2. The V(x, y, z) ‘operator’, which—in this particular context—just means: “multiply with V”. Needless to say, V is the potential here, and so it captures the presence of external force fields. Also note that V is a real number, just like −ħ2/2m.

Let me say something about the dimensions here. On the left-hand side of Schrödinger’s equation, we have the product of ħ and a time derivative (is just the imaginary unit, so that’s just a (complex) number). Hence, the dimension there is [J·s]/[s] (the dimension of a time derivative is something expressed per second). So the dimension of the left-hand side is joule. On the right-hand side, we’ve got two terms. The dimension of that second-order derivative (∇2ψ) is something expressed per square meter, but then we multiply it with −ħ2/2m, whose dimension is [J2·s2]/[J/(m2/s2)]. [Remember: m = E/c2.] So that reduces to [J·m2]. Hence, the dimension of (−ħ2/2m)∇2ψ is joule. And the dimension of V is joule too, of course. So it all works out. In fact, now that we’re here, it may or may not be useful to remind you of that heat diffusion equation we discussed when introducing the basic concepts involved in vector analysis:

diffusion equation

That equation illustrated the physical significance of the Laplacian. We were talking about the flow of heat in, say, a block of metal, as illustrated below. The in the equation above is the heat per unit volume, and the h in the illustration below was the heat flow vector (so it’s got nothing to do with Planck’s constant), which depended on the material, and which we wrote as = –κT, with T the temperature, and κ (kappa) the thermal conductivity. In any case, the point is the following: the equation below illustrates the physical significance of the Laplacian. We let it operate on the temperature (i.e. a scalar function) and its product with some constant (just think of replacing κ by −ħ2/2m gives us the time derivative of q, i.e. the heat per unit volume.

heat flow

In fact, we know that is proportional to T, so if we’d choose an appropriate temperature scale – i.e. choose the zero point such that T (your physics teacher in high school would refer to as the (volume) specific heat capacity) – then we could simple write:

∂T/∂t = (κ/k)∇2T

From a mathematical point of view, that equation is just the same as ∂ψ/∂t = –(i·ħ/2m)·∇2ψ, which is Schrödinger’s equation for V = 0. In other words, you can – and actually should – also think of Schrödinger’s equation as describing the flow of… Well… What?

Well… Not sure. I am tempted to think of something like a probability density in space, but ψ represents a (complex-valued) amplitude. Having said that, you get the idea—I hope! 🙂 If not, let me paraphrase Feynman on this:

“We can think of Schrödinger’s equation as describing the diffusion of a probability amplitude from one point to another. In fact, the equation looks something like the diffusion equation we introduced when discussing heat flow, or the spreading of a gas. But there is one main difference: the imaginary coefficient in front of the time derivative makes the behavior completely different from the ordinary diffusion such as you would have for a gas spreading out. Ordinary diffusion gives rise to real exponential solutions, whereas the solutions of Schrödinger’s equation are complex waves.”

That says it all, right? 🙂 In fact, Schrödinger’s equation – as discussed here – was actually being derived when describing the motion of an electron along a line of atoms, i.e. for motion in one direction only, but you can visualize what it represents in three-dimensional space. The real exponential functions Feynman refer to exponential decay function: as the energy is spread over an ever-increasing volume, the amplitude of the wave becomes smaller and smaller. That may be the case for complex-valued exponentials as well. The key difference between a real- and complex-valued exponential decay function is that a complex exponential is a cyclical function. Now, I quickly googled to see how we could visualize that, and I like the following illustration:

decay

The dimensional analysis of Schrödinger’s equation is also quite interesting because… Well… Think of it: that heat diffusion equation incorporates the same dimensions: temperature is a measure of the average energy of the molecules. That’s really something to think about. These differential equations are not only structurally similar but, in addition, they all seem to describe some flow of energy. That’s pretty deep stuff: it relates amplitudes to energies, so we should think in terms of Poynting vectors and all that. But… Well… I need to move on, and so I will move on—so you can re-visit this later. 🙂

Now that we’ve introduced the concept of an operator, let me say something about notations, because that’s quite confusing.

Some remarks on notation

Because it’s an operator, we should actually use the hat symbol—in line with what we did when we were discussing matrix operators: we’d distinguish the matrix (e.g. A) from its use as an operator (Â). You may or may not remember we do the same in statistics: the hat symbol is supposed to distinguish the estimator (â) – i.e. some function we use to estimate a parameter (which we usually denoted by some Greek symbol, like α) – from a specific estimate of the parameter, i.e. the value (a) we get when applying â to a specific sample or observation. However, if you remember the difference, you’ll also remember that hat symbol was quickly forgotten, because the context made it clear what was what, and so we’d just write a(x) instead of â(x). So… Well… I’ll be sloppy as well here, if only because the WordPress editor only offers very few symbols with a hat! 🙂

In any case, this discussion on the use (or not) of that hat is irrelevant. In contrast, what is relevant is to realize this algebraic operator H here is very different from that other quantum-mechanical Hamiltonian operator we discussed when dealing with a finite set of base states: that H was the Hamiltonian matrix, but used in an ‘operation’ on some state. So we have the matrix operator H, and the algebraic operator H.

Confusing?

Yes and no. First, we’ve got the context again, and so you always know whether you’re looking at continuous or discrete stuff:

  1. If your ‘space’ is continuous (i.e. if states are to defined with reference to an infinite set of base states), then it’s the algebraic operator.
  2. If, on the other hand, your states are defined by some finite set of discrete base states, then it’s the Hamiltonian matrix.

There’s another, more fundamental, reason why there should be no confusion. In fact, it’s the reason why physicists use the same symbol H in the first place: despite the fact that they look so different, these two operators (i.e. H the algebraic operator and H the matrix operator) are actually equivalent. Their interpretation is similar, as evidenced from the fact that both are being referred to as the energy operator in quantum physics. The only difference is that one operates on a (state) vector, while the other operates on a continuous function. It’s just the difference between matrix mechanics as opposed to wave mechanics really.

But… Well… I am sure I’ve confused you by now—and probably very much so—and so let’s start from the start. 🙂

Matrix mechanics

Let’s start with the easy thing indeed: matrix mechanics. The matrix-mechanical approach is summarized in that set of Hamiltonian equations which, by now, you know so well:

new

If we have base states, then we have equations like this: one for each = 1, 2,… n. As for the introduction of the Hamiltonian, and the other subscript (j), just think of the description of a state:

essential

So… Well… Because we had used already, we had to introduce j. 🙂

Let’s think about |ψ〉. It is the state of a system, like the ground state of a hydrogen atom, or one of its many excited states. But… Well… It’s a bit of a weird term, really. It all depends on what you want to measure: when we’re thinking of the ground state, or an excited state, we’re thinking energy. That’s something else than thinking its position in space, for example. Always remember: a state is defined by a set of base states, and so those base states come with a certain perspective: when talking states, we’re only looking at some aspect of reality, really. Let’s continue with our example of energy states, however.

You know that the lifetime of a system in an excited state is usually short: some spontaneous or induced emission of a quantum of energy (i.e. a photon) will ensure that the system quickly returns to a less excited state, or to the ground state itself. However, you shouldn’t think of that here: we’re looking at stable systems here. To be clear: we’re looking at systems that have some definite energy—or so we think: it’s just because of the quantum-mechanical uncertainty that we’ll always measure some other different value. Does that make sense?

If it doesn’t… Well… Stop reading, because it’s only going to get even more confusing. Not my fault, however!

Psi-chology

The ubiquity of that ψ symbol (i.e. the Greek letter psi) is really something psi-chological 🙂 and, hence, very confusing, really. In matrix mechanics, our ψ would just denote a state of a system, like the energy of an electron (or, when there’s only one electron, our hydrogen atom). If it’s an electron, then we’d describe it by its orbital. In this regard, I found the following illustration from Wikipedia particularly helpful: the green orbitals show excitations of copper (Cu) orbitals on a CuOplane. [The two big arrows just illustrate the principle of X-ray spectroscopy, so it’s an X-ray probing the structure of the material.]

800px-CuO2-plane_in_high_Tc_superconductor

So… Well… We’d write ψ as |ψ〉 just to remind ourselves we’re talking of some state of the system indeed. However, quantum physicists always want to confuse you, and so they will also use the psi symbol to denote something else: they’ll use it to denote a very particular Ci amplitude (or coefficient) in that |ψ〉 = ∑|iCi formula above. To be specific, they’d replace the base states |i〉 by the continuous position variable x, and they would write the following:

Ci = ψ(i = x) = ψ(x) = Cψ(x) = C(x) = 〈x|ψ〉

In fact, that’s just like writing:

φ(p) = 〈 mom p | ψ 〉 = 〈p|ψ〉 = Cφ(p) = C(p)

What they’re doing here, is (1) reduce the ‘system‘ to a ‘particle‘ once more (which is OK, as long as you know what you’re doing) and (2) they basically state the following:

If a particle is in some state |ψ〉, then we can associate some wavefunction ψ(x) or φ(p)—with it, and that wavefunction will represent the amplitude for the system (i.e. our particle) to be at x, or to have a momentum that’s equal to p.

So what’s wrong with that? Well… Nothing. It’s just that… Well… Why don’t they use χ(x) instead of ψ(x)? That would avoid a lot of confusion, I feel: one should not use the same symbol (psi) for the |ψ〉 state and the ψ(x) wavefunction.

Huh? Yes. Think about it. The point is: the position or the momentum, or even the energy, are properties of the system, so to speak and, therefore, it’s really confusing to use the same symbol psi (ψ) to describe (1) the state of the system, in general, versus (2) the position wavefunction, which describes… Well… Some very particular aspect (or ‘state’, if you want) of the same system (in this case: its position). There’s no such problem with φ(p), so… Well… Why don’t they use χ(x) instead of ψ(x) indeed? I have only one answer: psi-chology. 🙂

In any case, there’s nothing we can do about it and… Well… In fact, that’s what this post is about: it’s about how to describe certain properties of the system. Of course, we’re talking quantum mechanics here and, hence, uncertainty, and, therefore, we’re going to talk about the average position, energy, momentum, etcetera that’s associated with a particular state of a system, or—as we’ll keep things very simple—the properties of a ‘particle’, really. Think of an electron in some orbital, indeed! 🙂

So let’s now look at that set of Hamiltonian equations once again:

new

Looking at it carefully – so just look at it once again! 🙂 – and thinking about what we did when going from the discrete to the continuous setting, we can now understand we should write the following for the continuous case:

equivalence

Of course, combining Schrödinger’s equation with the expression above implies the following:

equality

Now how can we relate that integral to the expression on the right-hand side? I’ll have to disappoint you here, as it requires a lot of math to transform that integral. It requires writing H(x, x’) in terms of rather complicated functions, including – you guessed it, didn’t you? – Dirac’s delta function. Hence, I assume you’ll believe me if I say that the matrix- and wave-mechanical approaches are actually equivalent. In any case, if you’d want to check it, you can always read Feynman yourself. 🙂

Now, I wrote this post to talk about quantum-mechanical operators, so let me do that now.

Quantum-mechanical operators

You know the concept of an operator. As mentioned above, we should put a little hat (^) on top of our Hamiltonian operator, so as to distinguish it from the matrix itself. However, as mentioned above, the difference is usually quite clear from the context. Our operators were all matrices so far, and we’d write the matrix elements of, say, some operator A, as:

Aij ≡ 〈 i | A | j 〉

The whole matrix itself, however, would usually not act on a base state but… Well… Just on some more general state ψ, to produce some new state φ, and so we’d write:

| φ 〉 = A | ψ 〉

Of course, we’d have to describe | φ 〉 in terms of the (same) set of base states and, therefore, we’d expand this expression into something like this:

operator 2

You get the idea. I should just add one more thing. You know this important property of amplitudes: the 〈 ψ | φ 〉 amplitude is the complex conjugate of the 〈 φ | ψ 〉 amplitude. It’s got to do with time reversibility, because the complex conjugate of eiθ = ei(ω·t−k·x) is equal to eiθ = ei(ω·t−k·x), so we’re just reversing the x- and tdirection. We write:

 〈 ψ | φ 〉 = 〈 φ | ψ 〉*

Now what happens if we want to take the complex conjugate when we insert a matrix, so when writing 〈 φ | A | ψ 〉 instead of 〈 φ | ψ 〉, this rules becomes:

〈 φ | A | ψ 〉* = 〈 ψ | A† | φ 〉

The dagger symbol denotes the conjugate transpose, so A† is an operator whose matrix elements are equal to Aij† = Aji*. Now, it may or may not happen that the A† matrix is actually equal to the original A matrix. In that case – and only in that case – we can write:

〈 ψ | A | φ 〉 = 〈 φ | A | ψ 〉*

We then say that A is a ‘self-adjoint’ or ‘Hermitian’ operator. That’s just a definition of a property, which the operator may or may not have—but many quantum-mechanical operators are actually Hermitian. In any case, we’re well armed now to discuss some actual operators, and we’ll start with that energy operator.

The energy operator (H)

We know the state of a system is described in terms of a set of base states. Now, our analysis of N-state systems showed we can always describe it in terms of a special set of base states, which are referred to as the states of definite energy because… Well… Because they’re associated with some definite energy. In that post, we referred to these energy levels as En (n = I, II,… N). We used boldface for the subscript n (so we wrote n instead of n) because of these Roman numerals. With each energy level, we could associate a base state, of definite energy indeed, that we wrote as |n〉. To make a long story short, we summarized our results as follows:

  1. The energies EI, EII,…, En,…, EN are the eigenvalues of the Hamiltonian matrix H.
  2. The state vectors |n〉 that are associated with each energy En, i.e. the set of vectors |n〉, are the corresponding eigenstates.

We’ll be working with some more subscripts in what follows, and these Roman numerals and the boldface notation are somewhat confusing (if only because I don’t want you to think of these subscripts as vectors), we’ll just denote EI, EII,…, En,…, EN as E1, E2,…, Ei,…, EN, and we’ll number the states of definite energy accordingly, also using some Greek letter so as to clearly distinguish them from all our Latin letter symbols: we’ll write these states as: |η1〉, |η1〉,… |ηN〉. [If I say, ‘we’, I mean Feynman of course. You may wonder why he doesn’t write |Ei〉, or |εi〉. The answer is: writing |En〉 would cause confusion, because this state will appear in expressions like: |Ei〉Ei, so that’s the ‘product’ of a state (|Ei〉) and the associated scalar (Ei). Too confusing. As for using η (eta) instead of ε (epsilon) to denote something that’s got to do with energy… Well… I guess he wanted to keep the resemblance with the n, and then the Ancient Greek apparently did use this η letter  for a sound like ‘e‘ so… Well… Why not? Let’s get back to the lesson.]

Using these base states of definite energy, we can write the state of the system as:

|ψ〉 = ∑ |ηi〉 C = ∑ |ηi〉〈ηi|ψ〉    over all (i = 1, 2,… , N)

Now, we didn’t talk all that much about what these base states actually mean in terms of measuring something but you’ll believe if I say that, when measuring the energy of the system, we’ll always measure one or the other E1, E2,…, Ei,…, EN value. We’ll never measure something in-between: it’s eitheror. Now, as you know, measuring something in quantum physics is supposed to be destructive but… Well… Let us imagine we could make a thousand measurements to try to determine the average energy of the system. We’d do so by counting the number of times we measure E1 (and of course we’d denote that number as N1), E2E3, etcetera. You’ll agree that we’d measure the average energy as:

E average

However, measurement is destructive, and we actually know what the expected value of this ‘average’ energy will be, because we know the probabilities of finding the system in a particular base state. That probability is equal to the absolute square of that Ccoefficient above, so we can use the P= |Ci|2 formula to write:

Eav〉 = ∑ Pi Ei over all (i = 1, 2,… , N)

Note that this is a rather general formula. It’s got nothing to do with quantum mechanics: if Ai represents the possible values of some quantity A, and Pi is the probability of getting that value, then (the expected value of) the average A will also be equal to 〈Aav〉 = ∑ Pi Ai. No rocket science here! 🙂 But let’s now apply our quantum-mechanical formulas to that 〈Eav〉 = ∑ Pi Ei formula. [Oh—and I apologize for using the same angle brackets 〈 and 〉 to denote an expected value here—sorry for that! But it’s what Feynman does—and other physicists! You see: they don’t really want you to understand stuff, and so they often use very confusing symbols.] Remembering that the absolute square of a complex number equals the product of that number and its complex conjugate, we can re-write the 〈Eav〉 = ∑ Pi Ei formula as:

Eav〉 = ∑ Pi Ei = ∑ |Ci|Ei = ∑ Ci*CEi = ∑ C*CEi = ∑ 〈ψ|ηi〉〈ηi|ψ〉E= ∑ 〈ψ|ηiEi〈ηi|ψ〉 over all i

Now, you know that Dirac’s bra-ket notation allows numerous manipulations. For example, what we could do is take out that ‘common factor’ 〈ψ|, and so we may re-write that monster above as:

Eav〉 = 〈ψ| ∑ ηiEi〈ηi|ψ〉 = 〈ψ|φ〉, with |φ〉 = ∑ |ηiEi〈ηi|ψ〉 over all i

Huh? Yes. Note the difference between |ψ〉 = ∑ |ηi〉 C = ∑ |ηi〉〈ηi|ψ〉 and |φ〉 = ∑ |ηiEi〈ηi|ψ〉. As Feynman puts it: φ is just some ‘cooked-up‘ state which you get by taking each of the base states |ηi〉 in the amount Ei〈ηi|ψ〉 (as opposed to the 〈ηi|ψ〉 amounts we took for ψ).

I know: you’re getting tired and you wonder why we need all this stuff. Just hang in there. We’re almost done. I just need to do a few more unpleasant things, one of which is to remind you that this business of the energy states being eigenstates (and the energy levels being eigenvalues) of our Hamiltonian matrix (see my post on N-state systems) comes with a number of interesting properties, including this one:

H |ηi〉 = Eii〉 = |ηiEi

Just think about what’s written here: on the left-hand side, we’re multiplying a matrix with a (base) state vector, and on the left-hand side we’re multiplying it with a scalar. So our |φ〉 = ∑ |ηiEi〈ηi|ψ〉 sum now becomes:

|φ〉 = ∑ H |ηi〉〈ηi|ψ〉 over all (i = 1, 2,… , N)

Now we can manipulate that expression some more so as to get the following:

|φ〉 = H ∑|ηi〉〈ηi|ψ〉 = H|ψ〉

Finally, we can re-combine this now with the 〈Eav〉 = 〈ψ|φ〉 equation above, and so we get the fantastic result we wanted:

Eav〉 = 〈 ψ | φ 〉 = 〈 ψ | H ψ 〉

Huh? Yes! To get the average energy, you operate on |ψ with H, and then you multiply the result with ψ|. It’s a beautiful formula. On top of that, the new formula for the average energy is not only pretty but also useful, because now we don’t need to say anything about any particular set of base states. We don’t even have to know all of the possible energy levels. When we have to calculate the average energy of some system, we only need to be able to describe the state of that system in terms of some set of base states, and we also need to know the Hamiltonian matrix for that set, of course. But if we know that, we can calculate its average energy.

You’ll say that’s not a big deal because… Well… If you know the Hamiltonian, you know everything, so… Well… Yes. You’re right: it’s less of a big deal than it seems. Having said that, the whole development above is very interesting because of something else: we can easily generalize it for other physical measurements. I call it the ‘average value’ operator idea, but you won’t find that term in any textbook. 🙂 Let me explain the idea.

The average value operator (A)

The development above illustrates how we can relate a physical observable, like the (average) energy (E), to a quantum-mechanical operator (H). Now, the development above can easily be generalized to any observable that would be proportional to the energy. It’s perfectly reasonable, for example, to assume the angular momentum – as measured in some direction, of course, which we usually refer to as the z-direction – would be proportional to the energy, and so then it would be easy to define a new operator Lz, which we’d define as the operator of the z-component of the angular momentum L. [I know… That’s a bit of a long name but… Well… You get the idea.] So we can write:

Lzav = 〈 ψ | Lψ 〉

In fact, further generalization yields the following grand result:

If a physical observable A is related to a suitable quantum-mechanical operator Â, then the average value of A for the state | ψ 〉 is given by:

Aav = 〈 ψ |  ψ 〉 = 〈 ψ | φ 〉 with | φ 〉 =  ψ 〉

At this point, you may have second thoughts, and wonder: what state | ψ 〉? The answer is: it doesn’t matter. It can be any state, as long as we’re able to describe in terms of a chosen set of base states. 🙂

OK. So far, so good. The next step is to look at how this works for the continuity case.

The energy operator for wavefunctions (H)

We can start thinking about the continuous equivalent of the 〈Eav〉 = 〈ψ|H|ψ〉 expression by first expanding it. We write:

e average continuous function

You know the continuous equivalent of a sum like this is an integral, i.e. an infinite sum. Now, because we’ve got two subscripts here (i and j), we get the following double integral:

double integral

Now, I did take my time to walk you through Feynman’s derivation of the energy operator for the discrete case, i.e. the operator when we’re dealing with matrix mechanics, but I think I can simplify my life here by just copying Feynman’s succinct development:

Feynman

Done! Given a wavefunction ψ(x), we get the average energy by doing that integral above. Now, the quantity in the braces of that integral can be written as that operator we introduced when we started this post:

H

So now we can write that integral much more elegantly. It becomes:

Eav = ∫ ψ*(xH ψ(x) dx

You’ll say that doesn’t look like 〈Eav〉 = 〈 ψ | H ψ 〉! It does. Remember that 〈 ψ | = ψ 〉*. 🙂 Done!

I should add one qualifier though: the formula above assumes our wavefunction has been normalized, so all probabilities add up to one. But that’s a minor thing. The only thing left to do now is to generalize to three dimensions. That’s easy enough. Our expression becomes a volume integral:

Eav = ∫ ψ*(rH ψ(r) dV

Of course, dV stands for dVolume here, not for any potential energy, and, of course, once again we assume all probabilities over the volume add up to 1, so all is normalized. Done! 🙂

We’re almost done with this post. What’s left is the position and momentum operator. You may think this is going to another lengthy development but… Well… It turns out the analysis is remarkably simple. Just stay with me a few more minutes and you’ll have earned your degree. 🙂

The position operator (x)

The thing we need to solve here is really easy. Look at the illustration below as representing the probability density of some particle being at x. Think about it: what’s the average position?

average position

Well? What? The (expected value of the) average position is just this simple integral: 〈xav = ∫ P(x) dx, over all the whole range of possible values for x. 🙂 That’s all. Of course, because P(x) = |ψ(x)|2 =ψ*(x)·ψ(x), this integral now becomes:

xav = ∫ ψ*(x) x ψ(x) dx

That looks exactly the same as 〈Eav = ∫ ψ*(xH ψ(x) dx, and so we can look at as an operator too!

Huh? Yes. It’s an extremely simple operator: it just means “multiply by x“. 🙂

I know you’re shaking your head now: is it that easy? It is. Moreover, the ‘matrix-mechanical equivalent’ is equally simple but, as it’s getting late here, I’ll refer you to Feynman for that. 🙂

The momentum operator (px)

Now we want to calculate the average momentum of, say, some electron. What integral would you use for that? […] Well… What? […] It’s easy: it’s the same thing as for x. We can just substitute replace for in that 〈xav = ∫ P(x) dformula, so we get:

pav = ∫ P(p) dp, over all the whole range of possible values for p

Now, you might think the rest is equally simple, and… Well… It actually is simple but there’s one additional thing in regard to the need to normalize stuff here. You’ll remember we defined a momentum wavefunction (see my post on the Uncertainty Principle), which we wrote as:

φ(p) = 〈 mom p | ψ 〉

Now, in the mentioned post, we related this momentum wavefunction to the particle’s ψ(x) = 〈x|ψ〉 wavefunction—which we should actually refer to as the position wavefunction, but everyone just calls it the particle’s wavefunction, which is a bit of a misnomer, as you can see now: a wavefunction describes some property of the system, and so we can associate several wavefunctions with the same system, really! In any case, we noted the following there:

  • The two probability density functions, φ(p) and ψ(x), look pretty much the same, but the half-width (or standard deviation) of one was inversely proportional to the half-width of the other. To be precise, we found that the constant of proportionality was equal to ħ/2, and wrote that relation as follows: σp = (ħ/2)/σx.
  • We also found that, when using a regular normal distribution function for ψ(x), we’d have to normalize the probability density function by inserting a (2πσx2)−1/2 in front of the exponential.

Now, it’s a bit of a complicated argument, but the upshot is that we cannot just write what we usually write, i.e. Pi = |Ci|2 or P(x) = |ψ(x)|2. No. We need to put a normalization factor in front, which combines the two factors I mentioned above. To be precise, we have to write:

P(p) = |〈p|ψ〉|2/(2πħ)

So… Well… Our 〈pav = ∫ P(p) dp integral can now be written as:

pav = ∫ 〈ψ|ppp|ψ〉 dp/(2πħ)

So that integral is totally like what we found for 〈xav and so… We could just leave it at that, and say we’ve solved the problem. In that sense, it is easy. However, having said that, it’s obvious we’d want some solution that’s written in terms of ψ(x), rather than in terms of φ(p), and that requires some more manipulation. I’ll refer you, once more, to Feynman for that, and I’ll just give you the result:

momentum operator

So… Well… I turns out that the momentum operator – which I tentatively denoted as px above – is not so simple as our position operator (x). Still… It’s not hugely complicated either, as we can write it as:

px ≡ (ħ/i)·(∂/∂x)

Of course, the purists amongst you will, once again, say that I should be more careful and put a hat wherever I’d need to put one so… Well… You’re right. I’ll wrap this all up by copying Feynman’s overview of the operators we just explained, and so he does use the fancy symbols. 🙂

overview

Well, folks—that’s it! Off we go! You know all about quantum physics now! We just need to work ourselves through the exercises that come with Feynman’s Lectures, and then you’re ready to go and bag a degree in physics somewhere. So… Yes… That’s what I want to do now, so I’ll be silent for quite a while now. Have fun! 🙂

Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/

Dirac’s delta function and Schrödinger’s equation in three dimensions

Feynman’s rather informal derivation of Schrödinger’s equation – following Schrödinger’s own logic when he published his famous paper on it back in 1926 – is wonderfully simple but, as I mentioned in my post on it, does lack some mathematical rigor here and there. Hence, Feynman hastens to dot all of the i‘s and cross all of the t‘s in the subsequent Lectures. We’ll look at two things here:

  1. Dirac’s delta function, which ensures proper ‘normalization’. In fact, as you’ll see in a moment, it’s more about ‘orthogonalization’ than normalization. 🙂
  2. The generalization of Schrödinger’s equation to three dimensions (in space) and also including the presence of external force fields (as opposed to the usual ‘free space’ assumption).

The second topic is the most interesting, of course, and also the easiest, really. However, let’s first use our energy to grind through the first topic. 🙂

Dirac’s delta function

When working with a finite set of discrete states, a fundamental condition is that the base states be ‘orthogonal’, i.e. they must satisfy the following equation:

ij 〉 = δij, with δij = 1 if i = j and δij = 0 if ij

Needless to say, the base states and j are rather special vectors in a rather special mathematical space (a so-called Hilbert space) and so it’s rather tricky to interpret their ‘orthogonality’ in any geometric way, although such geometric interpretation is often actually possible in simple quantum-mechanical systems: you’ll just notice a ‘right’ angle may actually be 45°, or 180° angles, or whatever. 🙂 In any case, that’s not the point here. The question is: if we move an infinite number of base states – like we did when we introduced the ψ(x) and φ(p) wavefunctions – what happens to that condition?

Your first reaction is going to be: nothing. Because… Well… Remember that, for a two-state system, in which we have two base states only, we’d fully describe some state | φ 〉 as a linear combination of the base states, so we’d write:

| φ 〉 =| I 〉 CI + | II 〉 CII 

Now, while saying we were talking a Hilbert space here, I did add we could use the same expression to define the base states themselves, so I wrote the following triviality:

M1Trivial but sensible. So we’d associate the base state | I 〉 with the base vector (1, 0) and, likewise, base state | II 〉 with the base vector (0, 1). When explaining this, I added that we could easily extend to an N-state system and so there’s a perfect analogy between the 〈 i | j 〉 bra-ket expression in quantum math and the ei·ej product in the run-of-the-mill coordinate spaces that you’re used to. So why can’t we just extend the concept to an infinite-state system and move to base vectors with an infinite number of elements, which we could write as ei =(…, 0, ei = 1, 0, 0,,…) and ej =(…, 0, 0, ej = 1, 0,…), thereby ensuring 〈 i | j 〉 = ei·ej = δijalways! The ‘orthogonality’ condition looks simple enough indeed, and so we could re-write it as:

xx’ 〉 = δxx’, with δxx’ = 1 if x = x’ and δxx’ = 0 if if x ≠ x’

However, when moving from a space with a finite number of dimensions to a space with an infinite number of dimensions, there are some issues. They pop up, for example, when we insert that 〈 xx’ 〉 = δxx’ function (note that we’re talking some function here of x and x’, indeed, so we’ll write it as f(x, x’) in the next step) in that 〈φ|ψ〉 = ∫〈φ|x〉〈x|ψ〉dx integral.

Huh? What integral? Relax: that 〈φ|ψ〉 = ∫〈φ|x〉〈x|ψ〉dx integral just generalizes our 〈φ|ψ〉 = ∑〈φ|x〉〈x|ψ〉 expression for discrete settings for the continuous case. Just look at it. When substituting φ for x’, we get:

x’|ψ〉 = ψ(x’) = ∫ 〈x’|x〉 〈x|ψ〉 dx ⇔ ψ(x’) = ∫ 〈x’|x〉 ψ(x) dx

You’ll say: what’s the problem? Well… From a mathematical point of view, it’s a bit difficult to find a function 〈x’|x〉 = f(x, x’) which, when multiplied with a wavefunction ψ(x), and integrated over all x, will just give us ψ(x’). A bit difficult? Well… It’s worse than that: it’s actually impossible!

Huh? Yes. Feynman illustrates the difficulty for x’ = 0, but he could have picked whatever value, really. In any case, if x’ = 0, we can write f(x, 0) = f(x), and our integral now reduces to:

ψ(0) = ∫ f(x) ψ(0) dx

This is a weird expression: the value of the integral (i.e. the right-hand side of the expression) does not depend on x: it is just some non-zero value ψ(0). However, we know that the f(x) in the integrand is zero for all x ≠ 0. Hence, this integral will be zero. So we have an impossible situation: we wish a function to be zero everywhere but for one point, and, at the same time, we also want it to give us a finite integral when using it in that integral above.

You’re likely to shake your head now and say: what the hell? Does it matter? It does: it is an actual problem in quantum math. Well… I should say: it was an actual problem in quantum math. Dirac solved it. He invented a new function which looks a bit less simple than our suggested generalization of Kronecker’s delta for the continuous case (i.e. that 〈 xx’ 〉 = δxx’ conjecture above). Dirac’s function is – quite logically – referred to as the Dirac delta function, and it’s actually defined by that integral above, in the sense that we impose the following two conditions on it:

  • δ(x‘) = 0 if x ≠ x’ (so that’s just like the first of our two conditions for that 〈 xx’ 〉 = δxx’ function)
  • δ(x)ψ(x) dx = ψ(x’) (so that’s not like the second of our two condition for that 〈 xx’ 〉 = δxx’ function)

Indeed, that second condition is much more sophisticated than our 〈 xx’ 〉 = 1 if x = x’ condition. In fact, one can show that the second condition amounts to finding some function satisfying this condition:

δ(x)dx = 1

We get this by equating x’ to zero once more and, additionally, by equating ψ(x) to 1. [Please do double-check yourself.] Of course, this ‘normalization’ (or ‘orthogonalization’) problem all sounds like a lot of hocus-pocus and, in many ways, it is. In fact, we’re actually talking a mathematical problem here which had been lying around for centuries (for a brief overview, see the Wikipedia article on it). So… Well… Without further ado, I’ll just give you the mathematical expression now—and please don’t stop reading now, as I’ll explain it in a moment:

dirac

I will also credit Wikipedia with the following animation, which shows that the expression above is just the normal distribution function, and which shows what happens when that a, i.e. its standard deviation, goes to zero: Dirac’s delta function is just the limit of a sequence of (zero-centered) normal distributions. That’s all. Nothing more, nothing less.

Dirac_function_approximation

But how do we interpret it? Well… I can’t do better than Feynman as he describes what’s going on really:

“Dirac’s δ(xfunction has the property that it is zero everywhere except at x = 0 but, at the same time, it has a finite integral equal to unity. [See the δ(x)dx = 1 equation.] One should imagine that the δ(x) function has such fantastic infinity at one point that the total area comes out equal to one.”

Well… That says it all, I guess. 🙂 Don’t you love the way he puts it? It’s not an ‘ordinary’ infinity. No. It’s fantastic. Frankly, I think these guys were all fantastic. 🙂 The point is: that special function, Dirac’s delta function, solves our problem. The equivalent expression for the 〈 ij 〉 = δij condition for a finite and discrete set of base states is the following one for the continuous case:

xx’ 〉 = δ(x − x’)

The only thing left now is to generalize this result to three dimensions. Now that’s fairly straightforward. The ‘normalization’ condition above is all that’s needed in terms of modifying the equations for dealing with the continuum of base states corresponding to the points along a line. Extending the analysis to three dimensions goes as follows:

  • First, we replace the x coordinate by the vector r = (x, y, z)
  • As a result, integrals over x, become integrals over x, y and z. In other words, they become volume integrals.
  • Finally, the one-dimensional δ-function must be replaced by the product of three δ-functions: one in x, one in y and one in z. We write:

r | r 〉 = δ(x − x’) δ(y − y’)δ(z − z’)

Feynman summarizes it all together as follows:

summary

What if we have two particles, or more? Well… Once again, I won’t bother to try to re-phrase the Grand Master as he explains it. I’ll just italicize or boldface the key points:

Suppose there are two particles, which we can call particle 1 and particle 2. What shall we use for the base states? One perfectly good set can be described by saying that particle 1 is at xand particle 2 is at x2, which we can write as | xx〉. Notice that describing the position of only one particle does not define a base state. Each base state must define the condition of the entire system, so you must not think that each particle moves independently as a wave in three dimensions. Any physical state | ψ 〉 can be defined by giving all of the amplitudes 〈 xx| ψ 〉 to find the two particles at x1 and x2. This generalized amplitude is therefore a function of the two sets of coordinates x1 and x1. You see that such a function is not a wave in the sense of an oscillation that moves along in three dimensions. Neither is it generally simply a product of two individual waves, one for each particle. It is, in general, some kind of a wave in the six dimensions defined by x1 and x1Hence, if there are two particles in Nature which are interacting, there is no way of describing what happens to one of the particles by trying to write down a wave function for it alone. The famous paradoxes that we considered in earlier chapters—where the measurements made on one particle were claimed to be able to tell what was going to happen to another particle, or were able to destroy an interference—have caused people all sorts of trouble because they have tried to think of the wave function of one particle alone, rather than the correct wave function in the coordinates of both particles. The complete description can be given correctly only in terms of functions of the coordinates of both particles.

Now we really know it all, don’t we? 🙂

Well… Almost. I promised to tackle another topic as well. So here it is:

Schrödinger’s equation in three dimensions

Let me start by jotting down what we had found already, i.e. Schrödinger’s equation when only one coordinate in space is involved. It’s written as:

schrodinger 3

Now, the extension to three dimensions is remarkably simple: we just substitute the ∂/∂xoperator by the ∇operator, i.e. ∇= ∂/∂x2  + ∂/∂y+ ∂/∂z2. We get:

schrodinger 4

Finally, we can also put forces on the particle, so now we are not looking at a particle moving in free space: we’ve got some force field working on it. It turns out the required modification is equally simple. The grand result is Schrödinger’s original equation in three dimensions:

schrodinger 5

V = V(x, y, z) is, of course, just the potential here. Remarkably simple equations but… How do we get these? Well… Sorry. The math is not too difficult, but you’re well equipped now to look at Feynman’s Lecture on it yourself now. You really are. Trust me. I really dealt with all of the ‘serious’ stuff you need to understand how he’s going about it in my previous posts so, yes, now I’ll just sit back and relax. Or go biking. Or whatever. 🙂

Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/

The Uncertainty Principle

In my previous post, I showed how Feynman derives Schrödinger’s equation using a historical and, therefore, quite intuitive approach. The approach was intuitive because the argument used a discrete model, so that’s stuff we are well acquainted with—like a crystal lattice, for example. However, now we’re now going to think continuity from the start. Let’s first see what changes in terms of notation.

New notations

Our C(xn, t) = 〈xn|ψ〉 now becomes C(x) = 〈x|ψ〉. This notation does not explicitly show the time dependence but then you know amplitudes like this do vary in space as well as in time. Having said that, the analysis below focuses mainly on their behavior in space, so it does make sense to not explicitly mention the time variable. It’s the usual trick: we look at how stuff behaves in space or, alternatively, in time. So we temporarily ‘forget’ about the other variable. That’s just how we work: it’s hard for our mind to think about these wavefunctions in both dimensions simultaneously although, ideally, we should do that.

Now, you also know that quantum physicists prefer to denote the wavefunction C(x) with some Greek letter: ψ (psi) or φ (phi). Feynman think it’s somewhat confusing because we use the same to denote a state itself, but I don’t agree. I think it’s pretty straightforward. In any case, we write:

ψ(x) = Cψ(x) = C(x) = 〈x|ψ〉

The next thing is the associated probabilities. From your high school math course, you’ll surely remember that we have two types of probability distributions: they are either discrete or, else, continuous. If they’re continuous, then our probability distribution becomes a probability density function (PDF) and, strictly speaking, we should no longer say that the probability of finding our particle at any particular point x at some time t is this or that. That probability is, strictly speaking, zero: if our variable is continuous, then our probability is defined for an interval only, and the P[x] value itself is referred to as a probability density. So we’ll look at little intervals Δx, and we can write the associated probability as:

prob (x, Δx) = |〈x|ψ〉|2Δx = |ψ(x)|2Δx

The idea is illustrated below. We just re-divide our continuous scale in little intervals and calculate the surface of some tiny elongated rectangle now. 🙂

image024

It is also easy to see that, when moving to an infinite set of states, our 〈φ|ψ〉 = ∑〈φ|x〉〈x|ψ〉 (over all x) formula for calculating the amplitude for a particle to go from state ψ to state φ should now be written as an infinite sum, i.e. as the following integral:

amplitude continuous

Now, we know that 〈φ|x〉 = 〈x|φ〉* and, therefore, this integral can also be written as:

integral

For example, if φ(x) =  〈x|φ〉 is equal to a simple exponential, so we can write φ(x) = a·eiθ, then φ*(x) =  〈φ|x〉 = a·e+iθ.

With that, we’re ready for the plat de résistance, except for one thing, perhaps: we don’t look at spin here. If we’d do that, we’d have to take two sets of base sets: one for up and one for down spin—but we don’t worry about this, for the time being, that is. 🙂

The momentum wavefunction

Our wavefunction 〈x|ψ〉 varies in time as well as in space. That’s obvious. How exactly depends on the energy and the momentum: both are related and, hence, if there’s uncertainty in the momentum, there will be uncertainty in the momentum, and vice versa. Uncertainty in the momentum changes the behavior of the wavefunction in space—through the p = ħk factor in the argument of the wavefunction (θ = ω·t − k·x)—while uncertainty in the energy changes the behavior of the wavefunction in time—through the E = ħω relation. As mentioned above, we focus on the variation in space here. We’ll do so y defining a new state, which is referred to as a state of definite momentum. We’ll write it as mom p, and so now we can use the Dirac notation to write the amplitude for an electron to have a definite momentum equal to p as:

φ(p) = 〈 mom p | ψ 〉

Now, you may think that the 〈x|ψ〉 and 〈mom p|ψ〉 amplitudes should be the same because, surely, we do associate the state with a definite momentum p, don’t we? Well… No! If we want to localize our wave ‘packet’, i.e. localize our particle, then we’re actually not going to associate it with a definite momentum. See my previous posts: we’re going to introduce some uncertainty so our wavefunction is actually a superposition of more elementary waves with slightly different (spatial) frequencies. So we should just go through the motions here and apply our integral formula to ‘unpack’ this amplitude. That goes as follows:

integral 2

So, as usual, when seeing a formula like this, we should remind ourselves of what we need to solve. Here, we assume we somehow know the ψ(x) = 〈x|ψ〉 wavefunction, so the question is: what do we use for 〈 mom p | x 〉? At this point, Feynman wanders off to start a digression on normalization, which really confuses the picture. When everything is said and done, the easiest thing to do is to just jot down the formula for that 〈mom p | x〉 in the integrand and think about it for a while:

〈mom p | x〉 = ei(p/ħ)∙x

I mean… What else could it be? This formula is very fundamental, and I am not going to try to explain it. As mentioned above, Feynman tries to ‘explain’ it by some story about probabilities and normalization, but I think his ‘explanation’ just confuses things even more. Really, what else would it be? The formula above really encapsulates what it means if we say that p and x are conjugate variables. [I can already note, of course, that symmetry implies that we can write something similar for energy and time. Indeed, we can define a state of definite energy as 〈E | ψ〉, and then ‘unpack’ it in the same way, and see that one of the two factors in the integrand would be equal to 〈E | t〉 and, of course, we’d associate a similar formula with it:

E | t〉 = ei(E/ħ)∙t

But let me get back to the lesson here. We’re analyzing stuff in space now, not in time. Feynman gives a simple example here. He suggests a wavefunction which has the following form:

ψ(x) = K·ex2/4σ2

The example is somewhat disingenuous because this is not a complex– but real-valued function. In fact, squaring it, and then calculating applying the normalization condition (all probabilities have to add up to one), yields the normal probability distribution:

prob (x, Δx) = P(x)dx = (2πσ2)−1/2ex2/2σ2dx

So that’s just the normal distribution for μ = 0, as illustrated below.

720px-Normal_Distribution_PDF

In any case, the integral we have to solve now is:Integral 3

Now, I hate integrals as much as you do (probably more) and so I assume you’re also only interested in the result (if you want the detail: check it in Feynman), which we can write as:

φ(p) = (2πη2)−1/4·ep2/4η2, with η = ħ/2σ

This formula is totally identical to the ψ(x) = (2πσ2)−1/4·ex2/4σdistribution we started with, except that it’s got another sigma value, which we denoted by η (and that’s not nu but eta), with 

η = ħ/2σ

Just for the record, Feynman refers to η and σ as the ‘half-width’ of the respective distributions. Mathematicians would say they’re the standard deviation. The concept are nearly the same, but not quite. In any case, that’s another thing I’ll let you find our for yourself. 🙂 The point is: η and σ are inversely proportional to each other, and the constant of proportionality is equal to ħ/2.

Now, if we take η and σ as measures of the uncertainty in and respectively – which is what they are, obviously ! – then we can re-write that η = ħ/2σ as ησ = ħ/2 or, better still, as the Uncertainty Principle itself:

ΔpΔx = ħ/2

You’ll say: that’s great, but we usually see the Uncertainty Principle written as:

ΔpΔx ≥ ħ/2

So where does that come from? Well… We choose a normal distribution (or the Gaussian distribution, as physicists call it), and so that yields the ΔpΔx = ħ/2 identity. If we’d chosen another one, we’d find a slightly different relation and so… Well… Let me quote Feynman here: “Interestingly enough, it is possible to prove that for any other form of a distribution in x or p, the product ΔpΔcannot be smaller than the one we have found here, so the Gaussian distribution gives the smallest possible value for the ΔpΔproduct.”

This is great. So what about the even more approximate ΔpΔx ≥ ħ formula? Where does that come from? Well… That’s more like a qualitative version of it: it basically says the minimum value of the same product is of the same order as ħ which, as you know, is pretty tiny: it’s about 0.0000000000000000000000000000000006626 J·s. 🙂 The last thing to note is its dimension: momentum is expressed in newton-second and position in meter, obviously. So the uncertainties in them are expressed in the same unit, and so the dimension of the product is N·m·s = J·s. So this dimension combines force, distance and time. That’s quite appropriate, I’d say. The ΔEΔproduct obviously does the same. But… Well… That’s it, folks! I enjoyed writing this – and I cannot always say the same of other posts! So I hope you enjoyed reading it. 🙂

Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/

Schrödinger’s equation: the original approach

Of course, your first question when seeing the title of this post is: what’s original, really? Well… The answer is simple: it’s the historical approach, and it’s original because it’s actually quite intuitive. Indeed, Lecture no. 16 in Feynman’s third Volume of Lectures on Physics is like a trip down memory lane as Feynman himself acknowledges, after presenting Schrödinger’s equation using that very rudimentary model we developed in our previous post:

“We do not intend to have you think we have derived the Schrödinger equation but only wish to show you one way of thinking about it. When Schrödinger first wrote it down, he gave a kind of derivation based on some heuristic arguments and some brilliant intuitive guesses. Some of the arguments he used were even false, but that does not matter; the only important thing is that the ultimate equation gives a correct description of nature.”

So… Well… Let’s have a look at it. 🙂 We were looking at some electron we described in terms of its location at one or the other atom in a linear array (think of it as a line). We did so by defining base states |n〉 = |xn〉, noting that the state of the electron at any point in time could then be written as:

|φ〉 = ∑ |xnCn(t) = ∑ |xn〉〈xn|φ〉 over all n

The Cn(t) = 〈xn|φ〉 coefficient is the amplitude for the electron to be at xat t. Hence, the Cn(t) amplitudes vary with t as well as with x. We’ll re-write them as Cn(t) = C(xn, t) = C(xn). Note that the latter notation does not explicitly show the time dependence. The Hamiltonian equation we derived in our previous post is now written as:

iħ·(∂C(xn)/∂t) = E0C(xn) − AC(xn+b) − AC(xn−b)

Note that, as part of our move from the Cn(t) to the C(xn) notation, we write the time derivative dCn(t)/dt now as ∂C(xn)/∂t, so we use the partial derivative symbol now (∂). Of course, the other partial derivative will be ∂C(x)/∂x) as we move from the count variable xto the continuous variable x, but let’s not get ahead of ourselves here. The solution we found for our C(xn) functions was the following wavefunction:

C(xn) = a·ei(k∙xn−ω·t) ei∙ω·t·ei∙k∙xn ei·(E/ħ)·t·ei·k∙xn

We also found the following relationship between E and k:

E = E0 − 2A·cos(kb)

Now, even Feynman struggles a bit with the definition of E0 and k here, and their relationship with E, which is graphed below.

energy

Indeed, he first writes, as he starts developing the model, that E0 is, physically, the energy the electron would have if it couldn’t leak away from one of the atoms, but then he also adds: “It represents really nothing but our choice of the zero of energy.”

This is all quite enigmatic because we cannot just do whatever we want when discussing the energy of a particle. As I pointed out in one of my previous posts, when discussing the energy of a particle in the context of the wavefunction, we generally consider it to be the sum of three different energy concepts:

  1. The particle’s rest energy m0c2, which de Broglie referred to as internal energy (Eint), and which includes the rest mass of the ‘internal pieces’, as Feynman puts it (now we call those ‘internal pieces’ quarks), as well as their binding energy (i.e. the quarks’ interaction energy).
  2. Any potential energy it may have because of some field (i.e. if it is not traveling in free space), which we usually denote by U. This field can be anything—gravitational, electromagnetic: it’s whatever changes the energy of the particle because of its position in space.
  3. The particle’s kinetic energy, which we write in terms of its momentum p: m·v2/2 = m2·v2/(2m) = (m·v)2/(2m) = p2/(2m).

It’s obvious that we cannot just “choose” the zero point here: the particle’s rest energy is its rest energy, and its velocity is its velocity. So it’s not quite clear what the E0 in our model really is. As far as I am concerned, it represents the average energy of the system really, so it’s just like the E0 for our ammonia molecule, or the E0 for whatever two-state system we’ve seen so far. In fact, when Feynman writes that we can “choose our zero of energy so that E0 − 2A = 0″ (so the minimum of that curve above is at the zero of energy), he actually makes some assumption in regard to the relative magnitude of the various amplitudes involved.

We should probably think about it in this way: −(i/ħ)·E0 is the amplitude for the electron to just stay where it is, while i·A/ħ is the amplitude to go somewhere else—and note we’ve got two possibilities here: the electron can go to |xn+1〉,  or, alternatively, it can go to |xn−1〉. Now, amplitudes can be associated with probabilities by taking the absolute square, so I’d re-write the E0 − 2A = 0 assumption as:

E0 = 2A ⇔ |−(i/ħ)·E0|= |(i/ħ)·2A|2

Hence, in my humble opinion, Feynman’s assumption that E0 − 2A = 0 has nothing to do with ‘choosing the zero of energy’. It’s more like a symmetry assumption: we’re basically saying it’s as likely for the electron to stay where it is as it is to move to the next position. It’s an idea I need to develop somewhat further, as Feynman seems to just gloss over these little things. For example, I am sure it is not a coincidence that the EI, EIIEIII and EIV energy levels we found when discussing the hyperfine splitting of the hydrogen ground state also add up to 0. In fact, you’ll remember we could actually measure those energy levels (E= EII = EIII = A ≈ 9.23×10−6 eV, and EIV = −3A ≈ −27.7×10−6 eV), so saying that we can “choose” some zero energy point is plain nonsense. The question just doesn’t arise. In any case, as I have to continue the development here, I’ll leave this point for further analysis in the future. So… Well… Just note this E0 − 2A = 0 assumption, as we’ll need it in a moment.

The second assumption we’ll need concerns the variation in k. As you know, we can only get a wave packet if we allow for uncertainty in k which, in turn, translates into uncertainty for E. We write:

ΔE = Δ[E0 − 2A·cos(kb)]

Of course, we’d need to interpret the Δ as a variance (σ2) or a standard deviation (σ) so we can apply the usual rules – i.e. var(a) = 0, var(aX) = a2·var(X), and var(aX ± bY) = a2·var(X) + b2·var(Y) ± 2ab·cov(X, Y) – to be a bit more precise about what we’re writing here, but you get the idea. In fact, let me quickly write it out:

var[E0 − 2A·cos(kb)] = var(E0) + 4A2·var[cos(kb)] ⇔ var(E) = 4A2·var[cos(kb)]

Now, you should check my post scriptum to my page on the Essentials, to see how the probability density function of the cosine of a randomly distributed variable looks like, and then you should go online to find a formula for its variance, and then you can work it all out yourself, because… Well… I am not going to do it for you. What I want to do here is just show how Feynman gets Schrödinger’s equation out of all of these simplifications.

So what’s the second assumption? Well… As the graph shows, our k can take any value between −π/b and +π/b, and therefore, the kb argument in our cosine function can take on any value between −π and +π. In other words, kb could be any angle. However, as Feynman puts it—we’ll be assuming that kb is ‘small enough’, so we can use the small-angle approximations whenever we see the cos(kb) and/or sin(kb) functions. So we write: sin(kb) ≈ kb and cos(kb) ≈ 1 − (kb)2/2 = 1 − k2b2/2. Now, that assumption led to another grand result, which we also derived in our previous post. It had to do with the group velocity of our wave packet, which we calculated as:

= dω/dk = (2Ab2/ħ)·k

Of course, we should interpret our k here as “the typical k“. Huh? Yes… That’s how Feynman refers to it, and I have no better term for it. It’s some kind of ‘average’ of the Δk interval, obviously, but… Well… Feynman does not give us any exact definition here. Of course, if you look at the graph once more, you’ll say that, if the typical kb has to be “small enough”, then its expected value should be zero. Well… Yes and no. If the typical kb is zero, or if is zero, then is zero, and then we’ve got a stationary electron, i.e. an electron with zero momentum. However, because we’re doing what we’re doing (that is, we’re studying “stuff that moves”—as I put it unrespectfully in a few of my posts, so as to distinguish from our analyses of “stuff that doesn’t move”, like our two-state systems, for example), our “typical k” should not be zero here. OK… We can now calculate what’s referred to as the effective mass of the electron, i.e. the mass that appears in the classical kinetic energy formula: K.E. = m·v2/2. Now, there are two ways to do that, and both are somewhat tricky in their interpretation:

1. Using both the E0 − 2A = 0 as well as the “small kb” assumption, we find that E = E0 − 2A·(1 − k2b2/2) = A·k2b2. Using that for the K.E. in our formula yields:

meff = 2A·k2b2/v= 2A·k2b2/[(2Ab2/ħ)·k]= ħ2/(2Ab2)

2. We can use the classical momentum formula (p = m·v), and then the 2nd de Broglie equation, which tells us that each wavenumber (k) is to be associated with a value for the momentum (p) using the p = ħk (so p is proportional to k, with ħ as the factor of proportionality). So we can now calculate meff as meff = ħk/v. Substituting again for what we’ve found above, gives us the same:

meff = 2A·k2b2/v = ħ·k/[(2Ab2/ħ)·k] = ħ2/(2Ab2)

Of course, we’re not supposed to know the de Broglie relations at this point in time. 🙂 But, now that you’ve seen them anyway, note how we have two formulas for the momentum:

  • The classical formula (p = m·v) tells us that the momentum is proportional to the classical velocity of our particle, and m is then the factor of proportionality.
  • The quantum-mechanical formula (p = ħk) tells us that the (typical) momentum is proportional to the (typical) wavenumber, with Planck’s constant (ħ) as the factor of proportionality. Combining both combines the classical and quantum-mechanical perspective of a moving particle:

v = ħk

I know… It’s an obvious equation but… Well… Think of it. It’s time to get back to the main story now. Remember we were trying to find Schrödinger’s equation? So let’s get on with it. 🙂

To do so, we need one more assumption. It’s the third major simplification and, just like the others, the assumption is obvious on first, but not on second thought. 😦 So… What is it? Well… It’s easy to see that, in our meff = ħ2/(2Ab2) formula, all depends on the value of 2Ab2. So, just like we should wonder what happens with that kb factor in the argument of our sine or cosine function if b goes to zero—i.e. if we’re letting the lattice spacing go to zero, so we’re moving from a discrete to a continuous analysis now—we should also wonder what happens with that 2Ab2 factor! Well… Think about it. Wouldn’t it be reasonable to assume that the effective mass of our electron is determined by some property of the material, or the medium (so that’s the silicon in our previous post) and, hence, that it’s constant really. Think of it: we’re not changing the fundamentals really—we just have some electron roaming around in some medium and all that we’re doing now is bringing those xcloser together. Much closer. It’s only logical, then, that our amplitude to jump from xn±1 to xwould also increase, no? So what we’re saying is that 2Ab2 is some constant which we write as ħ2/meff or, what amounts to the same, that Ab= ħ2/2·meff.

Of course, you may raise two objections here:

  1. The Ab= ħ2/2·meff assumption establishes a very particular relation between A and b, as we can write A as A = [ħ2/(2meff)]·b−2 now. So we’ve got like an y = 1/x2 relation here. Where the hell does that come from?
  2. We were talking some real stuff here: a crystal lattice with atoms that, in reality, do have some spacing, so that corresponds to some real value for b. So that spacing gives some actual physical significance to those xvalues.

Well… What can I say? I think you should re-read that quote of Feynman when I started this post. We’re going to get Schrödinger’s equation – i.e. the ultimate prize for all of the hard work that we’ve been doing so far – but… Yes. It’s really very heuristic, indeed! 🙂 But let’s get on with it now! We can re-write our Hamiltonian equation as:

iħ·(∂C(xn)/∂t) = E0C(xn) − AC(xn+b) − AC(xn−b)]

= (E0−2A)C(xn) + A[2C(xn) − C(xn+b) − C(xn−b) = A[2C(xn) − C(xn+b) − C(xn−b)]

Now, I know your brain is about to melt down but, fiddling with this equation as we’re doing right now, Schrödinger recognized a formula for the second-order derivative of a function. I’ll just jot it down, and you can google it so as to double-check where it comes from:

second derivative

Just substitute f(x) for C(xn) in the second part of our equation above, and you’ll see we can effectively write that 2C(xn) − C(xn+b) − C(xn−b) factor as:

formula 1

We’re done. We just iħ·(∂C(xn)/∂t) on the left-hand side now and multiply the expression above with A, to get what we wanted to get, and that’s – YES! – Schrödinger’s equation:

Schrodinger 2

Whatever your objections to this ‘derivation’, it is the correct equation. For a particle in free space, we just write m instead of meff, but it’s exactly the same. I’ll now give you Feynman’s full quote, which is quite enlightening:

“We do not intend to have you think we have derived the Schrödinger equation but only wish to show you one way of thinking about it. When Schrödinger first wrote it down, he gave a kind of derivation based on some heuristic arguments and some brilliant intuitive guesses. Some of the arguments he used were even false, but that does not matter; the only important thing is that the ultimate equation gives a correct description of nature. The purpose of our discussion is then simply to show you that the correct fundamental quantum mechanical equation [i.e. Schrödinger’s equation] has the same form you get for the limiting case of an electron moving along a line of atoms. We can think of it as describing the diffusion of a probability amplitude from one point to the next along the line. That is, if an electron has a certain amplitude to be at one point, it will, a little time later, have some amplitude to be at neighboring points. In fact, the equation looks something like the diffusion equations which we have used in Volume I. But there is one main difference: the imaginary coefficient in front of the time derivative makes the behavior completely different from the ordinary diffusion such as you would have for a gas spreading out along a thin tube. Ordinary diffusion gives rise to real exponential solutions, whereas the solutions of Schrödinger’s equation are complex waves.”

So… That says it all, I guess. Isn’t it great to be where we are? We’ve really climbed a mountain here. And I think the view is gorgeous. 🙂

Oh—just in case you’d think I did not give you Schrödinger’s equation, let me write it in the form you’ll usually see it:

schrodinger 3

Done! 🙂

Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/

Quantum math in solid-state physics

Pre-script (dated 26 June 2020): This post got mutilated by the removal of some material by the dark force. You should be able to follow the main story line, however. If anything, the lack of illustrations might actually help you to think things through for yourself. In any case, we now have different views on these concepts as part of our realist interpretation of quantum mechanics, so we recommend you read our recent papers instead of these old blog posts.

Original post:

I’ve said it a couple of times already: so far, we’ve only studied stuff that does not move in space. Hence, till now, time was the only variable in our wavefunctions. So now it’s time to… Well… Study stuff that does move in space. 🙂

Is that compatible with the title of this post? Solid-state physics? Solid-state stuff doesn’t move, does it? Well… No. But what we’re going to look at is how an electron travels through a solid crystal or, more generally, how an atomic excitation can travel through. In fact, what we’re really going to look at is how the wavefunction itself travels through space. However, that’s a rather bold statement, and so you should just read this post and judge for yourself. To be specific, we’re going to look at what happens in semiconductor material, like the silicon that’s used in microelectronic components like transistors and integrated circuits (ICs). You surely know the classical idea of that, which involves imagining an electron can be situated in a kind of ‘pit’ at one particular atom (or an electron hole, as it’s usually referred to), and it just moves from pit to pit. The Wikipedia article on it defines an electron hole as follows: an electron hole is the absence of an electron from a full valence band: the concept is used to conceptualize the interactions of the electrons within a nearly full system, i.e. a system which is missing just a few electrons. But here we’re going to forget about the classical picture. We’ll try to model it using the wavefunction concept. So how does that work? Feynman approaches it as follows.

If we look at a (one-dimensional) line of atoms – we can extend to a two- and three-dimensional analysis later – we may define an infinite number of base states for the extra electron that we think of as moving through the crystal. If the electron is with the n-th atom, then we’ll say it’s in a base state which we shall write as |n〉. Likewise, if it’s at atom n+1 or n−1, then we’ll associate that with base state |n+1〉 and |n−1〉 respectively. That’s what visualized below, and you should just along with the story here: don’t think classically, i.e. in terms of the electron is either here or, else, somewhere else. No. It’s got an amplitude to be anywhere. If you can’t take that… Well… I am sorry but that’s what QM is all about!

electron moving

As usual, we write the amplitude for the electron to be in one of those states |n〉 as Cn(t) = 〈n|φ〉, and so we can the write the electron’s state at any point in time t by superposing all base states, so that’s the weighted sum of all base states, with the weights being equal to the associated amplitudes. So we write:

|φ〉 = ∑ |nCn(t) = ∑ |n〉〈n|φ〉 over all n

Now we add some assumptions. One assumption is that an electron cannot directly jump to its next nearest neighbor: if it goes to the next nearest one, it will first have to go to nearest one. So two steps are needed to go from state |n−1〉 to state |n+1〉. This assumption simplifies the analysis: we can discuss more general cases later. To be specific, we’ll assume the amplitude to go from one base state to another, e.g. from |n〉 to |n+1〉, or |n−1〉 to state |n〉, is equal to i·A/ħ. You may wonder where this comes from, but it’s totally in line with equating our Hamiltonian non-diagonal elements to –A. Let me quickly insert a small digression here—for those who do really wonder where this comes from. 🙂

START OF DIGRESSION

Just check out those two-state systems we described, or that post of mine in which I explained why the following formulas are actually quite intuitive and easy to understand:

  • U12(t + Δt, t) = − (i/ħ)·H12(t)·Δt = (i/ħ)·A·Δt and
  • U21(t + Δt, t) = − (i/ħ)·H21(t)·Δt = (i/ħ)·A·Δt

More generally, you’ll remember that we wrote Uij(t + Δt, t) as:

Uij(t + Δt, t) = Uij(t, t) + Kij·Δt = δij(t, t) + Kij·Δt = δij(t, t) − (i/ħ)·Hij(t)·Δt

That looks monstrous but, frankly, what we have here is just a formula like this:

 f(x+Δx) = f(x) + [df(x)/dt]·Δx

In case you didn’t notice, the formula is just the definition of the derivative if we write it as Δy/Δx = df(x)/dt for Δx → 0. Hence, the Kij coefficient in this formula is to be interpreted as a time derivative. Now, we re-wrote that Kij coefficient as the amplitude −(i/ħ)·Hij(t) and, therefore, that amplitude – i.e. the i·A/ħ factor (for ij) I introduced above – is to be interpreted as a time derivative. [Now that we’re here, let me quickly add that a time derivative gives the time rate of change of some quantity per unit time. So that i·A/ħ factor is also expressed per unit time.] We’d then just move the − (i/ħ) factor in that −(i/ħ)·Hij(t) coefficient to the other side to get the grand result we got for two-state systems, i.e. the Hamiltonian equations, which we could write in a number of ways, as shown below:

hamiltonian equations

So… Well… That’s all there is to it, basically. Quantum math is not easy but, if anything, it is logical. You just have to get used to that imaginary unit (i) in front of stuff. That makes it always look very mysterious. 🙂 However, it should never scare you. You can just move it in or out of the differential operator, for example: i·df(x)/dt = d[i·f(x)]/dt. [Of course, i·f(x) ≠ f(i·x)!] So just think of as a reminder that the number that follows it points in a different direction. To be precise: its angle with the other number is 90°. It doesn’t matter what we call those two numbers. The convention is to say that one is the real part of the wavefunction, while the other is the imaginary part but, frankly, in quantum math, both numbers are just as real. 🙂

END OF DIGRESSION

Yes. Let me get back to the lesson here. The assumption is that the Hamiltonian equations for our system here, i.e. the electron traveling from hole to hole, look like the following equation:

Hamiltonian

It’s really like those iħ·(dC1/dt) = E0C1 − AC2 and iħ·(dC2/dt) = − AC1 + E0C2 equations above, except that we’ve got three terms here:

  1. −(i/ħ)·E0 is the amplitude for the electron to just stay where it is, so we multiply that with the amplitude of the electron to be there at that time, i.e. the amplitude Cn(t), and bingo! That’s the first contribution to the time rate of change of the Cn amplitude (i.e. dCn/dt). [Note that all I brought that iħ factor in front to the other side: 1/(iħ) = −(i/ħ).] Of course, you also need to know what Eis now: that’s just the (average) energy of our electron. So it’s really like the Eof our ammonia molecule—or the average energy of any two-state system, really.
  2. −(i/ħ)·(−A) = i·A/ħ is the amplitude to go from one base state to another, i.e. from |n+1〉 to |n〉, for example. In fact, the second term models exactly that: i·A/ħ times the amplitude to be in state |n+1〉 is the second contribution to to the time rate of change of the Cn amplitude.
  3. Finally, the electron may also be in state |n−1〉 and go to |n〉 from there, so i·A/ħ times the amplitude to be in state |n−1〉 is yet another contribution to to the time rate of change of the Cn amplitude.

Now, we don’t want to think about what happens at the start and the end of our line of atoms, so we’ll just assume we’ve got an infinite number of them. As a result, we get an infinite number of equations, which Feynman summarizes as:

hamiltonian equations - 2

Holy cow! How do we solve that? We know that the general solution for those Cn amplitudes is likely to be some function like this:

Cn(t) = an·e−(i/ħ)·E·t

In case you wonder where this comes from, check my post on the general solution for N-state systems. If we substitute that trial solution in that iħ·(dCn/dt) = E0Cn − ACn+1 − ACn−1, we get:

Ea= E0an − Aan+1 − Aan−1

[Just do that derivative, and you’ll see the iħ can be scrapped. Also, the exponentials on both sides of the equation cancel each other out.] Now, that doesn’t look too bad, and we can also write it as (E − E0a= − A(an+1 + an−1 ), but… Well… What’s the next step? We’ve got an infinite number of coefficients ahere, so we can’t use the usual methods to solve this set of equations. Feynman tries something completely different here. It looks weird but… Well… He gets a sensible result, so… Well… Let’s go for it.

He first writes these coefficients aas a function of a distance, which he defines as xn = xn−1 + b, with the atomic spacing, i.e. the distance between two atoms (see the illustration). So now we write a= f(xn) = a(xn). Note that we don’t write a= fn(xn) = an(xn). No. It’s just one function f = a, not an infinite number of functions f= an. Of course, once you see what comes of it, you’ll say: sure! The (complex) acoefficient in that function is the non-time-varying part of our function, and it’s about time we insert some part that’s varying in space and so… Well… Yes, of course! Our acoefficients don’t vary in time, so they must vary in space. Well… Yes. I guess so. 🙂 Our Ea= E0an − Aan+1 − Aan−1 equation becomes:

a(xn) = E0·a(xn) − a(xn+1) − A·a(xn+1) = E0·a(xn) − a(xn+b) − A·a(xn−b)

We can write this, once again, as (E − E0a(xn) = − A·[a(xn+b) + a(xn−b)]. Feynman notes this equation is like a differential equation, in the sense that it relates the value of some function (i.e. our a function, of course) at some point x to the values of the same function at nearby points, i.e. ± b here. Frankly, I struggle a bit to see how it works exactly but Feynman now offers the following trial solution:

a(xn) = eikxn

Huh? Why? And what’s k? Be patient. Just go along with this for a while. Let’s first do a graph. Think of xas a nearly continuous variable representing position in space. We then know that this parameter k is then equal to the spatial frequency of our wavefunction: larger values for k give the wavefunction a higher density in space, as shown below. 

graph 

In fact, I shouldn’t confuse you here, but you’ll surely think of the wavefunction you saw so many times already:

ψ(x, t) = a·ei·[(E/ħ)·t − (p/ħ)∙x] = a·ei·(ω·t − k∙x) = a·ei(k∙x−ω·t) = a·ei∙k∙x·ei∙ω·t

This was the elementary wavefunction we’d associate with any particle, and so would be equal to p/ħ, which is just the second of the two de Broglie relations: E = ħω and p = ħk (or, what amounts to the same: E = hf and λ = h/p). But you shouldn’t get confused. Not at this point. Or… Well… Not yet. 🙂

Let’s just take this proposed solution and plug it in. We get:

eikxn = E0·eikxn − eik(xn+b) − A·eik(xn−b) ⇔ E = E0 − eikb − A·eikb ⇔ E = E0 − 2A·cos(kb)

[In case you wonder what happens here: we just divide both sides by the common factor eikxand then we also know that eiθ+eiθ = 2·cosθ property.] So each is associated with some energy E. In fact, to be precise, that E = E0 − 2A·cos(kb) function is a periodic function: it’s depicted below, and it reaches a maximum at k = ± π/b. [It’s easy to see why: E0 − 2A·cos(kb) reaches a maximum if cos(kb) = −1, i.e. if kb = ± π.]

energy

Of course, we still don’t really know what k or E are really supposed to represent, but think of it: it’s obvious that E can never be larger or smaller than E ± 2A, whatever the value of k. Note that, once again, it doesn’t matter if we used +A or −A in our equations: the energy band remains the same. And… Well… We’ve dropped the term now: the energy band of a semiconductor. That’s what it’s all about. What we’re saying here is that our electron, as it moves about, can have no other energies than the values in this band. Having said, that still doesn’t determine its energy: any energy level within that energy band is possible. So what does that mean? Hmm… Let’s take a break and not bother too much about k for the moment. Let’s look at our Cn(t) equations once more. We can now write them as:

Cn(t) =  eikxn·e−(i/ħ)·E·t = eikxn·e−(i/ħ)·[E0 − 2A·cos(kb)]·t

You have enough experience now to sort of visualize what happens here. We can look at a certain xvalue – read: a certain position in the lattice and watch, as time goes by, how the real and imaginary part of our little Cwavefunction varies sinusoidally. We can also do it the other way around, and take a snapshot of the lattice at a certain point in time, and then we see how the amplitudes vary from point to point. That’s easy enough.

The thing is: we’re interested in probabilities in the end, and our wavefunction does not satisfy us in that regard: if we take the absolute square, its phase vanishes, and so we get the same probability everywhere! [Note that we didn’t normalize our wavefunctions here. It doesn’t matter. We can always do that later.] Now that’s not great. So what can we do about that? Now that’s where that comes back in the game. Let’s have a look.

The effective mass of an electron

We’d like to find a solution which sort of ‘localizes’ our electron in space. Now, we know that we can do, in general, by superposing wavefunctions having different frequencies. There are a number of ways to go about, but the general idea is illustrated below.

Fourier_series_and_transform beats

The first animation (for which credit must go to Wikipedia once more) is, obviously, the most sophisticated one. It shows how a new function – in red, and denoted by s6(x) – is constructed by summing six sine functions of different amplitudes and with harmonically related frequencies. This particular sum is referred to as a Fourier series, and the so-called Fourier transform, i.e. the S(f) function (in blue), depicts the six frequencies and their amplitudes.

We’re more interested in the second animation here (for which credit goes to another nice site), which shows how a pattern of beats is created by just mixing two slightly different cosine waves. We want to do something similar here: we want to get a ‘wave packet‘ like the one below, which shows the real part only—but you can imagine the imaginary part 🙂 of course. [That’s exactly the same but with a phase shift, cf. the sine and cosine bit in Euler’s formula: eiθ = cosθ + i·sinθ.]

image

As you know, we must know make a distinction between the group velocity of the wave, and its phase velocity. That’s got to do with the dispersion relation, but we’re not going to get into the nitty-gritty here. Just remember that the group velocity corresponds to the classical velocity of our particle – so that must be the classical velocity of our electron here – and, equally important, also remember the following formula for that group velocity:

group velocity

Let’s see how that plays out. The ω in this equation is equal to E/ħ = [E0 − 2A·cos(kb)]/ħ, so dω/dk = d[− (2A/ħ)·cos(kb)]/dk = (2Ab/ħ)·sin(kb). However, we’ll usually assume k is fairly small, so the variation of the amplitude from one xn to the other is fairly small. In that case, kb will be fairly small, and then we can use the so-called small angle approximation formula sin(ε) ≈ ε. [Note the reasoning here is a bit tricky, though, because – theoretically – k may vary between −π/b and +π/b and, hence, kb can take any value between −π and +π.] Using the small angle approximation, we get:

solution velocity

So we’ve got a quantum-mechanical calculation here that yields a classical velocity. Now, we can do something interesting now: we can calculate what is known as the effective mass of the electron, i.e. the mass that appears in the classical kinetic energy formula: K.E. = m·v2/2. Or in the classical momentum formula: p = m·vSo we can now write: K.E. = meff·v2/2 and p = meff·vBut… Well… The second de Broglie equation tells us that p = ħk, so we find that meff = ħk/v. Substituting for what we’ve found above, gives us:

formula for m eff

Unsurprisingly, we find that the value of meff is inversely proportional to A. It’s usually stated in units of the true mass of the electron, i.e. its mass in free space (m≈ 9.11×10−31 kg) and, in these units, it’s usually in the range of 0.01 to 10. You’ll say: 0.01, i.e. one percent of its actual mass? Yes. An electron may travel more freely in matter than it does in free space. 🙂 That’s weird but… Well… Quantum mechanics is weird.

In any case, I’ll wrap this post up now. You’ ve got a nice model here. As Feynman puts it:

“We have now explained a remarkable mystery—how an electron in a crystal (like an extra electron put into germanium) can ride right through the crystal and flow perfectly freely even though it has to hit all the atoms. It does so by having its amplitudes going pip-pip-pip from one atom to the next, working its way through the crystal. That is how a solid can conduct electricity.”

Well… There you go. 🙂

Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/

Systems with 2 spin-1/2 particles (II)

Pre-script (dated 26 June 2020): This post got mutilated by the removal of some material by the dark force. You should be able to follow the main story line, however. If anything, the lack of illustrations might actually help you to think things through for yourself. In any case, we now have different views on these concepts as part of our realist interpretation of quantum mechanics, so we recommend you read our recent papers instead of these old blog posts.

Original post:

In our previous post, we noted the Hamiltonian for a simple system of two spin-1/2 particles—a proton and an electron (i.e. a hydrogen atom, in other words):

hamil

After noting that this Hamiltonian is “the only thing that it can be, by the symmetry of space, i.e. so long as there is no external field,” Feynman also notes the constant term (A) depends on the level we choose to measure energies from, so one might just as well take E= 0, in which case the formula reduces to H = Aσe·σp. Feynman analyzes this term as follows:

If there are two magnets near each other with magnetic moments μe and μp, the mutual energy will depend on μe·μp = |μe||μp|cosα = μeμpcosα — among other things. Now, the classical thing that we call μe or μp appears in quantum mechanics as μeσand μpσrespectively (where μis the magnetic moment of the proton, which is about 1000 times smaller than μe, and has the opposite sign). So the H = Aσe·σp equation says that the interaction energy is like the interaction between two magnets—only not quite, because the interaction of the two magnets depends on the radial distance between them. But the equation could be—and, in fact, is—some kind of an average interaction. The electron is moving all around inside the atom, and our Hamiltonian gives only the average interaction energy. All it says is that for a prescribed arrangement in space for the electron and proton there is an energy proportional to the cosine of the angle between the two magnetic moments, speaking classically. Such a classical qualitative picture may help you to understand where the H = Aσe·σequation comes from.

That’s loud and clear, I guess. The next step is to introduce an external field. The formula for the Hamiltonian (we don’t distinguish between the matrix and the operator here) then becomes:

H = Aσe·σp − μeσe·B − μpσp·B

The first term is the term we already had. The second term is the energy the electron would have in the magnetic field if it were there alone. Likewise, the third term is the energy the proton would have in the magnetic field if it were there alone. When reading this, you should remember the following convention: classically, we write the energy U as U = −μ·B, because the energy is lowest when the moment is along the field. Hence, for positive particles, the magnetic moment is parallel to the spin, while for negative particles it’s opposite. In other words, μp is a positive number, while μe is negative. Feynman sums it all up as follows:

Classically, the energy of the electron and the proton together, would be the sum of the two, and that works also quantum mechanically. In a magnetic field, the energy of interaction due to the magnetic field is just the sum of the energy of interaction of the electron with the external field, and of the proton with the field—both expressed in terms of the sigma operators. In quantum mechanics these terms are not really the energies, but thinking of the classical formulas for the energy is a way of remembering the rules for writing down the Hamiltonian.

That’s also loud and clear. So now we need to solve those Hamiltonian equations once again. Feynman does so first assuming B is constant and in the z-direction. I’ll refer you to him for the nitty-gritty. The important thing is the results here:

energy

He visualizes these – as a function of μB/A – as follows:

fig1Fig2

The illustration shows how the four energy levels have a different B-dependence:

  • EI, EII, EIII start at (0, 1) but EI increases linearly with B—with slope μ, to be precise (cf. the EI = A + μB expression);
  • In contrast, EII decreases linearly with B—again, with slope μ (cf. the EII = A − μB expression);
  • We then have the EIII and EIV curves, which start out horizontally, to then curve and approach straight lines for large B, with slopes equal to μ’.

Oh—I realize I forget to define μ and μ’. Let me do that now: μ = −(μep) and μ’ = −(μe−μp). And remember what we said above: μis about 1000 times smaller than μe, and has opposite sign. OK. The point is: the magnetic field shifts the energy levels of our hydrogen atom. This is referred to as the Zeeman effect. Feynman describes it as follows:

The curves show the Zeeman splitting of the ground state of hydrogen. When there is no magnetic field, we get just one spectral line from the hyperfine structure of hydrogen. The transitions between state IV and any one of the others occurs with the absorption or emission of a photon whose (angular) frequency is 1/ħ times the energy difference 4A. [See my previous post for the calculation.] However, when the atom is in a magnetic field B, there are many more lines, and there can be transitions between any two of the four states. So if we have atoms in all four states, energy can be absorbed—or emitted—in any one of the six transitions shown by the vertical arrows in the illustration above.

The last question is: what makes the transitions go? Let me also quote Feynman’s answer to that:

The transitions will occur if you apply a small disturbing magnetic field that varies with time (in addition to the steady strong field B). It’s just as we saw for a varying electric field on the ammonia molecule. Only here, it is the magnetic field which couples with the magnetic moments and does the trick. But the theory follows through in the same way that we worked it out for the ammonia. The theory is the simplest if you take a perturbing magnetic field that rotates in the xy-plane—although any horizontal oscillating field will do. When you put in this perturbing field as an additional term in the Hamiltonian, you get solutions in which the amplitudes vary with time—as we found for the ammonia molecule. So you can calculate easily and accurately the probability of a transition from one state to another. And you find that it all agrees with experiment.

Alright! All loud and clear. 🙂

The magnetic quantum number

At very low magnetic fields, we still have the Zeeman splitting, but we can now approximate it as follows:

magnetic quantum number

This simplified representation of things explains an older concept you may still see mentioned: the magnetic quantum number, which is usually denoted by m. Feynman’s explanation of it is quite straightforward, and so I’ll just copy it as is:

Capture

As he notes: the concept of the magnetic quantum number has nothing to do with new physics. It’s all just a matter of notation. 🙂

Well… This concludes our short study of four-state systems. On to the next! 🙂

Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/

Systems with 2 spin-1/2 particles (I)

Pre-script (dated 26 June 2020): This post got mutilated by the removal of some material by the dark force. You should be able to follow the main story line, however. If anything, the lack of illustrations might actually help you to think things through for yourself. In any case, we now have different views on these concepts as part of our realist interpretation of quantum mechanics, so we recommend you read our recent papers instead of these old blog posts.

Original post:

I agree: this is probably the most boring title of a post ever. However, it should be interesting, as we’re going to apply what we’ve learned so far – i.e. the quantum-mechanical model of two-state systems – to a much more complicated problem—the solution of which can then be generalized to describe even more complicated situations.

Two spin-1/2 particles? Let’s recall the most obvious example. In the ground state of a hydrogen atom (H), we have one electron that’s bound to one proton. The electron occupies the lowest energy state in its ground state, which – as Feynman shows in one of his first quantum-mechanical calculations – is equal to −13.6 eV. More or less, that is. 🙂  You’ll remember the reason for the minus sign: the electron has more energy when it’s unbound, which it releases as radiation when it joins an ionized hydrogen atom or, to put it simply, when a proton and an electron come together. In-between being bound and unbound, there are other discrete energy states – illustrated below – and we’ll learn how to describe the patterns of motion of the electron in each of those states soon enough.

bohr_transitions

Not in this post, however. 😦 In this post, we want to focus on the ground state only. Why? Just because. That’s today’s topic. 🙂 The proton and the electron can be in either of two spin states. As a result, the so-called ground state is not really a single definite-energy state. The spin states cause the so-called hyperfine structure in the energy levels: it splits them into several nearly equal energy levels, so that’s what referred to as hyperfine splitting.

[…] OK. Let’s go for it. As Feynman points out, the whole model is reduced to a set of four base states:

  1. State 1: |++〉 = |1〉 (the electron and proton are both ‘up’)
  2. State 2: |+−〉 = |2〉  (the electron is ‘up’ and the proton is ‘down’)
  3. State 3: |−+〉 = |3〉  (the electron is ‘down’ and the proton is ‘up’)
  4. State 4: |−−〉 = |4〉  (the electron and proton are both ‘down’)

The simplification is huge. As you know, the spin of electrically charged elementary particles is related to their motion in space, but so we don’t care about exact spatial relationships here: the direction of spin can be in any direction, but all that matters here is the relative orientation, and so all is simplified to some direction as defined by the proton and the electron itself. Full stop.

You know that the whole problem is to find the Hamiltonian coefficients, i.e. the energy matrix. Let me give them to you straight away. The energy levels involved are the following:

  • E= EII = EIII = A ≈ 9.23×10−6 eV
  • EIV = −3A ≈ 27.7×10−6 eV

So the difference in energy levels is measured in ten-millionths of an electron-volt and, hence, the hyperfine splitting is really hyper-fine. The question is: how do we get these values? So that is what this post is about. Let’s start by reminding ourselves of what we learned so far.

The Hamiltonian operator

We know that, in quantum mechanics, we describe any state in terms of the base states. In this particular case, we’d do so as follows:

|ψ〉 = |1〉C1 + |2〉C2 + |3〉C3 +|4〉C4 with Ci = 〈i|ψ〉

We refer to |ψ〉 as the spin state of the system, and so it’s determined by those four Ci amplitudes. Now, we know that those Ci amplitudes are functions of time, and they are, in turn, determined by the Hamiltonian matrix. To be precise, we find them by solving a set of linear differential equations that we referred to as Hamiltonian equations. To be precise, we’d describe the behavior of |ψ〉 in time by the following equation:

hamiltonian operator

In case you forgot, the expression above is a short-hand for the following expression:

hamiltonian operator 2The index would range over all base states and, therefore, this expression gives us everything we want: it really does describe the behavior, in time, of an N-state system. You’ll also remember that, when we’d use the Hamiltonian matrix in the way it’s used above (i.e. as an operator on a state), we’d put a little hat over it, so we defined the Hamiltonian operator as:

operator

So far, so good—but this does not solve our problem: how do we find the Hamiltonian for this four-state system? What is it?

Well… There’s no one-size-fits-all answer to that: the analysis of two different two-state systems, like an ammonia molecule, or one spin-1/2 particle in a magnetic field, was different. Having said that, we did find we could generalize some of the solutions we’d find. For example, we’d write the Hamiltonian for a spin-1/2 particle, with a magnetic moment that’s assumed to be equal to μ, in a magnetic field B = (Bx, By, Bz) as:

sigma matrices

In this equation, we’ve got a set of 4 two-by-two matrices (three so-called sigma matrices (σx, σy, σz), and then the unit matrix δij = 1) which we referred to as the Pauli spin matrices, and which we wrote as:

Capture

You’ll remember that expression – which we further abbreviated, even more elegantly, to H = −μσ·B – covered all two-state systems involving a magnetic moment in a magnetic field. In fact, you’ll remember we could actually easily adapt the model to cover two-state systems in electric fields as well.

In short, these sigma matrices made our life very easy—as they covered a whole range of two-state models. So… Well… To make a long story short, what we want to do here is find some similar sigma matrices for four-state problems. So… Well… Let’s do that.

First, you should remind yourself of the fact that we could also use these sigma matrices as little operators themselves. To be specific, we’d let them ‘operate’ on the base states, and we’d find they’d do the following:

P3

You need to read this carefully. What it says that the σz matrix, as an operator, acting on the ‘up’ base state, yields the same base state (i.e. ‘up’), and that the same operator, acting on the ‘down’ state, gives us the same but with a minus sign in front. Likewise, the σy matrix operating on the ‘up’ and ‘down’ states respectively, will give us i·|down〉 and −i·|up〉 respectively.

The trick to solve our problem here (i.e. our four-state system) is to apply those sigma matrices to the electron and the proton separately. Feynman introduces a new notation here by distinguishing the electron and proton sigma operators: the electron sigma operators (σxe, σye, and σze) operate on the electron spin only, while – you guessed it – the proton sigma operator ((σxp, σyp, and σzp) acts on the proton spin only. Applying it to the four states we’re looking at (i.e. |++〉, |+−〉, |−+〉 and |−−〉), we get the following bifurcation for our σx operator:

  1. σxe|++〉 = |−+〉
  2. σxe|+−〉 = |−−〉
  3. σxe|−+〉 = |++〉
  4. σxe|−−〉 = |+−〉
  5. σxp|++〉 = |+−〉
  6. σxp|+−〉 = |++〉
  7. σxp|−+〉 = |−−〉
  8. σxp|−−〉 = |−+〉

You get the idea. We had three operators acting on two states, i.e. 6 possibilities. Now we combine these three operators with two different particles, so we have six operators now, and we let them act on four possible system states, so we have 24 possibilities now. Now, we can, of course, let these operators act one after another. Check the following for example:

 σxeσzp|+−〉 = σxezp|+−〉] = –σxe|+−〉 = –|–−〉

[I now realize that I should have used the ↑ and ↓ symbols for the ‘up’ and ‘down’ states, as the minus sign is used to denote two very different things here, but… Well… So be it.]

Note that we only have nine possible σxeσzp-like combinations, because σxeσz= σzpσxe, and then we have the 2×3 = six σe and σp operators themselves, so that makes for 15 new operators. [Note that the commutativity of these operators (σxeσz= σzpσxe) is not some general property of quantum-mechanical operators.] If we include the unit operator (δij = 1) – i.e. an operator that leaves all unchanged – we’ve got 16 in total. Now, we mentioned that we could write the Hamiltonian for a two-state system – i.e. a two-by-two matrix – as a linear combination of the four Pauli spin matrices. Likewise, one can demonstrate that the Hamiltonian for a four-state system can always be written as some linear combination of those sixteen ‘double-spin’ matrices. To be specific, we can write it as:

hamil

We should note a few things here. First, the E0 constant is, of course, to be multiplied by the unit matrix, so we should actually write E0δij instead of E0, but… Well… Quantum physicists always want to confuse you. 🙂 Second, the σeσis like the σ·notation: we can look at the σxe, σye, σze and σxp, σyp, σzp matrices as being the three components of two new (matrix) vectors, which we write as σand σrespectively. Thirdly, and most importantly, you’ll want proof of that equation above. Well… I am sorry but I am going to refer you to Feynman here: he shows that the expression above “is the only thing that the Hamiltonian can be.” The proof is based on the fundamental symmetry of space. He also adds that space is symmetrical only so long as there is no external field. 🙂

Final question: what’s A? Well… Feynman is quite honest here as he says the following: “A can be calculated accurately once you understand the complete quantum theory of the hydrogen atom—which we so far do not. It has, in fact, been calculated to an accuracy of about 30 parts in one million. So, unlike the flip-flop constant A of the ammonia molecule, which couldn’t be calculated at all well by a theory, our constant A for the hydrogen can be calculated from a more detailed theory. But never mind, we will for our present purposes think of the A as a number which could be determined by experiment, and analyze the physics of the situation.”

So… Well… So far so good. We’ve got the Hamiltonian. That’s all we wanted, actually. But, now that we have come so far, let’s write it all out now.

Solving the equations

If that expression above is the Hamiltonian – and we assume it is, of course! – then our system of Hamiltonian equations can be written as:

dyna

[Note that we’ve switched to Newton’s ‘over-dot’ notation to denote time derivatives here.] Now, I could walk you through Feynman’s exposé but I guess you’ll trust the result. The equation above is equivalent to the following set of four equations:

set

We know that, because the Hamiltonian looks like this:

hamil-2

How do we know that? Well… Sorry: just check Feynman. 🙂 He just writes it all out. Now, we want to find those Ci functions. [When studying physics, the most important thing is to remember what it is that you’re trying to do. 🙂 ] Now, from my previous post (i.e. my post on the general solution for N-state systems), you’ll remember that those Ci functions should have the following functional form:

Ci(t) = ai·ei·(E/ħ)·t 

If we substituting Ci(t) for that functional form in our set of Hamiltonian equations, we can cancel the exponentials so we get the following delightfully simple set of new equations:

sol1

The trivial solution, of course, is that all of the ai coefficients are zero, but – as mentioned in my previous post – we’re looking for non-trivial solutions here. Well… From what you see above, it’s easy to appreciate that one non-trivial but simple solution is:

a1 = 1 and a2 = a3 = a4 = 0

So we’ve got one set of ai coefficients here, and we’ll associate it with the first eigenvalue, or energy level, really—which we’ll denote as EI. [I am just being consistent here with what I wrote in my previous post, which explained how general solutions to N-state systems look like.] So we find the following:

E= A

[Another thing you learn when studying physics is that the most amazing things are often summarized in super-terse equations, like this one here. 🙂 ]

But – Hey! Look at the symmetry between the first and last equation! 

We immediately get another simple – but non-trivial! – solution:

a4 = 1 and a1 = a2 = a3 = 0

We’ll associate the second energy level with that, so we write:

EII = A

We’ve got two left. I’ll leave that to Feynman to solve:

feDone! Four energy levels En (n = I, II, III, IV), and four associated energy state vectors – |n〉 – that describe their configuration (and which, as Feynman puts it, have the time dependence “factored out”). Perfect!

Now, we mentioned the experimental values:

  • E= EII = EIII = A ≈ 9.23×10−6 eV
  • EIV = −3A ≈ 27.7×10−6 eV

How can scientists measure these values? The theoretical analysis gives us the A and −3A values, but what about the empirical measurements? Well… We should find those values as the hydrogen atoms in state I, II or III should get rid of the energy by emitting some radiation. Now, the frequency of that radiation will give us the information we need, as illustrated below. The difference between E= EII = EIII = A and EIV = −3A (i.e. 4A) should correspond to the (angular) frequency of the radiation that’s being emitted or absorbed as atoms go from one energy state to the other. Now, hydrogen atoms do absorb and emit microwave radiation with a frequency that’s equal to 1,420,405,751.8 Hz. More or less, that is. 🙂 The standard error in the measurement is about two parts in 100 billion—and I am quoting some measurement done in the early 1960s here!]

diagram

Bingo! If = ω/2π = (4A/ħ)/2π = 1,420,405,751.8 Hz, then A = f·2π·ħ/4 ≈ 9.23×10−6 eV.

So… Well… We’re done! I’ll see you tomorrow. 🙂 Tomorrow, we’re going to look at what happens when space is not symmetric, i.e. when we would have some external field! C u ! Cheers !

Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 20, 2020 as a result of a DMCA takedown notice from Michael A. Gottlieb, Rudolf Pfeiffer, and The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 20, 2020 as a result of a DMCA takedown notice from Michael A. Gottlieb, Rudolf Pfeiffer, and The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 20, 2020 as a result of a DMCA takedown notice from Michael A. Gottlieb, Rudolf Pfeiffer, and The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 20, 2020 as a result of a DMCA takedown notice from Michael A. Gottlieb, Rudolf Pfeiffer, and The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/

N-state systems

Pre-script (dated 26 June 2020): This post got mutilated by the removal of some material by the dark force. You should be able to follow the main story line, however. If anything, the lack of illustrations might actually help you to think things through for yourself. In any case, we now have different views on these concepts as part of our realist interpretation of quantum mechanics, so we recommend you read our recent papers instead of these old blog posts.

Original post:

On the 10th of December, last year, I wrote that my next post would generalize the results we got for two-state systems. That didn’t happen: I didn’t write the ‘next post’—not till now, that is. No. Instead, I started digging—as you can see from all the posts in-between this one and the 10 December piece. And you may also want to take a look at my new Essentials page. 🙂 In any case, it is now time to get back to Feynman’s Lectures on quantum mechanics. Remember where we are: halfway, really. The first half was all about stuff that doesn’t move in space. The second half, i.e. all that we’re going to study now, is about… Well… You guessed it. 🙂 That’s going to be about stuff that does move in space. To see how that works, we first need to generalize the two-state model to an N-state model. Let’s do it.

You’ll remember that, in quantum mechanics, we describe stuff by saying it’s in some state which, as long as we don’t measure in what state exactly, is written as some linear combination of a set of base states. [And please do think about what I highlight here: some state, measureexactly. It all matters. Think about it!] The coefficients in that linear combination are complex-valued functions, which we referred to as wavefunctions, or (probability) amplitudes. To make a long story short, we wrote:

eq1

These Ci coefficients are a shorthand for 〈 i | ψ(t) 〉 amplitudes. As such, they give us the amplitude of the system to be in state i as a function of time. Their dynamics (i.e. the way they evolve in time) are governed by the Hamiltonian equations, i.e.:

Eq2

The Hij coefficients in this set of equations are organized in the Hamiltonian matrix, which Feynman refers to as the energy matrix, because these coefficients do represent energies indeed. So we applied all of this to two-state systems and, hence, things should not be too hard now, because it’s all the same, except that we have N base states now, instead of just two.

So we have a N×N matrix whose diagonal elements Hij are real numbers. The non-diagonal elements may be complex numbers but, if they are, the following rule applies: Hij* = Hji. [In case you wonder: that’s got to do with the fact that we can write any final 〈χ| or 〈φ| state as the conjugate transpose of the initial |χ〉 or |φ〉 state, so we can write: 〈χ| = |χ〉*, or 〈φ| = |φ〉*.]

As usual, the trick is to find those N Ci(t) functions: we do so by solving that set of N equations, assuming we know those Hamiltonian coefficients. [As you may suspect, the real challenge is to determine the Hamiltonian, which we assume to be given here. But… Well… You first need to learn how to model stuff. Once you get your degree, you’ll be paid to actually solve problems using those models. 🙂 ] We know the complex exponential is a functional form that usually does that trick. Hence, generalizing the results from our analysis of two-state systems once more, the following general solution is suggested:

Ci(t) = ai·ei·(E/ħ)·t 

Note that we introduce only one E variable here, but N ai coefficients, which may be real- or complex-valued. Indeed, my examples – see my previous posts – often involved real coefficients, but that’s not necessarily the case. Think of the C2(t) = i·e(i/ħ)·E0·t·sin[(A/ħ)·t] function describing one of the two base state amplitudes for the ammonia molecule—for example. 🙂

Now, that proposed general solution allows us to calculate the derivatives in our Hamiltonian equations (i.e. the d[Ci(t)]/dt functions) as follows:

d[Ci(t)]/dt = −i·(E/ħ)·ai·ei·(E/ħ)·t 

You can now double-check that the set of equations reduces to the following:

Eq4

Please do write it out: because we have one E only, the ei·(E/ħ)·t factor is common to all terms, and so we can cancel it. The other stuff is plain arithmetic: i·i = i2 = 1, and the ħ constants cancel out too. So there we are: we’ve got a very simple set of N equations here, with N unknowns (i.e. these a1, a2,…, aN coefficients, to be specific). We can re-write this system as:

Eq5

The δij here is the Kronecker delta, of course (it’s one for i = j and zero for j), and we are now looking at a homogeneous system of equations here, i.e. a set of linear equations in which all the constant terms are zero. You should remember it from your high school math course. To be specific, you’d write it as Ax = 0, with A the coefficient matrix. The trivial solution is the zero solution, of course: all a1, a2,…, aN coefficients are zero. But we don’t want the trivial solution. Now, as Feynman points out – tongue-in-cheek, really – we actually have to be lucky to have a non-trivial solution. Indeed, you may or may not remember that the zero solution was actually the only solution if the determinant of the coefficient matrix was not equal to zero. So we only had a non-trivial solution if the determinant of A was equal to zero, i.e. if Det[A] = 0. So A has to be some so-called singular matrix. You’ll also remember that, in that case, we got an infinite number of solutions, to which we could apply the so-called superposition principle: if x and y are two solutions to the homogeneous set of equations Ax = 0, then any linear combination of x and y is also a solution. I wrote an addendum to this post (just scroll down and you’ll find it), which explains what systems of linear equations are all about, so I’ll refer you to that in case you’d need more detail here. I need to continue our story here. The bottom line is: the [Hij–δijE] matrix needs to be singular for the system to have meaningful solutions, so we will only have a non-trivial solution for those values of E for which

Det[Hij–δijE] = 0

Let’s spell it out. The condition above is the same as writing:

Eq7

So far, so good. What’s next? Well… The formula for the determinant is the following:

det physicists

That looks like a monster, and it is, but, in essence, what we’ve got here is an expression for the determinant in terms of the permutations of the matrix elements. This is not a math course so I’ll just refer you Wikipedia for a detailed explanation of this formula for the determinant. The bottom line is: if we write it all out, then Det[Hij–δijE] is just an Nth order polynomial in E. In other words: it’s just a sum of products with powers of E up to EN, and so our Det[Hij–δijE] = 0 condition amounts to equating it with zero.

In general, we’ll have N roots, but – sorry you need to remember so much from your high school math classes here – some of them may be multiple roots (i.e. two or more roots may be equal). We’ll call those roots—you guessed it:

EI, EII,…, En,…, EN

Note I am following Feynman’s exposé, and so he uses n, rather than k, to denote the nth Roman numeral (as opposed to Latin numerals). Now, I know your brain is near the melting point… But… Well… We’re not done yet. Just hang on. For each of these values E = EI, EII,…, En,…, EN, we have an associated set of solutions ai. As Feynman puts it: you get a set which belongs to En. In order to not forget that, for each En, we’re talking a set of N coefficients ai (= 1, 2,…, N), we denote that set not by ai(n) but by ai(n). So that’s why we use boldface for our index n: it’s special—and not only because it denotes a Roman numeral! It’s just one of Feynman’s many meaningful conventions.

Now remember that Ci(t) = ai·ei·(E/ħ)·t formula. For each set of ai(n) coefficients, we’ll have a set of Ci(n) functions which, naturally, we can write as:

Ci(n) = ai(nei·(En/ħ)·t

So far, so good. We have N ai(n) coefficients and N Ci(n) functions. That’s easy enough to understand. Now we’ll define also define a set of N new vectors,  which we’ll write as |n〉, and which we’ll refer to as the state vectors that describe the configuration of the definite energy states En (n = I, II,… N). [Just breathe right now: I’ll (try to) explain this in a moment.] Moreover, we’ll write our set of coefficients ai(n) as 〈i|n〉. Again, the boldface n reminds us we’re talking a set of N complex numbers here. So we re-write that set of N Ci(n) functions as follows:

Ci(n) = 〈i|n〉·ei·(En/ħ)·t

We can expand this as follows:

Ci(n) = 〈 i | ψn(t) 〉 = 〈 i | 〉·ei·(En/ħ)·t

which, of course, implies that:

| ψn(t) 〉 = |n〉·ei·(En/ħ)·t

So now you may understand Feynman’s description of those |n〉 vectors somewhat better. As he puts it:

“The |n〉 vectors – of which there are N – are the state vectors that describe the configuration of the definite energy states En (n = I, II,… N), but have the time dependence factored out.”

Hmm… I know. This stuff is hard to swallow, but we’re not done yet: if your brain hasn’t melted yet, it may do so now. You’ll remember we talked about eigenvalues and eigenvectors in our post on the math behind the quantum-mechanical model of our ammonia molecule. Well… We can generalize the results we got there:

  1. The energies EI, EII,…, En,…, EN are the eigenvalues of the Hamiltonian matrix H.
  2. The state vectors |n〉 that are associated with each energy En, i.e. the set of vectors |n〉, are the corresponding eigenstates.

So… Well… That’s it! We’re done! This is all there is to it. I know it’s a lot but… Well… We’ve got a general description of N-state systems here, and so that’s great!

Let me make some concluding remarks though.

First, note the following property: if we let the Hamiltonian matrix act on one of those state vectors |n〉, the result is just En times the same state. We write:

Eq-12

We’re writing nothing new here really: it’s just a consequence of the definition of eigenstates and eigenvalues. The more interesting thing is the following. When describing our two-state systems, we saw we could use the states that we associated with the Eand EII as a new base set. The same is true for N-state systems: the state vectors |n〉 can also be used as a base set. Of course, for that to be the case, all of the states must be orthogonal, meaning that for any two of them, say |n〉 and |m〉, the following equation must hold:

n|m〉 = 0

Feynman shows this will be true automatically if all the energies are different. If they’re not – i.e. if our polynomial in E would accidentally have two (or more) roots with the same energy – then things are more complicated. However, as Feynman points out, this problem can be solved by ‘cooking up’ two new states that do have the same energy but are also orthogonal. I’ll refer you to him for the detail, as well as for the proof of that 〈n|m〉 = 0 equation.

Finally, you should also note that – because of the homogeneity principle – it’s possible to multiply the N ai(n) coefficients by a suitable factor so that all the states are normalized, by which we mean:

n|n〉 = 1

Well… We’re done! For today, at least! 🙂

Addendum on Systems of Linear Equations

It’s probably good to briefly remind you of your high school math class on systems of linear equations. First note the difference between homogeneous and non-homogeneous equations. Non-homogeneous equations have a non-zero constant term. The following three equations are an example of a non-homogeneous set of equations:

  • 3x + 2y − z = 1
  • 2x − 2y + 4z = −2
  • −x + y/2 − z = 0

We have a point solution here: (x, y, z) = (1, −2, −2). The geometry of the situation is something like this:

Secretsharing_3-point

One of the equations may be a linear combination of the two others. In that case, that equation can be removed without affecting the solution set. For the three-dimensional case, we get a line solution, as illustrated below.  Intersecting_Planes_2

Homogeneous and non-homogeneous sets of linear equations are closely related. If we write a homogeneous set as Ax = 0, then a non-homogeneous set of equations can be written as Ax = b. They are related. More in particular, the solution set for Ax = b is going to be a translation of the solution set for Ax = 0. We can write that more formally as follows:

If p is any specific solution to the linear system Ax = b, then the entire solution set can be described as {p + v|v is any solution to Ax = 0}

The solution set for a homogeneous system is a linear subspace. In the example above, which had three variables and, hence, for which the vector space was three-dimensional, there were three possibilities: a point, line or plane solution. All are (linear) subspaces—although you’d want to drop the term ‘linear’ for the point solution, of course. 🙂 Formally, a subspace is defined as follows: if V is a vector space, then W is a subspace if and only if:

  1. The zero vector (i.e. 0) is in W.
  2. If x is an element of W, then any scalar multiple ax will be an element of W too (this is often referred to as the property of homogeneity).
  3. If x and y are elements of W, then the sum of x and y (i.e. x + y) will be an element of W too (this is referred to as the property of additivity).

As you can see, the superposition principle actually combines the properties of homogeneity and additivity: if x and y are solutions, then any linear combination of them will be a solution too.

The solution set for a non-homogeneous system of equations is referred to as a flat. It’s a subset too, so it’s like a subspace, except that it need not pass through the origin. Again, the flats in two-dimensional space are points and lines, while in three-dimensional space we have points, lines and planes. In general, we’ll have flats, and subspaces, of every dimension from 0 to n−1 in n-dimensional space.

OK. That’s clear enough, but what is all that talk about eigenstates and eigenvalues about? Mathematically, we define eigenvectors, aka as characteristic vectors, as follows:

  • The non-zero vector v is an eigenvector of a square matrix A if Av is a scalar multiple of v, i.e. Av = λv.
  • The associated scalar λ is known as the eigenvalue (or characteristic value) associated with the eigenvector v.

Now, in physics, we talk states, rather than vectors—although our states are vectors, of course. So we’ll call them eigenstates, rather than eigenvectors. But the principle is the same, really. Now, I won’t copy what you can find elsewhere—especially not in an addendum to a post, like this one. So let me just refer you elswhere. Paul’s Online Math Notes, for example, are quite good on this—especially in the context of solving a set of differential equations, which is what we are doing here. And you can also find a more general treatment in the Wikipedia article on eigenvalues and eigenstates which, while being general, highlights their particular use in quantum math.

Some content on this page was disabled on June 20, 2020 as a result of a DMCA takedown notice from Michael A. Gottlieb, Rudolf Pfeiffer, and The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 20, 2020 as a result of a DMCA takedown notice from Michael A. Gottlieb, Rudolf Pfeiffer, and The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 20, 2020 as a result of a DMCA takedown notice from Michael A. Gottlieb, Rudolf Pfeiffer, and The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 20, 2020 as a result of a DMCA takedown notice from Michael A. Gottlieb, Rudolf Pfeiffer, and The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 20, 2020 as a result of a DMCA takedown notice from Michael A. Gottlieb, Rudolf Pfeiffer, and The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/

Freewheeling once more…

You remember the elementary wavefunction Ψ(x, t) = Ψ(θ), with θ = ω·t−k∙x = (E/ħ)·t − (p/ħ)∙x = (E·t−p∙x)/ħ. Now, we can re-scale θ and define a new argument, which we’ll write as:

φ = θ/ħ = E·t−p∙x

The Ψ(θ) function can now be written as:

Ψ(x, t) = Ψ(θ) = [ei·(θ/ħ)]ħ = Φ(φ) = [ei·φ]ħ with φ = E·t−p∙x

This doesn’t change the fundamentals: we’re just re-scaling E and p here, by measuring them in units of ħ. 

You’ll wonder: can we do that? We’re talking physics here, so our variables represent something real. Not all we can do in math, should be done in physics, right? So what does it mean? We need to look at the dimensions of our variables. Does it affect our time and distance units, i.e. the second and the meter? Well… I’d say it’s OK.

Energy is expressed in joule: 1 J = 1 N·m. [In SI base units, we write: J = N·m = (kg·m/s2)·m = kg·(m/s)2.] So if we divide it by ħ, whose dimension is joule-second (J·s), we get some value expressed per second, i.e. a (temporal) frequency. That’s what we want, as we’re multiplying it with t in the argument of our wavefunction!

Momentum is expressed in newton-second (N·s). Now, 1 J = 1 N·m, so 1 N = 1 J/m. Hence, if we divide the momentum value by ħ, we get some value expressed per meter: N·s/J·s = N/J = N/N·m = 1/m. So we get a spatial frequency here. That’s what we want, as we’re multiplying it with x!

So the answer is yes: we can re-scale energy and momentum and we get a temporal and spatial frequency respectively, which we can multiply with t and x respectively: we do not need to change our time and distance units when re-scaling E and p by dividing by ħ!

The next question is: if we express energy and momentum as temporal and spatial frequencies, do our E = m·cand p = m·formulas still apply? They should: both and v are expressed in meter per second (m/s) and, as mentioned above, the re-scaling does not affect our time and distance units. Hence, the energy-mass equivalence relation, and the definition of p (p = m·v), imply that we can re-write the argument (φ) of our ‘new’ wavefunction – i.e. Φ(φ) – as:

φ = E·t−p∙x = m·c2∙t − m∙v·x = m·c2[t – (v/c)∙(x/c)] = m·c2[t – (v/c)∙(x/c)]

In effect, when re-scaling our energy and momentum values, we’ve also re-scaled our unit of inertia, i.e. the unit in which we measure the mass m, which is directly related to both energy as well as momentum. To be precise, from a math point of view, m is nothing but a proportionality constant in both the E = m·cand p = m·formulas.

The next step is to fiddle with the time and distance units. If we

  1. measure x and t in equivalent units (so c = 1);
  2. denote v/c by β; and
  3. re-use the x symbol to denote x/c (that’s just to simplify by saving symbols);

we get:

φ = m·(t–β∙x)

This argument is the product of two factors: (1) m and (2) t–β∙x.

  1. The first factor – i.e. the mass m – is an inherent property of the particle that we’re looking at: it measures its inertia, i.e. the key variable in any dynamical model (i.e. any model – classical or quantum-mechanical – representing the motion of the particle).
  2. The second factor – i.e. t–v∙x – reminds one of the argument of the wavefunction that’s used in classical mechanics, i.e. x–vt, with v the velocity of the wave. Of course, we should note two major differences between the t–β∙x and x–vt expressions:
  1. β is a relative velocity (i.e. a ratio between 0 and 1), while v is an absolute velocity (i.e. a number between 0 and ≈ 299,792,458 m/s).
  2. The t–β∙x expression switches the time and distance variables as compared to the x–vt expression, and vice versa.

Both differences are important, but let’s focus on the second one. From a math point of view, the t–β∙x and x–vt expressions are equivalent. However, time is time, and distance is distance—in physics, that is. So what can we conclude here? To answer that question, let’s re-analyze the x–vt expression. Remember its origin: if we have some wave function F(x–vt), and we add some time Δt to its argument – so we’re looking at F[x−v(t+Δt)] now, instead of F(x−vt) – then we can restore it to its former value by also adding some distance Δx = v∙Δt to the argument: indeed, if we do so, we get F[x+Δx−v(t+Δt)] = F(x+vΔt–vt−vΔt) = F(x–vt). Of course, we can do the same analysis the other way around, so we add some Δx and then… Well… You get the idea.

Can we do that for for the F(t–β∙x) expression too? Sure. If we add some Δt to its argument, then we can restore it to its former value by also adding some distance Δx = Δt/β. Just check it: F[(t+Δt)–β(x+Δx)] = F(t+Δt–βx−βΔx) = F(t+Δt–βx−βΔt/β) = F(t–β∙x).

So the mathematical equivalence between the t–β∙x and x–vt expressions is surely meaningful. The F(x–vt) function uniquely determines the waveform and, as part of that determination (or definition, if you want), it also defines its velocity v. Likewise, we can say that the Φ(φ) = Φ[m·(t–β∙x)] function defines the (relative) velocity (β) of the particle that we’re looking at—quantum-mechanically, that is.

You’ll say: we’ve got two variables here: m and β. Well… Yes and no. We can look at m as an independent variable here. In fact, if you want, we could define yet another variable –χ = φ/m = t–β∙x – and, hence, yet another wavefunction here:

Ψ(θ) = [ei·(θ/ħ)]ħ = [ei·φ]ħ = Φ(φ) = Χ(χ) = [ei·φ/m]ħ·m = [ei·χ]ħ·m = [ei·θ/(ħ·m)]ħ·m

Does that make sense? Maybe. Think of it: the spatial dimension of the wave pulse F(x–vt) – if you don’t know what I am talking about: just think of its ‘position’ – is defined by its velocity v = x/t, which – from a math point of view – is equivalent to stating: x – v∙t = 0. Likewise, if we look at our wavefunction as some pulse in space, then its spatial dimension would also be defined by its (relative) velocity, which corresponds to the classical (relative) velocity of the particle we’re looking at. So… Well… As I said, I’ll let you think of all this.

Post Scriptum:

  1. You may wonder what that ħ·m factor in that Χ(χ) = [ei·χ]ħ·m = [ei·(t–β∙x)/(ħ·m)]ħ·m function actually stands for. Well… If we measure time and distance in equivalent units (so = 1 and, therefore, E = m), and if we measure energy in units of ħ, then ħ·m corresponds to our old energy unit, i.e. E measured in joule, rather than in terms of ħ. So… Well… I don’t think we can say much more about it.
  2. Another thing you may want to think about is the relativistic transformation of the wavefunction. You know that we should correct Newton’s Law of Motion for velocities approaching c. We do so by integrating the Lorentz factor. In light of the fact that we’re using the relative velocity (β) in our wave function, do you think we still need to apply such corrections for the wavefunction? What’s your guess? 🙂

The Hamiltonian revisited

I want to come back to something I mentioned in a previous post: when looking at that formula for those Uij amplitudes—which I’ll jot down once more:

Uij(t + Δt, t) = δij + ΔUij(t + Δt, t) = δij + Kij(t)·Δt ⇔ Uij(t + Δt, t) = δij − (i/ħ)·Hij(t)·Δt

—I noted that it resembles the general y(t + Δt) = y(t) + Δy = y(t) + (dy/dt)·Δt formula. So we can look at our Kij(t) function as being equal to the time derivative of the Uij(t + Δt, t) function. I want to re-visit that here, as it triggers a whole range of questions, which may or may not help to understand quantum math somewhat more intuitively.  Let’s quickly sum up what we’ve learned so far: it’s basically all about quantum-mechanical stuff that does not move in space. Hence, the x in our wavefunction ψ(x, t) is some fixed point in space and, therefore, our elementary wavefunction—which we wrote as:

ψ(x, t) = a·ei·θ a·ei·(ω·t − k∙x) = a·ei·[(E/ħ)·t − (p/ħ)∙x]

—reduces to ψ(t) = a·ei·ω·t = a·ei·[(E/ħ)·t.

Unlike what you might think, we’re not equating x with zero here. No. It’s the p = m·v factor that becomes zero, because our reference frame is that of the system that we’re looking at, so its velocity is zero: it doesn’t move in our reference frame. That immediately answers an obvious question: does our wavefunction look any different when choosing another reference frame? The answer is obviously: yes! It surely matters if the system moves or not, and it also matters how fast it moves, because it changes the energy and momentum values from E and p to some E’ and p’. However, we’ll not consider such complications here: that’s the realm of relativistic quantum mechanics. Let’s start with the simplest of situations.

A simple two-state system

One of the simplest examples of a quantum-mechanical system that does not move in space, is the textbook example of the ammonia molecule. The picture was as simple as the one below: an ammonia molecule consists of one nitrogen atom and three hydrogen atoms, and the nitrogen atom could be ‘up’ or ‘down’ with regard to the motion of the NH3 molecule around its axis of symmetry, as shown below.

Capture

It’s important to note that this ‘up’ or ‘down’ direction is, once again, defined with respect to the reference frame of the system itself. The motion of the molecule around its axis of symmetry is referred to as its spin—a term that’s used in a variety of contexts and, therefore, is annoyingly ambiguous. When we use the term ‘spin’ (up or down) to describe an electron state, for example, we’d associate it with the direction of its magnetic moment. Such magnetic moment arises from the fact that, for all practical purposes, we can think of an electron as a spinning electric charge. Now, while our ammonia molecule is electrically neutral, as a whole, the two states are actually associated with opposite electric dipole moments, as illustrated below. Hence, when we’d apply an electric field (denoted as ε) below, the two states are effectively associated with different energy levels, which we wrote as E0 ± εμ.

ammonia

But we’re getting ahead of ourselves here. Let’s revert to the system in free space, i.e. without an electromagnetic force field—or, what amounts to saying the same, without potential. Now, the ammonia molecule is a quantum-mechanical system, and so there is some amplitude for the nitrogen atom to tunnel through the plane of hydrogens. I told you before that this is the key to understanding quantum mechanics really: there is an energy barrier there and, classically, the nitrogen atom should not sneak across. But it does. It’s like it can borrow some energy – which we denote by A – to penetrate the energy barrier.

In quantum mechanics, the dynamics of this system are modeled using a set of two differential equations. These differential equations are really the equivalent of Newton’s classical Law of Motion (I am referring to the F = m·(dv/dt) = m·a equation here) in quantum mechanics, so I’ll have to explain them—which is not so easy as explaining Newton’s Law, because we’re talking complex-valued functions, but… Well… Let me first insert the solution of that set of differential equations:

graph

This graph shows how the probability of the nitrogen atom (or the ammonia molecule itself) being in state 1 (i.e. ‘up’) or, else, in state 2 (i.e. ‘down’), varies sinusoidally in time. Let me also give you the equations for the amplitudes to be in state 1 or 2 respectively:

  1. C1(t) = 〈 1 | ψ 〉 = (1/2)·e(i/ħ)·(E− A)·t + (1/2)·e(i/ħ)·(E+ A)·t = e(i/ħ)·E0·t·cos[(A/ħ)·t]
  2. C2(t) = 〈 2 | ψ 〉 = (1/2)·e(i/ħ)·(E− A)·t – (1/2)·e(i/ħ)·(E+ A)·t = i·e(i/ħ)·E0·t·sin[(A/ħ)·t]

So the P1(t) and P2(t) probabilities above are just the absolute square of these C1(t) and C2(t) functions. So as to help you understand what’s going on here, let me quickly insert the following technical remarks:

  • In case you wonder how we go from those exponentials to a simple sine and cosine factor, remember that the sum of complex conjugates, i.e eiθ eiθ reduces to 2·cosθ, while eiθ − eiθ reduces to 2·i·sinθ.
  • As for how to take the absolute square… Well… I shouldn’t be explaining that here, but you should be able to work that out remembering that (i) |a·b·c|2 = |a|2·|b|2·|c|2; (ii) |eiθ|2 = |e−iθ|= 12 = 1 (for any value of θ); and (iii) |i|2 = 1.
  • As for the periodicity of both probability functions, note that the period of the squared sine and cosine functions is equal to π. Hence, the argument of our sine and cosine function will be equal to 0, π, 2π, 3π etcetera if (A/ħ)·t = 0, π, 2π, 3π etcetera, i.e. if t = 0·ħ/A, π·ħ/A, 2π·ħ/A, 3π·ħ/A etcetera. So that’s why we measure time in units of ħ/A above.

The graph above is actually tricky to interpret, as it assumes that we know in what state the molecule starts out with at t = 0. This assumption is tricky because we usually do not know that: we have to make some observation which, curiously enough, will always yield one of the two states—nothing in-between. Or, else, we can use a state selector—an inhomogeneous electric field which will separate the ammonia molecules according to their state. It’s a weird thing really, and it summarizes all of the ‘craziness’ of quantum-mechanics: as long as we don’t measure anything – by applying that force field – our molecule is in some kind of abstract state, which mixes the two base states. But when we do make the measurement, always along some specific direction (which we usually take to be the z-direction in our reference frame), we’ll always find the molecule is either ‘up’ or, else, ‘down’. We never measure it as something in-between. Personally, I like to think the measurement apparatus – I am talking the electric field here – causes the nitrogen atom to sort of ‘snap into place’. However, physicists use more precise language here: they would say that the electric field does result in the two positions having very different energy levels (E0 + εμ and E0 – εμ, to be precise) and that, as a result, the amplitude for the nitrogen atom to flip back and forth has little effect. Now how do we model that?

The Hamiltonian equations

I shouldn’t be using the term above, as it usually refers to a set of differential equations describing classical systems. However, I’ll also use it for the quantum-mechanical analog, which amounts to the following for our simple two-state example above:

Hamiltonian maser

Don’t panic. We’ll explain. The equations above are all the same but use different formats: the first block writes them as a set of equations, while the second uses the matrix notation, which involves the use of that rather infamous Hamiltonian matrix, which we denote by H = [Hij]. Now, we’ve postponed a lot of technical stuff, so… Well… We can’t avoid it any longer. Let’s look at those Hamiltonian coefficients Hij first. Where do they come from?

You’ll remember we thought of time as some kind of apparatus, with particles entering in some initial state φ and coming out in some final state χ. Both are to be described in terms of our base states. To be precise, we associated the (complex) coefficients C1 and C2 with |φ〉 and D1 and D2 with |χ〉. However, the χ state is a final state, so we have to write it as 〈χ| = |χ〉† (read: chi dagger). The dagger symbol tells us we need to take the conjugate transpose of |χ〉, so the column vector becomes a row vector, and its coefficients are the complex conjugate of D1 and D2, which we denote as D1* and D2*. We combined this with Dirac’s bra-ket notation for the amplitude to go from one base state to another, as a function in time (or a function of time, I should say):

Uij(t + Δt, t) = 〈i|U(t + Δt, t)|j〉

This allowed us to write the following matrix equation:

U coefficients

To see what it means, you should write it all out:

〈χ|U(t + Δt, t)|φ〉 = D1*·(U11(t + Δt, t)·C1 + U12(t + Δt, t)·C2) + D2*·(U21(t + Δt, t)·C1 + U22(t + Δt, t)·C2)

= D1*·U11(t + Δt, t)·C+ D1*·U12(t + Δt, t)·C+ D2*·U21(t + Δt, t)·C+ D2*·U22(t + Δt, t)·C2

It’s a horrendous expression, but it’s a complex-valued amplitude or, quite simply, a complex number. So this is not nonsensical. We can now take the next step, and that’s to go from those Uij amplitudes to the Hij amplitudes of the Hamiltonian matrix. The key is to consider the following: if Δt goes to zero, nothing happens, so we write: Uij = 〈i|U|j〉 → 〈i|j〉 = δij for Δt → 0, with δij = 1 if i = j, and δij = 0 if i ≠ j. We then assume that, for small t, those Uij amplitudes should differ from δij (i.e. from 1 or 0) by amounts that are proportional to Δt. So we write:

Uij(t + Δt, t) = δij + ΔUij(t + Δt, t) = δij + Kij(t)·Δt

We then equated those Kij(t) factors with − (i/ħ)·Hij(t), and we were done: Uij(t + Δt, t) = δij − (i/ħ)·Hij(t)·Δt. […] Well… I show you how we get those differential equations in a moment. Let’s pause here for a while to see what’s going on really. You’ll probably remember how one can mathematically ‘construct’ the complex exponential eiθ by using the linear approximation eiε = 1 + iε near θ = 0 and for infinitesimally small values of ε. In case you forgot, we basically used the definition of the derivative of the real exponential eε for ε going to zero:

FormulaSo we’ve got something similar here for U11(t + Δt, t) = 1 − i·[H11(t)/ħ]·Δt and U22(t + Δt, t) = 1 − i·[H22(t)/ħ]·Δt. Just replace the ε in eiε = 1 + iε by ε = − (E0/ħ)·Δt. Indeed, we know that H11 = H22 = E0, and E0/ħ is, of course, just the energy measured in (reduced) Planck units, i.e. in its natural unit. Hence, if our ammonia molecule is in one of the two base states, we start at θ = 0 and then we just start moving on the unit circle, clockwise, because of the minus sign in eiθ. Let’s write it out:

U11(t + Δt, t) = 1 − i·[H11(t)/ħ]·Δt = 1 − i·[E0/ħ]·Δt and

U22(t + Δt, t) = 1 − i·[H22(t)/ħ]·Δt = 1 − i·[E0/ħ]·Δt

But what about U12 and U21? Is there a similar interpretation? Let’s write those equations down and think about them:

U12(t + Δt, t) = 0 − i·[H12(t)/ħ]·Δt = 0 + i·[A/ħ]·Δt and

U21(t + Δt, t) = 0 − i·[H21(t)/ħ]·Δt = 0 + i·[A/ħ]·Δt

We can visualize this as follows:

circle

Let’s remind ourselves of the definition of the derivative of a function by looking at the illustration below:izvodThe f(x0) value in this illustration corresponds to the Uij(t, t), obviously. So now things make somewhat more sense: U11(t, t) = U11(t, t) = 1, obviously, and U12(t, t) = U21(t, t) = 0. We then add the ΔUij(t + Δt, t) to Uij(t, t). Hence, we can, and probably should, think of those Kij(t) coefficients as the derivative of the Uij(t, t) functions with respect to time. So we can write something like this:

H and U

These derivatives are pure imaginary numbers. That does not mean that the Uij(t + Δt, t) functions are purely imaginary: U11(t + Δt, t) and U22(t + Δt, t) can be approximated by 1 − i·[E0/ħ]·Δt for small Δt, so they do have a real part. In contrast, U12(t + Δt, t) and U21(t + Δt, t) are, effectively, purely imaginary (for small Δt, that is).

I can’t help thinking these formulas reflect a deep and beautiful geometry, but its meaning escapes me so far. 😦 When everything is said and done, none of the reflections above makes things somewhat more intuitive: these wavefunctions remain as mysterious as ever.

I keep staring at those P1(t) and P2(t) functions, and the C1(t) and C2(t) functions that ‘generate’ them, so to speak. They’re not independent, obviously. In fact, they’re exactly the same, except for a phase difference, which corresponds to the phase difference between the sine and cosine. So it’s all one reality, really: all can be described in one single functional form, so to speak. I hope things become more obvious as I move forward. :-/

Post scriptum: I promised I’d show you how to get those differential equations but… Well… I’ve done that in other posts, so I’ll refer you to one of those. Sorry for not repeating myself. 🙂

The de Broglie relations, the wave equation, and relativistic length contraction

Pre-script (dated 26 June 2020): Our ideas have evolved into a full-blown realistic (or classical) interpretation of all things quantum-mechanical. So no use to read this. Read my recent papers instead. 🙂

Original post:

You know the two de Broglie relations, also known as matter-wave equations:

f = E/h and λ = h/p

You’ll find them in almost any popular account of quantum mechanics, and the writers of those popular books will tell you that is the frequency of the ‘matter-wave’, and λ is its wavelength. In fact, to add some more weight to their narrative, they’ll usually write them in a somewhat more sophisticated form: they’ll write them using ω and k. The omega symbol (using a Greek letter always makes a big impression, doesn’t it?) denotes the angular frequency, while k is the so-called wavenumber.  Now, k = 2π/λ and ω = 2π·f and, therefore, using the definition of the reduced Planck constant, i.e. ħ = h/2π, they’ll write the same relations as:

  1. λ = h/p = 2π/k ⇔ k = 2π·p/h
  2. f = E/h = (ω/2π)

⇒ k = p/ħ and ω = E/ħ

They’re the same thing: it’s just that working with angular frequencies and wavenumbers is more convenient, from a mathematical point of view that is: it’s why we prefer expressing angles in radians rather than in degrees (k is expressed in radians per meter, while ω is expressed in radians per second). In any case, the ‘matter wave’ – even Wikipedia uses that term now – is, of course, the amplitude, i.e. the wave-function ψ(x, t), which has a frequency and a wavelength, indeed. In fact, as I’ll show in a moment, it’s got two frequencies: one temporal, and one spatial. I am modest and, hence, I’ll admit it took me quite a while to fully distinguish the two frequencies, and so that’s why I always had trouble connecting these two ‘matter wave’ equations.

Indeed, if they represent the same thing, they must be related, right? But how exactly? It should be easy enough. The wavelength and the frequency must be related through the wave velocity, so we can write: f·λ = v, with the velocity of the wave, which must be equal to the classical particle velocity, right? And then momentum and energy are also related. To be precise, we have the relativistic energy-momentum relationship: p·c = mv·v·c = mv·c2·v/c = E·v/c. So it’s just a matter of substitution. We should be able to go from one equation to the other, and vice versa. Right?

Well… No. It’s not that simple. We can start with either of the two equations but it doesn’t work. Try it. Whatever substitution you try, there’s no way you can derive one of the two equations above from the other. The fact that it’s impossible is evidenced by what we get when we’d multiply both equations. We get:

  1. f·λ = (E/h)·(h/p) = E/p
  2. v = f·λ  ⇒ f·λ = v = E/p ⇔ E = v·p = v·(m·v)

⇒ E = m·v2

Huh? What kind of formula is that? E = m·v2? That’s a formula you’ve never ever seen, have you? It reminds you of the kinetic energy formula of course—K.E. = m·v2/2—but… That factor 1/2 should not be there. Let’s think about it for a while. First note that this E = m·vrelation makes perfectly sense if v = c. In that case, we get Einstein’s mass-energy equivalence (E = m·c2), but that’s besides the point here. The point is: if v = c, then our ‘particle’ is a photon, really, and then the E = h·f is referred to as the Planck-Einstein relation. The wave velocity is then equal to c and, therefore, f·λ = c, and so we can effectively substitute to find what we’re looking for:

E/p = (h·f)/(h/λ) = f·λ = c ⇒ E = p·

So that’s fine: we just showed that the de Broglie relations are correct for photons. [You remember that E = p·c relation, no? If not, check out my post on it.] However, while that’s all nice, it is not what the de Broglie equations are about: we’re talking the matter-wave here, and so we want to do something more than just re-confirm that Planck-Einstein relation, which you can interpret as the limit of the de Broglie relations for v = c. In short, we’re doing something wrong here! Of course, we are. I’ll tell you what exactly in a moment: it’s got to do with the fact we’ve got two frequencies really.

Let’s first try something else. We’ve been using the relativistic E = mv·c2 equation above. Let’s try some other energy concept: let’s substitute the E in the f = E/h by the kinetic energy and then see where we get—if anywhere at all. So we’ll use the Ekinetic = m∙v2/2 equation. We can then use the definition of momentum (p = m∙v) to write E = p2/(2m), and then we can relate the frequency f to the wavelength λ using the v = λ∙f formula once again. That should work, no? Let’s do it. We write:

  1. E = p2/(2m)
  2. E = h∙f = h·v

⇒ λ = h·v/E = h·v/(p2/(2m)) = h·v/[m2·v2/(2m)] = h/[m·v/2] = 2∙h/p

So we find λ = 2∙h/p. That is almost right, but not quite: that factor 2 should not be there. Well… Of course you’re smart enough to see it’s just that factor 1/2 popping up once more—but as a reciprocal, this time around. 🙂 So what’s going on? The honest answer is: you can try anything but it will never work, because the f = E/h and λ = h/p equations cannot be related—or at least not so easily. The substitutions above only work if we use that E = m·v2 energy concept which, you’ll agree, doesn’t make much sense—at first, at least. Again: what’s going on? Well… Same honest answer: the f = E/h and λ = h/p equations cannot be related—or at least not so easily—because the wave equation itself is not so easy.

Let’s review the basics once again.

The wavefunction

The amplitude of a particle is represented by a wavefunction. If we have no information whatsoever on its position, then we usually write that wavefunction as the following complex-valued exponential:

ψ(x, t) = a·ei·[(E/ħ)·t − (p/ħ)∙x] = a·ei·(ω·t − kx= a·ei(kx−ω·t) = a·eiθ = (cosθ + i·sinθ)

θ is the so-called phase of our wavefunction and, as you can see, it’s the argument of a wavefunction indeed, with temporal frequency ω and spatial frequency k (if we choose our x-axis so its direction is the same as the direction of k, then we can substitute the k and x vectors for the k and x scalars, so that’s what we’re doing here). Now, we know we shouldn’t worry too much about a, because that’s just some normalization constant (remember: all probabilities have to add up to one). However, let’s quickly develop some logic here. Taking the absolute square of this wavefunction gives us the probability of our particle being somewhere in space at some point in time. So we get the probability as a function of x and t. We write:

P(x ,t) = |a·ei·[(E/ħ)·t − (p/ħ)∙x]|= a2

As all probabilities have to add up to one, we must assume we’re looking at some box in spacetime here. So, if the length of our box is Δx = x2 − x1, then (Δx)·a2 = (x2−x1a= 1 ⇔ Δx = 1/a2. [We obviously simplify the analysis by assuming a one-dimensional space only here, but the gist of the argument is essentially correct.] So, freezing time (i.e. equating t to some point t = t0), we get the following probability density function:

Capture

That’s simple enough. The point is: the two de Broglie equations f = E/h and λ = h/p give us the temporal and spatial frequencies in that ψ(x, t) = a·ei·[(E/ħ)·t − (p/ħ)∙x] relation. As you can see, that’s an equation that implies a much more complicated relationship between E/ħ = ω and p/ħ = k. Or… Well… Much more complicated than what one would think of at first.

To appreciate what’s being represented here, it’s good to play a bit. We’ll continue with our simple exponential above, which also illustrates how we usually analyze those wavefunctions: we either assume we’re looking at the wavefunction in space at some fixed point in time (t = t0) or, else, at how the wavefunction changes in time at some fixed point in space (x = x0). Of course, we know that Einstein told us we shouldn’t do that: space and time are related and, hence, we should try to think of spacetime, i.e. some ‘kind of union’ of space and time—as Minkowski famously put it. However, when everything is said and done, mere mortals like us are not so good at that, and so we’re sort of condemned to try to imagine things using the classical cut-up of things. 🙂 So we’ll just an online graphing tool to play with that a·ei(k∙x−ω·t) = a·eiθ = (cosθ + i·sinθ) formula.

Compare the following two graps, for example. Just imagine we either look at how the wavefunction behaves at some point in space, with the time fixed at some point t = t0, or, alternatively, that we look at how the wavefunction behaves in time at some point in space x = x0. As you can see, increasing k = p/ħ or increasing ω = E/ħ gives the wavefunction a higher ‘density’ in space or, alternatively, in time.

density 1

density 2That makes sense, intuitively. In fact, when thinking about how the energy, or the momentum, affects the shape of the wavefunction, I am reminded of an airplane propeller: as it spins, faster and faster, it gives the propeller some ‘density’, in space as well as in time, as its blades cover more space in less time. It’s an interesting analogy: it helps—me, at least—to think through what that wavefunction might actually represent.

propeller

So as to stimulate your imagination even more, you should also think of representing the real and complex part of that ψ = a·ei(k∙x−ω·t) = a·eiθ = (cosθ + i·sinθ) formula in a different way. In the graphs above, we just showed the sine and cosine in the same plane but, as you know, the real and the imaginary axis are orthogonal, so Euler’s formula a·eiθ (cosθ + i·sinθ) = cosθ + i·sinθ = Re(ψ) + i·Im(ψ) may also be graphed as follows:

5d_euler_f

The illustration above should make you think of yet another illustration you’ve probably seen like a hundred times before: the electromagnetic wave, propagating through space as the magnetic and electric field induce each other, as illustrated below. However, there’s a big difference: Euler’s formula incorporates a phase shift—remember: sinθ = cos(θ − π/2)—and you don’t have that in the graph below. The difference is much more fundamental, however: it’s really hard to see how one could possibly relate the magnetic and electric field to the real and imaginary part of the wavefunction respectively. Having said that, the mathematical similarity makes one think!

FG02_06

Of course, you should remind yourself of what E and B stand for: they represent the strength of the electric (E) and magnetic (B) field at some point x at some time t. So you shouldn’t think of those wavefunctions above as occupying some three-dimensional space. They don’t. Likewise, our wavefunction ψ(x, t) does not occupy some physical space: it’s some complex number—an amplitude that’s associated with each and every point in spacetime. Nevertheless, as mentioned above, the visuals make one think and, as such, do help us as we try to understand all of this in a more intuitive way.

Let’s now look at that energy-momentum relationship once again, but using the wavefunction, rather than those two de Broglie relations.

Energy and momentum in the wavefunction

I am not going to talk about uncertainty here. You know that Spiel. If there’s uncertainty, it’s in the energy or the momentum, or in both. The uncertainty determines the size of that ‘box’ (in spacetime) in which we hope to find our particle, and it’s modeled by a splitting of the energy levels. We’ll say the energy of the particle may be E0, but it might also be some other value, which we’ll write as En = E0 ± n·ħ. The thing to note is that energy levels will always be separated by some integer multiple of ħ, so ħ is, effectively , the quantum of energy for all practical—and theoretical—purposes. We then super-impose the various wave equations to get a wave function that might—or might not—resemble something like this:

Photon waveWho knows? 🙂 In any case, that’s not what I want to talk about here. Let’s repeat the basics once more: if we write our wavefunction a·ei·[(E/ħ)·t − (p/ħ)∙x] as a·ei·[ω·t − k∙x], we refer to ω = E/ħ as the temporal frequency, i.e. the frequency of our wavefunction in time (i.e. the frequency it has if we keep the position fixed), and to k = p/ħ as the spatial frequency (i.e. the frequency of our wavefunction in space (so now we stop the clock and just look at the wave in space). Now, let’s think about the energy concept first. The energy of a particle is generally thought of to consist of three parts:

  1. The particle’s rest energy m0c2, which de Broglie referred to as internal energy (Eint): it includes the rest mass of the ‘internal pieces’, as Feynman puts it (now we call those ‘internal pieces’ quarks), as well as their binding energy (i.e. the quarks’ interaction energy);
  2. Any potential energy it may have because of some field (so de Broglie was not assuming the particle was traveling in free space), which we’ll denote by U, and note that the field can be anything—gravitational, electromagnetic: it’s whatever changes the energy because of the position of the particle;
  3. The particle’s kinetic energy, which we write in terms of its momentum p: m·v2/2 = m2·v2/(2m) = (m·v)2/(2m) = p2/(2m).

So we have one energy concept here (the rest energy) that does not depend on the particle’s position in spacetime, and two energy concepts that do depend on position (potential energy) and/or how that position changes because of its velocity and/or momentum (kinetic energy). The two last bits are related through the energy conservation principle. The total energy is E = mvc2, of course—with the little subscript (v) ensuring the mass incorporates the equivalent mass of the particle’s kinetic energy.

So what? Well… In my post on quantum tunneling, I drew attention to the fact that different potentials , so different potential energies (indeed, as our particle travels one region to another, the field is likely to vary) have no impact on the temporal frequency. Let me re-visit the argument, because it’s an important one. Imagine two different regions in space that differ in potential—because the field has a larger or smaller magnitude there, or points in a different direction, or whatever: just different fields, which corresponds to different values for U1 and U2, i.e. the potential in region 1 versus region 2. Now, the different potential will change the momentum: the particle will accelerate or decelerate as it moves from one region to the other, so we also have a different p1 and p2. Having said that, the internal energy doesn’t change, so we can write the corresponding amplitudes, or wavefunctions, as:

  1. ψ11) = Ψ1(x, t) = a·eiθ1 = a·e−i[(Eint + p12/(2m) + U1)·t − p1∙x]/ħ 
  2. ψ22) = Ψ2(x, t) = a·e−iθ2 = a·e−i[(Eint + p22/(2m) + U2)·t − p2∙x]/ħ 

Now how should we think about these two equations? We are definitely talking different wavefunctions. However, their temporal frequencies ω= Eint + p12/(2m) + U1 and ω= Eint + p22/(2m) + Umust be the same. Why? Because of the energy conservation principle—or its equivalent in quantum mechanics, I should say: the temporal frequency f or ω, i.e. the time-rate of change of the phase of the wavefunction, does not change: all of the change in potential, and the corresponding change in kinetic energy, goes into changing the spatial frequency, i.e. the wave number k or the wavelength λ, as potential energy becomes kinetic or vice versa. The sum of the potential and kinetic energy doesn’t change, indeed. So the energy remains the same and, therefore, the temporal frequency does not change. In fact, we need this quantum-mechanical equivalent of the energy conservation principle to calculate how the momentum and, hence, the spatial frequency of our wavefunction, changes. We do so by boldly equating ω= Eint + p12/(2m) + Uand ω2 = Eint + p22/(2m) + U2, and so we write:

ω= ω2 ⇔ Eint + p12/(2m) + U=  Eint + p22/(2m) + U

⇔ p12/(2m) − p22/(2m) = U– U⇔ p2=  (2m)·[p12/(2m) – (U– U1)]

⇔ p2 = (p12 – 2m·ΔU )1/2

We played with this in a previous post, assuming that p12 is larger than 2m·ΔU, so as to get a positive number on the right-hand side of the equation for p22, so then we can confidently take the positive square root of that (p12 – 2m·ΔU ) expression to calculate p2. For example, when the potential difference ΔU = U– U1 was negative, so ΔU < 0, then we’re safe and sure to get some real positive value for p2.

Having said that, we also contemplated the possibility that p2= p12 – 2m·ΔU was negative, in which case p2 has to be some pure imaginary number, which we wrote as p= i·p’ (so p’ (read: p prime) is a real positive number here). We could work with that: it resulted in an exponentially decreasing factor ep’·x/ħ that ended up ‘killing’ the wavefunction in space. However, its limited existence still allowed particles to ‘tunnel’ through potential energy barriers, thereby explaining the quantum-mechanical tunneling phenomenon.

This is rather weird—at first, at least. Indeed, one would think that, because of the E/ħ = ω equation, any change in energy would lead to some change in ω. But no! The total energy doesn’t change, and the potential and kinetic energy are like communicating vessels: any change in potential energy is associated with a change in p, and vice versa. It’s a really funny thing. It helps to think it’s because the potential depends on position only, and so it should not have an impact on the temporal frequency of our wavefunction. Of course, it’s equally obvious that the story would change drastically if the potential would change with time, but… Well… We’re not looking at that right now. In short, we’re assuming energy is being conserved in our quantum-mechanical system too, and so that implies what’s described above: no change in ω, but we obviously do have changes in p whenever our particle goes from one region in space to another, and the potentials differ. So… Well… Just remember: the energy conservation principle implies that the temporal frequency of our wave function doesn’t change. Any change in potential, as our particle travels from one place to another, plays out through the momentum.

Now that we know that, let’s look at those de Broglie relations once again.

Re-visiting the de Broglie relations

As mentioned above, we usually think in one dimension only: we either freeze time or, else, we freeze space. If we do that, we can derive some funny new relationships. Let’s first simplify the analysis by re-writing the argument of the wavefunction as:

θ = E·t − p·x

Of course, you’ll say: the argument of the wavefunction is not equal to E·t − p·x: it’s (E/ħ)·t − (p/ħ)∙x. Moreover, θ should have a minus sign in front. Well… Yes, you’re right. We should put that 1/ħ factor in front, but we can change units, and so let’s just measure both E as well as p in units of ħ here. We can do that. No worries. And, yes, the minus sign should be there—Nature choose a clockwise direction for θ—but that doesn’t matter for the analysis hereunder.

The E·t − p·x expression reminds one of those invariant quantities in relativity theory. But let’s be precise here. We’re thinking about those so-called four-vectors here, which we wrote as pμ = (E, px, py, pz) = (E, p) and xμ = (t, x, y, z) = (t, x) respectively. [Well… OK… You’re right. We wrote those four-vectors as pμ = (E, px·c , py·c, pz·c) = (E, p·c) and xμ = (c·t, x, y, z) = (t, x). So what we write is true only if we measure time and distance in equivalent units so we have = 1. So… Well… Let’s do that and move on.] In any case, what was invariant was not E·t − p·x·c or c·t − x (that’s a nonsensical expression anyway: you cannot subtract a vector from a scalar), but pμ2 = pμpμ = E2 − (p·c)2 = E2 − p2·c= E2 − (px2 + py2 + pz2c2 and xμ2 = xμxμ = (c·t)2 − x2 = c2·t2 − (x2 + y2 + z2) respectively. [Remember pμpμ and xμxμ are four-vector dot products, so they have that +— signature, unlike the p2 and x2 or a·b dot products, which are just a simple sum of the squared components.] So… Well… E·t − p·x is not an invariant quantity. Let’s try something else.

Let’s re-simplify by equating ħ as well as c to one again, so we write: ħ = c = 1. [You may wonder if it is possible to ‘normalize’ both physical constants simultaneously, but the answer is yes. The Planck unit system is an example.]  then our relativistic energy-momentum relationship can be re-written as E/p = 1/v. [If c would not be one, we’d write: E·β = p·c, with β = v/c. So we got E/p = c/β. We referred to β as the relative velocity of our particle: it was the velocity, but measured as a ratio of the speed of light. So here it’s the same, except that we use the velocity symbol v now for that ratio.]

Now think of a particle moving in free space, i.e. without any fields acting on it, so we don’t have any potential changing the spatial frequency of the wavefunction of our particle, and let’s also assume we choose our x-axis such that it’s the direction of travel, so the position vector (x) can be replaced by a simple scalar (x). Finally, we will also choose the origin of our x-axis such that x = 0 zero when t = 0, so we write: x(t = 0) = 0. It’s obvious then that, if our particle is traveling in spacetime with some velocity v, then the ratio of its position x and the time t that it’s been traveling will always be equal to = x/t. Hence, for that very special position in spacetime (t, x = v·t) – so we’re talking the actual position of the particle in spacetime here – we get: θ = E·t − p·x = E·t − p·v·t = E·t − m·v·v·t= (E −  m∙v2)·t. So… Well… There we have the m∙v2 factor.

The question is: what does it mean? How do we interpret this? I am not sure. When I first jotted this thing down, I thought of choosing a different reference potential: some negative value such that it ensures that the sum of kinetic, rest and potential energy is zero, so I could write E = 0 and then the wavefunction would reduce to ψ(t) = ei·m∙v2·t. Feynman refers to that as ‘choosing the zero of our energy scale such that E = 0’, and you’ll find this in many other works too. However, it’s not that simple. Free space is free space: if there’s no change in potential from one region to another, then the concept of some reference point for the potential becomes meaningless. There is only rest energy and kinetic energy, then. The total energy reduces to E = m (because we chose our units such that c = 1 and, therefore, E = mc2 = m·12 = m) and so our wavefunction reduces to:

ψ(t) = a·ei·m·(1 − v2)·t

We can’t reduce this any further. The mass is the mass: it’s a measure for inertia, as measured in our inertial frame of reference. And the velocity is the velocity, of course—also as measured in our frame of reference. We can re-write it, of course, by substituting t for t = x/v, so we get:

ψ(x) = a·ei·m·(1/vv)·x

For both functions, we get constant probabilities, but a wavefunction that’s ‘denser’ for higher values of m. The (1 − v2) and (1/vv) factors are different, however: these factors becomes smaller for higher v, so our wavefunction becomes less dense for higher v. In fact, for = 1 (so for travel at the speed of light, i.e. for photons), we get that ψ(t) = ψ(x) = e0 = 1. [You should use the graphing tool once more, and you’ll see the imaginary part, i.e. the sine of the (cosθ + i·sinθ) expression, just vanishes, as sinθ = 0 for θ = 0.]

graph

The wavefunction and relativistic length contraction

Are exercises like this useful? As mentioned above, these constant probability wavefunctions are a bit nonsensical, so you may wonder why I wrote what I wrote. There may be no real conclusion, indeed: I was just fiddling around a bit, and playing with equations and functions. I feel stuff like this helps me to understand what that wavefunction actually is somewhat better. If anything, it does illustrate that idea of the ‘density’ of a wavefunction, in space or in time. What we’ve been doing by substituting x for x = v·t or t for t = x/v is showing how, when everything is said and done, the mass and the velocity of a particle are the actual variables determining that ‘density’ and, frankly, I really like that ‘airplane propeller’ idea as a pedagogic device. In fact, I feel it may be more than just a pedagogic device, and so I’ll surely re-visit it—once I’ve gone through the rest of Feynman’s Lectures, that is. 🙂

That brings me to what I added in the title of this post: relativistic length contraction. You’ll wonder why I am bringing that into a discussion like this. Well… Just play a bit with those (1 − v2) and (1/vv) factors. As mentioned above, they decrease the density of the wavefunction. In other words, it’s like space is being ‘stretched out’. Also, it can’t be a coincidence we find the same (1 − v2) factor in the relativistic length contraction formula: L = L0·√(1 − v2), in which L0 is the so-called proper length (i.e. the length in the stationary frame of reference) and is the (relative) velocity of the moving frame of reference. Of course, we also find it in the relativistic mass formula: m = mv = m0/√(1−v2). In fact, things become much more obvious when substituting m for m0/√(1−v2) in that ψ(t) = ei·m·(1 − v2)·t function. We get:

ψ(t) = a·ei·m·(1 − v2)·t = a·ei·m0·√(1−v2)·t 

Well… We’re surely getting somewhere here. What if we go back to our original ψ(x, t) = a·ei·[(E/ħ)·t − (p/ħ)∙x] function? Using natural units once again, that’s equivalent to:

ψ(x, t) = a·ei·(m·t − p∙x) = a·ei·[(m0/√(1−v2))·t − (m0·v/√(1−v2)∙x)

= a·ei·[m0/√(1−v2)]·(t − v∙x)

Interesting! We’ve got a wavefunction that’s a function of x and t, but with the rest mass (or rest energy) and velocity as parameters! Now that really starts to make sense. Look at the (blue) graph for that 1/√(1−v2) factor: it goes from one (1) to infinity (∞) as v goes from 0 to 1 (remember we ‘normalized’ v: it’s a ratio between 0 and 1 now). So that’s the factor that comes into play for t. For x, it’s the red graph, which has the same shape but goes from zero (0) to infinity (∞) as v goes from 0 to 1.

graph 2Now that makes sense: the ‘density’ of the wavefunction, in time and in space, increases as the velocity v increases. In space, that should correspond to the relativistic length contraction effect: it’s like space is contracting, as the velocity increases and, therefore, the length of the object we’re watching contracts too. For time, the reasoning is a bit more complicated: it’s our time that becomes more dense and, therefore, our clock that seems to tick faster.

[…]

I know I need to explore this further—if only so as to assure you I have not gone crazy. Unfortunately, I have no time to do that right now. Indeed, from time to time, I need to work on other stuff besides this physics ‘hobby’ of mine. :-/

Post scriptum 1: As for the E = m·vformula, I also have a funny feeling that it might be related to the fact that, in quantum mechanics, both the real and imaginary part of the oscillation actually matter. You’ll remember that we’d represent any oscillator in physics by a complex exponential, because it eased our calculations. So instead of writing A = A0·cos(ωt + Δ), we’d write: A = A0·ei(ωt + Δ) = A0·cos(ωt + Δ) + i·A0·sin(ωt + Δ). When calculating the energy or intensity of a wave, however, we couldn’t just take the square of the complex amplitude of the wave – remembering that E ∼ A2. No! We had to get back to the real part only, i.e. the cosine or the sine only. Now the mean (or average) value of the squared cosine function (or a squared sine function), over one or more cycles, is 1/2, so the mean of A2 is equal to 1/2 = A02. cos(ωt + Δ). I am not sure, and it’s probably a long shot, but one must be able to show that, if the imaginary part of the oscillation would actually matter – which is obviously the case for our matter-wave – then 1/2 + 1/2 is obviously equal to 1. I mean: try to think of an image with a mass attached to two springs, rather than one only. Does that make sense? 🙂 […] I know: I am just freewheeling here. 🙂

Post scriptum 2: The other thing that this E = m·vequation makes me think of is – curiously enough – an eternally expanding spring. Indeed, the kinetic energy of a mass on a spring and the potential energy that’s stored in the spring always add up to some constant, and the average potential and kinetic energy are equal to each other. To be precise: 〈K.E.〉 + 〈P.E.〉 = (1/4)·k·A2 + (1/4)·k·A= k·A2/2. It means that, on average, the total energy of the system is twice the average kinetic energy (or potential energy). You’ll say: so what? Well… I don’t know. Can we think of a spring that expands eternally, with the mass on its end not gaining or losing any speed? In that case, is constant, and the total energy of the system would, effectively, be equal to Etotal = 2·〈K.E.〉 = (1/2)·m·v2/2 = m·v2.

Post scriptum 3: That substitution I made above – substituting x for x = v·t – is kinda weird. Indeed, if that E = m∙v2 equation makes any sense, then E − m∙v2 = 0, of course, and, therefore, θ = E·t − p·x = E·t − p·v·t = E·t − m·v·v·t= (E −  m∙v2)·t = 0·t = 0. So the argument of our wavefunction is 0 and, therefore, we get a·e= for our wavefunction. It basically means our particle is where it is. 🙂

Post scriptum 4: This post scriptum – no. 4 – was added later—much later. On 29 February 2016, to be precise. The solution to the ‘riddle’ above is actually quite simple. We just need to make a distinction between the group and the phase velocity of our complex-valued wave. The solution came to me when I was writing a little piece on Schrödinger’s equation. I noticed that we do not find that weird E = m∙v2 formula when substituting ψ for ψ = ei(kx − ωt) in Schrödinger’s equation, i.e. in:

Schrodinger's equation 2

Let me quickly go over the logic. To keep things simple, we’ll just assume one-dimensional space, so ∇2ψ = ∂2ψ/∂x2. The time derivative on the left-hand side is ∂ψ/∂t = −iω·ei(kx − ωt). The second-order derivative on the right-hand side is ∂2ψ/∂x2 = (ik)·(ik)·ei(kx − ωt) = −k2·ei(kx − ωt) . The ei(kx − ωt) factor on both sides cancels out and, hence, equating both sides gives us the following condition:

iω = −(iħ/2m)·k2 ⇔ ω = (ħ/2m)·k2

Substituting ω = E/ħ and k = p/ħ yields:

E/ħ = (ħ/2m)·p22 = m2·v2/(2m·ħ) = m·v2/(2ħ) ⇔ E = m·v2/2

In short: the E = m·v2/2 is the correct formula. It must be, because… Well… Because Schrödinger’s equation is a formula we surely shouldn’t doubt, right? So the only logical conclusion is that we must be doing something wrong when multiplying the two de Broglie equations. To be precise: our v = f·λ equation must be wrong. Why? Well… It’s just something one shouldn’t apply to our complex-valued wavefunction. The ‘correct’ velocity formula for the complex-valued wavefunction should have that 1/2 factor, so we’d write 2·f·λ = v to make things come out alright. But where would this formula come from? The period of cosθ + isinθ is the period of the sine and cosine function: cos(θ+2π) + isin(θ+2π) = cosθ + isinθ, so T = 2π and f = 1/T = 1/2π do not change.

But so that’s a mathematical point of view. From a physical point of view, it’s clear we got two oscillations for the price of one: one ‘real’ and one ‘imaginary’—but both are equally essential and, hence, equally ‘real’. So the answer must lie in the distinction between the group and the phase velocity when we’re combining waves. Indeed, the group velocity of a sum of waves is equal to vg = dω/dk. In this case, we have:

vg = d[E/ħ]/d[p/ħ] = dE/dp

We can now use the kinetic energy formula to write E as E = m·v2/2 = p·v/2. Now, v and p are related through m (p = m·v, so = p/m). So we should write this as E = m·v2/2 = p2/(2m). Substituting E and p = m·v in the equation above then gives us the following:

dω/dk = d[p2/(2m)]/dp = 2p/(2m) = v= v

However, for the phase velocity, we can just use the v= ω/k formula, which gives us that 1/2 factor:

v= ω/k = (E/ħ)/(p/ħ) = E/p = (m·v2/2)/(m·v) = v/2

Bingo! Riddle solved! 🙂 Isn’t it nice that our formula for the group velocity also applies to our complex-valued wavefunction? I think that’s amazing, really! But I’ll let you think about it. 🙂

The Pauli spin matrices as operators

Pre-script (dated 26 June 2020): This post got mutilated by the removal of some material by the dark force. You should be able to follow the main story line, however. If anything, the lack of illustrations might actually help you to think things through for yourself. In any case, we now have different views on these concepts as part of our realist interpretation of quantum mechanics, so we recommend you read our recent papers instead of these old blog posts.

Original post:

You must be despairing by now. More theory? Haven’t we had enough? Relax. We’re almost there. The next post is going to generalize our results for n-state systems. However, before we do that, we need one more building block, and that’s this one. So… Well… Let’s go for it. It’s a bit long but, hopefully, interesting enough—so you don’t fall asleep before the end. 🙂 Let’s first review the concept of an operator itself.

The concept of an operator

You’ll remember Feynman‘s ‘Great Law of Quantum Mechanics’:

| = ∑ | i 〉〈 i | over all base states i.

We also talked of all kinds of apparatuses: a Stern-Gerlach spin filter, a state selector for a maser, a resonant cavity or—quite simply—just time passing by. From a quantum-mechanical point of view, we think of this as particles going into the apparatus in some state φ, and coming out of it in some other state χ. We wrote the amplitude for that as 〈 χ | A | φ 〉. [Remember the right-to-left reading, like Arab or Hebrew script.] Then we applied our ‘Great Law’ to that 〈 χ | A | φ 〉 expression – twice, actually – to get the following expression:

A1

We’re just ‘unpacking’ the φ and χ states here, as we can only describe those states in terms of base states, which we denote as and j here. That’s all. If we’d add another apparatus in series, we’d get:

B1

We just put the | bar between B and A and apply the same trick. The | bar is really like a factor 1 in multiplication—in the sense that we can insert it anywhere: a×b = a×1×b = 1×a×b = a×b×1 = 1×a×1×b×1 = 1×a×b×1 etc. Anywhere? Hmm… It’s not quite the same, but I’ll let you check out the differences. 🙂 The point is that, from a mathematical point of view, we can fully describe the apparatus A, or the combined apparatus BA, in terms of those 〈 i | A | j 〉 or 〈 i | BA | j 〉 amplitudes. Depending on the number of base states, we’d have a three-by-three, or a two-by-two, or, more generally, an n-by-n matrix, i.e. a square matrix of order n. For example, there are 3×3 = 9 amplitudes if we have three possible states, for example—and, equally obviously, 2×2 = 4 amplitudes for the example involving spin-1/2 particles. [If you think things are way too complicated,… Well… At least we’ve got square matrices here—not n-by-matrices.] We simply called such matrix the matrix of amplitudes, and we usually denoted it by A. However, sometimes we’d also denote it by Aij, or by [Aij], depending on our mood. 🙂 The preferred notation was A, however, so as to avoid confusion with the matrix elements, which we’d write as Aij.

The Hamiltonian matrix – which, very roughly speaking, is like the quantum-mechanical equivalent of the  dp/dt term of Newton’s Law of Motion: F = dp/dt = m·dv/dt = m·a – is a matrix of amplitudes as well, and we’ll come back to it in a minute. Let’s first continue our story on operators here. The idea of an operator comes up when we’re creative again, and when we drop the 〈 χ | state from the 〈 χ | A | φ〉 expression, so we write:

C1

So now we think of the particle entering the ‘apparatus’ A in the state ϕ and coming out of A in some state ψ (‘psi’). But our psi is a ket, i.e. some initial state. That’s why we write it as | ψ 〉. It doesn’t mean anything until we combine with some bra, like a base state 〈 i |, or with a final state, which we’d denote by 〈 χ | or some other Greek letter between a 〈 and a | symbol. So then we get 〈 χ | ψ 〉 = 〈 χ | A | φ〉 or 〈 i | ψ 〉 = 〈 i | A | φ 〉. So then we’re ‘unpacking’ our bar once more. Let me be explicit here: it’s kinda weird, but if you’re going to study quantum math, you’ll need to accept that, when discussing the state of a system or a particle, like ψ or φ, it does make a difference if they’re initial or final states. To be precise, the final 〈 χ | or 〈 φ | states are equal to the conjugate transpose of the initial | χ 〉 or | φ 〉 states, so we write: 〈 χ | = | χ 〉 or 〈 φ | = | φ 〉. I’ll come back to that, because it’s kind of counter-intuitive: a state should be a state, no? Well… No. Not from a quantum-math point of view at least. 😦 But back to our operator. Feynman defines an operator in the following rather intuitive way:

The symbol A is neither an amplitude, nor a vector; it is a new kind of thing called an operator. It is something which “operates on” some state | φ 〉 to produce some new state | ψ 〉.”

But… Well… Be careful! What’s a state? As I mentioned, | ψ 〉 is not the same as 〈 ψ |. We’re talking an initial state | ψ 〉 here, not 〈 ψ |. That’s why we need to ‘unpack’ the operator to see what it does: we have to combine it with some final state that we’re interested in, or a base state. Then—and only then—we get a proper amplitude, i.e. some complex number – or some complex function – that we can work with. To be precise, we then get the amplitude to be in that final state, or in that base state. In practical terms, that means our operator, or our apparatus, doesn’t mean very much as long as we don’t measure what comes out—and measuring something implies we have to choose some set of base states, i.e. a representation, which allows us to describe the final state, which we denoted as 〈 χ | above.

Let’s wrap this up by being clear on the notation once again. We’ll write: Aij = 〈 i | A | j 〉, or Uij = 〈 i | U | j 〉, or Hij = 〈 i | H | j 〉. In other words, we’ll really be consistent now with those subscripts: if they are there, we’re talking a coefficient, or a matrix element. If they’re not there, we’re talking the matrix itself, i.e. A, U or H. Now, to give you a sort of feeling for how that works in terms of the matrix equations that we’ll inevitably have to deal with, let me just jot one of them down here:

time

The Di* numbers are the ‘coordinates’ of the (final) 〈 χ | state in terms of the base states, which we denote as i = +, 0 or − here. So we have three states here. [That’s just to remind you that the two-state systems we’ve seen so far are pretty easy. We’ll soon be working with four-state systems—and then the sky is the limit. :-)] In fact, you’ll remember that those coordinates were the complex conjugate of the ‘coordinates’ of the initial | χ 〉 state, i.e. D+, D0, D, so that 1-by-3 matrix above, i.e. the row vector 〈 χ |[D+*  D0*  D*], is the so-called conjugate transpose of the column vector | χ 〉 = [D+  D0  D]T. [I can’t do columns with this WordPress editor, so I am just putting the T for transpose so as to make sure you understand | χ 〉 is a column vector.]

Now, you’ll wonder – if you don’t, you should 🙂 – how that Aij = 〈 i | A | j 〉, Uij = 〈 i | U | j 〉, or Hij = 〈 i | H | j 〉 notation works out in terms of matrices. It’s extremely simple really. If we have only two states (yes, back to simplicity), which we’ll also write as + and − (forget about the 0 state), then we can write Aij = 〈 i | A | j 〉 in matrix notation as:

matrix

Huh? Is is that simple? Yes. We can make things more complicated by involving a transformation matrix so we can write our base states in terms of another, different, set of base states but, in essence, this is what we are talking about here. Of course, you should absolutely not try to give a geometric interpretation to our [1 0] or [0 1] ‘coordinates’. If you do that, you get in trouble, because then you want to give the transformed base states the same geometric interpretation and… Well… It just doesn’t make sense. I gave an example of that in my post on the hydrogen molecule as a two-state system. Symmetries in quantum physics are not geometric… Well… Not in a physical sense, that is. As I explained in my previous post, describing spin-1/2 particles involves stuff like 720 degree symmetries and all that. So… Well… Just don’t! 🙂

Onwards!

The Hamiltonian as a matrix and as an operator

As mentioned above, our Hamiltonian is a matrix of amplitudes as well, and we can also write it as H, Hij, or [Hij] respectively, depending on our mood. 🙂 For some reason, Feynman often writes it as Hij, instead of H, which creates a lot of confusion because, in most contexts, Hij refers to the matrix elements, rather than the matrix itself. I guess Feynman likes to keep the subscripts, i.e ij or I,II, as they refer to the representation that was chosen. However, Hij should really refer to the matrix element, and then we can use H for the matrix itself. So let’s be consistent. As I’ve shown above, the Hij notation – and so I am talking the Hamiltonian coefficients here – is actually a shorthand for writing:

Hij = 〈 i | H | j 〉

So the Hamiltonian coefficient (Hij) connects two base states (i and j) through the Hamiltonian matrix (H). Connect? How? Our language in the previous posts, and some of Feynman’s language, may have suggested the Hamiltonian coefficients are amplitudes to go from state j to state i. However, that’s not the case. Or… Well… We need to qualify that statement. What does it mean? The i and j states are base states and, hence, 〈 i | j 〉 = δij, with δij = 1 if i = j and δij = 0 if i ≠ j. Hence, stating that the Hamiltonian coefficients are the amplitudes to go from one state to another is… Well… Let’s say that language is rather inaccurate. We need to include the element of time, so we need to think in terms of those amplitudes C1 and C2, or Cand CII, which are functions in time: Ci = Ci(t). Now, the Hamiltonian coefficients are obviously related to those amplitudes. Sure! That’s quite obvious from the fact they appear in those differential equations for Cand C2, or Cand CII, i.e. the amplitude to be in state 1 or state 2, or state I or state II, respectively. But they’re not the same.

Let’s go back to the basics here. When we derived the Hamiltonian matrix as we presented Feynman’s brilliant differential analysis of it, we wrote the amplitude to go from one base state to another, as a function in time (or a function of time, I should say), as:

Uij = Uij(t + Δt, t) = 〈 i | U | j 〉 = 〈 i | U(t + Δt, t) | j 〉

Our ‘unpacking’ rules then allowed us to write something like this for t = t1 and t + Δt = t2 or – let me quickly circle back to that monster matrix notation above – for Δt = t− t1:

time

The key – as presented by Feynman – to go from those Uij amplitudes to the Hij amplitudes is to consider the following: if Δt goes to zero, nothing happens, so we wrote: Uij = 〈 i | U | j 〉 → 〈 i | j 〉 = δij for Δt → 0. We also assumed that, for small t, those Uij amplitudes should differ from δij (i.e. from 1 or 0) by amounts that are proportional to Δt. So we wrote:

Uij(t + Δt, t) = δij + ΔUij(t + Δt, t) = δij + Kij(t)·Δt ⇔ Uij(t + Δt, t) = δij − (i/ħ)·Hij(t)·Δt

There’s several things here. First, note the first-order linear approximation: it’s just like the general y(t + Δt) = y(t) + Δy = y(t) + (dy/dt)·Δt formula. So can we look at our Kij(t) function as being the time derivative of the Uij(t + Δt, t) function? The answer is, unambiguously, yes. Hence, −(i/ħ)·Hij(t) is the same time derivative. [Why? Because Kij(t) = −(i/ħ)·Hij(t).] Now, the time derivative of a function, i.e. dy/dt, is equal to Δy/Δt for Δt → 0 and, of course, we know that Δy = 0 for Δt → 0. We are now in a position to understand Feynman’s interpretation of the Hamiltonian coefficients:

The −(i/ħ)·Hij(t) = −(i/ħ)·〈 i | H | j 〉 factor is the amplitude that—under the physical conditions described by H—a state j will, during the time dt, “generate” the state i.

I know I shouldn’t make this post too long (I promised to write about the Pauli spin matrices, and I am not even halfway there) but I should note a funny thing there: in that Uij(t + Δt, t) = δij + ΔUij(t + Δt, t) = δij + Kij(t)·Δt = δij − (i/ħ)·Hij(t)·Δt formula, for Δt → 0, we go from real to complex numbers. I shouldn’t anticipate anything but… Well… We know that the Hij coefficients will (usually) represent some energy level, so they are real numbers. Therefore, − (i/ħ)·Hij(t) = Kij(t) is complex-valued, as we’d expect, because Uij(t + Δt, t) is, in general, complex-valued, and δij is just 0 or 1. I don’t have too much time to linger on this, but it should remind you of how one may mathematically ‘construct’ the complex exponential eiby using the linear approximation eiε = 1 + iε near s = 0 or, what amounts to the same, for small ε. My post on this shows how Feynman takes the magic out of Euler’s formula doing that – and I should re-visit it, because I feel the formula above, and that linear approximation formula for a complex exponential, go to the heart of the ‘mystery’, really. But… Well… No time. I have to move on.

Let me quickly make another small technical remark here. When Feynman talks about base states, he always writes them as a bra or a ket, just like any other state. So he talks about “base state | i 〉”, or “base state 〈 i |”. If you look it up, you’ll see he does the same in that quote: he writes | j 〉 and | i 〉, rather than j and i. In fact, strictly speaking, he should write 〈 i | instead of | i 〉. Frankly, I really prefer to just write “base state i”, or base state j”, without specifying if it’s a bra or a ket. A base state is a base state: 〈 i | and | i 〉 represent the same. Of course, it’s rather obvious that 〈 χ | and | χ 〉 are not the same. In fact, as I showed above, they’re each other’s complex conjugate, so 〈 χ |* = | χ 〉. To be precise, I should say: they’re each other’s conjugate transpose, because we’re talking row and column vectors respectively. Likewise, we can write: 〈 χ | φ 〉* = 〈 φ | χ 〉. For base states, this becomes 〈 i | j 〉* = 〈 j | i 〉. Now, 〈 i | and | j 〉 were matrices, really – row and column vectors, to be precise – so we can apply the following rule: the conjugate transpose of the product of two matrices is the product of the conjugate transpose of the same matrices, but with the order of the matrices reversed. So we have: (AB)* = B*A*. In this case: 〈 i | j 〉* = | j 〉*〈 i |*. Huh? Yes. Think about it. I should probably use the dagger notation for the conjugate transpose, rather than the simple * notation, but… Well… It works. The bottom line is: 〈 i | j 〉* = 〈 j | i 〉 = | j 〉*〈 i |* and, therefore, 〈 j | = | j 〉* and | i 〉 = 〈 i |*. Conversely, 〈 j | i 〉* = 〈 i | j 〉 = | i 〉*〈 j |* and, therefore, we also have 〈 j |* = | j 〉 and | i 〉* = 〈 i |. Now, we know the coefficients of these row and column vectors are either one or zero. In short, 〈 i | and | i 〉, or 〈 j | and | j 〉 are really one and the same ‘object’. The only reason why we would use the bra-ket notation is to indicate whether we’re using them in an initial condition, or in a final state. In the specific case that we’re dealing with here, it’s obvious that j is used in an initial condition, and i is a final condition.

We’re now ready to look at these differential equations once more, and try to truly understand them:

cotribu

The summation over all base states j amounts to adding the contribution, so to speak, of all those base states j, during the infinitesimally small time interval dt, to the change in the amplitude (during the same infinitesimal time interval, of course) to be in state i. Does that make sense?

You’ll say: yes. Or maybe. Or maybe not. 🙂 And I know you’re impatient. We were supposed to talk about the Hamiltonian operator here. So what about that? Why this long story on the Hamiltonian coefficients? Well… Let’s take the next step. An operator is all about ‘abstracting away’, or ‘dropping terms’, as Feynman calls it—more down to the ground. 🙂 So let’s do that in two successive rounds, as shown below. First we drop the 〈 i |, because the equation holds for any i. Then we apply the grand | = ∑ | i 〉〈 i | rule—which is somewhat tricky, as it also gets rid of the summation. We then define the Hamiltonian operator as H, but we just put a little hat on top of it. That’s all.

operator

As this is all rather confusing, let me show what it means in terms of matrix algebra:

operator2

So… Frankly, it’s not all that difficult. It’s basically introducing a summary notation, which is what operators usually do. Note that the H = (i/ħ)·d/dt operator (sorry if I am not always putting the hat) is not just the d/dt with an extra division by ħ and a multiplication by the imaginary unit i. From a mathematical point of view, of course, that’s what it seems to be, and actually is. From a mathematical point of view, it’s just an n-by-n matrix, and so we can effectively apply it to some n-by-1 column vector to get another n-by-1 column vector.

But its meaning is much deeper: as Feynman puts it: the equation(s) above are the dynamical law of Nature—the law of motion for a quantum system. In a way, it’s like that invariant (1−v2)−1/2·d/dt operator that we introduced when discussing relativity, and things like the proper time and invariance under Lorentz transformation. That operator really did something. It ‘fixed’ things as we applied to the four-vectors in relativistic spacetime. So… Well… Think about it.

Before I move on – because, when everything is said and done, I promised to use the Pauli matrices as operators – I’ll just copy Feynman as he approaches the equations from another angle:

alternative

Of course, that’s the equation we started out with, before we started ‘abstracting away’:

cotribu

So… Well… You can go through the motions once more. Onward!

The Pauli spin matrices as operators

If the Hamiltonian matrix can be used as an operator, then we can use the Pauli spin matrices as little operators too! Indeed, from my previous post, you’ll remember we can write the Hamiltonian in terms of the Pauli spin matrices:

Pauli

Now, if we think of the Hamiltonian matrix as an operator, we can put a little hat everywhere, so we get:

P2

It’s really as simple as that. Now, we get a little bit in trouble with the x, y and subscripts as we’re going to want to write the matrix elements as σij, so we’ll just move them and write them as superscripts, so our matrix elements will be written as σxij = 〈 i | σx | j 〉, σyij = 〈 i | σy | j 〉 and σzij = 〈 i | σz | j 〉 respectively. Now, we introduced all kinds of properties of the Pauli matrices themselves, but let’s now look at the properties of these matrices as an operator. To do that, we’ll let them loose on the base states. We get the following:

P3

[You can check this in Feynman, but it’s really very straightforward, so you should try to get this result yourself.] The next thing is to create even more operators by multiplying the operators two by two. We get stuff like:

σxσy|+〉 = σxy|+〉) = σx(i|−〉) = i·(σx|−〉) = i·|+〉

The thing to note here is that it’s business as usual: we can move factors like out of the operators, as the operators work on the state vectors only. Oh… And sorry I am not putting the hat again. It’s the limitations of the WordPress editor here (I always need to ‘import’ my formulas from Word or some other editor, so I can’t put them in the text itself). On the other hand, Feynman himself seems to doubt the use of the hat symbol, as he writes: “It is best, when working with these things, not to keep track of whether a quantity like σ or H is an operator or a matrix. All the (matrix) equations are the same anyway.

That makes it all rather tedious or, in fact, no! That makes it all quite easy, because our table with the properties of the sigma matrices is also valid for the sigma operators, so let’s just copy it, and then we’re done, so we can wrap up and do something else. 🙂

products

To conclude, let me answer your most pressing question at this very moment: what’s the use of this? Well… To a large extent, it’s a nice way of rather things. For example, let’s look at our equations for the ammonia molecule once more. But… Well… No. I’ll refer you to Feynman here, as he re-visits all the systems we’ve studied before, but now approaches them with our new operators and notations. Have fun with it! 🙂

Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 17, 2020 as a result of a DMCA takedown notice from Michael A. Gottlieb, Rudolf Pfeiffer, and The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/

Pauli’s spin matrices

[Preliminary note (added on 4 April 2020): When re-reading what I wrote below, I realize I would fundamentally re-write certain sections. I think I have found a comprehensive realist interpretation of quantum mechanics and, hence, I’d recommend you read my paper on the fine and hyperfine structure of the hydrogen atom, which is centered around a classical explanation of the Lamb shift. The writings below are probably just good to illustrate how I got there. Lettura felice!]

[Preliminary note 2 (added on 26 June 2020): I also note the dark force has had fun here too, as a result of which the post has been mutilated substantially. So… Well… No lettura felice then, I guess.]

Wolfgang Pauli’s life is as wonderful as his scientific legacy—but we’ll just talk about one of his many contributions to quantum mechanics here in this post—not about his life.

This post should be fairly straightforward. We just want to review some of the math. Indeed, we got the ‘Grand Result’ already in our previous post, as we found the Hamiltonian coefficients for a spin one-half particle—read: all matter-particles, practically speaking—in a magnetic field—but then we can just replace the magnetic dipole moment by an electric dipole moment, if needed, and we’ll find the same formulas, so we’ve basically covered everything you can possible think of.

[…] Well… Sort of… 🙂

OK. Jokes aside, we have a magnetic field B, which we describe in terms of its components: B = (Bx, By, Bz), and we’ve defined two mutually exclusive states – call them ‘up’ or ‘down’, or 1 or 2, or + or −, whatever − along some direction, which we call the z-direction. Why? Convention. Historical accident. The z-direction is the direction in regard to which we measure stuff. What stuff? Well… Stuff like the spin of an electron: quantum-mechanical stuff. 🙂 In any case, the Hamiltonian that comes with this system is:

F1

Now, because this matrix doesn’t look impressive enough, we’re going to re-write it as:

Pauli

Huh? Yes. It looks good, doesn’t it? And the σx, σy and σz matrices are given below, so you can check it’s actually true. […] I mean: you can check that the two notations are equivalent, from a math point of view, that is. 🙂

Capture

As Feynman puts it: “This is what the professionals use all of the time.” So… Well… Yes. We had better learn them by heart. 🙂

The identity matrix is actually not one of the so-called Pauli spin matrices, but we need it when we’d decide to not equate the average energy of our system to zero, i.e. when we’d decide to shift the zero point of our energy scale so as to include the equivalent energy of the rest mass. In that case, we re-write the Hamiltonian as:

Capture2

In fact, as most academics want to hide their knowledge from us by confusing us deliberately, they’ll often omit the Kronecker delta, and simply write:

Capture3

It’s OK, as long as you know what it is that you’re trying to do. 🙂 The point is, we’ve got four ‘elementary’ matrices now which allow us to write any matrix – literally, any matrix – as a linear combination of them. In Feynman‘s words:

text

Now, the Pauli matrices have lots of interesting properties. Their products, for example, taken two at a time, are rather special:

products

The most interesting property, however, is that, when choosing some other representation, i.e. when changing to another coordinate systemthe three Pauli matrices behave like the components of a vector. That vector is written as σ, and so it’s a matrix you can use in different coordinate systems, as though it’s a vector. It allows us to re-write the Hamiltonian we started out with in a particularly nice way:

Pauli vector

You should compare this to the classical formula for the energy of a little magnet with the magnetic moment μ in the same magnetic field:

m

There are several differences, of course. First, note that the quantum-mechanical magnetic moment is like the quantum-mechanical angular momentum: there’s only a limited set of discrete values, given by the following relation:

mm

That’s why we write it as a scalar in the quantum-mechanical equation, and as a vector, i.e. in boldface (μ), in the second equation. The two equations differ more fundamentally, however: the first one is a matrix equation, while the second one is… Well… Just a simple vector dot product.

The point is: the classical energy becomes the Hamiltonian matrix, and the classical μ vector becomes the μσ matrix. As Feynman puts it: “It is sometimes said that to each quantity in classical physics there corresponds a matrix in quantum mechanics, but it is really more correct to say that the Hamiltonian matrix corresponds to the energy, and any quantity that can be defined via energy has a corresponding matrix.”

[…]

What does he mean by a quantity that can be defined via energy? It’s simple: the magnetic moment, for example, can be defined via energy by saying that the energy, in an external field B, is equal to −μ·B.

Huh? Wasn’t it the other way around? Didn’t we define the energy by saying it’s equal to −μ·B?

We did. In our posts on electromagnetism. That was classical theory. However, in quantum mechanics, it’s the energy that’s the ‘currency’ we need to be dealing in. So it makes sense to look at things the other way around: we’ll first think about the energy, and then we try to find a matrix that corresponds to it.

So… Yes. Many classical quantities have their quantum-mechanical counterparts, and those quantum-mechanical counterparts are often some matrices. But not all of them. Sometimes there’s just no comparison, because the two worlds are actually different. Let me quote Feynman on what he thinks of how these two worlds relate, as he wraps up his discussion of the two equations above:

philosophy Well… That says it all, doesn’t it? 🙂 We’ll talk more tomorrow. 🙂

Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 17, 2020 as a result of a DMCA takedown notice from Michael A. Gottlieb, Rudolf Pfeiffer, and The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/

The Hamiltonian of matter in a field

Pre-script (dated 26 June 2020): Our ideas have evolved into a full-blown realistic (or classical) interpretation of all things quantum-mechanical. So no use to read this. Read my recent papers instead. 🙂

Original post:

In this and the next post, I want to present some essential discussions in Feynman’s 10th, 11th and 12th Lectures on Quantum Mechanics. This post in particular will actually present the Hamiltonian for the spin state of an electron, but the discussion is much more general than that: it’s a model for any spin-1/2 particle, i.e. for all elementary fermions—so that’s the ‘matter-particles’ which you know: electrons, protons and neutrons. Or, taking into account protons and neutrons consists of quarks, we should say quarks, which also have spin 1/2. So let’s go for it. Let me first, by way of introduction, remind you of a few things.

What is it that we are trying to do?

That’s always a good question to start with. 🙂 Just for fun, and as we’ll be talking a lot about symmetries and directions in space, I’ve inserted an animation below of a four-dimensional object, as its author calls it. This ‘object’ returns to its original configuration after a rotation of 720 degrees only (after 360 degrees, the spiral flips between clockwise and counterclockwise orientations, so it’s not the same). For some rather obscure reason 🙂 he refers to it as a spin-1/2 particle, or a spinor.

Spin_One-Half_(Slow)

Are spin one-half particles, like an electron or a proton, really four-dimensional? Well… I guess so. All depends, of course, on your definition or concept of a dimension. 🙂 Indeed, the term is as well – I should say, as badly, really – defined as the ubiquitous term ‘vector’ and so… Well… Let me say that spinors are usually defined in four-dimensional vector spaces, indeed. […] So is this what it’s all about, and should we talk about spinors?

Not really. Feynman doesn’t push the math that far, so I won’t do that either. 🙂 In fact, I am not sure why he’s holding back here: spinors are just mathematical objects, like vectors or tensors, which we introduced in one of our posts on electromagnetism, so why not have a go at it? You’ll remember that our electromagnetic tensor was like a special vector cross-product which, using the four-potential vector Aμ and the ∇μ = (∂/∂t, −∂/∂x, −∂/∂y, −∂/∂z) operator, we could write as (∇μAμ) − (∇μAμ)T.

Huh? Hey! Relax! It’s a matrix equation. It looks like this:

matrix

In fact, I left out above, and so we should plug it in, remembering that B’s magnitude is 1/c times E’s magnitude. So the electromagnetic tensor – in one of its many forms at least – is the following matrix:

electromagnetic tensor final

Why do we need a beast like this? Well… Have a look at the mentioned post or, better, one of the subsequent posts: we used it in very powerful equations (read: very concise equations, because that’s what mathematicans, and physicists, like) describing the dynamics of a system. So we have something similar here: what we’re trying to describe the dynamics of a quantum-mechanical system in terms of the evolution of its state, which we express as a linear combination of ‘pure’ base states, which we wrote as:

|ψ〉 = |1〉C|2〉C= |1〉〈1|ψ〉 + |2 〉〈2|ψ〉

C1 and C2 are complex-valued wavefunctions, or amplitudes as we call them, and the dynamics of the system are captured in a set of differential equations, which we wrote as:

System

The trick was to know or guess our Hamiltonian, i.e. we had to know or, more likely, guess those Hij coefficients (and then find experiments to confirm our guesses). Once we got those, it was a piece of cake. We’d solve for C1 and C2, and then take their absolute square so as to get probability functions. like the ones we found for our ammonia (NH3) molecule: P1(t) = |C1(t)|2 = cos2[(A/ħ)·t] and P2(t) = |C2(t)|= sin2[(A/ħ)·t]. They say that, if we would take a measurement, then the probability of finding the molecule in the ‘up’ or ‘down’ state (i.e. state 1 versus state 2) varies as shown:

graph

So here we are going to generalize the analysis: rather than guessing, or assuming we know them (from experiment, for example, or because someone else told us so), we’re going to calculate what those Hamiltonian coefficients are in general.

Now, returning to those spinors, it’s rather daunting to think that such a simple thing as being in the ‘up’ or ‘down’ condition has to be represented by some mathematical object that’s at least as complicated as these tensors. But… Well… I am afraid that’s the way it is. Having said that, Feynman himself seems to consider that’s math for graduate students in physics, rather than the undergraduate public for which he wrote the course. Hence, while he presented all of the math in the Lecture Volume on electromagnetism, he keeps things as simple as possible in the Volume on quantum mechanics. So… No. We will not be talking about spinors here.

The only reason why I started out with that wonderful animation is to remind you of the weirdness of quantum mechanics as evidenced by, for example, the fact I almost immediately got into trouble when trying to associate base states with two-dimensional geometric vectors when writing my post on the hydrogen molecule, or when thinking about the magnitude of the quantum-mechanical equivalent of the angular momentum of a particle (see my post on spin and angular momentum).

Thinking of that, it’s probably good to remind ourselves of the latter discussion. If we denote the angular momentum as J, then we know that, in classical mechanics, any of J‘s components Jx, Jy or Jz, could take on any value from +J to −J and, therefore, the maximum value of any component of J – say Jz – would be equal to J. To be precise, J would be the value of the component of J in the direction of J itself. So, in classical mechanics, we’d write: |J| = +√(J·J) = +√JJ, and it would be the maximum value of any component of J.

However, in quantum mechanics, that’s not the case. If the spin number of J is j, then the maximum value of any component of J is equal to j·ħ. In this case, the spin number will be either +1/2 or −1/2. So, naturally, one would think that J, i.e. the magnitude of J, would be equal to J = |J| = +√(J·J) = +√J= j·ħ = ħ/2. But that’s not the case: J = |J| ≠ j·ħ = ħ/2. To calculate the magnitude, we need to calculate J= Jx+ Jy+ Jz2. So the idea is to measure these repeatedly and use the expected value for Jx2, Jy2 and Jz2 in the formula. Now, that’s pretty simple: we know that Jx, Jy or Jz are equal to either +ħ/2 or −ħ/2, and, in the absence of a field (i.e. in free space), there’s no preference, so both values are equally likely. To make a long story short, the expected value of Jx2, Jy2 and Jz2 is equal to (1/2)·(ħ/2)+ (1/2)·(−ħ/2)= ħ2/4, and J= 3·ħ2/4 = j(j+1)ħ, with j = 1/2. So J = |J| = +√J= √(3·ħ2/4) = √3·(ħ/2) ≈ 0.866·ħ. Now that’s a huge difference as compared to ħ/2 = ħ/2.

What we’re saying here is that the magnitude of the angular momentum is √3 ≈ 1.7 times the maximum value of the angular momentum in any direction. How is that possible? Thinking classically, this is nonsensical. However, we need to stop thinking classically here: it means that, when we’re atomic or sub-atomic particles, their angular momentum is never completely in one direction. This implies we need to revise our classical idea of an oriented (electric or magnetic) moment: to put it simply, we find it’s never in one direction only! Alternatively, we might want to re-visit our concept of direction itself, but then we do not want to go there: we continue to say we’re measuring this or that quantity in this or that direction. Of course we do! What’s the alternative? There’s none. You may think we didn’t use the proper definition of the magnitude of a quantity when calculating J as √3·(ħ/2), but… Well… You’ll find yourself alone with that opinion. 🙂

This weird thing really comes with the experimental fact that, if you measure the angular momentum, along any axis, you’ll find it is always an integer or half-integer times ħ. Always! So it comes with the experimental fact that energy levels are discrete: they’re separated by the quantum of energy, which is ħ, and which explains why we have the 1/ħ factor in all coefficients in the coefficient matrix for our set of differential equations. The Hamiltonian coefficients represent energies indeed, and so we’ll want to measure them in units of ħ.

Of course, now you’ll wonder: why the −i? I wish I could you a simple answer here, like: “The −factor corresponds to a rotation by −π/2, and that’s the angle we use to go from our ‘up’ and ‘down’ base states to the ‘Uno‘ and ‘Duo‘ (I and II) base states.” 🙂 Unfortunately, this easy answer isn’t the answer. :-/ I need to refer you to my post on the Hamiltonian: the true answer is that it’s got to do with the in the e(i/ħ)·(E·t − pxfunction: the E, i.e. the energy, is real – most of the time, at least 🙂 – but the wavefunction is what it is: a complex exponential. So… Well…

Frankly, that’s more than enough as an introduction. You may want to think about the imaginary momentum of virtual particles here – i.e. ‘particles’ that are being exchanged as part of a ‘state switch’ –  but then we’d be babbling for hours! So let’s just do what we wanted to do here, and that is to find the Hamiltonian for a spin one-half particle in general, so that’s usually in some field, rather than in free space. 🙂

So here we go. Finally! 🙂

The Hamiltonian of a spin one-half particle in a magnetic field

We’ve actually done some really advanced stuff already. For example, when discussing the ammonia maser, we agreed on the following Hamiltonian in order to make sense of what happens inside of the maser’s resonant cavity:

states

State 1 was the state with the ‘upper’ energy E0 + με, as the energy that’s associated with the electric dipole moment of the ammonia molecule was added to the (average) energy of the system (i.e. E0). State 2 was the state with the ‘lower’ energy level E0 − με, implying the electric dipole moment is opposite to that of state 1. The field could be dynamic or static, i.e. varying in time, or not, but it was the same Hamiltonian. Of course, solving the differential equations with non-constant Hamiltonian coefficients was much more difficult, but we did it.

We also have a “flip-flop amplitude” – I am using Feynman’s term for it 🙂 – in that Hamiltonian above. So that’s an amplitude for the system to go from one state to another in the absence of an electric field. For our ammonia molecule, and our hydrogen molecule too, it was associated with the energy that’s needed to tunnel through a potential barrier and, as we explained in our post on virtual particles, that’s usually associated with a negative value for the energy or, what amounts to the same, with a purely imaginary momentum, so that’s why we write minus A in the matrix. However, don’t rack your brain over this as it is a bit of convention, really: putting +A would just result in a phase difference for the amplitudes, but it would give us the same probabilities. If it helps you, you may also like to think of our nitrogen atom (or our electron when we were talking the hydrogen system) as borrowing some energy from the system so as to be able to tunnel through and, hence, temporarily reducing the energy of the system by an amount that’s equal to A. In any case… We need to move on.

As for these probabilities, we could see – after solving the whole thing, of course (and that was very complicated, indeed) – that they’re going up and down just like in that graph above. The only difference was that we were talking induced transitions here, and so the frequency of the transitions depended on με0, i.e. on the strength of the field, and the magnitude of the dipole moment itself of course, rather than on A. In fact, to be precise, we found that the ratio between the average periods was equal to:

Tinduced/Tspontaneous = [(π·ħ)/(2με0)]/[(π·ħ)/(2A)] = A/με0

But… Well… I need to move on. I just wanted to present the general philosophy behind these things. For a simple electron which, as you know, is either in a ‘up’ or a ‘down’ state – vis-á-vis a certain direction, of course – the Hamiltonian will be very simple. As usual, we’ll assume the direction is that z-direction. Of course, this ‘z-direction” is just a short-hand for our reference frame: we decide to measure something in this or that direction, and we call that direction the z-direction.

Fine. Next. As our z-direction is currently our reference direction, we assume it’s the direction of some magnetic field, which wel’ll write as B. So the components of B in the x– and y-direction are zero: all of the field is in the z-direction, so B = Bz. [Note that the magnetic field is not some quantum-mechanical quantity, and so we can have all of the magnitude in one direction. It’s just a classical thing.]

Fine. Next. The spin or the angular momentum of our electron is, of course, associated with some magnetic dipole moment, which we’ll write as μ. [And, yes, sometimes we use this symbol for an electric dipole moment and, at other times, for a magnetic dipole moment, like here. I can’t help that. You don’t want a zillion different symbols anyway.] Hence, just like we had two energy levels E0 ± με, we’ll now have two energy levels E0 ± μBz. We’ll just shift the energy scale so E0 = 0, so that’s as per our convention. [Feynman glosses over it, but this is a bit of a tricky point, really. Usually, one includes the rest mass, or rest energy, in the E in the argument of the wavefunction, but so here we’re equating m0 c2 with zero. Tough! However, you can think of this re-definition of the zero energy points as a phase shift in all wavefunctions, so it shouldn’t matter when taking the absolute square or looking at interference. Still… Think about it.]

Fine. Next. Well… We’ve got two energy levels, +μBz and +μBz, but no A to put in our Hamiltonian, so the following Hamiltonian may or may not make sense:

electron

Hmm… Why is there no flip-flop amplitude? Well… You tell me. Why would we have one? It’s not like the ammonia or hydrogen molecule here, so… Well… Where’s the potential barrier? Of course, you’ll now say that we can imagine it takes some energy to change the spin of an electron, like we were doing with those induced transitions. But… Yes and no. We’ve been selecting particles using our Stern-Gerlach apparatus, or that state selector for our maser, but were we actually flip-flopping things? The changing electric field in our resonant cavity is changes the transition frequency but, when everything is said and done, the transition itself has to do with that A. You’ll object again: a pure stationary state? So the electron is either ‘up’ or ‘down’, and it stays like that foreverReally?

Well… I am afraid I have to cut you off, because otherwise we’ll never get to the end. Stop being so critical. 🙂 Well… No. You should be critical. However, you’re right in saying that, when everything is said and done, these are all hypotheses that may or may not make sense. However, Feynman is also right when he says that, ultimately, the proof of the pudding is in the eating: at the end of this long, winding story, we’ll get some solutions that can be tested in experiment: they should give predictions, or probabilities rather, that agree with experiment. As Feynman writes: “[The objective is to find] “equations of motion for the spin states” of an electron in a magnetic field. We guess at them by making some physical argument, but the real test of any Hamiltonian is that it should give predictions in agreement with experiment. According to any tests that have been made, these equations are right. In fact, although we made our arguments only for constant fields, the Hamiltonian we have written is also right for magnetic fields which vary with time.”

So let’s get on with it: let’s assume the Hamiltonian above is the one we should use for a magnetic field in the z-direction, and that we have those pure stationary states with the energies they have, i.e. −μBz and +μBz. One minor technical point, perhaps: you may wonder why we write what we write and do not switch −μBz and +μBz in the Hamiltonian—so as to reflect these ‘upper’ and ‘lower’ energies in those other Hamiltonians. The answer is: it’s just convention. We choose state 1 to be the ‘up’ state, so its spin is ‘up’, but the magnetic moment is opposite to the spin, so the ‘up’ state has the minus sign. Full stop. Onwards!

We’re now going to assume our B field is not in the z-direction. Hence, its Bx and By components are not zero. What we want to see now is how the Hamiltonian looks like. [Yes. Sorry for regularly reminding you of what it is that we are trying to do.] Here you need to be creative. Whatever the direction of the field, we need to be consistent. If that Hamiltonian makes sense, i.e. if we’d have two pure stationary states with the energies they have, if the field is in the z-direction, then it’s rather obvious that, if the field is in some other direction, we should still be able to find two stationary states with exactly the same energy levels. As Feynman puts it: “We could have chosen our z-axis in its direction, and we would have found two stationary states with the energies ±μBz. Just choosing our axes in a different direction doesn’t change the physics. Our description of the stationary states will be different, but their energies will still be ±μBz.” Right. And because the magnetic field is a classical quantity, the relevant magnitude is just the square root of the squares of its components, so we write:

formula 1So we have the energies now, but we want the Hamiltonian coefficients. Here we need to work backwards. The general solution for any system with constant Hamiltonian coefficients always involves two stationary states with energy levels which we denoted as Eand EII, indeed. Let me remind you of the formula for them:

energies

[If you want to double-check and see how we get those, it’s probably best to check it in the original text, i.e. Feynman’s Lecture on the Ammonia Maser, Section 2.]

So how do we connect the two sets of equations? How do we get the Hij coefficients out of these square roots and all of that? [Again. I am just reminding you of what it is that we are trying to do.] We’ve got two equations and four coefficients, so… Well… There’s some rules we can apply. For example, we know that any Hij coefficient must equal Hji*, i.e. complex conjugate of Hji. [However, I should add that’s true only if i ≠ j.] But… Hey! We can already see that H11 must be equal to minus H22. Just compare the two sets. That comes out as a condition, clearly. Now that simplifies our square roots above significantly. Also noting that the absolute square of a complex number is equal to the product of the number with its complex conjugate, the two equations above imply the following:

formula 2

Let’s see what this means if we’d apply this to our ‘special’ direction once more, so let’s assume the field is in the z-direction once again. Perhaps we can some more ‘conditions’ out of that. If the field is in the z-direction itself, the equation above reduces to:

formula 3

That makes it rather obvious that, in this special case, at least, |H12|2 = 0. You’ll say: that’s nothing new, because we had those zeroes in that Hamiltonian already. Well… Yes and no! Here we need to introduce another constraint. I’ll let Feynman explain it: “We are going to make an assumption that there is a kind of superposition principle for the terms of the Hamiltonian. More specifically, we want to assume that if two magnetic fields are superposed, the terms in the Hamiltonian simply add—if we know the Hij for a pure Band we know the Hij for a pure Bx, then the Hij for a both Band Btogether is simply the sum. This is certainly true if we consider only fields in the z-direction—if we double Bz, then all the Hij are doubled. So let’s assume that H is linear in the field B.”

Now, the assumption that H12 must be some linear combination of Bx, Band Bz, combined with the |H12|2 = 0 condition when all of the magnitude of the field is in the z-direction, tells us that H12 has no term in Bz. It may have – in fact, it probably should have – terms in Bx and By, but not in Bz. That does take us a step further.

Next assumption. The next assumption is that, regardless of the direction of the field, H11 and H22 don’t change: they remain what they are, so we write: H11 = −μBz and H22 = +μBz. Now, you may think that’s no big deal, because we defined the 1 and 2 states in terms of our z-direction, but… Well… We did so assuming all of the magnitude was in the z-direction.

You’ll say: so what? Now we’ve got some field in the x– and y-directions, so that shouldn’t impact the amplitude to be in a state that’s associated with the z-direction. Well… I should say two things here. First, we’re not talking about the amplitude to be in state 1 or state 2. These amplitudes are those C1 and Cfunctions that we can find once we’ve got those Hamiltonian coefficients. Second, you’d surely expect that some field in the x– and y-directions should have some impact on those C1 and Cfunctions. Of course!

In any case, I’ll let you do some more thinking about this assumption. Again, we need to move on, so let’s just go along with it. At this point, Feynman‘s had enough of the assumptions, and so he boldly proposes a solution, which incorporates that the H11 = −μBz and H22 = +μBz assumption. Let me quote him:

Formula 4

Of course, this leaves us gasping for breath. A simple guess? One can plug it in, of course, and see it makes sense—rather quickly, really. But… Nothing linear is going to come out of that expression for |H12|2, right? We’ll have to take a square root to find that H12 = ±μ·(Bx+ By2)1/2. Well… No. We’re working in the complex space here, remember? So we can use complex solutions. Feynman notes the same and immediately proposes the right solution:

final 1

To make a long story, we get what we wanted, i.e. those “equations of motion for the spin states” of an electron in a magnetic field. I’ll let Feynman summarize the results:

Final 3

It’s truly a Great Result, especially because, as Feynman notes, (almost) any problem about two-state systems can be solved by making a mathematical analog to the system of the spinning electron. We’ll illustrate that as we move ahead. For now, however, I think we’ve had enough, isn’t it? 🙂

We’ve made a big leap here, and perhaps we should re-visit some of the assumptions and conventions—later, that is. As for now, let’s try to work with it. As mentioned above, Feynman shied away from the grand mathematical approach to it. Indeed, the whole argument might have been somewhat fuzzy, but at least we got a good feel for the solution. In my next post, I’ll abstract away from it, as Feynman does in his next Lecture, where he introduces the so-called Pauli spin matrices, which are like Lego building blocks for all of the matrix algebra which – I must assume you sort of sense that’s coming, no? 🙂 – we’ll need to master so as to understand what’s going on.

So… That’s it for today. I hope you understood “what it is that we’re trying to do”, and that you’ll have some fun working on it on your own now. 🙂

Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/

The quantum-mechanical view of chemical binding

In my post on the hydrogen atom, I explained its stability using the following graph out of Feynman’s Lectures. It shows an equilibrium state for the Hmolecule with an energy level that’s about 5 eV (ΔE/E≈ −0.375 ⇔ ΔE ≈ −0.375×13.6 eV = 5.1 eV) lower than the energy of two separate hydrogen atoms (2H).

raph3The lower energy level is denoted by EII and refers to a state, which we also denoted as state II, that’s associated with some kind of molecular orbital for both electrons, resulting in more (shared) space where the two electrons can have a low potential energy, as Feynman puts it, so “the electron can spread out—lowering its kinetic energy—without increasing its potential energy.” The electrons have opposite spin. The have to have opposite spin because our formula for state II would violate the Fermi exclusion principle if they would not have opposite spin. Indeed, if the two electrons would not have opposite spin, the formula for our CII amplitude, would be violating the rule that, when identical fermions are involved, and we’re adding amplitudes, then we should do so with a negative sign for the exchanged case. So our CII = 〈II|ψ〉 = (1/√2)[〈1|ψ〉 + 〈2|ψ〉] = (1/√2)[〈2|ψ〉 + 〈1|ψ〉] would be problematic: when we switch the electrons, we should get a minus sign.

We do get that minus sign for state I:

〈I|ψ〉 = (1/√2)[〈1|ψ〉 − 〈2|ψ〉] = −(1/√2)[〈2|ψ〉 − 〈1|ψ〉]

To make a long story short, state II is the equilibrium state, and so that’s an Hmolecule with two electrons with opposite spins that share a molecular orbital, rather than moving around in some atomic orbital.

The question is: can we generalize this analysis? I mean… We’ve spent a lot of time so as to make sure we understand this one particular case. What’s the use of such analysis if we can’t generalize? We shouldn’t be doing nitty-gritty all of the time, isn’t it?

You’re right. The thing is: we can easily generalize. We’ve learned to play with those Hamiltonian matrices now, and so let’s do the ‘same-same but different’ with other systems. Let’s replace one of the two protons in the two-protons-one-electron model by a much heavier ion—say, lithium. [The example is not random, of course: lithium is very easily ionized, which is why it’s used in batteries.]

We need to think of the Hamiltonian again, right? We’re now in a situation in which the Hamiltonian coefficients H11 and H22 are likely to be different. We’ve lost the symmetry: if the electron is near the lithium ion, then we can’t assume the system has the same energy as when it’s near the hydrogen nucleus (in case you forgot: that’s what the proton is, really). Because we’ve lost the symmetry, we no longer have these ‘easy’ Hamiltonians:

equi

We need to look at the original formulas for Eand E2 once again. Let me write them down:

energies

Of course, H12 and H21 will still be equal to A, and so… Well… Let me simplify my life and copy Feynman:

Feynman

There’s several things here. First, note that approximation to the square root:

square root sum of squares

We’re only allowed to do that if y is much smaller than x, with = 1 and = 2A/(H11 − H22). In fact, the condition is usually written as 0 ≤ y/x ≤ 1/2, so we take the A/(H11 − H22) ratio as (much) less than one, indeed. So the second term in the energy difference E− EII = (H11 − H22) + 2A·A/(H11 − H22) is surely smaller than 2A. But there’s the first term, of course: H11 − H22. However, that’s there anyway, and so we should actually be looking at the additional separation, so that’s where the A comes in, and so that’s the second term: 2A·A/(H11 − H22) which, as mentioned, is smaller by the factor A/(H11 − H22), which is less than one. So Feynman’s conclusion is correct: “The binding of unsymmetric diatomic molecules is generally very weak.

However, that’s not the case when binding two ions by two electrons, which is referred to as a two-electron binding, which is the most common valence bond. Let me simplify my life once more and quote once again:

Feynman 2

What he’s saying is that H11 and H22 are one and the same once again, and equal to E0, because both ions can take one electron, so there’s no difference between state 1 and state 2 in that regard. So the energy difference is 2A once more and we’ve got good covalent binding. [Note that the term ‘covalent’ just refers to sharing electrons, so their value is shared, so to speak.]

Now, this result is, of course, subject to the hypothesis that the electron is more or less equally attracted to both ions, which may or may not be the case. If it’s not the case, we’ll have what’s referred to as ‘ionic’ binding. Again, I’ll let Feynman explain it, as it’s pretty straightforward and so it’s no use to try to write another summary of this:

Feynman3

So… That’s it, really. As Feynman puts it, by way of conclusion: “You can now begin to see how it is that many of the facts of chemistry can be most clearly understood in terms of a quantum mechanical description.”

Most clearly? Well… I guess that, at the very least, we’re “beginning to see” something here, aren’t we? 🙂

Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/