The geometry of the wavefunction, electron spin and the form factor

Our previous posts showed how a simple geometric interpretation of the elementary wavefunction yielded the (Compton scattering) radius of an elementary particle—for an electron, at least: for the proton, we only got the order of magnitude right—but then a proton is not an elementary particle. We got lots of other interesting equations as well… But… Well… When everything is said and done, it’s that equivalence between the E = m·a2·ω2 and E = m·c2 relations that we… Well… We need to be more specific about it.

Indeed, I’ve been ambiguous here and there—oscillating between various interpretations, so to speak. 🙂 In my own mind, I refer to my unanswered questions, or my ambiguous answers to them, as the form factor problem. So… Well… That explains the title of my post. But so… Well… I do want to be somewhat more conclusive in this post. So let’s go and see where we end up. 🙂

To help focus our mind, let us recall the metaphor of the V-2 perpetuum mobile, as illustrated below. With permanently closed valves, the air inside the cylinder compresses and decompresses as the pistons move up and down. It provides, therefore, a restoring force. As such, it will store potential energy, just like a spring, and the motion of the pistons will also reflect that of a mass on a spring: it is described by a sinusoidal function, with the zero point at the center of each cylinder. We can, therefore, think of the moving pistons as harmonic oscillators, just like mechanical springs. Of course, instead of two cylinders with pistons, one may also think of connecting two springs with a crankshaft, but then that’s not fancy enough for me. 🙂

V-2 engine

At first sight, the analogy between our flywheel model of an electron and the V-twin engine seems to be complete: the 90 degree angle of our V-2 engine makes it possible to perfectly balance the pistons and we may, therefore, think of the flywheel as a (symmetric) rotating mass, whose angular momentum is given by the product of the angular frequency and the moment of inertia: L = ω·I. Of course, the moment of inertia (aka the angular mass) will depend on the form (or shape) of our flywheel:

  1. I = m·a2 for a rotating point mass m or, what amounts to the same, for a circular hoop of mass m and radius a.
  2. For a rotating (uniformly solid) disk, we must add a 1/2 factor: I = m·a2/2.

How can we relate those formulas to the E = m·a2·ω2 formula? The kinetic energy that is being stored in a flywheel is equal Ekinetic = I·ω2/2, so that is only half of the E = m·a2·ω2 product if we substitute I for I = m·a2. [For a disk, we get a factor 1/4, so that’s even worse!] However, our flywheel model of an electron incorporates potential energy too. In fact, the E = m·a2·ω2 formula just adds the (kinetic and potential) energy of two oscillators: we do not really consider the energy in the flywheel itself because… Well… The essence of our flywheel model of an electron is not the flywheel: the flywheel just transfers energy from one oscillator to the other, but so… Well… We don’t include it in our energy calculations. The essence of our model is that two-dimensional oscillation which drives the electron, and which is reflected in Einstein’s E = m·c2 formula. That two-dimensional oscillation—the a2·ω2 = c2 equation, really—tells us that the resonant (or natural) frequency of the fabric of spacetime is given by the speed of light—but measured in units of a. [If you don’t quite get this, re-write the a2·ω2 = c2 equation as ω = c/a: the radius of our electron appears as a natural distance unit here.]

Now, we were extremely happy with this interpretation not only because of the key results mentioned above, but also because it has lots of other nice consequences. Think of our probabilities as being proportional to energy densities, for example—and all of the other stuff I describe in my published paper on this. But there is even more on the horizon: a follower of this blog (a reader with an actual PhD in physics, for a change) sent me an article analyzing elementary particles as tiny black holes because… Well… If our electron is effectively spinning around, then its tangential velocity is equal to ω = c. Now, recent research suggest black holes are also spinning at (nearly) the speed of light. Interesting, right? However, in order to understand what she’s trying to tell me, I’ll first need to get a better grasp of general relativity, so I can relate what I’ve been writing here and in previous posts to the Schwarzschild radius and other stuff.

Let me get back to the lesson here. In the reference frame of our particle, the wavefunction really looks like the animation below: it has two components, and the amplitude of the two-dimensional oscillation is equal to a, which we calculated as = ħ·/(m·c) = 3.8616×10−13 m, so that’s the (reduced) Compton scattering radius of an electron.

Circle_cos_sin

In my original article on this, I used a more complicated argument involving the angular momentum formula, but I now prefer a more straightforward calculation:

c = a·ω = a·E/ħ = a·m·c2/ħ  ⇔ = ħ/(m·c)

The question is: what is that rotating arrow? I’ve been vague and not so vague on this. The thing is: I can’t prove anything in this regard. But my hypothesis is that it is, in effect, a rotating field vector, so it’s just like the electric field vector of a (circularly polarized) electromagnetic wave (illustrated below).

There are a number of crucial differences though:

  1. The (physical) dimension of the field vector of the matter-wave is different: I associate the real and imaginary component of the wavefunction with a force per unit mass (as opposed to the force per unit charge dimension of the electric field vector). Of course, the newton/kg dimension reduces to the dimension of acceleration (m/s2), so that’s the dimension of a gravitational field.
  2. I do believe this gravitational disturbance, so to speak, does cause an electron to move about some center, and I believe it does so at the speed of light. In contrast, electromagnetic waves do not involve any mass: they’re just an oscillating field. Nothing more. Nothing less. In contrast, as Feynman puts it: “When you do find the electron some place, the entire charge is there.” (Feynman’s Lectures, III-21-4)
  3. The third difference is one that I thought of only recently: the plane of the oscillation cannot be perpendicular to the direction of motion of our electron, because then we can’t explain the direction of its magnetic moment, which is either up or down when traveling through a Stern-Gerlach apparatus.

I mentioned that in my previous post but, for your convenience, I’ll repeat what I wrote there. The basic idea here is illustrated below (credit for this illustration goes to another blogger on physics). As for the Stern-Gerlach experiment itself, let me refer you to a YouTube video from the Quantum Made Simple site.

Figure 1 BohrThe point is: the direction of the angular momentum (and the magnetic moment) of an electron—or, to be precise, its component as measured in the direction of the (inhomogeneous) magnetic field through which our electron is traveling—cannot be parallel to the direction of motion. On the contrary, it is perpendicular to the direction of motion. In other words, if we imagine our electron as spinning around some center, then the disk it circumscribes will comprise the direction of motion.

However, we need to add an interesting detail here. As you know, we don’t really have a precise direction of angular momentum in quantum physics. [If you don’t know this… Well… Just look at one of my many posts on spin and angular momentum in quantum physics.] Now, we’ve explored a number of hypotheses but, when everything is said and done, a rather classical explanation turns out to be the best: an object with an angular momentum J and a magnetic moment μ (I used bold-face because these are vector quantities) that is parallel to some magnetic field B, will not line up, as you’d expect a tiny magnet to do in a magnetic field—or not completely, at least: it will precess. I explained that in another post on quantum-mechanical spin, which I advise you to re-read if you want to appreciate the point that I am trying to make here. That post integrates some interesting formulas, and so one of the things on my ‘to do’ list is to prove that these formulas are, effectively, compatible with the electron model we’ve presented in this and previous posts.

Indeed, when one advances a hypothesis like this, it’s not enough to just sort of show that the general geometry of the situation makes sense: we also need to show the numbers come out alright. So… Well… Whatever we think our electron—or its wavefunction—might be, it needs to be compatible with stuff like the observed precession frequency of an electron in a magnetic field.

Our model also needs to be compatible with the transformation formulas for amplitudes. I’ve been talking about this for quite a while now, and so it’s about time I get going on that.

Last but not least, those articles that relate matter-particles to (quantum) gravity—such as the one I mentioned above—are intriguing too and, hence, whatever hypotheses I advance here, I’d better check them against those more advanced theories too, right? 🙂 Unfortunately, that’s going to take me a few more years of studying… But… Well… I still have many years ahead—I hope. 🙂

Post scriptum: It’s funny how one’s brain keeps working when sleeping. When I woke up this morning, I thought: “But it is that flywheel that matters, right? That’s the energy storage mechanism and also explains how photons possibly interact with electrons. The oscillators drive the flywheel but, without the flywheel, nothing is happening. It is really the transfer of energy—through the flywheel—which explains why our flywheel goes round and round.”

It may or may not be useful to remind ourselves of the math in this regard. The motion of our first oscillator is given by the cos(ω·t) = cosθ function (θ = ω·t), and its kinetic energy will be equal to sin2θ. Hence, the (instantaneous) change in kinetic energy at any point in time (as a function of the angle θ) is equal to: d(sin2θ)/dθ = 2∙sinθ∙d(sinθ)/dθ = 2∙sinθ∙cosθ. Now, the motion of the second oscillator (just look at that second piston going up and down in the V-2 engine) is given by the sinθ function, which is equal to cos(θ − π /2). Hence, its kinetic energy is equal to sin2(θ − π /2), and how it changes (as a function of θ again) is equal to 2∙sin(θ − π /2)∙cos(θ − π /2) = = −2∙cosθ∙sinθ = −2∙sinθ∙cosθ. So here we have our energy transfer: the flywheel organizes the borrowing and returning of energy, so to speak. That’s the crux of the matter.

So… Well… What if the relevant energy formula is E = m·a2·ω2/2 instead of E = m·a2·ω2? What are the implications? Well… We get a √2 factor in our formula for the radius a, as shown below.

square 2

Now that is not so nice. For the tangential velocity, we get a·ω = √2·c. This is also not so nice. How can we save our model? I am not sure, but here I am thinking of the mentioned precession—the wobbling of our flywheel in a magnetic field. Remember we may think of Jz—the angular momentum or, to be precise, its component in the z-direction (the direction in which we measure it—as the projection of the real angular momentum J. Let me insert Feynman’s illustration here again (Feynman’s Lectures, II-34-3), so you get what I am talking about.

precession

Now, all depends on the angle (θ) between Jz and J, of course. We did a rather obscure post on these angles, but the formulas there come in handy now. Just click the link and review it if and when you’d want to understand the following formulas for the magnitude of the presumed actual momentum:magnitude formulaIn this particular case (spin-1/2 particles), j is equal to 1/2 (in units of ħ, of course). Hence, is equal to √0.75 ≈ 0.866. Elementary geometry then tells us cos(θ) = (1/2)/√(3/4) =  = 1/√3. Hence, θ ≈ 54.73561°. That’s a big angle—larger than the 45° angle we had secretly expected because… Well… The 45° angle has that √2 factor in it: cos(45°) = sin(45°) = 1/√2.

Hmm… As you can see, there is no easy fix here. Those damn 1/2 factors! They pop up everywhere, don’t they? 🙂 We’ll solve the puzzle. One day… But not today, I am afraid. I’ll call it the form factor problem… Because… Well… It sounds better than the 1/2 or √2 problem, right? 🙂

Note: If you’re into quantum math, you’ll note ħ/(m·c) is the reduced Compton scattering radius. The standard Compton scattering radius is equal to  = (2π·ħ)/(m·c) =  h/(m·c) = h/(m·c). It doesn’t solve the √2 problem. Sorry. The form factor problem. 🙂

To be honest, I finished my published paper on all of this with a suggestion that, perhaps, we should think of two circular oscillations, as opposed to linear ones. Think of a tiny ball, whose center of mass stays where it is, as depicted below. Any rotation – around any axis – will be some combination of a rotation around the two other axes. Hence, we may want to think of our two-dimensional oscillation as an oscillation of a polar and azimuthal angle. It’s just a thought but… Well… I am sure it’s going to keep me busy for a while. 🙂polar_coordsThey are oscillations, still, so I am not thinking of two flywheels that keep going around in the same direction. No. More like a wobbling object on a spring. Something like the movement of a bobblehead on a spring perhaps. 🙂bobblehead

Advertisements

The speed of light as an angular velocity (2)

My previous post on the speed of light as an angular velocity was rather cryptic. This post will be a bit more elaborate. Not all that much, however: this stuff is and remains quite dense, unfortunately. 😦 But I’ll do my best to try to explain what I am thinking of. Remember the formula (or definition) of the elementary wavefunction:

ψ = a·ei[E·t − px]/ħa·cos(px/ħ − E∙t/ħ) + i·a·sin(px/ħ − E∙t/ħ)

How should we interpret this? We know an actual particle will be represented by a wave packet: a sum of wavefunctions, each with its own amplitude ak and its own argument θk = (Ek∙t − pkx)/ħ. But… Well… Let’s see how far we get when analyzing the elementary wavefunction itself only.

According to mathematical convention, the imaginary unit (i) is a 90° angle in the counterclockwise direction. However, Nature surely cannot be bothered about our convention of measuring phase angles – or time itself – clockwise or counterclockwise. Therefore, both right- as well as left-handed polarization may be possible, as illustrated below.

The left-handed elementary wavefunction would be written as:

ψ = a·ei[E·t − px]/ħa·cos(px/ħ − E∙t/ħ) − i·a·sin(px/ħ − E∙t/ħ)

In my previous posts, I hypothesized that the two physical possibilities correspond to the angular momentum of our particle – say, an electron – being either positive or negative: J = +ħ/2 or, else, J = −ħ/2. I will come back to this in a moment. Let us first further examine the functional form of the wavefunction.

We should note that both the direction as well as the magnitude of the (linear) momentum (p) are relative: they depend on the orientation and relative velocity of our reference frame – which are, in effect, relative to the reference frame of our object. As such, the wavefunction itself is relative: another observer will obtain a different value for both the momentum (p) as well as for the energy (E). Of course, this makes us think of the relativity of the electric and magnetic field vectors (E and B) but… Well… It’s not quite the same because – as I will explain in a moment – the argument of the wavefunction, considered as a whole, is actually invariant under a Lorentz transformation.

Let me elaborate this point. If we consider the reference frame of the particle itself, then the idea of direction and momentum sort of vanishes, as the momentum vector shrinks to the origin itself: p = 0. Let us now look at how the argument of the wavefunction transforms. The E and p in the argument of the wavefunction (θ = ω∙t – kx = (E/ħ)∙t – (p/ħ)∙x = (E∙t – px)/ħ) are, of course, the energy and momentum as measured in our frame of reference. Hence, we will want to write these quantities as E = Ev and p = pv = pvv. If we then use natural time and distance units (hence, the numerical value of c is equal to 1 and, hence, the (relative) velocity is then measured as a fraction of c, with a value between 0 and 1), we can relate the energy and momentum of a moving object to its energy and momentum when at rest using the following relativistic formulas:

E= γ·E0 and p= γ·m0v = γ·E0v/c2

The argument of the wavefunction can then be re-written as:

θ = [γ·E0/ħ]∙t – [(γ·E0v/c2)/ħ]∙x = (E0/ħ)·(t − v∙x/c2)·γ = (E0/ħ)∙t’

The γ in these formulas is, of course, the Lorentz factor, and t’ is the proper time: t’ = (t − v∙x/c2)/√(1−v2/c2). Two essential points should be noted here:

1. The argument of the wavefunction is invariant. There is a primed time (t’) but there is no primed θ (θ’): θ = (Ev/ħ)·t – (pv/ħ)·x = (E0/ħ)∙t’.

2. The E0/ħ coefficient pops up as an angular frequency: E0/ħ = ω0. We may refer to it as the frequency of the elementary wavefunction.

Now, if you don’t like the concept of angular frequency, we can also write: f0 = ω0/2π = (E0/ħ)/2π = E0/h. Alternatively, and perhaps more elucidating, we get the following formula for the period of the oscillation:

T0 = 1/f0 = h/E0

This is interesting, because we can look at the period as a natural unit of time for our particle. This period is inversely proportional to the (rest) energy of the particle, and the constant of proportionality is h. Substituting Efor m0·c2, we may also say it’s inversely proportional to the (rest) mass of the particle, with the constant of proportionality equal to h/c2. The period of an electron, for example, would be equal to about 8×10−21 s. That’s very small, and it only gets smaller for larger objects ! But what does all of this really tell us? What does it actually mean?

We can look at the sine and cosine components of the wavefunction as an oscillation in two dimensions, as illustrated below.

Circle_cos_sin

Look at the little green dot going around. Imagine it is some mass going around and around. Its circular motion is equivalent to the two-dimensional oscillation. Indeed, instead of saying it moves along a circle, we may also say it moves simultaneously (1) left and right and back again (the cosine) while also moving (2) up and down and back again (the sine).

Now, a mass that rotates about a fixed axis has angular momentum, which we can write as the vector cross-product L = r×p or, alternatively, as the product of an angular velocity (ω) and rotational inertia (I), aka as the moment of inertia or the angular massL = I·ω. [Note we write L and ω in boldface here because they are (axial) vectors. If we consider their magnitudes only, we write L = I·ω (no boldface).]

We can now do some calculations. We already know the angular velocity (ω) is equal to E0/ħ. Now, the magnitude of r in the Lr×p vector cross-product should equal the magnitude of ψ = a·ei∙E·t/ħ, so we write: r = a. What’s next? Well… The momentum (p) is the product of a linear velocity (v) – in this case, the tangential velocity – and some mass (m): p = m·v. If we switch to scalar instead of vector quantities, then the (tangential) velocity is given by v = r·ω.

So now we only need to think about what formula we should use for the angular mass. If we’re thinking, as we are doing here, of some point mass going around some center, then the formula to use is I = m·r2. However, we may also want to think that the two-dimensional oscillation of our point mass actually describes the surface of a disk, in which case the formula for I becomes I = m·r2/2. Of course, the addition of this 1/2 factor may seem arbitrary but, as you will see, it will give us a more intuitive result. This is what we get:

L = I·ω = (m·r2/2)·(E/ħ) = (1/2)·a2·(E/c2)·(E/ħ) = a2·E2/(2·ħ·c2)

Note that our frame of reference is that of the particle itself, so we should actually write ω0, m0 and E0 instead of ω, m and E. The value of the rest energy of an electron is about 0.510 MeV, or 8.1871×10−14 N∙m. Now, this momentum should equal J = ±ħ/2. We can, therefore, derive the (Compton scattering) radius of an electron:Formula 1Substituting the various constants with their numerical values, we find that a is equal 3.8616×10−13 m, which is the (reduced) Compton scattering radius of an electron. The (tangential) velocity (v) can now be calculated as being equal to v = r·ω = a·ω = [ħ·/(m·c)]·(E/ħ) = c. This is an amazing result. Let us think about it.

In our previous posts, we introduced the metaphor of two springs or oscillators, whose energy was equal to E = m·ω2. Is this compatible with Einstein’s E = m·c2 mass-energy equivalence relation? It is. The E = m·c2 implies E/m = c2. We, therefore, can write the following:

ω = E/ħ = m·c2/ħ = m·(E/m)·/ħ ⇔ ω = E/ħ

Hence, we should actually have titled this and the previous post somewhat differently: the speed of light appears as a tangential velocity. Think of the following: the ratio of c and ω is equal to c/ω = a·ω/ω = a. Hence, the tangential and angular velocity would be the same if we’d measure distance in units of a. In other words, the radius of an electron appears as a natural distance unit here: if we’d measure ω in units of per second, rather than in radians (which are expressed in the SI unit of distance, i.e. the meter) per second, the two concepts would coincide.

More fundamentally, we may want to look at the radius of an electron as a natural unit of velocityHuh? Yes. Just re-write the c/ω = a as ω = c/a. What does it say? Exactly what I said, right? As such, the radius of an electron is not only a norm for measuring distance but also for time. 🙂

If you don’t quite get this, think of the following. For an electron, we get an angular frequency that is equal to ω = E/ħ = (8.19×10−14 N·m)/(1.05×10−34 N·m·s) ≈ 7.76×1020 radians per second. That’s an incredible velocity, because radians are expressed in distance units—so that’s in meter. However, our mass is not moving along the unit circle, but along a much tinier orbit. The ratio of the radius of the unit circle and is equal to 1/a ≈ (1 m)/(3.86×10−13 m) ≈ 2.59×1012. Now, if we divide the above-mentioned velocity of 7.76×1020 radians per second by this factor, we get… Right ! The speed of light: 2.998×1082 m/s. 🙂

Post scriptum: I have no clear answer to the question as to why we should use the I = m·r2/2 formula, as opposed to the I = m·r2 formula. It ensures we get the result we want, but this 1/2 factor is actually rather enigmatic. It makes me think of the 1/2 factor in Schrödinger’s equation, which is also quite enigmatic. In my view, the 1/2 factor should not be there in Schrödinger’s equation. Electron orbitals tend to be occupied by two electrons with opposite spin. That’s why their energy levels should be twice as much. And so I’d get rid of the 1/2 factor, solve for the energy levels, and then divide them by two again. Or something like that. 🙂 But then that’s just my personal opinion or… Well… I’ve always been intrigued by the difference between the original printed edition of the Feynman Lectures and the online version, which has been edited on this point. My printed edition is the third printing, which is dated July 1966, and – on this point – it says the following:

“Don’t forget that meff has nothing to do with the real mass of an electron. It may be quite different—although in commonly used metals and semiconductors it often happens to turn out to be the same general order of magnitude, about 2 to 20 times the free-space mass of the electron.”

Two to twenty times. Not 1 or 0.5 to 20 times. No. Two times. As I’ve explained a couple of times, if we’d define a new effective mass which would be twice the old concept – so meffNEW = 2∙meffOLD – then such re-definition would not only solve a number of paradoxes and inconsistencies, but it will also justify my interpretation of energy as a two-dimensional oscillation of mass.

However, the online edition has been edited here to reflect the current knowledge about the behavior of an electron in a medium. Hence, if you click on the link above, you will read that the effective mass can be “about 0.1 to 30 times” the free-space mass of the electron. Well… This is another topic altogether, and so I’ll sign off here and let you think about it all. 🙂

The energy and 1/2 factor in Schrödinger’s equation

Schrödinger’s equation, for a particle moving in free space (so we have no external force fields acting on it, so V = 0 and, therefore, the Vψ term disappears) is written as:

∂ψ(x, t)/∂t = i·(1/2)·(ħ/meff)·∇2ψ(x, t)

We already noted and explained the structural similarity with the ubiquitous diffusion equation in physics:

∂φ(x, t)/∂t = D·∇2φ(x, t) with x = (x, y, z)

The big difference between the wave equation and an ordinary diffusion equation is that the wave equation gives us two equations for the price of one: ψ is a complex-valued function, with a real and an imaginary part which, despite their name, are both equally fundamental, or essential. Whatever word you prefer. 🙂 That’s also what the presence of the imaginary unit (i) in the equation tells us. But for the rest it’s the same: the diffusion constant (D) in Schrödinger’s equation is equal to (1/2)·(ħ/meff).

Why the 1/2 factor? It’s ugly. Think of the following: If we bring the (1/2)·(ħ/meff) to the other side, we can write it as meff/(ħ/2). The ħ/2 now appears as a scaling factor in the diffusion constant, just like ħ does in the de Broglie equations: ω = E/ħ and k = p/ħ, or in the argument of the wavefunction: θ = (E·t − p∙x)/ħ. Planck’s constant is, effectively, a physical scaling factor. As a physical scaling constant, it usually does two things:

  1. It fixes the numbers (so that’s its function as a mathematical constant).
  2. As a physical constant, it also fixes the physical dimensions. Note, for example, how the 1/ħ factor in ω = E/ħ and k = p/ħ ensures that the ω·t = (E/ħ)·t and k·x = (p/ħ)·x terms in the argument of the wavefunction are both expressed as some dimensionless number, so they can effectively be added together. Physicists don’t like adding apples and oranges.

The question is: why did Schrödinger use ħ/2, rather than ħ, as a scaling factor? Let’s explore the question.

The 1/2 factor

We may want to think that 1/2 factor just echoes the 1/2 factor in the Uncertainty Principle, which we should think of as a pair of relations: σx·σp ≥ ħ/2 and σE·σ≥ ħ/2. However, the 1/2 factor in those relations only makes sense because we chose to equate the fundamental uncertainty (Δ) in x, p, E and t with the mathematical concept of the standard deviation (σ), or the half-width, as Feynman calls it in his wonderfully clear exposé on it in one of his Lectures on quantum mechanics (for a summary with some comments, see my blog post on it). We may just as well choose to equate Δ with the full-width of those probability distributions we get for x and p, or for E and t. If we do that, we get σx·σp ≥ ħ and σE·σ≥ ħ.

It’s a bit like measuring the weight of a person on an old-fashioned (non-digital) bathroom scale with 1 kg marks only: do we say this person is x kg ± 1 kg, or x kg ± 500 g? Do we take the half-width or the full-width as the margin of error? In short, it’s a matter of appreciation, and the 1/2 factor in our pair of uncertainty relations is not there because we’ve got two relations. Likewise, it’s not because I mentioned we can think of Schrödinger’s equation as a pair of relations that, taken together, represent an energy propagation mechanism that’s quite similar in its structure to Maxwell’s equations for an electromagnetic wave (as shown below), that we’d insert (or not) that 1/2 factor: either of the two representations below works. It just depends on our definition of the concept of the effective mass.

The 1/2 factor is really a matter of choice, because the rather peculiar – and flexible – concept of the effective mass takes care of it. However, we could define some new effective mass concept, by writing: meffNEW = 2∙meffOLD, and then Schrödinger’s equation would look more elegant:

∂ψ/∂t = i·(ħ/meffNEW)·∇2ψ

Now you’ll want the definition, of course! What is that effective mass concept? Feynman talks at length about it, but his exposé is embedded in a much longer and more general argument on the propagation of electrons in a crystal lattice, which you may not necessarily want to go through right now. So let’s try to answer that question by doing something stupid: let’s substitute ψ in the equation for ψ = a·ei·[E·t − p∙x]/ħ (which is an elementary wavefunction), calculate the time derivative and the Laplacian, and see what we get. If we do that, the ∂ψ/∂t = i·(1/2)·(ħ/meff)·∇2ψ equation becomes:

i·a·(E/ħei∙(E·t − p∙x)/ħ = i·a·(1/2)·(ħ/meff)(p2/ħ2ei∙(E·t − p∙x) 

⇔ E = (1/2)·p2/meff = (1/2)·(m·v)2/meff ⇔ meff = (1/2)·(m/E)·m·v2

⇔ meff = (1/c2)·(m·v2/2) = m·β2/2

Hence, the effective mass appears in this equation as the equivalent mass of the kinetic energy (K.E.) of the elementary particle that’s being represented by the wavefunction. Now, you may think that sounds good – and it does – but you should note the following:

1. The K.E. = m·v2/2 formula is only correct for non-relativistic speeds. In fact, it’s the kinetic energy formula if, and only if, if m ≈ m0. The relativistically correct formula for the kinetic energy calculates it as the difference between (1) the total energy (which is given by the E = m·c2 formula, always) and (2) its rest energy, so we write:

K.E. = E − E0 = mv·c2 − m0·c2 = m0·γ·c2 − m0·c2 = m0·c2·(γ − 1)

2. The energy concept in the wavefunction ψ = a·ei·[E·t − p∙x]/ħ is, obviously, the total energy of the particle. For non-relativistic speeds, the kinetic energy is only a very small fraction of the total energy. In fact, using the formula above, you can calculate the ratio between the kinetic and the total energy: you’ll find it’s equal to 1 − 1/γ = 1 − √(1−v2/c2), and its graph goes from 0 to 1.

graph

Now, if we discard the 1/2 factor, the calculations above yield the following:

i·a·(E/ħ)·ei∙(E·t − p∙x)/ħ = −i·a·(ħ/meff)(p22ei∙(E·t − p∙x)/ħ 

⇔ E = p2/meff = (m·v)2/meff ⇔ meff = (m/E)·m·v2

⇔ meff = m·v2/c= m·β2

In fact, it is fair to say that both definitions are equally weird, even if the dimensions come out alright: the effective mass is measured in old-fashioned mass units, and the βor β2/2 factor appears as a sort of correction factor, varying between 0 and 1 (for β2) or between 0 and 1/2 (for β2/2). I prefer the new definition, as it ensures that meff becomes equal to m in the limit for the velocity going to c. In addition, if we bring the ħ/meff or (1/2)∙ħ/meff factor to the other side of the equation, the choice becomes one between a meffNEW/ħ or a 2∙meffOLD/ħ coefficient.

It’s a choice, really. Personally, I think the equation without the 1/2 factor – and, hence, the use of ħ rather than ħ/2 as the scaling factor – looks better, but then you may argue that – if half of the energy of our particle is in the oscillating real part of the wavefunction, and the other is in the imaginary part – then the 1/2 factor should stay, because it ensures that meff becomes equal to m/2 as v goes to c (or, what amounts to the same, β goes to 1). But then that’s the argument about whether or not we should have a 1/2 factor because we get two equations for the price of one, like we did for the Uncertainty Principle.

So… What to do? Let’s first ask ourselves whether that derivation of the effective mass actually makes sense. Let’s therefore look at both limit situations.

1. For v going to c (or β = v/c going to 1), we do not have much of a problem: meff just becomes the total mass of the particle that we’re looking at, and Schrödinger’s equation can easily be interpreted as an energy propagation mechanism. Our particle has zero rest mass in that case ( we may also say that the concept of a rest mass is meaningless in this situation) and all of the energy – and, therefore, all of the equivalent mass – is kinetic: m = E/cand the effective mass is just the mass: meff = m·c2/c= m. Hence, our particle is everywhere and nowhere. In fact, you should note that the concept of velocity itself doesn’t make sense in this rather particular case. It’s like a photon (but note it’s not a photon: we’re talking some theoretical particle here with zero spin and zero rest mass): it’s a wave in its own frame of reference, but as it zips by at the speed of light, we think of it as a particle.

2. Let’s look at the other limit situation. For v going to 0 (or β = v/c going to 0), Schrödinger’s equation no longer makes sense, because the diffusion constant goes to zero, so we get a nonsensical equation. Huh? What’s wrong with our analysis?

Well… I must be honest. We started off on the wrong foot. You should note that it’s hard – in fact, plain impossible – to reconcile our simple a·ei·[E·t − p∙x]/ħ function with the idea of the classical velocity of our particle. Indeed, the classical velocity corresponds to a group velocity, or the velocity of a wave packet, and so we just have one wave here: no group. So we get nonsense. You can see the same when equating p to zero in the wave equation: we get another nonsensical equation, because the Laplacian is zero! Check it. If our elementary wavefunction is equal to ψ = a·ei·(E/ħ)·t, then that Laplacian is zero.

Hence, our calculation of the effective mass is not very sensical. Why? Because the elementary wavefunction is a theoretical concept only: it may represent some box in space, that is uniformly filled with energy, but it cannot represent any actual particle. Actual particles are always some superposition of two or more elementary waves, so then we’ve got a wave packet (as illustrated below) that we can actually associate with some real-life particle moving in space, like an electron in some orbital indeed. 🙂

wave-packet

I must credit Oregon State University for the animation above. It’s quite nice: a simple particle in a box model without potential. As I showed on my other page (explaining various models), we must add at least two waves – traveling in opposite directions – to model a particle in a box. Why? Because we represent it by a standing wave, and a standing wave is the sum of two waves traveling in opposite directions.

So, if our derivation above was not very meaningful, then what is the actual concept of the effective mass?

The concept of the effective mass

I am afraid that, at this point, I do have to direct you back to the Grand Master himself for the detail. Let me just try to sum it up very succinctly. If we have a wave packet, there is – obviously – some energy in it, and it’s energy we may associate with the classical concept of the velocity of our particle – because it’s the group velocity of our wave packet. Hence, we have a new energy concept here – and the equivalent mass, of course. Now, Feynman’s analysis – which is Schrödinger’s analysis, really – shows we can write that energy as:

E = meff·v2/2

So… Well… That’s the classical kinetic energy formula. And it’s the very classical one, because it’s not relativistic. 😦 But that’s OK for relatively small-moving electrons! [Remember the typical (relative) velocity is given by the fine-structure constant: α = β = v/c. So that’s impressive (about 2,188 km per second), but it’s only a tiny fraction of the speed of light, so non-relativistic formulas should work.]

Now, the meff factor in this equation is a function of the various parameters of the model he uses. To be precise, we get the following formula out of his model (which, as mentioned above, is a model of electrons propagating in a crystal lattice):

meff = ħ2/(2·A·b2 )

Now, the b in this formula is the spacing between the atoms in the lattice. The A basically represents an energy barrier: to move from one atom to another, the electron needs to get across it. I talked about this in my post on it, and so I won’t explain the graph below – because I did that in that post. Just note that we don’t need that factor 2: there is no reason whatsoever to write E+ 2·A and E2·A. We could just re-define a new A: (1/2)·ANEW = AOLD. The formula for meff then simplifies to ħ2/(2·AOLD·b2) = ħ2/(ANEW·b2). We then get an Eeff = meff·vformula for the extra energy.

energy

Eeff = meff·v2?!? What energy formula is that? Schrödinger must have thought the same thing, and so that’s why we have that ugly 1/2 factor in his equation. However, think about it. Our analysis shows that it is quite straightforward to model energy as a two-dimensional oscillation of mass. In this analysis, both the real and the imaginary component of the wavefunction each store half of the total energy of the object, which is equal to E = m·c2. Remember, indeed, that we compared it to the energy in an oscillator, which is equal to the sum of kinetic and potential energy, and for which we have the T + U = m·ω02/2 formula. But so we have two oscillators here and, hence, twice the energy. Hence, the E = m·c2 corresponds to m·ω0and, hence, we may think of as the natural frequency of the vacuum.

Therefore, the Eeff = meff·v2 formula makes much more sense. It nicely mirrors Einstein’s E = m·c2 formula and, in fact, naturally merges into E = m·c for v approaching c. But, I admit, it is not so easy to interpret. It’s much easier to just say that the effective mass is the mass of our electron as it appears in the kinetic energy formula, or – alternatively – in the momentum formula. Indeed, Feynman also writes the following formula:

meff·v = p = ħ·k

Now, that is something we easily recognize! 🙂

So… Well… What do we do now? Do we use the 1/2 factor or not?

It would be very convenient, of course, to just stick with tradition and use meff as everyone else uses it: it is just the mass as it appears in whatever medium we happen to look it, which may be a crystal lattice (or a semi-conductor), or just free space. In short, it’s the mass of the electron as it appears to us, i.e. as it appears in the (non-relativistic) kinetic energy formula (K.E. = meff·v2/2), the formula for the momentum of an electron (p = meff·v), or in the wavefunction itself (k = p/ħ = (meff·v)/ħ. In fact, in his analysis of the electron orbitals, Feynman (who just follows Schrödinger here) drops the eff subscript altogether, and so the effective mass is just the mass: meff = m. Hence, the apparent mass of the electron in the hydrogen atom serves as a reference point, and the effective mass in a different medium (such as a crystal lattice, rather than free space or, I should say, a hydrogen atom in free space) will also be different.

The thing is: we get the right results out of Schrödinger’s equation, with the 1/2 factor in it. Hence, Schrödinger’s equation works: we get the actual electron orbitals out of it. Hence, Schrödinger’s equation is true – without any doubt. Hence, if we take that 1/2 factor out, then we do need to use the other effective mass concept. We can do that. Think about the actual relation between the effective mass and the real mass of the electron, about which Feynman writes the following: “The effective mass has nothing to do with the real mass of an electron. It may be quite different—although in commonly used metals and semiconductors it often happens to turn out to be the same general order of magnitude: about 0.1 to 30 times the free-space mass of the electron.” Hence, if we write the relation between meff and m as meff = g(m), then the same relation for our meffNEW = 2∙meffOLD becomes meffNEW = 2·g(m), and the “about 0.1 to 30 times” becomes “about 0.2 to 60 times.”

In fact, in the original 1963 edition, Feynman writes that the effective mass is “about 2 to 20 times” the free-space mass of the electron. Isn’t that interesting? I mean… Note that factor 2! If we’d write meff = 2·m, then we’re fine. We can then write Schrödinger’s equation in the following two equivalent ways:

  1. (meff/ħ)·∂ψ/∂t = i·∇2ψ
  2. (2m/ħ)·∂ψ/∂t = i·∇2ψ

Both would be correct, and it explains why Schrödinger’s equation works. So let’s go for that compromise and write Schrödinger’s equation in either of the two equivalent ways. 🙂 The question then becomes: how to interpret that factor 2? The answer to that question is, effectively, related to the fact that we get two waves for the price of one here. So we have two oscillators, so to speak. Now that‘s quite deep, and I will explore that in one of my next posts.

Let me now address the second weird thing in Schrödinger’s equation: the energy factor. I should be more precise: the weirdness arises when solving Schrödinger’s equation. Indeed, in the texts I’ve read, there is this constant switching back and forth between interpreting E as the energy of the atom, versus the energy of the electron. Now, both concepts are obviously quite different, so which one is it really?

The energy factor E

It’s a confusing point—for me, at least and, hence, I must assume for students as well. Let me indicate, by way of example, how the confusion arises in Feynman’s exposé on the solutions to the Schrödinger equation. Initially, the development is quite straightforward. Replacing V by −e2/r, Schrödinger’s equation becomes:

Eq1

As usual, it is then assumed that a solution of the form ψ (r, t) =  e−(i/ħ)·E·t·ψ(r) will work. Apart from the confusion that arises because we use the same symbol, ψ, for two different functions (you will agree that ψ (r, t), a function in two variables, is obviously not the same as ψ(r), a function in one variable only), this assumption is quite straightforward and allows us to re-write the differential equation above as:

de

To get this, you just need to actually to do that time derivative, noting that the ψ in our equation is now ψ(r), not ψ (r, t). Feynman duly notes this as he writes: “The function ψ(rmust solve this equation, where E is some constant—the energy of the atom.” So far, so good. In one of the (many) next steps, we re-write E as E = ER·ε, with E= m·e4/2ħ2. So we just use the Rydberg energy (E≈ 13.6 eV) here as a ‘natural’ atomic energy unit. That’s all. No harm in that.

Then all kinds of complicated but legitimate mathematical manipulations follow, in an attempt to solve this differential equation—attempt that is successful, of course! However, after all these manipulations, one ends up with the grand simple solution for the s-states of the atom (i.e. the spherically symmetric solutions):

En = −ER/nwith 1/n= 1, 1/4, 1/9, 1/16,…, 1

So we get: En = −13.6 eV, −3.4 eV, −1.5 eV, etcetera. Now how is that possible? How can the energy of the atom suddenly be negative? More importantly, why is so tiny in comparison with the rest energy of the proton (which is about 938 mega-electronvolt), or the electron (0.511 MeV)? The energy levels above are a few eV only, not a few million electronvolt. Feynman answers this question rather vaguely when he states the following:

“There is, incidentally, nothing mysterious about negative numbers for the energy. The energies are negative because when we chose to write V = −e2/r, we picked our zero point as the energy of an electron located far from the proton. When it is close to the proton, its energy is less, so somewhat below zero. The energy is lowest (most negative) for n = 1, and increases toward zero with increasing n.”

We picked our zero point as the energy of an electron located far away from the proton? But we were talking the energy of the atom all along, right? You’re right. Feynman doesn’t answer the question. The solution is OK – well, sort of, at least – but, in one of those mathematical complications, there is a ‘normalization’ – a choice of some constant that pops up when combining and substituting stuff – that is not so innocent. To be precise, at some point, Feynman substitutes the ε variable for the square of another variable – to be even more precise, he writes: ε = −α2. He then performs some more hat tricks – all legitimate, no doubt – and finds that the only sensible solutions to the differential equation require α to be equal to 1/n, which immediately leads to the above-mentioned solution for our s-states.

The real answer to the question is given somewhere else. In fact, Feynman casually gives us an explanation in one of his very first Lectures on quantum mechanics, where he writes the following:

“If we have a “condition” which is a mixture of two different states with different energies, then the amplitude for each of the two states will vary with time according to an equation like a·eiωt, with ħ·ω = E0 = m·c2. Hence, we can write the amplitude for the two states, for example as:

ei(E1/ħ)·t and ei(E2/ħ)·t

And if we have some combination of the two, we will have an interference. But notice that if we added a constant to both energies, it wouldn’t make any difference. If somebody else were to use a different scale of energy in which all the energies were increased (or decreased) by a constant amount—say, by the amount A—then the amplitudes in the two states would, from his point of view, be

ei(E1+A)·t/ħ and ei(E2+A)·t/ħ

All of his amplitudes would be multiplied by the same factor ei(A/ħ)·t, and all linear combinations, or interferences, would have the same factor. When we take the absolute squares to find the probabilities, all the answers would be the same. The choice of an origin for our energy scale makes no difference; we can measure energy from any zero we want. For relativistic purposes it is nice to measure the energy so that the rest mass is included, but for many purposes that aren’t relativistic it is often nice to subtract some standard amount from all energies that appear. For instance, in the case of an atom, it is usually convenient to subtract the energy Ms·c2, where Ms is the mass of all the separate pieces—the nucleus and the electrons—which is, of course, different from the mass of the atom. For other problems, it may be useful to subtract from all energies the amount Mg·c2, where Mg is the mass of the whole atom in the ground state; then the energy that appears is just the excitation energy of the atom. So, sometimes we may shift our zero of energy by some very large constant, but it doesn’t make any difference, provided we shift all the energies in a particular calculation by the same constant.”

It’s a rather long quotation, but it’s important. The key phrase here is, obviously, the following: “For other problems, it may be useful to subtract from all energies the amount Mg·c2, where Mg is the mass of the whole atom in the ground state; then the energy that appears is just the excitation energy of the atom.” So that’s what he’s doing when solving Schrödinger’s equation. However, I should make the following point here: if we shift the origin of our energy scale, it does not make any difference in regard to the probabilities we calculate, but it obviously does make a difference in terms of our wavefunction itself. To be precise, its density in time will be very different. Hence, if we’d want to give the wavefunction some physical meaning – which is what I’ve been trying to do all along – it does make a huge difference. When we leave the rest mass of all of the pieces in our system out, we can no longer pretend we capture their energy.

This is a rather simple observation, but one that has profound implications in terms of our interpretation of the wavefunction. Personally, I admire the Great Teacher’s Lectures, but I am really disappointed that he doesn’t pay more attention to this. 😦