# The Uncertainty Principle and the stability of atoms

Pre-script (dated 26 June 2020): This post did not suffer too much from the attack on this blog by the the dark force. It remains relevant. ðŸ™‚

Original post:

The Model of the Atom

In one of my posts, I explained the quantum-mechanical model of an atom. Feynman sums it up as follows:

“The electrostatic forces pull the electron as close to the nucleus as possible, but the electron is compelled to stay spread out in space over a distance given by the Uncertainty Principle. If it were confined in too small a space, it would have a great uncertainty in momentum. But that means it would have a high expected energyâ€”which it would use to escape from the electrical attraction. The net result is an electrical equilibrium not too different from the idea of Thompsonâ€”only is it the negativeÂ charge that is spread out, because the mass of the electron is so much smaller than the mass of the proton.”

This explanation is a bit sloppy, so we should add the following clarification: “The wave function Î¨(r) for an electron in an atom does not describe a smeared-out electron with a smooth charge density. The electron is either here, or there, or somewhere else, but wherever it is, it is a point charge.” (Feynman’s Lectures, Vol. III, p. 21-6)

The two quotes are not incompatible: it is just a matter of defining what we really mean by ‘spread out’. Feynman’s calculation of theÂ Bohr radiusÂ of an atom in his introduction to quantum mechanicsÂ clears all confusion in this regard:

It is a nice argument. One may criticize he gets the right thing out because he puts the right things in â€“ such as the values ofÂ e and m, for example ðŸ™‚ âˆ’Â but it’s nice nevertheless!

Mass as a ScaleÂ Factor for Uncertainty

Having complimented Feynman, the calculation above does raise an obvious question: why is it that we cannot confine the electron in “too small a space” but that we can do so for the nucleus (which is just one proton in the example of the hydrogen atom here). Feynman gives the answer above: because the mass of the electron is so much smaller than the mass of the proton.

Huh?Â What’s the mass got to do with it? The uncertainty is the same for protons and electrons, isn’t it?

Well… It is, and it isn’t. ðŸ™‚Â The Uncertainty Principle â€“ usually written in its more accurate ÏƒxÏƒpÂ â‰¥ Ä§/2 expression â€“ applies to both the electron and the proton â€“ of course! â€“ but the momentum pÂ is the product ofÂ mass and velocityÂ (p = mÂ·v), and so it’s the proton’s mass that makes the difference here. To be specific, the mass of a proton is about 1836 times that of an electron.Â Now, as long as the velocities involved are non-relativisticâ€”and they are non-relativistic in this case: the (relative) speed of electrons in atoms is given by the fine-structure constantÂ Î± =Â v/cÂ â‰ˆ 0.0073, so the Lorentz factor isÂ veryÂ close to 1â€”we can treat the m in theÂ p = mÂ·vÂ identity as a constant and, hence, we can also write: Î”p = Î”(mÂ·v) = mÂ·Î”v. So all of the uncertainty of the momentum goes into the uncertainty of the velocity.Â Hence, the mass acts likes a reverseÂ scale factorÂ for the uncertainty.Â To appreciate what that means, let me write Î”xÎ”p = Ä§Â as:

Î”xÎ”v = Ä§/m

It is an interesting point, so let me expand the argument somewhat. We actually use a more general mathematical property of the standard deviation here: the standard deviation of a variableÂ scales directly with the scale of the variable. Hence, we can write:Â Ïƒ(kÂ·x) = kÂ·Ïƒ(x), with k > 0.Â So the uncertainty is, indeed,Â smaller for larger masses. Larger masses are associated with smaller uncertainties in their position x. To be precise, the uncertainty is inversely proportional to the mass and, hence, the mass number effectively acts like a reverse scale factor for the uncertainty.

Of course, you’ll say that the uncertainty still applies to both factors on the left-hand side of the equation, and so you’ll wonder: why can’t we keep Î”x the same and multiply Î”v with m, so its product yields Ä§ again? In other words, why can’t we have a uncertainty in velocity for the proton that is 1836 timesÂ largerÂ than the uncertainty in velocity for the electron? The answer to that question should be obvious: the uncertainty should not be greater than the expected value. When everything is said and done, we’re talking aÂ distributionÂ of some variable here (the velocity variable, to be precise) and, hence, that distribution is likely to be the Maxwell-Boltzmann distribution we introduced in previous posts. Its formula and graph are given below:

In statistics (and in probability theory), they call this a chi distributionÂ with three degrees of freedom and aÂ scale parameterÂ which is equal to a =Â (kT/m)1/2. The formula for the scale parameter shows how the mass of a particle indeed acts as a reverse scale parameter. The graph above shows three graphs for a = 1, 2 and 5 respectively. Note the square root though:Â quadruplingÂ the mass (keeping kT the same) amounts to going from a = 2 to a = 1, so that’s halvingÂ a. Indeed, [kT/(4m)]1/2Â = (1/2)(kT/m)1/2.Â So we can’t just do what we want with Î”v (like multiplying it with 1836, as suggested). In fact, the graph and the formulas show that Feynman’s assumption that we can equate p with Î”p (i.e. his assumption that “the momenta must be of the order p = Ä§/Î”x, with Î”x the spread in position”), more or less at least, is quite reasonable.

Of course, you areÂ veryÂ smart and so you’ll have yet another objection: why can’t we associate a much higher momentum with the proton, as that would allow us to associateÂ higherÂ velocities with the proton?Â Good question. My answer to that is the following (and it might be original, as I didn’t find this anywhere else). When everything is said and done, we’re talking two particles in some box here: an electron and a proton. Hence, we should assume that the average kinetic energy of our electron and our proton is the same (if not, they would be exchanging kinetic energy until it’s more or less equal), so we write <melectronÂ·v2electron/2> = <mprotonÂ·v2proton/2>. We can re-write this as mp/meÂ = 1/1836 = <v2e>/<v2p> and, therefore, <v2e> = 1836Â·<v2p>. Now, <v2> â‰  <v>2Â and, hence, <v> â‰  âˆš<v2>. So the equality doesÂ notÂ imply that the expected velocity of the electronÂ isÂ âˆš1836 â‰ˆ 43 times the expected velocity of the proton. Indeed, because of the particularities of the distribution, there is a difference between (a) the most probable speed, which is equal to âˆš2Â·a â‰ˆ 1.414Â·a, (b) the root mean square speed, which is equal toÂ âˆš<v2> = âˆš3Â·a â‰ˆ 1.732Â·a, and, finally, (c) the mean or expected speed, which is equal to <v>Â = 2Â·(2/Ï€)1/2Â·a â‰ˆ 1.596Â·a.

However, we are not far off.Â We could use any of these three values to roughly approximate Î”v, as well as theÂ scale parameterÂ a itself: our answers would all be of the same order. However, to keep the calculations simple, let’s use the most probableÂ speed. Let’s equate our electron mass with unity, so the mass of our proton is 1836.Â Now, such mass implies a scale factor (i.e. a) that’sÂ âˆš1836 â‰ˆ 43 times smaller. So the most probable speed of the proton and, therefore, its spread, would be about âˆš2/âˆš1836 = âˆš(2/1836) â‰ˆ 0.033 that of the electron, so we write: Î”vpÂ â‰ˆ 0.033Â·Î”ve.Â Now we canÂ insert this in our Î”xÎ”v = Ä§/m = Ä§/1836 identity. We get:Â Î”xpÎ”vpÂ = Î”xpÂ·âˆš(2/1836)Â·Î”veÂ =Â Ä§/1836. That, in turn, implies that âˆš(2Â·1836)Â·Î”xpÂ =Â Ä§/Î”ve, which we can re-write as: Î”xpÂ = Î”xe/âˆš(2Â·1836)Â â‰ˆ Î”xe/60. In other words, the expected spread in the position of the proton is about 60 timesÂ smallerÂ than the expected spread of the electron. More in general, we can say that the spread in position of a particle, keeping all else equal, is inversely proportional to (2m)1/2. Indeed, in this case, we multiplied the mass with about 1800, and we found that the uncertainty in position went down with a factor 1/60 = 1/âˆš3600. Not bad as a result ! Is it precise? Well… It could be like âˆš3Â·âˆšm or 2Â·(2/Ï€)1/2Â·Â·âˆšm depending on our definition of ‘uncertainty’, but it’s all of the same order. So… Yes. Not bad at all… ðŸ™‚

You’ll raise a third objection now: the radiusÂ of a proton is measured using the femtometer scale, so that’s expressed inÂ 10âˆ’15Â m, which is not 60 but a millionÂ times smaller than the nanometer (i.e.Â 10âˆ’9Â m) scaleÂ used to express the Bohr radius as calculated by Feynman above. You’re right, but theÂ 10âˆ’15Â m number is theÂ chargeÂ radius, not the uncertainty in position. Indeed, the so-called classical electron radius is also measured in femtometer and, hence, the Bohr radius is also like a million times that number. OK. That should settle the matter. I need to move on.

Before I do move on, let me relate the observation (i.e. the fact that the uncertainty in regard to position decreases as the mass of a particle increases)Â to another phenomenon. As you know, the interference of light beams is easy to observe. Hence, the interference of photons is easy to observe:Â Young’s experiment involved a slit of 0.85 mm (so almost 1 mm) only. In contrast, the 2012 double-slit experiment with electrons involved slits that wereÂ 62Â nanometer wide, i.e. 62Â billionthsÂ of a meter! That’s because the associated frequencies are so much higher and, hence, the wave zone is much smaller. So much, in fact, that Feynman could not imagine technology would ever be sufficiently advanced so as to actually carry out the double slit experiment with electrons. It’s an aspect of the same: the uncertainty in position is muchÂ smaller for electrons than it is for photons. Who knows: perhaps one day, we’ll be able to do the experiment with protons. ðŸ™‚Â For further detail, I’ll refer you one of my posts on this.

What’s Explained, and What’s Left Unexplained?

There is another obvious question: if the electron is still some point charge, and going around as it does, why doesn’t it radiate energy? Indeed, the Rutherford-Bohr model had to be discarded because this ‘planetary’ model involved circular (or elliptical) motion and, therefore, someÂ acceleration. According to classical theory, the electron should thus emit electromagnetic radiation, as a result of which it would radiate its kinetic energy away and, therefore, spiral in toward the nucleus. The quantum-mechanical model doesn’t explain this either, does it?

I can’t answer this question as yet, as I still need to go through all Feynman’s LecturesÂ on quantum mechanics. You’re right. There’s something odd about the quantum-mechanical idea: it still involves a electron moving in some kind of orbital âˆ’ although I hasten to add that the wavefunction is a complex-valuedÂ function, not some real functionÂ âˆ’Â but it doesÂ notÂ involve any loss of kinetic energy due to circular motion apparently!

There are other unexplained questions as well. For example, theÂ idea of an electrical point charge still needs to be re-conciliated with the mathematical inconsistencies it implies, as Feynman points out himself in yet another of his Lectures.

Finally, you’ll wonder as to the difference between a proton and a positron: if a positron and an electron annihilate each other in a flash, why do we have a hydrogen atom at all? Well… The proton is not the electron’sÂ anti-particle. For starters, it’s made of quarks, while the positron is made of… Well… A positron is a positron: it’sÂ elementary. But, yes, interesting question, and the ‘mechanics’ behind the mutual destruction are quite interesting and, hence, surely worth looking intoâ€”but not here. ðŸ™‚

Having mentioned a few things that remain unexplained, the model does have the advantage of solving plenty of other questions. It explains, for example, why the electron and the proton are actually right on top of each other, as they should be according to classical electrostatic theory, and why they are not at the same time: the electron is still a sort of ‘cloud’ indeed, with the proton at its center.

The quantum-mechanical ‘cloud’ model of the electron also explains why “the terrific electrical forces balance themselves out, almost perfectly, by forming tight, fine mixtures of the positive and the negative, so there is almost no attraction or repulsion at all between two separate bunches of such mixtures” (Richard Feynman, Introduction to Electromagnetism, p. 1-1) or, to quote from one of his other writings, why we do not fall through the floor as we walk:

“As we walk, our shoes with their masses of atoms push against the floor with its mass of atoms. In order to squash the atoms closer together, the electrons would be confined to a smaller space and, by the uncertainty principle, their momenta would have to be higher on the average, and that means high energy; the resistance to atomic compression is a quantum-mechanical effect and not a classical effect. Classically, we would expect that if we were to draw all the electrons and protons closer together, the energy would be reduced still further, and the best arrangement of positive and negative charges in classical physics is all on top of each other. This was well known in classical physics and was a puzzle because of the existence of the atom. Of course, the early scientists invented some ways out of the troubleâ€”but never mind, we have the right way out, now!”

So that’s it, then. Except… Well…

The Fine-Structure Constant

When talking about the stability of atoms, one cannot escape a short discussion of the so-called fine-structure constant, denoted by Î± (alpha). I discussed it another post of mine, so I’ll refer you there for a more comprehensive overview. I’ll just remind you of the basics:

(1) Î± is the square of the electron charge expressed in Planck units: Î± =Â eP2.

(2) Î± is the square root of the ratio of (a) the classical electron radius and (b) the Bohr radius:Â Î± =Â âˆš(reÂ /r). Youâ€™ll see this more often written asÂ reÂ = Î±2r. Also note that this is an equation that doesÂ notÂ depend on the units, in contrast to equation 1 (above), and 4 and 5 (below), which require you to switch to Planck units. Itâ€™s the square of a ratio and, hence, the units donâ€™t matter. They fall away.

(3)Â Î± is the (relative) speed of an electron: Î± = v/c. [The relative speed is the speed as measured against the speed of light. Note that the â€˜naturalâ€™ unit of speed in the Planck system of units is equal to c. Indeed, if you divide one Planck length by one Planck time unit, you get (1.616Ã—10âˆ’35Â m)/(5.391Ã—10âˆ’44Â s) =Â cÂ m/s. However, this is another equation, just like (2), that does notÂ depend on the units: we can express vÂ and c in whatever unit we want, as long weâ€™re consistent and express both in theÂ same units.]

(4) Finally, Î± is also equal to the product of (a) the electron mass (which Iâ€™ll simply write as meÂ here) and (b) the classical electron radius reÂ (if both are expressed in Planck units): Î± =Â meÂ·re. [IÂ thinkÂ thatâ€™s, perhaps, theÂ mostÂ amazing of all of the expressions forÂ Î±. If you donâ€™t think thatâ€™s amazing, Iâ€™d really suggest you stop trying to study physics.]

Note that, from (2) and (4), we also find that:

(5) The electron mass (in Planck units) is equal meÂ = Î±/reÂ = Î±/Î±2rÂ = 1/Î±r. So that gives us an expression, using Î± once again, for the electron mass as a function of the Bohr radius r expressed in Planck units.

Finally, we can also substitute (1) in (5) to get:

(6) The electron mass (in Planck units) is equal to meÂ = Î±/reÂ  = eP2/re. Using the Bohr radius, we getÂ meÂ = 1/Î±r = 1/eP2r.

In addition, in the mentioned post, I also related Î± to the so-calledÂ coupling constantÂ determining the strength ofÂ the interaction between electrons and photons.Â So… What a magical number indeed ! It suggests some unityÂ that our little model of the atom above doesn’t quite capture.Â As far as I am concerned, it’s one of the many other ‘unexplained questions’, and one of my key objectives, as I struggle throughÂ Feynman’s Lectures, is to understand it all. ðŸ™‚ One of the issues is, of course, how to relate thisÂ couplingÂ constant to the concept of a gauge, which I briefly discussed in my previous post.Â In short, I’ve still got a long way to go… ðŸ˜¦

Post Scriptum: The de BroglieÂ relations and theÂ Uncertainty Principle

My little exposÃ© on mass being nothing but a scale factor in the Uncertainty Principle is a good occasion to reflect on the Uncertainty Principle once more. Indeed, what’s the uncertainty about, if it’s not about the mass? It’s about theÂ positionÂ in space andÂ velocity, i.e. it’sÂ movementÂ and time. Velocity or speed (i.e. the magnitude of the velocity vector)Â is, in turn, defined as the distance traveled divided by the time of travel, so the uncertainty is about time as well, as evidenced from theÂ Î”EÎ”t =Â h expression of the Uncertainty Principle. But how does it workÂ exactly?

Hmm… Not sure. Let me try to remember the context.Â We know that theÂ de BroglieÂ relation,Â Î» =Â h/p, which associates a wavelength (Î») with the momentum (p) of a particle, is somewhat misleading, because we’re actually associating a (possibly infinite)Â bunchÂ of component waves with a particle. So we’re talking someÂ range of wavelengths (Î”Î») and, hence, assuming all these component waves travel at the same speed, we’re also talking a frequency range (Î”f). The bottom line is that we’ve got aÂ wave packetÂ and we need to distinguish the velocity of itsÂ phase (vp)Â versus theÂ groupÂ velocity (vg), which corresponds to theÂ classicalÂ velocity of our particle.

I think I explained that pretty well in one of my previous posts on the Uncertainty Principle, so I’d suggest you have a look there. The mentioned post explains how the Uncertainty Principle relates position (x) and momentum (p) as a Fourier pair, and it also explains that generalÂ mathematicalÂ property of Fourier pairs: the more ‘concentrated’ one distribution is, the more ‘spread out’ its Fourier transform will be. In other words, it isÂ notÂ possible to arbitrarily ‘concentrate’ bothÂ distributions, i.e. both the distribution of x (which I denoted as Î¨(x) as well as its Fourier transform, i.e. the distribution of p (which I denoted by Î¦(p)).Â So, if weâ€™d ‘squeeze’ Î¨(x), then its Fourier transformÂ Î¦(p) will ‘stretch out’.

That was clear enoughâ€”I hope! But how do we go from Î”xÎ”p =Â h to Î”EÎ”t =Â h? Why are energy and time another Fourier pair? To answer that question, we need to clearly define what energy and what time we are talking about. The argument revolves around the second de BroglieÂ relation: E = hÂ·f. How do we go from the momentum p to the energy E? And how do we go from the wavelengthÂ Î» to the frequency f?

The answer to the first question is the energy-mass equivalence:Â E = mc2, always. This formula is relativistic, as m is theÂ relativisticÂ mass, so it includes the rest mass m0Â as well as the equivalent mass of its kinetic energy m0v2/2 + … [Note, indeed, that the kinetic energy â€“ defined as the excess energy over its rest energy â€“Â is a rapidly convergingÂ seriesÂ of terms, so only theÂ m0v2/2 term is mentioned.] Likewise, momentum is defined as p = mv,Â always, with m theÂ relativisticÂ mass, i.e. m =Â (1âˆ’v2/c2)âˆ’1/2Â·m0Â = Î³Â·m0, withÂ Î³ the Lorentz factor. TheÂ E = mc2Â and p = mv relations combined give us the E/c = mÂ·c = pÂ·c/v or EÂ·v/c = pÂ·c relationship, which we can also write as E/p = c2/v. However, we’ll need to write E as a function of p for the purpose of a derivation. You can verify that E2Â âˆ’ p2c2Â = m02c4)Â and, hence, that E = (p2c2Â + m02c4)1/2.

Now, to go from a wavelength to a frequency, we need the wave velocity, and we’re obviously talking the phase velocityÂ here, so we write: vpÂ = Î»Â·f. That’s where the de BroglieÂ hypothesis comes in:Â de Broglie just assumed the Planck-Einstein relation E = hÂ·Î½, in which Î½ is the frequency of a masslessÂ photon, would also be valid for massive particles, so he wrote: E = hÂ·f. It’s just a hypothesis,Â of course, but it makes everything come out alright. More in particular, theÂ phaseÂ velocity vpÂ = Î»Â·fÂ can now be re-written, using both de BroglieÂ relations (i.e. h/p = Î» and E/h = f)Â as vpÂ =Â (E/h)Â·(p/h) = E/p = c2/v. Now, because v is always smaller than c for massive particles (and usuallyÂ veryÂ much smaller), we’re talking aÂ superluminalÂ phase velocity here! However, because it doesn’t carry any signal, it’s not inconsistent with relativity theory.

Now what about the group velocity? To calculate the group velocity, we need the frequencies and wavelengths of theÂ componentÂ waves. The dispersion relation assumes the frequency of each component wave can be expressed as a function of its wavelength, so f = f(Î»). Now, it takes a bit of wave mechanics (which I won’t elaborate on here) to show that the group velocity is the derivative of f with respect to Î», so we write vgÂ = âˆ‚f/âˆ‚Î». Using the twoÂ de BroglieÂ relations, we get:Â vgÂ = âˆ‚f/âˆ‚Î» = âˆ‚(E/h)/âˆ‚(p/h) =Â âˆ‚E/âˆ‚p =Â âˆ‚[p2c2Â + m02c4)1/2]/âˆ‚p. Now, when you write it all out, you should find that vgÂ = âˆ‚f/âˆ‚Î» = pc2/E = c2/vpÂ = v, so that’s the classicalÂ velocity of our particle once again.

Phew! Complicated!Â Yes. But so we still don’t have ourÂ Î”EÎ”t =Â h expression! All of the above tells us how we can associate a range of momenta (Î”p) with a range of wavelengths (Î”Î») and, in turn, with a frequency range (Î”f) which then gives us some energy range (Î”E), so the logic is like:

Î”pÂ â‡’ Î”Î» â‡’ Î”fÂ â‡’ Î”E

Somehow, the same sequence must also ‘transform’ ourÂ Î”x intoÂ Î”t. I googledÂ a bit, but I couldn’t find any clear explanation. Feynman doesn’t seem to have one in his Lectures either so, frankly, I gave up. What I did do in one of my previous posts, is to give someÂ interpretation. However, I am not quite sure if it’s reallyÂ theÂ interpretation: there are probably several ones. It must have something to do with theÂ periodÂ of a wave, but I’ll let you break your head over it. ðŸ™‚ As far as I am concerned, it’s just one of the other unexplained questions I have as I sort of close my study of ‘classical’ physics. So I’ll just make a mental note of it. [Of course, please don’t hesitate to send me yourÂ answer, if you’d have one!]Â Now it’s time to reallyÂ dig into quantum mechanics, so I should reallyÂ stay silent for quite a while now! ðŸ™‚

Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

# A post for my kids: on energy and potential

Pre-scriptum (dated 26 June 2020): These posts on elementary math and physics for my kids (they are 21 and 23 now and no longer need such explanations) have not suffered much the attack by the dark forceâ€”which is good because I still like them. While my views on the true nature of light, matter and the force or forces that act on them have evolved significantly as part of my explorations of a more realist (classical) explanation of quantum mechanics, I think most (if not all) of the analysis in this post remains valid and fun to read. In fact, I find the simplest stuff is often the best. ðŸ™‚

Original post:

We’ve been juggling with a lot of advanced concepts in the previous post. Perhaps it’s time I write something that my kids can understand too. One of the things I struggled with when re-learning elementary physics isÂ the concept of energy. What is energy really? I always felt my high school teachers did a poor job in trying to explain it. So let me try to do a better job here.

A high-school level course usually introduces the topic using the gravitational force, i.e. Newton’s Third Law: F = GmM/r2. This law states that the force of attraction is proportional to the productÂ of the masses m and M, and inverselyÂ proportional to the square of the distance rÂ between those two masses. The factor of proportionality is equal to G, i.e. the so-called universalÂ gravitational constant, aka the ‘big G’Â (G â‰ˆ 6.674Ã—10-11 N(m/kg)2),Â as opposed to the ‘little g’, which is the gravity of Earth (g â‰ˆ 9.80665 m/s2). As far as I am concerned, it is at this point where my high-school teacher failed.

Indeed, he would just go on and simplify Newton’s Third Law by writing F = mg, noting that g = GM/r2Â and that, for all practical purposes, this g factor is constant, because we are talking small distances as compared to the radius of the Earth. Hence, we should just remember that the gravitational force is proportional to the mass only, and that one kilogram amounts to a weightÂ ofÂ about 10 newtonÂ (9.80665 kgÂ·m/s2Â (N) to be precise). That simplification would then be followed by another simplification: if we are lifting an object with mass m, we are doing workÂ againstÂ the gravitational force. How much work? Well, he’d say, workÂ is – quite simply – the force times the distance in physics, and the work done against the force is the potential energy (usually denoted by U) of that object. So he would write U = Fh = mgh, with h the height of the object (as measured from the surface of the Earth), and he would draw a nice linear graph like the one below (I set m to 10 kg here, and h ranges from 0 to 100 m).

Note that the slope of this line is slightly less than 45 degrees (and also note, of course, that it’s only approximately 45 degrees because of our choice of scale: dU/dh is equal to 98.0665, so if the x and y axes would have the same scale, we’d have a line that’s almost vertical).

So what’s wrong with this graph? Nothing. It’s just that this graph sort of got stuck in my head, and it complicated a more accurate understanding of energy. Indeed, with examples like the one above, one tends to forget that:

1. Such linear graphs are an approximation only. In reality, the gravitational field, and force fields in general, areÂ notÂ uniform and, hence, g isÂ notÂ a constant: the graph below shows how g varies with the height (but the height is expressed in kilometer this time, not in meter).
2. Not only is potential energy usually not a linear function but â€“ equally important â€“ it is usuallyÂ notÂ a positive real number either. In fact, in physics, U will usually take on a negative value. Why? Because we’re indeed measuring and defining it by the work done against the force.

So what’s the more accurate view of things? Well… Let’s start by noting that potential energy is defined in relation to someÂ reference pointÂ and, taking a moreÂ universal point of view, that reference point will usually be infinity when discussing the gravitational (or electromagnetic) force of attraction. Now, the potential energy of the point(s) at infinity – i.e. the reference point – will, usually, be equated with zero. Hence, the potential energy curve will then take the shape of the graph below (y =Â â€“1/x), so U will vary from zero (0) toÂ minus infinity (â€“âˆž) , as we bring the two masses closer together. You can readily see that the graph below makes sense: its slope is positive and, hence, as such it does capture the same idea as that linear mgh graph above: moving a mass from point 1 to point 2 requires work and, hence, the potential energy at point 2 is higher than at point 1, even if both values U(2) and U(1) are negative numbers, unlike the values of that linear mgh curve.

How do you get a curve like that? Well… I should first note another convention which is essential for making the sign come out alright: if the force is gravity, then we should writeÂ F = â€“GmMr/r3. So we have a minus sign here. And please do note the boldface type: F and r are vectors, and vectors have both a direction and magnitude â€“Â and so that’s why they are denoted by a bold letter (r), as opposed to the scalar quantities G, m, M or r).

Back to the minus sign. Why do we have that here? Well… It has to do with the direction of the force, which, in case of attraction, will be opposite to the so-called radius vector r.Â Just look at the illustration below, which shows, first, the direction of the force between two opposite electric charges (top) and then (bottom), the force between two masses, let’s say the Earth and the Moon.

So it’s a matter of convention really.

Now, when we’re talking the electromagnetic force, you know that likes repel and opposites attract, so two charges with the same sign will repel each other, and two charges with opposite sign will attract each other. So F12, i.e. the force on q2Â because of the presence of q1, will be equal toÂ F12Â =Â q1q2r/r3. Therefore, no minus sign is needed here because q1Â andÂ q2Â are opposite and, hence, the sign of this product will be negative. Therefore, we know that the direction of F comes out alright: it’s opposite to the direction of the radius vector r. So the force on a charge q2Â which is placed in an electric field produced by a charge q1Â is equal toÂ F12Â =Â q1q2r/r3. In short, no minus sign needed here because we already have one. Of course, the original charge q1Â will be subject to the very same force and so we should write F21Â =Â â€“q1q2r/r3. So we’ve got that minus sign again now. In general, however, we’ll write FijÂ = qiqjr/r3Â when dealing with the electromagnetic force, so that’s without a minus sign, because the convention is to draw the radius vector from charge i to charge j and, hence, the radius vector rÂ in the formula F21Â would point in the other direction and, hence, the minus sign is not needed.

In short, because of the way that the electromagnetic force works, the sign always come out right: there is no need for a minus sign in front. However, for gravity, there are no oppositeÂ charges: masses are always alike, and soÂ likesÂ actually attract when we’re talking gravity, and so that’s why we need the minus sign when dealing with the gravitational force: the force between a mass i and another mass j will always be written as FijÂ = â€“mimjr/r3, so here weÂ doÂ have to put the minus sign, because the direction of the force needs to be opposite to the direction of the radius vector and so the sign of the ‘charges’ (i.e. the masses in this case), in the case of gravity, does not take care of that.

One last remark here may be useful: always watch out to not double-count forces when considering a system with many charges or many masses: both charges (or masses) feel the same force, but with opposite direction. OK. Let’s move on. If you are confused, don’t worry. Just remember that (1) it’s veryÂ important to be consistent when drawing that radius vector (it goes from the charge (or mass)Â causingÂ the force field to the other charge (or mass) that is being brought in),Â and (2) that the gravitational and electromagnetic forces have a lot in common in terms of ‘geometry’ â€“ notably that inverse proportionality relation with the square of the distance between the two charges or masses â€“ but that we need to put a minus sign when we’re dealing with the gravitational force because, with gravitation, likes do not repel but attract each other, as opposed to electricÂ charges.

Now, let’s move on indeed and get back to our discussion of potential energy. Let me copy that potential energy curve again and let’s assume we’re talking electromagnetics here, and that we’re have two oppositeÂ charges, so the force is one of attraction.

Hence, if we move one charge away from the other, we are doing workÂ againstÂ the force. Conversely, if we bring them closer to each other, we’re working withÂ the force and, hence, its potential energy will go downÂ â€“ from zero (i.e. the reference point) to… Well… Some negative value.Â How much work is being done? Well… The force changes all the time, so it’s notÂ constant and so we cannot just calculate the force times the distance (Fs). We need to do one of those infinite sums, i.e. an integral, and so, for point 1 in the graph above, we can write:

Why the minus sign? Well… As said, we’re not increasingÂ potential energy: we’re decreasing it, from zero to some negative value. If we’d move the charge from point 1 to the reference point (infinity), then we’d be doing work against the force and we’d be increasing potential energy. So then we’d have a positive value. If this is difficult, just think it through for a while and you’ll get there.

Now, this integral is somewhat special because F and s are vectors, and the FÂ·ds product above is a so-called dot product between two vectors. The integral itself is a so-called path integralÂ and so you may not have learned how to solve this one. But let me explain the dot product at least: the dot product of two vectors is the product of the magnitudes of those two vectors (i.e. their length) times the cosine of the angle between the two vectors:

FÂ·dsÂ =â”‚Fâ”‚â”‚dsâ”‚cosÎ¸

Why that cosine? Well… To go from one point to another (from point 0 to point 1, for example), we can take any path really. [In fact, it is actually not so obvious that all paths will yield the same value for the potential energy: it is the case for so-calledÂ conservativeÂ forces only. But so gravity and the electromagnetic force are conservative forces and so, yes, we can take any path and we will find the same value.] Now, if the direction of the force and the direction of the displacement are the same, then that angle Î¸ will be equal to zero and, hence, the dot product is just the product of the magnitudes (cos(0) = 1). However, if the direction of the force and the direction of the displacement are notÂ the same, then it’s only the component of the force in the direction of the displacement that’s doing work, and the magnitude of that component is FcosÎ¸. So there you are: that explains why we need that cosine function.

Now, solving that ‘special’ integral is not so easy because the distance between the two charges at point 0 is zero and, hence, when we try to solve the integral by putting in the formula for F and finding the primitive and all that, you’ll find there’s a division by zero involved. Of course, there’s a way to solve the integral, but I won’t do it here. Just accept the general result here for U(r):

U(r) =Â q1q2/4Ï€Îµ0r

You can immediately see that, because we’re dealing with opposite charges, U(r) will always be negative, while the limit of this function for r going to infinity is equal to zero indeed. Conversely, its limit equals â€“âˆž for r going to zero. As for the 4Ï€Îµ0Â factor in this formula, that factor plays the same role as the G-factor for gravity. Indeed, Îµ0Â is an ubiquitous electric constant: Îµ0Â â‰ˆ 8.854Ã—10-12Â F/m, but it can be included in the value of the charges by choosing another unit and, hence, it’s often omitted â€“ and that’s what I’ll also do here. Now, the same formula obviously applies to point 2 in the graph as well, and so now we can calculate the difference in potential energy between point 1 and point 2:

Does that make sense? Yes. We’re, once again, doing work against the force when moving the charge from point Â 1 to point 2. So that’s why we have a minus sign in front. As for the signs of q1Â andÂ q2, remember these are opposite. As for the value of the (r2Â â€“ r1) factor, that’s obviously positive becauseÂ Â r2Â >Â r1. Hence,Â Î”U =Â U(1)Â â€“ U(2) is negative. How do we interpret that? U(2) and U(1) are negative values, the difference between those two values, i.e. U(1)Â â€“ U(2), is negative as well? Well… Just remember thatÂ Î”U isÂ minusÂ the work done to move the charge from point 1 to point 2. Hence, the change in potential energy (Î”U) is some negative value because the amount of work that needs to be done to move the charge from point 1 to point 2 is decidedly positive. Hence, yes, the charge has a higher energy level (albeit negative – but that’s just because of our convention which equates potential energy at infinity with zero) at point 2 as compared to point 1.

What about gravity? Well… That linear graph above is an approximation, we said, and it also takes r = h = 0 as the reference point but it assigns a value of zero for the potential energyÂ thereÂ (as opposed to the â€“âˆž value for the electromagnetic force above). So that graph is actually an linearization of a graph resembling the one below: we only start counting when we are on the Earth’s surface, so to say.

However, in a more advanced physics course, you will probably see the following potential energy function for gravity:Â U(r) =Â â€“GMm/r, and the graph of this function looks exactly the same as that graph we found for the potential energy between two opposite charges: the curve starts at point (0,Â â€“âˆž) and ends at point (âˆž, 0).

OK. Time to move on to another illustration or application: the covalent bond between two hydrogen atoms.

Application: the covalent bond between two hydrogen atoms

The graph below shows the potential energy as a function of the distance between two hydrogen atoms. Don’t worry about its exact mathematical shape: just try to understand it.

Natural hydrogen comes in H2Â molecules, so there is a bond between two hydrogen atoms as a result of mutual attraction. The force involved is a chemical bond: the two hydrogenÂ atoms share their so-called valence electron, thereby forming a so-called covalent bond (which is a form of chemical bond indeed, as you should remember from your high-school courses).Â However, one cannot push two hydrogen atoms too close, because then the positively charged nuclei will start repelling each other, and so that’s what is depicted above: the potential energy goes up very rapidly because the two atoms will repel each otherÂ very strongly.

The right half of the graph shows how the force of attraction vanishes as the two atoms are separated. After a while, the potential energy does not increase any more and so then the two atoms are free.

Again, the reference point does not matter very much: in the graph above, the potential energy is assumed to be zero at infinity (i.e. the ‘free’ state) but we could have chosen another reference point: it would onlyÂ shift the graph up or down.Â

This brings us to another point: the law of energy conservation. For that, we need to introduce the concept of kinetic energy once again.

The formula for kinetic energy

In one of my previous posts, I defined the kinetic energy of an object as the excess energy over its rest energy:

K.E. = T = mc2Â â€“ m0c2Â =Â Î³m0c2Â â€“ m0c2Â = (Î³â€“1)m0c2

Î³ is the Lorentz factor in this formula (Î³ = (1â€“v2/c2)-1/2), and I derived the T = mv2/2 formula for the kinetic energy from a Taylor expansion of the formula above, noting that K.E.Â = mv2/2Â is actually an approximation for non-relativistic speeds only, i.e. speeds that are much less than c and, hence, have no impact on the mass of the object: so, non-relativistic means that, for all practical purposes, m =Â m0. Now, ifÂ m =Â m0, then mc2Â â€“ m0c2 is equal to zero ! So how do we derive the kinetic energy formula for non-relativistic speeds then? Well… We must apply another method, using Newton’s Law: the force equals the time rate of change of the momentum of an object. The momentum of an object is denoted by pÂ (it’s a vector quantity) and is the product of its mass and its velocity (p = mv), so we can write

F = d(mv)/dt (again, all bold letters denote vectors).

When the speed is low (i.e. non-relativistic), then we can just treat m as a constant and so we can write FÂ = mdv/dt = maÂ (the mass times the acceleration). If m would not be constant, then we would have to apply the product rule: d(mv) = (dm/dt)v + m(dv/dt), and so then we would have two terms instead of one. Treating m as a constant also allows us to derive the classical (Newtonian) formula for kinetic energy:

So if we assume that the velocity of the object at point O is equal to zero (soÂ voÂ = 0), then Î”T will be equal to T and we get what we were looking for: the kinetic energy at point P will be equal to T = mv2/2.

Energy conservation

Now, theÂ totalÂ energy – potential and kinetic – of an object (or a system) has to remain constant, so we haveÂ E = T + U = constant. As a consequence, the time derivative of theÂ totalÂ energy must equal zero. So we have:

E = T + U = constant, and dE/dt = 0

Can we prove that with the formulas T = mv2/2 and UÂ = q1q2/4Ï€Îµ0r? Yes, but the proof is a bit lengthy and so I won’t prove it here. [We need to take the derivativesÂ âˆ‚T/âˆ‚t and âˆ‚U/âˆ‚t and show that these derivatives are equal except for the sign, which is opposite, and so the sumÂ of those two derivatives equals zero. Note that âˆ‚T/âˆ‚t = (dT/dv)(dv/dt) and that âˆ‚U/âˆ‚t = (dU/dr)(dr/dt), so you have to use the chain rule for derivatives here.] So just take a mental note of that and accept the result:

(1) mv2/2 + q1q2/4Ï€Îµ0rÂ = constantÂ when the electromagnetic force is involved (no minus sign, because the sign of the charges makes things come out alright), and
(2) mv2/2 â€“ GMm/rÂ = constant when the gravitational force is involved (note theÂ minusÂ sign, for the reason mentioned above: when the gravitational force is involved, we need to reverse the sign).

We can also take another example: an oscillating spring. When you try to compress a (linear) spring, the spring will push back with a force equal to F = kx. Hence, the energy needed toÂ compressÂ a (linear) spring a distance x from its equilibrium position can be calculated from the same integral/infinite sum formula: you will get U = kx2/2 as a result. Indeed, this is an easy integral (not a path integral), and so let me quickly solve it:

While that U = kx2/2 formula looks similar to the kinetic energy formula, you should note that it’s a function of the position, not of velocity, and that the formula does not involve the mass of the object we’re attaching to the string. So it’s a different animal altogether. However, because of the energy conservation law, the graph of both the potential and kinetic energy will obviously reflect each other, just like the energy graphs of a swinging pendulum, as shown below. We have:

T + U = mv2/2 + kx2/2 = C

Note: The graph above mentions an ‘ideal’ pendulum because, in reality, there will be an energy loss due to friction and, hence, the pendulum will slowly stop, as shown below. Hence, in reality, energy is conserved, but it leaks out of the system we are observing here: it gets lost as heat, which is another form of kinetic energy actually.

Another application: estimating the radius of an atom

AÂ veryÂ nice application of the energy concepts introduced above is the so-called Bohr model of a hydrogen atom. Feynman introduces that model as an estimate of the size (or radius)Â of an atom (see Feynman’sÂ Lectures, Vol. III, p. 2-6). The argument is the following.

The radius of an atom is more or less the spreadÂ (usually denoted byÂ Î” or Ïƒ) in the position of the electron, so we can write that Î”x = a. In words, the uncertainty about the position is the radius a. Now, we know that the uncertainty about the position (x) also determines the uncertainty about the momentum (p = mv)Â of the electron because of the Uncertainty Principle Î”xÎ”p â‰¥ Ä§/2 (Ä§Â â‰ˆ 6.6Ã—10-16Â eVÂ·s). The principle is illustrated below, and in a previous posts I proved the relationship. [Note that k in the left graph actually represents the wave number of theÂ de BroglieÂ wave, but wave number and momentum are related through theÂ de BroglieÂ relation p =Â Ä§k.]

Hence, theÂ order of magnitudeÂ of the momentum of the electron will – very roughly – be pÂ â‰ˆÂ Ä§/a. [Note that Feynman doesn’t care about factors 2 or Ï€ or even 2Ï€ (h = 2Ï€Ä§): the idea is just to get the order of magnitude (Feynman calls it a ‘dimensional analysis’), and that he actually equates p with p = h/a, so he doesn’t use theÂ reducedÂ Planck constant (Ä§).]

Now, the electron’s potential energy will be given by that U(r) = q1q2/4Ï€Îµ0rÂ formula above, with q1= e (the charge of the proton) and q2= â€“e (i.e. the charge of the electron), so we can simplify this toÂ â€“e2/a.Â

TheÂ kineticÂ energy of the electron is given by the usual formula: T = mv2/2. This can be written as T = mv2/2 = m2v2/2m = p2/2m = Â h2/2ma2. Hence, the total energy of the electron is given by

E = T + U =Â h2/2ma2Â â€“Â e2/a

What does this say? It says that the potential energy becomes smaller as a gets smaller (that’s because of the minus sign: when we say ‘smaller’, we actually mean a larger negativeÂ value). However, as it gets closer to the nucleus, it kinetic energy increases. In fact, the shape of this function is similar to that graph depicting theÂ potentialÂ energy of a covalent bond as a function of the distance, but you should note that the blue graph below is theÂ totalÂ energy (so it’s not only potential energy but kinetic energy as well).

I guess you can now anticipate the rest of the story. The electron will be there where its total energy is minimized. Why? Well… We could call it the minimum energy principle, but that’s usually used in another context (thermodynamics). Let me just quote Feynman here, because I don’t have a better explanation: “We do not know what a is, but we know that the atom is going to arrange itself to make some kind of compromise so that the energy is as little as possible.”

He then calculates, as expected, the derivative dE/da, which equals dE/da =Â â€“h2/ma3 +Â e2/a2. Setting dE/da equal to zero, we get the ‘optimal’ value for a:Â

a0Â =Â h2/me2Â =0.528Ã—10-10Â m = 0.528 Ã… (angstrom)

Note that this calculation depends on the value one uses for e: to be correct, we need to put theÂ 4Ï€Îµ0Â factor back in. You also need to ensure you use proper and compatible units for all factors. Just try a couple of times and you should find that 0.528 value.

Of course, the question is whether or not this back-of-the-envelope calculation resembles anything real? It does: this number isÂ veryÂ close to the so-called Bohr radius, which is the most probable distance between the proton and and the electron in a hydrogen atom (in its ground state) indeed. The Bohr radius is an actualÂ physical constantÂ and has been measured to be about 0.529 angstrom. Hence, for all practical purposes, the above calculation corresponds with reality. [Of course, while Feynman started with writing that we shouldn’t trust our answer within factors like 2,Â Ï€, etcetera, he concludes his calculation by noting that he used all constants in such a way that it happens to come out the right number. :-)]

The corresponding energy for this value for a can be found by putting the value a0Â back into the total energy equation, and then we find:

E0Â =Â â€“me4/2h2Â = â€“13.6 eV

Again, this corresponds to reality, because this is the energy that is needed to kick an electron out of its orbit or, to use proper language, this is the energy that is needed to ionize a hydrogen atom (it’s referred to as aÂ RydbergÂ of energy). By way of conclusion, let me quote Feynman on what this negative energy actually means: “[Negative energy] means that the electron has less energy when it isÂ inÂ the atom than when it is free. It means it is bound. It means it takes energy to kick the electron out.”

That being said, as we pointed out above, it is all a matter of choosing our reference point: we can add or subtract any constant C to the energy equation:Â E + C = T + U + C will still beÂ constantÂ and, hence, respect the energy conservation law. But so I’ll conclude here and – of course – check if my kids understand any of this.

Oh – yes. I forgot. The title of this post suggests that I would also write something on what is referred to as ‘potential’, and it’s not the same as potential energy. So let me quickly do that.

By now, you are surely familiar with the idea of a force field. If we put a charge or a mass somewhere, then it will create a condition such that another charge or mass will feel a force. That ‘condition’ is referred to as the field, and one represents a field by field vectors. For a gravitational field, we can write:

F = mC

C is the field vector, and F is the force on the mass that we would ‘supply’ to the field for it to act on. Now, we can obviously re-write that integral for the potential energy as

U =Â â€“âˆ«FÂ·ds =Â â€“mâˆ«CÂ·dsÂ = mÎ¨ with Î¨ (read: psi) = âˆ«CÂ·dsÂ = the potential

So we can say that the potential Î¨Â is the potential energy of a unit charge or a unit mass that would be placed in the field. Both C (a vector) as well Î¨ (a scalar quantity, i.e. a real number) obviously vary in space and in time and, hence, are a function of the space coordinates x, y and z as well as the time coordinate t. However, let’s leave time out for the moment, in order to not make things too complex. [And, of course, I should not say that thisÂ psiÂ has nothing to do with the probability wave function we introduced in previous posts. Nothing at all. It just happens to be the same symbol.]

Now, U is an integral, and so it can be shown that, if we know the potential energy, we also know the force. Indeed, the x-, y and z-component of the force is equal to:

FxÂ = â€“Â âˆ‚U/âˆ‚x, FyÂ =Â â€“ âˆ‚U/âˆ‚y,Â FzÂ =Â â€“ âˆ‚U/âˆ‚z or, using theÂ grad (gradient)Â operator: F = â€“âˆ‡UÂ Â

Likewise, we can recover the field vectors C from the potential functionÂ Î¨:

CxÂ =Â â€“Â âˆ‚Î¨/âˆ‚x, CyÂ =Â â€“ âˆ‚Î¨/âˆ‚y, CzÂ =Â â€“ âˆ‚Î¨/âˆ‚z, or CÂ =Â â€“âˆ‡Î¨

That grad operator is nice: it makes a vector function out of a scalar function.

In the ‘electrical case’, we will write:

F = qE

Â And, likewise,

U =Â â€“âˆ«FÂ·ds =Â â€“qâˆ«EÂ·dsÂ = qÎ¦ withÂ Î¦ (read: phi) =Â âˆ«EÂ·dsÂ = the electricalÂ potential.

Unlike the ‘psi’ potential, the ‘phi’ potential is well known to us, if only because it’s expressed inÂ volts. In fact, when we say that a battery or a capacitor is charged to a certain voltage, we actually mean the voltage differenceÂ between theÂ parallel plates of which the capacitor or battery consists, so we are actually talking the difference in electrical potential Î”Î¦ = Î¦1Â â€“Â Î¦2., which we also express in volts, just like the electrical potential itself.

Post scriptum:

The model of the atom that is implied in the above derivation is referred to as the so-called Bohr model. It is a rather primitive model (Wikipedia calls it a ‘first-order approximation’) but, despite its limitations, it’s a proper quantum-mechanical view of the hydrogen atom and, hence, Wikipedia notes that “it isÂ still commonly taught to introduce students to quantum mechanics.” Indeed, that’s Feynman also uses it in one of his firstÂ LecturesÂ on Quantum MechanicsÂ (Vol. III, Chapter 2), before he moves on to more complex things.

Some content on this page was disabled on June 20, 2020 as a result of a DMCA takedown notice from Michael A. Gottlieb, Rudolf Pfeiffer, and The California Institute of Technology. You can learn more about the DMCA here: