The Liénard–Wiechert potentials and the solution for Maxwell’s equations

In my post on gauges and gauge transformations in electromagnetics, I mentioned the full and complete solution for Maxwell’s equations, using the electric and magnetic (vector) potential Φ and A. Feynman frames it nicely, so I should print it and put it on the kitchen door, so I can look at it everyday. 🙂

I should print the wave equation we derived in our previous post too. Hmm… Stupid question, perhaps, but why is there no wave equation above? I mean: in the previous post, we said the wave equation was the solution for Maxwell’s equation, didn’t we? The answer is simple, of course: the wave equation is a solution for waves originating from some source and traveling through free space, so that’s a special case. Here we have everything. Those integrals ‘sweep’ all over space, and so that’s real space, which is full of moving charges and so there’s waves everywhere. So the solution above is far more general and captures it all: it’s the potential at every point in space, and at every point in time, taking into account whatever else is there, moving or not moving. In fact, it is the general solution of Maxwell’s equations.

How do we find it? Well… I could copy Feynman’s 21st Lecture but I won’t do that. The solution is based on the formula for Φ and A for a small blob of charge, and then the formulas above just integrate over all of space. That solution for a small blob of charge, i.e. a point charge really, was first deduced in 1898, by a French engineer: Alfred-Marie Liénard. However, his equations did not get much attention, apparently, because a German physicist, Emil Johann Wiechert, worked on the same thing and found the very same equations just two years later. That’s why they are referred to as the Liénard-Wiechert potentials, so they both get credit for it, even if both of them worked it out independently. These are the equations:

Now, you may wonder why I am mentioning them, and you may also wonder how we get those integrals above, i.e. our general solution for Maxwell’s equations, from them. You can find the answer to your second question in Feynman’s 21st Lecture. 🙂 As for the first question, I mention them because one can derive two other formulas for E and B from them. It’s the formulas that Feynman uses in his first Volume, when studying light:

Now you’ll probably wonder how we can get these two equations from the Liénard-Wiechert potentials. They don’t look very similar, do they? No, they don’t. Frankly, I would like to give you the same answer as above, i.e. check it in Feynman’s 21st Lecture, but the truth is that the derivation is so long and tedious that even Feynman says one needs “a lot of paper and a lot of time” for that. So… Well… I’d suggest we just use all of those formulas and not worry too much about where they come from. If we can agree on that, we’re actually sort of finished with electromagnetism. All the chapters that follow Feynman’s 21st Lecture are applications indeed, so they do not add all that much to the core of the classical theory of electromagnetism.

So why did I write this post? Well… I am not sure. I guess I just wanted to sum things up for myself, so I can print it all out and put it on the kitchen door indeed. 🙂 Oh, and now that I think of it, I should add one more formula, and that’s the formula for spherical waves (as opposed to the plane waves we discussed in my previous post). It’s a very simple formula, and entirely what you’d expect to see:

The S function is the source function, and you can see that the formula is a Coulomb-like potential, but with the retarded argument. You’ll wonder: what is ψ? Is it E or B or what? Well… You can just substitute: ψ can be anything. Indeed, Feynman gives a very general solution for any type of spherical wave here. 🙂

So… That’s it, folks. That’s all there is to it. I hope you enjoyed it. 🙂

Addendum: Feynman’s equation for electromagnetic radiation

I talked about Feynman’s formula for electromagnetic radiation before, but it’s probably good to quickly re-explain it here. Note that it talks about the electric field only, as the magnetic field is so tiny and, in any case, if we have E then we can find B. So the formula is:

The geometry of the situation is depicted below. We have some charge q that, we assume, is moving through space, and so it creates some field E at point P. The e_r‘vector is the unit vector from P to Q, so it points at the charge. Well… It points to where the charge was at the time just a little while ago, i.e. at the time t – r‘/c. Why? Well… We don’t know where q is right now, because the field needs some time travel, we don’t know q right now, i.e. q at time t. It might be anywhere. Perhaps it followed some weird trajectory during the time r‘/c, like the trajectory below.

So our e_r‘vector moves as the charge moves, and so it will also have velocity and, likely, some acceleration, but what we measure for its velocity and acceleration, i.e. the d(e_r‘)/dt and d²(e_r‘)/dt² in that Feynman equation, is also the retarded velocity and the retarded acceleration. But look at the terms in the equation. The first two terms have a 1/r’² in them, so these two effects diminish with the square of the distance. The first term is just Coulomb’s Law (note that the minus sign in front takes care of the fact that like charges repel and so the E vector will point in the other way). Well… It is and it isn’t, because of the retarded time argument, of course. And so we have the second term, which sort of compensates for that. Indeed, the d(e_r‘)/dt is the time rate of change of e_r‘ and, hence, if r‘/c = Δt, then (r‘/c)·d(e_r‘)/dt is a first-order approximation of Δe_r‘.

As Feynman puts it: “The second term is as though nature were trying to allow for the fact that the Coulomb effect is retarded, if we might put it very crudely. It suggests that we should calculate the delayed Coulomb field but add a correction to it, which is its rate of change times the time delay that we use. Nature seems to be attempting to guess what the field at the present time is going to be, by taking the rate of change and multiplying by the time that is delayed.” In short, the first two terms can be written as E = −(q/4πε₀)/r‘²·[e_r‘ + Δe_r‘] and, hence, it’s a sort of modified Coulomb Law that sort of tries to guess what the electrostatic field at P should be based on (a) what it is right now, and (b) how q’s direction and velocity, as measured now, would change it.

Now, the third term has a 1/c² factor in front but, unlike the other two terms, this effect does not fall off with distance. So the formula below fully describes electromagnetic radiation, indeed, because it’s the only important term when we get ‘far enough away’, with ‘far enough’ meaning that the parts that go as the square of the distance have fallen off so much that they’re no longer significant.

Of course, you’re smart, and so you’ll immediately note that, as r increases, that unit vector keeps wiggling but that effect will also diminish. You’re right. It does, but in a fairly complicated way. The acceleration of e_r‘ has two components indeed. One is the transverse or tangential piece, because the end of e_r‘ goes up and down, and the other is a radial piece because it stays on a sphere and so it changes direction. The radial piece is the smallest bit, and actually also varies as the inverse square of $r$ when $r$ is fairly large. The tangential piece, however, varies only inversely as the distance, so as 1/r. So, yes, the wigglings of e_r‘ look smaller and smaller, inversely as the distance, but the tangential piece is and remains significant, because it does not vary as 1/r² but as 1/r only. That’s why you’ll usually see the law of radiation written in an even simpler way:

This law reduces the whole effect to the component of the acceleration that is perpendicular to the line of sight only. It assumes the distance is huge as compared to the distance over which the charge is moving and, therefore, that r‘ and r can be equated for all practical purposes. It also notes that the tangential piece is all that matters, and so it equates d²(e_r‘)/dt²with a_x/r. The whole thing is probably best illustrated as below: we have a generator driving charges up and down in G – so it’s an antenna really – and so we’ll measure a strong signal when putting the radiation detector D in position 1, but we’ll measure nothing in position 3. [The detector is, of course, another antenna, but with an amplifier for the signal.] But so here I am starting to talk about electromagnetic radiation once more, which was not what I wanted to do here, if only because Feynman does a much better job at that than I could ever do. 🙂

Traveling fields: the wave equation and its solutions

Pre-script (dated 26 June 2020): Our ideas have evolved into a full-blown realistic (or classical) interpretation of all things quantum-mechanical. In addition, I note the dark force has amused himself by removing some material. So no use to read this. Read my recent papers instead. 🙂

Original post:

We’ve climbed a big mountain over the past few weeks, post by post, 🙂 slowly gaining height, and carefully checking out the various routes to the top. But we are there now: we finally fully understand how Maxwell’s equations actually work. Let me jot them down once more:

As for how real or unreal the E and B fields are, I gave you Feynman’s answer to it, so… Well… I can’t add to that. I should just note, or remind you, that we have a fully equivalent description of it all in terms of the electric and magnetic (vector) potential Φ and A, and so we can ask the same question about Φ and A. They explain real stuff, so they’re real in that sense. That’s what Feynman’s answer amounts to, and I am happy with it. 🙂

What I want to do here is show how we can get from those equations to some kind of wave equation: an equation that describes how a field actually travels through space. So… Well… Let’s first look at that very particular wave function we used in the previous post to prove that electromagnetic waves propagate with speed c, i.e. the speed of light. The fields were very simple: the electric field had a y-component only, and the magnetic field a z-component only. Their magnitudes, i.e. their magnitude where the field had reached, as it fills the space traveling outwards, were given in terms of J, i.e. the surface current density going in the positive y-direction, and the geometry of the situation is illustrated below.

The fields were, obviously, zero where the fields had not reached as they were traveling outwards. And, yes, I know that sounds stupid. But… Well… It’s just to make clear what we’re looking at here. 🙂

We also showed how the wave would look like if we would turn off its First Cause after some time T, so if the moving sheet of charge would no longer move after time T. We’d have the following pulse traveling through space, a rectangular shape really:

We can imagine more complicated shapes for the pulse, like the shape shown below. J goes from one unit to two units at time t = t₁ and then to zero at t = t₂. Now, the illustration on the right shows the electric field as a function of x at the time t shown by the arrow. We’ve seen this before when discussing waves: if the speed of travel of the wave is equal to c, then x is equal to x = c·t, and the pattern is as shown below indeed: it mirrors what happened at the source x/c seconds ago. So we write:

This idea of using the retarded time t’ = t − x/c in the argument of a wave function f – or, what amounts to the same, using x − c/t – is key to understanding wave functions. I’ve explained this in very simple language in a post for my kids and, if you don’t get this, I recommend you check it out. What we’re doing, basically, is converting something expressed in time units into something expressed in distance units, or vice versa, using the velocity of the wave as the scale factor, so time and distance are both expressed in the same unit, which may be seconds, or meter.

To see how it works, suppose we add some time Δt to the argument of our wave function f, so we’re looking at f[x−c(t+Δt)] now, instead of f(x−ct). Now, f[x−c(t+Δt)] = f(x−ct−cΔt), so we’ll get a different value for our function—obviously! But it’s easy to see that we can restore our wave function F to its former value by also adding some distance Δx = cΔt to the argument. Indeed, if we do so, we get f[x+Δx−c(t+Δt)] = f(x+cΔt–ct−cΔt) = f(x–ct). You’ll say: t − x/c is not the same as x–ct. It is and it isn’t: any function of x–ct is also a function of t − x/c, because we can write:

Here, I need to add something about the direction of travel. The pulse above travel in the positive x-direction, so that’s why we have x minus ct in the argument. For a wave traveling in the negative x-direction, we’ll have a wave function y = F(x+ct). In any case, I can’t dwell on this, so let me move on.

Now, Maxwell’s equations in free or empty space, where are there no charges nor currents to interact with, reduce to:

Now, how can we relate this set of complicated equations to a simple wave function? Let’s do the exercise for our simple E_y and B_z wave. Let’s start by writing out the first equation, i.e. ∇·E = 0, so we get:

Now, our wave does not vary in the y and z direction, so none of the components, including E_y and E_zdepend on y or z. It only varies in the x-direction, so ∂E_y/∂y and ∂E_z/∂z are zero. Note that the cross-derivatives ∂E_y/∂z and ∂E_z/∂y are also zero: we’re talking a plane wave here, the field varies only with x. However, because ∇·E = 0, ∂E_x/∂x must be zero and, hence, E_x must be zero.

Huh? What? How is that possible? You just said that our field does vary in the x-direction! And now you’re saying it doesn’t it? Read carefully. I know it’s complicated business, but it all makes sense. Look at the function: we’re talking E_y, not E_x. E_y does vary as a function of x, but our field does not have an x-component, so E_x = 0. We have no cross-derivative ∂E_y/∂x in the divergence of E (i.e. in ∇·E = 0).

Huh? What? Let me put it differently. E has three components: E_x, E_y and E_z, and we have three space coordinates: x, y and z, so we have nine cross-derivatives. What I am saying is that all derivatives with respect to y and z are zero. That still leaves us with three derivatives: ∂E_x/∂x, ∂E_y/∂x, and ∂E_y/∂x. So… Because all derivatives in respect to y and z are zero, and because of the ∇·E = 0 equation, we know that ∂E_x/∂x must be zero. So, to make a long story short, I did not say anything about ∂E_y/∂x or ∂E_z/∂x. These may still be whatever they want to be, and they may vary in more or in less complicated ways. I’ll give an example of that in a moment.

Having said that, I do agree that I was a bit quick in writing that, because ∂E_x/∂x = 0, E_x must be zero too. Looking at the math only, E_x is not necessarily zero: it might be some non-zero constant. So… Yes. That’s a mathematical possibility. The static field from some charged condenser plate would be an example of a constant E_x field. However, the point is that we’re not looking at such static fields here: we’re talking dynamics here, and we’re looking at a particular type of wave: we’re talking a so-called plane wave. Now, the wave front of a plane wave is… Well… A plane. 🙂 So E_x is zero indeed. It’s a general result for plane waves: the electric field of a plane wave will always be at right angles to the direction of propagation.

Hmm… I can feel your skepticism here. You’ll say I am arbitrarily restricting the field of analysis… Well… Yes. For the moment. It’s not a reasonable restriction though. As I mentioned above, the field of a plane wave may still vary in both the y- and z-directions, as shown in the illustration below (for which the credit goes to Wikipedia), which visualizes the electric field of circularly polarized light. In any case, don’t worry too much about. Let’s get back to the analysis. Just note we’re talking plane waves here. We’ll talk about non-plane waves i.e. incoherent light waves later. 🙂

So we have plane waves and, therefore, a so-called transverse E field which we can resolve in two components: E_yand E_z. However, we wanted to study a very simply E_yfield only. Why? Remember the objective of this lesson: it’s just to show how we go from Maxwell’s equations to the wave function, and so let’s keep the analysis simple as we can for now: we can make it more general later. In fact, if we do the analysis now for non-zero E_yand zero E_z, we can do a similar analysis for non-zero E_zand zero E_y, and the general solution is going to be some superposition of two such fields, so we’ll have a non-zero E_yand E_z. Capito? 🙂 So let me write out Maxwell’s second equation, and use the results we got above, so I’ll incorporate the zero values for the derivatives with respect to y and z, and also the assumption that E_z is zero. So we get:

[By the way: note that, out of the nine derivatives, the curl involves only the (six) cross-derivatives. That’s linked to the neat separation between the curl and the divergence operator. Math is great! :-)]

Now, because of the flux rule (∇×E = –∂B/∂t), we can (and should) equate the three components of ∇×E above with the three components of –∂B/∂t, so we get:

[In case you wonder what it is that I am trying to do, patience, please! We’ll get where we want to get. Just hang in there and read on.] Now, ∂B_x/∂t = 0 and ∂B_y/∂t = 0 do not necessarily imply that B_x and B_yare zero: there might be some magnets and, hence, we may have some constant static field. However, that’s a matter of choosing a reference point or, more simply, assuming that empty space is effectively empty, and so we don’t have magnets lying around and so we assume that B_x and B_yare effectively zero. [Again, we can always throw more stuff in when our analysis is finished, but let’s keep it simple and stupid right now, especially because the B_x = B_y= 0 is entirely in line with the E_x = E_z= 0 assumption.]

The equations above tell us what we know already: the E and B fields are at right angles to each other. However, note, once again, that this is a more general result for all plane electromagnetic waves, so it’s not only that very special caterpillar or butterfly field that we’re looking at it. [If you didn’t read my previous post, you won’t get the pun, but don’t worry about it. You need to understand the equations, not the silly jokes.]

OK. We’re almost there. Now we need Maxwell’s last equation. When we write it out, we get the following monstrously looking set of equations:

However, because of all of the equations involving zeroes above 🙂 only ∂B_z/∂x is not equal to zero, so the whole set reduced to only simple equation only:

Simplifying assumptions are great, aren’t they? 🙂 Having said that, it’s easy to be confused. You should watch out for the denominators: a ∂x and a ∂t are two very different things. So we have two equations now involving first-order derivatives:

∂B_z/∂t = −∂E_y/∂x
−c²∂B_z/∂x = −∂E_y/∂t

So what? Patience, please! 🙂 Let’s differentiate the first equation with respect to x and the second with respect to t. Why? Because… Well… You’ll see. Don’t complain. It’s simple. Just do it. We get:

∂[∂B_z/∂t]/∂x = −∂²E_y/∂x²
∂[−c²∂B_z/∂x]/∂t = −∂²E_y/∂x²

So we can equate the left-hand sides of our two equations now, and what we get is a differential equation of the second order that we’ve encountered already, when we were studying wave equations. In fact, it is the wave equation for one-dimensional waves:

In case you want to double-check, I did a few posts on this, but, if you don’t get this, well… I am sorry. You’ll need to do some homework. More in particular, you’ll need to do some homework on differential equations. The equation above is basically some constraint on the functional form of E_y. More in general, if we see an equation like:

then the function ψ(x, t) must be some function

So any function ψ like that will work. You can check it out by doing the necessary derivatives and plug them into the wave equation. [In case you wonder how you should go about this, Feynman actually does it for you in his Lecture on this topic, so you may want to check it there.]

In fact, the functions f(x − c/t) and g(x + c/t) themselves will also work as possible solutions. So we can drop one or the other, which amounts to saying that our ‘shape’ has to travel in some direction, rather than in both at the same time. 🙂 Indeed, from all of my explanations above, you know what f(x − c/t) represents: it’s a wave that travels in the positive x-direction. Now, it may be periodic, but it doesn’t have to be periodic. The f(x − c/t) function could represent any constant ‘shape’ that’s traveling in the positive x-direction at speed c. Likewise, the g(x + c/t) function could represent any constant ‘shape’ that’s traveling in the negative x-direction at speed c. As for super-imposing both…

Well… I suggest you check that post I wrote for my son, Vincent. It’s on the math of waves, but it doesn’t have derivatives and/or differential equations. It just explains how superimposition and all that works. It’s not very abstract, as it revolves around a vibrating guitar string. So, if you have trouble with all of the above, you may want to read that first. 🙂 The bottom line is that we can get any wavefunction we want by superimposing simple sinusoidals that are traveling in one or the other direction, and so that’s what’s the more general solution really says. Full stop. So that’s what’s we’re doing really: we add very simple waves to get very more complicated waveforms. 🙂

Now, I could leave it at this, but then it’s very easy to just go one step further, and that is to assume that E_zand, therefore, B_yare not zero. It’s just a matter of super-imposing solutions. Let me just give you the general solution. Just look at it for a while. If you understood all that I’ve said above, 20 seconds or so should be sufficient to say: “Yes, that makes sense. That’s the solution in two dimensions.” At least, I hope so! 🙂

OK. I should really stop now. But… Well… Now that we’ve got a general solution for all plane waves, why not be even bolder and think about what we could possibly say about three-dimensional waves? So then E_xand, therefore, B_xwould not necessarily be zero either. After all, light can behave that way. In fact, light is likely to be non-polarized and, hence, E_xand, therefore, B_xare most probably not equal to zero!

Now, you may think the analysis is going to be terribly complicated. And you’re right. It would be if we’d stick to our analysis in terms of x, y and z coordinates. However, it turns out that the analysis in terms of vector equations is actually quite straightforward. I’ll just copy the Master here, so you can see His Greatness. 🙂

But what solution does an equation like (20.27) have? We can appreciate it’s actually three equations, i.e. one for each component, and so… Well… Hmm… What can we say about that? I’ll quote the Master on this too:

“How shall we find the general wave solution? The answer is that all the solutions of the three-dimensional wave equation can be represented as a superposition of the one-dimensional solutions we have already found. We obtained the equation for waves which move in the $x$ -direction by supposing that the field did not depend on $y$ and $z$ . Obviously, there are other solutions in which the fields do not depend on $x$ and $z$ , representing waves going in the $y$ -direction. Then there are solutions which do not depend on $x$ and $y$ , representing waves travelling in the $z$ -direction. Or in general, since we have written our equations in vector form, the three-dimensional wave equation can have solutions which are plane waves moving in any direction at all. Again, since the equations are linear, we may have simultaneously as many plane waves as we wish, travelling in as many different directions. Thus the most general solution of the three-dimensional wave equation is a superposition of all sorts of plane waves moving in all sorts of directions.”

It’s the same thing once more: we add very simple waves to get very more complicated waveforms. 🙂

You must have fallen asleep by now or, else, be watching something else. Feynman must have felt the same. After explaining all of the nitty-gritty above, Feynman wakes up his students. He does so by appealing to their imagination:

“Try to imagine what the electric and magnetic fields look like at present in the space in this lecture room. First of all, there is a steady magnetic field; it comes from the currents in the interior of the earth—that is, the earth’s steady magnetic field. Then there are some irregular, nearly static electric fields produced perhaps by electric charges generated by friction as various people move about in their chairs and rub their coat sleeves against the chair arms. Then there are other magnetic fields produced by oscillating currents in the electrical wiring—fields which vary at a frequency of $6060$ cycles per second, in synchronism with the generator at Boulder Dam. But more interesting are the electric and magnetic fields varying at much higher frequencies. For instance, as light travels from window to floor and wall to wall, there are little wiggles of the electric and magnetic fields moving along at $186,000$ miles per second. Then there are also infrared waves travelling from the warm foreheads to the cold blackboard. And we have forgotten the ultraviolet light, the x-rays, and the radiowaves travelling through the room.

Flying across the room are electromagnetic waves which carry music of a jazz band. There are waves modulated by a series of impulses representing pictures of events going on in other parts of the world, or of imaginary aspirins dissolving in imaginary stomachs. To demonstrate the reality of these waves it is only necessary to turn on electronic equipment that converts these waves into pictures and sounds.

If we go into further detail to analyze even the smallest wiggles, there are tiny electromagnetic waves that have come into the room from enormous distances. There are now tiny oscillations of the electric field, whose crests are separated by a distance of one foot, that have come from millions of miles away, transmitted to the earth from the Mariner II space craft which has just passed Venus. Its signals carry summaries of information it has picked up about the planets (information obtained from electromagnetic waves that travelled from the planet to the space craft).

There are very tiny wiggles of the electric and magnetic fields that are waves which originated billions of light years away—from galaxies in the remotest corners of the universe. That this is true has been found by “filling the room with wires”—by building antennas as large as this room. Such radiowaves have been detected from places in space beyond the range of the greatest optical telescopes. Even they, the optical telescopes, are simply gatherers of electromagnetic waves. What we call the stars are only inferences, inferences drawn from the only physical reality we have yet gotten from them—from a careful study of the unendingly complex undulations of the electric and magnetic fields reaching us on earth.

There is, of course, more: the fields produced by lightning miles away, the fields of the charged cosmic ray particles as they zip through the room, and more, and more. What a complicated thing is the electric field in the space around you! Yet it always satisfies the three-dimensional wave equation.”

So… Well… That’s it for today, folks. 🙂 We have some more gymnastics to do, still… But we’re really there. Or here, I should say: on top of the peak. What a view we have here! Isn’t it beautiful? It took us quite some effort to get on top of this thing, and we’re still trying to catch our breath as we struggle with what we’ve learned so far, but it’s really worthwhile, isn’t it? 🙂

Maxwell’s equations and the speed of light

Pre-script (dated 26 June 2020): Our ideas have evolved into a full-blown realistic (or classical) interpretation of all things quantum-mechanical. In addition, I note the dark force has amused himself by removing some material, which messed up the lay-out of this post as well. So no use to read this. Read my recent papers instead. 🙂

Original post:

We know how electromagnetic waves travel through space: they do so because of the mechanism described in Maxwell’s equation: a changing magnetic field causes a changing electric field, and a changing magnetic field causes a (changing) electric field, as illustrated below.

So we need some First Cause to get it all started 🙂 i.e. some current, i.e. some moving charge, but then the electromagnetic wave travels, all by itself, through empty space, completely detached from the cause. You know that by now – indeed, you’ve heard this a thousand times before – but, if you’re reading this, you want to know how it works exactly. 🙂

In my post on the Lorentz gauge, I included a few links to Feynman’s Lectures that explain the nitty-gritty of this mechanism from various angles. However, they’re pretty horrendous to read, and so I just want to summarize them a bit—if only for myself, so as to remind myself what’s important and not. In this post, I’ll focus on the speed of light: why do electromagnetic waves – light – travel at the speed of light?

You’ll immediately say: that’s a nonsensical question. It’s light, so it travels at the speed of light. Sure, smart-arse! Let me be more precise: how can we relate the speed of light to Maxwell’s equations? That is the question here. Let’s go for it.

Feynman deals with the matter of the speed of an electromagnetic wave, and the speed of light, in a rather complicated exposé on the fields from some infinite sheet of charge that is suddenly set into motion, parallel to itself, as shown below. The situation looks – and actually is – very simple, but the math is rather messy because of the rather exotic assumptions: infinite sheets and infinite acceleration are not easy to deal with. 🙂 But so the whole point of the exposé is just to prove that the speed of propagation (v) of the electric and magnetic fields is equal to the speed of light (c), and it does a marvelous job at that. So let’s focus on that here only. So what I am saying is that I am going to leave out most of the nitty-gritty and just try to get to that v = c result as fast as I possibly can. So, fasten your seat belt, please.

Most of the nitty-gritty in Feynman’s exposé is about how to determine the direction and magnitude of the electric and magnetic fields, i.e. E and B. Now, when the nitty-gritty business is finished, the grand conclusion is that both E and B travel out in both the positive as well as the negative x-direction at some speed v and sort of ‘fill’ the entire space as they do. Now, the region they are filling extends infinitely far in both the y- and z-direction but, because they travel along the x-axis, there are no fields (yet) in the region beyond x = ± v·t (t = 0 is the moment when the sheet started moving, and it moves in the positive y-direction). As you can see, the sheet of charge fills the yz-plane, and the assumption is that its speed goes from zero to u instantaneously, or very very quickly at least. So the E and B fields move out like a tidal wave, as illustrated below, and thereby ‘fill’ the space indeed, as they move out.

The magnitude of E and B is constant, but it’s not the same constant, and part of the exercise here is to determine the relationship between the two constants. As for their direction, you can see it in the first illustration: B points in the negative z-direction for x > 0 and in the positive z-direction for x < 0, while E‘s direction is opposite to u‘s direction everywhere, so E points in the negative y-direction. As said, you should just take my word for it, because the nitty-gritty on this – which we do not want to deal with here – is all in Feynman and so I don’t want to copy that.

The crux of the argument revolves around what happens at the wavefront itself, as it travels out. Feynman relates flux and circulation there. It’s the typical thing to do: it’s at the wavefront itself that the fields change: before they were zero, and now they are equal to that constant. The fields do not change anywhere else, so there’s no changing flux or circulation business to be analyzed anywhere else. So we define two loops at the wavefront itself: Γ₁ and Γ₂. They are normal to each other (cf. the top and side view of the situation below), because the E and B fields are normal to each other. And so then we use Maxwell’s equations to check out what happens with the flux and circulation there and conclude what needs to be concluded. 🙂

We start with rectangle Γ₂. So one side is in the region where there are fields, and one side is in the region where the fields haven’t reached yet. There is some magnetic flux through this loop, and it is changing, so there is an emf around it, i.e. some circulation of E. The flux changes because the area in which B exists increases at speed v. Now, the time rate of change of the flux is, obviously, the width of the rectangle L times the rate of change of the area, so that’s (B·L·v·Δt)/Δt = B·L·v, with Δt some differential time interval co-defining how slow or how fast the field changes. Now, according to Faraday’s Law (see my previous post), this will be equal to minus the line integral of $E around Γ 2, which is E\cdotL. So E\cdotL = B\cdotL\cdotv and, hence, we find:$ $E = v\cdotB.$

$Interesting! To satisfy Faraday’s equation (which is just one of Maxwell’s equations in integral rather than in differential form), E must equal B times v, with v the speed of propagation of our ‘tidal’ wave. Now let’s look at Γ 1 . There we should apply:$

$Now the line integral is just B\cdotL, and the right-hand side is E\cdotL\cdotv, so, not forgetting that c 2 in front—i.e. the square of the speed of light, as you know!—we get:$ $c 2 B = E\cdotv, or E = (c 2 /v)\cdotB.$

Now, the E = v·B and E = (c²/v)·B equations must both apply (we’re talking one wave and one and the same phenomenon) and, obviously, that’s only possible if v = c²/v, i.e. if v = c. So the wavefront must travel at the speed of light! Waw ! That’s fast. 🙂 Yes. […] Jokes aside, that’s the result we wanted here: we just proved that the speed of travel of an electromagnetic wave must be equal to the speed of light.

As an added bonus, we also showed the mechanism of travel. It’s obvious from the equations we used to prove the result: it works through the derivatives of the fields with respect to time, i.e. ∂E/∂t and ∂B/∂t.

Done! Great! Enjoy the view!

Well… Yes and no. If you’re smart, you’ll say: we got this result because of the c² factor in that equation, so Maxwell had already put it in, so to speak. Waw! You really are a smart-arse, aren’t you? 🙂

The thing is… Well… The answer is: no. Maxwell did not put it in. Well… Yes and no. Let me explain. Maxwell’s first equation was the electric flux law ∇·E = σ/ε₀: the flux of E through a closed surface is proportional to the charge inside. So that’s basically an other way of writing Coulomb’s Law, and ε₀ was just some constant in it, the electric constant. So it’s a constant of proportionality that depends on the unit in which we measure electric charge. The only reason that it’s there is to make the units come out alright, so if we’d measure charge not in coulomb (C) in a unit equal to 1 C/ε₀, it would disappear. If we’d do that, our new unit would be equivalent to the charge of some 700,000 protons. You can figure that magical number yourself by checking the values of the proton charge and ε₀. 🙂

OK. And then Faraday came up with the exact laws for magnetism, and they involved current and some other constant of proportionality, and Maxwell formalized that by writing ∇×B = μ₀j, with μ₀ the magnetic constant. It’s not a flux law but a circulation law: currents cause circulation of B. We get the flux rule from it by integrating it. But currents are moving charges, and so Maxwell knew magnetism was related to the same thing: electric charge. So Maxwell knew the two constants had to be related. In fact, when putting the full set of equations together – there are four, as you know – Maxwell figured out that μ₀times ε₀would have to be equal to the reciprocal of c², with c the speed of propagation of the wave. So Maxwell knew that, whatever the unit of charge, we’d get two constants of proportionality, and electric and a magnetic constant, and that μ₀·ε₀would be equal to 1/c². However, while he knew that, at the time, light and electromagnetism were considered to be separate phenomena, and so Maxwell did not say that c was the speed of light: the only thing his equations told him was that c is the speed of propagation of that ‘electromagnetic’ wave that came out of his equations.

The rest is history. In 1856, the great Wilhelm Eduard Weber – you’ve seen his name before, didn’t you? – did a whole bunch of experiments which measured the electric constant rather precisely, and Maxwell jumped on it and calculated all the rest, i.e. μ₀, and so then he took the reciprocal of the square root of μ₀·ε₀and – Bang! – he had c, the speed of propagation of the electromagnetic wave he was thinking of. Now, c was some value of the order of 3×10⁸ m/s, and so that happened to be the same as the speed of light, which suggested that Maxwell’s c and the speed of light were actually one and the same thing!

Now, I am a smart-arse too 🙂 and, hence, when I first heard this story, I actually wondered how Maxwell could possibly know the speed of light at the time: Maxwell died many years before the Michelson-Morley experiment unequivocally established the value of the speed of light. [In case, you wonder: the Michelson-Morley experiment was done in 1887. So I check it. The fact is that the Michelson-Morley experiment concluded that the speed of light was an absolute value and that, in the process of doing so, they got a rather precise value for it, but the value of c itself has already been established, more or less, that is, by a Danish astronomer, Ole Römer, in 1676 ! He did so by carefully observing the timing of the repeating eclipses of Io, one of Jupiter’s moons. Newton mentioned his results in his Principia, which he wrote in 1687, duly noting that it takes about seven to eight minutes for light to travel from the Sun to the Earth. Done! The whole story is fascinating, really, so you should check it out yourself. 🙂

In any case, to make a long story short, Maxwell was puzzled by this mysterious coincidence, but he was bold enough to immediately point to the right conclusion, tentatively at least, and so he told the Cambridge Philosophical Society, in the very same year, i.e. 1856, that “we can scarcely avoid the inference that light consists in the transverse undulations of the same medium which is the cause of electric and magnetic phenomena.”

So… Well… Maxwell still suggests light needs some medium here, so the ‘medium’ is a reference to the infamous aether theory, but that’s not the point: what he says here is what we all take for granted now: light is an electromagnetic wave. So now we know there’s absolute no reason whatsoever to avoid the ‘inference’, but… Well… 160 years ago, it was quite a big deal to suggest something like that. 🙂

So that’s the full story. I hoped you like it. Don’t underestimate what you just did: understanding an argument like this is like “climbing a great peak”, as Feynman puts it. So it is “a great moment” indeed. 🙂 The only thing left is, perhaps, to explain the ‘other’ flux rules I used above. Indeed, you know Faraday’s Law:

But that other one? Well… As I explained in my previous post, Faraday’s Law is the integral form of Maxwell’s second equation: −∂B/∂t = ∇×E. The ‘other’ flux rule above – so that’s the one with the c² in front and without a minus sign, is the integral form of Maxwell’s fourth equation: c²∇×B = j/ε₀+ ∂E/∂t, taking into account that we’re talking a wave traveling in free space, so there are no charges and currents (it’s just a wave in empty space—whatever that means) and, hence, the Maxwell equation reduces to c²∇×B = ∂E/∂t. Now, I could take you through the same gymnastics as I did in my previous post but, if I were you, I’d just apply the general principle that ”the same equations must yield the same solutions” and so I’d just switch E for B and vice versa in Faraday’s equation. 🙂

So we’re done… Well… Perhaps one more thing. We’ve got these flux rules above telling us that the electromagnetic wave will travel all by itself, through empty space, completely detached from its First Cause. But… […] Well… Again you may think there’s some trick here. In other words, you may think the wavefront has to remain connected to the First Cause somehow, just like the whip below is connected to some person whipping it. 🙂

There’s no such connection. The whip is not needed. 🙂 If we’d switch off the First Cause after some time T, so our moving sheet stops moving, then we’d have the pulse below traveling through empty space. As Feynman puts it: “The fields have taken off: they are freely propagating through space, no longer connected in any way with the source. The caterpillar has turned into a butterfly!“

Now, the last question is always the same: what are those fields? What’s their reality? Here, I should refer you to one of the most delightful sections in Feynman’s Lectures. It’s on the scientific imagination. I’ll just quote the introduction to it, but I warmly recommend you go and check it out for yourself: it has no formulas whatsoever, and so you should understand all of it without any problem at all. 🙂

“I have asked you to imagine these electric and magnetic fields. What do you do? Do you know how? How do I imagine the electric and magnetic field? What do I actually see? What are the demands of scientific imagination? Is it any different from trying to imagine that the room is full of invisible angels? No, it is not like imagining invisible angels. It requires a much higher degree of imagination to understand the electromagnetic field than to understand invisible angels. Why? Because to make invisible angels understandable, all I have to do is to alter their properties a little bit—I make them slightly visible, and then I can see the shapes of their wings, and bodies, and halos. Once I succeed in imagining a visible angel, the abstraction required—which is to take almost invisible angels and imagine them completely invisible—is relatively easy. So you say, “Professor, please give me an approximate description of the electromagnetic waves, even though it may be slightly inaccurate, so that I too can see them as well as I can see almost invisible angels. Then I will modify the picture to the necessary abstraction.”

I’m sorry I can’t do that for you. I don’t know how. I have no picture of this electromagnetic field that is in any sense accurate. I have known about the electromagnetic field a long time—I was in the same position 25 years ago that you are now, and I have had 25 years more of experience thinking about these wiggling waves. When I start describing the magnetic field moving through space, I speak of the $E$ and $B$ fields and wave my arms and you may imagine that I can see them. I’ll tell you what I see. I see some kind of vague shadowy, wiggling lines—here and there is an $E and a$ $B$ written on them somehow, and perhaps some of the lines have arrows on them—an arrow here or there which disappears when I look too closely at it. When I talk about the fields swishing through space, I have a terrible confusion between the symbols I use to describe the objects and the objects themselves. I cannot really make a picture that is even nearly like the true waves. So if you have some difficulty in making such a picture, you should not be worried that your difficulty is unusual.

Our science makes terrific demands on the imagination. The degree of imagination that is required is much more extreme than that required for some of the ancient ideas. The modern ideas are much harder to imagine. We use a lot of tools, though. We use mathematical equations and rules, and make a lot of pictures. What I realize now is that when I talk about the electromagnetic field in space, I see some kind of a superposition of all of the diagrams which I’ve ever seen drawn about them. I don’t see little bundles of field lines running about because it worries me that if I ran at a different speed the bundles would disappear, I don’t even always see the electric and magnetic fields because sometimes I think I should have made a picture with the vector potential and the scalar potential, for those were perhaps the more physically significant things that were wiggling.

Perhaps the only hope, you say, is to take a mathematical view. Now what is a mathematical view? From a mathematical view, there is an electric field vector and a magnetic field vector at every point in space; that is, there are six numbers associated with every point. Can you imagine six numbers associated with each point in space? That’s too hard. Can you imagine even one number associated with every point? I cannot! I can imagine such a thing as the temperature at every point in space. That seems to be understandable. There is a hotness and coldness that varies from place to place. But I honestly do not understand the idea of a number at every point.

So perhaps we should put the question: Can we represent the electric field by something more like a temperature, say like the displacement of a piece of jello? Suppose that we were to begin by imagining that the world was filled with thin jello and that the fields represented some distortion—say a stretching or twisting—of the jello. Then we could visualize the field. After we “see” what it is like we could abstract the jello away. For many years that’s what people tried to do. Maxwell, Ampère, Faraday, and others tried to understand electromagnetism this way. (Sometimes they called the abstract jello “ether.”) But it turned out that the attempt to imagine the electromagnetic field in that way was really standing in the way of progress. We are unfortunately limited to abstractions, to using instruments to detect the field, to using mathematical symbols to describe the field, etc. But nevertheless, in some sense the fields are real, because after we are all finished fiddling around with mathematical equations—with or without making pictures and drawings or trying to visualize the thing—we can still make the instruments detect the signals from Mariner II and find out about galaxies a billion miles away, and so on.

The whole question of imagination in science is often misunderstood by people in other disciplines. They try to test our imagination in the following way. They say, “Here is a picture of some people in a situation. What do you imagine will happen next?” When we say, “I can’t imagine,” they may think we have a weak imagination. They overlook the fact that whatever we are allowed to imagine in science must be consistent with everything else we know: that the electric fields and the waves we talk about are not just some happy thoughts which we are free to make as we wish, but ideas which must be consistent with all the laws of physics we know. We can’t allow ourselves to seriously imagine things which are obviously in contradiction to the known laws of nature. And so our kind of imagination is quite a difficult game. One has to have the imagination to think of something that has never been seen before, never been heard of before. At the same time the thoughts are restricted in a strait jacket, so to speak, limited by the conditions that come from our knowledge of the way nature really is. The problem of creating something which is new, but which is consistent with everything which has been seen before, is one of extreme difficulty.”

Isn’t that great? I mean: Feynman, one of the greatest physicists of all time, didn’t write what he wrote above when he was a undergrad student or so. No. He did so in 1964, when he was 45 years old, at the height of his scientific career! And it gets better, because Feynman then starts talking about beauty. What is beauty in science? Well… Just click and check what Feynman thinks about it. 🙂

Oh… Last thing. So what is the magnitude of the E and B field? Well… You can work it out yourself, but I’ll give you the answer. The geometry of the situation makes it clear that the electric field has a y-component only, and the magnetic field a z-component only. Their magnitudes are given in terms of J, i.e. the surface current density going in the positive y-direction:

An introduction to electric circuits

In my previous post,I introduced electric motors, generators and transformers. They all work because of Faraday’s flux rule: a changing magnetic flux will produce some circulation of the electric field. The formula for the flux rule is given below:

It is a wonderful thing, really, but not easy to grasp intuitively. It’s one of these equations where I should quote Feynman’s introduction to electromagnetism: “The laws of Newton were very simple to write down, but they had a lot of complicated consequences and it took us a long time to learn about them all. The laws of electromagnetism are not nearly as simple to write down, which means that the consequences are going to be more elaborate and it will take us quite a lot of time to figure them all out.”

Now, among Maxwell’s Laws, this is surely the most complicated one! However, that shouldn’t deter us. 🙂 Recalling Stokes’ Theorem helps to appreciate what the integral on the left-hand side represents:

We’ve got a line integral around some closed loop Γ on the left and, on the right, we’ve got a surface integral over some surface S whose boundary is Γ. The illustration below depicts the geometry of the situation. You know what it all means. If not, I am afraid I have to send you back to square one, i.e. my posts on vector analysis. Yep. Sorry. Can’t keep copying stuff and make my posts longer and longer. 🙂

To understand the flux rule, you should imagine that the loop Γ is some loop of electric wire, and then you just replace C by E, the electric field vector. The circulation of E, which is caused by the change in magnetic flux, is referred to as the electromotive force (emf), and it’s the tangential force (E·ds) per unit charge in the wire integrated over its entire length around the loop, which is denoted by Γ here, and which encloses a surface S.

Now, you can go from the line integral to the surface integral by noting Maxwell’s Law: −∂B/∂t = ∇×E. In fact, it’s the same flux rule really, but in differential form. As for (∇×E)_n, i.e. the component of ∇×E that is normal to the surface, you know that any vector multiplied with the normal unit vector will yield its normal component. In any case, if you’re reading this, you should already be acquainted with all of this. Let’s explore the concept of the electromotive force, and then apply it our first electric circuit. 🙂

Indeed, it’s now time for a small series on circuits, and so we’ll start right here and right now, but… Well… First things first. 🙂

The electromotive force: concept and units

The term ‘force’ in ‘electromotive force’ is actually somewhat misleading. There is a force involved, of course, but the emf is not a force. The emf is expressed in volts. That’s consistent with its definition as the circulation of E: a force times a distance amounts to work, or energy (one joule is one newton·meter), and because E is the force on a unit charge, the circulation of E is expressed in joule per coulomb, so that’s a voltage: 1 volt = 1 joule/coulomb. Hence, on the left-hand side of Faraday’s equation, we don’t have any dimension of time: it’s energy per unit charge, so it’s x joule per coulomb . Full stop.

On the right-hand side, however, we have the time rate of change of the magnetic flux. through the surface S. The magnetic flux is a surface integral, and so it’s a quantity expressed in [B]·m², with [B] the measurement unit for the magnetic field strength. The time rate of change of the flux is then, of course, expressed in [B]·m²per second, i.e. [B]·m²/s. Now what is the unit for the magnetic field strength B, which we denoted by [B]?

Well… [B] is a bit of a special unit: it is not measured as some force per unit charge, i.e. in newton per coulomb, like the electric field strength E. No. [B] is measured in (N/C)/(m/s). Why? Because the magnetic force is not F = qE but F = qv×B. Hence, so as to make the units come out alright, we need to express B in (N·s)/(C·m), which is a unit known as the tesla (1 T = N·s/C·m), so as to honor the Serbian-American genius Nikola Tesla. [I know it’s a bit of short and dumb answer, but the complete answer is quite complicated: it’s got to do with the relativity of the magnetic force, which I explained in another post: both the v in F = qv×B equation as well as the m/s unit in [B] should make you think: whose velocity? In which reference frame? But that’s something I can’t summarize in two lines, so just click the link if you want to know more. I need to get back to the lesson.]

Now that we’re talking units, I should note that the unit of flux also got a special name, the weber, so as to honor one of Germany’s most famous physicists, Wilhelm Eduard Weber: as you might expect, 1 Wb = 1 T·m². But don’t worry about these strange names. Besides the units you know, like the joule and the newton, I’ll only use the volt, which got its name to honor some other physicist, Alessandro Volta, the inventor of the electrical battery. Or… Well… I might mention the watt as well at some point… 🙂

So how does it work? On one side, we have something expressed per second – so that’s per unit time – and on the other we have something that’s expressed per coulomb – so that’s per unit charge. The link between the two is the power, so that’s the time rate of doing work. It’s expressed in joule per second. So… Well… Yes. Here we go: in honor of yet another genius, James Watt, the unit of power got its own special name too: the watt. 🙂 In the argument below, I’ll show that the power that is being generated by a generator, and that is being consumed in the circuit (through resistive heating, for example, or whatever else taking energy out of the circuit) is equal to the emf times the current. For the moment, however, I’ll just assume you believe me. 🙂

We need to look at the whole circuit now, indeed, in which our little generator (i.e. our loop or coil of wire) is just one of the circuit elements. The units come out alright: the power = emf·current product is expressed in volt·coulomb/second = (joule/coulomb)·(coulomb/second) = joule/second. So, yes, it looks OK. But what’s going on really? How does it work, literally?

A short digression: on Ohm’s Law and electric power

Well… Let me first recall the basic concepts involved which, believe it or not, are probably easiest to explain by briefly recalling Ohm’s Law, which you’ll surely remember from your high-school physics classes. It’s quite simple really: we have some resistance in a little circuit, so that’s something that resists the passage of electric current, and then we also have a voltage source. Now, Ohm’s Law tells us that the ratio of (i) the voltage V across the resistance (so that’s between the two points marked as + and −) and (ii) the current I will be some constant. It’s the same as saying that V and I are inversely proportional to each other. The constant of proportionality is referred to as the resistance itself and, while it’s often looked at as a property of the circuit itself, we may embody it in a circuit element itself: a resistor, as shown below.

So we write R = V/I, and the brief presentation above should remind you of the capacity of a capacitor, which was just another constant of proportionality. Indeed, instead of feeding a resistor (so all energy gets dissipated away), we could charge a capacitor with a voltage source, so that’s a energy storage device, and then we find that the ratio between (i) the charge on the capacitor and (ii) the voltage across the capacitor was a constant too, which we defined as the capacity of the capacitor, and so we wrote C = Q/V. So, yes, another constant of proportionality (there are many in electricity!).

In any case, the point is: to increase the current in the circuit above, you need to increase the voltage, but increasing both amounts to increasing the power that’s being consumed in the circuit, because the power is voltage times current indeed, so P = V·I (or v·i, if I use the small letters that are used in the two animations below). For example, if we’d want to double the current, we’d need to double the voltage, and so we’re quadrupling the power: (2·V)·(2·I) = 2²·V·I. So we have a square-cube law for the power, which we get by substituting V for R·I or by substituting I for V/R, so we can write the power P as P = V²/R = I²·R. This square-cube law says exactly the same: if you want to double the voltage or the current, you’ll actually have to double both and, hence, you’ll quadruple the power. Now let’s look at the animations below (for which credit must go to Wikipedia).

They show how energy is being used in an electric circuit in terms of power. [Note that the little moving pluses are in line with the convention that a current is defined as the movement of positive charges, so we write I = dQ/dt instead of I = −dQ/dt. That also explains the direction of the field line E, which has been added to show that the power source effectively moves charges against the field and, hence, against the electric force.] What we have here is that, on one side of the circuit, some generator or voltage source will create an emf pushing the charges, and then some load will consume their energy, so they lose their push. So power, i.e. energy per unit time, is supplied, and is then consumed.

Back to the emf…

Now, I mentioned that the emf is a ratio of two terms: the numerator is expressed in joule, and the denominator is expressed in coulomb. So you might think we’ve got some trade-off here—something like: if we double the energy of half of the individual charges, then we still get the same emf. Or vice versa: we could, perhaps, double the number of charges and load them with only half the energy. One thing is for sure: we can’t both.

Hmm… Well… Let’s have a look at this line of reasoning by writing it down more formally.

The time rate of change of the magnetic flux generates some emf, which we can and should think of as a property of the loop or the coil of wire in which it is being generated. Indeed, the magnetic flux through it depends on its orientation, its size, and its shape. So it’s really very much like the capacity of a capacitor or the resistance of a conductor. So we write: emf = Δ(flux)/Δt. [In fact, the induced emf tries to oppose the change in flux, so I should add the minus sign, but you get the idea.]
For a uniform magnetic field, the flux is equal to the field strength B times the surface area S. [To be precise, we need to take the normal component of B, so the flux is B·S = B·S·cosθ.] So the flux can change because of a change in B or because of a change in S, or because of both.
The emf = Δ(flux)/Δt formula makes it clear that a very slow change in flux (i.e. the same Δ(flux) over a much larger Δt) will generate little emf. In contrast, a very fast change (i.e. the the same Δ(flux) over a much smaller Δt) will produce a lot of emf. So, in that sense, emf is not like the capacity or resistance, because it’s variable: it depends on Δ(flux), as well as on Δt. However, you should still think of it as a property of the loop or the ‘generator’ we’re talking about here.
Now, the power that is being produced or consumed in the circuit in which our ‘generator’ is just one of the elements, is equal to the emf times the current. The power is the time rate of change of the energy, and the energy is the work that’s being done in the circuit (which I’ll denote by ΔU), so we write: emf·current = ΔU/Δt.
Now, the current is equal to the time rate of change of the charge, so I = ΔQ/Δt. Hence, the emf is equal to emf = (ΔU/Δt)/I = (ΔU/Δt)/(ΔQ/Δt) = ΔU/ΔQ. From this, it follows that: emf = Δ(flux)/Δt = ΔU/ΔQ, which we can re-write as:

Δ(flux) = ΔU·Δt/ΔQ

What this says is the following. For a given amount of change in the magnetic flux (so we treat Δ(flux) as constant in the equation above), we could do more work on the same charge (ΔQ) – we could double ΔU by moving the same charge over a potential difference that’s twice as large, for example – but then Δt must be cut in half. So the same change in magnetic flux can do twice as much work if the change happens in half of the time.

Now, does that mean the current is being doubled? We’re talking the same ΔQ and half the Δt, so… Well? No. The Δt here measures the time of the flux change, so it’s not the dt in I = dQ/dt. For the current to change, we’d need to move the same charge faster, i.e. over a larger distance over the same time. We didn’t say we’d do that above: we only said we’d move the charge across a larger potential difference: we didn’t say we’d change the distance over which they are moved.

OK. That makes sense. But we’re not quite finished. Let’s first try something else, to then come back to where we are right now via some other way. 🙂 Can we change ΔQ? Here we need to look at the physics behind. What’s happening really is that the change in magnetic flux causes an induced current which consists of the free electrons in the Γ loop. So we have electrons moving in and out of our loop, and through the whole circuit really, but so there’s only so many free electrons per unit length in the wire. However, if we would effectively double the voltage, then their speed will effectively increase proportionally, so we’ll have more of them passing through per second. Now that effect surely impacts the current. It’s what we wrote above: all other things being the same, including the resistance, then we’ll also double the current as we double the voltage.

So where is that effect in the flux rule? The answer is: it isn’t there. The circulation of E around the loop is what it is: it’s some energy per unit charge. Not per unit time. So our flux rule gives us a voltage, which tells us that we’re going to have some push on the charges in the wire, but it doesn’t tell us anything about the current. To know the current, we must know the velocity of the moving charges, which we can calculate from the push if we also get some other information (such as the resistance involved, for instance), but so it’s not there in the formula of the flux rule. You’ll protest: there is a Δt on the right-hand side! Yes, that’s true. But it’s not the Δt in the v = Δs/Δt equation for our charges. Full stop.

Hmm… I may have lost you by now. If not, please continue reading. Let me drive the point home by asking another question. Think about the following: we can re-write that Δ(flux) = ΔU·Δt/ΔQ equation above as Δ(flux) = (ΔU/ΔQ)·Δt equation. Now, does that imply that, with the same change in flux, i.e. the same Δ(flux), and, importantly, for the same Δt, we could double both ΔU as well as ΔQ? I mean: (2·ΔU)/(2·ΔQ) = ΔU/ΔQ and so the equation holds, mathematically that is. […] Think about it.

You should shake your head now, and rightly so, because, while the Δ(flux) = (ΔU/ΔQ)·Δt equation suggests that would be possible, it’s totally counter-intuitive. We’re changing nothing in the real world (what happens there is the same change of flux in the same amount of time), but so we’d get twice the energy and twice the charge ?! Of course, we could also put a 3 there, or 20,000, or minus a million. So who decides on what we get? You get the point: it is, indeed, not possible. Again, what we can change is the speed of the free electrons, but not their number, and to change their speed, you’ll need to do more work, and so the reality is that we’re always looking at the same ΔQ, so if we want a larger ΔU, then we’ll need a larger change in flux, or we a shorter Δt during which that change in flux is happening.

So what can we do? We can change the physics of the situation. We can do so in many ways, like we could change the length of the loop, or its shape. One particularly interesting thing to do would be to increase the number of loops, so instead of one loop, we could have some coil with, say, N turns, so that’s N of these Γ loops. So what happens then? In fact, contrary to what you might expect, the ΔQ still doesn’t change as it moves into the coil and then from loop to loop to get out and then through the circuit: it’s still the same ΔQ. But the work that can be done by this current becomes much larger. In fact, two loops give us twice the emf of one loop, and N loops give us N times the emf of one loop. So then we can make the free electrons move faster, so they cover more distance in the same time (and you know work is force times distance), or we can move them across a larger potential difference over the same distance (and so then we move them against a larger force, so it also implies we’re doing more work). The first case is a larger current, while the second is a larger voltage. So what is it going to be?

Think about the physics of the situation once more: to make the charges move faster, you’ll need a larger force, so you’ll have a larger potential difference, i.e. a larger voltage. As for what happens to the current, I’ll explain that below. Before I do, let me talk some more basics.

In the exposé below, we’ll talk about power again, and also about load. What is load? Think about what it is in real life: when buying a battery for a big car, we’ll want a big battery, so we don’t look at the voltage only (they’re all 12-volt anyway). We’ll look at how many ampères it can deliver, and for how long. The starter motor in the car, for example, can suck up like 200 A, but for a very short time only, of course, as the car engine itself should kick in. So that’s why the capacity of batteries is expressed in ampère-hours.

Now, how do we get such large currents, such large loads? Well… Use Ohm’s Law: to get 200 A at 12 V, the resistance of the starter motor will have to as low as 0.06 ohm. So large currents are associated with very low resistance. Think practical: a 240-volt 60 watt light-bulb will suck in 0.25 A, and hence, its internal resistance, is about 960 Ω. Also think of what goes on in your house: we’ve got a lot of resistors in parallel consuming power there. The formula for the total resistance is 1/R_total= 1/R₁+ 1/R₂+ 1/R₃+ … So more appliances is less resistance, so that’s what draws in the larger current.

The point is: when looking at circuits, emf is one thing, but energy and power, i.e. the work done per second, are all that matters really. And so then we’re talking currents, but our flux rule does not say how much current our generator will produce: that depends on the load. OK. We really need to get back to the lesson now.

A circuit with an AC generator

The situation is depicted below. We’ve got a coil of wire of, let’s say, N turns of wire, and we’ll use it to generate an alternating current (AC) in a circuit.

The coil is really like the loop of wire in that primitive electric motor I introduced in my previous post, but so now we use the motor as a generator. To simplify the analysis, we assume we’ll rotate our coil of wire in a uniform magnetic field, as shown by the field lines B.

Now, our coil is not a loop, of course: the two ends of the coil are brought to external connections through some kind of sliding contacts, but that doesn’t change the flux rule: a changing magnetic flux will produce some emf and, therefore, some current in the coil.

OK. That’s clear enough. Let’s see what’s happening really. When we rotate our coil of wire, we change the magnetic flux through it. If S is the area of the coil, and θ is the angle between the magnetic field and the normal to the plane of the coil, then the flux through the coil will be equal to B·S·cosθ. Now, if we rotate the coil at a uniform angular velocity ω, then θ varies with time as θ = ω·t. Now, each turn of the coil will have an emf equal to the rate of change of the flux, i.e. d(B·S·cosθ)/dt. We’ve got N turns of wire, and so the total emf, which we’ll denote by Ɛ (yep, a new symbol), will be equal to:

Now, that’s just a nice sinusoidal function indeed, which will look like the graph below.

When no current is being drawn from the wire, this Ɛ will effectively be the potential difference between the two wires. What happens really is that the emf produces a current in the coil which pushes some charges out to the wire, and so then they’re stuck there for a while, and so there’s a potential difference between them, which we’ll denote by V, and that potential difference will be equal to Ɛ. It has to be equal to Ɛ because, if it were any different, we’d have an equalizing counter-current, of course. [It’s a fine point, so you should think about it.] So we can write:

So what happens when we do connect the wires to the circuit, so we’ve got that closed circuit depicted above (and below)?

Then we’ll have a current I going through the circuit, and Ohm’s Law then tells us that the ratio between (i) the voltage across the resistance in this circuit (we assume the connections between the generator and the resistor itself are perfect conductors) and (ii) the current will be some constant, so we have R = V/I and, therefore:

[To be fully complete, I should note that, when other circuit elements than resistors are involved, like capacitors and inductors, we’ll have a phase difference between the voltage and current functions, and so we should look at the impedance of the circuit, rather than its resistance. For more detail, see the addendum below this post.]

OK. Let’s now look at the power and energy involved.

Energy and power in the AC circuit

You’ll probably have many questions about the analysis above. You should. I do. The most remarkable thing, perhaps, is that this analysis suggests that the voltage doesn’t drop as we connect the generator to the circuit. It should. Why not? Why do the charges at both ends of the wire simply discharge through the circuit? In real life, there surely is such tendency: sudden large changes in loading will effectively produce temporary changes in the voltage. But then it’s like Feynman writes: “The emf will continue to provide charge to the wires as current is drawn from them, attempting to keep the wires always at the same potential difference.”

So how much current is drawn from them? As I explained above, that depends not on the generator but on the circuit, and more in particular on the load, so that’s the resistor in this case. Again, the resistance is the (constant) ratio of the voltage and the current: R = V/I. So think about increasing or decreasing the resistance. If the voltage remains the same, it implies the current must decrease or increase accordingly, because R = V/I implies that I = V/R. So the current is inversely proportional to R, as I explained above when discussing car batteries and lamps and loads. 🙂

Now, I still have to prove that the power provided by our generator is effectively equal to P = Ɛ·I but, if it is, it implies the power that’s being delivered will be inversely proportional to R. Indeed, when Ɛ and/or V remain what they are as we insert a larger resistance in the circuit, then P = Ɛ·I = Ɛ²/R, and so the power that’s being delivered would be inversely proportional to R. To be clear, we’d have a relation between P and R like the one below.

This is somewhat weird. Why? Well… I also have to show you that the power that goes into moving our coil in the magnetic field, i.e. the rate of mechanical work required to rotate the coil against the magnetic forces, is equal to the electric power Ɛ·I, i.e. the rate at which electrical energy is being delivered by the emf of the generator. However, I’ll postpone that for a while and, hence, I’ll just ask you, once again, to take me on my word. 🙂 Now, if that’s true, so if the mechanical power equals the electric power, then that implies that a larger resistance will reduce the mechanical power we need to maintain the angular velocity ω. Think of a practical example: if we’d double the resistance (i.e. we halve the load), and if the voltage stays the same, then the current would be halved, and the power would also be halved. And let’s think about the limit situations: as the resistance goes to infinity, the power that’s being delivered goes to zero, as the current goes to zero, while if the resistance goes to zero, both the current as well as the power would go to infinity!

Well… We actually know that’s also true in real-life: actual generators consume more fuel when the load increases, so when they deliver more power, and much less fuel, so less power, when there’s no load at all. You’ll know that, at least when you’re living in a developing country with a lot of load shedding! 🙂 And the difference is huge: no or just a little load will only consume 10% of what you need when fully loading it. It’s totally in line with what I wrote on the relationship between the resistance and the current that it draws in. So, yes, it does make sense:

An emf does produce more current if the resistance in the circuit is low (so i.e. when the load is high), and the stronger currents do represent greater mechanical forces.

That’s a very remarkable thing. It means that, if we’d put a larger load on our little AC generator, it should require more mechanical work to keep the coil rotating at the same angular velocity ω. But… What changes? The change in flux is the same, the Δt is the same, and so what changes really? What changes is the current going through the coil, and it’s not a change in that ΔQ factor above, but a change in its velocity v.

Hmm… That all looks quite complicated, doesn’t it? It does, so let’s get back to the analysis of what we have here, so we’ll simply assume that we have some dynamic equilibrium obeying that formula above, and so I and R are what they are, and we relate them to Ɛ according to that equation above, i.e.:

Now let me prove those formulas on the power of our generator and in the circuit. We have all these charges in our coil that are receiving some energy. Now, the rate at which they receive energy is F·v.

Huh? Yes. Let me explain: the work that’s being done on a charge along some path is the line integral ∫ F·ds along this path. But the infinitesimal distance ds is equal to v·dt, as ds/dt = v (note that we write s and v as vectors, so the dot product with F gives us the component of F that is tangential to the path). So ∫ F·ds = ∫ (F·v)dt. So the time rate of change of the energy, which is the power, is F·v. Just take the time derivative of the integral. 🙂

Now let’s assume we have n moving charges per unit length of our coil (so that’s in line with what I wrote about ΔQ above), then the power being delivered to any element ds of the coil is (F·v)·n·ds, which can be written as: (F·ds)·n·v. [Why? Because v and ds have the same direction: the direction of both vectors is tangential to the wire, always.] Now all we need to do to find out how much power is being delivered to the circuit by our AC generator is integrate this expression over the coil, so we need to find:

However, the emf (Ɛ) is defined as the line integral ∫ E·ds line, taken around the entire coil, and E = F/q, and the current I is equal to I = q·n·v. So the power from our little AC generator is indeed equal to:

Power = Ɛ·I

So that’s done. Now I need to make good on my other promise, and that is to show that Ɛ·I product is equal to the mechanical power that’s required to rotate the coil in the magnetic field. So how do we do that?

We know there’s going to be some torque because of the current in the coil. It’s formula is given by τ = μ×B. What magnetic field? Well… Let me refer you to my post on the magnetic dipole and its torque: it’s not the magnetic field caused by the current, but the external magnetic field, so that’s the B we’ve been talking about here all along. So… Well… I am not trying to fool you here. 🙂 However, the magnetic moment μ was not defined by that external field, but by the current in the coil and its area. Indeed, μ‘s magnitude was the current times the area, so that’s N·I·S in this case. Of course, we need to watch out because μ is a vector itself and so we need the angle between μ and B to calculate that vector cross product τ = μ×B. However, if you check how we defined the direction of μ, you’ll see it’s normal to the plane of the coil and, hence, the angle between μ and B is the very same θ = ω·t that we started our analysis with. So, to make a long story short, the magnitude of the torque τ is equal to:

τ = (N·I·S)·B·sinθ

Now, we know the torque is also equal to the work done per unit of distance traveled (around the axis of rotation, that is), so τ = dW/dθ. Now dθ = d(ω·t) = ω·dt. So we can now find the work done per unit of time, so that’s the power once more:

dW/dt = ω·τ = ω·(N·I·S)·B·sinθ

But so we found that Ɛ = N·S·B·ω·sinθ, so… Well… We find that:

dW/dt = Ɛ·I

Now, this equation doesn’t sort out our question as to how much power actually goes in and out of the circuit as we put some load on it, but it is what we promised to do: I showed that the mechanical work we’re doing on the coil is equal to the electric energy that’s being delivered to the circuit. 🙂

It’s all quite mysterious, isn’t it? It is. And we didn’t include other stuff that’s relevant here, such as the phenomenon of self-inductance: the varying current in the coil will actually produce its own magnetic field and, hence, in practice, we’d get some “back emf” in the circuit. This “back emf” is opposite to the current when it is increasing, and it is in the direction of the current when it is decreasing. In short, the self-inductance effect causes a current to have ‘inertia’: the inductive effects try to keep the flow constant, just as mechanical inertia tries to keep the velocity of an object constant. But… Well… I left that out. I’ll take about next time because…

[…] Well… It’s getting late in the day, and so I must assume this is sort of ‘OK enough’ as an introduction to what we’ll be busying ourselves with over the coming week. You take care, and I’ll talk to you again some day soon. 🙂

Perhaps one little note, on a question that might have popped up when you were reading all of the above: so how do actual generators keep the voltage up? Well… Most AC generators are, indeed, so-called constant speed devices. You can download some manuals from the Web, and you’ll find things like this: don’t operate at speeds above 4% of the rated speed, or more than 1% below the rated speed. Fortunately, the so-called engine governor will take car of that. 🙂

Addendum: The concept of impedance

In one of my posts on oscillators, I explain the concept of impedance, which is the equivalent of resistance, but for AC circuits. Just like resistance, impedance also sort of measures the ‘opposition’ that a circuit presents to a current when a voltage is applied, but it’s a complex ratio, as opposed to R = V/I. It’s literally a complex ratio because the impedance has a magnitude and a direction, or a phase as it’s usually referred to. Hence, one will often write the impedance (denoted by Z) using Euler’s formula:

Z = |Z|eⁱ^θ

The illustration below (credit goes to Wikipedia, once again) explains what’s going on. It’s a pretty generic view of the same AC circuit. The truth is: if we apply an alternating current, then the current and the voltage will both go up and down, but the current signal will usually lag the voltage signal, and the phase factor θ tells us by how much. Hence, using complex-number notation, we write:

V = I∗Z = I∗|Z|eⁱ^θ

Now, while that resembles the V = R·I formula, you should note the bold-face type for V and I, and the ∗ symbol I am using here for multiplication. First the ∗ symbol: that’s to make it clear we’re not talking a vector cross product A×B here, but a product of two complex numbers. The bold-face for V and I implies they’re like vectors, or like complex numbers: so they have a phase too and, hence, we can write them as:

V = |V|eⁱ^{(ωt +}^θ_V⁾
I = |I|eⁱ^{(ωt +}^θ_I⁾

To be fully complete – you may skip all of this if you want, but it’s not that difficult, nor very long – it all works out as follows. We write:

V = I∗Z = |I|eⁱ^{(ωt +}^θ_I⁾∗|Z|eⁱ^θ= |I||Z|eⁱ^{(ωt +}^θ_I^+ θ)= |V|eⁱ^{(ωt +}^θ_V⁾

Now, this equation must hold for all t, so we can equate the magnitudes and phases and, hence, we get: |V| = |I||Z| and so we get the formula we need, i.e. the phase difference between our function for the voltage and our function for the current.

θ_V= θ_I+ θ

Of course, you’ll say: voltage and current are something real, isn’t it? So what’s this about complex numbers? You’re right. I’ve used the complex notation only to simplify the calculus, so it’s only the real part of those complex-valued functions that counts.

Oh… And also note that, as mentioned above, we do not have such lag or phase difference when only resistors are involved. So we don’t need the concept of impedance in the analysis above. With this addendum, I just wanted to be as complete as I can be. 🙂

Induced currents

In my two previous posts, I presented all of the ingredients of the meal we’re going to cook now, most notably:

The formula for the torque on a loop of a current in a magnetic field, and its energy: (i) τ = μ×B, and (ii) U_mech= −μ·B.
The Biot-Savart Law, which gives you the magnetic field that’s produced by wires carrying currents:

Both ingredients are, obviously, relevant to the design of an electromagnetic motor, i.e. an ‘engine that can do some work’, as Feynman calls it. 🙂 Its principle is illustrated below.

The two formulas above explain how and why the coil go around, and the coil can be made to keep going by arranging that the connections to the coil are reversed each half-turn by contacts mounted on the shaft. Then the torque is always in the same direction. That’s how a small direct current (DC) motor is made. My father made me make a couple of these thirty years ago, with a magnet, a big nail and some copper coil. I used sliding contacts, and they were the most difficult thing in the whole design. But now I found a very nice demo on YouTube of a guy whose system to ‘reverse’ the connections is wonderfully simple: he doesn’t use any sliding contacts. He just removes half of the insulation on the wire of the coil on one side. It works like a charm, but I think it’s not so sustainable, as it spins so fast that the insulation on the other side will probably come off after a while! 🙂

Now, to make this motor run, you need current and, hence, 19th century physicists and mechanical engineers also wondered how one could produce currents by changing the magnetic field. Indeed, they could use Alessandro Volta’s ‘voltaic pile‘ to produce currents but it was not very handy: it consisted of alternating zinc and copper discs, with pieces of cloth soaked in salt water in-between!

Now, while the Biot-Savart Law goes back to 1820, it took another decade to find out how that could be done. Initially, people thought magnetic fields should just cause some current, but that didn’t work. Finally, Faraday unequivocally established the fundamental principle that electric effects are only there when something is changing. So you’ll get a current in a wire by moving it in a magnetic field, or by moving the magnet or, if the magnetic field is caused by some other current, by changing the current in that wire. It’s referred to as the ‘flux rule’, or Faraday’s Law. Remember: we’ve seen Gauss’ Law, then Ampère’s Law, and then that Biot-Savart Law, and so now it’s time for Faraday’s Law. 🙂 Faraday’s Law is Maxwell’s third equation really, aka as the Maxwell-Faraday Law of Induction:

∇×E = −∂B/∂t

Now you’ll wonder: what’s flux got to do with this formula? ∇×E is about circulation, not about flux! Well… Let me copy Feynman’s answer:

So… There you go. And, yes, you’re right, instead of writing Faraday’s Law as ∇×E = −∂B/∂t, we should write it as:

That’s a easier to understand, and it’s also easier to work with, as we’ll see in a moment. So the point is: whenever the magnetic flux changes, there’s a push on the electrons in the wire. That push is referred to as the electromotive force, abbreviated as emf or EMF, and so it’s that line and/or surface integral above indeed. Let me paraphrase Feynman so you fully understand what we’re talking about here:

When we move our wire in a magnetic field, or when we move a magnet near the wire, or when we change the current in a nearby wire, there will be some net push on the electrons in the wire in one direction along the wire. There may be pushes in different directions at different places, but there will be more push in one direction than another. What counts is the push integrated around the complete circuit. We call this net integrated push the electromotive force (abbreviated emf) in the circuit. More precisely, the emf is defined as the tangential force per unit charge in the wire integrated over length, once around the complete circuit.

So that’s the integral. 🙂 And that’s how we can turn that motor above into a generator: instead of putting a current through the wire to make it turn, we can turn the loop, by hand or by a waterwheel or by whatever. Now, when the coil rotates, its wires will be moving in the magnetic field and so we will find an emf in the circuit of the coil, and so that’s how the motor becomes a generator.

Now, let me quickly interject something here: when I say ‘a push on the electrons in the wire’, what electrons are we talking about? How many? Well… I’ll answer that question in very much detail in a moment but, as for now, just note that the emf is some quantity expressed per coulomb or, as Feynman puts it above, per unit charge. So we’ll need to multiply it with the current in the circuit to get the power of our little generator.

OK. Let’s move on. Indeed, all I can do here is mention just a few basics, so we can move on to the next thing. If you really want to know all of the nitty-gritty, then you should just read Feynman’s Lecture on induced currents. That’s got everything. And, no, don’t worry: contrary to what you might expect, my ‘basics’ do not amount to a terrible pile of formulas. In fact, it’s all easy and quite amusing stuff, and I should probably include a lot more. But then… Well… I always need to move on… If not, I’ll never get to the stuff that I really want to understand. 😦

The electromotive force

We defined the electromotive force above, including its formula:

What are the units? Let’s see… We know B was measured not in newton per coulomb, like the electric field E, but in N·s/C·m, because we had to multiply the magnetic field strength with the velocity of the charge to find the force per unit charge, cf. the F/q = v×B equation. Now what’s the unit in which we’d express that surface integral? We must multiply with m², so we get N·m·s/C. Now let’s simplify that by noting that one volt is equal to 1 N·m/C. [The volt has a number of definitions, but the one that applies here is that it’s the potential difference between two points that will impart one joule (i.e. 1 N·m) of energy to a unit of charge (i.e. 1 C) that passes between them.] So we can measure the magnetic flux in volt-seconds, i.e. V·s. And then we take the derivative in regard to time, so we divide by s, and so we get… Volt! The emf is measured in volt!

Does that make sense? I guess so: the emf causes a current, just like a potential difference, i.e. a voltage, and, therefore, we can and should look at the emf as a voltage too!

But let’s think about it some more, though. In differential form, Faraday’s Law, is just that ∇×E = −∂B/∂t equation, so that’s just one of Maxwell’s four equations, and so we prefer to write it as the “flux rule”. Now, the “flux rule” says that the electromotive force (abbreviated as emf or EMF) on the electrons in a closed circuit is equal to the time rate of change of the magnetic flux it encloses. As mentioned above, we measure magnetic flux in volt-seconds (i.e. V·s), so its time rate of change is measured in volt (because the time rate of change is a quantity expressed per second), and so the emf is measured in volt, i.e. joule per coulomb, as 1 V = 1 N·m/C = 1 J/C. What does it mean?

The time rate of change of the magnetic flux can change because the surface covered by our loop changes, or because the field itself changes, or by both. Whatever the cause, it will change the emf, or the voltage, and so it will make the electrons move. So let’s suppose we have some generator generating some emf. The emf can be used to do some work. We can charge a capacitor, for example. So how would that work?

More charge on the capacitor will increase the voltage V of the capacitor, i.e. the potential difference V = Φ₁− Φ₂between the two plates. Now, we know that the increase of the voltage V will be proportional to the increase of the charge Q, and that the constant of proportionality is defined by the capacity C of the capacitor: C = Q/V. [How do we know that? Well… Have a look at my post on capacitors.] Now, if our capacitor has an enormous capacity, then its voltage won’t increase very rapidly. However, it’s clear that, no matter how large the capacity, its voltage will increase. It’s just a matter of time. Now, its voltage cannot be higher than the emf provided by our ‘generator’, because it will then want to discharge through the same circuit!

So we’re talking power and energy here, and so we need to put some load on our generator. Power is the rate of doing work, so it’s the time rate of change of energy, and it’s expressed in joule per second. The energy of our capacitor is U = (1/2)·Q²/C = (1/2)·C·V². [How do we know that? Well… Have a look at my post on capacitors once again. :-)] So let’s take the time derivative of U assuming some constant voltage V. We get: dU/dt = d[(1/2)·Q²/C]/dt = (Q/C)·dQ/dt = V·dQ/dt. So that’s the power that the generator would need to supply to charge the generator. As I’ll show in a moment, the power supplied by a generator is, indeed, equal to the emf times the current, and the current is the time rate of change of the charge, so I = dQ/dt.

So, yes, it all works out: the power that’s being supplied by our generator will be used to charge our capacitor. Now, you may wonder: what about the current? Where is the current in Faraday’s Law? The answer is: Faraday’s Law doesn’t have the current. It’s just not there. The emf is expressed in volt, and so that’s energy per coulomb, so it’s per unit charge. How much power an generator can and will deliver depends on its design, and the circuit and load that we will be putting on it. So we can’t say how many coulomb we will have. It all depends. But you can imagine that, if the loop would be bigger, or if we’d have a coil with many loops, then our generator would be able to produce more power, i.e. it would be able to move more electrons, so the mentioned power = (emf)×(current) product would be larger. 🙂

Finally, to conclude, note Feynman’s definition of the emf: the tangential force per unit charge in the wire integrated over length around the complete circuit. So we’ve got force times distance here, but per unit charge. Now, force times distance is work, or energy, and so… Yes, emf is joule per coulomb, definitely! 🙂

[…] Don’t worry too much if you don’t quite ‘get’ this. I’ll come back to it when discussing electric circuits, which I’ll do in my next posts.

Self-inductance and Lenz’s rule

We talked about motors and generators above. We also have transformers, like the one below. What’s going on here is that an alternating current (AC) produces a continuously varying magnetic field, which generates an alternating emf in the second coil, which produces enough power to light an electric bulb.

Now, the total emf in coil (b) is the sum of the emf’s of the separate turns of coil, so if we wind (b) with many turns, we’ll get a larger emf, so we can ‘transform’ the voltage to some other voltage. From your high-school classes, you should know how that works.

The thing I want to talk about here is something else, though. There is an induction effect in coil (a) itself. Indeed, the varying current in coil (a) produces a varying magnetic field inside itself, and the flux of this field is continually changing, so there is a self-induced emf in coil (a). The effect is called self-inductance, and so it’s the emf acting on a current itself when it is building up a magnetic field or, in general, when its field is changing in any way. It’s a most remarkable phenomenon, and so let me paraphrase Feynman as he describes it:

“When we gave “the flux rule” that the emf is equal to the rate of change of the flux linkage, we didn’t specify the direction of the emf. There is a simple rule, called Lenz’s rule, for figuring out which way the emf goes: the emf tries to oppose any flux change. That is, the direction of an induced emf is always such that if a current were to flow in the direction of the emf, it would produce a flux of B that opposes the change in $B$ that produces the emf. In particular, if there is a changing current in a single coil (or in any wire), there is a “back” emf in the circuit. This emf acts on the charges flowing in the coil to oppose the change in magnetic field, and so in the direction to oppose the change in current. It tries to keep the current constant; it is opposite to the current when the current is increasing, and it is in the direction of the current when it is decreasing. A current in a self-inductance has “inertia,” because the inductive effects try to keep the flow constant, just as mechanical inertia tries to keep the velocity of an object constant.”

Hmm… That’s something you need to read a couple of times to fully digest it. There’s a nice demo on YouTube, showing an MIT physics video demonstrating this effect with a metal ring placed on the end of an electromagnet. You’ve probably seen it before: the electromagnet is connected to a current, and the ring flies into the air. The explanation is that the induced currents in the ring create a magnetic field opposing the change of field through it. So the ring and the coil repel just like two magnets with opposite poles. The effect is no longer there when a thin radial cut is made in the ring, because then there can be no current. The nice thing about the video is that it shows how the effect gets much more dramatic when an alternating current is applied, rather than a DC current. And it also shows what happens when you first cool the ring in liquid nitrogen. 🙂

You may also notice the sparks when the electromagnet is being turned on. Believe it or not, that’s also related to a “back emf”. Indeed, when we disconnect a large electromagnet by opening a switch, the current is supposed to immediately go to zero but, in trying to do so, it generates a large “back emf”: large enough to develop an arc across the opening contacts of the switch. The high voltage is also not good for the insulation of the coil, as it might damage it. So that’s why large electromagnets usually include some extra circuit, which allows the “back current” to discharge less dramatically. But I’ll refer you to Feynman for more details, as any illustration here would clutter the exposé.

Eddy currents

I like educational videos, and so I should give you a few references here, but there’s so many of this that I’ll let you google a few yourself. The most spectacular demonstration of eddy currents is those that appear in a superconductor: even back in the 1970s, when Feynman wrote his Lectures, the effect of magnetic levitation was well known. Feynman illustrates the effect with the simple diagram below: when bringing a magnet near to a perfect conductor, such as tin below 3.8°K, eddy currents will create opposing fields, so that no magnetic flux enters the superconducting material. The effect is also referred to as the Meisner effect, after the German physicist Walther Meisner, although it was discovered much earlier (in 1911) by a Dutch physicist in Leiden, Heike Kamerlingh Onnes, who got a Nobel Prize for it.

Of course, we have eddy currents in less dramatic situations as well. The phenomenon of eddy currents is usually demonstrated by the braking of a sheet of metal as it swings back and forth between the poles of an electromagnet, as illustrated below (left). The illustration on the right shows how eddy-current effect can be drastically reduced by cutting slots in the plate, so that’s like making a radial cut in our jumping ring. 🙂

The Faraday disc

The Faraday disc is interesting, not only from a historical point of view – the illustration below is a 19th century model, so Michael Faraday may have used himself – but also because it seems to contradict the “flux of rule”: as the disc rotates through a steady magnetic field, it will produce some emf, but so there’s no change in the flux. How is that possible?

The answer, of course, is that we are ‘cheating’ here: the material is moving, so we’re actually moving the ‘wire’, or the circuit if you want, so here we need to combine two equations:

If we do that, you’ll see it all makes sense. 🙂 Oh… That Faraday disc is referred to as a homopolar generator, and it’s quite interesting. You should check out what happened to the concept in the Wikipedia article on it. The Faraday disc was apparently used as a source for power pulses in the 1950s. The thing below could store 500 mega-joules and deliver currents up to 2 mega-ampère, i.e. 2 million amps! Fascinating, isn’t it? 🙂

Magnetic dipoles and their torque and energy

We studied the magnetic dipole in very much detail in one of my previous posts but, while we talked about an awful lot of stuff there, we actually managed to not talk about the torque on a it, when it’s placed in the magnetic field of other currents, or some other magnetic field tout court. Now, that’s what drives electric motors and generators, of course, and so we should talk about it, which is what I’ll do here. Let me first remind you of the concept of torque, and then we’ll apply it to a loop of current. 🙂

The concept of torque

The concept of torque is easy to grasp intuitively, but the math involved is not so easy. Let me sum up the basics (for the detail, I’ll refer you to my posts on spin and angular momentum). In essence, for rotations in space (i.e. rotational motion), the torque is what the force is for linear motion:

It’s the torque (τ) that makes an object spin faster or slower around some axis, just like the force would accelerate or decelerate that very same object when it would be moving along some curve.
There’s also a similar ‘law of Newton’ for torque: you’ll remember that the force equals the time rate of change of a vector quantity referred to as (linear) momentum: F = dp/dt = d(mv)/dt = ma (the mass times the acceleration). Likewise, we have a vector quantity that is referred to as angular momentum (L), and we can write: τ (i.e. the Greek tau) = dL/dt.
Finally, instead of linear velocity, we’ll have an angular velocity ω (omega), which is the time rate of change of the angle θ that defines how far the object has gone around (as opposed to the distance in linear dynamics, describing how far the object has gone along). So we have ω = dθ/dt. This is actually easy to visualize because we know that θ, expressed in radians, is actually the length of the corresponding arc on the unit circle. Hence, the equivalence with the linear distance traveled is easily ascertained.

There are many more similarities, like an angular acceleration: α = dω/dt = d²θ/dt², and we should also note that, just like the force, the torque is doing work – in its conventional definition as used in physics – as it turns an object instead of just moving it, so we can write:

ΔW = τ·Δθ

So it’s all the same-same but different once more 🙂 and so now we also need to point out some differences. The animation below does that very well, as it relates the ‘new’ concepts – i.e. torque and angular momentum – to the ‘old’ concepts – i.e. force and linear momentum. It does so using the vector cross product, which is really all you need to understand the math involved. Just look carefully at all of the vectors involved, which you can identify by their colors, i.e. red-brown (r), light-blue (τ), dark-blue (F), light-green (L), and dark-green (p).

So what do we have here? We have vector quantities once again, denoted by symbols in bold-face. Having said that, I should note that τ, L and ω are ‘special’ vectors: they are referred to as axial vectors, as opposed to the polar vectors F, p and v. To put it simply: polar vectors represent something physical, and axial vectors are more like mathematical vectors, but that’s a very imprecise and, hence, essential non-correct definition. 🙂 Axial vectors are directed along the axis of spin – so that is, strangely enough, at right angles to the direction of spin, or perpendicular to the ‘plane of the twist’ as Feynman calls it – and the direction of the axial vector is determined by a convention which is referred to as the ‘right-hand screw rule’. 🙂

Now, I know it’s not so easy to visualize vector cross products, so it may help to first think of torque (also known, for some obscure reason, as the moment of the force) as a twist on an object or a plane. Indeed, the torque’s magnitude can be defined in another way: it’s equal to the tangential component of the force, i.e. F·sin(Δθ), times the distance between the object and the axis of rotation (we’ll denote this distance by r). This quantity is also equal to the product of the magnitude of the force itself and the length of the so-called lever arm, i.e. the perpendicular distance from the axis to the line of action of the force (this lever arm length is denoted by r₀). So, we can define τ without the use of the vector cross-product, and in not less than three different ways actually. Indeed, the torque is equal to:

The product of the tangential component of the force times the distance r: τ = r·F_t= r·F·sin(Δθ);
The product of the length of the lever arm times the force: τ = r₀·F;
The work done per unit of distance traveled: τ = ΔW/Δθ or τ = dW/dθ in the limit.

Phew! Yeah. I know. It’s not so easy… However, I regret to have to inform you that you’ll need to go even further in your understanding of torque. More specifically, you really need to understand why and how we define the torque as a vector cross product, and so please do check out that post of mine on the fundamentals of ‘torque math’. If you don’t want to do that, then just try to remember the definition of torque as an axial vector, which is:

τ = (τ_yz, τ_zx, τ_xy) = (τ_x, τ_y, τ_z) with

τ_x = τ_yz = yF_z – zF_y (i.e. the torque about the x-axis, i.e. in the yz-plane),

τ_y = τ_zx = zF_x – xF_z (i.e. the torque about the y-axis, i.e. in the zx-plane), and

τ_z = τ_xy = xF_y – yF_x (i.e. the torque about the z-axis, i.e. in the xy-plane).

The angular momentum L is defined in the same way:

L = (L_yz, L_zx, L_xy) = (L_x, L_y, L_z) with

L_x = L_yz = yp_z – zp_y (i.e. the angular momentum about the x-axis),

L_y = L_zx = zp_x – xp_z(i.e. the angular momentum about the y-axis), and

L_z = L_xy = xp_y – yp_x (i.e. the angular momentum about the z-axis).

Let’s now apply the concepts to a loop of current.

The forces on a current loop

The geometry of the situation is depicted below. I know it looks messy but let me help you identifying the moving parts, so to speak. 🙂 We’ve got a loop with current and so we’ve got a magnetic dipole with some moment μ. From my post on the magnetic dipole, you know that μ‘s magnitude is equal to |μ| = μ = (current)·(area of the loop) = I·a·b.

Now look at the B vectors, i.e. the magnetic field. Please note that these vectors represent some external magnetic field! So it’s not like what we did in our post on the dipole: we’re not looking at the magnetic field caused by our loop, but at how it behaves in some external magnetic field. Now, because it’s kinda convenient to analyze, we assume that the direction of our external field B is the direction of the z-axis, so that’s what you see in this illustration: the B vectors all point north. Now look at the force vectors, remembering that the magnetic force is equal to:

F_magnetic = qv×B

So that gives the F₁, F₂, F₃, and F₄ vectors (so that’s the force on the first, second, third and fourth leg of the loop respectively) the magnitude and direction they’re having. Now, it’s easy to see that the opposite forces, i.e. the F₁–F₂ and F₃–F₄pair respectively, create a torque. The torque because of F₁and F₂is a torque which will tend to rotate the loop about the y-axis, so that’s a torque in the xz-plane, while the torque because of F₃and F₄will be some torque about the x-axis and/or the z-axis. As you can see, the torque is such that it will try to line up the moment vector μ with the magnetic field B. In fact, the geometry of the situation above is such that F₃and F₄have already done their job, so to speak: the moment vector μ is already lined up with the xz-plane, so there’s not net torque in that plane. However, that’s just because of the specifics of the situation here: the more general situation is that we’d have some torque about all three axes, and so we need to find that vector τ.

If we’d be talking some electric dipole, the analysis would be very straightforward, because the electric force is just F_electric = qE, which we can also write as E = F_electric =/q, so the field is just the force on one unit of electric charge, and so it’s (relatively) easy to see that we’d get the following formula for the torque vector:

τ = p×E

Of course, the p is the electric dipole moment here, not some linear momentum. [And, yes, please do try to check this formula. Sorry I can’t elaborate on it, but the objective of this blog is not substitute for a textbook!]

Now, all of the analogies between the electric and magnetic dipole field, which we explored in the above-mentioned post of mine, would tend to make us think that we can write τ here as:

τ = μ×B

Well… Yes. It works. Now you may want to know why it works 🙂 and so let me give you the following hint. Each charge in a wire feels that F_magnetic = qv×B force, so the total magnetic force on some volume ΔV, which I’ll denote by ΔF for a while, is the sum of the forces on all of the individual charges. So let’s assume we’ve got N charges per unit volume, then we’ve got N·ΔV charges in our little volume ΔV, so we write: ΔF = N·ΔV·q·v×B. You’re probably confused now: what’s the v here? It’s the (drift) velocity of the (free) electrons that make up our current I. Indeed, the protons don’t move. 🙂 So N·q·v is just the current density j, so we get: ΔF = j×BΔV, which implies that the force per unit volume is equal to j×B. But we need to relate it to the current in our wire, not the current density. Relax. We’re almost there. The ΔV in a wire is just its cross-sectional area A times some length, which I’ll denote by ΔL, so ΔF = j×BΔV becomes ΔF = j×BAΔL. Now, jA is the vector current I, so we get the simple result we need here: ΔF = I×BΔL, i.e. the magnetic force per unit length on a wire is equal to ΔF/ΔL = I×B.

Let’s now get back to our magnetic dipole and calculate F₁and F₂. The length of ‘wire’ is the length of the leg of the loop, i.e. b, so we can write:

F₁= −F₂ = b·I×B

So the magnitude of these forces is equal F₁= F₂ = I·B·b. Now, The length of the moment or lever arm is, obviously, equal to a·sinθ, so the magnitude of the torque is equal to the force times the lever arm (cf. the τ = r₀·F formula above) and so we can write:

τ = I·B·b·a·sinθ

But I·a·b is the magnitude of the magnetic moment μ, so we get:

τ = μ·B·sinθ

Now that’s consistent with the definition of the vector cross product:

τ = μ×B = |μ|·|B|·sinθ·n = μ·B·sinθ·n

Done! Now, electric motors and generators are all about work and, therefore, we also need to briefly talk about energy here.

The energy of a magnetic dipole

Let me remind you that we could also write the torque as the work done per unit of distance traveled, i.e. as τ = ΔW/Δθ or τ = dW/dθ in the limit. Now, the torque tries to line up the moment with the field, and so the energy will be lowest when μ and B are parallel, so we need to throw in a minus sign when writing:

τ = −dU/dθ ⇔ dU = −τ·dθ

We should now integrate over the [0, θ] interval to find U, also using our τ = μ·B·sinθ formula. That’s easy, because we know that d(cosθ)/dθ = −sinθ, so that integral yields:

U = 1 − μ·B·cosθ + a constant

If we choose the constant to be zero, and if we equate μ·B with 1, we get the blue graph below:

The μ·B in the U = 1 − μ·B·cosθ formula is just a scaling factor, obviously, so it determines the minimum and maximum energy. Now, you may want to limit the relevant range of θ to [0, π], but that’s not necessary: the energy of our loop of current does go up and down as shown in the graph. Just think about it: it all makes perfect sense!

Now, there is, of course, more energy in the loop than this U energy because energy is needed to maintain the current in the loop, and so we didn’t talk about that here. Therefore, we’ll qualify this ‘energy’ and call it the mechanical energy, which we’ll abbreviate by U_mech. In addition, we could, and will, choose some other constant of integration, so that amounts to choosing some other reference point for the lowest energy level. Why? Because it then allows us to write U_mechas a vector dot product, so we get:

U_mech= −μ·B·cosθ = −μ·B

The graph is pretty much the same, but it now goes from −μ·B to +μ·B, as shown by the red graph in the illustration above.

Finally, you should note that the U_mech= −μ·B formula is similar to what you’ll usually see written for the energy of an electric dipole: U= −p·E. So that’s all nice and good! However, you should remember that the electrostatic energy of an electric dipole (i.e. two opposite charges separated by some distance d) is all of the energy, as we don’t need to maintain some current to create the dipole moment!

Now, Feynman does all kinds of things with these formulas in his Lectures on electromagnetism but I really think this is all you need to know about it—for the moment, at least. 🙂

The magnetic field of circuits: the Law of Biot and Savart

Pre-script (dated 26 June 2020): This post got mutilated by the removal of some material by the dark force. You should be able to follow the main story line, however. If anything, the lack of illustrations might actually help you to think things through for yourself. In any case, we now have different views on these concepts as part of our realist interpretation of quantum mechanics, so we recommend you read our recent papers instead of these old blog posts.

Original post:

We studied the magnetic dipole in very much detail in one of my previous posts. While we talked about an awful lot of stuff there, we actually managed to not talk about the torque on a it, when it’s placed in the magnetic field of other currents. Now, that’s what drives electric motors and generators, of course, and so we should talk about it, which is what I’ll do in my next post. Before doing so, however, I need to give you one or two extra formulas generalizing some of the results we obtained in our previous posts on magnetostatics. So that’s what I do under this heading: the magnetic field of circuits. The idea is simple: loops of current are not always nice squares or circles. Their shape might be quite irregular, indeed, like the loop below.

Of course, the same general formula should apply. So we can find the magnetic vector potential with the following integral:

Just to make sure, let me re-insert its equivalent for electrostatics, so you can see they’re (almost) the same:

But we’re talking a wire here, so how can we relate the current density j and the volume element dV to that? It’s easy: the illustration below shows that we can simply write:

j·dV = j·S·ds = I·ds

Therefore, we can write our integral for the vector potential as:

Of course, you should note the subtle change from a volume integral to a line integral, so it’s not all that straightforward, but we’re good to go. Now, in electrostatics, we actually had a fairly simple integral for the electric field itself:

To be clear, E(1) is the field of a known charge distribution, which is represented by ρ(2), at point (1). The integral is almost the same as the one for Φ, but we’re talking vectors here (E and e₁₂) rather than scalars (ρ and Φ), and you should also note the square in the denominator of the integral. 🙂

As you might expect, there is a similar integral for B, which we find by… Well… We just need to calculate B, so that’s the curl of A:

How do we do that? It’s not so easy, so let me just copy the master himself:

So this integral gives B directly in terms of the known currents. The geometry involved is easy but, just in case, Feynman illustrates it, quite simply, as follows:

Now, there’s one more step to take, and then we’re done. If we’re talking a circuit of small wire, then we can replace j·dV by I·ds once more, and, hence, we get the Biot-Savart Law in its final form:

Note the minus sign: it appears because we reversed the order of the vector cross product, and also note we actually have three integrals here, one for each component of B, so that’s just like that integral for A.

So… That’s it. 🙂 I’ll conclude by two small remarks:

The law is named after Jean-Baptiste Biot and Félix Savart, two incredible Frenchmen (it’s really worthwhile checking their biographies on Wikipedia), who jotted it down in 1820, so that’s almost 200 years ago. Isn’t that amazing?
You see we sort of got rid of the vector potential with this formula. So the question is: “What is the advantage of the vector potential if we can find $B$ directly with a vector integral? After all, A also involves three integrals!” I’ll let Feynman reply to that question:

Because of the cross product, the integrals for B are usually more complicated. Also, since the integrals for $A$ are like those of electrostatics, we may already know them. Finally, we will see that in more advanced theoretical matters (in relativity, in advanced formulations of the laws of mechanics, like the principle of least action to be discussed later, and in quantum mechanics), the vector potential plays an important role.

In fact, Feynman makes the point on the vector potential being relevant very explicit by just boldly stating two laws in quantum mechanics in which the magnetic and electric potential are used, not the magnetic or electric field. Indeed, it seems an external magnetic or electric field changes probability amplitudes. I’ll just jot down the two laws below, but leave it to you to decide whether or not you want to read the whole argument. qm2

The key point that Feynman is making is that Φ and A are equally ‘real’ or ‘unreal’ as E and B in terms of explaining physical realities. I get the point, but I don’t find it necessary to copy the whole argument here. Perhaps it’s sufficient to just quote Feynman’s introduction to it, which says it all, in my humble opinion, that is:

“There are many changes in what concepts are important when we go from classical to quantum mechanics. We have already discussed some of them in Volume I. In particular, the force concept gradually fades away, while the concepts of energy and momentum become of paramount importance. You remember that instead of particle motions, one deals with probability amplitudes which vary in space and time. In these amplitudes there are wavelengths related to momenta, and frequencies related to energies. The momenta and energies, which determine the phases of wave functions, are therefore the important quantities in quantum mechanics. Instead of forces, we deal with the way interactions change the wavelength of the waves. The idea of a force becomes quite secondary—if it is there at all. When people talk about nuclear forces, for example, what they usually analyze and work with are the energies of interaction of two nucleons, and not the force between them. Nobody ever differentiates the energy to find out what the force looks like. In this section we want to describe how the vector and scalar potentials enter into quantum mechanics. It is, in fact, just because momentum and energy play a central role in quantum mechanics that A and Φ provide the most direct way of introducing electromagnetic effects into quantum descriptions.”

OK. That’s sufficient really. Onwards!

Gravitational waves: how should we imagine them?

This post is not a post. It’s just a reminder for myself to look into gravitational waves at some point in time. We know how electromagnetic waves travel through space: they do so because of the mechanism described in Maxwell’s equation: a changing magnetic field causes a changing electric field, and a changing magnetic field causes a (changing) electric field, as illustrated below.

So… Electromagnetism is one phenomenon only, but we do analyze the E and B fields as separate things, as the equations below illustrate:

B is co-defined with E, and then all of the dynamics work themselves out through the ∂E/∂t and ∂E/∂t functions. Now, when talking gravity, we only have positive ‘charges’, referred to as masses, always attracting each other, but that doesn’t matter all that much: Coulomb’s and Newton’s Law have the same mathematical structure (as shown below), except for the minus sign, and so there must be some equivalent to the electromagnetic wave, explaining gravitational waves , using the same mathematical concepts, as a propagation of a force on a ‘unit charge’ (so that’s a unit mass in this case) using the very same concepts, i.e. the concepts of a field – two separate fields, I should say, just like E and B – and of its flux and circulation.

So we’d have an E_G and a B_G field, so to speak, or a G_E and a G_B field, and formulas for the flux and circulation of both, resembling Maxwell’s equations. In fact, they’d be the same, more or less. It’s a powerful idea, and I am sure the idea has been fully developed elsewhere. In fact, I quickly googled a few things, but it seems that the whole area is a pretty new area of research, both theoretically as well as experimentally—to my surprise!

Hmm… Interesting idea, but I’ll need to do a lot more analysis to be able to grind through this… One day, I will… The first thing I need to do, obviously, is to thoroughly study how these equations for E and B above can be derived from Maxwell’s equations. I’ll need some time for that, and then I can see if it’s consistent with Einstein’s special and general theories of relativity.

I’ll update you on progress. 🙂

The field of electric and magnetic dipoles

You’l surely remember the electric field of an electric dipole, as depicted below:

We distinguish two components:

A z-component along the axis of the dipole itself, and
A so-called transverse component, i.e. the component that is perpendicular to the axis of the dipole, or the z-axis.

Pythagoras’ rule then gives us the total field, which is equal to:

I’ll give you the formulas for both components in a moment, but let’s first introduce the concept of a magnetic dipole. Look at the magnetic field of a solenoid below, and imagine we reduce the solenoid to one loop of current only. What would we get?

We get a magnetic field that resembles the electric field from an electric dipole. Of course, it’s a magnetic field, and it’s not the field of an electric dipole but of a magnetic dipole which, in this case, is the field of a small loop of current. Feynman depicts the geometry of the situation, and the relevant variables, as follows:

Now, in my previous post, in which I presented the magnetic vector potential, I pointed out that the equations for the x-, y- and z-component of the vector potential A are Poisson equations and, therefore, they are mathematically identical to the Poisson equation for electrostatics. Now that’s why the x-, y- and z-component of the vector potential arising from a current density j is exactly the same as the electric potential Φ that would be produced by a charge density $ρ$ equal to j_x, j_y and j_z respectively, divided by c² $, and we use that fact to calculate them. Huh? Yes.$ $Let me just jot it down the equations once more, so you can appreciate what I am saying here:$

∇²Φ = –ρ/ε₀(Poisson equation for electrostatics)

∇²A = –j/ε₀c²(Poisson equation for magnetostatics)

⇔ A_x = –j_x/ε₀c², A_y = –j_y/ε₀c²and A_z = –j_z/ε₀c²

We didn’t go through all of the nitty-gritty of the exercise for a straight wire and a solenoid in our previous post, but we will do so now for our single loop of current. It’s kinda tedious stuff, so just hang in there and try to get through it. 🙂

We’ll start with A_x. To calculate A_x, we need j_x, so that’s the component of the current density that’s shown below. [As you can see, j_zis zero, and we’ll talk about j_y in a minute.]

What are those + and − signs next to those two loop legs? Well… Think about what we wrote above: we need to think of j_xdivided by c² $as some charge density, and the associated Φ will then equal A x . It’s an easy principle: the same equations have the same solutions. We’re only switching symbols. So look at the geometry of the situation: we have a uniform current and, hence, a uniform current density, in each leg of the loop, so j x$ is some constant over the two legs shown here. Therefore, the electrostatic equivalent is that of two charged rods with some charge density λ.

Now, it’s easy to see that λ = j_x/c² = I/c²and… So… Well… Hmm… […] So how do we calculate the electrostatic potential Φ from two charged rods? Well… We haven’t calculated that in any of our posts, but if R is ‘large enough’ (and it should be, because we’re interested in the fields only at distances that are large compared to the size of the loop), then Φ will be the potential of a dipole with (opposite) charges that are equal to the (opposite) charges of the rods. Now, λ is a charge density, so we need to multiply it with the length of the rods a to get the charge. And then we need to multiply the charge with the distance between the two charges to get the dipole moment, so we write:

p = λab = Iab/c²

OK. That’s a nice result because we can now use that to calculate the potential. Indeed, you may or may not remember the formula for the potential of an electric dipole, but we wrote it as follows in one of my posts:

Φ = −p·∇φ₀= −(1/4πε₀)p·∇(1/R) = −(1/4πε₀)(p·e_r)/R²

∇φ₀is the gradient of φ₀, and φ₀is the potential of a unit point of charge: φ₀ = 1/4πε₀r. In any case, don’t worry too much about this formula right now, because I’ll give you some easier version of it later on. So let’s just move on. To calculate that vector product p·e_r = |p|·|e_r |·cosθ = p·cosθ, we need to know the angle θ between them, and the direction of the dipole moment. That’s simple enough: the vector p points in the negative y-direction, so cosθ = –y/R, where y is a coordinate of P. So we have:

Now we do the grand substitution, the hat-trick really: Φ becomes A_x, and λ = j_x/c² = I/c², and so we get:

Huh? Yes. It’s a great trick really. Brilliant! 🙂 You’ll get it. Just think about what happened for a while.

Next thing is j_y. So what about it? Well… As you may have suspected, it’s a perfectly symmetrical situation, so we can just repeat the same reasoning but swap x for y and y for x. We get:

Note the minus sign has disappeared, but you can explain that yourself. And, of course, we only have the vector potential here, so we still need to calculate B from it, using the B = ∇×A equation, which is three equations involving all possible cross-derivatives really:

Phew! It looks quite horrible indeed! Feynman says: “It’s a little more complicated than electrostatics, but the same idea.” But… Well… I guess it’s a bit hard to not take this statement with a pinch of salt. For starters, there is no intuitive physical interpretation of the magnetic vector potential A, as opposed to the electric potential Φ. More importantly, calculating E = −∇Φ involves only three derivatives, and that’s a helluva lot easier than calculating not less than six cross-derivatives and keep track of their order and all of that. In any case, nothing much we can do about, so let’s grind through it. And to be fully honest: A_z is zero, so the ∂A_z/∂y and ∂A_z/∂x derivatives above are also zero, so that leaves only four cross-derivatives to be calculated.

Let’s first define the magnetic dipole moment, however. We talked a lot about it, but what is it? We said it’s similar to the electric dipole moment p = q·d, whose magnitude was λab = Iab/c²in this example. So what have we got here? No charge, but a current, so that’s charge per second. No distance either, but some area a·b. And then the 1/c²factor that, somehow, always slips in because we’ve got it in Maxwell’s equation too. So, we said the fields look the same, and just like Φ was proportional to p, we now see that A is proportional to I·a·b, so it’s quite similar really and so we’ll just define the magnetic dipole moment as:

μ = (current)·(area of the loop) = I·a·b

Now, the area of the loop doesn’t have to be some square: it could also be some circle or some triangle or whatever. You should just change the formula to calculate the area accordingly. You’ll say: where’s the 1/c²factor? It’s true: we didn’t put it in. Why? I could say: that’s the convention, but that’s a bit too easy as an answer, I guess. I am not quite sure but the best answer I have is that the 1/c²factor has nothing to do with the physicality of the situation: it’s part of Maxwell’s equations, or the electromagnetic force equations, if you want. So it’s typical of the magnetic force but it’s got nothing to do with our little loop of current, so to speak. So that’s why we just leave it out in our definition of μ.

OK. That should be it. However, I should also note that we can (and, hence, we will) make a vector quantity out of μ, so then we write it in boldface: μ. The symmetry of the situation implies that μ should be some vector that’s normal to the plane of the loop because, if not, all directions would be equally likely candidates. But up or down? As you may expect, we’ll use the right-hand rule again, as illustrated below: if your fingers point in the direction of current, your thumb will point in μ‘s direction. The illustration below also gives the direction of A, because we can now combine the definition of μ and our formulas for A_x, A_y and A_z and write:

Note that the formula for A above is quite similar to the formula for Φ below, but not quite the same: in the formula for A we’ve got a vector cross product – so that’s very different from the vector dot product below – and then we’ve also got that 1/c²factor. So watch out: it’s the “same-same but different”, as they say in Asia. 🙂[…] OK. Sooooo much talk, and still I did not calculate B to effectively show that the magnetic field of a magnetic dipole looks like the electric field of an electric dipole indeed. So let me no longer postpone those tedious calculations and effectively do that B = ∇×A cross-product using our new formula for A. Well… Calculating B_x and B_y is relatively simple because A‘s z-component is zero, so ∂A_z/∂x = ∂A_z/∂y = 0. So for B_x, we get:

For B_y and B_z, we get:

The … in the formulas above obviously stands for μ/4πε₀c². Now, let’s compare this with the components of the electric field (E) of an electric dipole. Let’s re-write the formula for the electric potential above as:

Note that this formula assumes the z-axis points in the same direction as the dipole, and that the dipole is centered at the origin of the coordinate system, so it’s the same coordinate system as the one for the magnetic dipole above. Therefore, we have that cosθ = z/r, and so the formula above is equivalent to:

Now, let’s find the electric field by using the E = (E_x, E_y, E_z) = −∇Φ = (−∂Φ/∂x, Φ, −∂Φ/∂y, −∂Φ/∂z) equation. For E_z, we get:

which we can re-write as:

Huh? Yes. Note that z also appears in r, because r = (x²+y²+z²)^1/2, and so we need to apply the product (or quotient) rule for derivatives: (u·v)’ = u’·v + u·v’. Note that this simplifies to E = (1/4πε₀)·(2p/r³) for the field at distance r from the dipole in the direction along its axis.

OK, so we’re done with E_z. For E_xand E_y, we get:

So… Well… Yes! We’ve checked it: the formulas for E_x, E_y, and E_z have exactly the same shape as those for B_x, B_y, and B_z, except for that 1/c² factor and, of course, the fact that we switched p for μ, so we may have some other number there too. [Oh – before I forget – I promised to give you the formula for the transverse component. The transverse component is, obviously, equal to (E_x² + E_y²)^1/2. So it’s just Pythagoras’ rule once more. Let me refer you to Feynman for the explicit formula, i.e. the formula in terms of r and θ, or in terms of x and y, as I don’t need it here and there’s too much clutter already in this post.]

So… Well… Yes, that’s all very interesting! Indeed, as Feynman notes, it is quite remarkable that, starting with completely different laws, $\nabla • E =ρ/ϵ 0$ and $\nabla \times B = j / ϵ 0 c 2$ , we end up with the same kind of field.

Is it? Well… Yes and no. You should, of course, note that the sources whose configuration we summarizing here by the dipole moments are physically quite different—in one case, it’s a circulating current; in the other, a pair of charges—one above and one below the plane of the loop for the corresponding field. So… Well… It’s the same-same but different indeed! 🙂

Now, to conclude this post, we should, perhaps, have a look at the units and order of magnitude of what we’re talking about here. The unit of charge and the unit of current are related: 1 ampere is a current of 1 coulomb per second. The coulomb is a rather large unit of charge, as it’s equivalent to the charge of some 6.241×10¹⁸ protons. Now, that’s less than a mole, which is about 6×10²³, but it’s still a quite respectable number. 🙂 Having said that, a current of one ampere is not so exceptional in everyday life, and so a magnetic dipole moment of 1 A·m² (i.e. the equivalent of an electric dipole moment of 1 C·m) would not be exceptional either.

The really interesting thing about the magnetic force is that 1/c² factor, which causes its magnitude to be so much smaller than that of the electric force. Indeed, the electric field at distance r from the dipole in the direction along its axis is given by E = (1/4πε₀)·(2p/r³). Assuming p = 1 C·m, and noting that 1/4πε₀≈ 9×10⁹N·m²/C² we get E ≈ (18×10⁹)·r⁻³N/C. So we’ve got an incredibly large (18×10⁹) factor in front here! [Note how the units come out alright: the electric field is effectively measured in newton per coulomb, indeed.]

What do we get for the magnetic field (along the z-axis once more) from a magnetic dipole with moment μ = 1 A·m²? The formula is now B = (1/4πε₀c²)·(2μ/r³). Now, 1/4πε₀c² ≈ (1×10⁻⁷)N·s²/C², so B ≈ (2×10⁻⁷)·r⁻³N·s/C·m. So we’ve got an incredibly small (2×10⁻⁷) factor in front here! [As for the unit, 1 N·s/C·m is the unit of the magnetic field indeed. It’s referred to as the tesla (1 T = 1 N·s/C·m), and it makes the force equation come out alright: the magnetic force on some charge q is, indeed, F_magnetic = qv×B, with v expressed in m/s, and so that’s why we need to multiply the N/C factor with an s/m factor in the unit for B.]

So look at this! The point to note is the relative strength – I should say weakness – of the magnetic force as compared to the electric force: they differ with a factor equal to c²≈ 9×10¹⁶. That’s quite incredible, especially because our electric motors and generators work through the magnetic force, as I’ll show in my next posts. In fact, looking at the F_magnetic = qv×B, and comparing it to the electric force (F_electric = q·E), we may say that the magnitude of the magnetic force, as compared to the magnitude of the electric force, is weaker because of two factors:

First, there is the relative velocity of the charge that we are looking, which is equal to β = v/c. The electrostatic Coulomb force has no such factor.
Second, there is the 1/c factor, which we also encountered when discussing radiation pressure, or the momentum of light, which is equal to p = E/c, with the symbol p denoting momentum here – not some dipole moment – and with E denoting the energy of the light photons – not the electric field involved!

Both combine to give us the actual relative strength of the magnetic versus the electric force, so we can write that relative strength as:

|F_magnetic|/|F_electric| = F_B/F_E = (v/c)·(1/c) = v/c²

Now, I am tempted to now write a lot about that 1/c factor, but I can’t do that here, as this post has become way too long already. Let me just add a small addendum here. It’s, perhaps, a not-so-easy piece 🙂 but I warmly recommend trying to work through it.

Addendum: On the momentum of light and radiation pressure

Look at the illustration below: it represents a beam of light, i.e. electromagnetic radiation, originating at some source S and hitting some charge q. What charge? Whatever charge the electromagnetic radiation is affecting, so you may want to think of an electron or a proton in a mirror or a piece of glass, or whatever other surface that is absorbing a photon fully or partially—so that’s any charge that is absorbing the energy of the light.

I’ve written a post on this before. It concluded a series of posts on electromagnetic radiation by noting that we usually look at the electric field only when discussing radiation. The electric field is given by the following formula for E:

I know. It looks like a monster, but the first term is just the Coulomb effect. For the explanation of the second and term term, I need to refer you to the mentioned section in Feynman’s Lecture on it. [Sorry for that, but I am still struggling somewhat with this equation myself. If and when I fully ‘get it’ myself, I’ll jot down a summary of my understanding of it. Now now, however.]

The point is: when discussing light, like interference and what have you, we usually forget about the B vector, for which the formula is the following:

So we define B by referring to E. Of course, this again shows the electric and magnetic force are one and the same phenomenon, really. The cross-product gives you both (i) the magnitude of B, which is equal to B = E/c (the magnitude of the unit vectors e_r’and n in the vector cross-product are obviously equal to one, and sinθ is equal to one too), and (ii) its direction: it’s perpendicular to both E as well as to e_r’, which is the unit vector from the point P where E is measured to the charge q that is producing the field.

The B = E/c equation tells us why the magnetic field is never looked at when discussing electromagnetic radiation: because it’s so tiny. It’s so tiny because of that 1/c factor, and so the B vector in the illustration above is surely not to scale: if all was drawn to scale, you just wouldn’t see it, because it’s 1/c times smaller than E. However, it’s there, and the illustration shows how the magnetic force resulting from it looks like, if we forget about scale, that is. 🙂

The magnetic force is, obviously, F = qv×B, and you need to apply the usual right-hand screw rule to find the direction of the force. [Please note that the order of magnitude of the force is the same as that of B so… Well… Again I need to warn you the illustration is not to scale and, hence, somewhat misleading. But I can’t think of an alternative to it.] As you can see, the magnetic force – as tiny as it is – is oriented in the direction of propagation, and it is what is responsible for the so-called radiation pressure.

Indeed, there is a ‘pushing momentum’ here, and we can calculate how strong it is. In fact, we should check how strong it is in order to see if it could, potentially, make a space ship like the one below. 🙂

Sorry. That was just to add a lighter note. 🙂 I know posts like these are rather tedious, so I just wanted to lighten it all up. 🙂 So… Well… Back to our F = qv×B equation. Because of that B = –e_r’×E/c formula, we can substitute B for E/c, and so we get the following formula for the magnitude of the magnetic force:

F = q·v·E/c

Now, the charge q times the electric field is the electric force on the charge, and the force on the charge times the velocity is equal to dW/dt, so that’s the time rate of change of the work that’s being done on the charge.

Huh? I know: not easy to see what’s being done here. Think of it like this: work equals force times distance, so W = qE·s in this case. Hence, dW/dt = d(qE·s)/dt = qE·ds/dt = qE·ds/dt. [Note that, if you’d check Feynman’s analysis, you’ll find that it’s based on average values here for some reason, so he writes 〈F〉 and 〈E〉, but I am not quite sure why, so I simplified here. I’ll try to figure the why of these averages later, and let you know.]

Again, think physical, rather than mathematical: what’s the charge here? It’s whatever charge the electromagnetic radiation is hitting, so you should think of an electron or a proton in a mirror or a piece of glass or whatever other surface. Now, because of the energy conservation principle, dW/dt must also be equal to the energy that is being absorbed from the light per second. So F = (dW/dt)/c.

Now, Newton’s Law (he had many, of course, so let me be precise: I am talking Newton’s Second Law of Motion) tells us that the force is the rate of change of the momentum: F = m·a = m·dv/dt = d(m·v)/dt. Hence, integrating both sides of the F = (dW/dt)/c equation gives us the associated momentum of the light: p = W/c.

In short, we know that light carries energy, but so we also know that light also carries momentum, and that momentum is due to the magnetic force, and it’s equal to 1 $/c$ times the energy. To be clear: it’s real pressure, so when light is emitted from a source there is a recoil effect, and when light hits some charge, it’s the same thing: the momentum of the light is being conserved as the charge that’s absorbing the energy picks it up. To be fully complete: we also have Newton’s Third Law of Motion coming into play: for every action, there is an equal and opposite reaction.

So… Well… That’s about it, I think. Just note that you’ll usually see this momentum written as $p = E/c, with E denoting the energy of the radiation, not the electric field (sorry for the switch in symbols). In this equation, E is given by the Planck-Einstein relation E = h \cdot f, with h Planck’s constant and f the frequency of the light.$

Interesting! So it all makes sense! Isn’t it wonderful how all these equations come together in the end? 🙂 However, I shouldn’t digress even more, and so I’ll leave it to you to further reflect on this. 🙂

Magnetostatics: the vector potential

This and the next posts are supposed to wrap up a few loose ends on magnetism. One of these loose ends is the (magnetic) vector potential, which we introduced in our post on gauge transformations, but then we didn’t do much with it. Another topic I neglected so far is that of the magnetic dipole moment (as opposed to the electric dipole moment), which is an extremely important concept both in classical as well as in quantum mechanics. So let’s do the vector potential here, and the magnetic dipole moment in the next. 🙂

Let’s go for it. Let me recall the basics which, as usual, are just Maxwell’s equations. You’ll remember that the electrostatic field was curl-free: ∇×E = 0, everywhere. Therefore, we can apply the following mathematical theorem: if the curl of a vector field is zero (everywhere), then the vector field can be represented as the gradient of some scalar function:

if ∇×C = 0, then there is some Ψ for which C = ∇Ψ

Substituting C for E, and taking into account our conventions on charge and the direction of flow, we wrote:

E = –∇Φ

Φ (phi) is referred to as the electric potential. Combining E = –∇Φ with Gauss’ Law – ∇•E = ρ/ε₀ − we got Poisson’s equation:

∇²Φ = −ρ/ε₀

So that equation sums up all of electrostatics. Really: that’s it! 🙂

Now, the two equations for magnetostatics are: ∇•B = 0 and c²∇×B = j/ε₀. Let me say something more about them:

The ∇•B = 0 equation is true, always, unlike the ∇×E = 0 expression, which is true for electrostatics only (no moving charges).
The ∇•B = 0 equation says the divergence of B is zero, always. Now, you can verify for yourself that the divergence of the curl of a vector field is always zero, so div (curl A) = ∇•(∇×A) = 0, always. Therefore, there’s another theorem that we can apply. It says the following: if the divergence of a vector field, say D, is zero – so if ∇•D = 0, then $D will be the$ the curl of some other vector field $C, so we can write: D = \nabla \times C . Applying this to \nabla • B = 0, we can write:$

If ∇•B = 0, then there is an A such that B = ∇×A

We can also write this as follows: ∇·B = ∇·(∇×A) = 0 and, hence, B = ∇×A. Now, it’s this vector field A that is referred to as the (magnetic) vector potential, and so that’s what we want to talk about here. As a start, it may be good to write all of the components of our B = ∇×A vector:

Note that we have no ‘time component’ because we assume the fields are static, so they do not change with time. Now, because that’s a relatively simple situation, you may wonder whether we really simplified anything with this vector potential. B is a vector with three components, and so is A. The answer to that question is somewhat subtle, and similar to what we did for electrostatics: it’s mathematically convenient to use A, and then calculate the derivatives above to find B. So the number of components doesn’t matter really: it’s just more convenient to first get A using our data on the currents j, and then we get B from A.

That’s it really. Let me show you how it works. The whole argument is somewhat lengthy, but it’s not difficult, and once it’s done, it’s done. So just carry on and please bear with me 🙂

First, we need to put some constraints on A, because the B = ∇×A equation does not fully define A. It’s like the scalar potential Φ: any Φ’ = Φ + C was as good a choice as Φ (with C any constant), so we needed a reference point Φ = 0, which we usually took at infinity. With the vector potential A, we have even more latitude: we can not only add a constant but any field which is the gradient of some scalar field, so any A’ = A + ∇Ψ will do. Why? Just write it all out: ∇×(A + ∇Ψ) = ∇×A + ∇×(∇Ψ). But the curl of the gradient of a scalar field (or a scalar function) is always zero (you can check my post on vector calculus on this), so ∇×(∇Ψ) = 0 and so ∇×(A + ∇Ψ) = ∇×A + ∇×(∇Ψ) = ∇×A + 0 = ∇×A = B.

So what constraints should we put on our choice of A? The choice is, once again, based on mathematical convenience: in magnetostatics, we’ll choose A such that ∇•A = 0. Can we do that? Yes. The A’ = A + ∇Ψ flexibility allows us to make ∇•A’ anything we wish, and so A and A’ will have the same curl, but they don’t need to have the same divergence. So we can choose an A’ so ∇•A’ = 0, and then we denote A’ by A. 🙂 So our ‘definition’ of the vector potential A is now:

B = ∇×A and ∇•A = 0

I have to make two points here:

First, you should note that, in my post on gauges, I mentioned that the choice is different when the time derivatives of E and B are not equal to zero, so when we’re talking changing currents and charge distributions, so that’s dynamics. However, that’s not a concern here.
To be fully complete, I should note that the ‘definition’ above does still not uniquely determine A. For a unique specification, we also need some reference point, or say how the field behaves on some boundary, or at large distances. It is usually convenient to choose a field which goes to zero at large distances, just like our electric potential.

Phew! We’ve said so many things about A now, but nothing that has any relevance to how we’d calculate A. 😦 So we are we heading here?

Fortunately, we can go a bit faster now. The c²∇×B = j/ε₀ equation and our B = ∇×A give us:

c²∇×(∇×A) = j/ε₀

Now, there’s this other vector identity, which you surely won’t remember either—but trust me: I am not lying: ∇×(∇×A) = ∇(∇•A) − ∇²A. So, now you see why we choose A such that ∇•A = 0 ! It allows us to write:

c²∇×(∇×A) = − c²∇²A = j/ε₀⇔ ∇²A = –j/ε₀c²

Now, the three components of ∇²A = –j/ε₀c²are, of course:

As you can see, each of these three equations is mathematically identical to that Poisson equation: ∇²Φ = − ρ/ε₀. So all that we learned about solving for potentials when ρ is known can now be used to solve for each component of A when j is known. Now, to calculate Φ, we used the following integral:

Simply substituting symbols then gives us the solution for A_x:

We have a similar integral for A_y and A_z, of course, and we can combine the three equations in vector form:

Finally, and just in case you wonder what is what, there’s the illustration below (taken from Feynman’s Lecture on this topic here) that, hopefully, will help you to make sense of it all.

At this point, you’re probably tired of these formulas (or asleep) or (if you’re not asleep) wondering what they mean really, so let’s do two examples. Of course, you won’t be surprised that we’ll be talking a straight wire and a solenoid respectively once again. 🙂

The magnetic field of a straight wire

We already calculated the magnetic field of a straight wire, using Ampère’s Law and the symmetry of the situation, in our previous post on magnetostatics. We got the following formula:

Do we get the same using those formulas for A and then doing our derivations to get B? We should, and we do, but I’ll be lazy here and just refer you to the relevant section in Feynman’s Lecture on it, because the solenoid stuff is much more interesting. 🙂

The magnetic field of a solenoid

In the mentioned post on magnetostatics, we also derived a formula for the magnetic field inside a solenoid. We got:

solenoid formula 2 with $n$ the number of turns per unit length of the solenoid, and I the current going through it. However, in the mentioned post, we assumed that the magnetic field outside of the solenoid was zero, for all practical purposes, but it is not. It is very weak but not zero, as shown below. In fact, it’s fairly strong at very short distances from the solenoid! Calculating the vector potential allows us to calculate its exact value, everywhere. So let’s go for it.

The relevant quantities are shown in the illustration below. So we’ve got a very long solenoid here once again, with n turns of wire per unit length and, therefore, a circumferential current on the surface of n·I per unit length (the slight pitch of the winding is being neglected).

Now, just like that surface charge density ρ in electrostatics, we have a ‘surface current density’ J here, which we define as J = n·I. So we’re going from a scalar to a vector quantity, and the components of J are:

J_x = –J·sinϕ, J_y = –J·cosϕ, J_z = 0

So how do we do this? As should be clear from the whole development above, the principle is that the x-component of the vector potential arising from a current density j is the same as the electric potential Φ that would be produced by a charge density $ρ$ equal to j_x divided by c² $, and similarly for the y- and z-components. Huh? Yes. Just read it a couple of times and think about it: we should imagine some cylinder with a surface charge ρ = -(J/ c 2) \cdotsinϕ to calculate A x . And then we equate ρ with -(J/ c 2) \cdotcosϕ and zero respectively to find A y and A z .$

Now, that sounds pretty easy but Feynman’s argument is quite convoluted here, so I’ll just skip it (click the link here if you’d want to see it) and give you the final result, i.e. the magnitude of A:

Of course, you need to interpret the result above with the illustration, which shows that A is always perpendicular to r’. [In case you wonder why we write r’ (so r with a prime) and not r, that’s to make clear we’re talking the distance from the z-axis, so it’s not the distance from the origin.]

Now, you may think that c² $in the denominator explains the very weak field, but it doesn’t: it’s the inverse proportionality to r’ that makes the difference!$ Indeed, you should compare the formula above with the result we get for the vector potential inside of the solenoid, which is equal to:

The illustration below shows the quantities involved. Note that we’re talking a uniform magnetic field here, along the z-axis, which has the same direction as B₀and, hence, is pointing towards you as you look at the illustration, which is why you don’t see the B₀ field lines and/or the z-axis: they’re perpendicular to your computer screen, so to speak.

As for the direction of A, it’s shown on the illustration, of course, but let me remind you of the right-hand rule for the vector cross product a×b once again, so you can make sense of the direction of A = (1/2)B₀×r’ indeed:

Also note the magnitude this formula implies: a×b = |a|·|b|·sinθ·n, with θ the angle between a and b, and n the normal unit vector in the direction given by that right-hand rule above. Now, unlike a vector dot product, the magnitude of the vector cross product is not zero for perpendicular vectors. In fact, when θ = π/2, which is the case for B₀and r’, then sinθ = 1, and, hence, we can write:

|A| = A = (1/2)|B₀||r’| = (1/2)·B₀·r’

Now, just substitute B₀for B₀= n·I/ε₀c², which is the field inside the solenoid, then you get:

A = (1/2)·n·I·r’/ε₀c²

You should compare this formula with the formula for A outside the solenoid, so you can draw the right conclusions. Note that both formulas incorporate the same (1/2)·n·I/ε₀c²factor. The difference, really, is that inside the solenoid, A is proportional to r’ (as shown in the illustration: if r’ doubles, triples etcetera, then A will double, triple etcetera too) while, outside of the solenoid, A is inversely proportional to r’. In addition, outside the solenoid, we have the a²factor, which doesn’t matter inside. Indeed, the radius of the solenoid (i.e. a) changes the flux, which is the product of B and the cross-section area π·a², but not B itself.

Let’s do a quick check to see if the formula makes sense. We do not want A to be larger outside of the solenoid than inside, obviously, so the a²/r’ factor should be smaller than r’ for r’ > a. Now, a²/r’ < r’ if a²< r’², and because a an r’ are both positive real numbers, that’s the case if r’ > a indeed. So we’ve got something that resembles the electric field inside and outside of a uniformly charged sphere, except that A decreases as 1/r’ rather than as 1/r’², as shown below.

Hmm… That’s all stuff to think about… The thing you should take home from all of this is the following:

A (uniform) magnetic field B in the z-direction corresponds to a vector potential A that rotates about the z-axis with magnitude A = B₀·r’/2 (with r’ the displacement from the z-axis, not from the origin—obviously!). So that gives you the A inside of a solenoid. The magnitude is A = (1/2)·n·I·r’/ε₀c², so A is proportional with r’.
Outside of the solenoid, A‘s magnitude (i.e. A) is inversely proportional to the distance r’, and it’s given by the formula: A = (1/2)·n·I·a²/ε₀c²·r’. That’s, of course, consistent with the magnetic field diminishing with distance there. But remember: contrary to what you’ve been taught or what you often read, it is not zero. It’s only near zero if r’ >> a.

Alright. Done. Next post. So that’s on the magnetic dipole moment 🙂

Ferroelectrics and ferromagnetics

Ferroelectricity and ferromagnetism are two different things, but they are analogous. Materials are ferroelectric if they have a spontaneous electric polarization that can be changed or reversed by the application of an external electric field. Ferromagnetism, in contrast, refers to materials which exhibit a permanent magnetic moment.

The materials are very different. In fact, most ferroelectric materials do not contain any iron and, hence, the ferro in the term is somewhat misleading. Ferroelectric materials are a special class of crystals, like barium or lead titanate (BaTiO₃or PbTiO₃). Lead zirconate titanate (LZT) is another example. These materials are also piezoelectric: when applying some mechanical stress, they will generate some voltage. In fact, the process goes both ways: when applying some voltage to them, it will also create mechanical deformation, as illustrated below (credit for this illustration goes to Wikipedia).

Ferroelectricity has to do with electric dipoles, while ferromagnetism has to do with magnetic dipoles. We’ve only discussed electric dipoles so far (see the section on dielectrics in my post on capacitors) and so we’re only in a position to discuss ferroelectricity right now, which is what I’ll do here. However, before doing so, let me briefly quote from the Wikipedia article on ferromagnetism, because that’s really concise and to the point on this:

“One of the fundamental properties of an electron (besides that it carries charge) is that it has a magnetic dipole moment, i.e. it behaves like a tiny magnet. This dipole moment comes from the more fundamental property of the electron that it has quantum mechanical spin. Due to its quantum nature, the spin of the electron can be in one of only two states; with the magnetic field either pointing “up” or “down” (for any choice of up and down). The spin of the electrons in atoms is the main source of ferromagnetism, although there is also a contribution from the orbital angular momentum of the electron about the nucleus.”

In short, ferromagnetism was discovered and known much before ferroelectricity was discovered and studied, but it’s actually more complicated, because it’s a quantum-mechanical thing really, unlike ferroelectricity, which we’ll discuss now. Before we start, let me note that, in many ways, this post is a continuation of the presentation on dielectrics, which I referred to above already, so you may want to check that discussion in that post I referred to if you have trouble following the arguments below.

Molecular dipoles

Let me first remind you of the basics. The (electric) dipole moment is the product of the distance between two equal but opposite charges q₊ and q₋. Usually, it’s written as a vector so as to also keep track of its direction and use it in vector equations, so we write p = qd, with d the vector going from the negative to the positive charge, as shown below.

Now, molecules like water molecules have a permanent dipole moment, as illustrated below. It’s because the center of ‘gravity’ of the positive and negative charges do not coincide, so that’s what makes the H₂O molecule polar, as opposed to the O₂ molecule, which is non-polar.

Now, if we place polar molecules in some electric field, we’d expect them to line up, to some extent at least, as shown below (the second illustration has more dipoles pointing vaguely north).

However, at ordinary temperatures and electric fields, the collisions of the molecules in their thermal motion keeps them from lining up too much. In fact, we can apply the principles of statistics mechanics to calculate how much exactly. You can check out the details in Feynman’s Lecture on it, but the result is that the net dipole moment per unit volume (so that’s the polarization) is equal to:

So the polarization is proportional to the number of molecules per unit volume (N), the square of their dipole moment (p₀) and, as we’d might expect, the electric field E, and inversely proportional to the temperature (T). In fact, the formula above is a sort of first-order approximation, in line with what we wrote on the electric susceptibility χ (chi) in our post on capacitors, where we also assumed the relation between P and E was linear, so we wrote: P = ε₀·χ·E. Now, engineers and physicists often use different symbols and definitions and so you may of may not have heard about another concept saying essentially the same thing: the dielectric constant, which is denoted by κ (kappa) and is, quite simply, equal to κ = 1 + χ. Combining the expression for P above, and the P = ε₀·χ·E = ε₀·(κ−1)·E expression, we get:

This doesn’t say anything new: it just states the dependence of χ on the temperature. Now, you can imagine this linear relationship has been verified experimentally. As it turns out, it’s sort of valid, but it is not as straightforward as you might imagine. There’s a nice post on this on the University of Cambridge’s Materials Science site. But this blog is about physics, not about materials science, so let’s move on. The only thing I should add to this section is a remark on the energy of dipoles.

You know charges in a field have energy, potential energy. You can look up the detail behind the formulas in one of my other posts on electromagnetism, so I’ll just remind you of them: the energy of a charge is, quite simply, the product of the charge (q) and the electric potential (Φ) at the location of the charge. Why? Well… The potential is the amount of work we’d do when bringing the unit charge there from some other (reference) point where Φ = 0. In short, the energy of the positive charge is q·Φ(1) and the energy of the negative charge is −q·Φ(2), with 1 and 2 denoting their respective location, as illustrated below.

So we have U = q·Φ(1) − q·Φ(2) = q·[Φ(1)−Φ(2)]. Now, we’re talking tiny little dipoles here, so we can approximate ΔΦ = Φ(1)−Φ(2) by ΔΦ = d•∇Φ = Δx·(∂Φ/∂x) + Δy·(∂Φ/∂y). Hence, also noting that E = −∇Φ and qd = p₀, we get:

U = q·Φ(1) − q·Φ(2) = qd•∇Φ = −p₀•E = −p₀·E·cosθ, with θ the angle between p₀and E

So the energy is lower when the dipoles are lined up with the field, which is what we would expect, of course. However, it’s an interesting thing so I just wanted to show you that. 🙂

Electrets, piezoelectricity and ferroelectricity

The analysis above was very general, so we actually haven’t started our discussion on ferroelectricity yet! All of the above is just a necessary introduction to the topic. So let’s move on. Ferroelectrics are solids, so let’s look at solids. Let me just copy Feynman’s introduction here, as it’s perfectly phrased:

“The first interesting fact about solids is that there can be a permanent polarization built in—which exists even without applying an electric field. An example occurs with a material like wax, which contains long molecules having a permanent dipole moment. If you melt some wax and put a strong electric field on it when it is a liquid, so that the dipole moments get partly lined up, they will stay that way when the liquid freezes. The solid material will have a permanent polarization which remains when the field is removed. Such a solid is called an electret. An electret has permanent polarization charges on its surface. It is the electrical analog of a magnet. It is not as useful, though, because free charges from the air are attracted to its surfaces, eventually cancelling the polarization charges. The electret is “discharged” and there are no visible external fields.”

Another example (i.e. other than wax) of an electret is the crystal lattice below. As you can see, all the dipoles are pointing in the same direction even with no applied electric field. Many crystals have such polarization but, again, we do not normally notice it because the external fields are discharged, just as for the electrets.

Now, this gives rise to the phenomena of pyroelectricity and piezoelectricity. Indeed, as Feynman explains: “If these internal dipole moments of a crystal are changed, external fields appear because there is not time for stray charges to gather and cancel the polarization charges. If the dielectric is in a condenser, free charges will be induced on the electrodes. The moments can also change when a dielectric is heated, because of thermal expansion. The effect is called pyroelectricity. Similarly, if we change the stresses in a crystal—for instance, if we bend it—again the moment may change a little bit, and a small electrical effect, called piezoelectricity, can be detected.”

But, still, piezoelectricity is not the same as ferroelectricity. In fact, there’s a hierarchy here:

Out of all crystals, some will be piezoelectric.
Among all piezoelectric crystals, some will also be pyroelectric.
Among the pyroelectric crystals, we can find some ferroelectric crystals.

The defining characteristic of ferroelectricity is that the built-in permanent moment can be reversed by the application of an external electric field. Feynman defines them as “nearly cubic crystals, whose moments can be turned in different directions, so we can detect a large change in the moment when an applied electric field is changed: all the moments flip over and we get a large effect.”

Because this is a blog, not a physics handbook, I’ll refer you to Feynman and/or the Wikipedia article on ferroelectricity for an explanation of the mechanism. Indeed, the objective of this post is to explain what it is, and so I don’t want to go off into the weeds. The two diagrams below, which I took from the mentioned Wikipedia article, illustrate the difference between your average dielectric material as opposed to a ferroelectric material. The first diagram shows you the linear relationship between P and E we discussed above: if we reverse the field, so E becomes negative, then the polarization will be reversed as well, but gradually, as shown below.

In contrast, the illustration below shows a hysteresis effect, which can be used as a memory function, and ferroelectric materials are indeed used for ferroelectric RAM (FeRAM) memory chips for computers! I’ll let you google that for yourself − it’s fun: just have a look at the following link, for example − because it’s about time I start wrapping up this post. 🙂

OK. That’s it for today. More tomorrow. 🙂

Magnetostatics

Original post:

Not all is exciting when studying physics. In fact, electromagnetism is, most of the time, a extremely boring subject-matter. But so we need to get through it, because we need the math and the formulas. So… Here we go…

When going from electrostatics to electrodynamics, one first needs to have a look at magnetostatics, to get familiar with (steady) electric currents. So let’s have a look at what they are. Of course, you already know what steady currents are. In that case, you should, perhaps, stop reading. But I’d recommend you go through it anyway. It’s always good to be explicit, so let’s be explicit.

Let me first make a very pedantic note. There are a couple of sections in Feynman’s Lectures in which he assumes that a steady current in a wire is uniformly distributed throughout the cross-section of the current-carrying wire: that assumption amounts to saying that the current density j is uniform. He uses that assumption, for example, when calculating the force per unit length of a current-carrying wire in a magnetic field (see Vol. II, section 13-3). He also uses it when calculating the magnetic field it creates itself (see Vol. II, section 14-3). This raises two questions:

Is the assumption true?
Does it matter?

My impression is that it’s a simplification that doesn’t matter. So the answer to both question would be negative. But let’s examine them. First note that, in previous posts, we repeatedly said that, if we place a charge Q on any conductor, all charges will spread out in some way on the surface, so we have an equipotential on the surface and no electric field inside of the conductor. The physics behind are easy to understand: if there were an electric field inside of the conductor, and the surface were not an equipotential, the charges would keep moving until it became zero.

Does it matter? Maybe. Maybe not. I discussed the electric field from a conductor in a previous post, so let me just recall some formulas here, first and foremost Gauss’ Law, which says that the electric flux from any closed surface S is equal to Q_inside/ε₀. Now, Q_insideis, obviously, the sum of the charges inside the volume enclosed by the surface, and the most remarkable thing about Gauss’ Law is that the charge distribution inside of the volume doesn’t matter. So if we’re talking a uniformly charged sphere or a thin spherical shell of charge, it’s the same. The illustration below shows the field for a uniformly charged sphere: E is proportional to r (to be precise: E = (ρ·r)/(3ε₀) for r ≤ R) inside the sphere, and outside E is proportional to 1/r² (to be precise: E = Q_inside/(4πε₀r²) for r ≥ R).

However, Gauss’ Law is a law that gives us the electric flux only, so we’re talking E only. We also have the magnetic field, i.e. the field vector B. So what’s the equivalent of Gauss’ Law for B? That’s Ampère’s Law, obviously, so let’s have a look at how Feynman derives that law.

Ampère’s Law

Feynman starts by defining the current through some surface S as the following integral:

The illustration below explains the logic behind. The vector j is like the heat flow vector h which we used when explaining the basics of vector calculus: it is some amount passing expressed per unit time and per unit area. As for the use of n, that’s the same normal unit vector we used for h as well: we then wrote that h·n = |h|·|n|·cosθ = h·cosθ was the component of the heat flow that’s perpendicular or normal (as mathematicians prefer to say) to the surface. So here we’ve got the same: j·n·dS is the amount of charge flowing across an infinitesimally small area dS in a unit time. So to get the electric current I, which is the total charge passing per unit time through a surface S, we need to integrate the normal component of the flow through all the surface elements, which is what the integral above is doing.

Note that I is not a vector but a scalar. We could, however, include the idea of the direction of flow by making I a vector, so then we write it in boldface: I. It is measured in coulomb per second, aka as ampere: 1 A = 1 C/s. Also note we don’t have any wires here: just surfaces and volumes. 🙂 Onwards!

The equations of magnetostatics are Maxwell’s third and fourth equation and, as we used Maxwell’s first and second equation to derive Gauss’ Law, we’ll use these two to derive Ampère’s Law: (1) ∇•B = 0 and (2) c²∇×B = j/ε₀.

You know these equations: the first one basically says there’s no flux of B: there’s no such thing as magnetic charges, in other words. The second one says that a current produces some circulation of B. You also know these equations are valid only for static fields: all electric charge densities are constant, and all currents are steady, so the electric and magnetic fields are not changing with time: ∂E/∂t = 0 = ∂B/∂t. Forget about c² for a moment (it’s just a constant) and note that ∇×B is referred to as the curl of B.

Now, as I pointed out in one of my posts on vector analysis, the divergence of the curl of a vector is always equal to zero, so ∇•(∇×B) = 0. However, because ∇×B = j/ε₀c², that means ∇•(j/ε₀c²) must also be equal to zero (we’re just taking the divergence of both sides of the equation here), and so we find that ∇•j must be equal to zero. What does that mean?Well… From the same post, you may or may not remember that the divergence of some vector field C (so that’s ∇•C) is the (net) flux out of an (infinitesimal) volume around the point we’re considering, so ∇•j = 0 implies that as much charge must be coming in as it going out, always and everywhere. So that means that, because of the charge conservation law (no charges are created or lost), we can only look at charges flowing in paths that close back on themselves, so we can only consider closed circuits. It’s a minor point – so don’t worry too much about it – but it does imply that we’re not looking at condensers, for example. Just remember: magnetostatics is about circulation, we have no flux, not of B, and not of j: our field, and our charges, circulate. 🙂

OK. Let’s get back to the lesson. We need to find Ampère’s Law, so we’d better get on with it. 🙂 To find Gauss’ Law, we used Gauss’ Theorem. To find Ampère’s Law, we’ll use… Stokes’ Theorem. [Sorry!] I need to refer you, once again, to that post on vector analysis for it. Here I can only remind you of the Theorem itself. It says that the line integral of the tangential component of a vector (field) around a closed loop is equal to the surface integral of the normal component of the curl of that vector over any surface which is bounded by the loop. […] I know that’s quite a mouthful, so let me jot down the equation:

Applying it to the magnetic field vector B, we get:

This is the illustration which goes with it.

Now, using our ∇×B = j/ε₀c² equation, we get:

Finally, we just plug in our I = ∫ j·n dS integral and we’re done. This is Ampère’s Law:

It basically says that the circulation of B around any closed curve is equal to the current I through the loop, divided by ε₀c². So what can we do with it? Well… We used Gauss’ Law to find the electric field in various circumstances, so let’s now use Ampère’s Law to find the magnetic field in various circumstances. 🙂

Before doing so, however, let me note that Ampère’s Law does not depend on any particular assumption in regard to the distribution of the charge densities j. So, frankly speaking, don’t worry too much about that assumption about a steady current in a wire: a current is a current in Ampère’s Law. 🙂

Wires

You know the magnetic field around a wire, as you’ll surely remember that right-hand rule for it from your high-school physics classes. Note, however, that it assumes you apply the usual convention: charge flows from positive to negative, because our unit of electric charge is obviously +1, not –1. So the electron flow actually goes the other way. 🙂

But so we’re past our high school days and we need to apply Ampère’s Law. The symmetry of the situation implies that that line integral of B·ds, taken along some closed circle around the wire, is, quite simply, the magnitude of B times the circumference r of our circle. Indeed, the symmetry of the situation implies that B at some distance r should be of the same magnitude everywhere, so we have:

But from Ampère’s Law we know that integral is equal to I/ε₀c² and, therefore, B·2π·r must equal I/ε₀c², and so we get the grand result we were looking for. The magnetic field outside of a (long) wire carrying the current I is:

As Feynman notes, we can write this in vector form to include the directions, remembering that B is at right angles both to I as well as to r, and remembering that the order matters, of course, because of the right-hand rule for a vector cross product. 🙂

Solenoids

Coils of wire, and solenoids, pop up almost everywhere when studying electromagnetism. Indeed, transformers, inductances, electrical motors: it’s all coils. So, yes, we can’t escape them. 😦 So let’s get on with it. As you know, a solenoid is a long coil of wire wound in a tight spiral. The illustrations below show a cross-section and its magnetic field.

Now, this is probably one of Feynman’s most intuitive arguments. Read: he’s cutting an awful lot of corners here. 🙂 I’ll just copy him:

We observe experimentally that when a solenoid is very long compared with its diameter, the field outside is very small compared with the field inside. Using just that fact, together with Ampère’s law, we can find the size of the field inside. Since the field stays inside (and has zero divergence), its lines must go along parallel to the axis, as shown above. That being the case, we can use Ampère’s law with the rectangular ‘curve’ Γ shown in the figure. This loop goes the distance $L$ inside the solenoid, where the field is, say, B₀, then goes at right angles to the field, and returns along the outside, where the field is negligible. The line integral of B for this curve is just B₀·L, and it must be $1/ε 0 c 2$ times the total current through Γ, which is $N\cdotI$ if there are $N$ turns of the solenoid in the length $L$ . We have:

Or, letting $n$ be the number of turns per unit length of the solenoid (that is, $n=N/L$ ), we get:

Oh… What happens to the lines of B when they get to the end of the solenoid? Well… They just spread out in some way and return to enter the solenoid at the other end. Hmm… He’s really cutting corners here, isn’t he? But the formula is right, and I’d rather keep it short—just like he seems to want to do here. 🙂 I’ll just insert an illustration showing another right-hand rule—the right-hand rule for solenoids: if the direction of the fingers of your right hand is the direction of current, then your thumb gives the direction of the magnetic field inside.

You may wonder: does it matter where the + and − ends of the coil are? Good question because, in practice, we’ll have something that’s very tightly wound, like the coil below, so when making an actual coil (click on this link for a nice video), we’ll have several rows and so we wind from right to left and then back from left to right and so on and so on. So if we’d have two rows of wire, the two ends of the wire would come out on the same side, and that’s OK.

Of course, the wire needs to be insulated. What you see on the picture (and in the video) is the use of so-called magnet wire, which has a polymer film insulation. So when making the electrical connections at both ends, after winding the coil, you need to get rid of the insulation, but then it often melts just by the heat of soldering. And now that we’re talking practical stuff, let me say something about the magnetic core you see in the illustration above.

A magnetic core is a material with high magnetic permeability as compared to the surrounding air, and this high permeability will cause the magnetic field to be concentrated in the core material. Now, there’s a phenomenon that’s called hysteresis, which means that the core material will tend to retain its magnetization when the applied field is removed. This is not very desirable in many applications, such as transformers or electric engines. That’s why so-called ‘soft’ magnetic materials with low hysteresis are often preferred. The so-called soft iron is such material: it’s literally softer because of a heat treatment increasing its ductility and reducing its hardness. Of course, for permanent magnets, a so-called ‘hard’ magnetic material will be used. But here we’re getting into engineering and that’s not what I want to write about in this blog.

I’ll just end by noting that a magnetic field has a so-called north (N) and south (S) pole. That convention refers to the Earth’s north and south pole, of course. However, since opposite poles (north and south) attract, the North Magnetic Pole is actually the south pole of the Earth’s magnetic field, and the South is the north. 🙂 So it’s better not to think too much of the Earth’s poles when discussing the poles of a magnet. By convention, a magnet’s north pole is where the field lines of a magnet emerge, and the south pole is where they enter, as shown below.

In any case… Folks: that’s it for today. I’ll continue tomorrow. 🙂

A post for Vincent: on the math of waves

Pre-scriptum (dated 26 June 2020): These posts on elementary math and physics for my kids (they are 21 and 23 now and no longer need such explanations) have not suffered much the attack by the dark force—which is good because I still like them. While my views on the true nature of light, matter and the force or forces that act on them have evolved significantly as part of my explorations of a more realist (classical) explanation of quantum mechanics, I think most (if not all) of the analysis in this post remains valid and fun to read. In fact, I find the simplest stuff is often the best. 🙂

Original post:

I wrote this post to just briefly entertain myself and my teenage kids. To be precise, I am writing this for Vincent, as he started to study more math this year (eight hours a week!), and as he also thinks he might go for engineering studies two years from now. So let’s see if he gets this and − much more importantly − if he likes the topic. If not… Well… Then he should get even better at golf than he already is, so he can make a living out of it. 🙂

To be sure, nothing what I write below requires an understanding of stuff you haven’t seen yet, like integrals, or complex numbers. There’s no derivatives, exponentials or logarithms either: you just need to know what a sine or a cosine is, and then it’s just a bit of addition and multiplication. So it’s just… Well… Geometry and waves as I would teach it to an interested teenager. So let’s go for it. And, yes, I am talking to you now, Vincent! 🙂

The animation below shows a repeating pulse. It is a periodic function: a traveling wave. It obviously travels in the positive x-direction, i.e. from left to right as per our convention. As you can see, the amplitude of our little wave varies as a function of time (t) and space (x), so it’s a function in two variables, like y = F(u, v). You know what that is, and you also know we’d refer to y as the dependent variable and to u and v as the independent variables.

Now, because it’s a wave, and because it travels in the positive x-direction, the argument of the wave function F will be x−ct, so we write:

y = F(x−ct)

Just to make sure: c is the speed of travel of this particular wave, so don’t think it’s the speed of light. This wave can be any wave: a water wave, a sound wave,… Whatever. Our dependent variable y is the amplitude of our wave, so it’s the vertical displacement − up or down − of whatever we’re looking at. As it’s a repeating pulse, y is zero most of the time, except when that pulse is pulsing. 🙂

So what’s the wavelength of this thing?

[…] Come on, Vincent. Think! Don’t just look at this!

[…] I got it, daddy! It’s the distance between two peaks, or between the center of two successive pulses— obviously! 🙂

[…] Good! 🙂 OK. That was easy enough. Now look at the argument of this function once again:

F = F(x−ct)

We are not merely acknowledging here that F is some function of x and t, i.e. some function varying in space and time. Of course, F is that too, so we can write: y = F = F(x, t) = F(x−ct), but it’s more than just some function: we’ve got a very special argument here, x−ct, and so let’s start our little lesson by explaining it.

The x−ct argument is there because we’re talking waves, so that is something moving through space and time indeed. Now, what are we actually doing when we write x−ct? Believe it or not, we’re basically converting something expressed in time units into something expressed in distance units. So we’re converting time into distance, so to speak. To see how this works, suppose we add some time Δt to the argument of our function y = F, so we’re looking at F[x−c(t+Δt)] now, instead of F(x−ct). Now, F[x−c(t+Δt)] = F(x−ct−cΔt), so we’ll get a different value for our function—obviously! But it’s easy to see that we can restore our wave function F to its former value by also adding some distance Δx = cΔt to the argument. Indeed, if we do so, we get F[x+Δx−c(t+Δt)] = F(x+cΔt–ct−cΔt) = F(x–ct). For example, if c = 3 m/s, then 2 seconds of time correspond to (2 s)×(3 m/s) = 6 meters of distance.

The idea behind adding both some time Δt as well as some distance Δx is that you’re traveling with the waveform itself, or with its phase as they say. So it’s like you’re riding on its crest or in its trough, or somewhere hanging on to it, so to speak. Hence, the speed of a wave is also referred to as its phase velocity, which we denote by v_p = c. Now, let me make some remarks here.

First, there is the direction of travel. The pulses above travel in the positive x-direction, so that’s why we have x minus ct in the argument. For a wave traveling in the negative x-direction, we’ll have a wave function y = F(x+ct). [And, yes, don’t be lazy, Vincent: please go through the Δx = cΔt math once again to double-check that.]

The second thing you should note is that the speed of a regular periodic wave is equal to to the product of its wavelength and its frequency, so we write: v_p = c = λ·f, which we can also write as λ = c/f or f = c/λ. Now, you know we express the frequency in oscillations or cycles per second, i.e. in hertz: one hertz is, quite simply, 1 s⁻¹, so the unit of frequency is the reciprocal of the second. So the m/s and the Hz units in the fraction below give us a wavelength λ equal to λ = (20 m/s)/(5/s) = 4 m. You’ll say that’s too simple but I just want to make sure you’ve got the basics right here.

The third thing is that, in physics, and in math, we’ll usually work with nice sinusoidal functions, i.e. sine or cosine functions. A sine and a cosine function are the same function but with a phase difference of 90 degrees, so that’s π/2 radians. That’s illustrated below: cosθ = sin(θ+π/2).

Now, when we converted time to distance by multiplying it with c, what we actually did was to ensure that the argument of our wavefunction F was expressed in one unit only: the meter, so that’s the distance unit in the international SI system of units. So that’s why we had to convert time to distance, so to speak.

The other option is to express all in seconds, so that’s in time units. So then we should measure distance in seconds, rather than meters, so to speak, and the corresponding argument is t–x/c, and our wave function would be written as y = G(t–x/c). Just go through the same Δx = cΔt math once more: G[t+Δt–(x+Δx)/c] = G(t+Δt–x/c−cΔt/c) = G(t–x/c).

In short, we’re talking the same wave function here, so F(x−ct) = G(t−x/c), but the argument of F is expressed in distance units, while the argument of G is expressed in time units. If you’d want to double-check what I am saying here, you can use the same 20 m/s wave example again: suppose the distance traveled is 100 m, so x = 100 m and x/c = (100 m)/(20 m/s) = 5 seconds. It’s always important to check the units, and you can see they come out alright in both cases! 🙂

Now, to go from F or G to our sine or cosine function, we need to do yet another conversion of units, as the argument of a sinusoidal function is some angle θ, not meters or seconds. In physics, we refer to θ as the phase of the wave function. So we need degrees or, more common now, radians, which I’ll explain in a moment. Let me first jot it down:

y = sin(2π(x–ct)/λ)

So what are we doing here? What’s going on? Well… First, we divide x–ct by the wavelength λ, so that’s the (x–ct)/λ in the argument of our sine function. So our ‘distance unit’ is no longer the meter but the wavelength of our wave, so we no longer measure in meter but in wavelengths. For example, if our argument x–ct was 20 m, and the wavelength of our wave is 4 m, we get (x–ct)/λ = 5 between the brackets. It’s just like comparing our length: ten years ago you were about half my size. Now you’re the same: one unit. 🙂 When we’re saying that, we’re using my length as the unit – and so that’s also your length unit now 🙂 – rather than meters or centimeters.

Now I need to explain the 2π factor, which is only slightly more difficult. Think about it: one wavelength corresponds to one full cycle, so that’s the full 360° of the circle below. In fact, we’ll express angles in radians, and the two animations below illustrate what a radian really is: an angle of 1 rad defines an arc whose length, as measured on the circle, is equal to the radius of that circle. […] Oh! Please look at the animations as two separate things: they illustrate the same idea, but they’re not synchronized, unfortunately! 🙂
Circle_radians

So… I hope it all makes sense now: if we add one wavelength to the argument of our wave function, we should get the same value, and so it’s equivalent to adding 2π to the argument of our sine function. Adding half a wavelength, or 35% of it, or a quarter, or two wavelengths, or e wavelengths, etc is equivalent to adding π, or 35%·2π ≈ 2.2, or 2π/4 = π/2, or 2·2π = 4π, or e·2π, etc to it. So… Well… Think about it: to go from the argument of our wavefunction expressed as a number of wavelengths − so that’s (x–ct)/λ – to the argument of our sine function, which is expressed in radians, we need to multiply by 2π.

[…] OK, Vincent. If it’s easier for you, you may want to think of the 1/λ and 2π factors in the argument of the sin(2π(x–ct)/λ) function as scaling factors: you’d use a scaling factor when you go from one measurement scale to another indeed. It’s like using vincents rather than meter. If one vincent corresponds to 1.8 m, then we need to re-scale all lengths by dividing them by 1.8 so as to express them in vincents. Vincent ten year ago was 0.9 m, so that’s half a vincent: 0.9/1.8 = 0.5. 🙂

[…] OK. […] Yes, you’re right: that’s rather stupid and makes nobody smile. Fine. You’re right: it’s time to move on to more complicated stuff. Now, read the following a couple of times. It’s my one and only message to you:

If there’s anything at all that you should remember from all of the nonsense I am writing about in this physics blog, it’s that any periodic phenomenon, any motion really, can be analyzed by assuming that it is the sum of the motions of all the different modes of what we’re looking at, combined with appropriate amplitudes and phases.

It really is a most amazing thing—it’s something very deep and very beautiful connecting all of physics with math.

We often refer to these modes as harmonics and, in one of my posts on the topic, I explained how the wavelengths of the harmonics of a classical guitar string – it’s just an example – depended on the length of the string only. Indeed, if we denote the various harmonics by their harmonic number n = 1, 2, 3,… n,… and the length of the string by L, we have λ₁ = 2L = (1/1)·2L, λ₂ = L = (1/2)·2L, λ₃ = (1/3)·2L,… λ_n = (1/n)·2L. So they look like this:

etcetera (1/8, 1/9,…,1/n,… 1/∞)

The diagram makes it look like it’s very obvious, but it’s an amazing fact: the material of the string, or its tension, doesn’t matter. It’s just the length: simple geometry is all that matters! As I mentioned in my post on music and physics, this realization led to a somewhat misplaced fascination with harmonic ratios, which the Greeks thought could explain everything. For example, the Pythagorean model of the orbits of the planets would also refer to these harmonic ratios, and it took intellectual giants like Galileo and Copernicus to finally convince the Pope that harmonic ratios are great, but that they cannot explain everything. 🙂 [Note: When I say that the material of the string, or its tension, doesn’t matter, I should correct myself: they do come into play when time becomes the variable. Also note that guitar strings are not the same length when strung on a guitar: the so-called bridge saddle is not in an exact right angle to the strings: this is a link to some close-up pictures of a bridge saddle on a guitar, just in case you don’t have a guitar at home to check.]

Now, I already explained the need to express the argument of a wave function in radians – because we’re talking periodic functions and so we want to use sinusoidals − and how it’s just a matter of units really, and so how we can go from meter to wavelengths to radians. I also explained how we could do the same for seconds, i.e. for time. The key to converting distance units to time units, and vice versa, is the speed of the wave, or the phase velocity, which relates wavelength and frequency: c = λ·f. Now, as we have to express everything in radians anyway, we’ll usually substitute the wavelength and frequency by the wavenumber and the angular frequency so as to convert these quantities too to something expressed in radians. Let me quickly explain how it works:

The wavenumber k is equal to k = 2π/λ, so it’s some number expressed in radians per unit distance, i.e. radians per meter. In the example above, where λ was 4 m, we have k = 2π/(4 m) = π/2 radians per meter. To put it differently, if our wave travels one meter, its phase θ will change by π/2.
Likewise, the angular frequency is ω = 2π·f = 2π/T. Using the same example once more, so assuming a frequency of 5 Hz, i.e. a period of one fifth of a second, we have ω = 2π/[(1/5)·s] = 10π per second. So the phase of our wave will change with 10 times π in one second. Now that makes sense because, in one second, we have five cycles, and so that corresponds to 5 times 2π.

Note that our definition implies that λ = 2π/k, and that it’s also easy to figure out that our definition of ω, combined with the f = c/λ relation, implies that ω = 2π·c/λ and, hence, that c = ω·λ/(2π) = (ω·2π/k)/(2π) = ω/k. OK. Let’s move on.

Using the definitions and explanations above, it’s now easy to see that we can re-write our y = sin(2π(x–ct)/λ) as:

y = sin(2π(x–ct)/λ) = sin[2π(x–(ω/k)t)/(2π/k)] = sin[(x–(ω/k)t)·k)] = sin(kx–ωt)

Remember, however, that we were talking some wave that was traveling in the positive x-direction. For the negative x-direction, the equation becomes:

y = sin(2π(x+ct)/λ) = sin(kx+ωt)

OK. That should be clear enough. Let’s go back to our guitar string. We can go from λ to k by noting that λ = 2L and, hence, we get the following for all of the various modes:

k = k₁ = 2π·1/(2L) = π/L, k₂ = 2π·2/(2L) = 2k, k₃ = 2π·3/(2L) = 3k,,… k_n = 2π·3/(2L) = nk,…

That gives us our grand result, and that’s that we can write some very complicated waveform Ψ(x) as the sum of an infinite number of simple sinusoids, so we have:

Ψ(x) = a₁sin(kx) + a₂sin(2kx) + a₃sin(3kx) + … + a_nsin(nkx) + … = ∑ a_nsin(nkx)

The equation above assumes we’re looking at the oscillation at some fixed point in time. If we’d be looking at the oscillation at some fixed point in space, we’d write:

Φ(t) = a₁sin(ωt) + a₂sin(2ωt) + a₃sin(3ωt) + … + a_nsin(nωt) + … = ∑ a_nsin(nωt)

Of course, to represent some very complicated oscillation on our guitar string, we can and should combine some Ψ(x) as well as some Φ(t) function, but how do we do that, exactly? Well… We’ll obviously need both the sin(kx–ωt) as well as those sin(kx+ωt) functions, as I’ll explain in a moment. However, let me first make another small digression, so as to complete your knowledge of wave mechanics. 🙂

We look at a wave as something that’s traveling through space and time at the same time. In that regard, I told you that the speed of the wave is its so-called phase velocity, which we denoted as v_p = c and which, as I explained above, is equal to v_p = c = λ·f = (2π/k)·(ω/2π) = ω/k. The animation below (credit for it must go to Wikipedia—and sorry I forget to acknowledge the same source for the illustrations above) illustrates the principle: the speed of travel of the red dot is the phase velocity. But you can see that what’s going on here is somewhat more complicated: we have a series of wave packets traveling through space and time here, and so that’s where the concept of the so-called group velocity comes in: it’s the speed of travel of the green dot.

Now, look at the animation below. What’s going on here? The wave packet (or the group or the envelope of the wave—whatever you want to call it) moves to the right, but the phase goes to the left, as the peaks and troughs move leftward indeed. Huh? How is that possible? And where is this wave going? Left or right? Can we still associate some direction with the wave here? It looks like it’s traveling in both directions at the same time!

The wave actually does travel in both directions at the same time. Well… Sort of. The point is actually quite subtle. When I started this post by writing that the pulses were ‘obviously’ traveling in the positive x-direction… Well… That’s actually not so obvious. What is it that is traveling really? Think about an oscillating guitar string: nothing travels left or right really. Each point on the string just moves up and down. Likewise, if our repeated pulse is some water wave, then the water just stays where it is: it just moves up and down. Likewise, if we shake up some rope, the rope is not going anywhere: we just started some motion that is traveling down the rope. In other words, the phase velocity is just a mathematical concept. The peaks and troughs that seem to be traveling are just mathematical points that are ‘traveling’ left or right.

What about the group velocity? Is that a mathematical notion too? It is. The wave packet is often referred to as the envelope of the wave curves, for obviously reasons: they’re enveloped indeed. Well… Sort of. 🙂 However, while both the phase and group velocity are velocities of mathematical constructs, it’s obvious that, if we’re looking at wave packets, the group velocity would be of more interest to us than the phase velocity. Think of those repeated pulses as real water waves, for example: while the water stays where it is (as mentioned, the water molecules just go up and down—more or less, at least), we’d surely be interested to know how fast these waves are ‘moving’, and that’s given by the group velocity, not the phase velocity. Still, having said that, the group velocity is as ‘unreal’ as the phase velocity: both are mathematical concepts. The only thing that’s ‘real’ is the up and down movement. Nothing travels in reality. Now, I shouldn’t digress too much here, but that’s why there’s no limit on the phase velocity: it can exceed the speed of light. In fact, in quantum mechanics, some real-life particle − like an electron, for instance – will be represented by a complex-valued wave function, and there’s no reason to put some limit on the phase velocity. In contrast, the group velocity will actually be the speed of the electron itself, and that speed can, obviously, approach the speed of light – in particle accelerators, for example – but it can never exceed it. [If you’re smart, and you are, you’ll wonder: what about photons? Well…The classical and quantum-mechanical view of an electromagnetic wave are surely not the same, but they do have a lot in common: both photons and electromagnetic radiation travel at the speed c. Photons can do so because their rest mass is zero. But I can’t go into any more detail here, otherwise this thing will become way too long.]

OK. Let me get back to the issue at hand. So I’ll now revert to the simpler situation we’re looking at here, and so that’s these harmonic waves, whose form is a simple sinusoidal indeed. The animation below (and, yes, it’s also from Wikipedia) is the one that’s relevant for this situation. You need to study it for a while to understand what’s going on. As you can see, the green wave travels to the right, the blue one travels to the left, and the red wave function is the sum of both.

Of course, after all that I wrote above, I should use quotation marks and write ‘travel’ instead of travel, so as to indicate there’s nothing traveling really, except for those mathematical points, but then no one does that, and so I won’t do it either. Just make sure you always think twice when reading stuff like this! Back to the lesson: what’s going on here?

As I explained, the argument of a wave traveling towards the negative x-direction will be x+ct. Conversely, the argument of a wave traveling in the positive x-direction will be x–ct. Now, our guitar string is going nowhere, obviously: it’s like the red wave function above. It’s a so-called standing wave. The red wave function has nodes, i.e. points where there is no motion—no displacement at all! Between the nodes, every point moves up and down sinusoidally, but the pattern of motion stays fixed in space. So that’s the kind of wave function we want, and the animation shows us how we can get it.

Indeed, there’s a funny thing with fixed strings: when a wave reaches the clamped end of a string, it will be reflected with a change in sign, as illustrated below: we’ve got that F(x+ct) wave coming in, and then it goes back indeed, but with the sign reversed.

The illustration above speaks for itself but, of course, once again I need to warn you about the use of sentences like ‘the wave reaches the end of the string’ and/or ‘the wave gets reflected back’. You know what it really means now: it’s some movement that travels through space. […] In any case, let’s get back to the lesson once more: how do we analyze that?

Easy: the red wave function is the sum of two waves: one traveling to the right, and one traveling to the left. We’ll call these component waves F and G respectively, so we have y = F(x, t) + G(x, t). Let’s go for it.

Let’s first assume the string is not held anywhere, so that we have an infinite string along which waves can travel in either direction. In fact, the most general functional form to capture the fact that a waveform can travel in any direction is to write the displacement y as the sum of two functions: one wave traveling one way (which we’ll denote by F, indeed), and the other wave (which, yes, we’ll denote by G) traveling the other way. From the illustration above, it’s obvious that the F wave is traveling towards the negative x-direction and, hence, its argument will be x+ct. Conversely, the G wave travels in the positive x-direction, so its argument is x–ct. So we write:

y = F(x, t) + G(x, t) = F(x+ct) + G(x–ct)

So… Well… We know that the string is actually not infinite, but that it’s fixed to two points. Hence, y is equal to zero there: y = 0. Now let’s choose the origin of our x-axis at the fixed end so as to simplify the analysis. Hence, where y is zero, x is also zero. Now, at x = 0, our general solution above for the infinite string becomes y = F(ct) + G(−ct) = 0, for all values of t. Of course, that means G(−ct) must be equal to –F(ct). Now, that equality is there for all values of t. So it’s there for all values of ct and −ct. In short, that equality is valid for whatever value of the argument of G and –F. As Feynman puts it: “G of anything must be –F of minus that same thing.” Now, the ‘anything’ in G is its argument: x – ct, so ‘minus that same thing’ is –(x–ct) = −x+ct. Therefore, our equation becomes:

y = F(x+ct) − F(−x+ct)

So that’s what’s depicted in the diagram above: the F(x+ct) wave ‘vanishes’ behind the wall as the − F(−x+ct) wave comes out of it. Now, of course, so as to make sure our guitar string doesn’t stop its vibration after being plucked, we need to ensure F is a periodic function, like a sin(kx+ωt) function. 🙂 Why? Well… If this F and G function would simply disappear and ‘serve’ only once, so to speak, then we only have one oscillation and that’s it! So the waves need to continue and so that’s why it needs to be periodic.

OK. Can we just take sin(kx+ωt) and −sin(−kx+ωt) and add both? It makes sense, doesn’t it? Indeed, −sinα = sin(−α) and, therefore, −sin(−kx+ωt) = sin(kx−ωt). Hence, y = F(x+ct) − F(−x+ct) would be equal to:

y = sin(kx+ωt) + sin(kx–ωt) = sin(2π(x+ct)/λ) + sin(2π(x−ct)/λ)

Done! Let’s use specific values for k and ω now. For the first harmonic, we know that k = 2π/2L = π/L. What about ω? Hmm… That depends on the wave velocity and, therefore, that actually does depend on the material and/or the tension of the string! The only thing we can say is that ω = c·k, so ω = c·2π/λ = c·π/L. So we get:

sin(kx+ωt) = sin(π·x/L + π·c·t/L) = sin[(π/L)·(x+ct)]

But this is our F function only. The whole oscillation is y = F(x+ct) − F(−x+ct), and − F(−x+ct) is equal to:

–sin[(π/L)·(−x+ct)] = –sin(−π·x/L+π·c·t/L) = −sin(−kx+ωt) = sin(kx–ωt) = sin[(π/L)·(x–ct)]

So, yes, we should add both functions to get:

y = sin[π(x+ct)/L] + sin[π(x−ct)/L]

Now, we can, of course, apply our trigonometric formulas for the addition of angles, which say that sin(α+β) = sinαcosβ + sinβcosα and sin(α–β) = sinαcosβ – sinβcosα. Hence, y = sin(kx+ωt) + sin(kx–ωt) is equal to sin(kx)cos(ωt) + sin(ωt)cos(kx) + sin(kx)cos(ωt) – sin(ωt)cos(kx) = 2sin(kx)cos(ωt). Now, that’s a very interesting result, so let’s give it some more prominence by writing it in boldface:

y = sin(kx+ωt) + sin(kx–ωt) = 2sin(kx)cos(ωt) = 2sin(π·x/L)cos(π·c·t/L)

The sin(π·x/L) factor gives us the nodes in space. Indeed, sin(π·x/L) = 0 if x is equal to 0 or L (values of x outside of the [0, L] interval are obviously not relevant here). Now, the other factor cos(π·c·t/L) can be re-written cos(2π·c·t/λ) = cos(2π·f·t) = cos(2π·t/T), with T the period T = 1/f = λ/c, so the amplitude reaches a maximum (+1 or −1 or, including the factor 2, +2 or −2) if 2π·t/T is equal to a multiple of π, so that’s if t = n·T/2 with n = 0, 1, 2, etc. In our example above, for f = 5 Hz, that means the amplitude reaches a maximum (+2 or −2) every tenth of a second.

The analysis for the other modes is as easy, and I’ll leave it you, Vincent, as an exercise, to work it all out and send me the y = 2·sin[something]·cos[something else] formula (with the ‘something’ and ‘something else’ written in terms of L and c, of course) for the higher harmonics. 🙂

[…] You’ll say: what’s the point, daddy? Well… Look at that animation again: isn’t it great we can analyze any standing wave, or any harmonic indeed, as the sum of two component waves with the same wavelength and frequency but ‘traveling’ in opposite directions?

Yes, Vincent. I can hear you sigh: “Daddy, I really do not see why I should be interested in this.”

Well… Your call… What can I say? Maybe one day you will. In fact, if you’re going to go for engineering studies, you’ll have to. 🙂

To conclude this post, I’ll insert one more illustration. Now that you know what modes are, you can start thinking about those more complicated Ψ and Φ functions. The illustration below shows how the first and second mode of our guitar string combine to give us some composite wave traveling up and down the very same string.

Think about it. We have one physical phenomenon here: at every point in time, the string is somewhere, but where exactly, depends on the mathematical shape of its components. If this doesn’t illustrate the beauty of Nature, the fact that, behind every simple physical phenomenon − most of which are some sort of oscillation indeed − we have some marvelous mathematical structure, then… Well… Then I don’t know how to explain why I am absolutely fascinated by this stuff.

Addendum 1: On actual waves

My examples of waves above were all examples of so-called transverse waves, i.e. oscillations at a right angle to the direction of the wave. The other type of wave is longitudinal. I mentioned sound waves above, but they are essentially longitudinal. So there the displacement of the medium is in the same direction of the wave, as illustrated below.

Real-life waves, like water waves, may be neither of the two. The illustration below shows how water molecules actually move as a wave passes. They move in little circles, with a systemic phase shift from circle to circle.

Why is this so? I’ll let Feynman answer, as he also provided the illustration above:

“Although the water at a given place is alternately trough or hill, it cannot simply be moving up and down, by the conservation of water. That is, if it goes down, where is the water going to go? The water is essentially incompressible. The speed of compression of waves—that is, sound in the water—is much, much higher, and we are not considering that now. Since water is incompressible on this scale, as a hill comes down the water must move away from the region. What actually happens is that particles of water near the surface move approximately in circles. When smooth swells are coming, a person floating in a tire can look at a nearby object and see it going in a circle. So it is a mixture of longitudinal and transverse, to add to the confusion. At greater depths in the water the motions are smaller circles until, reasonably far down, there is nothing left of the motion.”

So… There you go… 🙂

Addendum 2: On non-periodic waves, i.e. pulses

A waveform is not necessarily periodic. The pulse we looked at could, perhaps, not repeat itself. It is not possible, then, to describe its wavelength. However, it’s still a wave and, hence, its functional form would still be some y = F(x−ct) or y = F(x+ct) form, depending on its direction of travel.

The example below also comes out of Feynman’s Lectures: electromagnetic radiation is caused by some accelerating electric charge – an electron, usually, because its mass is small and, hence, it’s much easier to move than a proton 🙂 – and then the electric field travels out in space. So the two diagrams below show (i) the acceleration (a) as a function of time (t) and (ii) the electric field strength (E) as a function of the distance (r). [To be fully precise, I should add he ignores the 1/r variation, but that’s a fine point which doesn’t matter much here.]

He basically uses this illustration to explain why we can use a y = G(t–x/c) functional form to describe a wave. The point is: he actually talks about one pulse only here. So the F(x±ct) or G(t±x/c) or sin(kx±ωt) form has nothing to do with whether or not we’re looking at a periodic or non-periodic waveform. The gist of the matter is that we’ve got something moving through space, and it doesn’t matter whether it’s periodic or not: the periodicity or non-periodicity, of a wave has nothing to do with the x±ct, t±x/c or kx±ωt shape of the argument of our wave function. The functional form of our argument is just the result of what I said about traveling along with our wave.

So what is it about periodicity then? Well… If periodicity kicks it, you’ll talk sinusoidal functions, and so the circle will be needed once more. 🙂

Now, I mentioned we cannot associate any particular wavelength with such non-periodic wave. Having said that, it’s still possible to analyze this pulse as a sum of sinusoids through a mathematical procedure which is referred to as the Fourier transform. If you’re going for engineer, you’ll need to learn how to master this technique. As for now, however, you can just have a look at the Wikipedia article on it. 🙂

Magnetism and relativity

Original post:

The magnetic force is a strange animal. The F = q(E+v×B) = qE+qv×B formula implies that both its direction as well as its magnitude depend on the direction and the magnitude of the motion of the charge. The magnetic force is, just like the electric force, still proportional to the amount of charge (q), but then we have not one but two vectors co-determining its direction and magnitude, as expressed by the vector product v×B = |v|·|B|·sinθ = v·B·sinθ.

The presence of the velocity vector in the F = q(E+v×B) formula implies both the magnetic as well as the electric field are relative, as we wonder: “What velocity? With respect to which reference frame?” The (a) and (b) below illustrate the same interaction between some current-carrying wire and some negative charge q from two perspectives:

Diagram (a) below represents frame S, in which the wire is at rest, and the charge moves along the wire with velocity v₀, while

Diagram (b) below represents frame S’, which coincides with the reference frame of the charge, so now it’s the wire that’s moving past the particle, instead of the other way around.

Because of relativity, all of our variables transform: we have time dilation, length contraction, and relativistic mass, as I explained in my posts on special relativity. So we cannot take any of the variables for granted and so we prime all of them: in S’, we have I’, v’, etcetera, and so we need to calculate their values using the Lorentz transformation rules.

Now, we know that the absolute speed of light connects both pictures, but that’s not enough to explain what’s going on. We need some other anchoring principle as well. We have such anchor: charges are always the same, moving or not. They are indestructible. They are never lost or created: they move from place to place but never appear from nowhere. In short, charge is conserved. So we also need to look at charge densities and see what happens to them.

The illustration above shows the current I going in the conventional direction, so that’s opposite to the actual direction of travel of the free drifting electrons. It’s a convention that makes sense because of all our other conventions, such as the right-hand rule for our vector cross-product v×B above, so we won’t touch it. Having said that, the illustration shows what’s going on in S: the positive charges in the wire don’t move, so we have some charge density ρ₊ and a velocity v₊ = 0. The electrons, on the other hand, do move, and so we have some charge density ρ₋ and a velocity v₋ = v. Now, we’re looking at an uncharged wire, so ρ₊ must be equal to −ρ₋. So the situation is rather simple: we have a current causing a magnetic field, and the force on our moving charge q(−) is F = v₀×B.

However, the same situation looks very different from the S’ perspective: our q(−) charge is not moving and, therefore, there can be no magnetic force. Hence, if there’s any force on the particle, it must come from an electric field. But what electric field? If the wire is neutral, there can be no electric flux from it.

You’ll say: why should there be a force on it? Forces also look different in different reference frames, don’t they? They do: they’re subject to the same Lorentz transformation rules: F’ = γF with γ = (1−v²/c²)^−1/2. So, yes, the force looks different, but they surely do not disappear! Especially not because the typical drift velocity of electrons in a conductor is exceedingly slow. In fact, it’s usually measured in centimeter per hour and, hence, the Lorentz factor γ is extremely close to 1. 🙂 So the forces in the two reference frames should be nearly identical. Hence, the conclusion must be that the electromagnetic force in the S’ reference frame appears as some electric force, which implies that… Well… The bold conclusion is that our wire must be charged in S’ and, therefore, causes an electric field, rather than a magnetic field!

Huh? How is that possible?

To simplify the calculations involved, Feynman analyzes a special case: he equates v with v₀. So that gives us the variables in diagram (b) above: in reference frame S’, we have some charge density ρ’₊ and a velocity v’₊ = –v₀= −v, while the electrons don’t seem to move: we have some charge density ρ’₋ but the velocity v’₋ = 0. As mentioned above, we cannot assume that ρ’₊ = ρ₊ or that ρ’₋ = ρ₋and, therefore, we cannot assume that I = I’.

[…] OK. Now that we’ve explained all the variables involved, we’re ready to actually do the calculation. The crux of the matter is that a charge density is some number expressed per unit volume, and that the volume changes because of the relativistic contraction of distances. That’s what’s shown below.

As I mentioned in my posts on relativity, of all of the effects of relativity, length contraction is probably the most difficult to grasp. How comes the same amount of charge is suddenly spread over a smaller volume? Well… It is what it is, and I cannot say more about it than what I already said in the mentioned posts, so let’s get on with it. The (a) and (b) situations above describe the same piece of wire: its length and area, as measured in the stationary reference frame S, is L₀ and A₀ respectively, so its volume is L₀·A₀. If we denote the total charge in this volume as Q, then the charge density ρ₀ will be measured as ρ₀= Q/(L₀·A₀).

Now what changes if we change the reference frame, so we look at this piece of wire moving past at velocity v? The dimensions that are transverse to the direction of motion don’t change, so the area A₀ remains what it is. What about Q? Well… Q doesn’t change either. As mentioned above, there’s no such thing as relativistic charge, so there’s no equivalent for the m_v = γm₀(or, multiplied with c², E_v = γE₀) formula when charges are involved. How do we know that? Feynman answers that question appealing to common sense:

“Suppose that we take a block of material, say a conductor, which is initially uncharged. Now we heat it up. Because the electrons have a different mass than the protons, the velocities of the electrons and of the protons will change by different amounts. If the charge of a particle depended on the speed of the particle carrying it, in the heated block the charge of the electrons and protons would no longer balance. A block would become charged when heated. As we have seen earlier, a very small fractional change in the charge of all the electrons in a block would give rise to enormous electric fields. No such effect has ever been observed. Also, we can point out that the mean speed of the electrons in matter depends on its chemical composition. If the charge on an electron changed with speed, the net charge in a piece of material would be changed in a chemical reaction. Again, a straightforward calculation shows that even a very small dependence of charge on speed would give enormous fields from the simplest chemical reactions. No such effect is observed, and we conclude that the electric charge of a single particle is independent of its state of motion. So the charge $q$ on a particle is an invariant scalar quantity, independent of the frame of reference. That means that in any frame the charge density of a distribution of electrons is just proportional to the number of electrons per unit volume. We need only worry about the fact that the volume can change because of the relativistic contraction of distances.”

OK. That’s clear enough. Let’s get back to the lesson. The upshot here is that we don’t need to worry about the charge but about the charge density. To be specific, the charge density, as measured in the reference frame S’, will be equal to:

Why? If the total charge Q is the same in both S and S’, then Q = ρ₀·L₀·A₀ must be equal to ρ·L·A₀, with L the measured length in the S’ reference frame. Now, because of the relativistic length contraction effect, we know that L = L₀·(1−v²/c²)^1/2 and, therefore, ρ must be equal to ρ = ρ₀·(1−v²/c²)^−1/2. Capito?

We’re almost there. Now we need to apply this more general result to the ρ’₋/ρ₋ and ρ₊/ρ’₊density ‘pairs’ that we mentioned at the start. Let me copy the illustration once again so you can see what we are talking about:

The analysis is straightforward but a bit tricky. For the positive charges, you should note that they are at rest in (a), so that’s in reference frame S and, therefore, we can just write:

However, for the negative charges, we see they’re at rest in (b), and so that’s in reference frame S’, so the ρ₀ in our general formula is not ρ₋ but ρ’₋! So you should be careful when applying the same formula. However, if you are careful, you’ll agree we can write:

Now, the total charge density ρ’ in reference frame S’ is, of course, the sum of ρ’₋ and ρ’₊. Now, also noting that we were looking at an uncharged wire in reference frame S, so ρ₊ = − ρ₋, we get the following grand result:

So our wire appears to be positively charged in the S’ frame, with a charge that’s equal to the product of the positive charge density and a β²/(1−β²)^1/2 factor. So that’s our Lorentz factor γ multiplied by β² = (v/c)². The graph below compares how that factor increases as β = v/c goes from 0 to 1. We’ve also inserted the graph of the Lorentz factor itself, so you can compare both. Interesting, isn’t it? 🙂

Now, because the wire is electrically charged in reference frame S’, we have an electric field E’ which, using the formula for the field of a uniformly charged cylinder, can be calculated as:

Now, as far as I am concerned, that’s it. But… Well… Of course, we should generalize the analysis for v ≠ v₀. However, I’ll refer you to Feynman for that. He also takes care of the remainder of the calculations you’d probably want to see, like a formula which show that the force on the charge in S’ is indeed what we would expect it to be. Feynman also shows that all other variables we can possibly calculate in the S’ reference frame, such as the momentum of the charged particle after the force has acted on it for some time all turn out be what we’d expect them to be according to special relativity.

However, I have to limit this post and, hence, I’ll just copy Feynman’s grand conclusion:

“We have found that we get the same physical result whether we analyze the motion of a particle moving along a wire in a coordinate system at rest with respect to the wire, or in a system at rest with respect to the particle. In the first instance, the force was purely “magnetic,” in the second, it was purely “electric. If we had chosen still another coordinate system, we would have found a different mixture of E and B fields. Electric and magnetic forces are part of one physical phenomenon—the electromagnetic interactions of particles. The separation of this interaction into electric and magnetic parts depends very much on the reference frame chosen for the description. But a complete electromagnetic description is invariant; electricity and magnetism taken together are consistent with Einstein’s relativity.”

So… That’s basically it for today’s lesson. 🙂 I should just add one more thing so as to be as complete as I should be in regard to the issue on hand here. You know the Lorentz transformation rules for the space and time coordinates, and you may or may not remember we had similar relativistic four-vectors for energy and momentum. Now, it turns out that we also have similar equations to relate charges and currents in one reference frame to those in another. More in particular, to transform ρ and j to a coordinate system moving with velocity in the x-direction, you should use the following rules:

But that’s really it for today. Have fun reflecting upon it all! 🙂

The field from a grid

Pre-script (dated 26 June 2020): This post got mutilated by the removal of some material by the dark force. You should be able to follow the main story-line, however. If anything, the lack of illustrations might actually help you to think things through for yourself.

Original post:

As part of his presentation of indirect methods for finding the field, Feynman presents an interesting argument on the electrostatic field of a grid. It’s just another indirect method to arrive at meaningful conclusions on how a field is supposed to look like, but it’s quite remarkable, and that’s why I am expanding it here. Feynman’s presentation is extremely succint indeed and, hence, I hope the elaboration below will help you to understand it somewhat quicker than I did. 🙂

The grid is shown below: it’s just a uniformly spaced array of parallel wires in a plane. We are looking at the field above the plane of wires here, and the dotted lines represent equipotential surfaces above the grid.

As you can see, for larger distances above the plane, we see a constant electric field, just as though the charge were uniformly spread over a sheet of charge, rather than over a grid. However, as we approach the grid, the field begins to deviate from the uniform field.

Let’s analyze it by assuming the wires lie in the xy-plane, running parallel to the y-axis. The distance between the wires is measured along the x-axis, and the distance to the grid is measured along the z-axis, as shown in the illustration above. We assume the wires are infinitely long and, hence, the electric field does not depend on y. So the component of E in the y-direction is 0, so E_y= –∂Φ/∂y = 0. Therefore, ∂²Φ/∂y²= 0 and our Poisson equation above the wires (where there are no charges) is reduced to ∂²Φ/∂x²+ ∂²Φ/∂z²=0. What’s next?

Let’s look at the field of two positive wires first. The plot below comes from the Wolfram Demonstrations Project. I recommend you click the link and play with it: you can vary the charges and the distance, and the tool will redraw the equipotentials and the field lines accordingly. It will give you a better feel for the (a)symmetries involved. The equipotential lines are the gray contours: they are cross-sections of equipotential surfaces. The red curves are the field lines, which are always orthogonal to the equipotentials.

The point at the center is really interesting: the straight horizontal and vertical red lines through it are limits really. Feynman’s illustration below shows the point represents an unstable equilibrium: the hollow tube prevents the charge from going sideways. So if it wouldn’t be there, the charge would go sideways, of course! So it’s some kind of saddle point. Onward!

Look at the illustration below and try to imagine how the field looks like by thinking about the value of the potential as you move along one of the two blue lines below: the potential goes down as we move to the right, reaches a minimum in the middle, and then goes up again. Also think about the difference between the lighter and darker blue line: going along the light-blue line, we start at a lower potential, and its minimum will also be lower than that of the dark-blue line.

So you can start drawing curves. However, I have to warn you: the graphs are not so simple. Look at the detail below. The potential along the blue line goes slightly up before it decreases, so the graph of the potential may resemble the green curve on the right of the image. I did an actual calculation here. 🙂 If there are only two charges, the formula for the potential is quite simple: Φ = (1/4πε₀)·(q₁/r₁) + (1/4πε₀)·(q₂/r₂). Briefly forgetting about the (1/4πε₀) and equating q₁ and q₂ to +1, we get Φ = 1/r₁ + 1/r₂= (r₁ + r₂)/r₁r₂. That looks like an easy function, and it is. You should think of it as the equivalent of the 1/r formula, but written as 1/r = r/r², and with a factor 2 in front because we have two charges. 🙂

However, we need to express it as a function of x, keeping z (i.e. the ‘vertical’ coordinate) constant. That’s what I did to get the graphs below. It’s easy to see that 1/r₁= (x²+ z²)^−1/2, while 1/r₂= [(a−x)²+ z²]^−1/2. Assuming a = 2 and z = 0.8, the contribution from the first charge is given by the blue curve, the contribution of the second charge is represented by the red curve, and the green curve adds both and, hence, represents the potential generated by both charges, i.e. q₁at x = 0 and q₂at x = a. OK… Onward!

The point to note is that we have an extremely simple situation here – two charges only, or two wires, I should say – but a potential function that is surely not some simple sinusoidal function. To drive the point home, I plotted a few more curves below, keeping a at a = 2, but equating z with 0.4, 0.7 and 1.7 respectively. The z = 1.7 curve shows that, at larger distances, the potential actually increases slightly as we move from left to right along the z = 1.7 line. Note the remarkable symmetry of the curves and the equipotential lines: there should be some obvious mathematical explanation for that but, unfortunately, not obvious enough for me to find it, so please let me know if you see it! 🙂

OK. Let’s get back to our grid. For your convenience, I copied it once more below.

Feynman’s approach to calculating the variations is quite original. He also duly notes that the potential function is surely not some simple sinusoidal function. However, he also notes that, when everything is said and done, it is some periodic quantity, in one way or another, and, therefore, we should be able to do a Fourier analysis and express it as a sum of sinusoidal waves. To be precise, we should be able to write Φ(x, z) as a sum of harmonics.

[…] I know. […] Now you say: Oh sh**! And you’ll just turn off. That’s OK, but why don’t you give it a try? I promise to be lengthy. 🙂

Before we get too much into the weeds, let’s briefly recall how it works for our classical guitar string. That post explained how the wavelengths of the harmonics of a string depended on its length. If we denote the various harmonics by their harmonic number n = 1, 2, 3 etcetera, and the length of the string by L, we have λ₁ = 2L = (1/1)·2L, λ₂ = L = (1/2)·2L, λ₃ = (1/3)·2L,… λ_n = (1/n)·2L. In short, the harmonics – i.e. the components of our waveform – look like this:

etcetera (1/8, 1/9,…,1/n,… 1/∞)

Beautiful, isn’t it? As I explained in that post, it’s so beautiful it triggered a misplaced fascination with harmonic ratios. It was misplaced because the Pythagorean theory was a bit too simple to be true. However, their intuition was right, and they set the stage for guys like Copernicus, Fourier and Feynman, so that was good! 🙂

Now, as you know, we’ll usually substitute wavelength and frequency by wavenumber and angular frequency so as to convert all to something expressed in radians, which we can then use as the argument in the sine and/or cosine component waves. [Yes, the Pythagoreans once again! :-)] The wavenumber k is equal to k = 2π/λ, and the angular frequency is ω = 2π·f = 2π/T (in case you doubt, you can quickly check that the speed of a wave c is equal to the product of the wavelength and its frequency by substituting: c = λ·f = (2π/k)·(ω/2π) = ω/k, which gives you the phase velocity v_p= c). To make a long story short, we wrote k = k₁ = 2π·1/(2L), k₂ = 2π·2/(2L) = 2k, k₃ = 2π·3/(2L) = 3k,,… k_n = 2π·3/(2L) = nk,… to arrive at the grand result, and that’s our wave F(x) expressed as the sum of an infinite number of simple sinusoids:

F(x) = a₁cos(kx) + a₂cos(2kx) + a₃cos(3kx) + … + a_ncos(nkx) + … = ∑ a_ncos(nkx)

That’s easy enough. The problem is to find those amplitudes a₁, a₂, a₃,… of course, but the great French mathematician who gave us the Fourier series also gave us the formulas for that, so we should be fine! Can we use them here? Should we use them here? Let’s see…

The a in the analysis, i.e. the spacing of the wires, is the physical quantity that corresponds to the length of our guitar string in our musical sound problem. In fact, a corresponds to 2L, because guitar strings are fixed at two ends and, hence, the two ends have to be nodes and, therefore, the wavelength of our first harmonic is twice the length of the string. Huh? Well… Something like that. As you can see from the illustration of the grid, a, in contrast to L, does correspond to one full wavelength of our periodic function. So we write:

Φ(x) = ∑ a_ncos(n·k·x) = ∑ a_ncos(2π·n·x/a) (n = 1, 2, 3,…)

Now, that’s the formula for Φ(x) assuming we’re fixing z, so it’s Φ(x) at some fixed distance from the grid. Let’s think about those amplitudes a_n now. They should not depend on x, because the harmonics themselves (i.e. the cos(2π·n·x/a) components) are all that varies with x. So they have be some function of n and – most importantly – some function of z also. So we denote them by F_n(z) and re-write the equation above as:

Φ(x, z) = ∑ F_n(z)·cos(2π·n·x/a) (n = 1, 2, 3,…)

Now, the rest of Feynman’s analysis speaks for itself, so I’ll just shamelessly copy it:

What did he find here? What is he saying, really? 🙂 First note that the derivation above has been done for one term in the Fourier sum only, so we’re talking a specific harmonic n here. That harmonic n is a function of z which – let me remind you – is the distance from the grid. To be precise, the function is F_n(z) = A_ne^−z/z₀. [In case you wonder how Feynman goes from equation (7.43) to (7.44), he’s just solving a second-order linear differential equation here. :-)]

Now, you’ve seen the graph of that function a zillion times before: it starts at A_nfor z = 0 and goes to zero as z goes to infinity, as shown below. 🙂

Now, that’s the case for all F_n(z) coefficients of course. As Feynman writes:

“We have found that if there is a Fourier component of the field of harmonic $n$ , that component will decrease exponentially with a characteristic distance z₀ $= a/2π n .$ For the first harmonic ( $n =1$ ), the amplitude falls by the factor e^−2π(i.e. a large decrease) each time we increase $z$ by one grid spacing $a$ . The other harmonics fall off even more rapidly as we move away from the grid. We see that if we are only a few times the distance $a$ away from the grid, the field is very nearly uniform, i.e., the oscillating terms are small. There would, of course, always remain the “zero harmonic” field, i.e. Φ₀ $= -E 0 \cdot z, to give the uniform field at large z.$ $Of course, for the complete solution, the sum needs to be made, and the coefficients A n would need to be adjusted so that the total sum, when differentiated, gives an electric field that would fit the charge density of the grid wires.”$

Phew! Quite something, isn’t it? But that’s it really, and it’s actually simpler than the ‘direct’ calculations of the field that I googled. Those calculations involve complicated series and logs and what have you, to arrive at the same result: the field away from a grid of charged wires is very nearly uniform.

Let me conclude this post by noting Feynman’s explanation of shielding by a screen. It’s quite terse:

“The method we have just developed can be used to explain why electrostatic shielding by means of a screen is often just as good as with a solid metal sheet. Except within a distance from the screen a few times the spacing of the screen wires, the fields inside a closed screen are zero. We see why copper screen—lighter and cheaper than copper sheet—is often used to shield sensitive electrical equipment from external disturbing fields.”

Hmm… So how does that work? The logic should be similar to the logic I explained when discussing shielding in one of my previous posts. Have a look—if only because it’s a lot easier to understand than the rather convoluted business I presented above. 🙂 But then I guess it’s all par for the course, isn’t it? 🙂

Capacitors

Original post:

This post briefly explores the properties of capacitors. Why? Well… Just because they’re an element in electric circuits, and so we should try to fully understand how they function so we can understand how electric circuits work. Indeed, we’ll look at some interesting DC and AC circuits in the very near future. 🙂

Feynman introduces condensers − now referred to as capacitors – right from the start, as he explains Maxwell’s fourth equation, which is written as c²∇×B = ∂E/∂t + j/ε₀ in differential form, but easier to read when integrating over a surface S bounded by a curve C:

formula 4 The ∂E/∂t term implies that changing electric fields produce magnetic effects (i.e. some circulation of B, i.e. the c²∇×B on the left-hand side). We need this term because, without it, there could be no currents in circuits that are not complete loops, like the circuit below, which is just a circuit with a capacitor made of two flat plates. The capacitor is charged by a current that flows toward one plate and away from the other. It looks messy because of the complicated drawing: we have a curve C around one of the wires defining two surfaces: S₁ is a surface that just fills the loop and, hence, crosses the wire, while S₂ is a bowl-shaped surface which passes between the plates of the capacitor (so it does not cross the wire).

If we look at C and S₁ only, then the circulation of B around C is explained by the current through the wire, so that’s the j/ε₀ term in Maxwell’s equation, which is probably how you understood magnetism during your high-school time. However, no current goes through the S₂ surface, so if we look at C and S₂ only, we need the ∂E/∂t to explain the magnetic field. Indeed, as Feynman points out, changing the location of an imaginary surface should not change a real magnetic field! 🙂

Let’s look at those charged sheets. For a single sheet of charge, we found two opposite fields of magnitude E = (1/2)·σ/ε₀. Now, it is easy to see that we can superimpose the solutions for two parallel sheets with equal and opposite charge densities +σ and −σ, so we get:

E _{between the sheets} = σ/ε₀ and E _outside = 0

Now, actual capacitors are not made of some infinitely thin sheet of charge: they are made of some conductor and, hence, we get that shielding effect and we’re talking surface charge densities +σ and −σ, so the actual picture is more like the one below. Having said that, the formula above is still correct: E is σ/ε₀ between the plates, and zero everywhere else (except at the edge, but I’ll talk about that later).

We’re now ready to tackle the first property of a capacitor, and that is its capacity. In fact, the correct term is capacitance, but that sounds rather strange, doesn’t it?

The capacity of a capacitor

We know the two plates are both equipotentials but with different potential, obviously! If we denote these two potentials as Φ₁and Φ₂respectively, we can define their difference Φ₁− Φ₂as the voltage between the two plates. It’s unit is the same as the unit for potential which, as you may or may not remember, is potential energy per unit charge, so that’s newton·meter/coulomb. [In honor of the guy who invented the first battery, 1 N·m/C is usually referred to as one volt, which – quite annoyingly – is also abbreviated as V, even if the voltage and the volt are two very different things: the volt is the unit of voltage.]

Now, it’s easy to see that the voltage, or potential difference, is the amount of work that’s required to carry one unit charge from one plate to the other. To be precise, because the coulomb is a huge unit − it’s equivalent to the combined charge of some 6.241×10¹⁸ protons − we should say that the voltage is the work per unit charge required to carry a small charge from one plate to the other. Hence, if d is the distance between the two plates (as shown in the illustration above), we can write:

Q is the total charge on each plate (so it’s positive on one, and negative on the other), A is the area of each plate, and d is the separation between the two plates. What the equation says is that the voltage is proportional to the charge, and the constant of proportionality is d over ε₀A. Now, the proportionality between V and Q is there for any two conductors in space (provided we have a plus charge on one, and a minus charge on the other, and so we assume there are no other charges around). Why? It’s just the logic of the superposition of fields: we double the charges, so we double the fields, and so the work done in carrying a unit charge from one point to the other is also doubled! So that’s why the potential difference between any two points is proportional to the charges.

Now, the constant of proportionality is called the capacity or capacitance of the system. In fact, it’s defined as C = Q/V. [Again, it’s a bit of a nuisance the symbol (C) is the same as the symbol that is used for the unit of charge, but don’t worry about it.] To put it simply, the capacitance is the ability of a body to store electric charge. For our parallel-plate condenser, it is equal to C = ε₀A/d. Its unit is coulomb/volt, obviously, but – again in honor of some other guy – it’s referred to as the farad: 1 F = 1 C/V.

To build a fairly high-capacity condenser, one could put waxed paper between sheets of aluminium and roll it up. Sealed in plastic, that made a typical radio-type condenser. The principle used today is still the same. In order to reduce the risk of breakdown (which occurs when the field strength becomes so large that it pulls electrons from the dielectric between the plates, thus causing conduction), higher capacity is generally better, so the voltage developed across the condenser will be smaller. Condensers used to be fairly big, but modern capacitors are actually as small as other computer card components. It’s all interesting stuff, but I won’t elaborate on it here, because I’d rather focus on the physics and the math behind the engineering in this blog. 🙂

Onward! Let’s move to the next thing. Before we do so, however, let me quickly give you the formula for the capacity of a charged sphere (for a parallel-plate capacitor, it’s C = ε₀A/d, as noted above): C = 4πε₀a. You’ll wonder: where’s the ‘other’ conductor here? Well… When this formula is used, it assumes some imaginary sphere of infinite radius with opposite charge −Q.

The energy of a capacitor

I talked about the energy of fields in various places, most notably my posts on fields and charges. The idea behind is quite simple: if there’s some distribution of charges in space, then we always have some energy in the system, because a certain amount of work was required to bring the charges together. [For the concept of energy itself, please see my post on energy and potential.] Remember that simple formula, and the equally simple illustration:

Also remember what we wrote above: the voltage is the work per unit charge required to carry a small charge from one plate to the other. Now, when charging a conductor, what’s happening is that charge gets transferred from one plate to another indeed, and the work required to transfer a small charge dQ is, obviously, equal to V·dQ. Hence, the change in energy is dU = V·dQ. Now, because V = Q/C, we get dU = (Q/C)·dQ, and integrating this from zero charge to some final charge Q, we get:

U = (1/2)·Q²/C = (1/2)·C·V²

Note how the capacity C, or its inverse 1/C, appears as a a constant of proportionality in both equations. It’s the charge, or the voltage, that’s the variable really, and the formulas say the energy is proportional to the square of the charge, or the voltage. Finally, also note that we immediately get the energy of a charged sphere by substituting C for 4πε₀a (see the capacity formula in the previous section):

Now, Feynman applies this energy formula to an interesting range of practical problems, but I’ll refer you to him for that: just click on the link and check it out. 🙂

OK… Next thing. The next thing is to look at the dielectric material inside capacitors.

Dielectrics

You know the dielectric inside a capacitor increases its capacity. In case you wonder what I am talking about: the dielectric is the waxed paper inside of that old-fashioned radio-type condenser, or the oxide layer on the metal foil used in more recent designs. However, before analyzing dielectric, let’s first look at what happens when putting another conductor in-between the plates of our parallel-plate condenser, as shown below.

As a matter of fact, the neutral conductor will also increase the capacitance of our condenser. Now how does that work? It’s because of the induced charges. As I explained in my post on how shielding works, the induced charges reduce the field inside of the conductor to zero. So there is no field inside the (neutral) conductor. The field in the rest of the space is still what it was: σ/ε₀, so that’s the surface density of charge (σ) divided by ε₀. However, the distance over which we have to integrate to get the potential difference (i.e. the voltage V) is reduced: it’s no longer d but d minus b, as there’s no work involved in moving a charge across a zero field. Hence, instead of writing V = E·d = σ·d/ε₀, we now write V = σ·(d−b)/ε₀. Hence, the capacity C = Q/V = ε₀A/d is now equal to C = Q/V = ε₀A/(d−b), which we prefer to write as:

Now, because 0 < 1 − b/d < 1, we have a factor (1 − b/d)⁻¹ that is greater than 1. So our capacitor will have greater capacity which, remembering our C = Q/V and U = (1/2)·C·V², formulas, implies (a) that it will store more charge at the same potential difference (i.e. voltage) and, hence, (a) that it will also store more energy at the same voltage.

Having said that, it’s easy to see that, if there’s air in-between, the risk of the capacitor breaking down will be much more significant. Hence, the use of conducting material to increase the capacitance of a capacitor is not recommended. [The question of how a breakdown actually occurs in a vacuum is an interesting one: the vacuum is expected to undergo electrical breakdown at or near the so-called Schwinger limit. If you want to know more about it, you can read the Wikipedia article on this.]

So what happens when we put a dielectric in-between. It’s illustrated below. The field is reduced but it is not zero, so the positive charge on the surface of the dielectric (look at the gaussian surface S shown by the broken lines) is less than the negative charge on the conductor: in the illustration below, it’s a 1 to 2 ratio.

But what’s happening really? What’s the reality behind? Good question. The illustration above is just a mathematical explanation. It doesn’t tell us anything − nothing at all, really − on the physics of the situation. As Feynman writes:

“The experimental fact is that if we put a piece of insulating material like lucite or glass between the plates, we find that the capacitance is larger. That means, of course, that the voltage is lower for the same charge. But the voltage difference is the integral of the electric field across the capacitor; so we must conclude that inside the capacitor, the electric field is reduced even though the charges on the plates remain unchanged. Now how can that be? Gauss’ Law tells us that the flux of the electric field is directly related to the enclosed charge. Consider the gaussian surface $S$ shown by broken lines. Since the electric field is reduced with the dielectric present, we conclude that the net charge inside the surface must be lower than it would be without the material. There is only one possible conclusion, and that is that there must be positive charges on the surface of the dielectric. Since the field is reduced but is not zero, we would expect this positive charge to be smaller than the negative charge on the conductor. So the phenomena can be explained if we could understand in some way that when a dielectric material is placed in an electric field there is positive charge induced on one surface and negative charge induced on the other.”

Now that’s a mathematical model indeed, based on the formula for the work involved in transferring charge from one plate to the other:

W = ∫ F·ds = ∫qE·ds = q·∫E·ds = qV

If your physics classes in high school were any good, you’ve probably seen the illustration above. Having said that, the physical model behind is more complicated, and so let’s have a look at that now.

The key to the whole analysis is the assumption that, inside a dielectric, we have lots of little atomic or molecular dipoles. Feynman presents an atomic model (shown below) but we could also think of highly polar molecules, like water, for instance. [Note, however, that, with water, we’d have a high risk of electrical breakdown once again.]

The micro-model doesn’t matter very much. The whole analysis hinges on the concept of a dipole moment per unit volume. We’ve introduced the concept of the dipole moment tout court in a previous post, but let me remind you: the dipole moment is the product of the distance between two equal but opposite charges q₊ and q₋.

Now, because we’re using the d symbol for the distance between our plates, we’ll use δ for the distance between the two charges. Also note that we usually write the dipole moment as a vector so we keep track of its direction and we can use it in vector equations. To make a long story: p = qδ and, using boldface for vectors, p = qδ. [Please do note that δ is a vector going from the negative to the positive charge, otherwise you won’t understand a thing of what follows.]

As mentioned above, we can have atomic or molecular or whatever other type of dipoles, but what we’re interested in is the dipole moment per unit volume, which we write as:

P = Nqδ, with N the number of dipoles per unit volume.

For rather obvious reasons, P is also often referred to as the polarization vector. […] OK. We’re all set now. We should distinguish two possibilities:

P is uniform, i.e. constant, across our sheet of material.
P is not uniform, i.e. P varies across the dielectric.

So let’s do the first case first.

1. Uniform P

This assumption gives us the mathematical model of the dielectric almost immediately. Indeed, when everything is said and done, what’s going on here is that the positive/negative charges inside the dielectric have just moved in/out over that distance δ, so at the surface, they have also moved in/out over the very same distance. So the image is effectively the image below, which is equivalent to that mathematical of a dielectric we presented above.

Of course, no analysis is complete without formulas, so let’s see what we need and what we get.

The first thing we need is the surface density of the polarization charge induced on the surface, which was denoted by σ_pol, as opposed to σ_free, which is the surface density on the plates of our capacitor (the subscript ‘free’ refers to the fact that the electrons are supposed to be able to move freely, which is not the case in our dielectric). Now, if A is the area of our surface slabs, and if, for each of the dipoles, we have that q₋ charge, then the illustration above tells us that the total charge in the tiny negative surface slab will be equal to Q = A·δ·q₋·N. Hence, the surface charge density σ_pol = Q/A = A·δ·q₋·N/A = N·δ·q₋. But N·δ·q is also the definition of P! Hence, σ_pol = P. [Note that σ_polis positive on one side, and negative on the other, of course!]

Now that we have σ_pol, we can use our E = σ/ε₀ formula and add the fields from the dielectric and the capacitor plates respectively. Just think about that gaussian surface S, for example. The field there, taking into account that σ_pol and σ_free have opposite signs, is equal to:

Using our σ_pol = P identity, we can also write this as E = (σ_free−P)/ε₀. But what’s P? Well… It’s a property of the material obviously, but then it’s also related to the electric field, of course! For larger E, we can reasonably assume that δ will be larger too (assuming some grid of atoms or molecules, we should obviously not assume a change in N or q₋) and, hence, dP/dE is supposed to be positive. In fact, it turns out that the relation between E and P is pretty linear, and so we can define some constant of proportionality and write E ≈ kP. Moreover, because the E and P vectors have the same direction, we can actually write E ≈ kP. Now, for historic reasons, we’ll write our k as k = ε₀·χ, so we’re singling out our ε₀ constant once more and – as usual – we add some gravitas to the analysis by using one of those Greek capital letters (χ is chi). So we have P = ε₀·χ·E, and our equation above becomes:

Now, remembering that V = E·d and that the total charge on our capacitor is equal to Q = σ_free·A, we get the formula which you may or may not know from your high school physics classes:

So… As Feynman puts it: “We have explained the observed facts. When a parallel-plate capacitor is filled with a dielectric, the capacitance is increased by the factor 1+χ.” The table below gives the values for various materials. As you can see, water’d be a great dielectric… if it wouldn’t be so conducive. 🙂

As for the assumption of linearity between E and P, there’s stuff on the Web on non-linear relationships too, but you can google that yourself. 🙂 Let’s now analyze the second case.

2. Non-uniform P

The analysis for non-uniform polarization is more general, and includes uniform polarization as a special case. To get going with it, Feynman uses an illustration (reproduced below) which is not so evident to interpret. Take your time to study it. The d connects, once again, two equal but opposite charges. The P vector points in the same direction as the d vector, obviously, but has a different magnitude, because P is equal to P = Nqd. We also have the normal unit vector n here and an angle θ between the normal and P. Finally, the broken lines represent a tiny imaginary surface. To be precise, it represents, once again, an infinitesimal surface, or a surface element, as Feynman terms it.

Just take your time and think about it. If there’s no field across, then θ = π/2 and our surface disappears. If n and P point in the same direction, then θ = 0 and our surface becomes a tiny rectangle of height d. Feynman uses the illustration above to point out that the charge moved across any surface element is proportional to the component of P that is perpendicular to the surface. Hence, remembering what the vector dot product stands for, and remembering that both σ_pol as well as P are expressed per unit area, we can write:

σ_pol = P·n = |P|·|n|·cosθ = P·cosθ

So P·n is the normal component of P, i.e. the component of P that’s perpendicular to our infinitesimal surface, and this component gives us the charge that moves across a surface element. [I know… The analysis is everything but easy here… But just hang in and try to get through it.]

Now, while the illustration above, and the formula, show us how some charge moves across the infinitesimal surface to create some surface polarization, it is obvious that it should not result in a net surface charge, because there are equal and opposite contributions from the dielectric on the two sides of the surface. However, having said that, the displacements of the charges do result in some tiny volume charge density, as illustrated below.

Now, I must admit Feynman does not make it easy to intuitively understand what’s going on because the various P vectors are chosen rather randomly, but you should be able to get the idea. P is not uniform indeed. Therefore, the electric field across our dielectric causes the P vectors to have different magnitudes and/or lengths. Now, as mentioned above, to get the total charge that is being displaced out of any volume bound by some surface S, we should look at the normal component of P over the surface S. To be precise, to get the total charge that is being displaced out of the volume V, we should integrate the outward normal component of P over the surface S. Of course, an equal excess charge of the opposite sign will be left behind. So, denoting the net charge inside V by ΔQ_pol, we write:

Now, you may or may not remember Gauss’ Theorem, which is related but not to be confused with Gauss’ Law (for more details, check one of my previous posts on vector analysis), according to which we can write:

[I know… You’re getting tired, but we’re almost there.] We can look at the net charge inside ΔQ_pol as an infinite sum of the (surface) charge densities σ_pol, but then added over the volume V. So we write:

Again, the integral above may not appear to be be very intuitive, but it actually is: we have a formula for the surface density for a surface element – so that’s something two-dimensional – and now we integrate over the volume, so the third spatial dimension comes in. Again, just let it sink in for a while, and you’ll see it all makes sense. In any case, the equalities above imply that:

and, therefore, that

σ_pol = −∇· P

You’ll say: so what? Well… It’s a nice result, really. Feynman summarizes it as follows:

“If there is a nonuniform polarization, its divergence gives the net density of charge appearing in the material. We emphasize that this is a perfectly real charge density; we call it “polarization charge” only to remind ourselves how it got there.”

Well… That says it all, I guess. To make sure you understand what’s written here: please note, once again, that the net charge over the whole of the dielectric is and remains zero, obviously!

The only question you may have is if non-uniform polarization is actually relevant. It is. You can google and you’re likely to get a lot of sites relating to multi-layered transducers and piezoelectric materials. 🙂 But, you’re right, that’s perhaps too advanced to talk about here.

Having said that, what I write above may look like too much nitty-gritty, but it isn’t: the formulas are pretty basic, and you need them if you want to advance in physics. In fact, Feynman uses these simple formulas in two more Lectures (Chapter 10 and 11 in Volume II, to be precise) to do some more analyses of real physics. However, as this blog is not meant to be a substitute for his Lectures, I’ll refer to him for further reading. At the very least, you have the basics here, and I hope it was interesting enough to induce you to look at the mentioned Lectures yourself. 🙂

The method of images

Pre-script (dated 26 June 2020): This post got mutilated by the removal of some illustrations by the dark force. You should be able to follow the main story-line, however. If anything, the lack of illustrations might actually help you to think things through for yourself.

Original post:

In my previous post, I mentioned the so-called method of images, but didn’t elaborate much. Let’s recall the problem. As you know, the whole subject of electrostatics is governed by one equation: the so-called Poisson equation:

∇²Φ = ∂²Φ/∂x² + ∂²Φ/∂x² + ∂²Φ/∂x² = −ρ/ε₀

We get this equation by combining Maxwell’s first law (∇·Φ = −ρ/ε₀) and the E = −∇Φ formula. Now, if we know the distribution of charges, then we don’t need that Poisson equation: we can calculate the potential at every point – denoted by (1) below – using the following formulas:

And if we have Φ, we have E, because E = –∇Φ. But, in most actual situations, we don’t know the charge distribution, and then we need to work with that Poisson equation. Of course, you’ll say: if you don’t know the charge distribution, then you don’t know the ρ in the equation, and so what use is it really?

The answer is: most problems will involve conductors, and we do know that their surface is an equipotential surface. We also know that the electric field just outside the surface must be normal to the surface. Let’s take the example of the grounded conducting sheet once again, as depicted below. We know the image charge and the field lines on the left-hand side are not there. In fact, because the sheet is grounded, there is no net charge on it, and the conductor acts as a shield.

We do have a real field on the right-hand side though, and it’s exactly the same as that of a dipole: we only need to cross out the left-hand half of the picture. What charges are responsible for it? It surely cannot be the lone +q charge alone, and it’s isn’t: we also have induced local charges on the sheet. Indeed, the positive charge will attract negative charges to the surface and, hence, while the sheet as a whole is neutral (so it has no net charge), the surface charge density is not zero. We can calculate it. How? It’s quite complicated, but let’s give it a try.

Look at the detail below. Let’s forget about the induced charges for a while, and analyze the field produced by the positive charge in the absence of induced charges, so that’s the E field at point P. The magnitude of its normal component is E_n+= E·cosθ, with θ the angle between the two vectors.

θ is an angle of a rectangular triangle, and it’s easy to see that cosθ is equal to a/(a² + ρ²)^1/2. Now, Coulomb’s Law tells us that E = (1/4πε₀)·q/[(a² + ρ²)^1/2]²= (1/4πε₀)·q/(a² + ρ²). Hence, we can write:

E_n+= (1/4πε₀)·a·q/(a² + ρ²)^3/2

[A quick note on the symbols used here: we use ρ (rho) to denote a distance here. That’s somewhat confusing because it usually denotes a volume density. However, we’re interested in a surface density here, for which the σ (sigma) symbol is used. So don’t worry about it. Just note that ρ is some distance here, instead of a charge density.]

Now we know that the induced charges will arrange themselves in such way that the addition of their field makes the field at P look like there was a negative charge of the same magnitude as q at the other side of the sheet. If there was such charge −q, then we could do the same analysis, as shown below. It’s easy to see that the component of the imaginary field along the sheet (i.e. the component that’s perpendicular to the normal) cancels the actual component along the shield of the field created by +q, while its normal component adds to the normal component of the +q field. To make a long story short, the actual field at P is equal to E(ρ) = (1/4πε₀)·2a·q/(a² + ρ²)^3/2, and it has two components of strength (1/4πε₀)·a·q/(a² + ρ²)^3/2.

To put it differently, the actual field can be thought as two parts: (1) the (normal) component of the field caused by + q, and (2) the field caused by the surface charge density σ at P, which we denote as σ(ρ). Let’s see what we can do with this.

The analysis of the field of a sheet of charge on a conductor is quite complicated, and not quite like the analysis of just a sheet of charge. The analysis for just a sheet of charge was based on the theoretical situation depicted below. We imagined some box with two Gaussian surfaces of area A, and we then used Gauss’ Law to deduce that, if σ was the charge per unit area (i.e. the surface density), the total flux out of the box should be equal to EA + EA = σA/ε₀ and, hence, E = (1/2)·σ/ε₀. The illustration below shows we should think of two fields with opposite direction, and with a magnitude of (1/2)·σ/ε₀ each.

That’s simple enough. However, a sheet of charge on a conductor produces a different field, as shown below. Because of the shielding effect, we have flux on one side of the box only, and the field strength of this flux is σ/ε₀, so that’s two times the (1/2)·σ/ε₀ magnitude described above. However, as mentioned, it’s zero on the other side, i.e. the inside of the conductor shown below.

So what happens here? The charges in the neighborhood of a point P on the surface actually do produce a local field (E_local), both inside and outside of the surface, which respects the E_local= (1/2)·σ/2ε₀ equality, but all the rest of the charges on the conductor “conspire” to produce an additional field at the point P, which also produces two fields, again with opposite direction and with a magnitude of (1/2)·σ/ε₀ each. So the net result is that the total field inside goes to zero, and the field outside is equal to E = σ/ε₀, so E = 2·E_local. Note that the example above assumes a positively charged conductor: if the charge on the conductor would be negative, the direction of the field would be inwards, but we’d still have a field on and outside of the surface only.

I know you’ve switched off already but − just in case you didn’t − what equality should we use to find σ in this case, i.e. the grounded sheet with no net charge on it but with some (negative) surface charge density. Well… We’re talking a surface density, and a conductor, and, therefore, I would think it’s the E = σ/ε₀, i.e. the formula for a charged sheet on a conductor. So we write:

E = σ(ρ)/ε₀ ⇔ σ(ρ) = ε₀E

But what E do we take to continue our calculation? The whole field or (1/4πε₀)·a·q/(a² + ρ²)^3/2only? The analysis above may make you think that we should take (1/4πε₀)·a·q/(a² + ρ²)^3/2only, so that’s the component that’s related to the imaginary charge only, but… No! We’re talking one actual field here, which is produced by the positive charge as well as by the induced charges. So we should not cut it for the purpose of calculating σ(ρ)! So the grand result is:

σ(ρ) = ε₀E = (1/4π)·2a·q/(a² + ρ²)^3/2

The shape of this function should not surprise us: it’s shown below for some different values of q (1 and 2 respectively) and a (1, 2 and 3 respectively).

How do we know our solution is correct? We can check it: if we integrate σ over the whole surface, we should find that the total induced charge is equal to $-q. So\dots Well\dots I’ll let you do that. Feynman also notes the induced charges should exert a force on our point charge, which we can calculating the force between the surface charges and the charge. It’s again an integral, and it should be equal to$

Lo and behold! The force acting on the positive charge is exactly the same as it would be with the negative image charge instead of the plate. Why? Well… Because the fields are the same!

The results we obtained are quite wonderful! Indeed, we said we did not know the charge distribution, and so we used a very different method to find the field: the method of images, which consists of computing the field due to q and some imaginary point charge –q somewhere else. Feynman summarizes the method of images as follows:

“The point charge we “imagine” existing behind the conducting surface is called an image charge. In books you can find long lists of solutions for hyperbolic-shaped conductors and other complicated looking things, and you wonder how anyone ever solved these terrible shapes. They were solved backwards! Someone solved a simple problem with given charges. He then saw that some equipotential surface showed up in a new shape, and he wrote a paper in which he pointed out that the field outside that particular shape can be described in a certain way.”

However, as you can see, the method is actually quite powerful, because we got a substantial bonus here: we calculated the field indeed, but then we could also calculate the charge distribution afterwards, so we got it all! Let’s see if we master the topic by looking at some other applications of the method of images.

Point charges near conducting spheres

For a grounded conducting sphere, we get the result shown below: the point charge q will induce charges on it whose fields are those of an image charge q’ = −aq/b placed at the point below.

You can check the details in Feynman’s Lecture on it, in which you will also find a more general formula for spheres that are not at zero potential. The more general formula involves a third charge q” at the center of the sphere, with charge q” = −q’ = aq/b.

Again, we’ll have a force of attraction between the sphere and the point charge, even if the net charge on the sphere is zero, because it’s grounded. Indeed, the positive charge q attracts negative charges to the side closer to itself and, hence, leaves positive charges on the surface of the far side. As the attraction by the negative charges exceeds the repulsion from the positive charges, we end up with some net attraction. Feynman leaves us with an interesting challenge here:

“Those who were entertained in childhood by the baking powder box which has on its label a picture of a baking powder box which has on its label a picture of a baking powder box which has … may be interested in the following problem. Two equal spheres, one with a total charge of $+Q$ and the other with a total charge of $-Q$ , are placed at some distance from each other. What is the force between them? The problem can be solved with an infinite number of images. One first approximates each sphere by a charge at its center. These charges will have image charges in the other sphere. The image charges will have images, etc., etc., etc. The solution is like the picture on the box of baking powder—and it converges pretty fast.”

Well… I’ll leave it to you to take up that challenge. 🙂

Direct and indirect methods

Let me end this post by noting that I started out with that Poisson equation, but that I actually didn’t use it. Having said that, this method of images did result in some solutions for it. It is what Feynman calls an indirect method of solving some problems, and he writes the following on it:

“If the problem to be solved does not belong to the class of problems for which we can construct solutions by the indirect method, we are forced to solve the problem by a more direct method. The mathematical problem of the direct method is the solution of Laplace’s equation ∇²Φ = 0 subject to the condition that Φ is a suitable constant on certain boundaries—the surfaces of the conductors. [Note that Laplace’s equation is Poisson’s equation with a zero on the right-hand side.] Problems which involve the solution of a differential field equation subject to certain boundary conditions are called boundary-value problems. They have been the object of considerable mathematical study. In the case of conductors having complicated shapes, there are no general analytical methods. Even such a simple problem as that of a charged cylindrical metal can closed at both ends—a beer can—presents formidable mathematical difficulties. It can be solved only approximately, using numerical methods. The only general methods of solution are numerical.”

Well… That says it all, I guess. There are other indirect methods, i.e. other than the method of images, but I won’t present these here. I may write something about it in some other post, perhaps. 🙂

The electric field in various circumstances

Original post:

This post summarizes two of what may well be Feynman’s most tedious Lectures. Their title is the same: the electric field “in various circumstances.” At first, I wanted to skip them, but then I found some unifying principle: the fields involved are all quite simple. In fact, except in chapter seven, it’s only about (a) the field of a single charge and (b) the field of a so-called dipole, i.e. the field of two opposite charges next to each other. Both are depicted below, and the dipole field can actually be derived by adding the fields of the two single charges.

So… In a way, these two Lectures are just a bunch of formulas repeating the same thing over and over again. The thing to remember is that a complicated but neutral mess of charges will also create a dipole field and, if that mess would not be neutral as a whole, then the field of our lump of charge will look like that of a point charge, provided we look at it from a large enough distance (i.e. a distance that is large relative to the separation of the elementary charges involved). So the situation we’re looking at, is the one depicted below, which is really quite general.

Before going into the nitty-gritty, it is probably good to review one of the points I made in my previous post: the field inside of a spherical shell of charge (like the one below) is zero everywhere, i.e. for any point P inside the shell.

This has nothing to do with the phenomenon of shielding, which is a consequence of free electrons re-arranging themselves so as to cancel the field inside. If we’d be able to build the cage below from protons only, so we’d have a fixed distribution of charges, the inside would not be shielded from the external electrical field. [Credit for the animation must go to Wikipedia.]

Because of the symmetry of the situation, however, the field inside a rectangular, fixed and uniform distribution of charges would also be zero. Let me quickly go over the math for the example of the spherical shell. The randomly chosen point P defines small cones extending to the surface of the sphere, with their apex at P and cutting out some surface area Δa. In the illustration above, we have two symmetrical cones defining two surfaces Δa₁ and Δa₂ respectively. It is easy to see that:

Δa₂/Δa₁ = r₂²/r₁²

Note that r₂²/r₁²is equal to (r₂/r₁)²but that (r₂/r₁)²is not equal to r₂/r₁. The square matters, and the square of a ratio is different than the ratio itself! In fact, it’s because of the inverse square law that the fields cancel exactly. Indeed, if the surface of the sphere is uniformly charged (which is the key assumption here), then the charge Δq on each of the area elements will be proportional to the area, so Δq₂/Δq₁ = Δa₂/Δa₁. Now, Coulomb’s Law also says that the magnitudes of the fields produced at P by these two surface elements are in the ratio of:

Huh? Yes. E₂/E₁ = (Δa₂/Δa₁)·(r₁²/ r₂²) = (Δa₂/Δa₁)·(Δa₁/Δa₂) = 1, according to the above. So… Yes, the fields cancel exactly, and because all parts of the surface can be paired off in the same way, the total field at P is zero, indeed! But what if we’d put a charge with equal sign at the center? Logic dictates the shell would balance it at the center. Hence, Feynman’s statement that a charge in an electrostatic field in free space can only be in equilibrium if there are mechanical constraints − as illustrated below – is false, and – I should add – the whole argument that follows has no relevance whatsoever for the quantum-mechanical model of an atom. But that’s a somewhat separate story which I’ll touch upon at the end of this post. Let me get back to the dipole problem.

Dipole fields

The model of a dipole is illustrated below. We have two opposite charges separated by a distance d. The so-called dipole moment is defined as p = q·d, and we also have an associated vector p, whose magnitude is p (so that’s the product of q and d) and whose direction is that of the dipole axis from −q to +q. We could also define a vector d and write p as p = q·d. Just think about it. I am sure you’ll figure it out. 🙂

Now, Feynman derives the formula for the dipole potential in various ways—first in an easy way, and then in a not-so-easy way. 🙂 The not-so-easy way is the most interesting—in this case, that is! He first notes the general formula for the potential of some point charge q at the origin at some point P = (x, y, z). You’ve seen that before: it’s Φ₀= q/r. [Forget about the constant of proportionality (I mean that 1/4πε₀ factor in Coulomb’s Law) for a while. We can stick it back in at the end of the argument.] What it says, is that, while the field follows an inverse square law, the potential has a 1/r dependence only (so when you double the distance, you halve the potential). Now, if we’d move the charge q along the z-axis, up a distance Δz, then the potential at P will change a little, by, say ΔΦ₊. How much exactly? Well, Feynman notes that “it is just the amount that the potential would change if we were to leave the charge at the origin and move P downward by the same distance Δz.” His illustration below, and the associated formula below, speak for themselves:

Now I’ll refer you to Feynman itself for the detail of the whole argument. The bottom line is that he gets the following formula for the dipole potential:

Φ = −p·∇φ₀

We have a vector dot product here of that dipole vector we defined above (p) and the gradient of φ₀, which is the potential of a unit point of charge: φ₀ = 1/4πε₀r. So what? Well… We can re-write this as:

Φ = −(1/4πε₀)p·∇(1/r)

Isn’t that great? For point charges, we have a field that’s the gradient of a potential that has a 1/r dependence, but so… Well… Here we have the potential of a dipole that’s the gradient of… Well… Just a number that has a 1/r dependence. 🙂

It explains why the dipole field E = −∇Φ varies inversely not as the square but as the cube of the distance from a dipole. I could give you the formula for E but, again, I don’t want to copy all of Feynman here and so I’ll just assume you believe me. Let me just wrap up in this section with the graph of the electric field, and note how the field vector E can be analyzed as the sum of a transverse component (i.e. the component in the x-y plane) and its component along the dipole axis (i.e. the component along the z-axis).

The dipole field of a lump of charges

The only thing that’s left is to define the p vector for a lump (or a mess as Feynman calls it) of charges. Note that the lump should not be neutral: if it is, then it will look like a point charge from a distance. But if it’s not neutral, then its field will be a dipole field. So the same formula applies but p is defined as p = ∑q_id_i. I copy the illustration above below so you can see what is what. 🙂

So… Is that it? Well… Yes. And… Well… No. All of the above assumes we know the charge distribution from the start. If we do, then my little summary above pretty much covers the whole subject. 🙂 However, we’ll often be talking some conductor with some total charge Q, without being able to say where the charges are, exactly. All that we know is that they will be spread out on the surface in some way.

Now… Well… That’s not quite exact. We also know they will distribute themselves so that the potential of the surface is constant, and that helps us some practical problems at least. What problems? Well… The problem of finding the field of charged conductors, which is the second topic that Feynman deals with in his two Lectures on the field “in various circumstances.”

However, that story risks becoming as tedious as Feynman’s Lectures on it, and so I’d rather not copy him here. Just look at the following illustrations. The first one gives the field lines and equipotentials for two point charges once again. It highlights two equipotentials in particular: A and B. Now look at the second illustration: we have a curved conductor with a given potential near a point charge and – lo and behold! – the field looks the same: we replace A by the surface of our conductor and all the rest vanishes. In fact, the illustration we could just put an imaginary point charge q at a suitable point and get the same field.

Now that’s what’s referred to as the method of images, and it’s illustrated in the third graph, where we have an “image charge” indeed. We see the equipotential halfway between the two charges which, in this case, is grounded conducting sheet. Why grounded? Because the plane had zero potential in our dipole field, as it was halfway between the two charges indeed.

Capito? No?

Well… It doesn’t matter all that much. This is, indeed, the really boring stuff one just has to grind through in order to understand the next thing, which is hopefully somewhat more exciting.

Quadrupole fields

Because you’re interested in physics, you probably know a thing or two about those quadrupole magnets used to focus particles beams in accelerators. They’re also referred to as lenses. The illustration below is the field of a quadrupole electric field, but a quadrupole magnetic field looks the same.

The point is: these lenses focus in one direction and, hence, in an actual accelerator or cyclotron, the Q-magnets will be arranged so as to alternately focus horizontally and vertically. Why can’t we build magnets so as to focus electric or magnetically charged particles simultaneously in two directions?

Well… It would require a tube built of protons, or electrons, in a stable configuration. We can’t do that. Technology just isn’t ready for it: we’re not able to build stable tubes of protons, or of electrons. 🙂 So the so-called Theorem of Earnshaw is still valid. Earnshaw’s Theorem says just that: simultaneous focusing in two directions at once is impossible. It applies to classical inverse-square law forces, such as the electric and gravitational force, but also the magnetic forces created by permanent magnets.

However, the theorem is subject to constraints, and these constraints can be exploited to create very interesting exceptions, like magnetic levitation. I warmly recommend the link. 🙂

The electric field in (and from) a conductor

Original post:

This is just a quick post to answer a question of my 16-year old son, Vincent: why are we safe in a car when lightning strikes? What’s the Faraday effect really?

He wants to become an engineer, and so I told him what I knew: the electric charges reside at the surface of a conductor and, therefore, a fully-enclosed, all-metallic vehicle is safe. One should just not touch the interior metallic areas, surely not during the strike, but also not after the strike. Why? Because there may still be some residual charge left on the vehicle, even if the metal frame should direct all lightning currents to the ground.

Through the rubber of the tyres? Yes. In fact, it’s the rubber and other insulators that explain why some residual charge might be left. Indeed, the common assumption that, somehow, it’s the rubber that protects the occupants of a car (or that, somehow, rubber soles would insulate us in an electric storm and, hence, less likely to get hit) is ridiculous—completely false, really! The following quote from the US National Weather Service is clear enough on that:

“While rubber is an electric insulator, it’s only effective to a certain point. The average lightning bolt carries about 30,000 amps of charge, has 100 million volts of electric potential, and is about 50,000°F. These amounts are several orders of magnitude higher than what humans use on a daily basis and can burn through any insulator—even the ceramic insulators on power lines! Besides, the lightning bolt may just have traveled many miles through the atmosphere, which is a good insulator. Half an inch (or less) of rubber will make no difference.”

So that’s what I told him—sort of. However, I felt my answer (which I tried to get across as I was driving the car, in fact) was superficial and incomplete. So…

Vincent, here’s the full answer! I promise, no integrals or complex numbers. At the same time, it will be not so easy as the physics you learned in school, because I want to teach you something new. 🙂 Just try it. What I want to explain to you is Gauss’ Law. If you manage to go through it, you’ll know all you need to know about electrostatics, and it will make your first undergrad year a lot easier. [Especially that vector equation, as I always felt my math teacher never told me what a vector really was: it’s something physical. :-)]

Forces and fields

You’ve surely seen Coulomb’s Law:

F = k_e·(q₁q₂)·(1/r²₁₂)

The k_e factor is Coulomb’s constant: it is just a constant of proportionality, so it’s there to make the units come out alright. Indeed, Coulomb’s formula is simple enough: it says that the force is directly proportional to the amount of charge and inversely proportional to the square of the distance. That’s all. However, the units in which we measure stuff are not necessarily compatible: we measure distance in meter, electric charge in coulomb, and force in newton. So, if we’d define the newton as the force between two charges of one coulomb separated by a distance of one meter, then we wouldn’t need to put that k_efactor there. But the newton has another definition: one newton is the force needed to accelerate 1 kg at a rate of 1 m/s per second.

Coulomb’s constant is usually written as k_e= 1/4πε₀ factor in more serious textbooks. Why? Well… You can read my note at the end of this post, but it doesn’t matter right now. It’s much more important to try to understand the vector form of Coulomb’s Law, which is written as:

I used boldface to denote F₁ and F₂ because they are force vectors. Vectors are physical ‘quantities’ with a magnitude (denoted by F₁ and F₂, so no boldface here) and a direction. That direction is given by the unit vector e₁₂ in the equation: it’s a unit vector (so its length is one) from q₂ to q₁. Read again: from q₂ to q₁, not from q₂ to q₁. It’s important to get this one thing right, otherwise you’ll make a mess of the signs. Indeed, in the example below, q₁ and q₂ have the same sign (+) but their sign may differ (so we have a plus and a minus), and the formula above should still work. Check it yourself by doing the drawing for opposite charges.

In fact, my drawing above has a small mistake: F₂is the same as F₁but I forgot to put the minus sign: the force on q₂ is F₂= –F₁. It’s the action = reaction principle, really.

OK. That’s clear. Now you need to learn about the concept of a field: the field is the force per unit charge. So the field at q₁, or the field at point (1), is the force on q₁ divided by q₁. For example, if q₁ is three Coulomb, we divide by three. More in general, we write:

So now you know what the field vector E stands for: it is the force on a unit charge we would place in the field. To be clear, a unit charge is +1 unit. We can measure it in coulomb, or the proton charge, or the charge of a quark, or in whatever unit we want, but we’ve been using coulomb so far so let’s stick to that. Just in case you wonder: one coulomb is the charge of approximately 6.241×10¹⁸ protons, so… Yes. That’s quite a lot. 🙂

OK. Next thing.

Gauss’ Law

The field is real. We don’t have to put any charge there. The field is there, and it has energy. [There’s a formula for the energy, but I won’t bother you with that here, because we don’t need it.] The magnitude of the electric field, i.e. the field strength E = |E|, is measured in newton (N) per coulomb (C), so in N/C. In physics, we’ll multiply the field strength with a surface area so we get the so-called flux of the field, which is measured in (N/C)·m². The illustration below (which I took from Feynman’s Lectures) is just as good as any. In fact, we have several surfaces here: we have a closed surface S with several faces, including surface a and b, which are spherical surfaces. The other surfaces of this box are so-called radial faces. The E field coming out of the charge is like a flow, and so the flow going through face a is the same as the flow going through face b: the b face is larger, but the field strength is less.

It is easy to show that the net flux is zero: Coulomb’s Law tells us that the magnitude of E decreases as 1/r² while, from our geometry classes, we know that the surface area increases as r², so their product is the same. So, if the surface area of a is Δa, and the surface area of b is Δb, then E_a·Δa = E_b·Δb and so the net flux through the box is equal to E_b·Δb − E_a·Δa = 0. So the flux of E into face a is just cancelled by the flux out of face b. Needless to say, there is no flux through the radial surfaces. Why? Because the electric force is a radial force.

OK. Let’s look at a more complicated situation:

When calculating the flux through a surface, we need to take the component of E that is normal to the surface, so that’s E_n = E·n = |E|·|n|·cosθ = |E|·cosθ. I am sure you’ve seen that much in your math classes: n is the so-called normal vector, so its length is one and it’s perpendicular to the surface. In any case, the point is: the net flux through this closed surface will still be zero.

Now it’s time for the Big Move. Look at the volume enclosed by the surface S below: we can think of it as completely made up of infinitesimal truncated cones and, for each of these cones, the flux of E from one end of each conical segment will be equal and opposite to the flux from the other end. So the total net flux from the surface $S$ is still zero!

So we have a very general result here:

The (net) flux out of a volume that has no charge(s) in it is zero, always!

You’ll say: so what? Well… It’s a most remarkable result, really. First, it’s not what you’d expect intuitively, and, second, we can now use a clever trick to calculate the flux out of a volume that has some charge(s) in it. Let’s be clever about it. Look at the surface S below: it’s got a point charge q in it. Now we imagine another surface S’ around it: we imagine a little sphere centered on the charge.

From Coulomb’s Law, we know that, if the radius of our little sphere is equal to r, then the field strength E, everywhere on its surface, is equal to:

From your geometry class, you also know that the surface of a sphere is equal to 4πr², so the flux from the surface of our little sphere is just the product of the field and the surface, so we write:

Now, the nice thing is that we can generalize this result for many charges, or for charge distributions, because we can simply add the fields for each of them: E = E₁+ E₂+ E₃+ … That gives us Gauss’ Law:

The flux from any closed surface S = Q_inside/ε₀

Q_insideis, obviously, the sum of the charges inside the volume enclosed by the surface.

OK. That’s Gauss’ Law. Let’s go back to our car. 🙂

The field in (and from) a conductor

An electrical conductor is a solid that contains many free electrons. Free electrons can move freely around, but cannot leave the surface. When we charge a conductor, the electrons will move around until they have arranged themselves to produce a zero electric field everywhere inside the conductor. It’s the corollary of Gauss’ Law: the (net) flux out of a volume that has no charge(s) in it is zero, always! And so the electrons will arrange themselves in order to make sure that happens.

Think about the dynamics of the situation: as long as there’s some field inside, the charges will keep moving. Fortunately (especially if you’re in a car or a plane hit by lightning!), the re-arrangement happens in a fraction of a second. Hence, if we have some kind of shell, then the field everywhere inside of the shell will be zero, always. In addition, when we charge a conductor, the electrons will push each other away and try to spread as much as possible, so they will reside at the surface of the conductor. In fact, the excess charge of any conductor is, on the average, within one or two atomic layers of the surface only. The situation is illustrated below:

Let me sum up the main conclusions:

The electric field inside the conductor (E₁) is zero. In other words, if a cavity is completely enclosed by a conductor, no distribution of charges outside can ever produce any field inside. But no field is no force, so that’s how the shielding really works!
The electric field just outside the surface of a conductor (E₂) is normal to the surface. There can be no tangential component. If there were a tangential component, the electrons would move along the surface until it was gone.

To be fully complete, the formula for the field just outside the surface of the conductor is E = σ/ε₀, where σ is the local surface charge density. That local surface charge density can be quite high, of course, especially when lightning is involved—but it works! You’re safe in a car!

There’s one more point. You may think that you’ve seen that E = σ/ε₀ formula before: it’s the formula for the field from a charged sheet, which is easy to calculate from Gauss’ Law. Indeed, if we look at some imaginary rectangular box that cuts through the sheet, as shown below (it’s referred to as a Gaussian surface), then the total flux is, once again, the field E times the area. Now, if the charge density (so the charge per unit area) is ρ, then the total charge enclosed in the box is σA. So the flux, on each side of the sheet, must be equal to E·A = σA/ε₀, from which we get: E = σ/ε₀. But so we have a field left and right. For our conductor, we only have the E = σ/ε₀field outside. So how does it work really?

We only have a field outside the conductor – and, hence, no field inside – because the charges in the immediate neighborhood of a point $P$ on the surface will arrange themselves in such a way so as to produce a field that neutralizes the E = σ/ε₀field we’d expect on the inside. So we have ‘other charges’ here that come into play. The mechanics behind are similar to the mechanics behind the polarization phenomenon. If we have a negative charge density on the surface, we’ll have a positive charge density in the layer below. However, it’s quite complicated and, to analyze it properly, we’d need to analyze the electric properties of matter in more detail, which we won’t do here.

So… When everything is said and done, the phenomenon of ‘shielding’ is extremely complex indeed: it’s all about charges arranging themselves in patterns, and the result is truly remarkable: the fields on the two sides of a closed conducting shell are completely independent—zero on the inside, and E = σ/ε₀on the outside, with σ the local surface charge density. And it also works the other way around: if we’d have some distribution of charges inside of a closed conductor, those charges would not produce any field outside. So shielding works both ways!

Some closing remarks

A car is not a sphere. Some surfaces may have points or sharp ends, like the object sketched below. Again, the charges will try to spread out as much as possible on the surface, and the tip of a sharp point is as far away as it is possible from most of the surface. Therefore, we should expect the surface density to be very high there. Now, a high charge density means a high field just outside. In fact, if the electric field is too great, air will break down, so we get a discharge. As Feynman explains it:

“Air will break down if the electric field is too great. What happens is that a loose charge (electron, or ion) somewhere in the air is accelerated by the field, and if the field is very great, the charge can pick up enough speed before it hits another atom to be able to knock an electron off that atom. As a result, more and more ions are produced. Their motion constitutes a discharge, or spark. If you want to charge an object to a high potential and not have it discharge itself by sparks in the air, you must be sure that the surface is smooth, so that there is no place where the field is abnormally large.”

It explains why lightning is attracted to pointy objects, so you should stay away from them.

What about planes and lightning? Well… There’s a nice article on that on the Scientific American website. Let me quote a paragraph that sort of sums up what actually happens:

“Although passengers and crew may see a flash and hear a loud noise if lightning strikes their plane, nothing serious should happen because of the careful lightning protection engineered into the aircraft and its sensitive components. Initially, the lightning will attach to an extremity such as the nose or wing tip. The airplane then flies through the lightning flash, which reattaches itself to the fuselage at other locations while the airplane is in the electric “circuit” between the cloud regions of opposite polarity. The current will travel through the conductive exterior skin and structures of the aircraft and exit off some other extremity, such as the tail. Pilots occasionally report temporary flickering of lights or short-lived interference with instruments.”

One more thing perhaps: isn’t incredible that, even when lightning goes through a car or a plane, it’s only the surface that’s being affected? I mean… It’s fairly easy to see the equilibrium situation, which has the charges on the surface only. But what about the dynamics indeed? 30,000 amps, 100 million volts, and 25,000 to 30,000 degrees Celsius… As lightning strikes, that must go everywhere, no? Well… Yes and no. If there are pointy objects, lightning will effectively burn through them. For an example of the damage of lightning on the nose of an airplane, click this link. 🙂 But then… Well… Let me copy Feynman as he introduces the electric force:

“Consider a force like gravitation which varies predominantly inversely as the square of the distance, but which is about a billion-billion-billion-billion times stronger. And with another difference. There are two kinds of “matter,” which we can call positive and negative. Like kinds repel and unlike kinds attract—unlike gravity where there is only attraction. What would happen? A bunch of positives would repel with an enormous force and spread out in all directions. A bunch of negatives would do the same.”

So that’s what happens. The charges spread out, in a fraction of a second, all away from each other, and so they stay on the surface only, because that’s as far away as they can get from each other. As mentioned above, we’re talking atomic or molecular layers really, so they don’t penetrate, despite the incredible charges and voltages involved. Let me continue the quote—just to illustrate the strength of the forces involved:

“But an evenly mixed bunch of positives and negatives would do something completely different. The opposite pieces would be pulled together by the enormous attractions. The net result would be that the terrific forces would balance themselves out almost perfectly, by forming tight, fine mixtures of the positive and the negative, and between two separate bunches of such mixtures there would be practically no attraction or repulsion at all. […] There is such a force: the electrical force. And all matter is a mixture of positive protons and negative electrons which are attracting and repelling with this great force. So perfect is the balance, however, that when you stand near someone else you don’t feel any force at all. If there were even a little bit of unbalance you would know it. If you were standing at arm’s length from someone and each of you had one percent more electrons than protons, the repelling force would be incredible. How great? Enough to lift the Empire State Building? No! To lift Mount Everest? No! The repulsion would be enough to lift a “weight” equal to that of the entire earth!”

So… Well… That’s it. I’ll close this post with the promised note on Coulomb’s constant and the electric constant, but it’s just an addendum, so you don’t have to read it if you don’t feel like it, Vincent. 🙂

Addendum: Coulomb’s constant and the electric constant

The k_e = 1/4πε₀ factor in Coulomb’s Law is just a constant of proportionality. Coulomb’s formula is simple enough – it says that the force is directly proportional to the amount of charge and inversely proportional to the square of the distance – but it would be a miracle if the units came out alright, wouldn’t it? Indeed, we measure distance in meter, charge in coulomb, and force in newton. Now, we could re-define one of those units so as to get rid of the 1/4πε₀ factor, but so that’s not what we’re going to do. Why not? First, the constant of proportionality depends on the medium. Indeed, ε₀is the so-called permittivity in a vacuum, so that’s in empty space. The constant of proportionality will be different in a gas, and it will be different for different gases and different temperatures and at different pressure. You can check it online if you want – just click the link here for some examples – but I guess you’ll believe me. So, if we write 1/4πε instead of k_e then we can put in a different ε for each medium and our formula is still OK.

Now, because you’re a smart kid, you’ll say that doesn’t quite answer the question: why do we write is as 1/4πε? Why don’t we simply write μ instead of 1/4πε, or just k or a or something? Well… There is an answer to that, but it’s complicated. First, the μ and μ₀ symbols are already used for something else: it’s something similar as ε and ε₀but then for magnetic fields. To be precise, μ₀ is referred to as the permeability of the vacuum (and μ is just the permeability of some non-vacuum medium, of course). Now, because electricity and magnetism are part of one and the same phenomenon in Nature (when you’re going for engineer, you’ll get one course on electromagnetism, not two separate ones), ε₀ are μ₀ related. In fact, they’re related through a marvelous formulas—a formula like E = mc² in physics or, in math, e^iπ+ 1 = 0. Don’t try to understand it. Just look at it:

c²ε₀μ₀ = (cε₀)(cμ₀) = 1

Amazing, isn’t it? The c here is the speed of light in a vacuum, obviously. So it’s a physical constant. In other words, unlike ε₀ or μ₀, it’s got nothing to do with proportionality or units: the speed of light is the speed of light no matter what units we use—meters or light-seconds or whatever. OK. Just swallow this and don’t pay too much attention. It’s just a digression, but let me finish it.

The equivalent of Coulomb’s Law in magnetism is Ampère’s Law, and it involves the circulation of a field, as illustrated below. So that’s why Ampère’s Law involves a 2π factor.

In fact, because we’re talking two wires (or two conductors) with currents going through them (I₁ and I₂respectively), the proportionality constant in Ampère’s Law is written as 2k_A.

Now, I won’t go too much into the detail but the thing about the circulation and that factor 2 in Ampère’s Law result in μ₀being written as μ₀ = 4π×10^–7N/A². As for the units: N is newton and A is ampere obviously. And so that’s why we have the 4π in the proportionality constant for Coulomb’s Law as well. And, of course, the (cε₀)(cμ₀) = 1 equation makes it obvious that cε₀ and cμ₀ are reciprocal numbers, so that’s why we write 1/4πε₀ for the proportionality constant in Coulomb’s Law, rather than k_eor a or whatever other simple thing. […] Well… Sort of. In any case, nothing to worry about. 🙂

The Uncertainty Principle and the stability of atoms

Pre-script (dated 26 June 2020): This post did not suffer too much from the attack on this blog by the the dark force. It remains relevant. 🙂

Original post:

The Model of the Atom

In one of my posts, I explained the quantum-mechanical model of an atom. Feynman sums it up as follows:

“The electrostatic forces pull the electron as close to the nucleus as possible, but the electron is compelled to stay spread out in space over a distance given by the Uncertainty Principle. If it were confined in too small a space, it would have a great uncertainty in momentum. But that means it would have a high expected energy—which it would use to escape from the electrical attraction. The net result is an electrical equilibrium not too different from the idea of Thompson—only is it the negative charge that is spread out, because the mass of the electron is so much smaller than the mass of the proton.”

This explanation is a bit sloppy, so we should add the following clarification: “The wave function Ψ(r) for an electron in an atom does not describe a smeared-out electron with a smooth charge density. The electron is either here, or there, or somewhere else, but wherever it is, it is a point charge.” (Feynman’s Lectures, Vol. III, p. 21-6)

The two quotes are not incompatible: it is just a matter of defining what we really mean by ‘spread out’. Feynman’s calculation of the Bohr radius of an atom in his introduction to quantum mechanics clears all confusion in this regard:

It is a nice argument. One may criticize he gets the right thing out because he puts the right things in – such as the values of e and m, for example 🙂 − but it’s nice nevertheless!

Mass as a Scale Factor for Uncertainty

Having complimented Feynman, the calculation above does raise an obvious question: why is it that we cannot confine the electron in “too small a space” but that we can do so for the nucleus (which is just one proton in the example of the hydrogen atom here). Feynman gives the answer above: because the mass of the electron is so much smaller than the mass of the proton.

Huh? What’s the mass got to do with it? The uncertainty is the same for protons and electrons, isn’t it?

Well… It is, and it isn’t. 🙂 The Uncertainty Principle – usually written in its more accurate σ_xσ_p ≥ ħ/2 expression – applies to both the electron and the proton – of course! – but the momentum p is the product of mass and velocity (p = m·v), and so it’s the proton’s mass that makes the difference here. To be specific, the mass of a proton is about 1836 times that of an electron. Now, as long as the velocities involved are non-relativistic—and they are non-relativistic in this case: the (relative) speed of electrons in atoms is given by the fine-structure constant α = v/c ≈ 0.0073, so the Lorentz factor is very close to 1—we can treat the m in the p = m·v identity as a constant and, hence, we can also write: Δp = Δ(m·v) = m·Δv. So all of the uncertainty of the momentum goes into the uncertainty of the velocity. Hence, the mass acts likes a reverse scale factor for the uncertainty. To appreciate what that means, let me write ΔxΔp = ħ as:

ΔxΔv = ħ/m

It is an interesting point, so let me expand the argument somewhat. We actually use a more general mathematical property of the standard deviation here: the standard deviation of a variable scales directly with the scale of the variable. Hence, we can write: σ(k·x) = k·σ(x), with k > 0. So the uncertainty is, indeed, smaller for larger masses. Larger masses are associated with smaller uncertainties in their position x. To be precise, the uncertainty is inversely proportional to the mass and, hence, the mass number effectively acts like a reverse scale factor for the uncertainty.

Of course, you’ll say that the uncertainty still applies to both factors on the left-hand side of the equation, and so you’ll wonder: why can’t we keep Δx the same and multiply Δv with m, so its product yields ħ again? In other words, why can’t we have a uncertainty in velocity for the proton that is 1836 times larger than the uncertainty in velocity for the electron? The answer to that question should be obvious: the uncertainty should not be greater than the expected value. When everything is said and done, we’re talking a distribution of some variable here (the velocity variable, to be precise) and, hence, that distribution is likely to be the Maxwell-Boltzmann distribution we introduced in previous posts. Its formula and graph are given below:

In statistics (and in probability theory), they call this a chi distribution with three degrees of freedom and a scale parameter which is equal to a = (kT/m)^1/2. The formula for the scale parameter shows how the mass of a particle indeed acts as a reverse scale parameter. The graph above shows three graphs for a = 1, 2 and 5 respectively. Note the square root though: quadrupling the mass (keeping kT the same) amounts to going from a = 2 to a = 1, so that’s halving a. Indeed, [kT/(4m)]^1/2= (1/2)(kT/m)^1/2. So we can’t just do what we want with Δv (like multiplying it with 1836, as suggested). In fact, the graph and the formulas show that Feynman’s assumption that we can equate p with Δp (i.e. his assumption that “the momenta must be of the order p = ħ/Δx, with Δx the spread in position”), more or less at least, is quite reasonable.

Of course, you are very smart and so you’ll have yet another objection: why can’t we associate a much higher momentum with the proton, as that would allow us to associate higher velocities with the proton? Good question. My answer to that is the following (and it might be original, as I didn’t find this anywhere else). When everything is said and done, we’re talking two particles in some box here: an electron and a proton. Hence, we should assume that the average kinetic energy of our electron and our proton is the same (if not, they would be exchanging kinetic energy until it’s more or less equal), so we write <m_electron·v²_electron/2> = <m_proton·v²_proton/2>. We can re-write this as m_p/m_e= 1/1836 = <v²_e>/<v²_p> and, therefore, <v²_e> = 1836·<v²_p>. Now, <v²> ≠ <v>² and, hence, <v> ≠ √<v²>. So the equality does not imply that the expected velocity of the electron is √1836 ≈ 43 times the expected velocity of the proton. Indeed, because of the particularities of the distribution, there is a difference between (a) the most probable speed, which is equal to √2·a ≈ 1.414·a, (b) the root mean square speed, which is equal to √<v²> = √3·a ≈ 1.732·a, and, finally, (c) the mean or expected speed, which is equal to <v> = 2·(2/π)^1/2·a ≈ 1.596·a.

However, we are not far off. We could use any of these three values to roughly approximate Δv, as well as the scale parameter a itself: our answers would all be of the same order. However, to keep the calculations simple, let’s use the most probable speed. Let’s equate our electron mass with unity, so the mass of our proton is 1836. Now, such mass implies a scale factor (i.e. a) that’s √1836 ≈ 43 times smaller. So the most probable speed of the proton and, therefore, its spread, would be about √2/√1836 = √(2/1836) ≈ 0.033 that of the electron, so we write: Δv_p ≈ 0.033·Δv_e. Now we can insert this in our ΔxΔv = ħ/m = ħ/1836 identity. We get: Δx_pΔv_p = Δx_p·√(2/1836)·Δv_e = ħ/1836. That, in turn, implies that √(2·1836)·Δx_p = ħ/Δv_e, which we can re-write as: Δx_p= Δx_e/√(2·1836) ≈ Δx_e/60. In other words, the expected spread in the position of the proton is about 60 times smaller than the expected spread of the electron. More in general, we can say that the spread in position of a particle, keeping all else equal, is inversely proportional to (2m)^1/2. Indeed, in this case, we multiplied the mass with about 1800, and we found that the uncertainty in position went down with a factor 1/60 = 1/√3600. Not bad as a result ! Is it precise? Well… It could be like √3·√m or 2·(2/π)^1/2··√m depending on our definition of ‘uncertainty’, but it’s all of the same order. So… Yes. Not bad at all… 🙂

You’ll raise a third objection now: the radius of a proton is measured using the femtometer scale, so that’s expressed in 10⁻¹⁵m, which is not 60 but a million times smaller than the nanometer (i.e. 10⁻⁹m) scale used to express the Bohr radius as calculated by Feynman above. You’re right, but the 10⁻¹⁵m number is the charge radius, not the uncertainty in position. Indeed, the so-called classical electron radius is also measured in femtometer and, hence, the Bohr radius is also like a million times that number. OK. That should settle the matter. I need to move on.

Before I do move on, let me relate the observation (i.e. the fact that the uncertainty in regard to position decreases as the mass of a particle increases) to another phenomenon. As you know, the interference of light beams is easy to observe. Hence, the interference of photons is easy to observe: Young’s experiment involved a slit of 0.85 mm (so almost 1 mm) only. In contrast, the 2012 double-slit experiment with electrons involved slits that were 62 nanometer wide, i.e. 62 billionths of a meter! That’s because the associated frequencies are so much higher and, hence, the wave zone is much smaller. So much, in fact, that Feynman could not imagine technology would ever be sufficiently advanced so as to actually carry out the double slit experiment with electrons. It’s an aspect of the same: the uncertainty in position is much smaller for electrons than it is for photons. Who knows: perhaps one day, we’ll be able to do the experiment with protons. 🙂 For further detail, I’ll refer you one of my posts on this.

What’s Explained, and What’s Left Unexplained?

There is another obvious question: if the electron is still some point charge, and going around as it does, why doesn’t it radiate energy? Indeed, the Rutherford-Bohr model had to be discarded because this ‘planetary’ model involved circular (or elliptical) motion and, therefore, some acceleration. According to classical theory, the electron should thus emit electromagnetic radiation, as a result of which it would radiate its kinetic energy away and, therefore, spiral in toward the nucleus. The quantum-mechanical model doesn’t explain this either, does it?

I can’t answer this question as yet, as I still need to go through all Feynman’s Lectures on quantum mechanics. You’re right. There’s something odd about the quantum-mechanical idea: it still involves a electron moving in some kind of orbital − although I hasten to add that the wavefunction is a complex-valued function, not some real function − but it does not involve any loss of kinetic energy due to circular motion apparently!

There are other unexplained questions as well. For example, the idea of an electrical point charge still needs to be re-conciliated with the mathematical inconsistencies it implies, as Feynman points out himself in yet another of his Lectures.

Finally, you’ll wonder as to the difference between a proton and a positron: if a positron and an electron annihilate each other in a flash, why do we have a hydrogen atom at all? Well… The proton is not the electron’s anti-particle. For starters, it’s made of quarks, while the positron is made of… Well… A positron is a positron: it’s elementary. But, yes, interesting question, and the ‘mechanics’ behind the mutual destruction are quite interesting and, hence, surely worth looking into—but not here. 🙂

Having mentioned a few things that remain unexplained, the model does have the advantage of solving plenty of other questions. It explains, for example, why the electron and the proton are actually right on top of each other, as they should be according to classical electrostatic theory, and why they are not at the same time: the electron is still a sort of ‘cloud’ indeed, with the proton at its center.

The quantum-mechanical ‘cloud’ model of the electron also explains why “the terrific electrical forces balance themselves out, almost perfectly, by forming tight, fine mixtures of the positive and the negative, so there is almost no attraction or repulsion at all between two separate bunches of such mixtures” (Richard Feynman, Introduction to Electromagnetism, p. 1-1) or, to quote from one of his other writings, why we do not fall through the floor as we walk:

“As we walk, our shoes with their masses of atoms push against the floor with its mass of atoms. In order to squash the atoms closer together, the electrons would be confined to a smaller space and, by the uncertainty principle, their momenta would have to be higher on the average, and that means high energy; the resistance to atomic compression is a quantum-mechanical effect and not a classical effect. Classically, we would expect that if we were to draw all the electrons and protons closer together, the energy would be reduced still further, and the best arrangement of positive and negative charges in classical physics is all on top of each other. This was well known in classical physics and was a puzzle because of the existence of the atom. Of course, the early scientists invented some ways out of the trouble—but never mind, we have the right way out, now!”

So that’s it, then. Except… Well…

The Fine-Structure Constant

When talking about the stability of atoms, one cannot escape a short discussion of the so-called fine-structure constant, denoted by α (alpha). I discussed it another post of mine, so I’ll refer you there for a more comprehensive overview. I’ll just remind you of the basics:

(1) α is the square of the electron charge expressed in Planck units: α = e_P².

(2) α is the square root of the ratio of (a) the classical electron radius and (b) the Bohr radius: α = √(r_e /r). You’ll see this more often written as r_e = α²r. Also note that this is an equation that does not depend on the units, in contrast to equation 1 (above), and 4 and 5 (below), which require you to switch to Planck units. It’s the square of a ratio and, hence, the units don’t matter. They fall away.

(3) α is the (relative) speed of an electron: α = v/c. [The relative speed is the speed as measured against the speed of light. Note that the ‘natural’ unit of speed in the Planck system of units is equal to c. Indeed, if you divide one Planck length by one Planck time unit, you get (1.616×10⁻³⁵m)/(5.391×10⁻⁴⁴s) = c m/s. However, this is another equation, just like (2), that does not depend on the units: we can express v and c in whatever unit we want, as long we’re consistent and express both in the same units.]

(4) Finally, α is also equal to the product of (a) the electron mass (which I’ll simply write as m_e here) and (b) the classical electron radius r_e (if both are expressed in Planck units): α = m_e·r_e. [I think that’s, perhaps, the most amazing of all of the expressions for α. If you don’t think that’s amazing, I’d really suggest you stop trying to study physics.]

Note that, from (2) and (4), we also find that:

(5) The electron mass (in Planck units) is equal m_e = α/r_e= α/α²r = 1/αr. So that gives us an expression, using α once again, for the electron mass as a function of the Bohr radius r expressed in Planck units.

Finally, we can also substitute (1) in (5) to get:

(6) The electron mass (in Planck units) is equal to m_e = α/r_e = e_P²/r_e. Using the Bohr radius, we get m_e = 1/αr = 1/e_P²r.

In addition, in the mentioned post, I also related α to the so-called coupling constant determining the strength of the interaction between electrons and photons. So… What a magical number indeed ! It suggests some unity that our little model of the atom above doesn’t quite capture. As far as I am concerned, it’s one of the many other ‘unexplained questions’, and one of my key objectives, as I struggle through Feynman’s Lectures, is to understand it all. 🙂 One of the issues is, of course, how to relate this coupling constant to the concept of a gauge, which I briefly discussed in my previous post. In short, I’ve still got a long way to go… 😦

Post Scriptum: The de Broglie relations and the Uncertainty Principle

My little exposé on mass being nothing but a scale factor in the Uncertainty Principle is a good occasion to reflect on the Uncertainty Principle once more. Indeed, what’s the uncertainty about, if it’s not about the mass? It’s about the position in space and velocity, i.e. it’s movement and time. Velocity or speed (i.e. the magnitude of the velocity vector) is, in turn, defined as the distance traveled divided by the time of travel, so the uncertainty is about time as well, as evidenced from the ΔEΔt = h expression of the Uncertainty Principle. But how does it work exactly?

Hmm… Not sure. Let me try to remember the context. We know that the de Broglie relation, λ = h/p, which associates a wavelength (λ) with the momentum (p) of a particle, is somewhat misleading, because we’re actually associating a (possibly infinite) bunch of component waves with a particle. So we’re talking some range of wavelengths (Δλ) and, hence, assuming all these component waves travel at the same speed, we’re also talking a frequency range (Δf). The bottom line is that we’ve got a wave packet and we need to distinguish the velocity of its phase (v_p) versus the group velocity (v_g), which corresponds to the classical velocity of our particle.

I think I explained that pretty well in one of my previous posts on the Uncertainty Principle, so I’d suggest you have a look there. The mentioned post explains how the Uncertainty Principle relates position (x) and momentum (p) as a Fourier pair, and it also explains that general mathematical property of Fourier pairs: the more ‘concentrated’ one distribution is, the more ‘spread out’ its Fourier transform will be. In other words, it is not possible to arbitrarily ‘concentrate’ both distributions, i.e. both the distribution of x (which I denoted as Ψ(x) as well as its Fourier transform, i.e. the distribution of p (which I denoted by Φ(p)). So, if we’d ‘squeeze’ Ψ(x), then its Fourier transform Φ(p) will ‘stretch out’.

That was clear enough—I hope! But how do we go from ΔxΔp = h to ΔEΔt = h? Why are energy and time another Fourier pair? To answer that question, we need to clearly define what energy and what time we are talking about. The argument revolves around the second de Broglie relation: E = h·f. How do we go from the momentum p to the energy E? And how do we go from the wavelength λ to the frequency f?

The answer to the first question is the energy-mass equivalence: E = mc², always. This formula is relativistic, as m is the relativistic mass, so it includes the rest mass m₀ as well as the equivalent mass of its kinetic energy m₀v²/2 + … [Note, indeed, that the kinetic energy – defined as the excess energy over its rest energy – is a rapidly converging series of terms, so only the m₀v²/2 term is mentioned.] Likewise, momentum is defined as p = mv, always, with m the relativistic mass, i.e. m = (1−v²/c²)^−1/2·m₀ = γ·m₀, with γ the Lorentz factor. The E = mc² and p = mv relations combined give us the E/c = m·c = p·c/v or E·v/c = p·c relationship, which we can also write as E/p = c²/v. However, we’ll need to write E as a function of p for the purpose of a derivation. You can verify that E²− p²c²= m₀²c⁴) and, hence, that E = (p²c²+ m₀²c⁴)^1/2.

Now, to go from a wavelength to a frequency, we need the wave velocity, and we’re obviously talking the phase velocity here, so we write: v_p = λ·f. That’s where the de Broglie hypothesis comes in: de Broglie just assumed the Planck-Einstein relation E = h·ν, in which ν is the frequency of a massless photon, would also be valid for massive particles, so he wrote: E = h·f. It’s just a hypothesis, of course, but it makes everything come out alright. More in particular, the phase velocity v_p = λ·f can now be re-written, using both de Broglie relations (i.e. h/p = λ and E/h = f) as v_p = (E/h)·(p/h) = E/p = c²/v. Now, because v is always smaller than c for massive particles (and usually very much smaller), we’re talking a superluminal phase velocity here! However, because it doesn’t carry any signal, it’s not inconsistent with relativity theory.

Now what about the group velocity? To calculate the group velocity, we need the frequencies and wavelengths of the component waves. The dispersion relation assumes the frequency of each component wave can be expressed as a function of its wavelength, so f = f(λ). Now, it takes a bit of wave mechanics (which I won’t elaborate on here) to show that the group velocity is the derivative of f with respect to λ, so we write v_g = ∂f/∂λ. Using the two de Broglie relations, we get: v_g = ∂f/∂λ = ∂(E/h)/∂(p/h) = ∂E/∂p = ∂[p²c²+ m₀²c⁴)^1/2]/∂p. Now, when you write it all out, you should find that v_g = ∂f/∂λ = pc²/E = c²/v_p = v, so that’s the classical velocity of our particle once again.

Phew! Complicated! Yes. But so we still don’t have our ΔEΔt = h expression! All of the above tells us how we can associate a range of momenta (Δp) with a range of wavelengths (Δλ) and, in turn, with a frequency range (Δf) which then gives us some energy range (ΔE), so the logic is like:

Δp ⇒ Δλ ⇒ Δf ⇒ ΔE

Somehow, the same sequence must also ‘transform’ our Δx into Δt. I googled a bit, but I couldn’t find any clear explanation. Feynman doesn’t seem to have one in his Lectures either so, frankly, I gave up. What I did do in one of my previous posts, is to give some interpretation. However, I am not quite sure if it’s really the interpretation: there are probably several ones. It must have something to do with the period of a wave, but I’ll let you break your head over it. 🙂 As far as I am concerned, it’s just one of the other unexplained questions I have as I sort of close my study of ‘classical’ physics. So I’ll just make a mental note of it. [Of course, please don’t hesitate to send me your answer, if you’d have one!] Now it’s time to really dig into quantum mechanics, so I should really stay silent for quite a while now! 🙂