# Traveling fields: the wave equation and its solutions

We’ve climbed a big mountain over the past few weeks, post by post, 🙂 slowly gaining height, and carefully checking out the various routes to the top. But we are there now: we finally fully understand how Maxwell’s equations actually work. Let me jot them down once more:

As for how real or unreal the E and B fields are, I gave you Feynman’s answer to it, so… Well… I can’t add to that. I should just note, or remind you, that we have a fully equivalent description of it all in terms of the electric and magnetic (vector) potential Φ and A, and so we can ask the same question about Φ and A. They explain real stuff, so they’re real in that sense. That’s what Feynman’s answer amounts to, and I am happy with it. 🙂

What I want to do here is show how we can get from those equations to some kind of wave equation: an equation that describes how a field actually travels through space. So… Well… Let’s first look at that very particular wave function we used in the previous post to prove that electromagnetic waves propagate with speed c, i.e. the speed of light. The fields were very simple: the electric field had a y-component only, and the magnetic field a z-component only. Their magnitudes, i.e. their magnitude where the field had reached, as it fills the space traveling outwards, were given in terms of J, i.e. the surface current density going in the positive y-direction, and the geometry of the situation is illustrated below.

The fields were, obviously, zero where the fields had not reached as they were traveling outwards. And, yes, I know that sounds stupid. But… Well… It’s just to make clear what we’re looking at here. 🙂

We also showed how the wave would look like if we would turn off its First Cause after some time T, so if the moving sheet of charge would no longer move after time T. We’d have the following pulse traveling through space, a rectangular shape really:

We can imagine more complicated shapes for the pulse, like the shape shown below. J goes from one unit to two units at time t = t1 and then to zero at t = t2. Now, the illustration on the right shows the electric field as a function of x at the time t shown by the arrow. We’ve seen this before when discussing waves: if the speed of travel of the wave is equal to c, then x is equal to x = c·t, and the pattern is as shown below indeed: it mirrors what happened at the source x/c seconds ago. So we write:

This idea of using the retarded time t’ = tx/c in the argument of a wave function f – or, what amounts to the same, using x − c/t – is key to understanding wave functions. I’ve explained this in very simple language in a post for my kids and, if you don’t get this, I recommend you check it out. What we’re doing, basically, is converting something expressed in time units into something expressed in distance units, or vice versa, using the velocity of the wave as the scale factor, so time and distance are both expressed in the same unit, which may be seconds, or meter.

To see how it works, suppose we add some time Δt to the argument of our wave function f, so we’re looking at f[x−c(t+Δt)] now, instead of f(x−ct). Now, f[x−c(t+Δt)] = f(x−ct−cΔt), so we’ll get a different value for our function—obviously! But it’s easy to see that we can restore our wave function F to its former value by also adding some distance Δx = cΔt to the argument. Indeed, if we do so, we get f[x+Δx−c(t+Δt)] = f(x+cΔt–ct−cΔt) = f(x–ct). You’ll say: t − x/c is not the same as x–ct. It is and it isn’t: any function of x–ct is also a function of t − x/c, because we can write:

Here, I need to add something about the direction of travel. The pulse above travel in the positive x-direction, so that’s why we have x minus ct in the argument. For a wave traveling in the negative x-direction, we’ll have a wave function y = F(x+ct). In any case, I can’t dwell on this, so let me move on.

Now, Maxwell’s equations in free or empty space, where are there no charges nor currents to interact with, reduce to:

Now, how can we relate this set of complicated equations to a simple wave function? Let’s do the exercise for our simple Ey and Bz wave. Let’s start by writing out the first equation, i.e. ·E = 0, so we get:

Now, our wave does not vary in the y and z direction, so none of the components, including Ey and Edepend on y or z. It only varies in the x-direction, so ∂Ey/∂y and ∂Ez/∂z are zero. Note that the cross-derivatives ∂Ey/∂z and ∂Ez/∂y are also zero: we’re talking a plane wave here, the field varies only with x. However, because ·E = 0, ∂Ex/∂x must be zero and, hence, Ex must be zero.

Huh? What? How is that possible? You just said that our field does vary in the x-direction! And now you’re saying it doesn’t it? Read carefully. I know it’s complicated business, but it all makes sense. Look at the function: we’re talking Ey, not Ex. Ey does vary as a function of x, but our field does not have an x-component, so Ex = 0. We have no cross-derivative ∂Ey/∂x in the divergence of E (i.e. in ·E = 0).

Huh? What? Let me put it differently. E has three components: Ex, Ey and Ez, and we have three space coordinates: x, y and z, so we have nine cross-derivatives. What I am saying is that all derivatives with respect to y and z are zero. That still leaves us with three derivatives: ∂Ex/∂x, ∂Ey/∂x, and ∂Ey/∂x. So… Because all derivatives in respect to y and z are zero, and because of the ·E = 0 equation, we know that ∂Ex/∂x must be zero. So, to make a long story short, I did not say anything about ∂Ey/∂x or ∂Ez/∂x. These may still be whatever they want to be, and they may vary in more or in less complicated ways. I’ll give an example of that in a moment.

Having said that, I do agree that I was a bit quick in writing that, because ∂Ex/∂x = 0, Ex must be zero too. Looking at the math only, Ex is not necessarily zero: it might be some non-zero constant. So… Yes. That’s a mathematical possibility. The static field from some charged condenser plate would be an example of a constant Ex field. However, the point is that we’re not looking at such static fields here: we’re talking dynamics here, and we’re looking at a particular type of wave: we’re talking a so-called plane wave. Now, the wave front of a plane wave is… Well… A plane. 🙂 So Ex is zero indeed. It’s a general result for plane waves: the electric field of a plane wave will always be at right angles to the direction of propagation.

Hmm… I can feel your skepticism here. You’ll say I am arbitrarily restricting the field of analysis… Well… Yes. For the moment. It’s not a reasonable restriction though. As I mentioned above, the field of a plane wave may still vary in both the y- and z-directions, as shown in the illustration below (for which the credit goes to Wikipedia), which visualizes the electric field of circularly polarized light. In any case, don’t worry too much about. Let’s get back to the analysis. Just note we’re talking plane waves here. We’ll talk about non-plane waves i.e. incoherent light waves later. 🙂

So we have plane waves and, therefore, a so-called transverse E field which we can resolve in two components: Eand Ez. However, we wanted to study a very simply Efield only. Why? Remember the objective of this lesson: it’s just to show how we go from Maxwell’s equations to the wave function, and so let’s keep the analysis simple as we can for now: we can make it more general later. In fact, if we do the analysis now for non-zero Eand zero Ez, we can do a similar analysis for non-zero Eand zero Ey, and the general solution is going to be some superposition of two such fields, so we’ll have a non-zero Eand Ez. Capito? 🙂 So let me write out Maxwell’s second equation, and use the results we got above, so I’ll incorporate the zero values for the derivatives with respect to y and z, and also the assumption that Ez is zero. So we get:

[By the way: note that, out of the nine derivatives, the curl involves only the (six) cross-derivatives. That’s linked to the neat separation between the curl and the divergence operator. Math is great! :-)]

Now, because of the flux rule (×E = –∂B/∂t), we can (and should) equate the three components of ×E above with the three components of –∂B/∂t, so we get:

[In case you wonder what it is that I am trying to do, patience, please! We’ll get where we want to get. Just hang in there and read on.] Now, ∂Bx/∂t = 0 and ∂By/∂t = 0 do not necessarily imply that Bx and Bare zero: there might be some magnets and, hence, we may have some constant static field. However, that’s a matter of choosing a reference point or, more simply, assuming that empty space is effectively empty, and so we don’t have magnets lying around and so we assume that Bx and Bare effectively zero. [Again, we can always throw more stuff in when our analysis is finished, but let’s keep it simple and stupid right now, especially because the Bx = B= 0 is entirely in line with the Ex = E= 0 assumption.]

The equations above tell us what we know already: the E and B fields are at right angles to each other. However, note, once again, that this is a more general result for all plane electromagnetic waves, so it’s not only that very special caterpillar or butterfly field that we’re looking at it. [If you didn’t read my previous post, you won’t get the pun, but don’t worry about it. You need to understand the equations, not the silly jokes.]

OK. We’re almost there. Now we need Maxwell’s last equation. When we write it out, we get the following monstrously looking set of equations:

However, because of all of the equations involving zeroes above 🙂 only ∂Bz/∂x is not equal to zero, so the whole set reduced to only simple equation only:

Simplifying assumptions are great, aren’t they? 🙂 Having said that, it’s easy to be confused. You should watch out for the denominators: a ∂x and a ∂t are two very different things. So we have two equations now involving first-order derivatives:

1. ∂Bz/∂t = −∂Ey/∂x
2. c2∂Bz/∂x = −∂Ey/∂t

So what? Patience, please! 🙂 Let’s differentiate the first equation with respect to x and the second with respect to t. Why? Because… Well… You’ll see. Don’t complain. It’s simple. Just do it. We get:

1. ∂[∂Bz/∂t]/∂x = −∂2Ey/∂x2
2. ∂[−c2∂Bz/∂x]/∂t = −∂2Ey/∂x2

So we can equate the left-hand sides of our two equations now, and what we get is a differential equation of the second order that we’ve encountered already, when we were studying wave equations. In fact, it is the wave equation for one-dimensional waves:

In case you want to double-check, I did a few posts on this, but, if you don’t get this, well… I am sorry. You’ll need to do some homework. More in particular, you’ll need to do some homework on differential equations. The equation above is basically some constraint on the functional form of Ey. More in general, if we see an equation like:

then the function ψ(x, t) must be some function

So any function ψ like that will work. You can check it out by doing the necessary derivatives and plug them into the wave equation. [In case you wonder how you should go about this, Feynman actually does it for you in his Lecture on this topic, so you may want to check it there.]

In fact, the functions f(x − c/t) and g(x + c/t) themselves will also work as possible solutions. So we can drop one or the other, which amounts to saying that our ‘shape’ has to travel in some direction, rather than in both at the same time. 🙂 Indeed, from all of my explanations above, you know what f(x − c/t) represents: it’s a wave that travels in the positive x-direction. Now, it may be periodic, but it doesn’t have to be periodic. The f(x − c/t) function could represent any constant ‘shape’ that’s traveling in the positive x-direction at speed c. Likewise, the g(x + c/t) function could represent any constant ‘shape’ that’s traveling in the negative x-direction at speed c. As for super-imposing both…

Well… I suggest you check that post I wrote for my son, Vincent. It’s on the math of waves, but it doesn’t have derivatives and/or differential equations. It just explains how superimposition and all that works. It’s not very abstract, as it revolves around a vibrating guitar string. So, if you have trouble with all of the above, you may want to read that first. 🙂 The bottom line is that we can get any wavefunction we want by superimposing simple sinusoidals that are traveling in one or the other direction, and so that’s what’s the more general solution really says. Full stop. So that’s what’s we’re doing really: we add very simple waves to get very more complicated waveforms. 🙂

Now, I could leave it at this, but then it’s very easy to just go one step further, and that is to assume that Eand, therefore, Bare not zero. It’s just a matter of super-imposing solutions. Let me just give you the general solution. Just look at it for a while. If you understood all that I’ve said above, 20 seconds or so should be sufficient to say: “Yes, that makes sense. That’s the solution in two dimensions.” At least, I hope so! 🙂

OK. I should really stop now. But… Well… Now that we’ve got a general solution for all plane waves, why not be even bolder and think about what we could possibly say about three-dimensional waves? So then Eand, therefore, Bwould not necessarily be zero either. After all, light can behave that way. In fact, light is likely to be non-polarized and, hence, Eand, therefore, Bare most probably not equal to zero!

Now, you may think the analysis is going to be terribly complicated. And you’re right. It would be if we’d stick to our analysis in terms of x, y and z coordinates. However, it turns out that the analysis in terms of vector equations is actually quite straightforward. I’ll just copy the Master here, so you can see His Greatness. 🙂

But what solution does an equation like (20.27) have? We can appreciate it’s actually three equations, i.e. one for each component, and so… Well… Hmm… What can we say about that? I’ll quote the Master on this too:

“How shall we find the general wave solution? The answer is that all the solutions of the three-dimensional wave equation can be represented as a superposition of the one-dimensional solutions we have already found. We obtained the equation for waves which move in the x-direction by supposing that the field did not depend on y and z. Obviously, there are other solutions in which the fields do not depend on x and z, representing waves going in the y-direction. Then there are solutions which do not depend on x and y, representing waves travelling in the z-direction. Or in general, since we have written our equations in vector form, the three-dimensional wave equation can have solutions which are plane waves moving in any direction at all. Again, since the equations are linear, we may have simultaneously as many plane waves as we wish, travelling in as many different directions. Thus the most general solution of the three-dimensional wave equation is a superposition of all sorts of plane waves moving in all sorts of directions.”

It’s the same thing once more: we add very simple waves to get very more complicated waveforms. 🙂

You must have fallen asleep by now or, else, be watching something else. Feynman must have felt the same. After explaining all of the nitty-gritty above, Feynman wakes up his students. He does so by appealing to their imagination:

“Try to imagine what the electric and magnetic fields look like at present in the space in this lecture room. First of all, there is a steady magnetic field; it comes from the currents in the interior of the earth—that is, the earth’s steady magnetic field. Then there are some irregular, nearly static electric fields produced perhaps by electric charges generated by friction as various people move about in their chairs and rub their coat sleeves against the chair arms. Then there are other magnetic fields produced by oscillating currents in the electrical wiring—fields which vary at a frequency of 6060 cycles per second, in synchronism with the generator at Boulder Dam. But more interesting are the electric and magnetic fields varying at much higher frequencies. For instance, as light travels from window to floor and wall to wall, there are little wiggles of the electric and magnetic fields moving along at 186,000 miles per second. Then there are also infrared waves travelling from the warm foreheads to the cold blackboard. And we have forgotten the ultraviolet light, the x-rays, and the radiowaves travelling through the room.

Flying across the room are electromagnetic waves which carry music of a jazz band. There are waves modulated by a series of impulses representing pictures of events going on in other parts of the world, or of imaginary aspirins dissolving in imaginary stomachs. To demonstrate the reality of these waves it is only necessary to turn on electronic equipment that converts these waves into pictures and sounds.

If we go into further detail to analyze even the smallest wiggles, there are tiny electromagnetic waves that have come into the room from enormous distances. There are now tiny oscillations of the electric field, whose crests are separated by a distance of one foot, that have come from millions of miles away, transmitted to the earth from the Mariner II space craft which has just passed Venus. Its signals carry summaries of information it has picked up about the planets (information obtained from electromagnetic waves that travelled from the planet to the space craft).

There are very tiny wiggles of the electric and magnetic fields that are waves which originated billions of light years away—from galaxies in the remotest corners of the universe. That this is true has been found by “filling the room with wires”—by building antennas as large as this room. Such radiowaves have been detected from places in space beyond the range of the greatest optical telescopes. Even they, the optical telescopes, are simply gatherers of electromagnetic waves. What we call the stars are only inferences, inferences drawn from the only physical reality we have yet gotten from them—from a careful study of the unendingly complex undulations of the electric and magnetic fields reaching us on earth.

There is, of course, more: the fields produced by lightning miles away, the fields of the charged cosmic ray particles as they zip through the room, and more, and more. What a complicated thing is the electric field in the space around you! Yet it always satisfies the three-dimensional wave equation.”

So… Well… That’s it for today, folks. 🙂 We have some more gymnastics to do, still… But we’re really there. Or here, I should say: on top of the peak. What a view we have here! Isn’t it beautiful? It took us quite some effort to get on top of this thing, and we’re still trying to catch our breath as we struggle with what we’ve learned so far, but it’s really worthwhile, isn’t it? 🙂

# Maxwell, Lorentz, gauges and gauge transformations

I’ve done quite a few posts already on electromagnetism. They were all focused on the math one needs to understand Maxwell’s equations. Maxwell’s equations are a set of (four) differential equations, so they relate some function with its derivatives. To be specific, they relate E and B, i.e. the electric and magnetic field vector respectively, with their derivatives in space and in time. [Let me be explicit here: E and B have three components, but depend on both space as well as time, so we have three dependent and four independent variables for each function: E = (Ex, Ey, Ez) = E(x, y, z, t) and B = (Bx, By, Bz) = B(x, y, z, t).] That’s simple enough to understand, but the dynamics involved are quite complicated, as illustrated below.

I now want to do a series on the more interesting stuff, including an exploration of the concept of gauge in field theory, and I also want to show how one can derive the wave equation for electromagnetic radiation from Maxwell’s equations. Before I start, let’s recall the basic concept of a field.

The reality of fields

I said a couple of time already that (electromagnetic) fields are real. They’re more than just a mathematical structure. Let me show you why. Remember the formula for the electrostatic potential caused by some charge q at the origin:

We know that the (negative) gradient of this function, at any point in space, gives us the electric field vector at that point: E = –Φ. [The minus sign is there because of convention: we take the reference point Φ = 0 at infinity.] Now, the electric field vector gives us the force on a unit charge (i.e. the charge of a proton) at that point. If q is some positive charge, the force will be repulsive, and the unit charge will accelerate away from our q charge at the origin. Hence, energy will be expended, as force over distance implies work is being done: as the charges separate, potential energy is converted into kinetic energy. Where does the energy come from? The energy conservation law tells us that it must come from somewhere.

It does: the energy comes from the field itself. Bringing in more or bigger charges (from infinity, or just from further away) requires more energy. So the new charges change the field and, therefore, its energy. How exactly? That’s given by Gauss’ Law: the total flux out of a closed surface is equal to:

You’ll say: flux and energy are two different things. Well… Yes and no. The energy in the field depends on E. Indeed, the formula for the energy density in space (i.e. the energy per unit volume) is

Getting the energy over a larger space is just another integral, with the energy density as the integral kernel:

Feynman’s illustration below is not very sophisticated but, as usual, enlightening. 🙂

Gauss’ Theorem connects both the math as well as the physics of the situation and, as such, underscores the reality of fields: the energy is not in the electric charges. The energy is in the fields they produce. Everything else is just the principle of superposition of fields –  i.e. E = E+ E– coming into play. I’ll explain Gauss’ Theorem in a moment. Let me first make some additional remarks.

First, the formulas are valid for electrostatics only (so E and B only vary in space, not in time), so they’re just a piece of the larger puzzle. 🙂 As for now, however, note that, if a field is real (or, to be precise, if its energy is real), then the flux is equally real.

Second, let me say something about the units. Field strength (E or, in this case, its normal component En = E·n) is measured in newton (N) per coulomb (C), so in N/C. The integral above implies that flux is measured in (N/C)·m2. It’s a weird unit because one associates flux with flow and, therefore, one would expect flux is some quantity per unit time and per unit area, so we’d have the m2 unit (and the second) in the denominator, not in the numerator. But so that’s true for heat transfer, for mass transfer, for fluid dynamics (e.g. the amount of water flowing through some cross-section) and many other physical phenomena. But for electric flux, it’s different. You can do a dimensional analysis of the expression above: the sum of the charges is expressed in coulomb (C), and the electric constant (i.e. the vacuum permittivity) is expressed in C2/(N·m2), so, yes, it works: C/[C2/(N·m2)] = (N/C)·m2. To make sense of the units, you should think of the flux as the total flow, and of the field strength as a surface density, so that’s the flux divided by the total area, so (field strength) = (flux)/(area). Conversely, (flux) = (field strength)×(area). Hence, the unit of flux is [flux] = [field strength]×[area] = (N/C)·m2.

OK. Now we’re ready for Gauss’ Theorem. 🙂 I’ll also say something about its corollary, Stokes’ Theorem. It’s a bit of a mathematical digression but necessary, I think, for a better understanding of all those operators we’re going to use.

Gauss’ Theorem

The concept of flux is related to the divergence of a vector field through Gauss’ Theorem. Gauss’s Theorem has nothing to do with Gauss’ Law, except that both are associated with the same genius. Gauss’ Theorem is:

The ·C in the integral on the right-hand side is the divergence of a vector field. It’s the volume density of the outward flux of a vector field from an infinitesimal volume around a given point.

Huh? What’s a volume density? Good question. Just substitute C for E in the surface and volume integral above (the integral on the left is a surface integral, and the one on the right is a volume integral), and think about the meaning of what’s written. To help you, let me also include the concept of linear density, so we have (1) linear, (2) surface and (3) volume density. Look at that representation of a vector field once again: we said the density of lines represented the magnitude of E. But what density? The representation hereunder is flat, so we can think of a linear density indeed, measured along the blue line: so the flux would be six (that’s the number of lines), and the linear density (i.e. the field strength) is six divided by the length of the blue line.

However, we defined field strength as a surface density above, so that’s the flux (i.e. the number of field lines) divided by the surface area (i.e. the area of a cross-section): think of the square of the blue line, and field lines going through that square. That’s simple enough. But what’s volume density? How do we count the number of lines inside of a box? The answer is: mathematicians actually define it for an infinitesimally small cube by adding the fluxes out of the six individual faces of an infinitesimally small cube:

So, the truth is: volume density is actually defined as a surface density, but for an infinitesimally small volume element. That, in turn, gives us the meaning of the divergence of a vector field. Indeed, the sum of the derivatives above is just ·C (i.e. the divergence of C), and ΔxΔyΔz is the volume of our infinitesimal cube, so the divergence of some field vector C at some point P is the flux – i.e. the outgoing ‘flow’ of Cper unit volume, in the neighborhood of P, as evidenced by writing

Indeed, just bring ΔV to the other side of the equation to check the ‘per unit volume’ aspect of what I wrote above. The whole idea is to determine whether the small volume is like a sink or like a source, and to what extent. Think of the field near a point charge, as illustrated below. Look at the black lines: they are the field lines (the dashed lines are equipotential lines) and note how the positive charge is a source of flux, obviously, while the negative charge is a sink.

Now, the next step is to acknowledge that the total flux from a volume is the sum of the fluxes out of each part. Indeed, the flux through the part of the surfaces common to two parts will cancel each other out. Feynman illustrates that with a rough drawing (below) and I’ll refer you to his Lecture on it for more detail.

So… Combining all of the gymnastics above – and integrating the divergence over an entire volume, indeed –  we get Gauss’ Theorem:

Stokes’ Theorem

There is a similar theorem involving the circulation of a vector, rather than its flux. It’s referred to as Stokes’ Theorem. Let me jot it down:

We have a contour integral here (left) and a surface integral (right). The reasoning behind is quite similar: a surface bounded by some loop Γ is divided into infinitesimally small squares, and the circulation around Γ is the sum of the circulations around the little loops. We should take care though: the surface integral takes the normal component of ×C, so that’s (×C)n = (×Cn. The illustrations below should help you to understand what’s going on.

The electric versus the magnetic force

There’s more than just the electric force: we also have the magnetic force. The so-called Lorentz force is the combination of both. The formula, for some charge q in an electromagnetic field, is equal to:

Hence, if the velocity vector v is not equal to zero, we need to look at the magnetic field vector B too! The simplest situation is magnetostatics, so let’s first have a look at that.

Magnetostatics imply that that the flux of E doesn’t change, so Maxwell’s third equation reduces to c2×B = j0. So we just have a steady electric current (j): no accelerating charges. Maxwell’s fourth equation, B = 0, remains what is was: there’s no such thing as a magnetic charge. The Lorentz force also remains what it is, of course: F = q(E+v×B) = qE +qv×B. Also note that the v, j and the lack of a magnetic charge all point to the same: magnetism is just a relativistic effect of electricity.

What about units? Well… While the unit of E, i.e. the electric field strength, is pretty obvious from the F = qE term  – hence, E = F/q, and so the unit of E must be [force]/[charge] = N/C – the unit of the magnetic field strength is more complicated. Indeed, the F = qv×B identity tells us it must be (N·s)/(m·C), because 1 N = 1C·(m/s)·(N·s)/(m·C). Phew! That’s as horrendous as it looks, and that’s why it’s usually expressed using its shorthand, i.e. the tesla: 1 T = 1 (N·s)/(m·C). Magnetic flux is the same concept as electric flux, so it’s (field strength)×(area). However, now we’re talking magnetic field strength, so its unit is T·m= (N·s·m)/(m·C) = (N·s·m)/C, which is referred to as the weber (Wb). Remembering that 1 volt = 1 N·m/C, it’s easy to see that a weber is also equal to 1 Wb = 1 V·s. In any case, it’s a unit that is not so easy to interpret.

Magnetostatics is a bit of a weird situation. It assumes steady fields, so the ∂E/∂t and ∂B/∂t terms in Maxwell’s equations can be dropped. In fact, c2×B = j0 implies that ·(c2×B ·(j0) and, therefore, that ·= 0. Now, ·= –∂ρ/∂t and, therefore, magnetostatics is a situation which assumes ∂ρ/∂t = 0. So we have electric currents but no change in charge densities. To put it simply, we’re not looking at a condenser that is charging or discharging, although that condenser may act like the battery or generator that keeps the charges flowing! But let’s go along with the magnetostatics assumption. What can we say about it? Well… First, we have the equivalent of Gauss’ Law, i.e. Ampère’s Law:

We have a line integral here around a closed curve, instead of a surface integral over a closed surface (Gauss’ Law), but it’s pretty similar: instead of the sum of the charges inside the volume, we have the current through the loop, and then an extra c2 factor in the denominator, of course. Combined with the B = 0 equation, this equation allows us to solve practical problems. But I am not interested in practical problems. What’s the theory behind?

The magnetic vector potential

TheB = 0 equation is true, always, unlike the ×E = 0 expression, which is true for electrostatics only (no moving charges). It says the divergence of B is zero, always, and, hence, it means we can represent B as the curl of another vector field, always. That vector field is referred to as the magnetic vector potential, and we write:

·B = ·(×A) = 0 and, hence, B×A

In electrostatics, we had the other theorem: if the curl of a vector field is zero (everywhere), then the vector field can be represented as the gradient of some scalar function, so if ×= 0, then there is some Ψ for which CΨ. Substituting C for E, and taking into account our conventions on charge and the direction of flow, we get E = –Φ. Substituting E in Maxwell’s first equation (E = ρ/ε0) then gave us the so-called Poisson equation: ∇2Φ = ρ/ε0, which sums up the whole subject of electrostatics really! It’s all in there!

Except magnetostatics, of course. Using the (magnetic) vector potential A, all of magnetostatics is reduced to another expression:

2A= −j0, with ·A = 0

Note the qualifier: ·A = 0. Why should the divergence of A be equal to zero? You’re right. It doesn’t have to be that way. We know that ·(×C) = 0, for any vector field C, and always (it’s a mathematical identity, in fact, so it’s got nothing to do with physics), but choosing A such that ·A = 0 is just a choice. In fact, as I’ll explain in a moment, it’s referred to as choosing a gauge. The·A = 0 choice is a very convenient choice, however, as it simplifies our equations. Indeed, c2×B = j0 = c2×(×A), and – from our vector calculus classes – we know that ×(×C) = (·C) – ∇2C. Combining that with our choice of A (which is such that ·A = 0, indeed), we get the ∇2A= −j0 expression indeed, which sums up the whole subject of magnetostatics!

The point is: if the time derivatives in Maxwell’s equations, i.e. ∂E/∂t and ∂B/∂t, are zero, then Maxwell’s four equations can be nicely separated into two pairs: the electric and magnetic field are not interconnected. Hence, as long as charges and currents are static, electricity and magnetism appear as distinct phenomena, and the interdependence of E and B does not appear. So we re-write Maxwell’s set of four equations as:

1. ElectrostaticsE = ρ/ε0 and ×E = 0
2. Magnetostatics: ×B = j/c2ε0 and B = 0

Note that electrostatics is a neat example of a vector field with zero curl and a given divergence (ρ/ε0), while magnetostatics is a neat example of a vector field with zero divergence and a given curl (j/c2ε0).

Electrodynamics

But reality is usually not so simple. With time-varying fields, Maxwell’s equations are what they are, and so there is interdependence, as illustrated in the introduction of this post. Note, however, that the magnetic field remains divergence-free in dynamics too! That’s because there is no such thing as a magnetic charge: we only have electric charges. So ·B = 0 and we can define a magnetic vector potential A and re-write B as B×A, indeed.

I am writing a vector potential field because, as I mentioned a couple of times already, we can choose A. Indeed, as long as ·A = 0, it’s fine, so we can add curl-free components to the magnetic potential: it won’t make a difference. This condition is referred to as gauge invariance. I’ll come back to that, and also show why this is what it is.

While we can easily get B from A because of the B×A, getting E from some potential is a different matter altogether. It turns out we can get E using the following expression, which involves both Φ (i.e. the electric or electrostatic potential) as well as A (i.e. the magnetic vector potential):

E = –Φ – ∂A/∂t

Likewise, one can show that Maxwell’s equations can be re-written in terms of Φ and A, rather than in terms of E and B. The expression looks rather formidable, but don’t panic:

Just look at it. We have two ‘variables’ here (Φ and A) and two equations, so the system is fully defined. [Of course, the second equation is three equations really: one for each component x, y and z.] What’s the point? Why would we want to re-write Maxwell’s equations? The first equation makes it clear that the scalar potential (i.e. the electric potential) is a time-varying quantity, so things are not, somehow, simpler. The answer is twofold. First, re-writing Maxwell’s equations in terms of the scalar and vector potential makes sense because we have (fairly) easy expressions for their value in time and in space as a function of the charges and currents. For statics, these expressions are:

So it is, effectively, easier to first calculate the scalar and vector potential, and then get E and B from them. For dynamics, the expressions are similar:

Indeed, they are like the integrals for statics, but with “a small and physically appealing modification”, as Feynman notes: when doing the integrals, we must use the so-called retarded time t′ = t − r12/ct’. The illustration below shows how it works: the influences propagate from point (2) to point (1) at the speed c, so we must use the values of ρ and j at the time t′ = t − r12/ct’ indeed!

The second aspect of the answer to the question of why we’d be interested in Φ and A has to do with the topic I wanted to write about here: the concept of a gauge and a gauge transformation.

Gauges and gauge transformations in electromagnetics

Let’s see what we’re doing really. We calculate some A and then solve for B by writing: B = ×A. Now, I say some A because any A‘ = AΨ, with Ψ any scalar field really. Why? Because the curl of the gradient of Ψ – i.e. curl(gradΨ) = ×(Ψ) – is equal to 0. Hence, ×(AΨ) = ×A×Ψ = ×A.

So we have B, and now we need E. So the next step is to take Faraday’s Law, which is Maxwell’s second equation: ×E = –∂B/∂t. Why this one? It’s a simple one, as it does not involve currents or charges. So we combine this equation and our B = ×A expression and write:

×E = –∂(∇×A)/∂t

Now, these operators are tricky but you can verify this can be re-written as:

×(E + ∂A/∂t) = 0

Looking carefully, we see this expression says that E + ∂A/∂t is some vector whose curl is equal to zero. Hence, this vector must be the gradient of something. When doing electrostatics, When we worked on electrostatics, we only had E, not the ∂A/∂t bit, and we said that E tout court was the gradient of something, so we wrote E = −Φ. We now do the same thing for E + ∂A/∂t, so we write:

E + ∂A/∂t = −Φ

So we use the same symbol Φ but it’s a bit of a different animal, obviously. However, it’s easy to see that, if the ∂A/∂t would disappear (as it does in electrostatics, where nothing changes with time), we’d get our ‘old’ −Φ. Now, E + ∂A/∂t = −Φ can be written as:

E = −Φ – ∂A/∂t

So, what’s the big deal? We wrote B and E as a function of Φ and A. Well, we said we could replace A by any A‘ = AΨ but, obviously, such substitution would not yield the same E. To get the same E, we need some substitution rule for Φ as well. Now, you can verify we will get the same E if we’d substitute Φ for Φ’ = Φ – ∂Ψ/∂t. You should check it by writing it all out:

E = −Φ’–∂A’/∂t = −(Φ–∂Ψ/∂t)–∂(A+Ψ)/∂t

= −Φ+(∂Ψ/∂t)–∂A/∂t–∂(Ψ)/∂t = −Φ – ∂A/∂t = E

Again, the operators are a bit tricky, but the +(∂Ψ/∂t) and –∂(Ψ)/∂t terms do cancel out. Where are we heading to? When everything is said and done, we do need to relate it all to the currents and the charges, because that’s the real stuff out there. So let’s take Maxwell’s E = ρ/ε0 equation, which has the charges in it, and let’s substitute E for E = −Φ – ∂A/∂t. We get:

That equation can be re-written as:

So we have one equation here relating Φ and A to the sources. We need another one, and we also need to separate Φ and A somehow. How do we do that?

Maxwell’s fourth equation, i.e. c2×B = j+ ∂E/∂t can, obviously, be written as c2×− E/∂t = j0. Substituting both E and B yields the following monstrosity:

We can now apply the general ∇×(×C) = (·C) – ∇2C identity to the first term to get:

It’s equally monstrous, obviously, but we can simplify the whole thing by choosing Φ and A in a clever way. For the magnetostatic case, we chose A such that ·A = 0. We could have chosen something else. Indeed, it’s not because B is divergence-free, that A has to be divergence-free too! For example, I’ll leave it to you to show that choosing ·A such that

also respects the general condition that any A and Φ we choose must respect the A‘ = AΨ and Φ’ = Φ – ∂Ψ/∂t equalities. Now, if we choose ·A such that ·A = −c–2·∂Φ/∂t indeed, then the two middle terms in our monstrosity cancel out, and we’re left with a much simpler equation for A:

In addition, doing the substitution in our other equation relating Φ and A to the sources yields an equation for Φ that has the same form:

What’s the big deal here? Well… Let’s write it all out. The equation above becomes:

That’s a wave equation in three dimensions. In case you wonder, just check one of my posts on wave equations. The one-dimensional equivalent for a wave propagating in the x direction at speed c (like a sound wave, for example) is ∂2Φ/∂xc–2·∂2Φ/∂t2, indeed. The equation for A yields above yields similar wave functions for A‘s components Ax, Ay, and Az.

So, yes, it is a big deal. We’ve written Maxwell’s equations in terms of the scalar (Φ) and vector (A) potential and in a form that makes immediately apparent that we’re talking electromagnetic waves moving out at the speed c. Let me copy them again:

You may, of course, say that you’d rather have a wave equation for E and B, rather than for A and Φ. Well… That can be done. Feynman gives us two derivations that do so. The first derivation is relatively simple and assumes the source our electromagnetic wave moves in one direction only. The second derivation is much more complicated and gives an equation for E that, if you’ve read the first volume of Feynman’s Lectures, you’ll surely remember:

The links are there, and so I’ll let you have fun with those Lectures yourself. I am finished here, indeed, in terms of what I wanted to do in this post, and that is to say a few words about gauges in field theory. It’s nothing much, really, and so we’ll surely have to discuss the topic again, but at least you now know what a gauge actually is in classical electromagnetic theory. Let’s quickly go over the concepts:

1. Choosing the ·A is choosing a gauge, or a gauge potential (because we’re talking scalar and vector potential here). The particular choice is also referred to as gauge fixing.
2. Changing A by adding ψ is called a gauge transformation, and the scalar function Ψ is referred to as a gauge function. The fact that we can add curl-free components to the magnetic potential without them making any difference is referred to as gauge invariance.
3. Finally, the ·A = −c–2·∂Φ/∂t gauge is referred to as a Lorentz gauge.

Just to make sure you understand: why is that Lorentz gauge so special? Well… Look at the whole argument once more: isn’t it amazing we get such beautiful (wave) equations if we stick it in? Also look at the functional shape of the gauge itself: it looks like a wave equation itself! […] Well… No… It doesn’t. I am a bit too enthusiastic here. We do have the same 1/c2 and a time derivative, but it’s not a wave equation. 🙂 In any case, it all confirms, once again, that physics is all about beautiful mathematical structures. But, again, it’s not math only. There’s something real out there. In this case, that ‘something’ is a traveling electromagnetic field. 🙂

But why do we call it a gauge? That should be equally obvious. It’s really like choosing a gauge in another context, such as measuring the pressure of a tyre, as shown below. 🙂

Gauges and group theory

You’ll usually see gauges mentioned with some reference to group theory. For example, you will see or hear phrases like: “The existence of arbitrary numbers of gauge functions ψ(r, t) corresponds to the U(1) gauge freedom of the electromagnetic theory.” The U(1) notation stands for a unitary group of degree n = 1. It is also known as the circle group. Let me copy the introduction to the unitary group from the Wikipedia article on it:

In mathematics, the unitary group of degree n, denoted U(n), is the group of n × n unitary matrices, with the group operation that of matrix multiplication. The unitary group is a subgroup of the general linear group GL(n, C). In the simple case n = 1, the group U(1) corresponds to the circle group, consisting of all complex numbers with absolute value 1 under multiplication. All the unitary groups contain copies of this group.

The unitary group U(n) is a real Lie group of of dimension n2. The Lie algebra of U(n) consists of n × n skew-Hermitian matrices, with the Lie bracket given by the commutator. The general unitary group (also called the group of unitary similitudes) consists of all matrices A such that A*A is a nonzero multiple of the identity matrix, and is just the product of the unitary group with the group of all positive multiples of the identity matrix.

Phew! Does this make you any wiser? If anything, it makes me realize I’ve still got a long way to go. 🙂 The Wikipedia article on gauge fixing notes something that’s more interesting (if only because I more or less understand what it says):

Although classical electromagnetism is now often spoken of as a gauge theory, it was not originally conceived in these terms. The motion of a classical point charge is affected only by the electric and magnetic field strengths at that point, and the potentials can be treated as a mere mathematical device for simplifying some proofs and calculations. Not until the advent of quantum field theory could it be said that the potentials themselves are part of the physical configuration of a system. The earliest consequence to be accurately predicted and experimentally verified was the Aharonov–Bohm effect, which has no classical counterpart.

This confirms, once again, that the fields are real. In fact, what this says is that the potentials are real: they have a meaningful physical interpretation. I’ll leave it to you to expore that Aharanov-Bohm effect. In the meanwhile, I’ll study what Feynman writes on potentials and all that as used in quantum physics. It will probably take a while before I’ll get into group theory though.

Indeed, it’s probably best to study physics at a somewhat less abstract level first, before getting into the more sophisticated stuff.

# Back to tedious stuff: an introduction to electromagnetism

It seems I skipped too many chapters in Feynman’s second volume of Lectures (on electromagnetism) and so I have to return to that before getting back to quantum physics. So let me just do that in the next couple of posts. I’ll have to start with the basics: Maxwell’s equations.

Indeed, electromagnetic phenomena are described by a set of four equations known as Maxwell’s equations. They relate two fields: the electric field (E) and the magnetic field (B). The electric field appears when we have electric charges: positive (e.g. protons or positively charged ions) or negative (e.g. electrons or negatively charged ions). That’s obvious.

In contrast, there is no such thing as ‘magnetic charges’. The magnetic field appears only when the electric field changes, or when charges move. In turn, the change in the magnetic field causes an electric field, and that’s how electromagnetic radiation basically works: a changing electric field causes a magnetic field, and the build-up of that magnetic field (so that’s a changing magnetic field) causes a build-up of an electric field, and so on and so on.

OK. That’s obvious too. But how does it work exactly? Before explaining this, I need to point out some more ‘obvious’ things:

1. From Maxwell’s equations, we can calculate the magnitude of E and B. Indeed, a specific functional form for E and is what we get when we solve Maxwell’s set of equations, and we’ll jot down that solution in a moment–even if I am afraid you will shake your head when you see it. The point to note is that what we get as a solution for E and B is a solution in a particular frame of reference only: if we switch to another reference frame, E and B will look different.

Huh? Yes. According to the principle of relativity, we cannot say which charges are ‘stationary’ and which charges are ‘moving’ in any absolute sense: it all depends on our frame our reference.

But… Yes? Then if we put an electric charge in these fields, the force on it will also be different?

Yes. Forces also look different when moving from one reference to another.

But… Yes? The physical effect surely has to be the same, regardless of the reference frame?

Yes. The point is that, if we look at an electric charge q moving along a current-carrying wire in a coordinate system at rest with respect to the wire, with the same velocity (v0) as the conduction electrons (v), then the whole force on the electric charge will be ‘magnetic’: F = qv0×B and E = 0. Now, if we’re looking at the same situation from a frame of reference that is moving with q, then our charge is at rest, and so there can be no magnetic force on it. Hence, the force on it must come from an electric field! But what produces the electric field? Our current-carrying wire is supposed to be neutral!

Well… It turns out that our ‘neutral’ wire appears to be charged when moving. We’ll explain – in very much detail – why this is so later. Now, you should just note that “we should not attach too much reality to E and B, because they appear in different ‘mixtures’ in different coordinate systems”, as Feynman puts it. In fact, you may or may not heard that magnetism is actually nothing but a “relativistic effect” of electricity. Well… That’s true, but we’ll also explain how that works later only. Let’s not jump the gun.

2. The remark above is related to the other ‘obvious’ thing I wanted to say before presenting Maxwell’s equations: fields are very useful to describe what’s going on but, when everything is said and done, what we really want to know is what force will be acting on a charge, because that’s what’s going to tell us how that charge is going to move. In other words, we want to find the equations of motion, and the force determines how the charge’s momentum will change: F = dp/dt = d(mv)/dt (i.e. Newton’s equation of motion).

So how does that work? We’ve given the formula before:

F = q(E + v×B) = qE + q(v×B)

This is a sum of two vectors:

1. qE is the ‘electric force: that force is in the same direction as the electric field, but with a magnitude equal to q times E. [Note I use a bold letter (E) for a vector (which we may define as some quantity with a direction) and a non-bold letter (E) for its magnitude.]
2. q(v×B) is the ‘magnetic’ force: that force depends on both v as well as on B. Its direction is given by the so-called right-hand rule for a vector cross-product (as opposed to a dot product, which is denoted by a dot (·) and which yields a scalar instead of a new vector).

That right-hand rule is illustrated below. Note that, if we switch a and b, the b×a vector will point downwards. The magnitude of q(v×B) is given by |v×B| = |v||B|sinθ (with θ the angle between v and B).

We know the direction of (because we’re talking about some charge that is moving here) but what direction is B? It’s time to be a bit more systematic now.

Flux and circulation

In order to understand Maxwell’s equations, one needs to understand two concepts related to a vector field: flux and circulation. The two concepts are best illustrated referring to a vector field describing the flow of a liquid:

1. If we have a surface, the flux will give us the net amount of fluid going out through the surface per unit time. The illustration below (which I took from Feynman’s Lectures) gives us not only the general idea but a formal definition as well:

2. The concept of circulation is linked to the idea of some net rotational motion around some loop. In fact, that’s exactly what it describes. I’ll again use Feynman’s illustration (and description) because I couldn’t find anything better.

Diagram (a) gives us the velocity field in the liquid. Now, imagine a tube (of uniform cross section) that follows some arbitrary closed curve, like in (b), and then imagine we’d suddenly freeze the liquid everywhere except inside the tube: the liquid in the tube would circulate as shown in (c). Formally, the circulation is defined as:

circulation = (the average tangential component)·(the distance around)

OK. So far, so good. Back to electromagnetism.

E and B

We’re familiar with the electric field E from our high school physics course. Indeed, you’ll probably recognize the two examples below: (a) a (positive) charge near a (neutral) conducting sheet, and (b) two opposite charges next to each other. Note the convention: the field lines emanate from the positive charge. Does that mean that the force is in that direction too? Yes. But remember: if a particle is attracted to another, the latter particle is attracted to the former too! So there’s a force in both directions !

What more can we say about this? Well… It is clear that the field E is directed radially. In terms of our flux and circulation concepts, we say that there’s an outgoing flux from the (positive) point charge. Furthermore, it would seem to be pretty obvious (we’d need to show why, but we won’t do that here: just look at Coulomb’s Law once again) that the flux should be proportional to the charge, and it is: if we double the charge, the flux doubles too. That gives us Maxwell’s first equation:

flux of E through a closed surface = (the net charge inside)/ε0

Note we’re talking a closed surface here, like a sphere for example–but it does not have to be a nice symmetric shape: Maxwell’s first equation is valid for any closed surface. The expression above is Coulomb’s Law, which you’ll also surely remember from your high school physics course: while it looks very different, it’s the same. It’s just because we’re using that flux concept here that we seem to be getting an entirely different expression. But so we’re not: it’s the same as Coulomb’s Law.

As for the ε0 factor, that’s just a constant that depends on the units we’re using to measure what we write above, so don’t worry about it. [I am noting it here because you’ll see it pop up later too.]

For B, we’ve got a similar-looking law:

flux of B through a closed surface = 0 (= zero = nil)

That’s not the same, you’ll say. Well… Yes and no. It’s the same really, but the zero on the right-hand side of the expression above says there’s no such thing as a ‘magnetic’ charge.

Hmm… But… If we can’t create any flux of B, because ‘magnetic charges’ don’t exist, so how do we get magnetic fields then?

Well… We wrote that above already, and you should remember it from your high school physics course as well: a magnetic field is created by (1) a moving charge (i.e. a flow or flux of electric current) or (2) a changing electric field.

Situation (1) is illustrated below: the current in the wire creates some circulation of B around the wire. How much? Not much: the magnetic effect is very small as compared to the electric effect (that has to do with magnetism being a relativistic effect of electricity but, as mentioned above, I’ll explain that later only). To be precise, the equation is the following:

c2(circulation of B)= (flux of electric current)/ε0

That c2 factor on the left-hand side becomes 1/c2 if we move it to the other side and, yes, is the speed of light here – so you can see we’re talking a very small amount of circulation only indeed! [As for the ε0 factor, that’s just the same constant: it’s got to do with the units we’re using to measure stuff.]

One last point perhaps: what’s the direction of the circulation? Well… There’s a so-called right-hand grip rule for that, which is illustrated below.

OK. Enough about this. Let’s go to situation (2): a changing electric field. That effect is usually illustrated with Faraday’s original 1831 experiment, which is shown below with a more modern voltmeter 🙂 : when the wire on one side of the iron ring is connected to the battery, we’ll see a transient current on the other side. It’s transient only, so the current quickly disappears. That’s why transformers don’t work with DC. In fact, it is said that Faraday was quite disappointed to see that the current didn’t last! Likewise, when the wire is disconnected, we’ll briefly see another transient current.

So this effect is due to the changing electric field, which causes a changing magnetic field. But so where is that magnetic field? We’re talking currents here, aren’t we? Yes, you’re right. To understand why we have a transient current in the voltmeter, you need to understand yet another effect: a changing magnetic field causes an electric field, and so that’s what actually generates the transient current. However, what’s going on in the iron ring is the magnetic effect, and so that’s caused by the changing electric field as we connect/disconnect the battery to the wire. Capito?

I guess so… So what’s the equation that captures this situation, i.e. situation (2)? That equation involves both flux and circulation, so we’ll have a surface (S) as well as a curve (C). The equation is the following one: for any surface S (not closed this time because, if the surface was closed, it wouldn’t have an edge!), we have:

c2(circulation of B around C)= d(flux of E through S)/dt

I mentioned above that the reverse is also true. A changing magnetic field causes an electric field, and the equation for that looks very similar, except that we don’t have the c2 factor:

circulation of around = d(flux of through S)/dt

Let me quickly mention the presence of absence of that c2 or 1/c2 factor in the previous equations once again. It is interesting. It’s got nothing to do with the units. It’s really a proportionality factor: any change in E will only cause a little change in (because of the 1/c2 factor in the first equation), but the reverse is not true: there’s no c2  in the second equation. Again, it’s got to do with magnetism being a relativistic effect of electricity, so the magnetic effect is, in most cases, tiny as compared to the electric effect, except when we’re talking charges that are moving at relativistic speeds (i.e. speeds close to c). As said, we’ll come back to that–later, much later. Let’s get back to Maxwell’s equations first.

Maxwell’s equations

We can now combine all of the equations above in one set, and so these are Maxwell’s four famous equations:

1. The flux of E through a closed surface = (the net charge inside)/ε0
2. The circulation of E around = d(flux of through S)/dt (with the curve or edge around S)
3. The flux of B through a closed surface = 0
4. c2(circulation of B around C)= d(flux of E through S)/dt + (flux of electric current)/ε0

From a mathematical point of view, this is a set of differential equations, and they are not easy to grasp intuitively. As Feynman puts it: “The laws of Newton were very simple to write down, but they had a lot of complicated consequences and it took us a long time to learn about them all. These laws are not nearly as simple to write down, which means that the consequences are going to be more elaborate and it will take us quite a lot of time to figure them all out.”

Indeed, Feynman needs about twenty (!) Lectures in that second Volume to show what it all implies, as he walks us through electrostatics, magnetostatics and various other ‘special’ cases before giving us the ‘complete’ or ‘general’ solution to the equations. This ‘general’ solution, in mathematical notation, is the following:

Huh? What’s that? Well… The four equations are the equations we explained already, but this time in mathematical notation: flux and circulation can be expressed much more elegantly using the differential operator  indeed. As for the solutions to Maxwell’s set of equations, you can see they are expressed using two other concepts: the scalar potential Φ and the vector potential A.

Now, it is not my intention to summarize two dozen of Feynman’s Lectures in just a few lines, so I’ll have to leave you here for the moment.

[…]

Huh? What? What about my promise to show that magnetism is a relativistic effect of electricity indeed?

Well… I wanted to do that just now, but when I look at it, I realize that I’d end up copying most of Feynman’s little exposé on it and, hence, I’ll just refer you to that particular section. It’s really quite exciting but – as you might expect – it does take a bit of time to wrestle through it.

That being said, it really does give you a kind of an Aha-Erlebnis and, therefore, I really warmly recommend it ! Just click on the link ! 🙂