# Magnetostatics: the vector potential

This and the next posts are supposed to wrap up a few loose ends on magnetism. One of these loose ends is the (magnetic) vector potential, which we introduced in our post on gauge transformations, but then we didn’t do much with it. Another topic I neglected so far is that of the magnetic dipole moment (as opposed to the electric dipole moment), which is an extremely important concept both in classical as well as in quantum mechanics. So let’s do the vector potential here, and the magnetic dipole moment in the next. 🙂

Let’s go for it. Let me recall the basics which, as usual, are just Maxwell’s equations. You’ll remember that the electrostatic field was curl-free: ×= 0, everywhere. Therefore, we can apply the following mathematical theorem: if the curl of a vector field is zero (everywhere), then the vector field can be represented as the gradient of some scalar function:

if ×= 0, then there is some Ψ for which CΨ

Substituting C for E, and taking into account our conventions on charge and the direction of flow, we wrote:

E = –Φ

Φ (phi) is referred to as the electric potential. Combining E = –Φ with Gauss’ Law – E = ρ/ε0 − we got Poisson’s equation:

2Φ = −ρ/ε0

So that equation sums up all of electrostatics. Really: that’s it! 🙂

Now, the two equations for magnetostatics are: B = 0 and c2×B = j0. Let me say something more about them:

1. TheB = 0 equation is true, always, unlike the ×E = 0 expression, which is true for electrostatics only (no moving charges).
2. TheB = 0 equation says the divergence of B is zero, always. Now, you can verify for yourself that the divergence of the curl of a vector field is always zero, so div (curl A) = •(×A) = 0, always. Therefore, there’s another theorem that we can apply. It says the following: if the divergence of a vector field, say D, is zero – so if D = 0, then D will be the the curl of some other vector field C, so we can write: D×C.  Applying this to B = 0, we can write:

If B = 0, then there is an A such that B×A

We can also write this as follows:·B = ·(×A) = 0 and, hence, B×A. Now, it’s this vector field A that is referred to as the (magnetic) vector potential, and so that’s what we want to talk about here. As a start, it may be good to write all of the components of our B×A vector:

Note that we have no ‘time component’ because we assume the fields are static, so they do not change with time. Now, because that’s a relatively simple situation, you may wonder whether we really simplified anything with this vector potential. B is a vector with three components, and so is A. The answer to that question is somewhat subtle, and similar to what we did for electrostatics: it’s mathematically convenient to use A, and then calculate the derivatives above to find B. So the number of components doesn’t matter really: it’s just more convenient to first get A using our data on the currents j, and then we get B from A.

That’s it really. Let me show you how it works. The whole argument is somewhat lengthy, but it’s not difficult, and once it’s done, it’s done. So just carry on and please bear with me 🙂

First, we need to put some constraints on A, because the B×A equation does not fully define A. It’s like the scalar potential Φ: any Φ’ = Φ + C was as good a choice as Φ (with C any constant), so we needed a reference point Φ = 0, which we usually took at infinity. With the vector potential A, we have even more latitude: we can not only add a constant but any field which is the gradient of some scalar field, so any A’ = A + Ψ will do. Why? Just write it all out: ×(A + Ψ) = ×A + ×(Ψ). But the curl of the gradient of a scalar field (or a scalar function) is always zero (you can check my post on vector calculus on this), so ×(Ψ) = 0 and so ×(A + Ψ) = ×A + ×(Ψ) = ×A + 0 = ×A = B.

So what constraints should we put on our choice of A? The choice is, once again, based on mathematical convenience: in magnetostatics, we’ll choose A such that A = 0. Can we do that? Yes. The A’ = A + Ψ flexibility allows us to make A’ anything we wish, and so A and A’ will have the same curl, but they don’t need to have the same divergence. So we can choose an A’ so A’ = 0, and then we denote A’ by A. 🙂 So our ‘definition’ of the vector potential A is now:

B×A and A = 0

I have to make two points here:

1. First, you should note that, in my post on gauges, I mentioned that the choice is different when the time derivatives of E and B are not equal to zero, so when we’re talking changing currents and charge distributions, so that’s dynamics. However, that’s not a concern here.
2. To be fully complete, I should note that the ‘definition’ above does still not uniquely determine A. For a unique specification, we also need some reference point, or say how the field behaves on some boundary, or at large distances. It is usually convenient to choose a field which goes to zero at large distances, just like our electric potential.

Phew! We’ve said so many things about A now, but nothing that has any relevance to how we’d calculate A. 😦 So we are we heading here?

Fortunately, we can go a bit faster now. The c2×B = j0 equation and our B×A give us:

c2×(×A) = j0

Now, there’s this other vector identity, which you surely won’t remember either—but trust me: I am not lying: ×(×A) = (∇•A) − ∇2A. So, now you see why we choose A such that ∇•A = 0 ! It allows us to write:

c2×(×A) = − c22Aj⇔ ∇2A = –j0c2

Now, the three components of ∇2A = –j0care, of course:

As you can see, each of these three equations is mathematically identical to that Poisson equation: ∇2Φ = − ρ/ε0. So all that we learned about solving for potentials when ρ is known can now be used to solve for each component of A when j is known. Now, to calculate Φ, we used the following integral:

Simply substituting symbols then gives us the solution for Ax:

We have a similar integral for Ay and Az, of course, and we can combine the three equations in vector form:

Finally, and just in case you wonder what is what, there’s the illustration below (taken from Feynman’s Lecture on this topic here) that, hopefully, will help you to make sense of it all.

At this point, you’re probably tired of these formulas (or asleep) or (if you’re not asleep) wondering what they mean really, so let’s do two examples. Of course, you won’t be surprised that we’ll be talking a straight wire and a solenoid respectively once again. 🙂

The magnetic field of a straight wire

We already calculated the magnetic field of a straight wire, using Ampère’s Law and the symmetry of the situation, in our previous post on magnetostatics. We got the following formula:

Do we get the same using those formulas for A and then doing our derivations to get B? We should, and we do, but I’ll be lazy here and just refer you to the relevant section in Feynman’s Lecture on it, because the solenoid stuff is much more interesting. 🙂

The magnetic field of a solenoid

In the mentioned post on magnetostatics, we also derived a formula for the magnetic field inside a solenoid. We got:

with n the number of turns per unit length of the solenoid, and I the current going through it. However, in the mentioned post, we assumed that the magnetic field outside of the solenoid was zero, for all practical purposes, but it is not. It is very weak but not zero, as shown below. In fact, it’s fairly strong at very short distances from the solenoid! Calculating the vector potential allows us to calculate its exact value, everywhere. So let’s go for it.

The relevant quantities are shown in the illustration below. So we’ve got a very long solenoid here once again, with n turns of wire per unit length and, therefore, a circumferential current on the surface of n·I per unit length (the slight pitch of the winding is being neglected).

Now, just like that surface charge density ρ in electrostatics, we have a ‘surface current density’ J here, which we define as J = n·I. So we’re going from a scalar to a vector quantity, and the components of J are:

Jx = –J·sinϕ, Jy = –J·cosϕ, Jz = 0

So how do we do this? As should be clear from the whole development above, the principle is that the x-component of the vector potential arising from a current density j is the same as the electric potential Φ that would be produced by a charge density ρ equal to jx divided by c2, and similarly for the y- and z-components. Huh? Yes. Just read it a couple of times and think about it: we should imagine some cylinder with a surface charge ρ = –(J/c2)·sinϕ to calculate Ax. And then we equate ρ with –(J/c2)·cosϕ and zero respectively to find Ay and Az.

Now, that sounds pretty easy but Feynman’s argument is quite convoluted here, so I’ll just skip it (click the link here if you’d want to see it) and give you the final result, i.e. the magnitude of A:

Of course, you need to interpret the result above with the illustration, which shows that A is always perpendicular to r’. [In case you wonder why we write r’ (so r with a prime) and not r, that’s to make clear we’re talking the distance from the z-axis, so it’s not the distance from the origin.]

Now, you may think that c2 in the denominator explains the very weak field, but it doesn’t: it’s the inverse proportionality to r’ that makes the difference! Indeed, you should compare the formula above with the result we get for the vector potential inside of the solenoid, which is equal to:

The illustration below shows the quantities involved. Note that we’re talking a uniform magnetic field here, along the z-axis, which has the same direction as Band, hence, is pointing towards you as you look at the illustration, which is why you don’t see the B0 field lines and/or the z-axis: they’re perpendicular to your computer screen, so to speak.

As for the direction of A, it’s shown on the illustration, of course, but let me remind you of the right-hand rule for the vector cross product a×b once again, so you can make sense of the direction of A = (1/2)B0×r’ indeed:

Also note the magnitude this formula implies: a×b = |a|·|b|·sinθ·n, with θ the angle between a and b, and the normal unit vector in the direction given by that right-hand rule above. Now, unlike a vector dot product, the magnitude of the vector cross product is not zero for perpendicular vectors. In fact, when θ = π/2, which is the case for Band r’, then sinθ = 1, and, hence, we can write:

|A| = A = (1/2)|B0||r’| = (1/2)·B0·r’

Now, just substitute Bfor B= n·I/ε0c2, which is the field inside the solenoid, then you get:

A = (1/2)·n·I·r’/ε0c2

You should compare this formula with the formula for A outside the solenoid, so you can draw the right conclusions. Note that both formulas incorporate the same (1/2)·n·I/ε0cfactor. The difference, really, is that inside the solenoid, A is proportional to r’ (as shown in the illustration: if r’ doubles, triples etcetera, then A will double, triple etcetera too) while, outside of the solenoid, A is inversely proportional to r’. In addition, outside the solenoid, we have the afactor, which doesn’t matter inside. Indeed, the radius of the solenoid (i.e. a) changes the flux, which is the product of B and the cross-section area π·a2, but not B itself.

Let’s do a quick check to see if the formula makes sense. We do not want A to be larger outside of the solenoid than inside, obviously, so the a2/r’ factor should be smaller than r’ for r’ > a. Now, a2/r’ < r’ if a2 < r’2, and because a an r’ are both positive real numbers, that’s the case if r’ > a indeed. So we’ve got something that resembles the electric field inside and outside of a uniformly charged sphere, except that A decreases as 1/r’ rather than as 1/r’2, as shown below.

Hmm… That’s all stuff to think about… The thing you should take home from all of this is the following:

1. A (uniform) magnetic field B in the z-direction corresponds to a vector potential A that rotates about the z-axis with magnitude A = B0·r’/2 (with r’ the displacement from the z-axis, not from the origin—obviously!). So that gives you the A inside of a solenoid. The magnitude is A = (1/2)·n·I·r’/ε0c2, so A is proportional with r’.
2. Outside of the solenoid, A‘s magnitude (i.e. A) is inversely proportional to the distance r’, and it’s given by the formula: A = (1/2)·n·I·a20c2·r’. That’s, of course, consistent with the magnetic field diminishing with distance there. But remember: contrary to what you’ve been taught or what you often read, it is not zero. It’s only near zero if r’ >> a.

Alright. Done. Next post. So that’s on the magnetic dipole moment 🙂

# Maxwell, Lorentz, gauges and gauge transformations

I’ve done quite a few posts already on electromagnetism. They were all focused on the math one needs to understand Maxwell’s equations. Maxwell’s equations are a set of (four) differential equations, so they relate some function with its derivatives. To be specific, they relate E and B, i.e. the electric and magnetic field vector respectively, with their derivatives in space and in time. [Let me be explicit here: E and B have three components, but depend on both space as well as time, so we have three dependent and four independent variables for each function: E = (Ex, Ey, Ez) = E(x, y, z, t) and B = (Bx, By, Bz) = B(x, y, z, t).] That’s simple enough to understand, but the dynamics involved are quite complicated, as illustrated below.

I now want to do a series on the more interesting stuff, including an exploration of the concept of gauge in field theory, and I also want to show how one can derive the wave equation for electromagnetic radiation from Maxwell’s equations. Before I start, let’s recall the basic concept of a field.

The reality of fields

I said a couple of time already that (electromagnetic) fields are real. They’re more than just a mathematical structure. Let me show you why. Remember the formula for the electrostatic potential caused by some charge q at the origin:

We know that the (negative) gradient of this function, at any point in space, gives us the electric field vector at that point: E = –Φ. [The minus sign is there because of convention: we take the reference point Φ = 0 at infinity.] Now, the electric field vector gives us the force on a unit charge (i.e. the charge of a proton) at that point. If q is some positive charge, the force will be repulsive, and the unit charge will accelerate away from our q charge at the origin. Hence, energy will be expended, as force over distance implies work is being done: as the charges separate, potential energy is converted into kinetic energy. Where does the energy come from? The energy conservation law tells us that it must come from somewhere.

It does: the energy comes from the field itself. Bringing in more or bigger charges (from infinity, or just from further away) requires more energy. So the new charges change the field and, therefore, its energy. How exactly? That’s given by Gauss’ Law: the total flux out of a closed surface is equal to:

You’ll say: flux and energy are two different things. Well… Yes and no. The energy in the field depends on E. Indeed, the formula for the energy density in space (i.e. the energy per unit volume) is

Getting the energy over a larger space is just another integral, with the energy density as the integral kernel:

Feynman’s illustration below is not very sophisticated but, as usual, enlightening. 🙂

Gauss’ Theorem connects both the math as well as the physics of the situation and, as such, underscores the reality of fields: the energy is not in the electric charges. The energy is in the fields they produce. Everything else is just the principle of superposition of fields –  i.e. E = E+ E– coming into play. I’ll explain Gauss’ Theorem in a moment. Let me first make some additional remarks.

First, the formulas are valid for electrostatics only (so E and B only vary in space, not in time), so they’re just a piece of the larger puzzle. 🙂 As for now, however, note that, if a field is real (or, to be precise, if its energy is real), then the flux is equally real.

Second, let me say something about the units. Field strength (E or, in this case, its normal component En = E·n) is measured in newton (N) per coulomb (C), so in N/C. The integral above implies that flux is measured in (N/C)·m2. It’s a weird unit because one associates flux with flow and, therefore, one would expect flux is some quantity per unit time and per unit area, so we’d have the m2 unit (and the second) in the denominator, not in the numerator. But so that’s true for heat transfer, for mass transfer, for fluid dynamics (e.g. the amount of water flowing through some cross-section) and many other physical phenomena. But for electric flux, it’s different. You can do a dimensional analysis of the expression above: the sum of the charges is expressed in coulomb (C), and the electric constant (i.e. the vacuum permittivity) is expressed in C2/(N·m2), so, yes, it works: C/[C2/(N·m2)] = (N/C)·m2. To make sense of the units, you should think of the flux as the total flow, and of the field strength as a surface density, so that’s the flux divided by the total area, so (field strength) = (flux)/(area). Conversely, (flux) = (field strength)×(area). Hence, the unit of flux is [flux] = [field strength]×[area] = (N/C)·m2.

OK. Now we’re ready for Gauss’ Theorem. 🙂 I’ll also say something about its corollary, Stokes’ Theorem. It’s a bit of a mathematical digression but necessary, I think, for a better understanding of all those operators we’re going to use.

Gauss’ Theorem

The concept of flux is related to the divergence of a vector field through Gauss’ Theorem. Gauss’s Theorem has nothing to do with Gauss’ Law, except that both are associated with the same genius. Gauss’ Theorem is:

The ·C in the integral on the right-hand side is the divergence of a vector field. It’s the volume density of the outward flux of a vector field from an infinitesimal volume around a given point.

Huh? What’s a volume density? Good question. Just substitute C for E in the surface and volume integral above (the integral on the left is a surface integral, and the one on the right is a volume integral), and think about the meaning of what’s written. To help you, let me also include the concept of linear density, so we have (1) linear, (2) surface and (3) volume density. Look at that representation of a vector field once again: we said the density of lines represented the magnitude of E. But what density? The representation hereunder is flat, so we can think of a linear density indeed, measured along the blue line: so the flux would be six (that’s the number of lines), and the linear density (i.e. the field strength) is six divided by the length of the blue line.

However, we defined field strength as a surface density above, so that’s the flux (i.e. the number of field lines) divided by the surface area (i.e. the area of a cross-section): think of the square of the blue line, and field lines going through that square. That’s simple enough. But what’s volume density? How do we count the number of lines inside of a box? The answer is: mathematicians actually define it for an infinitesimally small cube by adding the fluxes out of the six individual faces of an infinitesimally small cube:

So, the truth is: volume density is actually defined as a surface density, but for an infinitesimally small volume element. That, in turn, gives us the meaning of the divergence of a vector field. Indeed, the sum of the derivatives above is just ·C (i.e. the divergence of C), and ΔxΔyΔz is the volume of our infinitesimal cube, so the divergence of some field vector C at some point P is the flux – i.e. the outgoing ‘flow’ of Cper unit volume, in the neighborhood of P, as evidenced by writing

Indeed, just bring ΔV to the other side of the equation to check the ‘per unit volume’ aspect of what I wrote above. The whole idea is to determine whether the small volume is like a sink or like a source, and to what extent. Think of the field near a point charge, as illustrated below. Look at the black lines: they are the field lines (the dashed lines are equipotential lines) and note how the positive charge is a source of flux, obviously, while the negative charge is a sink.

Now, the next step is to acknowledge that the total flux from a volume is the sum of the fluxes out of each part. Indeed, the flux through the part of the surfaces common to two parts will cancel each other out. Feynman illustrates that with a rough drawing (below) and I’ll refer you to his Lecture on it for more detail.

So… Combining all of the gymnastics above – and integrating the divergence over an entire volume, indeed –  we get Gauss’ Theorem:

Stokes’ Theorem

There is a similar theorem involving the circulation of a vector, rather than its flux. It’s referred to as Stokes’ Theorem. Let me jot it down:

We have a contour integral here (left) and a surface integral (right). The reasoning behind is quite similar: a surface bounded by some loop Γ is divided into infinitesimally small squares, and the circulation around Γ is the sum of the circulations around the little loops. We should take care though: the surface integral takes the normal component of ×C, so that’s (×C)n = (×Cn. The illustrations below should help you to understand what’s going on.

The electric versus the magnetic force

There’s more than just the electric force: we also have the magnetic force. The so-called Lorentz force is the combination of both. The formula, for some charge q in an electromagnetic field, is equal to:

Hence, if the velocity vector v is not equal to zero, we need to look at the magnetic field vector B too! The simplest situation is magnetostatics, so let’s first have a look at that.

Magnetostatics imply that that the flux of E doesn’t change, so Maxwell’s third equation reduces to c2×B = j0. So we just have a steady electric current (j): no accelerating charges. Maxwell’s fourth equation, B = 0, remains what is was: there’s no such thing as a magnetic charge. The Lorentz force also remains what it is, of course: F = q(E+v×B) = qE +qv×B. Also note that the v, j and the lack of a magnetic charge all point to the same: magnetism is just a relativistic effect of electricity.

What about units? Well… While the unit of E, i.e. the electric field strength, is pretty obvious from the F = qE term  – hence, E = F/q, and so the unit of E must be [force]/[charge] = N/C – the unit of the magnetic field strength is more complicated. Indeed, the F = qv×B identity tells us it must be (N·s)/(m·C), because 1 N = 1C·(m/s)·(N·s)/(m·C). Phew! That’s as horrendous as it looks, and that’s why it’s usually expressed using its shorthand, i.e. the tesla: 1 T = 1 (N·s)/(m·C). Magnetic flux is the same concept as electric flux, so it’s (field strength)×(area). However, now we’re talking magnetic field strength, so its unit is T·m= (N·s·m)/(m·C) = (N·s·m)/C, which is referred to as the weber (Wb). Remembering that 1 volt = 1 N·m/C, it’s easy to see that a weber is also equal to 1 Wb = 1 V·s. In any case, it’s a unit that is not so easy to interpret.

Magnetostatics is a bit of a weird situation. It assumes steady fields, so the ∂E/∂t and ∂B/∂t terms in Maxwell’s equations can be dropped. In fact, c2×B = j0 implies that ·(c2×B ·(j0) and, therefore, that ·= 0. Now, ·= –∂ρ/∂t and, therefore, magnetostatics is a situation which assumes ∂ρ/∂t = 0. So we have electric currents but no change in charge densities. To put it simply, we’re not looking at a condenser that is charging or discharging, although that condenser may act like the battery or generator that keeps the charges flowing! But let’s go along with the magnetostatics assumption. What can we say about it? Well… First, we have the equivalent of Gauss’ Law, i.e. Ampère’s Law:

We have a line integral here around a closed curve, instead of a surface integral over a closed surface (Gauss’ Law), but it’s pretty similar: instead of the sum of the charges inside the volume, we have the current through the loop, and then an extra c2 factor in the denominator, of course. Combined with the B = 0 equation, this equation allows us to solve practical problems. But I am not interested in practical problems. What’s the theory behind?

The magnetic vector potential

TheB = 0 equation is true, always, unlike the ×E = 0 expression, which is true for electrostatics only (no moving charges). It says the divergence of B is zero, always, and, hence, it means we can represent B as the curl of another vector field, always. That vector field is referred to as the magnetic vector potential, and we write:

·B = ·(×A) = 0 and, hence, B×A

In electrostatics, we had the other theorem: if the curl of a vector field is zero (everywhere), then the vector field can be represented as the gradient of some scalar function, so if ×= 0, then there is some Ψ for which CΨ. Substituting C for E, and taking into account our conventions on charge and the direction of flow, we get E = –Φ. Substituting E in Maxwell’s first equation (E = ρ/ε0) then gave us the so-called Poisson equation: ∇2Φ = ρ/ε0, which sums up the whole subject of electrostatics really! It’s all in there!

Except magnetostatics, of course. Using the (magnetic) vector potential A, all of magnetostatics is reduced to another expression:

2A= −j0, with ·A = 0

Note the qualifier: ·A = 0. Why should the divergence of A be equal to zero? You’re right. It doesn’t have to be that way. We know that ·(×C) = 0, for any vector field C, and always (it’s a mathematical identity, in fact, so it’s got nothing to do with physics), but choosing A such that ·A = 0 is just a choice. In fact, as I’ll explain in a moment, it’s referred to as choosing a gauge. The·A = 0 choice is a very convenient choice, however, as it simplifies our equations. Indeed, c2×B = j0 = c2×(×A), and – from our vector calculus classes – we know that ×(×C) = (·C) – ∇2C. Combining that with our choice of A (which is such that ·A = 0, indeed), we get the ∇2A= −j0 expression indeed, which sums up the whole subject of magnetostatics!

The point is: if the time derivatives in Maxwell’s equations, i.e. ∂E/∂t and ∂B/∂t, are zero, then Maxwell’s four equations can be nicely separated into two pairs: the electric and magnetic field are not interconnected. Hence, as long as charges and currents are static, electricity and magnetism appear as distinct phenomena, and the interdependence of E and B does not appear. So we re-write Maxwell’s set of four equations as:

1. ElectrostaticsE = ρ/ε0 and ×E = 0
2. Magnetostatics: ×B = j/c2ε0 and B = 0

Note that electrostatics is a neat example of a vector field with zero curl and a given divergence (ρ/ε0), while magnetostatics is a neat example of a vector field with zero divergence and a given curl (j/c2ε0).

Electrodynamics

But reality is usually not so simple. With time-varying fields, Maxwell’s equations are what they are, and so there is interdependence, as illustrated in the introduction of this post. Note, however, that the magnetic field remains divergence-free in dynamics too! That’s because there is no such thing as a magnetic charge: we only have electric charges. So ·B = 0 and we can define a magnetic vector potential A and re-write B as B×A, indeed.

I am writing a vector potential field because, as I mentioned a couple of times already, we can choose A. Indeed, as long as ·A = 0, it’s fine, so we can add curl-free components to the magnetic potential: it won’t make a difference. This condition is referred to as gauge invariance. I’ll come back to that, and also show why this is what it is.

While we can easily get B from A because of the B×A, getting E from some potential is a different matter altogether. It turns out we can get E using the following expression, which involves both Φ (i.e. the electric or electrostatic potential) as well as A (i.e. the magnetic vector potential):

E = –Φ – ∂A/∂t

Likewise, one can show that Maxwell’s equations can be re-written in terms of Φ and A, rather than in terms of E and B. The expression looks rather formidable, but don’t panic:

Just look at it. We have two ‘variables’ here (Φ and A) and two equations, so the system is fully defined. [Of course, the second equation is three equations really: one for each component x, y and z.] What’s the point? Why would we want to re-write Maxwell’s equations? The first equation makes it clear that the scalar potential (i.e. the electric potential) is a time-varying quantity, so things are not, somehow, simpler. The answer is twofold. First, re-writing Maxwell’s equations in terms of the scalar and vector potential makes sense because we have (fairly) easy expressions for their value in time and in space as a function of the charges and currents. For statics, these expressions are:

So it is, effectively, easier to first calculate the scalar and vector potential, and then get E and B from them. For dynamics, the expressions are similar:

Indeed, they are like the integrals for statics, but with “a small and physically appealing modification”, as Feynman notes: when doing the integrals, we must use the so-called retarded time t′ = t − r12/ct’. The illustration below shows how it works: the influences propagate from point (2) to point (1) at the speed c, so we must use the values of ρ and j at the time t′ = t − r12/ct’ indeed!

The second aspect of the answer to the question of why we’d be interested in Φ and A has to do with the topic I wanted to write about here: the concept of a gauge and a gauge transformation.

Gauges and gauge transformations in electromagnetics

Let’s see what we’re doing really. We calculate some A and then solve for B by writing: B = ×A. Now, I say some A because any A‘ = AΨ, with Ψ any scalar field really. Why? Because the curl of the gradient of Ψ – i.e. curl(gradΨ) = ×(Ψ) – is equal to 0. Hence, ×(AΨ) = ×A×Ψ = ×A.

So we have B, and now we need E. So the next step is to take Faraday’s Law, which is Maxwell’s second equation: ×E = –∂B/∂t. Why this one? It’s a simple one, as it does not involve currents or charges. So we combine this equation and our B = ×A expression and write:

×E = –∂(∇×A)/∂t

Now, these operators are tricky but you can verify this can be re-written as:

×(E + ∂A/∂t) = 0

Looking carefully, we see this expression says that E + ∂A/∂t is some vector whose curl is equal to zero. Hence, this vector must be the gradient of something. When doing electrostatics, When we worked on electrostatics, we only had E, not the ∂A/∂t bit, and we said that E tout court was the gradient of something, so we wrote E = −Φ. We now do the same thing for E + ∂A/∂t, so we write:

E + ∂A/∂t = −Φ

So we use the same symbol Φ but it’s a bit of a different animal, obviously. However, it’s easy to see that, if the ∂A/∂t would disappear (as it does in electrostatics, where nothing changes with time), we’d get our ‘old’ −Φ. Now, E + ∂A/∂t = −Φ can be written as:

E = −Φ – ∂A/∂t

So, what’s the big deal? We wrote B and E as a function of Φ and A. Well, we said we could replace A by any A‘ = AΨ but, obviously, such substitution would not yield the same E. To get the same E, we need some substitution rule for Φ as well. Now, you can verify we will get the same E if we’d substitute Φ for Φ’ = Φ – ∂Ψ/∂t. You should check it by writing it all out:

E = −Φ’–∂A’/∂t = −(Φ–∂Ψ/∂t)–∂(A+Ψ)/∂t

= −Φ+(∂Ψ/∂t)–∂A/∂t–∂(Ψ)/∂t = −Φ – ∂A/∂t = E

Again, the operators are a bit tricky, but the +(∂Ψ/∂t) and –∂(Ψ)/∂t terms do cancel out. Where are we heading to? When everything is said and done, we do need to relate it all to the currents and the charges, because that’s the real stuff out there. So let’s take Maxwell’s E = ρ/ε0 equation, which has the charges in it, and let’s substitute E for E = −Φ – ∂A/∂t. We get:

That equation can be re-written as:

So we have one equation here relating Φ and A to the sources. We need another one, and we also need to separate Φ and A somehow. How do we do that?

Maxwell’s fourth equation, i.e. c2×B = j+ ∂E/∂t can, obviously, be written as c2×− E/∂t = j0. Substituting both E and B yields the following monstrosity:

We can now apply the general ∇×(×C) = (·C) – ∇2C identity to the first term to get:

It’s equally monstrous, obviously, but we can simplify the whole thing by choosing Φ and A in a clever way. For the magnetostatic case, we chose A such that ·A = 0. We could have chosen something else. Indeed, it’s not because B is divergence-free, that A has to be divergence-free too! For example, I’ll leave it to you to show that choosing ·A such that

also respects the general condition that any A and Φ we choose must respect the A‘ = AΨ and Φ’ = Φ – ∂Ψ/∂t equalities. Now, if we choose ·A such that ·A = −c–2·∂Φ/∂t indeed, then the two middle terms in our monstrosity cancel out, and we’re left with a much simpler equation for A:

In addition, doing the substitution in our other equation relating Φ and A to the sources yields an equation for Φ that has the same form:

What’s the big deal here? Well… Let’s write it all out. The equation above becomes:

That’s a wave equation in three dimensions. In case you wonder, just check one of my posts on wave equations. The one-dimensional equivalent for a wave propagating in the x direction at speed c (like a sound wave, for example) is ∂2Φ/∂xc–2·∂2Φ/∂t2, indeed. The equation for A yields above yields similar wave functions for A‘s components Ax, Ay, and Az.

So, yes, it is a big deal. We’ve written Maxwell’s equations in terms of the scalar (Φ) and vector (A) potential and in a form that makes immediately apparent that we’re talking electromagnetic waves moving out at the speed c. Let me copy them again:

You may, of course, say that you’d rather have a wave equation for E and B, rather than for A and Φ. Well… That can be done. Feynman gives us two derivations that do so. The first derivation is relatively simple and assumes the source our electromagnetic wave moves in one direction only. The second derivation is much more complicated and gives an equation for E that, if you’ve read the first volume of Feynman’s Lectures, you’ll surely remember:

The links are there, and so I’ll let you have fun with those Lectures yourself. I am finished here, indeed, in terms of what I wanted to do in this post, and that is to say a few words about gauges in field theory. It’s nothing much, really, and so we’ll surely have to discuss the topic again, but at least you now know what a gauge actually is in classical electromagnetic theory. Let’s quickly go over the concepts:

1. Choosing the ·A is choosing a gauge, or a gauge potential (because we’re talking scalar and vector potential here). The particular choice is also referred to as gauge fixing.
2. Changing A by adding ψ is called a gauge transformation, and the scalar function Ψ is referred to as a gauge function. The fact that we can add curl-free components to the magnetic potential without them making any difference is referred to as gauge invariance.
3. Finally, the ·A = −c–2·∂Φ/∂t gauge is referred to as a Lorentz gauge.

Just to make sure you understand: why is that Lorentz gauge so special? Well… Look at the whole argument once more: isn’t it amazing we get such beautiful (wave) equations if we stick it in? Also look at the functional shape of the gauge itself: it looks like a wave equation itself! […] Well… No… It doesn’t. I am a bit too enthusiastic here. We do have the same 1/c2 and a time derivative, but it’s not a wave equation. 🙂 In any case, it all confirms, once again, that physics is all about beautiful mathematical structures. But, again, it’s not math only. There’s something real out there. In this case, that ‘something’ is a traveling electromagnetic field. 🙂

But why do we call it a gauge? That should be equally obvious. It’s really like choosing a gauge in another context, such as measuring the pressure of a tyre, as shown below. 🙂

Gauges and group theory

You’ll usually see gauges mentioned with some reference to group theory. For example, you will see or hear phrases like: “The existence of arbitrary numbers of gauge functions ψ(r, t) corresponds to the U(1) gauge freedom of the electromagnetic theory.” The U(1) notation stands for a unitary group of degree n = 1. It is also known as the circle group. Let me copy the introduction to the unitary group from the Wikipedia article on it:

In mathematics, the unitary group of degree n, denoted U(n), is the group of n × n unitary matrices, with the group operation that of matrix multiplication. The unitary group is a subgroup of the general linear group GL(n, C). In the simple case n = 1, the group U(1) corresponds to the circle group, consisting of all complex numbers with absolute value 1 under multiplication. All the unitary groups contain copies of this group.

The unitary group U(n) is a real Lie group of of dimension n2. The Lie algebra of U(n) consists of n × n skew-Hermitian matrices, with the Lie bracket given by the commutator. The general unitary group (also called the group of unitary similitudes) consists of all matrices A such that A*A is a nonzero multiple of the identity matrix, and is just the product of the unitary group with the group of all positive multiples of the identity matrix.

Phew! Does this make you any wiser? If anything, it makes me realize I’ve still got a long way to go. 🙂 The Wikipedia article on gauge fixing notes something that’s more interesting (if only because I more or less understand what it says):

Although classical electromagnetism is now often spoken of as a gauge theory, it was not originally conceived in these terms. The motion of a classical point charge is affected only by the electric and magnetic field strengths at that point, and the potentials can be treated as a mere mathematical device for simplifying some proofs and calculations. Not until the advent of quantum field theory could it be said that the potentials themselves are part of the physical configuration of a system. The earliest consequence to be accurately predicted and experimentally verified was the Aharonov–Bohm effect, which has no classical counterpart.

This confirms, once again, that the fields are real. In fact, what this says is that the potentials are real: they have a meaningful physical interpretation. I’ll leave it to you to expore that Aharanov-Bohm effect. In the meanwhile, I’ll study what Feynman writes on potentials and all that as used in quantum physics. It will probably take a while before I’ll get into group theory though.

Indeed, it’s probably best to study physics at a somewhat less abstract level first, before getting into the more sophisticated stuff.

# Applied vector analysis (II)

We’ve covered a lot of ground in the previous post, but we’re not quite there yet. We need to look at a few more things in order to gain some kind of ‘physical’ understanding’ of Maxwell’s equations, as opposed to a merely ‘mathematical’ understanding only. That will probably disappoint you. In fact, you probably wonder why one needs to know about Gauss’ and Stokes’ Theorems if the only objective is to ‘understand’ Maxwell’s equations.

To some extent, your skepticism is justified. It’s already quite something to get some feel for those two new operators we’ve introduced in the previous post, i.e. the divergence (div) and curl operators, denoted by ∇• and × respectively. By now, you understand that these two operators act on a vector field, such as the electric field vector E, or the magnetic field vector B, or, in the example we used, the heat flow h, so we should write •(a vector) and ×(a vector. And, as for that del operator – i.e.  without the dot (•) or the cross (×) – if there’s one diagram you should be able to draw off the top of your head, it’s the one below, which shows:

1. The heat flow vector h, whose magnitude is the thermal energy that passes, per unit time and per unit area, through an infinitesimally small isothermal surface, so we write: h = |h| = ΔJ/ΔA.
2. The gradient vector T, whose direction is opposite to that of h, and whose magnitude is proportional to h, so we can write the so-called differential equation of heat flow: h = –κT.
3. The components of the vector dot product ΔT = T•ΔR = |T|·ΔR·cosθ.

You should also remember that we can re-write that ΔT = T•ΔR = |T|·ΔR·cosθ equation – which we can also write as ΔT/ΔR = |T|·cosθ – in a more general form:

Δψ/ΔR = |ψ|·cosθ

That equation says that the component of the gradient vector ψ along a small displacement ΔR is equal to the rate of change of ψ in the direction of ΔRAnd then we had three important theorems, but I can imagine you don’t want to hear about them anymore. So what can we do without them? Let’s have a look at Maxwell’s equations again and explore some linkages.

Curl-free and divergence-free fields

From what I wrote in my previous post, you should remember that:

1. The curl of a vector field (i.e. ×C) represents its circulation, i.e. its (infinitesimal) rotation.
2. Its divergence (i.e. ∇•C) represents the outward flux out of an (infinitesimal) volume around the point we’re considering.

Back to Maxwell’s equations:

Let’s start at the bottom, i.e. with equation (4). It says that a changing electric field (i.e. ∂E/∂t ≠ 0) and/or a (steady) electric current (j0) will cause some circulation of B, i.e. the magnetic field. It’s important to note that (a) the electric field has to change and/or (b) that electric charges (positive or negative) have to move  in order to cause some circulation of B: a steady electric field will not result in any magnetic effects.

This brings us to the first and easiest of all the circumstances we can analyze: the static case. In that case, the time derivatives ∂E/∂t and ∂B/∂t are zero, and Maxwell’s equations reduce to:

1. ∇•E = ρ/ε0. In this equation, we have ρ, which represents the so-called charge density, which describes the distribution of electric charges in space: ρ = ρ(x, y, z). To put it simply: ρ is the ‘amount of charge’ (which we’ll denote by Δq) per unit volume at a given point. Hence, if we  consider a small volume (ΔV) located at point (x, y, z) in space – an infinitesimally small volume, in fact (as usual) –then we can write: Δq =  ρ(x, y, z)ΔV. [As for ε0, you already know this is a constant which ensures all units are ‘compatible’.] This equation basically says we have some flux of E, the exact amount of which is determined by the charge density ρ or, more in general, by the charge distribution in space.
2. ×E = 0. That means that the curl of E is zero: everywhere, and always. So there’s no circulation of E. We call this a curl-free field.
3. B = 0. That means that the divergence of B is zero: everywhere, and always. So there’s no flux of B. None. We call this a divergence-free field.
4. c2∇×B = j0. So here we have steady current(s) causing some circulation of B, the exact amount of which is determined by the (total) current j. [What about that cfactor? Well… We talked about that before: magnetism is, basically, a relativistic effect, and so that’s where that factor comes from. I’ll just refer you to what Feynman writes about this in his Lectures, and warmly recommend to read it, because it’s really quite interesting: it gave me at least a much deeper understanding of what it’s all about, and so I hope it will help you as much.]

Now you’ll say: why bother with all these difficult mathematical constructs if we’re going to consider curl-free and divergence-free fields only. Well… B is not curl-free, and E is not divergence-free. To be precise:

1. E is a field with zero curl and a given divergence, and
2. B is a field with zero divergence and a given curl.

Yeah, but why can’t we analyze fields that have both curl and divergence? The answer is: we can, and we will, but we have to start somewhere, and so we start with an easier analysis first.

Electrostatics and magnetostatics

The first thing you should note is that, in the static case (i.e. when charges and currents are static), there is no interdependence between E and B. The two fields are not interconnected, so to say. Therefore, we can neatly separate them into two pairs:

1. Electrostatics: (1) ∇•E = ρ/ε0 and (2) ×E = 0.
2. Magnetostatics: (1) ∇×B = j/c2ε0 and (2) B = 0.

Now, I won’t go through all of the particularities involved. In fact, I’ll refer you to a real physics textbook on that (like Feynman’s Lectures indeed). My aim here is to use these equations to introduce some more math and to gain a better understanding of vector calculus – an understanding that goes, in fact, beyond the math (i.e. a ‘physical’ understanding, as Feynman terms it).

At this point, I have to introduce two additional theorems. They are nice and easy to understand (although not so easy to prove, and so I won’t):

Theorem 1: If we have a vector field – let’s denote it by C – and we find that its curl is zero everywhere, then C must be the gradient of something. In other words, there must be some scalar field ψ (psi) such that C is equal to the gradient of ψ. It’s easier to write this down as follows:

If ×= 0, there is a ψ such that C = ψ.

Theorem 2: If we have a vector field – let’s denote it by D, just to introduce yet another letter – and we find that its divergence is zero everywhere, then D must be the curl of some vector field A. So we can write:

If D = 0, there is an A such that D = ×A.

We can apply this to the situation at hand:

1. For E, there is some scalar potential Φ such that E = –Φ. [Note that we could integrate the minus sign in Φ, but we leave it there as a reminder that the situation is similar to that of heat flow. It’s a matter of convention really: E ‘flows’ from higher to lower potential.]
2. For B, there is a so-called vector potential A such that B = ×A.

The whole game is then to compute Φ and A everywhere. We can then take the gradient of Φ, and the curl of A, to find the electric and magnetic field respectively, at every single point in space. In fact, most of Feynman’s second Volume of his Lectures is devoted to that, so I’ll refer you that if you’d be interested. As said, my goal here is just to introduce the basics of vector calculus, so you gain a better understanding of physics, i.e. an understanding which goes beyond the math.

Electrodynamics

We’re almost done. Electrodynamics is, of course, much more complicated than the static case, but I don’t have the intention to go too much in detail here. The important thing is to see the linkages in Maxwell’s equations. I’ve highlighted them below:

I know this looks messy, but it’s actually not so complicated. The interactions between the electric and magnetic field are governed by equation (2) and (4), so equation (1) and (3) is just ‘statics’. Something needs to trigger it all, of course. I assume it’s an electric current (that’s the arrow marked by [0]).

Indeed, equation (4), i.e. c2∇×B = ∂E/∂t + j0, implies that a changing electric current – an accelerating electric charge, for instance – will cause the circulation of B to change. More specifically, we can write: ∂[c2∇×B]/∂t = ∂[j0]∂t. However, as the circulation of B changes, the magnetic field B itself must be changing. Hence, we have a non-zero time derivative of B (∂B/∂t ≠ 0). But, then, according to equation (2), i.e. ∇×E = –∂B/∂t, we’ll have some circulation of E. That’s the dynamics marked by the red arrows [1].

Now, assuming that ∂B/∂t is not constant (because that electric charge accelerates and decelerates, for example), the time derivative ∂E/∂t will be non-zero too (∂E/∂t ≠ 0). But so that feeds back into equation (4), according to which a changing electric field will cause the circulation of B to change. That’s the dynamics marked by the yellow arrows [2].

The ‘feedback loop’ is closed now: I’ve just explained how an electromagnetic field (or radiation) actually propagates through space. Below you can see one of the fancier animations you can find on the Web. The blue oscillation is supposed to represent the oscillating magnetic vector, while the red oscillation is supposed to represent the electric field vector. Note how the effect travels through space.

This is, of course, an extremely simplified view. To be precise, it assumes that the light wave (that’s what an electromagnetic wave actually is) is linearly (aka as plane) polarized, as the electric (and magnetic field) oscillate on a straight line. If we choose the direction of propagation as the z-axis of our reference frame, the electric field vector will oscillate in the xy-plane. In other words, the electric field will have an x- and a y-component, which we’ll denote as Ex and Erespectively, as shown in the diagrams below, which give various examples of linear polarization.

Light is, of course, not necessarily plane-polarized. The animation below shows circular polarization, which is a special case of the more general elliptical polarization condition.

The relativity of magnetic and electric fields

Allow me to make a small digression here, which has more to do with physics than with vector analysis. You’ll have noticed that we didn’t talk about the magnetic field vector anymore when discussing the polarization of light. Indeed, when discussing electromagnetic radiation, most – if not all – textbooks start by noting we have E and B vectors, but then proceed to discuss the E vector only. Where’s the magnetic field? We need to note two things here.

1. First, I need to remind you of the force on any electrically charged particle (and note we only have electric charge: there’s no such thing as a magnetic charge according to Maxwell’s third equation) consists of two components. Indeed, the total electromagnetic force (aka Lorentz force) on a charge q is:

F = q(E + v×B) = qE + q(v×B) = FE + FM

The velocity vector v is the velocity of the charge: if the charge is not moving, then there’s no magnetic force. The illustration below shows you the components of the vector cross product that, by now, you’re fully familiar with. Indeed, in my previous post, I gave you the expressions for the x, y and z coordinate of a cross product, but there’s a geometrical definition as well:

v×B = |v||B|sin(θ)n

The magnetic force FM is q(v×B) = qv×B q|v||B|sin(θ)n. The unit vector n determines the direction of the force, which is determined by that right-hand rule that, by now, you also are fully familiar with: it’s perpendicular to both v and B (cf. the two 90° angles in the illustration). Just to make sure, I’ve also added the right-hand rule illustration above: check it out, as it does involve a bit of arm-twisting in this case. 🙂

In any case, the point to note here is that there’s only one electromagnetic force on the particle. While we distinguish between an E and a B vector, the E and B vector depend on our reference frame. Huh? Yes. The velocity v is relative: we specify the magnetic field in a so-called inertial frame of reference here. If we’d be moving with the charge, the magnetic force would, quite simply, disappear, because we’d have a v equal to zero, so we’d have v×B = 0×B= 0. Of course, all other charges (i.e. all ‘stationary’ and ‘moving’ charges that were causing the field in the first place) would have different velocities as well and, hence, our E and B vector would look very different too: they would come in a ‘different mixture’, as Feynman puts it. [If you’d want to know in what mixture exactly, I’ll refer you Feynman: it’s a rather lengthy analysis (five rather dense pages, in fact), but I can warmly recommend it: in fact, you should go through it if only to test your knowledge at this point, I think.]

You’ll say: So what? That doesn’t answer the question above. Why do physicists leave out the magnetic field vector in all those illustrations?

You’re right. I haven’t answered the question. This first remark is more like a warning. Let me quote Feynman on it:

“Since electric and magnetic fields appear in different mixtures if we change our frame of reference, we must be careful about how we look at the fields E and B. […] The fields are our way of describing what goes on at a point in space. In particular, E and B tell us about the forces that will act on a moving particle. The question “What is the force on a charge from a moving magnetic field?” doesn’t mean anything precise. The force is given by the values of E and B at the charge, and the F = q(E + v×B) formula is not to be altered if the source of E or B is moving: it is the values of E and B that will be altered by the motion. Our mathematical description deals only with the fields as a function of xy, z, and t with respect to some inertial frame.”

If you allow me, I’ll take this opportunity to insert another warning, one that’s quite specific to how we should interpret this concept of an electromagnetic wave. When we say that an electromagnetic wave ‘travels’ through space, we often tend to think of a wave traveling on a string: we’re smart enough to understand that what is traveling is not the string itself (or some part of the string) but the amplitude of the oscillation: it’s the vertical displacement (i.e. the movement that’s perpendicular to the direction of ‘travel’) that appears first at one place and then at the next and so on and so on. It’s in that sense, and in that sense only, that the wave ‘travels’. However, the problem with this comparison to a wave traveling on a string is that we tend to think that an electromagnetic wave also occupies some space in the directions that are perpendicular to the direction of travel (i.e. the x and y directions in those illustrations on polarization). Now that’s a huge misconception! The electromagnetic field is something physical, for sure, but the E and B vectors do not occupy any physical space in the x and y direction as they ‘travel’ along the z direction!

Let me conclude this digression with Feynman’s conclusion on all of this:

“If we choose another coordinate system, we find another mixture of E and B fields. However, electric and magnetic forces are part of one physical phenomenon—the electromagnetic interactions of particles. While the separation of this interaction into electric and magnetic parts depends very much on the reference frame chosen for the description, the complete electromagnetic description is invariant: electricity and magnetism taken together are consistent with Einstein’s relativity.”

2. You’ll say: I don’t give a damn about other reference frames. Answer the question. Why are magnetic fields left out of the analysis when discussing electromagnetic radiation?

The answer to that question is very mundane. When we know E (in one or the other reference frame), we also know B, and, while B is as ‘essential’ as E when analyzing how an electromagnetic wave propagates through space, the truth is that the magnitude of B is only a very tiny fraction of that of E.

Huh? Yes. That animation with these oscillating blue and red vectors is very misleading in this regard. Let me be precise here and give you the formulas:

I’ve analyzed these formulas in one of my other posts (see, for example, my first post on light and radiation), and so I won’t repeat myself too much here. However, let me recall the basics of it all. The eR′ vector is a unit vector pointing in the apparent direction of the charge. When I say ‘apparent’, I mean that this unit vector is not pointing towards the present position of the charge, but at where is was a little while ago, because this ‘signal’ can only travel from the charge to where we are now at the same speed of the wave, i.e. at the speed of light c. That’s why we prime the (radial) vector R also (so we write R′ instead of R). So that unit vector wiggles up and down and, as the formula makes clear, it’s the second-order derivative of that movement which determines the electric field. That second-order derivative is the acceleration vector, and it can be substituted for the vertical component of the acceleration of the charge that caused the radiation in the first place but, again, I’ll refer you my post on that, as it’s not the topic we want to cover here.

What we do want to look at here, is that formula for B: it’s the cross product of that eR′ vector (the minus sign just reverses the direction of the whole thing) and E divided by c. We also know that the E and eR′ vectors are at right angles to each, so the sine factor (sinθ) is 1 (or –1) too. In other words, the magnitude of B is |E|/c =  E/c, which is a very tiny fraction of E indeed (remember: c ≈ 3×108).

So… Yes, for all practical purposes, B doesn’t matter all that much when analyzing electromagnetic radiation, and so that’s why physicists will note it but then proceed and look at E only when discussing radiation. Poor BThat being said, the magnetic force may be tiny, but it’s quite interesting. Just look at its direction! Huh? Why? What’s so interesting about it?  I am not talking the direction of B here: I am talking the direction of the force. Oh… OK… Hmm… Well…

Let me spell it out. Take the force formula: F = q(E + v×B) = qE + q(v×B). When our electromagnetic wave hits something real (I mean anything real, like a wall, or some molecule of gas), it is likely to hit some electron, i.e. an actual electric charge. Hence, the electric and magnetic field should have some impact on it. Now, as we pointed here, the magnitude of the electric force will be the most important one – by far – and, hence, it’s the electric field that will ‘drive’ that charge and, in the process, give it some velocity v, as shown below. In what direction? Don’t ask stupid questions: look at the equation. FE = qE, so the electric force will have the same direction as E.

But we’ve got a moving charge now and, therefore, the magnetic force comes into play as well! That force is FM  = q(v×B) and its direction is given by the right-hand rule: it’s the F above in the direction of the light beam itself. Admittedly, it’s a tiny force, as its magnitude is F = qvE/c only, but it’s there, and it’s what causes the so-called radiation pressure (or light pressure tout court). So, yes, you can start dreaming of fancy solar sailing ships (the illustration below shows one out of of Star Trek) but… Well… Good luck with it! The force is very tiny indeed and, of course, don’t forget there’s light coming from all directions in space!

Jokes aside, it’s a real and interesting effect indeed, but I won’t say much more about it. Just note that we are really talking the momentum of light here, and it’s a ‘real’ as any momentum. In an interesting analysis, Feynman calculates this momentum and, rather unsurprisingly (but please do check out how he calculates these things, as it’s quite interesting), the same 1/c factor comes into play once: the momentum (p) that’s being delivered when light hits something real is equal to 1/c of the energy that’s being absorbed. So, if we denote the energy by W (in order to not create confusion with the E symbol we’ve used already), we can write: p = W/c.

Now I can’t resist one more digression. We’re, obviously, fully in classical physics here and, hence, we shouldn’t mention anything quantum-mechanical here. That being said, you already know that, in quantum physics, we’ll look at light as a stream of photons, i.e. ‘light particles’ that also have energy and momentum. The formula for the energy of a photon is given by the Planck relation: E = hf. The h factor is Planck’s constant here – also quite tiny, as you know – and f is the light frequency of course. Oh – and I am switching back to the symbol E to denote energy, as it’s clear from the context I am no longer talking about the electric field here.

Now, you may or may not remember that relativity theory yields the following relations between momentum and energy:

E2 – p2c2 = m0cand/or pc = Ev/c

In this equations, mstands, obviously, for the rest mass of the particle, i.e. its mass at v = 0. Now, photons have zero rest mass, but their speed is c. Hence, both equations reduce to p = E/c, so that’s the same as what Feynman found out above: p = W/c.

Of course, you’ll say: that’s obvious. Well… No, it’s not obvious at all. We do find the same formula for the momentum of light (p) – which is great, of course –  but so we find the same thing coming from very different necks parts of the woods. The formula for the (relativistic) momentum and energy of particles comes from a very classical analysis of particles – ‘real-life’ objects with mass, a very definite position in space and whatever other properties you’d associate with billiard balls – while that other p = W/c formula comes out of a very long and tedious analysis of light as an electromagnetic wave. The two analytical frameworks couldn’t differ much more, could they? Yet, we come to the same conclusion indeed.

Physics is wonderful. 🙂

So what’s left?

Lots, of course! For starters, it would be nice to show how these formulas for E and B with eR′ in them can be derived from Maxwell’s equations. There’s no obvious relation, is there? You’re right. Yet, they do come out of the very same equations. However, for the details, I have to refer you to Feynman’s Lectures once again – to the second Volume to be precise. Indeed, besides calculating scalar and vector potentials in various situations, a lot of what he writes there is about how to calculate these wave equations from Maxwell’s equations. But so that’s not the topic of this post really. It’s, quite simply, impossible to ‘summarize’ all those arguments and derivations in a single post. The objective here was to give you some idea of what vector analysis really is in physics, and I hope you got the gist of it, because that’s what needed to proceed. 🙂

The other thing I left out is much more relevant to vector calculus. It’s about that del operator () again: you should note that it can be used in many more combinations. More in particular, it can be used in combinations involving second-order derivatives. Indeed, till now, we’ve limited ourselves to first-order derivatives only. I’ll spare you the details and just copy a table with some key results:

1. •(T) = div(grad T) = T = ()T = ∇2T = ∂2T/∂x+ ∂2T/∂y+ ∂2T/∂z= a scalar field
2. ()h = ∇2= a vector field
3. (h) = grad(div h) = a vector field
4. ×(×h) = curl(curl h) =(h) – ∇2h
5. ∇•(×h) = div(curl h) = 0 (always)
6. ×(T) = curl(grad T) = 0 (always)

So we have yet another set of operators here: not less than six, to be precise. You may think that we can have some more, like (×), for example. But… No. A (×) operator doesn’t make sense. Just write it out and think about it. Perhaps you’ll see why. You can try to invent some more but, if you manage, you’ll see they won’t make sense either. The combinations that do make sense are listed above, all of them.

Now, while of these combinations make (some) sense, it’s obvious that some of these combinations are more useful than others. More in particular, the first operator, ∇2, appears very often in physics and, hence, has a special name: it’s the Laplacian. As you can see, it’s the divergence of the gradient of a function.

Note that the Laplace operator (∇2) can be applied to both scalar as well as vector functions. If we operate with it on a vector, we’ll apply it to each component of the vector function. The Wikipedia article on the Laplace operator shows how and where it’s used in physics, and so I’ll refer to that if you’d want to know more. Below, I’ll just write out the operator itself, as well as how we apply it to a vector:

So that covers (1) and (2) above. What about the other ‘operators’?

Let me start at the bottom. Equations (5) and (6) are just what they are: two results that you can use in some mathematical argument or derivation. Equation (4) is… Well… Similar: it’s an identity that may or may not help one when doing some derivation.

What about (3), i.e. the gradient of the divergence of some vector function? Nothing special. As Feynman puts it: “It is a possible vector field, but there is nothing special to say about it. It’s just some vector field which may occasionally come up.”

So… That should conclude my little introduction to vector analysis, and so I’ll call it a day now. 🙂 I hope you enjoyed it.