Relativistic transformations of fields and the electromagnetic tensor

Pre-script (dated 26 June 2020): Our ideas have evolved into a full-blown realistic (or classical) interpretation of all things quantum-mechanical. In addition, I note the dark force has amused himself by removing some material. So no use to read this. Read my recent papers instead. 🙂

Original post:

We’re going to do a very interesting piece of math here. It’s going to bring a lot of things together. The key idea is to present a mathematical construct that effectively presents the electromagnetic force as one force, as one physical reality. Indeed, we’ve been saying repeatedly that electromagnetism is one phenomenon only but we’ve been writing it always as something involving two vectors: he electric field vector E and the magnetic field vector B. Of course, Lorentz’ force law F = q(E + v×B) makes it clear we’re talking one force only but… Well… There is a way of writing it all up that is much more elegant.

I have to warn you though: this post doesn’t add anything to the physics we’ve seen so far: it’s all math, really and, to a large extent, math only. So if you read this blog because you’re interested in the physics only, then you may just as well skip this post. Having said that, the mathematical concept we’re going to present is that of the tensor and… Well… You’ll have to get to know that animal sooner or later anyway, so you may just as well give it a try right now, and see whatever you can get out of this post.

The concept of a tensor further builds on the concept of the vector, which we liked so much because it allows us to write the laws of physics as vector equations, which do not change when going from one reference frame to another. In fact, we’ll see that a tensor can be described as a ‘special’ vector cross product (to be precise, we’ll show that a tensor is a ‘more general’ cross product, really). So the tensor and vector concepts are very closely related, but then… Well… If you think about it, the concept of a vector and the concept of a scalar are closely related, too! So we’re just moving up the value chain, so to speak: from scalar fields to vector fields to… Well… Tensor fields! And in quantum mechanics, we’ll introduce spinors, and so we also have spinor fields! Having said that, don’t worry about tensor fields. Let’s first try to understand tensors tout court. 🙂

So… Well… Here we go. Let me start with it all by reminding you of the concept of a vector, and why we like to use vectors and vector equations.

The invariance of physics and the use of vector equations

What’s a vector? You may think, naively, that any one-dimensional array of numbers is a vector. But… Well… No! In math, we may, effectively, refer to any one-dimensional array of numbers as a ‘vector’, perhaps, but in physics, a vector does represent something real, something physical, and so a vector is only a vector if it transforms like a vector under the transformation rules that apply when going from one another frame of reference, i.e. one coordinate system, to another. Examples of vectors in three dimensions are: the velocity vector v, or the momentum vector p = m·v, or the position vector r.

Needless to say, the same can be said of scalars: mathematicians may define a scalar as just any real number, but it’s not in physics. A scalar in physics refers to something real, i.e. a scalar field, like the temperature (T) inside of a block of material. In fact, think about your first vector equation: it may have been the one determining the heat flow (h), i.e. h = −κ·∇T = (−κ·∂T/∂x, −κ·∂T/∂y, −κ·∂T/∂z). It immediately shows how scalar and vector fields are intimately related.

Now, when discussing the relativistic framework of physics, we introduced vectors in four dimensions, i.e. four-vectors. The most basic four-vector is the spacetime four-vector R = (ct, x, y, z), which is often referred to as an event, but it’s just a point in spacetime, really. So it’s a ‘point’ with a time as well as a spatial dimension, so it also has t in it, besides x, y and z. It is also known as the position four-vector but, again, you should think of a ‘position’ that includes time! Of course, we can re-write R as R = (ct, r), with r = (x, y, z), so here we sort of ‘break up’ the four-vector in a scalar and a three-dimensional vector, which is something we’ll do from time to time, indeed. 🙂

We also have a displacement four-vector, which we can write as ΔR = (c·Δt, Δr). There are other four-vectors as well, including the four-velocity, the four-momentum and the four-force four-vectors, which we’ll discuss later (in the last section of this post).

So it’s just like using three-dimensional vectors in three-dimensional physics, or ‘Newtonian’ physics, I should say: the use of four-vectors is going to allow us to write the laws of physics using vector equations, but in four dimensions, rather than three, so we get the ‘Einsteinian’ physics, the real physics, so to speak—or the relativistically correct physics, I should say. And so these four-dimensional vector equations will also not change when going from one reference frame to another, and so our four-vector will be vectors indeed, i.e. they will transform like a vector under the transformation rules that apply when going from one another frame of reference, i.e. one coordinate system, to another.

What transformation? Well… In Newtonian or Galilean physics, we had translations and rotations and what have you, but what we are interested in right now are ‘Einsteinian’ transformations of coordinate systems, so these have to ensure that all of the laws of physics that we know of, including the principle of relativity, still look the same. You’ve seen these transformation rules. We don’t call them the ‘Einsteinian’ transformation rules, but the Lorentz transformation rules, because it was a Dutch physicist (Hendrik Lorentz) who first wrote them down. So these rules are very different from the Newtonian or Galilean transformation rules which everyone assumed to be valid until the Michelson-Morley experiment unequivocally established that the speed of light did not respect the Galilean transformation rules. Very different? Well… Yes. In their mathematical structure, that is. Of course, when velocities are low, i.e. non-relativistic, then they yield the same result, approximately, that is. However, I explained that in my post on special relativity, and so I won’t dwell on that here.

Let me just jot down both sets of rules assuming that the two reference frames move with respect to each other along the x- axis only, so the y- and z-component of u is zero.

The Galilean or Newtonian rules are the simple rules on the right. Going from one reference frame to another (let’s call them S and S’ respectively) is just a matter of adding or subtracting speeds: if my car goes 100 km/h, and yours goes 120 km/h, then you will see my car falling behind at a speed of (minus) 20 km/h. That’s it. We could also rotate our reference frame, and our Newtonian vector equations would still look the same. As Feynman notes, smilingly, it’s what a lot of armchair philosophers think relativity theory is all about, but so it’s got nothing to do with it. It’s plain wrong!

In any case, back to vectors and transformations. The key to the so-called invariance of the laws of physics is the use of vectors and vector operators that transform like vectors. For example, if we defined A and B as (A_x, A_y, A_z) and (B_x, B_y, B_z), then we knew that the so-called inner product A•B would look the same in all rotated coordinate systems, so we can write: A•B = A’•B’. So we know that if we have a product like that on both sides of an equation, we’re fine: the equation will have the same form in all rotated coordinate systems. Also, the gradient, i.e. our vector operator ∇ = (∂/∂_x, ∂/∂_y, ∂/∂_z), when applied to a scalar function, gave three quantities that also transform like a vector under rotation. We also defined a vector cross product, which yielded a vector (as opposed to the inner product, i.e. the vector dot product, which yields a scalar):

So how does this thing behave under a Galilean transformation? Well… You may or may not remember that we used this cross-product to define the angular momentum L, which was a cross product of the radius vector r and the momentum vector p = mv, as illustrated below. The animation also gives the torque τ, which is, loosely speaking, a measure of the turning force: it’s the cross product of r and F, i.e. the force on the lever-arm.

The components of L are:

Now, we find that these three numbers, or objects if you want, transform in exactly the same way as the components of a vector. However, as Feynman points out, that’s a matter of ‘luck’ really. It’s something ‘special’. Indeed, you may or may not remember that we distinguished axial vectors from polar vectors. L is an axial vector, while r and p are polar vectors, and so we find that, in three dimensions, the cross product of two polar vectors will always yields an axial vector. Axial vectors are sometimes referred to as pseudovectors, which suggests that they are ‘not so real’ as… Well… Polar vectors, which are sometimes referred to as ‘true’ vectors. However, it doesn’t matter when doing these Newtonian or Galilean transformations: pseudo or true, both vectors transform like vectors. 🙂

But so… Well… We’re actually getting a bit of a heads-up here: if we’d be mixing (or ‘crossing’) polar and axial vectors, or mixing axial vectors only, so if we’d define something involving L and p (rather than r and p), or something involving L and τ, then we may not be so lucky, and then we’d have to carefully examine our cross-product, or whatever other product we’d want to define, because its components may not behave like a vector.

Huh? Whatever other product we’d want to define? Why are you saying that? Well… We actually can think of other products. For example, if we have two vectors a = (a_x, a_y, a_z) and b = (b_x, b_y, b_z), then we’ll have nine possible combinations of their components, which we can write as T_ij = a_ib_j. So that’s like L_xy, L_yz and L_zx really. Now, you’ll say: “No. It isn’t. We don’t have nine combinations here. Just three numbers.” Well… Think about it: we actually do have nine L_ij combinations too here, as we can write: L_ij = r_i·p_j – r_j·p_i. It just happens that, with this definition, only three of these combinations L_ij are independent. That’s because the other six numbers are either zero or the opposite. Indeed, it’s easy to verify that L_ij = –L_ji , and L_ii = 0. So… Well… It turns out that the three components of our L = r×p ‘vector’ are actually a subset of a set of nine L_ij numbers. So… Well… Think about it. We cannot just do whatever we want with our ‘vectors’. We need to watch out.

In fact, I do not want to get too much ahead of myself, but I can already tell you that the matrix with these nine T_ij = a_ib_j combinations is what is referred to as the tensor. To be precise, it’s referred to as a tensor of the second rank in three dimensions. The ‘second rank’, aka as ‘degree’ or ‘order’ refers to the fact that we’ve got two indices, and the ‘three dimensions’ is because we’re using three-dimensional vectors. We’ll soon see that the electromagnetic tensor is also of the second rank, but it’s a tensor in four dimensions. In any case, I should not get ahead of myself. Just note what I am saying here: the tensor is like a ‘new’ product of two vectors, a new type of ‘cross’ product really (because we’re mixing the components, so to say), but it doesn’t yield a vector: it yields a matrix. For three-dimensional vectors, we get a 3×3 matrix. For four-vectors, we’ll get a 4×4 matrix. And so the full truth about our angular momentum vector L, is the following:

There is a thing which we call the angular momentum tensor. It’s a 3×3 matrix, so it has nine elements which are defined as: L_ij = r_i·p_j – r_j·p_i. Because of this definition, it’s an antisymmetric tensor of the second order in three dimensions, so it’s got only three independent components.
The three independent elements are the components of our ‘vector’ L, and picking them out and calling these three components a ‘vector’ is actually a ‘trick’ that only works in three dimensions. They really just happen to transform like a vector under rotation or under whatever Galilean transformation! [By the way, do you know understand why I was saying that we can look at a tensor as a ‘more general’ cross product?]
In fact, in four dimensions, we’ll use a similar definition and define 16 elements F_ij as F_ij = ∇_iA_j − ∇_jA_i, using the two four-vectors ∇_μand A_μ (so we have 4×4 = 16 combinations indeed), out of which only six will be independent for the very same reason: we have an antisymmetric vector combination here, F_ij = −F_ji and F_ii = 0. 🙂 However, because we cannot represent six independent things by four things, we do not get some other four-vector, and so that’s why we cannot apply the same ‘trick’ in four dimensions.

However, here I am getting way ahead of myself and so… Well… Yes. Back to the main story line. 🙂 So let’s try to move to the next level of understanding, which is… Well…

Because of guys like Maxwell and Einstein, we now know that rotations are part of the Newtonian world, in which time and space are neatly separated, and that things are not so simple in Einstein’s world, which is the real world, as far as we know, at least! Under a Lorentz transformation, the new ‘primed’ space and time coordinates are a mixture of the ‘unprimed’ ones. Indeed, the new x’ is a mixture of x and t, and the new t’ is a mixture of x and t as well. [Yes, please scroll all the way up and have a look at the transformation on the left-hand side!]

So you don’t have that under a Galilean transformation: in the Newtonian world, space and time are neatly separated, and time is absolute, i.e. it is the same regardless of the reference frame. In Einstein’s world – our world – that’s not the case: time is relative, or local as Hendrik Lorentz termed it quite appropriately, and so it’s space-time – i.e. ‘some kind of union of space and time’ as Minkowski termed it – that transforms.

So that’s why physicists use four-vectors to keep track of things. These four-vectors always have three space-like components, but they also include one so-called time-like component. It’s the only way to ensure that the laws of physics are unchanged when moving with uniform velocity. Indeed, any true law of physics we write down must be arranged so that the invariance of physics (as a “fact of Nature”, as Feynman puts it) is built in, and so that’s why we use Lorentz transformations and four-vectors.

In the mentioned post, I gave a few examples illustrating how the Lorentz rules work. Suppose we’re looking at some spaceship that is moving at half the speed of light (i.e. 0.5c) and that, inside the spaceship, some object is also moving at half the speed of light, as measured in the reference frame of the spaceship, then we get the rather remarkable result that, from our point of view (i.e. our reference frame as observer on the ground), that object is not going as fast as light, as Newton or Galileo – and most present-day armchair philosophers 🙂 – would predict (0.5c + 0.5c = c). We’d see it move at a speed equal to v = 0.8c. Huh? How do we know that? Well… We can derive a velocity formula from the Lorentz rules:

So now you can just put in the numbers now: v_x = (0.5c + 0.5c)/(1 + 0.5·0.5) = 0.8c. See?

Let’s do another example. Suppose we’re looking at a light beam inside the spaceship, so something that’s traveling at speed c itself in the spaceship. How does that look to us? The Galilean transformation rules say its speed should be 1.5c, but that can’t be true of course, and the Lorentz rules save us once more: v_x = (0.5c + c)/(1 + 0.5·1) = c, so it turns out that the speed of light does not depend on the reference frame: it looks the same – both to the man in the ship as well as to the man on the ground. As Feynman puts it: “This is good, for it is, in fact, what the Einstein theory of relativity was designed to do in the first place—so it had better work!” 🙂

So let’s now apply relativity to electromagnetism. Indeed, that’s what this post is all about! However, before I do so, let me re-write the Lorentz transformation rules for c = 1. We can equate the speed of light to one, indeed, when measure time and distance in equivalent units. It’s just a matter of ditching our seconds for meters (so our time unit becomes the time that light needs to travel a distance of one meter), or ditching our meters for seconds (so our distance unit becomes the distance that light travels in one second). You should be familiar with this procedure. If not, well… Check out my posts on relativity. So here’s the same set of rules for c = 1:

They’re much easier to remember and work with, and so that’s good, because now we need to look at how these rules work with four-vectors and the various operations and operators we’ll be defining on them. Let’s look at that step by step.

Electrodynamics in relativistic notation

Let me copy the Universal Set of Equations and Their Solution once more:

The solution for Maxwell’s equations is given in terms of the (electric) potential Φ and the (magnetic) vector potential A. I explained that in my post on this, so I won’t repeat myself too much here either. The only point you should note is that this solution is the result of a special choice of Φ and A, which we referred to as the Lorentz gauge. We’ll touch upon this condition once more, so just make a mental note of it.

Now, E and B do not correspond to four-vectors: they depend on x, y, z and t, but they have three components only: E_x, E_y, E_z, and B_x, B_y, and B_z respectively. So we have six independent terms here, rather than four things that, somehow, we could combine into some four-vector. [Does this ring a bell? It should. :-)] Having said that, it turns out that we can combine Φ and A into a four-vector, which we’ll refer to as the four-potential and which we’ll will write as:

A_μ= (Φ, A) = (Φ, A_x, A_y, A_z) = (A_t, A_x, A_y, A_z) with A_t = Φ.

So that’s a four-vector just like R = (ct, x, y, z).

How do we know that A_μis a four-vector? Well… Here I need to say a few things about those Lorentz transformation rules and, more importantly, about the required condition of invariance under a Lorentz transformation. So, yes, here we need to dive into the math.

Four-vectors and invariance under Lorentz transformations

When you were in high-school, you learned how to rotate your coordinate frame. You also learned that the distance of a point from the origin does not change under a rotation, so you’d write r’²= x’²+ y’²+ z’²= r²= x²+ y²+ z², and you’d say that r² is an invariant quantity under a rotation. Indeed, transformations leave certain things unchanged. From the Lorentz transformation rules itself, it is easy to see that

c·t’²– x’²– y’²–z ‘²= c·t²–x²– y² – z², or,

if c = 1, that t’²– x’²– y’²– z’²= t²– x²– y² – z²,

is an invariant under a Lorentz transformation. We found the same for the so-called spacetime interval Δs² = Δr²– cΔt², which we write as Δs² = Δr²– Δt² as we chose our time or distance units such that c = 1. [Note that, from now on, we’ll assume that’s the case, so c = 1 everywhere. We can always change back to our old units when we’re done with the analysis.] Indeed, such invariance allowed us to define spacelike, timelike and lightlike intervals using the so-called light cone emanating from a single event and traveling in all directions.

You should note that, for four-vectors, we do not have a simple sum of three terms. Indeed, we don’t write x²+ y²+ z² but t²– x²– y² – z². So we’ve got a +−−− thing here or, it’s just another convention, we could also work with a −+++ sum of terms. The convention is referred to as the signature, and we will use the so-called metric signature here, which is +−−−. Let’s continue the story. Now, all four-vectors a_μ= (a_t, a_x, a_y, a_z) have this property that:

a_t‘²– a_x‘²– a_y‘²– a_z‘²= a_t²– a_x²– a_y² – a_z².

[The primed quantities are, obviously, the quantities as measured in the other reference frame.] So. Well… Yes. 🙂 But… Well… Hmm… We can say that our four-potential vector is a four-vector, but so we still have to prove that. So we need to prove that Φ’²– A_x‘²– A_y‘²– A_z‘²= Φ²– A_x²– A_y² – A_z² for our four-potential vector A_μ= (Φ, A). So… Yes… How can we do that? The proof is not so easy, but you need to go through it as it will introduce some more concepts and ideas you need to understand.

In my post on the Lorentz gauge, I mentioned that Maxwell’s equations can be re-written in terms of Φ and A, rather than in terms of E and B. The equations are:

The expression look rather formidable, but don’t panic: just look at it. Of course, you need to be familiar with the operators that are being used here, so that’s the Laplacian ∇² and the divergence operator ∇• that’s being applied to the scalar Φ and the vector A. I can’t re-explain this. I am sorry. Just check my posts on vector analysis. You should also look at the third equation: that’s just the Lorentz gauge condition, which we introduced when deriving these equations from Maxwell’s equations. Having said that, it’s the first and second equation which describe Φ and A as a function of the charges and currents in space, and so that’s what matters here. So let’s unfold the first equation. It says the following:

In fact, if we’d be talking free or empty space, i.e. regions where there are no charges and currents, then the right-hand side would be zero and this equation would then represent a wave equation, so some potential Φ that is changing in time and moving out at the speed c. Here again, I am sorry I can’t write about this here: you’ll need to check one of my posts on wave equations. If you don’t want to do that, you should believe me when I say that, if you see an equation like this:

then the function Ψ(x, t) must be some function

Now, that’s a function representing a wave traveling at speed c, i.e. the phase velocity. Always? Yes. Always! It’s got to do with the x − ct and/or x + ct argument in the function. But, sorry, I need to move on here.

The unfolding of the equation with Φ makes it clear that we have four equations really. Indeed, the second equation is three equations: one for A_x, one for A_y, and one for A_z respectively. The four quantities on the right-hand side of these equations are ρ, j_x, j_y and j_z respectively, divided by ε₀, which is a universal constant which does not change when going from one coordinate system to another. Now, the quantities ρ, j_x, j_y and j_z transform like a four-vector. How do we know that? It’s just the charge conservation law. We used it when solving the problem of the fields around a moving wire, when we demonstrated the relativity of the electric and magnetic field. Indeed, the relevant equations were:

You can check that against the Lorentz transformation rules for c = 1. They’re exactly the same, but so we chose t = 0, so the rules are even simpler. Hence, the (ρ, j_x, j_y, j_z) vector is, effectively, a four-vector, and we’ll denote it by j_μ= (ρ, j). I now need to explain something else. [And, yes, I know this is becoming a very long story but… Well… That’s how it is.]

It’s about our operators ∇, ∇•, ∇× and ∇², so that’s the gradient, the divergence, curl and Laplacian operator respectively: they all have a four-dimensional equivalent. Of course, that won’t surprise you. 😦 Let me just jot all of them down, so we’re done with that, and then I’ll focus on the four-dimensional equivalent of the Laplacian ∇•∇ = ∇², which is referred to as the D’Alembertian, and which is denoted by □², because that’s the one we need to prove that our four-potential vector is a real four-vector. [I know: □²is a tiny symbol for a pretty monstrous thing, but I can’t help it: my editor tool is pretty limited.]

Now, we’re almost there. Just hang in for a little longer. It should be obvious that we can re-write those two equations with Φ, A, ρ and j, as:

Just to make sure, let me remind you that A_μ= (Φ, A) and that j_μ= (ρ, j). Now, our new D’Alembertian operator is just an operator—a pretty formidable operator but, still, it’s an operator, and so it doesn’t change when the coordinate system changes, so the conclusion is that, IF j_μ= (ρ, j) is a four-vector – which it is – and, therefore, transforms like a four-vector, THEN the quantities Φ, A_x, A_y, and A_z must also transform like a four-vector, which means they are (the components of) a four-vector.

So… Well… Think about it, but not too long, because it’s just an intermediate result we had to prove. So that’s done. But we’re not done here. It’s just the beginning, actually. Let me repeat our intermediate result:

A_μ= (Φ, A) is a four-vector. We call it the four-potential vector.

OK. Let’s continue. Let me first draw your attention to that expression with the D’Alembertian above. Which expression? This one:

What about it? Well… You should note that the physics of that equation is just the same as Maxwell’s equations. So it’s one equation only, but it’s got it all.

It’s quite a pleasure to re-write it in such elegant form. Why? Think about it: it’s a four-vector equation: we’ve got a four-vector on the left-hand side, and a four-vector on the right-hand side. Therefore, this equation is invariant under a transformation. So, therefore, it directly shows the invariance of electrodynamics under the Lorentz transformation.

Huh? Yes. You may think about this a little longer. 🙂

To wrap this up, I should also note that we can also express the gauge condition using our new four-vector notation. Indeed, we can write it as:

It’s referred to as the Lorentz condition and it is, effectively, a condition for invariance, i.e. it ensures that the four-vector equation above does stay in the form it is in for all reference frames. Note that we’re re-writing it using the four-dimensional equivalent of the divergence operator ∇•, but so we don’t have a dot between ∇_μ and A_μ. In fact, the notation is pretty confusing, and it’s easy to think we’re talking some gradient, rather than the divergence. So let me therefore highlight the meaning of both once again. It looks the same, but it’s two very different things: the gradient operates on a scalar, while the divergence operates on a (four-)vector. Also note the +−−− signature is only there for the gradient, not for the divergence!

You’ll wonder why they didn’t use some • or ∗ symbol, and the answer: I don’t know. I know it’s hard to keep inventing symbols for all these different ‘products’ – the ⊗ symbol, for example, is reserved for tensor products, which we won’t get into – but… Well… I think they could have done something here. 😦

In any case… Let’s move on. Before we do, please note that we can also re-write our conservation law for electric charge using our new four-vector notation. Indeed, you’ll remember that we wrote that conservation law as:

Using our new four-vector operator ∇_μ, we can re-write that as ∇_μj_μ= 0. So all of electrodynamics can be summarized in the two equations only—Maxwell’s law and the charge conservation law:

OK. We’re now ready to discuss the electromagnetic tensor. [I know… This is becoming an incredibly long and incredibly complicated piece but, if you get through it, you’ll admit it’s really worth it.]

The electromagnetic tensor

The whole analysis above was done in terms of the Φ and A potentials. It’s time to get back to our field vectors E and B. We know we can easily get them from Φ and A, using the rules we mentioned as solutions:

These two equations should not look as yet another formula. They are essential, and you should be able to jot them down anytime anywhere. They should be on your kitchen door, in your toilet and above your bed. 🙂 For example, the second equation gives us the components of the magnetic field vector B:

Now, look at these equations. The $x$ -component is equal to a couple of terms that involve only $y$ – and $z$ -components. The y-component is equal to something involving only x and $z.$ Finally, the $z$ -component only involves x and y. Interesting. Let’s define a ‘thing’ we’ll denote by F_zy and define as:

So now we can write: B_x = F_zy, B_y = F_xz, and B_z = F_xy. Now look at our equation for E. It turns out the components of E are equal to things like F_xt, F_ytand F_zt! Indeed, F_xt = ∂A_x/∂t − ∂A_t/∂x = E_x!

But… Well… No. 😦 The sign is wrong! E_x = −∂A_x/∂t−∂A_t/∂x, so we need to modify our definition of F_xt. When the t-component is involved, we’ll define our ‘F-things’ as:

So we’ve got a plus instead of a minus. It looks quite arbitrary but, frankly, you’ll have to admit it’s sort of consistent with our +−−− signature for our four-vectors and, in just a minute, you’ll see it’s fully consistent with our definition of the four-dimensional vector operator ∇_μ= (∂/∂t, −∂/∂x, −∂/∂y, −∂/∂z). So… Well… Let’s go along with it.

What about the F_xx, F_yy, F_zzand F_ttterms? Well… F_xx = ∂A_x/∂x − ∂A_x/∂x = 0, and it’s easy to see that F_yy and F_zz are zero too. But F_tt? Well… It’s a bit tricky but, applying our definitions carefully, we see that F_tt must be zero too. In any case, the F_tt = 0 will become obvious as we will be arranging these ‘F-things’ in a matrix, which is what we’ll do now. [Again: does this ring a bell? If not, it should. :-)]

Indeed, we’ve got sixteen possible combinations here, which Feynman denotes as F_μν, which is somewhat confusing, because F_μν usually denotes the 4×4 matrix representing all of these combinations. So let me use the subscripts i and j instead, and define F_ij as:

F_ij = ∇_iA_j − ∇_jA_i

with ∇_i being the t-, x-, y- or z-component of ∇_μ = (∂/∂t, −∂/∂x, −∂/∂y, −∂/∂z) and, likewise, A_i being the t-, x-, y- or z-component of A_μ = (Φ, A_x, A_y, A_z). Just check it: F_zy = −∂A_y/∂z + ∂A_z/∂y = ∂A_z/∂y − ∂A_y/∂z = B_x, for example, and F_xt = −∂Φ/∂x − ∂A_x/∂t = E_x. So the +−−− convention works. [Also note that it’s easier now to see that F_tt = ∂Φ/∂t − ∂Φ/∂t = 0.]

We can now arrange the F_ij in a matrix. This matrix is antisymmetric, because F_ij = – F_ji, and its diagonal elements are zero. [For those of you who love math: note that the diagonal elements of an antisymmetric matrix are always zero because of the F_ij = – F_ji constraint: just use k = i = j in the constraint.]

Now that matrix is referred to as the electromagnetic tensor and it’s depicted below (we plugged c back in, remember that B’s magnitude is 1/c times E’s magnitude).

So… Well… Great ! We’re done! Well… Not quite. 🙂

We can get this matrix in a number of ways. The least complicated way is, of course, just to calculate all F_ij components and them put them in a [F_ij] matrix using the i as the row number and the j as the column number. You need to watch out with the conventions though, and so i and j start on t and end on z. 🙂

The other way to do it is to write the ∇_μ = (∂/∂t, −∂/∂x, −∂/∂y, −∂/∂z) operator as a 4×1 column vector, which you then multiply with the four-vector A_μ written as a 4×1 row vector. So ∇_μA_μis then a 4×4 matrix, which we combine with its transpose, i.e. (∇_μA_μ)^T, as shown below. So what’s written below is (∇_μA_μ) − (∇_μA_μ)^T.

If you google, you’ll see there’s more than one way to go about it, so I’d recommend you just go through the motions and double-check the whole thing yourself—and please do let me know if you find any mistake! In fact, the Wikipedia article on the electromagnetic tensor denotes the matrix above as F^μν, rather than as F_μν, which is the same tensor but in its so-called covariant form, but so I’ll refer you to that article as I don’t want to make things even more complicated here! As said, there’s different conventions around here, and so you need to double-check what is what really. 🙂

Where are we heading with all of this? The next thing is to look at the Lorentz transformation of these F_ij = ∇_iA_j − ∇_jA_icomponents, because then we know how our E and B fields transform. Before we do so, however, we should note the more general results and definitions which we obtained here:

1. The F_μν matrix (a matrix is just a multi-dimensional array, of course) is a so-called tensor. It’s a tensor of the second rank, because it has two indices in it. We think of it as a very special ‘product’ of two vectors, not unlike the vector cross product a × b, whose components were also defined by a similar combination of the components of a and b. Indeed, we wrote:

So one should think of a tensor as “another kind of cross product” or, preferably, and as Feynman puts it, as a “generalization of the cross product”.

2. In this case, the four-vectors are ∇_μ = (∂/∂t, −∂/∂x, −∂/∂y, −∂/∂z) and A_μ = (Φ, A_x, A_y, A_z). Now, you will probably say that ∇_μ is an operator, not a vector, and you are right. However, we know that ∇_μ behaves like a vector, and so this is just a special case. The point is: because the tensor is based on four-vectors, the F_μν tensor is referred to as a tensor of the second rank in four dimensions. In addition, because of the F_ij = – F_ji result, F_μν is an asymmetric tensor of the second rank in four dimensions.

3. Now, the whole point is to examine how tensors transform. We know that the vector dot product, aka the inner product, remains invariant under a Lorentz transformation, both in three as well as in four dimensions, but what about the vector cross product, and what about the tensor? That’s what we’ll be looking at now.

The Lorentz transformation of the electric and magnetic fields

Cross products are complicated, and tensors will be complicated too. Let’s recall our example in three dimensions, i.e. the angular momentum vector L, which was a cross product of the radius vector r and the momentum vector p = mv, as illustrated below (the animation also gives the torque τ, which is, loosely speaking, a measure of the turning force).

The components of L are:

Now, this particular definition ensures that L_ijturns out to be an antisymmetric object:

So it’s a similar situation here. We have nine possible combinations, but only three independent numbers. So it’s a bit like our tensor in four dimensions: 16 combinations, but only 6 independent numbers.

Now, it so happens that that these three numbers, or objects if you want, transform in exactly the same way as the components of a vector. However, as Feynman points out, that’s a matter of ‘luck’ really. In fact, Feynman points out that, when we have two vectors a = (a_x, a_y, a_z) and b = (b_x, b_y, b_z), we’ll have nine products T_ij = a_ib_j which will also form a tensor of the second rank (cf. the two indices) but which, in general, will not obey the transformation rules we got for the angular momentum tensor, which happened to be an antisymmetric tensor of the second rank in three dimensions.

To make a long story short, it’s not simple in general, and surely not here: with E and B, we’ve got six independent terms, and so we cannot represent six things by four things, so the transformation rules for E and B will differ from those for a four-vector. So what are they then?

Well… Feynman first works out the rules for the general antisymmetric vector combination G_ij = a_ib_j− a_jb_i, with a_iand b_j the t-, x-, y- or z-component of the four-vectors a_μ= (a_t, a_x, a_y, a_z) and b_μ= (b_t, b_x, b_y, b_z) respectively. The idea is to first get some general rules, and then replace G_ij = a_ib_j− a_jb_i by F_ij = ∇_iA_j − ∇_jA_i, of course! So let’s apply the Lorentz rules, which – let me remind you – are the following ones:

So we get:

The rest is all very tedious: you just need to plug these things into the various G_ij = a_ib_j− a_jb_i formulas. For example, for G’_tx, we get:

Hey! That’s just G’_tx, so we find that G’_tx= G_tx! What about the rest? Well… That yields something different. Let me shorten the story by simply copying Feynman here:

So… Done!

So what?

Well… Now we just substitute. In fact, there are two alternative formulations of the Lorentz transformations of E and B. They are given below (note the units are such that c = 1):

In addition, there is a third equivalent formulation which is more practical, and also simpler, even if it puts the c‘s back in. It re-defines the field components, distinguishing only two:

The ‘parallel’ components E_|| and B_||along the x-direction ( because they are parallel to the relative velocity of the S and S’ reference frames), and
The ‘perpendicular’ or ‘total transverse’ components E_⊥ and B_⊥, which are the vector sums of the y- and z-components.

So that gives us four equations only:

And, yes, we are done now. This is the Lorentz transformation of the fields. I am sure it has left you totally exhausted. Well… If not… […] It sure left me totally exhausted. 🙂

To lighten things up, let me insert an image of how the transformed field E actually looks like. The first image is the reference frame of a charge itself: we have a simple Coulomb field. The second image shows the charge flying by. Its electric field is ‘squashed up’. To be precise, it’s just like the scale of x is squashed up by a factor ((1−v²/c²)^1/2. Let me refer you to Feynman for the detail of the calculations here.

OK. So that’s it. You may wonder: what about that promise I made? Indeed, when I started this post, I said I’d present a mathematical construct that presents the electromagnetic force as one force only, as one physical reality, but so we’re back writing all of it in terms of two vectors—the electric field vector E and the magnetic field vector B. Well… What can I say? I did present the mathematical construct: it’s the electromagnetic tensor. So it’s that antisymmetric matrix really, which one can combine with a transformation matrix embodying the Lorentz transformation rules. So, I did what I promised to do. But you’re right: I am re-presenting stuff in the old style once again.

The second objection that you may have—in fact, that you should have, is that all of this has been rather tedious. And you’re right. The whole thing just re-emphasizes the value of using the four-potential vector. It’s obviously much easier to take that vector from one reference frame to another – so we just apply the Lorentz transformation rules to A_μ= (Φ, A) and get A_μ‘ = (Φ’, A’) from it – and then calculate E’ and B’ from it, rather than trying to remember those equations above. However, that’s not the point, or…

Well… It is and it isn’t. We wanted to get away from those two vectors E and B, and show that electromagnetism is really one phenomenon only, and so that’s where the concept of the electromagnetic tensor came in. There were two objectives here: the first objective was to introduce you to the concept of tensors, which we’ll need in the future. The second objective was to show you that, while Lorentz’ force law – F = q(E + v×B) makes it clear we’re talking one force only, there is a way of writing it all up that is much more elegant.

I’ve introduced the concept of tensors here, so the first objective should have been achieved. As for the second objective, I’ll discuss that in my next post, in which I’ll introduce the four-velocity vector μ_μas well as the four-force vector f_μ. It will explain the following beautiful equation of motion:

Now that looks very elegant and unified, doesn’t it? 🙂

[…] Hmm… No reaction. I know… You’re tired now, and you’re thinking: yet another way of representing the same thing? Well… Yes! So…

OK… Enough for today. Let’s follow up tomorrow.

Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 20, 2020 as a result of a DMCA takedown notice from Michael A. Gottlieb, Rudolf Pfeiffer, and The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 20, 2020 as a result of a DMCA takedown notice from Michael A. Gottlieb, Rudolf Pfeiffer, and The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 20, 2020 as a result of a DMCA takedown notice from Michael A. Gottlieb, Rudolf Pfeiffer, and The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/

On (special) relativity: the Lorentz transformations

Pre-scriptum (dated 26 June 2020): These posts on elementary math and physics have not suffered much the attack by the dark force—which is good because I still like them. While my views on the true nature of light, matter and the force or forces that act on them have evolved significantly as part of my explorations of a more realist (classical) explanation of quantum mechanics, I think most (if not all) of the analysis in this post remains valid and fun to read. In fact, I find the simplest stuff is often the best. 🙂

Original post:

I just skyped to my kids (unfortunately, we’re separated by circumstances) and they did not quite get the two previous posts (on energy and (special) relativity). The main obstacle is that they don’t know much – nothing at all actually – about integrals. So I should avoid integrals. That’s hard but I’ll try to do so in this post, in which I want to introduce special relativity as it’s usually done, and so that’s not by talking about Einstein’s mass-energy equivalence relation first.

Galilean/Newtonian relativity

A lot of people think they understand relativity theory but they often confuse it with Galilean (aka Newtonian) relativity and, hence, they actually do not understand it at all. Indeed, Galilean or Newtonian relativity is as old as Galileo and Newton (so that’s like 400 years old), who stated the principle of relativity as a corollary to the laws of motion: “The motions of bodies included in a given space are the same amongst themselves, whether that space is at rest or moves uniformly forward in a straight line.”

The Galilean or Newtonian principle of relativity is about adding and subtracting speeds: if I am driving at 120 km/h on some highway, but you overtake me at 140 km/h, then I will see you go past me at the rather modest speed of 20 km/h. That’s all what there is to it.

Now, that’s not what Einstein‘s relativity theory is about. Indeed, the relationship between your and my reference frame (yours is moving with respect to mine, and mine is moving with respect to yours but with opposite velocity) is very simple in this example. It involves a so-called Galilean transformation only: if my coordinate system is (x, y, z, t), and yours is (x‘, y‘, z‘, t‘), then we can write:

(1) x’ = x – ut (or x = x’ + ut), (2) y’ = y, (3) z’ = z and (4) t’ = t

To continue the example above: if we start counting at t = t’ = 0 when you are overtaking me, and if we both consider ourselves to be at the center of our reference frame (i.e. x = 0 where I am and x’ = 0 where you are), then you will be at x = 10 km after 30 minutes from my point of view, and I will be at x’ = –10 km (so that’s 10 km behind) from your point of view. So x’ = x – ut indeed, with u = 20 km/h.

Again, that’s not what Einstein’s principle of relativity is about. They knew that very well in the 17th century already. In fact, they actually knew that much earlier but Descartes formalized his Cartesian coordinate system only in the first half of the 17th century and, hence, it’s only from that time onwards that scientists such as Newton and Huygens started using it to transform the laws of physics from one frame of reference to another. What they found is that those laws remained invariant.

For example, the conservation law for momentum remains valid even if, as illustrated below, an inertial observer will see an elastic collision, such as the one illustrated, differently than a observer who’s moving along: for the observer who’s moving along, the (horizontal) speed of the blue ball will be zero, and the (horizontal) speed of the red ball will be twice the speed as observed by the inertial observer. That being said, both observers will find that momentum (i.e. the product of mass and velocity: p = mv) is being conserved in such collisions.

But, again, that’s Galilean relativity only: the laws of Newton are of the same form in a moving system as in a stationary system and, therefore, it is impossible to tell, by making experiments, whether our system is moving or not. In other words: there is no such thing as ‘absolute speed’. But, so – let me repeat it again – that is not what Einstein’s relativity theory is about.

Let me give a more interesting example of Galilean relativity, and then we can see what’s wrong with it. The speed of a sound wave is not dependent on the motion of the source: the sound of a siren of an ambulance or a noisy car engine will always travel at a speed of 343 meter per second, regardless of the motion of the ambulance. So, while we’ll experience a so-called Doppler effect when the ambulance is moving – i.e. a higher pitch when it’s approaching than when it’s receding – this Doppler effect does not have any impact on the speed of the sound wave. It only affects the frequency as we hear it. The speed of the wave depends on the medium only, i.e. air in this case.

Indeed, the speed of sound will be different in another gas, or in a fluid, or in a solid, and there’s a surprisingly simple function for that – the so-called Newton-Laplace equation: v_sound = (k/ρ)². In this equation, k is a coefficient of ‘stiffness’ of the medium (even if ‘stiffness’ sounds somewhat strange as a concept to apply to gases), and ρ is the density of the medium (so lower or higher air density will increase/decrease the speed of sound).

This has nothing to do with speed being absolute. No. The Galilean relativity principle does come into play, as one would expect: it is actually possible to catch up with a sound wave (or with any wave traveling through some medium). In fact, that’s what supersonic planes do: they catch up with their own sound waves. However, in essence, planes are not any different from cars in terms of their relationship with the sound that they produce. It’s just that they are faster: the sound wave they produce also travels at a speed of 1,235 km/h, and so cars can’t match that, but supersonic planes can!

[As for the shock wave that is being produced as these planes accelerate and actually ‘break’ the ‘sound barrier’, that has to do with the pressure waves the plane creates in front of itself (just like a traveling compresses the air in front of it). These pressure waves also travel at the speed of sound. Now, as the speed of the object increases, the waves are forced together, or compressed, because they cannot get out of the way of each other. Eventually they merge into one single shock wave, and so that’s what happens and creates the ‘sonic boom’, which also travels at the speed of sound. However, that should not concern us here. For more information on this, I’d refer to Wikipedia, as I got these illustrations from that source, and I quite like the way they present the topic.]

The Doppler effect looks somewhat different (it’s illustrated above) but so, once again, this phenomenon has nothing to do with Einstein’s relativity theory. Why not? Because we are still talking Galilean relativity here. Indeed, let’s suppose our plane travels at twice the speed of sound (i.e. Mach 2 or almost 2,500 km/h). For us, as inertial observers, the speed of the sound wave originating at point 0 in the illustration above (i.e. the reference frame of the inertial observer) will be equal to dx/dt = 1235 km/h. However, for the pilot, the speed of that wave will be equal to

dx’/dt = d(x – ut)/dt = dx/dt – d(ut)/dt = dx/dt – d(ut)/dt = 1235 km/h – u

= 1235 km/h – u = 1235 km/h – 2470 km/h = – 1235 km/h

In short, from the point of view of the pilot, he sees the wave front of the wave created at point 0 traveling away from him (cf. the negative value) at 1235 km/h, i.e. the speed of sound. That makes sense obviously, because he travels twice as fast. However – I cannot repeat it enough – this phenomenon has nothing to do with Einstein’s theory of relativity: if they could have imagined supersonic travel, Galileo, Newton and Huygens would have predicted that too.

So what’s Einstein’s theory of (special) relativity about?

Einstein’s principle of relativity

In 1865, the Scottish mathematical physicist James Clerk Maxwell – I guess it’s important to note he’s Scottish with that referendum coming 🙂 – finally discovered that light was nothing but electromagnetic radiation – so radio waves, (visible) light, X-rays, gamma rays,… It’s all the same: electromagnetic radiation, also known as light tout court.

Now, the equations that describe how electromagnetic radiation (i.e. light) travels through space are beautiful but involve operators which you may not recognize and, hence, I will not write them down. The point to note is that Maxwell’s equations were very elegant but… There were two major difficulties with them:

They did not respect Galilean relativity: if we transform them using the above-mentioned Galilean transformation (x’ = x – ut, y’ = y, z’ = z and t’ = t) then we do not get some relative speed of light. On the contrary, according to Maxwell’s equations, from whatever reference frame you look at light, it should always travel at the same (absolute) speed of light c = 299,792 km/h. So c is a constant, and the same constant, ALWAYS.
Scientists did not have any clue about the medium in which light was supposed to travel. The second half of the 19th century saw lots of experiments trying to discover evidence of a hypothetical ‘luminiferous aether’ in which light was supposed to travel, and which should also have some ‘stiffness’ and ‘density’, but so they could not find any trace of it. No one ever did, and so now we’ve finally accepted that light can actually travel in a vacuum, i.e. in plain nothing.

So what? Well… Let’s first look at the first point. Just like a sound wave, the motion of the source does not have any impact on the speed of light: it goes out in all directions at the same speed c, whether it is emitted from a fast-moving car or from some beacon near the sea. However, unlike sound waves, Maxwell’s equations imply that we cannot catch up with them. That’s troublesome, very troublesome, because, according to the above-mentioned Galilean transformation rules,

i.e. v’ = dx’/dt = dx/dt – u = v – u,

some light beam that is traveling at speed v = c past a spaceship that itself is traveling at speed u – let’s say u = 0.2c for example – should have a speed of c‘ = c – 0.2c = 0.8c = = 239,834 km/h only with respect to the spaceship. However, that’s not what Maxwell’s equations say when you substitute x, y, z and t for x‘, y‘, z‘ and t‘ using those four simple equations x’ = x – ut, y’ = y, z’ = z and t’ = t. After you do the substitution, the transformed Maxwell equations will once again yield that c’ = c = 299,792 km/h, and not c’ = 0.8×299,792 km/h = 239,834 km/h.

That’s weird ! Why? Well… If you don’t think that this is weird, then you’re actually not thinking at all ! Just compare it with the example of our sound wave. There is just no logic to it !

The discovery startled all scientists because there could only be possible solutions to the paradox:

Either Maxwell’s equations were wrong (because they did not observe the principle of (Galilean relativity) or, else,
Newton’s equations (and the Galilean transformation rules – i.e. the Galilean relativity principle) are wrong.

Obviously, scientists and experimenters first tried to prove that Maxwell had it all wrong – if only because no experiment had ever shown Newton’s Laws to be wrong, and so it was probably hard – if not impossible – to try to come up with one that would ! So, instead, experimenters invented all kinds of wonderful apparatuses trying to show that the speed of the light was actually not absolute.

Basically, these experiments assumed that the speed of the Earth, as it rotates around the Sun at a speed of 108,000 km per hour, would result in measurable differences of c that would depend on the direction of the apparatus. More specifically, the speed of the light beam, as measured, would be different if the light beam would be traveling parallel to the motion of the Earth, as opposed to the light beam traveling at right angle to the motion of the Earth. Why? Well… It’s the same idea as the car chasing its own light beams, but I’ll refer to you to other descriptions of the experiment, because explaining these set-ups would take too much time and space. 🙂 I’ll just say that, because 108,000 km/h (on average) is only about 30 km per second (i.e. 0.0001 times c), these experiments relied on (expected) interference effects. The technical aspect of these experiments is really quite interesting. However, as mentioned above, I’ll refer you to Wikipedia or other sources if you’d want more detail.

Just note the most famous of those experiments: the 1887 Michelson-Morley experiment, also known as ‘the most famous failed experiment in history’ because, indeed, it failed to find any interference effects: the speed of light always was the speed of light, regardless of the direction of the beam with respect to the direction of motion of the Earth.

The Lorentz transformations

Once the scientists had recovered from this startling news (Michelson himself suffered from a nervous breakdown for a while, because he really wanted to find that interference effect in order to disprove Maxwell’s Laws), they suggested solutions.

The math was solved first. Indeed, just before the turn of the century, the Dutch physicist Hendrik Antoon Lorentz suggested that, if material bodies would contract in the direction of their motion with a factor (1 – u²/c²)^1/2 and, in addition, if time would also be dilated with a factor (1 – u²/c²)^–1/2, then the Michelson-Morley results could be explained. Of course, scientists objected to this ‘explanation’ as being very much ‘ad hoc’.

So then came Einstein. He just took the math for granted, so Einstein basically accepted the so-called Lorentz transformations that resulted from it, and corrected Newton’s Law in order to set physics right again.

And so that was it. As it turned out, all that was needed in fact, was to do away with the assumption that the inertia (or mass) of an object is a constant and, hence, that it does not vary with its velocity. For us, today, it seems obvious: mass also varies, and the factor involved is the very same Lorentz factor that we mentioned above: γ = (1 – u²/c²)^–1/2. Hence, the m in Newton’s Second Law (F = d(mv)/dt) is not a constant but equal to m = γm₀. For all speeds that we, human beings, can imagine (including the astronomical speed of the Earth in orbit around the Sun), the ‘correction’ is too small to be noticeable, or negligible, but so it’s there, as evidenced by the Michelson-Morley experiment, and, some hundred years later, we can actually verify it in particle accelerators.

As said, for us, today, it’s obvious (in my previous post, I mention a few examples: I explain how the mass of electrons in an electron beam is impacted by their speed, and how the lifetime of muon increases because of their speed) but one hundred years ago, it was not. Not at all – and so that’s why Einstein was a genius: he dared to explore and accept the non-obvious.

Now, what then are the correct transformations from one reference frame to another? They are referred to as the Lorentz transformations, and they can be written down (in a simplified form, assuming relative motion in the x direction only) as follows:

Now, I could point out many interesting implications, or come up with examples, but I will resist the temptation. I will only note two things about them:

1. These Lorentz transformations actually re-establish the principle of relativity: the Laws of Nature – including the Laws of Newton as corrected by Einstein’s relativistic mass formula – are of the same form in a moving system as in a stationary system, and therefore it is impossible to tell, by making experiments, whether the system is moving or not.

2. The second thing I should note is that the equations above imply that the idea of absolute time is no longer valid: there is no such thing as ‘absolute’ or ‘universal’ time. Indeed, Lorentz’ concept of ‘local time’ is a most profound departure from Newtonian mechanics that is implicit in these equations.

Indeed, space and time are entangled in these equations as you can see from the –ut and –ux/c² terms in the equation for x’ and t’ respectively and, hence, the idea of simultaneity has to be abandoned: what happens simultaneously in two separated places according to one observer, does not happen at the same time as viewed by an observer moving with respect to the first. Let me quickly show how.

Suppose that in my world I see two events happening at the same time t₀but so they happen at two different places x₁ and x₂. Now, if you are movingaway from me at a (uniform) speed u, then equation (4) tells us that you will see these two events happen at two different times t₁‘ and t₂‘, with the time difference t₁‘ – t₂‘ equal to t₁‘ – t₂‘ = γ[u(x₁ – x₂)/c²], with γ the above-mentioned Lorentz factor. [Just do the calculation for yourself using equation 4.]

Of course, the effect is negligible for most speeds that we, as human beings, can imagine, but it’s there. So we do not have three separate space coordinates and one time coordinates, but four space-time coordinates that transform together, fully entangled, when applying those four equations above.

That observation led the German mathematician Hermann Minkowski, who helped Einstein to develop his theory of four-dimensional space-time, to famously state that “Space of itself, and time of itself, will sink into mere shadows, and only a kind of union between them shall survive.”

Post scriptum: I did not elaborate on the second difficulty when I mentioned Maxwell’s equations: the lack of a need for a medium for light to travel through. I will let that rest for the moment (or, else, you can just Google some stuff on it). Just note that (1) it is kinda convenient that electromagnetic radiation does not need any medium (I can’t see how one would incorporate that in relativity theory) and (2) that light does seem to slow down in a medium. However, the explanation for that (i.e. for light to have an apparently lower speed in a medium) is to be found in quantum mechanics and so we won’t touch upon that complex matter here (for now that is). The point to note is that this slowing down is caused by light interacting with the matter it encounters as it travels through the medium. It does not actually go slower. However, I need to stop here as this is, yet again, a post which has become way too long. On the other hand, I am hopeful my kids will actually understand this one, because it does not involve integrals. 🙂