The nature of time: an easy explanation of relativity

My manuscript offers a somewhat sacrilegious but intuitive explanation of (special) relativity theory (The Emperor Has No Clothes: the force law and relativity, p. 24-27). It is one of my lighter and more easily accessible pieces of writing. The argument is based on the idea that we may define infinity or infinite velocities as some kind of limit (or some kind of limiting idea), but that we cannot really imagine it: it leads to all kinds of logical inconsistencies.

Let me give you a very simple example here to illustrate these inconsistencies: if something is traveling at an infinite velocity, then it is everywhere and nowhere at the same time, and no theory of physics can deal with that.

Now, if I would have to rewrite that brief introduction to relativity theory, I would probably add another logical argument. One that is based on our definition or notion of time itself. What is the definition of time, indeed? When you think long and hard about this, you will have to agree we can only measure time with reference to some fundamental cycle in Nature, right? It used to be the seasons, or the days or nights. Later, we subdivided a day into hours, and now we have atomic clocks. Whatever you can count and meaningfully communicate to some other intelligent being who happens to observe the same cyclical phenomenon works just fine, right?

Hence, if we would be able to communicate to some other intelligent being in outer space, whose position we may or may not know but both he/she/it (let us think of a male Martian for ease of reference) and we/me/us are broadcasting our frequency- or amplitude-modulated signals wide enough so as to ensure ongoing communication, then we would probably be able to converge on a definition of time in terms of the fundamental frequency of an elementary particle – let us say an electron to keep things simple. We could, therefore, agree on an experiment where he – after receiving a pre-agreed start signal from us – would starting counting and send us a stop signal back after, say, three billion electron cycles (not approximately, of course, but three billion exactly). In the meanwhile, we would be capable, of course, to verify that, inbetween sending and receiving the start and stop signal respectively (and taking into account the time that start and stop signal needs to travel between him and us), his clock seems to run somewhat differently than ours.

So that is the amazing thing, really. Our Martian uses the same electron clock, but our/his motion relative to his/ours leads us to the conclusion his clock works somewhat differently, and Einstein’s (special) relativity theory tells us how, exactly: time dilation, as given by the Lorentz factor.

Does this explanation make it any easier to truly understand relativity theory? Maybe. Maybe not. For me, it does, because what I am describing here is nothing but the results of the Michelson-Morley experiment in a slightly more amusing context which, for some reason I do not quite understand, seems to make them more comprehensible. At the very least, it shows Galilean relativity is as incomprehensible – or as illogical or non-intuitive, I should say – as the modern-day concept of relativity as pioneered by Albert Einstein.

You may now think (or not): OK, but what about relativistic mass? That concept is, and will probably forever remain, non-intuitive. Right? Time dilation and length contraction are fine, because we can now somehow imagine the what and why of this, but how do you explain relativistic mass, really?

The only answer I can give you here it to think some more about Newton’s law: mass is a measure of inertia, so that is a resistance to a change in the state of motion of an object. Motion and, therefore, your measurement of any acceleration or deceleration (i.e. a change in the state of motion) will depend on how you measure time and distance too. Therefore, mass has to be relativistic too.

QED: quod erat demonstrandum. In fact, it is not a proof, so I should not say it’s QED. It’s SE: a satisfactory explanation. Why is an explanation and not a proof? Because I take the constant speed of light for granted, and so I kinda derive the relativity of time, distance and mass from my point of departure (both figuratively and literally speaking, I’d say).

Post scriptum: For the mentioned calculation, we do need to know the (relative) position of the Martian, of course. Any event in physics is defined by both its position as well as its timing. That is what (also) makes it all very consistent, in fact. I should also note this short story here (I mean my post) is very well aligned with Einstein’s original 1905 article, so you can (also) go there to check the math. The main difference between his article and my explanation here is that I take the constant speed of light for granted, and then all that’s relative derives its relativity from that. Einstein looked at it the other way around, because things were not so obvious then. 🙂


Field energy and field momentum

This post goes to the heart of the E = mc2, equation. It’s kinda funny, because Feynman just compresses all of it in a sub-section of his Lectures. However, as far as I am concerned, I feel it’s a very crucial section. Pivotal, I’d say, which would fit with its place in all of the 115 Lectures that make up the three volumes, which is sort of mid-way, which is where we are here. So let’s get go for it. 🙂

Let’s first recall what we wrote about the Poynting vector S, which we calculate from the magnetic and electric field vectors E and B by taking their cross-product:

S formula

This vector represents the energy flow, per unit area and per unit time, in electrodynamical situations. If E and/or are zero (which is the case in electrostatics, for example, because we don’t have magnetic fields in electrostatics), then S is zero too, so there is no energy flow then. That makes sense, because we have no moving charges, so where would the energy go to?

I also made it clear we should think of S as something physical, by comparing it to the heat flow vector h, which we presented when discussing vector analysis and vector operators. The heat flow out of a surface element da is the area times the component of perpendicular to da, so that’s (hn)·da = hn·da. Likewise, we can write (Sn)·da = Sn·da. The units of S and h are also the same: joule per second and per square meter or, using the definition of the watt (1 W = 1 J/s), in watt per square meter. In fact, if you google a bit, you’ll find that both h and S are referred to as a flux density:

  1. The heat flow vector h is the heat flux density vector, from which we get the heat flux through an area through the (hn)·da = hn·da product.
  2. The energy flow is the energy flux density vector, from which we get the energy flux through the (Sn)·da = Sn·da product.

So that should be enough as an introduction to what I want to talk about here. Let’s first look at the energy conservation principle once again.

Local energy conservation

In a way, you can look at my previous post as being all about the equation below, which we referred to as the ‘local’ energy conservation law:

energy flux

Of course, it is not the complete energy conservation law. The local energy is not only in the field. We’ve got matter as well, and so that’s what I want to discuss here: we want to look at the energy in the field as well as the energy that’s in the matter. Indeed, field energy is conserved, and then it isn’t: if the field is doing work on matter, or matter is doing work on the field, then… Well… Energy goes from one to the other, i.e. from the field to the matter or from the matter to the field. So we need to include matter in our analysis, which we didn’t do in our last post. Feynman gives the following simple example: we’re in a dark room, and suddenly someone turns on the light switch. So now the room is full of field energy—and, yes, I just mean it’s not dark anymore. :-). So that means some matter out there must have radiated its energy out and, in the process, it must have lost the equivalent mass of that energy. So, yes, we had matter losing energy and, hence, losing mass.

Now, we know that energy and momentum are related. Respecting and incorporating relativity theory, we’ve got two equivalent formulas for it:

  1. E− p2c2 = m02c4
  2. pc = E·(v/c) ⇔ p = v·E/c= m·v

The E = mc2 and m = ·m0·(1−v2/c2)−1/2 formulas connect both expressions. So we can look at it in either of two ways. We could use the energy conservation law, but Feynman prefers the conservation of momentum approach, so let’s see where he takes us. If the field has some energy (and, hence, some equivalent mass) per unit volume, and if there’s some flow, so if there’s some velocity (which there is: that’s what our previous post was all about), then it will have a certain momentum per unit volume. [Remember: momentum is mass times velocity.] That momentum will have a direction, so it’s a vector, just like p = mv. We’ll write it as g, so we define g as:

g is the momentum of the field per unit volume.

What units would we express it in? We’ve got a bit of choice here. For example, because we’re relating everything to energy here, we may want to convert our kilogram into eV/cor J/cunits, using the mass-energy equivalence relation E = mc2. Hmm… Let’s first keep the kg as a measure of inertia though. So we write: [g] = [m]·[v]/m= (kg·m/s)/m3. Hmm… That doesn’t show it’s energy, so let’s replace the kg with a unit that’s got newton and meter in it, cf. the F = ma law. So we write: [g] = (kg·m/s)/m= (kg/s)/m= [(N·s2/m)/s]/m= N·s/m3. Well… OK. The newton·second is the unit of momentum indeed, and we can re-write it including the joule (1 J = 1 N·m), so then we get [g] = (J·s/m4), so what’s that? Well… Nothing much. However, I do note it happens to be the dimension of S/c2, so that’s [S/c2] = [J/(s·m2)]·(s2/m2) = (J·s/m4). 🙂 Let’s continue the discussion.

Now, momentum is conserved, and each component of it is conserved. So let’s look at the x-direction. We should have something like:


If you look at this carefully, you’ll probably say: “OK. I understood the thing with the dark room and light switch. Mass got converted into field energy, but what’s that second term of the left?”

Good. Smart. Right remark. Perfect. […] Let me try to answer the question. While all of the quantities above are expressed per unit volume, we’re actually looking at the same infinitesimal volume element here, so the example of the light switch is actually an example of a ‘momentum outflow’, so it’s actually an example of that second term of the left-hand side of the equation above kicking in! 🙂

Indeed, the first term just sort of reiterates the mass-energy equivalence: the energy that’s in the matter can become field energy, so to speak, in our infinitesimal volume element itself, and vice versa. But if it doesn’t, then it should get out and, hence, become ‘momentum outflow’. Does that make sense? No?

Hmm… What to say? You’ll need to look at that equation a couple of times more, I guess. :-/ But I need to move on, unfortunately. [Don’t get put off when I say things like this: I am basically talking to myself, so it means I’ll need to re-visit this myself. :-/]

Let’s look at all of the three terms:

  1. The left-hand side (i.e. the time rate-of-change of the momentum of matter) is easy. It’s just the force on it, which we know is equal to Fq(E+v×B). Do we know that? OK… I’ll admit it. Sometimes it’s easy to forget where we are in an analysis like this, but so we’re looking at the electromagnetic force here. 🙂 As we’re talking infinitesimals here and, therefore, charge density rather than discrete charges, we should re-write this as the force per unit volume which is ρE+j×B. [This is an interesting formula which I didn’t use before, so you should double-check it. :-)]
  2. The first term on the right-hand side should be equally obvious, or… Well… Perhaps somewhat less so. But with all my rambling on the Uncertainty Principle and/or the wave-particle duality, it should make sense. If we scrap the second term on the right-hand side, we basically have an equation that is equivalent to the E = mc2 equation. No? Sorry. Just look at it, again and again. You’ll end up understanding it. 🙂
  3. So it’s that second term on the right-hand side. What the hell does that say? Well… I could say: it’s the local energy or momentum conservation law. If the energy or momentum doesn’t stay in, it has to go out. 🙂 But that’s not very satisfactory as an answer, of course. However, please just go along with this ‘temporary’ answer for a while.

So what is that second term on the right-hand side? As we wrote it, it’s an x-component – or, let’s put it differently, it is or was part of the x-component of the momentum density – but, frankly, we should probably allow it to go out in any direction really, as the only constraint on the left-hand side is a per second rate of change of something. Hence, Feynman suggest to equate it to something like this:


What a, b and c? The components of some vector? Not sure. We’re stuck. This piece really requires very advanced math. In fact, as far as I know, this is the only time where Feynman says: “Sorry. This is too advanced. I’ll just give you the equation. Sorry.” So that’s what he does. He explains the philosophy of the argument, which is the following:

  1. On the left-hand side, we’ve got the time rate-of-change of momentum, so that obeys the F = dp/dt = d(mv)/dt law, with the force Fper unit volume, being equal to F(unit volume) = ρE+j×B.
  2. On the right-hand side, we’ve got something that can be written as:

general 2

So we’d need to find a way to ρE+j×B in terms of and B only – eliminating ρ and j by using Maxwell’s equations or whatever other trick  – and then juggle terms and make substitutions to get it into a form that looks like the formula above, i.e. the right-hand side of that equation. But so Feynman doesn’t show us how it’s being done. He just mentions some theorem in physics, which says that the energy that’s flowing through a unit area per unit time divided by c2 – so that’s E/cper unit area and per unit time – must be equal to the momentum per unit volume in the space, so we write:

g = S/c2

He illustrates the general theorem that’s used to get the equation above by giving two examples:

example theorem

OK. Two good examples. However, it’s still frustrating to not see how we get the g = S/c2 in the specific context of the electromagnetic force, so let’s do a dimensional analysis at least. In my previous post, I showed that the dimension of S must be J/(m2·s), so [S/c2] = [J/(m2·s)]/(m2/s2) = [N·m/(m2·s)]·(s2/m2) = [N·s/m3]. Now, we know that the unit of mass is 1 kg = N/(m/s2). That’s just the force law: a force of 1 newton will give a mass of 1 kg an acceleration of 1 m/s per second, so 1 N = 1 kg·(m/s2). So the [N·s/m3] dimension is equal to [kg·(m/s2)·s/m3] = [(kg·(m/s)/m3] = [(kg·(m/s)]/m3, which is the dimension of momentum (p = mv) per unit volume, indeed. So, yes, the dimensional analysis works out, and it’s also in line with the p = v·E/c2 = m·v equation, but… Oh… We did a dimensional analysis already, where we also showed that [g] = [S/c2] = (J·s/m4). Well… In any case… It’s a bit frustrating to not see the detail here, but let us note the the Grand Result once again:

The Poynting vector S gives us the energy flow as well as the momentum density= S/c2.

But what does it all mean, really? Let’s go through Einstein’s illustration of the principle. That will help us a lot. Before we do, however, I’d like to note something. I’ve always wondered a bit about that dichotomy between energy and momentum. Energy is force times distance: 1 joule is 1 newton × 1 meter indeed (1 J = 1 N·m). Momentum is force times time, as we can express it in N·s. Planck’s constant combines all three in the dimension of action, which is force times distance times time: ≈ 6.6×10−34 N·m·s, indeed. I like that unity. In this regard, you should, perhaps, quickly review that post in which I explain that is the energy per cycle, i.e. per wavelength or per period, of a photon, regardless of its wavelength. So it’s really something very fundamental.

We’ve got something similar here: energy and momentum coming together, and being shown as one aspect of the same thing: some oscillation. Indeed, just see what happens with the dimensions when we ‘distribute’ the 1/cfactor on the right-hand side over the two sides, so we write: c·= S/c and work out the dimensions:

  1. [c·g = (m/s)·(N·s)/m= N/m= J/m3.
  2. [S/c] = (s/m)·(N·m)/(s·m2) = N/m= J/m3.

Isn’t that nice? Both sides of the equation now have a dimension like ‘the force per unit area’, or ‘the energy per unit volume’. To get that, we just re-scaled g and S, by c and 1/c respectively. As far as I am concerned, this shows an underlying unity we probably tend to mask with our ‘related but different’ energy and momentum concepts. It’s like E and B: I just love it we can write them together in our Poynting formula = ε0c2E×B. In fact, let me show something else here, which you should think about. You know that c= 1/(ε0μ0), so we can write also as SE×B0. That’s nice, but what’s nice too is the following:

  1. S/c = c·= ε0cE×B = E×B/μ0c
  2. S/g = c= 1/(ε0μ0)

So, once again, Feynman may feel the Poynting vector is sort of counter-intuitive when analyzing specific situations but, as far as I am concerned, I feel the Poyning vector makes things actually easier to understand. Instead of two E and B vectors, and two concepts to deal with ‘energy’ (i.e. energy and momentum), we’re sort of unifying things here. In that regard – i.e in regard of feeling we’re talking the same thing really – I’d really highlight the S/g = c2 = 1/(ε0μ0) equation. Indeed, the universal constant acts just like the fine-structure constant here: it links everything to everything. 🙂

And, yes, it’s also about time we introduce the so-called principle of least action to explain things, because action, as a concept, combines force, distance and time indeed, so it’s a bit more promising than just energy, of just momentum. Having said that, you’ll see in the next section that it’s sometimes quite useful to have the choice between one formula or the other. But… Well… Enough talk. Let’s look at Einstein’s car.

Einstein’s car

Einstein’s car is a wonderful device: it rolls without any friction and it moves with a little flashlight. That’s all it needs. It’s pictured below. 🙂 So the situation is the following: the flashlight shoots some light out from one side, which is then stopped at the opposite end of the car. When the light is emitted, there must be some recoil. In fact, we know it’s going to be equal to 1/c times the energy because all we need to do is apply the pc = E·(v/c) formula for v = c, so we know that p = E/c. Of course, this momentum now needs to move Einstein’s car. It’s frictionless, so it should work, but still… The car has some mass M, and so that will determine its recoil velocity: v = p/M. We just apply the general p = mv formula here, and v is not equal to c here, of course! Of course, then the light hits the opposite end of the car and delivers the same momentum, so that stops the car again. However, it did move over some distance x = vt. So we could flash our light again and get to wherever we want to get. [Never mind the infinite accelerations involved!] So… Well… Great! Yes, but Einstein didn’t like this car when he first saw it. In fact, he still doesn’t like it, because he knows it won’t take you very far. 🙂

Einsteins' car

The problem is that we seem to be moving the center of gravity of this car by fooling around on the inside only. Einstein doesn’t like that. He thinks it’s impossible. And he’s right of course. The thing is: the center of gravity did not change. What happened here is that we’ve got some blob of energy, and so that blob has some equivalent mass (which we’ll denote by U/c2), and so that equivalent mass moved all the way from one side to the other, i.e. over the length of the car, which we denote by L. In fact, it’s stuff like this that inspired the whole theory of the field energy and field momentum, and how it interacts with matter.

What happens here is like switching the light on in the dark room: we’ve got matter doing work on the field, and so matter loses mass, and the field gains it, through its momentum and/or energy. To calculate how much, we could integrate S/c or c·over the volume of our blob, and we’d get something in joule indeed, but there’s a simpler way here. The momentum conservation says that the momentum of our car and the momentum of our blob must be equal, so if T is the time that was needed for our blob to go to the other side – and so that’s, of course, also the time during which our car was rolling – then M·v = M·x/T must be equal to (U/c2= (U/c2)·L/T. The 1/T factor on both sides cancel, so we write: M·x = (U/c2)·L. Now, what is x? Yes. In case you were wondering, that’s what we’re looking for here. 🙂 Here it is:

x = vT = vL/c = (p/M)·(L/c) = [U/c)/M]·(L/c) = (U/c2)·(L/M)

So what’s next? Well… Now we need to show that the center-of-mass actually did not move with this ‘transfer’ of the blob. I’ll leave the math to you here: it should all work out. And you can also think through the obvious questions:

  1. Where is the energy and, hence, the mass of our blob after it stops the car? Hint: think about excited atoms and imagine they might radiate some light back. 🙂
  2. As the car did move a little bit, we should be able to move it further and further away from its center of gravity, until the center of gravity is no longer in the car. Hint: think about batteries and energy levels going down while shooting light out. It just won’t happen. 🙂

Now, what about a blob of light going from the top to the bottom of the car? Well… That involves the conservation of angular momentum: we’ll have more mass on the bottom, but on a shorter lever-arm, so angular momentum is being conserved. It’s a very good question though, and it led Einstein to combine the center-of-gravity theorem with the angular momentum conservation theorem to explain stuff like this.

It’s all fascinating, and one can think of a great many paradoxes that, at first, seem to contradict the Grand Principles we used here, which means that they would contradict all that we have learned so far. However, a careful analysis of those paradox reveals that they are paradoxes indeed: propositions which sound true but are, in the end, self-contradictory. In fact, when explaining electromagnetism over his various Lectures, Feynman tasks his readers with a rather formidable paradox when discussing the laws of induction, he solves it here, ten chapters later, after describing what we described above. You can busy yourself with it but… Well… I guess you’ve got something better to do. If so, just take away the key lesson: there’s momentum in the field, and it’s also possible to build up angular momentum in a magnetic field and, if you switch it off, the angular momentum will be given back, somehow, as it’s stored energy.

That’s also why the seemingly irrelevant circulation of S we discussed in my previous post, where we had a charge next to an ordinary magnet, and where we found that there was energy circulating around, is not so queer. The energy is there, in the circulating field, and it’s real. As real as can be. 🙂


Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

On (special) relativity: what’s relative?

Pre-scriptum (dated 26 June 2020): These posts on elementary math and physics have not suffered much the attack by the dark force—which is good because I still like them. While my views on the true nature of light, matter and the force or forces that act on them have evolved significantly as part of my explorations of a more realist (classical) explanation of quantum mechanics, I think most (if not all) of the analysis in this post remains valid and fun to read. In fact, I find the simplest stuff is often the best. 🙂

Original post:

This is my third and final post about special relativity. In the previous posts, I introduced the general idea and the Lorentz transformations. I present these Lorentz transformations once again below, next to their Galilean counterparts. [Note that I continue to assume, for simplicity, that the two reference frames move with respect to each other along the x- axis only, so the y- and z-component of u is zero. It is not all that difficult to generalize to three dimensions (especially not when using vectors) but it makes an intuitive understanding of what’s relativity all about more difficult.]

CaptureAs you can see, under a Lorentz transformation, the new ‘primed’ space and time coordinates are a mixture of the ‘unprimed’ ones. Indeed, the new x’ is a mixture of x and t, and the new t’ is a mixture as well. You don’t have that under a Galilean transformation: in the Newtonian world, space and time are neatly separated, and time is absolute, i.e. it is the same regardless of the reference frame. In Einstein’s world – our world – that’s not the case: time is relative, or local as Hendrik Lorentz termed it, and so it’s space-time – i.e. ‘some kind of union of space and time’ as Minkowski termed it  that transforms. In practice, physicists will use so-called four-vectors, i.e. vectors with four coordinates, to keep track of things. These four-vectors incorporate both the three-dimensional space vector as well as the time dimension. However, we won’t go into the mathematical details of that here.

What else is relative? Everything, except the speed of light. Of course, velocity is relative, just like in the Newtonian world, but the equation to go from a velocity as measured in one reference frame to a velocity as measured in the other, is different: it’s not a matter of just adding or subtracting speeds. In addition, besides time, mass becomes a relative concept as well in Einstein’s world, and that was definitely not the case in the Newtonian world.

What about energy? Well… We mentioned that velocities are relative in the Newtonian world as well, so momentum and kinetic energy were relative in that world as well: what you would measure for those two quantities would depend on your reference frame as well. However, here also, we get a different formula now. In addition, we have this weird equivalence between mass and energy in Einstein’s world, about which I should also say something more.

But let’s tackle these topics one by one. We’ll start with velocities.

Relativistic velocity

In the Newtonian world, it was easy. From the Galilean transformation equations above, it’s easy to see that

v’ = dx’/dt’ = d(x – ut)/dt = dx/dt – d(ut)/dt = v – u

So, in the Newtonian world, it’s just a matter of adding/subtracting speeds indeed: if my car goes 100 km/h (v), and yours goes 120 km/h, then you will see my car falling behind at a speed of (minus) 20 km/h. That’s it. In Einstein’s world, it is not so simply. Let’s take the spaceship example once again. So we have a man on the ground (the inertial or ‘unprimed’ reference frame) and a man in the spaceship (the primed reference frame), which is moving away from us with velocity u.

Now, suppose an object is moving inside the spaceship (along the x-axis as well) with a (uniform) velocity vx’, as measured from the point of view of the man inside the spaceship. Then the displacement x’ will be equal to x’vx’ t’. To know how that looks from the man on the ground, we just need to use the opposite Lorentz transformations: just replace u by –u everywhere (to the man in the spaceship, it’s like the man on the ground moves away with velocity –u), and note that the Lorentz factor does not change because we’re squaring and (–u)2 u2. So we get:


Hence, x’ = vx’ t’ can be written as x = γ(vx’ t’ + ut’). Now we should also substitute t’, because we want to measure everything from the point of view of the man on the ground. Now, t = γ(t’ + uvx’ t’/c2). Because we’re talking uniform velocities, v(i.e. the velocity of the object as measured by the man on the ground) will be equal to x divided by t (so we don’t need to take the time derivative of x), and then, after some simplifying and re-arranging (note, for instance, how the t’ factor miraculously disappears), we get:


What does this rather complicated formula say? Just put in some numbers:

  • Suppose the object is moving at half the speed of light, so 0.5c, and that the spaceship is moving itself also at 0.5c, then we get the rather remarkable result that, from the point of view of the observer on the ground, that object is not going as fast as light, but only at vx = (0.5c + 0.5c)/(1 + 0.5·0.5) = 0.8c.
  • Or suppose we’re looking at a light beam inside the spaceship, so something that’s traveling at speed c itself in the spaceship. How does that look to the man on the ground? Just put in the numbers: vx = (0.5c + c)/(1 + 0.5·1) = ! So the speed of light is not dependent on the reference frame: it looks the same – both to the man in the ship as well as to the man on the ground. As Feynman puts it: “This is good, for it is, in fact, what the Einstein theory of relativity was designed to do in the first place–so it had better work!”

It’s interesting to note that, even if u has no y– or z-component, velocity in the direction will be affected too. Indeed, if an object is moving upward in the spaceship, then the distance of travel of that object to the man on the ground will appear to be larger. See the triangle below: if that object travels a distance Δs’ = Δy’ = Δy = v’Δt’ with respect to the man in the spaceship, then it will have traveled a distance Δs = vΔt to the man on the ground, and that distance is longer.

CaptureI won’t go through the process of substituting and combining the Lorentz equations (you can do that yourself) but the grand result is the following:

vy = (1/γ)vy’ 

1/γ is the reciprocal of the Lorentz factor, and I’ll leave it to you to work out a few numeric examples. When you do that, you’ll find the rather remarkable result that vy is actually less than vy’. For example, for u = 0.6c, 1/γ will be equal to 0.8, so vy will be 20% less than vy’. How is that possible? The vertical distance is what it is (Δy’ = Δy), and that distance is not affected by the ‘length contraction’ effect (y’ = y). So how can the vertical velocity be smaller?  The answer is easy to state, but not so easy to understand: it’s the time dilation effect: time in the spaceship goes slower. Hence, the object will cover the same vertical distance indeed – for both observers – but, from the point of view of the observer on the ground, the object will apparently need more time to cover that distance than the time measured by the man in the spaceship: Δt > Δt’. Hence, the logical conclusion is that the vertical velocity of that object will appear to be less to the observer on the ground.

How much less? The time dilation factor is the Lorentz factor. Hence, Δt = γΔt’. Now, if u = 0.6c, then γ will be equal to 1.25 and Δt = 1.25Δt’. Hence, if that object would need, say, one second to cover that vertical distance, then, from the point of view of the observer on the ground, it would need 1.25 seconds to cover the same distance. Hence, its speed as observed from the ground is indeed only 1/(5/4) = 4/5 = 0.8 of its speed as observed by the man in the spaceship.

Is that hard to understand? Maybe. You have to think through it. One common mistake is that people think that length contraction and/or time dilation are, somehow, related to the fact that we are looking at things from a distance and that light needs time to reach us. Indeed, on the Web, you can find complicated calculations using the angle of view and/or the line of sight (and tons of trigonometric formulas) as, for example, shown in the drawing below. These have nothing to do with relativity theory and you’ll never get the Lorentz transformation out of them. They are plain nonsense: they are rooted in an inability of these youthful authors to go beyond Galilean relativity. Length contraction and/or time dilation are not some kind of visual trick or illusion. If you want to see how one can derive the Lorentz factor geometrically, you should look for a good description of the Michelson-Morley experiment in a good physics handbook such as, yes :-), Feynman’s Lectures.

visual effect 2

So, I repeat: illustrations that try to explain length contraction and time dilation in terms of line of sight and/or angle of view are useless and will not help you to understand relativity. On the contrary, they will only confuse you. I will let you think through this and move on to the next topic.

Relativistic mass and relativistic momentum

Einstein actually stated two principles in his (special) relativity theory:

  1. The first is the Principle of Relativity itself, which is basically just the same as Newton’s principle of relativity. So that was nothing new actually: “If a system of coordinates K is chosen such that, in relation to it, physical laws hold good in their simplest form, then the same laws must hold good in relation to any other system of coordinates K’ moving in uniform translation relatively to K.” Hence, Einstein did not change the principle of relativity – quite on the contrary: he re-confirmed it – but he did change Newton’s Laws, as well as the Galilean transformation equations that came with them. He also introduced a new ‘law’, which is stated in the second ‘principle’, and that the more revolutionary one really:
  2. The Principle of Invariant Light Speed: “Light is always propagated in empty space with a definite velocity [speed] c which is independent of the state of motion of the emitting body.”

As mentioned above, the most notable change in Newton’s Laws – the only change, in fact – is Einstein’s relativistic formula for mass:

mv = γm0

This formula implies that the inertia of an object, i.e. its mass, also depends on the reference frame of the observer. If the object moves (but velocity is relative as we know: an object will not be moving if we move with it), then its mass increases. This affects its momentum. As you may or may not remember, the momentum of an object is the product of its mass and its velocity. It’s a vector quantity and, hence, momentum has not only a magnitude but also a direction:

 pv = mvv = γm0v 

As evidenced from the formula above, the momentum formula is a relativistic formula as well, as it’s dependent on the Lorentz factor too. So where do I want to go from here? Well… In this section (relativistic mass and momentum), I just want to show that Einstein’s mass formula is not some separate law or postulate: it just comes with the Lorentz transformation equations (and the above-mentioned consequences in terms of measuring horizontal and vertical velocities).

Indeed, Einstein’s relativistic mass formula can be derived from the momentum conservation principle, which is one of the ‘physical laws’ that Einstein refers to. Look at the elastic collision between two billiard balls below. These balls are equal – same mass and same speed from the point of view of an inertial observer – but not identical: one is red and one is blue. The two diagrams show the collision from two different points of view: left, we have the inertial reference frame, and, right, we have a reference frame that is moving with a velocity equal to the horizontal component of the velocity of the blue ball.


The points to note are the following:

  1. The total momentum of such elastic collision before and after the collision must be the same.
  2. Because the two balls have equal mass (in the inertial reference frame at least), the collision will be perfectly symmetrical. Indeed, we may just turn the diagram ‘upside down’ and change the colors of the balls, as we do below, and the values w, u and v (as well as the angle α) are the same.

Elastic collision

As mentioned above, the velocity of the blue and red ball and, hence, their momentum, will depend on the frame of reference. In the diagram on the left, we’re moving with a velocity equal to the horizontal component of the velocity of the blue ball and, therefore, in this particular frame of reference, the velocity (and the momentum) of the blue ball consists of a vertical component only, which we refer to as w.

From this point of view (i.e. the reference frame moving with, the velocity (and, hence, the momentum) of the red ball will have both a horizontal as well as a vertical component. If we denote the horizontal component by u, then it’s easy to show that the vertical velocity of the red ball must be equal to sin(α)v. Now, because u = cos(α)v, this vertical component will be equal to tan(α)u. But so what is tan(α)u? Now, you’ll say, that is quite evident: tan(α)u must be equal to w, right?

No. That’s Newtonian physics. The red ball is moving horizontally with speed u with respect to the blue ball and, hence, its vertical velocity will not be quite equal to w. Its vertical velocity will be given by the formula which we derived above: vy = (1/γ)vy’, so it will be a little bit slower than the w we see in the diagram on the right which is, of course, the same w as in the diagram on the left. [If you look carefully at my drawing above, then you’ll notice that the w vector is a bit longer indeed.]

Huh? Yes. Just think about it: tan(α)= (1/γ)w. But then… How can momentum be conserved if these speeds are not the same? Isn’t the momentum conservation principle supposed to conserve both horizontal as well as vertical momentum? It is, and momentum is being conserved. Why? Because of the relativistic mass factor.

Indeed, the change in vertical momentum (Δp) of the blue ball in the diagram on the left or – which amounts to the same – the red ball in the diagram on the right (i.e. the vertically moving ball) is equal to Δpblue = 2mww. [The factor 2 is there because the ball goes down and then up (or vice versa) and, hence, the total change in momentum must be twice the mwamount.] Now, that amount must be equal to Δpred, which is equal to Δpblue = 2mv(1/γ)w. Equating both yields the following grand result:

mv/m= γ ⇔ mv = γmw

What does this mean? It means that mass of the red ball in the diagram on the left is larger than the mass of the blue ball. So here we have actually derived Einstein’s relativistic mass formula from the momentum conservation principle !

Of course you’ll say: not quite. This formula is not the mu = γmformula that we’re used to ! Indeed, it’s not. The blue ball has some velocity w itself, and so the formula links two velocities v and w. However, we can derive  mv = γmformula as a limit of mv = γmw for w going to zer0. How can w become infinitesimally small? If the angle α becomes infinitesimally small. It’s obvious, then, that v and u will be practically equal. In fact, if w goes to zero, then mw will be equal to m0 in the limiting case, and mv will be equal to mu. So, then, indeed, we get the familiar formula as a limiting case:

mu = γm

Hmm… You’ll probably find all of this quite fishy. I’d suggest you just think about it. What I presented above, is actually Feynman’s presentation of the subject, but with a bit more verbosity. Let’s move on to the final.

Relativistic energy

From what I wrote above (and from what I wrote in my two previous posts on this topic), it should be obvious, by now, that energy also depends on the reference frame. Indeed, mass and velocity depend on the reference frame (moving or not), and both appear in the formula for kinetic energy which, as you’ll remember, is

K.E. = mc– m0c= (m – m0)c= γm0c– m0c= m0c2(γ – 1).

Now, if you go back to the post where I presented that formula, you’ll see that we’re actually talking the change in kinetic energy here: if the mass is at rest, it’s kinetic energy is zero (because m = m0), and it’s only when the mass is moving, that we can observe the increase in mass. [If you wonder how, think about the example of the fast-moving electrons in an electron beam: we see it as an increase in the inertia: applying the same force does no longer yield the same acceleration.]

Now, in that same post, I also noted that Einstein added an equivalent rest mass energy (E= m0c2) to the kinetic energy above, to arrive at the total energy of an object:

E = E+ K.E. = mc

Now, what does this equivalence actually mean? Is mass energy? Can we equate them really? The short answer to that is: yes.

Indeed, in one of my older posts (Loose Ends), I explained that protons and neutrons are made of quarks and, hence, that quarks are the actual matter particles, not protons and neutrons. However, the mass of a proton – which consists of two up quarks and one down quark – is 938 MeV/c(don’t worry about the units I am using here: because protons are so tiny, we don’t measure their mass in grams), but the mass figure you get when you add the rest mass of two u‘s and one d, is 9.6 MeV/conly: about one percent of 938 ! So where’s the difference?

The difference is the equivalent mass (or inertia) of the binding energy between the quarks. Indeed, the so-called ‘mass’ that gets converted into energy when a nuclear bomb explodes is not the mass of quarks. Quarks survive: nuclear power is binding energy between quarks that gets converted into heat and radiation and kinetic energy and whatever else a nuclear explosion unleashes.

In short, 99% of the ‘mass’ of a proton or an electron is due to the strong force. So that’s ‘potential’ energy that gets unleashed in a nuclear chain reaction. In other words, the rest mass of the proton is actually the inertia of the system of moving quarks and gluons that make up the particle. In such atomic system, even the energy of massless particles (e.g. the virtual photons that are being exchanged between the nucleus and its electron shells) is measured as part of the rest mass of the system. So, yes, mass is energy. As Feynman put it, long before the quark model was confirmed and generally accepted:

“We do not have to know what things are made of inside; we cannot and need not justify, inside a particle, which of the energy is rest energy of the parts into which it is going to disintegrate. It is not convenient and often not possible to separate the total mc2 energy of an object into (1) rest energy of the inside pieces, (2) kinetic energy of the pieces, and (3) potential energy of the pieces; instead we simply speak of the total energy of the particle. We ‘shift the origin’ of energy by adding a constant m0c2 to everything, and say that the total energy of a particle is the mass in motion times c2, and when the object is standing still, the energy is the mass at rest times c2.” (Richard Feynman’s Lectures on Physics, Vol. I, p. 16-9)

 So that says it all, I guess, and, hence, that concludes my little ‘series’ on (special) relativity. I hope you enjoyed it.

Post scriptum:

Feynman describes the concept of space-time with a nice analogy: “When we move to a new position, our brain immediately recalculates the true width and depth of an object from the ‘apparent’ width and depth. But our brain does not immediately recalculate coordinates and time when we move at high speed, because we have had no effective experience of going nearly as fast as light to appreciate the fact that time and space are also of the same nature. It is as though we were always stuck in the position of having to look at just the width of something, not being able to move our heads appreciably one way or the other; if we could, we understand now, we would see some of the other man’s time—we would see “behind”, so to speak, a little bit. Thus, we shall try to think of objects in a new kind of world, of space and time mixed together, in the same sense that the objects in our ordinary space-world are real, and can be looked at from different directions. We shall then consider that objects occupying space and lasting for a certain length of time occupy a kind of a “blob” in a new kind of world, and that when we look at this “blob” from different points of view when we are moving at different velocities. This new world, this geometrical entity in which the “blobs” exist by occupying position and taking up a certain amount of time, is called space-time.”

If none of what I wrote could convey the general idea, then I hope the above quote will. 🙂 Apart from that, I should also note that physicists will prefer to re-write the Lorentz transformation equations by measuring time and distance in so-called equivalent units: velocities will be expressed not in km/h but as a ratio of c and, hence, = 1 (a pure number) and so u will also be a pure number between 0 and 1. That can be done by expressing distance in light-seconds ( a light-second is the distance traveled by light in one second or, alternatively, by expressing time in ‘meter’. Both are equivalent but, in most textbooks, it will be time that will be measured in the ‘new’ units. So how do we express time in meter?

It’s quite simple: we multiply the old seconds with c and then we get: timeexpressed in meters = timeexpressed in seconds multiplied by 3×10meters per second. Hence, as the ‘second’ the first factor and the ‘per second’ in the second factor cancel out, the dimension of the new time unit will effectively be the meter. Now, if both time and distance are expressed in meter, then velocity becomes a pure number without any dimension, because we are dividing distance expressed in meter by time expressed in meter, and it should be noted that it will be a pure number between 0 and 1 (0 ≤ u ≤ 1), because 1 ‘time second’ = 1/(3×108) ‘time meters’. Also, c itself becomes the pure number 1. The Lorentz transformation equations then become:


They are easy to remember in this form (cf. the symmetry between x ut and t  ux) and, if needed, we can always convert back to the old units to recover the original formulas.

I personally think there is no better way to illustrate how space and time are ‘mere shadows’ of the same thing indeed: if we express both time and space in the same dimension (meter), we can see how, as result of that, velocity becomes a dimensionless number between zero and one and, more importantly, how the equations for x’ and t’ then mirror each other nicely. I am not sure what ‘kind of union’ between space and time Minkowski had in mind, but this must come pretty close, no?

Final note: I noted the equivalence of mass and energy above. In fact, mass and energy can also be expressed in the same units, and we actually do that above already. If we say that an electron has a rest mass of 0.511 MeV/c(a bit less than a quarter of the mass of the u quark), then we express the mass in terms of energy. Indeed, the eV is an energy unit and so we’re actually using the m = E/c2 formula when we express mass in such units. Expressing mass and energy in equivalent units allows us to derive similar ‘Lorentz transformation equations’ for the energy and the momentum of an object as measured under an inertial versus a moving reference frame. Hence, energy and momentum also transform like our space-time four-vectors and – likewise – the energy and the momentum itself, i.e. the components of the (four-)vector, are less ‘real’ than the vector itself. However, I think this post has become way too long and, hence, I’ll just jot these four equations down – please note, once again, the nice symmetry between (1) and (2) – but then leave it at that and finish this post. 🙂


Another post for my kids: introducing (special) relativity

Pre-scriptum (dated 26 June 2020): These posts on elementary math and physics have not suffered much the attack by the dark force—which is good because I still like them. While my views on the true nature of light, matter and the force or forces that act on them have evolved significantly as part of my explorations of a more realist (classical) explanation of quantum mechanics, I think most (if not all) of the analysis in this post remains valid and fun to read. In fact, I find the simplest stuff is often the best. 🙂

Original post:

In my previous post, I talked about energy, and I tried to keep it simple – but also accurate. However, to be completely accurate, one must, of course, introduce relativity at some point. So how does that work? What’s ‘relativistic’ energy? Well… Let me try to convey a few ideas here.

The first thing to note is that the energy conservation law still holds: special theory or not, the sum of the kinetic and potential energies in a (closed) system is always equal to some constant C. What constant? That doesn’t matter: Nature does not care about our zero point and, hence, we can add or subtract any (other) constant to the equation K.E. + P.E. = T + U = C.

That being said, in my previous post, I pointed out that the constant depends on the reference point for the potential energy term U: we will usually take infinity as the reference point (for a force that attracts) and associate it with zero potential (U = 0). We then get a function U(x) like the one below: for gravitational energy we have U(x) = –GMm/x, and for electrical charges, we have U(x) = q1q2/4πε0x. The mathematical shape is exactly the same but, in the case of the electromagnetic forces, you have to remember that likes repel, and opposites attract, so we don’t need the minus sign: the sign of the charges takes care of it.


Minus sign? In case you wonder why we need that minus sign for the potential energy function, well… I explained that in my previous post and so I’ll be brief on that here: potential energy is measured by doing work against the force. That’s why. So we have an infinite sum (i.e. an integral) over some trajectory or path looking like this: U = – ∫F·ds.

For kinetic energy, we don’t need any minus sign: as an object picks up speed, it’s the force itself that is doing the work as its potential energy is converted into kinetic energy, so the change in kinetic energy will equal the change in potential energy, but with opposite sign: as the object loses potential energy, it gains kinetic energy. Hence, we write ΔT = –ΔU = ∫F·ds..

That’s all kids stuff obviously. Let’s go beyond this and ask some questions. First, why can we add or subtract any constant to the potential energy but not to the kinetic energy? The answer is… Well… We actually can add or subtract a ‘constant’ to the kinetic energy as well. Now you will shake your head: Huh? Didn’t we have that T = mv2/2 formula for kinetic energy? So how and why could one add or subtract some number to that?

Well… That’s where relativity comes into play. The velocity v depends on your reference frame. If another observer would move with and/or alongside the object, at the same speed, that observer would observe a velocity equal to zero and, hence, its kinetic energy – as that observer would measure it – would also be zero. You will object to that, saying that a change of reference frame does not change the force, and you’re right: the force will cause the object to accelerate or decelerate indeed, and if the observer is not subject to the same force, then he’ll see the object accelerate or decelerate indeed, regardless of his reference frame is a moving or inertial frame. Hence, both the inertial as well as the moving observer will see an increase (or decrease) in its kinetic energy and, therefore, both will conclude that its potential energy decreases (or increases) accordingly. In short, it’s the change in energy that matters, both for the potential as well as for the kinetic energy. The reference point itself, i.e. the point from where we start counting so to say, does not: that’s relative. [This also shows in the derivation for kinetic energy which I’ll do below.]

That brings us to the second question. We all learned in high school that mass and energy are related through Einstein’s mass-energy relation, E = mc2, which establishes an equivalence between the two: the mass of an object that’s picking up speed increases, and so we need to look at both speed and mass as a function of time. Indeed, remember Newton’s Law: force is the time rate of change of momentum: F = d(mv)/dt. When the speed is low (i.e. non-relativistic), then we can just treat m as a constant and write that  F = mdv/dt = ma (the mass times the acceleration). Treating m as a constant also allows us to derive the classical (Newtonian) formula for kinetic energy:


So if we assume that the velocity of the object at point O is equal to zero (so vo = 0), then ΔT will be equal to T and we get what we were looking for: the kinetic energy at point P will be equal to T = mv2/2.

Now, you may wonder why we can’t do that same derivation for a non-constant mass? The answer to that question is simple: taking the m factor out of the integral can only be done if we assume it is a constant. If not, then we should leave it inside. It’s similar to taking a derivative. If m would not be constant, then we would have to apply the product rule to calculate d(mv)/dt, so we’d write d(mv)/dt = (dm/dt)v + m(dv/dt). So we have two terms here and it’s only when m is constant that we can reduce it to d(mv)/dt = m(dv/dt).

So we have our classical kinetic energy function. However, when the velocity gets really high – i.e. if it’s like the same order of magnitude as the velocity of light – then we cannot assume that mass is constant. Indeed, the same high-school course in physics that taught you that E = mc2 equation will probably also have taught you that an object can never go faster than light, regardless of the reference frame. Hence, as the object goes faster and faster, it will pick up more momentum, but its rate of acceleration should (and will) go down in such way that the object can never actually reach the speed of light. Indeed, if Newton’s Law is to remain valid, we need to correct it such a way that m is no longer constant: m itself will increase as a function of its velocity and, hence, as a function of time. You’ll remember the formula for that:


This is often written as m = γm0, with m0 denoting the mass of the object at rest (in your reference frame that is) and γ = (1 – v2/c2)–1/2 the so-called Lorentz factor. The Lorentz factor is named after a Dutch physicist who introduced it near the end of the 19th century in order to explain why the speed of light is always c, regardless of the frame of reference (moving or not), or – in other words – why the speed of light is not relative. Indeed, while you’ll remember that there is no such thing as an absolute velocity according to the (special) theory of relativity, the velocity of light actually is absolute ! That means you will always see light traveling at speed c regardless of your reference frame. To put it simply, you can never catch up with light and, if you would be traveling away from some star in a spaceship with a velocity of 200,000 km per second, and a light beam from that star would pass you, you’d measure the speed of that light beam to be equal to 300,000 km/s, not 100,000 km/s. So is an absolute speed that acts as an absolute speed limit regardless of your reference frame. [Note that we’re talking only about reference frames moving at a uniform speed: when acceleration comes into play, then we need to refer to the general theory of relativity and that’s a somewhat different ball game.]

The graph below shows how γ varies as a function of v. As you can see, the mass increase only becomes significant at speeds of like 100,000 km per second indeed. Indeed, for v = 0.3c, the Lorentz factor is 1.048, so the increase is about 5% only. For v = 0.5c, it’s still limited to an increase of some 15%. But then it goes up rapidly: for v = 0.9c, the mass is more than twice the rest mass: m ≈ 2.3m0; for v = 0.99c, the mass increase is 600%: m ≈ 7m0; and so on. For v = 0.999c – so when the speed of the object differs from c only by 1 part in 1,000 – the mass of the object will be more than twenty-two times the rest mass (m ≈ 22.4m0).


You probably know that we can actually reach such speeds and, hence, verify Einstein’s correction of Newton’s Law in particle accelerators: the electrons in an electron beam in a particle accelerator get usually pretty close to c and have a mass that’s like 2000 times their rest mass. How do we know that? Because the magnetic field needed to deflect them is like 2000 times as great as their (theoretical) rest mass. So how fast do they go? For their mass to be 2000 times m0, 1 – v2/c2 must be equal to 1/4,000,000. Hence, their velocity v differs from c only by one part in 8,000,000. You’ll have to admit that’s very close.

Other effects of relativistic speeds

So we mentioned the thing that’s best known about Einstein’s (special) theory of relativity: the mass of an object, as measured by the inertial observer, increases with its speed. Now, you may or may not be familiar with two other things that come out of relativity theory as well:

  1. The first is length contraction: objects are measured to be shortened in the direction of motion with respect to the (inertial) observer. The formula to be used incorporates the reciprocal of the Lorentz factor: L = (1/γ)L0. For example, a meter stick in a space ship moving at a velocity v = 0.6c will appear to be only 80 cm to the external/inertial observer seeing it whizz past… That is if he can see anything at all of course: he’d have to take like a photo-finish picture as it zooms past ! 🙂
  2. The second is time dilation, which is also rather well known – just like the mass increase effect – because of the so-called twin paradox: time will appear to be slower in that space ship and, hence, if you send one of two twins away on a space journey, traveling at such relativistic speed, he will come back younger than his brother. The formula here is a bit more complicated, but that’s only because we’re used to measure time in seconds. If we would take a more natural unit, i.e. the time it takes light to travel a distance of 1 m, then the formula will look the same as our mass formula: t = γt0 and, hence, one ‘second’ in the space ship will be measured as 1.25 ‘seconds’ by the external observer. Hence, the moving clock will appear to run slower – to the external (inertial) observer that is.

Again, the reality of this can be demonstrated. You’ll remember that we introduced the muon in previous posts: muons resemble electrons in the sense that they have the same charge, but their mass is more than 200 times the mass of an electron. As compared to other unstable particles, their average lifetime is quite long: 2.2 microseconds. Still, that would not be enough to travel more than 600 meters or so – even at the speed of light (2.2 μs × 300,000 km/s = 660 m). But so we do detect muons in detectors down here that come all the way down from the stratosphere, where they are created when cosmic rays hit the Earth’s atmosphere some 10 kilometers up. So how do they get here if they decay so fast? Well, those that actually end up in those detectors, do indeed travel very close to the speed of light and, hence, while from their own point of view they live only like two millionths of a second, they live considerably longer from our point of view.

Relativistic energy: E = mc2

Let’s go back to our main story line: relativistic energy. We wrote above that it’s the change of energy that matters really. So let’s look at that.

You may or may not remember that the concept of work in physics is closely related to the concept of power. In fact, you may actually remember that power, in physics at least, is defined as the work done per second. Indeed, we defined work as the (dot) product of the force and the distance. Now, when we’re talking a differential distance only (i.e. an infinitesimally small change only), then we can write dT = F·ds, but when we’re talking something larger, then we have to do that integral: ΔT = ∫F·ds. However, we’re interested in the time rate of change of T here, and so that’s the time derivative dT/dt which, as you easily verify, will be equal to dT/dt = (F·ds)/dt = F·(ds/dt) = F·and so we can use that differential formula and we don’t need the integral. Now, that (dot) product of the force and the velocity vectors is what’s referred to as the power. [Note that only the component of the force in the direction of motion contributes to the work done and, hence, to the power.]

OK. What am I getting at? Well… I just want to show an interesting derivation: if we assume, with Einstein, that mass and energy are equivalent and, hence, that the total energy of a body always equals E = mc2, then we can actually derive Einstein’s mass formula from that. How? Well… If the time rate of change of the energy of an object is equal to the power expended by the forces acting on it, then we can write:

dE/dt = d(mc2)/dt = F·v

Now, we cannot take the mass out of those brackets after the differential operator (d) because the mass is not a constant in this case (relativistic speeds) and, hence, dm/dt ≠ 0. However, we can take out c2 (that’s an absolute constant, remember?) and we can also substitute F using Newton’s Law (F = d(mv)/dt), again taking care to leave m between the brackets, not outside. So then we get:

d(mc2)/dt = c2dm/dt = [d(mv)/dt]·v = d(mv)/dt

In case you wonder why we can replace the vectors (bold face) v and d(mv) by their magnitudes (or lengths) v and d(mv): v and mv have the same direction and, hence, the angle θ between them is zero, and so v·v =v││v│cosθ =v2. Likewise, d(mv) and v also have the same direction and so we can just replace the dot product by the product of the magnitudes of those two vectors.

Now, let’s not forget the objective: we need to solve this equation for m and, hopefully, we’ll find Einstein’s mass formula, which we need to correct Newton’s Law. How do we do that? We’ll first multiply both sides by 2m. Why? Because we can then apply another mathematical trick, as shown below:

c2(2m)·dm/dt = 2md(mv)/dt ⇔ d(m2c2)/dt = d(m2v2)/dt

However, if the derivatives of two quantities are equal, then the quantities themselves can only differ by a constant, say C. So we integrate both sides and get:

m2c= m2v+ C

Be patient: we’re almost there. The above equation must be true for all velocities v and, hence, we can choose the special case where v = 0 and call this mass m0, and then substitute, so we get m0c= m00+ C = C. Now we put this particular value for C back in the more general equation above and we get:

mc= mv+ m0c⇔ m = mv2/c2 +m⇔ m(1 – v2/c2) = m⇔ m = m0/(1 – v2/c2)–1/2

So there we are: we have just shown that we get the relativistic mass formula (it’s on the right-hand side above) if we assume that Einstein’s mass-energy equivalence relation holds.

Now, you may wonder why that’s significant. Well… If you’re disappointed, then, at the very least, you’ll have to admit that it’s nice to show how everything is related to everything in this theory: from E = mc2, we get m0/(1 – v2/c2)–1/2. I think that’s kinda neat!

In addition, let us analyze that mass-energy relation in another way. It actually allows us to re-define kinetic energy as the excess of a particle over its rest mass energy, or – it’s the same expression really – or the difference between its total energy and its rest energy.

How does that work? Well… When we’re looking at high-speed or high-energy particles, we will write the kinetic energy as:

K.E. = mc– m0c= (m – m0)c= γm0c– m0c= m0c2(γ – 1). 

Now, we can expand that Lorentz factor γ = (1 – v2/c2)–1/2 into a binomial series (the binomial series is an infinite Taylor series, so it’s not to be confused with the (finite) binomial expansion: just check it online if you’re in doubt). If we do that, we we can write γ as an infinite sum of the following terms:

γ = 1 + (1/2)v2/c+ (3/8)v4/c+ (5/16)v6/c+ …

Now, when we plug this back into our (relativistic) kinetic energy equation, we can scrap a few things (just do it) to get where I wanted to get:

K.E. = (1/2)m0v+ (3/8)m0v4/c+ (5/16)m0v6/c+ …

Again, you’ll wonder: so what? Well… See how the non-relativistic formula for kinetic energy (K.E. = m0v2/2) appears here as the first term of this series and, hence, how the formula above shows that our ‘Newtonian’ formula is just an approximation. Of course, at low speeds, the second, third etcetera terms represent close to nothing and, hence, then our Newtonian ‘approximation is obviously pretty good of course !

OK… But… Now you’ll say: that’s fine, but how did Einstein get inspired to write E = mc2 in the first place? Well, truth be told, the relativistic mass formula was derived first (i.e. before Einstein wrote his E = mc2 equation), out of a derivation involving the momentum conservation law and the formulas we must use to convert the space-time coordinates from one reference frame to another when looking at phenomena (i.e. the so-called Lorentz transformations). And it was only afterwards that Einstein noted that, when expanding the relativistic mass formula, that the increase in mass of a body appeared to be equal to the increase in kinetic energy divided by c2 (Δm = Δ(K.E.)/c2). Now, that, in turn, inspired him to also assign an equivalent energy to the rest mass of that body: E0 = m0c2. […] At least that’s how Feynman tells the story in his 1965 Lectures… But so we’ve actually been doing it the other way around here!

Hmm… You will probably find all of this rather strange, and you may also wonder what happened to our potential energy. Indeed, that concept sort of ‘disappeared’ in this story: from the story above, it’s clear that kinetic energy has an equivalent mass, but what about potential energy?

That’s a very interesting question but, unfortunately, I can only give a rather rudimentary answer to that. Let’s suppose that we have two masses M and m. According to the potential energy formula above, the potential energy U between these two masses will then be equal to U = –GMm/r. Now, that energy is not interpreted as energy of either M or m, but as energy that is part of the (M, m) system, which includes the system’s gravitational field. So that energy is considered to be stored in that gravitational field. If the two masses would sit right on top of each other, then there would be no potential energy in the (M, m) system and, hence, the system as a whole would have less energy. In contrast, when we separate them further apart, then we increase the energy of the system as a whole, and so the system’s gravitational field then increases. So, yes, the potential energy does impact the (equivalent) mass of the system, but not the individual masses M and m. Does that make sense?

For me , it does, but I guess you’re a bit tired by now and, hence, I think I should wrap up here. In my next (and probably last) post on relativity, I’ll present those Lorentz transformations that allow us to ‘translate’ the space and time coordinates from one reference frame to another, and in that post I’ll also present the other derivation of Einstein’s relativistic mass formula, which is actually based on those transformations. In fact, I realize I should have probably started with that (as mentioned above, that’s how Feynman does it in his Lectures) but, then, for some reason, I find the presentation above more interesting, and so that’s why I am telling the story starting from another angle. I hope you don’t mind. In any case, it should be the same, because everything is related to everything in physics – just like in math. That’s why it’s important to have a good teacher. 🙂