Another post for my kids: introducing (special) relativity

Pre-scriptum (dated 26 June 2020): These posts on elementary math and physics have not suffered much the attack by the dark force—which is good because I still like them. While my views on the true nature of light, matter and the force or forces that act on them have evolved significantly as part of my explorations of a more realist (classical) explanation of quantum mechanics, I think most (if not all) of the analysis in this post remains valid and fun to read. In fact, I find the simplest stuff is often the best. 🙂

Original post:

In my previous post, I talked about energy, and I tried to keep it simple – but also accurate. However, to be completely accurate, one must, of course, introduce relativity at some point. So how does that work? What’s ‘relativistic’ energy? Well… Let me try to convey a few ideas here.

The first thing to note is that the energy conservation law still holds: special theory or not, the sum of the kinetic and potential energies in a (closed) system is always equal to some constant C. What constant? That doesn’t matter: Nature does not care about our zero point and, hence, we can add or subtract any (other) constant to the equation K.E. + P.E. = T + U = C.

That being said, in my previous post, I pointed out that the constant depends on the reference point for the potential energy term U: we will usually take infinity as the reference point (for a force that attracts) and associate it with zero potential (U = 0). We then get a function U(x) like the one below: for gravitational energy we have U(x) = –GMm/x, and for electrical charges, we have U(x) = q1q2/4πε0x. The mathematical shape is exactly the same but, in the case of the electromagnetic forces, you have to remember that likes repel, and opposites attract, so we don’t need the minus sign: the sign of the charges takes care of it.


Minus sign? In case you wonder why we need that minus sign for the potential energy function, well… I explained that in my previous post and so I’ll be brief on that here: potential energy is measured by doing work against the force. That’s why. So we have an infinite sum (i.e. an integral) over some trajectory or path looking like this: U = – ∫F·ds.

For kinetic energy, we don’t need any minus sign: as an object picks up speed, it’s the force itself that is doing the work as its potential energy is converted into kinetic energy, so the change in kinetic energy will equal the change in potential energy, but with opposite sign: as the object loses potential energy, it gains kinetic energy. Hence, we write ΔT = –ΔU = ∫F·ds..

That’s all kids stuff obviously. Let’s go beyond this and ask some questions. First, why can we add or subtract any constant to the potential energy but not to the kinetic energy? The answer is… Well… We actually can add or subtract a ‘constant’ to the kinetic energy as well. Now you will shake your head: Huh? Didn’t we have that T = mv2/2 formula for kinetic energy? So how and why could one add or subtract some number to that?

Well… That’s where relativity comes into play. The velocity v depends on your reference frame. If another observer would move with and/or alongside the object, at the same speed, that observer would observe a velocity equal to zero and, hence, its kinetic energy – as that observer would measure it – would also be zero. You will object to that, saying that a change of reference frame does not change the force, and you’re right: the force will cause the object to accelerate or decelerate indeed, and if the observer is not subject to the same force, then he’ll see the object accelerate or decelerate indeed, regardless of his reference frame is a moving or inertial frame. Hence, both the inertial as well as the moving observer will see an increase (or decrease) in its kinetic energy and, therefore, both will conclude that its potential energy decreases (or increases) accordingly. In short, it’s the change in energy that matters, both for the potential as well as for the kinetic energy. The reference point itself, i.e. the point from where we start counting so to say, does not: that’s relative. [This also shows in the derivation for kinetic energy which I’ll do below.]

That brings us to the second question. We all learned in high school that mass and energy are related through Einstein’s mass-energy relation, E = mc2, which establishes an equivalence between the two: the mass of an object that’s picking up speed increases, and so we need to look at both speed and mass as a function of time. Indeed, remember Newton’s Law: force is the time rate of change of momentum: F = d(mv)/dt. When the speed is low (i.e. non-relativistic), then we can just treat m as a constant and write that  F = mdv/dt = ma (the mass times the acceleration). Treating m as a constant also allows us to derive the classical (Newtonian) formula for kinetic energy:


So if we assume that the velocity of the object at point O is equal to zero (so vo = 0), then ΔT will be equal to T and we get what we were looking for: the kinetic energy at point P will be equal to T = mv2/2.

Now, you may wonder why we can’t do that same derivation for a non-constant mass? The answer to that question is simple: taking the m factor out of the integral can only be done if we assume it is a constant. If not, then we should leave it inside. It’s similar to taking a derivative. If m would not be constant, then we would have to apply the product rule to calculate d(mv)/dt, so we’d write d(mv)/dt = (dm/dt)v + m(dv/dt). So we have two terms here and it’s only when m is constant that we can reduce it to d(mv)/dt = m(dv/dt).

So we have our classical kinetic energy function. However, when the velocity gets really high – i.e. if it’s like the same order of magnitude as the velocity of light – then we cannot assume that mass is constant. Indeed, the same high-school course in physics that taught you that E = mc2 equation will probably also have taught you that an object can never go faster than light, regardless of the reference frame. Hence, as the object goes faster and faster, it will pick up more momentum, but its rate of acceleration should (and will) go down in such way that the object can never actually reach the speed of light. Indeed, if Newton’s Law is to remain valid, we need to correct it such a way that m is no longer constant: m itself will increase as a function of its velocity and, hence, as a function of time. You’ll remember the formula for that:


This is often written as m = γm0, with m0 denoting the mass of the object at rest (in your reference frame that is) and γ = (1 – v2/c2)–1/2 the so-called Lorentz factor. The Lorentz factor is named after a Dutch physicist who introduced it near the end of the 19th century in order to explain why the speed of light is always c, regardless of the frame of reference (moving or not), or – in other words – why the speed of light is not relative. Indeed, while you’ll remember that there is no such thing as an absolute velocity according to the (special) theory of relativity, the velocity of light actually is absolute ! That means you will always see light traveling at speed c regardless of your reference frame. To put it simply, you can never catch up with light and, if you would be traveling away from some star in a spaceship with a velocity of 200,000 km per second, and a light beam from that star would pass you, you’d measure the speed of that light beam to be equal to 300,000 km/s, not 100,000 km/s. So is an absolute speed that acts as an absolute speed limit regardless of your reference frame. [Note that we’re talking only about reference frames moving at a uniform speed: when acceleration comes into play, then we need to refer to the general theory of relativity and that’s a somewhat different ball game.]

The graph below shows how γ varies as a function of v. As you can see, the mass increase only becomes significant at speeds of like 100,000 km per second indeed. Indeed, for v = 0.3c, the Lorentz factor is 1.048, so the increase is about 5% only. For v = 0.5c, it’s still limited to an increase of some 15%. But then it goes up rapidly: for v = 0.9c, the mass is more than twice the rest mass: m ≈ 2.3m0; for v = 0.99c, the mass increase is 600%: m ≈ 7m0; and so on. For v = 0.999c – so when the speed of the object differs from c only by 1 part in 1,000 – the mass of the object will be more than twenty-two times the rest mass (m ≈ 22.4m0).


You probably know that we can actually reach such speeds and, hence, verify Einstein’s correction of Newton’s Law in particle accelerators: the electrons in an electron beam in a particle accelerator get usually pretty close to c and have a mass that’s like 2000 times their rest mass. How do we know that? Because the magnetic field needed to deflect them is like 2000 times as great as their (theoretical) rest mass. So how fast do they go? For their mass to be 2000 times m0, 1 – v2/c2 must be equal to 1/4,000,000. Hence, their velocity v differs from c only by one part in 8,000,000. You’ll have to admit that’s very close.

Other effects of relativistic speeds

So we mentioned the thing that’s best known about Einstein’s (special) theory of relativity: the mass of an object, as measured by the inertial observer, increases with its speed. Now, you may or may not be familiar with two other things that come out of relativity theory as well:

  1. The first is length contraction: objects are measured to be shortened in the direction of motion with respect to the (inertial) observer. The formula to be used incorporates the reciprocal of the Lorentz factor: L = (1/γ)L0. For example, a meter stick in a space ship moving at a velocity v = 0.6c will appear to be only 80 cm to the external/inertial observer seeing it whizz past… That is if he can see anything at all of course: he’d have to take like a photo-finish picture as it zooms past ! 🙂
  2. The second is time dilation, which is also rather well known – just like the mass increase effect – because of the so-called twin paradox: time will appear to be slower in that space ship and, hence, if you send one of two twins away on a space journey, traveling at such relativistic speed, he will come back younger than his brother. The formula here is a bit more complicated, but that’s only because we’re used to measure time in seconds. If we would take a more natural unit, i.e. the time it takes light to travel a distance of 1 m, then the formula will look the same as our mass formula: t = γt0 and, hence, one ‘second’ in the space ship will be measured as 1.25 ‘seconds’ by the external observer. Hence, the moving clock will appear to run slower – to the external (inertial) observer that is.

Again, the reality of this can be demonstrated. You’ll remember that we introduced the muon in previous posts: muons resemble electrons in the sense that they have the same charge, but their mass is more than 200 times the mass of an electron. As compared to other unstable particles, their average lifetime is quite long: 2.2 microseconds. Still, that would not be enough to travel more than 600 meters or so – even at the speed of light (2.2 μs × 300,000 km/s = 660 m). But so we do detect muons in detectors down here that come all the way down from the stratosphere, where they are created when cosmic rays hit the Earth’s atmosphere some 10 kilometers up. So how do they get here if they decay so fast? Well, those that actually end up in those detectors, do indeed travel very close to the speed of light and, hence, while from their own point of view they live only like two millionths of a second, they live considerably longer from our point of view.

Relativistic energy: E = mc2

Let’s go back to our main story line: relativistic energy. We wrote above that it’s the change of energy that matters really. So let’s look at that.

You may or may not remember that the concept of work in physics is closely related to the concept of power. In fact, you may actually remember that power, in physics at least, is defined as the work done per second. Indeed, we defined work as the (dot) product of the force and the distance. Now, when we’re talking a differential distance only (i.e. an infinitesimally small change only), then we can write dT = F·ds, but when we’re talking something larger, then we have to do that integral: ΔT = ∫F·ds. However, we’re interested in the time rate of change of T here, and so that’s the time derivative dT/dt which, as you easily verify, will be equal to dT/dt = (F·ds)/dt = F·(ds/dt) = F·and so we can use that differential formula and we don’t need the integral. Now, that (dot) product of the force and the velocity vectors is what’s referred to as the power. [Note that only the component of the force in the direction of motion contributes to the work done and, hence, to the power.]

OK. What am I getting at? Well… I just want to show an interesting derivation: if we assume, with Einstein, that mass and energy are equivalent and, hence, that the total energy of a body always equals E = mc2, then we can actually derive Einstein’s mass formula from that. How? Well… If the time rate of change of the energy of an object is equal to the power expended by the forces acting on it, then we can write:

dE/dt = d(mc2)/dt = F·v

Now, we cannot take the mass out of those brackets after the differential operator (d) because the mass is not a constant in this case (relativistic speeds) and, hence, dm/dt ≠ 0. However, we can take out c2 (that’s an absolute constant, remember?) and we can also substitute F using Newton’s Law (F = d(mv)/dt), again taking care to leave m between the brackets, not outside. So then we get:

d(mc2)/dt = c2dm/dt = [d(mv)/dt]·v = d(mv)/dt

In case you wonder why we can replace the vectors (bold face) v and d(mv) by their magnitudes (or lengths) v and d(mv): v and mv have the same direction and, hence, the angle θ between them is zero, and so v·v =v││v│cosθ =v2. Likewise, d(mv) and v also have the same direction and so we can just replace the dot product by the product of the magnitudes of those two vectors.

Now, let’s not forget the objective: we need to solve this equation for m and, hopefully, we’ll find Einstein’s mass formula, which we need to correct Newton’s Law. How do we do that? We’ll first multiply both sides by 2m. Why? Because we can then apply another mathematical trick, as shown below:

c2(2m)·dm/dt = 2md(mv)/dt ⇔ d(m2c2)/dt = d(m2v2)/dt

However, if the derivatives of two quantities are equal, then the quantities themselves can only differ by a constant, say C. So we integrate both sides and get:

m2c= m2v+ C

Be patient: we’re almost there. The above equation must be true for all velocities v and, hence, we can choose the special case where v = 0 and call this mass m0, and then substitute, so we get m0c= m00+ C = C. Now we put this particular value for C back in the more general equation above and we get:

mc= mv+ m0c⇔ m = mv2/c2 +m⇔ m(1 – v2/c2) = m⇔ m = m0/(1 – v2/c2)–1/2

So there we are: we have just shown that we get the relativistic mass formula (it’s on the right-hand side above) if we assume that Einstein’s mass-energy equivalence relation holds.

Now, you may wonder why that’s significant. Well… If you’re disappointed, then, at the very least, you’ll have to admit that it’s nice to show how everything is related to everything in this theory: from E = mc2, we get m0/(1 – v2/c2)–1/2. I think that’s kinda neat!

In addition, let us analyze that mass-energy relation in another way. It actually allows us to re-define kinetic energy as the excess of a particle over its rest mass energy, or – it’s the same expression really – or the difference between its total energy and its rest energy.

How does that work? Well… When we’re looking at high-speed or high-energy particles, we will write the kinetic energy as:

K.E. = mc– m0c= (m – m0)c= γm0c– m0c= m0c2(γ – 1). 

Now, we can expand that Lorentz factor γ = (1 – v2/c2)–1/2 into a binomial series (the binomial series is an infinite Taylor series, so it’s not to be confused with the (finite) binomial expansion: just check it online if you’re in doubt). If we do that, we we can write γ as an infinite sum of the following terms:

γ = 1 + (1/2)v2/c+ (3/8)v4/c+ (5/16)v6/c+ …

Now, when we plug this back into our (relativistic) kinetic energy equation, we can scrap a few things (just do it) to get where I wanted to get:

K.E. = (1/2)m0v+ (3/8)m0v4/c+ (5/16)m0v6/c+ …

Again, you’ll wonder: so what? Well… See how the non-relativistic formula for kinetic energy (K.E. = m0v2/2) appears here as the first term of this series and, hence, how the formula above shows that our ‘Newtonian’ formula is just an approximation. Of course, at low speeds, the second, third etcetera terms represent close to nothing and, hence, then our Newtonian ‘approximation is obviously pretty good of course !

OK… But… Now you’ll say: that’s fine, but how did Einstein get inspired to write E = mc2 in the first place? Well, truth be told, the relativistic mass formula was derived first (i.e. before Einstein wrote his E = mc2 equation), out of a derivation involving the momentum conservation law and the formulas we must use to convert the space-time coordinates from one reference frame to another when looking at phenomena (i.e. the so-called Lorentz transformations). And it was only afterwards that Einstein noted that, when expanding the relativistic mass formula, that the increase in mass of a body appeared to be equal to the increase in kinetic energy divided by c2 (Δm = Δ(K.E.)/c2). Now, that, in turn, inspired him to also assign an equivalent energy to the rest mass of that body: E0 = m0c2. […] At least that’s how Feynman tells the story in his 1965 Lectures… But so we’ve actually been doing it the other way around here!

Hmm… You will probably find all of this rather strange, and you may also wonder what happened to our potential energy. Indeed, that concept sort of ‘disappeared’ in this story: from the story above, it’s clear that kinetic energy has an equivalent mass, but what about potential energy?

That’s a very interesting question but, unfortunately, I can only give a rather rudimentary answer to that. Let’s suppose that we have two masses M and m. According to the potential energy formula above, the potential energy U between these two masses will then be equal to U = –GMm/r. Now, that energy is not interpreted as energy of either M or m, but as energy that is part of the (M, m) system, which includes the system’s gravitational field. So that energy is considered to be stored in that gravitational field. If the two masses would sit right on top of each other, then there would be no potential energy in the (M, m) system and, hence, the system as a whole would have less energy. In contrast, when we separate them further apart, then we increase the energy of the system as a whole, and so the system’s gravitational field then increases. So, yes, the potential energy does impact the (equivalent) mass of the system, but not the individual masses M and m. Does that make sense?

For me , it does, but I guess you’re a bit tired by now and, hence, I think I should wrap up here. In my next (and probably last) post on relativity, I’ll present those Lorentz transformations that allow us to ‘translate’ the space and time coordinates from one reference frame to another, and in that post I’ll also present the other derivation of Einstein’s relativistic mass formula, which is actually based on those transformations. In fact, I realize I should have probably started with that (as mentioned above, that’s how Feynman does it in his Lectures) but, then, for some reason, I find the presentation above more interesting, and so that’s why I am telling the story starting from another angle. I hope you don’t mind. In any case, it should be the same, because everything is related to everything in physics – just like in math. That’s why it’s important to have a good teacher. 🙂


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s