# The Principle of Least Action re-visited

As I was posting some remarks on the Exercises that come with Feynman’s Lectures, I was thinking I should do another post on the Principle of Least Action, and how it is used in quantum mechanics. It is an interesting matter, because the Principle of Least Action sort of connects classical and quantum mechanics.

Let us first re-visit the Principle in classical mechanics. The illustrations which Feynman uses in his iconic exposé on it are copied below. You know what they depict: some object that goes up in the air, and then comes back down because of… Well… Gravity. Hence, we have a force field and, therefore, some potential which gives our object some potential energy. The illustration is nice because we can apply it any (uniform) force field, so let’s analyze it a bit more in depth.

We know the actual trajectory – which Feynman writes as x(t)x(t) + η(t) so as to distinguish it from some other nearby path x(t) – will minimize the value of the following integral:

In the mentioned post, I try to explain what the formula actually means by breaking it up in two separate integrals: one with the kinetic energy in the integrand and – you guessed it 🙂 – one with the potential energy. We can choose any reference point for our potential energy, of course, but to better reflect the energy conservation principle, we assume PE = 0 at the highest point. This ensures that the sum of the kinetic and the potential energy is zero. For a mass of 5 kg (think of the ubiquitous cannon ball), and a (maximum) height of 50 m, we got the following graph.

Just to make sure, here is how we calculate KE and PE as a function of time:

We can, of course, also calculate the action as a function of time:

Note the integrand: KE − PE = m·v2. Strange, isn’t it? It’s like E = m·c2, right? We get a weird cubic function, which I plotted below (blue). I added the function for the height (but in millimeter) because of the different scales.

So what’s going on? The action concept is interesting. As the product of force, distance and time, it makes intuitive sense: it’s force over distance over time. To cover some distance in some force field, energy will be used or spent but, clearly, the time that is needed should matter as well, right? Yes. But the question is: how, exactly? Let’s analyze what happens from = 0 to = 3.2 seconds, so that’s the trajectory from = 0 to the highest point (= 50 m). The action that is required to bring our 5 kg object there would be equal to F·h·t = m·g·h·t = 5×9.8×50×3.2 = 7828.9 J·s. [I use non-rounded values in my calculations.] However, our action integral tells us it’s only 5219.6 J·s. The difference (2609.3 J·s) is explained by the initial velocity and, hence, the initial kinetic energy, which we got for free, so to speak, and which, over the time interval, is spent as action. So our action integral gives us a net value, so to speak.

To be precise, we can calculate the time rate of change of the kinetic energy as d(KE)/dt = −1533.7 + 480.2·t, so that’s a linear function of time. The graph below shows how it works. The time rate of change is initially negative, as kinetic energy gets spent and increases the potential energy of our object. At the maximum height, the time of rate of change is zero. The object then starts falling, and the time rate of change becomes positive, as the velocity of our object goes from zero to… Well… The velocity is a linear function of time as well: v0 − g·t, remember? Hence, at = v0/g = 31.3/9.8 = 3.2 s, the velocity becomes negative so our cannon ball is, effectively, falling down. Of course, as it falls down and gains speed, it covers more and more distance per second and, therefore, the associated action also goes up exponentially. Just re-define our starting point at = 3.2 s. The m·v0t·(v0 − gt) term is zero at that point, and so then it’s only the m·g2·t3/3 term that counts.

So… Yes. That’s clear enough. But it still doesn’t answer the fundamental question: how does that minimization of S (or the maximization of −S) work, exactly? Well… It’s not like Nature knows it wants to go from point to point b, and then sort of works out some least action algorithm. No. The true path is given by the force law which, at every point in spacetime, will accelerate, or decelerate, our object at a rate that is equal to the ratio of the force and the mass of our object. In this case, we write: = F/= m·g/m = g, so that’s the acceleration of gravity. That’s the only real thing: all of the above is just math, some mental construct, so to speak.

Of course, this acceleration, or deceleration, then gives the velocity and the kinetic energy. Hence, once again, it’s not like we’re choosing some average for our kinetic energy: the force (gravity, in this particular case) just give us that average. Likewise, the potential energy depends on the position of our object, which we get from… Well… Where it starts and where it goes, so it also depends on the velocity and, hence, the acceleration or deceleration from the force field. So there is no optimization. No teleology. Newton’s force law gives us the true path. If we drop something down, it will go down in a straight line, because any deviation from it would add to the distance. A more complicated illustration is Fermat’s Principle of Least Time, which combines distance and time. But we won’t go into any further detail here. Just note that, in classical mechanics, the true path can, effectively, be associated with a minimum value for that action integral: any other path will be associated with a higher S. So we’re done with classical mechanics here. What about the Principle of Least Action in quantum mechanics?

## The Principle of Least Action in quantum mechanics

We have the uncertainty in quantum mechanics: there is no unique path. However, we can, effectively, associate each possible path with a definite amount of action, which we will also write as S. However, instead of talking velocities, we’ll usually want to talk momentum. Photons have no rest mass (m0 = 0), but they do have momentum because of their energy: for a photon, the E = m·c2 equation can be rewritten as E = p·c, and the Einstein-Planck relation for photons tells us the photon energy (E) is related to the frequency (f): E = h·f. Now, for a photon, the wavelength is given by = c/λ. Hence, p = E/c = h·f/c= h/λ = ħ·k.

OK. What’s the action integral? What’s the kinetic and potential energy? Let’s just try the energy: E = m·c2. It reflects the KE − PE = m·v2 formula we used above. Of course, the energy of a photon does not vary, so the value of our integral is just the energy times the travel time, right? What is the travel time? Let’s do things properly by using vector notations here, so we will have two position vectors rand r2 for point and b respectively. We can then define a vector pointing from r1 to r2, which we will write as r12. The distance between the two points is then, obviously, equal to|r12| = √r122 = r12. Our photon travels at the speed of light, so the time interval will be equal to = r12/c. So we get a very simple formula for the action: = E·t = p·c·= p·c·r12/c = p·r12. Now, it may or may not make sense to assume that the direction of the momentum of our photon and the direction of r12 are somewhat different, so we’ll want to re-write this as a vector dot product: S = p·r12. [Of course, you know the pr12 dot product equals |p|∙|r12cosθ = p∙r12·cosθ, with θ the angle between p and r12. If the angle is the same, then cosθ is equal to 1. If the angle is ± π/2, then it’s 0.]

So now we minimize the action so as to determine the actual path? No. We have this weird stopwatch stuff in quantum mechanics. We’ll use this S = p·r12 value to calculate a probability amplitude. So we’ll associate trajectories with amplitudes, and we just use the action values to do so. This is how it works (don’t ask me why – not now, at least):

1. We measure action in units of ħ, because… Well… Planck’s constant is a pretty fundamental unit of action, right? 🙂 So we write θ = S/ħ p·r12/ħ.
2. θ usually denotes an angle, right? Right. θ = p·r12/ħ is the so-called phase of… Well… A proper wavefunction:

ψ(pr12) = a·ei·θ = (1/r12ei·pr12

Wow ! I realize you may never have seen this… Well… It’s my derivation of what physicists refer to as the propagator function for a photon. If you google it, you may see it written like this (most probably not, however, as it’s usually couched in more abstract math):This formulation looks slightly better because it uses Diracs bra-ket notation: the initial state of our photon is written as 〈 r1| and its final state is, accordingly, |r2〉. But it’s the same: it’s the amplitude for our photon to go from point to point b. In case you wonder, the 1/r12 coefficient is there to take care of the inverse square law. I’ll let you think about that for yourself. It’s just like any other physical quantity (or intensity, if you want): they get diluted as the distance increases. [Note that we get the inverse square (1/r122) when calculating a probability, which we do by taking the absolute square of our amplitude: |(1/r12ei·pr12|2 = |1/r122)|2·|ei·pr12|2 = 1/r122.]

So… Well… Now we are ready to understand Feynman’s own summary of his path integral formulation of quantum mechanics:  explanation words:

“Here is how it works: Suppose that for all paths, S is very large compared to ħ. One path contributes a certain amplitude. For a nearby path, the phase is quite different, because with an enormous even a small change in means a completely different phase—because ħ is so tiny. So nearby paths will normally cancel their effects out in taking the sum—except for one region, and that is when a path and a nearby path all give the same phase in the first approximation (more precisely, the same action within ħ). Only those paths will be the important ones.”

You are now, finally, ready to understand that wonderful animation that’s part of the Wikipedia article on Feynman’s path integral formulation of quantum mechanics. Check it out, and let the author (not me, but a guy who identifies himself as Juan David) I think it’s great ! 🙂

## Explaining diffraction

All of the above is nice, but how does it work? What’s the geometry? Let me be somewhat more adventurous here. So we have our formula for the amplitude of a photon to go from one point to another:The formula is far too simple, if only because it assumes photons always travel at the speed of light. As explained in an older post of mine, a photon also has an amplitude to travel slower or faster than (I know that sounds crazy, but it is what it is) and a more sophisticated propagator function will acknowledge that and, unsurprisingly, ensure the spacetime intervals that are more light-like make greater contributions to the ‘final arrow’, as Feynman (or his student, Ralph Leighton, I should say) put it in his Strange Theory of Light and Matter. However, then we’d need to use four-vector notation and we don’t want to do that here. The simplified formula above serves the purpose. We can re-write it as:

ψ(pr12) = a·ei·θ = (1/r12ei·S = ei·pr12/r12

Again, S = p·r12 is just the amount of action we calculate for the path. Action is energy over some time (1 N·m·s = 1 J·s), or momentum over some distance (1 kg·(m/s)·m = 1 N·(s2/m)·(m/s)·m) = 1 N·m·s). For a photon traveling at the speed of light, we have E = p·c, and r12/c, so we get a very simple formula for the action: = E·t = p·r12. Now, we know that, in quantum mechanics, we have to add the amplitudes for the various paths between r1 and r2 so we get a ‘final arrow’ whose absolute square gives us the probability of… Well… Our photon going from r1 and r2. You also know that we don’t really know what actually happens in-between: we know amplitudes interfere, but that’s what we’re modeling when adding the arrows. Let me copy one of Feynman’s famous drawings so we’re sure we know what we’re talking about.Our simplified approach (the assumption of light traveling at the speed of light) reduces our least action principle to a least time principle: the arrows associated with the path of least time and the paths immediately left and right of it that make the biggest contribution to the final arrow. Why? Think of the stopwatch metaphor: these stopwatches arrive around the same time and, hence, their hands point more or less in the same direction. It doesn’t matter what direction – as long as it’s more or less the same.

Now let me copy the illustrations he uses to explain diffraction. Look at them carefully, and read the explanation below.

When the slit is large, our photon is likely to travel in a straight line. There are many other possible paths – crooked paths – but the amplitudes that are associated with those other paths cancel each other out. In contrast, the straight-line path and, importantly, the nearby paths, are associated with amplitudes that have the same phase, more or less.

However, when the slit is very narrow, there is a problem. As Feynman puts it, “there are not enough arrows to cancel each other out” and, therefore, the crooked paths are also associated with sizable probabilities. Now how does that work, exactly? Not enough arrows? Why? Let’s have a look at it.

The phase (θ) of our amplitudes a·ei·θ = (1/r12ei·S is measured in units of ħ: θ = S/ħ. Hence, we should measure the variation in in units of ħ. Consider two paths, for example: one for which the action is equal to S, and one for which the action is equal to + δ+ π·ħ, so δ= π·ħ. They will cancel each other out:

ei·S/ħ/r12 + e(S + δS)/ħ/r12 = (1/r12)·(ei·S/ħ/r12 + ei·(S+π·ħ)/ħ/r12 )

= (1/r12)·(ei·S/ħ + ei·S/ħ·ei·π) = (1/r12)·(ei·S/ħ − ei·S/ħ) = 0

So nearby paths will interfere constructively, so to speak, by making the final arrow larger. In order for that to happen, δS should be smaller than 2πħ/3 ≈ 2ħ, as shown below.

Why? That’s just the way the addition of angles work. Look at the illustration below: if the red arrow is the amplitude to which we are adding another, any amplitude whose phase angle is smaller than 2πħ/3 ≈ 2ħ will add something to its length. That’s what the geometry of the situation tells us. [If you have time, you can perhaps find some algebraic proof: let me know the result!]
We need to note a few things here. First, unlike what you might think, the amplitudes of the higher and lower path in the drawing do not cancel. On the contrary, the action is the same, so their magnitudes just add up. Second, if this logic is correct, we will have alternating zones with paths that interfere positively and negatively, as shown below.

Interesting geometry. How relevant are these zones as we move out from the center, steadily increasing δS? I am not quite sure. I’d have to get into the math of it all, which I don’t want to do in a blog like this. What I do want to do is re-examine is Feynman’s intuitive explanation of diffraction: when the slit is very narrow, “there are not enough arrows to cancel each other out.”

Huh? What’s that? Can’t we add more paths? It’s a tricky question. We are measuring action in units of ħ, but do we actually think action comes in units of ħ? I am not sure. It would make sense, intuitively, but… Well… There’s uncertainty on the energy (E) and the momentum (p) of our photon, right? And how accurately can we measure the distance? So there’s some randomness everywhere. Having said that, the whole argument does requires us to assume action effectively comes in units of ħħ is, effectively, the scaling factor here.

So how can we have more paths? More arrows? I don’t think so. We measure as energy over some time, or as momentum over some distance, and we express all these quantities in old-fashioned SI units: newton for the force, meter for the distance, and second for the time. If we want smaller arrows, we’ll have to use other units, but then the numerical value for ħ will change too! So… Well… No. I don’t think so. And it’s not because of the normalization rule (all probabilities have to add up to one, so we do some have some re-scaling for that). That doesn’t matter, really. What matters is the physics behind the formula, and the formula tells us the physical reality is ħ. So the geometry of the situation is what it is.

Hmm… I guess that, at this point, we should wrap up our rather intuitive discussion here, and resort to the mathematical formalism of Feynman’s path integral formulation, but you can find that elsewhere.

Post scriptum: I said I would show how the Principle of Least Action is relevant to both classical as well as quantum mechanics. Well… Let me quote the Master once more:

“So in the limiting case in which Planck’s constant ħ goes to zero, the correct quantum-mechanical laws can be summarized by simply saying: ‘Forget about all these probability amplitudes. The particle does go on a special path, namely, that one for which does not vary in the first approximation.’”

So that’s how the Principle of Least Action sort of unifies quantum mechanics as well as classical mechanics. 🙂

Post scriptum 2: In my next post, I’ll be doing some calculations. They will answer the question as to how relevant those zones of positive and negative interference further away from the straight-line path. I’ll give a numerical example which shows the 1/r12 factor does its job. 🙂 Just have a look at it. 🙂

# Quantum math: the rules – all of them! :-)

In my previous post, I made no compromise, and used all of the rules one needs to calculate quantum-mechanical stuff:

However, I didn’t explain them. These rules look simple enough, but let’s analyze them now. They’re simple and not at the same time, indeed.

[I] The first equation uses the Kronecker delta, which sounds fancy but it’s just a simple shorthand: δij = δji is equal to 1 if i = j, and zero if i ≠ j, with and j representing base states. Equation (I) basically says that base states are all different. For example, the angular momentum in the x-direction of a spin-1/2 particle – think of an electron or a proton – is either +ħ/2 or −ħ/2, not something in-between, or some mixture. So 〈 +x | +x 〉 = 〈 −x | −x 〉 = 1 and 〈 +x | −x 〉 = 〈 −x | +x 〉 = 0.

We’re talking base states here, of course. Base states are like a coordinate system: we settle on an x-, y- and z-axis, and a unit, and any point is defined in terms of an x-, y– and z-number. It’s the same here, except we’re talking ‘points’ in four-dimensional spacetime. To be precise, we’re talking constructs evolving in spacetime. To be even more precise, we’re talking amplitudes with a temporal as well as a spatial frequency, which we’ll often represent as:

ei·θ ei·(ω·t − k ∙x) = a·e(i/ħ)·(E·t − px)

The coefficient in front (a) is just a normalization constant, ensuring all probabilities add up to one. It may not be a constant, actually: perhaps it just ensure our amplitude stays within some kind of envelope, as illustrated below.

As for the ω = E/ħ and k = p/ħ identities, these are the de Broglie equations for a matter-wave, which the young Comte jotted down as part of his 1924 PhD thesis. He was inspired by the fact that the E·t − px factor is an invariant four-vector product (E·t − px = pμxμ) in relativity theory, and noted the striking similarity with the argument of any wave function in space and time (ω·t − k ∙x) and, hence, couldn’t resist equating both. Louis de Broglie was inspired, of course, by the solution to the blackbody radiation problem, which Max Planck and Einstein had convincingly solved by accepting that the ω = E/ħ equation holds for photons. As he wrote it:

“When I conceived the first basic ideas of wave mechanics in 1923–24, I was guided by the aim to perform a real physical synthesis, valid for all particles, of the coexistence of the wave and of the corpuscular aspects that Einstein had introduced for photons in his theory of light quanta in 1905.” (Louis de Broglie, quoted in Wikipedia)

Looking back, you’d of course want the phase of a wavefunction to be some invariant quantity, and the examples we gave our previous post illustrate how one would expect energy and momentum to impact its temporal and spatial frequency. But I am digressing. Let’s look at the second equation. However, before we move on, note that minus sign in the exponent of our wavefunction: a·ei·θ. The phase turns counter-clockwise. That’s just the way it is. I’ll come back to this.

[II] The φ and χ symbols do not necessarily represent base states. In fact, Feynman illustrates this law using a variety of examples including both polarized as well as unpolarized beams, or ‘filtered’ as well as ‘unfiltered’ states, as he calls it in the context of the Stern-Gerlach apparatuses he uses to explain what’s going on. Let me summarize his argument here.

I discussed the Stern-Gerlach experiment in my post on spin and angular momentum, but the Wikipedia article on it is very good too. The principle is illustrated below: a inhomogeneous magnetic field – note the direction of the gradient ∇B = (∂B/∂x, ∂B/∂y, ∂B/∂z) – will split a beam of spin-one particles into three beams. [Matter-particles with spin one are rather rare (Lithium-6 is an example), but three states (rather than two only, as we’d have when analyzing spin-1/2 particles, such as electrons or protons) allow for more play in the analysis. 🙂 In any case, the analysis is easily generalized.]

The splitting of the beam is based, of course, on the quantized angular momentum in the z-direction (i.e. the direction of the gradient): its value is either ħ, 0, or −ħ. We’ll denote these base states as +, 0 or −, and we should note they are defined in regard to an apparatus with a specific orientation. If we call this apparatus S, then we can denote these base states as +S, 0S and −S respectively.

The interesting thing in Feynman’s analysis is the imagined modified Stern-Gerlach apparatus, which – I am using Feynman‘s words here 🙂 –  “puts Humpty Dumpty back together.” It looks a bit monstruous, but it’s easy enough to understand. Quoting Feynman once more: “It consists of a sequence of three high-gradient magnets. The first one (on the left) is just the usual Stern-Gerlach magnet and splits the incoming beam of spin-one particles into three separate beams. The second magnet has the same cross section as the first, but is twice as long and the polarity of its magnetic field is opposite the field in magnet 1. The second magnet pushes in the opposite direction on the atomic magnets and bends their paths back toward the axis, as shown in the trajectories drawn in the lower part of the figure. The third magnet is just like the first, and brings the three beams back together again, so that leaves the exit hole along the axis.”

Now, we can use this apparatus as a filter by inserting blocking masks, as illustrated below.

But let’s get back to the lesson. What about the second ‘Law’ of quantum math? Well… You need to be able to imagine all kinds of situations now. The rather simple set-up below is one of them: we’ve got two of these apparatuses in series now, S and T, with T tilted at the angle α with respect to the first.

I know: you’re getting impatient. What about it? Well… We’re finally ready now. Let’s suppose we’ve got three apparatuses in series, with the first and the last one having the very same orientation, and the one in the middle being tilted. We’ll denote them by S, T and S’ respectively. We’ll also use masks: we’ll block the 0 and − state in the S-filter, like in that illustration above. In addition, we’ll block the + and − state in the T apparatus and, finally, the 0 and − state in the S’ apparatus. Now try to imagine what happens: how many particles will get through?

[…]

Just try to think about it. Make some drawing or something. Please!

[…]

OK… The answer is shown below. Despite the filtering in S, the +S particles that come out do have an amplitude to go through the 0T-filter, and so the number of atoms that come out will be some fraction (α) of the number of atoms (N) that came out of the +S-filter. Likewise, some other fraction (β) will make it through the +S’-filter, so we end up with βαN particles.

Now, I am sure that, if you’d tried to guess the answer yourself, you’d have said zero rather than βαN but, thinking about it, it makes sense: it’s not because we’ve got some angular momentum in one direction that we have none in the other. When everything is said and done, we’re talking components of the total angular momentum here, don’t we? Well… Yes and no. Let’s remove the masks from T. What do we get?

[…]

Come on: what’s your guess? N?

[…] You’re right. It’s N. Perfect. It’s what’s shown below.

Now, that should boost your confidence. Let’s try the next scenario. We block the 0 and − state in the S-filter once again, and the + and − state in the T apparatus, so the first two apparatuses are the same as in our first example. But let’s change the S’ apparatus: let’s close the + and − state there now. Now try to imagine what happens: how many particles will get through?

[…]

Come on! You think it’s a trap, isn’t it? It’s not. It’s perfectly similar: we’ve got some other fraction here, which we’ll write as γαN, as shown below.

Next scenario: S has the 0 and − gate closed once more, and T is fully open, so it has no masks. But, this time, we set S’ so it filters the 0-state with respect to it. What do we get? Come on! Think! Please!

[…]

The answer is zero, as shown below.

Does that make sense to you? Yes? Great! Because many think it’s weird: they think the T apparatus must ‘re-orient’ the angular momentum of the particles. It doesn’t: if the filter is wide open, then “no information is lost”, as Feynman puts it. Still… Have a look at it. It looks like we’re opening ‘more channels’ in the last example: the S and S’ filter are the same, indeed, and T is fully open, while it selected for 0-state particles before. But no particles come through now, while with the 0-channel, we had γαN.

Hmm… It actually is kinda weird, won’t you agree? Sorry I had to talk about this, but it will make you appreciate that second ‘Law’ now: we can always insert a ‘wide-open’ filter and, hence, split the beams into a complete set of base states − with respect to the filter, that is − and bring them back together provided our filter does not produce any unequal disturbances on the three beams. In short, the passage through the wide-open filter should not result in a change of the amplitudes. Again, as Feynman puts it: the wide-open filter should really put Humpty-Dumpty back together again. If it does, we can effectively apply our ‘Law’:

For an example, I’ll refer you to my previous post. This brings me to the third and final ‘Law’.

[III] The amplitude to go from state φ to state χ is the complex conjugate of the amplitude to to go from state χ to state φ:

〈 χ | φ 〉 = 〈 φ | χ 〉*

This is probably the weirdest ‘Law’ of all, even if I should say, straight from the start, we can actually derive it from the second ‘Law’, and the fact that all probabilities have to add up to one. Indeed, a probability is the absolute square of an amplitude and, as we know, the absolute square of a complex number is also equal to the product of itself and its complex conjugate:

|z|= |z|·|z| = z·z*

[You should go through the trouble of reviewing the difference between the square and the absolute square of a complex number. Just write z as a + ib and calculate (a + ib)= a2 + 2ab+ b2 , as opposed to |z|= a2 + b2. Also check what it means when writing z as r·eiθ = r·(cosθ + i·sinθ).]

Let’s applying the probability rule to a two-filter set-up, i.e. the situation with the S and the tilted T filter which we described above, and let’s assume we’ve got a pure beam of +S particles entering the wide-open T filter, so our particles can come out in either of the three base states with respect to T. We can then write:

〈 +T | +S 〉+ 〈 0T | +S 〉+ 〈 −T | +S 〉= 1

⇔ 〈 +T | +S 〉〈 +T | +S 〉* + 〈 0T | +S 〉〈 0T | +S 〉* + 〈 −T | +S 〉〈 −T | +S 〉* = 1

Of course, we’ve got two other such equations if we start with a 0S or a −S state. Now, we take the 〈 χ | φ 〉 = ∑ 〈 χ | i 〉〈 i | φ 〉 ‘Law’, and substitute χ and φ for +S, and all states for the base states with regard to T. We get:

〈 +S | +S 〉 = 1 = 〈 +S | +T 〉〈 +T | +S 〉 + 〈 +S | 0T 〉〈 0T | +S 〉 + 〈 +S | –T 〉〈 −T | +S 〉

These equations are consistent only if:

〈 +S | +T 〉 = 〈 +T | +S 〉*,

〈 +S | 0T 〉 = 〈 0T | +S 〉*,

〈 +S | −T 〉 = 〈 −T | +S 〉*,

which is what we wanted to prove. One can then generalize to any state φ and χ. However, proving the result is one thing. Understanding it is something else. One can write down a number of strange consequences, which all point to Feynman‘s rather enigmatic comment on this ‘Law’: “If this Law were not true, probability would not be ‘conserved’, and particles would get ‘lost’.” So what does that mean? Well… You may want to think about the following, perhaps. It’s obvious that we can write:

|〈 φ | χ 〉|= 〈 φ | χ 〉〈 φ | χ 〉* = 〈 χ | φ 〉*〈 χ | φ 〉 = |〈 χ | φ 〉|2

This says that the probability to go from the φ-state to the χ-state  is the same as the probability to go from the χ-state to the φ-state.

Now, when we’re talking base states, that’s rather obvious, because the probabilities involved are either 0 or 1. However, if we substitute for +S and −T, or some more complicated states, then it’s a different thing. My guts instinct tells me this third ‘Law’ – which, as mentioned, can be derived from the other ‘Laws’ – reflects the principle of reversibility in spacetime, which you may also interpret as a causality principle, in the sense that, in theory at least (i.e. not thinking about entropy and/or statistical mechanics), we can reverse what’s happening: we can go back in spacetime.

In this regard, we should also remember that the complex conjugate of a complex number in polar form, i.e. a complex number written as r·eiθ, is equal to r·eiθ, so the argument in the exponent gets a minus sign. Think about what this means for our a·ei·θ ei·(ω·t − k ∙x) = a·e(i/ħ)·(E·t − pxfunction. Taking the complex conjugate of this function amounts to reversing the direction of t and x which, once again, evokes that idea of going back in spacetime.

I feel there’s some more fundamental principle here at work, on which I’ll try to reflect a bit more. Perhaps we can also do something with that relationship between the multiplicative inverse of a complex number and its complex conjugate, i.e. z−1 = z*/|z|2. I’ll check it out. As for now, however, I’ll leave you to do that, and please let me know if you’ve got any inspirational ideas on this. 🙂

So… Well… Goodbye as for now. I’ll probably talk about the Hamiltonian in my next post. I think we really did a good job in laying the groundwork for the really hardcore stuff, so let’s go for that now. 🙂

Post Scriptum: On the Uncertainty Principle and other rules

After writing all of the above, I realized I should add some remarks to make this post somewhat more readable. First thing: not all of the rules are there—obviously! Most notably, I didn’t say anything about the rules for adding or multiplying amplitudes, but that’s because I wrote extensively about that already, and so I assume you’re familiar with that. [If not, see my page on the essentials.]

Second, I didn’t talk about the Uncertainty Principle. That’s because I didn’t have to. In fact, we don’t need it here. In general, all popular accounts of quantum mechanics have an excessive focus on the position and momentum of a particle, while the approach in this and my previous post is quite different. Of course, it’s Feynman’s approach to QM really. Not ‘mine’. 🙂 All of the examples and all of the theory he presents in his introductory chapters in the Third Volume of Lectures, i.e. the volume on QM, are related to things like:

• What is the amplitude for a particle to go from spin state +S to spin state −T?
• What is the amplitude for a particle to be scattered, by a crystal, or from some collision with another particle, in the θ direction?
• What is the amplitude for two identical particles to be scattered in the same direction?
• What is the amplitude for an atom to absorb or emit a photon? [See, for example, Feynman’s approach to the blackbody radiation problem.]
• What is the amplitude to go from one place to another?

In short, you read Feynman, and it’s only at the very end of his exposé, that he starts talking about the things popular books start with, such as the amplitude of a particle to be at point (x, t) in spacetime, or the Schrödinger equation, which describes the orbital of an electron in an atom. That’s where the Uncertainty Principle comes in and, hence, one can really avoid it for quite a while. In fact, one should avoid it for quite a while, because it’s now become clear to me that simply presenting the Uncertainty Principle doesn’t help all that much to truly understand quantum mechanics.

Truly understanding quantum mechanics involves understanding all of these weird rules above. To some extent, that involves dissociating the idea of the wavefunction with our conventional ideas of time and position. From the questions above, it should be obvious that ‘the’ wavefunction does actually not exist: we’ve got a wavefunction for anything we can and possibly want to measure. That brings us to the question of the base states: what are they?

Feynman addresses this question in a rather verbose section of his Lectures titled: What are the base states of the world? I won’t copy it here, but I strongly recommend you have a look at it. 🙂

I’ll end here with a final equation that we’ll need frequently: the amplitude for a particle to go from one place (r1) to another (r2). It’s referred to as a propagator function, for obvious reasons—one of them being that physicists like fancy terminology!—and it looks like this:

The shape of the e(i/ħ)·(pr12function is now familiar to you. Note the r12 in the argument, i.e. the vector pointing from r1 to r2. The pr12 dot product equals |p|∙|r12|·cosθ = p∙r12·cosθ, with θ the angle between p and r12. If the angle is the same, then cosθ is equal to 1. If the angle is π/2, then it’s 0, and the function reduces to 1/r12. So the angle θ, through the cosθ factor, sort of scales the spatial frequency. Let me try to give you some idea of how this looks like by assuming the angle between p and r12 is the same, so we’re looking at the space in the direction of the momentum only and |p|∙|r12|·cosθ = p∙r12. Now, we can look at the p/ħ factor as a scaling factor, and measure the distance x in units defined by that scale, so we write: x = p∙r12/ħ. The function then reduces to (ħ/p)·eix/x = (ħ/p)·cos(x)/x + i·(ħ/p)·sin(x)/x, and we just need to square this to get the probability. All of the graphs are drawn hereunder: I’ll let you analyze them. [Note that the graphs do not include the ħ/p factor, which you may look at as yet another scaling factor.] You’ll see – I hope! – that it all makes perfect sense: the probability quickly drops off with distance, both in the positive as well as in the negative x-direction, while it’s going to infinity when very near. [Note that the absolute square, using cos(x)/x and sin(x)/x yields the same graph as squaring 1/x—obviously!]

# Light and matter

In my previous post, I discussed the de Broglie wave of a photon. It’s usually referred to as ‘the’ wave function (or the psi function) but, as I explained, for every psi – i.e. the position-space wave function Ψ(x ,t) – there is also a phi – i.e. the momentum-space wave function Φ(p, t).

In that post, I also compared it – without much formalism – to the de Broglie wave of ‘matter particles’. Indeed, in physics, we look at ‘stuff’ as being made of particles and, while the taxonomy of the particle zoo of the Standard Model of physics is rather complicated, one ‘taxonomic’ principle stands out: particles are either matter particles (known as fermions) or force carriers (known as bosons). It’s a strict separation: either/or. No split personalities.

A quick overview before we start…

Wikipedia’s overview of particles in the Standard Model (including the latest addition: the Higgs boson) illustrates this fundamental dichotomy in nature: we have the matter particles (quarks and leptons) on one side, and the bosons (i.e. the force carriers) on the other side.

Don’t be put off by my remark on the particle zoo: it’s a term coined in the 1960s, when the situation was quite confusing indeed (like more than 400 ‘particles’). However, the picture is quite orderly now. In fact, the Standard Model put an end to the discovery of ‘new’ particles, and it’s been stable since the 1970s, as experiments confirmed the reality of quarks. Indeed, all resistance to Gell-Man’s quarks and his flavor and color concepts – which are just words to describe new types of ‘charge’ – similar to electric charge but with more variety), ended when experiments by Stanford’s Linear Accelerator Laboratory (SLAC) in November 1974 confirmed the existence of the (second-generation and, hence, heavy and unstable) ‘charm’ quark (again, the names suggest some frivolity but it’s serious physical research).

As for the Higgs boson, its existence of the Higgs boson had also been predicted, since 1964 to be precise, but it took fifty years to confirm it experimentally because only something like the Large Hadron Collider could produce the required energy to find it in these particle smashing experiments – a rather crude way of analyzing matter, you may think, but so be it. [In case you harbor doubts on the Higgs particle, please note that, while CERN is the first to admit further confirmation is needed, the Nobel Prize Committee apparently found the evidence ‘evidence enough’ to finally award Higgs and others a Nobel Prize for their ‘discovery’ fifty years ago – and, as you know, the Nobel Prize committee members are usually rather conservative in their judgment. So you would have to come up with a rather complex conspiracy theory to deny its existence.]

Also note that the particle zoo is actually less complicated than it looks at first sight: the (composite) particles that are stable in our world – this world – consist of three quarks only: a proton consists of two up quarks and one down quark and, hence, is written as uud., and a neutron is two down quarks and one up quark: udd. Hence, for all practical purposes (i.e. for our discussion how light interacts with matter), only the so-called first generation of matter-particles – so that’s the first column in the overview above – are relevant.

All the particles in the second and third column are unstable. That being said, they survive long enough – a muon disintegrates after 2.2 millionths of a second (on average) – to deserve the ‘particle’ title, as opposed to a ‘resonance’, whose lifetime can be as short as a billionth of a trillionth of a second – but we’ve gone through these numbers before and so I won’t repeat that here. Why do we need them? Well… We don’t, but they are a by-product of our world view (i.e. the Standard Model) and, for some reason, we find everything what this Standard Model says should exist, even if most of the stuff (all second- and third-generation matter particles, and all these resonances, vanish rather quickly – but so that also seems to be consistent with the model). [As for a possible fourth (or higher) generation, Feynman didn’t exclude it when he wrote his 1985 Lectures on quantum electrodynamics, but, checking on Wikipedia, I find the following: “According to the results of the statistical analysis by researchers from CERN and the Humboldt University of Berlin, the existence of further fermions can be excluded with a probability of 99.99999% (5.3 sigma).” If you want to know why… Well… Read the rest of the Wikipedia article. It’s got to do with the Higgs particle.]

As for the (first-generation) neutrino in the table – the only one which you may not be familiar with – these are very spooky things but – I don’t want to scare you – relatively high-energy neutrinos are going through your and my my body, right now and here, at a rate of some hundred trillion per second. They are produced by stars (stars are huge nuclear fusion reactors, remember?), and also as a by-product of these high-energy collisions in particle accelerators of course. But they are very hard to detect: the first trace of their existence was found in 1956 only – 26 years after their existence had been postulated: the fact that Wolfgang Pauli proposed their existence in 1930 to explain how beta decay could conserve energy, momentum and spin (angular momentum) demonstrates not only the genius but also the confidence of these early theoretical quantum physicists. Most neutrinos passing through Earth are produced by our Sun. Now they are being analyzed more routinely. The largest neutrino detector on Earth is called IceCube. It sits on the South Pole – or under it, as it’s suspended under the Antarctic ice, and it regularly captures high-energy neutrinos in the range of 1 to 10 TeV.

Let me – to conclude this introduction – just quickly list and explain the bosons (i.e the force carriers) in the table above:

1. Of all of the bosons, the photon (i.e. the topic of this post), is the most straightforward: there is only type of photon, even if it comes in different possible states of polarization.

[…]

I should probably do a quick note on polarization here – even if all of the stuff that follows will make abstraction of it. Indeed, the discussion on photons that follows (largely adapted from Feynman’s 1985 Lectures on Quantum Electrodynamics) assumes that there is no such thing as polarization – because it would make everything even more complicated. The concept of polarization (linear, circular or elliptical) has a direct physical interpretation in classical mechanics (i.e. light as an electromagnetic wave). In quantum mechanics, however, polarization becomes a so-called qubit (quantum bit): leaving aside so-called virtual photons (these are short-range disturbances going between a proton and an electron in an atom – effectively mediating the electromagnetic force between them), the property of polarization comes in two basis states (0 and 1, or left and right), but these two basis states can be superposed. In ket notation: if ¦0〉 and ¦1〉 are the basis states, then any linear combination α·¦0〉 + ß·¦1〉 is also a valid state provided│α│2 + │β│= 1, in line with the need to get probabilities that add up to one.

In case you wonder why I am introducing these kets, there is no reason for it, except that I will be introducing some other tools in this post – such as Feynman diagrams – and so that’s all. In order to wrap this up, I need to note that kets are used in conjunction with bras. So we have a bra-ket notation: the ket gives the starting condition, and the bra – denoted as 〈 ¦ – gives the final condition. They are combined in statements such as 〈 particle arrives at x¦particle leaves from s〉 or – in short – 〈 x¦s〉 and, while x and s would have some real-number value, 〈 x¦s〉 would denote the (complex-valued) probability amplitude associated wit the event consisting of these two conditions (i.e the starting and final condition).

But don’t worry about it. This digression is just what it is: a digression. Oh… Just make a mental note that the so-called virtual photons (the mediators that are supposed to keep the electron in touch with the proton) have four possible states of polarization – instead of two. They are related to the four directions of space (x, y and z) and time (t). 🙂

2. Gluons, the exchange particles for the strong force, are more complicated: they come in eight so-called colors. In practice, one should think of these colors as different charges, but so we have more elementary charges in this case than just plus or minus one (±1) – as we have for the electric charge. So it’s just another type of qubit in quantum mechanics.

[Note that the so-called elementary ±1 values for electric charge are not really elementary: it’s –1/3 (for the down quark, and for the second- and third-generation strange and bottom quarks as well) and +2/3 (for the up quark as well as for the second- and third-generation charm and top quarks). That being said, electric charge takes two values only, and the ±1 value is easily found from a linear combination of the –1/3 and +2/3 values.]

3. Z and W bosons carry the so-called weak force, aka as Fermi’s interaction: they explain how one type of quark can change into another, thereby explaining phenomena such as beta decay. Beta decay explains why carbon-14 will, after a very long time (as compared to the ‘unstable’ particles mentioned above), spontaneously decay into nitrogen-14. Indeed, carbon-12 is the (very) stable isotope, while carbon-14 has a life-time of 5,730 ± 40 years ‘only’  (so one can’t call carbon-12 ‘unstable’: perhaps ‘less stable’ will do) and, hence, measuring how much carbon-14 is left in some organic substance allows us to date it (that’s what (radio)carbon-dating is about). As for the name, a beta particle can refer to an electron or a positron, so we can have β decay (e.g. the above-mentioned carbon-14 decay) as well as β+ decay (e.g. magnesium-23 into sodium-23). There’s also alpha and gamma decay but that involves different things.

As you can see from the table, W± and Zbosons are very heavy (157,000 and 178,000 times heavier than a electron!), and W± carry the (positive or negative) electric charge. So why don’t we see them? Well… They are so short-lived that we can only see a tiny decay width, just a very tiny little trace, so they resemble resonances in experiments. That’s also the reason why we see little or nothing of the weak force in real-life: the force-carrying particles mediating this force don’t get anywhere.

4. Finally, as mentioned above, the Higgs particle – and, hence, of the associated Higgs field – had been predicted since 1964 already but its existence was only (tentatively) experimentally confirmed last year. The Higgs field gives fermions, and also the W and Z bosons, mass (but not photons and gluons), and – as mentioned above – that’s why the weak force has such short range as compared to the electromagnetic and strong forces. Note, however, that the Higgs particle does actually not explain the gravitational force, so it’s not the (theoretical) graviton and there is no quantum field theory for the gravitational force as yet. Just Google it and you’ll quickly find out why: there’s theoretical as well as practical (experimental) reasons for that.

The Higgs field stands out from the other force fields because it’s a scalar field (as opposed to a vector field). However, I have no idea how this so-called Higgs mechanism (i.e. the interaction with matter particles (i.e. with the quarks and leptons, but not directly with neutrinos it would seem from the diagram below), with W and Z bosons, and with itself – but not with the massless photons and gluons) actually works. But then I still have a very long way to go on this Road to Reality.

In any case… The topic of this post is to discuss light and its interaction with matter – not the weak or strong force, nor the Higgs field.

Let’s go for it.

Amplitudes, probabilities and observable properties

Being born a boson or a fermion makes a big difference. That being said, both fermions and bosons are wavicles described by a complex-valued psi function, colloquially known as the wave function. To be precise, there will be several wave functions, and the square of their modulus (sorry for the jargon) will give you the probability of some observable property having a value in some relevant range, usually denoted by Δ. [I also explained (in my post on Bose and Fermi) how the rules for combining amplitudes differ for bosons versus fermions, and how that explains why they are what they are: matter particles occupy space, while photons not only can but also like to crowd together in, for example, a powerful laser beam. I’ll come back on that.]

For all practical purposes, relevant usually means ‘small enough to be meaningful’. For example, we may want to calculate the probability of detecting an electron in some tiny spacetime interval (Δx, Δt). [Again, ‘tiny’ in this context means small enough to be relevant: if we are looking at a hydrogen atom (whose size is a few nanometer), then Δx is likely to be a cube or a sphere with an edge or a radius of a few picometer only (a picometer is a thousandth of a nanometer, so it’s a millionth of a millionth of a meter); and, noting that the electron’s speed is approximately 2200 km per second… Well… I will let you calculate a relevant Δt. :-)]

If we want to do that, then we will need to square the modulus of the corresponding wave function Ψ(x, t). To be precise, we will have to do a summation of all the values │Ψ(x, t)│over the interval and, because x and t are real (and, hence, continuous) numbers, that means doing some integral (because an integral is the continuous version of a sum).

But that’s only one example of an observable property: position. There are others. For example, we may not be interested in the particle’s exact position but only in its momentum or energy. Well, we have another wave function for that: the momentum wave function Φ(x ,t). In fact, if you looked at my previous posts, you’ll remember the two are related because they are conjugate variables: Fourier transforms duals of one another. A less formal way of expressing that is to refer to the uncertainty principle. But this is not the time to repeat things.

The bottom line is that all particles travel through spacetime with a backpack full of complex-valued wave functions. We don’t know who and where these particles are exactly, and so we can’t talk to them – but we can e-mail God and He’ll send us the wave function that we need to calculate some probability we are interested in because we want to check – in all kinds of experiments designed to fool them – if it matches with reality.

As mentioned above, I highlighted the main difference between bosons and fermions in my Bose and Fermi post, so I won’t repeat that here. Just note that, when it comes to working with those probability amplitudes (that’s just another word for these psi and phi functions), it makes a huge difference: fermions and bosons interact very differently. Bosons are party particles: they like to crowd and will always welcome an extra one. Fermions, on the other hand, will exclude each other: that’s why there’s something referred to as the Fermi exclusion principle in quantum mechanics. That’s why fermions make matter (matter needs space) and bosons are force carriers (they’ll just call friends to help when the load gets heavier).

Light versus matter: Quantum Electrodynamics

OK. Let’s get down to business. This post is about light, or about light-matter interaction. Indeed, in my previous post (on Light), I promised to say something about the amplitude of a photon to go from point A to B (because – as I wrote in my previous post – that’s more ‘relevant’, when it comes to explaining stuff, than the amplitude of a photon to actually be at point x at time t), and so that’s what I will do now.

In his 1985 Lectures on Quantum Electrodynamics (which are lectures for the lay audience), Feynman writes the amplitude of a photon to go from point A to B as P(A to B) – and the P stands for photon obviously, not for probability. [I am tired of repeating that you need to square the modulus of an amplitude to get a probability but – here you are – I have said it once more.] That’s in line with the other fundamental wave function in quantum electrodynamics (QED): the amplitude of an electron to go from A to B, which is written as E(A to B). [You got it: E just stands for electron, not for our electric field vector.]

I also talked about the third fundamental amplitude in my previous post: the amplitude of an electron to absorb or emit a photon. So let’s have a look at these three. As Feynman says: ““Out of these three amplitudes, we can make the whole world, aside from what goes on in nuclei, and gravitation, as always!”

Well… Thank you, Mr Feynman: I’ve always wanted to understand the World (especially if you made it).

The photon-electron coupling constant j

Let’s start with the last of those three amplitudes (or wave functions): the amplitude of an electron to absorb or emit a photon. Indeed, absorbing or emitting makes no difference: we have the same complex number for both. It’s a constant – denoted by j (for junction number) – equal to –0.1 (a bit less actually but it’s good enough as an approximation in the context of this blog).

Huh? Minus 0.1? That’s not a complex number, is it? It is. Real numbers are complex numbers too: –0.1 is 0.1eiπ in polar coordinates. As Feynman puts it: it’s “a shrink to about one-tenth, and half a turn.” The ‘shrink’ is the 0.1 magnitude of this vector (or arrow), and the ‘half-turn’ is the angle of π (i.e. 180 degrees). He obviously refers to multiplying (no adding here) j with other amplitudes, e.g. P(A, C) and E(B, C) if the coupling is to happen at or near C. And, as you’ll remember, multiplying complex numbers amounts to adding their phases, and multiplying their modulus (so that’s adding the angles and multiplying lengths).

Let’s introduce a Feynman diagram at this point – drawn by Feynman himself – which shows three possible ways of two electrons exchanging a photon. We actually have two couplings here, and so the combined amplitude will involve two j‘s. In fact, if we label the starting point of the two lines representing our electrons as 1 and 2 respectively, and their end points as 3 and 4, then the amplitude for these events will be given by:

E(1 to 5)·j·E(5 to 3)·E(2 to 6)·j·E(6 to 3)

As for how that j factor works, please do read the caption of the illustration below: the same j describes both emission as well as absorption. It’s just that we have both an emission as well as an as absorption here, so we have a j2 factor here, which is less than 0.1·0.1 = 0.01. At this point, it’s worth noting that it’s obvious that the amplitudes we’re talking about here – i.e. for one possible way of an exchange like the one below happening – are very tiny. They only become significant when we add many of these amplitudes, which – as explained below – is what has to happen: one has to consider all possible paths, calculate the amplitudes for them (through multiplication), and then add all these amplitudes, to then – finally – square the modulus of the combined ‘arrow’ (or amplitude) to get some probability of something actually happening. [Again, that’s the best we can do: calculate probabilities that correspond to experimentally measured occurrences. We cannot predict anything in the classical sense of the word.]

A Feynman diagram is not just some sketchy drawing. For example, we have to care about scales: the distance and time units are equivalent (so distance would be measured in light-seconds or, else, time would be measured in units equivalent to the time needed for light to travel one meter). Hence, particles traveling through time (and space) – from the bottom of the graph to the top – will usually not  be traveling at an angle of more than 45 degrees (as measured from the time axis) but, from the graph above, it is clear that photons do. [Note that electrons moving through spacetime are represented by plain straight lines, while photons are represented by wavy lines. It’s just a matter of convention.]

More importantly, a Feynman diagram is a pictorial device showing what needs to be calculated and how. Indeed, with all the complexities involved, it is easy to lose track of what should be added and what should be multiplied, especially when it comes to much more complicated situations like the one described above (e.g. making sense of a scattering event). So, while the coupling constant j (aka as the ‘charge’ of a particle – but it’s obviously not the electric charge) is just a number, calculating an actual E(A to B) amplitudes is not easy – not only because there are many different possible routes (paths) but because (almost) anything can happen. Let’s have a closer look at it.

E(A to B)

As Feynman explains in his 1985 QED Lectures: “E(A to B) can be represented as a giant sum of a lot of different ways an electron can go from point A to B in spacetime: the electron can take a ‘one-hop flight’, going directly from point A to B; it could take a ‘two-hop flight’, stopping at an intermediate point C; it could take a ‘three-hop flight’ stopping at points D and E, and so on.”

Fortunately, the calculation re-uses known values: the amplitude for each ‘hop’ – from C to D, for example – is P(F to G) – so that’s the amplitude of a photon (!) to go from F to G – even if we are talking an electron here. But there’s a difference: we also have to multiply the amplitudes for each ‘hop’ with the amplitude for each ‘stop’, and that’s represented by another number – not j but n2. So we have an infinite series of terms for E(A to B): P(A to B) + P(A to C)·n2·P(C to B) + P(A to D)·n2·P(D to E)·n2·P(E to B) + … for all possible intermediate points C, D, E, and so on, as per the illustration below.

You’ll immediately ask: what’s the value of n? It’s quite important to know it, because we want to know how big these n2netcetera terms are. I’ll be honest: I have not come to terms with that yet. According to Feynman (QED, p. 125), it is the ‘rest mass’ of an ‘ideal’ electron: an ‘ideal’ electron is an electron that doesn’t know Feynman’s amplitude theory and just goes from point to point in spacetime using only the direct path. 🙂 Hence, it’s not a probability amplitude like j: a proper probability amplitude will always have a modulus less than 1, and so when we see exponential terms like j2, j4,… we know we should not be all that worried – because these sort of vanish (go to zero) for sufficiently large exponents. For E(A to B), we do not have such vanishing terms. I will not dwell on this right here, but I promise to discuss it in the Post Scriptum of this post. The frightening possibility is that n might be a number larger than one.

[As we’re freewheeling a bit anyway here, just a quick note on conventions: I should not be writing j in bold-face, because it’s a (complex- or real-valued) number and symbols representing numbers are usually not written in bold-face: vectors are written in bold-face. So, while you can look at a complex number as a vector, well… It’s just one of these inconsistencies I guess. The problem with using bold-face letters to represent complex numbers (like amplitudes) is that they suggest that the ‘dot’ in a product (e.g. j·j) is an actual dot project (aka as a scalar product or an inner product) of two vectors. That’s not the case. We’re multiplying complex numbers here, and so we’re just using the standard definition of a product of complex numbers. This subtlety probably explains why Feynman prefers to write the above product as P(A to B) + P(A to C)*n2*P(C to B) + P(A to D)*n2*P(D to E)*n2*P(E to B) + … But then I find that using that asterisk to represent multiplication is a bit funny (although it’s a pretty common thing in complex math) and so I am not using it. Just be aware that a dot in a product may not always mean the same type of multiplication: multiplying complex numbers and multiplying vectors is not the same. […] And I won’t write j in bold-face anymore.]

P(A to B)

Regardless of the value for n, it’s obvious we need a functional form for P(A to B), because that’s the other thing (other than n) that we need to calculate E(A to B). So what’s the amplitude of a photon to go from point A to B?

Well… The function describing P(A to B) is obviously some wave function – so that’s a complex-valued function of x and t. It’s referred to as a (Feynman) propagator: a propagator function gives the probability amplitude for a particle to travel from one place to another in a given time, or to travel with a certain energy and momentum. [So our function for E(A to B) will be a propagator as well.] You can check out the details on it on Wikipedia. Indeed, I could insert the formula here, but believe me if I say it would only confuse you. The points to note is that:

1. The propagator is also derived from the wave equation describing the system, so that’s some kind of differential equation which incorporates the relevant rules and constraints that apply to the system. For electrons, that’s the Schrödinger equation I presented in my previous post. For photons… Well… As I mentioned in my previous post, there is ‘something similar’ for photons – there must be – but I have not seen anything that’s equally ‘simple’ as the Schrödinger equation for photons. [I have Googled a bit but it’s obvious we’re talking pretty advanced quantum mechanics here – so it’s not the QM-101 course that I am currently trying to make sense of.]
2. The most important thing (in this context at least) is that the key variable in this propagator (i.e. the Feynman propagator for the photon) is I: that spacetime interval which I mentioned in my previous post already:

I = Δr– Δt2 =  (z2– z1)+ (y2– y1)+ (x2– x1)– (t2– t1)2

In this equation, we need to measure the time and spatial distance between two points in spacetime in equivalent units (these ‘points’ are usually referred to as four-vectors), so we’d use light-seconds for the unit of distance or, for the unit of time, the time it takes for light to travel one meter. [If we don’t want to transform time or distance scales, then we have to write I as I = c2Δt2 – Δr2.] Now, there are three types of intervals:

1. For time-like intervals, we have a negative value for I, so Δt> Δr2. For two events separated by a time-like interval, enough time passes between them so there could be a cause–effect relationship between the two events. In a Feynman diagram, the angle between the time axis and the line between the two events will be less than 45 degrees from the vertical axis. The traveling electrons in the Feynman diagrams above are an example.
2. For space-like intervals, we have a positive value for I, so Δt< Δr2. Events separated by space-like intervals cannot possibly be causally connected. The photons traveling between point 5 and 6 in the first Feynman diagram are an example, but then photons do have amplitudes to travel faster than light.
3. Finally, for light-like intervals, I = 0, or Δt2 = Δr2. The points connected by the 45-degree lines in the illustration below (which Feynman uses to introduce his Feynman diagrams) are an example of points connected by light-like intervals.

[Note that we are using the so-called space-like convention (+++–) here for I. There’s also a time-like convention, i.e. with +––– as signs: I = Δt2 – Δrso just check when you would consult other sources on this (which I recommend) and if you’d feel I am not getting the signs right.]

Now, what’s the relevance of this? To calculate P(A to B), we have to add the amplitudes for all possible paths that the photon can take, and not in space, but in spacetime. So we should add all these vectors (or ‘arrows’ as Feynman calls them) – an infinite number of them really. In the meanwhile, you know it amounts to adding complex numbers, and that infinite sums are done by doing integrals, but let’s take a step back: how are vectors added?

Well…That’s easy, you’ll say… It’s the parallelogram rule… Well… Yes. And no. Let me take a step back here to show how adding a whole range of similar amplitudes works.

The illustration below shows a bunch of photons – real or imagined – from a source above a water surface (the sun for example), all taking different paths to arrive at a detector under the water (let’s say some fish looking at the sky from under the water). In this case, we make abstraction of all the photons leaving at different times and so we only look at a bunch that’s leaving at the same point in time. In other words, their stopwatches will be synchronized (i.e. there is no phase shift term in the phase of their wave function) – let’s say at 12 o’clock when they leave the source. [If you think this simplification is not acceptable, well… Think again.]

When these stopwatches hit the retina of our poor fish’s eye (I feel we should put a detector there, instead of a fish), they will stop, and the hand of each stopwatch represents an amplitude: it has a modulus (its length) – which is assumed to be the same because all paths are equally likely (this is one of the first principles of QED) – but their direction is very different. However, by now we are quite familiar with these operations: we add all the ‘arrows’ indeed (or vectors or amplitudes or complex numbers or whatever you want to call them) and get one big final arrow, shown at the bottom – just above the caption. Look at it very carefully.

If you look at the so-called contribution made by each of the individual arrows, you can see that it’s the arrows associated with the path of least time and the paths immediately left and right of it that make the biggest contribution to the final arrow. Why? Because these stopwatches arrive around the same time and, hence, their hands point more or less in the same direction. It doesn’t matter what direction – as long as it’s more or less the same.

[As for the calculation of the path of least time, that has to do with the fact that light is slowed down in water. Feynman shows why in his 1985 Lectures on QED, but I cannot possibly copy the whole book here ! The principle is illustrated below.]

So, where are we? This digressions go on and on, don’t they? Let’s go back to the main story: we want to calculate P(A to B), remember?

As mentioned above, one of the first principles in QED is that all paths – in spacetime – are equally likely. So we need to add amplitudes for every possible path in spacetime using that Feynman propagator function. You can imagine that will be some kind of integral which you’ll never want to solve. Fortunately, Feynman’s disciples have done that for you already. The results is quite predictable: the grand result is that light has a tendency to travel in straight lines and at the speed of light.

WHAT!? Did Feynman get a Nobel prize for trivial stuff like that?

Yes. The math involved in adding amplitudes over all possible paths not only in space but also in time uses the so-called path integral formulation of quantum mechanics and so that’s got Feynman’s signature on it, and that’s the main reason why he got this award – together with Julian Schwinger and Sin-Itiro Tomonaga: both much less well known than Feynman, but so they shared the burden. Don’t complain about it. Just take a look at the ‘mechanics’ of it.

We already mentioned that the propagator has the spacetime interval I in its denominator. Now, the way it works is that, for values of I equal or close to zero, so the paths that are associated with light-like intervals, our propagator function will yield large contributions in the ‘same’ direction (wherever that direction is), but for the spacetime intervals that are very much time- or space-like, the magnitude of our amplitude will be smaller and – worse – our arrow will point in the ‘wrong’ direction. In short, the arrows associated with the time- and space-like intervals don’t add up to much, especially over longer distances. [When distances are short, there are (relatively) few arrows to add, and so the probability distribution will be flatter: in short, the likelihood of having the actual photon travel faster or slower than speed is higher.]

Conclusion

Does this make sense? I am not sure, but I did what I promised to do. I told you how P(A to B) gets calculated; and from the formula for E(A to B), it is obvious that we can then also calculate E(A to B) provided we have a value for n. However, that value n is determined experimentally, just like the value of j, in order to ensure this amplitude theory yields probabilities that match the probabilities we observe in all kinds of crazy experiments that try to prove or disprove the theory; and then we can use these three amplitude formulas “to make the whole world”, as Feynman calls it, except the stuff that goes on inside of nuclei (because that’s the domain of the weak and strong nuclear force) and gravitation, for which we have a law (Newton’s Law) but no real ‘explanation’. [Now, you may wonder if this QED explanation of light is really all that good, but Mr Feynman thinks it is, and so I have no reason to doubt that – especially because there’s surely not anything more convincing lying around as far as I know.]

So what remains to be told? Lots of things, even within the realm of expertise of quantum electrodynamics. Indeed, Feynman applies the basics as described above to a number of real-life phenomena – quite interesting, all of it ! – but, once again, it’s not my goal to copy all of his Lectures here. [I am only hoping to offer some good summaries of key points in some attempt to convince myself that I am getting some of it at least.] And then there is the strong force, and the weak force, and the Higgs field, and so and so on. But that’s all very strange and new territory which I haven’t even started to explore. I’ll keep you posted as I am making my way towards it.

Post scriptum: On the values of j and n

In this post, I promised I would write something about how we can find j and n because I realize it would just amount to copy three of four pages out of that book I mentioned above, and which inspired most of this post. Let me just say something more about that remarkable book, and then quote a few lines on what the author of that book – the great Mr Feynman ! – thinks of the math behind calculating these two constants (the coupling constant j, and the ‘rest mass’ of an ‘ideal’ electron). Now, before I do that, I should repeat that he actually invented that math (it makes use of a mathematical approximation method called perturbation theory) and that he got a Nobel Prize for it.

First, about the book. Feynman’s 1985 Lectures on Quantum Electrodynamics are not like his 1965 Lectures on Physics. The Lectures on Physics are proper courses for undergraduate and even graduate students in physics. This little 1985 book on QED is just a series of four lectures for a lay audience, conceived in honor of Alix G. Mautner. She was a friend of Mr Feynman’s who died a few years before he gave and wrote these ‘lectures’ on QED. She had a degree in English literature and would ask Mr Feynman regularly to explain quantum mechanics and quantum electrodynamics in a way she would understand. While they had known each other for about 22 years, he had apparently never taken enough time to do so, as he writes in his Introduction to these Alix G. Mautner Memorial Lectures: “So here are the lectures I really [should have] prepared for Alix, but unfortunately I can’t tell them to her directly, now.”

The great Richard Phillips Feynman himself died only three years later, in February 1988 – not of one but two rare forms of cancer. He was only 69 years old when he died. I don’t know if he was aware of the cancer(s) that would kill him, but I find his fourth and last lecture in the book, Loose Ends, just fascinating. Here we have a brilliant mind deprecating the math that earned him a Nobel Prize and without which the Standard Model would be unintelligible. I won’t try to paraphrase him. Let me just quote him. [If you want to check the quotes, the relevant pages are page 125 to 131):

The math behind calculating these constants] is a “dippy process” and “having to resort to such hocus-pocus has prevented us from proving that the theory of quantum electrodynamics is mathematically self-consistent“. He adds: “It’s surprising that the theory still hasn’t been proved self-consistent one way or the other by now; I suspect that renormalization [“the shell game that we play to find n and j” as he calls it]  is not mathematically legitimate.” […] Now, Mr Feynman writes this about quantum electrodynamics, not about “the rest of physics” (and so that’s quantum chromodynamics (QCD) – the theory of the strong interactions – and quantum flavordynamics (QFD) – the theory of weak interactions) which, he adds, “has not been checked anywhere near as well as electrodynamics.”

That’s a pretty damning statement, isn’t it? In one of my other posts (see: The End of the Road to Reality?), I explore these comments a bit. However, I have to admit I feel I really need to get back to math in order to appreciate these remarks. I’ve written way too much about physics anyway now (as opposed to the my first dozen of posts – which were much more math-oriented). So I’ll just have a look at some more stuff indeed (such as perturbation theory), and then I’ll get back blogging. Indeed, I’ve written like 20 posts or so in a few months only – so I guess I should shut up for while now !

In the meanwhile, you’re more than welcome to comment of course !