The weird force

In my previous post (Loose Ends), I mentioned the weak force as the weird force. Indeed, unlike photons or gluons (i.e. the presumed carriers of the electromagnetic and strong force respectively), the weak force carriers (W bosons) have (1) mass and (2) electric charge:

  1. W bosons are very massive. The equivalent mass of a W+ and W– boson is some 86.3 atomic mass units (amu): that’s about the same as a rubidium or strontium atom. The mass of a Z boson is even larger: roughly equivalent to the mass of a molybdenium atom (98 amu). That is extremely heavy: just compare with iron or silver, which have a mass of  about 56 amu and 108 amu respectively. Because they are so massive, W bosons cannot travel very far before disintegrating (they actually go (almost) nowhere), which explains why the weak force is very short-range only, and so that’s yet another fundamental difference as compared to the other fundamental forces.
  2. The electric charge of W and Z bosons explains why we have a trio of weak force carriers rather than just one: W+, W– and Z0. Feynman calls them “the three W’s”.

The electric charge of W and Z bosons is what it is: an electric charge – just like protons and electrons. Hence, one has to distinguish it from the the weak charge as such: the weak charge (or, to be correct, I should say the weak isospin number) of a particle (such as a proton or a neutron for example) is related to the propensity of that particle to interact through the weak force — just like the electric charge is related to the propensity of a particle to interact through the electromagnetic force (think about Coulomb’s law for example: likes repel and opposites attract), and just like the so-called color charge (or the (strong) isospin number I should say) is related to the propensity of quarks (and gluons) to interact with each other through the strong force.

In short, as compared to the electromagnetic force and the strong force, the weak force (or Fermi’s interaction as it’s often called) is indeed the odd one out: these W bosons seem to mix just about everything: mass, charge and whatever else. In his 1985 Lectures on Quantum Electrodynamics, Feynman writes the following about this:

“The observed coupling constant for W’s is much the same as that for the photon. Therefore, the possibility exists that the three W’s and the photon are all different aspects of the same thing. Stephen Weinberg and Abdus Salam tried to combine quantum electrodynamics with what’s called ‘the weak interactions’ into one quantum theory, and they did it. But if you look at the results they get, you can see the glue—so to speak. It’s very clear that the photon and the three W’s are interconnected somehow, but at the present level of understanding, the connection is difficult to see clearly—you can still see the ‘seams’ in the theories; they have not yet been smoothed out so that the connection becomes more beautiful and, therefore, probably more correct.” (Feynman, 1985, p. 142)

Well… That says it all, I think. And from what I can see, the (tentative) confirmation of the existence of the Higgs field has not made these ‘seams’ any less visible. However, before criticizing eminent scientists such as Weinberg and Salam, we should obviously first have a closer look at those W bosons without any prejudice.

Alpha decay, potential wells and quantum tunneling

The weak force is usually explained as the force behind a process referred to as beta decay. However, because beta decay is just one form of radioactive decay, I need to say something about alpha decay too. [There is also gamma decay but that’s like a by-product of alpha and beta decay: when a nucleus emits an α or β particle (i.e. when we have alpha or beta decay), the nucleus will usually be left in an excited state, and so it can then move to a lower energy state by emitting a gamma ray photon (gamma radiation is very hard (i.e. very high-energy) radiation) – in the same way that an atomic electron can jump to a lower energy state by emitting a (soft) light ray photon. But so I won’t talk about gamma decay.]

Atomic decay, in general, is a loss of energy accompanying a transformation of the nucleus of the atom. Alpha decay occurs when the nucleus ejects an alpha particle: an α-particle consist of two protons and two neutrons bound together and, hence, it’s identical to a helium nucleus. Alpha particles are commonly emitted by all of the larger radioactive nuclei, such as uranium (which becomes thorium as a result of the decay process), or radium (which becomes radon gas). However, alpha decay is explained by a mechanism not involving the weak force: the electromagnetic force and the nuclear force (i.e. the strong force) will do. The reasoning is as follows: the alpha particle can be looked at as a stable but somewhat separate particle inside the nucleus. Because of their charge (both positive), the alpha particle inside of the nucleus and ‘the rest of the nucleus’ are subject to strong repulsive electromagnetic forces between them. However, these strong repulsive electromagnetic forces are not as strong as the strong force between the quarks that make up matter and, hence, that’s what keeps them together – most of the time that is.

Let me be fully complete here. The so-called nuclear force between composite particles such as protons and neutrons – or between clusters of protons and neutrons in this case – is actually the residual effect of the strong force. The strong force itself is between quarks – and between them only – and so that’s what binds them together in protons and neutrons (so that’s the next level of aggregation you might say). Now, the strong force is mostly neutralized within those protons and neutrons, but there is some residual force, and so that’s what keeps a nucleus together and what is referred to as the nuclear force.

There is a very helpful analogy here: the electromagnetic forces between neutral atoms (and/or molecules)—referred to as van der Waals forces (that’s what explains the liquid shape of water, among other things)— are also the residual of the (much stronger) electromagnetic forces that tie the electrons to the nucleus.

Now, that residual strong force – i.e. the nuclear force – diminishes in strength with distance but, within a certain distance, that residual force is strong enough to do what it does, and that’s to keep the nucleus together. This stable situation is usually depicted by what is referred to as a potential well:

Potential wellThe name is obvious: a well is a hole in the ground from which you can get water (or oil or gas or whatever). Now, the sea level might actually be lower than the bottom of a well, but the water would still stay in the well. In the illustration above, we are not depicting water levels but energy levels, but it’s equally obvious it would require some energy to kick a particle out of this well: if it would be water, we’d require a pump to get it out but, of course, it would be happy to flow to the sea once it’s out. Indeed, once a charged particle would be out (I am talking our alpha particle now), it will obviously stay out because of the repulsive electromagnetic forces coming into play (positive charges reject each other).

But so how can it escape the nuclear force and go up on the side of the well? [A potential pond or lake would have been a better term – but then that doesn’t sound quite right, does it? :-)]

Well, the energy may come from outside – that’s what’s referred to as induced radioactive decay (just Google it and you will tons of articles on experiments involving laser-induced accelerated alpha decay) – or, and that’s much more intriguing, the Uncertainty Principle comes into play.

Huh? Yes. According to the Uncertainty Principle, the energy of our alpha particle inside of the nucleus wiggles around some mean value but our alpha particle would also have an amplitude to have some higher energy level. That results not only in a theoretical probability for it to escape out of the well but also into something actually happening if we wait long enough: the amplitude (and, hence, the probability) is tiny, but it’s what explains the decay process – and what gives U-232 a half-life of 68.9 years, and also what gives the more common U-238 a much more comfortable 4.47 billion years as the half-life period.

[…]

Now that we’re talking about wells and all that, we should also mention that this phenomenon of getting out of the well is referred to as quantum tunneling. You can easily see why: it’s like the particle dug its way out. However, it didn’t: instead of digging under the sidewall, it sort of ‘climbed over’ it. Think of it being stuck and trying and trying and trying – a zillion times – to escape, until it finally did. So now you understand this fancy word: quantum tunneling. However, this post is about the weak force and so let’s discuss beta decay now.

Beta decay and intermediate vector bosons

Beta decay also involves transmutation of nuclei, but not by the emission of an α-particle but by a β-particle. A beta particle is just a different name for an electron (β) and/or its anti-matter counterpart: the positron (β+). [Physicists usually simplify stuff but in this case, they obviously didn’t: why don’t they just write e and ehere?]

An example of β decay is the decay of carbon-14 (C-14) into nitrogen-14 (N-14), and an example of β+ decay is the decay of magnesium-23 into sodium-23. C-14 and N-14 have the same mass but they are different atoms. The decay process is described by the equations below:

Beta decay

You’ll remember these formulas from your high-school days: beta decay does not change the mass number (carbon and nitrogen have the same mass: 14 units) but it does change the atomic (or proton) number: nitrogen has an extra proton. So one of the neutrons became a proton ! [The second equation shows the opposite: a proton became a neutron.] In order to do that, the carbon atom had to eject a negative charge: that’s the electron you see in the equation above.

In addition, there is also the ejection of a anti-neutrino (that’s what the bar above the ve symbol stands for: antimatter). You’ll wonder what an antineutrino could possibly be. Don’t worry about it: it’s not any spookier than the neutrino. Neutrinos and anti-neutrinos have no electric charge and so you cannot distinguish them on that account (electric charge). However, all antineutrinos have right-handed helicity (i.e. they come in only one of the two possible spin states), while the neutrinos are all left-handed. That’s why beta-decay is said to not respect parity symmetry, aka as mirror symmetry. Hence, in the case of beta decay, Nature does distinguish between the world and the mirror world ! I’ll come back on that but let me first lighten up the discussion somewhat with a graphical illustration of that neutron-proton transformation.

2000px-Beta-minus_Decay

As for magnesium-sodium transformation, we’d have something similar but so we’d just have a positron instead of an electron (a positron is just an electron with a positive charge for all practical purposes) and a regular neutrino. So we’d just have the anti-matter counterparts of the electron and the neutrino. [Don’t be put off by the term ‘anti-matter’: anti-matter is really just like regular matter – except that the charges have opposite sign. For example, the anti-matter counterpart of a blue quark is an anti-blue quark, and the anti-matter counterpart of neutrino has right-handed helicity – or spin – as opposed to the ‘left-handed’ ‘ordinary’ neutrinos.]

Now, you surely will have several serious questions. The most obvious question is what happens with the electron and the neutrino? Well… Those spooky neutrinos are gone before you know it and so don’t worry about them. As for the electron, the carbon had only six electrons but the nitrogen needs seven to be electrically neutral… So you might think the new atom will take care of it. Well… No. Sorry. Because of its kinetic energy, the electron is likely to just explore the world and crash into something else, and so we’re left with a positively charged nitrogen ion indeed. So I should have added a little + sign next to the N in the formula above. Of course, one cannot exclude the possibility that this ion will pick up the electron later – but don’t bet on it: the ion might have to absorb another electron, or not find any free electrons !

As for the positron (in a β+ decay), that will just grab the nearest electron around and auto-destruct—thereby generating two high-energy photons (so that’s a little light flash). The net result is that we do not have an ion but a neutral sodium atom. Because the nearest electron will usually be found on some shell around the nucleus (the K or L shell for example), such process is often described as electron capture, and the ‘transformation equation’ can then be written p + e– → n + ve (with p and n denoting a proton and a neutron respectively).

The more important question is: where are the W and Z bosons in this story?

Ah ! Yes! Sorry I forgot about them. The Feynman diagram below shows how it really works—and why the name of intermediate vector bosons for these three strange ‘particles’ (W+, W, and Z0) is so apt. These W bosons are just a short trace of ‘something’ indeed: their half-life is about 3×10−25 s, and so that’s the same order of magnitude (or minitude I should say) as the mean lifetime of other resonances observed in particle collisions. 

Feynman diagram beta decay

Indeed, you’ll notice that, in this so-called Feynman diagram, there’s no space axis. That’s because the distances involved are so tiny that we have to distort the scale—so we are not using equivalent time and distance units here, as Feynman diagrams should. That’s in line with a more prosaic description of what may be happening: W bosons mediate the weak force by seemingly absorbing an awful lot of momentum, spin, and whatever other energy related to all of the qubits describing the particles involved, to then eject an electron (or positron) and a neutrino (or an anti-neutrino).

Hmm… That’s not a standard description of a W boson as a force carrying particle, you’ll say. You’re right. This is more the description of a Z boson. What’s the Z boson again? Well… I haven’t explained it yet. It’s not involved in beta decay. There’s a process called elastic scattering of neutrinos. Elastic scattering means that some momentum is exchanged but neither the target (an electron or a nucleus) nor the incident particle (the neutrino) are affected as such (so there’s no break-up of the nucleus for example). In other words, things bounce back and/or get deflected but there’s no destruction and/or creation of particles, which is what you would have with inelastic collisions. Let’s examine what happens here.

W and Z bosons in neutrino scattering experiments

It’s easy to generate neutrino beams: remember their existence was confirmed in 1956 because nuclear reactors create a huge flux of them ! So it’s easy to send lots of high-energy neutrinos into a cloud or bubble chamber and see what happens. Cloud and bubble chambers are prehistoric devices which were built and used to detect electrically charged particles moving through it. I won’t go into too much detail but I can’t resist inserting a few historic pictures here.

The first two pictures below document the first experimental confirmation of the existence of positrons by Carl Anderson, back in 1932 (and, no, he’s not Danish but American), for which he got a Nobel Prize. The magnetic field which gives the positron some curvature—the trace of which can be seen in the image on the right—is generated by the coils around the chamber. Note the opening in the coils, which allows for taking a picture when the supersaturated vapor is suddenly being decompressed – and so the charged particle that goes through it leaves a trace of ionized atoms behind that act as ‘nucleation centers’ around which the vapor condenses, thereby forming tiny droplets. Quite incredible, isn’t it? One can only admire the perseverance of these early pioneers.

Carl Anderson Positron

The picture below is another historical first: it’s the first detection of a neutrino in a bubble chamber. It’s fun to analyze what happens here: we have a mu-meson – aka as a muon – coming out of the collision here (that’s just a heavier version of the electron) and then a pion – which should (also) be electrically charged because the muon carries electric charge… But I will let you figure this one out. I need to move on with the main story. 🙂

FirstNeutrinoEventAnnotated

The point to note is that these spooky neutrinos collide with other matter particles. In the image above, it’s a proton, but so when you’re shooting neutrino beams through a bubble chamber, a few of these neutrinos can also knock electrons out of orbit, and so that electron will seemingly appear out of nowhere in the image and move some distance with some kinetic energy (which can all be measured because magnetic fields around it will give the electron some curvature indeed, and so we can calculate its momentum and all that).

Of course, they will tend to move in the same direction – more or less at least – as the neutrinos that knocked them loose. So it’s like the Compton scattering which we discussed earlier (from which we could calculate the so-called classical radius of the electron – or its size if you will)—but with one key difference: the electrons get knocked loose not by photons, but by neutrinos.

But… How can they do that? Photons carry the electromagnetic field so the interaction between them and the electrons is electromagnetic too. But neutrinos? Last time I checked, they were matter particles, not bosons. And they carry no charge. So what makes them scatter electrons?

You’ll say that’s a stupid question: it’s the neutrino, dummy ! Yes, but how? Well, you’ll say, they collide—don’t they? Yes. But we are not talking tiny billiard balls here: if particles scatter, one of the fundamental forces of Nature must be involved, and usually it’s the electromagnetic force: it’s the electron density around nuclei indeed that explains why atoms will push each other away if they meet each other and, as explained above, it’s also the electromagnetic force that explains Compton scattering. So billiard balls bounce back because of the electromagnetic force too and…

OK-OK-OK. I got it ! So here it must be the strong force or something. Well… No. Neutrinos are not made of quarks. You’ll immediately ask what they are made of – but the answer is simple: they are what they are – one of the four matter particles in the Standard Model – and so they are not made of anything else. Capito?

OK-OK-OK. I got it ! It must be gravity, no? Perhaps these neutrinos don’t really hit the electron: perhaps they skim near it and sort of drag it along as they pass? No. It’s not gravity either. It can’t be. We have no exact measurement of the mass of a neutrino but it’s damn close to zero – and, hence, way too small to exert any such influence on an electron. It’s just not consistent with those traces.

OK-OK-OK. I got it ! It’s that weak force, isn’t it? YES ! The Feynman diagrams below show the mechanism involved. As far as terminology goes (remember Feynman’s complaints about the up, down, strange, charm, beauty and truths quarks?), I think this is even worse. The interaction is described as a current, and when the neutral Z boson is involved, it’s called a neutral current – as opposed to…  Well… Charged currents. Neutral and charged currents? That sounds like sweet and sour candy, isn’t it? But isn’t candy supposed to be sweet? Well… No. Sour candy is pretty common too. And so neutral currents are pretty common too.

neutrino_scattering

You obviously don’t believe a word of what I am saying and you’ll wonder what the difference is between these charged and neutral currents. The end result is the same in the first two pictures: an electron and a neutrino interact, and they exchange momentum. So why is one current neutral and the other charged? In fact, when you ask that question, you are actually wondering whether we need that neutral Z boson. W bosons should be enough, no?

No. The first and second picture are “the same but different”—and you know what that means in physics: it means it’s not the same. It’s different. Full stop. In the second picture, there is electron absorption (only for a very brief moment obviously, but so that’s what it is, and you don’t have that in the first diagram) and then electron emission, and there’s also neutrino absorption and emission. […] I can sense your skepticism – and I actually share it – but that’s what I understand of it !

[…] So what’s the third picture? Well… That’s actually beta decay: a neutron becomes a proton, and there’s emission of an electron and… Hey ! Wait a minute ! This is interesting: this is not what we wrote above: we have an incoming neutrino instead of an outgoing anti-neutrino here. So what’s this?

Well… I got this illustration from a blog on physics (Galileo’s Pendulum – The Flavor of Neutrinos) which, in turn, mentions Physics Today as its source. The incoming neutrino has nothing to do with the usual representation of an anti-matter particle as a particle traveling backwards in time. It’s something different, and it triggers a very interesting question: could beta decay possibly be ‘triggered’ by neutrinos? Who knows?

I googled it, and there seems to be some evidence supporting such thesis. However, this ‘evidence’ is flimsy (the only real ‘clue’ is that the activity of the Sun, as measured by the intensity of solar flares, seems to be having some (tiny) impact on the rate of decay of radioactive elements on Earth) and, hence, most ‘serious’ scientists seem to reject that possibility. I wonder why: it would make the ‘weird force’ somewhat less weird in my view. So… What to say? Well… Nothing much at this moment. Let me move on and examine the question a bit more in detail in a Post Scriptum.

The odd one out

You may wonder if neutrino-electron interaction always involve the weak force. The answer to that question is simple: Yes ! Because they do not carry any electric charge, and because they are not quarks, neutrinos are only affected by the weak force. However, as evidenced by all the stuff I wrote on beta decay, you cannot turn this statement on its head: the weak force is relevant not only for neutrinos but for electrons and quarks as well ! That gives us the following connection between forces and matter:

forces and matter

[Specialists reading this post may say they’ve not seen this diagram before. That might be true. I made it myself – for a change – but I am sure it’s around somewhere.]

It is a weird asymmetry: almost massless particles (neutrinos) interact with other particles through massive bosons, and these massive ‘things’ are supposed to be ‘bosons’, i.e. force carrying particles ! These physicists must be joking, right? These bosons can hardly carry themselves – as evidenced by the fact they peter out just like all of those other ‘resonances’ !

Hmm… Not sure what to say. It’s true that their honorific title – ‘intermediate vectors’ – seems to be quite apt: they are very intermediate indeed: they only appear as a short-lived stage in between the initial and final state of the system. Again, it leads one to think that these W bosons may just reflect some kind of energy blob caused by some neutrino – or anti-neutrino – crashing into another matter particle (a quark or an electron). Whatever it is, this weak force is surely the odd one out.

Odd one out

In my previous post, I mentioned other asymmetries as well. Let’s revisit them.

Time irreversibility

In Nature, uranium is usually found as uranium-238. Indeed, that’s the most abundant isotope of uranium: about 99.3% of all uranium is U-238. There’s also some uranium-235 out there: some 0.7%. And there are also trace amounts of U-234. And that’s it really. So where is the U-232 we introduced above when talking about alpha decay? Well… We said it has a half-life of 68.9 years only and so it’s rather normal U-232 cannot be found in Nature. What? Yes: 68.9 years is nothing compared to the half-life of U-238 (4.47 billion years) or U-235 (704 million years), and so it’s all gone. In fact, the tiny proportion of U-235 on this Earth is what allows us to date the Earth. The math and physics involved resemble the math and physics involved in carbon-dating but so carbon-dating is used for organic materials only, because the carbon-14 that’s used also has a fairly short half-time: 5,730 years—so that’s like a thousand times more than U-232 but… Well… Not like millions or billions of years. [You’ll immediately ask why this C-14 is still around if it’s got such a short life-time. The answer to that is easy: C-14 is continually being produced in the atmosphere and, hence, unlike U-232, it doesn’t just disappear.]

Hmm… Interesting. Radioactive decay suggests time irreversibility. Indeed, it’s wonderful and amazing – but sad at the same time:

  1. There’s so much diversity – a truly incredible range of chemical elements making life what it is.
  2. But so all these chemical elements have been produced through a process of nuclear fusion in stars (stellar nucleosynthesis), which were then blasted into space by supernovae, and so they then coagulated into planets like ours.
  3. However, all of the heavier atoms will decay back into some lighter element because of radioactive decay, as shown in the graph below.
  4. So we are doomed !

Overview of decay modes

In fact, some of the GUT theorists think that there is no such thing as ‘stable nuclides’ (that’s the black line in the graph above): they claim that all atomic species will decay because – according to their line of reasoning – the proton itself is NOT stable.

WHAT? Yeah ! That’s what Feynman complained about too: he obviously doesn’t like these GUT theorists either. Of course, there is an expensive experiment trying to prove spontaneous proton decay: the so-called Super-K under Mount Kamioka in Japan. It’s basically a huge tank of ultra-pure water with a lot of machinery around it… Just google it. It’s fascinating. If, one day, it would be able to prove that there’s proton decay, our Standard Model would be in very serious problems – because it doesn’t cater for unstable protons. That being said, I am happy that has not happened so far – because it would mean our world would really be doomed.   

What do I mean with that? We’re all doomed, aren’t we? If only because of the Second Law of Thermodynamics. Huh? Yes. That ‘law’ just expresses a universal principle: all kinetic and potential energy observable in nature will, in the end, dissipate: differences in temperature, pressure, and chemical potential will even out. Entropy increases. Time is NOT reversible: it points in the direction of increasing entropy – till all is the same once again. Sorry? 

Don’t worry about it. When everything is said and done, we humans – or life in general – are an amazing negation of the Second Law of Thermodynamics: temperature, pressure, chemical potential and what have you – it’s all super-organized and super-focused in our body ! But it’s temporary indeed – and we actually don’t negate the Second Law of Thermodynamics: we create order by creating disorder. In any case, I don’t want to dwell on this point. Time reversibility in physics usually refers to something else: time reversibility would mean that all basic laws of physics (and with ‘basic’, I am excluding this higher-level Second Law of Thermodynamics) would be time-reversible: if we’d put in minus t (–t) instead of t, all formulas would still make sense, wouldn’t they? So we could – theoretically – reverse our clock and stopwatches and go back in time.

Can we do that?

Well… We can reverse a lot. For example, U-232 decays into a lot of other stuff BUT we can produce U-232 from scratch once again—from thorium to be precise. In fact, that’s how we got it in the first place: as mentioned above, any natural U-232 that might have been produced in those stellar nuclear fusion reactors is gone. But so that means that alpha decay is reversible: we’re producing stable stuff – U-232 lasts for dozens of years – that probably existed long time ago but so it decayed and now we’re reversing the arrow of time using our nuclear science and technology.

Now, you may object that you don’t see Nature spontaneously assemble the nuclear technology we’re using to produce U-232, except if Nature would go for that Big Crunch everyone’s predicting so it can repeat the Big Bang once again (so that’s the oscillating Universe scenario)—and you’re obviously right in that assessment. That being said, from some kind of weird existential-philosophical point of view, it’s kind of nice to know that – in theory at least – there is time reversibility indeed (or T symmetry as it’s called by scientists). 

[Voice booming from the sky] STOP DREAMING ! TIME REVERSIBILITY DOESN’T EXIST !

What? That’s right. For beta decay, we don’t have T symmetry. The weak force breaks all kinds of symmetries, and time symmetry is only one of them. I talked about these in my previous post (Loose Ends) – so please have a look at that, and let me just repeat the basics:

  1. Parity (P) symmetry or mirror symmetry revolves around the notion that Nature should not distinguish between right- and left-handedness, so everything that works in our world, should also work in the mirror world. Now, the weak force does not respect P symmetry: we need right-handed neutrinos for β decay, and we’d also need right-handed neutrinos to reverse the process – which actually happens: so, yes, beta decay might be time-reversible but so it doesn’t work with left-handed neutrinos – which is what our ‘right-handed’ neutrinos would be in the ‘mirror world’. Full stop. Our world is different from the mirror world because the weak force knows the difference between left and right – and some stuff only works with left-handed stuff (and then some other stuff only works with right-handed stuff). In short, the weak force doesn’t work the same in the mirror world. In the mirror world, we’d need to throw in left-handed neutrinos for β decay. Not impossible but a bit of a nuisance, you’ll agree.  
  2. Charge conjugation or charge (C) symmetry revolves around the notion that a world in which we reverse all (electric) charge signs. Now, the weak force also does not respect C symmetry. I’ll let you go through the reasoning for that, but it’s the same really. Just reversing all signs would not make the weak force ‘work’ in the mirror world: we’d have to ‘keep’ some of the signs – notably those of our W bosons !
  3. Initially, it was thought that the weak force respected the combined CP symmetry (and, therefore, that the principle of P and C symmetry could be substituted by a combined CP symmetry principle) but two experimenters – Val Fitch and James Cronin – got a Nobel Prize when they proved that this was not the case. To be precise, the spontaneous decay of neutral kaons (which is a type of decay mediated by the weak force) does not respect CP symmetry. Now, that was the death blow to time reversibility (T symmetry). Why? Can’t we just make a film of those experiments not respecting P, C or CP symmetry, and then just press the ‘reverse’ button? We could but one can show that the relativistic invariance in Einstein’s relativity theory implies a combined CPT symmetry. Hence, if CP is a broken symmetry, then the T symmetry is also broken. So we could play that film, but the laws of physics would not make sense ! In other words, the weak force does not respect T symmetry either !

To summarize this rather lengthy philosophical digression: a full CPT sequence of operations would work. So we could – in sequence – (1) change all particles to antiparticles (C), (2) reflect the system in a mirror (P), and (3) change the sign of time (T), and we’d have a ‘working’ anti-world that would be just as real as ours. HOWEVER, we do not live in a mirror world. We live in OUR world – and so left-handed is left-handed, and right-handed is right-handed, and positive is positive and negative is negative, and so THERE IS NO TIME REVERSIBILITY: the weak force does not respect T symmetry.

Do you understand now why I call the weak force the weird force? Penrose devotes a whole chapter to time reversibility in his Road to Reality, but he does not focus on the weak force. I wonder why. All that rambling on the Second Law of Thermodynamics is great – but one should relate that ‘principle’ to the fundamental forces and, most notably, to the weak force.

Post scriptum 1:

In one of my previous posts, I complained about not finding any good image of the Higgs particle. The problem is that these super-duper particle accelerators don’t use bubble chambers anymore. The scales involved have become incredibly small and so all that we have is electronic data, it seems, and that is then re-assembled into some kind of digital image but – when everything is said and done – these images are only simulations. Not the real thing. I guess I am just an old grumpy guy – a 45-year old economist: what do you expect? – but I’ll admit that those black-and-white pictures above make my heart race a bit more than those colorful simulations. But so I found a good simulation. It’s the cover image of Wikipedia’s Physics beyond the Standard Model (I should have looked there in the first place, I guess). So here it is: the “simulated Large Hadron Collider CMS particle detector data depicting a Higgs boson (produced by colliding protons) decaying into hadron jets and electrons.”

CMS_Higgs-event (1)

So that’s what gives mass to our massive W bosons. The Higgs particle is a massive particle itself: an estimated 125-126 GeV/c2, so that’s about 1.5 times the mass of the W bosons. I tried to look into decay widths and all that, but it’s all quite confusing. In short, I have no doubt that the Higgs theory is correct – the data is all what we have and then, when everything is said and done, we have an honorable Nobel Prize Committee thinking the evidence is good enough (which – in light of their rather conservative approach (which I fully subscribe too: don’t get me wrong !) – usually means that it’s more than good enough !) – but I can’t help thinking this is a theory which has been designed to match experiment. 

Wikipedia writes the following about the Higgs field:

“The Higgs field consists of four components, two neutral ones and two charged component fields. Both of the charged components and one of the neutral fields are Goldstone bosons, which act as the longitudinal third-polarization components of the massive W+, W– and Z bosons. The quantum of the remaining neutral component corresponds to (and is theoretically realized as) the massive Higgs boson.”

Hmm… So we assign some qubits to W bosons (sorry for the jargon: I am talking these ‘longitudinal third-polarization components’ here), and to W bosons only, and then we find that the Higgs field gives mass to these bosons only? I might be mistaken – I truly hope so (I’ll find out when I am somewhat stronger in quantum-mechanical math) – but, as for now, it all smells somewhat fishy to me. It’s all consistent, yes – and I am even more skeptical about GUT stuff ! – but it does look somewhat artificial.

But then I guess this rather negative appreciation of the mathematical beauty (or lack of it) of the Standard Model is really what is driving all these GUT theories – and so I shouldn’t be so skeptical about them ! 🙂

Oh… And as I’ve inserted some images of collisions already, let me insert some more. The ones below document the discovery of quarks. They come out of the above-mentioned coffee table book of Lederman and Schramm (1989). The accompanying texts speak for themselves.

Quark - 1

Quark - 2

Quark - 3

 

Post scriptum 2:

I checked the source of that third diagram showing how an incoming neutrino could possibly cause a neutron to become a proton. It comes out of the August 2001 issue of Physics Today indeed, and it describes a very particular type of beta decay. This is the original illustration:

inverse beta decay

The article (and the illustration above) describes how solar neutrinos traveling through heavy water – also known as deuterium – can interact with the deuterium nucleus – which is referred to as deuteron, and which we’ll represent by the symbol d in the process descriptions below. The nucleus of deuterium – which is an isotope of hydrogen – consists of one proton and one neutron, as opposed to the much more common protium isotope of hydrogen, which has just one proton in the nucleus. Deuterium occurs naturally (0.0156% of all hydrogen atoms in the Earth’s oceans is deuterium), but it can also be produced industrially – for use in heavy-water nuclear reactors for example. In any case, the point is that deuteron can respond to solar neutrinos by breaking up in one of two ways:

  1. Quasi-elastically: ve + d → ve + p + n. So, in this case, the deuteron just breaks up in its two components: one proton and one neutron. That seems to happen pretty frequently because the nuclear forces holding the proton and the neutron together are pretty weak it seems.
  2. Alternatively, the solar neutrino can turn a deuteron’s neutron into a second proton, and so that’s what’s depicted in the third diagram above: ve + d → e + p + p. So what happens really is ve + n → e + p.

The author of this article – which basically presents the basics of how a new neutrino detector – the Sudbury Neutrino Observatory – is supposed to work – refers to the second process as inverse beta decay – but that’s a rather generic and imprecise term it seems. The conclusion is that the weak force seems to have myriad ways of expressing itself. However, the connection between neutrinos and the weak force seems to need further exploring. As for myself, I’d like to know why the hypothesis that any form of beta decay – or, for that matter, any other expression of the weak force – is actually being triggered by these tiny neutrinos crashing into (other) matter particles would not be reasonable.

In such scenario, the W bosons would be reduced to a (very) temporary messy ‘blob’ of energy, combining kinetic, electromagnetic as well as the strong binding energy between quarks if protons and neutrons are involved. Could this ‘odd one out’ be nothing but a pseudo-force? I am no doubt being very simplistic here – but then it’s an interesting possibility, isn’t it? In order to firmly deny it, I’ll need to learn a lot more about neutrinos no doubt – and about how the results of all these collisions in particle accelerators are actually being analyzed and interpreted.

Loose ends…

It looks like I am getting ready for my next plunge into Roger Penrose’s Road to Reality. I still need to learn more about those Hamiltonian operators and all that, but I can sort of ‘see’ what they are supposed to do now. However, before I venture off on another series of posts on math instead of physics, I thought I’d briefly present what Feynman identified as ‘loose ends’ in his 1985 Lectures on Quantum Electrodynamics – a few years before his untimely death – and then see if any of those ‘loose ends’ appears less loose today, i.e. some thirty years later.

The three-forces model and coupling constants

All three forces in the Standard Model (the electromagnetic force, the weak force and the strong force) are mediated by force carrying particles: bosons. [Let me talk about the Higgs field later and – of course – I leave out the gravitational force, for which we do not have a quantum field theory.]

Indeed, the electromagnetic force is mediated by the photon; the strong force is mediated by gluons; and the weak force is mediated by W and/or Z bosons. The mechanism is more or less the same for all. There is a so-called coupling (or a junction) between a matter particle (i.e. a fermion) and a force-carrying particle (i.e. the boson), and the amplitude for this coupling to happen is given by a number that is related to a so-called coupling constant

Let’s give an example straight away – and let’s do it for the electromagnetic force, which is the only force we have been talking about so far. The illustration below shows three possible ways for two electrons moving in spacetime to exchange a photon. This involves two couplings: one emission, and one absorption. The amplitude for an emission or an absorption is the same: it’s –j. So the amplitude here will be (–j)(j) = j2. Note that the two electrons repel each other as they exchange a photon, which reflects the electromagnetic force between them from a quantum-mechanical point of view !

Photon exchangeWe will have a number like this for all three forces. Feynman writes the coupling constant for the electromagnetic force as  j and the coupling constant for the strong force (i.e. the amplitude for a gluon to be emitted or absorbed by a quark) as g. [As for the weak force, he is rather short on that and actually doesn’t bother to introduce a symbol for it. I’ll come back on that later.]

The coupling constant is a dimensionless number and one can interpret it as the unit of ‘charge’ for the electromagnetic and strong force respectively. So the ‘charge’ q of a particle should be read as q times the coupling constant. Of course, we can argue about that unit. The elementary charge for electromagnetism was or is – historically – the charge of the proton (q = +1), but now the proton is no longer elementary: it consists of quarks with charge –1/3 and +2/3 (for the d and u quark) respectively (a proton consists of two u quarks and one d quark, so you can write it as uud). So what’s j then? Feynman doesn’t give its precise value but uses an approximate value of –0.1. It is an amplitude so it should be interpreted as a complex number to be added or multiplied with other complex numbers representing amplitudes – so –0.1 is “a shrink to about one-tenth, and half a turn.” [In these 1985 Lectures on QED, which he wrote for a lay audience, he calls amplitudes ‘arrows’, to be combined with other ‘arrows.’ In complex notation, –0.1 = 0.1eiπ = 0.1(cosπ + isinπ).]

Let me give a precise number. The coupling constant for the electromagnetic force is the so-called fine-structure constant, and it’s usually denoted by the alpha symbol (α). There is a remarkably easy formula for α, which becomes even easier if we fiddle with units to simplify the matter even more. Let me paraphrase Wikipedia on α here, because I have no better way of summarizing it (the summary is also nice as it shows how changing units – replacing the SI units by so-called natural units – can simplify equations):

1. There are three equivalent definitions of α in terms of other fundamental physical constants:

\alpha = \frac{k_\mathrm{e} e^2}{\hbar c} = \frac{1}{(4 \pi \varepsilon_0)} \frac{e^2}{\hbar c} = \frac{e^2 c \mu_0}{2 h}
where e is the elementary charge (so that’s the electric charge of the proton); ħ = h/2π is the reduced Planck constant; c is the speed of light (in vacuum); ε0 is the electric constant (i.e. the so-called permittivity of free space); µ0 is the magnetic constant (i.e. the so-called permeability of free space); and ke is the Coulomb constant.

2. In the old centimeter-gram-second variant of the metric system (cgs), the unit of electric charge is chosen such that the Coulomb constant (or the permittivity factor) equals 1. Then the expression of the fine-structure constant just becomes:

\alpha = \frac{e^2}{\hbar c}

3. When using so-called natural units, we equate ε0 , c and ħ to 1. [That does not mean they are the same, but they just become the unit for measurement for whatever is measured in them. :-)] The value of the fine-structure constant then becomes:

 \alpha = \frac{e^2}{4 \pi}.

Of course, then it just becomes a matter of choosing a value for e. Indeed, we still haven’t answered the question as to what we should choose as ‘elementary’: 1 or 1/3? If we take 1, then α is just a bit smaller than 0.08 (around 0.0795775 to be somewhat more precise). If we take 1/3 (the value for a quark), then we get a much smaller value: about 0.008842 (I won’t bother too much about the rest of the decimals here). Feynman’s (very) rough approximation of –0.1 obviously uses the historic proton charge, so e = +1.

The coupling constant for the strong force is much bigger. In fact, if we use the SI units (i.e. one of the three formulas for α under point 1 above), then we get an alpha equal to some 7.297×10–3. In fact, its value will usually be written as 1/α, and so we get a value of (roughly) 1/137. In this scheme of things, the coupling constant for the strong force is 1, so that’s 137 times bigger.

Coupling constants, interactions, and Feynman diagrams

So how does it work? The Wikipedia article on coupling constants makes an extremely useful distinction between the kinetic part and the proper interaction part of an ‘interaction’. Indeed, before we just blindly associate qubits with particles, it’s probably useful to not only look at how photon absorption and/or emission works, but also at how a process as common as photon scattering works (so we’re talking Compton scattering here – discovered in 1923, and it earned Compton a Nobel Prize !).

The illustration below separates the kinetic and interaction part properly: the photon and the electron are both deflected (i.e. the magnitude and/or direction of their momentum (p) changes) – that’s the kinetic part – but, in addition, the frequency of the photon (and, hence, its energy – cf. E = hν) is also affected – so that’s the interaction part I’d say.

Compton scattering

With an absorption or an emission, the situation is different, but it also involves frequencies (and, hence, energy levels), as show below: an electron absorbing a higher-energy photon will jump two or more levels as it absorbs the energy by moving to a higher energy level (i.e. a so-called excited state), and when it re-emits the energy, the emitted photon will have higher energy and, hence, higher frequency.

Absorption-1

This business of frequencies and energy levels may not be so obvious when looking at those Feynman diagrams, but I should add that these Feynman diagrams are not just sketchy drawings: the time and space axis is precisely defined (time and distance are measured in equivalent units) and so the direction of travel of particles (photons, electrons, or whatever particle is depicted) does reflect the direction of travel and, hence, conveys precious information about both the direction as well as the magnitude of the momentum of those particles. That being said, a Feynman diagram does not care about a photon’s frequency and, hence, its energy (its velocity will always be c, and it has no mass, so we can’t get any information from its trajectory).

Let’s look at these Feynman diagrams now, and the underlying force model, which I refer to as the boson exchange model.

The boson exchange model

The quantum field model – for all forces – is a boson exchange model. In this model, electrons, for example, are kept in orbit through the continuous exchange of (virtual) photons between the proton and the electron, as shown below.

Electron-protonNow, I should say a few words about these ‘virtual’ photons. The most important thing is that you should look at them as being ‘real’. They may be derided as being only temporary disturbances of the electromagnetic field but they are very real force carriers in the quantum field theory of electromagnetism. They may carry very low energy as compared to ‘real’ photons, but they do conserve energy and momentum – in quite a strange way obviously: while it is easy to imagine a photon pushing an electron away, it is a bit more difficult to imagine it pulling it closer, which is what it does here. Nevertheless, that’s how forces are being mediated by virtual particles in quantum mechanics: we have matter particles carrying charge but neutral bosons taking care of the exchange between those charges.

In fact, note how Feynman actually cares about the possibility of one of those ‘virtual’ photons briefly disintegrating into an electron-positron pair, which underscores the ‘reality’ of photons mediating the electromagnetic force between a proton and an electron,  thereby keeping them close together. There is probably no better illustration to explain the difference between quantum field theory and the classical view of forces, such as the classical view on gravity: there are no gravitons doing for gravity what photons are doing for electromagnetic attraction (or repulsion).

Pandora’s Box

I cannot resist a small digression here. The ‘Box of Pandora’ to which Feynman refers in the caption of the illustration above is the problem of calculating the coupling constants. Indeed, j is the coupling constant for an ‘ideal’ electron to couple with some kind of ‘ideal’ photon, but how do we calculate that when we actually know that all possible paths in spacetime have to be considered and that we have all of these ‘virtual’ mess going on? Indeed, in experiments, we can only observe probabilities for real electrons to couple with real photons.

In the ‘Chapter 4’ to which the caption makes a reference, he briefly explains the mathematical procedure, which he invented and for which he got a Nobel Prize. He calls it a ‘shell game’. It’s basically an application of ‘perturbation theory’, which I haven’t studied yet. However, he does so with skepticism about its mathematical consistency – skepticism which I mentioned and explored somewhat in previous posts, so I won’t repeat that here. Here, I’ll just note that the issue of ‘mathematical consistency’ is much more of an issue for the strong force, because the coupling constant is so big.

Indeed, terms with j2, j3jetcetera (i.e. the terms involved in adding amplitudes for all possible paths and all possible ways in which an event can happen) quickly become very small as the exponent increases, but terms with g2, g3getcetera do not become negligibly small. In fact, they don’t become irrelevant at all. Indeed, if we wrote α for the electromagnetic force as 7.297×10–3, then the α for the strong force is one, and so none of these terms becomes vanishingly small. I won’t dwell on this, but just quote Wikipedia’s very succinct appraisal of the situation: “If α is much less than 1 [in a quantum field theory with a dimensionless coupling constant α], then the theory is said to be weakly coupled. In this case it is well described by an expansion in powers of α called perturbation theory. [However] If the coupling constant is of order one or larger, the theory is said to be strongly coupled. An example of the latter [the only example as far as I am aware: we don’t have like a dozen different forces out there !] is the hadronic theory of strong interactions, which is why it is called strong in the first place. [Hadrons is just a difficult word for particles composed of quarks – so don’t worry about it: you understand what is being said here.] In such a case non-perturbative methods have to be used to investigate the theory.”

Hmm… If Feynman thought his technique for calculating weak coupling constants was fishy, then his skepticism about whether or not physicists actually know what they are doing when calculating stuff using the strong coupling constant is probably justified. But let’s come back on that later. With all that we know here, we’re ready to present a picture of the ‘first-generation world’.

The first-generation world

The first-generation is our world, excluding all that goes in those particle accelerators, where they discovered so-called second- and third-generation matter – but I’ll come back to that. Our world consists of only four matter particles, collectively referred to as (first-generation) fermions: two quarks (a u and a d type), the electron, and the neutrino. This is what is shown below.

first-generation matter

Indeed, u and d quarks make up protons and neutrons (a proton consists of two u quarks and one d quark, and a neutron must be neutral, so it’s two d quarks and one u quark), and then there’s electrons circling around them and so that’s our atoms. And from atoms, we make molecules and then you know the rest of the story. Genesis ! 

Oh… But why do we need the neutrino? [Damn – you’re smart ! You see everything, don’t you? :-)] Well… There’s something referred to as beta decay: this allows a neutron to become a proton (and vice versa). Beta decay explains why carbon-14 will spontaneously decay into nitrogen-14. Indeed, carbon-12 is the (very) stable isotope, while carbon-14 has a life-time of 5,730 ± 40 years ‘only’ and, hence, measuring how much carbon-14 is left in some organic substance allows us to date it (that’s what (radio)carbon-dating is about). Now, a beta particle can refer to an electron or a positron, so we can have β decay (e.g. the above-mentioned carbon-14 decay) or β+ decay (e.g. magnesium-23 into sodium-23). If we have β decay, then some electron will be flying out in order to make sure the atom as a whole stays electrically neutral. If it’s β+ decay, then emitting a positron will do the job (I forgot to mention that each of the particles above also has a anti-matter counterpart – but don’t think I tried to hide anything else: the fermion picture above is pretty complete). That being said, Wolfgang Pauli, one of those geniuses who invented quantum theory, noted, in 1930 already, that some momentum and energy was missing, and so he predicted the emission of this mysterious neutrinos as well. Guess what? These things are very spooky (relatively high-energy neutrinos produced by stars (our Sun in the first place) are going through your and my my body, right now and right here, at a rate of some hundred trillion per second) but, because they are so hard to detect, the first actual trace of their existence was found in 1956 only. [Neutrino detection is fairly standard business now, however.] But back to quarks now.

Quarks are held together by gluons – as you probably know. Quarks come in flavors (u and d), but gluons come in ‘colors’. It’s a bit of a stupid name but the analogy works great. Quarks exchange gluons all of the time and so that’s what ‘glues’ them so strongly together. Indeed, the so-called ‘mass’ that gets converted into energy when a nuclear bomb explodes is not the mass of quarks (their mass is only 2.4 and 4.8 MeV/c2. Nuclear power is binding energy between quarks that gets converted into heat and radiation and kinetic energy and whatever else a nuclear explosion unleashes. That binding energy is reflected in the difference between the mass of a proton (or a neutron) – around 938 MeV/c2 – and the mass figure you get when you add two u‘s and one d, which is them 9.6 MeV/c2 only. This ratio – a factor of one hundred – illustrates once again the strength of the strong force: 99% of the ‘mass’ of a proton or an electron is due to the strong force.    

But I am digressing too much, and I haven’t even started to talk about the bosons associated with the weak force. Well… I won’t just now. I’ll just move on the second- and third-generation world.

Second- and third-generation matter

When physicists started to look for those quarks in their particle accelerators, Nature had already confused them by producing lots of other particles in these accelerators: in the 1960s, there were more than four hundred of them. Yes. Too much. But they couldn’t get them back in the box. 🙂

Now, all these ‘other particles’ are unstable but they survive long enough – a muon, for example, disintegrates after 2.2 millionths of a second (on average) – to deserve the ‘particle’ title, as opposed to a ‘resonance’, whose lifetime can be as short as a billionth of a trillionth of a second. And so, yes, the physicists had to explain them too. So the guys who devised the quark-gluon model (the model is usually associated with Murray Gell-Mann but – as usual with great ideas – some others worked hard on it as well) had already included heavier versions of their quarks to explain (some of) these other particles. And so we do not only have heavier quarks, but also a heavier version of the electron (that’s the muon I mentioned) as well as a heavier version of the neutrino (the so-called muon neutrino). The two new ‘flavors’ of quarks were called s and c. [Feynman hates these names but let me give them: u stands for up, d for down, s for strange and c for charm. Why? Well… According to Feynman: “For no reason whatsoever.”]

Traces of the second-generation and c quarks were found in experiments in 1968 and 1974 respectively (it took six years to boost the particle accelerators sufficiently), and the third-generation quark (for beauty or bottom – whatever) popped up in Fermilab‘s particle accelerator in 1978. To be fully complete, it then took 17 years to detect the super-heavy t quark – which stands for truth.  [Of all the quarks, this name is probably the nicest: “If beauty, then truth” – as Lederman and Schramm write in their 1989 history of all of this.]

What’s next? Will there be a fourth or even fifth generation? Back in 1985, Feynman didn’t exclude it (and actually seemed to expect it), but current assessments are more prosaic. Indeed, Wikipedia writes that, According to the results of the statistical analysis by researchers from CERN and the Humboldt University of Berlin, the existence of further fermions can be excluded with a probability of 99.99999% (5.3 sigma).” If you want to know why… Well… Read the rest of the Wikipedia article. It’s got to do with the Higgs particle.

So the complete model of reality is the one I already inserted in a previous post and, if you find it complicated, remember that the first generation of matter is the one that matters and, among the bosons, it’s the photons and gluons. If you focus on these only, it’s not complicated at all – and surely a huge improvement over those 400+ particles no one understood in the 1960s.

Standard_Model_of_Elementary_Particles

As for the interactions, quarks stick together – and rather firmly so – by interchanging gluons. They thereby ‘change color’ (which is the same as saying there is some exchange of ‘charge’). I copy Feynman’s original illustration hereunder (not because there’s no better illustration: the stuff you can find on Wikipedia has actual colors !) but just because it’s reflects the other illustrations above (and, perhaps, maybe I also want to make sure – with this black-and-white thing – that you don’t think there’s something like ‘real’ color inside of a nucleus).

quark gluon exchange

So what are the loose ends then? The problem of ‘mathematical consistency’ associated with the techniques used to calculate (or estimate) these coupling constants – which Feynman identifies as a key defect in 1985 – is is a form of skepticism about the Standard Model that is not shared by others. It’s more about the other forces. So let’s now talk about these.

The weak force as the weird force: about symmetry breaking

I included the weak force in the title of one of the sub-sections above (“The three-forces model”) and then talked about the other two forces only. The W, W and Z bosons – usually referred to, as a group, as the W bosons, or the ‘intermediate vector bosons’ – are an odd bunch. First, note that they are the only ones that do not only have a (rest) mass (and not just a little bit: they’re almost 100 times heavier than a the proton or neutron – or a hydrogen atom !) but, on top of that, they also have electric charge (except for the Z boson). They are really the odd ones out.  Feynman does not doubt their existence (a Fermilab team produced them in 1983, and they got a Nobel Prize for it, so little room for doubts here !), but it is obvious he finds the weak force interaction model rather weird.

He’s not the only one: in a wonderful publication designed to make a case for more powerful particle accelerators (probably successful, because the Large Hadron Collider came through – and discovered credible traces of the Higgs field, which is involved in the story that is about to follow), Leon Lederman and David Schramm look at the asymmety involved in having massive W bosons and massless photons and gluons, as just one of the many asymmetries associated with the weak force. Let me develop this point.

We like symmetries. They are aesthetic. But so I am talking something else here: in classical physics, characterized by strict causality and determinism, we can – in theory – reverse the arrow of time. In practice, we can’t – because of entropy – but, in theory, so-called reversible machines are not a problem. However, in quantum mechanics we cannot reverse time for reasons that have nothing to do with thermodynamics. In fact, there are several types of symmetries in physics:

  1. Parity (P) symmetry revolves around the notion that Nature should not distinguish between right- and left-handedness, so everything that works in our world, should also work in the mirror world. Now, the weak force does not respect P symmetry. That was shown by experiments on the decay of pions, muons and radioactive cobalt-60 in 1956 and 1957 already.
  2. Charge conjugation or charge (C) symmetry revolves around the notion that a world in which we reverse all (electric) charge signs (so protons would have minus one as charge, and electrons have plus one) would also just work the same. The same 1957 experiments showed that the weak force does also not respect C symmetry.
  3. Initially, smart theorists noted that the combined operation of CP was respected by these 1957 experiments (hence, the principle of P and C symmetry could be substituted by a combined CP symmetry principle) but, then, in 1964, Val Fitch and James Cronin, proved that the spontaneous decay of neutral kaons (don’t worry if you don’t know what particle this is: you can look it up) into pairs of pions did not respect CP symmetry. In other words, it was – again – the weak force not respecting symmetry. [Fitch and Cronin got a Nobel Prize for this, so you can imagine it did mean something !]
  4. We mentioned time reversal (T) symmetry: how is that being broken? In theory, we can imagine a film being made of those events not respecting P, C or CP symmetry and then just pressing the ‘reverse’ button, can’t we? Well… I must admit I do not master the details of what I am going to write now, but let me just quote Lederman (another Nobel Prize physicist) and Schramm (an astrophysicist): “Years before this, [Wolfgang] Pauli [Remember him from his neutrino prediction?] had pointed out that a sequence of operations like CPT could be imagined and studied; that is, in sequence, change all particles to antiparticles, reflect the system in a mirror, and change the sign of time. Pauli’s theorem was that all nature respected the CPT operation and, in fact, that this was closely connected to the relativistic invariance of Einstein’s equations. There is a consensus that CPT invariance cannot be broken – at least not at energy scales below 1019 GeV [i.e. the Planck scale]. However, if CPT is a valid symmetry, then, when Fitch and Cronin showed that CP is a broken symmetry, they also showed that T symmetry must be similarly broken.” (Lederman and Schramm, 1989, From Quarks to the Cosmos, p. 122-123)

So the weak force doesn’t care about symmetries. Not at all. That being said, there is an obvious difference between the asymmetries mentioned above, and the asymmetry involved in W bosons having mass and other bosons not having mass. That’s true. Especially because now we have that Higgs field to explain why W bosons have mass – and not only W bosons but also the matter particles (i.e. the three generations of leptons and quarks discussed above). The diagram shows what interacts with what.

2000px-Elementary_particle_interactions.svgBut so the Higgs field does not interact with photons and gluons. Why? Well… I am not sure. Let me copy the Wikipedia explanation: “The Higgs field consists of four components, two neutral ones and two charged component fields. Both of the charged components and one of the neutral fields are Goldstone bosons, which act as the longitudinal third-polarization components of the massive W+, W– and Z bosons. The quantum of the remaining neutral component corresponds to (and is theoretically realized as) the massive Higgs boson.”

Huh? […] This ‘answer’ probably doesn’t answer your question. What I understand from the explanation above, is that the Higgs field only interacts with W bosons because its (theoretical) structure is such that it only interacts with W bosons. Now, you’ll remember Feynman’s oft-quoted criticism of string theory: I don’t like that for anything that disagrees with an experiment, they cook up an explanation–a fix-up to say.” Is the Higgs theory such cooked-up explanation? No. That kind of criticism would not apply here, in light of the fact that – some 50 years after the theory – there is (some) experimental confirmation at least !

But you’ll admit it does all look ‘somewhat ugly.’ However, while that’s a ‘loose end’ of the Standard Model, it’s not a fundamental defect or so. The argument is more about aesthetics, but then different people have different views on aesthetics – especially when it comes to mathematical attractiveness or unattractiveness.

So… No real loose end here I’d say.

Gravity

The other ‘loose end’ that Feynman mentions in his 1985 summary is obviously still very relevant today (much more than his worries about the weak force I’d say). It is the lack of a quantum theory of gravity. There is none. Of course, the obvious question is: why would we need one? We’ve got Einstein’s theory, don’t we? What’s wrong with it?

The short answer to the last question is: nothing’s wrong with it – on the contrary ! It’s just that it is – well… – classical physics. No uncertainty. As such, the formalism of quantum field theory cannot be applied to gravity. That’s it. What’s Feynman’s take on this? [Sorry I refer to him all the time, but I made it clear in the introduction of this post that I would be discussing ‘his’ loose ends indeed.] Well… He makes two points – a practical one and a theoretical one:

1. “Because the gravitation force is so much weaker than any of the other interactions, it is impossible at the present time to make any experiment that is sufficiently delicate to measure any effect that requires the precision of a quantum theory to explain it.”

Feynman is surely right about gravity being ‘so much weaker’. Indeed, you should note that, at a scale of 10–13 cm (that’s the picometer scale – so that’s the relevant scale indeed at the sub-atomic level), the coupling constants compare as follows: if the coupling constant of the strong force is 1, the coupling constant of the electromagnetic force is approximately 1/137, so that’s a factor of 10–2 approximately. The strength of the weak force as measured by the coupling constant would be smaller with a factor 10–13 (so that’s 1/10000000000000 smaller). Incredibly small, but so we do have a quantum field theory for the weak force ! However, the coupling constant for the gravitational force involves a factor 10–38. Let’s face it: this is unimaginably small.

However, Feynman wrote this in 1985 (i.e. thirty years ago) and scientists wouldn’t be scientists if they would not at least try to set up some kind of experiment. So there it is: LIGO. Let me quote Wikipedia on it:

LIGO, which stands for the Laser Interferometer Gravitational-Wave Observatory, is a large-scale physics experiment aiming to directly detect gravitation waves. […] At the cost of $365 million (in 2002 USD), it is the largest and most ambitious project ever funded by the NSF. Observations at LIGO began in 2002 and ended in 2010; no unambiguous detections of gravitational waves have been reported. The original detectors were disassembled and are currently being replaced by improved versions known as “Advanced LIGO”.

So, let’s see what comes out of that. I won’t put my money on it just yet. 🙂 Let’s go to the theoretical problem now.

2. “Even though there is no way to test them, there are, nevertheless, quantum theories of gravity that involve ‘gravitons’ (which would appear under a new category of polarizations, called spin “2”) and other fundamental particles (some with spin 3/2). The best of these theories is not able to include the particles that we do find, and invents a lot of particles that we don’t find. [In addition] The quantum theories of gravity also have infinities in the terms with couplings [Feynman does not refer to a coupling constant but to a factor n appearing in the so-called propagator for an electron – don’t worry about it: just note it’s a problem with one of those constants actually being larger than one !], but the “dippy process” that is successful in getting rid of the infinities in quantum electrodynamics doesn’t get rid of them in gravitation. So not only have we no experiments with which to check a quantum theory of gravitation, we also have no reasonable theory.”

Phew ! After reading that, you wouldn’t apply for a job at that LIGO facility, would you? That being said, the fact that there is a LIGO experiment would seem to undermine Feynman’s practical argument. But then is his theoretical criticism still relevant today? I am not an expert, but it would seem to be the case according to Wikipedia’s update on it:

“Although a quantum theory of gravity is needed in order to reconcile general relativity with the principles of quantum mechanics, difficulties arise when one attempts to apply the usual prescriptions of quantum field theory. From a technical point of view, the problem is that the theory one gets in this way is not renormalizable and therefore cannot be used to make meaningful physical predictions. As a result, theorists have taken up more radical approaches to the problem of quantum gravity, the most popular approaches being string theory and loop quantum gravity.”

Hmm… String theory and loop quantum gravity? That’s the stuff that Penrose is exploring. However, I’d suspect that for these (string theory and loop quantum gravity), Feynman’s criticism probably still rings true – to some extent at least: 

I don’t like that they’re not calculating anything. I don’t like that they don’t check their ideas. I don’t like that for anything that disagrees with an experiment, they cook up an explanation–a fix-up to say, “Well, it might be true.” For example, the theory requires ten dimensions. Well, maybe there’s a way of wrapping up six of the dimensions. Yes, that’s all possible mathematically, but why not seven? When they write their equation, the equation should decide how many of these things get wrapped up, not the desire to agree with experiment. In other words, there’s no reason whatsoever in superstring theory that it isn’t eight out of the ten dimensions that get wrapped up and that the result is only two dimensions, which would be completely in disagreement with experience. So the fact that it might disagree with experience is very tenuous, it doesn’t produce anything; it has to be excused most of the time. It doesn’t look right.”

What to say by way of conclusion? Not sure. I think my personal “research agenda” is reasonably simple: I just want to try to understand all of the above somewhat better and then, perhaps, I might be able to understand some of what Roger Penrose is writing. 🙂

Bad thinking: photons versus the matter wave

In my previous post, I wrote that I was puzzled by that relation between the energy and the size of a particle: higher-energy photons are supposed to be smaller and, pushing that logic to the limit, we get photons becoming black holes at the Planck scale. Now, understanding what the Planck scale is all about, is important to understand why we’d need a GUT, and so I do want to explore that relation between size and energy somewhat further.

I found the answer by a coincidence. We’ll call it serendipity. 🙂 Indeed, an acquaintance of mine who is very well versed in physics pointed out a terrible mistake in (some of) my reasoning in the previous posts: photons do not have a de Broglie wavelength. They just have a wavelength. Full stop. It immediately reduced my bemusement about that energy-size relation and, in the end, eliminated it completely. So let’s analyze that mistake – which seems to be a fairly common freshman mistake judging from what’s being written about it in some of the online discussions on physics.

If photons are not to be associated with a de Broglie wave, it basically means that the Planck relation has nothing to do with the de Broglie relation, even if these two relations are identical from a pure mathematical point of view:

  1. The Planck relation E = hν states that electromagnetic waves with frequency ν are a bunch of discrete packets of energy referred to as photons, and that the energy of these photons is proportional to the frequency of the electromagnetic wave, with the Planck constant h as the factor of proportionality. In other words, the natural unit to measure their energy is h, which is why h is referred to as the quantum of action.
  2. The de Broglie relation E = hf assigns de Broglie wave with frequency f to a matter particle with energy E = mc2 = γm0c2. [The factor γ in this formula is the Lorentz factor: γ = (1 – v2/c2)–1/2. It just corrects for the relativistic effect on mass as the velocity of the particle (v) gets closer to the speed of light (c).]

These are two very different things: photons do not have rest mass (which is why they can travel at light speed) and, hence, they are not to be considered as matter particles. Therefore, one should not assign a de Broglie wave to them. So what are they then? A photon is a wave packet but it’s an electromagnetic wave packet. Hence, its wave function is not some complex-valued psi function Ψ(x, t). What is oscillating in the illustration below (let’s say this is a procession of photons) is the electric field vector E. [To get the full picture of the electromagnetic wave, you should also imagine a (tiny) magnetic field vector B, which oscillates perpendicular to E), but that does not make much of a difference. Finally, in case you wonder about these dots: the red and green dot just make it clear that phase and group velocity of the wave are the same: vg = vp = v = c.] Wave - same group and phase velocityThe point to note is that we have a real wave here: it is not de Broglie wave. A de Broglie wave is a complex-valued function Ψ(x, t) with two oscillating parts: (i) the so-called real part of the complex value Ψ, and (ii) the so-called imaginary part (and, despite its name, that counts as much as the real part when working with Ψ !). That’s what’s shown in the examples of complex (standing) waves below: the blue part is one part (let’s say the real part), and then the salmon color is the other part. We need to square the modulus of that complex value to find the probability P of detecting that particle in space at point x at time t: P(x, t) = |Ψ(x, t)|2. Now, if we would write Ψ(x, t) as Ψ = u(x, t) + iv(x, t), then u(x, t) is the real part, and v(x, t) is the imaginary part. |Ψ(x, t)|2 is then equal to u2 + u2 so that shows that both the blue as well as the salmon amplitude matter when doing the math.  

StationaryStatesAnimation

So, while I may have given the impression that the Planck relation was like a limit of the de Broglie relation for particles with zero rest mass traveling at speed c, that’s just plain wrong ! The description of a particle with zero rest mass fits a photon but the Planck relation is not the limit of the de Broglie relation: photons are photons, and electrons are electrons, and an electron wave has nothing to do with a photon. Electrons are matter particles (fermions as physicists would say), and photons are bosons, i.e. force carriers.

Let’s now re-examine the relationship between the size and the energy of a photon. If the wave packet below would represent an (ideal) photon, what is its energy E as a function of the electric and magnetic field vectors E and B[Note that the (non-boldface) E stands for energy (i.e. a scalar quantity, so it’s just a number) indeed, while the (italic and bold) E stands for the (electric) field vector (so that’s something with a magnitude (E – with the symbol in italics once again to distinguish it from energy E) and a direction).] Indeed, if a photon is nothing but a disturbance of the electromagnetic field, then the energy E of this disturbance – which obviously depends on E and B – must also be equal to E = hν according to the Planck relation. Can we show that?

Well… Let’s take a snapshot of a plane-wave photon, i.e. a photon oscillating in a two-dimensional plane only. That plane is perpendicular to our line of sight here:

photon

Because it’s a snapshot (time is not a variable), we may look at this as an electrostatic field: all points in the interval Δx are associated with some magnitude (i.e. the magnitude of our electric field E), and points outside of that interval have zero amplitude. It can then be shown (just browse through any course on electromagnetism) that the energy density (i.e. the energy per unit volume) is equal to (1/2)ε0Eis the electric constant which we encountered in previous posts already). To calculate the total energy of this photon, we should integrate over the whole distance Δx, from left to right. However, rather than bothering you with integrals, I think that (i) the ε0E2/2 formula and (ii) the illustration above should be sufficient to convince you that:

  1. The energy of a photon is proportional to the square of the amplitude of the electric field. Such E ∝ Arelation is typical of any real wave, be they water waves or electromagnetic waves. So if we would double, triple, or quadruple its amplitude (i.e. the magnitude E of the electric field E), then the energy of this photon with be multiplied with four, nine times and sixteen respectively.
  2. If we would not change the amplitude of the wave above but double, triple or quadruple its frequency, then we would only double, triple or quadruple its energy: there’s no exponential relation here. In other words, the Planck relation E = hν makes perfect sense, because it reflects that simple proportionality: there is nothing to be squared.
  3. If we double the frequency but leave the amplitude unchanged, then we can imagine a photon with the same energy occupying only half of the Δx space. In fact, because we also have that universal relationship between frequency and wavelength (the propagation speed of a wave equals the product of its wavelength and its frequency: v = λf), we would have to halve the wavelength (and, hence, that would amount to dividing the Δx by two) to make sure our photon is still traveling at the speed of light.

Now, the Planck relation only says that higher energy is associated with higher frequencies: it does not say anything about amplitudes. As mentioned above, if we leave amplitudes unchanged, then the same Δx space will accommodate a photon with twice the frequency and twice the energy. However, if we would double both frequency and amplitude, then the photon would occupy only half of the Δx space, and still have twice as much energy. So the only thing I now need to prove is that higher-frequency electromagnetic waves are associated with larger-amplitude E‘s. Now, while that is something that we get straight out of the the laws of electromagnetic radiation: electromagnetic radiation is caused by oscillating electric charges, and it’s the magnitude of the acceleration (written as a in the formula below) of the oscillating charge that determines the amplitude. Indeed, for a full write-up of these ‘laws’, I’ll refer to a textbook (or just download Feynman’s 28th Lecture on Physics), but let me just give the formula for the (vertical) component of E: EMR law

You will recognize all of the variables and constants in this one: the electric constant ε0, the distance r, the speed of light (and our wave) c, etcetera. The ‘a’ is the acceleration: note that it’s a function not of t but of (t – r/c), and so we’re talking the so-called retarded acceleration here, but don’t worry about that.

Now, higher frequencies effectively imply a higher magnitude of the acceleration vector, and so that’s what’s I had to prove and so we’re done: higher-energy photons not only have higher frequency but also larger amplitude, and so they take less space.

It would be nice if I could derive some kind of equation to specify the relation between energy and size, but I am not that advanced in math (yet). 🙂 I am sure it will come.

Post scriptum 1: The ‘mistake’ I made obviously fully explains why Feynman is only interested in the amplitude of a photon to go from point A to B, and not in the amplitude of a photon to be at point x at time t. The question of the ‘size of the arrows’ then becomes a question related to the so-called propagator function, which gives the probability amplitude for a particle (a photon in this case) to travel from one place to another in a given time. The answer seems to involve another important buzzword when studying quantum mechanics: the gauge parameter. However, that’s also advanced math which I don’t master (as yet). I’ll come back on it… Hopefully… 🙂

Post scriptum 2: As I am re-reading some of my post now (i.e. on 12 January 2015), I noted how immature this post is. I wanted to delete it, but finally I didn’t, as it does illustrate my (limited) progress. I am still struggling with the question of a de Broglie wave for a photon, but I dare to think that my analysis of the question at least is a bit more mature now: please see one of my other posts on it.

Re-visiting the Uncertainty Principle

Let me, just like Feynman did in his last lecture on quantum electrodynamics for Alix Mautner, discuss some loose ends. Unlike Feynman, I will not be able to tie them up. However, just describing them might be interesting and perhaps you, my imaginary reader, could actually help me with tying them up ! Let’s first re-visit the wave function for a photon by way of introduction.

The wave function for a photon

Let’s not complicate things from the start and, hence, let’s first analyze a nice Gaussian wave packet, such as the right-hand graph below: Ψ(x, t). It could be a de Broglie wave representing an electron but here we’ll assume the wave packet might actually represent a photon. [Of course, do remember we should actually show both the real as well as the imaginary part of this complex-valued wave function but we don’t want to clutter the illustration and so it’s only one of the two (cosine or sine). The ‘other’ part (sine or cosine) is just the same but with a phase shift. Indeed, remember that a complex number reθ is equal to r(cosθ + isinθ), and the shape of the sine function is the same as the cosine function but it’s shifted to the left with π/2. So if we have one, we have the other. End of digression.] 

example of wave packet

The assumptions associated with this wonderful mathematical shape include the idea that the wave packet is a composite wave consisting of a large number of harmonic waves with wave numbers k1, k2k3,… all lying around some mean value μk. That is what is shown in the left-hand graph. The mean value is actually noted as k-bar in the illustration above but because I can’t find a k-bar symbol among the ‘special characters’ in the text editor tool bar here, I’ll use the statistical symbols μ and σ to represent a mean value (μ) and some spread around it (σ). In any case, we have a pretty normal shape here, resembling the Gaussian distribution illustrated below.

normal probability distributionThese Gaussian distributions (also known as a density function) have outliers, but you will catch 95,4% of the observations within the μ ± 2σ interval, and 99.7% within the μ ± 3σ interval (that’s the so-called two- and three-sigma rule). Now, the shape of the left-hand graph of the first illustration, mapping the relation between k and A(k), is the same as this Gaussian density function, and if you would take a little ruler and measure the spread of k on the horizontal axis, you would find that the values for k are effectively spread over an interval that’s somewhat bigger than k-bar plus or minus 2Δk. So let’s say 95,4% of the values of k lie in the interval [μ– 2Δk, μk + 2Δk]. Hence, for all practical purposes, we can write that μ– 2Δk  < k< μ+ 2Δk. In any case, we do not care too much about the rest because their contribution to the amplitude of the wave packet is minimal anyway, as we can see from that graph. Indeed, note that the A(k) values on the vertical axis of that graph do not represent the density of the k variable: there is only one wave number for each component wave, and so there’s no distribution or density function of k. These A(k) numbers represent the (maximum) amplitude of the component waves of our wave packet Ψ(x, t). In short, they are the values A(k) appearing in the summation formula for our composite wave, i.e. the wave packet:

Wave packet summation

I don’t want to dwell much more on the math here (I’ve done that in my other posts already): I just want you to get a general understanding of that ‘ideal’ wave packet possibly representing a photon above so you can follow the rest of my story. So we have a (theoretical) bunch of (component) waves with different wave numbers kn, and the spread in these wave numbers – i.e. 2Δk, or let’s take 4Δk to make sure we catch (almost) all of them – determines the length of the wave packet Ψ, which is written here as 2Δx, or 4Δx if we’d want to include (most of) the tail ends as well. What else can we say about Ψ? Well… Maybe something about velocities and all that? OK.

To calculate velocities, we need both ω and k. Indeed, the phase velocity of a wave (vp) is equal to v= ω/k. Now, the wave number k of the wave packet itself – i.e. the wave number of the oscillating ‘carrier wave’ so to say – should be equal to μaccording to the article I took this illustration from. I should check that but, looking at that relationship between A(k) and k, I would not be surprised if the math behind is right. So we have the k for the wave packet itself (as opposed to the k’s of its components). However, I also need the angular frequency ω.

So what is that ω? Well… That will depend on all the ω’s associated with all the k’s, isn’t it? It does. But, as I explained in a previous post, the component waves do not necessarily have to travel all at the same speed and so the relationship between ω and k may not be simple. We would love that, of course, but Nature does what it wants. The only reasonable constraint we can impose on all those ω’s is that they should be some linear function of k. Indeed, if we do not want our wave packet to dissipate (or disperse or, to put it even more plainly, to disappear), then the so-called dispersion relation ω = ω(k) should be linear, so ωn should be equal to ω= ak+ b. What a and b? We don’t know. Random constants. But if the relationship is not linear, then the wave packet will disperse and it cannot possibly represent a particle – be it an electron or a photon.

I won’t go through the math all over again but in my Re-visiting the Matter Wave (I), I used the other de Broglie relationship (E = ħω) to show that – for matter waves that do not disperse – we will find that the phase velocity will equal c/β, with β = v/c, i.e. the ratio of the speed of our particle (v) and the speed of light (c). But, of course, photons travel at the speed of light and, therefore, everything becomes very simple and the phase velocity of the wave packet of our photon would equal the group velocity. In short, we have:

v= ω/k = v= ∂ω/∂k = c

Of course, I should add that the angular frequency of all component waves will also be equal to ω = ck, so all component waves of the wave packet representing a photon are supposed to travel at the speed of light! What an amazingly simple result!

It is. In order to illustrate what we have here – especially the elegance and simplicity of that wave packet for a photon – I’ve uploaded two gif files (see below). The first one could represent our ‘ideal’ photon: group and phase velocity (represented by the speed of the green and red dot respectively) are the same. Of course, our ‘ideal’ photon would only be one wave packet – not a bunch of them like here – but then you may want to think that the ‘beam’ below might represent a number of photons following each other in a regular procession.

Wave - same group and phase velocity

The second animated gif below shows how phase and group velocity can differ. So that would be a (bunch of) wave packets representing a particle not traveling at the speed of light. The phase velocity here is faster than the group velocity (the red dot travels faster than the green dot). [One can actually also have a wave with positive group velocity and negative phase velocity – quite interesting ! – but so that would not represent a particle wave.] Again, a particle would be represented by one wave packet only (so that’s the space between two green dots only) but, again, you may want to think of this as representing electrons following each other in a very regular procession. 

Wave_groupThese illustrations (which I took, once again, from the online encyclopedia Wikipedia) are a wonderful pedagogic tool. I don’t know if it’s by coincidence but the group velocity of the second wave is actually somewhat slower than the first – so the photon versus electron comparison holds (electrons are supposed to move (much) slower). However, as for the phase velocities, they are the same for both waves and that would not reflect the results we found for matter waves. Indeed, you may or may not remember that we calculated superluminal speeds for the phase velocity of matter waves in that post I mentioned above (Re-visiting the Matter Wave): an electron traveling at a speed of 0.01c (1% of the speed of light) would be represented by a wave packet with a group velocity of 0.01c indeed, but its phase velocity would be 100 times the speed of light, i.e. 100c. [That being said, the second illustration may be interpreted as a little bit correct as the red dot does travel faster than the green dot, which – as I explained – is not necessarily always the case when looking at such composite waves (we can have slower or even negative speeds).]

Of course, I should once again repeat that we should not think that a photon or an electron is actually wriggling through space like this: the oscillation only represent the real or imaginary part of the complex-valued probability amplitude associated with our ‘ideal’ photon or our ‘ideal’ electron. That’s all. So this wave is an ‘oscillating complex number’, so to say, whose modulus we have to square to get the probability to actually find the photon (or electron) at some point x and some time t. However, the photon (or the electron) itself are just moving straight from left to right, with a speed matching the group velocity of their wave function.

Are they? 

Well… No. Or, to be more precise: maybe. WHAT? Yes, that’s surely one ‘loose end’ worth mentioning! According to QED, photons also have an amplitude to travel faster or slower than light, and they are not necessarily moving in a straight line either. WHAT? Yes. That’s the complicated business I discussed in my previous post. As for the amplitudes to travel faster or slower than light, Feynman dealt with them very summarily. Indeed, you’ll remember the illustration below, which shows that the contributions of the amplitudes associated with slower or faster speed than light tend to nil because (a) their magnitude (or modulus) is smaller and (b) they point in the ‘wrong’ direction, i.e. not the direction of travel.

Contribution interval

Still, these amplitudes are there and – Shock, horror ! – photons also have an amplitude to not travel in a straight line, especially when they are forced to travel through a narrow slit, or right next to some obstacle. That’s diffraction, described as “the apparent bending of waves around small obstacles and the spreading out of waves past small openings” in Wikipedia. 

Diffraction is one of the many phenomena that Feynman deals with in his 1985 Alix G. Mautner Memorial Lectures. His explanation is easy: “not enough arrows” – read: not enough amplitudes to add. With few arrows, there are also few that cancel out indeed, and so the final arrow for the event is quite random, as shown in the illustrations below.

Many arrows Few arrows

So… Not enough arrows… Feynman adds the following on this: “[For short distances] The nearby, nearly straight paths also make important contributions. So light doesn’t really travel only in a straight line; it “smells” the neighboring paths around it, and uses a small core of nearby space. In the same way, a mirror has to have enough size to reflect normally; if the mirror is too small for the core of neighboring paths, the light scatters in many directions, no matter where you put the mirror.” (QED, 1985, p. 54-56)

Not enough arrows… What does he mean by that? Not enough photons? No. Diffraction for photons works just the same as for electrons: even if the photons would go through the slit one by one, we would have diffraction (see my Revisiting the Matter Wave (II) post for a detailed discussion of the experiment). So even one photon is likely to take some random direction left or right after going through a slit, rather than to go straight. Not enough arrows means not enough amplitudes. But what amplitudes is he talking about?

These amplitudes have nothing to do with the wave function of our ideal photon we were discussing above: that’s the amplitude Ψ(x, t) of a photon to be at point x at point t. The amplitude Feynman is talking about is the amplitude of a photon to go from point A to B along one of the infinitely many possible paths it could take. As I explained in my previous post, we have to add all of these amplitudes to arrive at one big final arrow which, over longer distances, will usually be associated with a rather large probability that the photon will travel in a straight line and at the speed of light – which is why light seems to do at a macro-scale. 🙂

But back to that very succinct statement: not enough arrows. That’s obviously a very relative statement. Not enough as compared to what? What measurement scale are we talking about here? It’s obvious that the ‘scale’ of these arrows for electrons is different than for photons, because the 2012 diffraction experiment with electrons that I referred to used 50 nanometer slits (50×10−9 m), while one of the many experiments demonstrating light diffraction using pretty standard (red) laser light used slits of some 100 micrometer (that 100×10−6 m or – in units you are used to – 0.1 millimeter). 

The key to the ‘scale’ here is the wavelength of these de Broglie waves: the slit needs to be ‘small enough’ as compared to these de Broglie wavelengths. For example, the width of the slit in the laser experiment corresponded to (roughly) 100 times the wavelength of the laser light, and the (de Broglie) wavelength of the electrons in that 2012 diffraction experiment was 50 picometer – that was actually a thousand times the electron wavelength – but it was OK enough to demonstrate diffraction. Much larger slits would not have done the trick. So, when it comes to light, we have diffraction at scales that do not  involve nanotechnology, but when it comes to matter particles, we’re not talking micro but nano: that’s thousand times smaller.

The weird relation between energy and size

Let’s re-visit the Uncertainty Principle, even if Feynman says we don’t need that (we just need to do the amplitude math and we have it all). We wrote the uncertainty principle using the more scientific Kennard formulation: σxσ≥ ħ/2, in which the sigma symbol represents the standard deviation of position x and momentum p respectively. Now that’s confusing, you’ll say, because we were talking wave numbers, not momentum in the introduction above. Well… The wave number k of a de Broglie wave is, of course, related to the momentum p of the particle we’re looking at: p = ħk. Hence, a spread in the wave numbers amounts to a spread in the momentum really and, as I wanted to talk scales, let’s now check the dimensions.

The value for ħ is about 1×10–34 Joule·seconds (J·s) (it’s about 1.054571726(47)×10−34 but let’s go with the gross approximation as for now). One J·s is the same as one kg·m2/s because 1 Joule is a shorthand for km kg·m2/s2. It’s a rather large unit and you probably know that physicists prefer electronVolt·seconds (eV·s) because of that. However, even in expressed in eV·s the value for ħ comes out astronomically small6.58211928(15)×10−16 eV·s. In any case, because the J·s makes dimensions come out right, I’ll stick to it for a while. What does this incredible small factor of proportionality, both in the de Broglie relations as well in that Kennard formulation of the uncertainty principle, imply? How does it work out from a math point of view?

Well… It’s literally a quantum of measurement: even if Feynman says the uncertainty principle should just be seen “in its historical context”, and that “we don’t need it for adding arrows”, it is a consequence of the (related) position-space and momentum-space wave functions for a particle. In case you would doubt that, check it on Wikipedia: the author of the article on the uncertainty principle derives it from these two wave functions, which form a so-called Fourier transform pair. But so what does it say really?

Look at it. First, it says that we cannot know any of the two values exactly (exactly means 100%) because then we have a zero standard deviation for one or the other variable, and then the inequality makes no sense anymore: zero is obviously not greater or equal to 0.527286×10–34 J·s. However, the inequality with the value for  ħ plugged in shows how close to zero we can get with our measurements. Let’s check it out. 

Let’s use the assumption that two times the standard deviation (written as 2Δk or 2Δx on or above the two graphs in the very first illustration of this post) sort of captures the whole ‘range’ of the variable. It’s not a bad assumption: indeed, if Nature would follow normal distributions – and in our macro-world, that seems to be the case – then we’d capture 95.4 of them, so that’s good. Then we can re-write the uncertainty principle as:

Δx·σ≥ ħ or σx·Δp ≥ ħ

So that means we know x within some interval (or ‘range’ if you prefer that term) Δx or, else, we know p within some interval (or ‘range’ if you prefer that term) Δp. But we want to know both within some range, you’ll say. Of course. In that case, the uncertainty principle can be written as:

Δx·Δp ≥ 2ħ

Huh? Why the factor 2? Well… Each of the two Δ ranges corresponds to 2σ (hence, σ= Δx/2 and σ= Δp/2), and so we have (1/2)Δx·(1/2)Δp ≥ ħ/2. Note that if we would equate our Δ with 3σ to get 97.7% of the values, instead of 95.4% only, once again assuming that Nature distributes all relevant properties normally (not sure – especially in this case, because we are talking discrete quanta of action here – so Nature may want to cut off the ‘tail ends’!), then we’d get Δx·Δp ≥ 4.5×ħ: the cost of extra precision soars! Also note that, if we would equate Δ with σ (the one-sigma rule corresponds to 68.3% of a normally distributed range of values), then we get yet another ‘version’ of the uncertainty principle: Δx·Δp ≥ ħ/2. Pick and choose! And if we want to be purists, we should note that ħ is used when we express things in radians (such as the angular frequency for example: E = ħω), so we should actually use h when we are talking distance and (linear) momentum. The equation above then becomes Δx·Δp ≥ h/π.

It doesn’t matter all that much. The point to note is that, if we express x and p in regular distance and momentum units (m and kg·m/s), then the unit for ħ (or h) is 1×10–34. Now, we can sort of choose how to spread the uncertainty over x and p. If we spread it evenly, then we’ll measure both Δx and Δp  in units of 1×10–17  m and 1×10–17 kg·m/s. That’s small… but not that small. In fact, it is (more or less) imaginably small I’d say.

For example, a photon of a blue-violet light (let’s say a wavelength of around 660 nanometer) would have a momentum p = h/λ equal to some 1×10–22 kg·m/s (just work it out using the values for h and λ). You would usually see this value measured in a unit that’s more appropriate to the atomic scale: 6.25 eV/c. [Converting momentum into energy using E = pc, and using the Joule-electronvolt conversion (1 eV ≈ 1.6×10–19 J) will get you there.] Hence, units of 1×10–17  m for momentum are a hundred thousand times the rather average momentum of our light photon. We can’t have that so let’s reduce the uncertainty related to the momentum to that 1×10–22 kg·m/s scale. Then the uncertainty about position will be measured in units of 1×10–12 m. That’s the picometer scale in-between the nanometer (1×10–9 m) and the femtometer (1×10–9 m) scale. You’ll remember that this scale corresponds to the resolution of a (modern) electron microscope (50 pm). So can we see “uncertainty effects” ? Yes. I’ll come back to that.

However, before I discuss these, I need to make a little digression. Despite the sub-title I am using above, the uncertainties in distance and momentum we are discussing here are nowhere near to what is referred to as the Planck scale in physics: the Planck scale is at the other side of that Great Desert I mentioned: the Large Hadron Collider, which smashes particles with  (average) energies of 4 tera-electronvolt (i.e. 4 trillion eV – all packed into one particle !) is probing stuff measuring at a scale of a thousandth of a femtometer (0.001×10–12 m), but we’re obviously at the limits of what’s technically possible, and so that’s where the Great Desert starts. The ‘other side’ of that Great Desert is the Planck scale: 10–35 m. Now, why is that some kind of theoretical limit? Why can’t we just continue to further cut these scales down? Just like Dedekind did when defining irrational numbers? We can surely get infinitely close to zero, can we? Well… No. The reasoning is quite complex (and I am not sure if I actually understand it – the way I should) but it is quite relevant to the topic here (the relation between energy and size), and it goes something like this:

  1. In quantum mechanics, particles are considered to be point-like but they do take space, as evidenced from our discussion on slit widths: light will show diffraction at the micro-scale (10–6 m) but electrons will do that only at the nano-scale (10–9 m), so that’s a thousand times smaller. That’s related to their respective the de Broglie wavelength which, for electrons, is also a thousand times smaller than that of electrons. Now, the de Broglie wavelength is related to the energy and/or the momentum of these particles: E = hf and p = h/λ.
  2. Higher energies correspond to smaller de Broglie wavelengths and, hence, are associated with particles of smaller size. To continue the example, the energy formula to be used in the E = hf relation for an electron – or any particle with rest mass – is the (relativistic) mass-energy equivalence relation: E = γm0c2, with γ the Lorentz factor, which depends on the velocity v of the particle. For example, electrons moving at more or less normal speeds (like in the 2012 experiment, or those used in an electron microscope) have typical energy levels of some 600 eV, and don’t think that’s a lot: the electrons from that cathode ray tube in the back of an old-fashioned TV which lighted up the screen so you could watch it, had energies in the 20,000 eV range. So, for electrons, we are talking energy levels a thousand or a hundred thousand higher than for your typical 2 to 10 eV photon.
  3. Of course, I am not talking X or gamma rays here: hard X rays also have energies of 10 to 100 kilo-electronvolt, and gamma ray energies range from 1 million to 10 million eV (1-10 MeV). In any case, the point to note is ‘small’ particles must have high energies, and I am not only talking massless particles such as photons. Indeed, in my post End of the Road to Reality?, I discussed the scale of a proton and the scale of quarks: 1.7 and 0.7 femtometer respectively, which is smaller than the so-called classical electron radius. So we have (much) heavier particles here that are smaller? Indeed, the rest mass of the u and d quarks that make up a proton (uud) is 2.4 and 4.8 MeV/c2 respectively, while the (theoretical) rest mass of an electron is 0.511 Mev/conly, so that’s almost 20 times more: (2.4+2.4+4.8/0.5). Well… No. The rest mass of a proton is actually 1835 times the rest mass of an electron: the difference between the added rest masses of the quarks that make it up and the rest mass of the proton itself (938 MeV//c2) is the equivalent mass of the strong force that keeps the quarks together.
  4. But let me not complicate things. Just note that there seems to be a strange relationship between the energy and the size of a particle: high-energy particles are supposed to be smaller, and vice versa: smaller particles are associated with higher energy levels. If we accept this as some kind of ‘factual reality’, then we may understand what the Planck scale is all about: : the energy levels associated with theoretical ‘particles’ of the above-mentioned Planck scale (i.e. particles with a size in the 10–35 m range) would have energy levels in the 1019 GeV range. So what? Well… This amount of energy corresponds to an equivalent mass density of a black hole. So any ‘particle’ we’d associate with the Planck length would not make sense as a physical entity: it’s the scale where gravity takes over – everything.

Again: so what? Well… I don’t know. It’s just that this is entirely new territory, and it’s also not the topic of my post here. So let me just quote Wikipedia on this and then move on: “The fundamental limit for a photon’s energy is the Planck energy [that’s the 1019 GeV which I mentioned above: to be precise, that ‘limit energy’ is said to be 1.22 × 1019 GeV], for the reasons cited above [that ‘photon’ would not be ‘photon’ but a black hole, sucking up everything around it]. This makes the Planck scale a fascinating realm for speculation by theoretical physicists from various schools of thought. Is the Planck scale domain a seething mass of virtual black holes? Is it a fabric of unimaginably fine loops or a spin foam network? Is it interpenetrated by innumerable Calabi-Yau manifolds which connect our 3-dimensional universe with a higher-dimensional space? [That’s what’s string theory is about.] Perhaps our 3-D universe is ‘sitting’ on a ‘brane’ which separates it from a 2, 5, or 10-dimensional universe and this accounts for the apparent ‘weakness’ of gravity in ours. These approaches, among several others, are being considered to gain insight into Planck scale dynamics. This would allow physicists to create a unified description of all the fundamental forces. [That’s what’s these Grand Unification Theories (GUTs) are about.]

Hmm… I wish I could find some easy explanation of why higher energy means smaller size. I do note there’s an easy relationship between energy and momentum for massless particles traveling at the velocity of light (like photons): E = p(or p = E/c), but – from what I write above – it is obvious that it’s the spread in momentum (and, therefore, in wave numbers) which determines how short or how long our wave train is, not the energy level as such. I guess I’ll just have to do some more research here and, hopefully, get back to you when I understand things better. 

Re-visiting the Uncertainty Principle

You will probably have read countless accounts of the double-slit experiment, and so you will probably remember that these thought or actual experiments also try to watch the electrons as they pass the slits – with disastrous results: the interference pattern disappears. I copy Feynman’s own drawing from his 1965 Lecture on Quantum Behavior below: a light source is placed behind the ‘wall’, right between the two slits. Now, light (i.e. photons) gets scattered when it hits electrons and so now we should ‘see’ through which slit the electron is coming. Indeed, remember that we sent them through these slits one by one, and we still had interference – suggesting the ‘electron wave’ somehow goes through both slits at the same time, which can’t be true – because an electron is a particle.

Watching the electrons

However, let’s re-examine what happens exactly.

  1. We can only detect all electrons if the light is high intensity, and high intensity does not mean higher energy photons but more photons. Indeed, if the light source is deem, then electrons might get through without being seen. So a high-intensity light source allows us to see all electrons but – as demonstrated not only in thought experiments but also in the laboratory – it destroys the interference pattern.
  2. What if we use lower-energy photons, like infrared light with wavelengths of 10 to 100 microns instead of visible light? We can then use thermal imaging night vision goggles to ‘see’ the electrons. 🙂 And if that doesn’t work, we can use radiowaves (or perhaps radar!). The problem – as Feynman explains it – is that such low frequency light (associated with long wavelengths) only give a ‘big fuzzy flash’ when the light is scattered: “We can no longer tell which hole the electron went through! We just know it went somewhere!” At the same time, “the jolts given to the electron are now small enough so that we begin to see some interference effect again.” Indeed: “For wavelengths much longer than the separation between the two slits (when we have no chance at all of telling where the electron went), we find that the disturbance due to the light gets sufficiently small that we again get the interference curve P12.” [P12 is the curve describing the original interference effect.]

Now, that would suggest that, when push comes to shove, the Uncertainty Principle only describes some indeterminacy in the so-called Compton scattering of a photon by an electron. This Compton scattering is illustrated below: it’s a more or less elastic collision between a photon and electron, in which momentum gets exchanged (especially the direction of the momentum) and – quite important – the wavelength of the scattered light is different from the incident radiation. Hence, the photon loses some energy to the electron and, because it will still travel at speed c, that means its wavelength must increase as prescribed by the λ = h/p de Broglie relation (with p = E/c for a photon). The change in the wavelength is called the Compton shift. and its formula is given in the illustration: it depends on the (rest) mass of the electron obviously and on the change in the direction of the momentum (of the photon – but that change in direction will obviously also be related to the recoil direction of the electron).

Compton scattering

This is a very physical interpretation of the Uncertainty Principle, but it’s the one which the great Richard P. Feynman himself stuck to in 1965, i.e. when he wrote his famous Lectures on Physics at the height of his career. Let me quote his interpretation of the Uncertainty Principle in full indeed:

“It is impossible to design an apparatus to determine which hole the electron passes through, that will not at the same time disturb the electrons enough to destroy the interference pattern. If an apparatus is capable of determining which hole the electron goes through, it cannot be so delicate that it does not disturb the pattern in an essential way. No one has ever found (or even thought of) a way around this. So we must assume that it describes a basic characteristic of nature.”

That’s very mechanistic indeed, and it points to indeterminacy rather than ontological uncertainty. However, there’s weirder stuff than electrons being ‘disturbed’ in some kind of random way by the photons we use to detect them, with the randomness only being related to us not knowing at what time photons leave our light source, and what energy or momentum they have exactly. That’s just ‘indeterminacy’ indeed; not some fundamental ‘uncertainty’ about Nature.

We see such ‘weirder stuff’ in those mega- and now tera-electronvolt experiments in particle accelerators. In 1965, Feynman had access to the results of the high-energy positron-electron collisions being observed in the 3 km long Stanford Linear Accelerator (SLAC), which started working in 1961, but stuff like quarks and all that was discovered only in the late 1960s and early 1970s, so that’s after Feynman’s Lectures on Physics.So let me just mention a rather remarkable example of the Uncertainty Principle at work which Feynman quotes in his 1985 Alix G. Mautner Memorial Lectures on Quantum Electrodynamics.

In the Feynman diagram below, we see a photon disintegrating, at time t = T3into a positron and an electron. The positron (a positron is an electron with positive charge basically: it’s the electron’s anti-matter counterpart) meets another electron that ‘happens’ to be nearby and the annihilation results in (another) high-energy photon being emitted. While, as Feynman underlines, “this is a sequence of events which has been observed in the laboratory”, how is all this possible? We create matter – an electron and a positron both have considerable mass – out of nothing here ! [Well… OK – there’s a photon, so that’s some energy to work with…]

Photon disintegrationFeynman explains this weird observation without reference to the Uncertainty Principle. He just notes that “Every particle in Nature has an amplitude to move backwards in time, and therefore has an anti-particle.” And so that’s what this electron coming from the bottom-left corner does: it emits a photon and then the electron moves backwards in time. So, while we see a (very short-lived) positron moving forward, it’s actually an electron quickly traveling back in time according to Feynman! And, after a short while, it has had enough of going back in time, so then it absorbs a photon and continues in a slightly different direction. Hmm… If this does not sound fishy to you, it does to me.

The more standard explanation is in terms of the Uncertainty Principle applied to energy and time. Indeed, I mentioned that we have several pairs of conjugate variables in quantum mechanics: position and momentum are one such pair (related through the de Broglie relation p =ħk), but energy and time are another (related through the other de Broglie relation E = hf = ħω). While the ‘energy-time uncertainty principle’ – ΔE·Δt ≥ ħ/2 – resembles the position-momentum relationship above, it is apparently used for ‘very short-lived products’ produced in high-energy collisions in accelerators only. I must assume the short-lived positron in the Feynman diagram is such an example: there is some kind of borrowing of energy (remember mass is equivalent to energy) against time, and then normalcy soon gets restored. Now THAT is something else than indeterminacy I’d say.

uncertainty principle energy time

But so Feynman would say both interpretations are equivalent, because Nature doesn’t care about our interpretations.

What to say in conclusion? I don’t know. I obviously have some more work to do before I’ll be able to claim to understand the uncertainty principle – or quantum mechanics in general – somewhat. I think the next step is to solve my problem with the summary ‘not enough arrows’ explanation, which is – evidently – linked to the relation between energy and size of particles. That’s the one loose end I really need to tie up I feel ! I’ll keep you posted !

Light and matter

In my previous post, I discussed the de Broglie wave of a photon. It’s usually referred to as ‘the’ wave function (or the psi function) but, as I explained, for every psi – i.e. the position-space wave function Ψ(x ,t) – there is also a phi – i.e. the momentum-space wave function Φ(p, t).

In that post, I also compared it – without much formalism – to the de Broglie wave of ‘matter particles’. Indeed, in physics, we look at ‘stuff’ as being made of particles and, while the taxonomy of the particle zoo of the Standard Model of physics is rather complicated, one ‘taxonomic’ principle stands out: particles are either matter particles (known as fermions) or force carriers (known as bosons). It’s a strict separation: either/or. No split personalities.

A quick overview before we start…

Wikipedia’s overview of particles in the Standard Model (including the latest addition: the Higgs boson) illustrates this fundamental dichotomy in nature: we have the matter particles (quarks and leptons) on one side, and the bosons (i.e. the force carriers) on the other side.

Standard_Model_of_Elementary_Particles

Don’t be put off by my remark on the particle zoo: it’s a term coined in the 1960s, when the situation was quite confusing indeed (like more than 400 ‘particles’). However, the picture is quite orderly now. In fact, the Standard Model put an end to the discovery of ‘new’ particles, and it’s been stable since the 1970s, as experiments confirmed the reality of quarks. Indeed, all resistance to Gell-Man’s quarks and his flavor and color concepts – which are just words to describe new types of ‘charge’ – similar to electric charge but with more variety), ended when experiments by Stanford’s Linear Accelerator Laboratory (SLAC) in November 1974 confirmed the existence of the (second-generation and, hence, heavy and unstable) ‘charm’ quark (again, the names suggest some frivolity but it’s serious physical research).

As for the Higgs boson, its existence of the Higgs boson had also been predicted, since 1964 to be precise, but it took fifty years to confirm it experimentally because only something like the Large Hadron Collider could produce the required energy to find it in these particle smashing experiments – a rather crude way of analyzing matter, you may think, but so be it. [In case you harbor doubts on the Higgs particle, please note that, while CERN is the first to admit further confirmation is needed, the Nobel Prize Committee apparently found the evidence ‘evidence enough’ to finally award Higgs and others a Nobel Prize for their ‘discovery’ fifty years ago – and, as you know, the Nobel Prize committee members are usually rather conservative in their judgment. So you would have to come up with a rather complex conspiracy theory to deny its existence.]

Also note that the particle zoo is actually less complicated than it looks at first sight: the (composite) particles that are stable in our world – this world – consist of three quarks only: a proton consists of two up quarks and one down quark and, hence, is written as uud., and a neutron is two down quarks and one up quark: udd. Hence, for all practical purposes (i.e. for our discussion how light interacts with matter), only the so-called first generation of matter-particles – so that’s the first column in the overview above – are relevant.

All the particles in the second and third column are unstable. That being said, they survive long enough – a muon disintegrates after 2.2 millionths of a second (on average) – to deserve the ‘particle’ title, as opposed to a ‘resonance’, whose lifetime can be as short as a billionth of a trillionth of a second – but we’ve gone through these numbers before and so I won’t repeat that here. Why do we need them? Well… We don’t, but they are a by-product of our world view (i.e. the Standard Model) and, for some reason, we find everything what this Standard Model says should exist, even if most of the stuff (all second- and third-generation matter particles, and all these resonances, vanish rather quickly – but so that also seems to be consistent with the model). [As for a possible fourth (or higher) generation, Feynman didn’t exclude it when he wrote his 1985 Lectures on quantum electrodynamics, but, checking on Wikipedia, I find the following: “According to the results of the statistical analysis by researchers from CERN and the Humboldt University of Berlin, the existence of further fermions can be excluded with a probability of 99.99999% (5.3 sigma).” If you want to know why… Well… Read the rest of the Wikipedia article. It’s got to do with the Higgs particle.]

As for the (first-generation) neutrino in the table – the only one which you may not be familiar with – these are very spooky things but – I don’t want to scare you – relatively high-energy neutrinos are going through your and my my body, right now and here, at a rate of some hundred trillion per second. They are produced by stars (stars are huge nuclear fusion reactors, remember?), and also as a by-product of these high-energy collisions in particle accelerators of course. But they are very hard to detect: the first trace of their existence was found in 1956 only – 26 years after their existence had been postulated: the fact that Wolfgang Pauli proposed their existence in 1930 to explain how beta decay could conserve energy, momentum and spin (angular momentum) demonstrates not only the genius but also the confidence of these early theoretical quantum physicists. Most neutrinos passing through Earth are produced by our Sun. Now they are being analyzed more routinely. The largest neutrino detector on Earth is called IceCube. It sits on the South Pole – or under it, as it’s suspended under the Antarctic ice, and it regularly captures high-energy neutrinos in the range of 1 to 10 TeV. 

Let me – to conclude this introduction – just quickly list and explain the bosons (i.e the force carriers) in the table above:

1. Of all of the bosons, the photon (i.e. the topic of this post), is the most straightforward: there is only type of photon, even if it comes in different possible states of polarization.

[…]

I should probably do a quick note on polarization here – even if all of the stuff that follows will make abstraction of it. Indeed, the discussion on photons that follows (largely adapted from Feynman’s 1985 Lectures on Quantum Electrodynamics) assumes that there is no such thing as polarization – because it would make everything even more complicated. The concept of polarization (linear, circular or elliptical) has a direct physical interpretation in classical mechanics (i.e. light as an electromagnetic wave). In quantum mechanics, however, polarization becomes a so-called qubit (quantum bit): leaving aside so-called virtual photons (these are short-range disturbances going between a proton and an electron in an atom – effectively mediating the electromagnetic force between them), the property of polarization comes in two basis states (0 and 1, or left and right), but these two basis states can be superposed. In ket notation: if ¦0〉 and ¦1〉 are the basis states, then any linear combination α·¦0〉 + ß·¦1〉 is also a valid state provided│α│2 + │β│= 1, in line with the need to get probabilities that add up to one.

In case you wonder why I am introducing these kets, there is no reason for it, except that I will be introducing some other tools in this post – such as Feynman diagrams – and so that’s all. In order to wrap this up, I need to note that kets are used in conjunction with bras. So we have a bra-ket notation: the ket gives the starting condition, and the bra – denoted as 〈 ¦ – gives the final condition. They are combined in statements such as 〈 particle arrives at x¦particle leaves from s〉 or – in short – 〈 x¦s〉 and, while x and s would have some real-number value, 〈 x¦s〉 would denote the (complex-valued) probability amplitude associated wit the event consisting of these two conditions (i.e the starting and final condition).

But don’t worry about it. This digression is just what it is: a digression. Oh… Just make a mental note that the so-called virtual photons (the mediators that are supposed to keep the electron in touch with the proton) have four possible states of polarization – instead of two. They are related to the four directions of space (x, y and z) and time (t). 🙂

2. Gluons, the exchange particles for the strong force, are more complicated: they come in eight so-called colors. In practice, one should think of these colors as different charges, but so we have more elementary charges in this case than just plus or minus one (±1) – as we have for the electric charge. So it’s just another type of qubit in quantum mechanics.

[Note that the so-called elementary ±1 values for electric charge are not really elementary: it’s –1/3 (for the down quark, and for the second- and third-generation strange and bottom quarks as well) and +2/3 (for the up quark as well as for the second- and third-generation charm and top quarks). That being said, electric charge takes two values only, and the ±1 value is easily found from a linear combination of the –1/3 and +2/3 values.]

3. Z and W bosons carry the so-called weak force, aka as Fermi’s interaction: they explain how one type of quark can change into another, thereby explaining phenomena such as beta decay. Beta decay explains why carbon-14 will, after a very long time (as compared to the ‘unstable’ particles mentioned above), spontaneously decay into nitrogen-14. Indeed, carbon-12 is the (very) stable isotope, while carbon-14 has a life-time of 5,730 ± 40 years ‘only’  (so one can’t call carbon-12 ‘unstable’: perhaps ‘less stable’ will do) and, hence, measuring how much carbon-14 is left in some organic substance allows us to date it (that’s what (radio)carbon-dating is about). As for the name, a beta particle can refer to an electron or a positron, so we can have β decay (e.g. the above-mentioned carbon-14 decay) as well as β+ decay (e.g. magnesium-23 into sodium-23). There’s also alpha and gamma decay but that involves different things. 

As you can see from the table, W± and Zbosons are very heavy (157,000 and 178,000 times heavier than a electron!), and W± carry the (positive or negative) electric charge. So why don’t we see them? Well… They are so short-lived that we can only see a tiny decay width, just a very tiny little trace, so they resemble resonances in experiments. That’s also the reason why we see little or nothing of the weak force in real-life: the force-carrying particles mediating this force don’t get anywhere.

4. Finally, as mentioned above, the Higgs particle – and, hence, of the associated Higgs field – had been predicted since 1964 already but its existence was only (tentatively) experimentally confirmed last year. The Higgs field gives fermions, and also the W and Z bosons, mass (but not photons and gluons), and – as mentioned above – that’s why the weak force has such short range as compared to the electromagnetic and strong forces. Note, however, that the Higgs particle does actually not explain the gravitational force, so it’s not the (theoretical) graviton and there is no quantum field theory for the gravitational force as yet. Just Google it and you’ll quickly find out why: there’s theoretical as well as practical (experimental) reasons for that.

The Higgs field stands out from the other force fields because it’s a scalar field (as opposed to a vector field). However, I have no idea how this so-called Higgs mechanism (i.e. the interaction with matter particles (i.e. with the quarks and leptons, but not directly with neutrinos it would seem from the diagram below), with W and Z bosons, and with itself – but not with the massless photons and gluons) actually works. But then I still have a very long way to go on this Road to Reality.

2000px-Elementary_particle_interactions.svg

In any case… The topic of this post is to discuss light and its interaction with matter – not the weak or strong force, nor the Higgs field.

Let’s go for it.

Amplitudes, probabilities and observable properties

Being born a boson or a fermion makes a big difference. That being said, both fermions and bosons are wavicles described by a complex-valued psi function, colloquially known as the wave function. To be precise, there will be several wave functions, and the square of their modulus (sorry for the jargon) will give you the probability of some observable property having a value in some relevant range, usually denoted by Δ. [I also explained (in my post on Bose and Fermi) how the rules for combining amplitudes differ for bosons versus fermions, and how that explains why they are what they are: matter particles occupy space, while photons not only can but also like to crowd together in, for example, a powerful laser beam. I’ll come back on that.]

For all practical purposes, relevant usually means ‘small enough to be meaningful’. For example, we may want to calculate the probability of detecting an electron in some tiny spacetime interval (Δx, Δt). [Again, ‘tiny’ in this context means small enough to be relevant: if we are looking at a hydrogen atom (whose size is a few nanometer), then Δx is likely to be a cube or a sphere with an edge or a radius of a few picometer only (a picometer is a thousandth of a nanometer, so it’s a millionth of a millionth of a meter); and, noting that the electron’s speed is approximately 2200 km per second… Well… I will let you calculate a relevant Δt. :-)]

If we want to do that, then we will need to square the modulus of the corresponding wave function Ψ(x, t). To be precise, we will have to do a summation of all the values │Ψ(x, t)│over the interval and, because x and t are real (and, hence, continuous) numbers, that means doing some integral (because an integral is the continuous version of a sum).

But that’s only one example of an observable property: position. There are others. For example, we may not be interested in the particle’s exact position but only in its momentum or energy. Well, we have another wave function for that: the momentum wave function Φ(x ,t). In fact, if you looked at my previous posts, you’ll remember the two are related because they are conjugate variables: Fourier transforms duals of one another. A less formal way of expressing that is to refer to the uncertainty principle. But this is not the time to repeat things.

The bottom line is that all particles travel through spacetime with a backpack full of complex-valued wave functions. We don’t know who and where these particles are exactly, and so we can’t talk to them – but we can e-mail God and He’ll send us the wave function that we need to calculate some probability we are interested in because we want to check – in all kinds of experiments designed to fool them – if it matches with reality.

As mentioned above, I highlighted the main difference between bosons and fermions in my Bose and Fermi post, so I won’t repeat that here. Just note that, when it comes to working with those probability amplitudes (that’s just another word for these psi and phi functions), it makes a huge difference: fermions and bosons interact very differently. Bosons are party particles: they like to crowd and will always welcome an extra one. Fermions, on the other hand, will exclude each other: that’s why there’s something referred to as the Fermi exclusion principle in quantum mechanics. That’s why fermions make matter (matter needs space) and bosons are force carriers (they’ll just call friends to help when the load gets heavier).

Light versus matter: Quantum Electrodynamics

OK. Let’s get down to business. This post is about light, or about light-matter interaction. Indeed, in my previous post (on Light), I promised to say something about the amplitude of a photon to go from point A to B (because – as I wrote in my previous post – that’s more ‘relevant’, when it comes to explaining stuff, than the amplitude of a photon to actually be at point x at time t), and so that’s what I will do now.

In his 1985 Lectures on Quantum Electrodynamics (which are lectures for the lay audience), Feynman writes the amplitude of a photon to go from point A to B as P(A to B) – and the P stands for photon obviously, not for probability. [I am tired of repeating that you need to square the modulus of an amplitude to get a probability but – here you are – I have said it once more.] That’s in line with the other fundamental wave function in quantum electrodynamics (QED): the amplitude of an electron to go from A to B, which is written as E(A to B). [You got it: E just stands for electron, not for our electric field vector.]

I also talked about the third fundamental amplitude in my previous post: the amplitude of an electron to absorb or emit a photon. So let’s have a look at these three. As Feynman says: ““Out of these three amplitudes, we can make the whole world, aside from what goes on in nuclei, and gravitation, as always!” 

Well… Thank you, Mr Feynman: I’ve always wanted to understand the World (especially if you made it).

The photon-electron coupling constant j

Let’s start with the last of those three amplitudes (or wave functions): the amplitude of an electron to absorb or emit a photon. Indeed, absorbing or emitting makes no difference: we have the same complex number for both. It’s a constant – denoted by j (for junction number) – equal to –0.1 (a bit less actually but it’s good enough as an approximation in the context of this blog).

Huh? Minus 0.1? That’s not a complex number, is it? It is. Real numbers are complex numbers too: –0.1 is 0.1eiπ in polar coordinates. As Feynman puts it: it’s “a shrink to about one-tenth, and half a turn.” The ‘shrink’ is the 0.1 magnitude of this vector (or arrow), and the ‘half-turn’ is the angle of π (i.e. 180 degrees). He obviously refers to multiplying (no adding here) j with other amplitudes, e.g. P(A, C) and E(B, C) if the coupling is to happen at or near C. And, as you’ll remember, multiplying complex numbers amounts to adding their phases, and multiplying their modulus (so that’s adding the angles and multiplying lengths).

Let’s introduce a Feynman diagram at this point – drawn by Feynman himself – which shows three possible ways of two electrons exchanging a photon. We actually have two couplings here, and so the combined amplitude will involve two j‘s. In fact, if we label the starting point of the two lines representing our electrons as 1 and 2 respectively, and their end points as 3 and 4, then the amplitude for these events will be given by:

E(1 to 5)·j·E(5 to 3)·E(2 to 6)·j·E(6 to 3)

 As for how that j factor works, please do read the caption of the illustration below: the same j describes both emission as well as absorption. It’s just that we have both an emission as well as an as absorption here, so we have a j2 factor here, which is less than 0.1·0.1 = 0.01. At this point, it’s worth noting that it’s obvious that the amplitudes we’re talking about here – i.e. for one possible way of an exchange like the one below happening – are very tiny. They only become significant when we add many of these amplitudes, which – as explained below – is what has to happen: one has to consider all possible paths, calculate the amplitudes for them (through multiplication), and then add all these amplitudes, to then – finally – square the modulus of the combined ‘arrow’ (or amplitude) to get some probability of something actually happening. [Again, that’s the best we can do: calculate probabilities that correspond to experimentally measured occurrences. We cannot predict anything in the classical sense of the word.]

Feynman diagram of photon-electron coupling

A Feynman diagram is not just some sketchy drawing. For example, we have to care about scales: the distance and time units are equivalent (so distance would be measured in light-seconds or, else, time would be measured in units equivalent to the time needed for light to travel one meter). Hence, particles traveling through time (and space) – from the bottom of the graph to the top – will usually not  be traveling at an angle of more than 45 degrees (as measured from the time axis) but, from the graph above, it is clear that photons do. [Note that electrons moving through spacetime are represented by plain straight lines, while photons are represented by wavy lines. It’s just a matter of convention.]

More importantly, a Feynman diagram is a pictorial device showing what needs to be calculated and how. Indeed, with all the complexities involved, it is easy to lose track of what should be added and what should be multiplied, especially when it comes to much more complicated situations like the one described above (e.g. making sense of a scattering event). So, while the coupling constant j (aka as the ‘charge’ of a particle – but it’s obviously not the electric charge) is just a number, calculating an actual E(A to B) amplitudes is not easy – not only because there are many different possible routes (paths) but because (almost) anything can happen. Let’s have a closer look at it.

E(A to B)

As Feynman explains in his 1985 QED Lectures: “E(A to B) can be represented as a giant sum of a lot of different ways an electron can go from point A to B in spacetime: the electron can take a ‘one-hop flight’, going directly from point A to B; it could take a ‘two-hop flight’, stopping at an intermediate point C; it could take a ‘three-hop flight’ stopping at points D and E, and so on.”

Fortunately, the calculation re-uses known values: the amplitude for each ‘hop’ – from C to D, for example – is P(F to G) – so that’s the amplitude of a photon (!) to go from F to G – even if we are talking an electron here. But there’s a difference: we also have to multiply the amplitudes for each ‘hop’ with the amplitude for each ‘stop’, and that’s represented by another number – not j but n2. So we have an infinite series of terms for E(A to B): P(A to B) + P(A to C)·n2·P(C to B) + P(A to D)·n2·P(D to E)·n2·P(E to B) + … for all possible intermediate points C, D, E, and so on, as per the illustration below.

E(A to B)

You’ll immediately ask: what’s the value of n? It’s quite important to know it, because we want to know how big these n2netcetera terms are. I’ll be honest: I have not come to terms with that yet. According to Feynman (QED, p. 125), it is the ‘rest mass’ of an ‘ideal’ electron: an ‘ideal’ electron is an electron that doesn’t know Feynman’s amplitude theory and just goes from point to point in spacetime using only the direct path. 🙂 Hence, it’s not a probability amplitude like j: a proper probability amplitude will always have a modulus less than 1, and so when we see exponential terms like j2, j4,… we know we should not be all that worried – because these sort of vanish (go to zero) for sufficiently large exponents. For E(A to B), we do not have such vanishing terms. I will not dwell on this right here, but I promise to discuss it in the Post Scriptum of this post. The frightening possibility is that n might be a number larger than one.

[As we’re freewheeling a bit anyway here, just a quick note on conventions: I should not be writing j in bold-face, because it’s a (complex- or real-valued) number and symbols representing numbers are usually not written in bold-face: vectors are written in bold-face. So, while you can look at a complex number as a vector, well… It’s just one of these inconsistencies I guess. The problem with using bold-face letters to represent complex numbers (like amplitudes) is that they suggest that the ‘dot’ in a product (e.g. j·j) is an actual dot project (aka as a scalar product or an inner product) of two vectors. That’s not the case. We’re multiplying complex numbers here, and so we’re just using the standard definition of a product of complex numbers. This subtlety probably explains why Feynman prefers to write the above product as P(A to B) + P(A to C)*n2*P(C to B) + P(A to D)*n2*P(D to E)*n2*P(E to B) + … But then I find that using that asterisk to represent multiplication is a bit funny (although it’s a pretty common thing in complex math) and so I am not using it. Just be aware that a dot in a product may not always mean the same type of multiplication: multiplying complex numbers and multiplying vectors is not the same. […] And I won’t write j in bold-face anymore.]

P(A to B)

Regardless of the value for n, it’s obvious we need a functional form for P(A to B), because that’s the other thing (other than n) that we need to calculate E(A to B). So what’s the amplitude of a photon to go from point A to B?

Well… The function describing P(A to B) is obviously some wave function – so that’s a complex-valued function of x and t. It’s referred to as a (Feynman) propagator: a propagator function gives the probability amplitude for a particle to travel from one place to another in a given time, or to travel with a certain energy and momentum. [So our function for E(A to B) will be a propagator as well.] You can check out the details on it on Wikipedia. Indeed, I could insert the formula here, but believe me if I say it would only confuse you. The points to note is that:

  1. The propagator is also derived from the wave equation describing the system, so that’s some kind of differential equation which incorporates the relevant rules and constraints that apply to the system. For electrons, that’s the Schrödinger equation I presented in my previous post. For photons… Well… As I mentioned in my previous post, there is ‘something similar’ for photons – there must be – but I have not seen anything that’s equally ‘simple’ as the Schrödinger equation for photons. [I have Googled a bit but it’s obvious we’re talking pretty advanced quantum mechanics here – so it’s not the QM-101 course that I am currently trying to make sense of.] 
  2. The most important thing (in this context at least) is that the key variable in this propagator (i.e. the Feynman propagator for the photon) is I: that spacetime interval which I mentioned in my previous post already:

I = Δr– Δt2 =  (z2– z1)+ (y2– y1)+ (x2– x1)– (t2– t1)2

In this equation, we need to measure the time and spatial distance between two points in spacetime in equivalent units (these ‘points’ are usually referred to as four-vectors), so we’d use light-seconds for the unit of distance or, for the unit of time, the time it takes for light to travel one meter. [If we don’t want to transform time or distance scales, then we have to write I as I = c2Δt2 – Δr2.] Now, there are three types of intervals:

  1. For time-like intervals, we have a negative value for I, so Δt> Δr2. For two events separated by a time-like interval, enough time passes between them so there could be a cause–effect relationship between the two events. In a Feynman diagram, the angle between the time axis and the line between the two events will be less than 45 degrees from the vertical axis. The traveling electrons in the Feynman diagrams above are an example.
  2. For space-like intervals, we have a positive value for I, so Δt< Δr2. Events separated by space-like intervals cannot possibly be causally connected. The photons traveling between point 5 and 6 in the first Feynman diagram are an example, but then photons do have amplitudes to travel faster than light.
  3. Finally, for light-like intervals, I = 0, or Δt2 = Δr2. The points connected by the 45-degree lines in the illustration below (which Feynman uses to introduce his Feynman diagrams) are an example of points connected by light-like intervals.

[Note that we are using the so-called space-like convention (+++–) here for I. There’s also a time-like convention, i.e. with +––– as signs: I = Δt2 – Δrso just check when you would consult other sources on this (which I recommend) and if you’d feel I am not getting the signs right.]

Spacetime intervalsNow, what’s the relevance of this? To calculate P(A to B), we have to add the amplitudes for all possible paths that the photon can take, and not in space, but in spacetime. So we should add all these vectors (or ‘arrows’ as Feynman calls them) – an infinite number of them really. In the meanwhile, you know it amounts to adding complex numbers, and that infinite sums are done by doing integrals, but let’s take a step back: how are vectors added?

Well…That’s easy, you’ll say… It’s the parallelogram rule… Well… Yes. And no. Let me take a step back here to show how adding a whole range of similar amplitudes works.

The illustration below shows a bunch of photons – real or imagined – from a source above a water surface (the sun for example), all taking different paths to arrive at a detector under the water (let’s say some fish looking at the sky from under the water). In this case, we make abstraction of all the photons leaving at different times and so we only look at a bunch that’s leaving at the same point in time. In other words, their stopwatches will be synchronized (i.e. there is no phase shift term in the phase of their wave function) – let’s say at 12 o’clock when they leave the source. [If you think this simplification is not acceptable, well… Think again.]

When these stopwatches hit the retina of our poor fish’s eye (I feel we should put a detector there, instead of a fish), they will stop, and the hand of each stopwatch represents an amplitude: it has a modulus (its length) – which is assumed to be the same because all paths are equally likely (this is one of the first principles of QED) – but their direction is very different. However, by now we are quite familiar with these operations: we add all the ‘arrows’ indeed (or vectors or amplitudes or complex numbers or whatever you want to call them) and get one big final arrow, shown at the bottom – just above the caption. Look at it very carefully.

adding arrows

If you look at the so-called contribution made by each of the individual arrows, you can see that it’s the arrows associated with the path of least time and the paths immediately left and right of it that make the biggest contribution to the final arrow. Why? Because these stopwatches arrive around the same time and, hence, their hands point more or less in the same direction. It doesn’t matter what direction – as long as it’s more or less the same.

[As for the calculation of the path of least time, that has to do with the fact that light is slowed down in water. Feynman shows why in his 1985 Lectures on QED, but I cannot possibly copy the whole book here ! The principle is illustrated below.]  Least time principle

So, where are we? This digressions go on and on, don’t they? Let’s go back to the main story: we want to calculate P(A to B), remember?

As mentioned above, one of the first principles in QED is that all paths – in spacetime – are equally likely. So we need to add amplitudes for every possible path in spacetime using that Feynman propagator function. You can imagine that will be some kind of integral which you’ll never want to solve. Fortunately, Feynman’s disciples have done that for you already. The results is quite predictable: the grand result is that light has a tendency to travel in straight lines and at the speed of light.

WHAT!? Did Feynman get a Nobel prize for trivial stuff like that?

Yes. The math involved in adding amplitudes over all possible paths not only in space but also in time uses the so-called path integral formulation of quantum mechanics and so that’s got Feynman’s signature on it, and that’s the main reason why he got this award – together with Julian Schwinger and Sin-Itiro Tomonaga: both much less well known than Feynman, but so they shared the burden. Don’t complain about it. Just take a look at the ‘mechanics’ of it.

We already mentioned that the propagator has the spacetime interval I in its denominator. Now, the way it works is that, for values of I equal or close to zero, so the paths that are associated with light-like intervals, our propagator function will yield large contributions in the ‘same’ direction (wherever that direction is), but for the spacetime intervals that are very much time- or space-like, the magnitude of our amplitude will be smaller and – worse – our arrow will point in the ‘wrong’ direction. In short, the arrows associated with the time- and space-like intervals don’t add up to much, especially over longer distances. [When distances are short, there are (relatively) few arrows to add, and so the probability distribution will be flatter: in short, the likelihood of having the actual photon travel faster or slower than speed is higher.]

Contribution interval

Conclusion

Does this make sense? I am not sure, but I did what I promised to do. I told you how P(A to B) gets calculated; and from the formula for E(A to B), it is obvious that we can then also calculate E(A to B) provided we have a value for n. However, that value n is determined experimentally, just like the value of j, in order to ensure this amplitude theory yields probabilities that match the probabilities we observe in all kinds of crazy experiments that try to prove or disprove the theory; and then we can use these three amplitude formulas “to make the whole world”, as Feynman calls it, except the stuff that goes on inside of nuclei (because that’s the domain of the weak and strong nuclear force) and gravitation, for which we have a law (Newton’s Law) but no real ‘explanation’. [Now, you may wonder if this QED explanation of light is really all that good, but Mr Feynman thinks it is, and so I have no reason to doubt that – especially because there’s surely not anything more convincing lying around as far as I know.]

So what remains to be told? Lots of things, even within the realm of expertise of quantum electrodynamics. Indeed, Feynman applies the basics as described above to a number of real-life phenomena – quite interesting, all of it ! – but, once again, it’s not my goal to copy all of his Lectures here. [I am only hoping to offer some good summaries of key points in some attempt to convince myself that I am getting some of it at least.] And then there is the strong force, and the weak force, and the Higgs field, and so and so on. But that’s all very strange and new territory which I haven’t even started to explore. I’ll keep you posted as I am making my way towards it.

Post scriptum: On the values of j and n

In this post, I promised I would write something about how we can find j and n because I realize it would just amount to copy three of four pages out of that book I mentioned above, and which inspired most of this post. Let me just say something more about that remarkable book, and then quote a few lines on what the author of that book – the great Mr Feynman ! – thinks of the math behind calculating these two constants (the coupling constant j, and the ‘rest mass’ of an ‘ideal’ electron). Now, before I do that, I should repeat that he actually invented that math (it makes use of a mathematical approximation method called perturbation theory) and that he got a Nobel Prize for it.

First, about the book. Feynman’s 1985 Lectures on Quantum Electrodynamics are not like his 1965 Lectures on Physics. The Lectures on Physics are proper courses for undergraduate and even graduate students in physics. This little 1985 book on QED is just a series of four lectures for a lay audience, conceived in honor of Alix G. Mautner. She was a friend of Mr Feynman’s who died a few years before he gave and wrote these ‘lectures’ on QED. She had a degree in English literature and would ask Mr Feynman regularly to explain quantum mechanics and quantum electrodynamics in a way she would understand. While they had known each other for about 22 years, he had apparently never taken enough time to do so, as he writes in his Introduction to these Alix G. Mautner Memorial Lectures: “So here are the lectures I really [should have] prepared for Alix, but unfortunately I can’t tell them to her directly, now.”

The great Richard Phillips Feynman himself died only three years later, in February 1988 – not of one but two rare forms of cancer. He was only 69 years old when he died. I don’t know if he was aware of the cancer(s) that would kill him, but I find his fourth and last lecture in the book, Loose Ends, just fascinating. Here we have a brilliant mind deprecating the math that earned him a Nobel Prize and without which the Standard Model would be unintelligible. I won’t try to paraphrase him. Let me just quote him. [If you want to check the quotes, the relevant pages are page 125 to 131):

The math behind calculating these constants] is a “dippy process” and “having to resort to such hocus-pocus has prevented us from proving that the theory of quantum electrodynamics is mathematically self-consistent“. He adds: “It’s surprising that the theory still hasn’t been proved self-consistent one way or the other by now; I suspect that renormalization [“the shell game that we play to find n and j” as he calls it]  is not mathematically legitimate.” […] Now, Mr Feynman writes this about quantum electrodynamics, not about “the rest of physics” (and so that’s quantum chromodynamics (QCD) – the theory of the strong interactions – and quantum flavordynamics (QFD) – the theory of weak interactions) which, he adds, “has not been checked anywhere near as well as electrodynamics.” 

That’s a pretty damning statement, isn’t it? In one of my other posts (see: The End of the Road to Reality?), I explore these comments a bit. However, I have to admit I feel I really need to get back to math in order to appreciate these remarks. I’ve written way too much about physics anyway now (as opposed to the my first dozen of posts – which were much more math-oriented). So I’ll just have a look at some more stuff indeed (such as perturbation theory), and then I’ll get back blogging. Indeed, I’ve written like 20 posts or so in a few months only – so I guess I should shut up for while now !

In the meanwhile, you’re more than welcome to comment of course ! 

Light

I started the two previous posts attempting to justify why we need all these mathematical formulas to understand stuff: because otherwise we just keep on repeating very simplistic but nonsensical things such as ‘matter behaves (sometimes) like light’, ‘light behaves (sometimes) like matter’ or, combining both, ‘light and matter behave like wavicles’. Indeed: what does ‘like‘ mean? Like the same but different? 🙂 However, I have not said much about light so far.

Light and matter are two very different things. For matter, we have quantum mechanics. For light, we have quantum electrodynamics (QED). However, QED is not only a quantum theory about light: as Feynman pointed out in his little but exquisite 1985 book on quantum electrodynamics (QED: The Strange Theory of Light and Matter), it is first and foremost a theory about how light interacts with matter. However, let’s limit ourselves here to light.

In classical physics, light is an electromagnetic wave: it just travels on and on and on because of that wonderful interaction between electric and magnetic fields. A changing electric field induces a magnetic field, the changing magnetic field then induces an electric field, and then the changing electric field induces a magnetic field, and… Well, you got the idea: it goes on and on and on. This wonderful machinery is summarized in Maxwell’s equations – and most beautifully so in the so-called Heaviside form of these equations, which assume a charge-free vacuum space (so there are no other charges lying around exerting a force on the electromagnetic wave or the (charged) particle whom’s behavior we want to study) and they also make abstraction of other complications such as electric currents (so there are no moving charges going around either).

I reproduced Heaviside’s Maxwell equations below as well as an animated gif which is supposed to illustrate the dynamics explained above. [In case you wonder who’s Heaviside? Well… Check it out: he was quite a character.] The animation is not all that great but OK enough. And don’t worry if you don’t understand the equations – just note the following:

  1. The electric and magnetic field E and B are represented by perpendicular oscillating vectors.
  2. The first and third equation (∇·E = 0 and ∇·B = 0) state that there are no static or moving charges around and, hence, they do not have any impact on (the flux of) E and B.
  3. The second and fourth equation are the ones that are essential. Note the time derivatives (∂/∂t): E and B oscillate and perpetuate each other by inducing new circulation of B and E.

Heaviside form of Maxwell's equations

The constants μ and ε in the fourth equation are the so-called permeability (μ) and permittivity (ε) of the medium, and μ0 and ε0 are the values for these constants in a vacuum space. Now, it is interesting to note that με equals 1/c2, so a changing electric field only produces a tiny change in the circulation of the magnetic field. That’s got something to do with magnetism being a ‘relativistic’ effect but I won’t explore that here – except for noting that the final Lorentz force on a (charged) particle F = q(E + v×B) will be the same regardless of the reference frame (moving or inertial): the reference frame will determine the mixture of E and B fields, but there is only one combined force on a charged particle in the end, regardless of the reference frame (inertial or moving at whatever speed – relativistic (i.e. close to c) or not). [The forces F, E and B on a moving (charged) particle are shown below the animation of the electromagnetic wave.] In other words, Maxwell’s equations are compatible with both special as well as general relativity. In fact, Einstein observed that these equations ensure that electromagnetic waves always travel at speed c (to use his own words: “Light is always propagated in empty space with a definite velocity c which is independent of the state of motion of the emitting body.”) and it’s this observation that led him to develop his special relativity theory.

Electromagneticwave3Dfromside

325px-Lorentz_force_particle

The other interesting thing to note is that there is energy in these oscillating fields and, hence, in the electromagnetic wave. Hence, if the wave hits an impenetrable barrier, such as a paper sheet, it exerts pressure on it – known as radiation pressure. [By the way, did you ever wonder why a light beam can travel through glass but not through paper? Check it out!] A very oft-quoted example is the following: if the effects of the sun’s radiation pressure on the Viking spacecraft had been ignored, the spacecraft would have missed its Mars orbit by about 15,000 kilometers. Another common example is more science fiction-oriented: the (theoretical) possibility of space ships using huge sails driven by sunlight (paper sails obviously – one should not use transparent plastic for that). 

I am mentioning radiation pressure because, although it is not that difficult to explain radiation pressure using classical electromagnetism (i.e. light as waves), the explanation provided by the ‘particle model’ of light is much more straightforward and, hence, a good starting point to discuss the particle nature of light:

  1. Electromagnetic radiation is quantized in particles called photons. We know that because of Max Planck’s work on black body radiation, which led to Planck’s relation: E = hν. Photons are bona fide particles in the so-called Standard Model of physics: they are defined as bosons with spin 1, but zero rest mass and no electric charge (as opposed to W bosons). They are denoted by the letter or symbol γ (gamma), so that’s the same symbol that’s used to denote gamma rays. [Gamma rays are high-energy electromagnetic radiation (i.e. ‘light’) that have a very definite particle character. Indeed, because of their very short wavelength – less than 10 picometer (10×10–12 m) and high energy (hundreds of KeV – as opposed to visible light, which has a wavelength between 380 and 750 nanometer (380-750×10–9 m) and typical energy of 2 to 3 eV only (so a few hundred thousand times less), they are capable of penetrating through thick layers of concrete, and the human body – where they might damage intracellular bodies and create cancer (lead is a more efficient barrier obviously: a shield of a few centimeter of lead will stop most of them. In case you are not sure about the relation between energy and penetration depth, see the Post Scriptum.]
  2. Although photons are considered to have zero rest mass, they have energy and, hence, an equivalent relativistic mass (m = E/c2) and, therefore, also momentum. Indeed, energy and momentum are related through the following (relativistic) formula: E = (p2c+ m02c4)1/2 (the non-relativistic version is simply E = p2/2m0 but – quite obviously – an approximation that cannot be used in this case – if only because the denominator would be zero). This simplifies to E = pc or p = E/c in this case. This basically says that the energy (E) and the momentum (p) of a photon are proportional, with c – the velocity of the wave – as the factor of proportionality.
  3. The generation of radiation pressure can then be directly related to the momentum property of photons, as shown in the diagram below – which shows how radiation force could – perhaps – propel a space sailing ship. [Nice idea, but I’d rather bet on nuclear-thermal rocket technology.]

Sail-Force1

I said in my introduction to this post that light and matter are two very different things. They are, and the logic connecting matter waves and electromagnetic radiation is not straightforward – if there is any. Let’s look at the two equations that are supposed to relate the two – the de Broglie relation and the Planck relation:

  1. The de Broglie relation E = hassigns a de Broglie frequency (i.e. the frequency of a complex-valued probability amplitude function) to a particle with mass m through the mass-energy equivalence relation E = mc2. However, the concept of a matter wave is rather complicated (if you don’t think so: read the two previous posts): matter waves have little – if anything – in common with electromagnetic waves. Feynman calls electromagnetic waves ‘real’ waves (just like water waves, or sound waves, or whatever other wave) as opposed to… Well – he does stop short of calling matter waves unreal but it’s obvious they look ‘less real’ than ‘real waves’. Indeed, these complex-valued psi functions (Ψ) – for which we have to square the modulus to get the probability of something happening in space and time, or to measure the likely value of some observable property of the system – are obviously ‘something else’! [I tried to convey their ‘reality’ as well as I could in my previous post, but I am not sure I did a good job – not all really.]
  2. The Planck relation E = hν relates the energy of a photon – the so-called quantum of light (das Lichtquant as Einstein called it in 1905 – the term ‘photon’ was coined some 20 years later it is said) – to the frequency of the electromagnetic wave of which it is part. [That Greek symbol (ν) – it’s the letter nu (the ‘v’ in Greek is amalgamated with the ‘b’) – is quite confusing: it’s not the v for velocity.]

So, while the Planck relation (which goes back to 1905) obviously inspired Louis de Broglie (who introduced his theory on electron waves some 20 years later – in his PhD thesis of 1924 to be precise), their equations look the same but are different – and that’s probably the main reason why we keep two different symbols – f and ν – for the two frequencies.

Photons and electrons are obviously very different particles as well. Just to state the obvious:

  1. Photons have zero rest mass, travel at the speed of light, have no electric charge, are bosons, and so on and so on, and so they behave differently (see, for example, my post on Bose and Fermi, which explains why one cannot make proton beam lasers). [As for the boson qualification, bosons are force carriers: photons in particular mediate (or carry) the electromagnetic force.]
  2. Electrons do not weigh much and, hence, can attain speeds close to light (but it requires tremendous amounts of energy to accelerate them very near c) but so they do have some mass, they have electric charge (photons are electrically neutral), and they are fermions – which means they’re an entirely different ‘beast’ so to say when it comes to combining their probability amplitudes (so that’s why they’ll never get together in some kind of electron laser beam either – just like protons or neutrons – as I explain in my post on Bose and Fermi indeed).

That being said, there’s some connection of course (and that’s what’s being explored in QED):

  1. Accelerating electric charges cause electromagnetic radiation (so moving charges (the negatively charged electrons) cause the electromagnetic field oscillations, but it’s the (neutral) photons that carry it).
  2. Electrons absorb and emit photons as they gain/lose energy when going from one energy level to the other.
  3. Most important of all, individual photons – just like electrons – also have a probability amplitude function – so that’s a de Broglie or matter wave function if you prefer that term.

That means photons can also be described in terms of some kind of complex wave packet, just like that electron I kept analyzing in my previous posts – until I (and surely you) got tired of it. That means we’re presented with the same type of mathematics. For starters, we cannot be happy with assigning a unique frequency to our (complex-valued) de Broglie wave, because that would – once again – mean that we have no clue whatsoever where our photon actually is. So, while the shape of the wave function below might well describe the E and B of a bona fide electromagnetic wave, it cannot describe the (real or imaginary) part of the probability amplitude of the photons we would associate with that wave.

constant frequency waveSo that doesn’t work. We’re back at analyzing wave packets – and, by now, you know how complicated that can be: I am sure you don’t want me to mention Fourier transforms again! So let’s turn to Feynman once again – the greatest of all (physics) teachers – to get his take on it. Now, the surprising thing is that, in his 1985 Lectures on Quantum Electrodynamics (QED), he doesn’t really care about the amplitude of a photon to be at point x at time t. What he needs to know is:

  1. The amplitude of a photon to go from point A to B, and
  2. The amplitude of a photon to be absorbed/emitted by an electron (a photon-electron coupling as it’s called).

And then he needs only one more thing: the amplitude of an electron to go from point A to B. That’s all he needs to explain EVERYTHING – in quantum electrodynamics that is. So that’s partial reflection, diffraction, interference… Whatever! In Feynman’s own words: “Out of these three amplitudes, we can make the whole world, aside from what goes on in nuclei, and gravitation, as always!” So let’s have a look at it.

I’ve shown some of his illustrations already in the Bose and Fermi post I mentioned above. In Feynman’s analysis, photons get emitted by some source and, as soon as they do, they travel with some stopwatch, as illustrated below. The speed with which the hand of the stopwatch turns is the angular frequency of the phase of the probability amplitude, and it’s length is the modulus -which, you’ll remember, we need to square to get a probability of something, so for the illustration below we have a probability of 0.2×0.2 = 4%. Probability of what? Relax. Let’s go step by step.

Stopwatch

Let’s first relate this probability amplitude stopwatch to a theoretical wave packet, such as the one below – which is a nice Gaussian wave packet:

example of wave packet

This thing really fits the bill: it’s associated with a nice Gaussian probability distribution (aka as a normal distribution, because – despite its ideal shape (from a math point of view), it actually does describe many real-life phenomena), and we can easily relate the stopwatch’s angular frequency to the angular frequency of the phase of the wave. The only thing you’ll need to remember is that its amplitude is not constant in space and time: indeed, this photon is somewhere sometime, and that means it’s no longer there when it’s gone, and also that it’s not there when it hasn’t arrived yet. 🙂 So, as you long as you remember that, Feynman’s stopwatch is a great way to represent a photon (or any particle really). [Just think of a stopwatch in your hand with no hand, but then suddenly that hand grows from zero to 0.2 (or some other random value between 0 and 1) and then shrinks back from that random value to 0 as the photon whizzes by. […] Or find some other creative interpretation if you don’t like this one. :-)]

Now, of course we do not know at what time the photon leaves the source and so the hand of the stopwatch could be at 2 o’clock, 9 o’clock or whatever: so the phase could be shifted by any value really. However, the thing to note is that the stopwatch’s hand goes around and around at a steady (angular) speed.

That’s OK. We can’t know where the photon is because we’re obviously assuming a nice standardized light source emitting polarized light with a very specific color, i.e. all photons have the same frequency (so we don’t have to worry about spin and all that). Indeed, because we’re going to add and multiply amplitudes, we have to keep it simple (the complicated things should be left to clever people – or academics). More importantly, it’s OK because we don’t need to know the exact position of the hand of the stopwatch as the photon leaves the source in order to explain phenomena like the partial reflection of light on glass. What matters there is only how much the stopwatch hand turns in the short time it takes to go from the front surface of the glass to its back surface. That difference in phase is independent from the position of the stopwatch hand as it reaches the glass: it only depends on the angular frequency (i.e. the energy of the photon, or the frequency of the light beam) and the thickness of the glass sheet. The two cases below present two possibilities: a 5% chance of reflection and a 16% chance of reflection (16% is actually a maximum, as Feynman shows in that little book, but that doesn’t matter here).

partial reflection

But – Hey! – I am suddenly talking amplitudes for reflection here, and the probabilities that I am calculating (by adding amplitudes, not probabilities) are also (partial) reflection probabilities. Damn ! YOU ARE SMART! It’s true. But you get the idea, and I told you already that Feynman is not interested in the probability of a photon just being here or there or wherever. He’s interested in (1) the amplitude of it going from the source (i.e. some point A) to the glass surface (i.e. some other point B), and then (2) the amplitude of photon-electron couplings – which determine the above amplitudes for being reflected (i.e. being (back)scattered by an electron actually).

So what? Well… Nothing. That’s it. I just wanted you to give some sense of de Broglie waves for photons. The thing to note is that they’re like de Broglie waves for electrons. So they are as real or unreal as these electron waves, and they have close to nothing to do with the electromagnetic wave of which they are part. The only thing that relates them with that real wave so to say, is their energy level, and so that determines their de Broglie wavelength. So, it’s strange to say, but we have two frequencies for a photon: E= hν and E = hf. The first one is the Planck relation (E= hν): it associates the energy of a photon with the frequency of the real-life electromagnetic wave. The second is the de Broglie relation (E = hf): once we’ve calculated the energy of a photon using E= hν, we associate a de Broglie wavelength with the photon. So we imagine it as a traveling stopwatch with angular frequency ω = 2πf.

So that’s it (for now). End of story.

[…]

Now, you may want to know something more about these other amplitudes (that’s what I would want), i.e. the amplitude of a photon to go from A to B and this coupling amplitude and whatever else that may or may not be relevant. Right you are: it’s fascinating stuff. For example, you may or may not be surprised that photons have an amplitude to travel faster or slower than light from A to B, and that they actually have many amplitudes to go from A to B: one for each possible path. [Does that mean that the path does not have to be straight? Yep. Light can take strange paths – and it’s the interplay (i.e. the interference) between all these amplitudes that determines the most probable path – which, fortunately (otherwise our amplitude theory would be worthless), turns out to be the straight line.] We can summarize this in a really short and nice formula for the P(A to B) amplitude [note that the ‘P’ stands for photon, not for probability – Feynman uses an E for the related amplitude for an electron, so he writes E(A to B)].

However, I won’t make this any more complicated right now and so I’ll just reveal that P(A to B) depends on the so-called spacetime interval. This spacetime interval (I) is equal to I = (z2– z1)+ (y2– y1)+ (x2– x1)– (t2– t1)2, with the time and spatial distance being measured in equivalent units (so we’d use light-seconds for the unit of distance or, for the unit of time, the time it takes for light to travel one meter). I am sure you’ve heard about this interval. It’s used to explain the famous light cone – which determines what’s past and future in respect to the here and now in spacetime (or the past and present of some event in spacetime) in terms of

  1. What could possibly have impacted the here and now (taking into account nothing can travel faster than light – even if we’ve mentioned some exceptions to this already, such as the phase velocity of a matter wave – but so that’s not a ‘signal’ and, hence, not in contradiction with relativity)?
  2. What could possible be impacted by the here and now (again taking into account that nothing can travel faster than c)?

In short, the light cone defines the past, the here, and the future in spacetime in terms of (potential) causal relations. However, as this post has – once again – become too long already, I’ll need to write another post to discuss these other types of amplitudes – and how they are used in quantum electrodynamics. So my next post should probably say something about light-matter interaction, or on photons as the carriers of the electromagnetic force (both in light as well as in an atom – as it’s the electromagnetic force that keeps an electron in orbit around the (positively charged) nucleus). In case you wonder, yes, that’s Feynman diagrams – among other things.

Post scriptum: On frequency, wavelength and energy – and the particle- versus wave-like nature of electromagnetic waves

I wrote that gamma waves have a very definite particle character because of their very short wavelength. Indeed, most discussions of the electromagnetic spectrum will start by pointing out that higher frequencies or shorter wavelengths – higher frequency (f) implies shorter wavelength (λ) because the wavelength is the speed of the wave (c in this case) over the frequency: λ = c/f – will make the (electromagnetic) wave more particle-like. For example, I copied two illustrations from Feynman’s very first Lectures (Volume I, Lectures 2 and 5) in which he makes the point by showing

  1. The familiar table of the electromagnetic spectrum (we could easily add a column for the wavelength (just calculate λ = c/f) and the energy (E = hf) besides the frequency), and
  2. An illustration that shows how matter (a block of carbon of 1 cm thick in this case) looks like for an electromagnetic wave racing towards it. It does not look like Gruyère cheese, because Gruyère cheese is cheese with holes: matter is huge holes with just a tiny little bit of cheese ! Indeed, at the micro-level, matter looks like a lot of nothing with only a few tiny specks of matter sprinkled about!

And so then he goes on to describe how ‘hard’ rays (i.e. rays with short wavelengths) just plow right through and so on and so on.

  electromagnetic spectrumcarbon close-up view

Now it will probably sound very stupid to non-autodidacts but, for a very long time, I was vaguely intrigued that the amplitude of a wave doesn’t seem to matter when looking at the particle- versus wave-like character of electromagnetic waves. Electromagnetic waves are transverse waves so they oscillate up and down, perpendicular to the direction of travel (as opposed to longitudinal waves, such as sound waves or pressure waves for example: these oscillate back and forth – in the same direction of travel). And photon paths are represented by wiggly lines, so… Well, you may not believe it but that’s why I stupidly thought it’s the amplitude that should matter, not the wavelength.

Indeed, the illustration below – which could be an example of how E or B oscillates in space and time – would suggest that lower amplitudes (smaller A’s) are the key to ‘avoiding’ those specks of matter. And if one can’t do anything about amplitude, then one may be forgiven to think that longer wavelengths – not shorter ones – are the key to avoiding those little ‘obstacles’ presented by atoms or nuclei in some crystal or non-crystalline structure. [Just jot it down: more wiggly lines increase the chance of hitting something.] But… Both lower amplitudes as well as longer wavelengths imply less energy. Indeed, the energy of a wave is, in general, proportional to the square of its amplitude and electromagnetic waves are no exception in this regard. As for wavelength, we have Planck’s relation. So what’s wrong in my very childish reasoning?

Cosine wave concepts

As usual, the answer is easy for those who already know it: neither wavelength nor amplitude have anything to do with how much space this wave actually takes as it propagates. But of course! You didn’t know that? Well… Sorry. Now I do. The vertical y axis might measure E and B indeed, but the graph and the nice animation above should not make you think that these field vectors actually occupy some space. So you can think of electromagnetic waves as particle waves indeed: we’ve got ‘something’ that’s traveling in a straight line, and it’s traveling at the speed of light. That ‘something’ is a photon, and it can have high or low energy. If it’s low-energy, it’s like a speck of dust: even if it travels at the speed of light, it is easy to deflect (i.e. scatter), and the ’empty space’ in matter (which is, of course, not empty but full of all kinds of electromagnetic disturbances) may well feel like jelly to it: it will get stuck (read: it will be absorbed somewhere or not even get through the first layer of atoms at all). If it’s high-energy, then it’s a different story: then the photon is like a tiny but very powerful bullet – same size as the speck of dust, and same speed, but much and much heavier. Such ‘bullet’ (e.g. a gamma ray photon) will indeed have a tendency to plow through matter like it’s air: it won’t care about all these low-energy fields in it.

It is, most probably, a very trivial point to make, but I thought it’s worth doing so.

[When thinking about the above, also remember the trivial relationship between energy and momentum for photons: p = E/c, so more energy means more momentum: a heavy truck crashing into your house will create more damage than a Mini at the same speed because the truck has much more momentum. So just use the mass-energy equivalence (E = mc2) and think about high-energy photons as armored vehicles and low-energy photons as mom-and-pop cars.]

Re-visiting the matter wave (II)

My previous post was, once again, littered with formulas – even if I had not intended it to be that way: I want to convey some kind of understanding of what an electron – or any particle at the atomic scale – actually is – with the minimum number of formulas necessary.

We know particles display wave behavior: when an electron beam encounters an obstacle or a slit that is somewhat comparable in size to its wavelength, we’ll observe diffraction, or interference. [I have to insert a quick note on terminology here: the terms diffraction and interference are often used interchangeably, but there is a tendency to use interference when we have more than one wave source and diffraction when there is only one wave source. However, I’ll immediately add that distinction is somewhat artificial. Do we have one or two wave sources in a double-slit experiment? There is one beam but the two slits break it up in two and, hence, we would call it interference. If it’s only one slit, there is also an interference pattern, but the phenomenon will be referred to as diffraction.]

We also know that the wavelength we are talking about it here is not the wavelength of some electromagnetic wave, like light. It’s the wavelength of a de Broglie wave, i.e. a matter wave: such wave is represented by an (oscillating) complex number – so we need to keep track of a real and an imaginary part – representing a so-called probability amplitude Ψ(x, t) whose modulus squared (│Ψ(x, t)│2) is the probability of actually detecting the electron at point x at time t. [The purists will say that complex numbers can’t oscillate – but I am sure you get the idea.]

You should read the phrase above twice: we cannot know where the electron actually is. We can only calculate probabilities (and, of course, compare them with the probabilities we get from experiments). Hence, when the wave function tells us the probability is greatest at point x at time t, then we may be lucky when we actually probe point x at time t and find it there, but it may also not be there. In fact, the probability of finding it exactly at some point x at some definite time t is zero. That’s just a characteristic of such probability density functions: we need to probe some region Δx in some time interval Δt.

If you think that is not very satisfactory, there’s actually a very common-sense explanation that has nothing to do with quantum mechanics: our scientific instruments do not allow us to go beyond a certain scale anyway. Indeed, the resolution of the best electron microscopes, for example, is some 50 picometer (1 pm = 1×10–12 m): that’s small (and resolutions get higher by the year), but so it implies that we are not looking at points – as defined in math that is: so that’s something with zero dimension – but at pixels of size Δx = 50×10–12 m.

The same goes for time. Time is measured by atomic clocks nowadays but even these clocks do ‘tick’, and these ‘ticks’ are discrete. Atomic clocks take advantage of the property of atoms to resonate at extremely consistent frequencies. I’ll say something more about resonance soon – because it’s very relevant for what I am writing about in this post – but, for the moment, just note that, for example, Caesium-133 (which was used to build the first atomic clock) oscillates at 9,192,631,770 cycles per second. In fact, the International Bureau of Standards and Weights re-defined the (time) second in 1967 to correspond to “the duration of 9,192,631,770 periods of the radiation corresponding to the transition between the two hyperfine levels of the ground state of the Caesium-133 atom at rest at a temperature of 0 K.”

Don’t worry about it: the point to note is that when it comes to measuring time, we also have an uncertainty. Now, when using this Caesium-133 atomic clock, this uncertainty would be in the range of ±9.2×10–9 seconds (so that’s nanoseconds: 1 ns = 1×10–9 s), because that’s the rate at which this clock ‘ticks’. However, there are other (much more plausible) ways of measuring time: some of the unstable baryons have lifetimes in the range of a few picoseconds only (1 ps = 1×10–12 s) and the really unstable ones – known as baryon resonances – have lifetimes in the 1×10–22 to 1×10–24 s range. This we can only measure because they leave some trace after these particle collisions in particle accelerators and, because we have some idea about their speed, we can calculate their lifetime from the (limited) distance they travel before disintegrating. The thing to remember is that for time also, we have to make do with time pixels  instead of time points, so there is a Δt as well. [In case you wonder what baryons are: they are particles consisting of three quarks, and the proton and the neutron are the most prominent (and most stable) representatives of this family of particles.]  

So what’s the size of an electron? Well… It depends. We need to distinguish two very different things: (1) the size of the area where we are likely to find the electron, and (2) the size of the electron itself. Let’s start with the latter, because that’s the easiest question to answer: there is a so-called classical electron radius re, which is also known as the Thompson scattering length, which has been calculated as:

r_\mathrm{e} = \frac{1}{4\pi\varepsilon_0}\frac{e^2}{m_{\mathrm{e}} c^2} = 2.817 940 3267(27) \times 10^{-15} \mathrm{m}As for the constants in this formula, you know these by now: the speed of light c, the electron charge e, its mass me, and the permittivity of free space εe. For whatever it’s worth (because you should note that, in quantum mechanics, electrons do not have a size: they are treated as point-like particles, so they have a point charge and zero dimension), that’s small. It’s in the femtometer range (1 fm = 1×10–15 m). You may or may not remember that the size of a proton is in the femtometer range as well – 1.7 fm to be precise – and we had a femtometer size estimate for quarks as well: 0.7 m. So we have the rather remarkable result that the much heavier proton (its rest mass is 938 MeV/csas opposed to only 0.511 MeV MeV/c2, so the proton is 1835 times heavier) is 1.65 times smaller than the electron. That’s something to be explored later: for the moment, we’ll just assume the electron wiggles around a bit more – exactly because it’s lighterHere you just have to note that this ‘classical’ electron radius does measure something: it’s something ‘hard’ and ‘real’ because it scatters, absorbs or deflects photons (and/or other particles). In one of my previous posts, I explained how particle accelerators probe things at the femtometer scale, so I’ll refer you to that post (End of the Road to Reality?) and move on to the next question.

The question concerning the area where we are likely to detect the electron is more interesting in light of the topic of this post (the nature of these matter waves). It is given by that wave function and, from my previous post, you’ll remember that we’re talking the nanometer scale here (1 nm = 1×10–9 m), so that’s a million times larger than the femtometer scale. Indeed, we’ve calculated a de Broglie wavelength of 0.33 nanometer for relatively slow-moving electrons (electrons in orbit), and the slits used in single- or double-slit experiments with electrons are also nanotechnology. In fact, now that we are here, it’s probably good to look at those experiments in detail.

The illustration below relates the actual experimental set-up of a double-slit experiment performed in 2012 to Feynman’s 1965 thought experiment. Indeed, in 1965, the nanotechnology you need for this kind of experiment was not yet available, although the phenomenon of electron diffraction had been confirmed experimentally already in 1925 in the famous Davisson-Germer experiment. [It’s famous not only because electron diffraction was a weird thing to contemplate at the time but also because it confirmed the de Broglie hypothesis only two years after Louis de Broglie had advanced it!]. But so here is the experiment which Feynman thought would never be possible because of technology constraints:

Electron double-slit set-upThe insert in the upper-left corner shows the two slits: they are each 50 nanometer wide (50×10–9 m) and 4 micrometer tall (4×10–6 m). [The thing in the middle of the slits is just a little support. Please do take a few seconds to contemplate the technology behind this feat: 50 nm is 50 millionths of a millimeter. Try to imagine dividing one millimeter in ten, and then one of these tenths in ten again, and again, and once again, again, and again. You just can’t imagine that, because our mind is used to addition/subtraction and – to some extent – with multiplication/division: our mind can’t deal with with exponentiation really – because it’s not a everyday phenomenon.] The second inset (in the upper-right corner) shows the mask that can be moved to close one or both slits partially or completely.

Now, 50 nanometer is 150 times larger than the 0.33 nanometer range we got for ‘our’ electron, but it’s small enough to show diffraction and/or interference. [In fact, in this experiment (done by Bach, Pope, Liou and Batelaan from the University of Nebraska-Lincoln less than two years ago indeed), the beam consisted of electrons with an (average) energy of 600 eV and a de Broglie wavelength of 50 picometer. So that’s like the electrons used in electron microscopes. 50 pm is 6.6 times smaller than the 0.33 nm wavelength we calculated for our low-energy (70 eV) electron – but then the energy and the fact these electrons are guided in electromagnetic fields explain the difference. Let’s go to the results.

The illustration below shows the predicted pattern next to the observed pattern for the two scenarios:

  1. We first close slit 2, let a lot of electrons go through it, and so we get a pattern described by the probability density function P1 = │Φ12. Here we see no interference but a typical diffraction pattern: the intensity follows a more or less normal (i.e. Gaussian) distribution. We then close slit 1 (and open slit 2 again), again let a lot of electrons through, and get a pattern described by the probability density function P2 = │Φ22. So that’s how we get P1 and P2.
  2. We then open both slits, let a whole electrons through, and get according to the pattern described by probability density function P12 = │Φ122, which we get not from adding the probabilities P1 and P2 (hence, P12 ≠  P1 + P2) – as one would expect if electrons would behave like particles – but from adding the probability amplitudes. We have interference, rather than diffraction.

Predicted interference effectBut so what exactly is interfering? Well… The electrons. But that can’t be, can it?

The electrons are obviously particles, as evidenced from the impact they make – one by one – as they hit the screen as shown below. [If you want to know what screen, let me quote the researchers: “The resulting patterns were magnified by an electrostatic quadrupole lens and imaged on a two-dimensional microchannel plate and phosphorus screen, then recorded with a charge-coupled device camera. […] To study the build-up of the diffraction pattern, each electron was localized using a “blob” detection scheme: each detection was replaced by a blob, whose size represents the error in the localization of the detection scheme. The blobs were compiled together to form the electron diffraction patterns.” So there you go.]

Electron blobs

Look carefully at how this interference pattern becomes ‘reality’ as the electrons hit the screen one by one. And then say it: WAW ! 

Indeed, as predicted by Feynman (and any other physics professor at the time), even if the electrons go through the slits one by one, they will interfere – with themselves so to speak. [In case you wonder if these electrons really went through one by one, let me quote the researchers once again: “The electron source’s intensity was reduced so that the electron detection rate in the pattern was about 1 Hz. At this rate and kinetic energy, the average distance between consecutive electrons was 2.3 × 106 meters. This ensures that only one electron is present in the 1 meter long system at any one time, thus eliminating electron-electron interactions.” You don’t need to be a scientist or engineer to understand that, isn’t it?]

While this is very spooky, I have not seen any better way to describe the reality of the de Broglie wave: the particle is not some point-like thing but a matter wave, as evidenced from the fact that it does interfere with itself when forced to move through two slits – or through one slit, as evidenced by the diffraction patterns built up in this experiment when closing one of the two slits: the electrons went through one by one as well!

But so how does it relate to the characteristics of that wave packet which I described in my previous post? Let me sum up the salient conclusions from that discussion:

  1. The wavelength λ of a wave packet is calculated directly from the momentum by using de Broglie‘s second relation: λ = h/p. In this case, the wavelength of the electrons averaged 50 picometer. That’s relatively small as compared to the width of the slit (50 nm) – a thousand times smaller actually! – but, as evidenced by the experiment, it’s small enough to show the ‘reality’ of the de Broglie wave.
  2. From a math point (but, of course, Nature does not care about our math), we can decompose the wave packet in a finite or infinite number of component waves. Such decomposition is referred to, in the first case (finite number of composite waves or discrete calculus) as a Fourier analysis, or, in the second case, as a Fourier transform. A Fourier transform maps our (continuous) wave function, Ψ(x), to a (continuous) wave function in the momentum space, which we noted as φ(p). [In fact, we noted it as Φ(p) but I don’t want to create confusion with the Φ symbol used in the experiment, which is actually the wave function in space, so Ψ(x) is Φ(x) in the experiment – if you know what I mean.] The point to note is that uncertainty about momentum is related to uncertainty about position. In this case, we’ll have pretty standard electrons (so not much variation in momentum), and so the location of the wave packet in space should be fairly precise as well.
  3. The group velocity of the wave packet (vg) – i.e. the envelope in which our Ψ wave oscillates – equals the speed of our electron (v), but the phase velocity (i.e. the speed of our Ψ wave itself) is superluminal: we showed it’s equal to (vp) = E/p =   c2/v = c/β, with β = v/c, so that’s the ratio of the speed of our electron and the speed of light. Hence, the phase velocity will always be superluminal but will approach c as the speed of our particle approaches c. For slow-moving particles, we get astonishing values for the phase velocity, like more than a hundred times the speed of light for the electron we looked at in our previous post. That’s weird but it does not contradict relativity: if it helps, one can think of the wave packet as a modulation of an incredibly fast-moving ‘carrier wave’. 

Is any of this relevant? Does it help you to imagine what the electron actually is? Or what that matter wave actually is? Probably not. You will still wonder: How does it look like? What is it in reality?

That’s hard to say. If the experiment above does not convey any ‘reality’ according to you, then perhaps the illustration below will help. It’s one I have used in another post too (An Easy Piece: Introducing Quantum Mechanics and the Wave Function). I took it from Wikipedia, and it represents “the (likely) space in which a single electron on the 5d atomic orbital of an atom would be found.” The solid body shows the places where the electron’s probability density (so that’s the squared modulus of the probability amplitude) is above a certain value – so it’s basically the area where the likelihood of finding the electron is higher than elsewhere. The hue on the colored surface shows the complex phase of the wave function.

Hydrogen_eigenstate_n5_l2_m1

So… Does this help? 

You will wonder why the shape is so complicated (but it’s beautiful, isn’t it?) but that has to do with quantum-mechanical calculations involving quantum-mechanical quantities such as spin and other machinery which I don’t master (yet). I think there’s always a bit of a gap between ‘first principles’ in physics and the ‘model’ of a real-life situation (like a real-life electron in this case), but it’s surely the case in quantum mechanics! That being said, when looking at the illustration above, you should be aware of the fact that you are actually looking at a 3D representation of the wave function of an electron in orbit. 

Indeed, wave functions of electrons in orbit are somewhat less random than – let’s say – the wave function of one of those baryon resonances I mentioned above. As mentioned in my Not So Easy Piece, in which I introduced the Schrödinger equation (i.e. one of my previous posts), they are solutions of a second-order partial differential equation – known as the Schrödinger wave equation indeed – which basically incorporates one key condition: these solutions – which are (atomic or molecular) ‘orbitals’ indeed – have to correspond to so-called stationary states or standing waves. Now what’s the ‘reality’ of that? 

The illustration below comes from Wikipedia once again (Wikipedia is an incredible resource for autodidacts like me indeed) and so you can check the article (on stationary states) for more details if needed. Let me just summarize the basics:

  1. A stationary state is called stationary because the system remains in the same ‘state’ independent of time. That does not mean the wave function is stationary. On the contrary, the wave function changes as function of both time and space – Ψ = Ψ(x, t) remember? – but it represents a so-called standing wave.
  2. Each of these possible states corresponds to an energy state, which is given through the de Broglie relation: E = hf. So the energy of the state is proportional to the oscillation frequency of the (standing) wave, and Planck’s constant is the factor of proportionality. From a formal point of view, that’s actually the one and only condition we impose on the ‘system’, and so it immediately yields the so-called time-independent Schrödinger equation, which I briefly explained in the above-mentioned Not So Easy Piece (but I will not write it down here because it would only confuse you even more). Just look at these so-called harmonic oscillators below:

QuantumHarmonicOscillatorAnimation

A and B represent a harmonic oscillator in classical mechanics: a ball with some mass m (mass is a measure for inertia, remember?) on a spring oscillating back and forth. In case you’d wonder what the difference is between the two: both the amplitude as well as the frequency of the movement are different. 🙂 A spring and a ball?

It represents a simple system. A harmonic oscillation is basically a resonance phenomenon: springs, electric circuits,… anything that swings, moves or oscillates (including large-scale things such as bridges and what have you – in his 1965 Lectures (Vol. I-23), Feynman even discusses resonance phenomena in the atmosphere in his Lectures) has some natural frequency ω0, also referred to as the resonance frequency, at which it oscillates naturally indeed: that means it requires (relatively) little energy to keep it going. How much energy it takes exactly to keep them going depends on the frictional forces involved: because the springs in A and B keep going, there’s obviously no friction involved at all. [In physics, we say there is no damping.] However, both springs do have a different k (that’s the key characteristic of a spring in Hooke’s Law, which describes how springs work), and the mass m of the ball might be different as well. Now, one can show that the period of this ‘natural’ movement will be equal to t0 = 2π/ω= 2π(m/k)1/2 or that ω= (m/k)–1/2. So we’ve got a A and a B situation which differ in k and m. Let’s go to the so-called quantum oscillator, illustrations C to H.

C to H in the illustration are six possible solutions to the Schrödinger Equation for this situation. The horizontal axis is position (and so time is the variable) – but we could switch the two independent variables easily: as I said a number of times already, time and space are interchangeable in the argument representing the phase (θ) of a wave provided we use the right units (e.g. light-seconds for distance and seconds for time): θ = ωt – kx. Apart from the nice animation, the other great thing about these illustrations – and the main difference with resonance frequencies in the classical world – is that they show both the real part (blue) as well as the imaginary part (red) of the wave function as a function of space (fixed in the x axis) and time (the animation).

Is this ‘real’ enough? If it isn’t, I know of no way to make it any more ‘real’. Indeed, that’s key to understanding the nature of matter waves: we have to come to terms with the idea that these strange fluctuating mathematical quantities actually represent something. What? Well… The spooky thing that leads to the above-mentioned experimental results: electron diffraction and interference. 

Let’s explore this quantum oscillator some more. Another key difference between natural frequencies in atomic physics (so the atomic scale) and resonance phenomena in ‘the big world’ is that there is more than one possibility: each of the six possible states above corresponds to a solution and an energy state indeed, which is given through the de Broglie relation: E = hf. However, in order to be fully complete, I have to mention that, while G and H are also solutions to the wave equation, they are actually not stationary states. The illustration below – which I took from the same Wikipedia article on stationary states – shows why. For stationary states, all observable properties of the state (such as the probability that the particle is at location x) are constant. For non-stationary states, the probabilities themselves fluctuate as a function of time (and space of obviously), so the observable properties of the system are not constant. These solutions are solutions to the time-dependent Schrödinger equation and, hence, they are, obviously, time-dependent solutions.

StationaryStatesAnimationWe can find these time-dependent solutions by superimposing two stationary states, so we have a new wave function ΨN which is the sum of two others:  ΨN = Ψ1  + Ψ2. [If you include the normalization factor (as you should to make sure all probabilities add up to 1), it’s actually ΨN = (2–1/2)(Ψ1  + Ψ2).] So G and H above still represent a state of a quantum harmonic oscillator (with a specific energy level proportional to h), but so they are not standing waves.

Let’s go back to our electron traveling in a more or less straight path. What’s the shape of the solution for that one? It could be anything. Well… Almost anything. As said, the only condition we can impose is that the envelope of the wave packet – its ‘general’ shape so to say – should not change. That because we should not have dispersion – as illustrated below. [Note that this illustration only represent the real or the imaginary part – not both – but you get the idea.]

dispersion

That being said, if we exclude dispersion (because a real-life electron traveling in a straight line doesn’t just disappear – as do dispersive wave packets), then, inside of that envelope, the weirdest things are possible – in theory that is. Indeed, Nature does not care much about our Fourier transforms. So the example below, which shows a theoretical wave packet (again, the real or imaginary part only) based on some theoretical distribution of the wave numbers of the (infinite number) of component waves that make up the wave packet, may or may not represent our real-life electron. However, if our electron has any resemblance to real-life, then I would expect it to not be as well-behaved as the theoretical one that’s shown below.

example of wave packet

The shape above is usually referred to as a Gaussian wave packet, because of the nice normal (Gaussian) probability density functions that are associated with it. But we can also imagine a ‘square’ wave packet: a somewhat weird shape but – in terms of the math involved – as consistent as the smooth Gaussian wave packet, in the sense that we can demonstrate that the wave packet is made up of an infinite number of waves with an angular frequency ω that is linearly related to their wave number k, so the dispersion relation is ω = ak + b. [Remember we need to impose that condition to ensure that our wave packet will not dissipate (or disperse or disappear – whatever term you prefer.] That’s shown below: a Fourier analysis of a square wave.

Square wave packet

While we can construct many theoretical shapes of wave packets that respect the ‘no dispersion!’ condition, we cannot know which one will actually represent that electron we’re trying to visualize. Worse, if push comes to shove, we don’t know if these matter waves (so these wave packets) actually consist of component waves (or time-independent stationary states or whatever).

[…] OK. Let me finally admit it: while I am trying to explain you the ‘reality’ of these matter waves, we actually don’t know how real these matter waves actually are. We cannot ‘see’ or ‘touch’ them indeed. All that we know is that (i) assuming their existence, and (ii) assuming these matter waves are more or less well-behaved (e.g. that actual particles will be represented by a composite wave characterized by a linear dispersion relation between the angular frequencies and the wave numbers of its (theoretical) component waves) allows us to do all that arithmetic with these (complex-valued) probability amplitudes. More importantly, all that arithmetic with these complex numbers actually yields (real-valued) probabilities that are consistent with the probabilities we obtain through repeated experiments. So that’s what’s real and ‘not so real’ I’d say.

Indeed, the bottom-line is that we do not know what goes on inside that envelope. Worse, according to the commonly accepted Copenhagen interpretation of the Uncertainty Principle (and tons of experiments have been done to try to overthrow that interpretation – all to no avail), we never will.

Re-visiting the matter wave (I)

In my previous posts, I introduced a lot of wave formulas. They are essential to understanding waves – both real ones (e.g. electromagnetic waves) as well as probability amplitude functions. Probability amplitude function is quite a mouthful so let me call it a matter wave, or a de Broglie wave. The formulas are necessary to create true understanding – whatever that means to you – because otherwise we just keep on repeating very simplistic but nonsensical things such as ‘matter behaves (sometimes) like light’, ‘light behaves (sometimes) like matter’ or, combining both, ‘light and matter behave like wavicles’. Indeed: what does ‘like‘ mean? Like the same but different? 🙂 So that means it’s different. Let’s therefore re-visit the matter wave (i.e. the de Broglie wave) and point out the differences with light waves.

In fact, this post actually has its origin in a mistake in a post scriptum of a previous post (An Easy Piece: On Quantum Mechanics and the Wave Function), in which I wondered what formula to use for the energy E in the (first) de Broglie relation E = hf (with the frequency of the matter wave and h the Planck constant). Should we use (a) the kinetic energy of the particle, (b) the rest mass (mass is energy, remember?), or (c) its total energy? So let us first re-visit these de Broglie relations which, you’ll remember, relate energy and momentum to frequency (f) and wavelength (λ) respectively with the Planck constant as the factor of proportionality:

E = hf and p = h/λ

The de Broglie wave

I first tried kinetic energy in that E = h equation. However, if you use the kinetic energy formula (K.E. = mv2/2, with the velocity of the particle), then the second de Broglie relation (p = h/λ) does not come out right. The second de Broglie relation has the wavelength λ on the right side, not the frequency f. But it’s easy to go from one to the other: frequency and wavelength are related through the velocity of the wave (v). Indeed, the number of cycles per second (i.e. the frequency f) times the length of one cycle (i.e. the wavelength λ) gives the distance traveled by the wave per second, i.e. its velocity v. So fλ = v. Hence, using that kinetic energy formula and that very obvious fλ = v relation, we can write E = hf as mv2/2 = v/λ and, hence, after moving one of the two v’s in v2 (and the 1/2 factor) on the left side to the right side of this equation, we get mv = 2h/λ. So there we are:

p = mv = 2h/λ.

Well… No. The second de Broglie relation is just p = h/λ. There is no factor 2 in it. So what’s wrong?

A factor of 2 in an equation like this surely doesn’t matter, does it? It does. We are talking tiny wavelengths here but a wavelength of 1 nanometer (1×10–9 m) – this is just an example of the scale we’re talking about here – is not the same as a wavelength of 0.5 nm. There’s another problem too. Let’s go back to our an example of an electron with a mass of 9.1×10–31 kg (that’s very tiny, and so you’ll usually see it expressed in a unit that’s more appropriate to the atomic scale), moving about with a velocity of 2.2×106 m/s (that’s the estimated speed of orbit of an electron around a hydrogen nucleus: it’s fast (2,200 km per second), but still less than 1% of the speed of light), and let’s do the math. 

[Before I do the math, however, let me quickly insert a line on that ‘other unit’ to measure mass. You will usually see it written down as eV, so that’s electronvolt. Electronvolt is a measure of energy but that’s OK because mass is energy according to Einstein’s mass-energy equation: E = mc2. The point to note is that the actual measure for mass at the atomic scale is eV/c2, so we make the unit even smaller by dividing the eV (which already is a very tiny amount of energy) by c2: 1 eV/ccorresponds to 1.782662×10−36 kg, so the mass of our electron (9.1×10–31 kg) is about 510,000 eV/c2, or 0.510 MeV/c2. I am spelling it out because you will often just see 0.510 MeV in older or more popular publications, but so don’t forget that cfactor. As for the calculations below, I just stick to the kg and m measures because they make the dimensions come out right.]

According to our kinetic energy formula (K.E. = mv2/2), these mass and velocity values correspond to an energy value of 22 ×10−19 Joule (the Joule is the so-called SI unit for energy – don’t worry about it right now). So, from the first de Broglie equation (f = E/h) – and using the right value for Planck’s constant (6.626 J·s), we get a frequency of 3.32×1015 hertz (hertz just means oscillations per second as you know). Now, using v once again, and fλ = v, we see that corresponds to a wavelength of 0.66 nanometer (0.66×10−9 m). [Just take the numbers and do the math.] 

However, if we use the second de Broglie relation, which relates wavelength to momentum instead of energy, then we get 0.33 nanometer (0.33×10−9 m), so that’s half of the value we got from the first equation. So what is it: 0.33 or 0.66 nm? It’s that factor 2 again. Something is wrong.

It must be that kinetic energy formula. You’ll say we should include potential energy or something. No. That’s not the issue. First, we’re talking a free particle here: an electron moving in space (a vacuum) with no external forces acting on it, so it’s a field-free space (or a region of constant potential). Second, we could, of course, extend the analysis and include potential energy, and show how it’s converted to kinetic energy (like a stone falling from 100 m to 50 m: potential energy gets converted into kinetic energy) but making our analysis more complicated by including potential energy as well will not solve our problem here: it will only make you even more confused.

Then it must be some relativistic effect you’ll say. No. It’s true the formula for kinetic energy above only holds for relatively low speeds (as compared to light, so ‘relatively’ low can be thousands of km per second) but that’s not the problem here: we are talking electrons moving at non-relativistic speeds indeed, so their mass or energy is not (or hardly) affected by relativistic effects and, hence, we can indeed use the more simple non-relativistic formulas.

The real problem we’re encountering here is not with the equations: it’s the simplistic model of our wave. We are imagining one wave here indeed, with a single frequency, a single wavelength and, hence, one single velocity – which happens to coincide with the velocity of our particle. Such wave cannot possibly represent an actual de Broglie wave: the wave is everywhere and, hence, the particle it represents is nowhere. Indeed, a wave defined by a specific wavelength λ (or a wave number k = 2π/λ if we’re using complex number notation) and a specific frequency f or period T (or angular frequency ω = 2π/T = 2πf) will have a very regular shape – such as Ψ = Aei(ωt-kx) and, hence, the probability of actually locating that particle at some specific point in space will be the same everywhere: |Ψ|= |Aei(ωt-kx)|= A2. [If you are confused about the math here, I am sorry but I cannot re-explain this once again: just remember that our de Broglie wave represents probability amplitudes – so that’s some complex number Ψ = Ψ(x, t) depending on space and time – and that we  need to take the modulus squared of that complex number to get the probability associated with some (real) value x (i.e. the space variable) and some value t (i.e. the time variable).]

So the actual matter wave of a real-life electron will be represented by a wave train, or a wave packet as it is usually referred to. Now, a wave packet is described by (at least) two types of wave velocity:

  1. The so-called group velocity: the group velocity of a wave is denoted by vand is the velocity of the wave packet as a whole is traveling. Wikipedia defines it as “the velocity with which the overall shape of the waves’ amplitudes — known as the modulation or envelope of the wave — propagates through space.”
  2. The so-called phase velocity: the phase velocity is denoted by vp and is what we usually associate with the velocity of a wave. It is just what it says it is: the rate at which the phase of the (composite) wave travels through space.

The term between brackets above – ‘composite’ – already indicates what it’s all about: a wave packet is to be analyzed as a composite wave: so it’s a wave composed of a finite or infinite number of component waves which all have their own wave number k and their own angular frequency ω. So the mistake we made above is that, naively, we just assumed that (i) there is only one simple wave (and, of course, there is only one wave, but it’s not a simple one: it’s a composite wave), and (ii) that the velocity v of our electron would be equal to the velocity of that wave. Now that we are a little bit more enlightened, we need to answer two questions in regard to point (ii):

  1. Why would that be the case?
  2. If it’s is the case, then what wave velocity are we talking about: the group velocity or the phase velocity?

To answer both questions, we need to look at wave packets once again, so let’s do that. Just to visualize things, I’ll insert – once more – that illustration you’ve seen in my other posts already:

Explanation of uncertainty principle

The de Broglie wave packet

The Wikipedia article on the group velocity of a wave has wonderful animations, which I would advise you to look at in order to make sure you are following me here. There are several possibilities:

  1. The phase velocity and the group velocity are the same: that’s a rather unexciting possibility but it’s the easiest thing to work with and, hence, most examples will assume that this is the case.
  2. The group and phase velocity are not the same, but our wave packet is ‘stable’, so to say. In other words, the individual peaks and troughs of the wave within the envelope travel at a different speed (the phase velocity vg), but the envelope as a whole (so the wave packet as such) does not get distorted as it travels through space.
  3. The wave packet dissipates: in this case, we have a constant group velocity, but the wave packet delocalizes. Its width increases over time and so the wave packet diffuses – as time goes by – over a wider and wider region of space, until it’s actually no longer there. [In case you wonder why it did not group this third possibility under (2): it’s a bit difficult to assign a fixed phase velocity to a wave like this.]

How the wave packet will behave depends on the characteristics of the component waves. To be precise, it will depend on their angular frequency and their wave number and, hence, their individual velocities. First, note the relationship between these three variables: ω = 2πf and k = 2π/λ so ω/k = fλ = v. So these variables are not independent: if you have two values (e.g. v and k), you also have the third one (ω). Secondly, note that the component waves of our wave packet will have different wavelengths and, hence, different wave numbers k.

Now, the de Broglie relation p = ħk (i.e. the same relation as p = h/λ but we replace λ with 2π/k and then ħ is the so-called reduced Planck constant ħ = h/2π) makes it obvious that different wave numbers k correspond to different values p for the momentum of our electron, so allowing for a spread in k (or a spread in λ as illustrates above) amounts to allowing for some spread in p. That’s where the uncertainty principle comes in – which I actually derived from a theoretical wave function in my post on Fourier transforms and conjugate variables. But so that’s not something I want to dwell on here.

We’re interested in the ω’s. What about them? Well… ω can take any value really – from a theoretical point of view that is. Now you’ll surely object to that from a practical point of view, because you know what it implies: different velocities of the component waves. But you can’t object in a theoretical analysis like this. The only thing we could possible impose as a constraint is that our wave packet should not dissipate – so we don’t want it to delocalize and/or vanish after a while because we’re talking about some real-life electron here, and so that’s a particle which just doesn’t vanish like that.

To impose that condition, we need to look at the so-called dispersion relation. We know that we’ll have a whole range of wave numbers k, but so what values should ω take for a wave function to be ‘well-behaved’, i.e. not disperse in our case? Let’s first accept that k is some variable, the independent variable actually, and so then we associate some ω with each of these values k. So ω becomes the dependent variable (dependent on k that is) and that amounts to saying that we have some function ω = ω(k).

What kind of function? Well… It’s called the dispersion relation – for rather obvious reasons: because this function determines how the wave packet will behave: non-dispersive or – what we don’t want here – dispersive. Indeed, there are several possibilities:

  1. The speed of all component waves is the same: that means that the ratio ω/k = is the same for all component waves. Now that’s the case only if ω is directly proportional to k, with the factor of proportionality equal to v. That means that we have a very simple dispersion relation: ω = αk with α some constant equal to the velocity of the component waves as well as the group and phase velocity of the composite wave. So all velocities are just the same (vvp = vg = α) and we’re in the first of the three cases explained at the beginning of this section.
  2. There is a linear relation between ω and k but no direct proportionality, so we write ω = αk + β, in which β can be anything but not some function of k. So we allow different wave speeds for the component waves. The phase velocity will, once again, be equal to the ratio of the angular frequency and the wave number of the composite wave (whatever that is), but what about the group velocity, i.e. the velocity of our electron in this example? Well… One can show – but I will not do it here because it is quite a bit of work – that the group velocity of the wave packet will be equal to vg = dω/dk, i.e. the (first-order) derivative of ω with respect to k. So, if we want that wave packet to travel at the same speed of our electron (which is what we want of course because, otherwise, the wave packet would obviously not represent our electron), we’ll have to impose that dω/dk (or ∂ω/∂k if you would want to introduce more independent variables) equals v. In short, we have the condition that dω/dk = d(αk + β)/dk = α = k.
  3. If the relation between ω and k is non-linear, well… Then we have none of the above. Hence, we then have a wave packet that gets distorted and stretched out and actually vanishes after a while. That case surely does not represent an electron.

Back to the de Broglie wave relations

Indeed, it’s now time to go back to our de Broglie relations – E = hf and p = h/λ and the question that sparked the presentation above: what formula to use for E? Indeed, for p it’s easy: we use p = mv and, if you want to include the case of relativistic speeds, you will write that formula in a more sophisticated way by making it explicit that the mass m is the relativistic mass m = γm0: the rest mass multiplied with a factor referred to as the Lorentz factor which, I am sure, you have seen before: γ = (1 – v2/c2)–1/2. At relativistic speeds (i.e. speeds close to c), this factor makes a difference: it adds mass to the rest mass. So the mass of a particle can be written as m = γm0, with m0 denoting the rest mass. At low speeds (e.g. 1% of the speed of light – as in the case of our electron), m will hardly differ from m0 and then we don’t need this Lorentz factor. It only comes into play at higher speeds.

At this point, I just can’t resist a small digression. It’s just to show that it’s not ‘relativistic effects’ that cause us trouble in finding the right energy equation for our E = hf relation. What’s kinetic energy? Well… There’s a few definitions – such as the energy gathered through converting potential energy – but one very useful definition in the current context is the following: kinetic energy is the excess of a particle over its rest mass energy. So when we’re looking at high-speed or high-energy particles, we will write the kinetic energy as K.E. = mc– m0c= (m – m0)c= γm0c– m0c= m0c2(γ – 1). Before you think I am trying to cheat you: where is the v of our particle? [To make it specific: think about our electron once again but not moving at leisure this time around: imagine it’s speeding at a velocity very close to c in some particle accelerator. Now, v is close to c but not equal to c and so it should not disappear. […]

It’s in the Lorentz factor γ = (1 – v2/c2)–1/2.

Now, we can expand γ into a binomial series (it’s basically an application of the Taylor series – but just check it online if you’re in doubt), so we can write γ as an infinite sum of the following terms: γ = 1 + (1/2)·v2/c+ (3/8)·v4/c+ (3/8)·v4/c+ (5/16)·v6/c+ … etcetera. [The binomial series is an infinite Taylor series, so it’s not to be confused with the (finite) binomial expansion.] Now, when we plug this back into our (relativistic) kinetic energy equation, we can scrap a few things (just do it) to get where I want to get:

K.E. = (1/2)·m0v+ (3/8)·m0v4/c+ (5/16)·m0v6/c+ … etcetera

So what? Well… That’s it – for the digression at least: see how our non-relativistic formula for kinetic energy (K.E. = m0v2/2 is only the first term of this series and, hence, just an approximation: at low speeds, the second, third etcetera terms represent close to nothing (and more close to nothing as you check out the fourth, fifth etcetera terms). OK, OK… You’re getting tired of these games. So what? Should we use this relativistic kinetic energy formula in the de Broglie relation?

No. As mentioned above already, we don’t need any relativistic correction, but the relativistic formula above does come in handy to understand the next bit. What’s the next bit about?

Well… It turns out that we actually do have to use the total energy – including (the energy equivalent to) the rest mass of our electron – in the de Broglie relation E = hf.

WHAT!?

If you think a few seconds about the math of this – so we’d use γm0c2 instead of (1/2)m0v2 (so we use the speed of light instead of the speed of our particle) – you’ll realize we’ll be getting some astronomical frequency (we got that already but so here we are talking some kind of truly fantastic frequency) and, hence, combining that with the wavelength we’d derive from the other de Broglie equation (p = h/λ) we’d surely get some kind of totally unreal speed. Whatever it is, it will surely have nothing to do with our electron, does it?

Let’s go through the math.

The wavelength is just the same as that one given by p = h/λ, so we have λ = 0.33 nanometer. Don’t worry about this. That’s what it is indeed. Check it out online: it’s about a thousand times smaller than the wavelength of (visible) light but that’s OK. We’re talking something real here. That’s why electron microscopes can ‘see’ stuff that light microscopes can’t: their resolution is about a thousand times higher indeed.

But so when we take the first equation once again (E =hf) and calculate the frequency from f = γm0c2/h, we get an frequency in the neighborhood of 12.34×1019 herz. So that gives a velocity of v = fλ = 4.1×1010 meter per second (m/s). But… THAT’S MORE THAN A HUNDRED TIMES THE SPEED OF LIGHT. Surely, we must have got it wrong.

We don’t. The velocity we are calculating here is the phase velocity vp of our matter wave – and IT’S REAL! More in general, it’s easy to show that this phase velocity is equal to vp = fλ = E/p = (γm0c2/h)·(h/γm0v) = c2/v. Just fill in the values for c and v (3×108 and 2.2×106 respectively and you will get the same answer.

But that’s not consistent with relativity, is it? It is: phase velocities can be (and, in fact, usually are – as evidenced by our real-life example) superluminal as they say – i.e. much higher than the speed of light. However, because they carry no information – the wave packet shape is the ‘information’, i.e. the (approximate) location of our electron – such phase velocities do not conflict with relativity theory. It’s like amplitude modulation, like AM radiowaves): the modulation of the amplitude carries the signal, not the carrier wave.

The group velocity, on the other hand, can obviously not be faster than and, in fact, should be equal to the speed of our particle (i.e. the electron). So how do we calculate that? We don’t have any formula ω(k) here, do we? No. But we don’t need one. Indeed, we can write:

v= ∂ω/∂k = ∂(E/ ħ)/∂(p/ ħ) = ∂E/∂p

[Do you see why we prefer the ∂ symbol instead of the d symbol now? ω is a function of k but it’s – first and foremost – a function of E, so a partial derivative sign is quite appropriate.]

So what? Well… Now you can use either the relativistic or non-relativistic relation between E and p to get a value for ∂E/∂p. Let’s take the non-relativistic one first (E = p2/2m) : ∂E/∂p = ∂(p2/2m)/∂p = p/m = v. So we get the velocity of our electron! Just like we wanted. 🙂 As for the relativistic formula (E = (p2c+ m02c4)1/2), well… I’ll let you figure that one out yourself. [You can also find it online in case you’d be desperate.]

Wow! So there we are. That was quite something! I will let you digest this for now. It’s true I promised to ‘point out the differences between matter waves and light waves’ in my introduction but this post has been lengthy enough. I’ll save those ‘differences’ for the next post. In the meanwhile, I hope you enjoyed and – more importantly – understood this one. If you did, you’re a master! A real one! 🙂

A not so easy piece: introducing the wave equation (and the Schrödinger equation)

The title above refers to a previous post: An Easy Piece: Introducing the wave function.

Indeed, I may have been sloppy here and there – I hope not – and so that’s why it’s probably good to clarify that the wave function (usually represented as Ψ – the psi function) and the wave equation (Schrödinger’s equation, for example – but there are other types of wave equations as well) are two related but different concepts: wave equations are differential equations, and wave functions are their solutions.

Indeed, from a mathematical point of view, a differential equation (such as a wave equation) relates a function (such as a wave function) with its derivatives, and its solution is that function or – more generally – the set (or family) of functions that satisfies this equation. 

The function can be real-valued or complex-valued, and it can be a function involving only one variable (such as y = y(x), for example) or more (such as u = u(x, t) for example). In the first case, it’s a so-called ordinary differential equation. In the second case, the equation is referred to as a partial differential equation, even if there’s nothing ‘partial’ about it: it’s as ‘complete’ as an ordinary differential equation (the name just refers to the presence of partial derivatives in the equation). Hence, in an ordinary differential equation, we will have terms involving dy/dx and/or d2y/dx2, i.e. the first and second derivative of y respectively (and/or higher-order derivatives, depending on the degree of the differential equation), while in partial differential equations, we will see terms involving ∂u/∂t and/or ∂u2/∂x(and/or higher-order partial derivatives), with ∂ replacing d as a symbol for the derivative.

The independent variables could also be complex-valued but, in physics, they will usually be real variables (or scalars as real numbers are also being referred to – as opposed to vectors, which are nothing but two-, three- or more-dimensional numbers really). In physics, the independent variables will usually be x – or let’s use r = (x, y, z) for a change, i.e. the three-dimensional space vector – and the time variable t. An example is that wave function which we introduced in our ‘easy piece’.

Ψ(r, t) = Aei(p·r – Et)ħ

[If you read the Easy Piece, then you might object that this is not quite what I wrote there, and you are right: I wrote Ψ(r, t) = Aei(p/ħr – ωt). However, here I am just introducing the other de Broglie relation (i.e. the one relating energy and frequency): E = hf =ħω and, hence, ω = E/ħ. Just re-arrange a bit and you’ll see it’s the same.]

From a physics point of view, a differential equation represents a system subject to constraints, such as the energy conservation law (the sum of the potential and kinetic energy remains constant), and Newton’s law of course: F = d(mv)/dt. A differential equation will usually also be given with one or more initial conditions, such as the value of the function at point t = 0, i.e. the initial value of the function. To use Wikipedia’s definition: “Differential equations arise whenever a relation involving some continuously varying quantities (modeled by functions) and their rates of change in space and/or time (expressed as derivatives) is known or postulated.”

That sounds a bit more complicated, perhaps, but it means the same: once you have a good mathematical model of a physical problem, you will often end up with a differential equation representing the system you’re looking at, and then you can do all kinds of things, such as analyzing whether or not the actual system is in an equilibrium and, if not, whether it will tend to equilibrium or, if not, what the equilibrium conditions would be. But here I’ll refer to my previous posts on the topic of differential equations, because I don’t want to get into these details – as I don’t need them here.

The one thing I do need to introduce is an operator referred to as the gradient (it’s also known as the del operator, but I don’t like that word because it does not convey what it is). The gradient – denoted by ∇ – is a shorthand for the partial derivatives of our function u or Ψ with respect to space, so we write:

∇ = (∂/∂x, ∂/∂y, ∂/∂z)

You should note that, in physics, we apply the gradient only to the spatial variables, not to time. For the derivative in regard to time, we just write ∂u/∂t or ∂Ψ/∂t.

Of course, an operator means nothing until you apply it to a (real- or complex-valued) function, such as our u(x, t) or our Ψ(r, t):

∇u = ∂u/∂x and ∇Ψ = (∂Ψ/∂x, ∂Ψ/∂y, ∂Ψ/∂z)

As you can see, the gradient operator returns a vector with three components if we apply it to a real- or complex-valued function of r, and so we can do all kinds of funny things with it combining it with the scalar or vector product, or with both. Here I need to remind you that, in a vector space, we can multiply vectors using either (i) the scalar product, aka the dot product (because of the dot in its notation: ab) or (ii) the vector product, aka as the cross product (yes, because of the cross in its notation: b).

So we can define a whole range of new operators using the gradient and these two products, such as the divergence and the curl of a vector field. For example, if E is the electric field vector (I am using an italic bold-type E so you should not confuse E with the energy E, which is a scalar quantity), then div E = ∇•E, and curl E =∇×E. Taking the divergence of a vector will yield some number (so that’s a scalar), while taking the curl will yield another vector. 

I am mentioning these operators because you will often see them. A famous example is the set of equations known as Maxwell’s equations, which integrate all of the laws of electromagnetism and from which we can derive the electromagnetic wave equation:

(1) ∇•E = ρ/ε(Gauss’ law)

(2) ∇×E = –∂B/∂t (Faraday’s law)

(3) ∇•B = 0

(4) c2∇×B =  j+  ∂E/∂t  

I should not explain these but let me just remind you of the essentials:

  1. The first equation (Gauss’ law) can be derived from the equations for Coulomb’s law and the forces acting upon a charge q in an electromagnetic field: F = q(E + v×B) – with B the magnetic field vector (F is also referred to as the Lorentz force: it’s the combined force on a charged particle caused by the electric and magnetic fields; v the velocity of the (moving) charge;  ρ the charge density (so charge is thought of as being distributed in space, rather than being packed into points, and that’s OK because our scale is not the quantum-mechanical one here); and, finally, ε0 the electric constant (some 8.854×10−12 farads per meter).
  2. The second equation (Faraday’s law) gives the electric field associated with a changing magnetic field.
  3. The third equation basically states that there is no such thing as a magnetic charge: there are only electric charges.
  4. Finally, in the last equation, we have a vector j representing the current density: indeed, remember than magnetism only appears when (electric) charges are moving, so if there’s an electric current. As for the equation itself, well… That’s a more complicated story so I will leave that for the post scriptum.

We can do many more things: we can also take the curl of the gradient of some scalar, or the divergence of the curl of some vector (both have the interesting property that they are zero), and there are many more possible combinations – some of them useful, others not so useful. However, this is not the place to introduce differential calculus of vector fields (because that’s what it is).

The only other thing I need to mention here is what happens when we apply this gradient operator twice. Then we have an new operator ∇•∇ = ∇which is referred to as the Laplacian. In fact, when we say ‘apply ∇ twice’, we are actually doing a dot product. Indeed, ∇ returns a vector, and so we are going to multiply this vector once again with a vector using the dot product rule: a= ∑aib(so we multiply the individual vector components and then add them). In the case of our functions u and Ψ, we get:

∇•(∇u) =∇•∇u = (∇•∇)u = ∇u =∂2u/∂x2

∇•(∇Ψ) = ∇Ψ = ∂2Ψ/∂x+ ∂2Ψ/∂y+ ∂2Ψ/∂z2

Now, you may wonder what it means to take the derivative (or partial derivative) of a complex-valued function (which is what we are doing in the case of Ψ) but don’t worry about that: a complex-valued function of one or more real variables,  such as our Ψ(x, t), can be decomposed as Ψ(x, t) =ΨRe(x, t) + iΨIm(x, t), with ΨRe and ΨRe two real-valued functions representing the real and imaginary part of Ψ(x, t) respectively. In addition, the rules for integrating complex-valued functions are, to a large extent, the same as for real-valued functions. For example, if z is a complex number, then dez/dz = ez and, hence, using this and other very straightforward rules, we can indeed find the partial derivatives of a function such as Ψ(r, t) = Aei(p·r – Et)ħ with respect to all the (real-valued) variables in the argument.

The electromagnetic wave equation  

OK. That’s enough math now. We are ready now to look at – and to understand – a real wave equation – I mean one that actually represents something in physics. Let’s take Maxwell’s equations as a start. To make it easy – and also to ensure that you have easy access to the full derivation – we’ll take the so-called Heaviside form of these equations:

Heaviside form of Maxwell's equations

This Heaviside form assumes a charge-free vacuum space, so there are no external forces acting upon our electromagnetic wave. There are also no other complications such as electric currents. Also, the c2 (i.e. the square of the speed of light) is written here c2 = 1/με, with μ and ε the so-called permeability (μ) and permittivity (ε) respectively (c0, μand ε0 are the values in a vacuum space: indeed, light travels slower elsewhere (e.g. in glass) – if at all).

Now, these four equations can be replaced by just two, and it’s these two equations that are referred to as the electromagnetic wave equation(s):

electromagnetic wave equation

The derivation is not difficult. In fact, it’s much easier than the derivation for the Schrödinger equation which I will present in a moment. But, even if it is very short, I will just refer to Wikipedia in case you would be interested in the details (see the article on the electromagnetic wave equation). The point here is just to illustrate what is being done with these wave equations and why – not so much howIndeed, you may wonder what we have gained with this ‘reduction’.

The answer to this very legitimate question is easy: the two equations above are second-order partial differential equations which are relatively easy to solve. In other words, we can find a general solution, i.e. a set or family of functions that satisfy the equation and, hence, can represent the wave itself. Why a set of functions? If it’s a specific wave, then there should only be one wave function, right? Right. But to narrow our general solution down to a specific solution, we will need extra information, which are referred to as initial conditions, boundary conditions or, in general, constraints. [And if these constraints are not sufficiently specific, then we may still end up with a whole bunch of possibilities, even if they narrowed down the choice.]

Let’s give an example by re-writing the above wave equation and using our function u(x, t) or, to simplify the analysis, u(x, t) – so we’re looking at a plane wave traveling in one dimension only:

Wave equation for u

There are many functional forms for u that satisfy this equation. One of them is the following:

general solution for wave equation

This resembles the one I introduced when presenting the de Broglie equations, except that – this time around – we are talking a real electromagnetic wave, not some probability amplitude. Another difference is that we allow a composite wave with two components: one traveling in the positive x-direction, and one traveling in the negative x-direction. Now, if you read the post in which I introduced the de Broglie wave, you will remember that these Aei(kx–ωt) or Be–i(kx+ωt) waves give strange probabilities. However, because we are not looking at some probability amplitude here – so it’s not a de Broglie wave but a real wave (so we use complex number notation only because it’s convenient but, in practice, we’re only considering the real part), this functional form is quite OK.

That being said, the following functional form, representing a wave packet (aka a wave train) is also a solution (or a set of solutions better):

Wave packet equation

Huh? Well… Yes. If you really can’t follow here, I can only refer you to my post on Fourier analysis and Fourier transforms: I cannot reproduce that one here because that would make this post totally unreadable. We have a wave packet here, and so that’s the sum of an infinite number of component waves that interfere constructively in the region of the envelope (so that’s the location of the packet) and destructively outside. The integral is just the continuum limit of a summation of n such waves. So this integral will yield a function u with x and t as independent variables… If we know A(k) that is. Now that’s the beauty of these Fourier integrals (because that’s what this integral is). 

Indeed, in my post on Fourier transforms I also explained how these amplitudes A(k) in the equation above can be expressed as a function of u(x, t) through the inverse Fourier transform. In fact, I actually presented the Fourier transform pair Ψ(x) and Φ(p) in that post, but the logic is same – except that we’re inserting the time variable t once again (but with its value fixed at t=0):

Fourier transformOK, you’ll say, but where is all of this going? Be patient. We’re almost done. Let’s now introduce a specific initial condition. Let’s assume that we have the following functional form for u at time t = 0:

u at time 0

You’ll wonder where this comes from. Well… I don’t know. It’s just an example from Wikipedia. It’s random but it fits the bill: it’s a localized wave (so that’s a a wave packet) because of the very particular form of the phase (θ = –x2+ ik0x). The point to note is that we can calculate A(k) when inserting this initial condition in the equation above, and then – finally, you’ll say – we also get a specific solution for our u(x, t) function by inserting the value for A(k) in our general solution. In short, we get:

A

and

u final form

As mentioned above, we are actually only interested in the real part of this equation (so that’s the e with the exponent factor (note there is no in it, so it’s just some real number) multiplied with the cosine term).

However, the example above shows how easy it is to extend the analysis to a complex-valued wave function, i.e. a wave function describing a probability amplitude. We will actually do that now for Schrödinger’s equation. [Note that the example comes from Wikipedia’s article on wave packets, and so there is a nice animation which shows how this wave packet (be it the real or imaginary part of it) travels through space. Do watch it!]

Schrödinger’s equation

Let me just write it down:

Schrodinger's equation

That’s it. This is the Schrödinger equation – in a somewhat simplified form but it’s OK.

[…] You’ll find that equation above either very simple or, else, very difficult depending on whether or not you understood most or nothing at all of what I wrote above it. If you understood something, then it should be fairly simple, because it hardly differs from the other wave equation.

Indeed, we have that imaginary unit (i) in front of the left term, but then you should not panic over that: when everything is said and done, we are working here with the derivative (or partial derivative) of a complex-valued function, and so it should not surprise us that we have an i here and there. It’s nothing special. In fact, we had them in the equation above too, but they just weren’t explicit. The second difference with the electromagnetic wave equation is that we have a first-order derivative of time only (in the electromagnetic wave equation we had 2u/∂t2, so that’s a second-order derivative). Finally, we have a -1/2 factor in front of the right-hand term, instead of c2. OK, so what? It’s a different thing – but that should not surprise us: when everything is said and done, it is a different wave equation because it describes something else (not an electromagnetic wave but a quantum-mechanical system).

To understand why it’s different, I’d need to give you the equivalent of Maxwell’s set of equations for quantum mechanics, and then show how this wave equation is derived from them. I could do that. The derivation is somewhat lengthier than for our electromagnetic wave equation but not all that much. The problem is that it involves some new concepts which we haven’t introduced as yet – mainly some new operators. But then we have introduced a lot of new operators already (such as the gradient and the curl and the divergence) so you might be ready for this. Well… Maybe. The treatment is a bit lengthy, and so I’d rather do in a separate post. Why? […] OK. Let me say a few things about it then. Here we go:

  • These new operators involve matrix algebra. Fine, you’ll say. Let’s get on with it. Well… It’s matrix algebra with matrices with complex elements, so if we write a n×m matrix A as A = (aiaj), then the elements aiaj (i = 1, 2,… n and j = 1, 2,… m) will be complex numbers.
  • That allows us to define Hermitian matrices: a Hermitian matrix is a square matrix A which is the same as the complex conjugate of its transpose.
  • We can use such matrices as operators indeed: transformations acting on a column vector X to produce another column vector AX.
  • Now, you’ll remember – from your course on matrix algebra with real (as opposed to complex) matrices, I hope – that we have this very particular matrix equation AX = λX which has non-trivial solutions (i.e. solutions X ≠ 0) if and only if the determinant of A-λI is equal to zero. This condition (det(A-λI) = 0) is referred to as the characteristic equation.
  • This characteristic equation is a polynomial of degree n in λ and its roots are called eigenvalues or characteristic values of the matrix A. The non-trivial solutions X ≠ 0 corresponding to each eigenvalue are called eigenvectors or characteristic vectors.

Now – just in case you’re still with me – it’s quite simple: in quantum mechanics, we have the so-called Hamiltonian operator. The Hamiltonian in classical mechanics represents the total energy of the system: H = T + V (total energy H = kinetic energy T + potential energy V). Here we have got something similar but different. 🙂 The Hamiltonian operator is written as H-hat, i.e. an H with an accent circonflexe (as they say in French). Now, we need to let this Hamiltonian operator act on the wave function Ψ and if the result is proportional to the same wave function Ψ, then Ψ is a so-called stationary state, and the proportionality constant will be equal to the energy E of the state Ψ. These stationary states correspond to standing waves, or ‘orbitals’, such as in atomic orbitals or molecular orbitals. So we have:

E\Psi=\hat H \Psi

I am sure you are no longer there but, in fact, that’s it. We’re done with the derivation. The equation above is the so-called time-independent Schrödinger equation. It’s called like that not because the wave function is time-independent (it is), but because the Hamiltonian operator is time-independent: that obviously makes sense because stationary states are associated with specific energy levels indeed. However, if we do allow the energy level to vary in time (which we should do – if only because of the uncertainty principle: there is no such thing as a fixed energy level in quantum mechanics), then we cannot use some constant for E, but we need a so-called energy operator. Fortunately, this energy operator has a remarkably simple functional form:

\hat{E} \Psi = i\hbar\dfrac{\partial}{\partial t}\Psi = E\Psi  Now if we plug that in the equation above, we get our time-dependent Schrödinger equation  

i \hbar \frac{\partial}{\partial t}\Psi = \hat H \Psi

OK. You probably did not understand one iota of this but, even then, you will object that this does not resemble the equation I wrote at the very beginning: i(u/∂t) = (-1/2)2u.

You’re right, but we only need one more step for that. If we leave out potential energy (so we assume a particle moving in free space), then the Hamiltonian can be written as:

\hat{H} = -\frac{\hbar^2}{2m}\nabla^2

You’ll ask me how this is done but I will be short on that: the relationship between energy and momentum is being used here (and so that’s where the 2m factor in the denominator comes from). However, I won’t say more about it because this post would become way too lengthy if I would include each and every derivation and, remember, I just want to get to the result because the derivations here are not the point: I want you to understand the functional form of the wave equation only. So, using the above identity and, OK, let’s be somewhat more complete and include potential energy once again, we can write the time-dependent wave equation as:

 i\hbar\frac{\partial}{\partial t}\Psi(\mathbf{r},t) = -\frac{\hbar^2}{2m}\nabla^2\Psi(\mathbf{r},t) + V(\mathbf{r},t)\Psi(\mathbf{r},t)

Now, how is the equation above related to i(u/∂t) = (-1/2)2u? It’s a very simplified version of it: potential energy is, once again, assumed to be not relevant (so we’re talking a free particle again, with no external forces acting on it) but the real simplification is that we give m and ħ the value 1, so m = ħ = 1. Why?

Well… My initial idea was to do something similar as I did above and, hence, actually use a specific example with an actual functional form, just like we did for that the real-valued u(x, t) function. However, when I look at how long this post has become already, I realize I should not do that. In fact, I would just copy an example from somewhere else – probably Wikipedia once again, if only because their examples are usually nicely illustrated with graphs (and often animated graphs). So let me just refer you here to the other example given in the Wikipedia article on wave packets: that example uses that simplified i(u/∂t) = (-1/2)2u equation indeed. It actually uses the same initial condition:

u at time 0

However, because the wave equation is different, the wave packet behaves differently. It’s a so-called dispersive wave packet: it delocalizes. Its width increases over time and so, after a while, it just vanishes because it diffuses all over space. So there’s a solution to the wave equation, given this initial condition, but it’s just not stable – as a description of some particle that is (from a mathematical point of view – or even a physical point of view – there is no issue).

In any case, this probably all sounds like Chinese – or Greek if you understand Chinese :-). I actually haven’t worked with these Hermitian operators yet, and so it’s pretty shaky territory for me myself. However, I felt like I had picked up enough math and physics on this long and winding Road to Reality (I don’t think I am even halfway) to give it a try. I hope I succeeded in passing the message, which I’ll summarize as follows:

  1. Schrödinger’s equation is just like any other differential equation used in physics, in the sense that it represents a system subject to constraints, such as the relationship between energy and momentum.
  2. It will have many general solutions. In other words, the wave function – which describes a probability amplitude as a function in space and time – will have many general solutions, and a specific solution will depend on the initial conditions.
  3. The solution(s) can represent stationary states, but not necessary so: a wave (or a wave packet) can be non-dispersive or dispersive. However, when we plug the wave function into the wave equation, it will satisfy that equation.

That’s neither spectacular nor difficult, is it? But, perhaps, it helps you to ‘understand’ wave equations, including the Schrödinger equation. But what is understanding? Dirac once famously said: “I consider that I understand an equation when I can predict the properties of its solutions, without actually solving it.”

Hmm… I am not quite there yet, but I am sure some more practice with it will help. 🙂

Post scriptum: On Maxwell’s equations

First, we should say something more about these two other operators which I introduced above: the divergence and the curl. First on the divergence.

The divergence of a field vector E (or B) at some point r represents the so-called flux of E, i.e. the ‘flow’ of E per unit volume. So flux and divergence both deal with the ‘flow’ of electric field lines away from (positive) charges. [The ‘away from’ is from positive charges indeed – as per the convention: Maxwell himself used the term ‘convergence’ to describe flow towards negative charges, but so his ‘convention’ did not survive. Too bad, because I think convergence would be much easier to remember.]

So if we write that ∇•ρ/ε0, then it means that we have some constant flux of E because of some (fixed) distribution of charges.

Now, we already mentioned that equation (2) in Maxwell’s set meant that there is no such thing as a ‘magnetic’ charge: indeed, ∇•B = 0 means there is no magnetic flux. But, of course, magnetic fields do exist, don’t they? They do. A current in a wire, for example, i.e. a bunch of steadily moving electric charges, will induce a magnetic field according to Ampère’s law, which is part of equation (4) in Maxwell’s set: c2∇×B =  j0, with j representing the current density and ε0 the electric constant.

Now, at this point, we have this curl: ∇×B. Just like divergence (or convergence as Maxwell called it – but then with the sign reversed), curl also means something in physics: it’s the amount of ‘rotation’, or ‘circulation’ as Feynman calls it, around some loop.

So, to summarize the above, we have (1) flux (divergence) and (2) circulation (curl) and, of course, the two must be related. And, while we do not have any magnetic charges and, hence, no flux for B, the current in that wire will cause some circulation of B, and so we do have a magnetic field. However, that magnetic field will be static, i.e. it will not change. Hence, the time derivative ∂B/∂t will be zero and, hence, from equation (2) we get that ∇×E = 0, so our electric field will be static too. The time derivative ∂E/∂t which appears in equation (4) also disappears and we just have c2∇×B =  j0. This situation – of a constant magnetic and electric field – is described as electrostatics and magnetostatics respectively. It implies a neat separation of the four equations, and it makes magnetism and electricity appear as distinct phenomena. Indeed, as long as charges and currents are static, we have:

[I] Electrostatics: (1) ∇•E = ρ/εand (2) ∇×E = 0

[II] Magnetostatics: (3) c2∇×B =  jand (4) ∇•B = 0

The first two equations describe a vector field with zero curl and a given divergence (i.e. the electric field) while the third and fourth equations second describe a seemingly separate vector field with a given curl but zero divergence. Now, I am not writing this post scriptum to reproduce Feynman’s Lectures on Electromagnetism, and so I won’t say much more about this. I just want to note two points:

1. The first point to note is that factor cin the c2∇×B =  jequation. That’s something which you don’t have in the ∇•E = ρ/εequation. Of course, you’ll say: So what? Well… It’s weird. And if you bring it to the other side of the equation, it becomes clear that you need an awful lot of current for a tiny little bit of magnetic circulation (because you’re dividing by c , so that’s a factor 9 with 16 zeroes after it (9×1016):  an awfully big number in other words). Truth be said, it reveals something very deep. Hmm? Take a wild guess. […] Relativity perhaps? Well… Yes!

It’s obvious that we buried v somewhere in this equation, the velocity of the moving charges. But then it’s part of j of course: the rate at which charge flows through a unit area per second. But – Hey! – velocity as compared to what? What’s the frame of reference? The frame of reference is us obviously or – somewhat less subjective – the stationary charges determining the electric field according to equation (1) in the set above: ∇•E = ρ/ε0. But so here we can ask the same question: stationary in what reference frame? As compared to the moving charges? Hmm… But so how does it work with relativity? I won’t copy Feynman’s 13th Lecture here, but so, in that lecture, he analyzes what happens to the electric and magnetic force when we look at the scene from another coordinate system – let’s say one that moves parallel to the wire at the same speed as the moving electrons, so – because of our new reference frame – the ‘moving electrons’ now appear to have no speed at all but, of course, our stationary charges will now seem to move.

What Feynman finds – and his calculations are very easy and straightforward – is that, while we will obviously insert different input values into Maxwell’s set of equations and, hence, get different values for the E and B fields, the actual physical effect – i.e. the final Lorentz force on a (charged) particle – will be the same. To be very specific, in a coordinate system at rest with respect to the wire (so we see charges move in the wire), we find a ‘magnetic’ force indeed, but in a coordinate system moving at the same speed of those charges, we will find an ‘electric’ force only. And from yet another reference frame, we will find a mixture of E and B fields. However, the physical result is the same: there is only one combined force in the end – the Lorentz force F = q(E + v×B) – and it’s always the same, regardless of the reference frame (inertial or moving at whatever speed – relativistic (i.e. close to c) or not).

In other words, Maxwell’s description of electromagnetism is invariant or, to say exactly the same in yet other words, electricity and magnetism taken together are consistent with relativity: they are part of one physical phenomenon: the electromagnetic interaction between (charged) particles. So electric and magnetic fields appear in different ‘mixtures’ if we change our frame of reference, and so that’s why magnetism is often described as a ‘relativistic’ effect – although that’s not very accurate. However, it does explain that cfactor in the equation for the curl of B. [How exactly? Well… If you’re smart enough to ask that kind of question, you will be smart enough to find the derivation on the Web. :-)]

Note: Don’t think we’re talking astronomical speeds here when comparing the two reference frames. It would also work for astronomical speeds but, in this case, we are talking the speed of the electrons moving through a wire. Now, the so-called drift velocity of electrons – which is the one we have to use here – in a copper wire of radius 1 mm carrying a steady current of 3 Amps is only about 1 m per hour! So the relativistic effect is tiny  – but still measurable !

2. The second thing I want to note is that  Maxwell’s set of equations with non-zero time derivatives for E and B clearly show that it’s changing electric and magnetic fields that sort of create each other, and it’s this that’s behind electromagnetic waves moving through space without losing energy. They just travel on and on. The math behind this is beautiful (and the animations in the related Wikipedia articles are equally beautiful – and probably easier to understand than the equations), but that’s stuff for another post. As the electric field changes, it induces a magnetic field, which then induces a new electric field, etc., allowing the wave to propagate itself through space. I should also note here that the energy is in the field and so, when electromagnetic waves, such as light, or radiowaves, travel through space, they carry their energy with them.

Let me be fully complete here, and note that there’s energy in electrostatic fields as well, and the formula for it is remarkably beautiful. The total (electrostatic) energy U in an electrostatic field generated by charges located within some finite distance is equal to:

Energy of electrostatic field

This equation introduces the electrostatic potential. This is a scalar field Φ from which we can derive the electric field vector just by applying the gradient operator. In fact, all curl-free fields (such as the electric field in this case) can be written as the gradient of some scalar field. That’s a universal truth. See how beautiful math is? 🙂

End of the Road to Reality?

Or the end of theoretical physics?

In my previous post, I mentioned the Goliath of science and engineering: the Large Hadron Collider (LHC), built by the European Organization for Nuclear Research (CERN) under the Franco-Swiss border near Geneva. I actually started uploading some pictures, but then I realized I should write a separate post about it. So here we go.

The first image (see below) shows the LHC tunnel, while the other shows (a part of) one of the two large general-purpose particle detectors that are part of this Large Hadron Collider. A detector is the thing that’s used to look at those collisions. This is actually the smallest of the two general-purpose detectors: it’s the so-called CMS detector (the other one is the ATLAS detector), and it’s ‘only’ 21.6 meter long and 15 meter in diameter – and it weighs about 12,500 tons. But so it did detect a Higgs particle – just like the ATLAS detector. [That’s actually not 100% sure but it was sure enough for the Nobel Prize committee – so I guess that should be good enough for us common mortals :-)]

LHC tunnelLHC - CMS detector

image of collision

The picture above shows one of these collisions in the CMS detector. It’s not the one with the trace of the Higgs particle though. In fact, I have not found any image that actually shows the Higgs particle: the closest thing to such image are some impressionistic images on the ATLAS site. See http://atlas.ch/news/2013/higgs-into-fermions.html

In case you wonder what’s being scattered here… Well… All kinds of things – but so the original collision is usually between protons (so these are hydrogen ions: Hnuclei), although the LHC can produce other nucleon beams as well (collectively referred to as hadrons). These protons have energy levels of 4 TeV (tera-electronVolt: 1 TeV = 1000 GeV = 1 trillion eV = 1×1012 eV).

Now, let’s think about scale once again. Remember (from that same previous post) that we calculated a wavelength of 0.33 nanometer (1 nm = 1×10–9 m, so that’s a billionth of a meter) for an electron. Well, this LHC is actually exploring the sub-femtometer (fm) frontier. One femtometer (fm) is 1×10–15 m so that’s another million times smaller. Yes: so we are talking a millionth of a billionth of a meter. The size of a proton is an estimated 1.7 femtometer indeed and, as you surely know, a proton is a point-like thing occupying a very tiny space, so it’s not like an electron ‘cloud’ swirling around: it’s much smaller. In fact, quarks – three of them make up a proton (or a neutron) – are usually thought of as being just a little bit less than half that size – so that’s about 0.7 fm.

It may also help you to use the value I mentioned for high-energy electrons when I was discussing the LEP (the Large Electron-Positron Collider, which preceded the LHC) – so that was 104.5 GeV – and calculate the associated de Broglie wavelength using E = hf and λ = v/f. The velocity is close to and, hence, if we plug everything in, we get a value close to 1.2×10–15 m indeed, so that’s the femtometer scale indeed. [If you don’t want to calculate anything, then just note we’re going from eV to giga-eV energy levels here, and so our wavelength decreases accordingly: one billion times smaller. Also remember (from the previous posts) that we calculated a wavelength of 0.33×10–6 m and an associated energy level of 70 eV for a slow-moving electron – i.e. one going at 2200 km per second ‘only’, i.e. less than 1% of the speed of light.]  Also note that, at these energy levels, it doesn’t matter whether or not we include the rest mass of the electron: 0.511 MeV is nothing as compared to the GeV realm. In short, we are talking very very tiny stuff here.

But so that’s the LEP scale. I wrote that the LHC is probing things at the sub-femtometer scale. So how much sub-something is that? Well… Quite a lot: the LHC is looking at stuff at a scale that’s more than a thousand times smaller. Indeed, if collision experiments in the giga-electronvolt (GeV) energy range correspond to probing stuff at the femtometer scale, then tera-electronvolt (TeV) energy levels correspond to probing stuff that’s, once again, another thousand times smaller, so we’re looking at distances of less than a thousandth of a millionth of a billionth of a meter. Now, you can try to ‘imagine’ that, but you can’t really.

So what do we actually ‘see’ then? Well… Nothing much one could say: all we can ‘see’ are traces of point-like ‘things’ being scattered, which then disintegrate or just vanish from the scene – as shown in the image above. In fact, as mentioned above, we do not even have such clear-cut ‘trace’ of a Higgs particle: we’ve got a ‘kinda signal’ only. So that’s it? Yes. But then these images are beautiful, aren’t they? If only to remind ourselves that particle physics is about more than just a bunch of formulas. It’s about… Well… The essence of reality: its intrinsic nature so to say. So… Well…

Let me be skeptical. So we know all of that now, don’t we? The so-called Standard Model has been confirmed by experiment. We now know how Nature works, don’t we? We observe light (or, to be precise, radiation: most notably that cosmic background radiation that reaches us from everywhere) that originated nearly 14 billion years ago  (to be precise: 380,000 years after the Big Bang – but what’s 380,000 years  on this scale?) and so we can ‘see’ things that are 14 billion light-years away. In fact, things that were 14 billion light-years away: indeed, because of the expansion of the universe, they are further away now and so that’s why the so-called observable universe is actually larger. So we can ‘see’ everything we need to ‘see’ at the cosmic distance scale and now we can also ‘see’ all of the particles that make up matter, i.e. quarks and electrons mainly (we also have some other so-called leptons, like neutrinos and muons), and also all of the particles that make up anti-matter of course (i.e. antiquarks, positrons etcetera). As importantly – or even more – we can also ‘see’ all of the ‘particles’ carrying the forces governing the interactions between the ‘matter particles’ – which are collectively referred to as fermions, as opposed to the ‘force carrying’ particles, which are collectively referred to as bosons (see my previous post on Bose and Fermi). Let me quickly list them – just to make sure we’re on the same page:

  1. Photons for the electromagnetic force.
  2. Gluons for the so-called strong force, which explains why positively charged protons ‘stick’ together in nuclei – in spite of their electric charge, which should push them away from each other. [You might think it’s the neutrons that ‘glue’ them together but so, no, it’s the gluons.]
  3. W+, W, and Z bosons for the so-called ‘weak’ interactions (aka as Fermi’s interaction), which explain how one type of quark can change into another, thereby explaining phenomena such as beta decay. [For example, carbon-14 will – through beta decay – spontaneously decay into nitrogen-14. Indeed, carbon-12 is the stable isotope, while carbon-14 has a life-time of 5,730 ± 40 years ‘only’ 🙂 and, hence, measuring how much carbon-14 is left in some organic substance allows us to date it (that’s what (radio)carbon-dating is about). As for the name, a beta particle can refer to an electron or a positron, so we can have β decay (e.g. the above-mentioned carbon-14 decay) as well as βdecay (e.g. magnesium-23 into sodium-23). There’s also alpha and gamma decay but that involves different things. In any case… Let me end this digression within the digression.]
  4. Finally, the existence of the Higgs particle – and, hence, of the associated Higgs field – has been predicted since 1964 already, but so it was only experimentally confirmed (i.e. we saw it, in the LHC) last year, so Peter Higgs – and a few others of course – got their well-deserved Nobel prize only 50 years later. The Higgs field gives fermions, and also the W+, W, and Z bosons, mass (but not photons and gluons, and so that’s why the weak force has such short range – as compared to the electromagnetic and strong forces).

So there we are. We know it all. Sort of. Of course, there are many questions left – so it is said. For example, the Higgs particle does actually not explain the gravitational force, so it’s not the (theoretical) graviton, and so we do not have a quantum field theory for the gravitational force. [Just Google it and you’ll see why: there’s theoretical as well as practical (experimental) reasons for that.] Secondly, while we do have a quantum field theory for all of the forces (or ‘interactions’ as physicists prefer to call them), there are a lot of constants in them (much more than just that Planck constant I introduced in my posts!) that seem to be ‘unrelated and arbitrary.’ I am obviously just quoting Wikipedia here – but it’s true.

Just look at it: three ‘generations’ of matter with various strange properties, four force fields (and some ‘gauge theory’ to provide some uniformity), bosons that have mass (the W+, W, and Z bosons, and then the Higgs particle itself) but then photons and gluons don’t… It just doesn’t look good, and then Feynman himself wrote, just a few years before his death (QED, 1985, p. 128), that the math behind calculating some of these constants (the coupling constant j for instance, or the rest mass n of an electron), which he actually invented (it makes use of a mathematical approximation method called perturbation theory) and for which he got a Nobel Prize, is a “dippy process” and that “having to resort to such hocus-pocus has prevented us from proving that the theory of quantum electrodynamics is mathematically self-consistent“. He adds: “It’s surprising that the theory still hasn’t been proved self-consistent one way or the other by now; I suspect that renormalization [“the shell game that we play to find n and j” as he calls it]  is not mathematically legitimate.” And so he writes this about quantum electrodynamics, not about “the rest of physics” (and so that’s quantum chromodynamics (QCD) – the theory of the strong interactions – and quantum flavordynamics (QFD) – the theory of weak interactions) which, he adds, “has not been checked anywhere near as well as electrodynamics.”

Waw ! That’s a pretty damning statement, isn’t it? In short, all of the celebrations around the experimental confirmation of the Higgs particle cannot hide the fact that it all looks a bit messy. There are other questions as well – most of which I don’t understand so I won’t mention them. To make a long story short, physicists and mathematicians alike seem to think there must be some ‘more fundamental’ theory behind. But – Hey! – you can’t have it all, can you? And, of course, all these theoretical physicists and mathematicians out there do need to justify their academic budget, don’t they? And so all that talk about a Grand Unification Theory (GUT) is probably just what is it: talk. Isn’t it? Maybe.

The key question is probably easy to formulate: what’s beyond this scale of a thousandth of a proton diameter (0.001×10–15 m) – a thousandth of a millionth of a billionth of a meter that is. Well… Let’s first note that this so-called ‘beyond’ is a ‘universe’ which mankind (or let’s just say ‘we’) will never see. Never ever. Why? Because there is no way to go substantially beyond the 4 TeV energy levels that were reached last year – at great cost – in the world’s largest particle collider (the LHC). Indeed, the LHC is widely regarded not only as “the most complex and ambitious scientific project ever accomplished by humanity” (I am quoting a CERN scientist here) but – with a cost of more than 7.5 billion Euro – also as one of the most expensive ones. Indeed, taking into account inflation and all that, it was like the Manhattan project indeed (although scientists loathe that comparison). So we should not have any illusions: there will be no new super-duper LHC any time soon, and surely not during our lifetime: the current LHC is the super-duper thing!

Indeed, when I write ‘substantially‘ above, I really mean substantially. Just to put things in perspective: the LHC is currently being upgraded to produce 7 TeV beams (it was shut down for this upgrade, and it should come back on stream in 2015). That sounds like an awful lot (from 4 to 7 is +75%), and it is: it amounts to packing the kinetic energy of seven flying mosquitos (instead of four previously :-)) into each and every particle that makes up the beam. But that’s not substantial, in the sense that it is very much below the so-called GUT energy scale, which is the energy level above which, it is believed (by all those GUT theorists at least), the electromagnetic force, the weak force and the strong force will all be part and parcel of one and the same unified force. Don’t ask me why (I’ll know when I finished reading Penrose, I hope) but that’s what it is (if I should believe what I am reading currently that is). In any case, the thing to remember is that the GUT energy levels are in the 1016 GeV range, so that’s – sorry for all these numbers – a trillion TeV. That amounts to pumping more than 160,000 Joule in each of those tiny point-like particles that make up our beam. So… No. Don’t even try to dream about it. It won’t happen. That’s science fiction – with the emphasis on fiction. [Also don’t dream about a trillion flying mosquitos packed into one proton-sized super-mosquito either. :-)]

So what?

Well… I don’t know. Physicists refer to the zone beyond the above-mentioned scale (so things smaller than 0.001×10–15 m) as the Great Desert. That’s a very appropriate name I think – for more than one reason. And so it’s this ‘desert’ that Roger Penrose is actually trying to explore in his ‘Road to Reality’. As for me, well… I must admit I have great trouble following Penrose on this road. I’ve actually started to doubt that Penrose’s Road leads to Reality. Maybe it takes us away from it. Huh? Well… I mean… Perhaps the road just stops at that 0.001×10–15 m frontier? 

In fact, that’s a view which one of the early physicists specialized in high-energy physics, Raoul Gatto, referred to as the zeroth scenarioI am actually not quoting Gatto here, but another theoretical physicist: Gerard ‘t Hooft, another Nobel prize winner (you may know him better because he’s a rather fervent Mars One supporter, but so here I am referring to his popular 1996 book In Search of the Ultimate Building Blocks). In any case, Gatto, and most other physicists, including ‘T Hooft (despite the fact ‘T Hooft got his Nobel prize for his contribution to gauge theory – which, together with Feynman’s application of perturbation theory to QED, is actually the backbone of the Standard Model) firmly reject this zeroth scenario. ‘T Hooft himself thinks superstring theory (i.e. supersymmetric string theory – which has now been folded into M-theory or – back to the original term – just string theory – the terminology is quite confusing) holds the key to exploring this desert.

But who knows? In fact, we can’t – because of the above-mentioned practical problem of experimental confirmation. So I am likely to stay on this side of the frontier for quite a while – if only because there’s still so much to see here and, of course, also because I am just at the beginning of this road. 🙂 And then I also realize I’ll need to understand gauge theory and all that to continue on this road – which is likely to take me another six months or so (if not more) and then, only then, I might try to look at those little strings, even if we’ll never see them because… Well… Their theoretical diameter is the so-called Planck length. So what? Well… That’s equal to 1.6×10−35 m. So what? Well… Nothing. It’s just that 1.6×10−35 m is 1/10 000 000 000 000 000 of that sub-femtometer scale. I don’t even want to write this in trillionths of trillionths of trillionths etcetera because I feel that’s just not making any sense. And perhaps it doesn’t. One thing is for sure: that ‘desert’ that GUT theorists want us to cross is not just ‘Great’: it’s ENORMOUS!

Richard Feynman – another Nobel Prize scientist whom I obviously respect a lot – surely thought trying to cross a desert like that amounts to certain death. Indeed, he’s supposed to have said the following about string theorists, about a year or two before he died (way too young): I don’t like that they’re not calculating anything. I don’t like that they don’t check their ideas. I don’t like that for anything that disagrees with an experiment, they cook up an explanation–a fix-up to say, “Well, it might be true.” For example, the theory requires ten dimensions. Well, maybe there’s a way of wrapping up six of the dimensions. Yes, that’s all possible mathematically, but why not seven? When they write their equation, the equation should decide how many of these things get wrapped up, not the desire to agree with experiment. In other words, there’s no reason whatsoever in superstring theory that it isn’t eight out of the ten dimensions that get wrapped up and that the result is only two dimensions, which would be completely in disagreement with experience. So the fact that it might disagree with experience is very tenuous, it doesn’t produce anything; it has to be excused most of the time. It doesn’t look right.”

Hmm…  Feynman and ‘T Hooft… Two giants in science. Two Nobel Prize winners – and for stuff that truly revolutionized physics. The amazing thing is that those two giants – who are clearly at loggerheads on this one – actually worked closely together on a number of other topics – most notably on the so-called Feynman-‘T Hooft gauge, which – as far as I understand – is the one that is most widely used in quantum field calculations. But I’ll leave it at that here – and I’ll just make a mental note of the terminology here. The Great Desert… Probably an appropriate term. ‘T Hooft says that most physicists think that desert is full of tiny flowers. I am not so sure – but then I am not half as smart as ‘T Hooft. Much less actually. So I’ll just see where the road I am currently following leads me. With Feynman’s warning in mind, I should probably expect the road condition to deteriorate quickly.

Post scriptum: You will not be surprised to hear that there’s a word for 1×10–18 m: it’s called an attometer (with two t’s, and abbreviated as am). And beyond that we have zeptometer (1 zm = 1×10–21 m) and yoctometer (1 ym = 1×10–23 m). In fact, these measures actually represent something: 20 yoctometer is the estimated radius of a 1 MeV neutrino – or, to be precise, its the radius of the cross section, which is “the effective area that governs the probability of some scattering or absorption event.” But so then there are no words anymore. The next measure is the Planck length: 1.62 × 10−35 m – but so that’s a trillion (1012) times smaller than a yoctometer. Unimaginable, isn’t it? Literally. 

Note: A 1 MeV neutrino? Well… Yes. The estimated rest mass of an (electron) neutrino is tiny: at least 50,000 times smaller than the mass of the electron and, therefore, neutrinos are often assumed to be massless, for all practical purposes that is. However, just like the massless photon, they can carry high energy. High-energy gamma ray photons, for example, are also associated with MeV energy levels. Neutrinos are one of the many particles produced in high-energy particle collisions in particle accelerators, but they are present everywhere: they’re produced by stars (which, as you know, are nuclear fusion reactors). In fact, most neutrinos passing through Earth are produced by our Sun. The largest neutrino detector on Earth is called IceCube. It sits on the South Pole – or under it, as it’s suspended under the Antarctic ice, and it regularly captures high-energy neutrinos in the range of 1 to 10 TeV. Last year (in November 2013), it captured two with energy levels around 1000 TeV – so that’s the peta-electronvolt level (1 PeV = 1×1015 eV). If you think that’s amazing, it is. But also remember that 1 eV is 1.6×10−19 Joule, so it’s ‘only’ a ten-thousandth of a Joule. In other words, you would need at least ten thousand of them to briefly light up an LED. The PeV pair was dubbed Bert and Ernie and the illustration below (from IceCube’s website) conveys how the detectors sort of lit up when they passed. It was obviously a pretty clear ‘signal’ – but so the illustration also makes it clear that we don’t really ‘see’ at such small scale: we just know ‘something’ happened.

Bert and Ernie

The Uncertainty Principle re-visited: Fourier transforms and conjugate variables

In previous posts, I presented a time-independent wave function for a particle (or wavicle as we should call it – but so that’s not the convention in physics) – let’s say an electron – traveling through space without any external forces (or force fields) acting upon it. So it’s just going in some random direction with some random velocity v and, hence, its momentum is p = mv. Let me be specific – so I’ll work with some numbers here – because I want to introduce some issues related to units for measurement.

So the momentum of this electron is the product of its mass m (about 9.1×10−28 grams) with its velocity v (typically something in the range around 2,200 km/s, which is fast but not even close to the speed of light – and, hence, we don’t need to worry about relativistic effects on its mass here). Hence, the momentum p of this electron would be some 20×10−25 kg·m/s. Huh? Kg·m/s?Well… Yes, kg·m/s or N·s are the usual measures of momentum in classical mechanics: its dimension is [mass][length]/[time] indeed. However, you know that, in atomic physics, we don’t want to work with these enormous units (because we then always have to add these ×10−28 and ×10−25 factors and so that’s a bit of a nuisance indeed). So the momentum p will usually be measured in eV/c, with c representing what it usually represents, i.e. the speed of light. Huh? What’s this strange unit? Electronvolts divided by c? Well… We know that eV is an appropriate unit for measuring energy in atomic physics: we can express eV in Joule and vice versa: 1 eV = 1.6×10−19 Joule, so that’s OK – except for the fact that this Joule is a monstrously large unit at the atomic scale indeed, and so that’s why we prefer electronvolt. But the Joule is a shorthand unit for kg·m2/s2, which is the measure for energy expressed in SI units, so there we are: while the SI dimension for energy is actually [mass][length]2/[time]2, using electronvolts (eV) is fine. Now, just divide the SI dimension for energy, i.e. [mass][length]2/[time]2, by the SI dimension for velocity, i.e. [length]/[time]: we get something expressed in [mass][length]/[time]. So that’s the SI dimension for momentum indeed! In other words, dividing some quantity expressed in some measure for energy (be it Joules or electronvolts or erg or calories or coulomb-volts or BTUs or whatever – there’s quite a lot of ways to measure energy indeed!) by the speed of light (c) will result in some quantity with the right dimensions indeed. So don’t worry about it. Now, 1 eV/c is equivalent to 5.344×10−28 kg·m/s, so the momentum of this electron will be 3.75 eV/c.

Let’s go back to the main story now. Just note that the momentum of this electron that we are looking at is a very tiny amount – as we would expect of course.

Time-independent means that we keep the time variable (t) in the wave function Ψ(x, t) fixed and so we only look at how Ψ(x, t) varies in space, with x as the (real) space variable representing position. So we have a simplified wave function Ψ(x) here: we can always put the time variable back in when we’re finished with the analysis. By now, it should also be clear that we should distinguish between real-valued wave functions and complex-valued wave functions. Real-valued wave functions represent what Feynman calls “real waves”, like a sound wave, or an oscillating electromagnetic field. Complex-valued wave functions describe probability amplitudes. They are… Well… Feynman actually stops short of saying that they are not real. So what are they?

They are, first and foremost complex numbers, so they have a real and a so-called imaginary part (z = a + ib or, if we use polar coordinates, reθ = cosθ + isinθ). Now, you may think – and you’re probably right to some extent – that the distinction between ‘real’ waves and ‘complex’ waves is, perhaps, less of a dichotomy than popular writers – like me 🙂 – suggest. When describing electromagnetic waves, for example, we need to keep track of both the electric field vector E as well as the magnetic field vector B (both are obviously related through Maxwell’s equations). So we have two components as well, so to say, and each of these components has three dimensions in space, and we’ll use the same mathematical tools to describe them (so we will also represent them using complex numbers). That being said, these probability amplitudes usually denoted by Ψ(x), describe something very different. What exactly? Well… By now, it should be clear that that is actually hard to explain: the best thing we can do is to work with them, so they start feeling familiar. The main thing to remember is that we need to square their modulus (or magnitude or absolute value if you find these terms more comprehensible) to get a probability (P). For example, the expression below gives the probability of finding a particle – our electron for example – in in the (space) interval [a, b]:

probability versus amplitude

Of course, we should not be talking intervals but three-dimensional regions in space. However, we’ll keep it simple: just remember that the analysis should be extended to three (space) dimensions (and, of course, include the time dimension as well) when we’re finished (to do that, we’d use so-called four-vectors – another wonderful mathematical invention).

Now, we also used a simple functional form for this wave function, as an example: Ψ(x) could be proportional, we said, to some idealized function eikx. So we can write: Ψ(x) ∝ eikx (∝ is the standard symbol expressing proportionality). In this function, we have a wave number k, which is like the frequency in space of the wave (but then measured in radians because the phase of the wave function has to be expressed in radians). In fact, we actually wrote Ψ(x, t) = (1/x)ei(kx – ωt) (so the magnitude of this amplitude decreases with distance) but, again, let’s keep it simple for the moment: even with this very simple function eikx , things will become complex enough.

We also introduced the de Broglie relation, which gives this wave number k as a function of the momentum p of the particle: k = p/ħ, with ħ the (reduced) Planck constant, i.e. a very tiny number in the neighborhood of 6.582 ×10−16 eV·s. So, using the numbers above, we’d have a value for k equal to 3.75 eV/c divided by 6.582 ×10−16 eV·s. So that’s 0.57×1016 (radians) per… Hey, how do we do it with the units here? We get an incredibly huge number here (57 with 14 zeroes after it) per second? We should get some number per meter because k is expressed in radians per unit distance, right? Right. We forgot c. We are actually measuring distance here, but in light-seconds instead of meter: k is 0.57×1016/s. Indeed, a light-second is the distance traveled by light in one second, so that’s s, and if we want k expressed in radians per meter, then we need to divide this huge number 0.57×1016 (in rad) by 2.998×108 ( in (m/s)·s) and so then we get a much more reasonable value for k, and with the right dimension too: to be precise, k is about 19×106 rad/m in this case. That’s still huge: it corresponds with a wavelength of 0.33 nanometer (1 nm = 10-6 m) but that’s the correct order of magnitude indeed.

[In case you wonder what formula I am using to calculate the wavelength: it’s λ = 2π/k. Note that our electron’s wavelength is more than a thousand times shorter than the wavelength of (visible) light (we humans can see light with wavelengths ranging from 380 to 750 nm) but so that’s what gives the electron its particle-like character! If we would increase their velocity (e.g. by accelerating them in an accelerator, using electromagnetic fields to propel them to speeds closer to and also to contain them in a beam), then we get hard beta rays. Hard beta rays are surely not as harmful as high-energy electromagnetic rays. X-rays and gamma rays consist of photons with wavelengths ranging from 1 to 100 picometer (1 pm = 10–12 m) – so that’s another factor of a thousand down – and thick lead shields are needed to stop them: they are the cause of cancer (Marie Curie’s cause of death), and the hard radiation of a nuclear blast will always end up killing more people than the immediate blast effect. In contrast, hard beta rays will cause skin damage (radiation burns) but they won’t go deeper than that.]

Let’s get back to our wave function Ψ(x) ∝ eikx. When we introduced it in our previous posts, we said it could not accurately describe a particle because this wave function (Ψ(x) = Aeikx) is associated with probabilities |Ψ(x)|2 that are the same everywhere. Indeed,  |Ψ(x)|2 = |Aeikx|2 = A2. Apart from the fact that these probabilities would add up to infinity (so this mathematical shape is unacceptable anyway), it also implies that we cannot locate our electron somewhere in space. It’s everywhere and that’s the same as saying it’s actually nowhere. So, while we can use this wave function to explain and illustrate a lot of stuff (first and foremost the de Broglie relations), we actually need something different if we would want to describe anything real (which, in the end, is what physicists want to do, right?). We already said in our previous posts: real particles will actually be represented by a wave packet, or a wave train. A wave train can be analyzed as a composite wave consisting of a (potentially infinite) number of component waves. So we write:

Composite wave

Note that we do not have one unique wave number k or – what amounts to saying the same – one unique value p for the momentum: we have n values. So we’re introducing a spread in the wavelength here, as illustrated below:

Explanation of uncertainty principle

In fact, the illustration above talks of a continuous distribution of wavelengths and so let’s take the continuum limit of the function above indeed and write what we should be writing:

Composite wave - integral

Now that is an interesting formula. [Note that I didn’t care about normalization issues here, so it’s not quite what you’d see in a more rigorous treatment of the matter. I’ll correct that in the Post Scriptum.] Indeed, it shows how we can get the wave function Ψ(x) from some other function Φ(p). We actually encountered that function already, and we referred to it as the wave function in the momentum space. Indeed, Nature does not care much what we measure: whether it’s position (x) or momentum (p), Nature will not share her secrets with us and, hence, the best we can do – according to quantum mechanics – is to find some wave function associating some (complex) probability amplitude with each and every possible (real) value of x or p. What the equation above shows, then, is these wave functions come as a pair: if we have Φ(p), then we can calculate Ψ(x) – and vice versa. Indeed, the particular relation between Ψ(x) and Φ(p) as established above, makes Ψ(x) and Φ(p) a so-called Fourier transform pair, as we can transform Φ(p) into Ψ(x) using the above Fourier transform (that’s how that  integral is called), and vice versa. More in general, a Fourier transform pair can be written as:

Fourier transform pair

Instead of x and p, and Ψ(x) and Φ(p), we have x and y, and f(x) and g(y), in the formulas above, but so that does not make much of a difference when it comes to the interpretation: x and p (or x and y in the formulas above) are said to be conjugate variables. What it means really is that they are not independent. There are quite a few of such conjugate variables in quantum mechanics such as, for example: (1) time and energy (and time and frequency, of course, in light of the de Broglie relation between both), and (2) angular momentum and angular position (or orientation). There are other pairs too but these involve quantum-mechanical variables which I do not understand as yet and, hence, I won’t mention them here. [To be complete, I should also say something about that 1/2π factor, but so that’s just something that pops up when deriving the Fourier transform from the (discrete) Fourier series on which it is based. We can put it in front of either integral, or split that factor across both. Also note the minus sign in the exponent of the inverse transform.]

When you look at the equations above, you may think that f(x) and g(y) must be real-valued functions. Well… No. The Fourier transform can be used for both real-valued as well as complex-valued functions. However, at this point I’ll have to refer those who want to know each and every detail about these Fourier transforms to a course in complex analysis (such as Brown and Churchill’s Complex Variables and Applications (2004) for instance) or, else, to a proper course on real and complex Fourier transforms (they are used in signal processing – a very popular topic in engineering – and so there’s quite a few of those courses around).

The point to note in this post is that we can derive the Uncertainty Principle from the equations above. Indeed, the (complex-valued) functions Ψ(x) and Φ(p) describe (probability) amplitudes, but the (real-valued) functions |Ψ(x)|2 and |Φ(p)|2 describe probabilities or – to be fully correct – they are probability (density) functions. So it is pretty obvious that, if the functions Ψ(x) and Φ(p) are a Fourier transform pair, then |Ψ(x)|2 and |Φ(p)|2 must be related to. They are. The derivation is a bit lengthy (and, hence, I will not copy it from the Wikipedia article on the Uncertainty Principle) but one can indeed derive the so-called Kennard formulation of the Uncertainty Principle from the above Fourier transforms. This Kennard formulation does not use this rather vague Δx and Δp symbols but clearly states that the product of the standard deviation from the mean of these two probability density functions can never be smaller than ħ/2:

σxσ≥ ħ/2

To be sure: ħ/2 is a rather tiny value, as you should know by now, 🙂 but, so, well… There it is.

As said, it’s a bit lengthy but not that difficult to do that derivation. However, just for once, I think I should try to keep my post somewhat shorter than usual so, to conclude, I’ll just insert one more illustration here (yes, you’ve seen that one before), which should now be very easy to understand: if the wave function Ψ(x) is such that there’s relatively little uncertainty about the position x of our electron, then the uncertainty about its momentum will be huge (see the top graphs). Vice versa (see the bottom graphs), precise information (or a narrow range) on its momentum, implies that its position cannot be known.

2000px-Quantum_mechanics_travelling_wavefunctions_wavelength

Does all this math make it any easier to understand what’s going on? Well… Yes and no, I guess. But then, if even Feynman admits that he himself “does not understand it the way he would like to” (Feynman Lectures, Vol. III, 1-1), who am I? In fact, I should probably not even try to explain it, should I? 🙂

So the best we can do is try to familiarize ourselves with the language used, and so that’s math for all practical purposes. And, then, when everything is said and done, we should probably just contemplate Mario Livio’s question: Is God a mathematician? 🙂

Post scriptum:

I obviously cut corners above, and so you may wonder how that ħ factor can be related to σand σ if it doesn’t appear in the wave functions. Truth be told, it does. Because of (i) the presence of ħ in the exponent in our ei(p/ħ)x function, (ii) normalization issues (remember that probabilities (i.e. Ψ|(x)|2 and |Φ(p)|2) have to add up to 1) and, last but not least, (iii) the 1/2π factor involved in Fourier transforms , Ψ(x) and Φ(p) have to be written as follows:

Position and momentum wave functionNote that we’ve also re-inserted the time variable here, so it’s pretty complete now. One more thing we could do is to substitute x for a proper three-dimensional space vector or, better still, introduce four-vectors, which would allow us to also integrate relativistic effects (most notably the slowing of time with motion – as observed from the stationary reference frame) – which become important when, for instance, we’re looking at electrons being accelerated, which is the rule, rather than the exception, in experiments.

Remember (from a previous post) that we calculated that an electron traveling at its usual speed in orbit (2200 km/s, i.e. less than 1% of the speed of light) had an energy of about 70 eV? Well, the Large Electron-Positron Collider (LEP) did accelerate them to speeds close to light, thereby giving them energy levels topping 104.5 billion eV (or 104.5 GeV as it’s written) so they could hit each other with collision energies topping 209 GeV (they come from opposite directions so it’s two times 104.5 GeV). Now, 209 GeV is tiny when converted to everyday energy units: 209 GeV is 33×10–9 Joule only indeed – and so note the minus sign in the exponent here: we’re talking billionths of a Joule here. Just to put things into perspective: 1 Watt is the energy consumption of an LED (and 1 Watt is 1 Joule per second), so you’d need to combine the energy of billions of these fast-traveling electrons to power just one little LED lamp. But, of course, that’s not the right comparison: 104.5 GeV is more than 200,000 times the electron’s rest mass (0.511 MeV), so that means that – in practical terms – their mass (remember that mass is a measure for inertia) increased by the same factor (204,500 times to be precise). Just to give an idea of the effort that was needed to do this: CERN’s LEP collider was housed in a tunnel with a circumference of 27 km. Was? Yes. The tunnel is still there but it now houses the Large Hadron Collider (LHC) which, as you surely know, is the world’s largest and most powerful particle accelerator: its experiments confirmed the existence of the Higgs particle in 2013, thereby confirming the so-called Standard Model of particle physics. [But I’ll see a few things about that in my next post.]

Oh… And, finally, in case you’d wonder where we get the inequality sign in σxσ≥ ħ/2, that’s because – at some point in the derivation – one has to use the Cauchy-Schwarz inequality (aka as the triangle inequality): |z1+ z1| ≤ |z1|+| z1|. In fact, to be fully complete, the derivation uses the more general formulation of the Cauchy-Schwarz inequality, which also applies to functions as we interpret them as vectors in a function space. But I would end up copying the whole derivation here if I add any more to this – and I said I wouldn’t do that. 🙂 […]