The Aharonov-Bohm effect

This title sounds very exciting. It is – or was, I should say – one of these things I thought I would never ever understand, until I started studying physics, that is. 🙂

Having said that, there is – incidentally – nothing very special about the Aharonov-Bohm effect. As Feynman puts it: “The theory was known from the beginning of quantum mechanics in 1926. […] The implication was there all the time, but no one paid attention to it.”

To be fair, he also admits the experiment itself – proving the effect – is “very, very difficult”, which is why the first experiment that claimed to confirm the predicted effect was set up in 1960 only. In fact, some claim the results of that experiment were ambiguous, and that it was only in 1986, with the experiment of Akira Tonomura, that the Aharonov-Bohm effect was unambiguously demonstrated. So what is it about?

In essence, it proves the reality of the vector potential—and of the (related) magnetic field. What do we mean with a real field? To put it simply, a real field cannot act on some particle from a distance through some kind of spooky ‘action-at-a-distance’: real fields must be specified at the position of the particle itself and describe what happens there. Now you’ll immediately wonder: so what’s a non-real field? Well… Some field that does act through some kind of spooky ‘action-at-a-distance.’ As for an example… Well… I can’t give you one because we’ve only been discussing real fields so far. 🙂

So it’s about what a magnetic (or an electric) field does in terms influencing motion and/or quantum-mechanical amplitudes. In fact, we discussed this matter  quite a while ago (check my 2015 post on it). Now, I don’t want to re-write that post, but let me just remind you of the essentials. The two equations for the magnetic field (B) in Maxwell’s set of four equations (the two others specify the electric field E) are: (1) B = 0 and (2) c2×B = j0 + ∂E/ ∂t. Now, you can temporarily forget about the second equation, but you should note that the B = 0 equation is always true (unlike the ×E = 0 expression, which is true for electrostatics only, when there are no moving charges). So it says that the divergence of B is zero, always.

Now, from our posts on vector calculus, you may or may not remember that the divergence of the curl of a vector field is always zero. We wrote: div (curl A) = •(×A) = 0, always. Now, there is another theorem that we can now apply, which says the following: if the divergence of a vector field, say D, is zero – so if D = 0, then D will be the curl of some other vector field C, so we can write: D×C. When we now apply this to our B = 0 equation, we can confidently state the following: 

If B = 0, then there is an A such that B×A

We can also write this as follows:·B = ·(×A) = 0 and, hence, B×A. Now, it’s this vector field A that is referred to as the (magnetic) vector potential, and so that’s what we want to talk about here. As a start, it may be good to write out all of the components of our B×A vector:

formula for B

In that 2015 post, I answered the question as to why we’d need this new vector field in a way that wasn’t very truthful: I just said that, in many situations, it would be more convenient – from a mathematical point of view, that is – to first find A, and then calculate the derivatives above to get B.

Now, Feynman says the following about this argument in his Lecture on the topic: “It is true that in many complex problems it is easier to work with A, but it would be hard to argue that this ease of technique would justify making you learn about one more vector field. […] We have introduced A because it does have an important physical significance: it is a real physical field.” Let us follow his argument here.

Quantum-mechanical interference effects

Let us first remind ourselves of the quintessential electron interference experiment illustrated below. [For a much more modern rendering of this experiment, check out the  Tout Est Quantique video on it. It’s much more amusing than my rather dry exposé here, but it doesn’t give you the math.]

interference

We have electrons, all of (nearly) the same energy, which leave the source – one by one – and travel towards a wall with two narrow slits. Beyond the wall is a backstop with a movable detector which measures the rate, which we call I, at which electrons arrive at a small region of the backstop at the distance x from the axis of symmetry. The rate (or intensityI is proportional to the probability that an individual electron that leaves the source will reach that region of the backstop. This probability has the complicated-looking distribution shown in the illustration, which we understand is due to the interference of two amplitudes, one from each slit. So we associate the two trajectories with two amplitudes, which Feynman writes as A1eiΦ1 and A2eiΦ2 respectively.

As usual, Feynman abstracts away from the time variable here because it is, effectively, not relevant: the interference pattern depends on distances and angles only. Having said that, for a good understanding, we should – perhaps – write our two wavefunctions as A1ei(ωt + Φ1and A2ei(ωt + Φ2respectively. The point is: we’ve got two wavefunctions – one for each trajectory – even if it’s only one electron going through the slit: that’s the mystery of quantum mechanics. 🙂 We need to add these waves so as to get the interference effect:

R = A1ei(ωt + Φ1A2ei(ωt + Φ2= [A1eiΦ1 A2eiΦ2eiωt

Now, we know we need to take the absolute square of this thing to get the intensity – or probability (before normalization). The absolute square of a product, is the product of the absolute squares of the factors, and we also know that the absolute square of any complex number is just the product of the same number with its complex conjugate. Hence, the absolute square of the eiωt factor is equal to |eiωt|2 = eiωteiωt = e= 1. So the time-dependent factor doesn’t matter: that’s why we can always abstract away from it. Let us now take the absolute square of the [A1eiΦ1 A2eiΦ2] factor, which we can write as:

|R|= |A1eiΦ1 A2eiΦ2|= (A1eiΦ1 A2eiΦ2)·(A1eiΦ1 A2eiΦ2)

= A1+ A2+ 2·A1·A2·cos(Φ1−Φ2) = A1+ A2+ 2·A1·A2·cosδ with δ = Φ1−Φ2

OK. This is probably going a bit quick, but you should be able to figure it out, especially when remembering that eiΦ eiΦ = 2·cosΦ and cosΦ = cos(−Φ). The point to note is that the intensity is equal to the sum of the intensities of both waves plus a correction factor, which is equal to 2·A1·A2·cos(Φ1−Φ2) and, hence, ranges from −2·A1·A2 to +2·A1·A2. Now, it takes a bit of geometrical wizardry to be able to write the phase difference δ = Φ1−Φas

δ = 2π·a/λ = 2π·(x/L)·d/λ

—but it can be done. 🙂 Well… […] OK. 🙂 Let me quickly help you here by copying another diagram from Feynman – one he uses to derive the formula for the phase difference on arrival between the signals from two oscillators. A1 and A2 are equal here (A1 = A2 = A) so that makes the situation below somewhat simpler to analyze. However, instead, we have the added complication of a phase difference (α) at the origin – which Feynman refers to as an intrinsic relative phasetriangle

When we apply the geometry shown above to our electron passing through the slits, we should, of course, equate α to zero. For the rest, the picture is pretty similar as the two-slit picture. The distance in the two-slit – i.e. the difference in the path lengths for the two trajectories of our electron(s) – is, obviously, equal to the d·sinθ factor in the oscillator picture. Also, because L is huge as compared to x, we may assume that trajectory 1 and 2 are more or less parallel and, importantly, that the triangles in the picture – small and large – are rectangular. Now, trigonometry tells us that sinθ is equal to the ratio of the opposite side of the triangle and the hypotenuse (i.e. the longest side of the rectangular triangle). The opposite side of the triangle is x and, because is very, very small as compared to L, we may approximate the length of the hypotenuse with L. [I know—a lot of approximations here, but… Well… Just go along with it as for now…] Hence, we can equate sinθ to x/L and, therefore, d·x/L. Now we need to calculate the phase difference. How many wavelengths do we have in a? That’s simple: a/λ, i.e. the total distance divided by the wavelength. Now these wavelengths correspond to 2π·aradians (one cycle corresponds to one wavelength which, in turn, corresponds to 2π radians). So we’re done. We’ve got the formula: δ = Φ1−Φ= 2π·a/λ = 2π·(x/L)·d/λ.

Huh? Yes. Just think about it. I need to move on. The point is: when is equal to zero, the two waves are in phase, and the probability will have a maximum. When δ = π, then the waves are out of phase and interfere destructively (cosπ = −1), so the intensity (and, hence, the probability) reaches a minimum. 

So that’s pretty obvious – or should be pretty obvious if you’ve understood some of the basics we presented in this blog. We now move to the non-standard stuff, i.e. the Aharonov-Bohm effect(s).

Interference in the presence of an electromagnetic field

In essence, the Aharonov-Bohm effect is nothing special: it is just a law – two laws, to be precise – that tells us how the phase of our wavefunction changes because of the presence of a magnetic and/or electric field. As such, it is not very different from previous analyses and presentations, such as those showing how amplitudes are affected by a potential − such as an electric potential, or a gravitational field, or a magnetic field − and how they relate to a classical analysis of the situation (see, for example, my November 2015 post on this topic). If anything, it’s just a more systematic approach to the topic and – importantly – an approach centered around the use of the vector potential A (and the electric potential Φ). Let me give you the formulas:

f1

f2

The first formula tells us that the phase of the amplitude for our electron (or whatever charged particle) to arrive at some location via some trajectory is changed by an amount that is equal to the integral of the vector potential along the trajectory times the charge of the particle over Planck’s constant. I know that’s quite a mouthful but just read it a couple of times.

The second formula tells us that, if there’s an electrostatic field, it will produce a phase change given by the negative of the time integral of the (scalar) potential Φ.

These two expressions – taken together – tell us what happens for any electromagnetic field, static or dynamic. In fact, they are really the (two) law(s) replacing the q(v×B) expression in classical mechanics.

So how does it work? Let me further follow Feynman’s treatment of the matter—which analyzes what happens when we’d have some magnetic field in the two-slit experiment (so we assume there’s no electric field: we only look at some magnetic field). We said Φ1 was the phase of the wave along trajectory 1, and Φ2 was the phase of the wave along trajectory 2. Without magnetic field, that is, so B = 0. Now, the (first) formula above tells us that, when the field is switched on, the new phases will be the following:

f3

f4

Hence, the phase difference δ = Φ1−Φwill now be equal to:

f5

Now, we can combine the two integrals into one that goes forward along trajectory 1 and comes back along trajectory 2. We’ll denote this path as 1-2 and write the new integral as follows:

f6

Note that we’re using a notation here which suggests that the 1-2 path is closed, which is… Well… Yet another approximation of the Master. In fact, his assumption that the new 1-2 path is closed proves to be essential in the argument that follows the one we presented above, in which he shows that the inherent arbitrariness in our choice of a vector potential function doesn’t matter, but… Well… I don’t want to get too technical here.

Let me conclude this post by noting we can re-write our grand formula above in terms of the flux of the magnetic field B:

f7

So… Well… That’s it, really. I’ll refer you to Feynman’s Lecture on this matter for a detailed description of the 1960 experiment itself, which involves a magnetized iron whisker that acts like a tiny solenoid—small enough to match the tiny scale of the interference experiment itself. I must warn you though: there is a rather long discussion in that Lecture on the ‘reality’ of the magnetic and the vector potential field which – unlike Feynman’s usual approach to discussions like this – is rather philosophical and partially misinformed, as it assumes there is zero magnetic field outside of a solenoid. That’s true for infinitely long solenoids, but not true for real-life solenoids: if we have some A, then we must also have some B, and vice versa. Hence, if the magnetic field (B) is a real field (in the sense that it cannot act on some particle from a distance through some kind of spooky ‘action-at-a-distance’), then the vector potential A is an equally real field—and vice versa. Feynman admits as much as he concludes his rather lengthy philosophical excursion with the following conclusion (out of which I already quoted one line in my introduction to this post):

“This subject has an interesting history. The theory we have described was known from the beginning of quantum mechanics in 1926. The fact that the vector potential appears in the wave equation of quantum mechanics (called the Schrödinger equation) was obvious from the day it was written. That it cannot be replaced by the magnetic field in any easy way was observed by one man after the other who tried to do so. This is also clear from our example of electrons moving in a region where there is no field and being affected nevertheless. But because in classical mechanics A did not appear to have any direct importance and, furthermore, because it could be changed by adding a gradient, people repeatedly said that the vector potential had no direct physical significance—that only the magnetic and electric fields are “real” even in quantum mechanics. It seems strange in retrospect that no one thought of discussing this experiment until 1956, when Bohm and Aharonov first suggested it and made the whole question crystal clear. The implication was there all the time, but no one paid attention to it. Thus many people were rather shocked when the matter was brought up. That’s why someone thought it would be worthwhile to do the experiment to see if it was really right, even though quantum mechanics, which had been believed for so many years, gave an unequivocal answer. It is interesting that something like this can be around for thirty years but, because of certain prejudices of what is and is not significant, continues to be ignored.”

Well… That’s it, folks! Enough for today! 🙂

Diffraction and the Uncertainty Principle (I)

In his Lectures, Feynman advances the double-slit experiment with electrons as the textbook example explaining the “mystery” of quantum mechanics. It shows interference–a property of waves–of ‘particles’, electrons: they no longer behave as particles in this experiment. While it obviously illustrates “the basic peculiarities of quantum mechanics” very well, I think the dual behavior of light – as a wave and as a stream of photons – is at least as good as an illustration. And he could also have elaborated the phenomenon of electron diffraction.

Indeed, the phenomenon of diffraction–light, or an electron beam, interfering with itself as it goes through one slit only–is equally fascinating. Frankly, I think it does not get enough attention in textbooks, including Feynman’s, so that’s why I am devoting a rather long post to it here.

To be fair, Feynman does use the phenomenon of diffraction to illustrate the Uncertainty Principle, both in his Lectures as well as in that little marvel, QED: The Strange Theory of Light of Matter–a must-read for anyone who wants to understand the (probability) wave function concept without any reference to complex numbers or what have you. Let’s have a look at it: light going through a slit or circular aperture, illustrated in the left-hand image below, creates a diffraction pattern, which resembles the interference pattern created by an array of oscillators, as shown in the right-hand image.

Diffraction for particle wave Line of oscillators

Let’s start with the right-hand illustration, which illustrates interference, not diffraction. We have eight point sources of electromagnetic radiation here (e.g. radio waves, but it can also be higher-energy light) in an array of length L. λ is the wavelength of the radiation that is being emitted, and α is the so-called intrinsic relative phase–or, to put it simply, the phase difference. We assume α is zero here, so the array produces a maximum in the direction θout = 0, i.e. perpendicular to the array. There are also weaker side lobes. That’s because the distance between the array and the point where we are measuring the intensity of the emitted radiation does result in a phase difference, even if the oscillators themselves have no intrinsic phase difference.

Interference patterns can be complicated. In the set-up below, for example, we have an array of oscillators producing not just one but many maxima. In fact, the array consists of just two sources of radiation, separated by 10 wavelengths.

Interference two dipole radiatorsThe explanation is fairly simple. Once again, the waves emitted by the two point sources will be in phase in the east-west (E-W) direction, and so we get a strong intensity there: four times more, in fact, than what we would get if we’d just have one point source. Indeed, the waves are perfectly in sync and, hence, add up, and the factor four is explained by the fact that the intensity, or the energy of the wave, is proportional to the square of the amplitude: 2= 4. We get the first minimum at a small angle away (the angle from the normal is denoted by ϕ in the illustration), where the arrival times differ by 180°, and so there is destructive interference and the intensity is zero. To be precise, if we draw a line from each oscillator to a distant point and the difference Δ in the two distances is λ/2, half an oscillation, then they will be out of phase. So this first null occurs when that happens. If we move a bit further, to the point where the difference Δ is equal to λ, then the two waves will be a whole cycle out of phase, i.e. 360°, which is the same as being exactly in phase again! And so we get many maxima (and minima) indeed.

But this post should not turn into a lesson on how to construct a radio broadcasting array. The point to note is that diffraction is usually explained using this rather simple theory on interference of waves assuming that the slit itself is an array of point sources, as illustrated below (while the illustrations above were copied from Feynman’s Lectures, the ones below were taken from the Wikipedia article on diffraction). This is referred to as the Huygens-Fresnel Principle, and the math behind is summarized in Kirchhoff’s diffraction formula.

500px-Refraction_on_an_aperture_-_Huygens-Fresnel_principle Huygens_Fresnel_Principle 

Now, that all looks easy enough, but the illustration above triggers an obvious question: what about the spacing between those imaginary point sources? Why do we have six in the illustration above? The relation between the length of the array and the wavelength is obviously important: we get the interference pattern that we get with those two point sources above because the distance between them is 10λ. If that distance would be different, we would get a different interference pattern. But so how does it work exactly? If we’d keep the length of the array the same (L = 10λ) but we would add more point sources, would we get the same pattern?

The easy answer is yes, and Kirchhoff’s formula actually assumes we have an infinite number of point sources between those two slits: every point becomes the source of a spherical wave, and the sum of these secondary waves then yields the interference pattern. The animation below shows the diffraction pattern from a slit with a width equal to five times the wavelength of the incident wave. The diffraction pattern is the same as above: one strong central beam with weaker lobes on the sides.

5wavelength=slitwidthsprectrum

However, the truth is somewhat more complicated. The illustration below shows the interference pattern for an array of length L = 10λ–so that’s like the situation with two point sources above–but with four additional point sources to the two we had already. The intensity in the E–W direction is much higher, as we would expect. Adding six waves in phase yields a field strength that is six times as great and, hence, the intensity (which is proportional to the square of the field) is thirty-six times as great as compared to the intensity of one individual oscillator. Also, when we look at neighboring points, we find a minimum and then some more ‘bumps’, as before, but then, at an angle of 30°, we get a second beam with the same intensity as the central beam. Now, that’s something we do not see in the diffraction patterns above. So what’s going on here?

Six-dipole antenna

Before I answer that question, I’d like to compare with the quantum-mechanical explanation. It turns out that this question in regard to the relevance of the number of point sources also pops up in Feynman’s quantum-mechanical explanation of diffraction.

The quantum-mechanical explanation of diffraction

The illustration below (taken from Feynman’s QED, p. 55-56) presents the quantum-mechanical point of view. It is assumed that light consists of a photons, and these photons can follow any path. Each of these paths is associated with what Feynman simply refers to as an arrow, but so it’s a vector with a magnitude and a direction: in other words, it’s a complex number representing a probability amplitude.

Many arrows Few arrows

In order to get the probability for a photon to travel from the source (S) to a point (P or Q), we have to add up all the ‘arrows’ to arrive at a final ‘arrow’, and then we take its (absolute) square to get the probability. The text under each of the two illustrations above speaks for itself: when we have ‘enough’ arrows (i.e. when we allow for many neighboring paths, as in the illustration on the left), then the arrows for all of the paths from S to P will add up to one big arrow, because there is hardly any difference in time between them, while the arrows associated with the paths to Q will cancel out, because the difference in time between them is fairly large. Hence, the light will not go to Q but travel to P, i.e. in a straight line.

However, when the gap is nearly closed (so we have a slit or a small aperture), then we have only a few neighboring paths, and then the arrows to Q also add up, because there is hardly any difference in time between them too. As I am quoting from Feynman’s QED here, let me quote all of the relevant paragraph: “Of course, both final arrows are small, so there’s not much light either way through such a small hole, but the detector at Q will click almost as much as the one at P ! So when you try to squeeze light too much to make sure it’s going only in a straight line, it refuses to cooperate and begins to spread out. This is an example of the Uncertainty Principle: there is a kind of complementarity between knowledge of where the light goes between the blocks and where it goes afterwards. Precise knowledge of both is impossible.” (Feynman, QED, p. 55-56).

Feynman’s quantum-mechanical explanation is obviously more ‘true’ that the classical explanation, in the sense that it corresponds to what we know is true from all of the 20th century experiments confirming the quantum-mechanical view of reality: photons are weird ‘wavicles’ and, hence, we should indeed analyze diffraction in terms of probability amplitudes, rather than in terms of interference between waves. That being said, Feynman’s presentation is obviously somewhat more difficult to understand and, hence, the classical explanation remains appealing. In addition, Feynman’s explanation triggers a similar question as the one I had on the number of point sources. Not enough arrows !? What do we mean with that? Why can’t we have more of them? What determines their number?

Let’s first look at their direction. Where does that come from? Feynman is a wonderful teacher here. He uses an imaginary stopwatch to determine their direction: the stopwatch starts timing at the source and stops at the destination. But all depends on the speed of the stopwatch hand of course. So how fast does it turn? Feynman is a bit vague about that but notes that “the stopwatch hand turns around faster when it times a blue photon than when it does when timing a red photon.” In other words, the speed of the stopwatch hand depends on the frequency of the light: blue light has a higher frequency (645 THz) and, hence, a shorter wavelength (465 nm) then red light, for which f = 455 THz and λ = 660 nm. Feynman uses this to explain the typical patterns of red, blue, and violet (separated by borders of black), when one shines red and blue light on a film of oil or, more generally,the phenomenon of iridescence in general, as shown below.

Iridescence

As for the size of the arrows, their length is obviously subject to a normalization condition, because all probabilities have to add up to 1. But what about their number? We didn’t answer that question–yet.

The answer, of course, is that the number of arrows and their size are obviously related: we associate a probability amplitude with every way an event can happen, and the (absolute) square of all these probability amplitudes has to add up to 1. Therefore, if we would identify more paths, we would have more arrows, but they would have to be smaller. The end result would be the same though: when the slit is ‘small enough’, the arrows representing the paths to Q would not cancel each other out and, hence, we’d have diffraction.

You’ll say: Hmm… OK. I sort of see the idea, but how do you explain that pattern–the central beam and the smaller side lobes, and perhaps a second beam as well? Well… You’re right to be skeptical. In order to explain the exact pattern, we need to analyze the wave functions, and that requires a mathematical approach rather than the type of intuitive approach which Feynman uses in his little QED booklet. Before we get started on that, however, let me give another example of such intuitive approach.

Diffraction and the Uncertainty Principle

Let’s look at that very first illustration again, which I’ve copied, for your convenience, again below. Feynman uses it (III-2-2) to (a) explain the diffraction pattern which we observe when we send electrons through a slit and (b) to illustrate the Uncertainty Principle. What’s the story?

Well… Before the electrons hit the wall or enter the slit, we have more or less complete information about their momentum, but nothing on their position: we don’t know where they are exactly, and we also don’t know if they are going to hit the wall or go through the slit. So they can be anywhere. However, we do know their energy and momentum. That momentum is horizontal only, as the electron beam is normal to the wall and the slit. Hence, their vertical momentum is zero–before they hit the wall or enter the slit that is. We’ll denote their (horizontal) momentum, i.e. the momentum before they enter the slit, as p0.

Diffraction for particle wave

Now, if an electron happens to go through the slit, and we know because we detected it on the other side, then we know its vertical position (y) at the slit itself with considerable accuracy: that position will be the center of the slit ±B/2. So the uncertainty in position (Δy) is of the order B, so we can write: Δy = B. However, according to the Uncertainty Principle, we cannot have precise knowledge about its position and its momentum. In addition, from the diffraction pattern itself, we know that the electron acquires some vertical momentum. Indeed, some electrons just go straight, but more stray a bit away from the normal. From the interference pattern, we know that the vast majority stays within an angle Δθ, as shown in the plot. Hence, plain trigonometry allows us to write the spread in the vertical momentum py as p0Δθ, with pthe horizontal momentum. So we have Δpy = p0Δθ.

Now, what is Δθ? Well… Feynman refers to the classical analysis of the phenomenon of diffraction (which I’ll reproduce in the next section) and notes, correctly, that the first minimum occurs at an angle such that the waves from one edge of the slit have to travel one wavelength farther than the waves from the other side. The geometric analysis (which, as noted, I’ll reproduce in the next section) shows that that angle is equal to the wavelength divided by the width of the slit, so we have Δθ = λ/B. So now we can write:

Δpy = p0Δθ = p0λ/B

That shows that the uncertainty in regard to the vertical momentum is, indeed, inversely proportional to the uncertainty in regard to its position (Δy), which is the slit width B. But we can go one step further. The de Broglie relation relates wavelength to momentum: λ = h/p. What momentum? Well… Feynman is a bit vague on that: he equates it with the electron’s horizontal momentum, so he writes λ = h/p0. Is this correct? Well… Yes and no. The de Broglie relation associates a wavelength with the total momentum, but then it’s obvious that most of the momentum is still horizontal, so let’s go along with this. What about the wavelength? What wavelength are we talking about here? It’s obviously the wavelength of the complex-valued wave function–the ‘probability wave’ so to say.

OK. So, what’s next? Well… Now we can write that Δpy = p0Δθ = p0λ/B = p0(h/p0)/B. Of course, the pfactor vanishes and, hence, bringing B to the other side and substituting for Δy = B yields the following grand result:

Δy·Δp= h

Wow ! Did Feynman ‘prove’ Heisenberg’s Uncertainty Principle here?

Well… No. Not really. First, the ‘proof’ above actually assumes there’s fundamental uncertainty as to the position and momentum of a particle (so it actually assumes some uncertainty principle from the start), and then it derives it from another fundamental assumption, i.e. the de Broglie relation, which is obviously related to the Uncertainty Principle. Hence, all of the above is only an illustration of the Uncertainty Principle. It’s no proof. As far as I know, one can’t really ‘prove’ the Uncertainty Principle: it’s a fundamental assumption which, if accepted, makes our observations consistent with the theory that is based on it, i.e. quantum or wave mechanics.

Finally, note that everything that I wrote above also takes the diffraction pattern as a given and, hence, while all of the above indeed illustrates the Uncertainty Principle, it’s not an explanation of the phenomenon of diffraction as such. For such explanation, we need a rigorous mathematical analysis, and that’s a classical analysis. Let’s go for it!

Going from six to n oscillators

The mathematics involved in analyzing diffraction and/or interference are actually quite tricky. If you’re alert, then you should have noticed that I used two illustrations that both have six oscillators but that the interference pattern doesn’t match. I’ve reproduced them below. The illustration on the right-hand side has six oscillators and shows a second beam besides the central one–and, of course, there’s such beam also 30° higher, so we have (at least) three beams with the same intensity here–while the animation on the left-hand side shows only one central beam. So what’s going on here?

Six-dipole antenna Huygens_Fresnel_Principle

The answer is that, in the particular example on the left-hand side, the successive dipole radiators (i.e. the point sources) are separated by a distance of two wavelengths (2λ). In that case, it is actually possible to find an angle where the distance δ between successive dipoles is exactly one wavelength (note the little δ in the illustration, as measured from the second point source), so that the effects from all of them are in phase again. So each one is then delayed relative to the next one by 360 degrees, and so they all come back in phase, and then we have another strong beam in that direction! In this case, the other strong beam has an angle of 30 degrees as compared to the E-W line. If we would put in some more oscillators, to ensure that they are all closer than one wavelength apart, then this cannot happen. And so it’s not happening with light. 🙂 But now that we’re here, I’ll just quickly note that it’s an interesting and useful phenomenon used in diffraction gratings, but I’ll refer you to the literature on that, as I shouldn’t be bothering you with all these digressions. So let’s get back at it.

In fact, let me skip the nitty-gritty of the detailed analysis (I’ll refer you to Feynman’s Lectures for that) and just present the grand result for n oscillators, as depicted below:

n oscillatorsThis, indeed, shows the diffraction pattern we are familiar with: one strong maximum separated from successive smaller ones (note that the dotted curve magnifies the actual curve with a factor 10). The vertical axis shows the intensity, but expressed as a fraction of the maximum intensity, which is n2I(Iis the intensity we would observe if there was only 1 oscillator). As for the horizontal axis, the variable there is ϕ really, although we re-scale the variable in order to get 1, 2, 2 etcetera for the first, second, etcetera minimum. This ϕ is the phase difference. It consists of two parts:

  1. The intrinsic relative phase α, i.e. the difference in phase between one oscillator and the next: this is assumed to be zero in all of the examples of diffraction patterns above but so the mathematical analysis here is somewhat more general.
  2. The phase difference which results from the fact that we are observing the array in a given direction θ from the normal. Now that‘s the interesting bit, and it’s not so difficult to show that this additional phase is equal to 2πdsinθ/λ, with d the distance between two oscillators, λ the wavelength of the radiation, and θ the angle from the normal.

In short, we write:

ϕ α 2πdsinθ/λ

Now, because I’ll have to use the variables below in the analysis that follows, I’ll quickly also reproduce the geometry of the set-up (all illustrations here taken from Feynman’s Lectures): 

geometry

Before I proceed, please note that we assume that d is less than λ, so we only have one great maximum, and that’s the so-called zero-order beam centered at θ 0. In order to get subsidiary great maxima (referred to as first-order, second-order, etcetera beams in the context of diffraction gratings), we must have the spacing d of the array greater than one wavelength, but so that’s not relevant for what we’re doing here, and that is to move from a discrete analysis to a continuous one.

Before we do that, let’s look at that curve again and analyze where the first minimum occurs. If we assume that α = 0 (no intrinsic relative phase), then the first minimum occurs when ϕ 2π/n. Using the ϕ α 2πdsinθ/λ formula, we get 2πdsinθ/λ 2π/n or ndsinθ λ. What does that mean? Well, nd is the total length L of the array, so we have ndsinθ Lsinθ Δ = λWhat that means is that we get the first minimum when Δ is equal to one wavelength.

Now why do we get a minimum when Δ λ? Because the contributions of the various oscillators are then uniformly distributed in phase from 0° to 360°. What we’re doing, once again, is adding arrows in order to get a resultant arrow AR, as shown below for n = 6. At the first minimum, the arrows are going around a whole circle: we are adding equal vectors in all directions, and such a sum is zero. So when we have an angle θ such that Δ λ, we get the first minimum. [Note that simple trigonometry rules imply that θ must be equal to λ/L, a fact which we used in that quantum-mechanical analysis of electron diffraction above.]    

Adding waves

What about the second minimum? Well… That occurs when ϕ = 4π/n. Using the ϕ 2πdsinθ/λ formula again, we get 2πdsinθ/λ = 4π/n or ndsinθ = 2λ. So we get ndsinθ Lsinθ Δ = 2λ. So we get the second minimum at an angle θ such that Δ = 2λFor the third minimum, we have ϕ = 6π/n. So we have 2πdsinθ/λ = 6π/n or ndsinθ = 3λ. So we get the third minimum at an angle θ such that Δ = 3λAnd so on and so on.

The point to note is that the diffraction pattern depends only on the wavelength λ and the total length L of the array, which is the width of the slit of course. Hence, we can actually extend the analysis for n going from some fixed value to infinity, and we’ll find that we will only have one great maximum with a lot of side lobes that are much and much smaller, with the minima occurring at angles such that Δ = λ, 2λ, 3λ, etcetera.

OK. What’s next? Well… Nothing. That’s it. I wanted to do a post on diffraction, and so that’s what I did. However, to wrap it all up, I’ll just include two more images from Wikipedia. The one on the left shows the diffraction pattern of a red laser beam made on a plate after passing a small circular hole in another plate. The pattern is quite clear. On the right-hand side, we have the diffraction pattern generated by ordinary white light going through a hole. In fact, it’s a computer-generated image and the gray scale intensities have been adjusted to enhance the brightness of the outer rings, because we would not be able to see them otherwise.

283px-Airy-pattern 600px-Laser_Interference

But… Didn’t I say I would write about diffraction and the Uncertainty Principle? Yes. And I admit I did not write all that much about the Uncertainty Principle above. But so I’ll do that in my next post, in which I intend to look at Heisenberg’s own illustration of the Uncertainty Principle. That example involves a good understanding of the resolving power of a lens or a microscope, and such understanding also involves some good mathematical analysis. However, as this post has become way too long already, I’ll leave that to the next post indeed. I’ll use the left-hand image above for that, so have a good look at it. In fact, let me quickly quote Wikipedia as an introduction to my next post:

The diffraction pattern resulting from a uniformly-illuminated circular aperture has a bright region in the center, known as the Airy disk which together with the series of concentric bright rings around is called the Airy pattern.

We’ll need in order to define the resolving power of a microscope, which is essential to understanding Heisenberg’s illustration of the Principle he advanced himself. But let me stop here, as it’s the topic of my next write-up indeed. This post has become way too long already. 🙂

Photons as strings

In my previous post, I explored, somewhat jokingly, the grey area between classical physics and quantum mechanics: light as a wave versus light as a particle. I did so by trying to picture a photon as an electromagnetic transient traveling through space, as illustrated below. While actual physicists would probably deride my attempt to think of a photon as an electromagnetic transient traveling through space, the idea illustrates the wave-particle duality quite well, I feel.

Photon wave

Understanding light is the key to understanding physics. Light is a wave, as Thomas Young proved to the Royal Society of London in 1803, thereby demolishing Newton’s corpuscular theory. But its constituents, photons, behave like particles. According to modern-day physics, both were right. Just to put things in perspective, the thickness of the note card which Young used to split the light – ordinary sunlight entering his room through a pinhole in a window shutter – was 1/30 of an inch, or approximately 0.85 mm. Hence, in essence, this is a double-slit experiment with the two slits being separated by a distance of almost 1 millimeter. That’s enormous as compared to modern-day engineering tolerance standards: what was thin then, is obviously not considered to be thin now. Scale matters. I’ll come back to this.

Young’s experiment (from www.physicsclassroom.com)

Young experiment

The table below shows that the ‘particle character’ of electromagnetic radiation becomes apparent when its frequency is a few hundred terahertz, like the sodium light example I used in my previous post: sodium light, as emitted by sodium lamps, has a frequency of 500×1012 oscillations per second and, therefore (the relation between frequency and wavelength is very straightforward: their product is the velocity of the wave, so for light we have the simple λf = c equation), a wavelength of 600 nanometer (600×10–9 meter).

Electromagnetic spectrum

However, whether something behaves like a particle or a wave also depends on our measurement scale: 0.85 mm was thin in Young’s time, and so it was a delicate experiment then but now, it’s a standard classroom experiment indeed. The theory of light as a wave would hold until more delicate equipment refuted it. Such equipment came with another sense of scale. It’s good to remind oneself that Einstein’s “discovery of the law of the photoelectric effect”, which explained the photoelectric effect as the result of light energy being carried in discrete quantized packets of energy, now referred to as photons, goes back to 1905 only, and that the experimental apparatus which could measure it was not much older. So waves behave like particles if we look at them close enough. Conversely, particles behave like waves if we look at them close enough. So there is this zone where they are neither, the zone for which we invoke the mathematical formalism of quantum mechanics or, to put it more precisely, the formalism of quantum electrodynamics: that “strange theory of light and Matter”, as Feynman calls it.

Let’s have a look at how particles became waves. It should not surprise us that the experimental apparatuses needed to confirm that electrons–or matter in general–can actually behave like a wave is more recent than the 19th century apparatuses which led Einstein to develop his ‘corpuscular’ theory of light (i.e. the theory of light as photons). The engineering tolerances involved are daunting. Let me be precise here. To be sure, the phenomenon of electron diffraction (i.e. electrons going through one slit and producing a diffraction pattern on the other side) had been confirmed experimentally already in 1925, in the famous Davisson-Germer experiment. I am saying because it’s rather famous indeed. First, because electron diffraction was a weird thing to contemplate at the time. Second, because it confirmed the de Broglie hypothesis only two years after Louis de Broglie had advanced it. And, third, because Davisson and Germer had never intended to set it up to detect diffraction: it was pure coincidence. In fact, the observed diffraction pattern was the result of a laboratory accident, and Davisson and Germer weren’t aware of other, conscious, attempts of trying to prove the de Broglie hypothesis. 🙂 […] OK. I am digressing. Sorry. Back to the lesson.

The nanotechnology that was needed to confirm Feynman’s 1965 thought experiment on electron interference (i.e. electrons going through two slits and interfering with each other (rather than producing some diffraction pattern as they go through one slit only) – and, equally significant as an experiment result, with themselves as they go through the slit(s) one by one! – was only developed over the past decades. In fact, it was only in 2008 (and again in 2012) that the experiment was carried out exactly the way Feynman describes it in his Lectures.

It is useful to think of what such experiments entail from a technical point of view. Have a look at the illustration below, which shows the set-up. The insert in the upper-left corner shows the two slits which were used in the 2012 experiment: they are each 62 nanometer wide – that’s 50×10–9 m! – and the distance between them is 272 nanometer, or 0.272 micrometer. [Just to be complete: they are 4 micrometer tall (4×10–6 m), and the thing in the middle of the slits is just a little support (150 nm) to make sure the slit width doesn’t vary.]

The second inset (in the upper-right corner) shows the mask that can be moved to close one or both slits partially or completely. The mask is 4.5µm wide ×20µm tall. Please do take a few seconds to contemplate the technology behind this feat: a nanometer is a millionth of a millimeter, so that’s a billionth of a meter, and a micrometer is a millionth of a meter. To imagine how small a nanometer is, you should imagine dividing one millimeter in ten, and then one of these tenths in ten again, and again, and once again, again, and again. In fact, you actually cannot imagine that because we live in the world we live in and, hence, our mind is used only to addition (and subtraction) when it comes to comparing sizes and – to a much more limited extent – with multiplication (and division): our brain is, quite simply, not wired to deal with exponentials and, hence, it can’t really ‘imagine’ these incredible (negative) powers. So don’t think you can imagine it really, because one can’t: in our mind, these scales exist only as mathematical constructs. They don’t correspond to anything we can actually make a mental picture of.

Electron double-slit set-up

The electron beam consisted of electrons with an (average) energy of 600 eV. That’s not an awful lot: 8.5 times more than the energy of an electron in orbit in a atom, whose energy would be some 70 eV, so the acceleration before they went through the slits was relatively modest. I’ve calculated the corresponding de Broglie wavelength of these electrons in another post (Re-Visiting the Matter-Wave, April 2014), using the de Broglie equations: f = E/h or λ = p/h. And, of course, you could just google the article on the experiment and read about it, but it’s a good exercise, and actually quite simple: just note that you’ll need to express the energy in joule (not in eV) to get it right. Also note that you need to include the rest mass of the electron in the energy. I’ll let you try it (or else just go to that post of mine). You should find a de Broglie wavelength of 50 picometer for these electrons, so that’s 50×10–12 m. While that wavelength is less than a thousandth of the slit width (62 nm), and about 5,500 times smaller than the space between the two slits (272 nm), the interference effect was unambiguous in the experiment. I advice you to google the results yourself (or read that April 2014 post of mine if you want a summary): the experiment was done at the University of Nebraska-Lincoln in 2012.

Electrons and X-rays

To put everything in perspective: 50 picometer is like the wavelength of X-rays, and you can google similar double-slit experiments for X-rays: they also loose their ‘particle behavior’ when we look at them at this tiny scale. In short, scale matters, and the boundary between ‘classical physics’ (electromagnetics) and quantum physics (wave mechanics) is not clear-cut. If anything, it depends on our perspective, i.e. what we can measure, and we seem to be shifting that boundary constantly. In what direction?

Downwards obviously: we’re devising instruments that measure stuff at smaller and smaller scales, and what’s happening is that we can ‘see’ typical ‘particles’, including hard radiation such as gamma rays, as local wave trains. Indeed, the next step is clear-cut evidence for interference between gamma rays.

Energy levels of photons

We would not associate low-frequency electromagnetic waves, such as radio or radar waves, with photons. But light in the visible spectrum, yes. Obviously. […]

Isn’t that an odd dichotomy? If we see that, on a smaller scale, particles start to look like waves, why would the reverse not be true? Why wouldn’t we analyze radio or radar waves, on a much larger scale, as a stream of very (I must say extremely) low-energy photons? I know the idea sounds ridiculous, because the energies involved would be ridiculously low indeed. Think about it. The energy of a photon is given by the Planck relation: E = h= hc/λ. For visible light, with wavelengths ranging from 800 nm (red) to 400 nm (violet or indigo), the photon energies range between 1.5 and 3 eV. Now, the shortest wavelengths for radar waves are in the so-called millimeter band, i.e. they range from 1 mm to 1 cm. A wavelength of 1 mm corresponds to a photon energy of 0.00124 eV. That’s close to nothing, of course, and surely not the kind of energy levels that we can currently detect.

But you get the idea: there is a grey area between classical physics and quantum mechanics, and it’s our equipment–notably the scale of our measurements–that determine where that grey area begins, and where it ends, and it seems to become larger and larger as the sensitivity of our equipment improves.

What do I want to get at? Nothing much. Just some awareness of scale, as an introduction to the actual topic of this post, and that’s some thoughts on a rather primitive string theory of photons. What !? 

Yes. Purely speculative, of course. 🙂

Photons as strings

I think my calculations in the previous post, as primitive as they were, actually provide quite some food for thought. If we’d treat a photon in the sodium light band (i.e. the light emitted by sodium, from a sodium lamp for instance) just like any other electromagnetic pulse, we would find it’s a pulse of some 10 meter long. We also made sense of this incredibly long distance by noting that, if we’d look at it as a particle (which is what we do when analyzing it as a photon), it should have zero size, because it moves at the speed of light and, hence, the relativistic length contraction effect ensures we (or any observer in whatever reference frame really, because light always moves at the speed of light, regardless of the reference frame) will see it as a zero-size particle.

Having said that, and knowing damn well that we have treat the photon as an elementary particle, I would think it’s very tempting to think of it as a vibrating string.

Huh?

Yes. Let me copy that graph again. The assumption I started with is a standard one in physics, and not something that you’d want to argue with: photons are emitted when an electron jumps from a higher to a lower energy level and, for all practical purposes, this emission can be analyzed as the emission of an electromagnetic pulse by an atomic oscillator. I’ll refer you to my previous post – as silly as it is – for details on these basics: the atomic oscillator has a Q, and so there’s damping involved and, hence, the assumption that the electromagnetic pulse resembles a transient should not sound ridiculous. Because the electric field as a function in space is the ‘reversed’ image of the oscillation in time, the suggested shape has nothing blasphemous.

Photon wave

Just go along with it for a while. First, we need to remind ourselves that what’s vibrating here is nothing physical: it’s an oscillating electromagnetic field. That being said, in my previous post, I toyed with the idea that the oscillation could actually also represent the photon’s wave function, provided we use a unit for the electric field that ensures that the area under the squared curve adds up to one, so as to normalize the probability amplitudes. Hence, I suggested that the field strength over the length of this string could actually represent the probability amplitudes, provided we choose an appropriate unit to measure the electric field.

But then I was joking, right? Well… No. Why not consider it? An electromagnetic oscillation packs energy, and the energy is proportional to the square of the amplitude of the oscillation. Now, the probability of detecting a particle is related to its energy, and such probability is calculated from taking the (absolute) square of probability amplitudes. Hence, mathematically, this makes perfect sense.

It’s quite interesting to think through the consequences, and I hope I will (a) understand enough of physics and (b) find enough time for this—one day! One interesting thing is that the field strength (i.e. the magnitude of the electric field vector) is a real number. Hence, if we equate these magnitudes with probability amplitudes, we’d have real probability amplitudes, instead of complex-valued ones. That’s not a very fundamental issue. It probably indicates we should also take into account the fact that the E vector also oscillates in the other direction that’s normal to the direction of propagation, i.e. the y-coordinate (assuming that the z-axis is the direction of propagation). To put it differently, we should take the polarization of the light into account. The figure below–which I took from Wikipedia again (by far the most convenient place to shop for images and animations: what would I do without it?– shows how the electric field vector moves in the xy-plane indeed, as the wave travels along the z-axis. So… Well… I still have to figure it all out, but the idea surely makes sense.

Circular.Polarization.Circularly.Polarized.Light_Right.Handed.Animation.305x190.255Colors

Another interesting thing to think about is how the collapse of the wave function would come about. If we think of a photon as a string, it must have some ‘hooks’ which could cause it to ‘stick’ or ‘collapse’ into a ‘lump’ as it hits a detector. What kind of hook? What force would come into play?

Well… The interaction between the photon and the photodetector is electromagnetic, but we’re looking for some other kind of ‘hook’ here. What could it be? I have no idea. Having said that, we know that the weakest of all fundamental forces—gravity—becomes much stronger—very much stronger—as the distance becomes smaller and smaller. In fact, it is said that, if we go to the Planck scale, the strength of the force of gravity becomes quite comparable with the other forces. So… Perhaps it’s, quite simply, the equivalent mass of the energy involved that gets ‘hooked’, somehow, as it starts interacting with the photon detector. Hence, when thinking about a photon as an oscillating string of energy, we should also think of that string as having some inseparable (equivalent) mass that, once it’s ‘hooked’, has no other option that to ‘collapse into itself’. [You may note there’s no quantum theory for gravity as yet. I am not sure how, but I’ve got a gut instinct that tells me that may help to explain why a photon consists of one single ‘unbreakable’ lump, although I need to elaborate this argument obviously.]

You must be laughing aloud now. A new string theory–really?

I know… I know… I haven’t reach sophomore level and I am already wildly speculating… Well… Yes. What I am talking about here has probably nothing to do with current string theories, although my proposed string would also replace the point-like photon by a one-dimensional ‘string’. However, ‘my’ string is, quite simply, an electromagnetic pulse (a transient actually, for reasons I explained in my previous post). Naive? Perhaps. However, I note that the earliest version of string theory is referred to as bosonic string theory, because it only incorporated bosons, which is what photons are.

So what? Well… Nothing… I am sure others have thought of this too, and I’ll look into it. It’s surely an idea which I’ll keep in the back of my head as I continue to explore physics. The idea is just too simple and beautiful to disregard, even if I am sure it must be pretty naive indeed. Photons as ten-meter long strings? Let’s just forget about it. 🙂 Onwards !!! 🙂

Post Scriptum: The key to ‘closing’ this discussion is, obviously, to be found in a full-blown analysis of the relativity of fields. So, yes, I have not done all of the required ‘homework’ on this and the previous post. I apologize for that. If anything, I hope it helped you to also try to think somewhat beyond the obvious. I realize I wasted a lot of time trying to understand the pre-cooked ready-made stuff that’s ‘on the market’, so to say. I still am, actually. Perhaps I should first thoroughly digest Feynman’s Lectures. In fact, I think that’s what I’ll try to do in the next year or so. Sorry for any inconvenience caused. 🙂

Re-visiting the matter wave (II)

My previous post was, once again, littered with formulas – even if I had not intended it to be that way: I want to convey some kind of understanding of what an electron – or any particle at the atomic scale – actually is – with the minimum number of formulas necessary.

We know particles display wave behavior: when an electron beam encounters an obstacle or a slit that is somewhat comparable in size to its wavelength, we’ll observe diffraction, or interference. [I have to insert a quick note on terminology here: the terms diffraction and interference are often used interchangeably, but there is a tendency to use interference when we have more than one wave source and diffraction when there is only one wave source. However, I’ll immediately add that distinction is somewhat artificial. Do we have one or two wave sources in a double-slit experiment? There is one beam but the two slits break it up in two and, hence, we would call it interference. If it’s only one slit, there is also an interference pattern, but the phenomenon will be referred to as diffraction.]

We also know that the wavelength we are talking about it here is not the wavelength of some electromagnetic wave, like light. It’s the wavelength of a de Broglie wave, i.e. a matter wave: such wave is represented by an (oscillating) complex number – so we need to keep track of a real and an imaginary part – representing a so-called probability amplitude Ψ(x, t) whose modulus squared (│Ψ(x, t)│2) is the probability of actually detecting the electron at point x at time t. [The purists will say that complex numbers can’t oscillate – but I am sure you get the idea.]

You should read the phrase above twice: we cannot know where the electron actually is. We can only calculate probabilities (and, of course, compare them with the probabilities we get from experiments). Hence, when the wave function tells us the probability is greatest at point x at time t, then we may be lucky when we actually probe point x at time t and find it there, but it may also not be there. In fact, the probability of finding it exactly at some point x at some definite time t is zero. That’s just a characteristic of such probability density functions: we need to probe some region Δx in some time interval Δt.

If you think that is not very satisfactory, there’s actually a very common-sense explanation that has nothing to do with quantum mechanics: our scientific instruments do not allow us to go beyond a certain scale anyway. Indeed, the resolution of the best electron microscopes, for example, is some 50 picometer (1 pm = 1×10–12 m): that’s small (and resolutions get higher by the year), but so it implies that we are not looking at points – as defined in math that is: so that’s something with zero dimension – but at pixels of size Δx = 50×10–12 m.

The same goes for time. Time is measured by atomic clocks nowadays but even these clocks do ‘tick’, and these ‘ticks’ are discrete. Atomic clocks take advantage of the property of atoms to resonate at extremely consistent frequencies. I’ll say something more about resonance soon – because it’s very relevant for what I am writing about in this post – but, for the moment, just note that, for example, Caesium-133 (which was used to build the first atomic clock) oscillates at 9,192,631,770 cycles per second. In fact, the International Bureau of Standards and Weights re-defined the (time) second in 1967 to correspond to “the duration of 9,192,631,770 periods of the radiation corresponding to the transition between the two hyperfine levels of the ground state of the Caesium-133 atom at rest at a temperature of 0 K.”

Don’t worry about it: the point to note is that when it comes to measuring time, we also have an uncertainty. Now, when using this Caesium-133 atomic clock, this uncertainty would be in the range of ±9.2×10–9 seconds (so that’s nanoseconds: 1 ns = 1×10–9 s), because that’s the rate at which this clock ‘ticks’. However, there are other (much more plausible) ways of measuring time: some of the unstable baryons have lifetimes in the range of a few picoseconds only (1 ps = 1×10–12 s) and the really unstable ones – known as baryon resonances – have lifetimes in the 1×10–22 to 1×10–24 s range. This we can only measure because they leave some trace after these particle collisions in particle accelerators and, because we have some idea about their speed, we can calculate their lifetime from the (limited) distance they travel before disintegrating. The thing to remember is that for time also, we have to make do with time pixels  instead of time points, so there is a Δt as well. [In case you wonder what baryons are: they are particles consisting of three quarks, and the proton and the neutron are the most prominent (and most stable) representatives of this family of particles.]  

So what’s the size of an electron? Well… It depends. We need to distinguish two very different things: (1) the size of the area where we are likely to find the electron, and (2) the size of the electron itself. Let’s start with the latter, because that’s the easiest question to answer: there is a so-called classical electron radius re, which is also known as the Thompson scattering length, which has been calculated as:

r_\mathrm{e} = \frac{1}{4\pi\varepsilon_0}\frac{e^2}{m_{\mathrm{e}} c^2} = 2.817 940 3267(27) \times 10^{-15} \mathrm{m}As for the constants in this formula, you know these by now: the speed of light c, the electron charge e, its mass me, and the permittivity of free space εe. For whatever it’s worth (because you should note that, in quantum mechanics, electrons do not have a size: they are treated as point-like particles, so they have a point charge and zero dimension), that’s small. It’s in the femtometer range (1 fm = 1×10–15 m). You may or may not remember that the size of a proton is in the femtometer range as well – 1.7 fm to be precise – and we had a femtometer size estimate for quarks as well: 0.7 m. So we have the rather remarkable result that the much heavier proton (its rest mass is 938 MeV/csas opposed to only 0.511 MeV MeV/c2, so the proton is 1835 times heavier) is 1.65 times smaller than the electron. That’s something to be explored later: for the moment, we’ll just assume the electron wiggles around a bit more – exactly because it’s lighterHere you just have to note that this ‘classical’ electron radius does measure something: it’s something ‘hard’ and ‘real’ because it scatters, absorbs or deflects photons (and/or other particles). In one of my previous posts, I explained how particle accelerators probe things at the femtometer scale, so I’ll refer you to that post (End of the Road to Reality?) and move on to the next question.

The question concerning the area where we are likely to detect the electron is more interesting in light of the topic of this post (the nature of these matter waves). It is given by that wave function and, from my previous post, you’ll remember that we’re talking the nanometer scale here (1 nm = 1×10–9 m), so that’s a million times larger than the femtometer scale. Indeed, we’ve calculated a de Broglie wavelength of 0.33 nanometer for relatively slow-moving electrons (electrons in orbit), and the slits used in single- or double-slit experiments with electrons are also nanotechnology. In fact, now that we are here, it’s probably good to look at those experiments in detail.

The illustration below relates the actual experimental set-up of a double-slit experiment performed in 2012 to Feynman’s 1965 thought experiment. Indeed, in 1965, the nanotechnology you need for this kind of experiment was not yet available, although the phenomenon of electron diffraction had been confirmed experimentally already in 1925 in the famous Davisson-Germer experiment. [It’s famous not only because electron diffraction was a weird thing to contemplate at the time but also because it confirmed the de Broglie hypothesis only two years after Louis de Broglie had advanced it!]. But so here is the experiment which Feynman thought would never be possible because of technology constraints:

Electron double-slit set-upThe insert in the upper-left corner shows the two slits: they are each 50 nanometer wide (50×10–9 m) and 4 micrometer tall (4×10–6 m). [The thing in the middle of the slits is just a little support. Please do take a few seconds to contemplate the technology behind this feat: 50 nm is 50 millionths of a millimeter. Try to imagine dividing one millimeter in ten, and then one of these tenths in ten again, and again, and once again, again, and again. You just can’t imagine that, because our mind is used to addition/subtraction and – to some extent – with multiplication/division: our mind can’t deal with with exponentiation really – because it’s not a everyday phenomenon.] The second inset (in the upper-right corner) shows the mask that can be moved to close one or both slits partially or completely.

Now, 50 nanometer is 150 times larger than the 0.33 nanometer range we got for ‘our’ electron, but it’s small enough to show diffraction and/or interference. [In fact, in this experiment (done by Bach, Pope, Liou and Batelaan from the University of Nebraska-Lincoln less than two years ago indeed), the beam consisted of electrons with an (average) energy of 600 eV and a de Broglie wavelength of 50 picometer. So that’s like the electrons used in electron microscopes. 50 pm is 6.6 times smaller than the 0.33 nm wavelength we calculated for our low-energy (70 eV) electron – but then the energy and the fact these electrons are guided in electromagnetic fields explain the difference. Let’s go to the results.

The illustration below shows the predicted pattern next to the observed pattern for the two scenarios:

  1. We first close slit 2, let a lot of electrons go through it, and so we get a pattern described by the probability density function P1 = │Φ12. Here we see no interference but a typical diffraction pattern: the intensity follows a more or less normal (i.e. Gaussian) distribution. We then close slit 1 (and open slit 2 again), again let a lot of electrons through, and get a pattern described by the probability density function P2 = │Φ22. So that’s how we get P1 and P2.
  2. We then open both slits, let a whole electrons through, and get according to the pattern described by probability density function P12 = │Φ122, which we get not from adding the probabilities P1 and P2 (hence, P12 ≠  P1 + P2) – as one would expect if electrons would behave like particles – but from adding the probability amplitudes. We have interference, rather than diffraction.

Predicted interference effectBut so what exactly is interfering? Well… The electrons. But that can’t be, can it?

The electrons are obviously particles, as evidenced from the impact they make – one by one – as they hit the screen as shown below. [If you want to know what screen, let me quote the researchers: “The resulting patterns were magnified by an electrostatic quadrupole lens and imaged on a two-dimensional microchannel plate and phosphorus screen, then recorded with a charge-coupled device camera. […] To study the build-up of the diffraction pattern, each electron was localized using a “blob” detection scheme: each detection was replaced by a blob, whose size represents the error in the localization of the detection scheme. The blobs were compiled together to form the electron diffraction patterns.” So there you go.]

Electron blobs

Look carefully at how this interference pattern becomes ‘reality’ as the electrons hit the screen one by one. And then say it: WAW ! 

Indeed, as predicted by Feynman (and any other physics professor at the time), even if the electrons go through the slits one by one, they will interfere – with themselves so to speak. [In case you wonder if these electrons really went through one by one, let me quote the researchers once again: “The electron source’s intensity was reduced so that the electron detection rate in the pattern was about 1 Hz. At this rate and kinetic energy, the average distance between consecutive electrons was 2.3 × 106 meters. This ensures that only one electron is present in the 1 meter long system at any one time, thus eliminating electron-electron interactions.” You don’t need to be a scientist or engineer to understand that, isn’t it?]

While this is very spooky, I have not seen any better way to describe the reality of the de Broglie wave: the particle is not some point-like thing but a matter wave, as evidenced from the fact that it does interfere with itself when forced to move through two slits – or through one slit, as evidenced by the diffraction patterns built up in this experiment when closing one of the two slits: the electrons went through one by one as well!

But so how does it relate to the characteristics of that wave packet which I described in my previous post? Let me sum up the salient conclusions from that discussion:

  1. The wavelength λ of a wave packet is calculated directly from the momentum by using de Broglie‘s second relation: λ = h/p. In this case, the wavelength of the electrons averaged 50 picometer. That’s relatively small as compared to the width of the slit (50 nm) – a thousand times smaller actually! – but, as evidenced by the experiment, it’s small enough to show the ‘reality’ of the de Broglie wave.
  2. From a math point (but, of course, Nature does not care about our math), we can decompose the wave packet in a finite or infinite number of component waves. Such decomposition is referred to, in the first case (finite number of composite waves or discrete calculus) as a Fourier analysis, or, in the second case, as a Fourier transform. A Fourier transform maps our (continuous) wave function, Ψ(x), to a (continuous) wave function in the momentum space, which we noted as φ(p). [In fact, we noted it as Φ(p) but I don’t want to create confusion with the Φ symbol used in the experiment, which is actually the wave function in space, so Ψ(x) is Φ(x) in the experiment – if you know what I mean.] The point to note is that uncertainty about momentum is related to uncertainty about position. In this case, we’ll have pretty standard electrons (so not much variation in momentum), and so the location of the wave packet in space should be fairly precise as well.
  3. The group velocity of the wave packet (vg) – i.e. the envelope in which our Ψ wave oscillates – equals the speed of our electron (v), but the phase velocity (i.e. the speed of our Ψ wave itself) is superluminal: we showed it’s equal to (vp) = E/p =   c2/v = c/β, with β = v/c, so that’s the ratio of the speed of our electron and the speed of light. Hence, the phase velocity will always be superluminal but will approach c as the speed of our particle approaches c. For slow-moving particles, we get astonishing values for the phase velocity, like more than a hundred times the speed of light for the electron we looked at in our previous post. That’s weird but it does not contradict relativity: if it helps, one can think of the wave packet as a modulation of an incredibly fast-moving ‘carrier wave’. 

Is any of this relevant? Does it help you to imagine what the electron actually is? Or what that matter wave actually is? Probably not. You will still wonder: How does it look like? What is it in reality?

That’s hard to say. If the experiment above does not convey any ‘reality’ according to you, then perhaps the illustration below will help. It’s one I have used in another post too (An Easy Piece: Introducing Quantum Mechanics and the Wave Function). I took it from Wikipedia, and it represents “the (likely) space in which a single electron on the 5d atomic orbital of an atom would be found.” The solid body shows the places where the electron’s probability density (so that’s the squared modulus of the probability amplitude) is above a certain value – so it’s basically the area where the likelihood of finding the electron is higher than elsewhere. The hue on the colored surface shows the complex phase of the wave function.

Hydrogen_eigenstate_n5_l2_m1

So… Does this help? 

You will wonder why the shape is so complicated (but it’s beautiful, isn’t it?) but that has to do with quantum-mechanical calculations involving quantum-mechanical quantities such as spin and other machinery which I don’t master (yet). I think there’s always a bit of a gap between ‘first principles’ in physics and the ‘model’ of a real-life situation (like a real-life electron in this case), but it’s surely the case in quantum mechanics! That being said, when looking at the illustration above, you should be aware of the fact that you are actually looking at a 3D representation of the wave function of an electron in orbit. 

Indeed, wave functions of electrons in orbit are somewhat less random than – let’s say – the wave function of one of those baryon resonances I mentioned above. As mentioned in my Not So Easy Piece, in which I introduced the Schrödinger equation (i.e. one of my previous posts), they are solutions of a second-order partial differential equation – known as the Schrödinger wave equation indeed – which basically incorporates one key condition: these solutions – which are (atomic or molecular) ‘orbitals’ indeed – have to correspond to so-called stationary states or standing waves. Now what’s the ‘reality’ of that? 

The illustration below comes from Wikipedia once again (Wikipedia is an incredible resource for autodidacts like me indeed) and so you can check the article (on stationary states) for more details if needed. Let me just summarize the basics:

  1. A stationary state is called stationary because the system remains in the same ‘state’ independent of time. That does not mean the wave function is stationary. On the contrary, the wave function changes as function of both time and space – Ψ = Ψ(x, t) remember? – but it represents a so-called standing wave.
  2. Each of these possible states corresponds to an energy state, which is given through the de Broglie relation: E = hf. So the energy of the state is proportional to the oscillation frequency of the (standing) wave, and Planck’s constant is the factor of proportionality. From a formal point of view, that’s actually the one and only condition we impose on the ‘system’, and so it immediately yields the so-called time-independent Schrödinger equation, which I briefly explained in the above-mentioned Not So Easy Piece (but I will not write it down here because it would only confuse you even more). Just look at these so-called harmonic oscillators below:

QuantumHarmonicOscillatorAnimation

A and B represent a harmonic oscillator in classical mechanics: a ball with some mass m (mass is a measure for inertia, remember?) on a spring oscillating back and forth. In case you’d wonder what the difference is between the two: both the amplitude as well as the frequency of the movement are different. 🙂 A spring and a ball?

It represents a simple system. A harmonic oscillation is basically a resonance phenomenon: springs, electric circuits,… anything that swings, moves or oscillates (including large-scale things such as bridges and what have you – in his 1965 Lectures (Vol. I-23), Feynman even discusses resonance phenomena in the atmosphere in his Lectures) has some natural frequency ω0, also referred to as the resonance frequency, at which it oscillates naturally indeed: that means it requires (relatively) little energy to keep it going. How much energy it takes exactly to keep them going depends on the frictional forces involved: because the springs in A and B keep going, there’s obviously no friction involved at all. [In physics, we say there is no damping.] However, both springs do have a different k (that’s the key characteristic of a spring in Hooke’s Law, which describes how springs work), and the mass m of the ball might be different as well. Now, one can show that the period of this ‘natural’ movement will be equal to t0 = 2π/ω= 2π(m/k)1/2 or that ω= (m/k)–1/2. So we’ve got a A and a B situation which differ in k and m. Let’s go to the so-called quantum oscillator, illustrations C to H.

C to H in the illustration are six possible solutions to the Schrödinger Equation for this situation. The horizontal axis is position (and so time is the variable) – but we could switch the two independent variables easily: as I said a number of times already, time and space are interchangeable in the argument representing the phase (θ) of a wave provided we use the right units (e.g. light-seconds for distance and seconds for time): θ = ωt – kx. Apart from the nice animation, the other great thing about these illustrations – and the main difference with resonance frequencies in the classical world – is that they show both the real part (blue) as well as the imaginary part (red) of the wave function as a function of space (fixed in the x axis) and time (the animation).

Is this ‘real’ enough? If it isn’t, I know of no way to make it any more ‘real’. Indeed, that’s key to understanding the nature of matter waves: we have to come to terms with the idea that these strange fluctuating mathematical quantities actually represent something. What? Well… The spooky thing that leads to the above-mentioned experimental results: electron diffraction and interference. 

Let’s explore this quantum oscillator some more. Another key difference between natural frequencies in atomic physics (so the atomic scale) and resonance phenomena in ‘the big world’ is that there is more than one possibility: each of the six possible states above corresponds to a solution and an energy state indeed, which is given through the de Broglie relation: E = hf. However, in order to be fully complete, I have to mention that, while G and H are also solutions to the wave equation, they are actually not stationary states. The illustration below – which I took from the same Wikipedia article on stationary states – shows why. For stationary states, all observable properties of the state (such as the probability that the particle is at location x) are constant. For non-stationary states, the probabilities themselves fluctuate as a function of time (and space of obviously), so the observable properties of the system are not constant. These solutions are solutions to the time-dependent Schrödinger equation and, hence, they are, obviously, time-dependent solutions.

StationaryStatesAnimationWe can find these time-dependent solutions by superimposing two stationary states, so we have a new wave function ΨN which is the sum of two others:  ΨN = Ψ1  + Ψ2. [If you include the normalization factor (as you should to make sure all probabilities add up to 1), it’s actually ΨN = (2–1/2)(Ψ1  + Ψ2).] So G and H above still represent a state of a quantum harmonic oscillator (with a specific energy level proportional to h), but so they are not standing waves.

Let’s go back to our electron traveling in a more or less straight path. What’s the shape of the solution for that one? It could be anything. Well… Almost anything. As said, the only condition we can impose is that the envelope of the wave packet – its ‘general’ shape so to say – should not change. That because we should not have dispersion – as illustrated below. [Note that this illustration only represent the real or the imaginary part – not both – but you get the idea.]

dispersion

That being said, if we exclude dispersion (because a real-life electron traveling in a straight line doesn’t just disappear – as do dispersive wave packets), then, inside of that envelope, the weirdest things are possible – in theory that is. Indeed, Nature does not care much about our Fourier transforms. So the example below, which shows a theoretical wave packet (again, the real or imaginary part only) based on some theoretical distribution of the wave numbers of the (infinite number) of component waves that make up the wave packet, may or may not represent our real-life electron. However, if our electron has any resemblance to real-life, then I would expect it to not be as well-behaved as the theoretical one that’s shown below.

example of wave packet

The shape above is usually referred to as a Gaussian wave packet, because of the nice normal (Gaussian) probability density functions that are associated with it. But we can also imagine a ‘square’ wave packet: a somewhat weird shape but – in terms of the math involved – as consistent as the smooth Gaussian wave packet, in the sense that we can demonstrate that the wave packet is made up of an infinite number of waves with an angular frequency ω that is linearly related to their wave number k, so the dispersion relation is ω = ak + b. [Remember we need to impose that condition to ensure that our wave packet will not dissipate (or disperse or disappear – whatever term you prefer.] That’s shown below: a Fourier analysis of a square wave.

Square wave packet

While we can construct many theoretical shapes of wave packets that respect the ‘no dispersion!’ condition, we cannot know which one will actually represent that electron we’re trying to visualize. Worse, if push comes to shove, we don’t know if these matter waves (so these wave packets) actually consist of component waves (or time-independent stationary states or whatever).

[…] OK. Let me finally admit it: while I am trying to explain you the ‘reality’ of these matter waves, we actually don’t know how real these matter waves actually are. We cannot ‘see’ or ‘touch’ them indeed. All that we know is that (i) assuming their existence, and (ii) assuming these matter waves are more or less well-behaved (e.g. that actual particles will be represented by a composite wave characterized by a linear dispersion relation between the angular frequencies and the wave numbers of its (theoretical) component waves) allows us to do all that arithmetic with these (complex-valued) probability amplitudes. More importantly, all that arithmetic with these complex numbers actually yields (real-valued) probabilities that are consistent with the probabilities we obtain through repeated experiments. So that’s what’s real and ‘not so real’ I’d say.

Indeed, the bottom-line is that we do not know what goes on inside that envelope. Worse, according to the commonly accepted Copenhagen interpretation of the Uncertainty Principle (and tons of experiments have been done to try to overthrow that interpretation – all to no avail), we never will.