Superconductivity and flux quantization

This post continues my mini-series on Feynman’s Seminar on Superconductivity. Superconductivity is a state which produces many wondrous phenomena, but… Well… The flux quantization phenomenon may not be part of your regular YouTube feed but, as far as I am concerned, it may well be the most amazing manifestation of a quantum-mechanical phenomenon at a macroscopic scale. I mean… Super currents that keep going, with zero resistance, are weird—they explain how we can trap a magnetic flux in the first place—but the fact that such fluxes are quantized is even weirder.

The key idea is the following. When we cool a ring-shaped piece of superconducting material in a magnetic field, all the way down to the critical temperature that causes the electrons to condense into a superconducting fluid, then a super current will emerge—think of an eddy current, here, but with zero resistance—that will force the magnetic field out of the material, as shown below. This current will permanently trap some of the magnetic field, even when the external field is being removed. As said, that’s weird enough by itself but… Well… If we think of the super current as an eddy current encountering zero resistance, then the idea of a permanently trapped magnetic field makes sense, right? In case you’d doubt the effect… Well… Just watch one of the many videos on the effect on YouTube. 🙂 The amazing thing here is not the permanently trapped magnetic field, but the fact that it’s quantized.

trapped flux

To be precise, the trapped flux will always be an integer times 2πħ/q. In other words, the magnetic field which Feynman denotes by Φ (the capitalized Greek letter phi), will always be equal to:

Φ = 2πħ/q, with = 0, 1, 2, 3,…

Hence, the flux can be 0, 2πħ/q, 4πħ/q, 6πħ/q , and so on. The fact that it’s a multiple of 2π shows us it’s got to do with the fact that our piece of material is, effectively, a ring. The nice thing about this phenomenon is that the mathematical analysis is, in fact, fairly easy to follow—or… Well… Much easier than what we discussed before. 🙂 Let’s quickly go through it.

We have a formula for the magnetic flux. It must be equal to the line integral of the vector potential (A) around a closed loop Τ, so we write:

flux

Now, we can choose the loop Τ to be well inside the body of the ring, so that it never gets near the surface, as illustrated below. So we know that the current J is zero there. [In case you doubt this, see my previous post.]

curve

One of the equations we introduced in our previous post, ħθ = m·v + q·A, will then reduce to:

ħθ = q·A

Why? The v in the m·v term (the velocity of the superconducting fluid, really), is zero. Remember the analysis is for this particular loop (well inside the ring) only. So… Well… If we integrate the expression above, we get:

integral

Combining the two expressions with the integrals, we get:

integral 2Now, the line integral of a gradient from one point to another (say from point 1 to point 2) is the difference of the values of the function at the two points, so we can write:

integral 3

Now what constraints are there on the values of θ1 and θ2? Well… You might think that, if they’re associated with the same point (we’re talking a closed loop, right?), then the two values should be the same, but… Well… No. All we can say is that the wavefunction must have the same value. We wrote that wavefunction as:

ψ = ρ(r)1/2eθ(r)

The value of this function at some point r is the same if θ changes by 2π. Hence, when doing one complete turn around the ring, the ∫∇θ·ds integral in the integral formulas we wrote down must be equal to 2π. Therefore, the second integral expression above can be re-written as:

result

That’s the result we wanted to explain so… Well… We’re done. Let me wrap up by quoting Feynman’s account of the 1961 experiment which confirmed London’s prediction of the effect, which goes back to 1950! It’s interesting, because… Well… It shows how up to date Feynman’s Lectures really are—or were, back in 1963, at least!feynman overview of experiment

Feynman’s Seminar on Superconductivity (2)

We didn’t get very far in our first post on Feynman’s Seminar on Superconductivity, and then I shifted my attention to other subjects over the past few months. So… Well… Let me re-visit the topic here.

One of the difficulties one encounters when trying to read this so-called seminar—which, according to Feynman, is ‘for entertainment only’ and, therefore, not really part of the Lectures themselves—is that Feynman throws in a lot of stuff that is not all that relevant to the topic itself but… Well… He apparently didn’t manage to throw all that he wanted to throw into his (other) Lectures on Quantum Mechanics and so he inserted a lot of stuff which he could, perhaps, have discussed elsewhere. :-/ So let us try to re-construct the main lines of reasoning here.

The first equation is Schrödinger’s equation for some particle with charge q that is moving in an electromagnetic field that is characterized not only by the (scalar) potential Φ but also by a vector potential A:

schrodinger

This closely resembles Schrödinger’s equation for an electron that is moving in an electric field only, which we used to find the energy states of electrons in a hydrogen atom: i·ħ·∂ψ/∂t = −(1/2)·(ħ2/m)∇2ψ + V·ψ. We just need to note the following:

  1. On the left-hand side, we can, obviously, replace −1/i by i.
  2. On the right-hand side, we can replace V by q·Φ, because the potential of a charge in an electric field is the product of the charge (q) and the (electric) potential (Φ).
  3. As for the other term on the right-hand side—so that’s the −(1/2)·(ħ2/m)∇2ψ term—we can re-write −ħ2·∇2ψ as [(ħ/i)·∇]·[(ħ/i)·∇]ψ because (1/i)·(1/i) = 1/i2 = 1/(−1) = −1. 🙂
  4. So all that’s left now, is that additional −q·A term in the (ħ/i)∇ − q·A expression. In our post, we showed that’s easily explained because we’re talking magnetodynamics: we’ve got to allow for the possibility of changing magnetic fields, and so that’s what the −q·A term captures.

Now, the latter point is not so easy to grasp but… Well… I’ll refer you that first post of mine, in which I show that some charge in a changing magnetic field will effectively gather some extra momentum, whose magnitude will be equal to p = m·v = −q·A. So that’s why we need to introduce another momentum operator here, which we write as:

new-momentum-operator

OK. Next. But… Then… Well… All of what follows are either digressions—like the section on the local conservation of probabilities—or, else, quite intuitive arguments. Indeed, Feynman does not give us the nitty-gritty of the Bardeen-Cooper-Schrieffer theory, nor is the rest of the argument nearly as rigorous as the derivation of the electron orbitals from Schrödinger’s equation in an electrostatic field. So let us closely stick to what he does write, and try our best to follow the arguments.

Cooper pairs

The key assumption is that there is some attraction between electrons which, at low enough temperatures, can overcome the Coulomb repulsion. Where does this attraction come from? Feynman does not give us any clues here. He just makes a reference to the BCS theory but notes this theory is “not the subject of this seminar”, and that we should just “accept the idea that the electrons do, in some manner or other, work in pairs”, and that “we can think of thos−e pairs as behaving more or less like particles”, and that “we can, therefore, talk about the wavefunction for a pair.”

So we have a new particle, so to speak, which consists of two electrons who move through the conductor as one. To be precise, the electron pair behaves as a boson. Now, bosons have integer spin. According to the spin addition rule, we have four possibilities here but only three possible values:− 1/2 + 1/2 = 1; −1/2 + 1/2 = 0; +1/2 − 1/2 = 0; −1/2 − 1/2 = − 1. Of course, it is tempting to think these Cooper pairs are just like the electron pairs in the atomic orbitals, whose spin is always opposite because of the Fermi exclusion principle. Feynman doesn’t say anything about this, but the Wikipedia article on the BCS theory notes that the two electrons in a Cooper pair are, effectively, correlated because of their opposite spin. Hence, we must assume the Cooper pairs effectively behave like spin-zero particles.

Now, unlike fermions, bosons can collectively share the same energy state. In fact, they are likely to share the same state into what is referred to as a Bose-Einstein condensate. As Feynman puts it: “Since electron pairs are bosons, when there are a lot of them in a given state there is an especially large amplitude for other pairs to go to the same state. So nearly all of the pairs will be locked down at the lowest energy in exactly the same state—it won’t be easy to get one of them into another state. There’s more amplitude to go into the same state than into an unoccupied state by the famous factor √n, where n−1 is the occupancy of the lowest state. So we would expect all the pairs to be moving in the same state.”

Of course, this only happens at very low temperatures, because even if the thermal energy is very low, it will give the electrons sufficient energy to ensure the attractive force is overcome and all pairs are broken up. It is only at very low temperature that they will pair up and go into a Bose-Einstein condensate. Now, Feynman derives this √n factor in a rather abstruse introductory Lecture in the third volume, and I’d advise you to google other material on Bose-Einstein statistics because… Well… The mentioned Lecture is not among Feynman’s finest. OK. Next step.

Cooper pairs and wavefunctions

We know the probability of finding a Cooper pair is equal to the absolute square of its wavefunction. Now, it is very reasonable to assume that this probability will be proportional to the charge density (ρ), so we can write:

|ψ|= ψψ* ∼ ρ(r)

The argument here (r) is just the position vector. The next step, then, is to write ψ as the square root of ρ(r) times some phase factor θ. Abstracting away from time, this phase factor will also depend on r, of course. So this is what Feynman writes:

ψ = ρ(r)1/2eθ(r)

As Feynman notes, we can write any complex function of r like this but… Well… The charge density is, obviously, something real. Something we can measure, so we’re not writing the obvious here. The next step is even less obvious.

In our first post, we spent quite some time on Feynman’s digression on the local conservation of probability and… Well… I wrote above I didn’t think this digression was very useful. It now turns out it’s a central piece in the puzzle that Feynman is trying to solve for us here. The key formula here is the one for the so-called probability current, which—as Feynman shows—we write as:

probability-current-2

This current J can also be written as:

probability-current-1

Now, Feynman skips all of the math here (he notes “it’s just a change of variables” but so he doesn’t want to go through all of the algebra), and so I’ll just believe him when he says that, when substituting ψ for our wavefunction ψ = ρ(r)1/2eθ(r), then we can express this ‘current’ (J) in terms of ρ and θ. To be precise, he writes J as: current formulaSo what? Well… It’s really fascinating to see what happens next. While J was some rather abstract concept so far—what’s a probability current, really?—Feynman now suggests we may want to think of it as a very classical electric current—the charge density times the velocity of the fluid of electrons. Hence, we equate J to J =  ρ·v. Now, if the equation above holds true, but J is also equal to J = ρ·v, then the equation above is equivalent to:

moment

Now, that gives us a formula for ħθ. We write:

ħθ = m·v + q·A

Now, in my previous post on this Seminar, I noted that Feynman attaches a lot of importance to this m·v + q·A quantity because… Well… It’s actually an invariant quantity. The argument can be, very briefly, summarized as follows. During the build-up of (or a change in) a magnetic flux, a charge will pick up some (classical) momentum that is equal to p = m·v = −q·A. Hence, the m·v + q·A sum is zero, and so… Well… That’s it, really: it’s some quantity that… Well… It has a significance in quantum mechanics. What significance? Well… Think of what we’ve been writing here. The v and the A have a physical significance, obviously. Therefore, that phase factor θ(r) must also have a physical significance.

But the question remains: what physical significance, exactly? Well… Let me quote Feynman here:

“The phase is just as observable as the charge density ρ. It is a piece of the current density J. The absolute phase (θ) is not observable, but if the gradient of the phase (θ) is known everywhere, then the phase is known except for a constant. You can define the phase at one point, and then the phase everywhere is determined.”

That makes sense, doesn’t it? But it still doesn’t quite answer the question: what is the physical significance of θ(r). What is it, really? We may be able to answer that question after exploring the equations above a bit more, so let’s do that now.

Superconductivity

The phenomenon of superconductivity itself is easily explained by the mentioned condensation of the Cooper pairs: they all go into the same energy state. They form, effectively, a superconducting fluid. Feynman’s description of this is as follows:

“There is no electrical resistance. There’s no resistance because all the electrons are collectively in the same state. In the ordinary flow of current you knock one electron or the other out of the regular flow, gradually deteriorating the general momentum. But here to get one electron away from what all the others are doing is very hard because of the tendency of all Bose particles to go in the same state. A current once started, just keeps on going forever.”

Frankly, I’ve re-read this a couple of times, but I don’t think it’s the best description of what we think is going on here. I’d rather compare the situation to… Well… Electrons moving around in an electron orbital. That’s doesn’t involve any radiation or energy transfer either. There’s just movement. Flow. The kind of flow we have in the wavefunction itself. Here I think the video on Bose-Einstein condensates on the French Tout est quantique site is quite instructive: all of the Cooper pairs join to become one giant wavefunction—one superconducting fluid, really. 🙂

OK… Next.

The Meissner effect

Feynman describes the Meissner effect as follows:

“If you have a piece of metal in the superconducting state and turn on a magnetic field which isn’t too strong (we won’t go into the details of how strong), the magnetic field can’t penetrate the metal. If, as you build up the magnetic field, any of it were to build up inside the metal, there would be a rate of change of flux which would produce an electric field, and an electric field would immediately generate a current which, by Lenz’s law, would oppose the flux. Since all the electrons will move together, an infinitesimal electric field will generate enough current to oppose completely any applied magnetic field. So if you turn the field on after you’ve cooled a metal to the superconducting state, it will be excluded.

Even more interesting is a related phenomenon discovered experimentally by Meissner. If you have a piece of the metal at a high temperature (so that it is a normal conductor) and establish a magnetic field through it, and then you lower the temperature below the critical temperature (where the metal becomes a superconductor), the field is expelled. In other words, it starts up its own current—and in just the right amount to push the field out.”

The math here is interesting. Feynman first notes that, in any lump of superconducting metal, the divergence of the current must be zero, so we write:  ∇·J = 0. At any point? Yes. The current that goes in must go out. No point is a sink or a source. Now the divergence operator (∇·J) is a linear operator. Hence, that means that, when applying the divergence operator to the J = (ħ/m)·[θ − (q/ħ)·A]·ρ equation, we’ll need to figure out what ∇·θ =   = ∇2θ and ∇·A are. Now, as explained in my post on gauges, we can choose to make ∇·A equal to zero so… Well… We’ll make that choice and, hence, the term with ∇·A in it vanishes. So… Well… If ∇·J equals zero, then the term with ∇2θ has to be zero as well, so ∇2θ has to be zero. That, in turn, implies θ has to be some constant (vector).

Now, there is a pretty big error in Feynman’s Lecture here, as it notes: “Now the only way that ∇2θ can be zero everywhere inside the lump of metal is for θ to be a constant.” It should read: ∇2θ can only be zero everywhere if θ is a constant (vector). So now we need to remind ourselves of the reality of θ, as described by Feynman (quoted above): “The absolute phase (θ) is not observable, but if the gradient of the phase (θ) is known everywhere, then the phase is known except for a constant. You can define the phase at one point, and then the phase everywhere is determined.” So we can define, or choose, our constant (vector) θ to be 0.

Hmm… We re-set not one but two gauges here: A and θ. Tricky business, but let’s go along with it. [If we want to understand Feynman’s argument, then we actually have no choice than to go long with his argument, right?] The point is: the (ħ/m)·θ term in the J = (ħ/m)·[θ − (q/ħ)·A]·ρ vanishes, so the equation we’re left with tells us the current—so that’s an actual as well as a probability current!—is proportional to the vector potential:

currentNow, we’ve neglected any possible variation in the charge density ρ so far because… Well… The charge density in a superconducting fluid must be uniform, right? Why? When the metal is superconducting, an accumulation of electrons in one region would be immediately neutralized by a current, right? [Note that Feynman’s language is more careful here. He writes: the charge density is almost perfectly uniform.]

So what’s next? Well… We have a more general equation from the equations of electromagnetism:

A and J

[In case you’d want to know how we get this equation out of Maxwell’s equations, you can look it up online in one of the many standard textbooks on electromagnetism.] You recognize this as a Poisson equation… Well… Three Poisson equations: one for each component of A and J. We can now combine the two equations above by substituting in that Poisson equation, so we get the following differential equation, which we need to solve for A:

A

The λ2 in this equation is, of course, a shorthand for the following constant:

lambda

Now, it’s very easy to see that both e−λr as well as e−λr are solutions for that Poisson equation. But what do they mean? In one dimension, r becomes the one-dimensional position variable x. You can check the shapes of these solutions with a graphing tool.

graph

Note that only one half of each graph counts: the vector potential must decrease when we go from the surface into the material, and there is a cut-off at the surface of the material itself, of course. So all depends on the size of λ, as compared to the size of our piece of superconducting metal (or whatever other substance our piece is made of). In fact, if we look at e−λx as as an exponential decay function, then τ = 1/λ is the so-called scaling constant (it’s the inverse of the decay constant, which is λ itself). [You can work this out yourself. Note that for = τ = 1/λ, the value of our function e−λx will be equal to e−λ(1/λ) = e−1 ≈ 0.368, so it means the value of our function is reduced to about 36.8% of its initial value. For all practical purposes, we may say—as Feynman notes—that the field will, effectively, only penetrate to a thin layer at the surface: a layer of about 1/1/λ in thickness. He illustrates this as follows:

illustration

Moreover, he calculates the 1/λ distance for lead. Let me copy him here:

calculation

Well… That says it all, right? We’re talking two millionths of a centimeter here… 🙂

So what’s left? A lot, like flux quantization, or the equations of motion for the superconducting electron fluid. But we’ll leave that for the next posts. 🙂

Feynman’s Seminar on Superconductivity (1)

The ultimate challenge for students of Feynman’s iconic Lectures series is, of course, to understand his final one: A Seminar on Superconductivity. As he notes in his introduction to this formidably dense piece, the text does not present the detail of each and every step in the development and, therefore, we’re not supposed to immediately understand everything. As Feynman puts it: we should just believe (more or less) that things would come out if we would be able to go through each and every step. Well… Let’s see. Feynman throws a lot of stuff in here—including, I suspect, some stuff that may not be directly relevant, but that he sort of couldn’t insert into all of his other Lectures. So where do we start?

It took me one long maddening day to figure out the first formula:f1It says that the amplitude for a particle to go from to in a vector potential (think of a classical magnetic field) is the amplitude for the same particle to go from to b when there is no field (A = 0) multiplied by the exponential of the line integral of the vector potential times the electric charge divided by Planck’s constant. I stared at this for quite a while, but then I recognized the formula for the magnetic effect on an amplitude, which I described in my previous post, which tells us that a magnetic field will shift the phase of the amplitude of a particle with an amount equal to:

integral

Hence, if we write 〈b|a〉 for A = 0 as 〈b|aA = 0 = C·eiθ, then 〈b|a〉 in A will, naturally, be equal to 〈b|a〉 in A = C·ei(θ+φ) = C·eiθ·eiφ = 〈b|aA = 0 ·eiφ, and so that explains it. 🙂 Alright… Next. Or… Well… Let us briefly re-examine the concept of the vector potential, because we’ll need it a lot. We introduced it in our post on magnetostatics. Let’s briefly re-cap the development there. In Maxwell’s set of equations, two out of the four equations give us the magnetic field: B = 0 and c2×B = j0. We noted the following in this regard:

  1. The ∇B = 0 equation is true, always, unlike the ×E = 0 expression, which is true for electrostatics only (no moving charges). So the B = 0 equation says the divergence of B is zero, always.
  2. The divergence of the curl of a vector field is always zero. Hence, if A is some vector field, then div(curl A) = •(×A) = 0, always.
  3. We can now apply another theorem: if the divergence of a vector field, say D, is zero—so if D = 0—then will be the the curl of some other vector field C, so we can write: D = ×C.  Applying this to B = 0, we can write: 

If B = 0, then there is an A such that B = ×A

So, in essence, we’re just re-defining the magnetic field (B) in terms of some other vector field. To be precise, we write it as the curl of some other vector field, which we refer to as the (magnetic) vector potential. The components of the magnetic field vector can then be re-written as:

formula for B

We need to note an important point here: the equations above suggest that the components of B depend on position only. In other words, we assume static magnetic fields, so they do not change with time. That, in turn, assumes steady currents. We will want to extend the analysis to also include magnetodynamics. It complicates the analysis but… Well… Quantum mechanics is complicated. Let us remind ourselves here of Feynman’s re-formulation of Maxwell’s equations as a set of two equations (expressed in terms of the magnetic (vector) and the electric potential) only:

Wave equation for A

Wave equation for potential

These equations are wave equations, as you can see by writing out the second equation:

wave equation

It is a wave equation in three dimensions. Note that, even in regions where we do no have any charges or currents, we have non-zero solutions for φ and A. These non-zero solutions are, effectively, representing the electric and magnetic fields as they travel through free space. As Feynman notes, the advantage of re-writing Maxwell’s equations as we do above, is that the two new equations make it immediately apparent that we’re talking electromagnetic waves, really. As he notes, for many practical purposes, it will still be convenient to use the original equations in terms of E and B, but… Well… Not in quantum mechanics, it turns out. As Feynman puts it: “E and B are on the other side of the mountain we have climbed. Now we are ready to cross over to the other side of the peak. Things will look different—we are ready for some new and beautiful views.”

Well… Maybe. Appreciating those views, as part of our study of quantum mechanics, does take time and effort, unfortunately. 😦

The Schrödinger equation in an electromagnetic field

Feynman then jots down Schrödinger’s equation for the same particle (with charge q) moving in an electromagnetic field that is characterized not only by the (scalar) potential Φ but also by a vector potential A:

schrodinger

Now where does that come from? We know the standard formula in an electric field, right? It’s the formula we used to find the energy states of electrons in a hydrogen atom:

i·ħ·∂ψ/∂t = −(1/2)·(ħ2/m)∇2ψ + V·ψ

Of course, it is easy to see that we replaced V by q·Φ, which makes sense: the potential of a charge in an electric field is the product of the charge (q) and the (electric) potential (Φ), because Φ is, obviously, the potential energy of the unit charge. It’s also easy to see we can re-write −ħ2·∇2ψ as [(ħ/i)·∇]·[(ħ/i)·∇]ψ because (1/i)·(1/i) = 1/i2 = 1/(−1) = −1. 🙂 Alright. So it’s just that −q·A term in the (ħ/i)∇ − q·A expression that we need to explain now.

Unfortunately, that explanation is not so easy. Feynman basically re-derives Schrödinger’s equation using his trade-mark historical argument – which did not include any magnetic field – with a vector potential. The re-derivation is rather annoying, and I didn’t have the courage to go through it myself, so you should – just like me – just believe Feynman when he says that, when there’s a vector potential – i.e. when there’s a magnetic field – then that (ħ/i)·∇ operator – which is the momentum operator– ought to be replaced by a new momentum operator:

new-momentum-operator

So… Well… There we are… 🙂 So far, so good? Well… Maybe.

While, as mentioned, you won’t be interested in the mathematical argument, it is probably worthwhile to reproduce Feynman’s more intuitive explanation of why the operator above is what it is. In other words, let us try to understand that −qA term. Look at the following situation: we’ve got a solenoid here, and some current I is going through it so there’s a magnetic field B. Think of the dynamics while we turn on this flux. Maxwell’s second equation (∇×E = −∂B/∂t) tells us the line integral of E around a loop will be equal to the time rate of change of the magnetic flux through that loop. The ∇×E = −∂B/∂t equation is a differential equation, of course, so it doesn’t have the integral, but you get the idea—I hope.solenoid

Now, using the B = ×A equation we can re-write the ∇×E = −∂B/∂t as ∇×E = −∂(×A)/∂t. This allows us to write the following:

 ∇×E = −∂(×A)/∂t = −×(∂A/∂t) ⇔ E = −∂A/∂t

This is a remarkable expression. Note its derivation is based on the commutativity of the curl and time derivative operators, which is a property that can easily be explained: if we have a function in two variables—say x and t—then the order of the derivation doesn’t matter: we can first take the derivative with respect to and then to t or, alternatively, we can first take the time derivative and then do the ∂/∂x operation. So… Well… The curl is, effectively, a derivative with regard to the spatial variables. OK. So what? What’s the point?

Well… If we’d have some charge q, as shown in the illustration above, that would happen to be there as the flux is being switched on, it will experience a force which is equal to F = qE. We can now integrate this over the time interval (t) during which the flux is being built up to get the following:

0t F = ∫0t m·a = ∫0t m·dv/dt = m·vt= ∫0t q·E = −∫0t q·∂A/∂t = −q·At

Assuming v0 and Aare zero, we may drop the time subscript and simply write:

v = −q·A

The point is: during the build-up of the magnetic flux, our charge will pick up some (classical) momentum that is equal to p = m·v = −q·A. So… Well… That sort of explains the additional term in our new momentum operator.

Note: For some reason I don’t quite understand, Feynman introduces the weird concept of ‘dynamical momentum’, which he defines as the quantity m·v + q·A, so that quantity must be zero in the analysis above. I quickly googled to see why but didn’t invest too much time in the research here. It’s just… Well… A bit puzzling. I don’t really see the relevance of his point here: I am quite happy to go along with the new operator, as it’s rather obvious that introducing changing magnetic fields must, obviously, also have some impact on our wave equations—in classical as well as in quantum mechanics.

Local conservation of probability

The title of this section in Feynman’s Lecture (yes, still the same Lecture – we’re not switching topics here) is the equation of continuity for probabilities. I find it brilliant, because it confirms my interpretation of the wave function as describing some kind of energy flow. Let me quote Feynman on his endeavor here:

“An important part of the Schrödinger equation for a single particle is the idea that the probability to find the particle at a position is given by the absolute square of the wave function. It is also characteristic of the quantum mechanics that probability is conserved in a local sense. When the probability of finding the electron somewhere decreases, while the probability of the electron being elsewhere increases (keeping the total probability unchanged), something must be going on in between. In other words, the electron has a continuity in the sense that if the probability decreases at one place and builds up at another place, there must be some kind of flow between. If you put a wall, for example, in the way, it will have an influence and the probabilities will not be the same. So the conservation of probability alone is not the complete statement of the conservation law, just as the conservation of energy alone is not as deep and important as the local conservation of energy. If energy is disappearing, there must be a flow of energy to correspond. In the same way, we would like to find a “current” of probability such that if there is any change in the probability density (the probability of being found in a unit volume), it can be considered as coming from an inflow or an outflow due to some current.”

This is it, really ! The wave function does represent some kind of energy flow – between a so-called ‘real’ and a so-called ‘imaginary’ space, which are to be defined in terms of directional versus rotational energy, as I try to point out – admittedly: more by appealing to intuition than to mathematical rigor – in that post of mine on the meaning of the wavefunction.

So what is the flow – or probability current as Feynman refers to it? Well… Here’s the formula:

probability-current-2

Huh? Yes. Don’t worry too much about it right now. The essential point is to understand what this current – denoted by J – actually stands for:

probability-current-1

So what’s next? Well… Nothing. I’ll actually refer you to Feynman now, because I can’t improve on how he explains how pairs of electrons start behaving when temperatures are low enough to render Boltzmann’s Law irrelevant: the kinetic energy that’s associated with temperature can no longer break up electron pairs if temperature comes close to the zero point.

Huh? What? Electron pairs? Electrons are not supposed to form pairs, are they? They carry the same charge and are, therefore, supposed to repel each other. Well… Yes and no. In my post on the electron orbitals in a hydrogen atom – which just presented Feynman’s presentation on the subject-matter in a, hopefully, somewhat more readable format – we calculated electron orbitals neglecting spin. In Feynman’s words:

“We make another approximation by forgetting that the electron has spin. […] The non-relativistic Schrödinger equation disregards magnetic effects. [However] Small magnetic effects [do] occur because, from the electron’s point-of-view, the proton is a circulating charge which produces a magnetic field. In this field the electron will have a different energy with its spin up than with it down. [Hence] The energy of the atom will be shifted a little bit from what we will calculate. We will ignore this small energy shift. Also we will imagine that the electron is just like a gyroscope moving around in space always keeping the same direction of spin. Since we will be considering a free atom in space the total angular momentum will be conserved. In our approximation we will assume that the angular momentum of the electron spin stays constant, so all the rest of the angular momentum of the atom—what is usually called “orbital” angular momentum—will also be conserved. To an excellent approximation the electron moves in the hydrogen atom like a particle without spin—the angular momentum of the motion is a constant.”

To an excellent approximation… But… Well… Electrons in a metal do form pairs, because they can give up energy in that way and, hence, they are more stable that way. Feynman does not go into the details here – I guess because that’s way beyond the undergrad level – but refers to the Bardeen-Coopers-Schrieffer (BCS) theory instead – the authors of which got a Nobel Prize in Physics in 1972 (that’s a decade or so after Feynman wrote this particular Lecture), so I must assume the theory is well accepted now. 🙂

Of course, you’ll shout now: Hey! Hydrogen is not a metal! Well… Think again: the latest breakthrough in physics is making hydrogen behave like a metal. 🙂 And I am really talking the latest breakthrough: Science just published the findings of this experiment last month! 🙂 🙂 In any case, we’re not talking hydrogen here but superconducting materials, to which – as far as we know – the BCS theory does apply.

So… Well… I am done. I just wanted to show you why it’s important to work your way through Feynman’s last Lecture because… Well… Quantum mechanics does explain everything – although the nitty-gritty of it (the Meissner effect, the London equation, flux quantization, etc.) are rather hard bullets to bite. 😦

Don’t give up ! I am struggling with the nitty-gritty too ! 🙂

Induced currents

In my two previous posts, I presented all of the ingredients of the meal we’re going to cook now, most notably:

  1. The formula for the torque on a loop of a current in a magnetic field, and its energy: (i) τ = μ×B, and (ii) Umech = −μ·B.
  2. The Biot-Savart Law, which gives you the magnetic field that’s produced by wires carrying currents:

B formula 2

Both ingredients are, obviously, relevant to the design of an electromagnetic motor, i.e. an ‘engine that can do some work’, as Feynman calls it. 🙂 Its principle is illustrated below.

motor

The two formulas above explain how and why the coil go around, and the coil can be made to keep going by arranging that the connections to the coil are reversed each half-turn by contacts mounted on the shaft. Then the torque is always in the same direction. That’s how a small direct current (DC) motor is made. My father made me make a couple of these thirty years ago, with a magnet, a big nail and some copper coil. I used sliding contacts, and they were the most difficult thing in the whole design. But now I found a very nice demo on YouTube of a guy whose system to ‘reverse’ the connections is wonderfully simple: he doesn’t use any sliding contacts. He just removes half of the insulation on the wire of the coil on one side. It works like a charm, but I think it’s not so sustainable, as it spins so fast that the insulation on the other side will probably come off after a while! 🙂

Now, to make this motor run, you need current and, hence, 19th century physicists and mechanical engineers also wondered how one could produce currents by changing the magnetic field. Indeed, they could use Alessandro Volta’s ‘voltaic pile‘ to produce currents but it was not very handy: it consisted of alternating zinc and copper discs, with pieces of cloth soaked in salt water in-between!

Now, while the Biot-Savart Law goes back to 1820, it took another decade to find out how that could be done. Initially, people thought magnetic fields should just cause some current, but that didn’t work. Finally, Faraday unequivocally established the fundamental principle that electric effects are only there when something is changingSo you’ll get a current in a wire by moving it in a magnetic field, or by moving the magnet or, if the magnetic field is caused by some other current, by changing the current in that wire. It’s referred to as the ‘flux rule’, or Faraday’s Law. Remember: we’ve seen Gauss’ Law, then Ampère’s Law, and then that Biot-Savart Law, and so now it’s time for Faraday’s Law. 🙂 Faraday’s Law is Maxwell’s third equation really, aka as the Maxwell-Faraday Law of Induction:

×E = −∂B/∂t

Now you’ll wonder: what’s flux got to do with this formula? ×E is about circulation, not about flux! Well… Let me copy Feynman’s answer:

Faraday's law

So… There you go. And, yes, you’re right, instead of writing Faraday’s Law as ×E = −∂B/∂t, we should write it as:

emf

That’s a easier to understand, and it’s also easier to work with, as we’ll see in a moment. So the point is: whenever the magnetic flux changes, there’s a push on the electrons in the wire. That push is referred to as the electromotive force, abbreviated as emf or EMF, and so it’s that line and/or surface integral above indeed. Let me paraphrase Feynman so you fully understand what we’re talking about here:

When we move our wire in a magnetic field, or when we move a magnet near the wire, or when we change the current in a nearby wire, there will be some net push on the electrons in the wire in one direction along the wire. There may be pushes in different directions at different places, but there will be more push in one direction than another. What counts is the push integrated around the complete circuit. We call this net integrated push the electromotive force (abbreviated emf) in the circuit. More precisely, the emf is defined as the tangential force per unit charge in the wire integrated over length, once around the complete circuit.

So that’s the integral. 🙂 And that’s how we can turn that motor above into a generator: instead of putting a current through the wire to make it turn, we can turn the loop, by hand or by a waterwheel or by whatever. Now, when the coil rotates, its wires will be moving in the magnetic field and so we will find an emf in the circuit of the coil, and so that’s how the motor becomes a generator.

Now, let me quickly interject something here: when I say ‘a push on the electrons in the wire’, what electrons are we talking about? How many? Well… I’ll answer that question in very much detail in a moment but, as for now, just note that the emf is some quantity expressed per coulomb or, as Feynman puts it above, per unit charge. So we’ll need to multiply it with the current in the circuit to get the power of our little generator.

OK. Let’s move on. Indeed, all I can do here is mention just a few basics, so we can move on to the next thing. If you really want to know all of the nitty-gritty, then you should just read Feynman’s Lecture on induced currents. That’s got everything. And, no, don’t worry: contrary to what you might expect, my ‘basics’ do not amount to a terrible pile of formulas. In fact, it’s all easy and quite amusing stuff, and I should probably include a lot more. But then… Well… I always need to move on… If not, I’ll never get to the stuff that I really want to understand. 😦

The electromotive force

We defined the electromotive force above, including its formula:

emf

What are the units? Let’s see… We know B was measured not in newton per coulomb, like the electric field E, but in N·s/C·m, because we had to multiply the magnetic field strength with the velocity of the charge to find the force per unit charge, cf. the F/q = v×equation. Now what’s the unit in which we’d express that surface integral? We must multiply with m2, so we get N·m·s/C. Now let’s simplify that by noting that one volt is equal to 1 N·m/C. [The volt has a number of definitions, but the one that applies here is that it’s the potential difference between two points that will impart one joule (i.e. 1 N·m) of energy to a unit of charge (i.e. 1 C) that passes between them.] So we can measure the magnetic flux in volt-seconds, i.e. V·s. And then we take the derivative in regard to time, so we divide by s, and so we get… Volt! The emf is measured in volt!

Does that make sense? I guess so: the emf causes a current, just like a potential difference, i.e. a voltage, and, therefore, we can and should look at the emf as a voltage too!

But let’s think about it some more, though. In differential form, Faraday’s Law, is just that ×E = −∂B/∂t equation, so that’s just one of Maxwell’s four equations, and so we prefer to write it as the “flux rule”. Now, the “flux rule” says that the electromotive force (abbreviated as emf or EMF) on the electrons in a closed circuit is equal to the time rate of change of the magnetic flux it encloses. As mentioned above, we measure magnetic flux in volt-seconds (i.e. V·s), so its time rate of change is measured in volt (because the time rate of change is a quantity expressed per second), and so the emf is measured in volt, i.e. joule per coulomb, as 1 V = 1 N·m/C = 1 J/C. What does it mean?

The time rate of change of the magnetic flux can change because the surface covered by our loop changes, or because the field itself changes, or by both. Whatever the cause, it will change the emf, or the voltage, and so it will make the electrons move. So let’s suppose we have some generator generating some emf. The emf can be used to do some work. We can charge a capacitor, for example. So how would that work?

More charge on the capacitor will increase the voltage V of the capacitor, i.e. the potential difference V = Φ1 − Φ2 between the two plates. Now, we know that the increase of the voltage V will be proportional to the increase of the charge Q, and that the constant of proportionality is defined by the capacity C of the capacitor: C = Q/V. [How do we know that? Well… Have a look at my post on capacitors.] Now, if our capacitor has an enormous capacity, then its voltage won’t increase very rapidly. However, it’s clear that, no matter how large the capacity, its voltage will increase. It’s just a matter of time. Now, its voltage cannot be higher than the emf provided by our ‘generator’, because it will then want to discharge through the same circuit!

So we’re talking power and energy here, and so we need to put some load on our generator. Power is the rate of doing work, so it’s the time rate of change of energy, and it’s expressed in joule per second. The energy of our capacitor is U = (1/2)·Q2/C = (1/2)·C·V2. [How do we know that? Well… Have a look at my post on capacitors once again. :-)] So let’s take the time derivative of U assuming some constant voltage V. We get: dU/dt = d[(1/2)·Q2/C]/dt = (Q/C)·dQ/dt = V·dQ/dt. So that’s the power that the generator would need to supply to charge the generator. As I’ll show in a moment, the power supplied by a generator is, indeed, equal to the emf times the current, and the current is the time rate of change of the charge, so I = dQ/dt.

So, yes, it all works out: the power that’s being supplied by our generator will be used to charge our capacitor. Now, you may wonder: what about the current? Where is the current in Faraday’s Law? The answer is: Faraday’s Law doesn’t have the current. It’s just not there. The emf is expressed in volt, and so that’s energy per coulomb, so it’s per unit charge. How much power an generator can and will deliver depends on its design, and the circuit and load that we will be putting on it. So we can’t say how many coulomb we will have. It all depends. But you can imagine that, if the loop would be bigger, or if we’d have a coil with many loops, then our generator would be able to produce more power, i.e. it would be able to move more electrons, so the mentioned power = (emf)×(current) product would be larger. 🙂

Finally, to conclude, note Feynman’s definition of the emf: the tangential force per unit charge in the wire integrated over length around the complete circuit. So we’ve got force times distance here, but per unit charge. Now, force times distance is work, or energy, and so… Yes, emf is joule per coulomb, definitely! 🙂

[…] Don’t worry too much if you don’t quite ‘get’ this. I’ll come back to it when discussing electric circuits, which I’ll do in my next posts.

Self-inductance and Lenz’s rule

We talked about motors and generators above. We also have transformers, like the one below. What’s going on here is that an alternating current (AC) produces a continuously varying magnetic field, which generates an alternating emf in the second coil, which produces enough power to light an electric bulb.

transformer

Now, the total emf in coil (b) is the sum of the emf’s of the separate turns of coil, so if we wind (b) with many turns, we’ll get a larger emf, so we can ‘transform’ the voltage to some other voltage. From your high-school classes, you should know how that works.

The thing I want to talk about here is something else, though. There is an induction effect in coil (a) itself. Indeed, the varying current in coil (a) produces a varying magnetic field inside itself, and the flux of this field is continually changing, so there is a self-induced emf in coil (a). The effect is called self-inductance, and so it’s the emf acting on a current itself when it is building up a magnetic field or, in general, when its field is changing in any way. It’s a most remarkable phenomenon, and so let me paraphrase Feynman as he describes it:

“When we gave “the flux rule” that the emf is equal to the rate of change of the flux linkage, we didn’t specify the direction of the emf. There is a simple rule, called Lenz’s rule, for figuring out which way the emf goes: the emf tries to oppose any flux change. That is, the direction of an induced emf is always such that if a current were to flow in the direction of the emf, it would produce a flux of B that opposes the change in B that produces the emf. In particular, if there is a changing current in a single coil (or in any wire), there is a “back” emf in the circuit. This emf acts on the charges flowing in the coil to oppose the change in magnetic field, and so in the direction to oppose the change in current. It tries to keep the current constant; it is opposite to the current when the current is increasing, and it is in the direction of the current when it is decreasing. A current in a self-inductance has “inertia,” because the inductive effects try to keep the flow constant, just as mechanical inertia tries to keep the velocity of an object constant.”

Hmm… That’s something you need to read a couple of times to fully digest it. There’s a nice demo on YouTube, showing an MIT physics video demonstrating this effect with a metal ring placed on the end of an electromagnet. You’ve probably seen it before: the electromagnet is connected to a current, and the ring flies into the air. The explanation is that the induced currents in the ring create a magnetic field opposing the change of field through it. So the ring and the coil repel just like two magnets with opposite poles. The effect is no longer there when a thin radial cut is made in the ring, because then there can be no current. The nice thing about the video is that it shows how the effect gets much more dramatic when an alternating current is applied, rather than a DC current. And it also shows what happens when you first cool the ring in liquid nitrogen. 🙂

You may also notice the sparks when the electromagnet is being turned on. Believe it or not, that’s also related to a “back emf”. Indeed, when we disconnect a large electromagnet by opening a switch, the current is supposed to immediately go to zero but, in trying to do so, it generates a large “back emf”: large enough to develop an arc across the opening contacts of the switch. The high voltage is also not good for the insulation of the coil, as it might damage it. So that’s why large electromagnets usually include some extra circuit, which allows the “back current” to discharge less dramatically. But I’ll refer you to Feynman for more details, as any illustration here would clutter the exposé.

Eddy currents

I like educational videos, and so I should give you a few references here, but there’s so many of this that I’ll let you google a few yourself. The most spectacular demonstration of eddy currents is those that appear in a superconductor: even back in the 1970s, when Feynman wrote his Lectures, the effect of magnetic levitation was well known. Feynman illustrates the effect with the simple diagram below: when bringing a magnet near to a perfect conductor, such as tin below 3.8°K, eddy currents will create opposing fields, so that no magnetic flux enters the superconducting material. The effect is also referred to as the Meisner effect, after the German physicist Walther Meisner, although it was discovered much earlier (in 1911) by a Dutch physicist in Leiden, Heike Kamerlingh Onnes, who got a Nobel Prize for it.

superconductor

Of course, we have eddy currents in less dramatic situations as well. The phenomenon of eddy currents is usually demonstrated by the braking of a sheet of metal as it swings back and forth between the poles of an electromagnet, as illustrated below (left). The illustration on the right shows how eddy-current effect can be drastically reduced by cutting slots in the plate, so that’s like making a radial cut in our jumping ring. 🙂

eddy currentseddy currents 2

The Faraday disc

The Faraday disc is interesting, not only from a historical point of view – the illustration below is a 19th century model, so Michael Faraday may have used himself – but also because it seems to contradict the “flux of rule”: as the disc rotates through a steady magnetic field, it will produce some emf, but so there’s no change in the flux. How is that possible?

Faraday_disk_generatorFaraday disk

The answer, of course, is that we are ‘cheating’ here: the material is moving, so we’re actually moving the ‘wire’, or the circuit if you want, so here we need to combine two equations:

two laws

If we do that, you’ll see it all makes sense. 🙂 Oh… That Faraday disc is referred to as a homopolar generator, and it’s quite interesting. You should check out what happened to the concept in the Wikipedia article on it. The Faraday disc was apparently used as a source for power pulses in the 1950s. The thing below could store 500 mega-joules and deliver currents up to 2 mega-ampère, i.e. 2 million amps! Fascinating, isn’t it? 🙂800px-Homopolar_anu-MJC

Bose and Fermi

Probability amplitudes: what are they?

Instead of reading Penrose, I’ve started to read Richard Feynman again. Of course, reading the original is always better than whatever others try to make of that, so I’d recommend you read Feynman yourself – instead of this blog. But then you’re doing that already, aren’t you? 🙂

Let’s explore those probability amplitudes somewhat more. They are complex numbers. In a fine little book on quantum mechanics (QED, 1985), Feynman calls them ‘arrows’ – and that’s what they are: two-dimensional vectors, aka complex numbers. So they have a direction and a length (or magnitude). When talking amplitudes, the direction and length are known as the phase and the modulus (or absolute value) respectively and you also know by now that the modulus squared represents a probability or probability density, such as the probability of detecting some particle (a photon or an electron) at some location x or some region Δx, or the probability of some particle going from A to B, or the probability of a photon being emitted or absorbed by an electron (or a proton), etcetera. I’ve inserted two illustrations below to explain the matter.

The first illustration just shows what a complex number really is: a two-dimensional number (z) with a real part (Re(z) = x) and an imaginary part (Im(z) = y). We can represent it in two ways: one uses the (x, y) coordinate system (z = x + iy), and the other is the so-called polar form: z = reiφ. The (real) number e in the latter equation is just Euler’s number, so that’s a mathematical constant (just like π). The little i is the imaginary unit, so that’s the thing we introduce to add a second (vertical) dimension to our analysis: i can be written as 0+= (0, 1) indeed, and so it’s like a (second) basis vector in the two-dimensional (Cartesian or complex) plane.

polar form of complex number

I should not say much more about this, but I must list some essential properties and relationships:

  • The coordinate and polar form are related through Euler’s formula: z = x + iy = reiφ = r(cosφ + isinφ).
  • From this, and the fact that cos(-φ) = cosφ and sin(-φ) = –sinφ, it follows that the (complex) conjugate z* = x – iy of a complex number z = x + iy is equal to z* = reiφ. [I use z* as a symbol, instead of z-bar, because I can’t find a z-bar in the character set here.]  This equality is illustrated above.
  • The length/modulus/absolute value of a complex number is written as |z| and is equal to |z| = (x2 + y2)1/2 = |reiφ| = r (so r is always a positive (real) number).
  • As you can see from the graph, a complex number z and its conjugate z* have the same absolute value: |z| = |x+iy| = |z*| = |x-iy|.
  • Therefore, we have the following: |z||z|=|z*||z*|=|z||z*|=|z|2, and we can use this result to calculate the (multiplicative) inverse: z-1 = 1/z = z*/|z|2.
  • The absolute value of a product of complex numbers equals the product of the absolute values of those numbers: |z1z2| = |z1||z2|.
  • Last but not least, it is important to be aware of the geometric interpretation of the sum and the product of two complex numbers:
    • The sum of two complex numbers amounts to adding vectors, so that’s the familiar parallelogram law for vector addition: (a+ib) + (c+id) = (a+b) + i(c+d).
    • Multiplying two complex numbers amounts to adding the angles and multiplying their lengths – as evident from writing such product in its polar form: reiθseiΘ = rsei(θ+Θ). The result is, quite obviously, another complex number. So it is not the usual scalar or vector product which you may or may not be familiar with.

[For the sake of completeness: (i) the scalar product (aka dot product) of two vectors (ab) is equal to the product of is the product of the magnitudes of the two vectors and the cosine of the angle between them: ab = |a||b|cosα; and (ii) the result of a vector product (or cross product) is a vector which is perpendicular to both, so it’s a vector that is not in the same plane as the vectors we are multiplying: a×b = |a||b| sinα n, with n the unit vector perpendicular to the plane containing a and b in the direction given by the so-called right-hand rule. Just be aware of the difference.]

The second illustration (see below) comes from that little book I mentioned above already: Feynman’s exquisite 1985 Alix G. Mautner Memorial Lectures on Quantum Electrodynamics, better known as QED: the Strange Theory of Light and Matter. It shows how these probability amplitudes, or ‘arrows’ as he calls them, really work, without even mentioning that they are ‘probability amplitudes’ or ‘complex numbers’. That being said, these ‘arrows’ are what they are: probability amplitudes.

To be precise, the illustration below shows the probability amplitude of a photon (so that’s a little packet of light) reflecting from the front surface (front reflection arrow) and the back (back reflection arrow) of a thin sheet of glass. If we write these vectors in polar form (reiφ), then it is obvious that they have the same length (r = 0.2) but their phase φ is different. That’s because the photon needs to travel a bit longer to reach the back of the glass: so the phase varies as a function of time and space, but the length doesn’t. Feynman visualizes that with the stopwatch: as the photon is emitted from a light source and travels through time and space, the stopwatch turns and, hence, the arrow will point in a different direction.

[To be even more precise, the amplitude for a photon traveling from point A to B is a (fairly simple) function (which I won’t write down here though) which depends on the so-called spacetime interval. This spacetime interval (written as I or s2) is equal to I = [(x-x1)2+(y-y1)2+(z-z1)2] – (t-t1)2. So the first term in this expression is the square of the distance in space, and the second term is the difference in time, or the ‘time distance’. Of course, we need to measure time and distance in equivalent units: we do that either by measuring spatial distance in light-seconds (i.e. the distance traveled by light in one second) or by expressing time in units that are equal to the time it takes for light to travel one meter (in the latter case we ‘stretch’ time (by multiplying it with c, i.e. the speed of light) while in the former, we ‘stretch’ our distance units). Because of the minus sign between the two terms, the spacetime interval can be negative, zero, or positive, and we call these intervals time-like (I < 0), light-like (I = 0) or space-like (I > 0). Because nothing travels faster than light, two events separated by a space-like interval cannot have a cause-effect relationship. I won’t go into any more detail here but, at this point, you may want to read the article on the so-called light cone relating past and future events in Wikipedia, because that’s what we’re talking about here really.]

front and back reflection amplitude

Feynman adds the two arrows, because a photon may be reflected either by the front surface or by the back surface and we can’t know which of the two possibilities was the case. So he adds the amplitudes here, not the probabilities. The probability of the photon bouncing off the front surface is the modulus of the amplitude squared, (i.e. |reiφ|2 = r2), and so that’s 4% here (0.2·0.2). The probability for the back surface is the same: 4% also. However, the combined probability of a photon bouncing back from either the front or the back surface – we cannot know which path was followed – is not 8%, but some value between 0 and 16% (5% only in the top illustration, and 16% (i.e. the maximum) in the bottom illustration). This value depends on the thickness of the sheet of glass. That’s because it’s the thickness of the sheet that determines where the hand of our stopwatch stops. If the glass is just thick enough to make the stopwatch make one extra half turn as the photon travels through the glass from the front to the back, then we reach our maximum value of 16%, and so that’s what shown in the bottom half of the illustration above.

For the sake of completeness, I need to note that the full explanation is actually a bit more complex. Just a little bit. 🙂 Indeed, there is no such thing as ‘surface reflection’ really: a photon has an amplitude for scattering by each and every layer of electrons in the glass and so we have actually have many more arrows to add in order to arrive at a ‘final’ arrow. However, Feynman shows how all these arrows can be replaced by two so-called ‘radius arrows’: one for ‘front surface reflection’ and one for ‘back surface reflection’. The argument is relatively easy but I have no intention to fully copy Feynman here because the point here is only to illustrate how probabilities are calculated from probability amplitudes. So just remember: probabilities are real numbers between 0 and 1 (or between 0 and 100%), while amplitudes are complex numbers – or ‘arrows’ as Feynman calls them in this popular lectures series.

In order to give somewhat more credit to Feynman – and also to be somewhat more complete on how light really reflects from a sheet of glass (or a film of oil on water or a mud puddle), I copy one more illustration here – with the text – which speaks for itself: “The phenomenon of colors produced by the partial reflection of white light by two surfaces is called iridescence, and can be found in many places. Perhaps you have wondered how the brilliant colors of hummingbirds and peacocks are produced. Now you know.” The iridescence phenomenon is caused by really small variations in the thickness of the reflecting material indeed, and it is, perhaps, worth noting that Feynman is also known as the father of nanotechnology… 🙂

Iridescence

Light versus matter

So much for light – or electromagnetic waves in general. They consist of photons. Photons are discrete wave-packets of energy, and their energy (E) is related to the frequency of the light (f) through the Planck relation: E = hf. The factor h in this relation is the Planck constant, or the quantum of action in quantum mechanics as this tiny number (6.62606957×10−34) is also being referred to. Photons have no mass and, hence, they travel at the speed of light indeed. But what about the other wave-like particles, like electrons?

For these, we have probability amplitudes (or, more generally, a wave function) as well, the characteristics of which are given by the de Broglie relations. These de Broglie relations also associate a frequency and a wavelength with the energy and/or the momentum of the ‘wave-particle’ that we are looking at: f = E/h and λ = h/p. In fact, one will usually find those two de Broglie relations in a slightly different but equivalent form: ω = E/ħ and k = p/ħ. The symbol ω stands for the angular frequency, so that’s the frequency expressed in radians. In other words, ω is the speed with which the hand of that stopwatch is going round and round and round. Similarly, k is the wave number, and so that’s the wavelength expressed in radians (or the spatial frequency one might say). We use k and ω in wave functions because the argument of these wave functions is the phase of the probability amplitude, and this phase is expressed in radians. For more details on how we go from distance and time units to radians, I refer to my previous post. [Indeed, I need to move on here otherwise this post will become a book of its own! Just check out the following: λ = 2π/k and f = ω/2π.]

How should we visualize a de Broglie wave for, let’s say, an electron? Well, I think the following illustration (which I took from Wikipedia) is not too bad.    

2000px-Quantum_mechanics_travelling_wavefunctions_wavelength

Let’s first look at the graph on the top of the left-hand side of the illustration above. We have a complex wave function Ψ(x) here but only the real part of it is being graphed. Also note that we only look at how this function varies over space at some fixed point of time, and so we do not have a time variable here. That’s OK. Adding the complex part would be nice but it would make the graph even more ‘complex’ :-), and looking at one point in space only and analyzing the amplitude as a function of time only would yield similar graphs. If you want to see an illustration with both the real as well as the complex part of a wave function, have a look at my previous post.

We also have the probability – that’s the red graph – as a function of the probability amplitude: P = |Ψ(x)|2 (so that’s just the modulus squared). What probability? Well, the probability that we can actually find the particle (let’s say an electron) at that location. Probability is obviously always positive (unlike the real (or imaginary) part of the probability amplitude, which oscillate around the x-axis). The probability is also reflected in the opacity of the little red ‘tennis ball’ representing our ‘wavicle’: the opacity varies as a function of the probability. So our electron is smeared out, so to say, over the space denoted as Δx.

Δx is the uncertainty about the position. The question mark next to the λ symbol (we’re still looking at the graph on the top left-hand side of the above illustration only: don’t look at the other three graphs now!) attributes this uncertainty to uncertainty about the wavelength. As mentioned in my previous post, wave packets, or wave trains, do not tend to have an exact wavelength indeed. And so, according to the de Broglie equation λ = h/p, if we cannot associate an exact value with λ, we will not be able to associate an exact value with p. Now that’s what’s shown on the right-hand side. In fact, because we’ve got a relatively good take on the position of this ‘particle’ (or wavicle we should say) here, we have a much wider interval for its momentum : Δpx. [We’re only considering the horizontal component of the momentum vector p here, so that’s px.] Φ(p) is referred to as the momentum wave function, and |Φ(p)|2 is the corresponding probability (or probability density as it’s usually referred to).

The two graphs at the bottom present the reverse situation: fairly precise momentum, but a lot of uncertainty about the wavicle’s position (I know I should stick to the term ‘particle’ – because that’s what physicists prefer – but I think ‘wavicle’ describes better what it’s supposed to be). So the illustration above is not only an illustration of the de Broglie wave function for a particle, but it also illustrates the Uncertainty Principle.

Now, I know I should move on to the thing I really want to write about in this post – i.e. bosons and fermions – but I feel I need to say a few things more about this famous ‘Uncertainty Principle’ – if only because I find it quite confusing. According to Feynman, one should not attach too much importance to it. Indeed, when introducing his simple arithmetic on probability amplitudes, Feynman writes the following about it: “The uncertainty principle needs to be seen in its historical context. When the revolutionary ideas of quantum physics were first coming out, people still tried to understand them in terms of old-fashioned ideas (such as, light goes in straight lines). But at a certain point, the old-fashioned ideas began to fail, so a warning was developed that said, in effect, ‘Your old-fashioned ideas are no damn good when…’ If you get rid of all the old-fashioned ideas and instead use the ideas that I’m explaining in these lectures – adding arrows for all the ways an event can happen – there is no need for the uncertainty principle!” So, according to Feynman, wave function math deals with all and everything and therefore we should, perhaps, indeed forget about this rather mysterious ‘principle’.

However, because it is mentioned so much (especially in the more popular writing), I did try to find some kind of easy derivation of its standard formulation: ΔxΔp ≥ ħ (ħ = h/2π, i.e. the quantum of angular momentum in quantum mechanics). To my surprise, it’s actually not easy to derive the uncertainty principle from other basic ‘principles’. As mentioned above, it follows from the de Broglie equation  λ = h/p that momentum (p) and wavelength (λ) are related, but so how do we relate the uncertainty about the wavelength (Δλ) or the momentum (Δp) to the uncertainty about the position of the particle (Δx)? The illustration below, which analyzes a wave packet (aka a wave train), might provide some clue. Before you look at the illustration and start wondering what it’s all about, remember that a wave function with a definite (angular) frequency ω and wave number k (as described in my previous post), which we can write as Ψ = Aei(ωt-kx), represents the amplitude of a particle with a known momentum p = ħ/at some point x and t, and that we had a big problem with such wave, because the squared modulus of this function is a constant: |Ψ|2 = |Aei(ωt-kx)|= A2. So that means that the probability of finding this particle is the same at all points. So it’s everywhere and nowhere really (so it’s like the second wave function in the illustration above, but then with Δx infinitely long and the same wave shape all along the x-axis). Surely, we can’t have this, can we? Now we cannot – if only because of the fact that if we add up all of the probabilities, we would not get some finite number. So, in reality, particles are effectively confined to some region Δor – if we limit our analysis to one dimension only (for the sake of simplicity) – Δx (remember that bold-type symbols represent vectors). So the probability amplitude of a particle is more likely to look like something that we refer to as a wave packet or a wave train. And so that’s what’s explained more in detail below.

Now, I said that localized wave trains do not tend to have an exact wavelength. What do I mean with that? It doesn’t sound very precise, does it? In fact, we actually can easily sketch a graph of a wave packet with some fixed wavelength (or fixed frequency), so what am I saying here? I am saying that, in quantum physics, we are only looking at a very specific type of wave train: they are a composite of a (potentially infinite) number of waves whose wavelengths are distributed more or less continuously around some average, as shown in the illustration below, and so the addition of all of these waves – or their superposition as the process of adding waves is usually referred to – results in a combined ‘wavelength’ for the localized wave train that we cannot, indeed, equate with some exact number. I have not mastered the details of the mathematical process referred to as Fourier analysis (which refers to the decomposition of a combined wave into its sinusoidal components) as yet, and, hence, I am not in a position to quickly show you how Δx and Δλ are related exactly, but the point to note is that a wider spread of wavelengths results in a smaller Δx. Now, a wider spread of wavelengths corresponds to a wider spread in p too, and so there we have the Uncertainty Principle: the more we know about Δx, the less we know about Δx, and so that’s what the inequality ΔxΔp ≥ h/2π represents really.

Explanation of uncertainty principle

[Those who like to check things out may wonder why a wider spread in wavelength implies a wider spread in momentum. Indeed, if we just replace λ and p with Δλ and Δp  in the de Broglie equation λ = h/p, we get Δλ = h/Δp and so we have an inversely proportional relationship here, don’t we? No. We can’t just write that Δλ = Δ(h/p) but this Δ is not some mathematical operator than you can simply move inside of the brackets. What is Δλ? Is it a standard deviation? Is it the spread and, if so, what’s the spread? We could, for example, define it as the difference between some maximum value λmax and some minimum value λmin, so as Δλ = λmax – λmin. These two values would then correspond with pmax =h/λmin and pmin =h/λmax and so the corresponding spread in momentum would be equal to Δp = pmax – pmin =  h/λmin – h/λmax = h(λmax – λmin)/(λmaxλmin). So a wider spread in wavelength does result in a wider spread in momentum, but the relationship is more subtle than you might think at first. In fact, in a more rigorous approach, we would indeed see the standard deviation (represented by the sigma symbol σ) from some average as a measure of the ‘uncertainty’. To be precise, the more precise formulation of the Uncertainty Principle is: σxσ≥ ħ/2, but don’t ask me where that 2 comes from!]

I really need to move on now, because this post is already way too lengthy and, hence, not very readable. So, back to that very first question: what’s that wave function math? Well, that’s obviously too complex a topic to be fully exhausted here. 🙂 I just wanted to present one aspect of it in this post: Bose-Einstein statistics. Huh? Yes.

When we say Bose-Einstein statistics, we should also say its opposite: Fermi-Dirac statistics. Bose-Einstein statistics were ‘discovered’ by the Indian scientist Satyanendra Nath Bose (the only thing Einstein did was to give Bose’s work on this wider recognition) and they apply to bosons (so they’re named after Bose only), while Fermi-Dirac statistics apply to fermions (‘Fermi-Diraqions’ doesn’t sound good either obviously). Any particle, or any wavicle I should say, is either a fermion or a boson. There’s a strict dichotomy: you can’t have characteristics of both. No split personalities. Not even for a split second.

The best-known examples of bosons are photons and the recently experimentally confirmed Higgs particle. But, in case you have heard of them, gluons (which mediate the so-called strong interactions between particles), and the W+, W and Z particles (which mediate the so-called weak interactions) are bosons too. Protons, neutrons and electrons, on the other hand, are fermions.

More complex particles, such as atomic nuclei, are also either bosons or fermions. That depends on the number of protons and neutrons they consist of. But let’s not get ahead of ourselves. Here, I’ll just note that bosons – unlike fermions – can pile on top of one another without limit, all occupying the same ‘quantum state’. This explains superconductivity, superfluidity and Bose-Einstein condensation at low temperatures. Indeed, these phenomena usually involve (bosonic) helium. You can’t do it with fermions. Superfluid helium has very weird properties, including zero viscosity – so it flows without dissipating energy and it creeps up the wall of its container, seemingly defying gravity: just Google one of the videos on the Web! It’s amazing stuff! Bose statistics also explain why photons of the same frequency can form coherent and extremely powerful laser beams, with (almost) no limit as to how much energy can be focused in a beam.

Fermions, on the other hand, avoid one another. Electrons, for example, organize themselves in shells around a nucleus stack. They can never collapse into some kind of condensed cloud, as bosons can. If electrons would not be fermions, we would not have such variety of atoms with such great range of chemical properties. But, again, let’s not get ahead of ourselves. Back to the math.

Bose versus Fermi particles

When adding two probability amplitudes (instead of probabilities), we are adding complex numbers (or vectors or arrows or whatever you want to call them), and so we need to take their phase into account or – to put it simply – their direction. If their phase is the same, the length of the new vector will be equal to the sum of the lengths of the two original vectors. When their phase is not the same, then the new vector will be shorter than the sum of the lengths of the two amplitudes that we are adding. How much shorter? Well, that obviously depends on the angle between the two vectors, i.e. the difference in phase: if it’s 180 degrees (or π radians), then they will cancel each other out and we have zero amplitude! So that’s destructive or negative interference. If it’s less than 90 degrees, then we will have constructive or positive interference.

It’s because of this interference effect that we have to add probability amplitudes first, before we can calculate the probability of an event happening in one or the other (indistinguishable) way (let’s say A or B) – instead of just adding probabilities as we would do in the classical world. It’s not subtle. It makes a big difference: |ΨA + ΨB|2 is the probability when we cannot distinguish the alternatives (so when we’re in the world of quantum mechanics and, hence, we have to add amplitudes), while |ΨA|+ |ΨB|is the probability when we can see what happens (i.e. we can see whetheror B was the case). Now, |ΨA + ΨB|is definitely not the same as |ΨA|+ |ΨB|– not for real numbers, and surely not for complex numbers either. But let’s move on with the argument – literally: I mean the argument of the wave function at hand here.

That stopwatch business above makes it easier to introduce the thought experiment which Feynman also uses to introduce Bose versus Fermi statistics (Feynman Lectures (1965), Vol. III, Lecture 4). The experimental set-up is shown below. We have two particles, which are being referred to as particle a and particle b respectively (so we can distinguish the two), heading straight for each other and, hence, they are likely to collide and be scattered in some other direction. The experimental set-up is designed to measure where they are likely to end up, i.e. to measure probabilities. [There’s no certainty in the quantum-mechanical world, remember?] So, in this experiment, we have a detector (or counter) at location 1 and a detector/counter at location 2 and, after many many measurements, we have some value for the (combined) probability that particle a goes to detector 1 and particle b goes to counter 2. This amplitude is a complex number and you may expect it will depend on the angle θ as shown in the illustration below.

scattering identical particles

So this angle θ will obviously show up somehow in the argument of our wave function. Hence, the wave function, or probability amplitude, describing the amplitude of particle a ending up in counter 1 and particle b ending up in counter 2 will be some (complex) function Ψ1= f(θ). Please note, once again, that θ is not some (complex) phase but some real number (expressed in radians) between 0 and 2π that characterizes the set-up of the experiment above. It is also worth repeating that f(θ) is not the amplitude of particle a hitting detector 1 only but the combined amplitude of particle a hitting counter 1 and particle b hitting counter 2! It makes a big difference and it’s essential in the interpretation of this argument! So, the combined probability of a going to 1 and of particle b going to 2, which we will write as P1, is equal to |Ψ1|= |f(θ)|2.

OK. That’s obvious enough. However, we might also find particle a in detector 2 and particle b in detector 1. Surely, the probability amplitude probability for this should be equal to f(θ+π)? It’s just a matter of switching counter 1 and 2 – i.e. we rotate their position over 180 degrees, or π (in radians) – and then we just insert the new angle of this experimental set-up (so that’s θ+π) into the very same wave function and there we are. Right?

Well… Maybe. The probability of a going to 2 and b going to 1, which we will write as P2, will be equal to |f(θ+π)|indeed. However, our probability amplitude, which I’ll write as Ψ2may not be equal to f(θ+π). It’s just a mathematical possibility. I am not saying anything definite here. Huh? Why not? 

Well… Think about the thing we said about the phase and the possibility of a phase shift: f(θ+π) is just one of the many mathematical possibilities for a wave function yielding a probability P=|Ψ2|= |f(θ+π)|2. But any function eiδf(θ+π) will yield the same probability. Indeed, |z1z2| = |z1||z2| and so |eiδ f(θ+π)|2 = (|eiδ||f(θ+π)|)= |eiδ|2|f(θ+π)|= |f(θ+π)|(the square of the modulus of a complex number on the unit circle is always one – because the length of vectors on the unit circle is equal to one). It’s a general thing: if Ψ is some wave function (i.e. it describes some complex amplitude in space and time, then eiδΨ is the same wave function but with a phase shift equal to δ. Huh? Yes. Think about it: we’re multiplying complex numbers here, so that’s adding angles and multiplying lengths. Now the length of eiδ is 1 (because it’s a complex number on the unit circle) but its phase is δ. So multiplying Ψ with eiδ does not change the length of Ψ but it does shift its phase by an amount (in radians) equal to δ. That should be easy enough to understand.

You probably wonder what I am being so fussy, and what that δ could be, or why it would be there. After all, we do have a well-behaved wave function f(θ) here, depending on x, t and θ, and so the only thing we did was to change the angle θ (we added π radians to it). So why would we need to insert a phase shift here? Because that’s what δ really is: some random phase shift. Well… I don’t know. This phase factor is just a mathematical possibility as for now. So we just assume that, for some reason which we don’t understand right now, there might be some ‘arbitrary phase factor’ (that’s how Feynman calls δ) coming into play when we ‘exchange’ the ‘role’ of the particles. So maybe that δ is there, but maybe not. I admit it looks very ugly. In fact, if the story about Bose’s ‘discovery’ of this ‘mathematical possibility’ (in 1924) is correct, then it all started with an obvious ‘mistake’ in a quantum-mechanical calculation – but a ‘mistake’ that, miraculously, gave predictions that agreed with experimental results that could not be explained without introducing this ‘mistake’. So let the argument go full circle – literally – and take your time to appreciate the beauty of argumentation in physics.

Let’s swap detector 1 and detector 2 a second time, so we ‘exchange’ particle a and b once again. So then we need to apply this phase factor δ once again and, because of symmetry in physics, we obviously have to use the same phase factor δ – not some other value γ or something. We’re only rotating our detectors once again. That’s it. So all the rest stays the same. Of course, we also need to add π once more to the argument in our wave function f. In short, the amplitude for this is:

eiδ[eiδf(θ+π+π)] = (eiδ)f(θ) = ei2δ f(θ)

Indeed, the angle θ+2π is the same as θ. But so we have twice that phase shift now: 2δ. As ugly as that ‘thing’ above: eiδf(θ+π). However, if we square the amplitude, we get the same probability: P= |Ψ1|= |ei2δ f(θ)| = |f(θ)|2. So it must be right, right? Yes. But – Hey! Wait a minute! We are obviously back at where we started, aren’t we? We are looking at the combined probability – and amplitude – for particle a going to counter 1 and particle b going to counter 2, and the angle is θ! So it’s the same physical situation, and – What the heck! – reality doesn’t change just because we’re rotating these detectors a couple of times, does it? [In fact, we’re actually doing nothing but a thought experiment here!] Hence, not only the probability but also the amplitude must be the same.  So (eiδ)2f(θ) must equal f(θ) and so… Well… If (eiδ)2f(θ) = f(θ), then (eiδ)2 must be equal to 1. Now, what does that imply for the value of δ?

Well… While the square of the modulus of all vectors on the unit circle is always equal to 1, there are only two cases for which the square of the vector itself yields 1: (I) eiδ = eiπ =  eiπ = –1 (check it: (eiπ)= (–1)ei2π = ei0 = +1), and (II) eiδ = ei2π eie= +1 (check it: ei2π)= (+1)ei4π = ei0 = +1). In other words, our phase factor δ is either δ = 0 (or 0 ± 2nπ) or, else, δ = π (or π ± 2nπ). So eiδ = ± 1 and Ψ2 is either +f(θ+π) or, else, –f(θ+π). What does this mean? It means that, if we’re going to be adding the amplitudes, then the ‘exchanged case’ may contribute with the same sign or, else, with the opposite sign.

But, surely, there is no need to add amplitudes here, is there? Particle a can be distinguished from particle b and so the first case (particle a going into counter 1 and particle b going into counter 2) is not the same as the ‘exchanged case’ (particle a going into counter 2 and b going into counter 1). So we can clearly distinguish or verify which of the two possible paths are followed and, hence, we should be adding probabilities if we want to get the combined probability for both cases, not amplitudes. Now that is where the fun starts. Suppose that we have identical particles here – so not some beam of α-particles (i.e. helium nuclei) bombarding beryllium nuclei for instance but, let’s say, electrons on electrons, or photons on photons indeed – then we do have to add the amplitudes, not the probabilities, in order to calculate the combined probability of a particle going into counter 1 and the other particle going into counter 2, for the simple reason that we don’t know which is which and, hence, which is going where.

Let me immediately throw in an important qualifier: defining ‘identical particles’ is not as easy as it sounds. Our ‘wavicle’ of choice, for example, an electron, can have its spin ‘up’ or ‘down’ – and so that’s two different things. When an electron arrives in a counter, we can measure its spin (in practice or in theory: it doesn’t matter in quantum mechanics) and so we can distinguish it and, hence, an electron that’s ‘up’ is not identical to one that’s ‘down’. [I should resist the temptation but I’ll quickly make the remark: that’s the reason why we have two electrons in one atomic orbital: one is ‘up’ and the other one is ‘down’. Identical particles need to be in the same ‘quantum state’ (that’s the standard expression for it) to end up as ‘identical particles’ in, let’s say, a laser beam or so. As Feynman states it: in this (theoretical) experiment, we are talking polarized beams, with no mixture of different spin states.]

The wonderful thing in quantum mechanics is that mathematical possibility usually corresponds with reality. For example, electrons with positive charge, or anti-matter in general, is not only a theoretical possibility: they exist. Likewise, we effectively have particles which interfere with positive sign – these are called Bose particles – and particles which interfere with negative sign – Fermi particles.

So that’s reality. The factor eiδ = ± 1 is there, and it’s a strict dichotomy: photons, for example, always behave like Bose particles, and protons, neutrons and electrons always behave like Fermi particles. So they don’t change their mind and switch from one to the other category, not for a short while, and not for a long while (or forever) either. In fact, you may or may not be surprised to hear that there are experiments trying to find out if they do – just in case. 🙂 For example, just Google for Budker and English (2010) from the University of California at Berkeley. The experiments confirm the dichotomy: no split personalities here, not even for a nanosecond (10−9 s), or a picosecond (10−12 s). [A picosecond is the time taken by light to travel 0.3 mm in a vacuum. In a nanosecond, light travels about one foot.]

In any case, does all of this really matter? What’s the difference, in practical terms that is? Between Bose or Fermi, I must assume we prefer the booze.

It’s quite fundamental, however. Hang in there for a while and you’ll see why.

Bose statistics

Suppose we have, once again, some particle a and b that (i) come from different directions (but, this time around, not necessarily in the experimental set-up as described above: the two particles may come from any direction really), (ii) are being scattered, at some point in space (but, this time around, not necessarily the same point in space), (iii) end up going in one and the same direction and – hopefully – (iv) arrive together at some other point in space. So they end up in the same state, which means they have the same direction and energy (or momentum) and also whatever other condition that’s relevant. Again, if the particles are not identical, we can catch both of them and identify which is which. Now, if it’s two different particles, then they won’t take exactly the same path. Let’s say they travel along two infinitesimally close paths referred to as path 1 and 2 and so we should have two infinitesimally small detectors: one at location 1 and the other at location 2. The illustration below (credit to Feynman once again!) is for n particles, but here we’ll limit ourselves to the calculations for just two.

Boson particles

Let’s denote the amplitude of a to follow path 1 (and end up in counter 1) as a1, and the amplitude of b to follow path 2 (and end up in counter 2) as b1. Then the amplitude for these two scatterings to occur at the same time is the product of these two amplitudes, and so the probability is equal to |a1b1|= [|a1||b1|]= |a1|2|b1|2. Similarly, the combined amplitude of a following path 2 (and ending up in counter 2) and b following path 1 (etcetera) is |a2|2|b2|2. But so we said that the directions 1 and 2 were infinitesimally close and, hence, the values for aand a2, and for band b2, should also approach each other, so we can equate them with a and b respectively and, hence, the probability of some kind of combined detector picking up both particles as they hit the counter is equal to P = 2|a|2|b|2 (just substitute and add). [Note: For those who would think that separate counters and ‘some kind of combined detector’ radically alter the set-up of this thought experiment (and, hence, that we cannot just do this kind of math), I refer to Feynman (Vol. III, Lecture 4, section 4): he shows how it works using differential calculus.]

Now, if the particles cannot be distinguished – so if we have ‘identical particles’ (like photons, or polarized electrons) – and if we assume they are Bose particles (so they interfere with a positive sign – i.e. like photons, but not like electrons), then we should no longer add the probabilities but the amplitudes, so we get a1b+ a2b= 2ab for the amplitude and – lo and behold! – a probability equal to P = 4|a|2|b|2So what? Well… We’ve got a factor 2 difference here: 4|a|2|b|is two times 2|a|2|b|2.

This is a strange result: it means we’re twice as likely to find two identical Bose particles scattered into the same state as you would assuming the particles were different. That’s weird, to say the least. In fact, it gets even weirder, because this experiment can easily be extended to a situation where we have n particles present (which is what the illustration suggests), and that makes it even more interesting (more ‘weird’ that is). I’ll refer to Feynman here for the (fairly easy but somewhat lengthy) calculus in case we have n particles, but the conclusion is rock-solid: if we have n bosons already present in some state, then the probability of getting one extra boson is n+1 times greater than it would be if there were none before.

So the presence of the other particles increases the probability of getting one more: bosons like to crowd. And there’s no limit to it: the more bosons you have in one space, the more likely it is another one will want to occupy the same space. It’s this rather weird phenomenon which explains equally weird things such as superconductivity and superfluidity, or why photons of the same frequency can form such powerful laser beams: they don’t mind being together – literally on the same spot – in huge numbers. In fact, they love it: a laser beam, superfluidity or superconductivity are actually quantum-mechanical phenomena that are visible at a macro-scale.

OK. I won’t go into any more detail here. Let me just conclude by showing how interference works for Fermi particles. Well… That doesn’t work or, let me be more precise, it leads to the so-called (Pauli) Exclusion Principle which, for electrons, states that “no two electrons can be found in exactly the same state (including spin).” Indeed, we get a1b– a2b1= ab – ab = 0 (zero!) if we let the values of aand a2, and band b2, come arbitrarily close to each other. So the amplitude becomes zero as the two directions (1 and 2) approach each other. That simply means that it is not possible at all for two electrons to have the same momentum, location or, in general, the same state of motion – unless they are spinning opposite to each other (in which case they are not ‘identical’ particles). So what? Well… Nothing much. It just explains all of the chemical properties of atoms. 🙂

In addition, the Pauli exclusion principle also explains the stability of matter on a larger scale: protons and neutrons are fermions as well, and so they just “don’t get close together with one big smear of electrons around them”, as Feynman puts it, adding: “Atoms must keep away from each other, and so the stability of matter on a large scale is really a consequence of the Fermi particle nature of the electrons, protons and neutrons.”

Well… There’s nothing much to add to that, I guess. 🙂

Post scriptum:

I wrote that “more complex particles, such as atomic nuclei, are also either bosons or fermions”, and that this depends on the number of protons and neutrons they consist of. In fact, bosons are, in general, particles with integer spin (0 or 1), while fermions have half-integer spin (1/2). Bosonic Helium-4 (He4) has zero spin. Photons (which mediate electromagnetic interactions), gluons (which mediate the so-called strong interactions between particles), and the W+, W and Z particles (which mediate the so-called weak interactions) all have spin one (1). As mentioned above, Lithium-7 (Li7) has half-integer spin (3/2). The underlying reason for the difference in spin between He4 and Li7 is their composition indeed: He4  consists of two protons and two neutrons, while Liconsists of three protons and four neutrons.

However, we have to go beyond the protons and neutrons for some better explanation. We now know that protons and neutrons are not ‘fundamental’ any more: they consist of quarks, and quarks have a spin of 1/2. It is probably worth noting that Feynman did not know this when he wrote his Lectures in 1965, although he briefly sketches the findings of Murray Gell-Man and Georg Zweig, who published their findings in 1961 and 1964 only, so just a little bit before, and describes them as ‘very interesting’. I guess this is just another example of Feynman’s formidable intellect and intuition… In any case, protons and neutrons are so-called baryons: they consist of three quarks, as opposed to the short-lived (unstable) mesons, which consist of one quark and one anti-quark only (you may not have heard about mesons – they don’t live long – and so I won’t say anything about them). Now, an uneven number of quarks result in half-integer spin, and so that’s why protons and neutrons have half-integer spin. An even number of quarks result in integer spin, and so that’s why mesons have spin zero 0 or 1. Two protons and two neutrons together, so that’s He4, can condense into a bosonic state with spin zero, because four half-integer spins allows for an integer sum. Seven half-integer spins, however, cannot be combined into some integer spin, and so that’s why Li7 has half-integer spin (3/2). Electrons also have half-integer spin (1/2) too. So there you are.

Now, I must admit that this spin business is a topic of which I understand little – if anything at all. And so I won’t go beyond the stuff I paraphrased or quoted above. The ‘explanation’ surely doesn’t ‘explain’ this fundamental dichotomy between bosons and fermions. In that regard, Feynman’s 1965 conclusion still stands: “It appears to be one of the few places in physics where there is a rule which can be stated very simply, but for which no one has found a simple and easy explanation. The explanation is deep down in relativistic quantum mechanics. This probably means that we do not have a complete understanding of the fundamental principle involved. For the moment, you will just have to take it as one of the rules of the world.”