While hurrying to try to understand the things I wanted to understand most – like Schrödinger’s equation and, equally important, its solutions explaining the weird shapes of electron orbitals – I skipped some interesting bits and pieces. Worse, I skipped two or three of Feynman’s Lectures on quantum mechanics entirely. These include Chapter 17 – on symmetry and conservation laws – and Chapter 18 – on angular momentum. With the benefit of hindsight, that was not the right thing to do. If anything, doing all of the Lectures would, at the very least, ensure I would have more than an ephemeral grasp of it all. So… In this and the next post, I want to tidy up and go over everything I skipped so far. 🙂
We’ve written a lot on how quantum mechanics applies to both bosons as well as fermions. For example, we pointed out – in very much detail – that the mathematical structure of the electromagnetic wave – light! 🙂 – is quite similar to that of the ubiquitous wavefunction. Equally fundamental – if not more – is the fact that light also arrives in lumps – little light-particles which we call photons. It’s the photoelectric effect, which Einstein explained in 1905 by… Well… By telling us that light consists of quanta – photons – whose energy must be high enough so as to be able to dislodge an electron. It’s what got him his Nobel Prize. [Einstein never got a Nobel Prize for his relativity theory, which is – arguably – at least as important. There’s a lot of controversy around that but, in any case, that’s history.]
So it shouldn’t surprise you that there’s an equivalent to the spin of an electron. With spin, we refer to the angular momentum of a quantum-mechanical system – an atom, a nucleus, an electron, whatever – which, as you know, can only be one of a set of discrete values when measured along some direction, which we usually refer to as the z-direction. More formally, we write that the z-component of the angular moment J is equal to
Jz = j·ħ, (j-1)·ħ, (j-2)·ħ, …, -(j-2)·ħ, -(j-1)·ħ, –j·ħ
The j in this expression is the so-called spin of the system. For an electron, it’s equal to ±1/2, which we referred to as “up” and “down” states respectively because of obvious reasons: one state points upwards – more or less, that is (we know the angular momentum will actually precess around the direction of the magnetic field) – while the other points downwards.
We also know that the magnetic energy of an electron in a (weak) magnetic field – which, as you know, we conveniently assume to be pointing in the same z-direction, so Bz = B – will be equal to:
Umag = g·μz·B·j = ± 2·μz·B·(1/2) = ± μz·B = ± B·(qe·ħ)/(2m)
In short, the magnetic energy is proportional to the magnetic field, and the constant of proportionality is the so-called Bohr magneton qe·ħ/2m. So far, so good. What’s the analog for a photon?
Well… Let’s first discuss the equivalent of a Stern-Gerlach apparatus for photons. That would be a polarizing material, like a piece of calcite, for example. Now, it is, unfortunately, much more difficult to explain how a polarizing material works than to explain how a Stern-Gerlach apparatus works. [If you thought the workings of that (hypothetical) Stern-Gerlach filter were difficult to understand, think again.] We actually have different types of polarizers – some complicated, some easy. We’ll take the easy ones: linear ones. In addition, the phenomenon of polarization itself is a bit intricate. The phenomenon is well described in Chapter 33 of Feynman’s first Volume of Lectures, out of which I copied the two illustrations below the next paragraph.
Of course, to make sure you think about whatever is that you’re reading, Feynman now chooses the z-direction such that it coincides with the direction of propagation of the electromagnetic radiation. So it’s now the x– and y-direction that we’re looking at. Not the z-direction any more. As usual, we forget about the magnetic field vector B and so we think of the oscillating electric field vector E only. Why can we forget about B? Well… If we have E, we know B. Full stop. As you know, I think B is pretty essential in the analysis too but… Well… You’ll see all textbooks on physics quickly forget about B when describing light. I don’t want to do that, but… Well… I need to move on. [I’ll come back to the matter – sideways – at the end of this post. :-)]
So we know the electric field vector E may oscillate in a plane (so that’s up and down and back again) but – interestingly enough – its direction may also rotate around the z-axis (again, remember the z-axis is the direction of propagation). Why? Well… Because E has an x– and a y-component (no z-component!), and these two components may oscillate in phase or out of phase, and so all of the combinations below are possible.To make a long story short, light comes in two varieties: linearly polarized and elliptically polarized. Of course, elliptically may be circularly – if you’re lucky! 🙂
Now, a (linear) polarizer has an optical axis, and only light whose E vector is oscillating along that axis will go through. […] OK. That’s not true: the component along the optical axis of some E pointing in some other direction will go through too! I’ll show how that works in a moment. But so all the rest is absorbed, and the absorbed energy just heats up the polarizer (which, of course, then radiates heat back out).
In any case, if the optical axis happens to be our x-axis, then we know that the light that comes through will be x-polarized, so that corresponds to the rather peculiar Ex = 1 and Ey = 0 notation. [This notation refers to coefficients we’ll use later to resolve states into base states – but don’t worry about it now.] Needless to say, you shouldn’t confuse the electric field vector E with the energy of our photon, which we denote as E. No bold letter here. No subscript. 🙂
Pfff… This introduction is becoming way too long. What about our photon? We want to talk about one photon only and we’ve already written over a page and haven’t started yet. 🙂
Well… First, we must note that we’ll assume the light is perfectly monochromatic, so all photons will have an energy that’s equal to E = h·f, so the energy is proportional to the frequency of our light, and the constant of proportionality is Planck’s constant. That’s Einstein’s relation, not a de Broglie relation. Just remember: we’re talking definite energy states here.
Second – and much more importantly – we may define two base states for our photon, |x〉 and |y〉 respectively, which correspond to the classical linear x– and y-polarization. So a photon can be in state |x〉 or |y〉 but, as usual, it is much more likely to be in some state that is some linear combination of these two base states.
OK. Now we can start playing with these ideas. Imagine a polarizer – or polaroid, as Feynman calls it – whose optical axis is tilted – say, it’s at an angle θ from the x-axis, as shown below. Classically, the light that comes through will be polarized in the x’-direction, which we associate with that angle θ. So we say the photons will be in the |x‘〉 state. So far, so good. But what happens if we have two polarizers, set up as shown below, with the optical axis of the first one at an angle θ, which is, say, equal to 30°? Will any light get through?
Well? No answer? […] Think about it. What happens classically? […] No answer? Let me tell you. In a classical analysis, we’d say that only the x-component of the light that comes through the first polarizer would get through the second one. Huh? Yes. It is not all or nothing in a classical analysis. This is where the magnitude of E comes in, which we’ll write as E0, so as to not confuse it with the energy E. [I know you’ll confuse it anyway but… Well… I need to move on or I won’t get anywhere with this story.] So if E0 is the (maximum) magnitude (or amplitude – in the classical sense of the word, that is) of E as the light leaves the first polarizer, then its x-component will be equal to E0·cosθ. [I don’t need to make a drawing here, do I?] Of course, you know that the intensity of the light will be proportional to the square of the (maximum) field, which is equal to E02·cos2θ = 0.75·E02 for θ = 30°.
So our classical theory says that only 3/4 of the energy that we were sending in will get through. The rest (1/4) will be absorbed. So how do we model that quantum-mechanically? It’s amazingly simple. We’ve already associated the |x‘〉 state with the photons coming out of the first polaroid, and so now we’ll just say that this |x‘〉 state is equal to the following linear combination of the |x〉 and |y〉 base states:
|x‘〉 = cosθ·|x〉 + sinθ·|y〉
Huh? Yes. As Feynman puts it, we should think our |x‘〉 beam of photons can, somehow, be resolved into |x〉 and |y〉 beams. Of course, we’re talking amplitudes here, so we’re talking 〈x|x‘〉 and 〈y|x‘〉 amplitudes here, and the absolute square of those amplitudes will give us the probability that a photon in the |x‘〉 state gets into the |x〉 and |y〉 state respectively. So how do we calculate that? Well… If |x‘〉 = cosθ·|x〉 + sinθ·|y〉, then we can obviously write the following:
〈x|x‘〉 = cosθ·〈x|x〉 + sinθ·〈x|y〉
Now, we know that 〈x|y〉 = 0, because |x〉 and |y〉 are base states. Because of the same reason, 〈x|x〉 = 1. That’s just an implication of the definition of base states: 〈i|j〉 = δij. So we get:
〈x|x‘〉 = cosθ
Lo and behold! The absolute square of that is equal to cos2θ, so each of these photons have an (average) probability of 3/4 to get through. So if we were to have like 10 billion photons, then some 7.5 billion of them would get through. As these photons are all associated with a definite energy – and they go through as one whole, of course (no such thing as a 3/4 photon!) – we find that 3/4 of all of the energy goes through. The quantum-mechanical theory gives the same result as the classical theory – as it should, in this case at least!
Now that’s all good for linear polarization. What about elliptical or circular polarization? Hmm… That’s a bit more complicated, but equally feasible. If we denote the state of a photon with a right-hand circular polarization (RHC) as |R〉 and, likewise, the state of a photon with a left-hand circular polarization (LHC) as |L〉, then we can write these as the following linear combinations of our base states |x〉 and |y〉:That’s where those coefficients under illustrations (c) and (g) come in, although I think they’ve got the sign of i (the imaginary unit) wrong. 🙂 So how does it work? Well… That 1/√2 factor is – obviously – just there to make sure everything’s normalized, so all probabilities over all states add up to 1. So that is taken care of and now we just need to explain how and why we’re adding |x〉 and |y〉. For |R〉, the amplitudes must be the same but with a phase difference of 90°. That corresponds to the sine and cosine function, which are the same except for a phase difference of π/2 (90°), indeed: sin(φ + π/2) = cosφ. Now, a phase shift of 90° corresponds to a multiplication with the imaginary unit i. Indeed, i = ei·π/2 and, therefore, it is obvious that ei·π/2·ei·φ = ei·(φ + π/2).
Of course, if we can write RHC and LHC states as a linear combination of the base states |x〉 and |y〉, then you’ll believe me if I say that we can write any polarization state – including non-circular elliptical ones – as a linear combination of these base states. Now, there are two or three other things I’d like to point out here:
1. The RHC and LHC states can be used as base states themselves – so they satisfy all of the conditions for a set of base states. Indeed, it’s easy to add and then subtract the two equations above to get the following:As an exercise, you should verify the right and left polarization states effectively satisfy the conditions for a set of base states.
2. We can also rotate the xy-plane around the z-axis (as mentioned, that’s the direction of propagation of our beam) and use the resulting |x‘〉 and |y‘〉 states as base states. In short, we can effectively, as Feynman puts it, “You can resolve light into x– and y– polarizations, or into x’– and y’-polarizations, or into right and left polarizations as a basis.” These pairs are always orthogonal and also satisfy the other conditions we’d impose on a set of base states.
3. The last point I want to make here is much more enigmatic but, as far as I am concerned – by far – the most interesting of all of Feynman’s Lecture on this topic. It’s actually just a footnote, but I am very excited about it. So… Well… What is it?
Well… Feynman does the calculations to show how a circularly polarized photon looks like when we rotate the coordinates around the z-axis, and shows the phase of the right and left polarized states effectively keeps track of the x– and y-axes, so all of our “right-hand” rules don’t get lost somehow. He compares this analysis to an analysis he actually did – in a much earlier Lecture (in Chapter 5) – for spin-one particles. But, of course, here we’ve been analyzing the photon as a two-state system, right?
So… Well… Don’t we have a contradiction here? If photons are spin-one particles, then they’re supposed to be analyzed in terms of three base states, right? Well… I guess so… But then Feynman adds a footnote – with a very important remark:
“The photon is a spin-one particle which has, however, no ‘zero’-state.”
Why I am noting that? Because it confirms my theory about photons – force-particles – being different from matter-particles not only because of the different rules for adding amplitudes, but also because we get two wavefunctions for the price of one and, therefore, twice the energy for every oscillation! And so we’ll also have a distance of two Planck units between the equivalent of the “up” and “down” states of the photon, rather than one Planck unit, like what we have for the angular momentum for an electron.
I described the gist of my argument in my e-book, which you’ll find under another tab of this blog, and so I’ll refer you there. However, in case you’re interested, the summary of the summary is as follows:
- We can think of a photon having some energy that’s equal to E = p = m (assuming we choose our time and distance units such that c = 1), but that energy would be split up in an electric and a magnetic wavefunction respectively: ψE and ψB.
- Now, Schrödinger’s equation would then apply to both wavefunctions, but the E, p and m in those two wavefunctions are the same and not the same: their numerical value is the same (pE =EE = mE = pB =EB = mB), but they’re conceptually different. [They must be: I showed that, if they aren’t, then we get a phase and group velocity for the wave that doesn’t make sense.]
It is then easy to show that – using the B = i·E relation between the magnetic and the electric field vectors – we find a composite wavefunction for our photon which we can write as:
E + B = ψE + ψB = E + i·E = √2·ei(p·x/2 − E·t/2 + π/4) = √2·ei(π/4)·ei(p·x/2 − E·t/2) = √2·ei(π/4)·E
The whole thing then becomes:
ψ = ψE + ψB = √2·ei(p·x/2 − E·t/2 + π/4) = √2·ei(π/4)·ei(p·x/2 − E·t/2)
So we’ve got a √2 factor here in front of our combined wavefunction for our photon which, knowing that the energy is proportional to the square of the amplitude gives us twice the energy we’d associate with a regular amplitude… [With “regular”, I mean the wavefunction for matter-particles – fermions, that is.] So… Well… That little footnote of Feynman seems to confirm I really am on to something. Nice! Very nice, actually! 🙂