[Preliminary note (added on 13 June 2019): When re-reading what I wrote below, I realize I would fundamentally re-write certain sections. I think I have found a comprehensive realist interpretation of quantum mechanics and, hence, I’d recommend you check my recent papers. The writings below are probably just good to illustrate how I got there. Lettura felice!]
When thinking about reality, what equation in physics comes to mind first? For me, it’s the E = ħ·ω equation. It can be interpreted in two ways:
- As the Planck-Einstein relation, it says that the energy of a photon is a multiple of the quantum of energy.
- As the first of the two de Broglie equations (ω = E/ħ and k = ħ/p), the equation gives us the temporal frequency of the wavefunction (denoted as psi: ψ) of a matter-particle (i.e. the spin-1/2 particles that make up our world) in free space:
ψ(x, t) = a·e−i·[(E/ħ)·t − (p/ħ)∙x] = a·e−i·(ω·t − k∙x) = a·ei(k∙x−ω·t) = a·eiθ = a·(cosθ + i·sinθ)
Let’s start with the first interpretation.
Counting cycles: frequency (and time) as count variables
You’ll usually hear the Planck-Einstein relation is about the energy of a photon being proportional to its frequency (E = h·f), with h the proportionality constant. So why do I say something different here? The energy of a photon is a multiple of the quantum of energy? What does that mean? Indeed, you’ve surely heard about the quantum of action: that’s h—Planck’s constant. But what’s the quantum of energy? Can we also equate that to h? Perhaps. Let’s look at it.
The dimension of Planck’s constant is the dimension of physical action: h ≈ 6.62607×10−34 joule–second (J·s). The action and energy concepts are different and, therefore, the quantum of action and the quantum of energy – whatever it is – should also be different, right?
Well… Maybe. Maybe not.
I like to think of the frequency as a count variable. A count variable is a statistical data type: in statistics, it’s defined as a (non-negative) integer that arises from counting: 0, 1, 2, etcetera. In math (or in physics), it’s just referred to as a natural number and, as you know, we can get any other number set from the set of so-called natural numbers: the integers (by including an additive inverse); the rational numbers (by including a multiplicative inverse (1/n) for each integer number n); the real numbers (we get them as Cauchy sequences of rationals); and, finally, the complex numbers (by including the unresolved square root of −1). [As I’ve explained a couple of times already, I’d rather explain the concept of a complex number by referring to the concept of direction: we’re adding the concept of direction when talking complex analysis. The i = √−1 business is rather abstract, even if amounts to saying the same: we’re just introducing an additional concept.]
You’ll say: a frequency is not a count variable: it can be any real number. We can have a frequency that’s equal to f = πe, for example. 🙂 [I am putting a smiley here because, as far as I am aware, mathematicians have not figured out whether or πe, or eπ, is actually an irrational real number: an irrational power of an irrational number may be rational. Whatever it is, numbers like πe or eπ are surely not integers.]
Is frequency a continuous variable? In my humble opinion, it isn’t. It’s a bit of a definitional issue. Frequency is measured per second: frequency is the number of cycles per second. That number is a count variable. So let’s think about our unit of time. How do we define time? Time is obviously a continuous variable, isn’t it? Well… Maybe. But maybe not. Time itself is defined with regard to some frequency, so it refers to a count datat type. For eample, the concept of a day refers to one rotation of the Earth, and nowadays we measure time referring to the frequency of radiation. Let me be precise here: the second was defined as the duration of 9 192 631 770 periods of the radiation corresponding to the transition between the two hyperfine levels of the ground state of the caesium–133 atom. Look at the definition: we don’t say about 9 192 631 770 periods. No. It’s 9 192 631 770 cycles exactly. So… Well… I think frequency is a count variable. I’ll go even further: I actually like to think of time as a count variable too.
Huh? I must be joking, right? What would be the unit? Well… I am not saying time is a count variable: I am saying it’s nice to think of time like that. To do so, we must imagine some kind of fundamental cycle. How?
The fundamental cycle: the simplest of quantum-mechanical models
If we’d take sodium light, whose frequency is – roughly – 5.1×1014 Hz, then we find that one cycle corresponds to an energy that’s equal to 3.38×10−21 J. So we cannot use radiation to define some kind of fundamental cycle whose energy would correspond to h. Having said that, what we can do, on the other hand, is look at that fundamental difference between energy levels which explains, for example, blackbody radiation (En = n·ħ·ω) or, what amounts to the same, the difference in energy between an electron whose spin is ‘up’ (Jz = +ħ/2) as opposed to an electron whose spin is ‘down’ (Jz = −ħ/2). In other words, we can think of an oscillation between those two energy levels, and that – in my humble opinion – does correspond to a fundamental oscillation whose cycle corresponds to the energy quantum, i.e. ħ (rather than h).
Now you’ll ask: what’s its frequency? Can we apply the analysis we applied to two-state systems in general? The one we did for the ammonia molecule (NH3), or for the hydrogen ion (H2+). Well… I guess so. Why not? I’d say the following matrix looks like a pretty reasonable Hamiltonian for this very special case here:
And remember that graph for the probabilities to be in state 1 or state 2? I copied it below. As A = ħ, we’d measure time now in units of… Well… What? Well… ħ/A = ħ/ħ = 1 here, so we’d just measure time in… Well… Its normal unit: seconds. Does that make sense?
Hmm… Strange… Let’s look at the underlying formulas:
- C1(t) = 〈 1 | ψ 〉 = (1/2)·e−(i/ħ)·(E0 − A)·t + (1/2)·e−(i/ħ)·(E0 + A)·t = e−(i/ħ)·E0·t·cos[(A/ħ)·t]
- C2(t) = 〈 2 | ψ 〉 = (1/2)·e−(i/ħ)·(E0 − A)·t – (1/2)·e−(i/ħ)·(E0 + A)·t = i·e−(i/ħ)·E0·t·sin[(A/ħ)·t]
Is it the E0 factor? No. The e−(i/ħ)·E0·t factor is equal to e−(i/ħ)·E0·t = e0 = 1 if E0 = 0, obviously. But the truth is: whether E0 is or isn’t zero doesn’t matter: the absolute square of e−(i/ħ)·E0·t is always equal to one, regardless of the value of E0. [The i in the C2(t) formula disappears as well because |−i|2 = |−1|2|i|2 = +1.] So, regardless of our choice for E0, we get the same P1(t) and P2(t) functions:
- P1(t) = |C1(t)|2 = cos2[(A/ħ)·t]
- P2(t) = |C2(t)|2 = sin2[(A/ħ)·t]
We’re definitely on to something here. The difference between the two energy levels in a two-state system is what gives the wavefunction its density in time. However, in this case, our wavefunctions reduce to:
- C1(t) = 〈 1 | ψ 〉 = cos[(ħ/ħ)·t] = cos(t)
- C2(t) = 〈 2 | ψ 〉 = i·sin[(ħ/ħ)·t] = i·sin(t)
But… Hey! Wait a minute! These things don’t look like regular wavefunctions! One is a real function and the other one is purely imaginary. In fact, if we’d want a sensible wavefunction, we’d have to combine them. If we do that, and substituting t for θ, we just get Euler’s formula:
eiθ = cosθ + i·sinθ
We find that Euler’s function – which, frankly, is just one of the many equivalent definitions of a complex number – describes the behavior of an elementary particle oscillating between two elementary states (spin up versus spin down), and the functional argument is just time. Isn’t that amazing? By the way, also note what happened here: we started with the Planck-Einstein relation and sort of slid into the de Broglie equation without really noticing where and how we were making the shift. You should note a number of things here:
First, the C1(t) and C2(t) functions are not independent. Of course. They never are. But this shows it in a very powerful way. They’re related through Euler’s function which, as I’ll show in a moment, represents the normalization condition (the probabilities, taken over all possible states, have to add up to one).
Second, think about the geometric interpretation of all of this. ‘Up’ and ‘down’ are defined with respect to the direction of measurement, which we usually refer to as the z-direction. Also note that the sine and cosine function are the same functions but for a phase difference equal to π/2: cosθ = cos(–θ) = sin(θ±π/2). We also noted that it doesn’t matter if you define the diagonal elements of the Hamiltonian (i.e. H12 = H21) as A or as −A. That’s just your convention in regard to the direction of rotation of the phase (clockwise or counterclockwise): you just need to make sure you’re consistent in your choice. Also, multiplying a complex number by i or −i amounts to a rotation by π/2 or − π/2 (±90°) respectively. So adding π/2 to the phase of our wavefunction and multiplying it by i are equivalent, as evidenced below:
i·eiθ = i·(cosθ + i·sinθ) = i·cosθ − sinθ = i·sin(θ+π/2) + cos(θ+π/2) = cos(θ+π/2) + i·sin(θ+π/2) = ei(θ+π/2)
What can we do with this? Could we relate it, for example, with the E and B vectors of electromagnetic radiation, whose direction − but not their phase! − also differs by an angle equal to π/2, as illustrated below?
Perhaps, but we’re talking something else: we’re talking the quantum-mechanical description of a quantum-mechanical system—i.e. we’re describing reality! 🙂 So let’s not bother about how we’d describe radiation. The point is: we’ve got a mathematical model of reality here that makes sense, and one which respects Occam’s Razor.
Of course, now you’ll grumble. OK, we have a model of an electron in free space whose spin oscillates between ‘up’ and ‘down’ here, and each cycle corresponds to an energy level equal to ħ. So it should emit some radiation. You’re right.
Well… Maybe. Let me quickly say what we can say here. Let’s first look at the frequency of this ‘radiation’: its period (T) is π seconds. Hence, its frequency is f = 1/T = 1/π ≈ 0.32 Hz. Therefore, its wavelength should be equal to λ = c/f ≈ 3π×108 m, i.e. like three times the distance from here to the moon. Can we detect anything like that? The US military established a way of communicating with submarines which uses so-called ELF (extremely low frequency) waves as low as 50 Hz, which amounts to a wavelength of λ = c/f ≈ (3/50)×108 m, i.e. 6,000 km. In fact, power lines are sources of ELF waves. But… Well… To detect waves like that the intensity of the signal needs to be sufficiently high. And one electron emitting a signal whose total energy is equal to ħ is surely very ELI (extremely low intensity). 🙂
Think about it. To be frank, I am not quite sure what to make of all this. Would free electrons really be emitting radiation like that—every time they make a transition? Electrons orbiting around the nucleus don’t emit radiation but… Well… They’re not supposed to change their spin every π seconds or so—which is what our model is saying here. Indeed, the model is really the same as that for the ammonia molecule, for which A ≈ 0.5×10−4 eV. That’s tiny, but the A/ħ ratio in those probability functions – I copied them once more (see below) – is still huge: (0.5×10−4)/(6.582×10−16) ≈ 76×109. So that’s 76 billion.
- P1(t) = |C1(t)|2 = cos2[(A/ħ)·t]
- P2(t) = |C2(t)|2 = sin2[(A/ħ)·t]
To be clear on this, if we say that we’re measuring time in units of ħ/A – which is what we do in that graph above – then we’re measuring time in billionths of a second. Note that we do that because A is like a billion times ħ! You may also remember we calculated the period of the spontaneous transition frequency as t = ħ·π/A ≈ 41.3 ×10−12 s, so we’re talking pico-seconds here. Finally, you may or may not remember we also talked about the natural oscillations inside of an uranium nucleus, whose frequency was of the order of 1022 per second. In short, the period of that electron oscillation model that we’ve got here is huge. So the natural question is: does it make any sense? I think its logic is sound, but… Well… I’ll admit it all looks very funny. If anything, it makes one think through all of the stuff we’ve been discussing, doesn’t it? What we’ve got here is a limit situation, which tests our concepts.
The model I just presented has some nice implications. One of them is that the ‘fundamental’ time unit, i.e. the time that corresponds to one cycle of our electron flipping its spin is equal to π or, preferably, that it’s equal to 2π, because that’s the cycle of the underlying amplitudes. I like that. I think it’s powerful. And it’s totally in line with those two mysterious formulas which come out of Euler’s formula if you equate θ to π or 2π:
eiπ = e−iπ = −1
ei2π = e0 = 1
Let’s double-check our calculations against the Hamiltonian equations. Indeed, you’ll remember we got our set of Hamiltonian equations assuming there was some amplitude for a system to change from base state i to state j over some infinitesimally small time unit (Δt). We wrote that amplitude as Uij(t + Δt, t), and we related those amplitudes to the corresponding Hamiltonian coefficients as follows:
Uij(t + Δt, t) = δij + ΔUij(t + Δt, t) = δij + Kij(t)·Δt ⇔ Uij(t + Δt, t) = δij − (i/ħ)·Hij(t)·Δt
In the model for our ammonia molecule, these equations become:
- U11(t + Δt, t) = 1 − i·[H11(t)/ħ]·Δt = 1 − i·[E0/ħ]·Δt
- U22(t + Δt, t) = 1 − i·[H22(t)/ħ]·Δt = 1 − i·[E0/ħ]·Δt
- U12(t + Δt, t) = 0 − i·[H12(t)/ħ]·Δt = 0 + i·[A/ħ]·Δt
- U21(t + Δt, t) = 0 − i·[H21(t)/ħ]·Δt = 0 + i·[A/ħ]·Δt
You can easily see we have a bit of a problem with our limit model, in which E0 = 0, because U11(t + Δt, t) and U22(t + Δt, t) just equal 1. Hmm… Let’s not think about that for a while. Let’s just pretend we didn’t notice the problem. So, equating E0 with 0, and A with ħ, we get:
- U11(t + Δt, t) = 1 − i·[E0/ħ]·Δt = 1 ⇔ − i·[H11(t)/ħ]·Δt = 0
- U22(t + Δt, t) = 1 − i·[E0/ħ]·Δt = 1 ⇔ − i·[H22(t)/ħ]·Δt = 0
- U12(t + Δt, t) = − i·[H12(t)/ħ]·Δt = i·Δt
- U21(t + Δt, t) = − i·[H21(t)/ħ]·Δt = i·Δt
Our set of Hamiltonian equations now reduces to:
We’ve calculated those C1(t) and C2(t) already:
- C1(t) = 〈 1 | ψ 〉 = cos(t)
- C2(t) = 〈 2 | ψ 〉 = i·sin(t)
So let’s check if it all makes sense. Calculating those derivatives and substituting them in those Hamiltonian equations, yields the following:
- i·d[C1(t)]dt = −C2(t) ⇔ i·d[cos(t)]dt = −i·sin(t) = −C2(t) ⇔ C2(t) = i·sin(t)
- i·d[C2(t)]dt = −C1(t) ⇔ i·d[i·sin(t)]dt = i·i·cos(t) = −cos(t) = −C1(t) ⇔ C1(t) = cos(t)
Bingo! We’re bang on! 🙂
The question, of course, remains the same: how do we interpret this really? Well… As far as I am concerned, there’s nothing wrong with a real-valued wavefunction, or with a purely imaginary one. And so that’s what we’ve got here. We could say that the state of the system, as a whole, is described by both. Both are linked through the normalization condition: the probability of being in state 1 and the probability of being state 2 has to add up to one. That’s obviously the case here:
|C1(t)|2 + |C2(t)|2 = cos2(t) + sin2(t) = 1
As I told you already: Euler’s function here just states the normalization condition and, by doing so, we can look at it as representing the whole system. To put it differently: in this generic model, we have Euler’s function describing the state of an elementary particle (i.e. any spin-1/2 particle really) in time.
Frankly, I’ve never seen such succinct description of reality. As far as I am concerned, this sums it all up. So… Well… Perhaps Euler’s function would be the equation to start with. 🙂
OK. So far so good. Let me present all of the above from another another angle now.
Modeling uncertainty in classical versus quantum physics
Let’s come back to that zero energy assumption. E0 is the sum of the rest energy, the kinetic energy, and the potential energy. We can always choose the zero point for measuring potential energy such that E0 is zero, but I admit that’s very artificial. Moreover, in free space we assume there is no potential energy. Hence, when describing some real particle, we should not assume that E0 is zero. How does E0 ≠ 0 change the analysis above? Let’s look at those C1(t) and C1(t) formulas once more, but let’s simplify them by assuming we measure energy in units of ħ, so we write E0/ħ, A/ħ and ħ/ħ just as E0, A and 1.
- C1(t) = e−iE0t·cos(t)
- C2(t) = i·e−iE0t·sin(t)
As mentioned above, these two functions yield the same probability functions P1(t) and P2(t), so there’s no impact there. However, the e−i·E0·t factor does yield another wavefunction. The most obvious thing to note is that our ‘real’ cos(t) and our purely ‘imaginary’ i·sin(t) wavefunctions re-become complex-valued functions. Let’s write them out:
- C1(t) = e−iE0t·cos(t) = [cos(E0t) − i·sin(E0t)]·cos(t) = cos(E0t)·cos(t) − i·sin(E0t)·cos(t)
- C2(t) = i·e−iE0t·sin(t) = i·[cos(E0t) − i·sin(E0t)]·sin(t) = sin(E0t)·sin(t) + i·cos(E0t)·sin(t)
This is another delightful formula: it shows how the non-zero energy splits our cosine and sine functions into a real and imaginary part. It’s quite interesting to play with the various sine and cosine combinations, i.e. the real and imaginary parts of those wavefunctions above. For example, the cos(E0t)·cos(t) function, over one cycle (so I let t range from −π to +π), looks as follows for E0 = 0, 10 and 100 respectively. As you can see, higher energy levels are associated with a higher ‘density’ in time.
And all possible sine and cosine combinations for E0 = 10 look like this:
You can play yourself with an online graphing tool. The question is: what does it all mean? The answer is: Occam’s Razor. When we’re going to describe the interaction between two particles, we’ll have four possible situations: both particles have their spin in the same direction (‘up’ or ‘down’), or or one is ‘up’ while the other is ‘down’. The ‘degrees of freedom’ in the quantum-mechanical wavefunction model are just right: we don’t need any more or less. It’s all just right. Think about modeling uncertainty in classical versus quantum mechanics.
I. In classical mechanics, the angular momentum vector of an object can point in any direction: all angles θ are equally likely. Hence, θ, as a function of time (θ = ω·t), follows a continuous and uniform probability distribution. We write: P(θ) = 1/(2π), and the normalization condition is self-evident: ∫P(θ)dθ = 2π/(2π) − 0/(2π) = 1 − 0 = 1.
II. In quantum mechanics, the angular momentum vector (for a spin-1/2 particle) can take only two values in any particular direction: Jz = ± ħ/2. That’s an experimental fact: we know this is true because of the Stern-Gerlach experiment. Hence, we say that our particle can be one out of two states only: ‘up’ or ‘down’. So we now have a simple two-point distribution: P[Jz = +ħ/2] = 0.5 and P[Jz = −ħ/2] = 0.5, and we can no longer say that the θ variable represents the direction of the angular momentum vector: θ is now the argument of our wavefunction, which we write as e−iθ = e−iωt = cos(ωt) + i·sin(ωt). The model implies the following:
- The probability of being in state 1 or state 2 is now equal to cos2(ωt) and sin2(ωt) respectively.
- The normalization condition is equally self-evident, but different: cos2(ωt) + sin2(ωt) = 1.
If we look at it like this, it’s equally self-evident that the natural unit to measure time is π or 2π, because that’s periodicity of our wavefunction and probability function.
You may wonder: why do we need θ at all? The answer: interference. The Stern-Gerlach experiment is great, but we also need to explain interference, and we can only do that by assuming that those wavefunctions exist—somehow. 🙂 That leads to the last question: what is that θ variable then? It’s not time: θ is a function of time and, when talking moving particles (as opposed to the stationary system we’ve been looking at so far), θ is not only a function in time but in space as well. The answer is: it is what it is—it’s the argument of the wavefunction. 🙂
Have you ever had a serious course in statistics? One that included non-linear regression models, like the logit regression, or log-linear (Poisson) regressions? Those models are designed to link continuous variables with discrete outcomes, like yes/no—i.e. up or down in this case. Logit regressions do it for binary variables (yes/no), while Poisson regressions do it for count data (1, 2, 3 etc). These models connect a continuous probability density function with continuous independent variables. The so-called link function establishes… Well… The link between the discrete outcome and the continuous probability density function. The link function in the logit model is the logit function – logit(p) = ln[p/(1−p)] – and in the log-linear regression we use another logarithmic function. These link functions connect the independent variables with the dependent variable (i.e. the probability of this or that outcome) through some intermediate variable. The logic is something like this:
P[x] = P[S(θ)] = P[S[f(x1, x2,…xk)]], with θ = f(x1, x2,…xk
That’s what we’ve got here:
- The independent variables (i.e. the xi above) are time (t) and position (x).
- θ is a function of both: θ = ω·t − k∙x = (E/ħ)·t − (p/ħ)∙x
- Our link function S(θ) is the wavefunction: S(θ) = e−iθ
- We get the probabilities by taking the absolute square of the link function. P(x, t) = |e−iθ|2
That’s it! That’s all there is to it! Isn’t that just nice? At the very least, you’ll have to admit it’s all quite aesthetic. 🙂 I have to leave you now, as I need to move on with the rest of Feynman’s course. I hope you enjoyed this rather original summary. If not… Well… I can only say I enjoyed writing it. 🙂
Post Scriptum on wrapped probability distributions
The statistics which I am referring in my post here are known as circular statistics. They involve a so-called wrapped probability distribution. The term is clear enough: if t is just a real number between –π and +π, and its distribution is uniform, i.e. P(t) = 1/2π, then what’s the distribution of P[cos(t)], or of P[sin(t)]? Circular statistics involves problems such as the one that’s illustrated below: if a light source emits photons in all directions in a continuous stream that we denote as I, then what’s the number of photons/sec directed into any wedge. The answer is: it’s going to be proportional to the area of the wedge.
But so we’ve got another problem here. It’s illustrated below: y(x) is, obviously, equal to cos(x). As you can see, as the angular velocity is some constant, and because of the geometry of the situation, the values near ±1 are more likely than the values near 0. So what’s the probability distribution here?
You may think it’s the cosine function itself but… No. Probabilities are always values between 0 and 1, not between −1 and +1. Of course, the square of the cos(x) function would be a candidate function. But… Well… Is it the probability density function we’re looking for here? In light of what I wrote above, you’ll be surprised to hear that the answer is: no! For starters, the cycle of cos2(x) is π, not 2π (as shown below). We need a distribution over a [0, 2π] interval.
The formula you need is the following:
It can be re-written as follows, and I also included its graph below.
The interpretation of the probability densities near the endpoints (i.e. ± 1) is not self-evident, but you have a link here which explains it. 🙂
In any case, you’ve got plenty of stuff to think about now, so I’ll leave you at it. 🙂