Statistical mechanics re-visited

Quite a while ago – in June and July 2015, to be precise – I wrote a series of posts on statistical mechanics, which included digressions on thermodynamics, Maxwell-Boltzmann, Bose-Einstein and Fermi-Dirac statistics (probability distributions used in quantum mechanics), and so forth. I actually thought I had sort of exhausted the topic. However, when going through the documentation on that Stern-Gerlach experiment that MIT undergrad students need to analyze as part of their courses, I realized I did actually not present some very basic formulas that you’ll definitely need in order to actually understand that experiment.

One of those basic formulas is the one for the distribution of velocities of particles in some volume (like an oven, for instance), or in a particle beam – like the beam of potassium atoms that is used to demonstrate the quantization of the magnetic moment in the Stern-Gerlach experiment. In fact, we’ve got two formulas here, which are subtly – as subtle as the difference between v (boldface, so it’s a vector) and v (lightface, so it’s a scalar) 🙂 – but fundamentally different:

velocity-distribution

Both functions are referred to as the Maxwell-Boltzmann density distribution, but the first distribution gives us the density for some v in the velocity space, while the second gives us the distribution density of the absolute value (or modulus) of the velocity, so that is the distribution density of the speed, which is just a scalar – without any direction. As you can see, the second formula includes a 4π·v² factor.

The question is: how are these formulas related to Boltzmann’s f(E) = C·e^−energy/kT Law? The answer is: we can derive all of these formulas – for the distribution of velocities, or of momenta – by clever substitutions. However, as evidenced by the two formulas above, these substitutions are not always straightforward. So let me quickly show you a few things here.

First note the two formulas above already include the e^−energy/kTfunction if we equate the energy E with the kinetic energy: E = K.E. = m·v²/2. Of course, if you’ve read those June-July 2015 posts, you’ll note that we derived Boltzmann’s Law in the context of a force field, like gravity, or an electric potential. For example, we wrote the law for the density (n = N/V) of gas in a gravitational field (like the Earth’s atmosphere) as n = n₀·e^−P.E./kT. In this formula, we only see the potential energy: P.E. = m·g·h, i.e. the product of the mass (m), the gravitational constant (g), and the height (h). However, when we’re talking the distribution of velocities – or of momenta – then the kinetic energy comes into play.

So that’s a first thing to note: Boltzmann’s Law is actually a whole set of laws. For example, the frequency distribution of particles in a system over various possible states, also involves the same exponential function: F(state) ∝ e^−E/kT. E is just the total energy of the state here (which varies from state to state, of course), so we don’t distinguish between potential and kinetic energy here.

So what energy concept should we use in that Stern-Gerlach experiment? Because these potassium atoms in that oven – or when they come out of it in a beam – have kinetic energy only, our E = m·v²/2 substitution does the trick: we can say that the potential energy is taken to be zero, so that all energy is in the form of kinetic energy. So now we understand the e^{−m·v²/2kT} function in those f(v) and f(v) formulas. Now we only need to explain those complicated coefficients. How do we get these?

We get them through clever substitutions using equations such as:

f_v(v)·dv = f_p(p)·dp

What are we writing here? We’re basically combining two normalization conditions: if f_v(v) and f_p(p) are proper probability density functions, then they must give us 1 when integrating over their domain. The domain of these two functions is, obviously, the velocity (v) and momentum (p) space. The velocity and momentum space are the same mathematical space, but they are obviously not the same physical space. But the two physical spaces are closely related: p = m·v, and so it’s easy to do the required transformation of variables. For example, it’s easy to see that, if E = m·v²/2, then E is also equal to E = p²/2m.

However, when doing these substitutions, things get tricky. We already noted that p and v are vectors, unlike E, or p and v – which are scalars, or magnitudes. So we write: p = (p_x, p_y, p_z) and |p| = p, and v = (v_x, v_y, v _z) and |v| = v. Of course, you also know how we calculate those magnitudes:

magnitude

Note that this also implies the following: p·p = p²= p_x²+ p_y²+p_z²= p². Trivial, right? Yes. But have a look now at the following differentials:

d³p
dp
dp = d(p_x, p_y, p_z)
dp_x·dp_y·dp_z

Are these the same or not? Now you need to think, right? That d³p and dp are different beasts is obvious: d³p is, obviously, some infinitesimal volume, as opposed to dp, which is, equally obviously, an (infinitesimal) interval. But what volume exactly? Is it the same as that dp = d(p_x, p_y, p_z) volume, and is that the same as the dp_x·dp_y·dp_z volume?

Fortunately, the volume differentials are, in fact, the same – so you can start breathing again. 🙂 Let’s get going with that d³p notation for the time being, as you will find that’s the notation which is used in the Wikipedia article on the Maxwell-Boltzmann distribution – which I warmly recommend, because – for a change – it is a much easier read than other Wikipedia articles on stuff like this. Among other things, the mentioned article writes the following:

f_E(E)·dE = f_p(p)·d³p

What is this? Well… It’s just like that f_v(v)·dv = f_p(p)·dp equation: it combines the normalization condition for both distributions. However, it’s much more interesting, because, on the left-hand side, we multiply a density with an (infinitesimal) interval (dE), while on the right-hand side we multiply with an (infinitesimal) volume (d³p). Now, the (infinitesimal) energy interval dE must, obviously, correspond with the (infinitesimal) momentum volume d³p. So how does that work?

Well… The mentioned Wikipedia article talks about the “spherical symmetry of the energy-momentum dispersion relation” (that dispersion relation is just E = |p|²/2m, of course), but that doesn’t make us all that wiser, so let’s try a more heuristic approach. You might remember the formula for the volume of a spherical shell, which is simply the difference between the volume of the outer sphere minus the volume of the inner sphere: V = (4π/3)·R³− (4π/3)·r³= (4π/3)·(R³− r³). Now, for a very thin shell of thickness Δr, we can use the following first-order approximation: V = 4π·r²·Δr. In case you wonder, I hereby copy a nice explanation from the Physics Stack Exchange site:

approximation

Perfect. That’s all we need to know. We’ll use that first-order approximation to re-write d³p as:

d³p = dp = 4π·|p|²·d|p| = 4π·p²·dp

Note that we’ll have the same formula for d³v, of course: d³v = dv = 4π·|v|²·d|v| = 4π·v²·dv, and also note that we get that same 4π·v² factor which we mentioned when discussing the f(v) and f(v) formulas. That is not a coincidence, of course, but – as I’ll explain in a moment – it is not so easy to immediately relate the formulas. In any case, we’re now ready to relate dE and dp so we can re-write that d³p formula in terms of m, E and dE:

substitution-2

We are now – finally! – sufficiently armed to derive all of the formulas we want – or need. Let me just copy them from the mentioned Wikipedia article:

momenta

energy

velocity

As said, you’ll encounter these formulas regularly – and so it’s good that you know how you can derive them. Indeed, the derivation is very straightforward and is done in the same article: the tips I gave you should allow you to read it in a couple of minutes only. Only the density function for velocities might cause you a bit of trouble – but only for a very short moment: just use the p = m·v equation to write d³p as d³p = 4π·p²·dp = 4π·m²·v²·m·dv = 4π·m³·v²·dv = m³·d³v, and you’re all set. 🙂

Of course, you will recognize the formula for the distribution of velocities: it’s the f(v) we mentioned in the introduction. However, you’re more likely to need the f(v) formula (i.e. the probability density function for the speed) than the f(v) function. So how can we derive get the f(v) – i.e. that formula for the distribution of speeds, with the 4π·v² factor – from the f(v) formula?

Well… I wish I could give you an easy answer. In fact, the same Wikipedia article suggests it’s easy – but it’s not. It involves a transformation from Cartesian to polar coordinates: the volume element dv_x·dv_y·dv_z is to be written as v²·sinθ·dv·dθ·dφ. And then… Well… Have a look at this link. 🙂 It involves a so-called Jacobian transformation matrix. If you want to know more about it, then I recommend you read some article on how to transform distribution functions: here’s a link to one of those, but you can easily google others. Frankly, as for now, I’d suggest you just accept the formula for f(v) as for now. 🙂 Let me copy it from the same article in a slightly different form: density-formula Now, the final thing to note is that you’ll often want to use so-called normalized velocities, i.e. velocities that are defined as a v/v₀ ratio, with v₀the most probable speed, which is equal to √(2kT/m). You get that value by calculating the df(v)/dv derivative, and then finding the value v = v₀ for which df(v)/dv = 0. You should now be able to verify the formula that is used in the mentioned MIT version of the Stern-Gerlach experiment: mit-formula Indeed, when you write it all out – note that π/π^3/2= 1/√π 🙂 – you’ll see the two formulas are effectively equivalent. Of course, by now you are completely formula-ed out, and so you probably don’t even wonder what that f(v)·dv product actually stands for. What does it mean, really? Now you’ll sigh: why would I even want to know that? Well… I want you to understand that MIT experiment. 🙂 And you won’t if you don’t know what f(v)·dv actually represents. So think about it. […]

[…] OK. Let me help you once more. Remember the normalization condition once again: the integral of the whole thing – over the whole range of possible velocities – needs to add up to 1, so f(v)·dv is really the fraction of (potassium) atoms (inside the oven) with a velocity in the (infinitesimally small) dv interval. It’s going to be a tiny fraction, of course: just a tiny bit larger than zero. Surely not larger than 1, obviously. 🙂 Think of integrating the function between two values – say v₁ and v₂ – that are pretty close to each other.

So… Well… We’re done as for now. So where are we now in terms of understanding the calculations in that description of that MIT experiment? Well… We’ve got the meat. But we need a lot of other ingredients now. We’ll want formulas for the intensity of the beam at some point along the axis measuring its deflection from its main direction. That axis is the z-axis. So we’ll want a formula for some I(z) function.

Deflection? Yes. There are a lot of steps to go through now. Here’s the set-up: set-up First, we’ll need some formula measuring the flux of (potassium) atoms coming out of the oven. And then… Well… Just have a look and try to make your way through the whole thing now – which is just what I want to do in the coming days, so I’ll give you some more feedback soon. 🙂 Here I only wanted to introduce those formulas for the distribution of velocities and momenta, because you’ll need them in other contexts too.

So I hope you found this useful. Stuff like this all makes it somewhat more real, doesn’t it? 🙂 Frankly, I think the math is at least as fascinating as the physics. We could have a closer look at those distributions, for example, by noting the following:

1. The probability density function for the momenta is the product of three normal distributions. Which ones? Well… The distribution of p_x, p_y and p_z respectively: three normal distributions whose variance is equal to mkT. 🙂

2. The f_E(E) function is a chi-squared (χ²) distribution with 3 degrees of freedom. Now, we have the equipartition theorem (which you should know – if you don’t, see my post on it), which tells us that this energy is evenly distributed among all three degrees of freedom. It is then relatively easy to show – if you know something about χ² distributions at least 🙂 – that the energy per degree of freedom (which we’ll write as ε below) will also be distributed as a chi-squared distribution with one degree of freedom: chi-square-2 This holds true for any number of degrees of freedom. For example, a diatomic molecule will have extra degrees of freedom, which are related to its rotational and vibrational motion (I explained that in my June-July 2015 posts too, so please go there if you’d want to know more). So we can really use this stuff in, for example, the theory of the specific heat of gases. 🙂

3. The function for the distribution of the velocities is also a product of three independent normally distributed variables – just like the density function for momenta. In this case, we have the v_x, v_y and v_z variables that are normally distributed, with variance kT/m.

So… Well… I’m done – for the time being, that is. 🙂 Isn’t it a privilege to be alive and to be able to savor all these little wonderful intellectual excursions? I wish you a very nice day and hope you enjoy stuff like this as much as I do. 🙂

The quantization of magnetic moments

Original post:

You may not have many questions after a first read of Feynman’s Lecture on the Stern-Gerlach experiment and his more general musings on the quantization of the magnetic moment of an elementary particle. [At least I didn’t have all that many after my first reading, which I summarized in a previous post.]

However, a second, third or fourth reading should trigger some, I’d think. My key question is the following: what happens to that magnetic moment of a particle – and its spin [1] – as it travels through a homogeneous or inhomogeneous magnetic field? We know – or, to be precise, we assume – its spin is either “up” (J_z = +ħ/2) or “down” (J_z = −ħ/2) when it enters the Stern-Gerlach apparatus, but then – when it’s moving in the field itself – we would expect that the magnetic field would, somehow, line up the magnetic moment, right?

Feynman says that it doesn’t: from all of the schematic drawings – and the subsequent discussion of Stern-Gerlach filters – it is obvious that the magnetic field – which we denote as B, and which we assume to be inhomogeneous [2] – should not result in a change of the magnetic moment. Feynman states it as follows: “The magnetic field produces a torque. Such a torque you would think is trying to line up the (atomic) magnet with the field, but it only causes its precession.”

[…] OK. That’s too much information already, I guess. Let’s start with the basics. The key to a good understanding of this discussion is the force formula:

We should first explain this formula before discussing the obvious question: over what time – or over what distance – should we expect this force to pull the particle up or down in the magnetic field? Indeed, if the force ends up aligning the moment, then the force will disappear!

So let’s first explain the formula. We start by explaining the energy U. U is the potential energy of our particle, which it gets from its magnetic moment μ and its orientation in the magnetic field B. To be precise, we can write the following:

Of course, μ and B are the magnitudes of μ and B respectively, and θ is the angle between μ and B: if the angle θ is zero, then U_mag will be negative. Hence, the total energy of our particle (U) will actually be less than what it would be without the magnetic field: it is the energy when the magnetic moment of our particle is fully lined up with the magnetic field. When the angle is a right angle (θ = ±π/2), then the energy doesn’t change (U_mag = 0). Finally, when θ is equal to π or −π, then its energy will be more than what it would be outside of the magnetic field. [Note that the angle θ effectively varies between –π and π – not between 0 and 2π!] angles Of course, we may already note that, in quantum mechanics, U_mag will only take on a very limited set of values. To be precise, for a particle with spin number j = 1/2, the possible values of U_mag will be limited to two values only. We will come back to that in a moment. First that force formula.

Energy is force over a distance. To be precise, when a particle is moved from point a to point b, then its change in energy can be written as the following line integral:

Note that the minus sign is there because of the convention that we’re doing work against the force when increasing the (potential) energy of that what we’re moving. Also note that F∙ds product is a vector (dot) product: it is, obviously, equal to F_t times ds, with F_t the magnitude of the tangential component of the force. The equation above gives us that force formula:

Feynman calls it the principle of virtual work, which sounds a bit mysterious – but so you get it by taking the derivative of both sides of the energy formula.

Let me now get back to the real mystery of quantum mechanics, which tells us that the magnetic moment – as measured along our z-axis – will only take one of two possible values. To be precise, we have the following formula for μ_z:

This is a formula you just have to accept for the moment. It needs a bit of interpretation, and you need to watch out for the sign. The g-factor is the so-called Landé g-factor: it is equal to 1 for a so-called pure orbital moment, 2 for a so-called pure spin moment, and some number in-between in reality, which is always some mixture of the two: both the electron’s orbit around the nucleus as well as the electron’s rotation about its own axis contribute to the total angular momentum and, hence, to the total magnetic moment of our electron. As for the other factors, m and q_e are, of course, the mass and the charge of our electron, and J_z is either +ħ/2 or −ħ/2. Hence, if we know g, we can easily calculate the two possible values for μ_z.

Now, that also means we could – theoretically – calculate the two possible values of that angle θ. For some reason, no handbook in physics ever does that. The reason is probably a good one: electron orbits, and the concept of spin itself, are not like the orbit and the spin of some planet in a planetary system. In fact, we know that we should not think of electrons like that at all: quantum physicists tell us we may only think of it as some kind of weird cloud around a center. That cloud has a density which is to be calculated by taking the absolute square of the quantum-mechanical amplitude of our electron.

In fact, when thinking about the two possible values for θ, we may want to remind ourselves of another peculiar consequence of the fact that the angular momentum – and, hence, the magnetic moment – is not continuous but quantized: the magnitude of the angular momentum J is not J = √(J·J) = √J² in quantum mechanics but J = √(J·J) = √[j·(j+1)·ħ²] = √[j·(j+1)]·ħ. For our electron, j = 1/2 and, hence, the magnitude of J is equal to J = √[(1/2)∙(3/2)]∙ ħ = √(3/4)∙ħ ≈ 0.866∙ħ. Hence, the magnitude of the angular momentum is larger than the maximum value of J_z – and not just a little bit, because the maximum value of ħ is ħ/2! That leads to that weird conclusion: in quantum mechanics, we find that the angular momentum is never completely along any one direction [3]! In fact, this conclusion basically undercuts the very idea of the angular momentum – and, hence, the magnetic moment – of having any precise direction at all! [This may sound spectacular, but there is actually a classical equivalent to the idea of the angular momentum having no precisely defined direction: gyroscopes may not only precess, but nutate as well. Nutation refers to a kind of wobbling around the direction of the angular momentum. For more details, see the post I wrote after my first reading of Feynman’s Lecture on the quantization of magnetic moments. :-)]

Let’s move on. So if, in quantum mechanics, we cannot associate the magnetic moment – or the angular momentum – with some specific direction, then how should we imagine it? Well… I won’t dwell on that here, but you may want to have a look at another post of mine, where I develop a metaphor for the wavefunction which may help you to sort of understand what it might be. The metaphor may help you to think of some oscillation in two directions – rather than in one only – with the two directions separated by a right angle. Hence, the whole thing obviously points in some direction but it’s not very precise. In any case, I need to move on here.

We said that the magnetic moment will take one of two values only, in any direction along which we’d want to measure it. We also said that the (maximum) value along that direction – any direction, really – will be smaller than the magnitude of the moment. [To be precise, we said that for the angular momentum, but the formulas above make it clear the conclusions also hold for the magnetic moment.] So that means that the magnetic moment is, in fact, never fully aligned with the magnetic field. Now, if it is not aligned – and, importantly, if it also does not line up – then it should precess. Now, precession is a difficult enough concept in classical mechanics, so you may think it’s going to be totally abstruse in quantum mechanics. Well… That is true – to some extent. At the same time, it is surely not unintelligible. I will not repeat Feynman’s argument here, but he uses the classical formulas once more to calculate an angular velocity and a precession frequency – although he doesn’t explain what they might actually physically represent. Let me just jot down the formula for the precession frequency:

We get the same factors: g, q_e and m. In addition, you should also note that the precession frequency is directly proportional to the strength of the magnetic field, which makes sense. Now, you may wonder: what is the relevance of this? Can we actually measure any of this?

We can. In fact, you may wonder about the if I inserted above: if we can measure the Landé g-factor… Can we? We can. It’s done in a resonance experiment, which is referred to as the Rabi molecular-beam method – but then it might also be just an atomic beam, of course!

The experiment is interesting, because it shows the precession is – somehow – real. It also illustrates some other principles we have been describing above.

The set-up looks pretty complicated. We have a series of three magnets. The first magnet is just a Stern-Gerlach apparatus: a magnet with a very sharp edge on one of the pole tips so as to produce an inhomogeneous magnetic field. Indeed, a homogeneous magnetic field implies that ∂B/∂z = 0 and, hence, the force along the z-direction would be zero and our atomic magnets would not be displaced.

The second magnet is more complicated. Its magnetic field is uniform, so there are no vertical forces on the atoms and they go straight through. However, the magnet includes an extra set of coils that can produce an alternating horizontal field as well. I’ll come back to that in a moment. Finally, the third magnet is just like the first one, but with the field inverted. Have a look at it:

rabi-apparatus

It may not look very obvious but, after some thinking, you’ll agree that the atoms can only arrive at the detector if they follow the trajectories a and/or b. In fact, these trajectories are the only possible ones because of the slits S₁ and S₂.

Now what’s the idea of that horizontal field B’ in magnet 2? In a classical situation, we could change the angular momentum – and the magnetic moment – by applying some torque about the z-axis. The idea is shown in Figure (a) and (b) below.

changing-angular-momentum

Figure (a) shows – or tries to show – some rotating field B’ – one that is always at right angles to both the angular momentum as well as to the (uniform) B field. That would be effective. However, Figure (b) shows another arrangement that is almost equally effective: an oscillating field that sort of pulls and pushes at some frequency ω. Classically, such fields would effectively change the angle of our gyroscope with respect to the z-axis. Is it also the case quantum-mechanically?

It turns out it sort of works the same in quantum mechanics. There is a big difference though. Classically, μ_z would change gradually, but in quantum mechanics it cannot: in quantum mechanics, it must jump suddenly from one value to the other, i.e. from +ħ/2 to −ħ/2, or the other way around. In other words, it must flip up or down. Now, if an atom flips, then it will, of course, no longer follow the (a) or (b) trajectories: it will follow some other path, like a’ or b’, which make it crash into the magnet. Now, it turns out that almost all atoms will flip if we get that frequency ω right. The graph below shows this ‘resonance’ phenomenon: there is a sharp drop in the ’current’ of atoms if ω is close or equal to ω_p.

resonance

What’s ω_p? It’s that precession frequency for which we gave you that formula above. To make a long story short, from the experiment, we can calculate the Landé g-factor for that particular beam of atoms – say, silver atoms [4]. So… Well… Now we know it all, don’t we?

Maybe. As mentioned when I started this post, when going through all of this material, I always wonder why there is no magnetization effect: why would an atom remain in the same state when it crosses a magnetic field? When it’s already aligned with the magnetic field – to the maximum extent possible, that is – then it shouldn’t flip, but what if its magnetic moment is opposite? It should lower its energy by flipping, right? And it should flip just like that. Why would it need an oscillating B’ field?

In fact, Feynman does describe how the magnetization phenomenon can be analyzed – classically and quantum-mechanically, but he does that for bulk materials: solids, or liquids, or gases – anything that involves lots of atoms that are kicked around because of the thermal motions. So that involves statistical mechanics – which I am sure you’ve skipped so far. 🙂 It is a beautiful argument – which ends with an equally beautiful formula, which tells us the magnetization (M) of a material – which is defined as the net magnetic moment per unit volume – has the same direction as the magnetic field (B) and a magnitude M that is proportional the magnitude of B:

The μ in this formula is the magnitude of the magnetic moment of the individual atoms and so… Well… It’s just like the formula for the electric polarization P, which we described in some other post. In fact, the formula for P and M are same-same but different, as they would say in Thailand. 🙂 But this wonderful story doesn’t answer our question. The magnetic moment of an individual particle should not stay what it is: if it doesn’t change because of all the kicking around as a result of thermal motions, then… Well… These little atomic magnets should line up. That means atoms with their spin “up” should go into the “spin-down” state.

I don’t have an answer to my own question as for now. I suspect it’s got to do with the strength of the magnetic field: a Stern-Gerlach apparatus involves a weak magnetic field. If it’s too strong, the atomic magnets must flip. Hence, a more advanced analysis should probably include that flipping effect. When quickly googling – just now – I found an MIT lab exercise on it, which also provides a historical account of the Stern-Gerlach experiment itself. I skimmed through it – and will read all of it in the coming days – but let me just quote this from the historical background section:

“Stern predicted that the effect would be be just barely observable. They had difficulty in raising support in the midst of the post war financial turmoil in Germany. The apparatus, which required extremely precise alignment and a high vacuum, kept breaking down. Finally, after a year of struggle, they obtained an exposure of sufficient length to give promise of an observable silver deposit. At first, when they examined the glass plate they saw nothing. Then, gradually, the deposit became visible, showing a beam separation of 0.2 millimeters! Apparently, Stern could only afford cheap cigars with a high sulfur content. As he breathed on the glass plate, sulfur fumes converted the invisible silver deposit into visible black silver sufide, and the splitting of the beam was discovered.”

Isn’t this funny? And great at the same time? 🙂 But… Well… The point is: the paper for that MIT lab exercise makes me realize Feynman does cut corners when explaining stuff – and some corners are more significant than others. I note, for example, that they talk about interference peaks rather than “two distinct spots on the glass plate.” Hence, the analysis is somewhat more sophisticated than Feynman pretends it to be. So, when everything is said and done, Feynman’s Lectures may indeed be reading for undergraduate students only. Is it time to move on?

[1] The magnetic moment – as measured in a particular coordinate system – is equal to μ = −g·[q/(2m)]·J. The factor J in this expression is the angular momentum, and the coordinate system is chosen such that its z-axis is along the direction of the magnetic field B. The component of J along the z-axis is written as J_z. This z-component of the angular momentum is what is, rather loosely, being referred to as the spin of the particle in this context. In most other contexts, spin refers to the spin number j which appears in the formula for the value of J_z, which is J_z = j∙ħ, (j−1)∙ħ, (j−2)∙ħ,…, (−j+2)∙ħ, (−j+1), −j∙ħ. Note the separation between the possible values of J_z is equal to ħ. Hence, j itself must be an integer (e.g. 1 or 2) or a half-integer (e.g. 1/2). We usually look at electrons, whose spin number j is 1/2.

[2] One of the pole tips of the magnet that is used in the Stern-Gerlach experiment has a sharp edge. Therefore, the magnetic field strength varies with z. We write: ∂B/∂z ≠ 0.

[3] The z-direction can be any direction, really.

[4] The original experiment was effectively done with a beam of silver atoms. The lab exercise which MIT uses to show the effect to physics students involves potassium atoms.

Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/

Feynman’s Seminar on Superconductivity (1)

Pre-script (dated 26 June 2020): This post got mutilated by the removal of some material by the dark force. You should be able to follow the main story line, however. If anything, the lack of illustrations might actually help you to think things through for yourself. In any case, we now have different views on these concepts as part of our realist interpretation of quantum mechanics, so we recommend you read our recent papers instead of these old blog posts.

Original post:

The ultimate challenge for students of Feynman’s iconic Lectures series is, of course, to understand his final one: A Seminar on Superconductivity. As he notes in his introduction to this formidably dense piece, the text does not present the detail of each and every step in the development and, therefore, we’re not supposed to immediately understand everything. As Feynman puts it: we should just believe (more or less) that things would come out if we would be able to go through each and every step. Well… Let’s see. Feynman throws a lot of stuff in here—including, I suspect, some stuff that may not be directly relevant, but that he sort of couldn’t insert into all of his other Lectures. So where do we start?

It took me one long maddening day to figure out the first formula:It says that the amplitude for a particle to go from a to b in a vector potential (think of a classical magnetic field) is the amplitude for the same particle to go from a to b when there is no field (A = 0) multiplied by the exponential of the line integral of the vector potential times the electric charge divided by Planck’s constant. I stared at this for quite a while, but then I recognized the formula for the magnetic effect on an amplitude, which I described in my previous post, which tells us that a magnetic field will shift the phase of the amplitude of a particle with an amount equal to:

Hence, if we write 〈b|a〉 for A = 0 as 〈b|a〉_{A = 0} = C·eⁱ^θ, then 〈b|a〉 in A will, naturally, be equal to 〈b|a〉 _{in A}= C·eⁱ⁽^θ+φ) = C·eⁱ^θ·eⁱ^φ = 〈b|a〉_{A = 0}·eⁱ^φ, and so that explains it. 🙂 Alright… Next. Or… Well… Let us briefly re-examine the concept of the vector potential, because we’ll need it a lot. We introduced it in our post on magnetostatics. Let’s briefly re-cap the development there. In Maxwell’s set of equations, two out of the four equations give us the magnetic field: ∇•B = 0 and c²∇×B = j/ε₀. We noted the following in this regard:

The ∇•B = 0 equation is true, always, unlike the ∇×E = 0 expression, which is true for electrostatics only (no moving charges). So the ∇•B = 0 equation says the divergence of B is zero, always.
The divergence of the curl of a vector field is always zero. Hence, if A is some vector field, then div(curl A) = ∇•(∇×A) = 0, always.
We can now apply another theorem: if the divergence of a vector field, say D, is zero—so if ∇•D = 0—then $D will be the$ the curl of some other vector field $C, so we can write: D = \nabla \times C . Applying this to \nabla • B = 0, we can write:$

If ∇•B = 0, then there is an A such that B = ∇×A

So, in essence, we’re just re-defining the magnetic field (B) in terms of some other vector field. To be precise, we write it as the curl of some other vector field, which we refer to as the (magnetic) vector potential. The components of the magnetic field vector can then be re-written as:

We need to note an important point here: the equations above suggest that the components of B depend on position only. In other words, we assume static magnetic fields, so they do not change with time. That, in turn, assumes steady currents. We will want to extend the analysis to also include magnetodynamics. It complicates the analysis but… Well… Quantum mechanics is complicated. Let us remind ourselves here of Feynman’s re-formulation of Maxwell’s equations as a set of two equations (expressed in terms of the magnetic (vector) and the electric potential) only:

Wave equation for A

Wave equation for potential

These equations are wave equations, as you can see by writing out the second equation:

It is a wave equation in three dimensions. Note that, even in regions where we do no have any charges or currents, we have non-zero solutions for φ and A. These non-zero solutions are, effectively, representing the electric and magnetic fields as they travel through free space. As Feynman notes, the advantage of re-writing Maxwell’s equations as we do above, is that the two new equations make it immediately apparent that we’re talking electromagnetic waves, really. As he notes, for many practical purposes, it will still be convenient to use the original equations in terms of E and B, but… Well… Not in quantum mechanics, it turns out. As Feynman puts it: “E and B are on the other side of the mountain we have climbed. Now we are ready to cross over to the other side of the peak. Things will look different—we are ready for some new and beautiful views.”

Well… Maybe. Appreciating those views, as part of our study of quantum mechanics, does take time and effort, unfortunately. 😦

The Schrödinger equation in an electromagnetic field

Feynman then jots down Schrödinger’s equation for the same particle (with charge q) moving in an electromagnetic field that is characterized not only by the (scalar) potential Φ but also by a vector potential A:

schrodinger

Now where does that come from? We know the standard formula in an electric field, right? It’s the formula we used to find the energy states of electrons in a hydrogen atom:

i·ħ·∂ψ/∂t = −(1/2)·(ħ²/m)∇²ψ + V·ψ

Of course, it is easy to see that we replaced V by q·Φ, which makes sense: the potential of a charge in an electric field is the product of the charge (q) and the (electric) potential (Φ), because Φ is, obviously, the potential energy of the unit charge. It’s also easy to see we can re-write −ħ²·∇²ψ as [(ħ/i)·∇]·[(ħ/i)·∇]ψ because (1/i)·(1/i) = 1/i² = 1/(−1) = −1. 🙂 Alright. So it’s just that −q·A term in the (ħ/i)∇ − q·A expression that we need to explain now.

Unfortunately, that explanation is not so easy. Feynman basically re-derives Schrödinger’s equation using his trade-mark historical argument – which did not include any magnetic field – with a vector potential. The re-derivation is rather annoying, and I didn’t have the courage to go through it myself, so you should – just like me – just believe Feynman when he says that, when there’s a vector potential – i.e. when there’s a magnetic field – then that (ħ/i)·∇ operator – which is the momentum operator– ought to be replaced by a new momentum operator:

new-momentum-operator

So… Well… There we are… 🙂 So far, so good? Well… Maybe.

While, as mentioned, you won’t be interested in the mathematical argument, it is probably worthwhile to reproduce Feynman’s more intuitive explanation of why the operator above is what it is. In other words, let us try to understand that −qA term. Look at the following situation: we’ve got a solenoid here, and some current I is going through it so there’s a magnetic field B. Think of the dynamics while we turn on this flux. Maxwell’s second equation (∇×E = −∂B/∂t) tells us the line integral of E around a loop will be equal to the time rate of change of the magnetic flux through that loop. The ∇×E = −∂B/∂t equation is a differential equation, of course, so it doesn’t have the integral, but you get the idea—I hope. solenoid

Now, using the B = ∇×A equation we can re-write the ∇×E = −∂B/∂t as ∇×E = −∂(∇×A)/∂t. This allows us to write the following:

∇×E = −∂(∇×A)/∂t = −∇×(∂A/∂t) ⇔ E = −∂A/∂t

This is a remarkable expression. Note its derivation is based on the commutativity of the curl and time derivative operators, which is a property that can easily be explained: if we have a function in two variables—say x and t—then the order of the derivation doesn’t matter: we can first take the derivative with respect to x and then to t or, alternatively, we can first take the time derivative and then do the ∂/∂x operation. So… Well… The curl is, effectively, a derivative with regard to the spatial variables. OK. So what? What’s the point?

Well… If we’d have some charge q, as shown in the illustration above, that would happen to be there as the flux is being switched on, it will experience a force which is equal to F = qE. We can now integrate this over the time interval (t) during which the flux is being built up to get the following:

∫₀^t F = ∫₀^t m·a = ∫₀^t m·dv/dt = m·v_t= ∫₀^t q·E = −∫₀^t q·∂A/∂t = −q·A_t

Assuming v₀ and A₀are zero, we may drop the time subscript and simply write:

m·v = −q·A

The point is: during the build-up of the magnetic flux, our charge will pick up some (classical) momentum that is equal to p = m·v = −q·A. So… Well… That sort of explains the additional term in our new momentum operator.

Note: For some reason I don’t quite understand, Feynman introduces the weird concept of ‘dynamical momentum’, which he defines as the quantity m·v + q·A, so that quantity must be zero in the analysis above. I quickly googled to see why but didn’t invest too much time in the research here. It’s just… Well… A bit puzzling. I don’t really see the relevance of his point here: I am quite happy to go along with the new operator, as it’s rather obvious that introducing changing magnetic fields must, obviously, also have some impact on our wave equations—in classical as well as in quantum mechanics.

Local conservation of probability

The title of this section in Feynman’s Lecture (yes, still the same Lecture – we’re not switching topics here) is the equation of continuity for probabilities. I find it brilliant, because it confirms my interpretation of the wave function as describing some kind of energy flow. Let me quote Feynman on his endeavor here:

“An important part of the Schrödinger equation for a single particle is the idea that the probability to find the particle at a position is given by the absolute square of the wave function. It is also characteristic of the quantum mechanics that probability is conserved in a local sense. When the probability of finding the electron somewhere decreases, while the probability of the electron being elsewhere increases (keeping the total probability unchanged), something must be going on in between. In other words, the electron has a continuity in the sense that if the probability decreases at one place and builds up at another place, there must be some kind of flow between. If you put a wall, for example, in the way, it will have an influence and the probabilities will not be the same. So the conservation of probability alone is not the complete statement of the conservation law, just as the conservation of energy alone is not as deep and important as the local conservation of energy. If energy is disappearing, there must be a flow of energy to correspond. In the same way, we would like to find a “current” of probability such that if there is any change in the probability density (the probability of being found in a unit volume), it can be considered as coming from an inflow or an outflow due to some current.”

This is it, really ! The wave function does represent some kind of energy flow – between a so-called ‘real’ and a so-called ‘imaginary’ space, which are to be defined in terms of directional versus rotational energy, as I try to point out – admittedly: more by appealing to intuition than to mathematical rigor – in that post of mine on the meaning of the wavefunction.

So what is the flow – or probability current as Feynman refers to it? Well… Here’s the formula:

probability-current-2

Huh? Yes. Don’t worry too much about it right now. The essential point is to understand what this current – denoted by J – actually stands for:

probability-current-1

So what’s next? Well… Nothing. I’ll actually refer you to Feynman now, because I can’t improve on how he explains how pairs of electrons start behaving when temperatures are low enough to render Boltzmann’s Law irrelevant: the kinetic energy that’s associated with temperature can no longer break up electron pairs if temperature comes close to the zero point.

Huh? What? Electron pairs? Electrons are not supposed to form pairs, are they? They carry the same charge and are, therefore, supposed to repel each other. Well… Yes and no. In my post on the electron orbitals in a hydrogen atom – which just presented Feynman’s presentation on the subject-matter in a, hopefully, somewhat more readable format – we calculated electron orbitals neglecting spin. In Feynman’s words:

“We make another approximation by forgetting that the electron has spin. […] The non-relativistic Schrödinger equation disregards magnetic effects. [However] Small magnetic effects [do] occur because, from the electron’s point-of-view, the proton is a circulating charge which produces a magnetic field. In this field the electron will have a different energy with its spin up than with it down. [Hence] The energy of the atom will be shifted a little bit from what we will calculate. We will ignore this small energy shift. Also we will imagine that the electron is just like a gyroscope moving around in space always keeping the same direction of spin. Since we will be considering a free atom in space the total angular momentum will be conserved. In our approximation we will assume that the angular momentum of the electron spin stays constant, so all the rest of the angular momentum of the atom—what is usually called “orbital” angular momentum—will also be conserved. To an excellent approximation the electron moves in the hydrogen atom like a particle without spin—the angular momentum of the motion is a constant.”

To an excellent approximation… But… Well… Electrons in a metal do form pairs, because they can give up energy in that way and, hence, they are more stable that way. Feynman does not go into the details here – I guess because that’s way beyond the undergrad level – but refers to the Bardeen-Coopers-Schrieffer (BCS) theory instead – the authors of which got a Nobel Prize in Physics in 1972 (that’s a decade or so after Feynman wrote this particular Lecture), so I must assume the theory is well accepted now. 🙂

Of course, you’ll shout now: Hey! Hydrogen is not a metal! Well… Think again: the latest breakthrough in physics is making hydrogen behave like a metal. 🙂 And I am really talking the latest breakthrough: Science just published the findings of this experiment last month! 🙂 🙂 In any case, we’re not talking hydrogen here but superconducting materials, to which – as far as we know – the BCS theory does apply.

So… Well… I am done. I just wanted to show you why it’s important to work your way through Feynman’s last Lecture because… Well… Quantum mechanics does explain everything – although the nitty-gritty of it (the Meissner effect, the London equation, flux quantization, etc.) are rather hard bullets to bite. 😦

Don’t give up ! I am struggling with the nitty-gritty too ! 🙂

Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/

The Aharonov-Bohm effect

Pre-script (dated 26 June 2020): Our ideas have evolved into a full-blown realistic (or classical) interpretation of all things quantum-mechanical. In addition, I note the dark force has amused himself by removing some material. So no use to read this. Read my recent papers instead. 🙂

Original post:

This title sounds very exciting. It is – or was, I should say – one of these things I thought I would never ever understand, until I started studying physics, that is. 🙂

Having said that, there is – incidentally – nothing very special about the Aharonov-Bohm effect. As Feynman puts it: “The theory was known from the beginning of quantum mechanics in 1926. […] The implication was there all the time, but no one paid attention to it.”

To be fair, he also admits the experiment itself – proving the effect – is “very, very difficult”, which is why the first experiment that claimed to confirm the predicted effect was set up in 1960 only. In fact, some claim the results of that experiment were ambiguous, and that it was only in 1986, with the experiment of Akira Tonomura, that the Aharonov-Bohm effect was unambiguously demonstrated. So what is it about?

In essence, it proves the reality of the vector potential—and of the (related) magnetic field. What do we mean with a real field? To put it simply, a real field cannot act on some particle from a distance through some kind of spooky ‘action-at-a-distance’: real fields must be specified at the position of the particle itself and describe what happens there. Now you’ll immediately wonder: so what’s a non-real field? Well… Some field that does act through some kind of spooky ‘action-at-a-distance.’ As for an example… Well… I can’t give you one because we’ve only been discussing real fields so far. 🙂

So it’s about what a magnetic (or an electric) field does in terms influencing motion and/or quantum-mechanical amplitudes. In fact, we discussed this matter quite a while ago (check my 2015 post on it). Now, I don’t want to re-write that post, but let me just remind you of the essentials. The two equations for the magnetic field (B) in Maxwell’s set of four equations (the two others specify the electric field E) are: (1) ∇•B = 0 and (2) c²∇×B = j/ε₀ + ∂E/ ∂t. Now, you can temporarily forget about the second equation, but you should note that the ∇•B = 0 equation is always true (unlike the ∇×E = 0 expression, which is true for electrostatics only, when there are no moving charges). So it says that the divergence of B is zero, always.

Now, from our posts on vector calculus, you may or may not remember that the divergence of the curl of a vector field is always zero. We wrote: div (curl A) = ∇•(∇×A) = 0, always. Now, there is another theorem that we can now apply, which says the following: if the divergence of a vector field, say D, is zero – so if ∇•D = 0, then $D will be the$ curl of some other vector field $C, so we can write: D = \nabla \times C . When we now apply this to our \nabla • B = 0 equation, we can confidently state the following:$

If ∇•B = 0, then there is an A such that B = ∇×A

We can also write this as follows: ∇·B = ∇·(∇×A) = 0 and, hence, B = ∇×A. Now, it’s this vector field A that is referred to as the (magnetic) vector potential, and so that’s what we want to talk about here. As a start, it may be good to write out all of the components of our B = ∇×A vector:

In that 2015 post, I answered the question as to why we’d need this new vector field in a way that wasn’t very truthful: I just said that, in many situations, it would be more convenient – from a mathematical point of view, that is – to first find A, and then calculate the derivatives above to get B.

Now, Feynman says the following about this argument in his Lecture on the topic: “It is true that in many complex problems it is easier to work with A, but it would be hard to argue that this ease of technique would justify making you learn about one more vector field. […] We have introduced A because it does have an important physical significance: it is a real physical field.” Let us follow his argument here.

Quantum-mechanical interference effects

Let us first remind ourselves of the quintessential electron interference experiment illustrated below. [For a much more modern rendering of this experiment, check out the Tout Est Quantique video on it. It’s much more amusing than my rather dry exposé here, but it doesn’t give you the math.]

interference

We have electrons, all of (nearly) the same energy, which leave the source – one by one – and travel towards a wall with two narrow slits. Beyond the wall is a backstop with a movable detector which measures the rate, which we call I, at which electrons arrive at a small region of the backstop at the distance x from the axis of symmetry. The rate (or intensity) I is proportional to the probability that an individual electron that leaves the source will reach that region of the backstop. This probability has the complicated-looking distribution shown in the illustration, which we understand is due to the interference of two amplitudes, one from each slit. So we associate the two trajectories with two amplitudes, which Feynman writes as A₁eⁱ^Φ₁and A₂eⁱ^Φ₂ respectively.

As usual, Feynman abstracts away from the time variable here because it is, effectively, not relevant: the interference pattern depends on distances and angles only. Having said that, for a good understanding, we should – perhaps – write our two wavefunctions as A₁eⁱ⁽^{ωt +}^Φ₁)and A₂eⁱ⁽^{ωt +}^Φ₂⁾respectively. The point is: we’ve got two wavefunctions – one for each trajectory – even if it’s only one electron going through the slit: that’s the mystery of quantum mechanics. 🙂 We need to add these waves so as to get the interference effect:

R = A₁eⁱ⁽^{ωt +}^Φ₁)+ A₂eⁱ⁽^{ωt +}^Φ₂⁾= [A₁eⁱ^Φ₁+ A₂eⁱ^Φ₂]·eⁱ^ωt

Now, we know we need to take the absolute square of this thing to get the intensity – or probability (before normalization). The absolute square of a product, is the product of the absolute squares of the factors, and we also know that the absolute square of any complex number is just the product of the same number with its complex conjugate. Hence, the absolute square of the eⁱ^ωt factor is equal to |eⁱ^ωt|² = eⁱ^ωt∙e^–iωt = e⁰= 1. So the time-dependent factor doesn’t matter: that’s why we can always abstract away from it. Let us now take the absolute square of the [A₁eⁱ^Φ₁+ A₂eⁱ^Φ₂] factor, which we can write as:

|R|²= |A₁eⁱ^Φ₁+ A₂eⁱ^Φ₂|²= (A₁eⁱ^Φ₁+ A₂eⁱ^Φ₂)·(A₁e^–i^Φ₁+ A₂e^–ⁱ^Φ₂)

= A₁²+ A₂²+ 2·A₁·A₂·cos(Φ₁−Φ₂) = A₁²+ A₂²+ 2·A₁·A₂·cosδ with δ = Φ₁−Φ₂

OK. This is probably going a bit quick, but you should be able to figure it out, especially when remembering that eⁱ^Φ+ e^–i^Φ= 2·cosΦ and cosΦ = cos(−Φ). The point to note is that the intensity is equal to the sum of the intensities of both waves plus a correction factor, which is equal to 2·A₁·A₂·cos(Φ₁−Φ₂) and, hence, ranges from −2·A₁·A₂ to +2·A₁·A₂. Now, it takes a bit of geometrical wizardry to be able to write the phase difference δ = Φ₁−Φ₂as

δ = 2π·a/λ = 2π·(x/L)·d/λ

—but it can be done. 🙂 Well… […] OK. 🙂 Let me quickly help you here by copying another diagram from Feynman – one he uses to derive the formula for the phase difference on arrival between the signals from two oscillators. A₁ and A₂ are equal here (A₁ = A₂ = A) so that makes the situation below somewhat simpler to analyze. However, instead, we have the added complication of a phase difference (α) at the origin – which Feynman refers to as an intrinsic relative phase. triangle

When we apply the geometry shown above to our electron passing through the slits, we should, of course, equate α to zero. For the rest, the picture is pretty similar as the two-slit picture. The distance a in the two-slit – i.e. the difference in the path lengths for the two trajectories of our electron(s) – is, obviously, equal to the d·sinθ factor in the oscillator picture. Also, because L is huge as compared to x, we may assume that trajectory 1 and 2 are more or less parallel and, importantly, that the triangles in the picture – small and large – are rectangular. Now, trigonometry tells us that sinθ is equal to the ratio of the opposite side of the triangle and the hypotenuse (i.e. the longest side of the rectangular triangle). The opposite side of the triangle is x and, because x is very, very small as compared to L, we may approximate the length of the hypotenuse with L. [I know—a lot of approximations here, but… Well… Just go along with it as for now…] Hence, we can equate sinθ to x/L and, therefore, a = d·x/L. Now we need to calculate the phase difference. How many wavelengths do we have in a? That’s simple: a/λ, i.e. the total distance divided by the wavelength. Now these wavelengths correspond to 2π·a/λ radians (one cycle corresponds to one wavelength which, in turn, corresponds to 2π radians). So we’re done. We’ve got the formula: δ = Φ₁−Φ₂= 2π·a/λ = 2π·(x/L)·d/λ.

Huh? Yes. Just think about it. I need to move on. The point is: when x is equal to zero, the two waves are in phase, and the probability will have a maximum. When δ = π, then the waves are out of phase and interfere destructively (cosπ = −1), so the intensity (and, hence, the probability) reaches a minimum.

So that’s pretty obvious – or should be pretty obvious if you’ve understood some of the basics we presented in this blog. We now move to the non-standard stuff, i.e. the Aharonov-Bohm effect(s).

Interference in the presence of an electromagnetic field

In essence, the Aharonov-Bohm effect is nothing special: it is just a law – two laws, to be precise – that tells us how the phase of our wavefunction changes because of the presence of a magnetic and/or electric field. As such, it is not very different from previous analyses and presentations, such as those showing how amplitudes are affected by a potential − such as an electric potential, or a gravitational field, or a magnetic field − and how they relate to a classical analysis of the situation (see, for example, my November 2015 post on this topic). If anything, it’s just a more systematic approach to the topic and – importantly – an approach centered around the use of the vector potential A (and the electric potential Φ). Let me give you the formulas:

The first formula tells us that the phase of the amplitude for our electron (or whatever charged particle) to arrive at some location via some trajectory is changed by an amount that is equal to the integral of the vector potential along the trajectory times the charge of the particle over Planck’s constant. I know that’s quite a mouthful but just read it a couple of times.

The second formula tells us that, if there’s an electrostatic field, it will produce a phase change given by the negative of the time integral of the (scalar) potential Φ.

These two expressions – taken together – tell us what happens for any electromagnetic field, static or dynamic. In fact, they are really the (two) law(s) replacing the F = q(E + v×B) expression in classical mechanics.

So how does it work? Let me further follow Feynman’s treatment of the matter—which analyzes what happens when we’d have some magnetic field in the two-slit experiment (so we assume there’s no electric field: we only look at some magnetic field). We said Φ₁ was the phase of the wave along trajectory 1, and Φ₂ was the phase of the wave along trajectory 2. Without magnetic field, that is, so B = 0. Now, the (first) formula above tells us that, when the field is switched on, the new phases will be the following:

Hence, the phase difference δ = Φ₁−Φ₂will now be equal to:

Now, we can combine the two integrals into one that goes forward along trajectory 1 and comes back along trajectory 2. We’ll denote this path as 1-2 and write the new integral as follows:

Note that we’re using a notation here which suggests that the 1-2 path is closed, which is… Well… Yet another approximation of the Master. In fact, his assumption that the new 1-2 path is closed proves to be essential in the argument that follows the one we presented above, in which he shows that the inherent arbitrariness in our choice of a vector potential function doesn’t matter, but… Well… I don’t want to get too technical here.

Let me conclude this post by noting we can re-write our grand formula above in terms of the flux of the magnetic field B:

So… Well… That’s it, really. I’ll refer you to Feynman’s Lecture on this matter for a detailed description of the 1960 experiment itself, which involves a magnetized iron whisker that acts like a tiny solenoid—small enough to match the tiny scale of the interference experiment itself. I must warn you though: there is a rather long discussion in that Lecture on the ‘reality’ of the magnetic and the vector potential field which – unlike Feynman’s usual approach to discussions like this – is rather philosophical and partially misinformed, as it assumes there is zero magnetic field outside of a solenoid. That’s true for infinitely long solenoids, but not true for real-life solenoids: if we have some A, then we must also have some B, and vice versa. Hence, if the magnetic field (B) is a real field (in the sense that it cannot act on some particle from a distance through some kind of spooky ‘action-at-a-distance’), then the vector potential A is an equally real field—and vice versa. Feynman admits as much as he concludes his rather lengthy philosophical excursion with the following conclusion (out of which I already quoted one line in my introduction to this post):

“This subject has an interesting history. The theory we have described was known from the beginning of quantum mechanics in 1926. The fact that the vector potential appears in the wave equation of quantum mechanics (called the Schrödinger equation) was obvious from the day it was written. That it cannot be replaced by the magnetic field in any easy way was observed by one man after the other who tried to do so. This is also clear from our example of electrons moving in a region where there is no field and being affected nevertheless. But because in classical mechanics A did not appear to have any direct importance and, furthermore, because it could be changed by adding a gradient, people repeatedly said that the vector potential had no direct physical significance—that only the magnetic and electric fields are “real” even in quantum mechanics. It seems strange in retrospect that no one thought of discussing this experiment until 1956, when Bohm and Aharonov first suggested it and made the whole question crystal clear. The implication was there all the time, but no one paid attention to it. Thus many people were rather shocked when the matter was brought up. That’s why someone thought it would be worthwhile to do the experiment to see if it was really right, even though quantum mechanics, which had been believed for so many years, gave an unequivocal answer. It is interesting that something like this can be around for thirty years but, because of certain prejudices of what is and is not significant, continues to be ignored.”

Well… That’s it, folks! Enough for today! 🙂

Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/

An interpretation of the wavefunction

This is my umpteenth post on the same topic. 😦 It is obvious that this search for a sensible interpretation is consuming me. Why? I am not sure. Studying physics is frustrating. As a leading physicist puts it:

“The teaching of quantum mechanics these days usually follows the same dogma: firstly, the student is told about the failure of classical physics at the beginning of the last century; secondly, the heroic confusions of the founding fathers are described and the student is given to understand that no humble undergraduate student could hope to actually understand quantum mechanics for himself; thirdly, a deus ex machina arrives in the form of a set of postulates (the Schrödinger equation, the collapse of the wavefunction, etc); fourthly, a bombardment of experimental verifications is given, so that the student cannot doubt that QM is correct; fifthly, the student learns how to solve the problems that will appear on the exam paper, hopefully with as little thought as possible.”

That’s obviously not the way we want to understand quantum mechanics. [With we, I mean, me, of course, and you, if you’re reading this blog.] Of course, that doesn’t mean I don’t believe Richard Feynman, one of the greatest physicists ever, when he tells us no one, including himself, understands physics quite the way we’d like to understand it. Such statements should not prevent us from trying harder. So let’s look for better metaphors. The animation below shows the two components of the archetypal wavefunction – a simple sine and cosine. They’re the same function actually, but their phases differ by 90 degrees (π/2).

circle_cos_sin

It makes me think of a V-2 engine with the pistons at a 90-degree angle. Look at the illustration below, which I took from a rather simple article on cars and engines that has nothing to do with quantum mechanics. Think of the moving pistons as harmonic oscillators, like springs.

two-timer-576-px-photo-369911-s-original

We will also think of the center of each cylinder as the zero point: think of that point as a point where – if we’re looking at one cylinder alone – the internal and external pressure balance each other, so the piston would not move… Well… If it weren’t for the other piston, because the second piston is not at the center when the first is. In fact, it is easy to verify and compare the following positions of both pistons, as well as the associated dynamics of the situation:

Piston 1	Piston 2	Motion of Piston 1	Motion Piston 2
Top	Center	Compressed air will push piston down	Piston moves down against external pressure
Center	Bottom	Piston moves down against external pressure	External air pressure will push piston up
Bottom	Center	External air pressure will push piston up	Piston moves further up and compresses the air
Center	Top	Piston moves further up and compresses the air	Compressed air will push piston down

When the pistons move, their linear motion will be described by a sinusoidal function: a sine or a cosine. In fact, the 90-degree V-2 configuration ensures that the linear motion of the two pistons will be exactly the same, except for a phase difference of 90 degrees. [Of course, because of the sideways motion of the connecting rods, our sine and cosine function describes the linear motion only approximately, but you can easily imagine the idealized limit situation. If not, check Feynman’s description of the harmonic oscillator.]

The question is: if we’d have a set-up like this, two springs – or two harmonic oscillators – attached to a shaft through a crank, would this really work as a perpetuum mobile? We obviously talk energy being transferred back and forth between the rotating shaft and the moving pistons… So… Well… Let’s model this: the total energy, potential and kinetic, in each harmonic oscillator is constant. Hence, the piston only delivers or receives kinetic energy from the rotating mass of the shaft.

Now, in physics, that’s a bit of an oxymoron: we don’t think of negative or positive kinetic (or potential) energy in the context of oscillators. We don’t think of the direction of energy. But… Well… If we’ve got two oscillators, our picture changes, and so we may have to adjust our thinking here.

Let me start by giving you an authoritative derivation of the various formulas involved here, taking the example of the physical spring as an oscillator—but the formulas are basically the same for any harmonic oscillator.

energy harmonic oscillator

The first formula is a general description of the motion of our oscillator. The coefficient in front of the cosine function (a) is the maximum amplitude. Of course, you will also recognize ω₀ as the natural frequency of the oscillator, and Δ as the phase factor, which takes into account our t = 0 point. In our case, for example, we have two oscillators with a phase difference equal to π/2 and, hence, Δ would be 0 for one oscillator, and –π/2 for the other. [The formula to apply here is sinθ = cos(θ – π/2).] Also note that we can equate our θ argument to ω₀·t. Now, if a = 1 (which is the case here), then these formulas simplify to:

K.E. = T = m·v²/2 = m·ω₀²·sin²(θ + Δ) = m·ω₀²·sin²(ω₀·t + Δ)
P.E. = U = k·x²/2 = k·cos²(θ + Δ)

The coefficient k in the potential energy formula characterizes the force: F = −k·x. The minus sign reminds us our oscillator wants to return to the center point, so the force pulls back. From the dynamics involved, it is obvious that k must be equal to m·ω₀²., so that gives us the famous T + U = m·ω₀²/2 formula or, including a once again, T + U = m·a²·ω₀²/2.

Now, if we normalize our functions by equating k to one (k = 1), then the motion of our first oscillator is given by the cosθ function, and its kinetic energy will be equal to sin²θ. Hence, the (instantaneous) change in kinetic energy at any point in time will be equal to:

d(sin²θ)/dθ = 2∙sinθ∙d(sinθ)/dt = 2∙sinθ∙cosθ

Let’s look at the second oscillator now. Just think of the second piston going up and down in our V-twin engine. Its motion is given by the sinθ function which, as mentioned above, is equal to cos(θ−π /2). Hence, its kinetic energy is equal to sin²(θ−π /2), and how it changes – as a function of θ – will be equal to:

2∙sin(θ−π /2)∙cos(θ−π /2) = = −2∙cosθ∙sinθ = −2∙sinθ∙cosθ

We have our perpetuum mobile! While transferring kinetic energy from one piston to the other, the rotating shaft moves at constant speed. Linear motion becomes circular motion, and vice versa, in a frictionless Universe. We have the metaphor we were looking for!

Somehow, in this beautiful interplay between linear and circular motion, energy is being borrowed from one place to another, and then returned. From what place to what place? I am not sure. We may call it the real and imaginary energy space respectively, but what does that mean? One thing is for sure, however: the interplay between the real and imaginary part of the wavefunction describes how energy propagates through space!

How exactly? Again, I am not sure. Energy is, obviously, mass in motion – as evidenced by the E = m·c² equation, and it may not have any direction (when everything is said and done, it’s a scalar quantity without direction), but the energy in a linear motion is surely different from that in a circular motion, and our metaphor suggests we need to think somewhat more along those lines. Perhaps we will, one day, able to square this circle. 🙂

Schrödinger’s equation

Let’s analyze the interplay between the real and imaginary part of the wavefunction through an analysis of Schrödinger’s equation, which we write as:

i·ħ∙∂ψ/∂t = –(ħ²/2m)∙∇²ψ + V·ψ

We can do a quick dimensional analysis of both sides:

[i·ħ∙∂ψ/∂t] = N∙m∙s/s = N∙m
[–(ħ²/2m)∙∇²ψ] = N∙m³/m² = N∙m
[V·ψ] = N∙m

Note the dimension of the ‘diffusion’ constant ħ²/2m: [ħ²/2m] = N²∙m²∙s²/kg = N²∙m²∙s²/(N·s²/m) = N∙m³. Also note that, in order for the dimensions to come out alright, the dimension of V – the potential – must be that of energy. Hence, Feynman’s description of it as the potential energy – rather than the potential tout court – is somewhat confusing but correct: V must equal the potential energy of the electron. Hence, V is not the conventional (potential) energy of the unit charge (1 coulomb). Instead, the natural unit of charge is used here, i.e. the charge of the electron itself.

Now, Schrödinger’s equation – without the V·ψ term – can be written as the following pair of equations:

Re(∂ψ/∂t) = −(1/2)∙(ħ/m)∙Im(∇²ψ)
Im(∂ψ/∂t) = (1/2)∙(ħ/m)∙Re(∇²ψ)

This closely resembles the propagation mechanism of an electromagnetic wave as described by Maxwell’s equation for free space (i.e. a space with no charges), but E and B are vectors, not scalars. How do we get this result. Well… ψ is a complex function, which we can write as a + i∙b. Likewise, ∂ψ/∂t is a complex function, which we can write as c + i∙d, and ∇²ψ can then be written as e + i∙f. If we temporarily forget about the coefficients (ħ, ħ²/m and V), then Schrödinger’s equation – including V·ψ term – amounts to writing something like this:

i∙(c + i∙d) = –(e + i∙f) + (a + i∙b) ⇔ a + i∙b = i∙c − d + e+ i∙f ⇔ a = −d + e and b = c + f

Hence, we can now write:

V∙Re(ψ) = −ħ∙Im(∂ψ/∂t) + (1/2)∙( ħ²/m)∙Re(∇²ψ)
V∙Im(ψ) = ħ∙Re(∂ψ/∂t) + (1/2)∙( ħ²/m)∙Im(∇²ψ)

This simplifies to the two equations above for V = 0, i.e. when there is no potential (electron in free space). Now we can bring the Re and Im operators into the brackets to get:

V∙Re(ψ) = −ħ∙∂Im (ψ)/∂t + (1/2)∙( ħ²/m)∙∇²Re(ψ)
V∙Im(ψ) = ħ∙∂Re(ψ)/∂t + (1/2)∙( ħ²/m)∙∇²Im(ψ)

This is very interesting, because we can re-write this using the quantum-mechanical energy operator H = –(ħ²/2m)∙∇² + V· (note the multiplication sign after the V, which we do not have – for obvious reasons – for the –(ħ²/2m)∙∇² expression):

H[Re (ψ)] = −ħ∙∂Im(ψ)/∂t
H[Im(ψ)] = ħ∙∂Re(ψ)/∂t

A dimensional analysis shows us both sides are, once again, expressed in N∙m. It’s a beautiful expression because – if we write the real and imaginary part of ψ as r∙cosθ and r∙sinθ, we get:

H[cosθ] = −ħ∙∂sinθ/∂t = E∙cosθ
H[sinθ] = ħ∙∂cosθ/∂t = E∙sinθ

Indeed, θ = (E∙t − p∙x)/ħ and, hence, −ħ∙∂sinθ/∂t = ħ∙cosθ∙E/ħ = E∙cosθ and ħ∙∂cosθ/∂t = ħ∙sinθ∙E/ħ = E∙sinθ. Now we can combine the two equations in one equation again and write:

H[r∙(cosθ + i∙sinθ)] = r∙(E∙cosθ + i∙sinθ) ⇔ H[ψ] = E∙ψ

The operator H – applied to the wavefunction – gives us the (scalar) product of the energy E and the wavefunction itself. Isn’t this strange?

Hmm… I need to further verify and explain this result… I’ll probably do so in yet another post on the same topic… 🙂

Post scriptum: The symmetry of our V-2 engine – or perpetuum mobile – is interesting: its cross-section has only one axis of symmetry. Hence, we may associate some angle with it, so as to define its orientation in the two-dimensional cross-sectional plane. Of course, the cross-sectional plane itself is at right angles to the crankshaft axis, which we may also associate with some angle in three-dimensional space. Hence, its geometry defines two orthogonal directions which, in turn, define a spherical coordinate system, as shown below.

We may, therefore, say that three-dimensional space is actually being implied by the geometry of our V-2 engine. Now that is interesting, isn’t it? 🙂

Quantum-mechanical operators

I wrote a post on quantum-mechanical operators some while ago but, when re-reading it now, I am not very happy about it, because it tries to cover too much ground in one go. In essence, I regret my attempt to constantly switch between the matrix representation of quantum physics – with the | state 〉 symbols – and the wavefunction approach, so as to show how the operators work for both cases. But then that’s how Feynman approaches this.

However, let’s admit it: while Heisenberg’s matrix approach is equivalent to Schrödinger’s wavefunction approach – and while it’s the only approach that works well for n-state systems – the wavefunction approach is more intuitive, because:

Most practical examples of quantum-mechanical systems (like the description of the electron orbitals of an atomic system) involve continuous coordinate spaces, so we have an infinite number of states and, hence, we need to describe it using the wavefunction approach.
Most of us are much better-versed in using derivatives and integrals, as opposed to matrix operations.
A more intuitive statement of the same argument above is the following: the idea of one state flowing into another, rather than being transformed through some matrix, is much more appealing. 🙂

So let’s stick to the wavefunction approach here. So, while you need to remember that there’s a ‘matrix equivalent’ for each of the equations we’re going to use in this post, we’re not going to talk about it.

The operator idea

In classical physics – high school physics, really – we would describe a pointlike particle traveling in space by a function relating its position (x) to time (t): x = x(t). Its (instantaneous) velocity is, obviously, v(t) = dx/dt. Simple. Obvious. Let’s complicate matters now by saying that the idea of a velocity operator would sort of generalize the v(t) = dx/dt velocity equation by making abstraction of the specifics of the x = x(t) function.

Huh? Yes. We could define a velocity ‘operator’ as:

Now, you may think that’s a rather ridiculous way to describe what an operator does, but – in essence – it’s correct. We have some function – describing an elementary particle, or a system, or an aspect of the system – and then we have some operator, which we apply to our function, to extract the information from it that we want: its velocity, its momentum, its energy. Whatever. Hence, in quantum physics, we have an energy operator, a position operator, a momentum operator, an angular momentum operator and… Well… I guess I listed the most important ones. 🙂

It’s kinda logical. Our velocity operator looks at one particular aspect of whatever it is that’s going on: the time rate of change of position. We do refer to that as the velocity. Our quantum-mechanical operators do the same: they look at one aspect of what’s being described by the wavefunction. [At this point, you may wonder what the other properties of our classical ‘system’ – i.e. other properties than velocity – because we’re just looking at a pointlike particle here, but… Well… Think of electric charge and forces acting on it, so it accelerates and decelerates in all kinds of ways, and we have kinetic and potential energy and all that. Or momentum. So it’s just the same: the x = x(t) function may cover a lot of complexities, just like the wavefunction does!]

The Wikipedia article on the momentum operator is, for a change (I usually find Wikipedia quite abstruse on these matters), quite simple – and, therefore – quite enlightening here. It applies the following simple logic to the elementary wavefunction ψ = e^{−i·(ω·t − k∙x)}, with the de Broglie relations telling us that ω = E/ħ and k = p/ħ:

mom op 1

Note we forget about the normalization coefficient a here. It doesn’t matter: we can always stuff it in later. The point to note is that we can sort of forget about ψ (or abstract away from it—as mathematicians and physicists would say) by defining the momentum operator, which we’ll write as:

mom op 2

Its three-dimensional equivalent is calculated in very much the same way:

wiki

So this operator, when operating on a particular wavefunction, gives us the (expected) momentum when we would actually catch our particle there, provided the momentum doesn’t vary in time. [Note that it may – and actually is likely to – vary in space!]

So that’s the basic idea of an operator. However, the comparison goes further. Indeed, a superficial reading of what operators are all about gives you the impression we get all these observables (or properties of the system) just by applying the operator to the (wave)function. That’s not the case. There is the randomness. The uncertainty. Actual wavefunctions are superpositions of several elementary waves with various coefficients representing their amplitudes. So we need averages, or expected values: E[X] Even our velocity operator ∂/∂t – in the classical world – gives us an instantaneous velocity only. To get the average velocity (in quantum mechanics, we’ll be interested in the the average momentum, or the average position, or the average energy – rather than the average velocity), we’re going to have the calculate the total distance traveled. Now, that’s going to involve a line integral:

S = ∫_Lds.

The principle is illustrated below.

line integral

You’ll say: this is kids stuff, and it is. Just note how we write the same integral in terms of the x and t coordinate, and using our new velocity operator:

integral

Kids stuff. Yes. But it’s good to think about what it represents really. For example, the simplest quantum-mechanical operator is the position operator. It’s just x for the x-coordinate, y for the y-coordinate, and z for the z-coordinate. To get the average position of a stationary particle – represented by the wavefunction ψ(r, t) – in three-dimensional space, we need to calculate the following volume integral:

position operator 3D V2

Simple? Yes and no. The r·|ψ(r)|² integrand is obvious: we multiply each possible position (r) by its probability (or likelihood), which is equal to P(r) = |ψ(r)|². However, look at the assumptions: we already omitted the time variable. Hence, the particle we’re describing here must be stationary, indeed! So we’ll need to re-visit the whole subject allowing for averages to change with time. We’ll do that later. I just wanted to show you that those integrals – even with very simple operators, like the position operator – can become very complicated. So you just need to make sure you know what you’re looking at.

One wavefunction—or two? Or more?

There is another reason why, with the immeasurable benefit of hindsight, I now feel that my earlier post is confusing: I kept switching between the position and the momentum wavefunction, which gives the impression we have different wavefunctions describing different aspects of the same thing. That’s just not true. The position and momentum wavefunction describe essentially the same thing: we can go from one to the other, and back again, by a simple mathematical manipulation. So I should have stuck to descriptions in terms of ψ(x, t), instead of switching back and forth between the ψ(x, t) and φ(x, t) representations.

In any case, the damage is done, so let’s move forward. The key idea is that, when we know the wavefunction, we know everything. I tried to convey that by noting that the real and imaginary part of the wavefunction must, somehow, represent the total energy of the particle. The structural similarity between the mass-energy equivalence relation (i.e. Einstein’s formula: E = m·c²) and the energy formulas for oscillators and spinning masses is too obvious:

The energy of any oscillator is given by the E = m·ω₀²/2. We may want to liken the real and imaginary component of our wavefunction to two oscillators and, hence, add them up. The E = m·ω₀² formula we get is then identical to the E = m·c² formula.
The energy of a spinning mass is given by an equivalent formula: E = I·ω²/2 (I is the moment of inertia in this formula). The same 1/2 factor tells us our particle is, somehow, spinning in two dimensions at the same time (i.e. a ‘real’ as well as an ‘imaginary’ space—but both are equally real, because amplitudes interfere), so we get the E = I·ω² formula.

Hence, the formulas tell us we should imagine an electron – or an electron orbital – as a very complicated two-dimensional standing wave. Now, when I write two-dimensional, I refer to the real and imaginary component of our wavefunction, as illustrated below. What I am asking you, however, is to not only imagine these two components oscillating up and down, but also spinning about. Hence, if we think about energy as some oscillating mass – which is what the E = m·c² formula tells us to do, we should remind ourselves we’re talking very complicated motions here: mass oscillates, swirls and spins, and it does so both in real as well as in imaginary space. rising_circular

What I like about the illustration above is that it shows us – in a very obvious way – why the wavefunction depends on our reference frame. These oscillations do represent something in absolute space, but how we measure it depends on our orientation in that absolute space. But so I am writing this post to talk about operators, not about my grand theory about the essence of mass and energy. So let’s talk about operators now. 🙂

In that post of mine, I showed how the position, momentum and energy operator would give us the average position, momentum and energy of whatever it was that we were looking at, but I didn’t introduce the angular momentum operator. So let me do that now. However, I’ll first recapitulate what we’ve learnt so far in regard to operators.

The energy, position and momentum operators

The equation below defines the energy operator, and also shows how we would apply it to the wavefunction:

To the purists: sorry for not (always) using the hat symbol. [I explained why in that post of mine: it’s just too cumbersome.] The others 🙂 should note the following:

E_average is also an expected value: E_av= E[E]
The * symbol tells us to take the complex conjugate of the wavefunction.
As for the integral, it’s an integral over some volume, so that’s what the d³r shows. Many authors use double or triple integral signs (∫∫ or ∫∫∫) to show it’s a surface or a volume integral, but that makes things look very complicated, and so I don’t that. I could also have written the integral as ∫ψ(r)*·H·ψ(r) dV, but then I’d need to explain that the dV stands for dVolume, not for any (differental) potential energy (V).
We must normalize our wavefunction for these formulas to work, so all probabilities over the volume add up to 1.

OK. That’s the energy operator. As you can see, it’s a pretty formidable beast, but then it just reflects Schrödinger’s equation which, as I explained a couple of times already, we can interpret as an energy propagation mechanism, or an energy diffusion equation, so it is actually not that difficult to memorize the formula: if you’re able to remember Schrödinger’s equation, then you’ll also have the operator. If not… Well… Then you won’t pass your undergrad physics exam. 🙂

I already mentioned that the position operator is a much simpler beast. That’s because it’s so intimately related to our interpretation of the wavefunction. It’s the one thing you know about quantum mechanics: the absolute square of the wavefunction gives us the probability density function. So, for one-dimensional space, the position operator is just:

The equivalent operator for three-dimensional space is equally simple:

position operator 3D V2

Note how the operator, for the one- as well as for the three-dimensional case, gets rid of time as a variable. In fact, the idea itself of an average makes abstraction of the temporal aspect. Well… Here, at least—because we’re looking at some box in space, rather than some box in spacetime. We’ll re-visit that rather particular idea of an average, and allow for averages that change with time, in a short while.

Next, we introduced the momentum operator in that post of mine. For one dimension, Feynman shows this operator is given by the following formula:

Now that does not look very simple. You might think that the ∂/∂x operator reflects our velocity operator, but… Well… No: ∂/∂t gives us a time rate of change, while ∂/∂x gives us the spatial variation. So it’s not the same. Also, that ħ/i factor is quite intriguing, isn’t it? We’ll come back to it in the next section of this post. Let me just give you the three-dimensional equivalent which, remembering that 1/i = −i, you’ll understand to be equal to the following vector operator:

Now it’s time to define the operator we wanted to talk about, i.e. the angular momentum operator.

The angular momentum operator

The formula for the angular momentum operator is remarkably simple:

Why do I call this a simple formula? Because it looks like the familiar formula of classical mechanics for the z-component of the classical angular momentum L = r × p. I must assume you know how to calculate a vector cross product. If not, check one of my many posts on vector analysis. I must also assume you remember the L = r × p formula. If not, the following animation might bring it all back. If that doesn’t help, check my post on gyroscopes. 🙂

Now, spin is a complicated phenomenon, and so, to simplify the analysis, we should think of orbital angular momentum only. This is a simplification, because electron spin is some complicated mix of intrinsic and orbital angular momentum. Hence, the angular momentum operator we’re introducing here is only the orbital angular momentum operator. However, let us not get bogged down in all of the nitty-gritty and, hence, let’s just go along with it for the time being.

I am somewhat hesitant to show you how we get that formula for our operator, but I’ll try to show you using an intuitive approach, which uses only bits and pieces of Feynman’s more detailed derivation. It will, hopefully, give you a bit of an idea of how these differential operators work. Think about a rotation of our reference frame over an infinitesimally small angle – which we’ll denote as ε – as illustrated below.

rotation

Now, the whole idea is that, because of that rotation of our reference frame, our wavefunction will look different. It’s nothing fundamental, but… Well… It’s just because we’re using a different coordinate system. Indeed, that’s where all these complicated transformation rules for amplitudes come in. I’ve spoken about these at length when we were still discussing n-state systems. In contrast, the transformation rules for the coordinates themselves are very simple:

rotation

Now, because ε is an infinitesimally small angle, we may equate cos(θ) = cos(ε) to 1, and cos(θ) = sin(ε) to ε. Hence, x’ and y’ are then written as x’ = x + εy and y’ = y − εx, while z‘ remains z. Vice versa, we can also write the old coordinates in terms of the new ones: x = x’ − εy, y = y’ + εx, and z = z‘. That’s obvious. Now comes the difficult thing: you need to think about the two-dimensional equivalent of the simple illustration below.

izvod

If we have some function y = f(x), then we know that, for small Δx, we have the following approximation formula for f(x + Δx): f(x + Δx) ≈ f(x) + (dy/dx)·Δx. It’s the formula you saw in high school: you would then take a limit (Δx → 0), and define dy/dx as the Δy/Δx ratio for Δx → 0. You would this after re-writing the f(x + Δx) ≈ f(x) + (dy/dx)·Δx formula as:

Δy = Δf = f(x + Δx) − f(x) ≈ (dy/dx)·Δx

Now you need to substitute f for ψ, and Δx for ε. There is only one complication here: ψ is a function of two variables: x and y. In fact, it’s a function of three variables – x, y and z – but we keep z constant. So think of moving from x and y to x + εy = x + Δx and to y + Δy = y − εx. Hence, Δx = εy and Δy = −εx. It then makes sense to write Δψ as:

angular momentum operator v2

If you agree with that, you’ll also agree we can write something like this:

formula 2

Now that implies the following formula for Δψ:

repair

This looks great! You can see we get some sort of differential operator here, which is what we want. So the next step should be simple: we just let ε go to zero and then we’re done, right? Well… No. In quantum mechanics, it’s always a bit more complicated. But it’s logical stuff. Think of the following:

1. We will want to re-write the infinitesimally small ε angle as a fraction of i, i.e. the imaginary unit.

Huh? Yes. This little i represents many things. In this particular case, we want to look at it as a right angle. In fact, you know multiplication with i amounts to a rotation by 90 degrees. So we should replace ε by ε·i. It’s like measuring ε in natural units. However, we’re not done.

2. We should also note that Nature measures angles clockwise, rather than counter-clockwise, as evidenced by the fact that the argument of our wavefunction rotates clockwise as time goes by. So our ε is, in fact, a −ε. We will just bring the minus sign inside of the brackets to solve this issue.

Huh? Yes. Sorry. I told you this is a rather intuitive approach to getting what we want to get. 🙂

3. The third modification we’d want to make is to express ε·i as a multiple of Planck’s constant.

Huh? Yes. This is a very weird thing, but it should make sense—intuitively: we’re talking angular momentum here, and its dimension is the same as that of physical action: N·m·s. Therefore, Planck’s quantum of action (ħ = h/2π ≈ 1×10⁻³⁴ J·s ≈ 6.6×10⁻¹⁶ eV·s) naturally appears as… Well… A natural unit, or a scaling factor, I should say.

To make a long story short, we’ll want to re-write ε as −(i/ħ)·ε. However, there is a thing called mathematical consistency, and so, if we want to do such substitutions and prepare for that limit situation (ε → 0), we should re-write that Δψ equation as follows:

final

So now – finally! – we do have the formula we wanted to find for our angular momentum operator:

final 2

The final substitution, which yields the formula we just gave you when commencing this section, just uses the formula for the linear momentum operator in the x– and y-direction respectively. We’re done! 🙂 Finally!

Well… No. 🙂 The question, of course, is the same as always: what does it all mean, really? That’s always a great question. 🙂 Unfortunately, the answer is rather boring: we can calculate the average angular momentum in the z-direction, using a similar integral as the one we used to get the average energy, or the average linear momentum in some direction. That’s basically it.

To compensate for that very boring answer, however, I will show you something that is far less boring. 🙂

Quantum-mechanical weirdness

I’ll shameless copy from Feynman here. He notes that many classical equations get carried over into a quantum-mechanical form (I’ll copy some of his illustrations later). But then there are some that don’t. As Feynman puts it—rather humorously: “There had better be some that don’t come out right, because if everything did, then there would be nothing different about quantum mechanics. There would be no new physics.” He then looks at the following super-obvious equation in classical mechanics:

x·p_x− p_x·x = 0

In fact, this equation is so super-obvious that it’s almost meaningless. Almost. It’s super-obvious because multiplication is commutative (for real as well for complex numbers). However, when we replace x and p_xby the position and momentum operator, we get an entirely different result. You can verify the following yourself:

strange

This is plain weird! What does it mean? I am not sure. Feynman’s take on it is nice but leaves us in the dark on it:

Feynman quote 2

He adds: “If Planck’s constant were zero, the classical and quantum results would be the same, and there would be no quantum mechanics to learn!” Hmm… What does it mean, really? Not sure. Let me make two remarks here:

1. We should not put any dot (·) between our operators, because they do not amount to multiplying one with another. We just apply operators successively. Hence, commutativity is not what we should expect.

2. Note that Feynman forgot to put the subscript in that quote. When doing the same calculations for the equivalent of the x·p_y− p_y·x expression, we do get zero, as shown below:

not strange

These equations – zero or not – are referred to as ‘commutation rules’. [Again, I should not have used any dot between x and p_y, because there is no multiplication here. It’s just a separation mark.] Let me quote Feynman on it, so the matter is dealt with:

quote

OK. So what do we conclude? What are we talking about?

Conclusions

Some of the stuff above was really intriguing. For example, we found that the linear and angular momentum operators are differential operators in the true sense of the word. The angular momentum operator shows us what happens to the wavefunction if we rotate our reference frame over an infinitesimally small angle ε. That’s what’s captured by the formulas we’ve developed, as summarized below:

angular momentum

Likewise, the linear momentum operator captures what happens to the wavefunction for an infinitesimally small displacement of the reference frame, as shown by the equivalent formulas below:

linear momentum

What’s the interpretation for the position operator, and the energy operator? Here we are not so sure. The integrals above make sense, but these integrals are used to calculate averages values, as opposed to instantaneous values. So… Well… There is not all that much I can say about the position and energy operator right now, except… Well… We now need to explore the question of how averages could possibly change over time. Let’s do that now.

Averages that change with time

I know: you are totally quantum-mechanicked out by now. So am I. But we’re almost there. In fact, this is Feynman’s last Lecture on quantum mechanics and, hence, I think I should let the Master speak here. So just click on the link and read for yourself. It’s a really interesting chapter, as he shows us the equivalent of Newton’s Law in quantum mechanics, as well as the quantum-mechanical equivalent of other standard equations in classical mechanics. However, I need to warn you: Feynman keeps testing the limits of our intellectual absorption capacity by switching back and forth between matrix and wave mechanics. Interesting, but not easy. For example, you’ll need to remind yourself of the fact that the Hamiltonian matrix is equal to its own complex conjugate (or – because it’s a matrix – its own conjugate transpose.

Having said that, it’s all wonderful. The time rate of change of all those average values is denoted by using the over-dot notation. For example, the time rate of change of the average position is denoted by:

Once you ‘get’ that new notation, you will quickly understand the derivations. They are not easy (what derivations are in quantum mechanics?), but we get very interesting results. Nice things to play with, or think about—like this identity:

formula2

It takes a while, but you suddenly realize this is the equivalent of the classical dx/dt = v = p/m formula. 🙂

Another sweet result is the following one:

formula3

This is the quantum-mechanical equivalent of Newton’s force law: F = m·a. Huh? Yes. Think of it: the spatial derivative of the (potential) energy is the force. Now just think of the classical dp/dt = d(m·v) = m·dv/dt = m·a formula. […] Can you see it now? Isn’t this just Great Fun?

Note, however, that these formulas also show the limits of our analysis so far, because they treat m as some constant. Hence, we’ll need to relativistically correct them. But that’s complicated, and so we’ll postpone that to another day.

[…]

Well… That’s it, folks! We’re really through! This was the last of the last of Feynman’s Lectures on Physics. So we’re totally done now. Isn’t this great? What an adventure! I hope that, despite the enormous mental energy that’s required to digest all this stuff, you enjoyed it as much as I did. 🙂

Post scriptum 1: I just love Feynman but, frankly, I think he’s sometimes somewhat sloppy with terminology. In regard to what these operators really mean, we should make use of better terminology: an average is something else than an expected value. Our momentum operator, for example, as such returns an expected value – not an average momentum. We need to deepen the analysis here somewhat, but I’ll also leave that for later.

Post scriptum 2: There is something really interesting about that i·ħ or −(i/ħ) scaling factor – or whatever you want to call it – appearing in our formulas. Remember the Schrödinger equation can also be written as:

i·ħ·∂ψ/∂t = −(1/2)·(ħ²/m)∇²ψ + V·ψ = Hψ

This is interesting in light of our interpretation of the Schrödinger equation as an energy propagation mechanism. If we write Schrödinger’s equation like we write it here, then we have the energy on the right-hand side – which is time-independent. How do we interpret the left-hand side now? Well… It’s kinda simple, but we just have the time rate of change of the real and imaginary part of the wavefunction here, and the i·ħ factor then becomes a sort of unit in which we measure the time rate of change. Alternatively, you may think of ‘splitting’ Planck’s constant in two: Planck’s energy, and Planck’s time unit, and then you bring the Planck energy unit to the other side, so we’d express the energy in natural units. Likewise, the time rate of change of the components of our wavefunction would also be measured in natural time units if we’d do that.

I know this is all very abstract but, frankly, it’s crystal clear to me. This formula tells us that the energy of the particle that’s being described by the wavefunction is being carried by the oscillations of the wavefunction. In fact, the oscillations are the energy. You can play with the mass factor, by moving it to the left-hand side too, or by using Einstein’s mass-energy equivalence relation. The interpretation remains consistent.

In fact, there is something really interesting here. You know that we usually separate out the spatial and temporal part of the wavefunction, so we write: ψ(r, t) = ψ(r)·e^{−i·(E/ħ)·t}. In fact, it is quite common to refer to ψ(r) – rather than to ψ(r, t) – as the wavefunction, even if, personally, I find that quite confusing and misleading (see my page onSchrödinger’s equation). Now, we may want to think of what happens when we’d apply the energy operator to ψ(r) rather than to ψ(r, t). We may think that we’d get a time-independent value for the energy at that point in space, so energy is some function of position only, not of time. That’s an interesting thought, and we should explore it. For example, we then may think of energy as an average that changes with position—as opposed to the (average) position and momentum, which we like to think of as averages than change with time, as mentioned above. I will come back to this later – but perhaps in another post or so. Not now. The only point I want to mention here is the following: you cannot use ψ(r) in Schrödinger’s equation. Why? Well… Schrödinger’s equation is no longer valid when substituting ψ for ψ(r), because the left-hand side is always zero, as ∂ψ(r)/∂t is zero – for any r.

There is another, related, point to this observation. If you think that Schrödinger’s equation implies that the operators on both sides of Schrödinger’s equation must be equivalent (i.e. the same), you’re wrong:

i·ħ·∂/∂t ≠ H = −(1/2)·(ħ²/m)∇² + V

It’s a basic thing, really: Schrödinger’s equation is not valid for just any function. Hence, it does not work for ψ(r). Only ψ(r, t) makes it work, because… Well… Schrödinger’s equation gave us ψ(r, t)!

Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/

The energy and 1/2 factor in Schrödinger’s equation

Schrödinger’s equation, for a particle moving in free space (so we have no external force fields acting on it, so V = 0 and, therefore, the Vψ term disappears) is written as:

∂ψ(x, t)/∂t = i·(1/2)·(ħ/m_eff)·∇²ψ(x, t)

We already noted and explained the structural similarity with the ubiquitous diffusion equation in physics:

∂φ(x, t)/∂t = D·∇²φ(x, t) with x = (x, y, z)

The big difference between the wave equation and an ordinary diffusion equation is that the wave equation gives us two equations for the price of one: ψ is a complex-valued function, with a real and an imaginary part which, despite their name, are both equally fundamental, or essential. Whatever word you prefer. 🙂 That’s also what the presence of the imaginary unit (i) in the equation tells us. But for the rest it’s the same: the diffusion constant (D) in Schrödinger’s equation is equal to (1/2)·(ħ/m_eff).

Why the 1/2 factor? It’s ugly. Think of the following: If we bring the (1/2)·(ħ/m_eff) to the other side, we can write it as m_eff/(ħ/2). The ħ/2 now appears as a scaling factor in the diffusion constant, just like ħ does in the de Broglie equations: ω = E/ħ and k = p/ħ, or in the argument of the wavefunction: θ = (E·t − p∙x)/ħ. Planck’s constant is, effectively, a physical scaling factor. As a physical scaling constant, it usually does two things:

It fixes the numbers (so that’s its function as a mathematical constant).
As a physical constant, it also fixes the physical dimensions. Note, for example, how the 1/ħ factor in ω = E/ħ and k = p/ħ ensures that the ω·t = (E/ħ)·t and k·x = (p/ħ)·x terms in the argument of the wavefunction are both expressed as some dimensionless number, so they can effectively be added together. Physicists don’t like adding apples and oranges.

The question is: why did Schrödinger use ħ/2, rather than ħ, as a scaling factor? Let’s explore the question.

The 1/2 factor

We may want to think that 1/2 factor just echoes the 1/2 factor in the Uncertainty Principle, which we should think of as a pair of relations: σ_x·σ_p ≥ ħ/2 and σ_E·σ_t≥ ħ/2. However, the 1/2 factor in those relations only makes sense because we chose to equate the fundamental uncertainty (Δ) in x, p, E and t with the mathematical concept of the standard deviation (σ), or the half-width, as Feynman calls it in his wonderfully clear exposé on it in one of his Lectures on quantum mechanics (for a summary with some comments, see my blog post on it). We may just as well choose to equate Δ with the full-width of those probability distributions we get for x and p, or for E and t. If we do that, we get σ_x·σ_p ≥ ħ and σ_E·σ_t≥ ħ.

It’s a bit like measuring the weight of a person on an old-fashioned (non-digital) bathroom scale with 1 kg marks only: do we say this person is x kg ± 1 kg, or x kg ± 500 g? Do we take the half-width or the full-width as the margin of error? In short, it’s a matter of appreciation, and the 1/2 factor in our pair of uncertainty relations is not there because we’ve got two relations. Likewise, it’s not because I mentioned we can think of Schrödinger’s equation as a pair of relations that, taken together, represent an energy propagation mechanism that’s quite similar in its structure to Maxwell’s equations for an electromagnetic wave (as shown below), that we’d insert (or not) that 1/2 factor: either of the two representations below works. It just depends on our definition of the concept of the effective mass.

The 1/2 factor is really a matter of choice, because the rather peculiar – and flexible – concept of the effective mass takes care of it. However, we could define some new effective mass concept, by writing: m_eff^NEW = 2∙m_eff^OLD, and then Schrödinger’s equation would look more elegant:

∂ψ/∂t = i·(ħ/m_eff^NEW)·∇²ψ

Now you’ll want the definition, of course! What is that effective mass concept? Feynman talks at length about it, but his exposé is embedded in a much longer and more general argument on the propagation of electrons in a crystal lattice, which you may not necessarily want to go through right now. So let’s try to answer that question by doing something stupid: let’s substitute ψ in the equation for ψ = a·e^{−i·[E·t − p∙x]/ħ} (which is an elementary wavefunction), calculate the time derivative and the Laplacian, and see what we get. If we do that, the ∂ψ/∂t = i·(1/2)·(ħ/m_eff)·∇²ψ equation becomes:

~~−i·a~~·(E/ħ)·~~e^{−i∙(E·t − p∙x)/ħ}~~ = ~~−i·a~~·(1/2)·(ħ/m_eff)(p²/ħ²)·~~e^{−i∙(E·t − p∙x)/ħ}~~

⇔ E = (1/2)·p²/m_eff = (1/2)·(m·v)²/m_eff ⇔ m_eff = (1/2)·(m/E)·m·v²

⇔ m_eff = (1/c²)·(m·v²/2) = m·β²/2

Hence, the effective mass appears in this equation as the equivalent mass of the kinetic energy (K.E.) of the elementary particle that’s being represented by the wavefunction. Now, you may think that sounds good – and it does – but you should note the following:

1. The K.E. = m·v²/2 formula is only correct for non-relativistic speeds. In fact, it’s the kinetic energy formula if, and only if, if m ≈ m₀. The relativistically correct formula for the kinetic energy calculates it as the difference between (1) the total energy (which is given by the E = m·c² formula, always) and (2) its rest energy, so we write:

K.E. = E − E₀ = m_v·c² − m₀·c² = m₀·γ·c² − m₀·c² = m₀·c²·(γ − 1)

2. The energy concept in the wavefunction ψ = a·e^{−i·[E·t − p∙x]/ħ}is, obviously, the total energy of the particle. For non-relativistic speeds, the kinetic energy is only a very small fraction of the total energy. In fact, using the formula above, you can calculate the ratio between the kinetic and the total energy: you’ll find it’s equal to 1 − 1/γ = 1 − √(1−v²/c²), and its graph goes from 0 to 1.

Now, if we discard the 1/2 factor, the calculations above yield the following:

−i·a·(E/ħ)·e^{−i∙(E·t − p∙x)/ħ} = −i·a·(ħ/m_eff)(p²/ħ²)·e^{−i∙(E·t − p∙x)/ħ}

⇔ E = p²/m_eff = (m·v)²/m_eff ⇔ m_eff = (m/E)·m·v²

⇔ m_eff = m·v²/c²= m·β²

In fact, it is fair to say that both definitions are equally weird, even if the dimensions come out alright: the effective mass is measured in old-fashioned mass units, and the β²or β²/2 factor appears as a sort of correction factor, varying between 0 and 1 (for β²) or between 0 and 1/2 (for β²/2). I prefer the new definition, as it ensures that m_eff becomes equal to m in the limit for the velocity going to c. In addition, if we bring the ħ/m_eff or (1/2)∙ħ/m_eff factor to the other side of the equation, the choice becomes one between a m_eff^NEW/ħ or a 2∙m_eff^OLD/ħ coefficient.

It’s a choice, really. Personally, I think the equation without the 1/2 factor – and, hence, the use of ħ rather than ħ/2 as the scaling factor – looks better, but then you may argue that – if half of the energy of our particle is in the oscillating real part of the wavefunction, and the other is in the imaginary part – then the 1/2 factor should stay, because it ensures that m_eff becomes equal to m/2 as v goes to c (or, what amounts to the same, β goes to 1). But then that’s the argument about whether or not we should have a 1/2 factor because we get two equations for the price of one, like we did for the Uncertainty Principle.

So… What to do? Let’s first ask ourselves whether that derivation of the effective mass actually makes sense. Let’s therefore look at both limit situations.

1. For v going to c (or β = v/c going to 1), we do not have much of a problem: m_eff just becomes the total mass of the particle that we’re looking at, and Schrödinger’s equation can easily be interpreted as an energy propagation mechanism. Our particle has zero rest mass in that case ( we may also say that the concept of a rest mass is meaningless in this situation) and all of the energy – and, therefore, all of the equivalent mass – is kinetic: m = E/c²and the effective mass is just the mass: m_eff = m·c²/c²= m. Hence, our particle is everywhere and nowhere. In fact, you should note that the concept of velocity itself doesn’t make sense in this rather particular case. It’s like a photon (but note it’s not a photon: we’re talking some theoretical particle here with zero spin and zero rest mass): it’s a wave in its own frame of reference, but as it zips by at the speed of light, we think of it as a particle.

2. Let’s look at the other limit situation. For v going to 0 (or β = v/c going to 0), Schrödinger’s equation no longer makes sense, because the diffusion constant goes to zero, so we get a nonsensical equation. Huh? What’s wrong with our analysis?

Well… I must be honest. We started off on the wrong foot. You should note that it’s hard – in fact, plain impossible – to reconcile our simple a·e^{−i·[E·t − p∙x]/ħ} function with the idea of the classical velocity of our particle. Indeed, the classical velocity corresponds to a group velocity, or the velocity of a wave packet, and so we just have one wave here: no group. So we get nonsense. You can see the same when equating p to zero in the wave equation: we get another nonsensical equation, because the Laplacian is zero! Check it. If our elementary wavefunction is equal to ψ = a·e^{−i·(E/ħ)·t}, then that Laplacian is zero.

Hence, our calculation of the effective mass is not very sensical. Why? Because the elementary wavefunction is a theoretical concept only: it may represent some box in space, that is uniformly filled with energy, but it cannot represent any actual particle. Actual particles are always some superposition of two or more elementary waves, so then we’ve got a wave packet (as illustrated below) that we can actually associate with some real-life particle moving in space, like an electron in some orbital indeed. 🙂

wave-packet

I must credit Oregon State University for the animation above. It’s quite nice: a simple particle in a box model without potential. As I showed on my other page (explaining various models), we must add at least two waves – traveling in opposite directions – to model a particle in a box. Why? Because we represent it by a standing wave, and a standing wave is the sum of two waves traveling in opposite directions.

So, if our derivation above was not very meaningful, then what is the actual concept of the effective mass?

The concept of the effective mass

I am afraid that, at this point, I do have to direct you back to the Grand Master himself for the detail. Let me just try to sum it up very succinctly. If we have a wave packet, there is – obviously – some energy in it, and it’s energy we may associate with the classical concept of the velocity of our particle – because it’s the group velocity of our wave packet. Hence, we have a new energy concept here – and the equivalent mass, of course. Now, Feynman’s analysis – which is Schrödinger’s analysis, really – shows we can write that energy as:

E = m_eff·v²/2

So… Well… That’s the classical kinetic energy formula. And it’s the very classical one, because it’s not relativistic. 😦 But that’s OK for relatively small-moving electrons! [Remember the typical (relative) velocity is given by the fine-structure constant: α = β = v/c. So that’s impressive (about 2,188 km per second), but it’s only a tiny fraction of the speed of light, so non-relativistic formulas should work.]

Now, the m_eff factor in this equation is a function of the various parameters of the model he uses. To be precise, we get the following formula out of his model (which, as mentioned above, is a model of electrons propagating in a crystal lattice):

m_eff = ħ²/(2·A·b² )

Now, the b in this formula is the spacing between the atoms in the lattice. The A basically represents an energy barrier: to move from one atom to another, the electron needs to get across it. I talked about this in my post on it, and so I won’t explain the graph below – because I did that in that post. Just note that we don’t need that factor 2: there is no reason whatsoever to write E₀+ 2·A and E₀− 2·A. We could just re-define a new A: (1/2)·A^NEW = A^OLD. The formula for m_eff then simplifies to ħ²/(2·A^OLD·b²) = ħ²/(A^NEW·b²). We then get an E_eff = m_eff·v²formula for the extra energy.

E_eff = m_eff·v²?!? What energy formula is that? Schrödinger must have thought the same thing, and so that’s why we have that ugly 1/2 factor in his equation. However, think about it. Our analysis shows that it is quite straightforward to model energy as a two-dimensional oscillation of mass. In this analysis, both the real and the imaginary component of the wavefunction each store half of the total energy of the object, which is equal to E = m·c². Remember, indeed, that we compared it to the energy in an oscillator, which is equal to the sum of kinetic and potential energy, and for which we have the T + U = m·ω₀²/2 formula. But so we have two oscillators here and, hence, twice the energy. Hence, the E = m·c² corresponds to m·ω₀²and, hence, we may think of c as the natural frequency of the vacuum.

Therefore, the E_eff = m_eff·v² formula makes much more sense. It nicely mirrors Einstein’s E = m·c² formula and, in fact, naturally merges into E = m·c² for v approaching c. But, I admit, it is not so easy to interpret. It’s much easier to just say that the effective mass is the mass of our electron as it appears in the kinetic energy formula, or – alternatively – in the momentum formula. Indeed, Feynman also writes the following formula:

m_eff·v = p = ħ·k

Now, that is something we easily recognize! 🙂

So… Well… What do we do now? Do we use the 1/2 factor or not?

It would be very convenient, of course, to just stick with tradition and use m_eff as everyone else uses it: it is just the mass as it appears in whatever medium we happen to look it, which may be a crystal lattice (or a semi-conductor), or just free space. In short, it’s the mass of the electron as it appears to us, i.e. as it appears in the (non-relativistic) kinetic energy formula (K.E. = m_eff·v²/2), the formula for the momentum of an electron (p = m_eff·v), or in the wavefunction itself (k = p/ħ = (m_eff·v)/ħ. In fact, in his analysis of the electron orbitals, Feynman (who just follows Schrödinger here) drops the eff subscript altogether, and so the effective mass is just the mass: m_eff = m. Hence, the apparent mass of the electron in the hydrogen atom serves as a reference point, and the effective mass in a different medium (such as a crystal lattice, rather than free space or, I should say, a hydrogen atom in free space) will also be different.

The thing is: we get the right results out of Schrödinger’s equation, with the 1/2 factor in it. Hence, Schrödinger’s equation works: we get the actual electron orbitals out of it. Hence, Schrödinger’s equation is true – without any doubt. Hence, if we take that 1/2 factor out, then we do need to use the other effective mass concept. We can do that. Think about the actual relation between the effective mass and the real mass of the electron, about which Feynman writes the following: “The effective mass has nothing to do with the real mass of an electron. It may be quite different—although in commonly used metals and semiconductors it often happens to turn out to be the same general order of magnitude: about 0.1 to 30 times the free-space mass of the electron.” Hence, if we write the relation between m_effand m as m_eff= g(m), then the same relation for our m_eff^NEW = 2∙m_eff^OLD becomes m_eff^NEW= 2·g(m), and the “about 0.1 to 30 times” becomes “about 0.2 to 60 times.”

In fact, in the original 1963 edition, Feynman writes that the effective mass is “about 2 to 20 times” the free-space mass of the electron. Isn’t that interesting? I mean… Note that factor 2! If we’d write m_eff= 2·m, then we’re fine. We can then write Schrödinger’s equation in the following two equivalent ways:

(m_eff/ħ)·∂ψ/∂t = i·∇²ψ
(2m/ħ)·∂ψ/∂t = i·∇²ψ

Both would be correct, and it explains why Schrödinger’s equation works. So let’s go for that compromise and write Schrödinger’s equation in either of the two equivalent ways. 🙂 The question then becomes: how to interpret that factor 2? The answer to that question is, effectively, related to the fact that we get two waves for the price of one here. So we have two oscillators, so to speak. Now that‘s quite deep, and I will explore that in one of my next posts.

Let me now address the second weird thing in Schrödinger’s equation: the energy factor. I should be more precise: the weirdness arises when solving Schrödinger’s equation. Indeed, in the texts I’ve read, there is this constant switching back and forth between interpreting E as the energy of the atom, versus the energy of the electron. Now, both concepts are obviously quite different, so which one is it really?

The energy factor E

It’s a confusing point—for me, at least and, hence, I must assume for students as well. Let me indicate, by way of example, how the confusion arises in Feynman’s exposé on the solutions to the Schrödinger equation. Initially, the development is quite straightforward. Replacing V by −e²/r, Schrödinger’s equation becomes:

As usual, it is then assumed that a solution of the form ψ (r, t) = e^{−(i/ħ)·E·t}·ψ(r) will work. Apart from the confusion that arises because we use the same symbol, ψ, for two different functions (you will agree that ψ (r, t), a function in two variables, is obviously not the same as ψ(r), a function in one variable only), this assumption is quite straightforward and allows us to re-write the differential equation above as:

To get this, you just need to actually to do that time derivative, noting that the ψ in our equation is now ψ(r), not ψ (r, t). Feynman duly notes this as he writes: “The function ψ(r) must solve this equation, where E is some constant—the energy of the atom.” So far, so good. In one of the (many) next steps, we re-write E as E = E_R·ε, with E_R= m·e⁴/2ħ². So we just use the Rydberg energy (E_R≈ 13.6 eV) here as a ‘natural’ atomic energy unit. That’s all. No harm in that.

Then all kinds of complicated but legitimate mathematical manipulations follow, in an attempt to solve this differential equation—attempt that is successful, of course! However, after all these manipulations, one ends up with the grand simple solution for the s-states of the atom (i.e. the spherically symmetric solutions):

E_n = −E_R/n²with 1/n²= 1, 1/4, 1/9, 1/16,…, 1

So we get: E_n = −13.6 eV, −3.4 eV, −1.5 eV, etcetera. Now how is that possible? How can the energy of the atom suddenly be negative? More importantly, why is so tiny in comparison with the rest energy of the proton (which is about 938 mega-electronvolt), or the electron (0.511 MeV)? The energy levels above are a few eV only, not a few million electronvolt. Feynman answers this question rather vaguely when he states the following:

“There is, incidentally, nothing mysterious about negative numbers for the energy. The energies are negative because when we chose to write V = −e²/r, we picked our zero point as the energy of an electron located far from the proton. When it is close to the proton, its energy is less, so somewhat below zero. The energy is lowest (most negative) for n = 1, and increases toward zero with increasing n.”

We picked our zero point as the energy of an electron located far away from the proton? But we were talking the energy of the atom all along, right? You’re right. Feynman doesn’t answer the question. The solution is OK – well, sort of, at least – but, in one of those mathematical complications, there is a ‘normalization’ – a choice of some constant that pops up when combining and substituting stuff – that is not so innocent. To be precise, at some point, Feynman substitutes the ε variable for the square of another variable – to be even more precise, he writes: ε = −α². He then performs some more hat tricks – all legitimate, no doubt – and finds that the only sensible solutions to the differential equation require α to be equal to 1/n, which immediately leads to the above-mentioned solution for our s-states.

The real answer to the question is given somewhere else. In fact, Feynman casually gives us an explanation in one of his very first Lectures on quantum mechanics, where he writes the following:

“If we have a “condition” which is a mixture of two different states with different energies, then the amplitude for each of the two states will vary with time according to an equation like a·e^−iωt, with ħ·ω = E₀ = m·c². Hence, we can write the amplitude for the two states, for example as:

e^{−i(E₁/ħ)·t} and e^{−i(E₂/ħ)·t}

And if we have some combination of the two, we will have an interference. But notice that if we added a constant to both energies, it wouldn’t make any difference. If somebody else were to use a different scale of energy in which all the energies were increased (or decreased) by a constant amount—say, by the amount A—then the amplitudes in the two states would, from his point of view, be

e - i (E 1 +A)\cdott/ħ and e - i (E 2 +A)\cdott/ħ

All of his amplitudes would be multiplied by the same factor e^{−i(A/ħ)·t}, and all linear combinations, or interferences, would have the same factor. When we take the absolute squares to find the probabilities, all the answers would be the same. The choice of an origin for our energy scale makes no difference; we can measure energy from any zero we want. For relativistic purposes it is nice to measure the energy so that the rest mass is included, but for many purposes that aren’t relativistic it is often nice to subtract some standard amount from all energies that appear. For instance, in the case of an atom, it is usually convenient to subtract the energy M_s·c², where M_s is the mass of all the separate pieces—the nucleus and the electrons—which is, of course, different from the mass of the atom. For other problems, it may be useful to subtract from all energies the amount M_g·c², where M_g is the mass of the whole atom in the ground state; then the energy that appears is just the excitation energy of the atom. So, sometimes we may shift our zero of energy by some very large constant, but it doesn’t make any difference, provided we shift all the energies in a particular calculation by the same constant.”

It’s a rather long quotation, but it’s important. The key phrase here is, obviously, the following: “For other problems, it may be useful to subtract from all energies the amount M_g·c², where M_g is the mass of the whole atom in the ground state; then the energy that appears is just the excitation energy of the atom.” So that’s what he’s doing when solving Schrödinger’s equation. However, I should make the following point here: if we shift the origin of our energy scale, it does not make any difference in regard to the probabilities we calculate, but it obviously does make a difference in terms of our wavefunction itself. To be precise, its density in time will be very different. Hence, if we’d want to give the wavefunction some physical meaning – which is what I’ve been trying to do all along – it does make a huge difference. When we leave the rest mass of all of the pieces in our system out, we can no longer pretend we capture their energy.

This is a rather simple observation, but one that has profound implications in terms of our interpretation of the wavefunction. Personally, I admire the Great Teacher’s Lectures, but I am really disappointed that he doesn’t pay more attention to this. 😦

The Essence of Reality

Original post:

I know it’s a crazy title. It has no place in a physics blog, but then I am sure this article will go elsewhere. […] Well… […] Let me be honest: it’s probably gonna go nowhere. Whatever. I don’t care too much. My life is happier than Wittgenstein’s. 🙂

My original title for this post was: discrete spacetime. That was somewhat less offensive but, while being less offensive, it suffered from the same drawback: the terminology was ambiguous. The commonly accepted term for discrete spacetime is the quantum vacuum. However, because I am just an arrogant bastard trying to establish myself in this field, I am telling you that term is meaningless. Indeed, wouldn’t you agree that, if the quantum vacuum is a vacuum, then it’s empty. So it’s nothing. Hence, it cannot have any properties and, therefore, it cannot be discrete – or continuous, or whatever. We need to put stuff in it to make it real.

Therefore, I’d rather distinguish mathematical versus physical space. Of course, you are smart, and so you now you’ll say that my terminology is as bad as that of the quantum vacuumists. And you are right. However, this is a story that I am writing, and so I will write it the way I want to write it. 🙂 So where were we? Spacetime! Discrete spacetime.

Yes. Thank you! Because relativity tells us we should think in terms of four-vectors, we should not talk about space but about spacetime. Hence, we should distinguish mathematical spacetime from physical spacetime. So what’s the definitional difference?

Mathematical spacetime is just what it is: a coordinate space – Cartesian, polar, or whatever – which we define by choosing a representation, or a base. And all the other elements of the set are just some algebraic combination of the base set. Mathematical space involves numbers. They don’t – let me emphasize that: they do not!– involve the physical dimensions of the variables. Always remember: math shows us the relations, but it doesn’t show us the stuff itself. Think of it: even if we may refer to the coordinate axes as time, or distance, we do not really think of them as something physical. In math, the physical dimension is just a label. Nothing more. Nothing less.

In contrast, physical spacetime is filled with something – with waves, or with particles – so it’s spacetime filled with energy and/or matter. In fact, we should analyze matter and energy as essentially the same thing, and please do carefully re-read what I wrote: I said they are essentially the same. I did not say they are the same. Energy and mass are equivalent, but not quite the same. I’ll tell you what that means in a moment.

These waves, or particles, come with mass, energy and momentum. There is an equivalence between mass and energy, but they are not the same. There is a twist – literally (only after reading the next paragraphs, you’ll realize how literally): even when choosing our time and distance units such that c is numerically equal to 1 – e.g. when measuring distance in light-seconds (or time in light-meters), or when using Planck units – the physical dimension of the c²factor in Einstein’s E = mc²equation doesn’t vanish: the physical dimension of energy is kg·m²/s².

Using Newton’s force law (1 N = 1 kg·m/s²), we can easily see this rather strange unit is effectively equivalent to the energy unit, i.e. the joule (1 J = 1 kg·m²/s² = 1 (N·s²/m)·m²/s²= 1 N·m), but that’s not the point. The (m/s)² factor – i.e. the square of the velocity dimension – reflects the following:

Energy is nothing but mass in motion. To be precise, it’s oscillating mass. [And, yes, that’s what string theory is all about, but I didn’t want to mention that. It’s just terminology once again: I prefer to say ‘oscillating’ rather than ‘vibrating’. :-)]
The rapidly oscillating real and imaginary component of the matter-wave (or wavefunction, we should say) each capture half of the total energy of the object E = mc².
The oscillation is an oscillation of the mass of the particle (or wave) that we’re looking at.

In the mentioned publication, I explore the structural similarity between:

The oscillating electric and magnetic field vectors (E and B) that represent the electromagnetic wave, and
The oscillating real and imaginary part of the matter-wave.

The story is simple or complicated, depending on what you know already, but it can be told in an abnoxiously easy way. Note that the associated force laws do not differ in their structure:

Coulomb Law

gravitation law

The only difference is the dimension of m versus q: mass – the measure of inertia -versus charge. Mass comes in one color only, so to speak: it’s always positive. In contrast, electric charge comes in two colors: positive and negative. You can guess what comes next, but I won’t talk about that here. Just note the absolute distance between two charges (with the same or the opposite sign) is twice the distance between 0 and 1, which must explains the rather mysterious 2 factor I get for the Schrödinger equation for the electromagnetic wave (but I still need to show how that works out exactly).

The point is: remembering that the physical dimension of the electric field is N/C (newton per coulomb, i.e. force per unit of charge) it should not come as a surprise that we find that the physical dimension of the components of the matter-wave is N/kg: newton per kg, i.e. force per unit of mass. For the detail, I’ll refer you to that article of mine (and, because I know you will not want to work your way through it, let me tell you it’s the last chapter that tells you how to do the trick).

So where were we? Strange. I actually just wanted to talk about discrete spacetime here, but I realize I’ve already dealt with all of the metaphysical questions you could possible have, except the (existential) Who Am I? question, which I cannot answer on your behalf. 🙂

I wanted to talk about physical spacetime, so that’s sanitized mathematical space plus something. A date without logistics. Our mind is a lazy host, indeed.

Reality is the guest that brings all of the wine and the food to the party.

In fact, it’s a guest that brings everything to the party: you – the observer – just need to set the time and the place. In fact, in light of what Kant – and many other eminent philosophers – wrote about space and time being constructs of the mind, that’s another statement which you should interpret literally. So physical spacetime is spacetime filled with something – like a wave, or a field. So how does that look like? Well… Frankly, I don’t know! But let me share my idea of it.

Because of the unity of Planck’s quantum of action (ħ ≈ 1.0545718×10⁻³⁴ N·m·s), a wave traveling in spacetime might be represented as a set of discrete spacetime points and the associated amplitudes, as illustrated below. [I just made an easy Excel graph. Nothing fancy.]

spacetime

The space in-between the discrete spacetime points, which are separated by the Planck time and distance units, is not real. It is plain nothingness, or – if you prefer that term – the space in-between in is mathematical space only: a figment of the mind – nothing real, because quantum theory tells us that the real, physical, space is discontinuous.

Why is that so? Well… Smaller time and distance units cannot exist, because we would not be able to pack Planck’s quantum of action in them: a box of the Planck scale, with ħ in it, is just a black hole and, hence, nothing could go from here to there, because all would be trapped. Of course, now you’ll wonder what it means to ‘pack‘ Planck’s quantum of action in a Planck-scale spacetime box. Let me try to explain this. It’s going to be a rather rudimentary explanation and, hence, it may not satisfy you. But then the alternative is to learn more about black holes and the Schwarzschild radius, which I warmly recommend for two equivalent reasons:

The matter is actually quite deep, and I’d recommend you try to fully understand it by reading some decent physics course.
You’d stop reading this nonsense.

If, despite my warning, you would continue to read what I write, you may want to note that we could also use the logic below to define Planck’s quantum of action, rather than using it to define the Planck time and distance unit. Everything is related to everything in physics. But let me now give the rather naive explanation itself:

Planck’s quantum of action (ħ ≈ 1.0545718×10⁻³⁴ N·m·s) is the smallest thing possible. It may express itself as some momentum (whose physical dimension is N·s) over some distance (Δs), or as some amount of energy (whose dimension is N·m) over some time (Δt).
Now, energy is an oscillation of mass (I will repeat that a couple of times, and show you the detail of what that means in the last chapter) and, hence, ħ must necessarily express itself both as momentum as well as energy over some time and some distance. Hence, it is what it is: some force over some distance over some time. This reflects the physical dimension of ħ, which is the product of force, distance and time. So let’s assume some force ΔF, some distance Δs, and some time Δt, so we can write ħ as ħ = ΔF·Δs·Δt.
Now let’s pack that into a traveling particle – like a photon, for example – which, as you know (and as I will show in this publication) is, effectively, just some oscillation of mass, or an energy flow. Now let’s think about one cycle of that oscillation. How small can we make it? In spacetime, I mean.
If we decrease Δs and/or Δt, then ΔF must increase, so as to ensure the integrity (or unity) of ħ as the fundamental quantum of action. Note that the increase in the momentum (ΔF·Δt) and the energy (ΔF·Δs) is proportional to the decrease in Δt and Δs. Now, in our search for the Planck-size spacetime box, we will obviously want to decrease Δs and Δt simultaneously.
Because nothing can exceed the speed of light, we may want to use equivalent time and distance units, so the numerical value of the speed of light is equal to 1 and all velocities become relative velocities. If we now assume our particle is traveling at the speed of light – so it must be a photon, or a (theoretical) matter-particle with zero rest mass (which is something different than a photon) – then our Δs and Δt should respect the following condition: Δs/Δt = c = 1.
Now, when Δs = 1.6162×10⁻³⁵ m and Δt = 5.391×10⁻⁴⁴ s, we find that Δs/Δt = c, but ΔF = ħ/(Δs·Δt) = (1.0545718×10⁻³⁴ N·m·s)/[(1.6162×10⁻³⁵ m)·(5.391×10⁻⁴⁴ s)] ≈ 1.21×10⁴⁴ N. That force is monstrously huge. Think of it: because of gravitation, a mass of 1 kg in our hand, here on Earth, will exert a force of 9.8 N. Now note the exponent in that 1.21×10⁴⁴ number.
If we multiply that monstrous force with Δs – which is extremely tiny – we get the Planck energy: (1.6162×10⁻³⁵ m)·(1.21×10⁴⁴ N) ≈ 1.956×10⁹ joule. Despite the tininess of Δs, we still get a fairly big value for the Planck energy. Just to give you an idea, it’s the energy that you’d get out of burning 60 liters of gasoline—or the mileage you’d get out of 16 gallons of fuel! In fact, the equivalent mass of that energy, packed in such tiny space, makes it a black hole.
In short, the conclusion is that our particle can’t move (or, thinking of it as a wave, that our wave can’t wave) because it’s caught in the black hole it creates by its own energy: so the energy can’t escape and, hence, it can’t flow. 🙂

Of course, you will now say that we could imagine half a cycle, or a quarter of that cycle. And you are right: we can surely imagine that, but we get the same thing: to respect the unity of ħ, we’ll then have to pack it into half a cycle, or a quarter of a cycle, which just means the energy of the whole cycle is 2·ħ, or 4·ħ. However, our conclusion still stands: we won’t be able to pack that half-cycle, or that quarter-cycle, into something smaller than the Planck-size spacetime box, because it would make it a black hole, and so our wave wouldn’t go anywhere, and the idea of our wave itself – or the particle – just doesn’t make sense anymore.

This brings me to the final point I’d like to make here. When Maxwell or Einstein, or the quantum vacuumists – or I 🙂 – say that the speed of light is just a property of the vacuum, then that’s correct and not correct at the same time. First, we should note that, if we say that, we might also say that ħ is a property of the vacuum. All physical constants are. Hence, it’s a pretty meaningless statement. Still, it’s a statement that helps us to understand the essence of reality. Second, and more importantly, we should dissect that statement. The speed of light combines two very different aspects:

It’s a physical constant, i.e. some fixed number that we will find to be the same regardless of our reference frame. As such, it’s as essential as those immovable physical laws that we find to be the same in each and every reference frame.
However, its physical dimension is the ratio of the distance and the time unit: m/s. We may choose other time and distance units, but we will still combine them in that ratio. These two units represent the two dimensions in our mind that – as Kant noted – structure our perception of reality: the temporal and spatial dimension.

Hence, we cannot just say that c is ‘just a property of the vacuum’. In our definition of c as a velocity, we mix reality – the ‘outside world’ – with our perception of it. It’s unavoidable. Frankly, while we should obviously try – and we should try very hard! – to separate what’s ‘out there’ versus ‘how we make sense of it’, it is and remains an impossible job because… Well… When everything is said and done, what we observe ‘out there’ is just that: it’s just what we – humans – observe. 🙂

So, when everything is said and done, the essence of reality consists of four things:

Nothing
Mass, i.e. something, or not nothing
Movement (of something), from nowhere to somewhere.
Us: our mind. Or God’s Mind. Whatever. Mind.

The first is like yin and yang, or manicheism, or whatever dualistic religious system. As for Movement and Mind… Hmm… In some very weird way, I feel they must be part of one and the same thing as well. 🙂 In fact, we may also think of those four things as:

0 (zero)
1 (one), or as some sine or a cosine, which is anything in-between 0 and 1.
Well… I am not sure! I can’t really separate point 3 and point 4, because they combine point 1 and point 2.

So we’ve don’t have a quadrupality, right? We do have a Trinity here, don’t we? […] Maybe. I won’t comment, because I think I just found Unity here. 🙂

The wavefunction and relativity

When reading about quantum theory, and wave mechanics, you will often encounter the rather enigmatic statement that the Schrödinger equation is not relativistically correct. What does that mean?

In my previous post on the wavefunction and relativity, I boldly claimed that relativity theory had been around for quite a while when the young Comte Louis de Broglie wrote his short groundbreaking PhD thesis, back in 1924. Moreover, it is more than likely that he suggested the θ = ω∙t – k∙x = (E∙t – p∙x)/ħ formula for the argument of the wavefunction exactly because relativity theory had already established the invariance of the four-vector product p_μx_μ = E∙t – p∙x = p_μ‘x_μ‘ = E’∙t’ – p’∙x’. [Note that Planck’s constant, as a physical constant, should obviously not depend on the reference frame either. Hence, if the E∙t – p∙x product is invariant, so is (E∙t – p∙x)/ħ.] However, I didn’t prove that, and I didn’t relate it to Schrödinger’s equation. Hence, let’s explore the matter somewhat further here.

I don’t want to do the academic thing, of course – and that is to prove the invariance of the four-vector dot product. If you want such proof, let me just give you a link to some course material that does just that. Here, I will just summarize the conclusions of such course material:

Four-vector dot products – like x_μx_μ = x_μ², p_μp_μ = p_μ², the spacetime interval s²= (Δr)²– Δt², or our p_μx_μ product here – are invariant under a Lorentz transformation (aka as a Lorentz boost). To be formally correct, I should write x_μx^μ, p_μp^μ, and p_μx^μ, because the product multiplies a row vector with a column vector, which is what the sub- and superscript indicate.
Four-vector dot products are referred to as Lorentz scalars.
When derivatives are involved, we must use the so-called four-gradient, which is denoted by ∂ or ∇_μ and defined as:

∂ = ∇_μ = (∂/∂t, –∇) = (∂/∂t, –∂/∂x, –∂/∂y, –∂/∂z)

Applying the four-gradient vector operator to the wavefunction, we get:

∇_μψ= (∂ψ/∂t, –∇ψ) = (∂ψ/∂t, –∂ψ/∂x, –∂ψ/∂y, –∂ψ/∂z)

We wrote about that in the context of electromagnetic theory (see, for instance, my post on the relativistic transformation of fields), so I won’t dwell on it here. Note, however, that that’s the weak spot in Schrödinger’s equation: it’s good, but not good enough. However, in the context in which it’s being used – i.e. to calculate electron orbitals – the approximation works just fine, so you shouldn’t worry about it. The point to remember is that the wavefunction itself is relativistically correct. 🙂

Of course, it is always good to work through a simple example, so let’s do that here. Let me first remind you of that transformation we presented a couple of times already, and that’s how to calculate the argument of the wavefunction in the reference frame of the particle itself, i.e. the inertial frame. It goes like this: when measuring all variables in Planck units, the physical constants ħ and c are numerically equal to one, then we can then re-write the argument of the wavefunction as follows:

ħ = 1 ⇒ θ = (E∙t – p∙x)/ħ = E∙t – p∙x = E_v∙t − (m_v∙v)∙x
E_v= E₀/√(1−v²) and m_v= m₀/√(1−v²) ⇒ θ = [E₀/√(1−v²)]∙t – [m₀∙v/√(1−v²)]∙x
c = 1 ⇒ m₀ = E₀⇒ θ = [E₀/√(1−v²)]∙t – [E₀∙v/√(1−v²)]∙x = E₀∙(t − v∙x)/√(1−v²)

⇔ θ = E₀∙t’ = E’·t’ with t’ = (t − v∙x)/√(1−v²)

The t’ in the θ = E₀∙t’ expression is, obviously, the proper time as measured in the inertial reference frame. Needless to say, v is the relative velocity, which is usually denoted by β. Note that this derivation uses the numerical m₀ = E₀ identity, which emerges when using natural time and distance units (c = 1). However, while mass and energy are equivalent, they are different physical concepts and, hence, they still have different physical dimensions. It is interesting to spell out what happens with the dimensions here:

The dimension of E_vt and/or E₀∙t’ is (N∙m)∙s, i.e. the dimension of (physical) action.
The dimension of the (m_v∙v)∙x term must be the same, but how is that possible? Despite us using natural units – so the value of v is now some number between 0 and 1 – velocity is what it is: velocity. Hence, its dimension is m/s. Hence, the dimension of the m_v∙v∙x term is kg∙m = (N∙s²/m)∙(m/s)∙m = N∙m∙s.
Hence, the dimension of the [E₀∙v/√(1−v²)]∙x term only makes sense if we remember the m²/s² dimension of the c² factor in the E = m∙c² equivalence relation. We write: [E₀∙v∙x] = [E₀]∙[v]∙[x] = [(N∙m)∙(s²/m²)]∙(m/s)∙m = N∙m∙s. In short, when doing the m_v = E_v and/or m₀ = E₀ substitution, we should not get rid of the physical 1/c² dimension.

That should be clear enough. Let’s now do the example. The rest energy of an electron, expressed in Planck units, E_eP = E_e/E_P = (0.511×10⁶eV)/(1.22×10²⁸eV) = 4.181×10⁻²³. That is a very tiny fraction. However, the numerical value of the Planck time unit is even smaller: about 5.4×10⁻⁴⁴ seconds. Hence, as a frequency is expressed as the number of cycles (or, as an angular frequency, as the number of radians) per time unit, the natural frequency of the wavefunction of the electron is 4.181×10⁻²³ rad per Planck time unit, so that’s a frequency in the order of [4.181×10⁻²³/(2π)]/(5.4×10⁻⁴⁴ s) ≈ 1×10²⁰ cycles per second (or hertz). The relevant calculations are given hereunder.

Electron
Rest energy (in joule)	8.1871E-14
Planck energy (in joule)	1.9562E+09
Rest energy in Planck units	4.1853E-23
Frequency in cycles per second	1.2356E+20

Because of these rather incredible numbers (like 10^–31or 10²⁰), the calculations are not always very obvious, but the logic is clear enough: a higher rest mass increases the (angular) frequency of the real and imaginary part of the wavefunction, and gives them a much higher density in spacetime. How does a frequency like 1.235×10²⁰ Hz compare to, say, the frequency of gamma rays. The answer may surprise you: they are of the same order, as is their energy! 🙂 However, their nature, as a wave ,is obviously very different: gamma rays are an electromagnetic wave, so they involve an E and B vector, rather than the two components of the matter-wave. As an energy propagation mechanism, they are structurally similar, though, as I showed in my previous post.

Now, the typical speed of an electron is given by of the fine-structure constant (α), which is (also) equal to the is the (relative) speed of an electron (for the many interpretations of the fine-structure constant, see my post on it). So we write:

α = β = v/c

More importantly, we can use this formula to calculate it, which is done hereunder. As you can see, while the typical electron speed is quite impressive (about 2,188 km per second), it is only a fraction of the speed of light and, therefore, the Lorentz factor is still equal to one for all practical purposes. Therefore, its speed adds hardly anything to its energy.

Fine-structure constant	0.007297353
Typical speed of the electron (m/s)	2.1877E+06
Typical speed of the electron (km/s)	2,188 km/s
Lorentz factor (γ)	1.0000266267

But I admit it does have momentum now and, hence, the p∙x term in the θ = E∙t – p∙x comes into play. What is its momentum? That’s calculated below. Remember we calculate all in Planck units here!

Electron energy moving at alpha (in Planck units)	4.1854E-23
Electron mass moving at alpha (in Planck units)	4.1854E-23
Planck momentum (p = m·v = m·α )	3.0542E-25

The momentum is tiny, but it’s real. Also note the increase in its energy. Now, when substituting x for x = v·t, we get the following formula for the argument of our wavefunction:

θ = E·t – p·x = E·t − p·v·t = m_v·t − m_v·v·v·t = m_v·(1 − v²)·t

Now, how does that compare to our θ = θ = E₀∙t’ = E’·t’ expression? Well… The value of the two coefficients is calculated below. You can, effectively, see it hardly matters.

m_v·(1 − v²)	4.1852E-23
Rest energy in Planck units	4.1853E-23

With that, we are finally ready to use the non-relativistic Schrödinger equation in a non-relativistic way, i.e. we can start calculating electron orbitals with it now, which is what we did in one of my previous posts, but I will re-visit that post soon – and provide some extra commentary! 🙂

The Poynting vector for the matter-wave

Original post:

In my various posts on the wavefunction – which I summarized in my e-book – I wrote at the length on the structural similarities between the matter-wave and the electromagnetic wave. Look at the following images once more:

Animation 5d_euler_f

Both are the same, and then they are not. The illustration on the right-hand side is a regular quantum-mechanical wavefunction, i.e. an amplitude wavefunction: the x-axis represents time, so we are looking at the wavefunction at some particular point in space. [Of course, we could just switch the dimensions and it would all look the same.] The illustration on the left-hand side looks similar, but it is not an amplitude wavefunction. The animation shows how the electric field vector (E) of an electromagnetic wave travels through space. Its shape is the same. So it is the same function. Is it also the same reality?

Yes and no. The two energy propagation mechanisms are structurally similar. The key difference is that, in electromagnetics, we get two waves for the price of one. Indeed, the animation above does not show the accompanying magnetic field vector (B), which is equally essential. But, for the rest, Schrödinger’s equation and Maxwell’s equation model a similar energy propagation mechanism, as shown below.

They have to, as the force laws are similar too:

Coulomb Law

gravitation law

The only difference is that mass comes in one color only, so to speak: it’s always positive. In contrast, electric charge comes in two colors: positive and negative. You can now guess what comes next: quantum chromodynamics, but I won’t write about that here, because I haven’t studied that yet. I won’t repeat what I wrote elsewhere, but I want to make good on one promise, and that is to develop the idea of the Poynting vector for the matter-wave. So let’s do that now. Let me first remind you of the basic ideas, however.

Basics

The animation below shows the two components of the archetypal wavefunction, i.e. the sine and cosine:

circle_cos_sin

Think of the two oscillations as (each) packing half of the total energy of a particle (like an electron or a photon, for example). Look at how the sine and cosine mutually feed into each other: the sine reaches zero as the cosine reaches plus or minus one, and vice versa. Look at how the moving dot accelerates as it goes to the center point of the axis, and how it decelerates when reaching the end points, so as to switch direction. The two functions are exactly the same function, but for a phase difference of 90 degrees, i.e. a right angle. Now, I love engines, and so it makes me think of a V-2 engine with the pistons at a 90-degree angle. Look at the illustration below. If there is no friction, we have a perpetual motion machine: it would store energy in its moving parts, while not requiring any external energy to keep it going.

two-timer-576-px-photo-369911-s-original

If it is easier for you, you can replace each piston by a physical spring, as I did below. However, I should learn how to make animations myself, because the image below does not capture the phase difference. Hence, it does not show how the real and imaginary part of the wavefunction mutually feed into each other, which is (one of the reasons) why I like the V-2 image much better. 🙂

summary 2

The point to note is: all of the illustrations above are true representations – whatever that means – of (idealized) stationary particles, and both for matter (fermions) as well as for force-carrying particles (bosons). Let me give you an example. The (rest) energy of an electron is tiny: about 8.2×10⁻¹⁴ joule. Note the minus 14 exponent: that’s an unimaginably small amount. It sounds better when using the more commonly used electronvolt scale for the energy of elementary particles: 0.511 MeV. Despite its tiny mass (or energy, I should say, but then mass and energy are directly proportional to each other: the proportionality coefficient is given by the E = m·c² formula), the frequency of the matter-wave of the electron is of the order of 1×10²⁰ = 100,000,000,000,000,000,000 cycles per second. That’s an unimaginably large number and – as I will show when we get there – that’s not because the second is a huge unit at the atomic or sub-atomic scale.

We may refer to this as the natural frequency of the electron. Higher rest masses increase the frequency and, hence, give the wavefunction an even higher density in spacetime. Let me summarize things in a very simple way:

The (total) energy that is stored in an oscillating spring is the sum of the kinetic and potential energy (T and U) and is given by the following formula: E = T + U = a₀²·m·ω₀²/2. The a₀factor is the maximum amplitude – which depends on the initial conditions, i.e. the initial pull or push. The ω₀in the formula is the natural frequency of our spring, which is a function of the stiffness of the spring (k) and the mass on the spring (m): ω₀² = k/m.
Hence, the total energy that’s stored in two springs is equal to a₀²·m·ω₀².
The similarity between the E = a₀²·m·ω₀² and the E = m·c² formula is much more than just striking. It is fundamental: the two oscillating components of the wavefunction each store half of the total energy of our particle.
To emphasize the point: ω₀ = √(k/m) is, obviously, a characteristic of the system. Likewise, c = √(E/m) is just the same: a property of spacetime.

Of course, the key question is: what is that is oscillating here? In our V-2 engine, we have the moving parts. Now what exactly is moving when it comes to the wavefunction? The easy answer is: it’s the same thing. The V-2 engine, or our springs, store energy because of the moving parts. Hence, energy is equivalent only to mass that moves, and the frequency of the oscillation obviously matters, as evidenced by the E = a₀²·m·ω₀²/2 formula for the energy in a oscillating spring. Mass. Energy is moving mass. To be precise, it’s oscillating mass. Think of it: mass and energy are equivalent, but they are not the same. That’s why the dimension of the c² factor in Einstein’s famous E = m·c² formula matters. The equivalent energy of a 1 kg object is approximately 9×10¹⁶ joule. To be precise, it is the following monstrous number:

89,875,517,873,681,764 kg·m²/s²

Note its dimension: the joule is the product of the mass unit and the square of the velocity unit. So that, then, is, perhaps, the true meaning of Einstein’s famous formula: energy is not just equivalent to mass. It’s equivalent to mass that’s moving. In this case, an oscillating mass. But we should explore the question much more rigorously, which is what I do in the next section. Let me warn you: it is not an easy matter and, even if you are able to work your way through all of the other material below in order to understand the answer, I cannot promise you that the answer will satisfy you entirely. However, it will surely help you to phrase the question.

The Poynting vector for the matter-wave

For the photon, we have the electric and magnetic field vectors E and B. The boldface highlights the fact that these are vectors indeed: they have a direction as well as a magnitude. Their magnitude has a physical dimension. The dimension of E is straightforward: the electric field strength (E) is a quantity expressed in newton per coulomb (N/C), i.e. force per unit charge. This follows straight from the F = q·E force relation.

The dimension of B is much less obvious: the magnetic field strength (B) is measured in (N/C)/(m/s) = (N/C)·(s/m). That’s what comes out of the F = q·v×B force relation. Just to make sure you understand: v×B is a vector cross product, and yields another vector, which is given by the following formula:

a×b = |a×b|·n = |a|·|b|·sinφ·n

The φ in this formula is the angle between a and b (in the plane containing them) and, hence, is always some angle between 0 and π. The n is the unit vector that is perpendicular to the plane containing a and b in the direction given by the right-hand rule. The animation below shows it works for some rather special angles:

We may also need the vector dot product, so let me quickly give you that formula too. The vector dot product yields a scalar given by the following formula:

a•b = |a|·|b|·cosφ

Let’s get back to the F = q·v×B relation. A dimensional analysis shows that the dimension of B must involve the reciprocal of the velocity dimension in order to ensure the dimensions come out alright:

[F]= [q·v×B] = [q]·[v]·[B] = C·(m/s)·(N/C)·(s/m) = N

We can derive the same result in a different way. First, note that the magnitude of B will always be equal to E/c (except when none of the charges is moving, so B is zero), which implies the same:

[B] = [E/c] = [E]/[c] = (N/C)/(m/s) = (N/C)·(s/m)

Finally, the Maxwell equation we used to derive the wavefunction of the photon was ∂E/∂t = c²∇×B, which also tells us the physical dimension of B must involve that s/m factor. Otherwise, the dimensional analysis would not work out:

[∂E/∂t] = (N/C)/s = N/(C·s)
[c²∇×B] = [c²]·[∇×B] = (m²/s²)·[(N/C)·(s/m)]/m= N/(C·s)

This analysis involves the curl operator ∇×, which is a rather special vector operator. It gives us the (infinitesimal) rotation of a three-dimensional vector field. You should look it up so you understand what we’re doing here.

Now, when deriving the wavefunction for the photon, we gave you a purely geometric formula for B:

B = e_x×E = i·E

Now I am going to ask you to be extremely flexible: wouldn’t you agree that the B = E/c and the B = e_x×E = i·E formulas, jointly, only make sense if we’d assign the s/m dimension to e_x and/or to i? I know you’ll think that’s nonsense because you’ve learned to think of the e_x× and/or i· operation as a rotation only. What I am saying here is that it also transforms the physical dimension of the vector on which we do the operation: it multiplies it with the reciprocal of the velocity dimension. Don’t think too much about it, because I’ll do yet another hat trick. We can think of the real and imaginary part of the wavefunction as being geometrically equivalent to the E and B vector. Just compare the illustrations below:

e-and-b Rising_circular

Of course, you are smart, and you’ll note the phase difference between the sine and the cosine (illustrated below). So what should we do with that? Not sure. Let’s hold our breath for the moment.

Let’s first think about what dimension we could possible assign to the real part of the wavefunction. We said this oscillation stores half of the energy of the elementary particle that is being described by the wavefunction. How does that storage work for the E vector? As I explained in my post on the topic, the Poynting vector describes the energy flow in a varying electromagnetic field. It’s a bit of a convoluted story (which I won’t repeat here), but the upshot is that the energy density is given by the following formula:

Its shape should not surprise you. The formula is quite intuitive really, even if its derivation is not. The formula represents the one thing that everyone knows about a wave, electromagnetic or not: the energy in it is proportional to the square of its amplitude, and so that’s E•E = E² and B•B = B². You should also note he c²factor that comes with the B•B product. It does two things here:

As a physical constant, with some dimension of its own, it ensures that the dimensions on both sides of the equation come out alright.
The magnitude of B is 1/c of that of E, so cB = E, and so that explains the extra c² factor in the second term: we do get two waves for the price of one here and, therefore, twice the energy.

Speaking of dimensions, let’s quickly do the dimensional analysis:

E is measured in newton per coulomb, so [E•E] = [E²] = N²/C².
B is measured in (N/C)/(m/s), so we get [B•B] = [B²] = (N²/C²)·(s²/m²). However, the dimension of our c² factor is (m²/s²) and so we’re left with N²/C². That’s nice, because we need to add stuff that’s expressed in the same units.
The ε₀ is that ubiquitous physical constant in electromagnetic theory: the electric constant, aka as the vacuum permittivity. Besides ensuring proportionality, it also ‘fixes’ our units, and so we should trust it to do the same thing here, and it does: [ε₀] = C²/(N·m²), so if we multiply that with N²/C², we find that u is expressed in N/m².

Why is N/m² an energy density? The correct answer to that question involves a rather complicated analysis, but there is an easier way to think about it: just multiply N/m²with m/m, and then its dimension becomes N·m/m³= J/m³, so that’s joule per cubic meter. That looks more like an energy density dimension, doesn’t it? But it’s actually the same thing. In any case, I need to move on.

We talked about the Poynting vector, and said it represents an energy flow. So how does that work? It is also quite intuitive, as its formula really speaks for itself. Let me write it down:

Just look at it: u is the energy density, so that’s the amount of energy per unit volume at a given point, and so whatever flows out of that point must represent its time rate of change. As for the –∇•S expression… Well… The ∇• operator is the divergence, and so it give us the magnitude of a (vector) field’s source or sink at a given point. If C is a vector field (any vector field, really), then ∇•C is a scalar, and if it’s positive in a region, then that region is a source. Conversely, if it’s negative, then it’s a sink. To be precise, the divergence represents the volume density of the outward flux of a vector field from an infinitesimal volume around a given point. So, in this case, it gives us the volume density of the flux of S. If you’re somewhat familiar with electromagnetic theory, then you will immediately note that the formula has exactly the same shape as the ∇•j = −∂ρ/∂t formula, which represents a flow of electric charge.

But I need to get on with my own story here. In order to not create confusion, I will denote the total energy by U, rather than E, because we will continue to use E for the magnitude of the electric field. We said the real and the imaginary component of the wavefunction were like the E and B vector, but what’s their dimension? It must involve force, but it should obviously not involve any electric charge. So what are our options here? You know the electric force law (i.e. Coulomb’s Law) and the gravitational force law are structurally similar:

Coulomb Law

gravitation law

So what if we would just guess that the dimension of the real and imaginary component of our wavefunction should involve a newton per kg factor (N/kg), so that’s force per mass unit rather than force per unit charge? But… Hey! Wait a minute! Newton’s force law defines the newton in terms of mass and acceleration, so we can do a substitution here: 1 N = 1 kg·m/s² ⇔ 1 kg = 1 N·s²/m. Hence, our N/kg dimension becomes:

N/kg = N/(N·s²/m)= m/s²

What is this: m/s²? Is that the dimension of the a·cosθ term in the a·e^−i·θ= a·cosθ − i·a·sinθ wavefunction? I hear you. This is getting quite crazy, but let’s see where it leads us. To calculate the equivalent energy density, we’d then need an equivalent for the ε₀ factor, which – replacing the C by kg in the [ε₀] = C²/(N·m²) expression – would be equal to kg²/(N·m²). Because we know what we want (energy is defined using the force unit, not the mass unit), we’ll want to substitute the kg unit once again, so – temporarily using the μ₀ symbol for the equivalent of that ε₀ constant – we get:

[μ₀] = [N·s²/m]²/(N·m²) = N·s⁴/m⁴

Hence, the dimension of the equivalent of that ε₀·E² term becomes:

[(μ₀/2)]·[cosθ]² = (N·s⁴/m⁴)·m²/s⁴= N/m²

Bingo! How does it work for the other component? The other component has the imaginary unit (i) in front. If we continue to pursue our comparison with the E and B vectors, we should assign an extra s/m dimension because of the e_x and/or i factor, so the physical dimension of the i·sinθ term would be (m/s²)·(s/m) = s. What? Just the second? Relax. That second term in the energy density formula has the c² factor, so it all works out:

[(μ₀/2)]·[c²]·[i·sinθ]² = [(μ₀/2)]·[c²]·[i]²·[sinθ]² (N·s⁴/m⁴)·(m²/s²)·(s²/m²)·m²/s⁴= N/m²

As weird as it is, it all works out. We can calculate u and, hence, we can now also calculate the equivalent Poynting vector (S). However, I will let you think about that as an exercise. 🙂 Just note the grand conclusions:

The physical dimension of the argument of the wavefunction is physical action (newton·meter·second) and Planck’s quantum of action is the scaling factor.
The physical dimension of both the real and imaginary component of the elementary wavefunction is newton per kg (N/kg). This allows us to analyze the wavefunction as an energy propagation mechanism that is structurally similar to Maxwell’s equations, which represent the energy propagation mechanism when electromagnetic energy is involved.

As such, all we presented so far was a deep exploration of the mathematical equivalence between the gravitational and electromagnetic force laws:

Coulomb Law

gravitation law

Despite our grand conclusions, you should note we have not answered the most fundamental question of all. What is mass? What is electric charge? We have all these relations and equations, but are we any wiser, really? The answer to that question probably lies in general relativity: mass is that what curves spacetime. Likewise, we may look at electric charge as causing a very special type of spacetime curvature. However, even such answer – which would involve a much more complicated mathematical analysis – may not satisfy you. In any case, I will let you digest this post. I hope you enjoyed it as much as I enjoyed writing it. 🙂

Post scriptum: Of all of the weird stuff I presented here, I think the dimensional analyses were the most interesting. Think of the N/kg = N/(N·s²/m)= m/s²identity, for example. The m/s² dimension is the dimension of physical acceleration (or deceleration): the rate of change of the velocity of an object. The identity comes straight out of Newton’s force law:

F = m·a ⇔ F/m = a

Now look, once again, at the animation, and remember the formula for the argument of the wavefunction: θ = E₀∙t’. The energy of the particle that is being described is the (angular) frequency of the real and imaginary components of the wavefunction.

circle_cos_sin

The relation between (1) the (angular) frequency of a harmonic oscillator (which is what the sine and cosine represent here) and (2) the acceleration along the axis is given by the following equation:

a(x) = −ω₀²·x

I’ll let you think about what that means. I know you will struggle with it – because I did – and, hence, let me give you the following hint:

The energy of an ordinary string wave, like a guitar string oscillating in one dimension only, will be proportional to the square of the frequency.
However, for two-dimensional waves – such as an electromagnetic wave – we find that the energy is directly proportional to the frequency. Think of Einstein’s E = h·f = ħ·ω relation, for example. There is no squaring here!

It is a strange observation. Those two-dimensional waves – the matter-wave, or the electromagnetic wave – give us two waves for the price of one, each carrying half of the total energy but, as a result, we no longer have that square function. Think about it. Solving the mystery will make you feel like you’ve squared the circle, which – as you know – is impossible. 🙂

Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/

Quantum Mechanics: The Other Introduction

About three weeks ago, I brought my most substantial posts together in one document: it’s the Deep Blue page of this site. I also published it on Amazon/Kindle. It’s nice. It crowns many years of self-study, and many nights of short and bad sleep – as I was mulling over yet another paradox haunting me in my dreams. It’s been an extraordinary climb but, frankly, the view from the top is magnificent. 🙂

The offer is there: anyone who is willing to go through it and offer constructive and/or substantial comments will be included in the book’s acknowledgements section when I go for a second edition (which it needs, I think). First person to be acknowledged here is my wife though, Maria Elena Barron, as she has given me the spacetime and, more importantly, the freedom to take this bull by its horns.Below I just copy the foreword, just to give you a taste of it. 🙂

Foreword

Another introduction to quantum mechanics? Yep. I am not hoping to sell many copies, but I do hope my unusual background—I graduated as an economist, not as a physicist—will encourage you to take on the challenge and grind through this.

I’ve always wanted to thoroughly understand, rather than just vaguely know, those quintessential equations: the Lorentz transformations, the wavefunction and, above all, Schrödinger’s wave equation. In my bookcase, I’ve always had what is probably the most famous physics course in the history of physics: Richard Feynman’s Lectures on Physics, which have been used for decades, not only at Caltech but at many of the best universities in the world. Plus a few dozen other books. Popular books—which I now regret I ever read, because they were an utter waste of time: the language of physics is math and, hence, one should read physics in math—not in any other language.

But Feynman’s Lectures on Physics—three volumes of about fifty chapters each—are not easy to read. However, the experimental verification of the existence of the Higgs particle in CERN’s LHC accelerator a couple of years ago, and the award of the Nobel prize to the scientists who had predicted its existence (including Peter Higgs and François Englert), convinced me it was about time I take the bull by its horns. While, I consider myself to be of average intelligence only, I do feel there’s value in the ideal of the ‘Renaissance man’ and, hence, I think stuff like this is something we all should try to understand—somehow. So I started to read, and I also started a blog (www.readingfeynman.org) to externalize my frustration as I tried to cope with the difficulties involved. The site attracted hundreds of visitors every week and, hence, it encouraged me to publish this booklet.

So what is it about? What makes it special? In essence, it is a common-sense introduction to the key concepts in quantum physics. However, while common-sense, it does not shy away from the math, which is complicated, but not impossible. So this little book is surely not a Guide to the Universe for Dummies. I do hope it will guide some Not-So-Dummies. It basically recycles what I consider to be my more interesting posts, but combines them in a comprehensive structure.

It is a bit of a philosophical analysis of quantum mechanics as well, as I will – hopefully – do a better job than others in distinguishing the mathematical concepts from what they are supposed to describe, i.e. physical reality.

Last but not least, it does offer some new didactic perspectives. For those who know the subject already, let me briefly point these out:

I. Few, if any, of the popular writers seems to have noted that the argument of the wavefunction (θ = E·t – p·t) – using natural units (hence, the numerical value of ħ and c is one), and for an object moving at constant velocity (hence, x = v·t) – can be written as the product of the proper time of the object and its rest mass:

θ = E·t – p·x = E·t − p·x = m_v·t − m_v·v·x = m_v·(t − v·x)

⇔ θ = m₀·(t − v·x)/√(1 – v²) = m₀·t’

Hence, the argument of the wavefunction is just the proper time of the object with the rest mass acting as a scaling factor for the time: the internal clock of the object ticks much faster if it’s heavier. This symmetry between the argument of the wavefunction of the object as measured in its own (inertial) reference frame, and its argument as measured by us, in our own reference frame, is remarkable, and allows to understand the nature of the wavefunction in a more intuitive way.

While this approach reflects Feynman’s idea of the photon stopwatch, the presentation in this booklet generalizes the concept for all wavefunctions, first and foremost the wavefunction of the matter-particles that we’re used to (e.g. electrons).

II. Few, if any, have thought of looking at Schrödinger’s wave equation as an energy propagation mechanism. In fact, when helping my daughter out as she was trying to understand non-linear regression (logit and Poisson regressions), it suddenly realized we can analyze the wavefunction as a link function that connects two physical spaces: the physical space of our moving object, and a physical energy space.

Re-inserting Planck’s quantum of action in the argument of the wavefunction – so we write θ as θ = (E/ħ)·t – (p/ħ)·x = [E·t – p·x]/ħ – we may assign a physical dimension to it: when interpreting ħ as a scaling factor only (and, hence, when we only consider its numerical value, not its physical dimension), θ becomes a quantity expressed in newton·meter·second, i.e. the (physical) dimension of action. It is only natural, then, that we would associate the real and imaginary part of the wavefunction with some physical dimension too, and a dimensional analysis of Schrödinger’s equation tells us this dimension must be energy.

This perspective allows us to look at the wavefunction as an energy propagation mechanism, with the real and imaginary part of the probability amplitude interacting in very much the same way as the electric and magnetic field vectors E and B. This leads me to the next point, which I make rather emphatically in this booklet: the propagation mechanism for electromagnetic energy – as described by Maxwell’s equations – is mathematically equivalent to the propagation mechanism that’s implicit in the Schrödinger equation.

I am, therefore, able to present the Schrödinger equation in a much more coherent way, describing not only how this famous equation works for electrons, or matter-particles in general (i.e. fermions or spin-1/2 particles), which is probably the only use of the Schrödinger equation you are familiar with, but also how it works for bosons, including the photon, of course, but also the theoretical zero-spin boson!

In fact, I am personally rather proud of this. Not because I am doing something that hasn’t been done before (I am sure many have come to the same conclusions before me), but because one always has to trust one’s intuition. So let me say something about that third innovation: the photon wavefunction.

III. Let me tell you the little story behind my photon wavefunction. One of my acquaintances is a retired nuclear scientist. While he knew I was delving into it all, I knew he had little time to answer any of my queries. However, when I asked him about the wavefunction for photons, he bluntly told me photons didn’t have a wavefunction. I should just study Maxwell’s equations and that’s it: there’s no wavefunction for photons: just this traveling electric and a magnetic field vector. Look at Feynman’s Lectures, or any textbook, he said. None of them talk about photon wavefunctions. That’s true, but I knew he had to be wrong. I mulled over it for several months, and then just sat down and started doing to fiddle with Maxwell’s equations, assuming the oscillations of the E and B vector could be described by regular sinusoids. And – Lo and behold! – I derived a wavefunction for the photon. It’s fully equivalent to the classical description, but the new expression solves the Schrödinger equation, if we modify it in a rather logical way: we have to double the diffusion constant, which makes sense, because E and B give you two waves for the price of one!

[…]

In any case, I am getting ahead of myself here, and so I should wrap up this rather long introduction. Let me just say that, through my rather long journey in search of understanding – rather than knowledge alone – I have learned there are so many wrong answers out there: wrong answers that hamper rather than promote a better understanding. Moreover, I was most shocked to find out that such wrong answers are not the preserve of amateurs alone! This emboldened me to write what I write here, and to publish it. Quantum mechanics is a logical and coherent framework, and it is not all that difficult to understand. One just needs good pointers, and that’s what I want to provide here.

As of now, it focuses on the mechanics in particular, i.e. the concept of the wavefunction and wave equation (better known as Schrödinger’s equation). The other aspect of quantum mechanics – i.e. the idea of uncertainty as implied by the quantum idea – will receive more attention in a later version of this document. I should also say I will limit myself to quantum electrodynamics (QED) only, so I won’t discuss quarks (i.e. quantum chromodynamics, which is an entirely different realm), nor will I delve into any of the other more recent advances of physics.

In the end, you’ll still be left with lots of unanswered questions. However, that’s quite OK, as Richard Feynman himself was of the opinion that he himself did not understand the topic the way he would like to understand it. But then that’s exactly what draws all of us to quantum physics: a common search for a deep and full understanding of reality, rather than just some superficial description of it, i.e. knowledge alone.

So let’s get on with it. I am not saying this is going to be easy reading. In fact, I blogged about much easier stuff than this in my blog—treating only aspects of the whole theory. This is the whole thing, and it’s not easy to swallow. In fact, it may well too big to swallow as a whole. But please do give it a try. I wanted this to be an intuitive but formally correct introduction to quantum math. However, when everything is said and done, you are the only who can judge if I reached that goal.

Of course, I should not forget the acknowledgements but… Well… It was a rather lonely venture, so I am only going to acknowledge my wife here, Maria, who gave me all of the spacetime and all of the freedom I needed, as I would get up early, or work late after coming home from my regular job. I sacrificed weekends, which we could have spent together, and – when mulling over yet another paradox – the nights were often short and bad. Frankly, it’s been an extraordinary climb, but the view from the top is magnificent.

I just need to insert one caution, my site (www.readingfeynman.org) includes animations, which make it much easier to grasp some of the mathematical concepts that I will be explaining. Hence, I warmly recommend you also have a look at that site, and its Deep Blue page in particular – as that page has the same contents, more or less, but the animations make it a much easier read.

Have fun with it!

Jean Louis Van Belle, BA, MA, BPhil, Drs.

Wave functions and equations: a summary

Post scriptum note added on 11 July 2016: This is one of the more speculative posts which led to my e-publication analyzing the wavefunction as an energy propagation. With the benefit of hindsight, I would recommend you to immediately the more recent exposé on the matter that is being presented here, which you can find by clicking on the provided link. In fact, I actually made some (small) mistakes when writing the post below.

Original post:

Schrödinger’s wave equation for spin-zero, spin-1/2, and spin-one particles in free space differ from each other by a factor two:

For particles with zero spin, we write: ∂ψ/∂t = i·(ħ/m)·∇²ψ. We get this by multiplying the ħ/(2m) factor in Schrödinger’s original wave equation – which applies to spin-1/2 particles (e.g. electrons) only – by two. Hence, the correction that needs to be made is very straightforward.
For fermions (spin-1/2 particles), Schrödinger’s equation is what it is: ∂ψ/∂t = i·[ħ/(2m)]·∇²ψ.
For spin-1 particles (photons), we have ∂ψ/∂t = i·(2ħ/m)·∇²ψ, so here we multiply the ħ/m factor in Schrödinger’s wave equation for spin-zero particles by two, which amounts to multiplying Schrödinger’s original coefficient by four.

Look at the coefficients carefully. It’s a strange succession:

The ħ/m factor (which is just the reciprocal of the mass measured in units of ħ) works for spin-0 particles.
For spin-1/2 particles, we take only half that factor: ħ/(2m) = (1/2)·(ħ/m).
For spin-1 particles, we double that factor: 2ħ/m = 2·(ħ/m).

I describe the detail on my Deep Blue page, so please go there for more detail. What I did there, can be summarized as follows:

The spin-one particle is the photon, and we derived the photon wavefunction from Maxwell’s equations in free space, and found that it solves the ∂ψ/∂t = i·(2ħ/m)·∇²ψ equation, not the ∂ψ/∂t = i·(ħ/m)·∇²ψ or ∂ψ/∂t = i·[ħ/(2m)]·∇²ψ equations.
As for the spin-zero particles, we simplified the analysis by assuming our particle had zero rest mass, and we found that we were basically modeling an energy flow.
The analysis for spin-1/2 particles is just the standard analysis you’ll find in textbooks.

We can speculate how things would look like for spin-3/2 particles, or for spin-2 particles, but let’s not do that here. In any case, we will come back to this. Let’s first focus on the more familiar terrain, i.e. the wave equation for spin-1/2 particles, such as protons or electrons. [A proton is not elementary – as it consists of quarks – but it is a spin-1/2 particle, i.e. a fermion.]

The phase and group velocity of the wavefunction for spin-1/2 particles (fermions)

We’ll start with the very beginning of it all, i.e. the two equations that the young Comte Louis de Broglie presented in his 1924 PhD thesis, which give us the temporal and spatial frequency of the wavefunction, i.e. the ω and k in the θ = ω·t − k·t argument of the a·e^–i·θwavefunction:

ω = E/ħ
k = p/ħ

This allows to calculate the phase velocity of the wavefunction:

v_p = ω/k = (E/ħ)/(p/ħ) = E/p

This is an elementary wavefunction, several of which we would add with appropriate coefficients, with uncertainty in the energy and momentum ensuring our component waves have different frequencies, and, therefore, the concept of a group velocity does not apply. In effect, the a·e^–i·θwavefunction does not describe a localized particle: the probability to find it somewhere is the same everywhere. We may want to think of our wavefunction being confined to some narrow band in space, with us having no prior information about the probability density function, and, therefore, we assume a uniform distribution. Assuming our box in space is defined by Δx = x₂ − x₁, and imposing the normalization condition (all probabilities have to add up to one), we find that the following logic should hold:

(Δx)·a² = (x₂−x₁)·a²= 1 ⇔ Δx = 1/a²

However, we are, of course, interested in the group velocity, as the group velocity should correspond to the classical velocity of the particle. The group velocity of a composite wave is given by the v_g = ∂ω/∂k formula. Of course, that formula assumes an unambiguous relation between the temporal and spatial frequency of the component waves, which we may want to denote as ω_n and k_n, with n = 1, 2, 3,… However, we will not use the index as the context makes it quite clear what we are talking about.

The relation between ω_n and k_n is known as the dispersion relation, and one particularly nice way to calculate ω as a function of kis to distinguish the real and imaginary parts of the ∂ψ/∂t =i·[ħ/(2m)]·∇²ψ wave equation and, hence, re-write it as a pair of two equations:

Re(∂ψ_B/∂t) = −[ħ/(2m)]·Im(∇²ψ_B) ⇔ ω·cos(kx − ωt) = k²·[ħ/(2m)]·cos(kx − ωt)
Im(∂ψ_B/∂t) = [ħ/(2m)]·Re(∇²ψ_B) ⇔ ω·sin(kx − ωt) = k²·[ħ/(2m)]·sin(kx − ωt)

Both equations imply the following dispersion relation:

ω = ħ·k²/(2m)

We can now calculate v_g = ∂ω/∂k as:

v_g = ∂ω/∂k = ∂[ħ·k²/(2m)]/∂k = 2ħk/(2m) = ħ·(p/ħ)/m = p/m = m·v/m = v

That’s nice, because it’s what we wanted to find. If the group velocity would not equal the classical velocity of our particle, then our model would not make sense.

We used the classical momentum formula in our calculation above: p = m·v. To calculate the phase velocity of our wavefunction, we need to calculate that E/p ratio and, hence, we need an energy formula. Here we have a lot of choice, as energy can be defined in many ways: is it rest energy, potential energy, or kinetic energy? At this point, I need to remind you of the basic concepts.

The argument of the wavefunction as the proper time

It is obvious that the energy concept that is to be used in the ω = E/ħ is the total energy. Louis de Broglie himself noted that the energy of a particle consisted of three parts:

The particle’s rest energy m₀c², which de Broglie referred to as internal energy (E_int): it includes the rest mass of the ‘internal pieces’, as de Broglie put it (now we call those ‘internal pieces’ quarks), as well as their binding energy (i.e. the quarks’ interaction energy);
Any potential energy (V) it may have because of some field (so de Broglie was not assuming the particle was traveling in free space): the field(s) can be anything—gravitational, electromagnetic—you name it: whatever changes the energy because of the position of the particle;
The particle’s kinetic energy, which he wrote in terms of its momentum p: K.E. = m·v²/2 = m²·v²/(2m) = (m·v)²/(2m) = p²/(2m).

So the wavefunction, as de Broglie wrote it, can be written as follows:

ψ(θ) = ψ(x, t) = a·e⁻ⁱ^θ = a·e^{−i[(E_int+ p²/(2m) + V)·t − p∙x]/ħ}

This formula allows us to analyze interesting phenomena such as the tunneling effect and, hence, you may want to stop here and start playing with it. However, you should note that the kinetic energy formula that is used here is non-relativistic. The relativistically correct energy formula is E = m_vc², and the relativistically correct formula for the kinetic energy is the difference between the total energy and the rest energy:

K.E. = E − E₀ = m_v·c² − m₀·c² = m₀·γ·c² − m₀·c² = m₀·c²·(γ − 1), with γ the Lorentz factor.

At this point, we should simplify our calculations by adopting natural units, so as to ensure the numerical value of c = 1, and likewise for ħ. Hence, we assume all is described in Planck units, but please note that the physical dimensions of our variables do not change when adopting natural units: time is time, energy is energy, etcetera. But when using natural units, the E = m_vc² reduces to E = m_v. As for our formula for the momentum, this formula remains p = m_v·v, but v is now some relative velocity, i.e. a fraction between 0 and 1. We can now re-write θ = (E/ħ)·t – (p/ħ)·x as:

θ = E·t – p·x = E·t − p·v·t = m_v·t − m_v·v·v·t = m_v·(1 − v²)·t

We can also write this as:

ψ(x, t) = a·e^{−i·(m_v·t − p∙x)}= a·e^{−i·[(m₀/√(1−v²))·t − (m₀·v/√(1−v²)∙x)}= a·e^{−i·m₀·(t − v∙x)/√(1−v²)}

The (t − v∙x)/√(1−v²) factor in the argument is the proper time of the particle as given by the formulas for the Lorentz transformation of spacetime:

relativity

However, both the θ = m_v·(1 − v²)·t and θ = m₀·t’ = m₀·(t − v∙x)/√(1−v²) are relativistically correct. Note that the rest mass of the particle (m₀) acts as a scaling factor as we multiply it with the proper time: a higher m₀ gives the wavefunction a higher density, in time as well as in space.

Let’s go back to our v_p = E/p formula. Using natural units, it becomes:

v_p = E/p = m_v/m_v·v = 1/v

Interesting! The phase velocity is the reciprocal of the classical velocity! This implies it is always superluminal, ranging from v_p = ∞ to v_p = c = 1 for v going from 0 to 1 = c, as illustrated in the simple graph below.

Let me note something here, as you may also want to use the dispersion relation, i.e. ω = ħ·k²/(2m), to calculate the phase velocity. You’d write:

v_p = ω/k = [ħ·k²/(2m)]/k = ħ·k/(2m) = ħ·(p/ħ)/(2m) = m·v/(2m) = v/2

That’s a nonsensical result. Why do we get it? Because we are mixing two different mass concepts here: the mass that’s associated with the component wave, and the mass that’s associated with the composite wave. Think of it. That’s where Schrödinger’s equation is different from all of the other diffusion equations you’ve seen: the mass factor in the ∂ψ/∂t = i·[ħ/(2m)]·∇²ψ equation is the mass of the particle that’s being represented by the wavefunction that solves the equation. Hence, the diffusion constant ħ/(2m) is not a property of the medium. In that sense, it’s different from the κ/k factor in the ∂T/∂t = (κ/k)·∇²T heat diffusion, for example. We don’t have a medium here and, therefore, Schrödinger’s equation and the associated wavefunction are intimately connected.

It’s an interesting point, because if we’re going to be measuring the mass as multiples of ħ/2 (as suggested by the ħ/(2m) = 1/[m/[ħ/2)] factor itself), then its possible values (for ħ = 1) will be 1/2, 1, 3/2, 2, 5/2,… Now that should remind you of a few things—things like harmonics, or allowable spin values, or… Well… So many things. 🙂

Let’s do the exercise for bosons now.

The phase and group velocity of the wavefunction for spin-0 particles

My Deep Blue page explains why we need to drop the 1/2 factor in Schrödinger’s equation to make it fit the wavefunction for bosons. We distinguished two bosons: (1) the (theoretical) zero-mass particle (which has spin zero), and the (actual) photon (which has spin one). Let’s first do the analysis for the spin-zero particle.

A zero-mass particle (i.e. a particle with zero rest mass) should be traveling at the speed of light: both its phase as well as its group velocity should be equal to c = 1. In fact, we’re not talking composite wavefunctions here, so there’s no such thing as a group velocity. We’re not adding waves: there is only one wavefunction. [Note that we don’t need to add waves with different frequencies in order to localize our particle, because quantum mechanics and relativity theory come together here in what might well be the most logical and absurd conclusion ever: as an outside observer, we’re going to see all those zero-mass particles as point objects whizzing by because of the relativistic length contraction. So their wavefunction is only all over spacetime in their proper space and time, but not in ours!]
Now, it’s easy to show that, if we choose our time and distance units such that c = 1, then the energy formula reduces to E = m∙c² = m. Likewise, we find that p = m∙c = m. So we have this strange condition: E = p = m.
Now, this is not consistent with the ω = ħ·k²/(2m) we get out of the ∂ψ/∂t = i·[ħ/(2m)]·∇²ψ equation, because E/ħ = ħ·(p/ħ)²/(2m) ⇔ E = m²/(2m) = m/2. That does not fit the E = p = m condition. The only way out is to drop the 1/2 factor, i.e. to multiply Schrödinger’s coefficient with 2.

Let’s quickly check if it does the trick. We assume E, p and m will be multiples of ħ/2 (E = p = m = n·(ħ/2), so the wavefunction is e^{−i∙[t − x]n·/2}, Schrödinger’s constant becomes 2/n, and the derivatives for ∂ψ/∂t = i·(ħ/m)·∇²ψ are:

∂ψ/∂t = −i·(n/2)·e^{−i∙[t − x]·n/2}
∇²ψ = ∂²[e^{−i∙[t − x]·n/2}]/∂x²= i·(n/2)·∂[e^{−i∙[t − x]·n/2}]/∂x = −(n²/4)·e^{−i∙[t − x]·n/2}

So the Schrödinger equation becomes:

−i·(n/2)·e^{−i∙[t − x]n·/2}) = −i·(2/n)·(n²/4)·e^{−i∙[t − x]·n/2} ⇔ n/2 = n/2 ⇔ 1 = 1

As Feynman would say, it works like a charm, and note that n does not have to be some integer to make this work.

So what makes spin-1/2 particles different? The answer is: they have both linear as well as angular momentum, and the equipartition theorem tells us the energy will be shared equally among both , so they will pick up linear and angular momentum. Hence, the associated condition is not E = p = m, but E = p = 2m. We’ll come back to this.

Let’s now summarize how it works for spin-one particles

The phase and group velocity of the wavefunction for spin-1 particles (photons)

Because of the particularities that characterize an electromagnetic wave, the wavefunction packs two waves, capturing both the electric as well as the magnetic field vector (i.e. E and B). For the detail, I’ll refer you to the mentioned page, because the proof is rather lengthy (but easy to follow, so please do check it out). I will just briefly summarize the logic here.

1. For the spin-zero particle, we measured E, m and p in units of – or as multiples of – the ħ/2 factor. Hence, the elementary wavefunction (i.e. the wavefunction for E = p = m = 1) for the zero-mass particle is eⁱ^{(x/2 − t/2)}.

2. For the spin-1 particle (the photon), one can show that we get two of these elementary wavefunctions (ψ_Eand ψ_B), and one can then prove that we can write the sum of the electric and magnetic field vector as:

E + B = E + B = ψ_E+ ψ_B= E + i·E

= √2·eⁱ^{(x/2 − t/2+ π/4)}= √2·eⁱ^(π/4)·eⁱ^{(x/2 − t/2)}= √2·eⁱ^(π/4)·E = √2·eⁱ^(π/4)·eⁱ^{(x/2 − t/2)}

Hence, the photon has a special wavefunction. Does it solve the Schrödinger equation? It does when we use the 2ħ/m diffusion constant, rather than the ħ/m or ħ/(2m) coefficient. Let us quickly check it. The derivatives are:

∂ψ/∂t = −√2·eⁱ^(π/4)·e^{−i∙[t − x]/2}·(i/2)
∇²ψ = ∂²[√2·eⁱ^(π/4)·e^{−i∙[t − x]/2}]/∂x²= √2·eⁱ^(π/4)·∂[e^{−i∙[t − x]/2}·(i/2)]/∂x = −√2·eⁱ^(π/4)·e^{−i∙[t − x]/2}·(1/4)

Note, however, that we have two mass, energy and momentum concepts here: E_E, p_E, m_E and E_B, p_B, and m_B respectively. Hence, if E_E= p_E= m_E = E_B= p_B= m_B = 1/2, then E = E_E+ E_B, p = p_E+ p_B and m = m_E+ m_Bare all equal to 1. Hence, because E = p = m = 1 and we measure in units of ħ, the 2ħ/m factor is equal to 2 and, therefore, the modified Schrödinger’s equation ∂ψ/∂t = i·(2ħ/m)·∇²ψ becomes:

−i·√2·eⁱ^(π/4)·e^{−i∙[t − x]/2}·(1/2) = −i·√2·2·eⁱ^(π/4)·e^{−i∙[t − x]/2}·(1/4) ⇔ 1/2 = 2/4 = 1/2

It all works out. Let’s quickly check it for E, m and p being multiples of ħ, so we write: E = p = m = n·ħ = n, so the wavefunction is √2·eⁱ^(π/4)·e^{−i∙[t − x]n·/2}, Schrödinger’s 2ħ/m constant becomes 2ħ/m = 2/n, and the derivatives for ∂ψ/∂t = i·(ħ/m)·∇²ψ are:

∂ψ/∂t = −i·(n/2)·√2·eⁱ^(π/4)·e^{−i∙[t − x]·n/2}
∇²ψ = ∂²[e^{−i∙[t − x]·n/2}]/∂x²= i·√2·eⁱ^(π/4)·(n/2)·∂[e^{−i∙[t − x]·n/2}]/∂x = −√2·(n²/4)·eⁱ^(π/4)·e^{−i∙[t − x]·n/2}

So the Schrödinger equation becomes:

−i·√2·eⁱ^(π/4)·(n/2)·e^{−i∙[t − x]·n/2}) = −i·√2·eⁱ^(π/4)·(2/n)·(n²/4)·e^{−i∙[t − x]·n/2} ⇔ n/2 = n/2 ⇔ 1 = 1

It works like a charm again. Note the subtlety of the difference between the ħ/(2m) and 2ħ/m factor: it depends on us measuring the mass (and, hence, the energy and momentum as well) in units of ħ/2 (for spin-0 particles) or, alternatively (for spin-1 particles), in units of ħ. This is very deep—but it does make sense in light of the E_n =n·ħ·ω = n·h·f formula that solves the black-body radiation problem, as illustrated below. [The formula next to the energy levels is the probability of an atomic oscillator occupying that energy level, which is given by Boltzmann’s Law. You can check things in my post on it.]

It is now time to look at something else.

Schrödinger’s equation as an energy propagation mechanism

The Schrödinger equations above are not complete. The complete equation includes force fields, i.e. potential energy:

To write the equation like this, we need to move the i on the right-hand side of our ∂ψ/∂t = i·(2ħ/m)·∇²ψ equation to the other side, and multiply both sides with −1. [Remember: 1/i = −i.] Now, it is very interesting to do a dimensional analysis of this equation. Let’s do the right-hand side first. The ħ²factor in the ħ/(2m) is expressed in J²·s². Now that doesn’t make much sense, but then that mass factor in the denominator makes everything come out alright. Indeed, we can use the mass-equivalence relation to express m in J/(m/s)² units. So we get: (J²·s²)·[(m/s)²/J] = J·m². But so we multiply that with some quantity (the Laplacian) that’s expressed per m². So −(ħ²/2m)·∇²ψ is something expressed in joule, so it’s some amount of energy! Interesting, isn’t it? [Note that it works out fine with the addition Vψ term, which is also expressed in joule.] On the left-hand side, we have ħ, and its dimension is the action dimension: J·s, i.e. force times distance times time (N·m·s). So we multiply that with a time derivative and we get J once again, the unit of energy. So it works out: we have joule units both left and right. But what does it mean?

Well… The Laplacian on the right-hand side works just the same as for our heat diffusion equation: it gives us a flux density, i.e. something expressed per square meter (1/m²). Likewise, the time derivative on the left-hand side gives us a flow per second. But so what is it that is flowing here? Well… My interpretation is that it is energy, and it’s flowing between a real and an imaginary space—but don’t be fooled by the terms here: both spaces are equally real, as both have an actual physical dimension. Let me explain.

Things become somewhat more comprehensible when we remind ourselves that the Schrödinger equation is equivalent to the following pair of equations:

Re(∂ψ/∂t) = −(ħ/2m)·Im(∇²ψ) ⇔ ω·cos(kx − ωt) = k²·(ħ/2m)·cos(kx − ωt)
Im(∂ψ/∂t) = (ħ/2m)·Re(∇²ψ) ⇔ ω·sin(kx − ωt) = k²·(ħ/2m)·sin(kx − ωt)

So what? Let me insert an illustration here. See what happens. The wavefunction acts as a link function between our physical spacetime and some other space whose dimensions – in my humble opinion – are also physical. We have those sines and cosines, which mirror the energy of the system at any point in time, as measured by the proper time of the system.

summary

Let me more precise. The wavefunction, as a link function between two spaces here, associates every point in spacetime with some real as well as some imaginary energy here—but, as mentioned above, that imaginary energy is as real as the real energy. What it embodies really is the energy conservation law: at any point in time (as measured by the proper time) the sum of kinetic and potential energy must be equal to some constant, and so that’s what’s shown here. Indeed, you should note the phase shift between the sine and the cosine function: if one reaches the +1 or −1 value, then the other function reaches the zero point—and vice versa. It’s a beautiful structure.

Of course, the million-dollar question is: is it a physical structure, or a mathematical structure? The answer is: it’s a bit of both. It’s a mathematical structure but, at the same time, its dimension is physical: it’s an energy space. It’s that energy that explains why amplitudes interfere—which, as you know, is what they do. So these amplitudes are something real, and as the dimensional analysis of Schrödinger’s equation reveals their dimension is expressed in joule, then… Well… Then these physical equations say what they say, don’t they? And what they say, is something like the diagram below.

summary 2

Note that the diagram above does not show the phase difference between the two springs. The animation below does a better job here, although you need to realize the hand of the clock will move faster or slower as our object travels through force fields and accelerates or decelerates accordingly.

Circle_cos_sin

We may relate that picture above to the principle of least action, which ensures that the difference between the kinetic energy (KE) and potential energy (PE) in the integrand of the action integral, i.e.

action

is minimized along the path of travel.

The spring metaphor should also make us think of the energy formula for a harmonic oscillator, which tells us that the total energy – kinetic (i.e. the energy related to its momentum) plus potential (i.e. the energy stored in the spring) – is equal to T + U = m·ω₀²/2. The ω₀here is the angular velocity, and we have two springs here, so the total energy would be the sum of both, i.e. m·ω₀², without the 1/2 factor. Does that make sense? It’s like an E = m·v²equation, so that’s twice the (non-relativistic) kinetic energy. Does that formula make any sense?

In the context of what we’re discussing here, it does. Think about the limit situation by trying to imagine a zero-mass particle here (I am talking a zero-mass spin-1/2 particle this time). It would have no rest energy, so it’s only energy is kinetic, which is equal to:

K.E. = E − E₀ = m_v·c² − m₀·c² = m_c·c²

Why is m_vequal to m_c? Zero-mass particles must travel at the speed of light, as the slightest force on them gives them an infinite acceleration. So there we are: the m·ω₀²equation makes sense! But what if we have a non-zero rest mass? In that case, look at that pair of equations again: they give us a dispersion relation, i.e. a relation between ω and k. Indeed, using natural units again, so the numerical value of ħ = 1, we can write:

ω = k²/(2m) = p²/(2m) = (m·v)²/(2m) = m·v²/2

This equation seems to represent the kinetic energy but m is not the rest mass here: it’s the relativistic mass, so that makes it different from the classical kinetic energy formula (K.E. = m₀·v²/2). [It may be useful here to remind you of how we get that classical formula. We basically integrate force over distance, from some start to some final point of a path in spacetime. So we write: ∫ F·ds = ∫ (m·a)·ds = ∫ (m·a)·ds = ∫ [m·(dv/dt)]·ds = ∫ [m·(ds/dt)]·dv = ∫ m·v·ds. So we can solve that using the m·v²/2 primitive but only if m does not vary, i.e. if m = m₀. If velocity are high, we need the relativistic mass concept.]

So we have a new energy concept here: m·v², and it’s split over those two springs. Hmm… The interpretation of all of this is not so easy, so I will need to re-visit this. As for now, however, it looks like the Universe can be represented by a V-twin engine! 🙂

V-Twin engine

Is it real?

You may still doubt whether that new ‘space’ has an actual energy dimension. It’s a figment of our mind, right? Well… Yes and no. Again, it’s a bit of a mixture between a mathematical and a physical space: it’s definitely not our physical space, as it’s not the spacetime we’re living in. But, having said that, I don’t think this energy space is just a figment of our mind. Let me give you some additional reasons, beside the dimensional analysis we did above.

For example, there is the fact that we need to to take the absolute square of the wavefunction to get the probability that our elementary particle is actually right there! Now that’s something real! Hence, let me say a few more things about that. The absolute square gets rid of the time factor. Just write it out to see what happens:

|re^iθ|²= |r|²|e^iθ|² = r²[√(cos²θ + sin²θ)]²= r²(√1)²= r²

Now, the r gives us the maximum amplitude (sorry for the mix of terminology here: I am just talking the wave amplitude here – i.e. the classical concept of an amplitude – not the quantum-mechanical concept of a probability amplitude). Now, we know that the energy of a wave – anywave, really – is proportional to the amplitude of a wave. It would also be logical to expect that the probability of finding our particle at some point x is proportional to the energy densitythere, isn’t it? [I know what you’ll say now: you’re squaring the amplitude, so if the dimension of its square is energy, then its own dimension must be the square root, right? No. Wrong. That’s why this confusion between amplitude and probability amplitude is so bad. Look at the formula: we’re squaring the sine and cosine, to then take the square root again, so the dimension doesn’t change: it’s √J² = J.]

The third reason why I think the probability amplitude represents some energy is that its real and imaginary part also interfere with each other, as is evident when you take the ordinary square (i.e. not the absolute square). Then the i² = –1 rule comes into play and, therefore, the square of the imaginary part starts messing with the square of the real part. Just write it out:

(re^iθ)²= r²(cosθ + isinθ)²= r²(cos²θ – sin²θ + 2icosθsinθ)²= r²(1 – 2sin²θ + 2icosθsinθ)²

As mentioned above, if there’s interference, then something is happening, and so then we’re talking something real. Hence, the real and imaginary part of the wavefunction must have some dimension, and not just any dimension: it must be energy, as that’s the currency of the Universe, so to speak.

Let me add a philosophical note here—or an ontological note, I should say. When you think we should only have one physical space, you’re right. This new physical space, in which we relate energy to time, is not our physical space. It’s not reality—as we know, as we experience it. So, in that sense, you’re right. It’s not physical space. But then… Well… It’s a definitional matter. Any space whose dimensions are physical, is a physical space for me. But then I should probably be more careful. What we have here is some kind of projection of our physical space to a space that lacks… Well… It lacks the spatial dimension. It’s just time – but a special kind of time: relativistic proper time – and energy—albeit energy in two dimensions, so to speak. So… What can I say? Just what I said a couple of times already: it’s some kind of mixture between a physical and mathematical space. But then… Well… Our own physical space – including the spatial dimension – is something like a mixture as well, isn’t it? We can try to disentangle them – which is what I am trying to do – but we’ll never fully succeed.

Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/

The Imaginary Energy Space

Post scriptum note added on 11 July 2016: This is one of the more speculative posts which led to my e-publication analyzing the wavefunction as an energy propagation. With the benefit of hindsight, I would recommend you to immediately the more recent exposé on the matter that is being presented here, which you can find by clicking on the provided link. In addition, I see the dark force has amused himself by removing some material even here!

Original post:

Intriguing title, isn’t it? You’ll think this is going to be highly speculative and you’re right. In fact, I could also have written: the imaginary action space, or the imaginary momentum space. Whatever. It all works ! It’s an imaginary space – but a very real one, because it holds energy, or momentum, or a combination of both, i.e. action. 🙂

So the title is either going to deter you or, else, encourage you to read on. I hope it’s the latter. 🙂

In my post on Richard Feynman’s exposé on how Schrödinger got his famous wave equation, I noted an ambiguity in how he deals with the energy concept. I wrote that piece in February, and we are now May. In-between, I looked at Schrödinger’s equation from various perspectives, as evidenced from the many posts that followed that February post, which I summarized on my Deep Blue page, where I note the following:

The argument of the wavefunction (i.e. θ = ωt – kx = [E·t – p·x]/ħ) is just the proper time of the object that’s being represented by the wavefunction (which, in most cases, is an elementary particle—an electron, for example).
The 1/2 factor in Schrödinger’s equation (∂ψ/∂t = i·(ħ/2m)·∇²ψ) doesn’t make all that much sense, so we should just drop it. Writing ∂ψ/∂t = i·(m/ħ)∇²ψ (i.e. Schrödinger’s equation without the 1/2 factor) does away with the mentioned ambiguities and, more importantly, avoids obvious contradictions.

Both remarks are rather unusual—especially the second one. In fact, if you’re not shocked by what I wrote above (Schrödinger got something wrong!), then stop reading—because then you’re likely not to understand a thing of what follows. 🙂 In any case, I thought it would be good to follow up by devoting a separate post to this matter.

The argument of the wavefunction as the proper time

Frankly, it took me quite a while to see that the argument of the wavefunction is nothing but the t’ = (t − v∙x)/√(1−v²)] formula that we know from the Lorentz transformation of spacetime. Let me quickly give you the formulas (just substitute the u for v):

relativity

In fact, let me be precise: the argument of the wavefunction also has the particle’s rest mass m₀ in it. That mass factor (m₀) appears in it as a general scaling factor, so it determines the density of the wavefunction both in time as well as in space. Let me jot it down:

ψ(x, t) = a·e^{−i·(m_v·t − p∙x)}= a·e^{−i·[(m₀/√(1−v²))·t − (m₀·v/√(1−v²))∙x]}= a·e^{−i·m₀·(t − v∙x)/√(1−v²)}

Huh? Yes. Let me show you how we get from θ = ωt – kx = [E·t – p·x]/ħ to θ = m_v·t − p∙x. It’s really easy. We first need to choose our units such that the speed of light and Planck’s constant are numerically equal to one, so we write: c = 1 and ħ = 1. So now the 1/ħ factor no longer appears.

[Let me note something here: using natural units does not do away with the dimensions: the dimensions of whatever is there remain what they are. For example, energy remains what it is, and so that’s force over distance: 1 joule = 1 newton·meter (1 J = 1 N·m. Likewise, momentum remains what it is: force times time (or mass times velocity). Finally, the dimension of the quantum of action doesn’t disappear either: it remains the product of force, distance and time (N·m·s). So you should distinguish between the numerical value of our variables and their dimension. Always! That’s where physics is different from algebra: the equations actually mean something!]

Now, because we’re working in natural units, the numerical value of both c and c²will be equal to 1. It’s obvious, then, that Einstein’s mass-energy equivalence relation reduces from E = m_vc² to E = m_v. You can work out the rest yourself – noting that p = m_v·v and m_v= m₀/√(1−v²). Done! For a more intuitive explanation, I refer you to the above-mentioned page.

So that’s for the wavefunction. Let’s now look at Schrödinger’s wave equation, i.e. that differential equation of which our wavefunction is a solution. In my introduction, I bluntly said there was something wrong with it: that 1/2 factor shouldn’t be there. Why not?

What’s wrong with Schrödinger’s equation?

When deriving his famous equation, Schrödinger uses the mass concept as it appears in the classical kinetic energy formula: K.E. = m·v²/2, and that’s why – after all the complicated turns – that 1/2 factor is there. There are many reasons why that factor doesn’t make sense. Let me sum up a few.

[I] The most important reason is that de Broglie made it quite clear that the energy concept in his equations for the temporal and spatial frequency for the wavefunction – i.e. the ω = E/ħ and k = p/ħ relations – is the total energy, including rest energy (m₀), kinetic energy (m·v²/2) and any potential energy (V). In fact, if we just multiply the two de Broglie (aka as matter-wave equations) and use the old-fashioned v = f·λ relation (so we write E as E = ω·ħ = (2π·f)·(h/2π) = f·h, and p as p = k·ħ = (2π/λ)·(h/2π) = h/λ and, therefore, we have f = E/h and p = h/p), we find that the energy concept that’s implicit in the two matter-wave equations is equal to E = m∙v², as shown below:

f·λ = (E/h)·(h/p) = E/p
v = f·λ ⇒ f·λ = v = E/p ⇔ E = v·p = v·(m·v) ⇒ E = m·v²

Huh? E = m∙v²? Yes. Not E = m∙c² or m·v²/2 or whatever else you might be thinking of. In fact, this E = m∙v² formula makes a lot of sense in light of the two following points.

Skeptical note: You may – and actually should – wonder whether we can use that v = f·λ relation for a wave like this, i.e. a wave with both a real (cos(-θ)) as well as an imaginary component (i·sin(-θ). It’s a deep question, and I’ll come back to it later. But… Yes. It’s the right question to ask. 😦

[II] Newton told us that force is mass time acceleration. Newton’s law is still valid in Einstein’s world. The only difference between Newton’s and Einstein’s world is that, since Einstein, we should treat the mass factor as a variable as well. We write: F = m_v·a = m_v·a = [m₀/√(1−v²)]·a. This formula gives us the definition of the newton as a force unit: 1 N = 1 kg·(m/s)/s = 1 kg·m/s². [Note that the 1/√(1−v²) factor – i.e. the Lorentz factor (γ) – has no dimension, because v is measured as a relative velocity here, i.e. as a fraction between 0 and 1.]

Now, you’ll agree the definition of energy as a force over some distance is valid in Einstein’s world as well. Hence, if 1 joule is 1 N·m, then 1 J is also equal to 1 (kg·m/s²)·m = 1 kg·(m²/s²), so this also reflects the E = m∙v² concept. [I can hear you mutter: that kg factor refers to the rest mass, no? No. It doesn’t. The kg is just a measure of inertia: as a unit, it applies to both m₀as well as m_v. Full stop.]

Very skeptical note: You will say this doesn’t prove anything – because this argument just shows the dimensional analysis for both equations (i.e. E = m∙v² and E = m∙c²) is OK. Hmm… Yes. You’re right. 🙂 But the next point will surely convince you! 🙂

[III] The third argument is the most intricate and the most beautiful at the same time—not because it’s simple (like the arguments above) but because it gives us an interpretation of what’s going on here. It’s fairly easy to verify that Schrödinger’s equation, ∂ψ/∂t = i·(ħ/2m)·∇²ψ equation (including the 1/2 factor to which I object), is equivalent to the following set of two equations:

Re(∂ψ/∂t) = −(ħ/2m)·Im(∇²ψ)
Im(∂ψ/∂t) = (ħ/2m)·Re(∇²ψ)

[In case you don’t see it immediately, note that two complex numbers a + i·b and c + i·d are equal if, and only if, their real and imaginary parts are the same. However, here we have something like this: a + i·b = i·(c + i·d) = i·c + i²·d = − d + i·c (remember i²= −1).]

Now, before we proceed (i.e. before I show you what’s wrong here with that 1/2 factor), let us look at the dimensions first. For that, we’d better analyze the complete Schrödinger equation so as to make sure we’re not doing anything stupid here by looking at one aspect of the equation only. The complete equation, in its original form, is:

Notice that, to simplify the analysis above, I had moved the i and the ħ on the left-hand side to the right-hand side (note that 1/i = −i, so −(ħ²/2m)/(i·ħ) = ħ/2m). Now, the ħ²factor on the right-hand side is expressed in J²·s². Now that doesn’t make much sense, but then that mass factor in the denominator makes everything come out alright. Indeed, we can use the mass-equivalence relation to express m in J/(m/s)² units. So our ħ²/2m coefficient is expressed in (J²·s²)/[J/(m/s)²] = J·m². Now we multiply that by that Laplacian operating on some scalar, which yields some quantity per square meter. So the whole right-hand side becomes some amount expressed in joule, i.e. the unit of energy! Interesting, isn’t it?

On the left-hand side, we have i and ħ. We shouldn’t worry about the imaginary unit because we can treat that as just another number, albeit a very special number (because its square is minus 1). However, in this equation, it’s like a mathematical constant and you can think of it as something like π or e. [Think of the magical formula: e^iπ= i² = −1.] In contrast, ħ is a physical constant, and so that constant comes with some dimension and, therefore, we cannot just do what we want. [I’ll show, later, that even moving it to the other side of the equation comes with interpretation problems, so be careful with physical constants, as they really mean something!] In this case, its dimension is the action dimension: J·s = N·m·s, so that’s force times distance times time. So we multiply that with a time derivative and we get joule once again (N·m·s/s = N·m = J), so that’s the unit of energy. So it works out: we have joule units both left and right in Schrödinger’s equation. Nice! Yes. But what does it mean? 🙂

Well… You know that we can – and should – think of Schrödinger’s equation as a diffusion equation – just like a heat diffusion equation, for example – but then one describing the diffusion of a probability amplitude. [In case you are not familiar with this interpretation, please do check my post on it, or my Deep Blue page.] But then we didn’t describe the mechanism in very much detail, so let me try to do that now and, in the process, finally explain the problem with the 1/2 factor.

The missing energy

There are various ways to explain the problem. One of them involves calculating group and phase velocities of the elementary wavefunction satisfying Schrödinger’s equation but that’s a more complicated approach and I’ve done that elsewhere, so just click the reference if you prefer the more complicated stuff. I find it easier to just use those two equations above:

Re(∂ψ/∂t) = −(ħ/2m)·Im(∇²ψ)
Im(∂ψ/∂t) = (ħ/2m)·Re(∇²ψ)

The argument is the following: if our elementary wavefunction is equal to eⁱ^{(kx − ωt)} = cos(kx−ωt) + i∙sin(kx−ωt), then it’s easy to proof that this pair of conditions is fulfilled if, and only if, ω = k²·(ħ/2m). [Note that I am omitting the normalization coefficient in front of the wavefunction: you can put it back in if you want. The argument here is valid, with or without normalization coefficients.] Easy? Yes. Check it out. The time derivative on the left-hand side is equal to:

∂ψ/∂t = −iω·ieⁱ^{(kx − ωt)}= ω·[cos(kx − ωt) + i·sin(kx − ωt)] = ω·cos(kx − ωt) + iω·sin(kx − ωt)

And the second-order derivative on the right-hand side is equal to:

∇²ψ= ∂²ψ/∂x²= i·k²·eⁱ^{(kx − ωt)}= k²·cos(kx − ωt) + i·k²·sin(kx − ωt)

So the two equations above are equivalent to writing:

Re(∂ψ_B/∂t) = −(ħ/2m)·Im(∇²ψ_B) ⇔ ω·cos(kx − ωt) = k²·(ħ/2m)·cos(kx − ωt)
Im(∂ψ_B/∂t) = (ħ/2m)·Re(∇²ψ_B) ⇔ ω·sin(kx − ωt) = k²·(ħ/2m)·sin(kx − ωt)

So both conditions are fulfilled if, and only if, ω = k²·(ħ/2m). You’ll say: so what? Well… We have a contradiction here—something that doesn’t make sense. Indeed, the second of the two de Broglie equations (always look at them as a pair) tells us that k = p/ħ, so we can re-write the ω = k²·(ħ/2m) condition as:

ω/k = v_p = k²·(ħ/2m)/k = k·ħ/(2m) = (p/ħ)·(ħ/2m) = p/2m ⇔ p = 2m

You’ll say: so what? Well… Stop reading, I’d say. That p = 2m doesn’t make sense—at all! Nope! In fact, if you thought that the E = m·v² is weird—which, I hope, is no longer the case by now—then… Well… This p = 2m equation is much weirder. In fact, it’s plain nonsense: this condition makes no sense whatsoever. The only way out is to remove the 1/2 factor, and to re-write the Schrödinger equation as I wrote it, i.e. with an ħ/m coefficient only, rather than an (1/2)·(ħ/m) coefficient.

Huh? Yes.

As mentioned above, I could do those group and phase velocity calculations to show you what rubbish that 1/2 factor leads to – and I’ll do that eventually – but let me first find yet another way to present the same paradox. Let’s simplify our life by choosing our units such that c = ħ = 1, so we’re using so-called natural units rather than our SI units. [Again, note that switching to natural units doesn’t do anything to the physical dimensions: a force remains a force, a distance remains a distance, and so on.] Our mass-energy equivalence then becomes: E = m·c²= m·1²= m. [Again, note that switching to natural units doesn’t do anything to the physical dimensions: a force remains a force, a distance remains a distance, and so on. So we’d still measure energy and mass in different but equivalent units. Hence, the equality sign should not make you think mass and energy are actually the same: energy is energy (i.e. force times distance), while mass is mass (i.e. a measure of inertia). I am saying this because it’s important, and because it took me a while to make these rather subtle distinctions.]

Let’s now go one step further and imagine a hypothetical particle with zero rest mass, so m₀ = 0. Hence, all its energy is kinetic and so we write: K.E. = m_v·v/2. Now, because this particle has zero rest mass, the slightest acceleration will make it travel at the speed of light. In fact, we would expect it to travel at the speed, so m_v = m_c and, according to the mass-energy equivalence relation, its total energy is, effectively, E = m_v = m_c. However, we just said its total energy is kinetic energy only. Hence, its total energy must be equal to E = K.E. = m_c·c/2 = m_c/2. So we’ve got only half the energy we need. Where’s the other half? Where’s the missing energy? Quid est veritas? Is its energy E = m_c or E = m_c/2?

It’s just a paradox, of course, but one we have to solve. Of course, we may just say we trust Einstein’s E = m·c²formula more than the kinetic energy formula, but that answer is not very scientific. 🙂 We’ve got a problem here and, in order to solve it, I’ve come to the following conclusion: just because of its sheer existence, our zero-mass particle must have some hidden energy, and that hidden energy is also equal to E = m·c²/2. Hence, the kinetic and the hidden energy add up to E = m·c² and all is alright.

Huh? Hidden energy? I must be joking, right?

Well… No. Let me explain. Oh. And just in case you wonder why I bother to try to imagine zero-mass particles. Let me tell you: it’s the first step towards finding a wavefunction for a photon and, secondly, you’ll see it just amounts to modeling the propagation mechanism of energy itself. 🙂

The hidden energy as imaginary energy

I am tempted to refer to the missing energy as imaginary energy, because it’s linked to the imaginary part of the wavefunction. However, it’s anything but imaginary: it’s as real as the imaginary part of the wavefunction. [I know that sounds a bit nonsensical, but… Well… Think about it. And read on!]

Back to that factor 1/2. As mentioned above, it also pops up when calculating the group and the phase velocity of the wavefunction. In fact, let me show you that calculation now. [Sorry. Just hang in there.] It goes like this.

The de Broglie relations tell us that the k and the ω in the eⁱ^{(kx − ωt)} = cos(kx−ωt) + i∙sin(kx−ωt) wavefunction (i.e. the spatial and temporal frequency respectively) are equal to k = p/ħ, and ω = E/ħ. Let’s now think of that zero-mass particle once more, so we assume all of its energy is kinetic: no rest energy, no potential! So… If we now use the kinetic energy formula E = m·v²/2 – which we can also write as E = m·v·v/2 = p·v/2 = p·p/2m = p²/2m, with v = p/m the classical velocity of the elementary particle that Louis de Broglie was thinking of – then we can calculate the group velocity of our eⁱ^{(kx − ωt)} = cos(kx−ωt) + i∙sin(kx−ωt) wavefunction as:

v_g = ∂ω/∂k = ∂[E/ħ]/∂[p/ħ] = ∂E/∂p = ∂[p²/2m]/∂p = 2p/2m = p/m = v

[Don’t tell me I can’t treat m as a constant when calculating ∂ω/∂k: I can. Think about it.]

Fine. Now the phase velocity. For the phase velocity of our eⁱ^{(kx − ωt)} wavefunction, we find:

v_p = ω/k = (E/ħ)/(p/ħ) = E/p = (p²/2m)/p = p/2m = v/2

So that’s only half of v: it’s the 1/2 factor once more! Strange, isn’t it? Why would we get a different value for the phase velocity here? It’s not like we have two different frequencies here, do we? Well… No. You may also note that the phase velocity turns out to be smaller than the group velocity (as mentioned, it’s only half of the group velocity), which is quite exceptional as well! So… Well… What’s the matter here? We’ve got a problem!

What’s going on here? We have only one wave here—one frequency and, hence, only one k and ω. However, on the other hand, it’s also true that the eⁱ^{(kx − ωt)} wavefunction gives us two functions for the price of one—one real and one imaginary: eⁱ^{(kx − ωt)} = cos(kx−ωt) + i∙sin(kx−ωt). So the question here is: are we adding waves, or are we not? It’s a deep question. If we’re adding waves, we may get different group and phase velocities, but if we’re not, then… Well… Then the group and phase velocity of our wave should be the same, right? The answer is: we are and we aren’t. It all depends on what you mean by ‘adding’ waves. I know you don’t like that answer, but that’s the way it is, really. 🙂

Let me make a small digression here that will make you feel even more confused. You know – or you should know – that the sine and the cosine function are the same except for a phase difference of 90 degrees: sinθ = cos(θ + π/2). Now, at the same time, multiplying something with i amounts to a rotation by 90 degrees, as shown below.

Hence, in order to sort of visualize what our eⁱ^{(kx − ωt)} function really looks like, we may want to super-impose the two graphs and think of something like this:

You’ll have to admit that, when you see this, our formulas for the group or phase velocity, or our v = f·λ relation, do no longer make much sense, do they? 🙂

Having said that, that 1/2 factor is and remains puzzling, and there must be some logical reason for it. For example, it also pops up in the Uncertainty Relations:

Δx·Δp ≥ ħ/2 and ΔE·Δt ≥ ħ/2

So we have ħ/2 in both, not ħ. Why do we need to divide the quantum of action here? How do we solve all these paradoxes? It’s easy to see how: the apparent contradiction (i.e. the different group and phase velocity) gets solved if we’d use the E = m∙v² formula rather than the kinetic energy E = m∙v²/2. But then… What energy formula is the correct one: E = m∙v² or m∙c²? Einstein’s formula is always right, isn’t it? It must be, so let me postpone the discussion a bit by looking at a limit situation. If v = c, then we don’t need to make a choice, obviously. 🙂 So let’s look at that limit situation first. So we’re discussing our zero-mass particle once again, assuming it travels at the speed of light. What do we get?

Well… Measuring time and distance in natural units, so c = 1, we have:

E = m∙c² = m and p = m∙c= m, so we get: E = m = p

Waw ! E = m = p ! What a weird combination, isn’t it? Well… Yes. But it’s fully OK. [You tell me why it wouldn’t be OK. It’s true we’re glossing over the dimensions here, but natural units are natural units and, hence, the numerical value of c and c²is 1. Just figure it out for yourself.] The point to note is that the E = m = p equality yields extremely simple but also very sensible results. For the group velocity of our eⁱ^{(kx − ωt)} wavefunction, we get:

v_g = ∂ω/∂k = ∂[E/ħ]/∂[p/ħ] = ∂E/∂p = ∂p/∂p = 1

So that’s the velocity of our zero-mass particle (remember: the 1 stands for c here, i.e. the speed of light) expressed in natural units once more—just like what we found before. For the phase velocity, we get:

v_p = ω/k = (E/ħ)/(p/ħ) = E/p = p/p = 1

Same result! No factor 1/2 here! Isn’t that great? My ‘hidden energy theory’ makes a lot of sense.

However, if there’s hidden energy, we still need to show where it’s hidden. 🙂 Now that question is linked to the propagation mechanism that’s described by those two equations, which now – leaving the 1/2 factor out, simplify to:

Re(∂ψ/∂t) = −(ħ/m)·Im(∇²ψ)
Im(∂ψ/∂t) = (ħ/m)·Re(∇²ψ)

Propagation mechanism? Yes. That’s what we’re talking about here: the propagation mechanism of energy. Huh? Yes. Let me explain in another separate section, so as to improve readability. Before I do, however, let me add another note—for the skeptics among you. 🙂

Indeed, the skeptics among you may wonder whether our zero-mass particle wavefunction makes any sense at all, and they should do so for the following reason: if x = 0 at t = 0, and it’s traveling at the speed of light, then x(t) = t. Always. So if E = m = p, the argument of our wavefunction becomes E·t – p·x = E·t – E·t = 0! So what’s that? The proper time of our zero-mass particle is zero—always and everywhere!?

Well… Yes. That’s why our zero-mass particle – as a point-like object – does not really exist. What we’re talking about is energy itself, and its propagation mechanism. 🙂

While I am sure that, by now, you’re very tired of my rambling, I beg you to read on. Frankly, if you got as far as you have, then you should really be able to work yourself through the rest of this post. 🙂 And I am sure that – if anything – you’ll find it stimulating! 🙂

The imaginary energy space

Look at the propagation mechanism for the electromagnetic wave in free space, which (for c = 1) is represented by the following two equations:

∂B/∂t = –∇×E
∂E/∂t = ∇×B

[In case you wonder, these are Maxwell’s equations for free space, so we have no stationary nor moving charges around.] See how similar this is to the two equations above? In fact, in my Deep Blue page, I use these two equations to derive the quantum-mechanical wavefunction for the photon (which is not the same as that hypothetical zero-mass particle I introduced above), but I won’t bother you with that here. Just note the so-called curl operator in the two equations above (∇×) can be related to the Laplacian we’ve used so far (∇²). It’s not the same thing, though: for starters, the curl operator operates on a vector quantity, while the Laplacian operates on a scalar (including complex scalars). But don’t get distracted now. Let’s look at the revised Schrödinger’s equation, i.e. the one without the 1/2 factor:

∂ψ/∂t = i·(ħ/m)·∇²ψ

On the left-hand side, we have a time derivative, so that’s a flow per second. On the right-hand side we have the Laplacian and the i·ħ/m factor. Now, written like this, Schrödinger’s equation really looks exactly the same as the general diffusion equation, which is written as: ∂φ/∂t = D·∇²φ, except for the imaginary unit, which makes it clear we’re getting two equations for the price of one here, rather than one only! 🙂 The point is: we may now look at that ħ/m factor as a diffusion constant, because it does exactly the same thing as the diffusion constant D in the diffusion equation ∂φ/∂t = D·∇²φ, i.e:

As a constant of proportionality, it quantifies the relationship between both derivatives.
As a physical constant, it ensures the dimensions on both sides of the equation are compatible.

So the diffusion constant for Schrödinger’s equation is ħ/m. What is its dimension? That’s easy: (N·m·s)/(N·s²/m) = m²/s. [Remember: 1 N = 1 kg·m/s².] But then we multiply it with the Laplacian, so that’s something expressed per square meter, so we get something per second on both sides.

Of course, you wonder: what per second? Not sure. That’s hard to say. Let’s continue with our analogy with the heat diffusion equation so as to try to get a better understanding of what’s being written here. Let me give you that heat diffusion equation here. Assuming the heat per unit volume (q) is proportional to the temperature (T) – which is the case when expressing T in degrees Kelvin (K), so we can write q as q = k·T – we can write it as:

So that’s structurally similar to Schrödinger’s equation, and to the two equivalent equations we jotted down above. So we’ve got T (temperature) in the role of ψ here—or, to be precise, in the role of ψ ‘s real and imaginary part respectively. So what’s temperature? From the kinetic theory of gases, we know that temperature is not just a scalar: temperature measures the mean (kinetic) energy of the molecules in the gas. That’s why we can confidently state that the heat diffusion equation models an energy flow, both in space as well as in time.

Let me make the point by doing the dimensional analysis for that heat diffusion equation. The time derivative on the left-hand side (∂T/∂t) is expressed in K/s (Kelvin per second). Weird, isn’t it? What’s a Kelvin per second? Well… Think of a Kelvin as some very small amount of energy in some equally small amount of space—think of the space that one molecule needs, and its (mean) energy—and then it all makes sense, doesn’t it?

However, in case you find that a bit difficult, just work out the dimensions of all the other constants and variables. The constant in front (k) makes sense of it. That coefficient (k) is the (volume) heat capacity of the substance, which is expressed in J/(m³·K). So the dimension of the whole thing on the left-hand side (k·∂T/∂t) is J/(m³·s), so that’s energy (J) per cubic meter (m³) and per second (s). Nice, isn’t it? What about the right-hand side? On the right-hand side we have the Laplacian operator – i.e. ∇²= ∇·∇, with ∇ = (∂/∂x, ∂/∂y, ∂/∂z) – operating on T. The Laplacian operator, when operating on a scalar quantity, gives us a flux density, i.e. something expressed per square meter (1/m²). In this case, it’s operating on T, so the dimension of ∇²T is K/m². Again, that doesn’t tell us very much (what’s the meaning of a Kelvin per square meter?) but we multiply it by the thermal conductivity (κ), whose dimension is W/(m·K) = J/(m·s·K). Hence, the dimension of the product is the same as the left-hand side: J/(m³·s). So that’s OK again, as energy (J) per cubic meter (m³) and per second (s) is definitely something we can associate with an energy flow.

In fact, we can play with this. We can bring k from the left- to the right-hand side of the equation, for example. The dimension of κ/k is m²/s (check it!), and multiplying that by K/m²(i.e. the dimension of ∇²T) gives us some quantity expressed in Kelvin per second, and so that’s the same dimension as that of ∂T/∂t. Done!

In fact, we’ve got two different ways of writing Schrödinger’s diffusion equation. We can write it as ∂ψ/∂t = i·(ħ/m)·∇²ψ or, else, we can write it as ħ·∂ψ/∂t = i·(ħ²/m)·∇²ψ. Does it matter? I don’t think it does. The dimensions come out OK in both cases. However, interestingly, if we do a dimensional analysis of the ħ·∂ψ/∂t = i·(ħ²/m)·∇²ψ equation, we get joule on both sides. Interesting, isn’t it? The key question, of course, is: what is it that is flowing here?

I don’t have a very convincing answer to that, but the answer I have is interesting—I think. 🙂 Think of the following: we can multiply Schrödinger’s equation with whatever we want, and then we get all kinds of flows. For example, if we multiply both sides with 1/(m²·s) or 1/(m³·s), we get a equation expressing the energy conservation law, indeed! [And you may want to think about the minus sign of the right-hand side of Schrödinger’s equation now, because it makes much more sense now!]

We could also multiply both sides with s, so then we get J·s on both sides, i.e. the dimension of physical action (J·s = N·m·s). So then the equation expresses the conservation of action! Huh? Yes. Let me re-phrase that: then it expresses the conservation of angular momentum—as you’ll surely remember that the dimension of action and angular momentum are the same. 🙂

And then we can divide both sides by m, so then we get N·s on both sides, so that’s momentum. So then Schrödinger’s equation embodies the momentum conservation law.

Isn’t it just wonderful? Schrödinger’s equation packs all of the conservation laws! The only catch is that it flows back and forth from the real to the imaginary space, using that propagation mechanism as described in those two equations.

Now that is really interesting, because it does provide an explanation – as fuzzy as it may seem – for all those weird concepts one encounters when studying physics, such as the tunneling effect, which amounts to energy flowing from the imaginary space to the real space and, then, inevitably, flowing back. It also allows for borrowing time from the imaginary space. Hmm… Interesting! [I know I still need to make these points much more formally, but… Well… You kinda get what I mean, don’t you?]

To conclude, let me re-baptize my real and imaginary ‘space’ by referring to them to what they really are: a real and imaginary energy space respectively. Although… Now that I think of it: it could also be real and imaginary momentum space, or a real and imaginary action space. Hmm… The latter term may be the best. 🙂

Isn’t this all great? I mean… I could go on and on—but I’ll stop here, so you can freewheel around yourself. For example, you may wonder how similar that energy propagation mechanism actually is as compared to the propagation mechanism of the electromagnetic wave? The answer is: very similar. You can check how similar in one of my posts on the photon wavefunction or, if you’d want a more general argument, check my Deep Blue page. Have fun exploring! 🙂

So… Well… That’s it, folks. I hope you enjoyed this post—if only because I really enjoyed writing it. 🙂

[…]

OK. You’re right. I still haven’t answered the fundamental question.

So what about the 1/2 factor?

What about that 1/2 factor? Did Schrödinger miss it? Well… Think about it for yourself. First, I’d encourage you to further explore that weird graph with the real and imaginary part of the wavefunction. I copied it below, but with an added 45º line—yes, the green diagonal. To make it somewhat more real, imagine you’re the zero-mass point-like particle moving along that line, and we observe you from our inertial frame of reference, using equivalent time and distance units.

So we’ve got that cosine (cosθ) varying as you travel, and we’ve also got the i·sinθ part of the wavefunction going while you’re zipping through spacetime. Now, THINK of it: the phase velocity of the cosine bit (i.e. the red graph) contributes as much to your lightning speed as the i·sinθ bit, doesn’t it? Should we apply Pythagoras’ basic r² = x² + y²Theorem here? Yes: the velocity vector along the green diagonal is going to be the sum of the velocity vectors along the horizontal and vertical axes. So… That’s great.

Yes. It is. However, we still have a problem here: it’s the velocity vectors that add up—not their magnitudes. Indeed, if we denote the velocity vector along the green diagonal as u, then we can calculate its magnitude as:

u = √u² = √[(v/2)² + (v/2)²] = √[2·(v²/4) = √[v²/2] = v/√2 ≈ 0.7·v

So, as mentioned, we’re adding the vectors, but not their magnitudes. We’re somewhat better off than we were in terms of showing that the phase velocity of those sine and cosine velocities add up—somehow, that is—but… Well… We’re not quite there.

Fortunately, Einstein saves us once again. Remember we’re actually transforming our reference frame when working with the wavefunction? Well… Look at the diagram below (for which I thank the author)

special relativity

In fact, let me insert an animated illustration, which shows what happens when the velocity u goes up and down from (close to) −c to +c and back again. It’s beautiful, and I must credit the author here too. It sort of speaks for itself, but please do click the link as the accompanying text is quite illuminating. 🙂

Animated_Lorentz_Transformation

The point is: for our zero-mass particle, the x’ and t’ axis will rotate into the diagonal itself which, as I mentioned a couple of times already, represents the speed of light and, therefore, our zero-mass particle traveling at c. It’s obvious that we’re now adding two vectors that point in the same direction and, hence, their magnitudes just add without any square root factor. So, instead of u = √[(v/2)² + (v/2)²], we just have v/2 + v/2 = v! Done! We solved the phase velocity paradox! 🙂

So… I still haven’t answered that question. Should that 1/2 factor in Schrödinger’s equation be there or not? The answer is, obviously: yes. It should be there. And as for Schrödinger using the mass concept as it appears in the classical kinetic energy formula: K.E. = m·v²/2… Well… What other mass concept would he use? I probably got a bit confused with Feynman’s exposé – especially this notion of ‘choosing the zero point for the energy’ – but then I should probably just re-visit the thing and adjust the language here and there. But the formula is correct.

Thinking it all through, the ħ/2m constant in Schrödinger’s equation should be thought of as the reciprocal of m/(ħ/2). So what we’re doing basically is measuring the mass of our object in units of ħ/2, rather than units of ħ. That makes perfect sense, if only because it’s ħ/2, rather than ħthe factor that appears in the Uncertainty Relations Δx·Δp ≥ ħ/2 and ΔE·Δt ≥ ħ/2. In fact, in my post on the wavefunction of the zero-mass particle, I noted its elementary wavefunction should use the m = E = p = ħ/2 values, so it becomes ψ(x, t) = a·e^{−i∙[(ħ/2)∙t − (ħ/2)∙x]/ħ}= a·e^{−i∙[t − x]/2}.

Isn’t that just nice? 🙂 I need to stop here, however, because it looks like this post is becoming a book. Oh—and note that nothing what I wrote above discredits my ‘hidden energy’ theory. On the contrary, it confirms it. In fact, the nice thing about those illustrations above is that it associates the imaginary component of our wavefunction with travel in time, while the real component is associated with travel in space. That makes our theory quite complete: the ‘hidden’ energy is the energy that moves time forward. The only thing I need to do is to connect it to that idea of action expressing itself in time or in space, cf. what I wrote on my Deep Blue page: we can look at the dimension of Planck’s constant, or at the concept of action in general, in two very different ways—from two different perspectives, so to speak:

[Planck’s constant] = [action] = N∙m∙s = (N∙m)∙s = [energy]∙[time]
[Planck’s constant] = [action] = N∙m∙s = (N∙s)∙m = [momentum]∙[distance]

Hmm… I need to combine that with the idea of the quantum vacuum, i.e. the mathematical space that’s associated with time and distance becoming countable variables…. In any case. Next time. 🙂

Before I sign off, however, let’s quickly check if our a·e^{−i∙[t − x]/2}wavefunction solves the Schrödinger equation:

∂ψ/∂t = −a·e^{−i∙[t − x]/2}·(i/2)
∇²ψ = ∂²[a·e^{−i∙[t − x]/2}]/∂x²= ∂[a·e^{−i∙[t − x]/2}·(i/2)]/∂x = −a·e^{−i∙[t − x]/2}·(1/4)

So the ∂ψ/∂t = i·(ħ/2m)·∇²ψ equation becomes:

−a·e^{−i∙[t − x]/2}·(i/2) = −i·(ħ/[2·(ħ/2)])·a·e^{−i∙[t − x]/2}·(1/4)

⇔ 1/2 = 1/4 !?

The damn 1/2 factor. Schrödinger wants it in his wave equation, but not in the wavefunction—apparently! So what if we take the m = E = p = ħ solution? We get:

∂ψ/∂t = −a·i·e^{−i∙[t − x]}
∇²ψ = ∂²[a·e^{−i∙[t − x]}]/∂x²= ∂[a·i·e^{−i∙[t − x]}]/∂x = −a·e^{−i∙[t − x]}

So the ∂ψ/∂t = i·(ħ/2m)·∇²ψ equation now becomes:

−a·i·e^{−i∙[t − x]} = −i·(ħ/[2·ħ])·a·e^{−i∙[t − x]}

⇔ 1 = 1/2 !?

We’re still in trouble! So… Was Schrödinger wrong after all? There’s no difficulty whatsoever with the ∂ψ/∂t = i·(ħ/m)·∇²ψ equation:

−a·e^{−i∙[t − x]/2}·(i/2) = −i·[ħ/(ħ/2)]·a·e^{−i∙[t − x]/2}·(1/4) ⇔ 1 = 1
−a·i·e^{−i∙[t − x]} = −i·(ħ/ħ)·a·e^{−i∙[t − x]}⇔ 1 = 1

What these equations might tell us is that we should measure mass, energy and momentum in terms of ħ (and not in terms of ħ/2) but that the fundamental uncertainty is ± ħ/2. That solves it all. So the magnitude of the uncertainty is ħ but it separates not 0 and ± 1, but −ħ/2 and −ħ/2. Or, more generally, the following series:

…, −7ħ/2, −5ħ/2, −3ħ/2, −ħ/2, +ħ/2, +3ħ/2,+5ħ/2, +7ħ/2,…

Why are we not surprised? The series represent the energy values that a spin one-half particle can possibly have, and ordinary matter – i.e. all fermions – is composed of spin one-half particles.

To conclude this post, let’s see if we can get any indication on the energy concepts that Schrödinger’s revised wave equation implies. We’ll do so by just calculating the derivatives in the ∂ψ/∂t = i·(ħ/m)·∇²ψ equation (i.e. the equation without the 1/2 factor). Let’s also not assume we’re measuring stuff in natural units, so our wavefunction is just what it is: a·e^{−i·[E·t − p∙x]/ħ}. The derivatives now become:

∂ψ/∂t = −a·i·(E/ħ)·e^{−i∙[E·t − p∙x]/ħ}
∇²ψ = ∂²[a·e^{−i∙[E·t − p∙x]/ħ}]/∂x²= ∂[a·i·(p/ħ)·e^{−i∙[E·t − p∙x]/ħ}]/∂x = −a·(p²/ħ²)·e^{−i∙[E·t − p∙x]/ħ}

So the ∂ψ/∂t = i·(ħ/m)·∇²ψ = i·(1/m)·∇²ψ equation now becomes:

−a·i·(E/ħ)·e^{−i∙[E·t − p∙x]/ħ} = −i·(ħ/m)·a·(p²/ħ²)·e^{−i∙[E·t − p∙x]/ħ}⇔ E = p²/m = m·v²

It all works like a charm. Note that we do not assume stuff like E = m = p here. It’s all quite general. Also note that the E = p²/m closely resembles the kinetic energy formula one often sees: K.E. = m·v²/2 = m·m·v²/(2m) = p²/(2m). We just don’t have the 1/2 factor in our E = p²/m formula, which is great—because we don’t want it! Of course, if you’d add the 1/2 factor in Schrödinger’s equation again, you’d get it back in your energy formula, which would just be that old kinetic energy formula which gave us all these contradictions and ambiguities. 😦

Finally, and just to make sure: let me add that, when we wrote that E = m = p – like we did above – we mean their numerical values are the same. Their dimensions remain what they are, of course. Just to make sure you get that subtle point, we’ll do a quick dimensional analysis of that E = p²/m formula:

[E] = [p²/m] ⇔ N·m = N²·s²/kg = N²·s²/[N·m/s²] = N·m = joule (J)

So… Well… It’s all perfect. 🙂

Post scriptum: I revised my Deep Blue page after writing this post, and I think that a number of the ideas that I express above are presented more consistently and coherently there. In any case, the missing energy theory makes sense. Think of it: any oscillator involves both kinetic as well as potential energy, and they both add up to twice the average kinetic (or potential) energy. So why not here? When everything is said and done, our elementary wavefunction does describe an oscillator. 🙂

Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/

Schrödinger’s equation in action

Pre-script (dated 26 June 2020): Our ideas have evolved into a full-blown realistic (or classical) interpretation of all things quantum-mechanical. So no use to read this. Read my recent papers instead. 🙂

Original post:

This post is about something I promised to write about aeons ago: how do we get those electron orbitals out of Schrödinger’s equation? So let me write it now – for the simplest of atoms: hydrogen. I’ll largely follow Richard Feynman’s exposé on it: this text just intends to walk you through it and provide some comments here and there.

Let me first remind you of what that famous Schrödinger’s equation actually represents. In its simplest form – i.e. not including any potential, so then it’s an equation that’s valid for free space only—no force fields!—it reduces to:

i·ħ∙∂ψ/∂t = –(1/2)∙(ħ²/m_eff)∙∇²ψ

Note the enigmatic concept of the efficient mass in it (m_eff), as well as the rather awkward 1/2 factor, which we may get rid of by re-defining it. We then write: m_eff^NEW = 2∙m_eff^OLD, and Schrödinger’s equation then simplifies to:

∂ψ/∂t + i∙(V/ħ)·ψ = i(ħ/m_eff)·∇²ψ
In free space (no potential): ∂ψ/∂t = i∙(ħ/m_eff)·∇²ψ

In case you wonder where the minus sign went, I just brought the imaginary unit to the other side. Remember 1/i = −i. 🙂

Now, in my post on quantum-mechanical operators, I drew your attention to the fact that this equation is structurally similar to the heat diffusion equation – or to any diffusion equation, really. Indeed, assuming the heat per unit volume (q) is proportional to the temperature (T) – which is the case when expressing T in degrees Kelvin (K), so we can write q as q = k·T – we can write the heat diffusion equation as:

Moreover, I noted the similarity is not only structural. There is more to it: both equations model energy flows. How exactly is something I wrote about in my e-publication on this, so let me refer you to that. Let’s jot down the complete equation once more:

∂ψ/∂t + i∙(V/ħ)·ψ = i(ħ/m_eff)·∇²ψ

In fact, it is rather surprising that Feynman drops the eff subscript almost immediately, so he just writes: schrodinger 5

Let me first remind you that ψ is a function of position in space and time, so we write: ψ = ψ(x, y, z, t) = ψ(r, t), with (x, y, z) = r. And m, on the other side of the equation, is what it always was: the effective electron mass. Now, we talked about the subtleties involved before, so let’s not bother about the definition of the effective electron mass, or wonder where that factor 1/2 comes from here.

What about V? V is the potential energy of the electron: it depends on the distance (r) from the proton. We write: V = −e²/│r│ = −e²/r. Why the minus sign? Because we say the potential energy is zero at large distances (see my post on potential energy). Back to Schrödinger’s equation.

On the left-hand side, we have ħ, and its dimension is J·s (or N·m·s, if you want). So we multiply that with a time derivative and we get J, the unit of energy. On the right-hand side, we have Planck’s constant squared, the mass factor in the denominator, and the Laplacian operator – i.e. ∇²= ∇·∇, with ∇ = (∂/∂x, ∂/∂y, ∂/∂z) – operating on the wavefunction.

Let’s start with the latter. The Laplacian works just the same as for our heat diffusion equation: it gives us a flux density, i.e. something expressed per square meter (1/m²). The ħ²factor gives us J²·s². The mass factor makes everything come out alright, if we use the mass-equivalence relation, which says it’s OK to express the mass in J/(m/s)². [The mass of an electron is usually expressed as being equal to 0.5109989461(31) MeV/c². That unit uses the E = m·c²mass-equivalence formula. As for the eV, you know we can convert that into joule, which is a rather large unit—which is why we use the electronvolt as a measure of energy.] To make a long story short, we’re OK: (J²·s²)·[(m/s)²/J]·(1/m²) = J! Perfect. [As for the Vψ term, that’s obviously expressed in joule too.]

In short, Schrödinger’s equation expresses the energy conservation law too, and we may express it per square meter or per second or per cubic meter as well, if we’d wish: we can just multiply both sides by 1/m² or 1/s or 1/m³or by whatever dimension you want. Again, if you want more detail on the Schrödinger equation as an energy propagation mechanism, read the mentioned e-publication. So let’s get back to our equation, which, taking into account our formula for V, now looks like this:

Eq1

Feynman then injects one of these enigmatic phrases—enigmatic for novices like us, at least!

“We want to look for definite energy states, so we try to find solutions which have the form: ψ (r, t) = e^{−(i/ħ)·E·t}·ψ(r).”

At first, you may think he’s just trying to get rid of the relativistic correction in the argument of the wavefunction. Indeed, as I explain in that little booklet of mine, the –(p/ħ)·x term in the argument of the elementary wavefunction a·e^−i·θ = a·e^{−i·[(E/ħ)·t – (p/ħ)·x]} is there because the young Comte Louis de Broglie, back in 1924, when he wrote his groundbreaking PhD thesis, suggested the θ = ω∙t – k∙x = (E∙t – p∙x)/ħ formula for the argument of the wavefunction, as he knew that relativity theory had already established the invariance of the four-vector (dot) product p_μx_μ = E∙t – p∙x = p_μ‘x_μ‘ = E’∙t’ – p’∙x’. [Note that Planck’s constant, as a physical constant, should obviously not depend on the reference frame either. Hence, if the E∙t – p∙x product is invariant, so is (E∙t – p∙x)/ħ.] So the θ = E∙t – p∙x and the θ = E₀∙t’ = E’·t’ are fully equivalent. Using lingo, we can say that the argument of the wavefunction is a Lorentz scalar and, therefore, invariant under a Lorentz boost. Sounds much better, doesn’t it? 🙂

But… Well. That’s not why Feynman says what he says. He just makes abstraction of uncertainty here, as he looks for states with a definite energy state, indeed. Nothing more, nothing less. Indeed, you should just note that we can re-write the elementary a·e^{−i[(E/ħ)·t – (p/ħ)·x]} function as e^{−(i/ħ)·E·t}·a·e^{i·(p/ħ)·x]}. So that’s what Feynman does here: he just eases the search for functional forms that satisfy Schrödinger’s equation. You should note the following:

Writing the coefficient in front of the complex exponential as ψ(r) = a·e^{i·(p/ħ)·x]} does the trick we want it to do: we do not want that coefficient to depend on time: it should only depend on the size of our ‘box’ in space, as I explained in one of my posts.
Having said that, you should also note that the ψ in the ψ(r, t) function and the ψ in the ψ(r) denote two different beasts: one is a function of two variables (r and t), while the other makes abstraction of the time factor and, hence, becomes a function of one variable only (r). I would have used another symbol for the ψ(r) function, but then the Master probably just wants to test your understanding. 🙂

In any case, the differential equation we need to solve now becomes:

Huh? How does that work? Well… Just take the time derivative of e^{−(i/ħ)·E·t}·ψ(r), multiply with the i·ħ in front of that term in Schrödinger’s original equation and re-arrange the terms. [Just do it: ∂[e^{−(i/ħ)·E·t}·ψ(r)]/∂t = −(i/ħ)·E·e^{−(i/ħ)·E·t}·ψ(r). Now multiply that with i·ħ: the ħ factor cancels and the minus disappears because i²= −1.]

So now we need to solve that differential equation, i.e. we need to find functional forms for ψ – and please do note we’re talking ψ(r) here – not ψ(r, t)! – that satisfy the above equation. Interesting question: is our equation still Schrödinger’s equation? Well… It is and it isn’t. Any linear combination of the definite energy solutions we find will also solve Schrödinger’s equation, but so we limited the solution set here to those definite energy solutions only. Hence, it’s not quite the same equation. We removed the time dependency here – and in a rather interesting way, I’d say.

The next thing to do is to switch from Cartesian to polar coordinates. Why? Well… When you have a central-force problem – like this one (because of the potential) – it’s easier to solve them using polar coordinates. In fact, because we’ve got three dimensions here, we’re actually talking a spherical coordinate system. The illustration and formulas below show how spherical and Cartesian coordinates are related:

x = r·sinθ·cosφ; y = r·sinθ·sinφ; z = r·cosθ

558px-3D_Spherical

As you know, θ (theta) is referred to as the polar angle, while φ (phi) is the azimuthal angle, and the coordinate transformation formulas can be easily derived. The rather simple differential equation above now becomes the following monster:

new de

Huh? Yes, I am very sorry. That’s how it is. Feynman does this to help us. If you think you can get to the solutions by directly solving the equation in Cartesian coordinates, please do let me know. 🙂 To tame the beast, we might imagine to first look for solutions that are spherically symmetric, i.e. solutions that do not depend on θ and φ. That means we could rotate the reference frame and none of the amplitudes would change. That means the ∂ψ/∂θ and ∂ψ/∂φ (partial) derivatives in our formula are equal to zero. These spherically symmetric states, or s-states as they are referred to, are states with zero (orbital) angular momentum, but you may want to think about that statement before accepting it. 🙂 [It’s not that there’s no angular momentum (on the contrary: there’s lots of it), but the total angular momentum should obviously be zero, and so that’s what meant when these states are denoted as l = 0 states.] So now we have to solve:

de 3

Now that looks somewhat less monstrous, but Feynman still fills two rather dense pages to show how this differential equation can be solved. It’s not only tedious but also complicated, so please check it yourself by clicking on the link. One of the steps is a switch in variables, or a re-scaling, I should say. Both E and r are now measured as follows:

The complicated-looking factors are just the Bohr radius (r_B= ħ²/(m·e²) ≈ 0.528 Å) and the Rydberg energy (E_R= m·e⁴/2·ħ² ≈ 13.6 eV). We calculated those long time ago using a rather heuristic model to describe an atom. In case you’d want to check the dimensions, note e²is a rather special animal. It’s got nothing to do with Euler’s number. Instead, e²is equal to k_e·q_e², and the k_e here is Coulomb’s constant: k_e = 1/(4πε₀). This allows to re-write the force between two electrons as a function of the distance: F = e²/r². This, in turn, explains the rather weird dimension of e²: [e²] = N·e²= J·m. But I am digressing too much. The bottom line is: the various energy levels that fit the equation, i.e. the allowable energies, are fractions of the Rydberg energy, i.e. E_R=m·e⁴/2·ħ². To be precise, the formula for the n^thenergy level is:

E_n= − E_R/n².

The interesting thing is that the spherically symmetric solutions yield real-valued ψ(r) functions. The solutions for n = 1, 2, and 3 respectively, and their graph is given below.

graph As Feynman writes, all of the wave functions approach zero rapidly for large r (also, confusingly, denoted as ρ) after oscillating a few times, with the number of ‘bumps’ equal to n. Of course, you should note that you should put the time factor back in in order to correctly interpret these functions. Indeed, remember how we separated them when we wrote:

ψ(r, t) = e^{−i·(E/ħ)·t}·ψ(r)

We might say the ψ(r) function is sort of an envelope function for the whole wavefunction, but it’s not quite as straightforward as that. However, I am sure you’ll figure it out.

States with an angular dependence

So far, so good. But what if those partial derivatives are not zero? Now the calculations become really complicated. Among other things, we need these transformation matrices for rotations, which we introduced a very long time ago. As mentioned above, I don’t have the intention to copy Feynman here, who needs another two or three dense pages to work out the logic. Let me just state the grand result:

We’ve got a whole range of definite energy states, which correspond to orbitals that form an orthonormal basis for the actual wavefunction of the electron.
The orbitals are characterized by three quantum numbers, denoted as l, n and m respectively:
- The l is the quantum number of (total) angular momentum, and it’s equal to 0, 1, 2, 3, etcetera. [Of course, as usual, we’re measuring in units of ħ.] The l = 0 states are referred to as s-states, the l = 1 states are referred to as p-states, and the l = 2 states are d-states. They are followed by f, g, h, etcetera—for no particular good reason. [As Feynman notes: “The letters don’t mean anything now. They did once—they meant “sharp” lines, “principal” lines, “diffuse” lines and “fundamental” lines of the optical spectra of atoms. But those were in the days when people did not know where the lines came from. After f there were no special names, so we now just continue with g, h, and so on.]
- The m is referred to as the ‘magnetic’ quantum number, and it ranges from −l to +l.
- The n is the ‘principle’ quantum number, and it goes from l + 1 to infinity (∞).

How do these things actually look like? Let me insert two illustrations here: one from Feynman, and the other from Wikipedia.

shape

The number in front just tracks the number of s-, p-, d-, etc. orbital. The shaded region shows where the amplitudes are large, and the plus and minus signs show the relative sign of the amplitude. [See my remark above on the fact that the ψ factor is real-valued, even if the wavefunction as a whole is complex-valued.] The Wikipedia image shows the same density plots but, as it was made some 50 years later, with some more color. 🙂

660px-Hydrogen_Density_Plots

This is it, guys. Feynman takes it further by also developing the electron configurations for the next 35 elements in the periodic table but… Well… I am sure you’ll want to read the original here, rather than my summaries. 🙂

Congrats ! We now know all what we need to know. All that remains is lots of practical exercises, so you can be sure you master the material for your exam. 🙂

Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/

Freewheeling…

In my previous post, I copied a simple animation from Wikipedia to show how one can move from Cartesian to polar coordinates. It’s really neat. Just watch it a few times to appreciate what’s going on here.

Cartesian_to_polar First, the function is being inverted, so we go from y = f(x) to x = g(y) with g = f⁻¹. In this case, we know that if y = sin(6x) + 2 (that’s the function above), then x = (1/6)·arcsin(y – 2). [Note the troublesome convention to denote the inverse function by the -1 superscript: it’s troublesome because that superscript is also used for a reciprocal—and f⁻¹has, obviously, nothing to do with 1/f. In any case, let’s move on.] So we swap the x-axis for the y-axis, and vice versa. In fact, to be precise, we reflect them about the diagonal. In fact, w’re reflecting the whole space here, including the graph of the function. Note that, in three-dimensional space, this reflection can also be looked at as a rotation – again, of all space, including the graph and the axes – by 180 degrees. The axis of rotation is, obviously, the same diagonal. [I like how the animation visualizes this. Neat! It made me think!]

Of course, if we swap the axes, then the domain and the range of the function get swapped too. Let’s see how that works here: x goes from −π to +π, so that’s one cycle (but one that starts from −π and goes to +π, rather than from 0 to 2π), and, hence, y ranges between 1 and 3. [Whatever its argument, the sine function always yields a value between −1 and +1, but we add 2 to every value it takes, so we get the [1, 3] interval now.] After swapping the x- and y-axis, the angle, i.e. the interval between −π and +π, is now on the vertical axis. That’s clear enough. So far so good. 🙂 The operation that follows, however, is a much more complicated transformation of space and, therefore, much more interesting.

The transformation bends the graph around the origin so its head and tail meet. That’s easy to see. What’s a bit more difficult to understand is how the coordinate axes transform. I had to look at the animation several times – so please do the same. Note how this transformation wraps all of the vertical lines around a circle, and how the radius of those circles depends on the distance of those lines from the origin (as measured along the horizontal axis). What about the vertical axis? The animation is somewhat misleading here, as it gives the impression we’re first making another circle out of it, which we then sort of shrink—all the way down to a circle with zero radius! So the vertical axis becomes the origin of our new space. However, there’s no shrinking really. What happens is that we also wrap it around a circle—but one with zero radius indeed!

It’s a very weird operation because we’re dealing with a non-linear transformation here (unlike rotation or reflection) and, therefore, we’re not familiar with it. Even weirder is what happens to the horizontal axis: somehow, this axis becomes an infinite disc, so the distance out is now measured from the center outwards. I should figure out the math here, but that’s for later. The point is: the r = sin(6θ) + 2 function in the final graph (i.e. the curve that looks like a petaled flower) is the same as that y = sin(6x) + 2 curve, so y = r and x = θ, and so we can write what’s written above: r(θ) = sin(6·θ) + 2.

You’ll say: nice, but so what? Well… When I saw this animation, my first reaction was: what if the x and y would be time and space respectively? You’ll say: what space? Well… Just space: three-dimensional space. So think of one of the axes as packing three dimensions really, or three directions—like what’s depicted below. Now think of some point-like object traveling through spacetime, as shown below. It doesn’t need to be point-like, of course—just small enough so we can represent its trajectory by a line. You can also think of the movement of its center-of-mass if you don’t like point-like assumptions. 🙂

trajectory

Of course, you’ll immediately say the trajectory above is not kosher, as our object travels back in time in three sections of this ‘itinerary’.

You’re right. Let’s correct that. It’s easy to see how we should correct it. We just need to ensure the itinerary is a well-defined function, which isn’t the case with the function above: for one value of t, we have only one value of x everywhere—except where we allow our particle to travel back in time. So… Well… We shouldn’t allow that. The concept of a well-defined function implies we need to choose one direction in time. 🙂 That’s neat, because this gives us an explanation for the unique direction of time without having to invoke entropy or other macro-concepts. So let’s replace that thing above by something more kosher traveling in spacetime, like the thing below.

trajectory 2 Now think of wrapping that around some circle. We’d get something like below. [Don’t worry about the precise shape of the graph, as I made up a new one. Note the remark on the need to have a well-behaved function applies here too!]

trajectory 4 Neat, you’ll say, but so what? All we’ve done so far is show that we can represent some itinerary in spacetime in two different ways. In the first representation, we measure time along some linear axis, while, in the second representation, time becomes some angle—an angle that increases, counter-clockwise. To put it differently: time becomes an angular velocity.

Likewise, the spatial dimension was a linear feature in the first representation, while in the second we think of it as some distance measured from some zero point. Well… In fact… No. That’s not correct. The r above has got nothing to do with the distance traveled: the distance traveled would need to be measured along the curve.

Hmm… What’s the deal here?

Frankly, I am not sure. Now that I look at it once more, I note that the exercise with our graph above involved one cycle of a periodic function only—so it’s really not like some object traveling in spacetime, because that’s not a periodic thing. But… Well… Does that matter all that much? It’s easy to imagine how our new representation would just involve some thing that keeps going around and around, as illustrated below.

trajectory 5

So, in this representation, any movement in spacetime – regular or irregular – does become something periodic. But what is periodic here? My first answer is the simplest and, hence, probably the correct one: it’s just time. Time is the periodic thing here.

Having said that, I immediately thought of something else that’s periodic: the wavefunction that’s associated with this object—any object traveling in spacetime, really—is periodic too. So my guts instinct tells me there’s something here that we might want to explore further. 🙂 Could we replace the function for the trajectory with the wavefunction?

Huh? Yes. The wavefunction also associates each x and t, although the association is a bit more complex—literally, because we’ll associate it with two periodic functions: the real part and the imaginary part of the (complex-valued) wavefunction. But for the rest, no problem, I’d say. Remember our wavefunction, when squared, represents the probability of our object being there. [I should say “absolute-squared” rather than squared, but that sounds so weird.]

But… Yes? Well… Don’t we get in trouble here because the same complex number (i.e. r·e^θ = x + i·y) may be related to two points in spacetime—as shown in the example above? My answer is the same: I don’t think so. It’s the same thing: our new representation implies stuff keeps going around and around in it. In fact, that just captures the periodicity of the wavefunction. So… Well… It’s fine. 🙂

The more important question is: what can we do with this new representation? Here I do not have any good answer. Nothing much for the moment. I just wanted to jot it down, because it triggers some deep thoughts—things I don’t quite understand myself, as yet.

First, I like the connection between a regular trajectory in spacetime – as represented by a well-defined function – and the unique direction in time it implies. It’s a simple thing: we know something can travel in any direction in space – forward, backwards, sideways, whatever – but time has one direction only. At least we can see why now: both in Cartesian as well as polar coordinates, we’d want to see a well-behaved function. 🙂 Otherwise we couldn’t work with it.

Another thought is the following. We associate the momentum of a particle with a linear trajectory in spacetime. But what’s linear in curved spacetime? Remember how we struggle to represent – or imagine, I would say – curved spacetime, as evidenced by the fact that most illustrations of curved spacetime represent a two-dimensional space in three-dimensional Cartesian space? Think of the typical illustration, like that rubber surface with the ball deforming it.

That’s why this transformation of a Cartesian coordinate space into a polar coordinate space is such an interesting exercise. We now measure distance along the circle. [Note that we suddenly need to keep track of the number of rotations, which we can do by keeping track of time, as time units become some angle, and linear speed becomes angular speed.] The whole thing underscores, in my view, that’s it’s only our mind that separates time and space: the reality of the object is just its movement or velocity – and that’s one movement.

My guts instinct tells me that this is what the periodicity of the wavefunction (or its component waves, I should say) captures, somehow. If the movement is linear, it’s linear both in space as well in time, so to speak:

As a mental construct, time is always linear – it goes in one direction (and we think of the clock being regular, i.e. not slowing down or speeding up) – and, hence, the mathematical qualities of the time variable in the wavefunction are the same as those of the position variable: it’s a factor in one of its two terms. To be precise, it appears as the t in the E·t term in the argument θ = E·t – p·x. [Note the minus sign appears because we measure angles counter-clockwise when using polar coordinates or complex numbers.]
The trajectory in space is also linear – whether or not space is curved because of the presence of other masses.

OK. I should conclude here, but I want take this conversation one step further. Think of the two graphs below as representing some oscillation in space. Some object that goes back and forth in space: it accelerates and decelerates—and reverses direction. Imagine the g-forces on it as it does so: if you’d be traveling with that object, you would sure feel it’s going back and forth in space! The graph on the left-hand side is our usual perspective on stuff like this: we measure time using some steadily ticking clock, and so the seconds, minutes, hours, days, etcetera just go by. graph 1

The graph on the right-hand side applies our inversion technique. But, frankly, it’s the same thing: it doesn’t give us any new information. It doesn’t look like a well-behaved function but it actually is. It’s just a matter of mathematical convention: if we’d be used to looking at the y-axis as the independent variable (rather than the dependent variable), the function would be acceptable.

This leads me to the idea I started to explore in my previous post, and that’s to try to think of wavefunctions as oscillations of spacetime, rather than oscillations in spacetime. I inserted the following graph in that post—but it doesn’t say all that much, as it suggests we’re doing the same thing here: we’re just swapping axes. The difference is that the θ in the first graph now combines both time and space. We might say it represents spacetime itself. So the wavefunction projects it into some other ‘space’, i.e. the complex space. And then in the second graph, we reflect the whole thing.

dependent independent

So the idea is the following: our functions sort of project one ‘space’ into another ‘space’. In this case: the wavefunction sort of transforms spacetime – i.e. what we like to think of as the ‘physical’ space – into a complex space – which is purely mathematical.

Hmm… This post is becoming way too long, so I need to wrap it up. Look at the graph below, and note the dimension of the axes. We’re looking at an oscillation once more, but an oscillation of time this time around.

Graph 2

Huh? Yes. Imagine that, for some reason, you don’t feel those g-forces while going up and down in space: it’s the rest of the world that’s moving. You think you’re stationary or—what amounts to the same according to the relativity principle—moving in a straight line at constant velocity. The only way how you could explain the rest of the world moving back and forth, accelerating and decelerating, is that time itself is oscillating: objects reverse their direction for no apparent reason—so that’s time reversal—and they do so a varying speeds, so we’ve got a clock going wild!

You’ll nod your head in agreement now and say: that’s Einstein’s intuition in regard to the gravitational force. There’s no force really: mass just bends spacetime in such a way a planet in orbit follows a straight line, in a curved spacetime continuum. What I am saying here is that there must be ways to think of the electromagnetic force in exactly the same way. If the accelerations and decelerations of an electron moving in some electron would really be due to an electromagnetic force in the classical picture of a force (i.e. something pulling or pushing), then it would radiate energy away. We know it doesn’t do that—because otherwise it would spiral down into the nucleus itself. So I’ve been thinking it must be traveling in its own curved spacetime, but then it’s curved because of the electromagnetic force—obviously, as that’s the force that’s much more relevant at this scale.

The underlying thought is simple enough: if gravity curves spacetime, why can’t we look at the other forces as doing the same? Why can’t we think of any force coming ‘with its own space’, so to say? The difference between the various forces is the curvature – which will, obviously, be much more complex (literally) for the electromagnetic force. Just think of the other forces as curving space in more than one dimension. 🙂

I am sure you’ll think I’ve just gone crazy. Perhaps. In any case, I don’t care too much. As mentioned, because the electromagnetic force is different—we don’t have negative masses attracting positive masses when discussing gravity—it’s going to be a much weirder type of curvature, but… Well… That’s probably why we need those ‘two-dimensional’ complex numbers when discussing quantum mechanics! 🙂 So we’ve got some more mathematical dimensions, but the physical principle behind all forces should be the same, no? All forces are measured using Newton’s Law, so we relate them to the motion of some mass. The principle is simple: if force is related to the change in motion of a mass, then the trajectory in the space that’s related to that force will be linear if the force is not acting.

So… Well… Hmm… What?

All of what I write above is a bit of a play with words, isn’t it? An oscillation of spacetime—but then spacetime must oscillate in something else, doesn’it? So in what then is it oscillating?

Great question. You’re right. It must be oscillating in something else or, to be precise, we need some other reference space so as to define what we mean by an oscillation of spacetime. That space is going to be some complex mathematical space—and I use complex both in its mathematical as well as in its everyday meaning here (complicated). Think of, for example, that x-axis representing three-dimensional space. We’d have something similar here: dimensions within dimensions.

There’s some great videos on YouTube that illustrate how one can turn a sphere inside out without punching a hole in it. That’s basically what we’re talking about here: it’s more than just switching the range for the domain of a function, which we can do by that reflection – or mirroring – using the 45º line. Conceptually, it’s really like turning a sphere inside out. Think of the surface of the curve connecting the two spaces.

Huh? Yes. But… Well… You’re right. Stuff like this is for the graduate level, I guess. So I’ll let you think about it—and do watch the videos that follow it. 🙂

In any case, I have to stop my wandering about here. Rather than wrapping up, however, I thought of something else yesterday—and so I’ll quickly jot that down as well, so I can re-visit it some other time. 🙂

Some other thinking on the Uncertainty Principle

I wanted to jot down something else too here. Something about the Uncertainty Principle once more. In my previous post, I noted we should think of Planck’s constant as expressing itself in time or in space, as we have two ways of looking at the dimension of Planck’s constant:

[Planck’s constant] = [ħ] = N∙m∙s = (N∙m)∙s = [energy]∙[time]
[Planck’s constant] = [ħ] = N∙m∙s = (N∙s)∙m = [momentum]∙[distance]

The bracket symbols [ and ] mean: ‘the dimension of what’s between the brackets’. Now, this may look like kids stuff, but the idea is quite fundamental: we’re thinking here of some amount of action (ħ, i.e. the quantum of action) expressing itself in time or, alternatively, expressing itself in space, indeed. In the former case, some amount of energy is expended during some time. In the latter case, some momentum is expended over some distance. We also know ħ can be written in terms of fundamental units, which are referred to as Planck units:

ħ = F_P∙l_P∙t_P = Planck force unit × Planck distance unit × Planck time unit

Finally, we thought of the Planck distance unit and the Planck time unit as the smallest units of time and distance possible. As such, they become countable variables, so we’re talking of a trajectory in terms of discrete steps in space and time here, or discrete states of our particle. As such, the E·t and p·x in the argument (θ) of the wavefunction—remember: θ = (E/ħ)·t − (p/ħ)·x—should be some multiple of ħ as well. We may write:

E·t = m·ħ and p·x = n·ħ, with m and n both positive integers

Of course, there’s uncertainty: Δp·Δx ≥ ħ/2 and ΔE·Δt ≥ ħ/2. Now, if Δx and Δt also become countable variables, so Δx and Δt can only take on values like ±1, ±2, ±3, ±4, etcetera, then we can think of trying to model some kind of random walk through spacetime, combining various values for n and m, as well as various values for Δx and Δt. The relation between E and p, and the related difference between m and n, should determine in what direction our particle should be moving even if it can go along different trajectories. In fact, Feynman’s path integral formulation of quantum mechanics tells us it’s likely to move along different trajectories at the same time, with each trajectory having its own amplitude. Feynman’s formulation uses continuum theory, of course, but a discrete analysis – using a random walk approach – should yield the same result because, when everything is said and done, the fact that physics tells us time and space must become countable at some scale (the Planck scale), suggests that continuum theory may not represent reality, but just be an approximation: a limiting situation, in other words.

Hmmm… Interesting… I’ll need to do something more with this. Unfortunately, I have little time over the coming weeks. Again, I am just writing it down to re-visit it later—probably much later. 😦

The wavefunction as an oscillation of spacetime

Original post:

You probably heard about the experimental confirmation of the existence of gravitational waves by Caltech’s LIGO Lab. Besides further confirming our understanding of the Universe, I also like to think it confirms that the elementary wavefunction represents a propagation mechanism that is common to all forces. However, the fundamental question remains: what is the wavefunction? What are those real and imaginary parts of those complex-valued wavefunctions describing particles and/or photons? [In case you think photons have no wavefunction, see my post on it: it’s fairly straightforward to re-formulate the usual description of an electromagnetic wave (i.e. the description in terms of the electric and magnetic field vectors) in terms of a complex-valued wavefunction. To be precise, in the mentioned post, I showed an electromagnetic wave can be represented as the sum of two wavefunctions whose components reflect each other through a rotation by 90 degrees.]

So what? Well… I’ve started to think that the wavefunction may not only describe some oscillation in spacetime. I’ve started to think the wavefunction—any wavefunction, really (so I am not talking gravitational waves only)—is nothing but an oscillation of spacetime. What makes them different is the geometry of those wavefunctions, and the coefficient(s) representing their amplitude, which must be related to their relative strength—somehow, although I still have to figure out how exactly.

Huh? Yes. Maxwell, after jotting down his equations for the electric and magnetic field vectors, wrote the following back in 1862: “The velocity of transverse undulations in our hypothetical medium, calculated from the electromagnetic experiments of MM. Kohlrausch and Weber, agrees so exactly with the velocity of light calculated from the optical experiments of M. Fizeau, that we can scarcely avoid the conclusion that light consists in the transverse undulations of the same medium which is the cause of electric and magnetic phenomena.”

We now know there is no medium – no aether – but physicists still haven’t answered the most fundamental question: what is it that is oscillating? No one has gone beyond the abstract math. I dare to say now that it must be spacetime itself. In order to prove this, I’ll have to study Einstein’s general theory of relativity. But this post will already cover some basics.

The quantum of action and natural units

We can re-write the quantum of action in natural units, which we’ll call Planck units for the time being. They may or may not be the Planck units you’ve heard about, so just think of them as being fundamental, or natural—for the time being, that is. You’ll wonder: what’s natural? What’s fundamental? Well… That’s the question we’re trying to explore in this post, so… Well… Just be patient… 🙂 We’ll denote those natural units as F_P, l_P, and t_P, i.e. the Planck force, Planck distance and Planck time unit respectively. Hence, we write:

ħ = F_P∙l_P∙t_P

Note that F_P, l_P, and t_P are expressed in our old-fashioned SI units here, i.e. in newton (N), meter (m) and seconds (s) respectively. So F_P, l_P, and t_P have a numerical value as well as a dimension, just like ħ. They’re not just numbers. If we’d want to be very explicit, we could write: F_P = F_P [force], or F_P = F_P N, and you could do the same for l_Pand t_P. However, it’s rather tedious to mention those dimensions all the time, so I’ll just assume you understand the symbols we’re using do not represent some dimensionless number. In fact, that’s what distinguishes physical constants from mathematical constants.

Dimensions are also distinguishes physics equations from purely mathematical ones: an equation in physics will always relate some physical quantities and, hence, when you’re dealing with physics equations, you always need to wonder about the dimensions. [Note that the term ‘dimension’ has many definitions… But… Well… I suppose you know what I am talking about here, and I need to move on. So let’s do that.] Let’s re-write that ħ = F_P∙l_P∙t_P formula as follows: ħ/t_P = F_P∙l_P.

F_P∙l_P is, obviously, a force times a distance, so that’s energy. Please do check the dimensions on the left-hand side as well: [ħ/t_P] = [[ħ]/[t_P] = (N·m·s)/s = N·m. In short, we can think of E_P = F_P∙l_P = ħ/t_P as being some ‘natural’ unit as well. But what would it correspond to—physically? What is its meaning? We may be tempted to call it the quantum of energy that’s associated with our quantum of action, but… Well… No. While it’s referred to as the Planck energy, it’s actually a rather large unit, and so… Well… No. We should not think of it as the quantum of energy. We have a quantum of action but no quantum of energy. Sorry. Let’s move on.

In the same vein, we can re-write the ħ = F_P∙l_P∙t_P as ħ/l_P = F_P∙t_P. Same thing with the dimensions—or ‘same-same but different’, as they say in Asia: [ħ/l_P] = [F_P∙t_P] = N·m·s)/m = N·s. Force times time is momentum and, hence, we may now be tempted to think of p_P = F_P∙t_P = ħ/l_P as the quantum of momentum that’s associated with ħ, but… Well… No. There’s no such thing as a quantum of momentum. Not now in any case. Maybe later. 🙂 But, for now, we only have a quantum of action. So we’ll just call ħ/l_P = F_P∙t_P the Planck momentum for the time being.

So now we have two ways of looking at the dimension of Planck’s constant:

[Planck’s constant] = N∙m∙s = (N∙m)∙s = [energy]∙[time]
[Planck’s constant] = N∙m∙s = (N∙s)∙m = [momentum]∙[distance]

In case you didn’t get this from what I wrote above: the brackets here, i.e. the [ and ] symbols, mean: ‘the dimension of what’s between the brackets’. OK. So far so good. It may all look like kids stuff – it actually is kids stuff so far – but the idea is quite fundamental: we’re thinking here of some amount of action (h or ħ, to be precise, i.e. the quantum of action) expressing itself in time or, alternatively, expressing itself in space. In the former case, some amount of energy is expended during some time. In the latter case, some momentum is expended over some distance.

Of course, ideally, we should try to think of action expressing itself in space and time simultaneously, so we should think of it as expressing itself in spacetime. In fact, that’s what the so-called Principle of Least Action in physics is all about—but I won’t dwell on that here, because… Well… It’s not an easy topic, and the diversion would lead us astray. 🙂 What we will do, however, is apply the idea above to the two de Broglie relations: E = ħω and p = ħk. I assume you know these relations by now. If not, just check one of my many posts on them. Let’s see what we can do with them.

**The de Broglie relations**

We can re-write the two de Broglie relations as ħ = E/ω and ħ = p/k. We can immediately derive an interesting property here:

ħ/ħ = 1 = (E/ω)/(p/k) ⇔ E/p = ω/k

So the ratio of the energy and the momentum is equal to the wave velocity. What wave velocity? The group of the phase velocity? We’re talking an elementary wave here, so both are the same: we have only one E and p, and, hence, only one ω and k. The E/p = ω/k identity underscores the following point: the de Broglie equations are a pair of equations here, and one of the key things to learn when trying to understand quantum mechanics is to think of them as an inseparable pair—like an inseparable twin really—as the quantum of action incorporates both a spatial as well as a temporal dimension. Just think of what Minkowski wrote back in 1907, shortly after he had re-formulated Einstein’s special relativity theory in terms of four-dimensional spacetime, and just two years before he died—unexpectely—from an appendicitis: “Henceforth space by itself, and time by itself, are doomed to fade away into mere shadows, and only a kind of union of the two will preserve an independent reality.”

So we should try to think of what that union might represent—and that surely includes looking at the de Broglie equations as a pair of matter-wave equations. Likewise, we should also think of the Uncertainty Principle as a pair of equations: ΔpΔx ≥ ħ/2 and ΔEΔt ≥ ħ/2—but I’ll come back to those later.

The ω in the E = ħω equation and the argument (θ = kx – ωt) of the wavefunction is a frequency in time (or temporal frequency). It’s a frequency expressed in radians per second. You get one radian by dividing one cycle by 2π. In other words, we have 2π radians in one cycle. So ω is related the frequency you’re used to, i.e. f—the frequency expressed in cycles per second (i.e. hertz): we multiply f by 2π to get ω. So we can write: E = ħω = ħ∙2π∙f = h∙f, with h = ħ∙2π (or ħ = h/2π).

Likewise, the k in the p = ħk equation and the argument (θ = kx – ωt) of the wavefunction is a frequency in space (or spatial frequency). Unsurprisingly, it’s expressed in radians per meter.

At this point, it’s good to properly define the radian as a unit in quantum mechanics. We often think of a radian as some distance measured along the circumference, because of the way the unit is constructed (see the illustration below) but that’s right and wrong at the same time. In fact, it’s more wrong than right: the radian is an angle that’s defined using the length of the radius of the unit circle but, when everything is said and done, it’s a unit used to measure some angle—not a distance. That should be obvious from the 2π rad = 360 degrees identity. The angle here is the argument of our wavefunction in quantum mechanics, and so that argument combines both time (t) as well as distance (x): θ = kx – ωt = k(x – c∙t). So our angle (the argument of the wavefunction) integrates both dimensions: space as well as time. If you’re not convinced, just do the dimensional analysis of the kx – ωt expression: both the kx and ωt yield a dimensionless number—or… Well… To be precise, I should say: the kx and ωt products both yield an angle expressed in radians. That angle connects the real and imaginary part of the argument of the wavefunction. Hence, it’s a dimensionless number—but that does not imply it is just some meaningless number. It’s not meaningless at all—obviously!

Let me try to present what I wrote above in yet another way. The θ = kx – ωt = (p/ħ)·x − (E/ħ)·t equation suggests a fungibility: the wavefunction itself also expresses itself in time and/or in space, so to speak—just like the quantum of action. Let me be precise: the p·x factor in the (p/ħ)·x term represents momentum (whose dimension is N·s) being expended over a distance, while the E·t factor in the (E/ħ)·t term represents energy (expressed in N·m) being expended over some time. [As for the minus sign in front of the (E/ħ)·t term, that’s got to do with the fact that the arrow of time points in one direction only while, in space, we can go in either direction: forward or backwards.] Hence, the expression for the argument tells us that both are essentially fungible—which suggests they’re aspects of one and the same thing. So that‘s what Minkowski intuition is all about: spacetime is one, and the wavefunction just connects the physical properties of whatever it is that we are observing – an electron, or a photon, or whatever other elementary particle – to it.

Of course, the corollary to thinking of unified spacetime is thinking of the real and imaginary part of the wavefunction as one—which we’re supposed to do as a complex number is… Well… One complex number. But that’s easier said than actually done, of course. One way of thinking about the connection between the two spacetime ‘dimensions’ – i.e. t and x, with x actually incorporating three spatial dimensions in space in its own right (see how confusing the term ‘dimension’ is?) – and the two ‘dimensions’ of a complex number is going from Cartesian to polar coordinates, and vice versa. You now think of Euler’s formula, of course – if not, you should – but let me insert something more interesting here. 🙂 I took it from Wikipedia. It illustrates how a simple sinusoidal function transforms as we go from Cartesian to polar coordinates.

Cartesian_to_polar

Interesting, isn’t it? Think of the horizontal and vertical axis in the Cartesian space as representing time and… Well… Space indeed. 🙂 The function connects the space and time dimension and might, for example, represent the trajectory of some object in spacetime. Admittedly, it’s a rather weird trajectory, as the object goes back and forth in some box in space, and accelerates and decelerates all of the time, reversing its direction in the process… But… Well… Isn’t that how we think of a an electron moving in some orbital? 🙂 With that in mind, look at how the same movement in spacetime looks like in polar coordinates. It’s also some movement in a box—but both the ‘horizontal’ and ‘vertical’ axis (think of these axes as the real and imaginary part of a complex number) are now delineating our box. So, whereas our box is a one-dimensional box in spacetime only (our object is constrained in space, but time keeps ticking), it’s a two-dimensional box in our ‘complex’ space. Isn’t it just nice to think about stuff this way?

As far as I am concerned, it triggers the same profound thoughts as that E/p = ω/k relation. The left-hand side is a ratio between energy and momentum. Now, one way to look at energy is that it’s action per time unit. Likewise, momentum is action per distance unit. Of course, ω is expressed as some quantity (expressed in radians, to be precise) per time unit, and k is some quantity (again, expressed in radians) per distance unit. Because this is a physical equation, the dimension of both sides of the equation has to be the same—and, of course, it is the same: the action dimension in the numerator and denominator of the ratio on the left-hand side of the E/p = ω/k equation cancel each other. But… What? Well… Wouldn’t it be nice to think of the dimension of the argument of our wavefunction as being the dimension of action, rather than thinking of it as just some mathematical thing, i.e. an angle. I like to think the argument of our wavefunction is more than just an angle. When everything is said and done, it has to be something physical—if onlyh because the wavefunction describes something physical. But… Well… I need to do some more thinking on this, so I’ll just move on here. Otherwise this post risks becoming a book in its own right. 🙂

Let’s get back to the topic we were discussing here. We were talking about natural units. More in particular, we were wondering: what’s natural? What does it mean?

Back to Planck units

Let’s start with time and distance. We may want to think of l_P and t_P as the smallest distance and time units possible—so small, in fact, that both distance and time become countable variables at that scale.

Huh? Yes. I am sure you’ve never thought of time and distance as countable variables but I’ll come back to this rather particular interpretation of the Planck length and time unit later. So don’t worry about it now: just make a mental note of it. The thing is: if t_P and l_P are the smallest time and distance units possible, then the smallest cycle we can possibly imagine will be associated with those two units: we write: ω_P = 1/t_P and k_P = 1/l_P. What’s the ‘smallest possible’ cycle? Well… Not sure. You should not think of some oscillation in spacetime as for now. Just think of a cycle. Whatever cycle. So, as for now, the smallest cycle is just the cycle you’d associate with the smallest time and distance units possible—so we cannot write ω_P = 2/t_P, for example, because that would imply we can imagine a time unit that’s smaller than t_P, as we can associate two cycles with t_P now.

OK. Next step. We can now define the Planck energy and the Planck momentum using the de Broglie relations:

E_P = ħ∙ω_P = ħ/t_P and p_P = ħ∙k_P = ħ/l_P

You’ll say that I am just repeating myself here, as I’ve given you those two equations already. Well… Yes and no. At this point, you should raise the following question: why are we using the angular frequency (ω_P = 2π·f_P) and the reduced Planck constant (ħ = h/2π), rather than f_P or h?

That’s a great question. In fact, it begs the question: what’s the physical constant really? We have two mathematical constants – ħ and h – but they must represent the same physical reality. So is one of the two constants more real than the other? The answer is unambiguously: yes! The Planck energy is defined as E_P = ħ/t_P =(h/2π)/t_P, so we cannot write this as E_P = h/t_P. The difference is that 1/2π factor, and it’s quite fundamental, as it implies we’re actually not associating a full cycle with t_P and l_P but a radian of that cycle only.

Huh? Yes. It’s a rather strange observation, and I must admit I haven’t quite sorted out what this actually means. The fundamental idea remains the same, however: we have a quantum of action, ħ (not h!), that can express itself as energy over the smallest distance unit possible or, alternatively, that expresses itself as momentum over the smallest time unit possible. In the former case, we write it as E_P = F_P∙l_P = ħ/t_P. In the latter, we write it as p_P = F_P∙t_P = ħ/l_P. Both are aspects of the same reality, though, as our particle moves in space as well as in time, i.e. it moves in spacetime. Hence, one step in space, or in time, corresponds to one radian. Well… Sort of… Not sure how to further explain this. I probably shouldn’t try anyway. 🙂

The more fundamental question is: with what speed is is moving? That question brings us to the next point. The objective is to get some specific value for l_P and t_P, so how do we do that? How can we determine these two values? Well… That’s another great question. 🙂

The first step is to relate the natural time and distance units to the wave velocity. Now, we do not want to complicate the analysis and so we’re not going to throw in some rest mass or potential energy here. No. We’ll be talking a theoretical zero-mass particle. So we’re not going to consider some electron moving in spacetime, or some other elementary particle. No. We’re going to think about some zero-mass particle here, or a photon. [Note that a photon is not just a zero-mass particle. It’s similar but different: in one of my previous posts, I showed a photon packs more energy, as you get two wavefunctions for the price of one, so to speak. However, don’t worry about the difference here.]

Now, you know that the wave velocity for a zero-mass particle and/or a photon is equal to the speed of light. To be precise, the wave velocity of a photon is the speed of light and, hence, the speed of any zero-mass particle must be the same—as per the definition of mass in Einstein’s special relativity theory. So we write: l_P/t_P = c ⇔ l_P = c∙t_P and t_P = l_P/c. In fact, we also get this from dividing E_P by p_P, because we know that E/p = c, for any photon (and for any zero-mass particle, really). So we know that E_P/p_P must also equal c. We can easily double-check that by writing: E_P/p_P = (ħ/t_P)/(ħ/l_P) = l_P/t_P = c. Substitution in ħ = F_P∙l_P∙t_P yields ħ = c∙F_P∙t_P² or, alternatively, ħ = F_P∙l_P²/c. So we can now write F_P as a function of l_P and/or t_P:

F_P = ħ∙c/l_P² = ħ/(c∙t_P²)

We can quickly double-check this by dividing F_P = ħ∙c/l_P² by F_P = ħ/(c∙t_P²). We get: 1 = c²∙t_P²/l_P² ⇔ l_P²/t_P² = c² ⇔ l_P/t_P = c.

Nice. However, this does not uniquely define F_P, l_P, and t_P. The problem is that we’ve got only two equations (ħ = F_P∙l_P∙t_P and l_P/t_P = c) for three unknowns (F_P, l_P, and t_P). Can we throw in one or both of the de Broglie equations to get some final answer?

I wish that would help, but it doesn’t—because we get the same ħ = F_P∙l_P∙t_P equation. Indeed, we’re just re-defining the Planck energy (and the Planck momentum) by that E_P = ħ/t_P (and p_P = ħ/l_P) equation here, and so that does not give us a value for E_P (and p_P). So we’re stuck. We need some other formula so we can calculate the third unknown, which is the Planck force unit (F_P). What formula could we possibly choose?

Well… We got a relationship by imposing the condition that l_P/t_P = c, which implies that if we’d measure the velocity of a photon in Planck time and distance units, we’d find that its velocity is one, so c = 1. Can we think of some similar condition involving ħ? The answer is: we can and we can’t. It’s not so simple. Remember we were thinking of the smallest cycle possible? We said it was small because t_P and l_P were the smallest units we could imagine. But how do we define that? The idea is as follows: the smallest possible cycle will pack the smallest amount of action, i.e. h (or, expressed per radian rather than per cycle, ħ).

Now, we usually think of packing energy, or momentum, instead of packing action, but that’s because… Well… Because we’re not good at thinking the way Minkowski wanted us to think: we’re not good at thinking of some kind of union of space and time. We tend to think of something moving in space, or, alternatively, of something moving in time—rather than something moving in spacetime. In short, we tend to separate dimensions. So that’s why we’d say the smallest possible cycle would pack an amount of energy that’s equal to E_P = ħ∙ω_P = ħ/t_P, or an amount of momentum that’s equal to p_P = ħ∙k_P = ħ/l_P. But both are part and parcel of the same reality, as evidenced by the E = m∙c² = m∙c∙c = p∙c equality. [This equation only holds for a zero-mass particle (and a photon), of course. It’s a bit more complicated when we’d throw in some rest mass, but we can do that later. Also note I keep repeating my idea of the smallest cycle, but we’re talking radians of a cycle, really.]

So we have that mass-energy equivalence, which is also a mass-momentum equivalence according to that E = m∙c² = m∙c∙c = p∙c formula. And so now the gravitational force comes into play: there’s a limit to the amount of energy we can pack into a tiny space. Or… Well… Perhaps there’s no limit—but if we pack an awful lot of energy into a really tiny speck of space, then we get a black hole.

However, we’re getting a bit ahead of ourselves here, so let’s first try something else. Let’s throw in the Uncertainty Principle.

The Uncertainty Principle

As mentioned above, we can think of some amount of action expressing itself over some time or, alternatively, over some distance. In the former case, some amount of energy is expended over some time. In the latter case, some momentum is expended over some distance. That’s why the energy and time variables, and the momentum and distance variables, are referred to as complementary. It’s hard to think of both things happening simultaneously (whatever that means in spacetime), but we should try! Let’s now look at the Uncertainty relations once again (I am writing uncertainty with a capital U out of respect—as it’s very fundamental, indeed!):

ΔpΔx ≥ ħ/2 and ΔEΔt ≥ ħ/2.

Note that the ħ/2 factor on the right-hand side quantifies the uncertainty, while the right-hand side of the two equations (ΔpΔx and ΔEΔt) are just an expression of that fundamental uncertainty. In other words, we have two equations (a pair), but there’s only one fundamental uncertainty, and it’s an uncertainty about a movement in spacetime. Hence, that uncertainty expresses itself in both time as well as in space.

Note the use of ħ rather than h, and the fact that the 1/2 factor makes it look like we’re splitting ħ over ΔpΔx and ΔEΔt respectively—which is actually a quite sensible explanation of what this pair of equations actually represent. Indeed, we can add both relations to get the following sum:

ΔpΔx + ΔEΔt ≥ ħ/2 + ħ/2 = ħ

Interesting, isn’t it? It explains that 1/2 factor which troubled us when playing with the de Broglie relations.

Let’s now think about our natural units again—about l_P, and t_P in particular. As mentioned above, we’ll want to think of them as the smallest distance and time units possible: so small, in fact, that both distance and time become countable variables, so we count x and t as 0, 1, 2, 3 etcetera. We may then imagine that the uncertainty in x and t is of the order of one unit only, so we write Δx = l_P and Δt = t_P. So we can now re-write the uncertainty relations as:

Δp·l_P = ħ/2
ΔE·t_P = ħ/2

Hey! Wait a minute! Do we have a solution for the value of l_P and t_P here? What if we equate the natural energy and momentum units to ΔE and Δp here? Well… Let’s try it. First note that we may think of the uncertainty in t, or in x, as being equal to plus or minus one unit, i.e. ±1. So the uncertainty is two units really. [Frankly, I just want to get rid of that 1/2 factor here.] Hence, we can re-write the ΔpΔx = ΔEΔt = ħ/2 equations as:

ΔpΔx = p_P∙l_P = F_P∙t_P∙l_P = ħ
ΔEΔt = E_P∙t_P = F_P∙l_P∙t_P = ħ

Hmm… What can we do with this? Nothing much, unfortunately. We’ve got the same problem: we need a value for F_P (or for p_P, or for E_P) to get some specific value for l_P and t_P, so we’re stuck once again. We have three variables and two equations only, so we have no specific value for either of them. 😦

What to do? Well… I will give you the answer now—the answer you’ve been waiting for, really—but not the technicalities of it. There’s a thing called the Schwarzschild radius, aka as the gravitational radius. Let’s analyze it.

The Schwarzschild radius and the Planck length

The Schwarzschild radius is just the radius of a black hole. Its formal definition is the following: it is the radius of a sphere such that, if all the mass of an object were to be compressed within that sphere, the escape velocity from the surface of the sphere would equal the speed of light (c). The formula for the Schwartzschild radius is the following:

R_S = 2m·G/c²

G is the gravitational constant here: G ≈ 6.674×10⁻¹¹ N⋅m²/kg². [Note that Newton’s F = m·a Law tells us that 1 kg = 1 N·s²/m, as we’ll need to substitute units later.]

But what is the mass (m) in that R_S = 2m·G/c² equation? Using equivalent time and distance units (so c = 1), we wrote the following for a zero-mass particle and for a photon respectively:

E = m = p = ħ/2 (zero-mass particle)
E = m = p = ħ (photon)

How can a zero-mass particle, or a photon, have some mass? Well… Because it moves at the speed of light. I’ve talked about that before, so I’ll just refer you to my post on that. Of course, the dimension of the right-hand side of these equations (i.e. ħ/2 or ħ) symbol has to be the same as the dimension on the left-hand side, so the ‘ħ’ in the E = ħ equation (or E = ħ/2 equation) is a different ‘ħ’ in the p = ħ equation (or p = ħ/2 equation). So we must be careful here. Let’s write it all out, so as to remind ourselves of the dimensions involved:

E [N·m] = ħ [N·m·s/s] = E_P = F_P∙l_P∙t_P/t_P
p [N·s] = ħ [N·m·s/m] = p_P = F_P∙l_P∙t_P/l_P

Now, let’s check this by cheating. I’ll just give you the numerical values—even if we’re not supposed to know them at this point—so you can see I am not writing nonsense here:

E_P= 1.0545718×10⁻³⁴N·m·s/(5.391×10⁻⁴⁴ s) = (1.21×10⁴⁴N)·(1.6162×10⁻³⁵m) = 1.9561×10⁹N·m
p_P=1.0545718×10⁻³⁴N·m·s/(1.6162×10⁻³⁵m) = (1.21×10⁴⁴N)·(5.391×10⁻⁴⁴ s) = 6.52485 N·s

You can google the Planck units, and you’ll see I am not taking you for a ride here. 🙂

The associated Planck mass is m_P = E_P/c² = 1.9561×10⁹N·m/(2.998×10⁸m/s)² = 2.17651×10⁻⁸ N·s²/m = 2.17651×10⁻⁸ kg. So let’s plug that value into R_S = 2m·G/c²equation. We get:

R_S = 2m·G/c²= [(2.17651×10⁻⁸ kg)·(6.674×10⁻¹¹ N⋅m²/kg²)/(8.988×10¹⁶m²·s⁻²)

= 1.6162×10⁻³⁵kg·N⋅m²·kg⁻²·m²·s⁻² = 1.6162×10⁻³⁵kg·N⋅m²·kg⁻²·m²·s⁻²= 1.6162×10⁻³⁵m = l_P

Bingo! You can look it up: 1.6162×10⁻³⁵m is the Planck length indeed, so the Schwarzschild radius is the Planck length. We can now easily calculate the other Planck units:

t_P= l_P/c = 1.6162×10⁻³⁵m/(2.998×10⁸m/s) = 5.391×10⁻⁴⁴s
F_P= ħ/(t_P∙l_P)= (1.0545718×10⁻³⁴N·m·s)/[(1.6162×10⁻³⁵m)·(5.391×10⁻⁴⁴s) = 1.21×10⁻⁴⁴N

Bingo again! 🙂

[…] But… Well… Look at this: we’ve been cheating all the way. First, we just gave you that formula for the Schwarzschild radius. It looks like an easy formula but its derivation involves a profound knowledge of general relativity theory. So we’d need to learn about tensors and what have you. The formula is, in effect, a solution to what is known as Einstein’s field equations, and that’s pretty complicated stuff.

However, my crime is much worse than that: I also gave you those numerical values for the Planck energy and momentum, rather than calculating them. I just couldn’t calculate them with the knowledge we have so far. When everything is said and done, we have more than three unknowns. We’ve got five in total, including the Planck charge (q_P) and, hence, we need five equations. Again, I’ll just copy them from Wikipedia, because… Well… What we’re discussing here is way beyond the undergraduate physics stuff that we’ve been presenting so far. The equations are the following. Just have a look at them and move on. 🙂

Finally, I should note one more thing: I did not use 2m but m in Schwarzschild’s formula. Why? Well… I have no good answer to that. I did it to ensure I got the result we wanted to get. It’s that 1/2 factor again. In fact, the E = m = p = ħ/2 is the correct formula to use, and all would come out alright if we did that and defined the magnitude of the uncertainty as one unit only, but so we used the E = m = p = ħ formula instead, i.e. the equation that’s associated with a photon. You can re-do the calculations as an exercise: you’ll see it comes out alright.

Just to make things somewhat more real, let me note that the Planck energy is very substantial: 1.9561×10⁹N·m ≈ 2×10⁹J is equivalent to the energy that you’d get out of burning 60 liters of gasoline—or the mileage you’d get out of 16 gallons of fuel! In short, it’s huge, and so we’re packing that into a unimaginably small space. To understand how that works, you can think of the E = h∙f ⇔ h = E/f relation once more. The h = E/f ratio implies that energy and frequency are directly proportional to each other, with h the coefficient of proportionality. Shortening the wavelength, amounts to increasing the frequency and, hence, the energy. So, as you think of our cycle becoming smaller and smaller, until it becomes the smallest cycle possible, you should think of the frequency becoming unimaginably large. Indeed, as I explained in one of my other posts on physical constants, we’re talking the the 10⁴³Hz scale here. However, we can always choose our time unit such that we measure the frequency as one cycle per time unit. Because the energy per cycle remains the same, it means the quantum of action (ħ = F_P∙l_P∙t_P) expresses itself over extremely short time spans, which means the E_P = F_P∙l_P product becomes huge, as we’ve shown above. The rest of the story is the same: gravity comes into play, and so our little blob in spacetime becomes a tiny black hole. Again, we should think of both space and time: they are joined in ‘some kind of union’ here, indeed, as they’re intimately connected through the wavefunction, which travels at the speed of light.

The wavefunction as an oscillation in and of spacetime

OK. Now I am going to present the big idea I started with. Let me first ask you a question: when thinking about the Planck-Einstein relation (I am talking about the E = ħ∙ω relation for a photon here, rather than the equivalent de Broglie equation for a matter-particle), aren’t you struck by the fact that the energy of a photon depends on the frequency of the electromagnetic wave only? I mean… It does not depend on its amplitude. The amplitude is mentioned nowhere. The amplitude is fixed, somehow—or considered to be fixed.

Isn’t that strange? I mean… For any other physical wave, the energy would not only depend on the frequency but also on the amplitude of the wave. For a photon, however, it’s just the frequency that counts. Light of the same frequency but higher intensity (read: more energy) is not a photon with higher amplitude, but just more photons. So it’s the photons that add up somehow, and so that explains the amplitude of the electric and magnetic field vectors (i.e. E and B) and, hence, the intensity of the light. However, every photon considered separately has the same amplitude apparently. We can only increase its energy by increasing the frequency. In short, ω is the only variable here.

Let’s look at that angular frequency once more. As you know, it’s expressed in radians per second but, if you multiply ω by 2π, you get the frequency you’re probably more acquainted with: f = 2πω = f cycles per second. The Planck-Einstein relation is then written as E = h∙f. That’s easy enough. But what if we’d change the time unit here? For example, what if our time unit becomes the time that’s needed for a photon to travel one meter? Let’s examine it.

Let’s denote that time unit by t_m, so we write: 1 t_m = 1/c s ⇔ t_m^–¹ = c s^–¹, with c ≈ 3×10⁸. The frequency, as measured using our new time unit, changes, obviously: we have to divide its former value by c now. So, using our little subscript once more, we could write: f_m = f/c. [Why? Just add the dimension to make things more explicit: f s^–¹ = f/c t_m^–¹ = f/c t_m^–¹.] But the energy of the photon should not depend on our time unit, should it?

Don’t worry. It doesn’t: the numerical value of Planck’s constant (h) would also change, as we’d replace the second in its dimension (N∙m∙s) by c times our new time unit t_m. However, Planck’s constant remains what it is: some physical constant. It does not depend on our measurement units: we can use the SI units, or the Planck units (F_P, l_P, and t_P), or whatever unit you can think of. It doesn’t matter: h (or ħ = h/2π) is what is—it’s the quantum of action, and so that’s a physical constant (as opposed to a mathematical constant) that’s associated with one cycle.

Now, I said we do not associate the wavefunction of a photon with an amplitude, but we do associate it with a wavelength. We do so using the standard formula for the velocity of a wave: c = f∙λ ⇔ λ = c/f. We can also write this using the angular frequency and the wavenumber: c = ω/k, with k = 2π/λ. We can double-check this, because we know that, for a photon, the following relation holds: E/p = c. Hence, using the E = ħ∙ω and p = ħ∙k relations, we get: (ħ∙ω)/(ħ∙k) = ω/k = c. So we have options here: h can express itself over a really long wavelength, or it can do so over an extremely short wavelength. We re-write p = ħ∙k as p = E/c = ħ∙2π/λ = h/λ ⇔ E = h∙c/λ ⇔ h∙c = E∙λ. We know this relationship: the energy and the wavelength of a photon (or an electromagnetic wave) are inversely proportional to each other.

Once again, we may want to think of the shortest wavelength possible. As λ gets a zillion times smaller, E gets a zillion times bigger. Is there a limit? There is. As I mentioned above, the gravitational force comes into play here: there’s a limit to the amount of energy we can pack into a tiny space. If we pack an awful lot of energy into a really tiny speck of space, then we get a black hole. In practical terms, that implies our photon can’t travel, as it can’t escape from the black hole it creates. That’s what that calculation of the Schwarzschild radius was all about.

We can—in fact, we should—now apply the same reasoning to the matter-wave. Instead of a photon, we should try to think of a zero-mass matter-particle. You’ll say: that’s a contradiction. Matter-particles – as opposed to force-carrying particles, like photons (or bosons in general) – must have some rest mass, so they can’t be massless. Well… Yes. You’re right. But we can throw the rest mass in later. I first want to focus on the abstract principles, i.e. the propagation mechanism of the matter-wave.

Using natural units, we know our particle will move in spacetime with velocity Δx/Δt = 1/1 = 1. Of course, it has to have some energy to move, or some momentum. We also showed that, if it’s massless, and the elementary wavefunction is eⁱ^[(p/^ħ)x^{– (E/}^ħ)t), then we know the energy, and the momentum, has to be equal to ħ/2. Where does it get that energy, or momentum? Not sure. I like to think it borrows it from spacetime, as it breaks some potential barrier between those two points, and then it gives it back. Or, if it’s at point x = t = 0, then perhaps it gets it from some other massless particle moving from x = t = −1. In both cases, we’d like to think our particle keeps moving. So if the first description (borrowing) is correct, it needs to keep borrowing and returning energy in some kind of interaction with spacetime itself. If it’s the second description, it’s more like spacetime bumping itself forward.

In both cases, however, we’re actually trying to visualize (or should I say: imagine?) some oscillation of spacetime itself, as opposed to an oscillation in spacetime.

Huh? Yes. The idea is the following here: we like to think of the wavefunction as the dependent variable: both its real as well as its imaginary part are a function of x and t, indeed. But what if we’d think of x and t as dependent variables? In that case, the real and imaginary part of the wavefunction would be the independent variables. It’s just a matter of perspective. We sort of mirror our function: we switch its domain for its range, and its range for its domain, as shown below. It all makes sense, doesn’t it? Space and time appear as separate dimensions to us, but they’re intimately connected through c, ħ and the other fundamental physical constants. Likewise, the real and imaginary part of the wavefunction appear as separate dimensions, but they’re intimately connected through π and Euler’s number, i.e. through mathematical constants. That cannot be a coincidence: the mathematical and physical ‘space’ reflect each other through the wavefunction, just like the domain and range of a function reflect each other through that function. So physics and math must meet in some kind of union—at least in our mind, they do!

So, yes, we can—and probably should—be looking at the wavefunction as an oscillation of spacetime, rather than as an oscillation in spacetime only. As mentioned in my introduction, I’ll need to study general relativity theory—and very much in depth—to convincingly prove that point, but I am sure it can be done.

You’ll probably think I am arrogant when saying that—and I probably am—but then I am very much emboldened by the fact some nuclear scientist told me a photon doesn’t have any wavefunction: it’s just those E and B vectors, he told me—and then I found out he was dead wrong, as I showed in my previous post! So I’d rather think more independently now. I’ll keep you guys posted on progress—but it will probably take a while to figure it all out. In the meanwhile, please do let me know your ideas on this. 🙂

Let me wrap up this little excursion with two small notes:

We have this E/c = p relation. The mass-energy equivalence relation implies momentum must also have an equivalent mass. If E = m∙c², then p = m∙c ⇔ m = p/c. It’s obvious, but I just thought it would be useful to highlight this.
When we studied the ammonia molecule as a two-state system, our space was not a continuum: we allowed just two positions—two points in space, which we defined with respect to the system. So x was a discrete variable. We assumed time to be continuous, however, and so we got those nice sinusoids as a solution to our set of Hamiltonian equations. However, if we look at space as being discrete, or countable, we should probably think of time as being countable as well. So we should, perhaps, think of a particle being at point x = t = 0 first, and, then, being at point x = t = 1. Instead of the nice sinusoids, we get some boxcar function, as illustrated below, but probably varying between 0 and 1—or whatever other normalized values. You get the idea, I hope. 🙂

Post Scriptum on the Principle of Least Action: As noted above, the Principle of Least Action is not very intuitive, even if Feynman’s exposé of it is not as impregnable as it may look at first. To put it simply, the Principle of Least Action says that the average kinetic energy less the average potential energy is as little as possible for the path of an object going from one point to another. So we have a path or line integral here. In a gravitation field, that integral is the following:

Least action

The integral is not all that important. Just note its dimension is the dimension of action indeed, as we multiply energy (the integrand) with time (dt). We can use the Principle of Least Action to re-state Newton’s Law, or whatever other classical law. Among other things, we’ll find that, in the absence of any potential, the trajectory of a particle will just be some straight line.

In quantum mechanics, however, we have uncertainty, as expressed in the ΔpΔx ≥ ħ/2 and ΔEΔt ≥ ħ/2 relations. Now, that uncertainty may express itself in time, or in distance, or in both. That’s where things become tricky. 🙂 I’ve written on this before, but let me copy Feynman himself here, with a more exact explanation of what’s happening (just click on the text to enlarge):

Feynman

The ‘student’ he speaks of above, is himself, of course. 🙂

Too complicated? Well… Never mind. I’ll come back to it later. 🙂

Some content on this page was disabled on June 16, 2020 as a result of a DMCA takedown notice from The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/
Some content on this page was disabled on June 17, 2020 as a result of a DMCA takedown notice from Michael A. Gottlieb, Rudolf Pfeiffer, and The California Institute of Technology. You can learn more about the DMCA here:

https://wordpress.com/support/copyright-and-the-dmca/

All what you ever wanted to know about the photon wavefunction…

Post scriptum note added on 11 July 2016: This is one of the more speculative posts which led to my e-publication analyzing the wavefunction as an energy propagation. With the benefit of hindsight, I would recommend you to immediately read the more recent exposé on the matter that is being presented here, which you can find by clicking on the provided link.

Original post:

This post is, essentially, a continuation of my previous post, in which I juxtaposed the following images:

Both are the same, and then they’re not. The illustration on the right-hand side is a regular quantum-mechanical wavefunction, i.e. an amplitude wavefunction. You’ve seen that one before. In this case, the x-axis represents time, so we’re looking at the wavefunction at some particular point in space. ]You know we can just switch the dimensions and it would all look the same.] The illustration on the left-hand side looks similar, but it’s not an amplitude wavefunction. The animation shows how the electric field vector (E) of an electromagnetic wave travels through space. Its shape is the same. So it’s the same function. Is it also the same reality?

Yes and no. And I would say: more no than yes—in this case, at least. Note that the animation does not show the accompanying magnetic field vector (B). That vector is equally essential in the electromagnetic propagation mechanism according to Maxwell’s equations, which—let me remind you—are equal to:

∂B/∂t = –∇×E
∂E/∂t = ∇×B

In fact, I should write the second equation as ∂E/∂t = c²∇×B, but then I assume we measure time and distance in equivalent units, so c = 1.

You know that E and B are two aspects of one and the same thing: if we have one, then we have the other. To be precise, B is always orthogonal to E in the direction that’s given by the right-hand rule for the following vector cross-product: B = e_x×E, with e_x the unit vector pointing in the x-direction (i.e. the direction of propagation). The reality behind is illustrated below for a linearly polarized electromagnetic wave.

E and b

The B = e_x×E equation is equivalent to writing B= i·E, which is equivalent to:

B = i·E = eⁱ^(π/2)·eⁱ^{(kx − ωt)} = cos(kx − ωt + π/2) + i·sin(kx − ωt + π/2)

= −sin((kx − ωt) + i·cos(kx − ωt)

Now, E and B have only two components: E_yand E_z, and B_yand B_z. That’s only because we’re looking at some ideal or elementary electromagnetic wave here but… Well… Let’s just go along with it. 🙂 It is then easy to prove that the equation above amounts to writing:

B_y= cos(kx − ωt + π/2) = −sin(kx − ωt) = −E_z
B_z= sin(kx − ωt + π/2) = cos(kx − ωt) = E_y

We should now think of E_y and E_zas the real and imaginary part of some wavefunction, which we’ll denote as ψ_E = eⁱ^{(kx − ωt)}. So we write:

E = (E_y, E_z) = E_y + i·E_z= cos(kx − ωt) + i∙sin(kx − ωt) = Re(ψ_E) + i·Im(ψ_E) = ψ_E = eⁱ^{(kx − ωt)}

What about B? We just do the same, so we write:

B = (B_y, B_z) = B_y + i·B_z= ψ_B = i·E = i·ψ_E = −sin(kx − ωt) + i∙sin(kx − ωt) = − Im(ψ_E) + i·Re(ψ_E)

Now we need to prove that ψ_E and ψ_B are regular wavefunctions, which amounts to proving Schrödinger’s equation, i.e. ∂ψ/∂t = i·(ħ/m)·∇²ψ, for both ψ_E and ψ_B. [Note I use the Schrödinger’s equation for a zero-mass spin-zero particle here, which uses the ħ/m factor rather than the ħ/(2m) factor.] To prove that ψ_E and ψ_B are regular wavefunctions, we should prove that:

Re(∂ψ_E/∂t) = −(ħ/m)·Im(∇²ψ_E) and Im(∂ψ_E/∂t) = (ħ/m)·Re(∇²ψ_E), and
Re(∂ψ_B/∂t) = −(ħ/m)·Im(∇²ψ_B) and Im(∂ψ_B/∂t) = (ħ/m)·Re(∇²ψ_B).

Let’s do the calculations for the second pair of equations. The time derivative on the left-hand side is equal to:

∂ψ_B/∂t = −iω·ieⁱ^{(kx − ωt)}= ω·[cos(kx − ωt) + i·sin(kx − ωt)] = ω·cos(kx − ωt) + iω·sin(kx − ωt)

The second-order derivative on the right-hand side is equal to:

∇²ψ_B= ∂²ψ_B/∂x²= i·k²·eⁱ^{(kx − ωt)}= k²·cos(kx − ωt) + i·k²·sin(kx − ωt)

So the two equations for ψ_Bare equivalent to writing:

Re(∂ψ_B/∂t) = −(ħ/m)·Im(∇²ψ_B) ⇔ ω·cos(kx − ωt) = k²·(ħ/m)·cos(kx − ωt)
Im(∂ψ_B/∂t) = (ħ/m)·Re(∇²ψ_B) ⇔ ω·sin(kx − ωt) = k²·(ħ/m)·sin(kx − ωt)

So we see that both conditions are fulfilled if, and only if, ω = k²·(ħ/m).

Now, we also demonstrated in that post of mine that Maxwell’s equations imply the following:

∂B_y/∂t = –(∇×E)_y = ∂E_z/∂x = ∂[sin(kx − ωt)]/∂x = k·cos(kx − ωt) = k·E_y
∂B_z/∂t = –(∇×E)_z = – ∂E_y/∂x = – ∂[cos(kx − ωt)]/∂x = k·sin(kx − ωt) = k·E_z

Hence, using those B_y= −E_zand B_z= E_yequations above, we can also calculate these derivatives as:

∂B_y/∂t = −∂E_z/∂t = −∂sin(kx − ωt)/∂t = ω·cos(kx − ωt) = ω·E_y
∂B_z/∂t = ∂E_y/∂t = ∂cos(kx − ωt)/∂t = −ω·[−sin(kx − ωt)] = ω·E_z

In other words, Maxwell’s equations imply that ω = k, which is consistent with us measuring time and distance in equivalent units, so the phase velocity is c = 1 = ω/k.

So far, so good. We basically established that the propagation mechanism for an electromagnetic wave, as described by Maxwell’s equations, is fully coherent with the propagation mechanism—if we can call it like that—as described by Schrödinger’s equation. We also established the following equalities:

ω = k
ω = k²·(ħ/m)

The second of the two de Broglie equations tells us that k = p/ħ, so we can combine these two equations and re-write these two conditions as:

ω/k = 1 = k·(ħ/m) = (p/ħ)·(ħ/m) = p/m ⇔ p = m

What does this imply? The p here is the momentum: p = m·v, so this condition implies v must be equal to 1 too, so the wave velocity is equal to the speed of light. Makes sense, because we actually are talking light here. 🙂 In addition, because it’s light, we also know E/p = c = 1, so we have – once again – the general E = p = m equation, which we’ll need!

OK. Next. Let’s write the Schrödinger wave equation for both wavefunctions:

∂ψ_E/∂t = i·(ħ/m_E)·∇²ψ_E, and
∂ψ_B/∂t = i·(ħ/m_B)·∇²ψ_B.

Huh? What’s m_E and m_E? We should only associate one mass concept with our electromagnetic wave, shouldn’t we? Perhaps. I just want to be on the safe side now. Of course, if we distinguish m_E and m_B, we should probably also distinguish p_E and p_B, and E_E and E_B as well, right? Well… Yes. If we accept this line of reasoning, then the mass factor in Schrödinger’s equations is pretty much like the 1/c² = μ₀ε₀ factor in Maxwell’s (1/c²)·∂E/∂t = ∇×B equation: the mass factor appears as a property of the medium, i.e. the vacuum here! [Just check my post on physical constants in case you wonder what I am trying to say here, in which I explain why and how c defines the (properties of the) vacuum.]

To be consistent, we should also distinguish p_E and p_B, and E_E and E_B, and so we should write ψ_Eand ψ_B as:

ψ_E = eⁱ^{(k_Ex − ω_Et)}, and
ψ_B = eⁱ^{(k_Bx − ω_Bt)}.

Huh? Yes. I know what you think: we’re talking one photon—or one electromagnetic wave—so there can be only one energy, one momentum and, hence, only one k, and one ω. Well… Yes and no. Of course, the following identities should hold: k_E = k_B and, likewise, ω_E= ω_B. So… Yes. They’re the same: one k and one ω. But then… Well… Conceptually, the two k’s and ω’s are different. So we write:

p_E = E_E = m_E, and
p_B = E_B = m_B.

The obvious question is: can we just add them up to find the total energy and momentum of our photon? The answer is obviously positive: E = E_E + E_B, p = p_E + p_B and m = m_E + m_B.

Let’s check a few things now. How does it work for the phase and group velocity of ψ_Eand ψ_B? Simple:

v_g = ∂ω_E/∂k_E = ∂[E_E/ħ]/∂[p_E/ħ] = ∂E_E/∂p_E = ∂p_E/∂p_E = 1
v_p = ω_E/k_E = (E_E/ħ)/(p_E/ħ) = E_E/p_E = p_E/p_E = 1

So we’re fine, and you can check the result for ψ_Bby substituting the subscript E for B. To sum it all up, what we’ve got here is the following:

We can think of a photon having some energy that’s equal to E = p = m (assuming c = 1), but that energy would be split up in an electric and a magnetic wavefunction respectively: ψ_Eand ψ_B.
Schrödinger’s equation applies to both wavefunctions, but the E, p and m in those two wavefunctions are the same and not the same: their numerical value is the same (p_E =E_E = m_E = p_B =E_B = m_B), but they’re conceptually different. They must be: if not, we’d get a phase and group velocity for the wave that doesn’t make sense.

Of course, the phase and group velocity for the sum of the ψ_Eand ψ_Bwaves must also be equal to c. This is obviously the case, because we’re adding waves with the same phase and group velocity c, so there’s no issue with the dispersion relation.

So let’s insert those p_E =E_E = m_E = p_B =E_B = m_B values in the two wavefunctions. For ψ_E, we get:

ψ_E= eⁱ^{[k_Ex − ω_Et)}= eⁱ^{[(p_E/ħ)·x − (E_E/ħ)·t]}

You can do the calculation for ψ_Byourself. Let’s simplify our life a little bit and assume we’re using Planck units, so ħ = 1, and so the wavefunction simplifies to ψ_E= eⁱ^{·(p_E·x − E_E·t)}. We can now add the components of E and B using the summation formulas for sines and cosines:

1. B_y+ E_y = cos(p_B·x − E_B·t + π/2) + cos(p_E·x − E_E·t) = 2·cos[(p·x − E·t + π/2)/2]·cos(π/4) = √2·cos(p·x/2 − E·t/2 + π/4)

2. B_z+ E_z = sin(p_B·x − E_B·t+π/2) + sin(p_E·x − E_E·t) = 2·sin[(p·x − E·t + π/2)/2]·cos(π/4) = √2·sin(p·x/2 − E·t/2 + π/4)

Interesting! We find a composite wavefunction for our photon which we can write as:

E + B = ψ_E+ ψ_B= E + i·E = √2·eⁱ^{(p·x/2 − E·t/2 + π/4)}= √2·eⁱ^(π/4)·eⁱ^{(p·x/2 − E·t/2)}= √2·eⁱ^(π/4)·E

What a great result! It’s easy to double-check, because we can see the E + i·E = √2·eⁱ^(π/4)·E formula implies that 1 + i should equal √2·eⁱ^(π/4). Now that’s easy to prove, both geometrically (just do a drawing) or formally: √2·eⁱ^(π/4) = √2·cos(π/4) + i·sin(π/4eⁱ^(π/4) = (√2/√2) + i·(√2/√2) = 1 + i. We’re bang on! 🙂

We can double-check once more, because we should get the same from adding E and B = i·E, right? Let’s try:

E + B = E + i·E = cos(p_E·x − E_E·t) + i·sin(p_E·x − E_E·t) + i·cos(p_E·x − E_E·t) − sin(p_E·x − E_E·t)

= [cos(p_E·x − E_E·t) – sin(p_E·x − E_E·t)] + i·[sin(p_E·x − E_E·t) – cos(p_E·x − E_E·t)]

Indeed, we can see we’re going to obtain the same result, because the −sinθ in the real part of our composite wavefunction is equal to cos(θ+π/2), and the −cosθ in its imaginary part is equal to sin(θ+π/2). So the sum above is the same sum of cosines and sines that we did already.

So our electromagnetic wavefunction, i.e. the wavefunction for the photon, is equal to:

ψ = ψ_E+ ψ_B= √2·eⁱ^{(p·x/2 − E·t/2 + π/4)} = √2·eⁱ^(π/4)·eⁱ^{(p·x/2 − E·t/2)}

What about the √2 factor in front, and the π/4 term in the argument itself? No sure. It must have something to do with the way the magnetic force works, which is not like the electric force. Indeed, remember the Lorentz formula: the force on some unit charge (q = 1) will be equal to F = E + v×B. So… Well… We’ve got another cross-product here and so the geometry of the situation is quite complicated: it’s not like adding two forces F₁and F₂to get some combined force F = F₁and F₂.

In any case, we need the energy, and we know that its proportional to the square of the amplitude, so… Well… We’re spot on: the square of the √2 factor in the √2·cos product and √2·sin product is 2, so that’s twice… Well… What? Hold on a minute! We’re actually taking the absolute square of the E + B = ψ_E+ ψ_B= E + i·E = √2·eⁱ^{(p·x/2 − E·t/2 + π/4)}wavefunction here. Is that legal? I must assume it is—although… Well… Yes. You’re right. We should do some more explaining here.

We know that we usually measure the energy as some definite integral, from t = 0 to some other point in time, or over the cycle of the oscillation. So what’s the cycle here? Our combined wavefunction can be written as √2·eⁱ^{(p·x/2 − E·t/2 + π/4)}= √2·eⁱ^{(θ/2 + π/4)}, so a full cycle would correspond to θ going from 0 to 4π here, rather than from 0 to 2π. So that explains the √2 factor in front of our wave equation.

Bingo! If you were looking for an interpretation of the Planck energy and momentum, here it is. And, while everything that’s written above is not easy to understand, it’s close to the ‘intuitive’ understanding to quantum mechanics that we were looking for, isn’t it? The quantum-mechanical propagation model explains everything now. 🙂 I only need to show one more thing, and that’s the different behavior of bosons and fermions:

The amplitudes of identitical bosonic particles interfere with a positive sign, so we have Bose-Einstein statistics here. As Feynman writes it: (amplitude direct) + (amplitude exchanged).
The amplitudes of identical fermionic particles interfere with a negative sign, so we have Fermi-Dirac statistics here: (amplitude direct) − (amplitude exchanged).

I’ll think about it. I am sure it’s got something to do with that B= i·E formula or, to put it simply, with the fact that, when bosons are involved, we get two wavefunctions (ψ_Eand ψ_B) for the price of one. The reasoning should be something like this:

I. For a massless particle (i.e. a zero-mass fermion), our wavefunction is just ψ = eⁱ^{(p·x − E·t)}. So we have no √2 or √2·eⁱ^(π/4)factor in front here. So we can just add any number of them – ψ₁+ ψ₂+ ψ₃+ … – and then take the absolute square of the amplitude to find a probability density, and we’re done.

II. For a photon (i.e. a zero-mass boson), our wavefunction is √2·eⁱ^(π/4)·eⁱ^{(p·x − E·t)/2}, which – let’s introduce a new symbol – we’ll denote by φ, so φ = √2·eⁱ^(π/4)·eⁱ^{(p·x − E·t)/2}. Now, if we add any number of these, we get a similar sum but with that √2·eⁱ^(π/4)factor in front, so we write: φ₁+ φ₂+ φ₃+ … = √2·eⁱ^(π/4)·(ψ₁+ ψ₂+ ψ₃+ …). If we take the absolute square now, we’ll see the probability density will be equal to twice the density for the ψ₁+ ψ₂+ ψ₃+ … sum, because

|√2·eⁱ^(π/4)·(ψ₁+ ψ₂+ ψ₃+ …)|² = |√2·eⁱ^(π/4)|²·|ψ₁+ ψ₂+ ψ₃+ …)|² = 2·|ψ₁+ ψ₂+ ψ₃+ …)|²

So… Well… I still need to connect this to Feynman’s (amplitude direct) ± (amplitude exchanged) formula, but I am sure it can be done.

Now, we haven’t tested the complete √2·eⁱ^(π/4)·eⁱ^{(p·x − E·t)/2}wavefunction. Does it respect Schrödinger’s ∂ψ/∂t = i·(1/m)·∇²ψ or, including the 1/2 factor, the ∂ψ/∂t = i·[1/2m)]·∇²ψ equation? [Note we assume, once again, that ħ = 1, so we use Planck units once more.] Let’s see. We can calculate the derivatives as:

∂ψ/∂t = −√2·eⁱ^(π/4)·e^{−i∙[p·x − E·t]/2}·(i·E/2)
∇²ψ = ∂²[√2·eⁱ^(π/4)·e^{−i∙[p·x − E·t]/2}]/∂x²= √2·eⁱ^(π/4)·∂[√2·eⁱ^(π/4)·e^{−i∙[p·x − E·t]/2}·(i·p/2)]/∂x = −√2·eⁱ^(π/4)·e^{−i∙[p·x − E·t]/2}·(p²/4)

So Schrödinger’s equation becomes:

−i·√2·eⁱ^(π/4)·e^{−i∙[p·x − E·t]/2}·(i·E/2) = −i·(1/m)·√2·eⁱ^(π/4)·e^{−i∙[p·x − E·t]/2}·(p²/4) ⇔ 1/2 = 1/4!?

That’s funny ! It doesn’t work ! The E and m and p² are OK because we’ve got that E = m = p equation, but we’ve got problems with yet another factor 2. It only works when we use the 2/m coefficient in Schrödinger’s equation.

So… Well… There’s no choice. That’s what we’re going to do. The Schrödinger equation for the photon is ∂ψ/∂t = i·(2/m)·∇²ψ !

It’s a very subtle point. This is all great, and very fundamental stuff! Let’s now move on to Schrödinger’s actual equation, i.e. the ∂ψ/∂t = i·(ħ/2m)·∇²ψ equation.

Post scriptum on the Planck units:

If we measure time and distance in equivalent units, say seconds, we can re-write the quantum of action as:

1.0545718×10⁻³⁴N·m·s = (1.21×10⁴⁴N)·(1.6162×10⁻³⁵m)·(5.391×10⁻⁴⁴ s)

⇔ (1.0545718×10⁻³⁴/2.998×10⁸) N·s² = (1.21×10⁴⁴N)·(1.6162×10⁻³⁵/2.998×10⁸ s)(5.391×10⁻⁴⁴ s)

⇔ (1.21×10⁴⁴N) = [(1.0545718×10⁻³⁴/2.998×10⁸)]/[(1.6162×10⁻³⁵/2.998×10⁸ s)(5.391×10⁻⁴⁴ s)] N·s²/s²

You’ll say: what’s this? Well… Look at it. We’ve got a much easier formula for the Planck force—much easier than the standard formulas you’ll find on Wikipedia, for example. If we re-interpret the symbols ħ and c so they denote the numerical value of the quantum of action and the speed of light in standard SI units (i.e. newton, meter and second)—so ħ and c become dimensionless, or mathematical constants only, rather than physical constants—then the formula above can be written as:

F_P newton = (ħ/c)/[(l_P/c)·t_P] newton ⇔ F_P = ħ/(l_P·t_P)

Just double-check it: 1.0545718×10⁻³⁴/(1.6162×10⁻³⁵·5.391×10⁻⁴⁴) = 1.21×10⁴⁴. Bingo!

You’ll say: what’s the point? The point is: our model is complete. We don’t need the other physical constants – i.e. the Coulomb, Boltzmann and gravitational constant – to calculate the Planck units we need, i.e. the Planck force, distance and time units. It all comes out of our elementary wavefunction! All we need to explain the Universe – or, let’s be more modest, quantum mechanics – is two numerical constants (c and ħ) and Euler’s formula (which uses π and e, of course). That’s it.

If you don’t think that’s a great result, then… Well… Then you’re not reading this. 🙂

The photon wavefunction

Original post:

In my previous posts, I juxtaposed the following images:

Both are the same, and then they’re not. The illustration on the left-hand side shows how the electric field vector (E) of an electromagnetic wave travels through space, but it does not show the accompanying magnetic field vector (B), which is as essential in the electromagnetic propagation mechanism according to Maxwell’s equations:

∂B/∂t = –∇×E
∂E/∂t = c²∇×B = ∇×B for c = 1

The second illustration shows a wavefunction eⁱ^{(kx − ωt)}= cos(kx − ωt) + i∙sin(kx − ωt). Its propagation mechanism—if we can call it like that—is Schrödinger’s equation:

∂ψ/∂t = i·(ħ/2m)·∇²ψ

We already drew attention to the fact that an equation like this models some flow. To be precise, the Laplacian on the right-hand side is the second derivative with respect to x here, and, therefore, expresses a flux density: a flow per unit surface area, i.e. per square meter. To be precise: the Laplacian represents the flux density of the gradient flow of ψ.

On the left-hand side of Schrödinger’s equation, we have a time derivative, so that’s a flow per second. The ħ/2m factor is like a diffusion constant. In fact, strictly speaking, that ħ/2m factor is a diffusion constant, because it does exactly the same thing as the diffusion constant D in the diffusion equation ∂φ/∂t = D·∇²φ, i.e:

As a constant of proportionality, it quantifies the relationship between both derivatives.
As a physical constant, it ensures the dimensions on both sides of the equation are compatible.

So our diffusion constant here is ħ/2m. Because of the Uncertainty Principle, m is always going to be some integer multiple of ħ/2, so ħ/2m = 1, 1/2, 1/3, 1/4 etcetera. In other words, the ħ/2m term is the inverse of the mass measured in units of ħ/2. We get the terms of the harmonic series here. How convenient! 🙂

In our previous posts, we studied the wavefunction for a zero-mass particle. Such particle has zero rest mass but – because of its movement – does have some energy, and, therefore, some mass and momentum. In fact, measuring time and distance in equivalent units (so c = 1), we found that E = m = p = ħ/2 for the zero-mass particle. It had to be. If not, our equations gave us nonsense. So Schrödinger’s equation was reduced to:

∂ψ/∂t = i·∇²ψ

How elegant! We only need to explain that imaginary unit (i) in the equation. It does a lot of things. First, it gives us two equations for the price of one—thereby providing a propagation mechanism indeed. It’s just like the E and B vectors. Indeed, we can write that ∂ψ/∂t = i·∇²ψ equation as:

Re(∂ψ/∂t) = −Im(∇²ψ)
Im(∂ψ/∂t) = Re(∇²ψ)

You should be able to show that the two equations above are effectively equivalent to Schrödinger’s equation. If not… Well… Then you should not be reading this stuff.] The two equations above show that the real part of the wavefunction feeds into its imaginary part, and vice versa. Both are as essential. Let me say this one more time: the so-called real and imaginary part of a wavefunction are equally real—or essential, I should say!

Second, i gives us the circle. Huh? Yes. Writing the wavefunction as ψ = a + i·b is not just like writing a vector in terms of its Cartesian coordinates, even if it looks very much that way. Why not? Well… Never forget: i²= −1, and so—let me use mathematical lingo here—the introduction of i makes our metric space complete. To put it simply: we can now compute everything. In short, the introduction of the imaginary unit gives us that wonderful mathematical construct, eⁱ^{(kx − ωt)}, which allows us to model everything. In case you wonder, I mean: everything! Literally. 🙂

However, we’re not going to impose any pre-conditions here, and so we’re not going to make that E = m = p = ħ/2 assumption now. We’ll just re-write Schrödinger’s equation as we did last time—so we’re going to keep our ‘diffusion constant’ ħ/2m as for now:

Re(∂ψ/∂t) = −(ħ/2m)·Im(∇²ψ)
Im(∂ψ/∂t) = (ħ/2m)·Re(∇²ψ)

So we have two pairs of equations now. Can they be related? Well… They look the same, so they had better be related! 🙂 Let’s explore it. First note that, if we’d equate the direction of propagation with the x-axis, we can write the E vector as the sum of two y- and z-components: E = (E_y, E_z). Using complex number notation, we can write E as:

E = (E_y, E_z) = E_y + i·E_z

In case you’d doubt, just think of this simple drawing:

The next step is to imagine—funny word when talking complex numbers—that E_y and E_zare the real and imaginary part of some wavefunction, which we’ll denote as ψ_E = eⁱ^{(kx − ωt)}. So now we can write:

E = (E_y, E_z) = E_y + i·E_z= cos(kx − ωt) + i∙sin(kx − ωt) = Re(ψ_E) + i·Im(ψ_E)

What’s k and ω? Don’t worry about it—for the moment, that is. We’ve done nothing special here. In fact, we’re used to representing waves as some sine or cosine function, so that’s what we are doing here. Nothing more. Nothing less. We just need two sinusoids because of the circular polarization of our electromagnetic wave.

What’s next? Well… If ψ_E is a regular wavefunction, then we should be able to check if it’s a solution to Schrödinger’s equation. So we should be able to write:

Re(∂ψ_E/∂t) = −(ħ/2m)·Im(∇²ψ_E)
Im(∂ψ_E/∂t) = (ħ/2m)·Re(∇²ψ_E)

Are we? How does that work? The time derivative on the left-hand side is equal to:

∂ψ_E/∂t = −iω·eⁱ^{(kx − ωt)}= −iω·[cos(kx − ωt) + i·sin(kx − ωt)] = ω·sin(kx − ωt) − iω·cos(kx − ωt)

The second-order derivative on the right-hand side is equal to:

∇²ψ_E= ∂²ψ_E/∂x²= −k²·eⁱ^{(kx − ωt)}= −k²·cos(kx − ωt) − ik²·sin(kx − ωt)

So the two equations above are equivalent to writing:

Re(∂ψ_E/∂t) = −(ħ/2m)·Im(∇²ψ_E) ⇔ ω·sin(kx − ωt) = k²·(ħ/2m)·sin(kx − ωt)
Im(∂ψ_E/∂t) = (ħ/2m)·Re(∇²ψ_E) ⇔ −ω·cos(kx − ωt) = −k²·(ħ/2m)·cos(kx − ωt)

Both conditions are fulfilled if, and only if, ω = k²·(ħ/2m). Now, assuming we measure time and distance in equivalent units (c = 1), we can calculate the phase velocity of the electromagnetic wave as being equal to c = ω/k = 1. We also have the de Broglie equation for the matter-wave, even if we’re not quite sure whether or not we should apply that to an electromagnetic wave. In any case, the de Broglie equation tells us that k = p/ħ. So we can re-write this condition as:

ω/k = 1 = k·(ħ/2m) = (p/ħ)·(ħ/2m) = p/2m ⇔ p = 2m ⇔ m = p/2

So that’s different from the E = m = p equality we imposed when discussing the wavefunction of the zero-mass particle: we’ve got that 1/2 factor which bothered us so much once again! And it’s causing us the same trouble: how do we interpret that m = p/2 equation? It leads to nonsense once more! E = m·c²= m, but E is also supposed to be equal to p·c = p. Here, however, we find that E = p/2! We also get strange results when calculating the group and phase velocity. So… Well… What’s going on here?

I am not quite sure. It’s that damn 1/2 factor. Perhaps it’s got something to do with our definition of mass. The m in the Schrödinger equation was referred to as the effective or reduced mass of the electron wavefunction that it was supposed to model. Now that concept is something funny: it sure allows for some gymnastics, as you’ll see when going through the Wikipedia article on it! I promise I’ll dig into it—but not now and here, as I’ve got no time for that. 😦

However, the good news is that we also get a magnetic field vector with an electromagnetic wave: B. We know B is always orthogonal to E, and in the direction that’s given by the right-hand rule for the vector cross-product. Indeed, we can write B as B = e_x×E/c, with e_x the unit vector pointing in the x-direction (i.e. the direction of propagation), as shown below.

E and b

So we can do the same analysis: we just substitute E for B everywhere, and we’ll find the same condition: m = p/2. To distinguish the two wavefunctions, we used the E and B subscripts for our wavefunctions, so we wrote ψ_Eand ψ_B. We can do the same for that m = p/2 condition:

m_E= p_E/2
m_B= p_B/2

Should we just add m_Eand m_E to get a total momentum and, hence, a total energy, that’s equal to E = m = p for the whole wave? I believe we should, but I haven’t quite figured out how we should interpret that summation!

So… Well… Sorry to disappoint you. I haven’t got the answer here. But I do believe my instinct tells me the truth: the wavefunction for an electromagnetic wave—so that’s the wavefunction for a photon, basically—is essentially the same as our wavefunction for a zero-mass particle. It’s just that we get two wavefunctions for the price of one. That’s what distinguishes bosons from fermions! And so I need to figure out how they differ exactly! And… Well… Yes. That might take me a while!

In the meanwhile, we should play some more with those E and B vectors, as that’s going to help us to solve the riddle—no doubt!

Fiddling with E and B

The B = e_x×E/c equation is equivalent to saying that we’ll get B when rotating E by 90 degrees which, in turn, is equivalent to multiplication by the imaginary unit i. Huh? Yes. Sorry. Just google the meaning of the vector cross product and multiplication by i. So we can write B = i·E, which amounts to writing:

B = i·E = eⁱ^(π/2)·eⁱ^{(kx − ωt)} = eⁱ^{(kx − ωt + π/2)} = cos(kx − ωt + π/2) + i·sin(kx − ωt + π/2)

So we can now associate a wavefunction ψ_B with the field magnetic field vector B, which is the same wavefunction as ψ_E except for a phase shift equal to π/2. You’ll say: so what? Well… Nothing much. I guess this observation just concludes this long digression on the wavefunction of a photon: it’s the same wavefunction as that of a zero-mass particle—except that we get two for the price of one!

It’s an interesting way of looking at things. Let’s look at the equations we started this post with, i.e. Maxwell’s equations in free space—i.e. no stationary charges, and no currents (i.e. moving charges) either! So we’re talking those ∂B/∂t = –∇×E and ∂E/∂t = ∇×B equations now.

Note that they actually give you four equations, because they’re vector equations:

∂B/∂t = –∇×E ⇔ ∂B_y/∂t = –(∇×E)_y and ∂B_z/∂t = –(∇×E)_z
∂E/∂t = ∇×B ⇔ ∂E_y/∂t = (∇×B)_y and ∂E_z/∂t = (∇×B)_z

To figure out what that means, we need to remind ourselves of the definition of the curl operator, i.e. the ∇× operator. For E, the components of ∇×E are the following:

(∇×E)_z = ∇_xE_y– ∇_yE_x= ∂E_y/∂x – ∂E_x/∂y
(∇×E)_x = ∇_yE_z– ∇_zE_y= ∂E_z/∂y – ∂E_y/∂z
(∇×E)_y = ∇_zE_x– ∇_xE_z= ∂E_x/∂z – ∂E_z/∂x

So the four equations above can now be written as:

∂B_y/∂t = –(∇×E)_y = –∂E_x/∂z + ∂E_z/∂x
∂B_z/∂t = –(∇×E)_z = –∂E_y/∂x + ∂E_x/∂y
∂E_y/∂t = (∇×B)_y = ∂B_x/∂z – ∂B_z/∂x
∂E_z/∂t = (∇×B)_z= ∂B_y/∂x – ∂B_x/∂y

What can we do with this? Well… The x-component of E and B is zero, so one of the two terms in the equations simply disappears. We get:

∂B_y/∂t = –(∇×E)_y = ∂E_z/∂x
∂B_z/∂t = –(∇×E)_z = – ∂E_y/∂x
∂E_y/∂t = (∇×B)_y = – ∂B_z/∂x
∂E_z/∂t = (∇×B)_z= ∂B_y/∂x

Interesting: only the derivatives with respect to x remain! Let’s calculate them:

∂B_y/∂t = –(∇×E)_y = ∂E_z/∂x = ∂[sin(kx − ωt)]/∂x = k·cos(kx − ωt) = k·E_y
∂B_z/∂t = –(∇×E)_z = – ∂E_y/∂x = – ∂[cos(kx − ωt)]/∂x = k·sin(kx − ωt) = k·E_z
∂E_y/∂t = (∇×B)_y = – ∂B_z/∂x = – ∂[sin(kx − ωt + π/2)]/∂x = – k·cos(kx − ωt + π/2) = – k·B_y
∂E_z/∂t = (∇×B)_z= ∂B_y/∂x = ∂[cos(kx − ωt + π/2)]/∂x = − k·sin(kx − ωt + π/2) = – k·B_z

What wonderful results! The time derivatives of the components of B and E are equal to ±k times the components of E and B respectively! So everything is related to everything, indeed! 🙂

Let’s play some more. Using the cos(θ + π/2) = −sin(θ) and sin(θ + π/2) = cos(θ) identities, we know that B_y and B_z= sin(kx − ωt + π/2) are equal to:

B_y= cos(kx − ωt + π/2) = −sin(kx − ωt) = −E_z
B_z= sin(kx − ωt + π/2) = cos(kx − ωt) = E_y

Let’s calculate those derivatives once more now:

∂B_y/∂t = −∂E_z/∂t = −∂sin(kx − ωt)/∂t = ω·cos(kx − ωt) = ω·E_y
∂B_z/∂t = ∂E_y/∂t = ∂cos(kx − ωt)/∂t = −ω·sin(kx − ωt) = −ω·E_z

This result can, obviously, be true only if ω = k, which we assume to be the case, as we’re measuring time and distance in equivalent units, so the phase velocity is c = 1 = ω/k.

Hmm… I am sure it won’t be long before I’ll be able to prove what I want to prove. I just need to figure out the math. It’s pretty obvious now that the wavefunction—any wavefunction, really—models the flow of energy. I just need to show how it works for the zero-mass particle—and then I mean: how it works exactly. We must be able to apply the concept of the Poynting vector to wavefunctions. We must be. I’ll find how. One day. 🙂

As for now, however, I feel we’ve played enough with those wavefunctions now. It’s time to do what we promised to do a long time ago, and that is to use Schrödinger’s equation to calculate electron orbitals—and other stuff, of course! Like… Well… We hardly ever talked about spin, did we? That comes with huge complexities. But we’ll get through it. Trust me. 🙂

The quantum of time and distance

Original post:

In my previous post, I introduced the elementary wavefunction of a particle with zero rest mass in free space (i.e. the particle also has zero potential). I wrote that wavefunction as eⁱ^{(kx − ωt)}= eⁱ^{(x/2 − t/2)}= cos[(x−t)/2] + i∙sin[(x−t)/2], and we can represent that function as follows:

5d_euler_f

If the real and imaginary axis in the image above are the y- and z-axis respectively, then the x-axis here is time, so here we’d be looking at the shape of the wavefunction at some fixed point in space.

Now, we could also look at its shape at some fixed in point in time, so the x-axis would then represent the spatial dimension. Better still, we could animate the illustration to incorporate both the temporal as well as the spatial dimension. The following animation does the trick quite well:

Animation

Please do note that space is one-dimensional here: the y- and z-axis represent the real and imaginary part of the wavefunction, not the y- or z-dimension in space.

You’ve seen this animation before, of course: I took it from Wikipedia, and it actually represents the electric field vector (E) for a circularly polarized electromagnetic wave. To get a complete picture of the electromagnetic wave, we should add the magnetic field vector (B), which is not shown here. We’ll come back to that later. Let’s first look at our zero-mass particle denuded of all properties, so that’s not an electromagnetic wave—read: a photon. No. We don’t want to talk charges here.

OK. So far so good. A zero-mass particle in free space. So we got that eⁱ^{(x/2 − t/2)}= cos[(x−t)/2] + i∙sin[(x−t)/2] wavefunction. We got that function assuming the following:

Time and distance are measured in equivalent units, so c = 1. Hence, the classical velocity (v) of our zero-mass particle is equal to 1, and we also find that the energy (E), mass (m) and momentum (p) of our particle are numerically the same. We wrote: E = m = p, using the p = m·v (for v = c) and the E = m∙c² formulas.
We also assumed that the quantum of energy (and, hence, the quantum of mass, and the quantum of momentum) was equal to ħ/2, rather than ħ. The de Broglie relations (k = p/ħ and ω = E/ħ) then gave us the rather particular argument of our wavefunction: kx − ωt = x/2 − t/2.

The latter hypothesis (E = m = p = ħ/2) is somewhat strange at first but, as I showed in that post of mine, it avoids an apparent contradiction: if we’d use ħ, then we would find two different values for the phase and group velocity of our wavefunction. To be precise, we’d find v for the group velocity, but v/2 for the phase velocity. Using ħ/2 solves that problem. In addition, using ħ/2 is consistent with the Uncertainty Principle, which tells us that ΔxΔp = ΔEΔt = ħ/2.

OK. Take a deep breath. Here I need to say something about dimensions. If we’re saying that we’re measuring time and distance in equivalent units – say, in meter, or in seconds – then we are not saying that they’re the same. The dimension of time and space is fundamentally different, as evidenced by the fact that, for example, time flows in one direction only, as opposed to x. To be precise, we assumed that x and t become countable variables themselves at some point in time. However, if we’re at t = 0, then we’d count time as t = 1, 2, etcetera only. In contrast, at the point x = 0, we can go to x = +1, +2, etcetera but we may also go to x = −1, −2, etc.

I have to stress this point, because what follows will require some mental flexibility. In fact, we often talk about natural units, such as Planck units, which we get from equating fundamental constants, such as c, or ħ, to 1, but then we often struggle to interpret those units, because we fail to grasp what it means to write c = 1, or ħ = 1. For example, writing c = 1 implies we can measure distance in seconds, or time in meter, but it does not imply that distance becomes time, or vice versa. We still need to keep track of whether or not we’re talking a second in time, or a second in space, i.e. c meter, or, conversely, whether we’re talking a meter in space, or a meter in time, i.e. 1/c seconds. We can make the distinction in various ways. For example, we could mention the dimension of each equation between brackets, so we’d write: t = 1×10⁻¹⁵ s [t] ≈ 299.8×10⁻⁹ m [t]. Alternatively, we could put a little subscript (like _t, or _d), so as to make sure it’s clear our meter is a a ‘light-meter’, so we’d write: t = 1×10⁻¹⁵ s ≈ 299.8×10⁻⁹ m_t. Likewise, we could add a little subscript when measuring distance in light-seconds, so we’d write x = 3×10⁸m ≈ 1 s_d, rather than x = 3×10⁸m [x] ≈ 1 s [x].

If you wish, we could refer to the ‘light-meter’ as a ‘time-meter’ (or a meter of time), and to the light-second as a ‘distance-second’ (or a second of distance). It doesn’t matter what you call it, or how you denote it. In fact, you will never hear of a meter of time, nor will you ever see those subscripts or brackets. But that’s because physicists always keep track of the dimensions of an equation, and so they know. They know, for example, that the dimension of energy combines the dimensions of both force as well as distance, so we write: [energy] = [force]·[distance]. Read: energy amounts to applying a force over a distance. Likewise, momentum amounts to applying some force over some time, so we write: [momentum] = [force]·[time]. Using the usual symbols for energy, momentum, force, distance and time respectively, we can write this as [E] = [F]·[x] and [p] = [F]·[t]. Using the units you know, i.e. joule, newton, meter and seconds, we can also write this as: 1 J = 1 N·m and 1…

Hey! Wait a minute! What’s that N·s unit for momentum? Momentum is mass times velocity, isn’t it? It is. But it amounts to the same. Remember that mass is a measure for the inertia of an object, and so mass is measured with reference to some force (F) and some acceleration (a): F = m·a ⇔ m = F/a. Hence, [m] = kg = [F/a] = N/(m/s²) = N·s²/m. [Note that the m in the brackets is symbol for mass but the other m is a meter!] So the unit of momentum is (N·s²/m)·(m/s) = N·s = newton·second.

Now, the dimension of Planck’s constant is the dimension of action, which combines all dimensions: force, time and distance. We write: ħ ≈ 1.0545718×10⁻³⁴N·m·s (newton·meter·second). That’s great, and I’ll show why in a moment. But, at this point, you should just note that when we write that E = m = p = ħ/2, we’re just saying they are numerically the same. The dimensions of E, m and p are not the same. So what we’re really saying is the following:

The quantum of energy is ħ/2 newton·meter ≈ 0.527286×10⁻³⁴N·m.
The quantum of momentum is ħ/2 newton·second ≈ 0.527286×10⁻³⁴N·s.

What’s the quantum of mass? That’s where the equivalent units come in. We wrote: 1 kg = 1 N·s²/m. So we could substitute the distance unit in this equation (m) by s_d/c = s_d/(3×10⁸). So we get: 1 kg = 3×10⁸ N·s²/s_d. Can we scrap both ‘seconds’ and say that the quantum of mass (ħ/2) is equal to the quantum of momentum? Think about it.

[…]

The answer is… Yes and no—but much more no than yes! The two sides of the equation are only numerically equal, but we’re talking a different dimension here. If we’d write that 1 kg = 0.527286×10⁻³⁴N·s²/s_d = 0.527286×10⁻³⁴N·s, you’d be equating two dimensions that are fundamentally different: space versus time. To reinforce the point, think of it the other way: think of substituting the second (s) for 3×10⁸m. Again, you’d make a mistake. You’d have to write 0.527286×10⁻³⁴N·(m_t)²/m, and you should not assume that a time-meter is equal to a distance-meter. They’re equivalent units, and so you can use them to get some number right, but they’re not equal: what they measure, is fundamentally different. A time-meter measures time, while a distance-meter measure distance. It’s as simple as that. So what is it then? Well… What we can do is remember Einstein’s energy-mass equivalence relation once more: E = m·c² (and m is the mass here). Just check the dimensions once more: [m]·[c²] = (N·s²/m)·(m²/s²) = N·m. So we should think of the quantum of mass as the quantum of energy, as energy and mass are equivalent, really.

Back to the wavefunction

The beauty of the construct of the wavefunction resides in several mathematical properties of this construct. The first is its argument:

θ = kx − ωt, with k = p/ħ and ω = E/ħ

Its dimension is the dimension of an angle: we express in it in radians. What’s a radian? You might think that a radian is a distance unit because… Well… Look at how we measure an angle in radians below:

But you’re wrong. An angle’s measurement in radians is numerically equal to the length of the corresponding arc of the unit circle but… Well… Numerically only. 🙂 Just do a dimensional analysis of θ = kx − ωt = (p/ħ)·x − (E/ħ)·t. The dimension of p/ħ is (N·s)/(N·m·s) = 1/m = m⁻¹, so we get some quantity expressed per meter, which we then multiply by x, so we get a pure number. No dimension whatsoever! Likewise, the dimension of E/ħ is (N·m)/(N·m·s) = 1/s = s⁻¹, which we then multiply by t, so we get another pure number, which we then add to get our argument θ. Hence, Planck’s quantum of action (ħ) does two things for us:

It expresses p and E in units of ħ.
It sorts out the dimensions, ensuring our argument is a dimensionless number indeed.

In fact, I’d say the ħ in the (p/ħ)·x term in the argument is a different ħ than the ħ in the (E/ħ)·t term. Huh? What? Yes. Think of the distinction I made between s and s_d, or between m and m_t. Both were numerically the same: they captured a magnitude, but they measured different things. We’ve got the same thing here:

The meter (m) in ħ ≈ 1.0545718×10⁻³⁴N·m·s in (p/ħ)·x is the dimension of x, and so it gets rid of the distance dimension. So the m in ħ ≈ 1.0545718×10⁻³⁴N·m·s goes, and what’s left measures p in terms of units equal to 1.0545718×10⁻³⁴N·s, so we get a pure number indeed.
Likewise, the second (s) in ħ ≈ 1.0545718×10⁻³⁴N·m·s in (E/ħ)·t is the dimension of t, and so it gets rid of the time dimension. So the s in ħ ≈ 1.0545718×10⁻³⁴N·m·s goes, and what’s left measures E in terms of units equal to 1.0545718×10⁻³⁴N·m, so we get another pure number.
Adding both gives us the argument θ: a pure number that measures some angle.

That’s why you need to watch out when writing θ = (p/ħ)·x − (E/ħ)·t as θ = (p·x − E·t)/ħ or – in the case of our elementary wavefunction for the zero-mass particle – as θ = (x/2 − t/2) = (x − t)/2. You can do it – in fact, you should do when trying to calculate something – but you need to be aware that you’re making abstraction of the dimensions. That’s quite OK, as you’re just calculating something—but don’t forget the physics behind!

You’ll immediately ask: what are the physics behind here? Well… I don’t know. Perhaps nobody knows. As Feynman once famously said: “I think I can safely say that nobody understands quantum mechanics.” But then he never wrote that, and I am sure he didn’t really mean that. And then he said that back in 1964, which is 50 years ago now. 🙂 So let’s try to understand it at least. 🙂

Planck’s quantum of action – 1.0545718×10⁻³⁴N·m·s – comes to us as a mysterious quantity. A quantity is more than a a number. A number is something like π or e, for example. It might be a complex number, like eⁱ^θ, but that’s still a number. In contrast, a quantity has some dimension, or some combination of dimensions. A quantity may be a scalar quantity (like distance), or a vector quantity (like a field vector). In this particular case (Planck’s ħ or h), we’ve got a physical constant combining three dimensions: force, time and distance—or space, if you want. It’s a quantum, so it comes as a blob—or a lump, if you prefer that word. However, as I see it, we can sort of project it in space as well as in time. In fact, if this blob is going to move in spacetime, then it will move in space as well as in time: t will go from 0 to 1, and x goes from 0 to ± 1, depending on what direction we’re going. So when I write that E = p = ħ/2—which, let me remind you, are two numerical equations, really—I sort of split Planck’s quantum over E = m and p respectively.

You’ll say: what kind of projection or split is that? When projecting some vector, we’ll usually have some sine and cosine, or a 1/√2 factor—or whatever, but not a clean 1/2 factor. Well… I have no answer to that, except that this split fits our mathematical construct. Or… Well… I should say: my mathematical construct. Because what I want to find is this clean Schrödinger equation:

∂ψ/∂t = i·(ħ/2m)·∇²ψ = i·∇²ψ for m = ħ/2

Now I can only get this equation if (1) E = m = p and (2) if m = ħ/2 (which amounts to writing that E = p = m = ħ/2). There’s also the Uncertainty Principle. If we are going to consider the quantum vacuum, i.e. if we’re going to look at space (or distance) and time as count variables, then Δx and Δt in the ΔxΔp = ΔEΔt = ħ/2 equations are ± 1 and, therefore, Δp and ΔE must be ± ħ/2. In any case, I am not going to try to justify my particular projection here. Let’s see what comes out of it.

The quantum vacuum

Schrödinger’s equation for my zero-mass particle (with energy E = m = p = ħ/2) amounts to writing:

Re(∂ψ/∂t) = −Im(∇²ψ)
Im(∂ψ/∂t) = Re(∇²ψ)

Now that reminds of the propagation mechanism for the electromagnetic wave, which we wrote as ∂B/∂t = –∇×E and ∂E/∂t = ∇×B, also assuming we measure time and distance in equivalent units. However, we’ll come back to that later. Let’s first study the equation we have, i.e.

eⁱ^{(kx − ωt)} = eⁱ^{(ħ·x/2 − ħ·t/2)/ħ} = eⁱ^{(x/2 − t/2)}= cos[(x−t)/2] + i∙sin[(x−t)/2]

Let’s think some more. What is that eⁱ^{(x/2 − t/2)}function? It’s subject to conceiving time and distance as countable variables, right? I am tempted to say: as discrete variables, but I won’t go that far—not now—because the countability may be related to a particular interpretation of quantum physics. So I need to think about that. In any case… The point is that x can only take on values like 0, 1, 2, etcetera. And the same goes for t. To make things easy, we’ll not consider negative values for x right now (and, obviously, not for t either). But you can easily check it doesn’t make a difference: if you think of the propagation mechanism – which is what we’re trying to model here – then x is always positive, because we’re moving away from some source that caused the wave. In any case, we’ve got a infinite set of points like:

eⁱ^{(0/2 − 0/2)}= eⁱ⁽⁰⁾= cos(0) + i∙sin(0)
eⁱ^{(1/2 − 0/2)}= eⁱ^(1/2)= cos(1/2) + i∙sin(1/2)
eⁱ^{(0/2 − 1/2)}= eⁱ^(−1/2)= cos(−1/2) + i∙sin(−1/2)
eⁱ^{(1/2 − 1/2)}= eⁱ⁽⁰⁾= cos(0) + i∙sin(0)
…

In my previous post, I calculated the real and imaginary part of this wavefunction for x going from 0 to 14 (as mentioned, in steps of 1) and for t doing the same (also in steps of 1), and what we got looked pretty good:

I also said that, if you wonder how the quantum vacuum could possibly look like, you should probably think of these discrete spacetime points, and some complex-valued wave that travels as illustrated above. In case you wonder what’s being illustrated here: the right-hand graph is the cosine value for all possible x = 0, 1, 2,… and t = 0, 1, 2,… combinations, and the left-hand graph depicts the sine values, so that’s the imaginary part of our wavefunction. Taking the absolute square of both gives 1 for all combinations. So it’s obvious we’d need to normalize and, more importantly, we’d have to localize the particle by adding several of these waves with the appropriate contributions. But so that’s not our worry right now. I want to check whether those discrete time and distance units actually make sense. What’s their size? Is it anything like the Planck length (for distance) and/or the Planck time?

Let’s see. What are the implications of our model? The question here is: if ħ/2 is the quantum of energy, and the quantum of momentum, what’s the quantum of force, and the quantum of time and/or distance?

Huh? Yep. We treated distance and time as countable variables above, but now we’d like to express the difference between x = 0 and x = 1 and between t = 0 and t = 1 in the units we know, this is in meter and in seconds. So how do we go about that? Do we have enough equations here? Not sure. Let’s see…

We obviously need to keep track of the various dimensions here, so let’s refer to that discrete distance and time unit as t_Pand l_P respectively. The subscript (P) refers to Planck, and the l refers to a length, but we’re likely to find something else than Planck units. I just need placeholder symbols here. To be clear: t_Pand l_Pare expressed in meter and seconds respectively, just like the actual Planck time and distance, which are equal to 5.391×10⁻⁴⁴ s (more or less) and 1.6162×10⁻³⁵m (more or less) respectively. As I mentioned above, we get these Planck units by equating fundamental physical constants to 1. Just check it: (1.6162×10⁻³⁵m)/(5.391×10⁻⁴⁴ s) = c ≈ 3×10⁸m/s. So the following relation must be true: l_P = c·t_P, or l_P/t_P= c.

Now, as mentioned above, there must be some quantum of force as well, which we’ll write as F_P, and which is – obviously – expressed in newton (N). So we have:

E = ħ/2 ⇒ 0.527286×10⁻³⁴N·m = F_P·l_PN·m
p = ħ/2 ⇒ 0.527286×10⁻³⁴N·s = F_P·t_PN·s

Let’s try to divide both formulas: E/p = (F_P·l_PN·m)/(F_P·t_PN·s) = l_P/t_Pm/s = l_P/t_Pm/s = c m/s. That’s consistent with the E/p = c equation. Hmm… We found what we knew already. My model is not fully determined, it seems. 😦

What about the following simplistic approach? E is numerically equal to 0.527286×10⁻³⁴, and its dimension is [E] = [F]·[x], so we write: E = 0.527286×10⁻³⁴·[E] = 0.527286×10⁻³⁴·[F]·[x]. Hence, [x] = [E]/[F] = (N·m)/N = m. That just confirms what we already know: the quantum of distance (i.e. our fundamental unit of distance) can be expressed in meter. But our model does not give that fundamental unit. It only gives us its dimension (meter), which is stuff we knew from the start. 😦

Let’s try something else. Let’s just accept that Planck length and time, so we write:

l_P = 1.6162×10⁻³⁵m
t_P= 5.391×10⁻⁴⁴ s

Now, if the quantum of action is equal to ħ N·m·s = F_P·l_P·t_P N·m·s = 1.0545718×10⁻³⁴N·m·s, and if the two definitions of l_Pand t_P above hold, then 1.0545718×10⁻³⁴N·m·s = (F_PN)×(1.6162×10⁻³⁵m)×(5.391×10⁻⁴⁴ s) ≈ F_P8.713×10⁻⁷⁹N·m·s ⇔ F_P≈ 1.21×10⁴⁴N.

Does that make sense? It does according to Wikipedia, but how do we relate this to our E = p = m = ħ/2 equations? Let’s try this:

E_P = (1.0545718×10⁻³⁴N·m·s)/(5.391×10⁻⁴⁴ s) = 1.956×10⁹ J. That corresponds to the regular Planck energy.
p_P = (1.0545718×10⁻³⁴N·m·s)/(1.6162×10⁻³⁵m) = 0.6525 N·s. That corresponds to the regular Planck momentum.

Is E_P = p_P? Let’s substitute: 1.956×10⁹ N·m = 1.956×10⁹ N·(s/c) = 1.956×10⁹/2.998×10⁹N·s = 0.6525 N·s. So, yes, it comes out alright. In fact, I omitted the 1/2 factor in the calculations, but it doesn’t matter: it does come out alright. So I did not prove that the difference between my x = 0 and x = 1 points (or my t = 0 and t = 1 points) is equal to the Planck length (or the Planck time unit), but I did show my theory is, at the very least, compatible with those units. That’s more than enough for now. And I’ll come surely come back to it in my next post. 🙂

Post Scriptum: One must solve the following equations to get the fundamental Planck units:

We have five fundamental equations for five fundamental quantities respectively: t_P, l_P, F_P, m_P, and E_P respectively, so that’s OK: it’s a fully determined system alright! But where do the expressions with G, k_B (the Boltzmann constant) and ε₀ come from? What does it mean to equate those constants to 1? Well… I need to think about that, and I’ll get back to you on it. 🙂

The wavefunction of a zero-mass particle

Original post:

I hope you find the title intriguing. A zero-mass particle? So I am talking a photon, right? Well… Yes and no. Just read this post and, more importantly, think about this story for yourself. 🙂

One of my acquaintances is a retired nuclear physicist. We mail every now and then—but he has little or no time for my questions: he usually just tells me to keep studying. I once asked him why there is never any mention of the wavefunction of a photon in physics textbooks. He bluntly told me photons don’t have a wavefunction—not in the sense I was talking at least. Photons are associated with a traveling electric and a magnetic field vector. That’s it. Full stop. Photons do not have a ψ or φ function. [I am using ψ and φ to refer to position or momentum wavefunction. You know both are related: if we have one, we have the other.] But then I never give up, of course. I just can’t let go out of the idea of a photon wavefunction. The structural similarity in the propagation mechanism of the electric and magnetic field vectors E and B just looks too much like the quantum-mechanical wavefunction. So I kept trying and, while I don’t think I fully solved the riddle, I feel I understand it much better now. Let me show you the why and how.

I. An electromagnetic wave in free space is fully described by the following two equations:

∂B/∂t = –∇×E
∂E/∂t = c²∇×B

We’re making abstraction here of stationary charges, and we also do not consider any currents here, so no moving charges either. So I am omitting the ∇·E = ρ/ε₀ equation (i.e. the first of the set of four equations), and I am also omitting the j/ε₀ in the second equation. So, for all practical purposes (i.e. for the purpose of this discussion), you should think of a space with no charges: ρ = 0 and j = 0. It’s just a traveling electromagnetic wave. To make things even simpler, we’ll assume our time and distance units are chosen such that c = 1, so the equations above reduce to:

∂B/∂t = –∇×E
∂E/∂t = ∇×B

Perfectly symmetrical! But note the minus sign in the first equation. As for the interpretation, I should refer you to previous posts but, briefly, the ∇× operator is the curl operator. It’s a vector operator: it describes the (infinitesimal) rotation of a (three-dimensional) vector field. We discussed heat flow a couple of times, or the flow of a moving liquid. So… Well… If the vector field represents the flow velocity of a moving fluid, then the curl is the circulation density of the fluid. The direction of the curl vector is the axis of rotation as determined by the ubiquitous right-hand rule, and its magnitude of the curl is the magnitude of rotation. OK. Next step.

II. For the wavefunction, we have Schrödinger’s equation, ∂ψ/∂t = i·(ħ/2m)·∇²ψ, which relates two complex-valued functions (∂ψ/∂t and ∇²ψ). Complex-valued functions consist of a real and an imaginary part, and you should be able to verify this equation is equivalent to the following set of two equations:

Re(∂ψ/∂t) = −(ħ/2m)·Im(∇²ψ)
Im(∂ψ/∂t) = (ħ/2m)·Re(∇²ψ)

[Two complex numbers a + ib and c + id are equal if, and only if, their real and imaginary parts are the same. However, note the −i factor in the right-hand side of the equation, so we get: a + ib = −i·(c + id) = d −ic.] The Schrödinger equation above also assumes free space (i.e. zero potential energy: V = 0) but, in addition – see my previous post – they also assume a zero rest mass of the elementary particle (E₀ = 0). So just assume E₀= V = 0 in de Broglie’s elementary ψ(θ) = ψ(x, t) = e⁻ⁱ^θ = a·e^{−i[(E₀+ p2/(2m) + V)·t − p∙x]/ħ} wavefunction. So, in essence, we’re looking at the wavefunction of a massless particle here. Sounds like nonsense, doesn’t it? But… Well… That should be the wavefunction of a photon in free space then, right? 🙂

Maybe. Maybe not. Let’s go as far as we can.

The energy of a zero-mass particle

What m would we use for a photon? It’s rest mass is zero, but it’s got energy and, hence, an equivalent mass. That mass is given by the m = E/c²mass-energy equivalence. We also know a photon has momentum, and it’s equal to its energy divided by c: p = m·c = E/c. [I know the notation is somewhat confusing: E is, obviously, not the magnitude of E here: it’s energy!] Both yield the same result. We get: m·c = E/c ⇔ m = E/c²⇔ E = m·c².

OK. Next step. Well… I’ve always been intrigued by the fact that the kinetic energy of a photon, using the E = m·v²/2 = E = m·c²/2 formula, is only half of its total energy E = m·c². Half: 1/2. That 1/2 factor is intriguing. Where’s the rest of the energy? It’s really a contradiction: our photon has no rest mass, and there’s no potential here, but its total energy is still twice its kinetic energy. Quid?

There’s only one conclusion: just because of its sheer existence, it must have some hidden energy, and that hidden energy is also equal to E = m·c²/2, and so the kinetic and hidden energy add up to E = m·c².

Huh? Hidden energy? I must be joking, right?

Well… No. No joke. I am tempted to call it the imaginary energy, because it’s linked to the imaginary part of the wavefunction—but then it’s everything but imaginary: it’s as real as the imaginary part of the wavefunction. [I know that sounds a bit nonsensical, but… Well… Think about it: it does make sense.]

Back to that factor 1/2. You may or may not remember it popped up when we were calculating the group and the phase velocity of the wavefunction respectively, again assuming zero rest mass, and zero potential. [Note that the rest mass term is mathematically equivalent to the potential term in both the wavefunction as well as in Schrödinger’s equation: (E₀·t +V·t = (E₀+ V)·t, and V·ψ + E₀·ψ = (V+E₀)·ψ—obviously!]

In fact, let me quickly show you that calculation again: the de Broglie relations tell us that the k and the ω in the eⁱ^{(kx − ωt)} = cos(kx−ωt) + i∙sin(kx−ωt) wavefunction (i.e. the spatial and temporal frequency respectively) are equal to k = p/ħ, and ω = E/ħ. If we would now use the kinetic energy formula E = m·v²/2 – which we can also write as E = m·v·v/2 = p·v/2 = p·p/2m = p²/2m, with v = p/m the classical velocity of the elementary particle that Louis de Broglie was thinking of – then we can calculate the group velocity of our eⁱ^{(kx − ωt)} = cos(kx−ωt) + i∙sin(kx−ωt) as:

v_g = ∂ω/∂k = ∂[E/ħ]/∂[p/ħ] = ∂E/∂p = ∂[p²/2m]/∂p = 2p/2m = p/m = v

[Don’t tell me I can’t treat m as a constant when calculating ∂ω/∂k: I can. Think about it.] Now the phase velocity. The phase velocity of our eⁱ^{(kx − ωt)} is only half of that. Again, we get that 1/2 factor:

v_p = ω/k = (E/ħ)/(p/ħ) = E/p = (p²/2m)/p = p/2m = v/2

Strange, isn’t it? Why would we get a different value for the phase velocity here? It’s not like we have two different frequencies here, do we? You may also note that the phase velocity turns out to be smaller than the group velocity, which is quite exceptional as well! So what’s the matter?

Well… The answer is: we do seem to have two frequencies here while, at the same time, it’s just one wave. There is only one k and ω here but, as I mentioned a couple of times already, that eⁱ^{(kx − ωt)} wavefunction seems to give you two functions for the price of one—one real and one imaginary: eⁱ^{(kx − ωt)} = cos(kx−ωt) + i∙sin(kx−ωt). So are we adding waves, or are we not? It’s a deep question. In my previous post, I said we were adding separate waves, but now I am thinking: no. We’re not. That sine and cosine are part of one and the same whole. Indeed, the apparent contradiction (i.e. the different group and phase velocity) gets solved if we’d use the E = m∙v² formula rather than the kinetic energy E = m∙v²/2. Indeed, assuming that E = m∙v² formula also applies to our zero-mass particle (I mean zero rest mass, of course), and measuring time and distance in natural units (so c = 1), we have:

E = m∙c² = m and p = m∙c²= m, so we get: E = m = p

Waw! What a weird combination, isn’t it? But… Well… It’s OK. [You tell me why it wouldn’t be OK. It’s true we’re glossing over the dimensions here, but natural units are natural units, and so c = c²= 1. So… Well… No worries!] The point is: that E = m = p equality yields extremely simple but also very sensible results. For the group velocity of our eⁱ^{(kx − ωt)} wavefunction, we get:

v_g = ∂ω/∂k = ∂[E/ħ]/∂[p/ħ] = ∂E/∂p = ∂p/∂p = 1

So that’s the velocity of our zero-mass particle (c, i.e. the speed of light) expressed in natural units once more—just like what we found before. For the phase velocity, we get:

v_p = ω/k = (E/ħ)/(p/ħ) = E/p = p/p = 1

Same result! No factor 1/2 here! Isn’t that great? My ‘hidden energy theory’ makes a lot of sense. 🙂 In fact, I had mentioned a couple of times already that the E = m∙v² relation comes out of the de Broglie relations if we just multiply the two and use the v = f·λ relation:

f·λ = (E/h)·(h/p) = E/p
v = f·λ ⇒ f·λ = v = E/p ⇔ E = v·p = v·(m·v) ⇒ E = m·v²

But so I had no good explanation for this. I have one now: the E = m·v²is the correct energy formula for our zero-mass particle. 🙂

The quantization of energy and the zero-mass particle

Let’s now think about the quantization of energy. What’s the smallest value for E that we could possible think of? That’s h, isn’t it? That’s the energy of one cycle of an oscillation according to the Planck-Einstein relation (E = h·f). Well… Perhaps it’s ħ? Because… Well… We saw energy levels were separated by ħ, rather than h, when studying the blackbody radiation problem. So is it ħ = h/2π? Is the natural unit a radian (i.e. a unit distance), rather than a cycle?

Neither is natural, I’d say. We also have the Uncertainty Principle, which suggests the smallest possible energy value is ħ/2, because ΔxΔp = ΔtΔE = ħ/2.

Huh? What’s the logic here?

Well… I am not quite sure but my intuition tells me the quantum of energy must be related to the quantum of time, and the quantum of distance.

Huh? The quantum of time? The quantum of distance? What’s that? The Planck scale?

No. Or… Well… Let me correct that: not necessarily. I am just thinking in terms of logical concepts here. Logically, as we think of the smallest of smallest, then our time and distance variables must become count variables, so they can only take on some integer value n = 0, 1, 2 etcetera. So then we’re literally counting in time and/or distance units. So Δx and Δt are then equal to 1. Hence, Δp and ΔE are then equal to Δp = ΔE = ħ/2. Just think of the radian (i.e. the unit in which we measure θ) as measuring both time as well as distance. Makes sense, no?

No? Well… Sorry. I need to move on. So the smallest possible value for m = E = p would be ħ/2. Let’s substitute that in Schrödinger’s equation, or in that set of equations Re(∂ψ/∂t) = −(ħ/2m)·Im(∇²ψ) and Im(∂ψ/∂t) = (ħ/2m)·Re(∇²ψ). We get:

Re(∂ψ/∂t) = −(ħ/2m)·Im(∇²ψ) = −(2ħ/2ħ)·Im(∇²ψ) = −Im(∇²ψ)
Im(∂ψ/∂t) = (ħ/2m)·Re(∇²ψ) = (2ħ/2ħ)·Re(∇²ψ) = Re(∇²ψ)

Bingo! The Re(∂ψ/∂t) = −Im(∇²ψ) and Im(∂ψ/∂t) = Re(∇²ψ) equations were what I was looking for. Indeed, I wanted to find something that was structurally similar to the ∂B/∂t = –∇×E and ∂E/∂t = ∇×B equations—and something that was exactly similar: no coefficients in front or anything. 🙂

What about our wavefunction? Using the de Broglie relations once more (k = p/ħ, and ω = E/ħ), our eⁱ^{(kx − ωt)} = cos(kx−ωt) + i∙sin(kx−ωt) now becomes:

eⁱ^{(kx − ωt)} = eⁱ^{(ħ·x/2 − ħ·t/2)/ħ} = eⁱ^{(x/2 − t/2)}= cos[(x−t)/2] + i∙sin[(x−t)/2]

Hmm… Interesting! So we’ve got that 1/2 factor now in the argument of our wavefunction! I really feel I am close to squaring the circle here. 🙂 Indeed, it must be possible to relate the ∂B/∂t = –∇×E and ∂E/∂t = c²∇×B to the Re(∂ψ/∂t) = −Im(∇²ψ) and Im(∂ψ/∂t) = Re(∇²ψ) equations. I am sure it’s a complicated exercise. It’s likely to involve the formula for the Lorentz force, which says that the force on a unit charge is equal to $E + v \times B$ $, with v the velocity of the charge. Why? Note the vector cross-product. Also note that \partial B /\partialt and \partial E /\partialt are vector-valued functions, not scalar-valued functions. Hence, in that sense, \partial B /\partialt and \partial E /\partialt and not like the Re (\partialψ/\partialt) and/or Im (\partialψ/\partialt) function. But\dots Well\dots For the rest, think of it:$ $E and B are orthogonal vectors, and that’s how we usually interpret the real and imaginary part of a complex number as well: the real and imaginary axis are orthogonal too!$

$So I am almost there.$ Who can help me prove what I want to prove here? The two propagation mechanisms are the “same-same but different”, as they say in Asia. The difference between the two propagation mechanisms must also be related to that fundamental dichotomy in Nature: the distinction between bosons and fermions. Indeed, when combining two directional quantities (i.e. two vectors), we like to think there are four different ways of doing that, as shown below. However, when we’re only interested in the magnitude of the result (and not in its direction), then the first and third result below are really the same, as are the second and fourth combination. Now, we’ve got pretty much the same in quantum math: we can, in theory, combine complex-valued amplitudes in four different ways but, in practice, we only have two (rather than four) types of behavior only: photons versus bosons.

Is our zero-mass particle just the electric field vector?

Let’s analyze that eⁱ^{(x/2 − t/2)}= cos[(x−t)/2] + i∙sin[(x−t)/2] wavefunction some more. It’s easy to represent it graphically. The following animation does the trick:

Animation

I am sure you’ve seen this animation before: it represents a circularly polarized electromagnetic wave… Well… Let me be precise: it presents the electric field vector (E) of such wave only. The B vector is not shown here, but you know where and what it is: orthogonal to the E vector, as shown below—for a linearly polarized wave.

eⁱ^{(0/2 − 0/2)}= cos(0) + i∙sin(0)
eⁱ^{(1/2 − 0/2)}= cos(1/2) + i∙sin(1/2)
eⁱ^{(0/2 − 1/2)}= cos(−1/2) + i∙sin(−1/2)
eⁱ^{(1/2 − 1/2)}= cos(0) + i∙sin(0)
…

Now, I quickly opened Excel and calculated those cosine and sine values for x and t going from 0 to 14 below. It’s really easy. Just five minutes of work. You should do yourself as an exercise. The result is shown below. Both graphs connect 14×14 = 196 data points, but you can see what’s going on: this does effectively, represent the elementary wavefunction of a particle traveling in spacetime. In fact, you can see its speed is equal to 1, i.e. it effectively travels at the speed of light, as it should: the wave velocity is v = f·λ = (ω/2π)·(2π/k) = ω/k = (1/2)·(1/2) = 1. The amplitude of our wave doesn’t change along the x = t diagonal. As the Last Samurai puts it, just before he moves to the Other World: “Perfect! They are all perfect!” 🙂

graph imaginary graph real

In fact, in case you wonder how the quantum vacuum could possibly look like, you should probably think of these discrete spacetime points, and some complex-valued wave that travels as it does in the illustration above.

Of course, that elementary wavefunction above does not localize our particle. For that, we’d have to add a potentially infinite number of such elementary wavefunctions, so we’d write the wavefunction as ∑ a_je^−iθ_j functions. [I use the j symbol here for the subscript, rather than the more conventional i symbol for a subscript, so as to avoid confusion with the symbol used for the imaginary unit.] The a_jcoefficients are the contribution that each of these elementary wavefunctions would make to the composite wave. What could they possibly be? Not sure. Let’s first look at the argument of our elementary component wavefunctions. We’d inject uncertainty in it. So we’d say that m = E = p is equal to

m = E = p = ħ/2 + j·ħ with j = 0, 1, 2,…

That amounts to writing: m = E = p = ħ/2, ħ, 3ħ/2, 2ħ, 5/2ħ, etcetera. Waw! That’s nice, isn’t it? My intuition tells me that our a_jcoefficients will be smaller for higher j, so the a_j(j) function would be some decreasing function. What shape? Not sure. Let’s first sum up our thoughts so far:

The elementary wavefunction of a zero-mass particle (again, I mean zero rest mass) in free space is associated with an energy that’s equal to ħ/2.
The zero-mass particle travels at the speed of light, obviously (because it has zero rest mass), and its kinetic energy is equal to E = m·v²/2= m·c²/2.
However, its total energy is equal to E = m·v²= m·c²: it has some hidden energy. Why? Just because it exists.
We may associate its kinetic energy with the real part of its wavefunction, and the hidden energy with its imaginary part. However, you should remember that the imaginary part of the wavefunction is as essential as its real part, so the hidden energy is equally real. 🙂

So… Well… Isn’t this just nice?

I think it is. Another obvious advantage of this way of looking at the elementary wavefunction is that – at first glance at least – it provides an intuitive understanding of why we need to take the (absolute) square of the wavefunction to find the probability of our particle being at some point in space and time. The energy of a wave is proportional to the square of its amplitude. Now, it is reasonable to assume the probability of finding our (point) particle would be proportional to the energy and, hence, to the square of the amplitude of the wavefunction, which is given by those a_j(j) coefficients.

Huh?

OK. You’re right. I am a bit too fast here. It’s a bit more complicated than that, of course. The argument of probability being proportional to energy being proportional to the square of the amplitude of the wavefunction only works for a single wave a·e^−iθ. The argument does not hold water for a sum of functions ∑ a_je^−iθ_j. Let’s write it all out. Taking our m = E = p = ħ/2 + j·ħ = ħ/2, ħ, 3ħ/2, 2ħ, 5/2ħ,… formula into account, this sum would look like:

a₁eⁱ^{(x − t)(1/2)}+ a₂eⁱ^{(x − t)(2/2)}+ a₃eⁱ^{(x − t)(3/2)}+ a₄eⁱ^{(x − t)(4/2)}+ …

But—Hey! We can write this as some power series, can’t we? We just need to add a₀eⁱ^{(x − t)(0/2)}= a₀, and then… Well… It’s not so easy, actually. Who can help me? I am trying to find something like this:

power series

Or… Well… Perhaps something like this:

power series 2

Whatever power series it is, we should be able to relate it to this one—I’d hope:

Hmm… […] It looks like I’ll need to re-visit this, but I am sure it’s going to work out. Unfortunately, I’ve got no more time today, I’ll let you have some fun now with all of this. 🙂 By the way, note that the result of the first power series is only valid for |x| < 1. 🙂

Note 1: What we should also do now is to re-insert mass in the equations. That should not be too difficult. It’s consistent with classical theory: the total energy of some moving mass is E = m·c², out of which m·v²/2 is the classical kinetic energy. All the rest – i.e. m·c² − m·v²/2 – is potential energy, and so that includes the energy that’s ‘hidden’ in the imaginary part of the wavefunction. 🙂

Note 2: I really didn’t pay much attentions to dimensions when doing all of these manipulations above but… Well… I don’t think I did anything wrong. Just to give you some more feel for that wavefunction eⁱ^{(kx − ωt)}, please do a dimensional analysis of its argument. I mean, k = p/ħ, and ω = E/ħ, so check the dimensions:

Momentum is expressed in newton·second, and we divide it by the quantum of action, which is expressed in newton·meter·second. So we get something per meter. But then we multiply it with x, so we get a dimensionless number.
The same is true for the ωt term. Energy is expressed in joule, i.e. newton·meter, and so we divide it by ħ once more, so we get something per second. But then we multiply it with t, so… Well… We do get a dimensionless number: a number that’s expressed in radians, to be precise. And so the radian does, indeed, integrate both the time as well as the distance dimension. 🙂