Two-state systems: the math versus the physics, and vice versa.

I think my previous post, on the math behind the maser, was a bit of a brain racker. However, the results were important and, hence, it is useful to generalize them so we can apply it to other two-state systems. 🙂 Indeed, we’ll use the very same two-state framework to analyze things like the stability of neutral and ionized hydrogen molecules and the binding of diatomic molecules in general – and lots of other stuff that can be analyzed as a two-state system. However, let’s first have look at the math once more. More importantly, let’s analyze the physics behind. 

At the center of our little Universe here 🙂 is the fact that the dynamics of a two-state system are described by a set of two differential equations, which we wrote as: System

It’s obvious these two equations are usually not easy to solve: the Cand Cfunctions are complex-valued amplitudes which vary not only in time but also in space, obviously, but, in fact, that’s not the problem. The issue is that the Hamiltonian coefficients Hij may also vary in space and in time, and so that‘s what makes things quite nightmarish to solve. [Note that, while H11 and H22 represent some energy level and, hence, are usually real numbers, H12 and H21 may be complex-valued. However, in the cases we’ll be analyzing, they will be real numbers too, as they will usually also represent some energy. Having noted that, being real- or complex-valued is not the problem: we can work with complex numbers and, as you can see from the matrix equation above, the i/ħ factor in front of our differential equations results in a complex-valued coefficient matrix anyway.]

So… Yes. It’s those non-constant Hamiltonian coefficients that caused us so much trouble when trying to analyze how a maser works or, more generally, how induced transitions work. [The same equations apply to blackbody radiation indeed, or other phenomena involved induced transitions.] In any case, so we won’t do that again – not now, at least – and so we’ll just go back to analyzing ‘simple’ two-state systems, i.e. systems with constant Hamiltonian coefficients.

Now, even for such simple systems, Feynman made life super-easy for us – too easy, I think – because he didn’t use the general mathematical approach to solve the issue on hand. That more general approach would be based on a technique you may or may not remember from your high school or university days: it’s based on finding the so-called eigenvalues and eigenvectors of the coefficient matrix. I won’t say too much about that, as there’s excellent online coverage of that, but… Well… We do need to relate the two approaches, and so that’s where math and physics meet. So let’s have a look at it all.

If we would write the first-order time derivative of those C1 and Cfunctions as C1‘ and C2‘ respectively (so we just put a prime instead of writing dC1/dt and dC2/dt), and we put them in a two-by-one column matrix, which I’ll write as C, and then, likewise, we also put the functions themselves, i.e. C1 and C2, in a column matrix, which I’ll write as C, then the system of equations can be written as the following simple expression:

C = AC

One can then show that the general solution will be equal to:

C = a1eλI·tv+ a2eλII·tvII

The λI and λII in the exponential functions are the eigenvalues of A, so that’s that two-by-two matrix in the equation, i.e. the coefficient matrix with the −(i/ħ)Hij elements. The vI and vII column matrices in the solution are the associated eigenvectors. As for a1 and a2, these are coefficients that depend on the initial conditions of the system as well as, in our case at least, the normalization condition: the probabilities we’ll calculate have to add up to one. So… Well… It all comes with the system, as we’ll see in a moment.

Let’s first look at those eigenvalues. We get them by calculating the determinant of the A−λI matrix, and equating it to zero, so we write det(A−λI) = 0. If A is a two-by-two matrix (which it is for the two-state systems that we are looking at), then we get a quadratic equation, and its two solutions will be those λI and λII values. The two eigenvalues of our system above can be written as:

λI = −(i/ħ)·EI and λII = −(i/ħ)·EII.

EI and EII are two possible values for the energy of our system, which are referred to as the upper and the lower energy level respectively. We can calculate them as:

energies

Note that we use the Roman numerals I and II for these two energy levels, rather than the usual Arabic numbers 1 and 2. That’s in line with Feynman’s notation: it relates to a special set of base states that we will introduce shortly. Indeed, plugging them into the a1eλI·t and a2eλII·t expressions gives us a1e−(i/ħ)·EI·t and a2e−(i/ħ)·EII·t and…

Well… It’s time to go back to the physics class now. What are we writing here, really? These two functions are amplitudes for so-called stationary states, i.e. states that are associated with probabilities that do not change in time. Indeed, it’s easy to see that their absolute square is equal to:

  • P= |a1e−(i/ħ)·EI·t|= |a1|2·|e−(i/ħ)·EI·t|= |a1|2
  • PII = |a2e−(i/ħ)·EII·t|= |a2|2·|e−(i/ħ)·EII·t|= |a2|2

Now, the a1 and a2 coefficients depend on the initial and/or normalization conditions of the system, so let’s leave those out for the moment and write the rather special amplitudes e−(i/ħ)·EI·t and e−(i/ħ)·EII·t as:

  • C= 〈 I | ψ 〉 =  e−(i/ħ)·EI·t
  • CII = 〈 II | ψ 〉 = e−(i/ħ)·EII·t

As you can see, there’s two base states that go with these amplitudes, which we denote as state | I 〉 and | II 〉 respectively, so we can write the state vector of our two-state system – like our ammonia molecule, or whatever – as:

| ψ 〉 = | I 〉 C| II 〉 CII = | I 〉〈 I | ψ 〉 + | II 〉〈 II | ψ 〉

In case you forgot, you can apply the magical | = ∑ | i 〉 〈 i | formula to see this makes sense: | ψ 〉 = ∑ | i 〉 〈 i | ψ 〉 = | I 〉 〈 I | ψ 〉 + | II 〉 〈 II | ψ 〉 = | I 〉 C| II 〉 CII.

Of course, we should also be able to revert back to the base states we started out with so, once we’ve calculated Cand C2, we can also write the state of our system in terms of state | 1 〉 and | 2 〉, which are the states as we defined them when we first looked at the problem. 🙂 In short, once we’ve got Cand C2, we can also write:

| ψ 〉 = | 1 〉 C| 2 〉 C= | 1 〉〈 1 | ψ 〉 + | 2 〉〈 2 | ψ 〉

So… Well… I guess you can sort of see how this is coming together. If we substitute what we’ve got so far, we get:

C = a1·CI·vI + a2·CII·vII

Hmm… So what’s that? We’ve seen something like C = a1·CI + a2·CII , as we wrote something like C1 = (a/2)·CI + (b/2)·CII b in our previous posts, for example—but what are those eigenvectors vI and vII? Why do we need them?

Well… They just pop up because we’re solving the system as mathematicians would do it, i.e. not as Feynman-the-Great-Physicist-and-Teacher-cum-Simplifier does it. 🙂 From a mathematical point of view, they’re the vectors that solve the (A−λII)vI = 0 and (A−λIII)vII = 0 equations, so they come with the eigenvalues, and their components will depend on the eigenvalues λand λI as well as the Hamiltonian coefficients. [I is the identity matrix in these matrix equations.] In fact, because the eigenvalues are written in terms of the Hamiltonian coefficients, they depend on the Hamiltonian coefficients only, but then it will be convenient to use the EI and EII values as a shorthand.

Of course, one can also look at them as base vectors that uniquely specify the solution C as a linear combination of vI and vII. Indeed, just ask your math teacher, or google, and you’ll find that eigenvectors can serve as a set of base vectors themselves. In fact, the transformations you need to do to relate them to the so-called natural basis are the ones you’d do when diagonalizing the coefficient matrix A, which you did when solving systems of equations back in high school or whatever you were doing at university. But then you probably forgot, right? 🙂 Well… It’s all rather advanced mathematical stuff, and so let’s cut some corners here. 🙂

We know, from the physics of the situations, that the C1 and C2 functions and the CI and CII functions are related in the same way as the associated base states. To be precise, we wrote:

eq 1

This two-by-two matrix here is the transformation matrix for a rotation of state filtering apparatus about the y-axis, over an angle equal to α, when only two states are involved. You’ve seen it before, but we wrote it differently:

transformation

In fact, we can be more precise: the angle that we chose was equal to minus 90 degrees. Indeed, we wrote our transformation as:

Eq 4[Check the values against α = −π/2.] However, let’s keep our analysis somewhat more general for the moment, so as to see if we really need to specify that angle. After all, we’re looking for a general solution here, so… Well… Remembering the definition of the inverse of a matrix (and the fact that cos2α + sin2α = 1), we can write:

Eq 3

Now, if we write the components of vI and vII as vI1 and vI2, and vII1 and vII2 respectively, then the C = a1·CI·vI + a2·CII·vII expression is equivalent to:

  • C1 = a1·vI1·Ca2·vII1·CII
  • C2 = a1·vI2·CI + a2·vII2 ·CII

Hence, a1·vI1 = a2·vII2 = cos(α/2) and a2·vII1 = −a1·vI2 = sin(α/2). What can we do with this? Can we solve this? Not really: we’ve got two equations and four variables. So we need to look at the normalization and starting conditions now. For example, we can choose our t = 0 point such that our two-state system is in state 1, or in state I. And then we know it will not be in state 2, or state II. In short, we can impose conditions like:

|C1(0)|= 1 = |a1·vI1·CI(0) + a2·vII1·CII(0)|and |C2|= 0 = |a1·vI1·CI(0) + a2·vII1·CII(0)|

However, as Feynman puts it: “These conditions do not uniquely specify the coefficients. They are still undetermined by an arbitrary phase.”

Hmm… He means the α, of course. So… What to do? Well… It’s simple. What he’s saying here is that we do need to specify that transformation angle. Just look at it: the a1·vI1 = a2·vII2 = cos(α/2) and a2·vII1 = −a1·vI2 = sin(α/2) conditions only make sense when we equate α with −π/2, so we can write:

  • a1·vI1 = a2·vII2 = cos(−π/4) = 1/√2
  • a2·vII1 = −a1·vI2 = sin(−π/4) = –1/√2

It’s only then that we get a unique ratio for a1/a= vI1/vII2 = −vII1/vI2. [In case you think there are two angles in the circle for which the cosine equals minus the sine – or, what amounts to the same, for which the sine equals minus the cosine – then… Well… You’re right, but we’ve got α divided by two in the argument. So if α/2 is equal to the ‘other’ angle, i.e. 3π/4, then α itself will be equal to 6π/4 = 3π/2. And so that’s the same −π/2 angle as above: 3π/2 − 2π = −π/2, indeed. So… Yes. It all makes sense.]

What are we doing here? Well… We’re sort of imposing a ‘common-sense’ condition here. Think of it: if the vI1/vII2 and −vII1/vI2 ratios would be different, we’d have a huge problem, because we’d have two different values for the a1/aratio! And… Well… That just doesn’t make sense. The system must come with some specific value for aand a2. We can’t just invent two ‘new’ ones!

So… Well… We are alright now, and we can analyze whatever two-state system we want now. One example was our ammonia molecule in an electric field, for which we found that the following systems of equations were fully equivalent:

Set

So, the upshot is that you should always remember that everything we’re doing is subject to the condition that the ‘1’ and ‘2’ base states and the ‘I’ and ‘II’ base states (Feynman suggests to read I and II as ‘Eins’ and ‘Zwei’ – or try ‘Uno‘ and ‘Duo‘ instead 🙂 – so as to make a difference with ‘one’ and ‘two’) are ‘separated’ by an angle of (minus) 90 degrees. [Of course, I am not using the ‘right’ language here, obviously. I should say ‘projected’, or ‘orthogonal’, perhaps, but then that’s hard to say for base states: the [1/√2, 1/√2] and [1/√2, −1/√2] vectors are obviously orthogonal, because their dot product is zero, but, as you know, the base states themselves do not have such geometrical interpretation: they’re just ‘objects’ in what’s referred to as a Hilbert space. But… Well… I shouldn’t dwell on that here.]

So… There we are. We’re all set. Good to go! Please note that, in the absence of an electric field, the two Hamiltonians are even simpler:

equi

In fact, they’ll usually do the trick in what we’re going to deal with now.

[…] So… Well… That’s is really! 🙂 We’re now going to apply all this in the next posts, so as to analyze things like the stability of neutral and ionized hydrogen molecules and the binding of diatomic molecules. More interestingly, we’re going to talk about virtual particles. 🙂

Addendum: I started writing this post because Feynman actually does give the impression there’s some kind of ‘doublet’ of aand a2 coefficients as he start his chapter on ‘other two-state systems’. It’s the symbols he’s using: ‘his’ aand a2, and the other doublet with the primes, i.e. a1‘ and a2‘, are the transformation amplitudesnot the coefficients that I am calculating above, and that he was calculating (in the previous chapter) too. So… Well… Again, the only thing you should remember from this post is that 90 degree angle as a sort of physical ‘common sense condition’ on the system.

Having criticized the Great Teacher for not being consistent in his use of symbols, I should add that the interesting thing is that, while confusing, his summary in that chapter does give us precise formulas for those transformation amplitudes, which he didn’t do before. Indeed, if we write them as a, b, c and d respectively (so as to avoid that confusing aand a2, and then a1‘ and a2‘ notation), so if we have:

transformation

then one can show that:

final

That’s, of course, fully consistent with the ratios we introduced above, as well as with the orthogonality condition that comes with those eigenvectors. Indeed, if a/b = −1 and c/d = +1, then a/b = −c/d and, therefore, a·d + b·c = 0. [I’ll leave it to you to compare the coefficients so as to check that’s the orthogonality condition indeed.]

In short, it all shows everything does come out of the system in a mathematical way too, so the math does match the physics once again—as it should, of course! 🙂

Working with base states and Hamiltonians

I wrote a pretty abstract post on working with amplitudes, followed by more of the same, and then illustrated how it worked with a practical example (the ammonia molecule as a two-state system). Now it’s time for even more advanced stuff. Here we’ll show how to switch to another set of base states, and what it implies in terms of the Hamiltonian matrix and all of those equations, like those differential equations and – of course – the wavefunctions (or amplitudes) themselves. In short, don’t try to read this if you haven’t done your homework. 🙂

Let me continue the practical example, i.e. the example of the NH3 molecule, as shown below. We abstracted away from all of its motion, except for its angular momentum – or its spin, you’d like to say, but that’s rather confusing, because we shouldn’t be using that term for the classical situation we’re presenting here – around its axis of symmetry. That angular momentum doesn’t change from state | 1 〉 to state | 2 〉. What’s happening here is that we allow the nitrogen atom to flip through the other side, so it tunnels through the plane of the hydrogen atoms, thereby going through an energy barrier.

ammonia

It’s important to note that we do not specify what that energy barrier consists of. In fact, the illustration above may be misleading, because it presents all sorts of things we don’t need right now, like the electric dipole moment, or the center of mass of the molecule, which actually doesn’t change, unlike what’s suggested above. We just put them there to remind you that (a) quantum physics is based on physics – so there’s lots of stuff involved – and (b) because we’ll need that electric dipole moment later. But, as we’re introducing it, note that we’re using the μ symbol for it, which is usually reserved for the magnetic dipole moment, which is what you’d usually associate when thinking about the angular momentum or the spin, both in classical as well as in quantum mechanics. So the direction of rotation of our molecule, as indicated by the arrow around the axis at the bottom, and the μ in the illustration itself, have nothing to do with each other. So now you know. Also, as we’re talking symbols, you should note the use of ε to represent an electric field. We’d usually write the electric dipole moment and the electric field vector as p and E respectively, but so we use that now for linear momentum and energy, and so we borrowed them from our study of magnets. 🙂

The point to note is that, when we’re talking about the ‘up’ or ‘down’ state of our ammonia molecule, you shouldn’t think of it as ‘spin up’ or ‘spin down’. It’s not like that: it’s just the nitrogen atom being beneath or above the plane of the hydrogen atoms, and we define beneath or above assuming the direction of spin actually stays the same!

OK. That should be clear enough. In quantum mechanics, the situation is analyzed by associating two energy levels with the ammonia molecule, E+ A and E− A, so they are separated by an amount equal to 2A. This pair of energy levels has been confirmed experimentally: they are separated by an energy amount equal to 1×10−4 eV, so that’s less than a ten-thousandth of the energy of a photon in the visible-light spectrum. Therefore, a molecule that has a transition will emit a photon in the microwave range. The principle of a maser is based on exciting the the NH3 molecules, and then induce transitions. One can do that by applying an external electric field. The mechanism works pretty much like what we described when discussing the tunneling phenomenon: an external force field will change the energy factor in the wavefunction, by adding potential energy (let’s say an amount equal to U) to the total energy, which usually consists of the internal (Eint) and kinetic (p2/(2m) = m·v) energy only. So now we write a·e−i[(Eint + m·v + U)·t − p∙x]/ħ instead of a·e−i[(Eint + m·v)·t − p∙x]/ħ.

Of course, a·e−i·(E·t − p∙x)/ħ is an idealized wavefunction only, or a Platonic wavefunction – as I jokingly referred to it in my previous post. A real wavefunction has to deal with these uncertainties: we don’t know E and p. At best, we have a discrete set of possible values, like E+ A and E− A in this case. But it might as well be some range, which we denote as ΔE and Δp, and then we need to make some assumption in regard to the probability density function that we’re going to associate with it. But I am getting ahead of myself here. Back to  NH3, i.e. our simple two-state system. Let’s first do some mathematical gymnastics.

Choosing another representation

We have two base states in this system: ‘up’ or ‘down’, which we denoted as base state | 1 〉 and base state | 2 〉 respectively. You’ll also remember we wrote the amplitude to find the molecule in either one of these two states as:

  • C= 〈 1 | ψ 〉 = (1/2)·e(i/ħ)·(E− A)·t + (1/2)·e(i/ħ)·(E+ A)·t
  • C= 〈 2 | ψ 〉 = (1/2)·e(i/ħ)·(E− A)·t – (1/2)·e(i/ħ)·(E+ A)·t

That gave us the following probabilities:

graph

If our molecule can be in two states only, and it starts off in one, then the probability that it will remain in that state will gradually decline, while the probability that it flips into the other state will gradually increase. So that’s what’s shown above, and it makes perfect sense.

Now, you may think there is only one possible set of base states here, as it’s not like measuring spin along this or that direction. These two base states are much simpler: it’s a matter of the nitrogen being beneath or above the plane of the hydrogens, and we’re only interested in the angular momentum of the molecule around its axis of symmetry to help us define what ‘up’ and what’s ‘down’. That’s all. However, from a quantum math point of view, we can actually choose some other ‘representation’. Now, these base state vectors | i 〉 are a bit tough to understand, so let’s, in our first go at it, use those coefficients Ci, which are ‘proper’ amplitudes. We’ll define two new coefficients, CI and CII, which – you’ve guess it – we’ll associate with an alternative set of base states | I 〉 and | II 〉. We’ll define them as follows:

  • C= 〈 I | ψ 〉 = (1/√2)·(C1 − C2)
  • CII = 〈 II | ψ 〉 = (1/√2)·(C1 + C2)

[The (1/√2) factor is there because of the normalization condition, obviously. We could take it out and then do the whole analysis to plug it in later, as Feynman does, but I prefer to do it this way, as it reminds us that our wavefunctions are to be related to probabilities at some point in time. :-)]

Now, you can easily check that, when substituting our Cand Cfor those wavefunctions above, we get:

  • C= 〈 I | ψ 〉 = (1/√2)·e(i/ħ)·(E+ A)·t 
  • CII = 〈 I | ψ 〉 = (1/√2)·e(i/ħ)·(E− A)·t

Note that the way plus and minus signs switch here makes things not so easy to remember, but that’s how it is. 🙂 So we’ve got our stationary state solutions here, that are associated with probabilities that do not vary in time. [In case you wonder: that’s the definition of a ‘stationary state’: we’ve got something with a definite energy and, therefore, the probability that’s associated with it is some constant.] Of course, now you’ll cry wolf and say: these wavefunctions don’t actually mean anything, do they? They don’t describe how ammonia actually behaves, do they? Well… Yes and no. The base states I and II actually do allow us to describe whatever we need to describe. To be precise, describing the state φ in terms of the base states | 1 〉 and | 2 〉, i.e. writing | φ 〉 as:

| φ 〉 = | 1 〉 C1 + | 2 〉 C2,

is mathematically equivalent to writing:

| φ 〉 = | I 〉 CI + | II 〉 CII.

We can easily show that, even if it requires some gymnastics indeed—but then you should look at it as just another exercise in quantum math and so, yes, please do go through the logic. First note that the C= 〈 I | ψ 〉 = (1/√2)·(C− C2) and CII = 〈 II | ψ 〉 = (1/√2)·(C+ C2) expressions are equivalent to:

〈 I | ψ 〉 = (1/√2)·[〈 1 | ψ 〉 − 〈 2 | ψ 〉] and 〈 II | ψ 〉 = (1/√2)·[〈 1 | ψ 〉 + 〈 2 | ψ 〉]

Now, using our quantum math rules, we can abstract the | ψ 〉 away, and so we get:

〈 I | = (1/√2)·[〈 1 | − 〈 2 |] and 〈 II | = (1/√2)·[〈 1 | + 〈 2 |]

We could also have applied the complex conjugate rule to the expression for 〈 I | ψ 〉 above (the complex conjugate of a sum (or a product) is the sum (or the product) of the complex conjugates), and then abstract 〈 ψ | away, so as to write:

| I 〉 = (1/√2)·[| 1 〉 − | 2 〉] and | II 〉 = (1/√2)·[| 1 〉 + | 2 〉]

OK. So what? We’ve only shown our new base states can be written as similar combinations as those CI and CII coefficients. What proves they are base states? Well… The first rule of quantum math actually defines them as states respecting the following condition:

〈 i | j〉 = 〈 j | i〉 = δij, with δij = δji is equal to 1 if i = j, and zero if i ≠ j

We can prove that as follows. First, use the | I 〉 = (1/√2)·[| 1 〉 − | 2 〉] and | II 〉 = (1/√2)·[| 1 〉 + | 2 〉] result above to check the following:

  • 〈 I | I 〉 = (1/√2)·[〈 I | 1 〉 − 〈 I | 2 〉]
  • 〈 II | II 〉 = (1/√2)·[〈 II | 1 〉 + 〈 II | 2 〉]
  • 〈 II | I 〉 = (1/√2)·[〈 II | 1 〉 − 〈 II | 2 〉]
  • 〈 I | II 〉 = (1/√2)·[〈 I | 1 〉 + 〈 I | 2 〉]

Now we need to find those 〈 I | i 〉 and 〈 II | i 〉 amplitudes. To do that, we can use that 〈 I | ψ 〉 = (1/√2)·[〈 1 | ψ 〉 − 〈 2 | ψ 〉] and 〈 II | ψ 〉 = (1/√2)·[〈 1 | ψ 〉 + 〈 2 | ψ 〉] equation and substitute:

  • 〈 I | 1 〉 = (1/√2)·[〈 1 | 1 〉 − 〈 2 | 1 〉] = (1/√2)
  • 〈 I | 2 〉 = (1/√2)·[〈 1 | 2 〉 − 〈 2 | 2 〉] = −(1/√2)
  • 〈 II | 1 〉 = (1/√2)·[〈 1 | 1 〉 + 〈 2 | 1 〉] =  (1/√2)
  • 〈 II | 2 〉 = (1/√2)·[〈 1 | 2 〉 + 〈 2 | 2 〉] =  (1/√2)

So we get:

  • 〈 I | I 〉 = (1/√2)·[〈 I | 1 〉 − 〈 I | 2 〉] = (1/√2)·[(1/√2) + (1/√2)] = (2/(√2·√2) = 1
  • 〈 II | II 〉 = (1/√2)·[〈 II | 1 〉 + 〈 II | 2 〉] = (1/√2)·[(1/√2) + (1/√2)] = 1
  • 〈 II | I 〉 = (1/√2)·[〈 II | 1 〉 − 〈 II | 2 〉] = (1/√2)·[(1/√2) − (1/√2)] = 0
  • 〈 I | II 〉 = (1/√2)·[〈 I | 1 〉 + 〈 I | 2 〉] = (1/√2)·[(1/√2) − (1/√2)] = 0

So… Well.. Yes. That’s equivalent to:

〈 I | I 〉 = 〈 II | II 〉 = 1 and 〈 I | II 〉 = 〈 II | I 〉 = 0

Therefore, we can confidently say that our | I 〉 = (1/√2)·[| 1 〉 − | 2 〉] and | II 〉 = (1/√2)·[| 1 〉 + | 2 〉] state vectors are, effectively, base vectors in their own right. Now, we’re going to have to grow very fond of matrices, so let me write our ‘definition’ of the new base vectors as a matrix formula:

matrix

You’ve seen this before. The two-by-two matrix is the transformation matrix for a rotation of state filtering apparatus about the y-axis, over an angle equal to (minus) 90 degrees, when only two states are involved:

transformation

You’ll wonder why we should go through all that trouble. Part of it, of course, is to just learn these tricks. The other reason, however, is that it does simplify calculations. Here I need to remind you of the Hamiltonian matrix and the set of differential equations that comes with it. For a system with two base states, we’d have the following set of equations:

set - two-base

Now, adding and subtracting those two equations, and then differentiating the expressions you get (with respect to t), should give you the following two equations:

e1

e2

So what about it? Well… If we transform to the new set of base states, and use the CI and CII coefficients instead of those Cand Ccoefficients, then it turns out that our set of differential equations simplifies, because – as you can see – two out of the four Hamiltonian coefficients are zero, so we can write:

H1

Now you might think that’s not worth the trouble but, of course, now you know how it goes, and so next time it will be easier. 🙂

On a more serious note, I hope you can appreciate the fact that with more states than just two, it will become important to diagonalize the Hamiltonian matrix so as simplify the problem of solving the related set of differential equations. Once we’ve got the solutions, we can always go back to calculate the wavefunctions we want, i.e. the Cand C2 functions that we happen to like more in this particular case. Just to remind you of how this works, remember that we can describe any state φ both in terms of the base states | 1 〉 and | 2 〉 as well as in terms of the base states | I 〉 and | II 〉, so we can either write:

| φ 〉 = | 1 〉 C1 + | 2 〉 C2 or, alternatively, | φ 〉 = | I 〉 CI + | II 〉 CII.

Now, if we choose, or define, Cand CII the way we do – so that’s as C= (1/√2)·(C1 − C2) and CII = (1/√2)·(C1 + C2) respectively – then the Hamiltonian matrices that come with them are the following ones:

f

To understand those matrices, let me remind you here of that equation for the Hamiltonian coefficients in those matrices:

Uij(t + Δt, t) = δij + Kij(t)·Δt = δij − (i/ħ)·Hij(t)·Δt

In my humble opinion, this makes the difference clear. The | I 〉 and | II 〉 base states are clearly separated, mathematically, as much as the | 1 〉 and | 2 〉 base states were separated conceptually. There is no amplitude to go from state I to state II, but then both states are a mix of state 1 and 2, so the physical reality they’re describing is exactly the same: we’re just pushing the temporal variation of the probabilities involved from the coefficients we’re using in our differential equations to the base states we use to define those coefficients – or vice versa.

Huh? Yes… I know it’s all quite deep, and I haven’t quite come to terms with it myself, so that’s why I’ll let you think about it. 🙂 To help you think this through, think about this: the C1 and Cwavefunctions made sense but, at the same time, they were not very ‘physical’ (read: classical), because they incorporated uncertainty—as they mix two different energy levels. However, the associated base states – which I’ll call ‘up’ and ‘down’ here – made perfect sense, in a classical ‘physical’ sense, that is (my English seems to be getting poorer and poorer—sorry for that!). Indeed, in classical physics, the nitrogen atom is either here or there, right? Not somewhere in-between. 🙂 Now, the CI and CII wavefunctions make sense in the classical sense because they are stationary and, hence, they’re associated with a very definite energy level. In fact, as definite, or as classical, as when we say: the nitrogen atom is either here or there. Not somewhere in-between. But they don’t make sense in some other way: we know that the nitrogen atom will, sooner or later, effectively tunnel through. So they do not describe anything real. So how do we capture reality now? Our CI and CII wavefunctions don’t do that explicitly, but implicitly, as the base states now incorporate all of the uncertainty. Indeed, the CI and CII wavefunctions are described in terms of the base states I and II, which themselves are a mixture of our ‘classical’ up or down states. So, yes, we are kicking the ball around here, from a math point of view. Does that make sense? If not, sorry. I can’t do much more. You’ll just have to think through this yourself. 🙂

Let me just add one little note, totally unrelated to what I just wrote, to conclude this little excursion. I must assume that, in regard of diagonalization, you’ve heard about eigenvalues and eigenvectors. In fact, I must assume you heard about this when you learned about matrices in high school. So… Well… In case you wonder, that’s where we need this stuff. 🙂

OK. On to the next !

The general solution for a two-state system

Now, you’ll wonder why, after all of the talk about the need to simplify the Hamiltonian, I will now present a general solution for any two-state system, i.e. any pair of Hamiltonian equations for two-state systems. However, you’ll soon appreciate why, and you’ll also connect the dots with what I wrote above.

Let me first give you the general solution. In fact, I’ll copy it from Feynman (just click on it to enlarge it, or read it in Feynman’s Lecture on it yourself):

copy1

copy2

The problem is, of course, how do we interpret that solution? Let me make it big:

solution3

This says that the general solution to any two-state system amounts to calculating two separate energy levels using the Hamiltonian coefficients as they are being used in those equations above. So there is an ‘upper’ energy level, which is denoted as EI, and a ‘lower’ energy level, which is denoted as EII.

What? So it doesn’t say anything about the Hamiltonian coefficients themselves? No. It doesn’t. What did you expect? Those coefficients define the system as such. So the solution is as general as the ‘two-state system’ we wanted to solve: conceptually, it’s characterized by two different energy levels, but that’s about all we can say about it.

[…] Well… No. The solutions above are specific functional forms and, to find them, we had to make certain assumptions and impose certain conditions so as to ensure there’s any non-zero solution at all! In fact, that’s all the fine print above, so I won’t dwell on that—and you had better stop complaining! 🙂 Having said that, the solutions above are very general indeed, and so now it’s up to us to look at specific two-state systems, like our ammonia molecule, and make educated guesses so as to come up with plausible values or functional forms for those Hamiltonian coefficients. That’s what we did when we equated H11 and H22 with some average energy E0, and H12 and H12 with some energy A. [Minus A, in fact—but we might have chosen some positive value +A. Same solution. In fact, I wonder why Feynman didn’t go for the +A value. It doesn’t matter, really, because we’re talking energy differences, but… Well… Any case… That’s how it is. I guess he just wanted to avoid having to switch the indices 1 and 2, and the coefficients a and b and what have you. But it’s the same. Honestly. :-)]

So… Well… We could do the same here and analyze the solutions we’ve found in our previous posts but… Well… I don’t think that’s very interesting. In addition, I’ll make some references to that in my next post anyway, where we’re going to be analyzing the ammonia molecule in terms of it I and II states, so as to prepare a full-blown analysis of how a maser works.

Just to wet your appetite, let me tell you that the mysterious I and II states do have a wonderfully practical physical interpretation as well. Just scroll back it all the way up, and look at the opposite electric dipole moment that’s associated with state 1 and 2. Now, the two pictures have the angular momentum in the same direction, but we might expect that, when looking at a beam of random NH3 molecules – think of gas being let out of a little jet 🙂 – the angular momentum will be distributed randomly. So… Well… The thing is: the molecules in state I, or in state II, will all have their electric dipole moment lined up in the very same physical direction. So, in that sense, they’re really ‘up’ or ‘down’, and we’ll be able to separate them in an inhomogeneous electric field, just like we were able to separate ‘up’ or ‘down’ electrons, protons or whatever spin-1/2 particles in an inhomogeneous magnetic field.

But so that’s for the next post. I just wanted to tell you that our | I 〉 and | II 〉 base states do make sense. They’re more than just ‘mathematical’ states. They make sense as soon as we’re moving away from an analysis in terms of one NH3 molecule only because… Well… Are you surprised, really? You shouldn’t be. 🙂 Let’s go for it straight away.

The ammonia molecule in an electric field

Our educating guess of the Hamiltonian matrix for the ammonia molecule was the following:

Hamiltion before

This guess was ‘educated’ because we knew what we wanted to get out of it, and that’s those time-dependent probabilities to be in state 1 or state 2:

graph

Now, we also know that state 1 and 2 are associated with opposite electric dipole moments, as illustrated below.

ammonia

Hence, it’s only natural, when applying an external electric field ε to a whole bunch of ammonia molecules –think of some beam – that our ‘educated’ guess would change to:

Hamiltonian after

Why the minus sign for με in the H22 term? You can answer that question yourself: the associated energy is μ·ε = μ·ε·cosθ, and θ is ±π here, as we’re talking opposite directions. So… There we are. 🙂 The consequences show when using those values in the general solution for our system of differential equations. Indeed, the

solution3

equations become:

solution x

The graph of this looks as follows:

graph new

The upshot is: we can separate the the NH3 molecules in a inhomogeneous electric field based on their state, and then I mean state I or II, not state 1 or 2. How? Let me copy Feynman on that: it’s like a Stern-Gerlach apparatus, really. 🙂

xyz So that’s it. We get the following:

electric field

That will feed into the maser, which looks as follows:

maser diagram

But… Well… Analyzing how a maser works involves another realm of physics: cavities and resonances. I don’t want to get into that here. I only wanted to show you why and how different representations of the same thing are useful, and how it translates into a different Hamiltonian matrix. I think I’ve done that, and so let’s call it a night. 🙂 I hope you enjoyed this one. If not… Well… I did. 🙂

Quantum math: the Hamiltonian

After all of the ‘rules’ and ‘laws’ we’ve introduced in our previous post, you might think we’re done but, of course, we aren’t. Things change. As Feynman puts it: “One convenient, delightful ‘apparatus’ to consider is merely a wait of a few minutes; During the delay, various things could be going on—external forces applied or other shenanigans—so that something is happening. At the end of the delay, the amplitude to find the thing in some state χ is no longer exactly the same as it would have been without the delay.”

In short, the picture we presented in the previous posts was a static one. Time was frozen. In reality, time passes, and so we now need to look at how amplitudes change over time. That’s where the Hamiltonian kicks in. So let’s have a look at that now.

[If you happen to understand the Hamiltonian already, you may want to have a look at how we apply it to a real situation: we’ll explain the basics involving state transitions of the ammonia molecule, which are a prerequisite to understanding how a maser works, which is not unlike a laser. But that’s for later. First we need to get the basics.]

Using Dirac’s bra-ket notation, which we introduced in the previous posts, we can write the amplitude to find a ‘thing’ – i.e. a particle, for example, or some system, of particles or other things – in some state χ at the time t = t2, when it was in some state φ state at the time t = t1 as follows:

H1

Don’t be scared of this thing. If you’re unfamiliar with the notation, just check out my previous posts: we’re just replacing A by U, and the only thing that we’ve modified is that the amplitudes to go from φ to χ now depend on t1 and t2. Of course, we’ll describe all states in terms of base states, so we have to choose some representation and expand this expression, so we write: 

H2

I’ve explained the point a couple of time already, but let me note it once more: in quantum physics, we always measure some (vector) quantity – like angular momentum, or spin – in some direction, let’s say the z-direction, or the x-direction, or whatever direction really. Now we can do that in classical mechanics too, of course, and then we find the component of that vector quantity (vector quantities are defined by their magnitude and, importantly, their direction). However, in classical mechanics, we know the components in the x-, y- and z-direction will unambiguously determine that vector quantity. In quantum physics, it doesn’t work that way. The magnitude is never all in one direction only, so we can always some of it in some other direction. (see my post on transformations, or on quantum math in general). So there is an ambiguity in quantum physics has no parallel in classical mechanics. So the concept of a component of a vector needs to be carefully interpreted. There’s nothing definite there, like in classical mechanics: all we have is amplitudes, and all we can do is calculate probabilities, i.e. expected values based on those amplitudes.

In any case, I can’t keep repeating this, so let me move on. In regard to that 〈 χ | U | φ 〉 expression, I should, perhaps, add a few remarks. First, why U instead of A? The answer: no special reason, but it’s true that the use of U reminds us of energy, like potential energy, for example. We might as well have used W. The point is: energy and momentum do appear in the argument of our wavefunctions, and so we might as well remind ourselves of that by choosing symbols like W or U here. Second, we may, of course, want to choose our time scale such that t1 = 0. However, it’s fine to develop the more general case. Third, it’s probably good to remind ourselves we can think of matrices to model it all. More in particular, if we have three base states, say ‘plus‘, ‘zero, or ‘minus‘, and denoting 〈 i | φ 〉 and 〈 i | χ 〉 as Ci and Di respectively (so 〈 χ | i 〉 = 〈 i | χ 〉* = Di*), then we can re-write the expanded expression above as:

Matrix U

Fourth, you may have heard of the S-matrix, which is also known as the scattering matrix—which explains the S in front but it’s actually a more general thing. Feynman defines the S-matrix as the U(t1, t2) matrix for t→ −∞ and t→ +∞, so as some kind of limiting case of U. That’s true in the sense that the S-matrix is used to relate initial and final states, indeed. However, the relation between the S-matrix and the so-called evolution operators U is slightly more complex than he wants us to believe. I can’t say too much about this now, so I’ll just refer you to the Wikipedia article on that, as I have to move on.

The key to the analysis is to break things up once more. More in particular, one should appreciate that we could look at three successive points in time, t1, t2, t3, and write U(t1, t3) as:

U(t3, t1) = U(t3, t2)·U(t2, t1)

It’s just like adding another apparatus in series, so it’s just like what did in our previous post, when we wrote:

B1

So we just put a | bar between B and A and wrote it all out. That | bar is really like a factor 1 in multiplication but – let me caution you – you really need to watch the order of the various factors in your product, and read symbols in the right order, which is often from right to left, like in Hebrew or Arab, rather than from left to right. In that regard, you should note that we wrote U(t3, t1) rather than U(t1, t3): you need to keep your wits about you here! So as to make sure we can all appreciate that point, let me show you what that U(t3, t1) = U(t3, t2)·U(t2, t1) actually says by spelling it out if we have two base states only (like ‘up‘ or ‘down‘, which I’ll note as ‘+’ and ‘−’ again) :

Matrix U2

So now you appreciate why we try to simplify our notation as much as we can! But let me get back to the lesson. To explain the Hamiltonian, which we need to describe how states change over time, Feynman embarks on a rather spectacular differential analysis. Now, we’ve done such exercises before, so don’t be too afraid. He substitutes t1 for t tout court, and tfor t + Δt, with Δt the infinitesimal you know from Δy = (dy/dx)·Δx, with the derivative dy/dx being defined as the Δy/Δx ratio for Δx → 0. So we write U(t2, t1) = U(t + Δt, t). Now, we also explained the idea of an operator in our previous post. It came up when we’re being creative, and so we dropped the 〈 χ | state from the 〈 χ | A | φ〉 expression and just wrote:

C1

If you ‘get’ that, you’ll also understand what I am writing now:

chi1

This is quite abstract, however. It is an ‘open’ equation, really: one needs to ‘complete’ it with a ‘bra’, i.e. a state like 〈 χ |, so as to give a 〈 χ | ψ〉 = 〈 χ | A | φ〉 type of amplitude that actually means something. What we’re saying is that our operator (or our ‘apparatus’ if it helps you to think that way) does not mean all that much as long as we don’t measure what comes out, so we have to choose some set of base states, i.e. a representation, which allows us to describe the final state, which we write as 〈 χ |. In fact, what we’re interested in is the following amplitudes:

chi2

So now we’re in business, really. 🙂 If we can find those amplitudes, for each of our base states i, we know what’s going on. Of course, we’ll want to express our ψ(t) state in terms of our base states too, so the expression we should be thinking of is:

chi3

Phew! That looks rather unwieldy, doesn’t it? You’re right. It does. So let’s simplify. We can do the following substitutions:

  • 〈 i | ψ(t + Δt)〉 = Ci(t + Δt) or, more generally, 〈 j | ψ(t)〉 = Cj(t)
  • 〈 i | U(t2, t1) | j〉 = Uij(t2, t1) or, more specifically, 〈 i | U(t + Δt, t) | j〉 = Uij(t + Δt, t)

H3

As Feynman notes, that’s how the dynamics of quantum mechanics really look like. But, of course, we do need something in terms of derivatives rather than in terms of differentials. That’s where the Δy = (dy/dx)·Δx equation comes in. The analysis looks kinda dicey because it’s like doing some kind of first-order linear approximation of things – rather than an exact kinda thing – but that’s how it is. Let me remind you of the following formula: if we write our function y as y = f(x), and we’re evaluating the function near some point a, then our Δy = (dy/dx)·Δx equation can be used to write:

y = f(x) ≈ f(a) + f'(a)·(x − a) = f(a) + (dy/dx)·Δx

To remind yourself of how this works, you can complete the drawing below with the actual y = f(x) as opposed to the f(a) + Δy approximation, remembering that the (dy/dx) derivative gives you the slope of the tangent to the curve, but it’s all kids’ stuff really and so we shouldn’t waste too much spacetime on this. 🙂

300px-TangentGraphic2

The point is: our Uij(t + Δt, t) is a function too, not only of time, but also of i and j. It’s just a rather special function, because we know that, for Δt → 0, Uij will be equal to 1 if i = (in plain language: if Δt → 0 goes to zero, nothing happens and we’re just in state i), and equal to 0 if i = j. That’s just as per the definition of our base states. Indeed, remember the first ‘rule’ of quantum math:

〈 i | j〉 = 〈 j | i〉 = δij, with δij = δji is equal to 1 if i = j, and zero if i ≠ j

So we can write our f(x) ≈ f(a) + (dy/dx)·Δx expression for Uij as:

H4

So Kij is also some kind of derivative and the Kronecker delta, i.e. δij, serves as the reference point around which we’re evaluating UijHowever, that’s about as far as the comparison goes. We need to remind ourselves that we’re talking complex-valued amplitudes here. In that regard, it’s probably also good to remind ourselves once more that we need to watch the order of stuff: Uij = 〈 i | U | j〉, so that’s the amplitude to go from base state to base state i, rather than the other way around. Of course, we have the 〈 χ | φ 〉 = 〈 φ | χ 〉* rule, but we still need to see how that plays out with an expression like 〈 i | U(t + Δt, t) | j〉. So, in short, we should be careful here! 

Having said that, we can actually play a bit with that expression, and so that’s what we’re going to do now. The first thing we’ll do is to write Kij as a function of time indeed:

Kij = Kij(t)

So we don’t have that Δt in the argument. It’s just like dy/dx = f'(x): a derivative is a derivative—a function which we derive from some other function. However, we’ll do something weird now: just like any function, we can multiply or divide it by some constant, so we can write something like G(x) = F(x), which is equivalent to saying that F(x) = G(x)/c. I know that sound silly but it is how is, and we can also do it with complex-valued functions: we can define some other function by multiplying or dividing by some complex-valued constant, like a + b·i, or ξ or whatever other constant. Just note we’re no longer talking the base state but the imaginary unit i. So it’s all done so as to confuse you even more. 🙂

So let’s take −i/ħ as our constant and re-write our Kij(t) function as −itimes some other function, which we’ll denote by Hij(t), so Kij(t) = –(i/ħ)·Hij(t). You guess it, of course: Hij(t) is the infamous Hamiltonian, and it’s written the way it’s written both for historical as well as for practical reasons, which you’ll soon discover. Of course, we’re talking one coefficient only and we’ll have nine if we have three base states i and j, or four if we have only two. So we’ve got a n-by-n matrix once more. As for its name… Well… As Feynman notes: “How Hamilton, who worked in the 1830s, got his name on a quantum mechanical matrix is a tale of history. It would be much better called the energy matrix, for reasons that will become apparent as we work with it.”

OK. So we’ll just have to acknowledge that and move on. Our Uij(t + Δt, t) = δij + Kij(t)·Δt expression becomes:

 Uij(t + Δt, t) = δij –(i/ħ)·Hij(t)·Δt

[Isn’t it great you actually start to understand those Chinese-looking formulas? :-)] We’re not there yet, however. In fact, we’ve still got quite a bit of ground to cover. We now need to take that other monster:

H3

So let’s substitute now, so we get:

H5

We can get this in the form we want to get – so that’s the form you’ll find in textbooks 🙂 – by noting that the ∑δij·Cj(t) sum, taking over all is, quite simply, equal to Ci(t). [Think about the indexes here: we’re looking at some i, and so it’s only the j that’s taking on whatever value it can possibly have.] So we can move that to the other side, which gives us Ci(t + Δt) – Ci(t). We can then divide both sides of our expression by Δt, which gives us an expression like [f(x + Δx) – f(x)]/Δx = Δy//Δx, which is actually the definition of the derivative for Δx going to zero. Now, that allows us to re-write the whole thing in terms of a proper derivative, rather than having to work with this rather unwieldy differential stuff. So, if we substitute [Ci(t + Δt) – Ci(t)]/Δx for d[Ci(t)]/dt, and then also move –(i/ħ) to the left-hand side, remembering that 1/i = –i (and, hence, [–(i/ħ)]−1 = i/ħ), we get the formula in the shape we wanted it in:

H6

Done ! Of course, this is a set of differential equations and… Well… Yes. Yet another set of differential equations. 🙂 It seems like we can’t solve anything without involving differential equations in physics, isn’t it? But… Well… I guess that’s the way it is. So, before we turn to some example, let’s note a few things.

First, we know that a particle, or a system, must be in some state at any point of time. That’s equivalent to stating that the sum of the probabilities |Ci(t)|= |〈 i | ψ(t)〉|is some constant. In fact, we’d like to say it’s equal to one, but then we haven’t normalized anything here. You can fiddle with the formulas but it’s probably easier to just acknowledge that, if we’d measure anything – think of the angular momentum along the z-direction, or some other direction, if you’d want an example – then we’ll find it’s either ‘up’ or ‘down’ for a spin-1/2 particle, or ‘plus’, ‘zero’, or ‘minus’ for a spin-1 particle.

Now, we know that the complex conjugate of a sum is equal to the sum of the complex conjugates: [∑ z]* = ∑ zi*, and that the complex conjugate of a product is the product of the complex conjugates, so we have [∑ ziz]* = ∑ zi*zj*. Now, some fiddling with the formulas above should allow you to prove that Hij = Hij*, and the associated matrix is usually referred to as the Hermitian or conjugate transpose. If if the original Hamiltonian matrix is denoted as H, then its conjugate transpose will be denoted by H*, H or even H(so the in the superscript stands for Hermitian, instead of Hamiltonean). So… Yes. There’s competing notations around. 🙂

The simplest situation, of course, is when the Hamiltonian do not depend on time. In that case, we’re back in the static case, and all Hij coefficients are just constants. For a system with two base states, we’d have the following set of equations:

set - two-base

This set of two equations can be easily solved by remembering the solution for one equation only. Indeed, if we assume there’s only base state – which is like saying: the particle is at rest somewhere (yes: it’s that stupid!) – our set of equations reduces to only one:

one equation

This is a differential equation which is easily solved to give:

solution

[As for being ‘easily solved’, just remember the exponential function is its own derivative and, therefore, d[a·e–(i/ħ)Hijt]/dt = a·d[e–(i/ħ)Hijt]/dt = –a·(i/ħ)·Hij·e–(i/ħ)Hijt, which gives you the differential equation, so… Well… That’s the solution.]

This should, of course, remind you of the equation that inspired Louis de Broglie to write down his now famous matter-wave equation (see my post on the basics of quantum math):

a·ei·θ ei·(ω·t − k ∙x) = a·e(i/ħ)·(E·t − px)

Indeed, if we look at the temporal variation of this function only – so we don’t consider the space variable x – then this equation reduces to a·e–(i/ħ)·(E·t), and so find that our Hamiltonian coefficient H11 is equal to the energy of our particle, so we write: H11 = E, which, of course, explains why Feynman thinks the Hamiltonian matrix should be referred to as the energy matrix. As he puts it: “The Hamiltonian is the generalization of the energy for more complex situations.”

Now, I’ll conclude this post by giving you the answer to Feynman’s remark on why the Irish 19th century mathematician William Rowan Hamilton should be associated with the Hamiltonian. The truth is: the term ‘Hamiltonian matrix’ may also refer to a more general notion. Let me copy Wikipedia here: “In mathematics, a Hamiltonian matrix is a 2n-by-2n matrix A such that JA is symmetric, where J is the skew-symmetric matrix

J= \begin{bmatrix} 0 & I_n \\ -I_n & 0 \\ \end{bmatrix}

and In is the n-by-n identity matrix. In other words, A is Hamiltonian if and only if (JA)T = JA where ()T denotes the transpose. So… That’s the answer. 🙂 And there’s another reason too: Hamilton invented the quaternions and… Well… I’ll leave it to you to check out what these have got to do with quantum physics. 🙂

[…] Oh ! And what about the maser example? Well… I am a bit tired now, so I’ll just refer you to Feynman’s exposé on it. It’s not that difficult if you understood all of the above. In fact, it’s actually quite straightforward, and so I really recommend you work your way through the example, as it will give you a much better ‘feel’ for the quantum-mechanical framework we’ve developed so far. In fact, walking through the whole thing is like a kind of ‘reward’ for having worked so hard on the more abstract stuff in this and my previous posts. So… Yes. Just go for it! 🙂 [And, just in case you don’t want to go for it, I did write a little introduction to in the following post. :-)]