I actually wanted to write about the Hamiltonian matrix. However, I realize that, before I can serve the *plat de résistance*, we need to review or introduce some more concepts and ideas. It all revolves around the same theme: working with states is like working with vectors, but so you need to know *how *exactly. Let’s go for it. 🙂

In my previous posts, I repeatedly said that a set of base states is like a coordinate system. A coordinate system allows us to describe (i.e. uniquely identify) *vectors* in an n-dimensional space: we associate a vector with a set of real numbers, like x, y and z, for example. Likewise, we can describe any state in terms of a set of *complex *numbers – *amplitudes*, really – once we’ve chosen a set of base states. We referred to this set of base states as a ‘representation’. For example, if our set of base states is +S, 0S and −S, then any state φ can be defined by the *amplitudes* C_{+} = 〈 +S | φ 〉, C_{0} = 〈 0S | φ 〉, and C_{−} = 〈 −S | φ 〉.

We have to choose *some** *representation (but we are free to choose which one) because, as I demonstrated when doing a practical example (see my description of muon decay in my post on how to work with amplitudes), we’ll usually want to calculate something like the amplitude to go from one state to another – which we denoted as 〈 χ | φ 〉 – and *we’ll do that by breaking it up*. To be precise, we’ll write that amplitude 〈 χ | φ 〉 – i.e. the amplitude to go from state φ to state χ (you have to read this thing from right to left, like Hebrew or Arab) – as the following sum:

So that’s a sum over a *complete *set of base states (that’s why I write *all *i under the summation symbol ∑). We discussed this rule in our presentation of the ‘Laws’ of quantum math.

Now we can play with this. As χ can be defined in terms of the chosen set of base states too, it’s handy to know that 〈 χ | i 〉 and 〈 i | χ 〉 are each other’s complex conjugates – we write this as: 〈 χ | i 〉 = 〈 i | χ 〉* – so if we have one, we have the other (we can also write: 〈 i | χ 〉* = 〈 χ | i 〉). In other words, if we have all C_{i} = 〈 i | φ 〉 and all D_{i} = 〈 i | χ 〉, i.e. the ‘components’ of both states in terms of our base states, then we can calculate 〈 χ | φ 〉 as:

〈 χ | φ 〉 = ∑ D_{i}*C_{i} = ∑〈 χ | i 〉〈 i | φ 〉,

provided we make sure we do the summation over a *complete *set of base states. For example, if we’re looking at the angular momentum of a spin-1/2 particle, like an electron or a proton, then we’ll have two base states, +ħ/2 and +ħ/2, so then we’ll have only two terms in our sum, but the spin number (*j*) of a cobalt nucleus is 7/2, so if we’d be looking at the angular momentum of a cobalt nucleus, we’ll have *eight *(2·j + 1)* *base states and, hence, eight terms when doing the sum. So it’s very much like working with vectors, indeed, and that’s why states are often referred to as ** state vectors**. So now you know that term too. 🙂

However, the similarities run even deeper, and we’ll explore all of them in this post. You may or may not remember that your math teacher actually also defined ordinary vectors in three-dimensional space in terms of *base vectors ***e**_{i}, defined as: **e**_{1 }= [1, 0, 0], **e**_{2 }= [0, 1, 0] and **e**_{2 }= [0, 0, 1]. You may also remember that the units along the x, y and z-axis didn’t have to be the same – we could, for example, measure in cm along the x-axis, but in inches along the z-axis, even if that’s not very convenient to calculate stuff – but that it was *very* important to ensure that the base vectors were a set of *orthogonal *vectors. In any case, we’d chose our set of orthogonal base vectors and write all of our vectors as:

**A** = A_{x}·**e**_{1} + A_{y}·**e**_{2 }+ A_{z}·**e**_{3}

That’s simple enough. In fact, one might say that the equation above actually *defines *coordinates. However, there’s another way of defining them. We can write A_{x}, A_{y}, and A_{z} as *vector dot products*, aka *scalar *vector products (as opposed to *cross *products, or vector products *tout court*). Check it:

A_{x }= **A**·**e**_{1}, A_{y }= **A**·**e**_{2}, and A_{z }= **A**·**e**_{3}.

This actually allows us to re-write the vector dot product **A**·**B** in a way you’ve probably haven’t seen before. Indeed, you’d usually calculate **A**·**B** as |**A**|∙|**B**|·cosθ = A∙B·cosθ (A and B is the *magnitude *of the vectors **A** and **B** respectively) or, quite simply, as A_{x}B_{x }+ A_{y}B_{y }+ A_{z}B_{z}. However, using the dot products above, we can now also write it as:

We deliberately wrote **B**·**A**** **instead of** A**∙** B **because, while the mathematical similarity with the

〈 χ | φ 〉 = ∑〈 χ | i 〉〈 i | φ 〉

equation is obvious, **B**·**A** = **A**·**B** but 〈 χ | φ 〉 ≠ 〈 φ | χ 〉. Indeed, 〈 χ | φ 〉 and 〈 φ | χ 〉 are complex conjugates – so 〈 χ | φ 〉 = 〈 φ | χ 〉* – but they’re not equal. So we’ll have to watch the order when working with those amplitudes. That’s because we’re working with *complex* numbers instead of *real *numbers. Indeed, it’s only because the **A**·**B** dot product involves *real *numbers, whose complex conjugate is the same, that we have that commutativity in the *real *vector space. Apart from that – so apart from having to carefully check the order of our products – **the correspondence is complete**.

Let me mention another similarity here. As mentioned above, our base vectors **e**_{i} had to be orthogonal. We can write this condition as:

**e**_{i}·**e**_{j} = δ_{ij}, with δ_{ij }= 0 if i ≠ j, and 1 if i = j.

Now, our first quantum-mechanical rule says the same:

〈 i | j 〉 = δ_{ij}, with δ_{ij }= 0 if i ≠ j, and 1 if i = j.

So our set of base *states* also has to be ‘orthogonal’, which is the term you’ll find in physics textbooks, although – as evidenced from our discussion on the base states for measuring angular momentum – one should not try to give any geometrical interpretation here: +ħ/2 and +ħ/2 (so that’s spin ‘up’ and ‘down’ respectively) are not ‘orthogonal’ in any geometric sense, indeed. It’s just that *pure *states, i.e. base states, are separate, which we write as: 〈 ‘up’ | ‘down’ 〉 = 〈 ‘down’ | ‘up’ 〉 = 0 and 〈 ‘up’ | ‘up’ 〉 = 〈 ‘down’ | ‘down’ 〉 = 1. It just means they are just *different *base states, and so it’s one or the other. For our +S, 0S and −S example, we’d have nine such amplitudes, and we can organize them in a little matrix:

In fact, just like we *defined *the base* vectors ***e**_{i} as **e**_{1 }= [1, 0, 0], **e**_{2 }= [0, 1, 0] and **e**_{2 }= [0, 0, 1] respectively, we may say that the matrix above, which states exactly the same as the 〈 i | j 〉 = δ_{ij} rule, can serve as a *definition* of what base states actually are. [Having said that, it’s obvious we like to believe that base states are more than just mathematical constructs: we’re talking reality here. The angular momentum as measured in the x-, y- or z-direction, or in whatever direction, is more than just a number.]

OK. You get this. In fact, you’re probably getting impatient because this is too simple for you. So let’s take another step. We showed that the 〈 χ | φ 〉 = ∑〈 χ | i 〉〈 i | χ 〉 and **B**·**A **= ∑(**B**·**e**_{i})(**e**_{i}·**A**) are *structurally* equivalent – from a mathematical point of view, that is – but **B** and **A** are ** separate vectors**, while 〈 χ | φ 〉 is

**. Right?**

*just a complex number*Well… **No.** We *can* actually analyze the *bra *and the *ket *in the 〈 χ | φ 〉 bra-ket as separate pieces too. Moreover, we’ll show they are actually *state vectors* too, even if the *bra*, i.e. 〈 χ |, and the ket, i.e. | φ 〉, are ‘unfinished pieces’, so to speak. Let’s be bold. Let’s just cut the 〈 χ | φ 〉 = ∑〈 χ | i 〉〈 i | χ 〉 by writing:

*Huh? *

Yes. That’s the power of Dirac’s bra-ket notation: we can just drop symbols left or right. It’s quite incredible. But, of course, the question is: so what does this actually *mean*? Well… Don’t rack your brain. I’ll tell you. We *define* | φ 〉 as a *state *vector because we *define* | i 〉 as a (base) state vector. Look at it this way: we wrote the 〈 +S | φ 〉, 〈 0S | φ 〉 and 〈 −S | φ 〉 amplitudes as C_{+}, C_{0}, C_{−}, respectively, so we can write the equation above as:

So we’ve got a sum of products here, and it’s just like **A** = A_{x}·**e**_{1 }+ A_{y}·**e**_{2} + A_{z}·**e**_{3}. Just substitute the A_{i }coefficients for C_{i} and the **e**_{i }base vectors for the | i 〉 base states. We get:

| φ 〉 = |+S〉 C_{+} + |0S〉 C_{0 }+ |+S〉 C_{−}

Of course, you’ll wonder what those terms mean: what does it mean to ‘multiply’ C_{+} (remember: C_{+ } is some complex number) by |+S〉? Be patient. Just wait. You’ll understand when we do some examples, so when you start *working *with this stuff. You’ll see it all makes sense—later. 🙂

Of course, we’ll have a similar equation for | χ 〉, and so if we write 〈 χ | i 〉 as D_{i}, then we can write | χ 〉 = ∑ | i 〉〈 χ | i 〉 as | χ 〉 = ∑ | i 〉 D_{i}.

So what? Again: be patient. We know that 〈 χ | i 〉 = 〈 i | χ 〉*, so our *second* equation above becomes:

You’ll have *two *questions now. The first is the same as the one above: what does it mean to ‘multiply’, let’s say, D_{0}* (i.e. the complex conjugate of D_{0}, so if D_{0 }= a + *i*b, then D_{0}* = a − *i*b) with 〈0S|? The answer is the same: be patient. 🙂 Your second question is: why do I use another symbol for the index here? Why j instead of i? Well… We’ll have to re-combine stuff, so it’s better to keep things separate by using another symbol for the same index. 🙂

In fact, let’s re-combine stuff right now, in exactly the same way as we took it apart: we just write the two things right next to each other. We get the following:

** What? Is that it?** So we went through all of this hocus-pocus just to find the same equation as we started out with?

Yes. I had to take you through this so you get used to juggling all those symbols, because that’s what we’ll do in the next post. Just think about it and give yourself some time. I know you’ve probably never ever handled such exercise in symbols before – I haven’t, for sure! – but it all makes sense: we cut and paste. It’s all great! 🙂 [Oh… In case you wonder about the transition from the sum involving i and j to the sum involving i only, think about the Kronecker expression: 〈 j | i 〉 = δ_{ij}, with δ_{ij }= 0 if i ≠ j, and 1 if i = j, so most of the terms are zero.]

To summarize the whole discussion, note that the expression above is completely analogous with the **B**·**A **= B_{x}A_{x }+ B_{y}A_{y }+ B_{z}A_{z }formula. The only difference is that we’re talking complex numbers here, so we need to watch out. We have to watch the order of stuff, and we can’t use the D_{i }numbers themselves: we have to use their complex conjugates D_{i}*. But, for the rest, we’re all set! 🙂 If we’ve got a set of base states, then we can define any state in terms of a set of ‘coordinates’ or ‘coefficients’ – i.e. the C_{i} or D_{i} numbers for the φ or χ example above – and we can then calculate the amplitude to go from one state to another as:

In case you’d get confused, just take the original equation:

The two equations are fully equivalent.

[…]

*So we just went through all of the shit above so as to show that structural similarity with vector spaces? *

Yes. It’s important. You just need to remember that we may have two, three, four, five,… or even an infinite number of base states depending on the situation we’re looking at, and what we’re trying to measure. I am sorry I had to take you through all of this. However, there’s more to come, and so you need this baggage. We’ll take the next step now, and that is to introduce the concept of an ** operator**.

Look at the middle term in that expression above—let me copy it:

We’ve got *three *terms in that double sum (a double sum is a sum involving two indices, which is what we have here: *i* and *j*). When we have two indices like that, one thinks of matrices. That’s easy to do here, because we represented that 〈 i | j 〉 = δ_{ij} equation as a matrix too! To be precise, we presented it as the *identity matrix*, and a simple substitution allows us to re-write our equation above as:

I must assume you’re shaking your head in disbelief now: we’ve expanded a simple *amplitude* into a product of three matrices now. Couldn’t we just stick to that sum, i.e that vector dot product ∑ D_{i}*C_{i}? What’s next? Well… I am afraid there’s a lot more to come. For starters, we’ll take that idea of ‘putting something in the middle’ to the next level by going back to our Stern-Gerlach filters and whatever other apparatus we can think of. Let’s assume that, instead of some filter S or T, we’ve got something more complex now, which we’ll denote by A. [Don’t confuse it with our vectors: we’re talking an apparatus now, so you should imagine some beam of particles, polarized or not, entering it, going through, and coming out.]

We’ll stick to the symbols we used already, and so we’ll just assume a particle enters into the apparatus in some state φ, and that it comes out in some state χ. Continuing the example of spin-one particles, and assuming our beam has *not *been filtered – so, using *lingo**, *we’d say it’s *un*polarized – we’d say there’s a probability of 1/3 for being either in the ‘plus’, ‘zero’, or ‘minus’ state with respect to whatever representation we’d happen to be working with, and the related amplitudes would be 1/√3. In other words, we’d say that φ is defined by C_{+} = 〈 +S | φ 〉, C_{0} = 〈 0S | φ 〉, and C_{−} = 〈 −S | φ 〉, with C_{+} = C_{0} = C_{− }= 1/√3. In fact, using that | φ 〉 = |+S〉 C_{+} + |0S〉 C_{0 }+ |+S〉 C_{− }expression we invented above, we’d write: | φ 〉 = (1/√3)|+S〉 + (1/√3)|0S〉 C_{0 }+ (1/√3)|+S〉 C_{−} or, using ‘matrices’—just a row and a column, really:

However, you don’t need to worry about that now. The new big thing is the following expression:

〈 χ | A | φ〉

It looks simple enough: φ to A to χ. Right? Well… Yes and no. The question is: what do you do with this? How would we take its complex conjugate, for example? And if we know how to do that, would it be equal to 〈 φ | A | χ〉?

You guessed it: we’ll have to take it apart, but how? We’ll do this using another fantastic abstraction. Remember how we took Dirac’s 〈 χ | φ 〉 bra-ket apart by writing | φ 〉 = ∑ | i 〉〈 i | φ 〉? We just dropped the 〈 χ left and right in our 〈 χ | φ 〉 = ∑〈 χ | i 〉〈 i | φ 〉 expression. We can go one step further now, and drop the φ 〉 left and right in our | φ 〉 = ∑ | i 〉〈 i | φ 〉 expression. We get the following wonderful thing:

| = ∑ | i 〉〈 i | over all base states *i*

With characteristic humor, Feynman calls this ** ‘The Great Law of Quantum Mechanics’** and, frankly, there’s actually more than one grain of truth in this. 🙂

Now, if we apply this ‘Great Law’ to our 〈 χ | A | φ〉 expression – we should apply it twice, actually – we get:

As Feynman points out, it’s easy to add another apparatus in series. We just write:

Just put a | bar between B and A and apply the same trick. The | bar is really like a factor 1 in multiplication. However, that’s all great fun but it doesn’t solve our problem. Our ‘Great Law’ allows us to sort of ‘resolve’ our apparatus A in terms of *base* states, as we now have 〈 i | A | j 〉 in the middle, rather than 〈 χ | A | φ〉 but, again, how do we work with that?

Well… The answer will surprise you. Rather than trying to break this thing up, we’ll say that the apparatus A is actually being described, or *defined*, by the nine 〈 i | A | j 〉 amplitudes. [There are nine for this example, but four only for the example involving spin-1/2 particles, of course.] We’ll call those amplitudes, quite simply, the *matrix of amplitudes*, and we’ll often denote it by A_{ij}.

Now, I wanted to talk about *operators *here. The idea of an operator comes up when we’re creative again, and when we drop the 〈 χ | state from the 〈 χ | A | φ〉 expression. We write:

So now we think of the particle entering the ‘apparatus’ A in the state ϕ and coming out of A in some state ψ (‘psi’). We can generalize this and think of it as an ‘operator’, which Feynman intuitively defines as follows:

**The symbol A is neither an amplitude, nor a vector; it is a new kind of thing called an operator. It is something which “operates on” a state to produce a new state.”**

But…** Wait a minute! **| ψ 〉 is not the same as 〈 χ |.

**Why can we do that substitution? We can only do it because any state ψ and χ are related through that other ‘Law’ of quantum math:**

Combining the two shows our ‘definition’ of an operator is OK. We should just note that it’s an ‘open’ equation until it is completed with a ‘bra’, i.e. a state like 〈 χ |, so as to give the 〈 χ | ψ〉 = 〈 χ | A | φ〉 type of amplitude that actually means something. In practical terms, that means our operator or our apparatus doesn’t mean much as long as we don’t *measure *what comes out, so then we choose some set of base states, i.e. a *representation*, which allows us to describe the final state, i.e. 〈 χ |.

[…]

Well… Folks, that’s it. I know this was mighty abstract, but the next posts should bring things back to earth again. I realize it’s only by working examples and doing exercises that one can get some kind of ‘feel’ for this kind of stuff, so that’s what we’ll have to go through now. 🙂

## 2 thoughts on “Quantum math: states as vectors, and apparatuses as operators”