Quantum math revisited

It’s probably good to review the concepts we’ve learned so far. Let’s start with the foundation of all of our math, i.e. the concept of the state, or the state vector. [The difference between the two concepts is subtle but real. I’ll come back to it.]

State vectors and base states

We used Dirac’s bra-ket notation to denote a state vector, in general, as | ψ 〉. The obvious question is: what is this thing? We called it a vector because we use it like a vector: we multiply it with some number, and then add it to some other vector. So that’s just what you did in high school, when you learned about real vector spaces. In this regard, it is good to remind you of the definition of a vector space. To put it simply, it is is a collection of objects called vectors, which may be added together, and multiplied by numbers. So we have two things here: the ‘objects’, and the ‘numbers’. That’s why we’d say that we have some vector space over a field of numbers. [The term ‘field’ just refers to an algebraic structure, so we can add and multiply and what have you.] Of course, what it means to ‘add’ two ‘objects’, and what it means to ‘multiply’ an object with a number, depends on the type of objects and, unsurprisingly, the type of numbers.

Huh? The type of number?! A number is a number, no?

No, hombre, no! We’ve got natural numbers, rational numbers, real numbers, complex numbers—and you’ve probably heard of quaternions too – and, hence, ‘multiplying’ a ‘number’ with ‘something else’ can mean very different things. At the same time, the general idea is the general idea, so that’s the same, indeed. 🙂 When using real numbers and the kind of vectors you are used to (i.e. Euclidean vectors), then the multiplication amounts to a re-scaling of the vector, and so that’s why a real number is often referred to as a scalar. At the same time, anything that can be used to multiply a vector is often referred to as a scalar in math so… Well… Terminology is often quite confusing. In fact, I’ll give you some more examples of confusing terminology in a moment. But let’s first look at our ‘objects’ here, i.e. our ‘vectors’.

I did a post on Euclidean and non-Euclidean vector spaces two years ago, when I started this blog, but state vectors are obviously very different ‘objects’. They don’t resemble the vectors we’re used to. We’re used to so-called polar vectors, aka as real vectors, like the position vector (x or r), or the momentum vector (p = m·v), or the electric field vector (E). We are also familiar with the so-called pseudo-vectors, aka as axial vectors, like angular momentum (L = r×p), or the magnetic dipole moment. [Unlike what you might think, not all vector cross products yield a pseudo-vector. For example, the cross-product of a polar and an axial vector yields a polar vector.] But here we are talking some very different ‘object’. In math, we say that state vectors are elements in a Hilbert space. So a Hilbert space is a vector space but… Well… With special vectors. 🙂

The key to understanding why we’d refer to states as state vectors is the fact that, just like Euclidean vectors, we can uniquely specify any element in a Hilbert space with respect to a set of base states. So it’s really like using Cartesian coordinates in a two- or three-dimensional Euclidean space. The analogy is complete because, even in the absence of a geometrical interpretation, we’ll require those base states to be orthonormal. Let me be explicit on that by reminding you of your high-school classes on vector analysis: you’d choose a set of orthonormal base vectors e1e2, and e3, and you’d write any vector A as:

A = (Ax, Ay, Az) = Ax·e1 + Ay·e2 + Az·e3 with ei·ej = 1 if i = j, and ei·ej = 0 if i ≠ j

The ei·ej = 1 if i = j and ei·ej = 0 if i ≠ j condition expresses the orthonormality condition: the base vectors need to be orthogonal unit vectors. We wrote it as ei·ej = δij using the Kronecker delta ij = 1 if i = j and 0 if i ≠ j). Now, base states in quantum mechanics do not necessarily have a geometrical interpretation. Indeed, although one often can actually associate them with some position or direction in space, the condition of orthonormality applies in the mathematical sense of the word only. Denoting the base states by i = 1, 2,… – or by Roman numerals, like I and II – so as to distinguish them from the Greek ψ or φ symbols we use to denote state vectors in general, we write the orthonormality condition as follows:

〈 i | j 〉 = δij, with δij = δji is equal to 1 if i = j, and zero if i ≠ j

Now, you may grumble and say: that 〈 i | j 〉 bra-ket does not resemble the ei·ej product. Well… It does and it doesn’t. I’ll show why in a moment. First note how we uniquely specify state vectors in general in terms of a set of base states. For example, if we have two possible base states only, we’ll write:

| φ 〉 = | 1 〉 C1 + | 2 〉 C2

Or, if we chose some other set of base states | I 〉 and | II 〉, we’ll write:

| φ 〉 = | I 〉 CI + | II 〉 CII

You should note that the | 1 〉 C1 term in the | φ 〉 = | 1 〉 C1 + | 2 〉 C2 sum is really like the Ax·e1 product in the A = Ax·e1 + Ay·e2 + Az·e3 expression. In fact, you may actually write it as C1·| 1 〉, or just reverse the order and write C1| 1 〉. However, that’s not common practice and so I won’t do that, except occasionally. So you should look at | 1 〉 C1 as a product indeed: it’s the product of a base state and a complex number, so it’s really like m·v, or whatever other product of some scalar and some vector, except that we’ve got a complex scalar here. […] Yes, I know the term ‘complex scalar’ doesn’t make sense, but I hope you know what I mean. 🙂

More generally, we write:


Writing our state vector | ψ 〉, | φ 〉 or | χ 〉 like this also defines these coefficients or coordinates Ci. Unlike our state vectors, or our base states, Cis an actual number. It has to be, of course: it’s the complex number that makes sense of the whole expression. To be precise, Cis an amplitude, or a wavefunction, i.e. a function depending on both space and time. In our previous posts, we limited the analysis to amplitudes varying in time only, and we’ll continue to do so for a while. However, at some point, you’ll get the full picture.

Now, what about the supposed similarity between the 〈 i | j〉 bra-ket and the ei·ej product? Let me invoke what Feynman, tongue-in-cheek as usual, refers to as the Great Law of Quantum Mechanics:


You get this by taking | ψ 〉 out of the | ψ 〉 = ∑| i 〉〈 i | ψ 〉 expression. And, no, don’t say: what nonsense! Because… Well… Dirac’s notation really is that simple and powerful! You just have to read it from right to left. There’s an order to the symbols, unlike what you’re used to in math, because you’re used to operations that are commutative. But I need to move on. The upshot is that we can specify our base states in terms of the base states too. For example, if we have only two base states, let’s say I and II, then we can write:

| I 〉 = ∑| i 〉〈 i | 1 〉 = 1·| I 〉 + 0·| II 〉 and | II 〉 = ∑| i 〉〈 i | II 〉 = 0·| 1 〉 + 0·| II 〉

We can write this using a matrix notation:

M1Now that is silly, you’ll say. What’s the use of this? It doesn’t tell us anything new, and it also does not show us why we should think of the 〈 i | j 〉 bra-ket and the ei·ej product as being similar! Well… Yes and no. Let me show you something else. Let’s assume we’ve got some state χ and φ, which we specify in terms of our chosen set of base states as | χ 〉 = ∑ | i 〉 Di and | φ 〉 = ∑ | i 〉 Ci respectively. Now, from our post on quantum math, you’ll remember that 〈 χ | i 〉 and 〈 i | χ 〉 are each other’s complex conjugates, so we know that 〈 χ | i 〉 = 〈 i | χ 〉* = Di*. So if we have all Ci = 〈 i | φ 〉 and all Di = 〈 i | χ 〉, i.e. the ‘components’ of both states in terms of our base states, then we can calculate 〈 χ | φ 〉 – i.e. the amplitude to go from some state φ to some state χ as:

〈 χ | φ 〉 = ∑〈 χ | i 〉〈 i | φ 〉 =∑ Di*C= ∑ Di*〈 i | φ 〉

We can now scrap | φ 〉 in this expression – yes, it’s the power of Dirac’s notation once more! – so we get:


Now, we can re-write this using a matrix notation:


[I assumed that we have three base states now, so as to make the example somewhat less obvious. Please note we can never leave one of the base states out when specifying a state vector, so it’s not like the previous example was not complete. I’ll switch from two-state to three-state systems and back again all the time, so as to show the analysis is pretty general. To visualize things, think of the ammonia molecule as an example of a two-state system versus the spin of a proton or an electron as a three-state system, respectively. OK. Let’s get back to the lesson.]

You’ll say: so what? Well… Look at this:


I just combined the notations for 〈 I | and | III 〉. Can you now see the similarity between the the 〈 i | j〉 bra-ket and the ei·ej product? It really is the same: you just need to respect the subtleties in regard to writing the 〈 i | and | j 〉 vector, or the eand ej vectors, as a row vector or a column vector respectively.

It doesn’t stop here, of course. When learning about vectors in high school, we also learned that we could go from one set of base vectors to another by a transformation, such as, for example, a rotation, or a translation. We showed how a rotation worked in one of our post on two-state systems, where we wrote:


So we’ve got that transformation matrix, which, of course, isn’t random. To be precise, we got the matrix equation above (note that we’re back to two states only, so as to simplify) because we defined the Cand CII coefficients in the | φ 〉 = | I 〉 CI + | II 〉 CII = | 1 〉 C1 + | 2 〉 C2 expression as follows:

  • C= 〈 I | ψ 〉 = (1/√2)·(C1 − C2)
  • CII = 〈 II | ψ 〉 = (1/√2)·(C1 + C2)

The (1/√2) factor is there because of the normalization condition, and the two-by-two matrix equals the transformation matrix for a rotation of a state filtering apparatus about the y-axis, over an angle equal to (minus) 90 degrees, which we wrote as:


I promised I’d say something more about confusing terminology so let me do that here. We call a set of base states a ‘representation‘, and writing a state vector in terms of a set of base states is often referred to as a ‘projection‘ of that state into the base set. Again, we can see it’s sort of a mathematical projection, rather than a geometrical one. But it makes sense. In any case, that’s enough on state vectors and base states.

Let me wrap it up by inserting one more matrix equation, which you should be able to reconstruct yourself:


The only thing we’re doing here is to substitute 〈 χ | and | φ 〉 for ∑ Dj*〈 j | and ∑ | i 〉 Ci respectively. All the rest follows. Finally, I promised I’d tell you the difference between a state and a state vector. It’s subtle and, in practice, the two concepts refer to the same. However, we write a state as a state, like ψ or, if it’s a base state, like I, or ‘up’, or whatever. When we say a state vector, then we think of a set of numbers. It may be a row vector, like the 〈 χ | row vector with the Di* coefficients, or a column vector, like the | φ 〉 column vector with the Di* coefficients. But so if we say vector, then we think of a one-dimensional array of numbers, while the state itself is… Well… The state. So that’s some reality in physics. So you might define the state vector as the set of numbers that describes the state. While the difference is subtle, it’s important. It’s also important to note that the 〈 χ | and | χ 〉 state vectors are different too. The former appears as the final state in an amplitude, while the latter describes the starting condition. The former is referred to as a bra in the 〈 χ | φ 〉 bra-ket, while the latter is a ket in the 〈 φ | χ 〉 = 〈 χ | φ 〉* amplitude. 〈 χ | is a row vector equal to ∑ Di*〈 i |, while | χ 〉 = ∑ D| χ 〉. So it’s quite different. More in general, we’d define bras and kets as row and column vectors respectively, so we write:


That makes it clear that a bra next to a ket is to be understood as a matrix multiplication. From what I wrote, it is also obvious that the conjugate transpose (which is also known as the Hermitian conjugate) of a bra is the corresponding ket and vice versa, so we write:


Let me formally define the conjugate or Hermitian transpose here: the conjugate transpose of an m-by-n matrix A with complex elements is the n-by-m matrix A† obtained from A by taking the transpose (so we write the rows as columns and vice versa) and then taking the complex conjugate of each element (i.e. we switch the sign of the imaginary part of the complex number). A† is read as ‘A dagger’, but mathematicians will usually denote it by A*. In fact, there are a lot of equivalent notations, as we can write:


OK. That’s it on this.

One more thing, perhaps. We’ll often have states, or base states, that make sense, in a physical sense, that is. But it’s not always the case: we’ll sometimes use base states that may not represent some situation we’re likely to encounter, but that make sense mathematically. We gave the example of the ‘mathematical’ | I 〉 and | II 〉 base states, versus the ‘physical’ | 1 〉 and | 2 〉 base states, in our post on the ammonia molecule, so I won’t say more about this here. Do keep it in mind though. Sometimes it may feel like nothing makes sense, physically, but it usually does mathematically and, therefore, all usually comes out alright in the end. 🙂 To be precise, what we did there, was to choose base states with a unambiguous, i.e. a definite, energy level. That made our calculations much easier, and the end result was the same, indeed!

So… Well… I’ll let this sink in, and move on to the next topic.

The Hamiltonian operator

In my post on the post on the Hamiltonian, I explained that those Ci and Di coefficients are usually a function of time, and how they can be determined. To be precise, they’re determined by a set of differential equations (i.e. equations involving a function and the derivative of that function) which we wrote as:


If we have two base states only, then this set of equations can be written as:

set - two-base

Two equations and two functions – C= C1(t) and C= C2(t) – so we should be able to solve this thing, right? Well… No. We don’t know those Hij coefficients. As I explained in that post, they also evolve in time, so we should write them as Hij(t) instead of Hij tout court, and so it messes the whole thing up. We have two equations and six functions really. Of course, there’s always a way out, but I won’t dwell on that here—not now at least. What I want to do here is look at the Hamiltonian as an operator.

We introduced operators – but not very rigorously – when explaining the Hamiltonian. We did so by ‘expanding’ our 〈 χ | φ 〉 amplitude as follows. We’d say the amplitude to find a ‘thing’ – like a particle, for example, or some system, of particles or other things – in some state χ at the time t = t2, when it was in some state φ state at the time t = t1 was equal to:


Now, a formula like this only makes sense because we’re ‘abstracting away’ from the base states, which we need to describe any state. Hence, to actually describe what’s going on, we have to choose some representation and expand this expression as follows:


That looks pretty monstrous, so we should write it all out. Using the matrix notation I introduced above, we can do that – let’s take a practical example with three base states once again – as follows:

Matrix U

Now, this still looks pretty monstrous, but just think of it. We’re just applying that ‘Great Law of Quantum Physics’ here, i.e. | = ∑ | i 〉〈 i | over all base states i. To be precise, we apply it to an 〈 χ | A | φ 〉 expression, and we do so twice, so we get:


Nothing more, nothing less. 🙂 Now, the idea of an operator is the result of being creative: we just drop the 〈 χ | state from the expression above to write:


Yes. I know. That’s a lot to swallow, but you’ll see it makes sense because of the Great Law of Quantum Mechanics:


Just think about it and continue reading when you’re ready. 🙂 The upshot is: we now think of the particle entering some ‘apparatus’ A in the state ϕ and coming out of A in some state ψ or, looking at A as an operator, we can generalize this. As Feynman puts it:

“The symbol A is neither an amplitude, nor a vector; it is a new kind of thing called an operator. It is something which “operates on” a state to produce a new state.”

Back to our Hamiltonian. Let’s go through the same process of ‘abstraction’. Let’s first re-write that ‘Hamiltonian equation’ as follows:


The Hij(t) are amplitudes indeed, and we can represent them in a 〈 i | Hij(t) | j 〉 matrix indeed! Now let’s take the first step in our ‘abstraction process’: let’s scrap the 〈 i | bit. We get:


We can, of course, also abstract away from the | j 〉 bit, so we get:


Look at this! The right-hand side of this expression is exactly the same as that A | χ 〉 format we presented when introducing the concept of an operator. [In fact, when I say you should ‘abstract away’ from the | j 〉 bit, then you should think of the ‘Great Law’ and that matrix notation above.] So H is an operator and, therefore, it’s something which operates on a state to produce a new state.

OK. Clear enough. But what’s that ‘state’ on the left-hand side? I’ll just paraphrase Feynman here, who says we should think of it as follows: “The time derivative of the state vector |ψ〉 times i is equal to what you get by (1) operating with the Hamiltonian operator H on each base state, (2) multiplying by the amplitude that ψ is in the state j (i.e. 〈j|ψ〉), and (1) summing over all j.” Alternatively, you can also say: “The time derivative, times iħ, of a state |ψ〉 is equal to what you get if you operate on it with the Hamiltonian.” Of course, that’s true for any state, so we can ‘abstract away’ the |ψ〉 bit too and, putting a little hat (^) over the operator to remind ourselves that it’s an operator (rather than just any matrix), we get the Hamiltonian operator equation:


Now, that’s all nice and great, but the key question, of course, is: what can you do with this? Well… It turns out his Hamiltonian operators is useful to calculate lots of stuff. In the first place, of course, it’s a useful operator in the context of those differential equations describing the dynamics of a quantum-mechanical system. When everything is said and done, those equations are the equivalent, in quantum physics, of the law of motion in classical physics. [And I am not joking here.]

In addition, the Hamiltonian operator also has other uses. The one I should really mention here is that you can calculate the average or expected value (EV[X]) of the energy  of a state ψ (i.e. any state, really) by first operating on | ψ 〉 with the Hamiltonian, and then multiplying 〈 ψ | with the result. That sounds a bit complicated, but you’ll understand it when seeing the mathematical expression, which we can write as:


The formula is pretty straightforward. [If you don’t think so, then just write it all out using the matrix notation.] But you may wonder how it works exactly… Well… Sorry. I don’t want to copy all of Feynman here, so I’ll refer you to him on this. In fact, the proof of this formula is actually very straightforward, and so you should be able to get through it with the math you got here. You may even understand Feynman’s illustration of it for the ‘special case’ when base states are, indeed, those mathematically convenient base states with definite energy levels.

Have fun with it! 🙂

Post scriptum on Hilbert spaces:

As mentioned above, our state vectors are actually functions. To be specific, they are wavefunctions, i.e. periodic functions, evolving in space and time, so we usually write them as ψ = ψ(x, t). Our ‘Hilbert space’, i.e. our collection of state vectors, is, therefore, often referred to as a function space. So it’s a set of functions. At the same time, it is a vector space too, because we have those addition and multiplication operations, so our function space has the algebraic structure of a vector space. As you can imagine, there are some mathematical conditions for a space or a set of objects to ‘qualify’ as a Hilbert space, and the epithet itself comes with a lot of interesting properties. One of them is completeness, which is a property that allows us to jot down those differential equations that describe the dynamics of a quantum-mechanical system. However, as you can find whatever you’d need or want to know about those mathematical properties on the Web, I won’t get into it. The important thing here is to understand the concept of a Hilbert space intuitively. I hope this post has helped you in that regard, at least. 🙂

Quantum math: states as vectors, and apparatuses as operators

I actually wanted to write about the Hamiltonian matrix. However, I realize that, before I can serve the plat de résistance, we need to review or introduce some more concepts and ideas. It all revolves around the same theme: working with states is like working with vectors, but so you need to know how exactly. Let’s go for it. 🙂

In my previous posts, I repeatedly said that a set of base states is like a coordinate system. A coordinate system allows us to describe (i.e. uniquely identify) vectors in an n-dimensional space: we associate a vector with a set of real numbers, like x, y and z, for example. Likewise, we can describe any state in terms of a set of complex numbers – amplitudes, really – once we’ve chosen a set of base states. We referred to this set of base states as a ‘representation’. For example, if our set of base states is +S, 0S and −S, then any state φ can be defined by the amplitudes C+ = 〈 +S | φ 〉, C0 = 〈 0S | φ 〉, and C = 〈 −S | φ 〉.

We have to choose some representation (but we are free to choose which one) because, as I demonstrated when doing a practical example (see my description of muon decay in my post on how to work with amplitudes), we’ll usually want to calculate something like the amplitude to go from one state to another – which we denoted as 〈 χ | φ 〉 – and we’ll do that by breaking it up. To be precise, we’ll write that amplitude 〈 χ | φ 〉  – i.e. the amplitude to go from state φ to state χ (you have to read this thing from right to left, like Hebrew or Arab) – as the following sum:


So that’s a sum over a complete set of base states (that’s why I write all i under the summation symbol ∑). We discussed this rule in our presentation of the ‘Laws’ of quantum math.

Now we can play with this. As χ can be defined in terms of the chosen set of base states too, it’s handy to know that 〈 χ | i 〉 and 〈 i | χ 〉 are each other’s complex conjugates – we write this as: 〈 χ | i 〉 = 〈 i | χ 〉* – so if we have one, we have the other (we can also write: 〈 i | χ 〉* = 〈 χ | i 〉). In other words, if we have all Ci = 〈 i | φ 〉 and all Di = 〈 i | χ 〉, i.e. the ‘components’ of both states in terms of our base states, then we can calculate 〈 χ | φ 〉 as:

〈 χ | φ 〉 = ∑ Di*Ci = ∑〈 χ | i 〉〈 i | φ 〉,

provided we make sure we do the summation over a complete set of base states. For example, if we’re looking at the angular momentum of a spin-1/2 particle, like an electron or a proton, then we’ll have two base states, +ħ/2 and +ħ/2, so then we’ll have only two terms in our sum, but the spin number (j) of a cobalt nucleus is 7/2, so if we’d be looking at the angular momentum of a cobalt nucleus, we’ll have eight (2·j + 1) base states and, hence, eight terms when doing the sum. So it’s very much like working with vectors, indeed, and that’s why states are often referred to as state vectors. So now you know that term too. 🙂

However, the similarities run even deeper, and we’ll explore all of them in this post. You may or may not remember that your math teacher actually also defined ordinary vectors in three-dimensional space in terms of base vectors ei, defined as: e= [1, 0, 0], e= [0, 1, 0] and e= [0, 0, 1]. You may also remember that the units along the x, y and z-axis didn’t have to be the same – we could, for example, measure in cm along the x-axis, but in inches along the z-axis, even if that’s not very convenient to calculate stuff – but that it was very important to ensure that the base vectors were a set of orthogonal vectors. In any case, we’d chose our set of orthogonal base vectors and write all of our vectors as:

A = Ax·e1 + Ay·e+ Az·e3

That’s simple enough. In fact, one might say that the equation above actually defines coordinates. However, there’s another way of defining them. We can write Ax, Ay, and Az as vector dot products, aka scalar vector products (as opposed to cross products, or vector products tout court). Check it:

A= A·e1, A= A·e2, and A= A·e3.

This actually allows us to re-write the vector dot product A·B in a way you’ve probably haven’t seen before. Indeed, you’d usually calculate A·B as |A|∙|B|·cosθ = A∙B·cosθ (A and B is the magnitude of the vectors A and B respectively) or, quite simply, as AxB+ AyB+ AzBz. However, using the dot products above, we can now also write it as:

equation 2

We deliberately wrote B·A instead of Abecause, while the mathematical similarity with the

〈 χ | φ 〉 = ∑〈 χ | i 〉〈 i | φ 〉

equation is obvious, B·A = A·B but 〈 χ | φ 〉 ≠ 〈 φ | χ 〉. Indeed, 〈 χ | φ 〉 and 〈 φ | χ 〉 are complex conjugates – so 〈 χ | φ 〉 = 〈 φ | χ 〉* – but they’re not equal. So we’ll have to watch the order when working with those amplitudes. That’s because we’re working with complex numbers instead of real numbers. Indeed, it’s only because the A·B dot product involves real numbers, whose complex conjugate is the same, that we have that commutativity in the real vector space. Apart from that – so apart from having to carefully check the order of our products – the correspondence is complete.

Let me mention another similarity here. As mentioned above, our base vectors ei had to be orthogonal. We can write this condition as:

ei·ej = δij, with δij = 0 if i ≠ j, and 1 if i = j.

Now, our first quantum-mechanical rule says the same:

〈 i | j 〉 = δij, with δij = 0 if i ≠ j, and 1 if i = j.

So our set of base states also has to be ‘orthogonal’, which is the term you’ll find in physics textbooks, although – as evidenced from our discussion on the base states for measuring angular momentum – one should not try to give any geometrical interpretation here: +ħ/2 and +ħ/2 (so that’s spin ‘up’ and ‘down’ respectively) are not ‘orthogonal’ in any geometric sense, indeed. It’s just that pure states, i.e. base states, are separate, which we write as: 〈 ‘up’ | ‘down’ 〉 = 〈 ‘down’ | ‘up’ 〉 = 0 and 〈 ‘up’ | ‘up’ 〉 = 〈 ‘down’ | ‘down’ 〉 = 1. It just means they are just different base states, and so it’s one or the other. For our +S, 0S and −S example, we’d have nine such amplitudes, and we can organize them in a little matrix:

def base statesIn fact, just like we defined the base vectors ei as e= [1, 0, 0], e= [0, 1, 0] and e= [0, 0, 1] respectively, we may say that the matrix above, which states exactly the same as the 〈 i | j 〉 = δij rule, can serve as a definition of what base states actually are. [Having said that, it’s obvious we like to believe that base states are more than just mathematical constructs: we’re talking reality here. The angular momentum as measured in the x-, y- or z-direction, or in whatever direction, is more than just a number.]

OK. You get this. In fact, you’re probably getting impatient because this is too simple for you. So let’s take another step. We showed that the 〈 χ | φ 〉 = ∑〈 χ | i 〉〈 i | χ 〉 and B·= ∑(B·ei)(ei·A) are structurally equivalent – from a mathematical point of view, that is – but B and A are separate vectors, while 〈 χ | φ 〉 is just a complex number. Right?

Well… No. We can actually analyze the bra and the ket in the 〈 χ | φ 〉 bra-ket as separate pieces too. Moreover, we’ll show they are actually state vectors too, even if the bra, i.e. 〈 χ |, and the ket, i.e. | φ 〉, are ‘unfinished pieces’, so to speak. Let’s be bold. Let’s just cut the 〈 χ | φ 〉 = ∑〈 χ | i 〉〈 i | χ 〉 by writing:

bra and ket


Yes. That’s the power of Dirac’s bra-ket notation: we can just drop symbols left or right. It’s quite incredible. But, of course, the question is: so what does this actually mean? Well… Don’t rack your brain. I’ll tell you. We define | φ 〉 as a state vector because we define | i 〉 as a (base) state vector. Look at it this way: we wrote the 〈 +S | φ 〉, 〈 0S | φ 〉 and 〈 −S | φ 〉 amplitudes as C+, C0, C, respectively, so we can write the equation above as:


So we’ve got a sum of products here, and it’s just like A = Ax·e+ Ay·e2 + Az·e3. Just substitute the Acoefficients for Ci and the ebase vectors for the | i 〉 base states. We get:

| φ 〉 = |+S〉 C+ + |0S〉 C0  + |+S〉 C

Of course, you’ll wonder what those terms mean: what does it mean to ‘multiply’ C+ (remember: C+  is some complex number) by |+S〉? Be patient. Just wait. You’ll understand when we do some examples, so when you start working with this stuff. You’ll see it all makes sense—later. 🙂

Of course, we’ll have a similar equation for | χ 〉, and so if we write 〈 χ | i 〉 as Di, then we can write | χ 〉 = ∑ | i 〉〈 χ | i 〉 as | χ 〉 = ∑ | i 〉 Di.

So what? Again: be patient. We know that 〈 χ | i 〉 = 〈 i | χ 〉*, so our second equation above becomes:


You’ll have two questions now. The first is the same as the one above: what does it mean to ‘multiply’, let’s say, D0* (i.e. the complex conjugate of D0, so if D= a + ib, then D0* = a − ib) with 〈0S|? The answer is the same: be patient. 🙂 Your second question is: why do I use another symbol for the index here? Why j instead of i? Well… We’ll have to re-combine stuff, so it’s better to keep things separate by using another symbol for the same index. 🙂

In fact, let’s re-combine stuff right now, in exactly the same way as we took it apart: we just write the two things right next to each other. We get the following:


What? Is that it? So we went through all of this hocus-pocus just to find the same equation as we started out with?

Yes. I had to take you through this so you get used to juggling all those symbols, because that’s what we’ll do in the next post. Just think about it and give yourself some time. I know you’ve probably never ever handled such exercise in symbols before – I haven’t, for sure! – but it all makes sense: we cut and paste. It’s all great! 🙂 [Oh… In case you wonder about the transition from the sum involving i and j to the sum involving i only, think about the Kronecker expression: 〈 j | i 〉 = δij, with δij = 0 if i ≠ j, and 1 if i = j, so most of the terms are zero.]

To summarize the whole discussion, note that the expression above is completely analogous with the B·= BxA+ ByA+ BzAformula. The only difference is that we’re talking complex numbers here, so we need to watch out. We have to watch the order of stuff, and we can’t use the Dnumbers themselves: we have to use their complex conjugates Di*. But, for the rest, we’re all set! 🙂 If we’ve got a set of base states, then we can define any state in terms of a set of ‘coordinates’ or ‘coefficients’ – i.e. the Ci or Di numbers for the φ or χ example above – and we can then calculate the amplitude to go from one state to another as:


In case you’d get confused, just take the original equation:


The two equations are fully equivalent.


So we just went through all of the shit above so as to show that structural similarity with vector spaces?

Yes. It’s important. You just need to remember that we may have two, three, four, five,… or even an infinite number of base states depending on the situation we’re looking at, and what we’re trying to measure. I am sorry I had to take you through all of this. However, there’s more to come, and so you need this baggage. We’ll take the next step now, and that is to introduce the concept of an operator.

Look at the middle term in that expression above—let me copy it:


We’ve got three terms in that double sum (a double sum is a sum involving two indices, which is what we have here: i and j). When we have two indices like that, one thinks of matrices. That’s easy to do here, because we represented that 〈 i | j 〉 = δij equation as a matrix too! To be precise, we presented it as the identity matrix, and a simple substitution allows us to re-write our equation above as:


I must assume you’re shaking your head in disbelief now: we’ve expanded a simple amplitude into a product of three matrices now. Couldn’t we just stick to that sum, i.e that vector dot product ∑ Di*Ci? What’s next? Well… I am afraid there’s a lot more to come. :-/ For starters, we’ll take that idea of ‘putting something in the middle’ to the next level by going back to our Stern-Gerlach filters and whatever other apparatus we can think of. Let’s assume that, instead of some filter S or T, we’ve got something more complex now, which we’ll denote by A. [Don’t confuse it with our vectors: we’re talking an apparatus now, so you should imagine some beam of particles, polarized or not, entering it, going through, and coming out.]

We’ll stick to the symbols we used already, and so we’ll just assume a particle enters into the apparatus in some state φ, and that it comes out in some state χ. Continuing the example of spin-one particles, and assuming our beam has not been filtered – so, using lingo, we’d say it’s unpolarized – we’d say there’s a probability of 1/3 for being either in the ‘plus’, ‘zero’, or ‘minus’ state with respect to whatever representation we’d happen to be working with, and the related amplitudes would be 1/√3. In other words, we’d say that φ is defined by C+ = 〈 +S | φ 〉, C0 = 〈 0S | φ 〉, and C = 〈 −S | φ 〉, with C+ = C0 = C− = 1/√3. In fact, using that | φ 〉 = |+S〉 C+ + |0S〉 C0  + |+S〉 C− expression we invented above, we’d write: | φ 〉 = (1/√3)|+S〉 + (1/√3)|0S〉 C0  + (1/√3)|+S〉 C or, using ‘matrices’—just a row and a column, really:

matrix 2

However, you don’t need to worry about that now. The new big thing is the following expression:

〈 χ | A | φ〉

It looks simple enough: φ to A to χ. Right? Well… Yes and no. The question is: what do you do with this? How would we take its complex conjugate, for example? And if we know how to do that, would it be equal to 〈 φ | A | χ〉?

You guessed it: we’ll have to take it apart, but how? We’ll do this using another fantastic abstraction. Remember how we took Dirac’s 〈 χ | φ 〉 bra-ket apart by writing | φ 〉 = ∑ | i 〉〈 i | φ 〉? We just dropped the 〈 χ left and right in our 〈 χ | φ 〉 = ∑〈 χ | i 〉〈 i | φ 〉 expression. We can go one step further now, and drop the φ 〉 left and right in our | φ 〉 = ∑ | i 〉〈 i | φ 〉 expression. We get the following wonderful thing:

| = ∑ | i 〉〈 i | over all base states i

With characteristic humor, Feynman calls this ‘The Great Law of Quantum Mechanics’ and, frankly, there’s actually more than one grain of truth in this. 🙂

Now, if we apply this ‘Great Law’ to our 〈 χ | A | φ〉 expression – we should apply it twice, actually – we get:


As Feynman points out, it’s easy to add another apparatus in series. We just write:


Just put a | bar between B and A and apply the same trick. The | bar is really like a factor 1 in multiplication. However, that’s all great fun but it doesn’t solve our problem. Our ‘Great Law’ allows us to sort of ‘resolve’ our apparatus A in terms of base states, as we now have 〈 i | A | j 〉 in the middle, rather than 〈 χ | A | φ〉 but, again, how do we work with that?

Well… The answer will surprise you. Rather than trying to break this thing up, we’ll say that the apparatus A is actually being described, or defined, by the nine 〈 i | A | j 〉 amplitudes. [There are nine for this example, but four only for the example involving spin-1/2 particles, of course.] We’ll call those amplitudes, quite simply, the matrix of amplitudes, and we’ll often denote it by Aij.

Now, I wanted to talk about operators here. The idea of an operator comes up when we’re creative again, and when we drop the 〈 χ | state from the 〈 χ | A | φ〉 expression. We write:


So now we think of the particle entering the ‘apparatus’ A in the state ϕ and coming out of A in some state ψ (‘psi’). We can generalize this and think of it as an ‘operator’, which Feynman intuitively defines as follows:

The symbol A is neither an amplitude, nor a vector; it is a new kind of thing called an operator. It is something which “operates on” a state to produce a new state.”

But… Wait a minute! | ψ 〉 is not the same as 〈 χ |. Why can we do that substitution? We can only do it because any state ψ and χ are related through that other ‘Law’ of quantum math:


Combining the two shows our ‘definition’ of an operator is OK. We should just note that it’s an ‘open’ equation until it is completed with a ‘bra’, i.e. a state like 〈 χ |, so as to give the 〈 χ | ψ〉 = 〈 χ | A | φ〉 type of amplitude that actually means something. In practical terms, that means our operator or our apparatus doesn’t mean much as long as we don’t measure what comes out, so then we choose some set of base states, i.e. a representation, which allows us to describe the final state, i.e. 〈 χ |.


Well… Folks, that’s it. I know this was mighty abstract, but the next posts should bring things back to earth again. I realize it’s only by working examples and doing exercises that one can get some kind of ‘feel’ for this kind of stuff, so that’s what we’ll have to go through now. 🙂

Quantum math: transformations

We’ve come a very long way. Now we’re ready for the Big Stuff. We’ll look at the rules for transforming amplitudes from one ‘base’ to ‘another’. [In quantum mechanics, however, we’ll talk about a ‘representation’, rather than a ‘base’, as we’ll reserve the latter term for a ‘base’ state.] In addition, we’ll look at how physicists model how amplitudes evolve over time using the so-called Hamiltonian matrix. So let’s go for it.

Transformations: how should we think about them?

In my previous post, I presented the following hypothetical set-up: we have an S-filter and a T-filter in series, but the T-filter at the angle α with respect to the first. In case you forgot: these ‘filters’ are modified Stern-Gerlach apparatuses, designed to split a particle beam according to the angular momentum in the direction of the gradient of the magnetic field, in which we may place masks to filter out one or more states.


The idea is illustrated in the hypothetical example below. The unpolarized beam goes through S, but we have masks blocking all particles with zero or negative spin in the z-direction, i.e. with respect to S. Hence, all particles entering the T-filter are in the +S state. Now, we assume the set-up of the T-filter is such that it filters out all particles with positive or negative spin. Hence, only particles with zero spin go through. So we’ve got something like this:


However, we need to be careful as what we are saying here. The T-apparatus is tilted, so the gradient of the magnetic field is different. To be precise, it’s got the same tilt as the T-filter itself (α). Hence, it will be filtering out all particles with positive or negative spin with respect to T. So, unlike what you might think at first, some fraction of the particles in the +S state will get through the T-filter, and come out in the 0T state. In fact, we know how many, because we have formulas for situations like this. To be precise, in this case, we should apply the following formula:

〈 0T | +S 〉 =  −(1/√2)·sinα

This is a real-valued amplitude. As usual, we get the following probability by taking the absolute square, so P = |−(1/√2)·sinα|= (1/2)·sin2α, which gives us the following graph of P:

graph probability

The probability varies between 0  (for α = 0 or π) and 1/2 = 0.5 (for α = π/2 or 3π/2). Now, this graph may or may not make sense to you, so you should think about it. You’ll admit it makes sense to find P = 0 for α = 0, but what about the non-zero values?

Think about what this would mean in classical terms: we’ve got a beam of particles whose angular momentum is ‘up’ in the z-direction. To be precise, this means that Jz = +ħ. [Angular momentum and the quantum of action have the same dimension: the joule·second.] So that’s the maximum value out of the three permitted values, which are +ħ, 0 and –ħ. Note that the particles here must be bosons. So you may think we’re talking photons, in practice but… Well… No. As I’ll explain in a later post, the photon is a spin-one particle but it’s quite particular, because it has no ‘zero spin’-state. Don’t worry about it here – but it’s really quite remarkable. So, instead of thinking of a photon, you should think of some composite matter particle obeying Bose-Einstein statistics. These are not so rare as you may think: all matter-particles that contain an even number of fermions – like elementary particles – have integer spin – but… Well… Their spin number is usually zero – not one. So… Well… Feynman’s particle here is somewhat theoretical – but it doesn’t matter. Let’s move on. 🙂

Let’s look at another transformation formula. More in particular, let’s look at the formula we (should) get for 〈 0T | −S 〉 as a function of α. So we change the set-up of the S-filter to ensure all particles entering T have negative spin. The formula is:

〈 0T | −S 〉 =  +(1/√2)·sinα

That gives the same probabilities: |+(1/√2)·sinα|= (1/2)·sin2α. Adding |〈 0T | +S 〉|2 and |〈 0T | −S 〉|gives us a total probability equal to sin2α, which is equal to 1 if α = π/2 or 3π/2. We may be tempted to interpret this as follows: if a particle is in the +S or −S state before entering the T-apparatus, and the T-apparatus is tilted at an angle α = π/2 or 3π/2 with respect to the S-apparatus, then this particle will come out of the T-apparatus in the 0T-state. No ambiguity here: P = 1.

Is this strange? Well… Let’s think about what it means to tilt the T-apparatus. You’ll have to admit that, if the apparatus is tilted at the angle π/2 or 3π/2, it’s going to measure the angular momentum in the x-direction. [The y-direction is the common axis of both apparatuses here.] So… Well… It’s pretty plausible, isn’t it? If all of the angular momentum is in the positive or negative z-direction, then it’s not going to have any angular momentum in the x-direction, right? And not having any angular momentum in the x-direction effectively corresponds to being in the 0T-state, right?

Oh ! Is it that easy?

Well… No! Not at all! The reasoning above shows how easy it is to be led astray. We forgot to normalize. Remember, if we integrate the probability density function over its domain, i.e. α ∈ [0, 2π], then we have to get one, as all probabilities have to add up to one. The definite integral of (1/2)·sin2α over [0, 2π] is equal to π/2 (the definite integral of the sine or cosine squared over a full cycle is equal to π), so we need to multiply this function by 2/π to get the actual probability density function, i.e. (1/π)·sin2α. It’s got the same shape, obviously, but it gives us maximum probabilities equal to 1/π ≈ 0.32 for α = π/2 or 3π/2, instead of 1/2 = 0.5.

Likewise, the sin2α function we got when adding |〈 0T | +S 〉|2 and |〈 0T | −S 〉|should also be normalized. One really needs to keep one’s wits about oneself here. What we’re saying here is that we have a particle that is either in the +S or the −S state, so let’s say that the chance is 50/50 to be in either of the two states. We then have these probabilities |〈 0T | +S 〉|2 and |〈 0T | −S 〉|2, which we calculated as (1/π)·sin2α. So the total combined probability is equal to 0.5·(1/π)·sin2α + 0.5·(1/π)·sin2α = (1/π)·sin2α. So we’re now weighing the two (1/π)·sin2α functions – and it doesn’t matter if the weights are 50/50 or 75/25 or whatever, as long as the two weights add up to one. The bottom line is: we get the same (1/π)·sin2α function for P, and the same maximum probability 1/π ≈ 0.32 for α = π/2 or 3π/2.

So we don’t get unity: P ≠ 1 for α = π/2 or 3π/2. Why not? Think about it. The classical analysis made sense, didn’t it? If the angular momentum is all in the z-direction (or in one of the two z-directions, I should say), then we cannot have any of it in the x-direction, can it? Well… The surprising answer is: yes, we can. The remarkable thing is that, in quantum physics, we actually never have all of the angular momentum in one direction. As I explained in my post on spin and angular momentum, the classical concepts of angular momentum, and the related magnetic moment, have their limits in quantum mechanics. In quantum physics, we find that the magnitude of a vector quantity, like angular momentum, or the related magnetic moment, is generally not equal to the maximum value of the component of that quantity in any direction. The general rule is that the maximum value of any component of J in whatever direction – i.e. +ħ in the example we’re discussing here – is smaller than the magnitude of J – which I calculated in the mentioned post as |J| = J = +√2·ħ ≈ 1.414·ħ, so that’s almost 1.5 times ħ! So it’s quite a bit smaller! The upshot is that we cannot associate any precise and unambiguous direction with quantities like the angular momentum J or the magnetic moment μ. So the answer is: the angular momentum can never be all in the z-direction, so we can always have some of it in the x-direction, and so that explains the amplitudes and probabilities we’re having here.


Yep. I know. We never seem to get out of this ‘weirdness’, but then that’s how quantum physics is like. Feynman warned us upfront:

“Because atomic behavior is so unlike ordinary experience, it is very difficult to get used to, and it appears peculiar and mysterious to everyone—both to the novice and to the experienced physicist. Even the experts do not understand it the way they would like to, and it is perfectly reasonable that they should not, because all of direct, human experience and of human intuition applies to large objects. We know how large objects will act, but things on a small scale just do not act that way. So we have to learn about them in a sort of abstract or imaginative fashion and not by connection with our direct experience.”

As I see it, quantum physics is about explaining all sorts of weird stuff, like electron interference and tunneling and what have you, so it shouldn’t surprise us that the framework is as weird as the stuff it’s trying to explain. 🙂 So… Well… All we can do is to try to go along with it, isn’t it? And so that’s what we’ll do here. 🙂

Transformations: the formulas

We need to distinguish various cases here. The first case is the case explained above: the T-apparatus shares the same y-axis – along which the particles move – but it’s tilted. To be precise, we should say that it’s rotated about the common y-axis by the angle α. That implies we can relate the x’, y’, z’ coordinate system of T to the x, y, z coordinate system of S through the following equations: z′ cosα sinα, x′ cosα − sinα, and y′ y. Then the transformation amplitudes are:

transformation 1

We used the formula for 〈 0T | +S 〉 and 〈 0T | −S 〉 above, and you can play with the formulas above by imagining the related set-up of the S and T filters, such as the one below:


If you do your homework (just check what formula and what set-up this corresponds to), you should find the following graph for the amplitude and the probability as a function of α: the graph is zero for α = π, but is non-zero everywhere else. As with the other example, you should think about this. It makes sense—sort of, that is. 🙂

probability 2

OK. Next case. Now we’re going to rotate the T-apparatus around the z-axis by some angle β. To illustrate what we’re doing here, we need to take a ‘top view’ of our apparatus, as shown below, which shows a rotation over 90°. More in general, for any angle β, the coordinate transformation is given by z′ z, x′ cosβ sinβ, y′ cosβ − sinβ. [So it’s quite similar to case 1: we’re only rotating the thing in a different plane.]

rotation z

The transformation amplitudes are now given by:

transformation 2

As you can see, we get complex-valued transformation amplitudes, unlike our first case, which yielded real-valued transformation amplitudes. That’s just the way it is. Nobody says transformation amplitudes have to be real-valued. On the contrary, one would expect them to be complex numbers. 🙂 Having said that, the combined set of transformation formulas is, obviously, rather remarkable. The amplitude to go from the +S state to, say, the 0T state is zero. Also, when our particle has zero spin when coming out of S, it will always have zero spin when and if it goes through T. In fact, the absolute value of those e±iβ functions is also equal to one, so they are also associated with probabilities that are equal to one: |e±iβ|2 = 12 =  1. So… Well… Those formulas are simple and weird at the same time, aren’t they? They sure give us plenty of stuff to think about, I’d say.

So what’s next? Well… Not all that much. We’re sort of done, really. Indeed, it’s just a property of space that we can get any rotation of T by combining the two rotations above. As I only want to introduce the basic concepts here, I’ll refer you to Feynman for the details of how exactly that’s being done. [He illustrates it for spin-1/2 particles in particular.] I’ll just wrap up here by generalizing our results from base states to any state.

Transformations: generalization

We mentioned a couple of times already that the base states are like a particular coordinate system: we will usually describe a state in terms of base states indeed. More in particular, choosing S as our representation, we’ll say:

The state φ is defined by the three numbers:

C+ = 〈 +S | φ 〉,

C0 = 〈 0S | φ 〉,

C = 〈 −S | φ 〉.

Now, the very same state can, of course, also be described in the ‘T system’, so then our numbers – i.e. the ‘components’ of φ – would be equal to:

C’+ = 〈 +T | φ 〉, C’0 = 〈 0T | φ 〉, and C’ = 〈 −T | φ 〉.

So how can we go from the unprimed ‘coordinates’ to the primed ones? The trick is to use the second of the three quantum math ‘Laws’ which I introduced in my previous post:


Just replace χ in [II] by +T, 0T and/or –T. More in general, if we denote +T, 0T or –T by jT, we can re-write this ‘Law’ as:

transformation 3

So the 〈 jT | iS 〉 amplitudes are those nine transformation amplitudes. Now, we can represent those nine amplitudes in a nice three-by-three matrix and, yes, we’ll call that matrix the transformation matrix. So now you know what that is.

To conclude, I should note that it’s only because we’re talking spin-one particles here that we have three base states here and, hence, three ‘components’, which we denoted by C+, C and C0, which transform the way they do when going from one representation to another, and so that is very much like what vectors do when we move to a different coordinate system, which is why spin-one particles are often referred to as ‘vector particles. [I am just mentioning this in case you’d come across the term and wonder why they’re being called that way. Now you know.] In fact, if we have three base states, in respect to whatever representation, and we define some state φ in terms of them, then we can always re-define that state in terms of the following ‘special’ set of components:

special set

The set is ‘special’ because one can show (you can do that yourself that by using those transformation laws) that these components transform exactly the way as x, y, z transform to x, y, z. But so I’ll leave at this.


Oh… What about the Hamiltonian? Well… I’ll save that for my next posts, as my posts have become longer and longer, and so it’s probably a good idea to separate them out. 🙂

Post scriptum: transformations for spin-1/2 particles

You should actually really check out that chapter of Feynman. The transformation matrices for spin-1/2 particles look different because… Well… Because there’s only two base states for spin-1/2 particles. It’s a pretty technical chapter, but then spin-1/2 particles are the ones that make up the world. 🙂