It’s probably good to review the concepts we’ve learned so far. Let’s start with the *foundation* of all of our math, i.e. the concept of the state, or the state *vector*. [The difference between the two concepts is subtle but real. I’ll come back to it.]

**State vectors and base states**

We used Dirac’s *bra-ket* notation to denote a state vector, in general, as | ψ 〉. The obvious question is: what *is *this thing? We called it a vector because we *use* it *like* a vector: we multiply it with some number, and then add it to some other vector. So that’s just what you did in high school, when you learned about *real* vector spaces. In this regard, it is good to remind you of the definition of a vector space. To put it simply, it is is a collection of objects called **vectors**, which may be added together, and multiplied by **numbers**. So we have* two* things here: the ‘objects’, and the ‘numbers’. That’s why we’d say that we have some vector *space *over a *field *of numbers. [The term ‘field’ just refers to an algebraic structure, so we can add and multiply and what have you.] Of course, what it means to ‘add’ two ‘objects’, and what it means to ‘multiply’ an object with a number, depends on the type of objects and, unsurprisingly, the type of numbers.

** Huh? **The

**of number?! A number is a number,**

*type**no?*

*No, hombre, no!* We’ve got natural numbers, rational numbers, real numbers, complex numbers—and you’ve probably heard of quaternions too – and, hence, ‘multiplying’ a ‘number’ with ‘something else’ can mean *very *different things. At the same time, the general idea is the general idea, so that’s the same, indeed. 🙂 When using real numbers and the kind of vectors you are used to (i.e. *Euclidean* vectors), then the multiplication amounts to a re-*scaling* of the vector, and so that’s why a real number is often referred to as a *scalar*. At the same time, anything that can be used to multiply a vector is often referred to as a scalar in math so… Well… Terminology is often quite confusing. In fact, I’ll give you some more examples of confusing terminology in a moment. But let’s first look at our ‘objects’ here, i.e. our ‘vectors’.

I did a post on Euclidean and non-Euclidean vector spaces two years ago, when I started this blog, but *state vectors *are obviously very different ‘objects’. They don’t resemble the vectors we’re used to. We’re used to so-called *polar *vectors, aka as *real *vectors, like the position vector (**x **or** r**), or the momentum vector (**p** = m·** v**), or the electric field vector (

**E**). We are also familiar with the so-called

*pseudo*-vectors, aka as

*axial*vectors, like angular momentum (

**L**=

**r**×

**p**), or the magnetic dipole moment. [Unlike what you might think, not all vector

*cross*products yield a pseudo-vector. For example, the cross-product of a polar

*and an axial vector yields a polar vector.] But here we are talking some very different ‘object’. In math, we say that state vectors are elements in a*

*Hilbert space*. So a Hilbert space is a vector space but… Well… With special vectors. 🙂

The key to understanding why we’d refer to *states *as state *vectors *is the fact that, just like Euclidean vectors, we can *uniquely *specify any *element* in a Hilbert space with respect to a set of *base *states. So it’s really like using Cartesian coordinates in a two- or three-dimensional Euclidean space. The analogy is complete because, even in the absence of a *geometrical* interpretation, we’ll require those base states to be *orthonormal*. Let me be explicit on that by reminding you of your high-school classes on vector analysis: you’d choose a set of orthonormal base vectors ** e _{1}**,

**e**, and

_{2}**e**and you’d write any vector

_{3},**A**as:

**A** = (A_{x}, A_{y}, A_{z}) = A_{x}·**e _{1}** + A

_{y}·

**e**

_{2 }+ A

_{z}·

**e**with

_{3}**e**·

_{i}**e**= 1 if i = j, and

_{j}**e**·

_{i}**e**= 0 if i ≠ j

_{j}The **e _{i}**·

**e**= 1 if i = j and

_{j}**e**·

_{i}**e**= 0 if i ≠ j condition expresses the orthonormality condition: the base vectors need to be orthogonal

_{j}*unit*vectors. We wrote it as

**e**·

_{i}**e**= δ

_{j}_{ij }using the

*Kronecker delta*(δ

_{ij }= 1 if i = j and 0 if i ≠ j). Now, base states in quantum mechanics do

*not*necessarily have a geometrical interpretation. Indeed, although one often

*can*actually associate them with some

*position*or

*direction*in space, the condition of orthonormality applies in the

*mathematical*sense of the word only. Denoting the base states by

*i*= 1, 2,… – or by Roman numerals, like I and II – so as to distinguish them from the Greek ψ or φ symbols we use to denote state vectors

*in general*, we write the orthonormality condition as follows:

〈 i | j 〉 = δ_{ij}, with δ_{ij }= δ_{ji }is equal to 1 if *i* = *j*, and zero if *i* ≠ *j*

Now, you may grumble and say: that 〈 i | j 〉 *bra-ket *does *not *resemble the **e _{i}**·

**e**product. Well… It does and it doesn’t. I’ll show why in a moment. First note how we uniquely specify state vectors in general in terms of a set of base states. For example, if we have two possible base states only, we’ll write:

_{j}| φ 〉 = | 1 〉 C_{1} + | 2 〉 C_{2}

Or, if we chose some other set of base states | I 〉 and | II 〉, we’ll write:

| φ 〉 = | I 〉 C_{I} + | II 〉 C_{II}

You should note that the | 1 〉 C_{1} term in the | φ 〉 = | 1 〉 C_{1} + | 2 〉 C_{2} sum is really like the A_{x}·**e _{1}** product in the

**A**= A

_{x}·

**e**+ A

_{1}_{y}·

**e**

_{2 }+ A

_{z}·

**e**expression. In fact, you may actually write it as C

_{3}_{1}·| 1 〉, or just reverse the order and write C

_{1}| 1 〉. However, that’s not common practice and so I won’t do that, except occasionally. So you should look at | 1 〉 C

_{1}as a product indeed: it’s the product of a base state and a complex number, so it’s really like m·

**, or whatever other product of some**

*v**scalar*and some

*vector*, except that we’ve got a complex scalar here. […] Yes, I know the term ‘complex scalar’ doesn’t make sense, but I hope you know what I mean. 🙂

More generally, we write:

Writing our state vector | ψ 〉, | φ 〉 or | χ 〉 like this also defines these coefficients or coordinates C_{i}. Unlike our state vectors, or our base states, C_{i }is an actual * number*. It

*has*to be, of course: it’s the

*complex*number that makes sense of the whole expression. To be precise, C

_{i }is an

*amplitude*, or a

*wavefunction*, i.e. a function depending on both space and time. In our previous posts, we limited the analysis to amplitudes varying in time only, and we’ll continue to do so for a while. However, at some point, you’ll get the full picture.

Now, what about the supposed similarity between the 〈 i | j〉 *bra-ket *and the **e _{i}**·

**e**product? Let me invoke what Feynman, tongue-in-cheek as usual, refers to as the Great Law of Quantum Mechanics:

_{j}You get this by taking | ψ 〉 out of the | ψ 〉 = ∑| i 〉〈 i | ψ 〉 expression. And, no, don’t say: what nonsense! Because… Well… Dirac’s notation really *is* that simple and powerful! You just have to read it from right to left. There’s an order to the symbols, unlike what you’re used to in math, because you’re used to operations that are commutative.* *But I need to move on. The upshot is that we can specify our base states in terms of the base states too. For example, if we have only two base states, let’s say I and II, then we can write:

| I 〉 = ∑| i 〉〈 i | 1 〉 = 1·| I 〉 + 0·| II 〉 and | II 〉 = ∑| i 〉〈 i | II 〉 = 0·| 1 〉 + 0·| II 〉

We can write this using a *matrix* notation:

Now *that *is silly, you’ll say. What’s the use of this? It doesn’t tell us anything new, and it also does *not* show us why we should think of the 〈 i | j 〉 *bra-ket *and the **e _{i}**·

**e**product as being similar! Well… Yes and no. Let me show you something else. Let’s assume we’ve got some state χ and φ, which we specify in terms of our chosen set of base states as | χ 〉 = ∑ | i 〉 D

_{j}_{i}and | φ 〉 = ∑ | i 〉 C

_{i}respectively. Now, from our post on quantum math, you’ll remember that 〈 χ | i 〉 and 〈 i | χ 〉 are each other’s complex conjugates, so we know that 〈 χ | i 〉 = 〈 i | χ 〉* = D

_{i}*. So if we have all C

_{i}= 〈 i | φ 〉 and all D

_{i}= 〈 i | χ 〉, i.e. the ‘components’ of both states in terms of our base states, then we can

*calculate*〈 χ | φ 〉 – i.e. the amplitude to go from some state φ to some state χ as:

〈 χ | φ 〉 = ∑〈 χ | i 〉〈 i | φ 〉 =∑ D_{i}*C_{i }= ∑ D_{i}*〈 i | φ 〉

We can now scrap | φ 〉 in this expression – *yes, it’s the power of Dirac’s notation once more!* – so we get:

Now, we can re-write this using a matrix notation:

[I assumed that we have three base states now, so as to make the example somewhat less obvious. Please note we can *never *leave one of the base states out when specifying a state vector, so it’s not like the previous example was not complete. I’ll switch from two-state to three-state systems and back again all the time, so as to show the analysis is pretty general. To visualize things, think of the ammonia molecule as an example of a two-state system versus the spin of a proton or an electron as a three-state system, respectively. OK. Let’s get back to the lesson.]

You’ll say: so what? Well… Look at this:

I just combined the notations for 〈 I | and | III 〉. Can you *now* see the similarity between the the 〈 i | j〉 *bra-ket *and the **e _{i}**·

**e**product? It really

_{j}*is*the same: you just need to respect the subtleties in regard to writing the 〈 i | and | j 〉 vector, or the

**e**and

_{i }**e**vectors, as a

_{j}*row*vector or a

*column*vector respectively.

It doesn’t stop here, of course. When learning about vectors in high school, we also learned that we could go from one set of base vectors to another by a *transformation, *such as, for example, a *rotation*, or a *translation*. We showed how a *rotation *worked in one of our post on two-state systems, where we wrote:

So we’ve got that *transformation *matrix, which, of course, isn’t random. To be precise, we got the matrix equation above (note that we’re back to two states only, so as to simplify) because we defined the C_{I }and C_{II }coefficients in the | φ 〉 = | I 〉 C_{I} + | II 〉 C_{II} = | 1 〉 C_{1} + | 2 〉 C_{2} expression as follows:

- C
_{I }= 〈 I | ψ 〉 = (1/√2)·(C_{1 }− C_{2}) - C
_{II }= 〈 II | ψ 〉 = (1/√2)·(C_{1 }+ C_{2})

The (1/√2) factor is there because of the normalization condition, and the two-by-two matrix equals the transformation matrix for a rotation of a state filtering apparatus about the y-axis, over an angle equal to (minus) 90 degrees, which we wrote as:

I promised I’d say something more about confusing terminology so let me do that here. We call a set of base states a ‘*representation*‘, and writing a state vector in terms of a set of base states is often referred to as a ‘*projection*‘ of that state into the base set. Again, we can see it’s sort of a mathematical projection, rather than a geometrical one. But it makes sense. In any case, that’s enough on state vectors and base states.

Let me wrap it up by inserting one more matrix equation, which you should be able to reconstruct yourself:

The only thing we’re doing here is to substitute 〈 χ | and | φ 〉 for ∑ D_{j}*〈 j | and ∑ | i 〉 C_{i} respectively. All the rest follows. Finally, I promised I’d tell you the difference between a state and a state *vector*. It’s subtle and, in practice, the two concepts refer to the same. However, we write a state as a state, like ψ or, if it’s a base state, like I, or ‘up’, or whatever. When we say a state *vector*, then we think of a set of numbers. It may be a row vector, like the 〈 χ | row vector with the D_{i}* coefficients, or a column vector, like the | φ 〉 column vector with the D_{i}* coefficients. But so if we say *vector*, then we think of a one-dimensional *array *of numbers, while the *state *itself is… Well… The state. So that’s some *reality* in physics. So you might define the state vector as the set of numbers that* describes* the state. While the difference is subtle, it’s important. It’s also important to note that the 〈 χ | and | χ 〉 state vectors are different too. The former appears as the *final state* in an amplitude, while the latter describes the starting condition. The former is referred to as a *bra *in the 〈 χ | φ 〉 bra-ket, while the latter is a *ket* in the 〈 φ | χ 〉 = 〈 χ | φ 〉* amplitude. 〈 χ | is a *row *vector equal to ∑ D_{i}*〈 i |, while | χ 〉 = ∑ D_{i }| χ 〉. So it’s quite different. More in general, we’d *define bras* and *kets* as row and column vectors respectively, so we write:

That makes it clear that a bra next to a ket is to be understood as a matrix multiplication. From what I wrote, it is also obvious that the *conjugate transpose *(which is also known as the *Hermitian conjugate*) of a bra is the corresponding ket and vice versa, so we write:

Let me formally define the *conjugate* or *Hermitian transpose* here: the conjugate transpose of an ** m-by-n **matrix

*A*with complex elements is the

**matrix**

*n*-by-*m**A*† obtained from

*A*by taking the transpose (so we write the rows as columns and vice versa) and then taking the complex conjugate of each element (i.e. we switch the sign of the

*imaginary*part of the complex number).

*A*† is read as ‘A dagger’, but mathematicians will usually denote it by

*A**. In fact, there are a lot of equivalent notations, as we can write:

OK. That’s it on this.

One more thing, perhaps. We’ll often have states, or *base *states, that make sense, in a *physical *sense, that is. But it’s not always the case: we’ll sometimes use base states that may *not *represent some situation we’re likely to encounter, but that make sense *mathematically*. We gave the example of the ‘mathematical’ | I 〉 and | II 〉 base states, versus the ‘physical’ | 1 〉 and | 2 〉 base states, in our post on the ammonia molecule, so I won’t say more about this here. Do keep it in mind though. Sometimes it may feel like nothing makes sense, *physically*, but it usually does *mathematically *and, therefore, all usually comes out alright in the end. 🙂 To be precise, what we did there, was to choose base states with a unambiguous, i.e. a *definite, *energy level. That made our calculations much easier, and the end result was the same, indeed!

So… Well… I’ll let this sink in, and move on to the next topic.

**The Hamiltonian operator**

In my post on the post on the Hamiltonian, I explained that those C_{i} and D_{i} coefficients are usually a *function of time*, and how they can be determined. To be precise, they’re determined by a *set *of differential equations (i.e. equations involving a function *and *the *derivative *of that function) which we wrote as:

If we have *two *base states only, then this set of equations can be written as:

Two equations and two functions – C_{1 }= C_{1}(t) and C_{2 }= C_{2}(t) – so we should be able to solve this thing, right? Well… No. We don’t know those H_{ij }coefficients. As I explained in that post, they also evolve in time, so we should write them as H_{ij}(t) instead of H_{ij }*tout court*, and so it messes the whole thing up. We have two equations and *six* functions really. Of course, there’s always a way out, but I won’t dwell on that here—not *now *at least. What I want to do here is look at the Hamiltonian as an *operator*.

We introduced operators – but not very rigorously – when explaining the Hamiltonian. We did so by ‘expanding’ our 〈 χ | φ 〉 amplitude as follows. We’d say the amplitude to find a ‘thing’ – like a particle, for example, or some *system*, of particles or other things – in some state χ at the time t = t_{2}, when it was in some state φ state at the time t = t_{1} was equal to:

Now, a formula like this only makes sense because we’re ‘abstracting away’ from the base states, which we need to describe any state. Hence, to actually *describe *what’s going on, we *have to *choose some *representation* and expand this expression as follows:

That looks pretty monstrous, so we should write it all out. Using the matrix notation I introduced above, we can do that – let’s take a practical example with three base states once again – as follows:

Now, this still looks pretty monstrous, but just think of it. We’re just applying that ‘Great Law of Quantum Physics’ here, i.e. | = ∑ | i 〉〈 i | over all base states *i*. To be precise, we apply it to an 〈 χ | A | φ 〉 expression, and we do so *twice*, so we get:

Nothing more, nothing less. 🙂 Now, the idea of an operator is the result of being creative: we just drop the 〈 χ | state from the expression above to write:

Yes. I know. That’s a lot to swallow, but you’ll see it makes sense because of the Great Law of Quantum Mechanics:

Just think about it and continue reading when you’re ready. 🙂 The upshot is: we now think of the particle entering some ‘apparatus’ A in the state ϕ and coming out of A in some state ψ or, looking at A as an operator, we can generalize this. As Feynman puts it:

**“The symbol A is neither an amplitude, nor a vector; it is a new kind of thing called an operator. It is something which “operates on” a state to produce a new state.”**

Back to our Hamiltonian. Let’s go through the same process of ‘abstraction’. Let’s first re-write that ‘Hamiltonian equation’ as follows:

The H_{ij}(t) are *amplitudes *indeed, and we can represent them in a 〈 i | H_{ij}(t) | j 〉 *matrix *indeed! Now let’s take the first step in our ‘abstraction process’: let’s scrap the 〈 i | bit. We get:

We can, of course, also abstract away from the | j 〉 bit, so we get:

Look at this! The right-hand side of this expression is *exactly *the same as that A | χ 〉 format we presented when introducing the concept of an operator. [In fact, when I say you should ‘abstract away’ from the | j 〉 bit, then you should think of the ‘Great Law’ and that matrix notation above.] So H is an operator and, therefore, it’s something which operates on a state to produce a new state.

OK. Clear enough. But what’s that ‘state’ on the left-hand side? I’ll just paraphrase Feynman here, who says we should think of it as follows: “The time derivative of the *state vector* |ψ〉 times iℏ is equal to what you get by (1) operating with the Hamiltonian *operator* H on each base state, (2) multiplying by the amplitude that ψ is in the state j (i.e. 〈j|ψ〉), and (1) summing over all j.” Alternatively, you can also say: “The time derivative, times *i*ħ, of a state |ψ〉 is equal to what you get if you *operate* on it with the Hamiltonian.” Of course, that’s true for *any *state, so we can ‘abstract away’ the |ψ〉 bit too and, putting a little hat (^) over the operator to remind ourselves that it’s an operator (rather than just *any *matrix), we get the Hamiltonian operator equation:

Now, that’s all nice and great, but the key question, of course, is: what can you *do *with this? Well… It turns out his Hamiltonian operators is useful to calculate lots of stuff. In the first place, of course, it’s a useful operator in the context of those differential equations describing the *dynamics *of a quantum-mechanical system. When everything is said and done, **those equations are the equivalent, in quantum physics, of the law of motion in classical physics**. [And I am not joking here.]

In addition, the Hamiltonian operator also has other uses. The one I should really mention here is that you can calculate the *average *or *expected value *(EV[*X*]) of the energy of a state ψ (i.e. *any *state, really) by first operating on | ψ 〉 with the Hamiltonian, and then multiplying 〈 ψ | with the result. That sounds a bit complicated, but you’ll understand it when seeing the mathematical expression, which we can write as:

The formula is pretty straightforward. [If you don’t think so, then just write it all out using the matrix notation.] But you may wonder how it* *works *exactly*… Well… Sorry. I don’t want to copy all of Feynman here, so I’ll refer you to him on this. In fact, the *proof *of this formula is actually *very* straightforward, and so you should be able to get through it with the math you got here. You may even understand Feynman’s illustration of it for the ‘special case’ when base states are, indeed, those mathematically convenient base states with definite energy levels.

Have fun with it! 🙂

**Post scriptum on Hilbert spaces**:

As mentioned above, our state vectors are actually *functions. *To be specific, they are wavefunctions, i.e.* periodic *functions, evolving in space and time, so we usually write them as ψ = ψ(**x**, t). Our ‘Hilbert space’, i.e. our *collection *of state vectors, is, therefore, often referred to as a *function space*. So it’s a set of functions. At the same time, it is a vector space too, because we have those addition and multiplication operations, so our function space has the *algebraic structure* of a vector space. As you can imagine, there are some mathematical conditions for a space or a set of objects to ‘qualify’ as a Hilbert space, and the epithet itself comes with a lot of interesting properties. One of them is *completeness*, which is a property that allows us to jot down those differential equations that describe the dynamics of a quantum-mechanical system. However, as you can find whatever you’d need or want to know about those mathematical properties on the Web, I won’t get into it. The important thing here is to understand the concept of a Hilbert space *intuitively*. I hope this post has helped you in that regard, at least. 🙂

Pingback: The math behind the maser re-visited | Reading Feynman

Pingback: Normalization of the states in x: Dirac’s delta function | Reading Feynman