Principal value – Reading Feynman

Pre-scriptum (dated 26 June 2020): the material in this post remains interesting but is, strictly speaking, not a prerequisite to understand quantum mechanics. It’s yet another example of how one can get lost in math when studying or teaching physics.

Original post:

In my previous post on this blog, I once again mentioned the issue of multiple-valuedness. It is probably time to deal with the issue once and for all by introducing Riemann surfaces.

Penrose attaches a lot of importance to these Riemann surfaces (so I must assume they are very important). In contrast, in their standard textbook on complex analysis, Brown and Churchill note that the two sections on Riemann surfaces are not essential reading, as it’s just ‘a geometric device’ to deal with multiple-valuedness. But so let’s go for it.

I already signaled that complex powers w = z^c are multiple-valued functions of z and so that causes all kinds of problems, because we can’t do derivatives and integrals and all that. In fact, z^c = e^{c log z}and so we have two components on the right-hand side of this equation. The first one is the (complex) exponential function e^c, i.e. the real number e raised to a complex power c. We already know (see the other posts below) that this is a periodic function with (imaginary) period 2πi: e^c= e^c+2πi= e^2πie^c= 1e^c. While this periodic component of z^c is somewhat special (as compared to exponentiation in real analysis), it is not this periodic component but the log z component which is causing the problem of multiple-valuedness. [Of course, it’s true that the problem of multiple-valuedness of the log function is, in fact, a logical consequence of the periodicity of the complex power function, but so you can figure that out yourself I guess.] So let’s look at that log z function once again.

If we write z in its polar form z = reⁱ^θ, then log z will be equal to log z = ln r + i(θ+2nπ) with n = 0, ±1, ±2,… Hence, if we write log z in rectangular coordinates (i.e. log z = x + iy) , then we note that the x component (i.e.the real part) of log z is equal to ln r and, hence, x is just an ordinary real number with some fixed value (x = ln r). However, the y component (i.e. the imaginary part of log z) does not have any fixed value: θ is just one of the values, but so are θ+2π and θ – 2π and θ+4π etcetera. In short, we have an infinite number of values for y, and so that’s the issue: what do we do with all these values? It’s not a proper function anymore.

Now, this problem of multiple-valuedness is usually solved by just picking a so-called principal value for log z, which is written as Log z = ln r + iθ, and which is defined by mathematicians by imposing the condition that θ takes a value in the interval between -π and +π only (hence, -π < θ < π). In short, the mathematicians usually just pretend that the 2nπi thing doesn’t matter.

However, this is not trivial: as we are imposing these restrictions on the value of Θ, we are actually defining some new single-valued function Log z = ln r + iθ. This Log z function, then, is a complex-valued analytic function with two real-valued components: x = ln r and y = θ. So, while x = ln r can take any value on the real axis, we let θ range from -π to +π only (in the usual counterclockwise or ‘positive’ direction, because that happens to be the convention). If we do this, we get a principal value for z^cas well: P.V. z^c = e^{c Log z}, and so we’ve ‘solved’ the problem of multiple values for the function z^ctoo in this way.

What we are doing here has a more general significance: we are taking a so-called branch out of a multiple-valued function, in order to make it single-valued and, hence, analytic. To illustrate what is really going on here, let us go back to the original multiple-valued log z = ln r + i(θ+2nπ) function and let’s do away with this integer n by writing log z in the more general form log z = ln r + iΘ. Of course, Θ is equal to θ+2nπ but so we’ll just forget about the θ and, most importantly, about the n, and allow the y component (i.e. the imaginary part) of the imaginary number log z = x + iy to take on any value Θ in the real field. In other words, we treat this angle Θ just like any other ordinary real number. We can now define branches of log z again, but in a more general way: we can pick any value α and say that’s a branch point, as it will define a range α < Θ < α + 2π in which, once again, we limit the possible values of log z to just one.

For example, if we choose α = -π, then Θ will range from -π to +π and so then we’re back to log z’s principal branch, i.e. Log z. However, let us now, instead of taking this Log z branch, define another branch – we’ll call it the L(z) branch – by choosing α = 0 and, hence, letting Θ range from 0 to 2π. So we have 0 < Θ < 2π and, of course, you’ll note that this range overlaps with the range that is being used for the principal branch of log z (i.e. Log z). It does, and it’s not a problem. Indeed, for values 0 < Θ < π (i.e. the overlapping half-plane) we get the same set of values Log z = L(z) for log z, and so we are talking the same function indeed.

OK. I guess we understand that. So what? Well… The fact is that we have found a very very nice way of illustrating the multiple-valuedness of the log z function and – more importantly – a nice way of ‘solving’ it too. Have a look at the beautiful 3D graph below. It represents the log z function. [Well… Let me be correct and note that, strictly speaking, this particular surface seems to represent the imaginary part of the log z function only, but that’s OK at this stage.]

Huh? What’s happening here? Well, this spiral surface represents the log z function by ‘gluing’ successive log z branches together. I took the illustration from Wikipedia’s article on the complex logarithm and, to explain how this surface has been constructed, let’s start at the origin, which is located right in the center of this graph, between the yellow-green and the red-pinkish sheets (so the horizontal (x, y) plane we start from is not the bottom of this rectangular prism: you should imagine it at its center).

From there, we start building up the first ‘level’ of this graph (i.e. the yellowish level above the origin) as the angle Θ sweeps around the origin, in counterclockwise direction, across the upper half of the complex z plane. So it goes from 0 to π and, when Θ crosses the negative side of the real axis, it has added π to its original value. With ‘original value’, I mean its value when it crossed the positive real axis the previous time. As we’ve just started, Θ was equal to 0. We then go from π to 2π, across the lower half of the complex plane, back to the positive real axis: that gives us the first ‘level’ of this spiral staircase (so the vertical distance reflects the value of Θ indeed, which is the imaginary part of log z) . Then we can go around the origin once more, and so Θ goes from 2π to 4π, and so that’s how we get the second ‘level’ above the origin – i.e. the greenish one. But – hey! – how does that work? The angle 2π is the same as zero, isn’t it? And 4π as well, no?Well… No. Not here. It is the same angle in the complex plane, but is not the same ‘angle’ if we’re using it here in this log z = ln r + iΘ function.

Let’s look at the first two levels (so the yellow-green ones) of this 3D graph once again. Let’s start with Θ = 0 and keep Θ fixed at this zero value for a while. The value of log z is then just the real component of this log z = ln r + iΘ function, and so we have log z = ln r + i0 = ln r. This ln r function (or ln(x) as it is written below) is just the (real) logarithmic function, which has the familiar form shown below. I guess there is no need to elaborate on that although I should, perhaps, remind you that r (or x in the graph below) is always some positive real number, as it’s the modulus of a vector – or a vector length if that’s easier to understand. So, while ln(r) can take on any (real-number) value between -∞ and +∞, the argument r is always a positive real number.

Let us now look at what happens with this log z function as Θ moves from 0 to 2π, first through the upper half of the complex z plane, to Θ = π first, and then further to 2π through the lower half of the complex plane. That’s less easy to visualize, but the illustration below might help. The circles in the plane below (which is the z plane) represent the real part of log z: the parametric representation of these circles is: Re(log z) = ln r = constant. In short, when we’re on these circles, going around the origin, we keep r fixed in the z plane (and, hence, ln r is constant indeed) but we let the argument of z (i.e. Θ) vary from 0 to 2π and, hence, the imaginary part of log z (which is equal to Θ) will also vary. On the rays it is the other way around: we let r vary but we keep the argument Θ of the complex number z = reⁱ^θfixed. Hence, each ray is the parametric representation of Im(log z) = Θ = constant, so Θ is some fixed angle in the interval 0 < π < 2π.

Let’s now go back to that spiral surface and construct the first level of that surface (or the first ‘sheet’ as it’s often referred to) once again. In fact, there is actually more than way to construct such spiral surface: while the spiral ramp above seems to depict the imaginary part of log z only, the vertical distance on the illustration below includes both the real as well as the imaginary part of log z (i.e. Re log z + Im log z = ln r + Θ).

Again, we start at the origin, which is, again, the center of this graph (there is a zero (0) marker nearby, but that’s actually just the value of Θ on that ray (Θ = 0), not a marker for the origin point). If we move outwards from the center, i.e. from the origin, on the horizontal two-dimensional z = x + iy = (x,y) plane but along the ray Θ = 0, then we again have log z = ln r + i0 = ln r. So, looking from above, we would see an image resembling the illustration above: we move on a circle around the origin if we keep r constant, and we move on rays if we keep Θ constant. So, in this case, we fix the value of Θ at 0 and move out on a ray indeed and, in three dimensions, the shape of that ray reflects the ln r function. As we then become somewhat more adventurous and start moving around the origin, rather than just moving away from it, the iΘ term in this ln r + iΘ function kicks in and the imaginary part of w (i.e. Im(log z) = y = Θ) grows. To be precise, the value 2π gets added to y with every loop around the origin as we go around it. You can actually ‘measure’ this distance 2π ≈ 6.3 between the various ‘sheets’ on the spiral surface along the vertical coordinate axis (that is if you could read the tiny little figures along the vertical coordinate axis in these 3D graphs, which you probably can’t).

So, by now you should get what’s going on here. We’re looking at this spiral surface and combining both movements now. If we move outwards, away from this center, keeping Θ constant, we can see that the shape of this spiral surface reflects the shape of the ln r function, going to -∞ as we are close to the center of the spiral, and taking on more moderate (positive) values further away from it. So if we move outwards from the center, we get higher up on this surface. We can also see that we also move higher up this surface as we move (counterclockwise) around the origin, rather than away from it. Indeed, as mentioned above, the vertical coordinate in the graph above (i.e. the measurements along the vertical axis of the spiral surface) is equal to the sum of Re(log) and Im(log z). In other words, the ‘z’ coordinate in the Euclidean three-dimensional (x, y, z) space which the illustrations above are using is equal to ln r + Θ, and, hence, as 2π gets added to the previous value of Θ with every turn we’re making around the origin, we get to the next ‘level’ of the spiral, which is exactly 2π higher than the previous level. Vice versa, 2π gets subtracted from the previous value of Θ as we’re going down the spiral, i.e. as we are moving clockwise (or in the ‘negative’ direction as it is aptly termed).

OK. This has been a very lengthy explanation but so I just wanted to make sure you got it. The horizontal plane is the z plane, so that’s all the points z = x + iy = reⁱ^θ, and so that’s the domain of the log z function. And then we have the image of all these points z under the log z function, i.e. the points w = ln r + iΘ right above or right below the z points on the horizontal plane through the origin.

Fine. But so how does this ‘solve’ the problem of multiple-valuedness, apart from ‘illustrating’ it? Well… From the title of this post, you’ll have inferred – and rightly so – that the spiral surface which we have just constructed is one of these so-called Riemann surfaces.

We may look at this Riemann surface as just another complex surface because, just like the complex plane, it is a two-dimensional manifold. Indeed, even if we have represented it in 3D, it is not all that different from a sphere as a non-Euclidean two-dimensional surface: we only need two real numbers (r and Θ) to identify any point on this surface and so it’s two-dimensional only indeed (although it has more ‘structure’ than the ‘flat’ complex plane we are used to) . It may help to note that there are other surfaces like this, such as the ones below, which are Riemann surfaces for other multiple-valued functions: in this case, the surfaces below are Riemann surfaces for the (complex) square root function (f(z) = z^1/2) and the (complex) arcsin(z) function.

Riemann_surface_arcsin

Nice graphs, you’ll say but, again, what is this all about? These graphs surely illustrate the problem of multiple-valuedness but so how do they help to solve it? Well… The trick is to use such Riemann surface as a domain really: now that we’ve got this Riemann surface, we can actually use it as a domain and then log z (or z^1/2 or arcsin(z) if we use these other Riemann surfaces) will be a nice single-valued (and analytic) function for all points on that surface.

Huh? What? […] Hmm… I agree that it looks fishy: we first use the function itself to construct a ‘Riemannian’ surface, and then we use that very same surface as a ‘Riemannian’ domain for the function itself? Well… Yes. As Penrose puts it: “Complex (analytic) functions have a mind of their own, and decide themselves what their domain should be, irrespective of the region of the complex plane which we ourselves may initially have allotted to it. While we may regard the function’s domain to be represented by the Riemann surface associated with the function, the domain is not given ahead of time: it is the explicit form of the function itself that tells us which Riemann surface the domain actually is.”

I guess we’ll have to judge the value of this bright Riemannian idea (Bernhardt Riemann had many bright ideas during his short lifetime it seems) when we understand somewhat better why we’d need these surfaces for solving physics problems. Back to Penrose. 🙂

Post scriptum: Brown and Churchill seem to approach the matter of how to construct a Riemann surface somewhat less rigorously than I do, as they do not provide any 3D illustrations but just talk about joining thin sheets, by cutting them along the positive half of the real axis and then joining the lower edge of the slit of the first sheet to the upper edge of the slit in the second sheet. This should be done, obviously, by making sure there is no (additional) tearing of the original sheet surfaces and all that (so we’re talking ‘continuous deformations’ I guess), but so that could be done, perhaps, without creating that ‘tornado vortex’ around the vertical axis, which you can clearly see in that gray 3D graph above. If we don’t include the ln r term in the definition of the ‘z’ coordinate in the Euclidean three-dimensional (x, y, z) space which the illustrations above are using, then we’d have a spiral ramp without a ‘hole’ in the center. However, that being said, in order to construct a ‘proper’ two-dimensional manifold, we would probably need some kind function of r in the definition of ‘z’. In fact, we would probably need to write r as some function of Θ in order to make sure we’ve got a proper analytic mapping. I won’t go into detail here (because I don’t know the detail) but leave it to you to check it out on the Web: just check on various parametric representations of spiral ramps: there’s usually (and probably always) a connection between Θ and how, and also how steep, spiral ramps climb around their vertical axis.

Tag: Principal value

Riemann surfaces (I)