I got stuck in Penrose’s Road to Reality in chapter 5 already. That is not very encouraging – because the book has 34 chapters, and every chapter builds on the previous one, and usually also looks more difficult than the previous one.
In Chapter 5, Penrose introduces complex algebra. As I tried to get through it, I realized I had to do some more self-study. Indeed, while Penrose claims no other books or courses are needed to get through his, I do not find this to be the case. So I bought a fairly standard course in complex analysis (James Ward Brown and Ruel V. Churchill, Complex variables and applications) and I’ve done chapter 1 and 2 now. Although these first two chapter do not nothing else than introduce the subject-matter, I find the matter rather difficult and the terminology confusing. Examples:
1. The term ‘scalar’ is used to denote real numbers. So why use the term ‘scalar’ if the word ‘real’ is available as well? And why not use the term ‘real field’ instead of ‘scalar field’? Well… The term ‘real field’ actually means something else. A scalar field associates a (real) number to every point in space. So that’s simple: think of temperature or pressure. The term ‘scalar’ is said to be derived from ‘scaling’: a scalar is that what scales vectors. Indeed, scalar multiplication of a vector and a real number multiplies the magnitude of the vector without changing its direction. So what is a real field then? Well… A (formally) real field is a field that can be extended with a (not necessarily unique) ordering which makes it an ordered field. Does that help? Somewhat I guess. But why the qualifier ‘formally real’? I checked and there is no such thing as an ‘informally real’ field. I guess it’s just to make sure we know what we are talking about, as ‘real’ is a word with many meanings.
2. So what’s a field in mathematics? It is an algebraic structure: a set of ‘things’ (like numbers) with operations defined on it, including the notions of addition, subtraction, multiplication, and division. As mentioned above, we have scalar fields and vector fields. In addition, we also have fields of complex numbers. We also have fields with some less likely candidates for addition and multiplication, such as functions (one can add and multiply functions with each other). In short, anything which satisfies the formal definition of a field – and here I should note that the above definition of a field is not formal – is a field. For example, the set of rational numbers satisfies the definition of a field too. So what is the formal definition? First of all, a field is a ring. Huh? Here we are in this abstract classification of algebraic structures: commutative groups, rings, fields, etcetera (there are also modules – a type of algebraic structure which I had never ever heard of before). To put it simply – because we have to move on of course – a ring (no one seems to know where that word actually comes from) has addition and multiplication only, while a field has division too. In other words, a ring does not need to have multiplicative inverses. Huh? It’s simply really: the integers form a ring, but the equation 2x = 1 does not have a solution in integers (x = ½) and, hence, the integers do not form a field. The same example shows why rational numbers do.
3. But what about a vector field? Can we do division with vectors? Yes, but not by zero – but that is not a problem as that is understood in the definition of a field (or in the general definition of division for that matter). In two-dimensional space, we can represent vectors by complex numbers: z = (x,y), and we have a formula for the so-called multiplicative inverse of a complex number: z-1 = (x/x2+y2, -y/x2+y2). OK. That’s easy. Let’s move on to more advanced stuff.
4. In logic, we have the concept of well-formed formulas (wffs). In math, we have the concept of ‘well-behaved’: we have well-behaved sets, well-behaved spaces and lots of other well-behaved things, including well-behaved functions, which are, of course, those of interest to engineers and scientists (and, hence, in light of the objective of understanding Penrose’s Road to Reality, to me as well). I must admit that I was somewhat surprised to learn that ‘well-behaved’ is one of the very few terms in math that have no formal definition. Wikipedia notes that its definition, in the science of mathematics that is, depends on ‘mathematical interest, fashion, and taste’. Let me quote in full here: “To ensure that an object is ‘well-behaved’ mathematicians introduce further axioms to narrow down the domain of study. This has the benefit of making analysis easier, but cuts down on the generality of any conclusions reached. […] In both pure and applied mathematics (optimization, numerical integration, or mathematical physics, for example), well-behaved means not violating any assumptions needed to successfully apply whatever analysis is being discussed. The opposite case is usually labeled pathological.” Wikipedia also notes that “concepts like non-Euclidean geometry were once considered ill-behaved, but are now common objects of study.”
5. So what is a well-behaved function? There is actually a whole hierarchy, with varying degrees of ‘good’ behavior, so one function can be more ‘well-behaved’ than another. First, we have smooth functions: a smooth function has derivatives of all orders (as for its name, it’s actually well chosen: the graph of a smooth function is actually, well, smooth). Then we have analytic functions: analytic functions are smooth but, in addition to being smooth, an analytic function is a function that can be locally given by a convergent power series. Huh? Let me try an alternative definition: a function is analytic if and only if its Taylor series about x0 converges to the function in some neighborhood for every x0 in its domain. That’s not helping much either, is it? Well… Let’s just leave that one for now.
In fact, it may help to note that the authors of the course I am reading (J.W. Brown and R.V. Churchill, Complex Variables and Applications) use the terms analytic, regular and holomorphic as interchangeable, and they define an analytic function simply as a function which has a derivative everywhere. While that’s helpful, it’s obviously a bit loose (what’s the thing about the Taylor series?) and so I checked on Wikipedia, which clears the confusion and also defines the terms ‘holomorphic’ and ‘regular’:
“A holomorphic function is a complex-valued function of one or more complex variables that is complex differentiable in a neighborhood of every point in its domain. The existence of a complex derivative in a neighborhood is a very strong condition, for it implies that any holomorphic function is actually infinitely differentiable and equal to its own Taylor series. The term analytic function is often used interchangeably with ‘holomorphic function’ although the word ‘analytic’ is also used in a broader sense to describe any function (real, complex, or of more general type) that can be written as a convergent power series in a neighborhood of each point in its domain. The fact that the class of complex analytic functions coincides with the class of holomorphic functions is a major theorem in complex analysis.”
Wikipedia also adds following: “Holomorphic functions are also sometimes referred to as regular functions or as conformal maps. A holomorphic function whose domain is the whole complex plane is called an entire function. The phrase ‘holomorphic at a point z0’ means not just differentiable at z0, but differentiable everywhere within some neighborhood of z0 in the complex plane.”
6. What to make of all this? Differentiability is obviously the key and, although there are many similarities between real differentiability and complex differentiability (both are linear and obey the product rule, quotient rule, and chain rule), real-valued functions and complex-valued functions are different animals. What are the conditions for differentiability? For real-valued functions, it is a matter of checking whether or not the limit defining the derivative exists and, of course, a necessary (but not sufficient) condition is continuity of the function.
For complex-valued functions, it is a bit more sophisticated because we’ve got so-called Cauchy-Riemann conditions applying here. How does that work? Well… We write f(z) as the sum of two functions: f(z) = u(x,y) + iv(x,y). So the real-valued function u(x,y) yields the real part of f(z), while v(x,y) yields the imaginary part of f(z). The Cauchy-Riemann equations (to be interpreted as conditions really) are the following: ux = vy and uy = -vx (note the minus sign in the second equation).
That looks simple enough, doesn’t it? However, as Wikipedia notes (see the quote above), differentiability at a point z0 is not enough (to ensure the existence of the derivative of f(z) at that point). We need to look at some neighborhood of the point z0 and see if these first-order derivatives (ux, uy, vx and vy) exist everywhere in that neighborhood and satisfy these Cauchy-Riemann equations. So we need to look beyond the point z0 itself hen doing our analysis: we need to ‘approach’ it from various directions before making any judgment. I know this sounds like Chinese but it became clear to me when doing the exercises.
7. OK. Phew! I got this far – but so that’s only chapter 1 and 2 of Brown & Churchill’s course ! In fact, chapter 2 also includes a few sections on so-called harmonic functions and harmonic conjugates. Let’s first talk about harmonic functions. Harmonic functions are even better behaved than holomorphic or analytic functions. Well… That’s not the right way to put it really. A harmonic function is a real-valued analytic function (its value could represent temperature, or pressure – just as an example) but, for a function to qualify as ‘harmonic’, an additional condition is imposed. That condition is known as Laplace’s equation: if we denote the harmonic function as H(x,y), then it has to have second-order derivatives which satisfies Hxx(x,y) + Hyy(x,y) = 0.
Huh? Laplace’s equation, or harmonic functions in general, plays an important role in physics, as the condition that is being imposed (the Laplace equation) often reflects a real-life physical constraint and, hence, the function H would describe real-life phenomena, such as the temperature of a thin plate (with the points on the plate defined by the (x,y) coordinates), or electrostatic potential. More about that later. Let’s conclude this first entry with the definition of harmonic conjugates.
8. As stated above, a harmonic function is a real-valued function. However, we also noted that a complex function f(z) can actually be written as a sum of a real and imaginary part using two real-valued functions u(x,y) and v(x,y). More in particular, we can write f(z) = u(x,y) + iv(x,y), with i the imaginary number (0,1). Now, if u and v would happen to be harmonic functions (but so that’s an if of course – see the Laplace condition imposed on their second-order derivatives in order to qualify for the ‘harmonic’ label) and, in addition to that, if their first-order derivatives would happen to satisfy the Cauchy-Riemann equations (in other words, f(z) should be a well-behaved analytic function), then (and only then) we can label v as the harmonic conjugate of u.
What does that mean? First, one should note that when v is a harmonic conjugate of u in some domain, it is not generally true that u is a harmonic conjugate of v. So one cannot just switch the functions. Indeed, the minus sign in the Cauchy–Riemann equations makes the relationship asymmetric. But so what’s the relevance of this definition of a harmonic conjugate? Well… There is a theorem that turns the definition around: ‘A function f(z) = u(x,y) + iv(x,y) is analytic (or holomorphic to use standard terminology) in a domain D if and only if v is a harmonic conjugate of u. In other words, introducing the definition of a harmonic conjugate (and the conditions which their first- and second-order derivatives have to satisfy), allows us to check whether or not we have a well-behaved complex-valued function (and with ‘well-behaved’ I mean analytic or holomorphic).
9. But, again, why do we need holomorphic functions? What’s so special about them? I am not sure for the moment, but I guess there’s something deeper in that one phrase which I quoted from Wikipedia above: “holomorphic functions are also sometimes referred to as regular functions or as conformal maps.” A conformal mapping preserves angles, as you can see on the illustration below, which shows a rectangular grid and its image under a conformal map: f maps pairs of lines intersecting at 90° to pairs of curves still intersecting at 90°. I guess that’s very relevant, although I do not know why exactly as for now. More about that in later posts.