Diffraction gratings are fascinating. The iridescent reflections from the grooves of a compact disc (CD), or from oil films, soap bubbles: it is all the same principle (or closely related – to be precise). In my April, 2014 posts, I introduced Feynman’s ‘arrows’ to explain it. Those posts talked about probability amplitudes, light as a bundle of photons, quantum electrodynamics. They were not wrong. In fact, the quantum-electrodynamical explanation is actually the only one that’s 100% correct (as far as we ‘know’, of course). But it is also more complicated than the classical explanation, which just explains light as waves.
To understand the classical explanation, one first needs to understand how electromagnetic waves interfere. That’s easy, you’ll say. It’s all about adding waves, isn’t it? And we have done that before, haven’t we? Yes. We’ve done it for sinusoidal waves. We also noted that, from a math point of view, the easiest way to go about it was to use vectors or complex numbers, and equate the real parts of the complex numbers with the actual physical quantities, i.e. the electric field in this case.
You’re right. Let’s continue to work with sinusoidal waves, but instead of having just two waves, we’ll consider a whole array of sources, because that’s what we’ll need to analyze when analyzing a diffraction grating.
First the simple case: two sources
Let’s first re-analyze the simple situation: two sources – or two dipole radiators as I called them in my previous post. The illustration below gives a top view of two such oscillators. They are separated, in the north-south direction, by a distance d.
Is that realistic? It is for radio waves: the wavelength of a 1 megahertz radio wave is 300 m (remember: λ = c/f). So, yes, we can separate two sources by a distance in the same order of magnitude as the wavelength of the radiation, but, as Feynman writes: “We cannot make little optical-frequency radio stations and hook them up with infinitesimal wires and drive them all with a given phase.”
For light, it will work differently – and we’ll describe how, but not now. As for now, we should continue with our radio waves.
The illustration above assumes that the radiation from the two sources is sinusoidal and has the same (maximum) amplitude A, but that the two sources might be out of phase: we’ll denote the difference by α. Hence, we can represent the radiation emitted by the two sources by the real part of the complex numbers Aeiωt and Aei(ωt + α) respectively. Now, we can move our detector around to measure the intensity of the radiation from these two antennas. If we place our detector at some point P, sufficiently far away from the sources, then the angle θ will result in another phase difference, due to the difference in distance from point P to the two oscillators. From simple geometry, we know that this difference will be equal to d·sinθ. The phase difference due to the distance difference will then be equal to the product of the wave number k (i.e. the rate of change of the phase (expressed in radians) with distance, i.e. per meter) and that distance d·sinθ. So the phase difference at arrival (i.e. at point P) would be
Φ2 – Φ1 = α + k· d·sinθ = α + (2π/λ)·d·sinθ
That’s pretty obvious, but let’s play a bit with this, in order to make we understand what’s going on. The illustration below gives two examples: α = 0 and α = π.
How do we get these numbers 0, 2 and 4, which indicate the intensity, i.e. the amount of energy that the field carries past per second, which is proportional to the square of the field, averaged in time? [If it would be (visible) light, instead of radio waves, the intensity would be the brightness of the light.]
Well… In the first case, we have α = 0 and d = λ/2 and, hence, at an angle of 30 degrees, we have d·sin(30°) = (λ/2)(1/2) = λ/4. Therefore, Φ2 – Φ1 = α + (2π/λ)·d·sinθ = 0 + (2π/λ)·(λ/4) = π/2. So what? Well… Let’s add the waves. We will have some combined wave with amplitude AR and phase ΦR:
Now, to calculate the length of this ‘vector’, i.e. the amplitude AR, we take the product of this complex number and its complex conjugate, and that will give us the length squared, and then we multiply it all out and so on and so on. To make a long story short, we’ll find that
AR2 = A12 + A22 + 2A1A2cos(Φ2 – Φ1)
The last term in this sum is the interference effect, and so that’s equal to zero in the case we’ve been studying above (α = 0, d = λ/2 and θ = 30°), so we get twice the intensity of one oscillator only. The other cases can be worked out in the same way.
Now, you should not think that the pattern is always symmetric, or simple, as the two illustrations below make clear.
With more oscillators, the patterns become even more interesting. The illustration below shows part of the intensity pattern of a six-dipole antenna array:
Let’s look at that now indeed: arrays with n oscillators.
Arrays with n oscillators
If we have six oscillators, like in the illustration above, we have to add something like this:
R = A[cos(ωt) + cos(ωt + Φ) + cos(ωt + 2Φ) + … + cos(ωt + 5Φ)]
From what we wrote above, it is obvious that the phase difference Φ can have two causes: the oscillators may be driven differently in phase, or we may be looking at them at an angle so that there is a difference in time delay. Hence, we have the same formula as the one above:
Φ = α + (2π/λ)·d·sinθ
Now, we have an interesting geometrical approach to finding the net amplitude AR. We can, once again, consider the various waves as vectors and add them, as shown below.
The length of all vectors is the same (A), and then we have the phase difference, i.e. the different angles: zero for A1, Φ for A1, 2Φ for A2, etcetera. So as we’re adding these vectors, we’re going around and forming an equiangular polygon with n sides, with the vertices (corner points) lying on a circle with radius r. It requires just a bit of trigonometry to establish that the following equality must hold: A = 2rsin(Φ/2). So that fixes r. We also have that the large angle OQT equals nΦ and, hence, AR = 2rsin(nΦ/2). We can now combine the results to find the following amplitude and intensity formula:
This formula is obvious for n = 1 and for n = 2: it gives us the results which were shown above already. But here we want to know how this thing behaves for large n. It is easy to see that the numerator above, i.e. sin2(nΦ/2), will always be larger than the denominator, sin2(Φ/2), and that both are – obviously – smaller or equal to 1. It can be demonstrated that this function of the angle Φ reaches its maximum value for Φ = 0. Indeed, taking the limit gives us I = I0n2. [We can intuitively see this because, if we express the angle in radians, we can substitute sin(Φ/2) and sin(nΦ/2) for Φ/2 and nΦ/2, and then we can eliminate the (Φ/2)2 factor to get n2.
It’s a bit more difficult to understand what happens next. If Φ becomes a bit larger, the ratio of the two sines begins to fall off (so it becomes smaller than n2). Note that the numerator, i.e. sin2(nΦ/2), will be equal to one if nΦ/2 = π/2, i.e. if Φ = π/n, and the ratio sin2(nΦ/2)/sin2(Φ/2) then becomes sin2(π/2)/sin2(π/2n) = 1/sin2(π/2n). Again, if we assume that n is (very) large, we can approximate and write that this ratio is more or less equal to 1/(π2/4n2) = 4n2/π2. That means that the intensity there will be 4/ π2 times the intensity of the beam at the maximum, i.e. 40.53% of it. That’s the point at nΦ/2π = 0.5 on the graph below.
The graph above has a re-scaled vertical as well as a re-scaled horizontal axis. Indeed, instead of I, the vertical axis shows I/n2I0, so the maximum value is 1. And the horizontal axis does not show Φ but nΦ/2π, so if Φ = π/n, then nΦ/2π = 0.5 indeed. [Don’t worry about the dotted curve: that’s the solid-line curve multiplied by 10: it’s there to make sure you see what’s going on, as this ratio of those sines becomes very small very rapidly indeed.]
So, once we’re past that 40.53% point, we get at our first minimum, which is reached at nΦ/2π = 1 or Φ = 2π/n. The numerator sin2(nΦ/2) equals sin2(π) = 0 there indeed, so the whole ratio becomes zero. Then it goes up again, to our second maximum, which we get when our numerator comes close to one again, i.e. when sin2(nΦ/2) ≈ 1. That happens when nΦ/2 = 3π/2, or Φ = 3π/n. Again, when n is (very) large, Φ will be very small, and so we can substitute the denominator sin2(Φ/2) for Φ2/4. We then get a ratio equal to 1/(9π2/4), or an intensity equal to 4n2I0/9π2, i.e. only 4.5% of the intensity at the (first) maximum. So that’s tiny. [Well… All is relative, of course. :-)] We can go on and on like that but that’s not the point here: the point is that we have a very sharp central maximum with very weak subsidiary maxima on the sides.
But what about that big lobe at 30 degrees on that graph with the six-dipole antenna? Relax. We’re not done yet with this ‘quick’ analysis. Let’s look at the general case from yet another angle, so to say. 🙂
The general case
To focus our minds, we’ve depicted that array with n oscillators below. Once again, we note that the phase difference between two sources, one to the next, will depend on (1) the intrinsic phase difference between them, which we denote by α, and (2) the time delay because we’re observing the system in a given direction q from the normal, which effect we calculated as equal to (2π/λ)·d·sinθ. So the whole effect is Φ = α + (2π/λ)·d·sinθ = a + k·d·sinθ, with k the wave number.
To make things simple, let’s first assume that α = 0. We’re then in the case that we described above: we’ll have a sharp maximum at Φ = 0, so that means θ = 0. It’s easy to see why: all oscillators are in phase and so we have maximum positive (or constructive) interference.
Let’s now examine the first minimum. When looking back at that geometrical interpretation, with the polygon, all the arrows come back to the starting point: we’ve completed a full circle. Indeed, n times Φ gives nΦ = n·2π/n = 2π. So what’s going on here? Well… If we put that value in our formula Φ = α + (2π/λ)·d·sinθ, we get 2π/n = 0 + (2π/λ)·d·sinθ or, getting rid of the 2π factor, n·d·sinθ = λ.
Now, n·d is the total length of the array, i.e. L, and, from the illustration above, we see that n·d·sinλ = L·sinθ = Δ. So we have that n·d·sinθ = λ = Δ. Hence, Δ is equal to one wavelength.That means that the total phase difference between the first and the last oscillator is equal to 2π, and the contributions of all the oscillators in-between are uniformly distributed in phase between 0° and 360°. The net result is a vector AR with amplitude AR = 0 and, hence, the intensity is zero as well.
OK, you’ll say, you’re just repeating yourself here. What about the other lobe or lobes? Well… Let’s go back to that maximum. We had it at Φ = 0, but we will also have it at Φ = 2π, and at Φ = 4π, and at Φ = 6π etcetera, etcetera. We’ll have such sharp maximum – the maximum, in fact – at any Φ = m⋅2π, where m is any integer. Now, plugging that into the Φ = α + (2π/λ)·d·sinθ formula (again, assuming that α = 0), we get m⋅2π = (2π/λ)·d·sinθ or d·sinθ = mλ.
While that looks very similar to our n·d·sinθ = λ = Δ condition for the (first) minimum, we’re not looking at that Δ but at that δ angle measured from the individual sources, and so we have δ = Δ/n = mλ. What’s being said here, is that each successive source is out of phase by 360° and, because, being out of phase by 360° obviously means that you’re in phase once again, ensure that all sources are, once again, contributing in phase and produce a maximum that is just as good as the one we had for m = 0. Now, these maxima will also have a (first) minimum described by that other formula above, and so that’s how we get that pattern of lobes with weak ‘side lobes’.
Now, the conditions presented above for maxima and minima obviously all depend on the distance d, i.e. the spacing of the array, and the wavelength λ. That brings us to an interesting point: if d is smaller than λ (so if the spacing is smaller than one wavelength), we have (d/λ)·sinθ = m < 1, so we only have one solution for m: m = 0. So we only have on beam in that case, the so-called zero-order beam centered at θ = 0. [Note that we also have a beam in the opposite direction.]
The point to note is that we can only have subsidiary great maxima if the spacing d of the array is greater than the wavelength λ. If we have such subsidiary great maxima, we’ll call them first-order, second-order etcetera beams, according to the value m.
We are now, finally, ready to discuss diffraction gratings. A diffraction grating, in its simplest form, is a plane glass sheet with scratches on it: several hundred grooves, or several thousand even, to the millimeter. That is because the spacing has to be of the same order of magnitude of the wavelength of light, so that’s 400 to 700 nanometer (nm) indeed – with the 400-500 nm range corresponding to violet-blue light, and the (longer) 700+ nm range corresponding to red light. Remember, a nanometer is a billionth of a meter (1´10-9 m), so even one thousandth of a millimeter is 1000 nanometer, i.e. longer than the wavelength of red light. Of course, from what we wrote above, it is obvious that the spacing d must be wider than the wavelength of interest to cause second- and third-order beams and, therefore, diffraction but, still, the order of magnitude must be the same to produce anything of interest. Isn’t it amazing that scientists were able to produce such diffraction experiments around the turn of the 18th century already? One of the earliest apparatuses, made in 1785, by the first director of the United States Mint, used hair strung between two finely threaded screws. In any case, let’s go back to the physics of it.
In my previous post, I already noted Feynman’s observation that “we cannot literally make little optical-frequency radio stations and hook them up with infinitesimal wires and drive them all with a given phase.” What happens is something similar to the following set-up, and I’ll quote Feynman again (Vol. I, p. 30-3), just because it’s easier to quote than to paraphrase: “Suppose that we had a lot of parallel wires, equally spaced at a spacing d, and a radio-frequency source very far away, practically at infinity, which is generating an electric field which arrives at each one of the wires at the same phase. Then the external electric field will drive the electrons up and down in each wire. That is, the field which is coming from the original source will shake the electrons up and down, and in moving, these represent new generators. This phenomenon is called scattering: a light wave from some source can induce a motion of the electrons in a piece of material, and these motions generate their own waves.”
When Feynman says “light” here, he means electromagnetic radiation in general. But so what’s happening with visible light? Well… All of the glass in that piece that makes up our diffraction grating scatters light, but so the notches in it scatter differently than the rest of the glass. The light going through the ‘rest of the glass’ goes straight through (a phenomenon which should be explained in itself, but so we don’t do that here), but the notches act as sources and produce secondary or even tertiary beams, as illustrated by the picture below, which shows a flash of light seen through such grating, showing three diffracted orders: the order m = 0 corresponds to a direct transmission of light through the grating, while the first-order beams (m = +1 and m = -1), show colors with increasing wavelengths (from violet-blue to red), being diffracted at increasing angles.
The ‘mechanics’ are very complicated, and the correct explanation in physics involve a good understanding of quantum electrodynamics, which we touched upon in our April, 2014 posts. I won’t do that here, because here we are introducing the so-called classical theory only. This classical theory does away with all of the complexity of a quantum-electrodynamical explanation and replaces it by what is now as the Huygens-Fresnel Principle, which was first formulated in 1678 (!), and which basically states that “every point which a luminous disturbance reaches becomes a source of a spherical wave, and the sum of these secondary waves determines the form of the wave at any subsequent time.”
This comes from Wikipedia, as do the illustrations below. It does not only ‘explain’ diffraction gratings, but it also ‘explains’ what happens when light goes through a slit, cf. the second (animated) illustration.
Now that, light being diffracted as it is going through a slit, is obviously much more mysterious than a diffraction grating – and, you’ll admit, a diffraction grating is already mysterious enough, because it’s rather strange that only certain points in the grating (i.e. the notches) would act as sources, isn’t it? Now, if that’s difficult to understand, it’s even more difficult to understand why an empty space, i.e. a slit, would act as a diffraction grating! However, because this post has become way too long already, we’ll leave this discussion for later.