3.8: Optional— Integrals in General Coordinates
( \newcommand{\kernel}{\mathrm{null}\,}\)
One of the most important tools used in dealing with single variable integrals is the change of variable (substitution) rule
x=f(u)dx=f′(u)du
See Theorems 1.4.2 and 1.4.6 in the CLP-2 text. Expressing multivariable integrals using polar or cylindrical or spherical coordinates are really multivariable substitutions. For example, switching to spherical coordinates amounts replacing the coordinates x,y,z with the coordinates ρ,θ,φ by using the substitution
X=r(ρ,θ,φ)dxdydz=ρ2sinφdρdθdφ
where
X=⟨x,y,z⟩andr(ρ,θ,φ)=⟨ρcosθsinφ,ρsinθsinφ,ρcosφ⟩
We'll now derive a generalization of the substitution rule 3.8.1 to two dimensions. It will include polar coordinates as a special case. Later, we'll state (without proof) its generalization to three dimensions. It will include cylindrical and spherical coordinates as special cases.
Suppose that we wish to integrate over a region, R, in R2 and that we also wish 1 to use two new coordinates, that we'll call u and v, in place of x and y. The new coordinates u, v are related to the old coordinates x, y, by the functions 2
x=x(u,v)y=y(u,v)
To make formulae more compact, we'll define the vector valued function r(u,v) by
r(u,v)=⟨x(u,v),y(u,v)⟩
As an example, if the new coordinates are polar coordinates, with r renamed to u and θ renamed to v, then x(u,v)=ucosv and y=usinv.
Note that if we hold v fixed and vary u, then r(u,v) sweeps out a curve. For example, if x(u,v)=ucosv and y=usinv, then, if we hold v fixed and vary u, →r(u,v) sweeps out a straight line (that makes the angle v with the x-axis), while, if we hold u>0 fixed and vary v, r(u,v) sweeps out a circle (of radius u centred on the origin).
We start by cutting R (the shaded region in the figure below) up into small pieces by drawing a bunch of curves of constant u (the blue curves in the figure below) and a bunch of curves of constant v (the red curves in the figure below).
Concentrate on any one of the small pieces. Here is a greatly magnified sketch.
For example, the lower red curve was constructed by holding v fixed at the value v0, varying u and sketching r(u,v0), and the upper red curve was constructed by holding v fixed at the slightly larger value v0+dv, varying u and sketching r(u,v0+dv). So the four intersection points in the figure are
P2=r(u0,v0+dv)P3=r(u0+du,v0+dv)P0=r(u0,v0)P1=r(u0+du,v0)
Now, for any small constants dU and dV, we have the linear approximation 3
r(u0+dU,v0+dV)≈r(u0,v0)+∂r∂u(u0,v0)dU+∂r∂v(u0,v0)dV
Applying this three times, once with dU=du, dV=0 (to approximate P1), once with dU=0, dV=dv (to approximate P2), and once with dU=du, dV=dv (to approximate P3),
P0=r(u0,v0)P1=r(u0+du,v0)≈r(u0,v0)+∂r∂u(u0,v0)duP2=r(u0,v0+dv)≈r(u0,v0)+∂r∂v(u0,v0)dvP3=r(u0+du,v0+dv)≈r(u0,v0)+∂r∂u(u0,v0)du+∂r∂v(u0,v0)dv
We have dropped all Taylor expansion terms that are of degree two or higher in du, dv. The reason is that, in defining the integral, we take the limit du,dv→0. Because of that limit, all of the dropped terms contribute exactly 0 to the integral. We shall not prove this. But we shall show, in the optional §3.8.1, why this is the case.
The small piece of R surface with corners P0, P1, P2, P3 is approximately a parallelogram with sides
→P0P1≈→P2P3≈∂r∂u(u0,v0)du=⟨∂x∂u(u0,v0),∂y∂u(u0,v0)⟩du→P0P2≈→P1P3≈∂ r∂v(u0,v0)dv=⟨∂x∂v(u0,v0),∂y∂v(u0,v0)⟩dv
Here the notation, for example, →P0P1 refers to the vector whose tail is at the point P0 and whose head is at the point P1. Recall, from 1.2.17 that
area of parallelogram with sides ⟨a,b⟩ and ⟨c,d⟩=|det[abcd]|=|ad−bc|
So the area of our small piece of R is essentially
dA=|det[∂x∂u∂y∂u∂x∂v∂y∂v]|dudv
Recall that detM denotes the determinant of the matrix M. Also recall that we don't really need determinants for this text, though it does make for nice compact notation.
The formula (3.8.2) is the heart of the following theorem, which tells us how to translate an integral in one coordinate system into an integral in another coordinate system.
Let the functions x(u,v) and y(u,v) have continuous first partial derivatives and let the function f(x,y) be continuous. Assume that x=x(u,v), y=y(u,v) provides a one-to-one correspondence between the points (u,v) of the region U in the uv-plane and the points (x,y) of the region R in the xy-plane. Then
∬
The determinant
\begin{align*} \det\left[\begin{matrix} \frac{\partial x}{\partial u}(u,v)&\frac{\partial y}{\partial u}(u,v) \\ \frac{\partial x}{\partial v}(u,v)&\frac{\partial y}{\partial v}(u,v) \end{matrix}\right] \end{align*}
that appears in (3.8.2) and Theorem 3.8.3 is known as the Jacobian 4.
We'll start with a pretty trivial example in which we simply rename x to Y and y to X\text{.} That is
\begin{align*} x(X,Y) &= Y\\ y(X,Y) &= X \end{align*}
Since
\begin{align*} \frac{\partial x}{\partial X}&=0 &\frac{\partial y}{\partial X}&=1\\ \frac{\partial x}{\partial Y}&=1 &\frac{\partial y}{\partial Y}&=0 \end{align*}
(3.8.2), but with u renamed to X and v renamed to Y\text{,} gives
\begin{align*} \mathrm{d}A &= \left|\det\left[\begin{matrix}0 & 1 \\ 1 & 0 \end{matrix}\right]\right| \mathrm{d}X\,\mathrm{d}Y = \mathrm{d}X\,\mathrm{d}Y \end{align*}
which should really not be a shock.
Polar coordinates have
\begin{align*} x(r,\theta) &= r\cos\theta\\ y(r,\theta) &= r\sin\theta \end{align*}
Since
\begin{align*} \frac{\partial x}{\partial r}&=\cos\theta &\frac{\partial y}{\partial r}&=\sin\theta\\ \frac{\partial x}{\partial \theta}&=-r\sin\theta &\frac{\partial y}{\partial \theta}&=r\cos\theta \end{align*}
(3.8.2), but with u renamed to r and v renamed to \theta\text{,} gives
\begin{align*} \mathrm{d}A &= \left|\det\left[\begin{matrix}\cos\theta &\sin\theta \\ -r\sin\theta & r\cos\theta \end{matrix}\right]\right| \mathrm{d}r \mathrm{d}{\theta} =\big(r\cos^2\theta + r\sin^2\theta\big)\,\mathrm{d}r \mathrm{d}{\theta} \\ &= r\,\mathrm{d}r\, \mathrm{d}{\theta} \end{align*}
which is exactly what we found in 3.2.5.
Parabolic 5 coordinates are defined by
\begin{align*} x(u,v) &= \frac{u^2-v^2}{2}\\ y(u,v) &= uv \end{align*}
Since
\begin{align*} \frac{\partial x}{\partial u}&= u &\frac{\partial y}{\partial u}&=v\\ \frac{\partial x}{\partial v}&=-v &\frac{\partial y}{\partial v}&=u \end{align*}
(3.8.2) gives
\begin{align*} \mathrm{d}A &= \left|\det\left[\begin{matrix} u & v \\ -v & u \end{matrix}\right]\right| \mathrm{d}u\mathrm{d}v = (u^2+v^2)\,\mathrm{d}u\,\mathrm{d}v \end{align*}
In practice applying the change of variables Theorem 3.8.3 can be quite tricky. Here is just one simple (and rigged) example.
Evaluate
\iint_\mathcal{R}\frac{y}{1+x}\ \mathrm{d}{x} \, \mathrm{d}{y} \qquad\text{where } \mathcal{R}=\left \{(x,y)|0\le x\le 1,\ 1+x\le y\le 2+2x\right \} \nonumber
Solution
We can simplify the integrand considerably by making the change of variables
\begin{align*} s&=x & x&=s\\ t&=\frac{y}{1+x} & y&=t(1+x) = t(1+s) \end{align*}
Of course to evaluate the given integral by applying Theorem 3.8.3 we also need to know
- [\circ] the domain of integration in terms of s and t and
- [\circ] \mathrm{d}{x} \, \mathrm{d}{y} in terms of \mathrm{d}s\,\mathrm{d}t\text{.}
By (3.8.2), recalling that x(s,t)=s and y(s,t)=t(1+s)\text{,}
\begin{align*} \mathrm{d}{x} \, \mathrm{d}{y} &= \left|\det\left[\begin{matrix}\frac{\partial x}{\partial s}&\frac{\partial y}{\partial s}\\ \frac{\partial x}{\partial t}&\frac{\partial y}{\partial t} \end{matrix}\right]\right| \mathrm{d}s\,\mathrm{d}t = \left|\det\left[\begin{matrix}1&t\\ 0&1+s \end{matrix}\right]\right| \mathrm{d}s\,\mathrm{d}t = (1+s)\,\mathrm{d}s\,\mathrm{d}t \end{align*}
To determine what the change of variables does to the domain of integration, we'll sketch \mathcal{R} and then reexpress the boundary of \mathcal{R} in terms of the new coordinates s and t\text{.} Here is the sketch of \mathcal{R} in the original coordinates (x,y)\text{.}
The region \mathcal{R} is a quadrilateral. It has four sides.
- The left side is part of the line x=0\text{.} Recall that x=s\text{.} So, in terms of s and t\text{,} this line is s=0\text{.}
- The right side is part of the line x=1\text{.} In terms of s and t\text{,} this line is s=1\text{.}
- The bottom side is part of the line y=1+x\text{,} or \frac{y}{1+x}=1\text{.} Recall that t=\frac{y}{1+x}\text{.} So, in terms of s and t\text{,} this line is t=1\text{.}
- The top side is part of the line y=2(1+x)\text{,} or \frac{y}{1+x}=2\text{.} In terms of s and t\text{,} this line is t=2\text{.}
Here is another copy of the sketch of \mathcal{R}\text{.} But this time the equations of its four sides are expressed in terms of s and t\text{.}
So, expressed in terms of s and t\text{,} the domain of integration \mathcal{R} is much simpler:
\left \{(s,t)|0\le s\le 1,\ 1\le t\le 2\right \} \nonumber
As \mathrm{d}{x} \, \mathrm{d}{y} = (1+s)\,\mathrm{d}s\,\mathrm{d}t and the integrand \frac{y}{1+x}=t\text{,} the integral is, by Theorem 3.8.3,
\begin{align*} \iint_\mathcal{R}\frac{y}{1+x}\ \mathrm{d}{x} \, \mathrm{d}{y} &=\int_0^1\mathrm{d}s\int_1^2\mathrm{d}t\ (1+s)t =\int_0^1\mathrm{d}s\ (1+s)\ \left[\frac{t^2}{2}\right]_1^2\\ &=\frac{3}{2}\left[s+\frac{s^2}{2}\right]_0^1\\ &=\frac{3}{2}\times \frac{3}{2}\\ &=\frac{9}{4} \end{align*}
There are natural generalizations of (3.8.2) and Theorem 3.8.3 to three (and also to higher) dimensions, that are derived in precisely the same way as (3.8.2) was derived. The derivation is based on the fact, discussed in the optional Section 1.2.4, that the volume of the parallelepiped (three dimensional parallelogram)
determined by the three vectors \textbf{a}=\left \langle a_1,a_2,a_3 \right \rangle ,\ \textbf{b}=\left \langle b_1,b_2,b_3 \right \rangle and \textbf{c}=\left \langle c_1,c_2,c_3 \right \rangle is given by the formula
\begin{align*} \text{volume of parallelepiped with edges } \textbf{a}, \textbf{b}, \textbf{c} &= \left| \det\left[\begin{matrix}a_1&a_2&a_3 \\ b_1&b_2&b_3\\ c_1&c_2&c_3\end{matrix}\right] \right| \end{align*}
where the determinant of a 3\times 3 matrix can be defined in terms of some 2\times 2 determinants by
If we use
\begin{align*} x&=x(u,v,w)\\ y&=y(u,v,w)\\ z&=z(u,v,w) \end{align*}
to change from old coordinates x,y,z to new coordinates u,v,w\text{,} then
\begin{align*} \mathrm{d}V = \left|\det\left[\begin{matrix} \frac{\partial x}{\partial u}&\frac{\partial y}{\partial u}&\frac{\partial z}{\partial u}\\ \frac{\partial x}{\partial v}&\frac{\partial y}{\partial v}&\frac{\partial z}{\partial v}\\ \frac{\partial x}{\partial w}&\frac{\partial y}{\partial w}&\frac{\partial z}{\partial w} \end{matrix}\right]\right| \mathrm{d}u\,\mathrm{d}v\,\mathrm{d}w \end{align*}
Cylindrical coordinates have
\begin{align*} x(r,\theta,z) &= r\cos\theta\\ y(r,\theta,z) &= r\sin\theta\\ z(r,\theta,z) & = z \end{align*}
Since
\begin{align*} \frac{\partial x}{\partial r}&=\cos\theta &\frac{\partial y}{\partial r}&=\sin\theta &\frac{\partial z}{\partial r}&=0\\ \frac{\partial x}{\partial \theta}&=-r\sin\theta &\frac{\partial y}{\partial \theta}&=r\cos\theta &\frac{\partial z}{\partial \theta}&=0\\ \frac{\partial x}{\partial z}&= 0 &\frac{\partial y}{\partial z}&=0 &\frac{\partial z}{\partial z}&=1 \end{align*}
(3.8.8), but with u renamed to r and v renamed to \theta\text{,} gives
\begin{align*} \mathrm{d}V &= \left|\det\left[\begin{matrix}\cos\theta &\sin\theta&0 \\ -r\sin\theta & r\cos\theta&0 \\ 0 & 0 & 1 \end{matrix}\right]\right| \mathrm{d}r\, \mathrm{d}{\theta} \, \mathrm{d}{z} \\ &= \left|\cos\theta\det\left[\begin{matrix} r\cos\theta&0 \\ 0 & 1 \end{matrix}\right] -\sin\theta\det\left[\begin{matrix} -r\sin\theta&0 \\ 0 & 1 \end{matrix}\right]\right.\\ &\hskip2.3in+0\left.\det\left[\begin{matrix} -r\sin\theta & r\cos\theta \\ 0 & 0 \end{matrix}\right] \right| \mathrm{d}r\, \mathrm{d}{\theta} \, \mathrm{d}{z} \\ &=\big(r\cos^2\theta + r\sin^2\theta\big)\,\mathrm{d}r\, \mathrm{d}{\theta} \, \mathrm{d}{z} \\ &= r\,\mathrm{d}r\, \mathrm{d}{\theta} \, \mathrm{d}{z} \end{align*}
which is exactly what we found in (3.6.3).
Spherical coordinates have
\begin{align*} x(\rho,\theta,\varphi) &= \rho\,\cos\theta\,\sin\varphi\\ y(\rho,\theta,\varphi) &= \rho\,\sin\theta\,\sin\varphi\\ z(\rho,\theta,\varphi) & = \rho\,\cos\varphi \end{align*}
Since
\begin{align*} \frac{\partial x}{\partial \rho}&=\cos\theta\,\sin\varphi &\frac{\partial y}{\partial \rho}&=\sin\theta\,\sin\varphi &\frac{\partial z}{\partial \rho}&=\cos\varphi\\ \frac{\partial x}{\partial \theta}&=-\rho\,\sin\theta\,\sin\varphi &\frac{\partial y}{\partial \theta}&=\rho\,\cos\theta\,\sin\varphi &\frac{\partial z}{\partial \theta}&=0\\ \frac{\partial x}{\partial \varphi}&= \rho\,\cos\theta\,\cos\varphi &\frac{\partial y}{\partial \varphi}&=\rho\,\sin\theta\,\cos\varphi &\frac{\partial z}{\partial \varphi}&=-\rho\,\sin\varphi \end{align*}
(3.8.8), but with u renamed to \rho\text{,} v renamed to \theta and w renamed to \varphi\text{,} gives
\begin{align*} \mathrm{d}V &= \left|\det\left[\begin{matrix}\cos\theta\,\sin\varphi & \sin\theta\,\sin\varphi &\cos\varphi \\ -\rho\,\sin\theta\,\sin\varphi &\rho\,\cos\theta\,\sin\varphi &0 \\ \rho\,\cos\theta\,\cos\varphi &\rho\,\sin\theta\,\cos\varphi &-\rho\,\sin\varphi \end{matrix}\right]\right| \mathrm{d}\rho\, \mathrm{d}{\theta} \,\mathrm{d}\varphi\\ &= \left|\cos\theta\,\sin\varphi\det\left[\begin{matrix} \rho\,\cos\theta\,\sin\varphi&0 \\ \rho\,\sin\theta\,\cos\varphi &-\rho\,\sin\varphi \end{matrix}\right] \right.\\ &\hskip1in\left. -\sin\theta\,\sin\varphi\det\left[\begin{matrix} -\rho\,\sin\theta\,\sin\vec{a}rphi &0 \\ \rho\,\cos\theta\,\cos\varphi &-\rho\,\sin\varphi \end{matrix}\right] \right.\\ &\hskip1in\left. +\cos\varphi\det\left[\begin{matrix} -\rho\,\sin\theta\,\sin\varphi &\rho\,\cos\theta\,\sin\varphi \\ \rho\,\cos\theta\,\cos\varphi &\rho\,\sin\theta\,\cos\varphi \end{matrix}\right] \right| \mathrm{d}\rho\, \mathrm{d}{\theta} \,\mathrm{d}\varphi\\ &=\rho^2 \big|-\cos^2\theta \sin^3\varphi - \sin^2\theta\sin^3\varphi -\sin\varphi\cos^2\varphi \big|\,\mathrm{d}\rho\, \mathrm{d}{\theta} \,\mathrm{d}\varphi\\ &=\rho^2 \big|-\sin\varphi \sin^2\varphi -\sin\varphi\cos^2\varphi \big|\,\mathrm{d}\rho\, \mathrm{d}{\theta} \,\mathrm{d}\varphi\\ &= \rho^2\sin\varphi\,\mathrm{d}\rho\, \mathrm{d}{\theta} \,\mathrm{d}\varphi \end{align*}
which is exactly what we found in (3.7.3).
Optional — Dropping Higher Order Terms in \mathrm{d}u,\mathrm{d}v
In the course of deriving (3.8.2), that is, the \mathrm{d}A formula for
we approximated, for example, the vectors
\begin{alignat*}{2} \overrightarrow{P_0P_1} &=\textbf{r}(u_0+\mathrm{d}u, v_0) -\textbf{r}(u_0\,,\,v_0) &= \frac{\partial \textbf{r}}{\partial u}(u_0\,,\,v_0)\,\mathrm{d}u + E_1 &\approx \frac{\partial \textbf{r}}{\partial u}(u_0\,,\,v_0)\,\mathrm{d}u\\ \overrightarrow{P_0P_2} &=\textbf{r}(u_0, v_0+\mathrm{d}v)-\textbf{r}(u_0\,,\,v_0) &= \frac{\partial \textbf{r}}{\partial v}(u_0\,,\,v_0)\,\mathrm{d}v + E_2 &\approx \frac{\partial \textbf{r}}{\partial v}(u_0\,,\,v_0)\,\mathrm{d}v \end{alignat*}
where \textbf{E}_1 is bounded 6 by a constant times (\mathrm{d}u)^2 and E_2 is bounded by a constant times (\mathrm{d}v)^2\text{.} That is, we assumed that we could just ignore the errors and drop E_1 and E_2 by setting them to zero.
So we approximated
\begin{align*} \left|\overrightarrow{P_0P_1}\times\overrightarrow{P_0P_2}\right| &=\left|\Big[\frac{\partial \textbf{r}}{\partial u}(u_0\,,\,v_0)\,\mathrm{d}u + \textbf{E}_1\Big] \times\Big[\frac{\partial \textbf{r}}{\partial v}(u_0\,,\,v_0)\,\mathrm{d}v + \textbf{E}_2\Big] \right|\\ &=\left|\frac{\partial \textbf{r}}{\partial u}(u_0\,,\,v_0)\,\mathrm{d}u \times\frac{\partial \textbf{r}}{\partial v}(u_0\,,\,v_0)\,\mathrm{d}v + \textbf{E}_3 \right|\\ &\approx \left|\frac{\partial \textbf{r}}{\partial u}(u_0\,,\,v_0)\,\mathrm{d}u \times\frac{\partial \textbf{r}}{\partial v}(u_0\,,\,v_0)\,\mathrm{d}v \right| \end{align*}
where the length of the vector \textbf{E}_3 is bounded by a constant times (\mathrm{d}u)^2\,\mathrm{d}v+\mathrm{d}u\,(\mathrm{d}v)^2\text{.} We'll now see why dropping terms like \textbf{E}_3 does not change the value of the integral at all 7. Suppose that our domain of integration consists of all (u,v)'s in a rectangle of width W and height H\text{,} as in the figure below.
Subdivide the rectangle into a grid of n\times n small subrectangles by drawing lines of constant v (the red lines in the figure) and lines of constant u (the blue lines in the figure). Each subrectangle has width \mathrm{d}u = \frac{W}{n} and height \mathrm{d}v = \frac{H}{n}\text{.} Now suppose that in setting up the integral we make, for each subrectangle, an error that is bounded by some constant times
(\mathrm{d}u)^2\,\mathrm{d}v+\mathrm{d}u\,(\mathrm{d}v)^2 =\Big(\frac{W}{n}\Big)^2 \frac{H}{n} + \frac{W}{n}\Big(\frac{H}{n}\Big)^2 =\frac{WH(W+H)}{n^3} \nonumber
Because there are a total of n^2 subrectangles, the total error that we have introduced, for all of these subrectangles, is no larger than a constant times
n^2 \times \frac{WH(W+H)}{n^3} = \frac{WH(W+H)}{n} \nonumber
When we define our integral by taking the limit n\rightarrow 0 of the Riemann sums, this error converges to exactly 0\text{.} As a consequence, it was safe for us to ignore the error terms when we established the change of variables formulae.
- We'll keep our third wish in reserve.
- We are abusing notation a little here by using x and y both as coordinates and as functions. We could write x=f(u,v) and y=g(u,v)\text{,} but it is easier to remember x=x(u,v) and y=y(u,v)\text{.}
- Recall 2.6.1.
- It is not named after the Jacobin Club, a political movement of the French revolution. It is not named after the Jacobite rebellions that took place in Great Britain and Ireland between 1688 and 1746. It is not named after the Jacobean era of English and Scottish history. It is named after the German mathematician Carl Gustav Jacob Jacobi (1804 – 1851). He died from smallpox.
- The name comes from the fact that both the curves of constant u and the curves of constant v are parabolas.
- Remember the error in the Taylor polynomial approximations. See 2.6.13 and 2.6.14.
- See the optional § 1.1.6 of the CLP-2 text for an analogous argument concerning Riemann sums.