14.4: The Chain Rule
( \newcommand{\kernel}{\mathrm{null}\,}\)
Consider the surface z=x2y+xy2, and suppose that x=2+t4 and y=1−t3. We can think of the latter two equations as describing how x and y change relative to, say, time. Then
z=x2y+xy2=(2+t4)2(1−t3)+(2+t4)(1−t3)2
tells us explicitly how the z coordinate of the corresponding point on the surface depends on t. If we want to know dz/dt we can compute it more or less directly---it's actually a bit simpler to use the chain rule:
dzdt=x2y′+2xx′y+x2yy′+x′y2=(2xy+y2)x′+(x2+2xy)y′=(2(2+t4)(1−t3)+(1−t3)2)(4t3)+((2+t4)2+2(2+t4)(1−t3))(−3t2)
If we look carefully at the middle step, dz/dt=(2xy+y2)x′+(x2+2xy)y′, we notice that 2xy+y2 is ∂z/∂x, and x2+2xy is ∂z/∂y. This turns out to be true in general, and gives us a new chain rule:
Suppose that z=f(x,y), f is differentiable, x=g(t), and y=h(t). Assuming that the relevant derivatives exist,
dzdt=∂z∂xdxdt+∂z∂ydydt.
If f is differentiable, then
Δz=fx(x0,y0)Δx+fy(x0,y0)Δy+ϵ1Δx+ϵ2Δy,
where ϵ1 and ϵ2 approach 0 as (x,y) approaches (x0,y0). Then
ΔzΔt=fxΔxΔt+fyΔyΔt+ϵ1ΔxΔt+ϵ2ΔyΔt.
As Δt approaches 0, (x,y) approaches (x0,y0) and so
limΔt→0ΔzΔt=dzdtlimΔt→0ϵ1ΔxΔt=0⋅dxdtlimΔt→0ϵ2ΔyΔt=0⋅dydt
and so taking the limit of (14.4.1) as Δt goes to 0 gives
dzdt=fxdxdt+fydydt,
as desired.
We can write the chain rule in way that is somewhat closer to the single variable chain rule:
dfdt=⟨fx,fy⟩⋅⟨x′,y′⟩,
or (roughly) the derivatives of the outside function "times'' the derivatives of the inside functions. Not surprisingly, essentially the same chain rule works for functions of more than two variables, for example, given a function of three variables f(x,y,z), where each of x, y and z is a function of t,
dfdt=⟨fx,fy,fz⟩⋅⟨x′,y′,z′⟩.
We can even extend the idea further. Suppose that f(x,y) is a function and x=g(s,t) and y=h(s,t) are functions of two variables s and t. Then f is "really'' a function of s and t as well, and
∂f∂s=fxgs+fyhs∂f∂t=fxgt+fyht.
The natural extension of this to f(x,y,z) works as well.
Recall that we used the ordinary chain rule to do implicit differentiation. We can do the same with the new chain rule.
x2+y2+z2=4 defines a sphere, which is not a function of x and y, though it can be thought of as two functions, the top and bottom hemispheres. We can think of z as one of these two functions, so really z=z(x,y), and we can think of x and y as particularly simple functions of x and y, and let f(x,y,z)=x2+y2+z2. Since f(x,y,z)=4, ∂f/∂x=0, but using the chain rule:
0=∂f∂x=fx∂x∂x+fy∂y∂x+fz∂z∂x=(2x)(1)+(2y)(0)+(2z)∂z∂x,
noting that since y is temporarily held constant its derivative ∂y/∂x=0. Now we can solve for ∂z/∂x:
∂z∂x=−2x2z=−xz.
In a similar manner we can compute ∂z/∂y.
Contributors
Integrated by Justin Marshall.