12.5: The Multivariable Chain Rule
- Page ID
- 4231
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)The Chain Rule, as learned in Section 2.5, states that \( \frac{d}{dx}\Big(f\big(g(x)\big)\Big) = f'\big(g(x)\big)g'(x)\). If \(t=g(x)\), we can express the Chain Rule as
\[\frac{df}{dx} = \frac{df}{dt}\frac{dt}{dx}.\]
In this section we extend the Chain Rule to functions of more than one variable.
theorem 107 Multivariable Chain Rule, Part I
Let \(z=f(x,y)\), \(x=g(t)\) and \(y=h(t)\), where \(f\), \(g\) and \(h\) are differentiable functions. Then \(z = f(x,y) = f\big(g(t),h(t)\big)\) is a function of \(t\), and
\[\begin{align*}
\frac{dz}{dt} = \frac{df}{dt} &= f_x(x,y)\frac{dx}{dt}+f_y(x,y)\frac{dy}{dt}\\[4pt]
&= \frac{\partial f}{\partial x}\frac{dx}{dt}+\frac{\partial f}{\partial y}\frac{dy}{dt}.
\end{align*}\]
It is good to understand what the situation of \(z=f(x,y)\), \(x=g(t)\) and \(y=h(t)\) describes. We know that \(z=f(x,y)\) describes a surface; we also recognize that \(x=g(t)\) and \(y=h(t)\) are parametric equations for a curve in the \(x\)-\(y\) plane. Combining these together, we are describing a curve that lies on the surface described by \(f\). The parametric equations for this curve are \(x=g(t)\), \(y=h(t)\) and \(z=f\big(g(t),h(t)\big)\).
Consider Figure 12.14 in which a surface is drawn, along with a dashed curve in the \(x\)-\(y\) plane. Restricting \(f\) to just the points on this circle gives the curve shown on the surface. The derivative \(\frac{df}{dt}\) gives the instantaneous rate of change of \(f\) with respect to \(t\). If we consider an object traveling along this path, \(\frac{df}{dt}\) gives the rate at which the object rises/falls.
We now practice applying the Multivariable Chain Rule.
Example \(\PageIndex{1}\): Using the Multivariable Chain Rule
Let \(z=x^2y+x\), where \(x=\sin t\) and \(y=e^{5t}\). Find \( \frac{dz}{dt}\) using the Chain Rule.
Solution
Following Theorem 107, we find
\[f_x(x,y) = 2xy+1,\qquad f_y(x,y) = x^2,\qquad \frac{dx}{dt} = \cos t,\qquad \frac{dy}{dt}= 5e^{5t}.\]
Applying the theorem, we have
\[\frac{dz}{dt} = (2xy+1)\cos t+ 5x^2e^{5t}.\]
This may look odd, as it seems that \(\frac{dz}{dt}\) is a function of \(x\), \(y\) and \(t\). Since \(x\) and \(y\) are functions of \(t\), \(\frac{dz}{dt}\) is really just a function of \(t\), and we can replace \(x\) with \(\sin t\) and \(y\) with \(e^{5t}\):
\[\frac{dz}{dt} = (2xy+1)\cos t+ 5x^2e^{5t} = (2\sin (t)e^{5t}+1)\cos t+5e^{5t}\sin^2t.\]
The previous example can make us wonder: if we substituted for \(x\) and \(y\) at the end to show that \(\frac{dz}{dt}\) is really just a function of \(t\), why not substitute before differentiating, showing clearly that \(z\) is a function of \(t\)?
That is, \(z = x^2y+x = (\sin t)^2e^{5t}+\sin t.\) Applying the Chain and Product Rules, we have
\[\frac{dz}{dt} = 2\sin t\cos t\, e^{5t}+ 5\sin^2t\,e^{5t}+\cos t,\]
which matches the result from the example.
This may now make one wonder "What's the point? If we could already find the derivative, why learn another way of finding it?'' In some cases, applying this rule makes deriving simpler, but this is hardly the power of the Chain Rule. Rather, in the case where \(z=f(x,y)\), \(x=g(t)\) and \(y=h(t)\), the Chain Rule is extremely powerful when we do not know what \(f\), \(g\) and/or \(h\) are. It may be hard to believe, but often in "the real world'' we know rate--of--change information (i.e., information about derivatives) without explicitly knowing the underlying functions. The Chain Rule allows us to combine several rates of change to find another rate of change. The Chain Rule also has theoretic use, giving us insight into the behavior of certain constructions (as we'll see in the next section).
We demonstrate this in the next example.
Example \(\PageIndex{2}\): Applying the Multivarible Chain Rule
An object travels along a path on a surface. The exact path and surface are not known, but at time \(t=t_0\) it is known that :
\[\frac{\partial z}{\partial x} = 5,\qquad \frac{\partial z}{\partial y}=-2,\qquad \frac{dx}{dt}=3\qquad \text{and}\qquad \frac{dy}{dt}=7.\]
Find \(\frac{dz}{dt}\) at time \(t_0\).
Solution
The Multivariable Chain Rule states that
\[\begin{align*}
\frac{dz}{dt} &= \frac{\partial z}{\partial x}\frac{dx}{dt} + \frac{\partial z}{\partial y}\frac{dy}{dt} \\
&= 5(3)+(-2)(7) \\
&=1.
\end{align*}\]
By knowing certain rates--of--change information about the surface and about the path of the particle in the \(x\)-\(y\) plane, we can determine how quickly the object is rising/falling.
We next apply the Chain Rule to solve a max/min problem.
Example \(\PageIndex{3}\): Applying the Multivariable Chain Rule
Consider the surface \(z=x^2+y^2-xy\), a paraboloid, on which a particle moves with \(x\) and \(y\) coordinates given by \(x=\cos t\) and \(y=\sin t\). Find \(\frac{dz}{dt}\) when \(t=0\), and find where the particle reaches its maximum/minimum \(z\)-values.
Solution
It is straightforward to compute
\[f_x(x,y) = 2x-y,\qquad f_y(x,y) = 2y-x,\qquad \frac{dx}{dt} = -\sin t,\qquad \frac{dy}{dt} = \cos t.\]
Combining these according to the Chain Rule gives:
\[\frac{dz}{dt} = -(2x-y)\sin t + (2y-x)\cos t.\]
When \(t=0\), \(x=1\) and \(y=0\). Thus \(\frac{dz}{dt} = -(2)(0)+ (-1)(1) = -1\). When \(t=0\), the particle is moving down, as shown in Figure 12.15.
To find where \(z\)-value is maximized/minimized on the particle's path, we set \(\frac{dz}{dt}=0\) and solve for \(t\):
\[\begin{align*}
\frac{dz}{dt} =0 &= -(2x-y)\sin t + (2y-x)\cos t\\
0&= -(2\cos t-\sin t)\sin t+(2\sin t-\cos t)\cos t\\
0&= \sin^2t-\cos^2t\\
\cos^2t &=\sin^2t\\
t&= n\frac{\pi}4\quad \text{(for odd \(n\))}
\end{align*}\]
We can use the First Derivative Test to find that on \([0,2\pi]\), \(z\) has reaches its absolute minimum at \(t=\pi/4\) and \(5\pi/4\); it reaches its absolute maximum at \(t=3\pi/4\) and \(7\pi/4\), as shown in Figure 12.15.
We can extend the Chain Rule to include the situation where \(z\) is a function of more than one variable, and each of these variables is also a function of more than one variable. The basic case of this is where \(z=f(x,y)\), and \(x\) and \(y\) are functions of two variables, say \(s\) and \(t\).
THEOREM 108 Multivariable Chain Rule, Part II
- Let \(z=f(x,y)\), \(x=g(s,t)\) and \(y=h(s,t)\), where \(f\), \(g\) and \(h\) are differentiable functions. Then \(z\) is a function of \(s\) and \(t\), and
-\( \frac{\partial z}{\partial s} = \frac{\partial f}{\partial x}\frac{\partial x}{\partial s} + \frac{\partial f}{\partial y}\frac{\partial y}{\partial s}\), and
-\( \frac{\partial z}{\partial t} = \frac{\partial f}{\partial x}\frac{\partial x}{\partial t} + \frac{\partial f}{\partial y}\frac{\partial y}{\partial t}.\)
- Let \(z = f(x_1,x_2,\ldots,x_m)\) be a differentiable function of \(m\) variables, where each of the \(x_i\) is a differentiable function of the variables \(t_1,t_2,\ldots,t_n\). Then \(z\) is a function of the \(t_i\), and
\[\frac{\partial z}{\partial t_i} = \frac{\partial f}{\partial x_1}\frac{\partial x_1}{\partial t_i} + \frac{\partial f}{\partial x_2}\frac{\partial x_2}{\partial t_i} + \cdots + \frac{\partial f}{\partial x_m}\frac{\partial x_m}{\partial t_i}.\]
Example \(\PageIndex{4}\): Using the Multivarible Chain Rule, Part II
Let \(z=x^2y+x\), \(x=s^2+3t\) and \(y=2s-t\). Find \(\frac{\partial z}{\partial s}\) and \(\frac{\partial z}{\partial t}\), and evaluate each when \(s=1\) and \(t=2\).
Solution
Following Theorem 108, we compute the following partial derivatives:
\[\frac{\partial f}{\partial x} = 2xy+1\qquad\qquad \frac{\partial f}{\partial y} = x^2,\]
\[\frac{\partial x}{\partial s} = 2s \qquad\qquad \frac{\partial x}{\partial t} = 3\qquad\qquad \frac{\partial y}{\partial s} = 2 \qquad\qquad \frac{\partial y}{\partial t} = -1.\]
Thus
\[ \frac{\partial z}{\partial s} = (2xy+1)(2s) + (x^2)(2) = 4xys+2s + 2x^2,\quad \text{and}\]
\[ \frac{\partial z}{\partial t} = (2xy+1)(3) + (x^2)(-1) = 6xy-x^2+3.\]
When \(s=1\) and \(t=2\), \(x= 7\) and \(y= 0\), so
\[\frac{\partial z}{\partial s} = 100\qquad \text{and}\qquad \frac{\partial z}{\partial t} = -46.\]
Example \(\PageIndex{5}\): Using the Multivarible Chain Rule, Part II
Let \(w = xy+z^2\), where \(x= t^2e^s\), \(y= t\cos s\), and \(z=s\sin t\). Find \(\frac{\partial w}{\partial t}\) when \(s=0\) and \(t=\pi\).
Solution
Following Theorem 108, we compute the following partial derivatives:
\[\frac{\partial f}{\partial x} = y\qquad\qquad \frac{\partial f}{\partial y} = x\qquad\qquad \frac{\partial f}{\partial z} = 2z,\]
\[\frac{\partial x}{\partial t} = 2te^s\qquad\qquad \frac{\partial y}{\partial t} = \cos s\qquad\qquad \frac{\partial z}{\partial t} = s\cos t.\]
Thus \[ \frac{\partial w}{\partial t} = y(2te^s) + x(\cos s) + 2z(s\cos t).\]
When \(s=0\) and \(t=\pi\), we have \(x=\pi^2\), \(y=\pi\) and \(z=0\). Thus
\[\frac{\partial w}{\partial t} = \pi(2\pi) + \pi^2 = 3\pi^2.\]
Implicit Differentiation
We studied finding \(\frac{dy}{dx}\) when \(y\) is given as an implicit function of \(x\) in detail in Section 2.6. We find here that the Multivariable Chain Rule gives a simpler method of finding \(\frac{dy}{dx}\).
For instance, consider the implicit function \(x^2y-xy^3=3.\) We learned to use the following steps to find \(\frac{dy}{dx}\):
\[\begin{align}
\frac{d}{dx}\Big(x^2y-xy^3\big) &= \frac{d}{dx}\Big(3\Big) \notag\\
2xy + x^2\frac{dy}{dx}-y^3-3xy^2\frac{dy}{dx} &= 0\notag \\
\frac{dy}{dx} = -\frac{2xy-y^3}{x^2-3xy^2}.\label{eq:mchain2}
\end{align}\]
Instead of using this method, consider \(z=x^2y-xy^3\). The implicit function above describes the level curve \(z=3\). Considering \(x\) and \(y\) as functions of \(x\), the Multivariable Chain Rule states that
\[\frac{dz}{dx} = \frac{\partial z}{\partial x}\frac{dx}{dx}+\frac{\partial z}{\partial y}\frac{dy}{dx}.\label{eq:mchain1}\]
Since \(z\) is constant (in our example, \(z=3\)), \(\frac{dz}{dx} = 0\). We also know \(\frac{dx}{dx} = 1\). Equation \ref{eq:mchain1} becomes
\[\begin{align*}
0 &= \frac{\partial z}{\partial x}(1) + \frac{\partial z}{\partial y}\frac{dy}{dx} \quad \Rightarrow\\
\frac{dy}{dx} &= -\frac{\partial z}{\partial x}\Big/\frac{\partial z}{\partial y}\\
&= -\frac{\,f_x\,}{f_y}.
\end{align*}\]
Note how our solution for \(\frac{dy}{dx}\) in Equation \ref{eq:mchain2} is just the partial derivative of \(z\) with respect to \(x\), divided by the partial derivative of \(z\) with respect to \(y\).
We state the above as a theorem.
THEOREM 109 Implicit Differentiation
Let \(f\) be a differentiable function of \(x\) and \(y\), where \(f(x,y)=c\) defines \(y\) as an implicit function of \(x\), for some constant \(c\). Then
\[\frac{dy}{dx} = - \frac{f_x(x,y)}{f_y(x,y)}.\]
We practice using Theorem 109 by applying it to a problem from Section 2.6.
Example \(\PageIndex{6}\): Implicit Differentiation
Given the implicitly defined function \(\sin(x^2y^2)+y^3=x+y\), find \(y'\). Note: this is the same problem as given in Example 2.6.4 from Section 2.6, where the solution took about a full page to find.
Solution
Let \(f(x,y) = \sin(x^2y^2)+y^3-x-y\); the implicitly defined function above is equivalent to \(f(x,y)=0\). We find \(\frac{dy}{dx}\) by applying Theorem 109. We find
\[f_x(x,y) = 2xy^2\cos(x^2y^2)-1\qquad \text{and}\qquad f_y(x,y) = 2x^2y\cos(x^2y^2)+3y^2-1,\]
so
\[\frac{dy}{dx} = -\frac{2xy^2\cos(x^2y^2)-1}{2x^2y\cos(x^2y^2)+3y^2-1},\]
which matches our solution from Example 2.6.4.