4.8: Implicit Differentiation
- Page ID
- 475
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)As we have seen, there is a close relationship between the derivatives of \( e^x\) and \(\ln x\) because these functions are inverses. Rather than relying on pictures for our understanding, we would like to be able to exploit this relationship computationally. In fact this technique can help us find derivatives in many situations, not just when we seek the derivative of an inverse function.
We will begin by illustrating the technique to find what we already know, the derivative of \(\ln x\). Let's write \(y=\ln x\) and then \( x=e^{\ln x}=e^y\), that is, \( x=e^y\). We say that this equation defines the function \(y=\ln x\) implicitly because while it is not an explicit expression \(y=\ldots\), it is true that if \( x=e^y\) then \(y\) is in fact the natural logarithm function.
Now, for the time being, pretend that all we know of \(y\) is that \( x=e^y\); what can we say about derivatives? We can take the derivative of both sides of the equation:
\[{d\over dx}x={d\over dx}e^y.\]
Then using the chain rule on the right hand side:
\[1 = \left({d\over dx}y\right) e^y = y'e^y.\]
Then we can solve for \(y'\):
\[y'={1\over e^y} = {1\over x}.\]
There is one little difficulty here. To use the chain rule to compute \( d/dx(e^y)=y'e^y\) we need to know that the function \(y\) has a derivative. All we have shown is that if it has a derivative then that derivative must be \(1/x\). When using this method we will always have to assume that the desired derivative exists, but fortunately this is a safe assumption for most such problems.
The example \(y=\ln x\) involved an inverse function defined implicitly, but other functions can be defined implicitly, and sometimes a single equation can be used to implicitly define more than one function. Here's a familiar example. The equation \( r^2=x^2+y^2\) describes a circle of radius \(r\). The circle is not a function \(y=f(x)\) because for some values of \(x\) there are two corresponding values of \(y\).
If we want to work with a function, we can break the circle into two pieces, the upper and lower semicircles, each of which is a function. Let's call these \(y=U(x)\) and \(y=L(x)\); in fact this is a fairly simple example, and it's possible to give explicit expressions for these:
\[ U(x)=\sqrt{r^2-x^2 }\]
and
\[ L(x)=-\sqrt{r^2-x^2 }.\]
But it's somewhat easier, and quite useful, to view both functions as given implicitly by \( r^2=x^2+y^2\): both \( r^2=x^2+U(x)^2\) and \( r^2=x^2+L(x)^2\) are true, and we can think of \( r^2=x^2+y^2\) as defining both \(U(x)\) and \(L(x)\).
Now we can take the derivative of both sides as before, remembering that \(y\) is not simply a variable but a function---in this case, \(y\) is either \(U(x)\) or \(L(x)\) but we're not yet specifying which one. When we take the derivative we just have to remember to apply the chain rule where \(y\) appears.
\[ \eqalign{ {d\over dx}r^2&={d\over dx}(x^2+y^2)\cr 0&=2x+2yy'\cr y'&={-2x\over 2y}=-{x\over y}.\cr }\]
Now we have an expression for \(y'\), but it contains \(y\) as well as \(x\). This means that if we want to compute \(y'\) for some particular value of \(x\) we'll have to know or compute \(y\) at that value of \(x\) as well. It is at this point that we will need to know whether \(y\) is \(U(x)\) or \(L(x)\). Occasionally it will turn out that we can avoid explicit use of \(U(x)\) or \(L(x)\) by the nature of the problem.
Example \(\PageIndex{1}\)
Find the slope of the circle \( 4=x^2+y^2\) at the point \( (1,-\sqrt{3})\).
Solution
Since we know both the \(x\) and \(y\) coordinates of the point of interest, we do not need to explicitly recognize that this point is on \(L(x)\), and we do not need to use \(L(x)\) to compute \(y\)---but we could. Using the calculation of \(y'\) from above,
\[y'=-{x\over y}=-{1\over -\sqrt{3}}={1\over \sqrt{3}}.\]
It is instructive to compare this approach to others.
We might have recognized at the start that \( (1,-\sqrt{3})\) is on the function \( y=L(x)=-\sqrt{4-x^2}\). We could then take the derivative of \(L(x)\), using the power rule and the chain rule, to get
\[L'(x)=-{1\over 2}(4-x^2)^{-1/2}(-2x)={x\over\sqrt{4-x^2}}.\]
Then we could compute \( L'(1)=1/\sqrt{3}\) by substituting \(x=1\).
Alternately, we could realize that the point is on \(L(x)\), but use the fact that \(y'=-x/y\). Since the point is on \(L(x)\) we can replace \(y\) by \(L(x)\) to get
\[y'=-{x\over L(x)}=-{x\over \sqrt{4-x^2}},\]
without computing the derivative of \(L(x)\) explicitly. Then we substitute \(x=1\) and get the same answer as before.
In the case of the circle it is possible to find the functions \(U(x)\) and \(L(x)\) explicitly, but there are potential advantages to using implicit differentiation anyway. In some cases it is more difficult or impossible to find an explicit formula for \(y\) and implicit differentiation is the only way to find the derivative.
Example \(\PageIndex{2}\)
Find the derivative of any function defined implicitly by \( yx^2+e^y=x\).
Solution
We treat \(y\) as an unspecified function and use the chain rule:
\[\eqalign{ {d\over dx}(yx^2+e^y)&={d\over dx}x\cr (y\cdot 2x+y'\cdot x^2)+y'e^y &= 1\cr y'x^2+y'e^y&= 1-2xy\cr y'(x^2+e^y)&= 1-2xy\cr y'&={1-2xy\over x^2+e^y}.\cr }\]
You might think that the step in which we solve for \(y'\) could sometimes be difficult---after all, we're using implicit differentiation here because we can't solve the equation \( yx^2+e^y=x\) for \(y\), so maybe after taking the derivative we get something that is hard to solve for \(y'\).
In fact, this never happens. All occurrences \(y'\) come from applying the chain rule, and whenever the chain rule is used it deposits a single \(y'\) multiplied by some other expression. So it will always be possible to group the terms containing \(y'\) together and factor out the \(y'\), just as in the previous example. If you ever get anything more difficult you have made a mistake and should fix it before trying to continue.
It is sometimes the case that a situation leads naturally to an equation that defines a function implicitly.
Example \(\PageIndex{3}\)
Consider all the points \((x,y)\) that have the property that the distance from \((x,y)\) to \( (x_1,y_1)\) plus the distance from \((x,y)\) to \( (x_2,y_2)\) is \(2a\) (\(a\) is some constant). These points form an ellipse, which like a circle is not a function but can viewed as two functions pasted together. Because we know how to write down the distance between two points, we can write down an implicit equation for the ellipse:
\[\sqrt{(x-x_1)^2+(y-y_1)^2}+\sqrt{(x-x_2)^2+(y-y_2)^2}=2a.\]
Then we can use implicit differentiation to find the slope of the ellipse at any point, though the computation is rather messy.
Example \(\PageIndex{4}\)
We have already justified the power rule by using the exponential function, but we could also do it for rational exponents by using implicit differentiation. Suppose that \( y=x^{m/n}\), where \(m\) and \(n\) are positive integers. We can write this implicitly as \( y^n=x^m\), then because we justified the power rule for integers, we can take the derivative of each side:
\[\eqalign{ ny^{n-1}y' &= mx^{m-1}\cr y'&= {m\over n}{x^{m-1}\over y^{n-1}}\cr y'&= {m\over n}{x^{m-1}\over (x^{m/n})^{n-1}}\cr y'&= {m\over n}x^{m-1-(m/n)(n-1)}\cr y'&= {m\over n}x^{m-1-m+(m/n)}\cr y'&= {m\over n}x^{(m/n)-1}.\cr }\]
Contributors
Integrated by Justin Marshall.