Skip to main content
Mathematics LibreTexts

3.11: Chain Rule

  • Page ID
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    With one additional rule, we will have the power to take the derivative of any function we can write down. What is this amazing rule? Why, it’s called the chain rule. The chain rule is \(\frac{d}{dx} f(g(x)) = f'(g(x)) \cdot g'(x)\). If we drop the \((x)\) on each function (note that it is still there, it is just implied), we have a slightly shorter version:

    \(\boxed{\cfrac{d}{dx} f(g) = f'(g) \cdot g'}\)

    Here, \(f(g)\) is called function composition. It does not mean \(f\) times \(g\). It means we are sticking \(g\) inside of \(f\)! Like \(f\) is eating \(g\)! That’s actually cannibalism if you think about it — so don’t think about it too closely. But do remember how to do function composition.

    Why does this work? I’m not going to do a formal proof, but let’s run through an idea behind it. Recall that the derivative is the slope or how steep the graph of a function is. This is a lot easier to think about if we’re talking about lines. For example, suppose \({\color{red}f = 6x - 10}\), and \({\color{blue}g = \frac{1}{2}x + 3}\). As we know from algebra, the slope of the \({\color{red}f}\) line is \({\color{red}6}\), and the slope of the \({\color{blue}g}\) line is \({\color{blue} \frac{1}{2}}\).

    So what is the slope of \({\color{red}f}({\color{blue}g})\)? What this means is we are putting \({\color{blue}g}\) inside of \({\color{red}f}\). So if \({\color{red}f = 6x - 10}\), and \({\color{blue}g = \frac{1}{2}x + 3}\), we take the blue \({\color{blue}\frac{1}{2}x + 3}\) and use that to replace the red \({\color{red}x}\). Here is what is would look like:

    \[\begin{align*} {\color{red}f}({\color{blue}g}) & = {\color{red}6{\color{blue}\left(\frac{1}{2}x + 3\right)}-10} \\ & = 6\left(\frac{1}{2}x + 3\right)-10 \\ & = 3x + 18 - 10 \\ & = 3x - 8 \end{align*}\]

    So again, what is the slope of \({\color{red}f}({\color{blue}g})\)? We can see from this calculation that it is \(3\), the product of the two slopes \({\color{red}6}\) and \({\color{blue} \frac{1}{2}}\). That’s why you have \(f' \cdot g'\) in the formula.

    Okay, but what about the \((g)\) after the \(f'\) in the formula? One way to think about the function composition \(f(g)\) is we are looking at the \(f\) curve at the \(x\)-location of \(g\). When you take the derivative, you’re still looking at that same location (now on the \(f'\) curve), so you still need that \((g)\) there to specify that location.

    Chain rule with \(\ln(x^2 + x)\)

    Find \(\frac{d}{dx} \ln(x^2 + x)\).

    We must identify an “inside” (g) and “outside” (f) function in order to use the chain rule. Often, the “inside” function will be in parentheses (though not always). That works in this case, so the “inside” function is \(x^2 + x\), so \(g = x^2 + x\). The outside function is \(\ln\), and hence \(f =\ln(x)\). We also know \(f' = \frac{1}{x}\) and \(g' = 2x + 1\). Now to use the chain rule, we first need \(f'(g)\). What is this? Well, remember that this is not multiplication, but it is sticking one function inside another. In this case, we are taking \(g = x^2 + x\) and sticking it into the function \(f' = \frac{1}{x}\). This means we replace the \(x\) in \(\frac{1}{x}\), replacing it with \(x^2 + x\), and get \(f'(g) = \frac{1}{x^2 + x}\). Hence we have

    \[\begin{align*} \frac{d}{dx} \ln(x^2 + x) & = f'(g) \cdot g' \\ & = \frac{1}{(x^2 + x)} \cdot (2x + 1) \\ & = \boxed{\frac{2x + 1}{x^2 + x}}. \end{align*}\]

    There you have it!

    Chain rule with \((3x + 1)^2\)

    Find \(\frac{d}{dx} (3x + 1)^2\) in two different ways: using power rule, and using the chain rule.

    Using the power rule, we first multiply \((3x + 1)^2 = (3x + 1)(3x + 1) = 9x^2 + 3x + 3x + 1 = 9x^2 + 6x + 1\). In this form, it is easy to find the derivative: \(\frac{d}{dx} 9x^2 + 6x + 1 = 18x + 6\).

    Using the chain rule, we identify the inside \(g\) function as \(3x + 1\), and the outside function as \(x^2\). We then have

    \[\begin{align*} \frac{d}{dx} (3x + 1)^2 & = f'(g) \cdot g' \\ & = 2(3x + 1)(3) \\ & = 6(3x + 1) \\ & = \boxed{18x + 6}. \end{align*}\]

    Again, math just works!

    Things can get quite complicated with the chain rule.

    Complicated chain rule

    Find \(\frac{d}{dx} \sqrt[4]{ x^2 + 2e^x}\).

    There are no explicit parentheses here, but the square root acts like parentheses, and it designates an inside function of \(g = x^2 + 2e^x\). The outside function is therefore \(f = \sqrt[4]{x}\). If we rewrite \(\sqrt[4]{x}\) as \(g' = 2x + 2e^x\). Hence, the chain rule gives

    That’s as simplified as we can get the answer to be.

    One more quick example.

    Proof of \(a^x\) rule

    Prove the rule \(\frac{d}{dx} \ a^x = \ln(a) \cdot a^x\).

    Recall that \(e^{\ln(a)} = a\). To prove this rule, we rewrite \(a^x = (e^{\ln(a)})^x = e^{x \ln(a)}\). We are then computing

    \(\frac{d}{dx} \ a^x = \frac{d}{dx} \ e^{x \ln(a)}\)

    To compute this derivative, we set \(f = e^x\) and \(g = x \ln(a)\). We find \(f' = e^x\) and \(g' = \ln(a)\), so by the chain rule

    \[\begin{align*} \frac{d}{dx} \ a^x & = \frac{d}{dx} \ e^{x \ln(a)} \\ & = f'(g) \cdot g' \\ & = e^{x \ln(a)} \cdot \ln(a) \\ \end{align*}\]

    We’ve already shown that \(a^x = e^{x \ln(a)}\), so this simplifies to \(a^x \cdot \ln(a)\), as desired.

    This page titled 3.11: Chain Rule is shared under a CC BY 4.0 license and was authored, remixed, and/or curated by Tyler Seacrest via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit history is available upon request.

    • Was this article helpful?