Skip to main content
Mathematics LibreTexts

3.3: The Chain Rule

  • Page ID
    149877
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)
    Learning Objectives
    • State the chain rule for the composition of two functions.
    • Apply the chain rule together with the power rule.
    • Apply the chain rule and the product/quotient rules correctly in combination when both are necessary.

    We have seen the techniques for differentiating basic functions (for example, polynomials) as well as sums, differences, products, quotients, and constant multiples of these functions. However, these techniques do not allow us to differentiate compositions of functions, such as \(k(x)=\sqrt{3x^2+1}\). In this section, we study the rule for finding the derivative of the composition of two or more functions.

    Deriving the Chain Rule

    When we have a function that is a composition of two or more functions, we could use all of the techniques we have already learned to differentiate it. However, using all of those techniques to break down a function into simpler parts that we are able to differentiate can get cumbersome. Instead, we use the chain rule, which states that the derivative of a composite function is the derivative of the outer function evaluated at the inner function times the derivative of the inner function.

    To put this rule into context, let’s take a look at an example: \(h(x)=(x^2+3)^2\). We can think of the derivative of this function with respect to \(x\) as the rate of change of \((x^2+3)^2\) relative to the change in \(x\). Consequently, we want to know how \((x^2+3)^2\) changes as \(x\) changes. We can think of this event as a chain reaction: As \(x\) changes, \(x^2+3\) changes, which leads to a change in \((x^2+3)^2\). This chain reaction gives us hints as to what is involved in computing the derivative of \((x^2+3)^2\). First of all, a change in \(x\) forcing a change in \(x^2+3\) suggests that somehow the derivative of \(x^2+3\) is involved. In addition, the change in \(x^2+3\) forcing a change in \((x^2+3)^2\) suggests that the derivative of \(u^2\) with respect to \(u\), where \(u=x^2+3\), is also part of the final derivative.

    We can take a more formal look at the derivative of \(h(x)=(x^2+3)^2\) by setting up the limit that would give us the derivative at a specific value \(a\) in the domain of \(h(x)=(x^2+3)^2\).

    \[h'(a)=\lim_{x→a}\dfrac{(x^2+3)^2−(a^2+3)^2}{x−a}\nonumber \]

    This expression does not seem particularly helpful; however, we can modify it by expanding the numerator to obtain

    \[h'(a)=\lim_{x→a}\dfrac{x^4+6x^2+9-a^4-6a^2-9}{x-a}\]

    Simplifying further, and rearranging terms, we have

    \[h'(a)=\lim_{x→a}\dfrac{x^4-a^4+6x^2-6a^2}{x-a}\]

    Factoring the numerator, we have

    \[h'(a)=\lim_{x→a}\dfrac{(x^2+a^2)(x^2-a^2)+6(x^2-a^2)}{x-a}\]

    Factoring further,

    \[h'(a)=\lim_{x→a}\dfrac{(x^2-a^2)[x^2+a^2+6]}{x-a}\]

    Factoring further, and reducing the common factor of \(x-a\), we are at long last left with

    \[h'(a)=\lim_{x→a}(x+a)(x^2+a^2+6)\]

    Thus, using direct substitution, \(h'(a)=2a(2a^2+6)=2a⋅2(a^2+3)\)

    In other words, if \(h(x)=(x^2+3)^2\), then \(h'(x)=2(x^2+3)⋅2x\). Thus, if we think of \(h(x)=(x^2+3)^2\) as the composition \((f∘g)(x)=f\big(g(x)\big)\) where \(f(x)= x^2\) and \(g(x)=x^2+3\), then the derivative of \(h(x)=(x^2+3)^2\) is the product of the derivative of \(g(x)=x^2+3\) and the derivative of the function \(f(x)=x^2\) evaluated at the function \(g(x)=x^2+3\).

    Now that we have derived a special case of the chain rule, we state the general case and then apply it in a general form to other composite functions. An informal proof is provided at the end of the section.

    Rule: The Chain Rule

    Let \(f\) and \(g\) be functions. For all \(x\) in the domain of \(g\) for which \(g\) is differentiable at \(x\) and \(f\) is differentiable at \(g(x)\), the derivative of the composite function

    \[h(x)=(f∘g)(x)=f\big(g(x)\big) \nonumber \]

    is given by

    \[h'(x)=f'\big(g(x)\big)\cdot g'(x). \nonumber \]

    Alternatively, if \(y\) is a function of \(u\), and \(u\) is a function of \(x\), then

    \[\dfrac{dy}{dx}=\dfrac{dy}{du}⋅\dfrac{du}{dx}. \nonumber \]

    Problem-Solving Strategy: Applying the Chain Rule
    1. To differentiate \(h(x)=f\big(g(x)\big)\), begin by identifying \(f(x)\) and \(g(x)\).
    2. Find \(f'(x)\) and evaluate it at \(g(x)\) to obtain \(f'\big(g(x)\big)\).
    3. Find \(g'(x).\)
    4. Write \(h'(x)=f'\big(g(x)\big)⋅g'(x).\)

    Note: When applying the chain rule to the composition of two or more functions, keep in mind that we work our way from the outside function in. It is also useful to remember that the derivative of the composition of two functions can be thought of as having two parts; the derivative of the composition of three functions has three parts; and so on. Also, remember that we never evaluate a derivative at a derivative.

    The Chain and Power Rules Combined

    We can now apply the chain rule to composite functions, but note that we often need to use it with other rules. For example, to find derivatives of functions of the form \(h(x)=\big(g(x)\big)^n\), we need to use the chain rule combined with the power rule. To do so, we can think of \(h(x)=\big(g(x)\big)^n\) as \(f\big(g(x)\big)\) where \(f(x)=x^n\). Then \(f'(x)=nx^{n−1}\). Thus, \(f'\big(g(x)\big)=n\big(g(x)\big)^{n−1}\). This leads us to the derivative of a power function using the chain rule,

    \(h'(x)=n\big(g(x)\big)^{n−1}\cdot g'(x)\)

    Rule: Power Rule for Composition of Functions (General Power Rule)

    For all values of \(x\) for which the derivative is defined, if

    \[h(x)=\big(g(x)\big)^n, \nonumber \]

    Then

    \[h'(x)=n\big(g(x)\big)^{n−1}\cdot g'(x) \label{genpow}. \]

    Example \(\PageIndex{1}\): Using the Chain and Power Rules

    Find the derivative of \(h(x)=\dfrac{1}{(3x^2+1)^2}\).

    Solution

    First, rewrite \(h(x)=\dfrac{1}{(3x^2+1)^2}=(3x^2+1)^{−2}\).

    Applying the power rule with \(g(x)=3x^2+1\), we have

    \(h'(x)=−2(3x^2+1)^{−3}\cdot 6x\).

    Rewriting back to the original form gives us

    \(h'(x)=\dfrac{−12x}{(3x^2+1)^3}\)

    Exercise \(\PageIndex{1}\)

    Find the derivative of \(h(x)=(2x^3+2x−1)^4\).

    Hint

    Use the General Power Rule (Equation \ref{genpow}) with \(g(x)=2x^3+2x−1\).

    Answer

    \(h'(x)=4(2x^3+2x−1)^3(6x^2+2)=8(3x^2+1)(2x^3+2x−1)^3\)

    Example \(\PageIndex{2}\): Finding the Equation of a Tangent Line

    Find the equation of a line tangent to the graph of \(h(x)=\dfrac{1}{(3x−5)^2}\) at \(x=2\).

    Solution

    Because we are finding an equation of a line, we need a point. The \(x\)-coordinate of the point is 2. To find the \(y\)-coordinate, substitute 2 into \(h(x)\). Since \(h(2)=\dfrac{1}{(3(2)−5)^2}=1\), the point is \((2,1)\).

    For the slope, we need \(h'(2)\). To find \(h'(x)\), first we rewrite \(h(x)=(3x−5)^{−2}\) and apply the power rule to obtain

    \(h'(x)=−2(3x−5)^{−3}(3)=−6(3x−5)^{−3}\).

    By substituting, we have \(h'(2)=−6(3(2)−5)^{−3}=−6.\)

    Therefore, the line has equation \(y−1=−6(x−2)\). Rewriting, the equation of the line is \(y=−6x+13\).

    Exercise \(\PageIndex{2}\)

    Find the equation of the line tangent to the graph of \(f(x)=(x^2−2)^3\) at \(x=−2\).

    Hint

    Use the preceding example as a guide.

    Answer

    \(y=−48x−88\)

    Combining the Chain Rule with Other Rules

    Now that we can combine the chain rule and the power rule, we examine how to combine the chain rule with the other rules we have learned. In particular, we can use it with the product rule or quotient rule.

    Example \(\PageIndex{3}\): Combining the Chain Rule with the Product Rule

    Find the derivative of \(h(x)=(2x+1)^5(3x−2)^7\).

    Solution

    First apply the product rule, then apply the chain rule to each term of the product.

    \(\begin{align*} h'(x)&=\dfrac{d}{dx}\big((2x+1)^5\big)⋅(3x−2)^7+\dfrac{d}{dx}\big((3x−2)^7\big)⋅(2x+1)^5 & & \text{Apply the product rule.}\\[4pt]
    &=5(2x+1)^4⋅2⋅(3x−2)^7+7(3x−2)^6⋅3⋅(2x+1)^5 & & \text{Apply the chain rule.}\\[4pt]
    &=10(2x+1)^4(3x−2)^7+21(3x−2)^6(2x+1)^5 & & \text{Simplify.}\\[4pt]
    &=(2x+1)^4(3x−2)^6(10(3x−2)+21(2x+1)) & & \text{Factor out }(2x+1)^4(3x−2)^6\\[4pt]
    &=(2x+1)^4(3x−2)^6(72x+1) & & \text{Simplify.} \end{align*}\)

    Exercise \(\PageIndex{3}\)

    Find the derivative of \(h(x)=\dfrac{x}{(2x+3)^3}\).

    Hint

    Start out by applying the quotient rule. Remember to use the chain rule to differentiate the denominator.

    Answer

    \(h'(x)=\dfrac{3−4x}{(2x+3)^4}\)

    Example \(\PageIndex{4}\): Using the Chain Rule in a Velocity Problem

    A particle moves along a coordinate axis. Its position at time t is given by \(s(t)=(t^2+3t-4)^3\) feet, with time measured in seconds. What is the velocity of the particle at time \(t=2\)?

    Solution

    To find \(v(t)\), the velocity of the particle at time \(t\), we must differentiate \(s(t)\). Thus,

    \[v(t)=s'(t)=3(t^2+3t-4)^2\cdot (2t+3) ft/sec\]

    \[v(2)=3(2^2+3\cdot 2-4)^2\cdot (2\cdot 2+3)=756 ft/sec\]

    Proof of Chain Rule

    At this point, we present a very informal proof of the chain rule. For simplicity’s sake we ignore certain issues: For example, we assume that \(g(x)≠g(a)\) for \(x≠a\) in some open interval containing \(a\). We begin by applying the limit definition of the derivative to the function \(h(x)\) to obtain \(h'(a)\):

    \[h'(a)=\lim_{x→a}\dfrac{f\big(g(x)\big)−f\big(g(a)\big)}{x−a}. \nonumber \]

    Rewriting, we obtain

    \[h'(a)=\lim_{x→a}\dfrac{f\big(g(x)\big)−f\big(g(a)\big)}{g(x)−g(a)}⋅\dfrac{g(x)−g(a)}{x−a}. \nonumber \]

    Although it is clear that

    \[\lim_{x→a}\dfrac{g(x)−g(a)}{x−a}=g'(a), \nonumber \]

    it is not obvious that

    \[\lim_{x→a}\dfrac{f\big(g(x)\big)−f\big(g(a)\big)}{g(x)−g(a)}=f'\big(g(a)\big). \nonumber \]

    To see that this is true, first recall that since \(g\) is differentiable at \(a\), \(g\) is also continuous at \(a.\) Thus,

    \[\lim_{x→a}g(x)=g(a). \nonumber \]

    Next, make the substitution \(y=g(x)\)and \(b=g(a)\) and use change of variables in the limit to obtain

    \[\lim_{x→a}\dfrac{f\big(g(x)\big)−f \big(g(a) \big)}{g(x)−g(a)}=\lim_{y→b}\dfrac{f(y)−f(b)}{y−b}=f'(b)=f'\big(g(a)\big). \nonumber \]

    Finally,

    \[h'(a)=\lim_{x→a}\dfrac{f\big(g(x)\big)−f\big(g(a)\big )}{g(x)−g(a)}⋅\dfrac{g(x)−g(a)}{x−a}=f'\big(g(a)\big)\cdot g'(a). \nonumber \]

    Example \(\PageIndex{5}\): Using the Chain Rule with Functional Values

    Let \(h(x)=f\big(g(x)\big).\) If \(g(1)=4,g'(1)=3\), and \(f'(4)=7\), find \(h'(1).\)

    Solution

    Use the chain rule, then substitute.

    \[ \begin{align*} h'(1)&=f'\big(g(1)\big)\cdot g'(1) & & \text{Apply the chain rule.} \\[4pt]
    &=f'(4)⋅3 & &\text{Substitute}\; g(1)=4 \;\text{and}\;g'(1)=3. \\[4pt]
    &=7⋅3 & &\text{Substitute}\; f'(4)=7. \\[4pt]
    &=21 & &\text{Simplify.} \end{align*} \nonumber \]

    Exercise \(\PageIndex{5}\)

    Given \(h(x)=f(g(x))\). If \(g(2)=−3,g'(2)=4,\) and \(f'(−3)=7\), find \(h'(2)\).

    Hint

    Follow Example \(\PageIndex{5}\).

    Answer

    28

    The Chain Rule Using Leibniz’s Notation

    As with other derivatives that we have seen, we can express the chain rule using Leibniz’s notation. This notation for the chain rule is used heavily in physics applications.

    For \(h(x)=f(g(x)),\) let \(u=g(x)\) and \(y=h(x)=f(u).\) Thus,

    \[h'(x)=\dfrac{dy}{dx}\nonumber \]

    \[f'(g(x))=f'(u)=\dfrac{dy}{du}\nonumber \]

    and

    \[g'(x)=\dfrac{du}{dx}.\nonumber \]

    Consequently,

    \[\dfrac{dy}{dx}=h'(x)=f'\big(g(x)\big)\cdot g'(x)=\dfrac{dy}{du}⋅\dfrac{du}{dx}.\nonumber \]

    Rule: Chain Rule Using Leibniz’s Notation

    If \(y\) is a function of \(u\), and \(u\) is a function of \(x\), then

    \[\dfrac{dy}{dx}=\dfrac{dy}{du}⋅\dfrac{du}{dx}. \nonumber \]

    Example \(\PageIndex{6}\): Taking a Derivative Using Leibniz’s Notation I

    Find the derivative of \(y=\left(\dfrac{x}{3x+2}\right)^5.\)

    Solution

    First, let \(u=\dfrac{x}{3x+2}\). Thus, \(y=u^5\). Next, find \(\dfrac{du}{dx}\) and \(\dfrac{dy}{du}\). Using the quotient rule,

    \(\dfrac{du}{dx}=\dfrac{2}{(3x+2)^2}\)

    and

    \(\dfrac{dy}{du}=5u^4\).

    Finally, we put it all together.

    \[\begin{align*} \dfrac{dy}{dx}&=\dfrac{dy}{du}⋅\dfrac{du}{dx} & & \text{Apply the chain rule.}\\[4pt]
    &=5u^4⋅\dfrac{2}{(3x+2)^2} & & \text{Substitute}\; \frac{dy}{du}=5u^4\;\text{and}\;\frac{du}{dx}=\frac{2}{(3x+2)^2}. \\[4pt]
    &=5\left(\dfrac{x}{3x+2}\right)^4⋅\dfrac{2}{(3x+2)^2} & & \text{Substitute}\; u=\frac{x}{3x+2}. \\[4pt]
    &=\dfrac{10x^4}{(3x+2)^6} & & \text{Simplify.} \end{align*}\]

    It is important to remember that, when using the Leibniz form of the chain rule, the final answer must be expressed entirely in terms of the original variable given in the problem.

    Key Concepts
    • The chain rule allows us to differentiate compositions of two or more functions. It states that for \(h(x)=f\big(g(x)\big),\)

    \(h'(x)=f'\big(g(x)\big)\cdot g'(x).\)

    In Leibniz’s notation this rule takes the form

    \(\dfrac{dy}{dx}=\dfrac{dy}{du}⋅\dfrac{du}{dx}\).

    • We can use the chain rule with other rules that we have learned, and we can derive formulas for some of them.
    • The chain rule combines with the power rule to form a new rule:

    If \(h(x)=\big(g(x)\big)^n\), then \(h'(x)=n\big(g(x)\big)^{n−1}\cdot g'(x)\).

    • When applied to the composition of three functions, the chain rule can be expressed as follows: If \(h(x)=f\Big(g\big(k(x)\big)\Big),\) then \(h'(x)=f'\Big(g\big(k(x)\big)\Big)\cdot g'\big(k(x)\big)\cdot k'(x).\)

    Key Equations

    • The chain rule

    \(h'(x)=f'\big(g(x)\big)\cdot g'(x)\)

    • The power rule for functions

    \(h'(x)=n\big(g(x)\big)^{n−1}\cdot g'(x)\)

    Glossary

    chain rule
    the chain rule defines the derivative of a composite function as the derivative of the outer function evaluated at the inner function times the derivative of the inner function

    This page titled 3.3: The Chain Rule is shared under a CC BY-NC-SA 1.0 license and was authored, remixed, and/or curated by Gilbert Strang & Edwin “Jed” Herman via source content that was edited to the style and standards of the LibreTexts platform.