C.2 The Chain Rule
On the preceding screen we reviewed compound functions and developed a conceptual understanding of the Chain Rule using the ascending balloon scenario. In particular, we saw that to find the rate-of-change of the overall function
Let's formalize that process, and then we'll work some initial problems to see how easy the Chain Rule is to use.
Formal Statement of the Chain Rule
Here is a formal statement of the Chain Rule:
Chain Rule, with prime notation:
For two functions
The Chain Rule is often written with Leibniz notation instead, because its easy to remember and is one case where thinking of the derivative as a fraction actually comes in handy.
Chain Rule, with Leibniz notation:
For two differentiable functions
Using the alternate notation
Chain Rule, alternate notation:
If we think of the derivative as a fraction of vanishing quantities (as Leibniz did), then the statement of the Chain Rule seems almost obvious. As you can see, when we multiply two fractions that share a common factor in the numerator and denominator (
This is probably what you did in your head when you considered the ascending balloon scenario on the preceding screen:
WARNING: While for first derivatives we can think of differentials as canceling in this way, we cannot extend this reasoning to second- and other higher-order derivatives. We also cannot apply other properties of fractions to derivatives. Indeed, even this notion of canceling these differentials was quite controversial until the 1960s — quite late for the development of Calculus since Newton and Leibnitz were alive in the 1600s.
Using the Chain Rule
Enough with the abstract; let's get to some Examples to show how we use the Chain Rule routinely in practice. We'll begin by resolving the first quick examples we introduced at the top of the preceding page to illustrate why we need the Chain Rule at all.Chain Rule Example #1:
Use the Chain Rule to differentiate
Note: As we saw above without using the Chain Rule, since
We'll solve this using three different approaches — but we encourage you to become comfortable with the third approach as quickly as possible, because that's the one you'll use to compute derivatives quickly as the course progresses.
• Solution 1.
Let's use the first form of the Chain rule above:
Then
Hence
Ah: our naive approach at the start of the preceding page was missing that very factor of 2 that comes from the Chain Rule as the derivative of the inner function. As we said, the Chain Rule makes this all work easily.
• Solution 2.
Let's use the second form of the Chain rule above:
We have
Then
• Solution 3.
With some experience, you won't introduce a new variable like
Note: You'd never actually write "stuff = ...." Instead just hold in your head what that "stuff" is, and proceed to write down the required derivatives.
Solution.
Let's consider next one of the functions from our Compound Functions Example 1 on the preceding screen.
Chain Rule Example #2:
Given
Solution.
We'll again solve this using three different approaches, and again encourage you to become comfortable with the third approach as quickly as possible.
• Solution 1.
Let's use the first form of the Chain rule above:
In Compound Functions Example 1, we recast this function as the composition
Then
Hence
• Solution 2.
Let's use the second form of the Chain rule above:
We have
Then
• Solution 3.
With some experience, you won't introduce a new variable like
Note: You'd never actually write "stuff = ...." Instead just hold in your head what that "stuff" is, and proceed to write down the required derivatives. We'll make this more formal immediately below this Example.
What's chained in the Chain Rule?
Let's use the preceding Example to both explain why the Chain Rule has the name it does, and also to justify our use of "stuff" in quickly reasoning our way through finding the derivative of even the most complex functions.
First, the "Chain Rule" has the name it does because compositions of functions can be thought of as "chains" of functions, and the Chain Rule provides the means to differentiate these functions.
Consider the example from immediately above. The function
The third, final link in this chain is the most "outside" procedure we apply, whereas the first link is the inner-most piece of the function. We can write the entire procedure we are using with function and box notation as
To differentiate this chain, we start from the outside and work our way inward until we hit something that can be considered a function. We'll use a downward pointing arrow to indicate our focus as we work our way along:
So let's move the arrow to the right.
The crux of the Chain Rule is what happens next: now we must multiply this first Chain Rule term by the derivative of the inside function, meaning the stuff we covered up with
Remember the
Chain Rule term!
For most beginning students, the most common error on exams is to forget to multiply by the derivative of the inner function.
Indeed, at this very moment all over the world teachers and tutors are saying "Chain Rule!" to beginning students who've forgotten this term. (It's also why, right before an exam, many students will write "Chain Rule!" on one hand to remind themselves to include this factor.) The "naive calculations" at the start of the preceding page illustrate this very error. With practice, you'll start catching yourself when you make it, and quickly remember to multiply by this "missing" Chain Rule factor.
Quick practice, and what's to come
On the next screen we'll provide you with lots of basic practice with the Chain Rule. Since the rest of the course will depend on your ability to quickly compute correct derivatives, practicing now (and making as many errors as you need to, and you will almost certainly initially make some) is super-important. On the screen after that, we'll address more complex problems: we simply extend the chain, and just keep going as we did above to find the derivative of quite complicated-looking functions.
To end this screen, let's do some quick work on your ability to immediately notice where a Chain-Rule term is missing. We'll treat similar problems in more depth on the next screen; we mean these to be fast, just so you can get used to basic usage of the Chain Rule.
CHECK QUESTION 1:
CHECK QUESTION 2:
CHECK QUESTION 3:
The Upshot
- A compound (or composite) function is comprised of an outer function and an inner function.
-
When we take the derivative of a compound function, we must use the Chain Rule.
In Prime notation: In Leibniz notation:[ 𝑓 ( 𝑔 ( 𝑥 ) ) ] ′ = 𝑓 ′ ( 𝑔 ( 𝑥 ) ) ⋅ 𝑔 ′ ( 𝑥 ) = [ d e r i v a t i v e o f t h e o u t e r f u n c t i o n , e v a l u a t e d a t t h e i n n e r f u n c t i o n ] × [ d e r i v a t i v e o f t h e i n n e r f u n c t i o n ] In alternate notation:𝑑 𝑦 𝑑 𝑡 = 𝑑 𝑦 𝑑 𝑢 ⋅ 𝑑 𝑢 𝑑 𝑡 And informally, the way you may quickly come to think about it:𝑑 𝑑 𝑥 ( 𝑓 ∘ 𝑔 ) ( 𝑥 ) = 𝑑 𝑓 𝑑 𝑔 ⋅ 𝑑 𝑔 𝑑 𝑥 𝑑 𝑓 𝑑 𝑥 = [ 𝑑 𝑓 𝑑 ( s t u f f ) , w i t h t h e s a m e s t u f f i n s i d e ] × 𝑑 𝑑 𝑥 ( s t u f f )
On the next screen, you'll get lots of practice with basic Chain Rule problems before we move on to more complex ones.
Questions or comments about what's on this screen, or any other Calculus questions? Visit the Forum and we'd love to help!
How helpful was this page?
Our goal is to provide the best content we can, for free to any student looking to learn well, and your feedback helps us improve.