Chain Rule Proof Theorem (Chain Rule): If `g` is differentiable at `a` and `f` is differentiable at `b=f(a)`, and `h(x) = f(g(x))` for an interval `I` with `a in I`
 then `h` is differentiable at `a` and `h'(a) = f'(g(a)) cdot g'(a)`.

   
PROOF of Chain Rule:
Part I: Assume there is some interval ,`I` containing `a` where for all `x in I` , `g(x) ne g(a)`.
Let `b = g(a)`and `k = g(a+h) - g(a) = g(x) - g(a)` for `h ne 0`.
Note `g(x) = g(a+h) = g(a) + k = b + k`. See Figure 1 .

From the assumption that `g` is differentiable at `a`, we have that `g` is also continuous at `a` . Thus we can conclude that as `h to 0, k to 0`.

We’ll follow the usual steps in finding the derivative of P at a:
Step I: `P(a+h) = f(g(a+h))`
          `\underline{- P(a)\ \  \ \  = f(g(a))}`
Step II: `P(a+h) - P(a) = f(g(a+h)) - f(g(a)) = f(b + k) - f(b)`.
Now we assumed that `k ne 0` . [Note: This is a major assumption for some functions.]
So
`P(a+h) - P(a) = {f(b + k) - f(b)}/k . k`
Therefore
`P'(a) = lim_{h to 0} { P(a+h) - P(a)}/h`
`= lim _{h to 0,  k to 0} {f(b + k) - f(b)}/k  cdot  k/h`
`= lim _{h to 0,  k to 0} {f(b + k) - f(b)}/k  cdot {g(a+h)-g(a) }/h`
`= f '(b)  cdot g'(a)`
`= f '(g(a)) cdot g'(a)`.

Part II:  Recall that we had `k = g(a+h)-g(a)` for `h ne 0`.
Suppose that  `k = 0` for values of `h` arbitrarily close to `0`.
Since we assume that g is differentiable we know that
`lim_{h to 0} {g(a+h)-g(a)}/ h` must exist. Our assumption that `k = 0` for `h` arbitrarily close to `0` means that
there is a sequence of  values of ` h`, {`h_n`} with `h_n to 0` and  `g(a+h_n) -g(a) = 0` for all `n`.
Thus `lim_{n  to oo} {g(a+h_n)-g(a)}/ {h_n } = lim_{n  to oo} 0/ {h_n }= 0` . [See Figure 2 ]
Thus  `g'(a) = lim_{ h to 0} {g(a+h)-g(a)} / h = 0`. [0 is the only possible limit.]

To complete the argument we need only show that `P'(a)= 0`.
But for precisely the same h values that had `k = g(a+h)-g(a) = 0`, we have `g(a+h) = g(a)`. Thus for these values of `h`
`P(a+h) - P(a) = f(g(a+h)) - f(g(a)) = f(b + k) - f(b) = f(b) - f(b) = 0.`
and hence `{P(a+h) - P(a)}/h = 0`. [See Figure 3]

Now for any `h` where `k ne 0`, see Figure 1, the argument of  part I is still valid to show that `{P(a+h) - P(a)}/h to 0` as `h  to 0`.

[This is primarily because `g'(a)=0`.]

In summary then , as h approaches 0 either `{P(a+h)-P(a)}/h` is
close to or actually is 0.
Thus `P'(a) =lim_{h to 0} {P(a+h) - P(a) }/h= 0`. EOP.