CMSC 27100 — Lecture 12

Strong Induction

Sometimes, ordinary induction will prove inconvenient for accomplishing a given task. Conveniently, probably the best example of this is the Fundamental Theorem of Arithmetic, which we want to prove anyway. Here's the statement of the theorem:

Theorem 11.1. Every natural number $n \gt 1$ can be written uniquely as a product of primes.

First, let's think about the problem, leaving the "uniquely" aside for the moment. We want to show that every natural number $n$ has a prime factorization. So assuming that we set everything up properly, we make our hypothesis, that if $n$ has a prime factorization, then $n+1$ has a prime factorization. So we consider $n+1$. If it's prime, then we're good. If it isn't, then we observe that $n+1 = ab$ for some natural numbers $a$ and $b$. What we want is that we can apply our inductive hypothesis to $a$ and $b$. However, our inductive hypothesis doesn't quite say what we want: it only says that $n$ has a prime factorization, not $a$ or $b$!

Strong induction is a reformulation of induction that gives us a "stronger" hypothesis to say these kinds of things.

Definition 11.2. Let $P(n)$ be a statement that depends on $n \in \mathbb N$. If

  1. $P(0)$, and
  2. for all $n \in \mathbb N$, if $P(k)$ for all $k \lt n$, then $P(n)$

are both true, then the following is true:

  1. for all $n \in \mathbb N$, $P(n)$ holds.

This is called the principle of strong induction.

What is the difference between ordinary mathematical induction and strong induction? Or, in other words, what makes strong induction strong? The hypothesis that we make in each of our implications is stronger. Recall that ordinary induction was stated as a chain of implications $P(k) \rightarrow P(k+1)$. Strong induction is a chain of implications of the form $$P(0) \wedge P(1) \wedge P(2) \wedge \cdots \wedge P(k) \rightarrow P(k+1).$$ Then what we're really proving is all of the following statements, \begin{align*} &P(0), \\ P(0) \rightarrow &P(1), \\ P(0) \wedge P(1) \rightarrow &P(2), \\ P(0) \wedge P(1) \wedge P(2) \rightarrow &P(3), \\ & \vdots \end{align*}

It's important to note that strong induction is not stronger in the sense that it allows us to prove things that we wouldn't be able to with mathematical induction. This is why I described strong induction as an alternate form of mathematical induction. It is possible to use mathematical induction to prove anything we would prove with strong induction, but this involves slightly altering the statement that you want to prove to incorporate your stronger assumptions.

A proof by strong induction looks similar to a proof by mathematical induction.

  1. Statement. Clearly state the statement $P$ that you want to prove and what you want to show induction on.
  2. Base case. Prove $P(0)$.
  3. Inductive case. State the inductive hypothesis, $\forall k \lt n, P(k)$. Then, using the assumption that $P(k)$ for all $k \lt n$, prove $P(n)$. Clearly indicate when the inductive hypothesis is used.

Theorem 11.1a. Every natural number $n \gt 1$ can be written as a product of primes.

Proof. We will prove that for $n \gt 1$, $n$ can be written as a product of primes via strong induction on $n$.

Base case. $n = 2$. Clearly, 2 is a product of primes because 2 itself is a prime number.

Inductive case. Assume that for $2 \leq m \lt k$, $m$ can be written as a product of primes. We want to show that $k$ can be written as a product of primes. There are two cases to consider.

$\tag*{$\Box$}$

Fibonacci numbers

Another really nice example for strong induction is the Fibonacci numbers.

Definition 11.3. The Fibonacci numbers are defined by

  1. $f_0 = 0$, $f_1 = 1$, and
  2. $f_n = f_{n-1} + f_{n-2}$ for $n \geq 2$.

In this form, the Fibonacci numbers lend themselves to inductive proofs quite naturally. However, just like with sums, sometimes we just want a nice neat formula to compute one term. It turns out that one can get an exact formula for $f_n$ in terms of the solutions to $x^2 - x - 1 = 0$. These are $\varphi = \frac{1 + \sqrt 5}{2}$, the Golden Ratio, and its conjugate root, $\widetilde \varphi = 1 - \varphi = \frac{1-\sqrt 5}{2}$. What we get is $$f_n = \frac{\varphi^n - \widetilde \varphi^n}{\sqrt 5}.$$ This is now known as Binet's formula, although it seems like it was known by a bunch of mathematicians like Euler, Bernoulli, and de Moivre over a century earlier.

Theorem 11.4. For all $n \in \mathbb N$, $$f_n = \frac{\varphi^n - \widetilde \varphi^n}{\sqrt 5}.$$

Proof. We will show this by strong induction on $n$.

Base case. For $n = 0$, we have $f_0 = 0 = \frac{\varphi^0 - \widetilde \varphi^0}{\sqrt 5}$. For $n = 1$, we have \begin{align*} \frac{\varphi - \widetilde \varphi}{\sqrt 5} &= \frac{\frac{1+\sqrt 5}{2} - \frac{1-\sqrt 5}{2}}{\sqrt 5} \\ &= \frac{\frac{2\sqrt 5}{2}}{\sqrt 5} \\ &= \frac{\sqrt 5}{\sqrt 5} \\ &= 1 \end{align*}

Inductive case. Suppose that $f_j = \frac{\varphi^j - \widetilde \varphi^j}{\sqrt 5}$ for all $j$ with $j \leq k$. Consider $f_{k+1}$. \begin{align*} f_{k+1} &= f_k + f_{k-1} \\ &= \frac{\varphi^k - \widetilde \varphi^k}{\sqrt 5} + \frac{\varphi^{k-1} - \widetilde \varphi^{k-1}}{\sqrt 5} &\text{by ind. hyp.} \\ &= \frac{\varphi^k + \varphi^{k-1} - (\widetilde \varphi^k + \widetilde \varphi^{k-1})}{\sqrt 5} \\ &= \frac{(1 + \varphi) \varphi^{k-1} - (1 + \widetilde \varphi) \widetilde \varphi^{k-1}}{\sqrt 5} \\ &= \frac{\varphi^2 \varphi^{k-1} - \widetilde \varphi^2 \widetilde \varphi^{k-1}}{\sqrt 5} &\text{since $\varphi^2 = \varphi + 1$ and $\widetilde \varphi^2 = \widetilde \varphi+1$}\\ &= \frac{\varphi^{k+1} - \widetilde \varphi^{k+1}}{\sqrt 5} \end{align*} $\tag*{$\Box$}$

Algorithmic Implications

This is post-class material that may be of interest but isn't required reading.

Using this property of the Fibonacci numbers, we can prove something interesting about the Euclidean algorithm. The following is a result attributed to Gabriel Lamé in 1844, which makes it likely the earliest example of both analysis of an algorithm and an application of the Fibonacci numbers.

Theorem 11.5 (Lamé's Theorem). Suppose $a_n \gt a_{n-1} \gt \cdots \gt a_1 \gt a_0 = 0$ is the sequence of numbers obtained from Euclid's algorithm. Then $a_i \geq f_i$ for all $i \leq n$.

Proof. We will prove this by (strong) induction on $n$.

Base case. From the statement, we have $a_0 = f_0$ and $0 \lt a_1$ and $f_1 = 1 \leq a_1$.

Inductive case. Assume that $a_j \geq f_j$ for all $j \lt n$. By the Division Theorem, there exists an integer $q$ such that $a_n = q \cdot a_{n-1} + a_{n-2}$ and that $a_{n-2} \lt a_{n-1}$. Then $q \geq 1$ and \begin{align*} a_n &= q \cdot a_{n-1} + a_{n-2} \\ & \geq a_{n-1} + a_{n-2} \\ & \geq f_{n-1} + f_{n-2} & \text{by ind. hyp.} \\ & = f_n \end{align*} $\tag*{$\Box$}$

When analyzing algorithms, we are concerned with the number of elementary operations that are performed in proportion to the size of the input and we are concerned with the growth of this function. Here, we consider the number of divisions to be our elementary operation, since that's the most important and costly operation we execute.

Corollary 11.6. Let $a \gt b \geq 0$ be natural numbers such that the representation of $a$ requires $d$ decimal digits and the calculation of $\gcd(a,b)$ via Euclid's algorithm requires $k$ division steps. Then $k \leq 5 \cdot d$.

Proof. Since the decimal representation of $a$ requires $d$ digits, we have $a \lt 10^d$. If the computation of $\gcd(a,b)$ by Euclid's algorithm requires $k$ steps, by Lamé's Theorem, we have $a \geq f_k$. Then we have \begin{align*} f_k &\lt 10^d \\ \frac{\varphi^k}{\sqrt 5} &\leq 10^d \\ k \log \varphi - \frac{\log 5}{2} &\leq d \log 10 \\ k &\leq \frac{d \log 10}{\log \varphi} - \frac{\log 5}{2 \log \varphi} \\ k &\leq 4.789d + 1.673 \end{align*} That this implies $k \leq 5d$ is clear for large $d$. For small $d$, we can check the values by hand. $\tag*{$\Box$}$

This is a neat result that goes back to our brief discussion about the efficiency of trying to find a gcd. What Lamé's theorem tells us is that Euclid's algorithm executes operations roughly linearly to the number of digits of the largest number $a$. The number of digits of a number turns out to be about $\log a$, so Euclid's algorithm scales roughly logarithmically with $a$.

If we compare this to the computing all the divisors method, we would be executing approximately as many operations as there are numbers from 1 to $n$. This means that our naive algorithm scales roughly linearly with $a$. Comparing the two algorithms, we have an approximately exponential difference between the two.