% DEFINE some information that will be populated throughout the course notes.
\def \coursename {Linear Algebra}
\def \coursecode {MATH 2221}
\def \courseterm {Winter 2017}
\def \instructorname {Nathan Johnston}
% END DEFINITIONS


% IMPORT the course note formatting and templates
\input{course_notes_template}
% END IMPORT


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\setcounter{chapter}{10} % Set to one less than the week number
\chapter{Diagonalization}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

{\large This week we will learn about:

\begin{itemize}
	\item Diagonalization of matrices,
	\item Matrix functions, and
	\item Why diagonalization is amazing.
\end{itemize}\bigskip\bigskip

\noindent Extra reading:
\begin{itemize}
	\item Section 3.4 in the textbook
	\item Lecture videos \href{https://www.youtube.com/watch?v=LoTuUy4aB7g&list=PLOAf1ViVP13jmawPabxnAa00YFIetVqbd&index=40}{40}, \href{https://www.youtube.com/watch?v=xGkCoOjfotA&list=PLOAf1ViVP13jmawPabxnAa00YFIetVqbd&index=41}{41}, \href{https://www.youtube.com/watch?v=INMXYEMTRVY&list=PLOAf1ViVP13jmawPabxnAa00YFIetVqbd&index=42}{42}, \href{https://www.youtube.com/watch?v=fzkD49_86pU&list=PLOAf1ViVP13jmawPabxnAa00YFIetVqbd&index=43}{43}, and \href{https://www.youtube.com/watch?v=MaG1f0GJNJc&list=PLOAf1ViVP13jmawPabxnAa00YFIetVqbd&index=44}{44} on YouTube
	\item \href{http://en.wikipedia.org/wiki/Diagonalizable_matrix}{Diagonalizable matrix} at Wikipedia
	\item \href{http://en.wikipedia.org/wiki/Matrix_exponential}{Matrix exponential} at Wikipedia
\end{itemize}\bigskip\bigskip

\noindent Extra textbook problems:
\begin{itemize}
	\item[$\star$] 3.4.1
	\item[$\star \, \star$] 3.4.2, 3.4.4, 3.4.6, 3.4.7, 3.4.22
	\item[$\star\star\star$] 3.4.8--3.4.12, 3.4.21
	\item[$\skull$] 3.4.23
\end{itemize}}


\newpage


\section*{Diagonalization}

One of the primary uses of eigenvalues and eigenvectors is that they let us put (most) matrices into a form that makes them almost as easy to work with as diagonal matrices.

\begin{definition}[Diagonalizable Matrices]
	A square matrix $A$ is called \textbf{diagonalizable} if there is a diagonal matrix $D$ and an invertible matrix $P$ such that $A = PDP^{-1}$.
\end{definition}

To get an idea of why diagonalizability is useful, consider the problem of computing a large power of a matrix, like $A^{500}$.

\begin{itemize}
	\item If $A$ is a general matrix...
	
	\horlines{2}
	% Ugh! AAA...A 500 times. Lots of matrix multiplication.
	
	\item If $A$ is a diagonal matrix...
	
	\horlines{2}
	% Easy! Just take 500th power of each diagonal entry (recall that diagonal matrix multiplication is entrywise)
	
	\item If $A$ is diagonalizable...
	
	\horlines{4}
	% A = PDP^{-1} so
	% A^500 = (PDP^{-1})(PDP^{-1})...(PDP^{-1}) = PD^{500}P^{-1}
\end{itemize}


\newpage


But how could we ever hope to determine whether or not a matrix is diagonalizable? It turns out that eigenvalues and eigenvectors give us the answer:

\begin{theorem}[Diagonalizability]
	Let $A$ be an $n \times n$ matrix. Then the following are equivalent:
	\begin{enumerate}
		\item $A$ is diagonalizable.
		
		\item $A$ has a set of $n$ linearly independent eigenvectors.
	\end{enumerate}
	
	\noindent Furthermore, if $A$ is diagonalizable then $A = PDP^{-1}$, where $P$ is the matrix whose columns are the $n$ linearly independent eigenvectors, and $D$ is the diagonal matrix whose diagonal entries are the eigenvalues corresponding to the eigenvectors in $P$ in the same order.
\end{theorem}

\begin{proof}
	Start by noticing that the equation $A = PDP^{-1}$ is equivalent to...
	
	\horlines{13}\vspace*{-1.3cm}
	% AP = PD
	% Use block-matrix multiplication to rewrite this in terms of columns of P:
	% A[p_1 | p2 | ... | pn] = [p_1 | p2 | ... | pn]D
	% [Ap_1 | Ap2 | ... | Apn] = [d_11p_1 | d_22p2 | ... | d_nnpn]
	% Ap_1 = d_11p_1, etc.
	% In other words, A = PDP^{-1} if and only if the columns of p are eigenvectors with the diagonal entries of D as eigenvalues. Eigenvectors must be linearly independent so that P is invertible.
	% Maybe add some statement about geometric multiplicities adding up to n as a 3rd equivalent condition.
\end{proof}


\newpage


\exx[5]{Diagonalize the matrix $A = \begin{bmatrix}1 & 2 \\ 5 & 4\end{bmatrix}$.}
% We already found eigenvalues 6, -1 earlier (week 10) and corresponding eigenvectors. Just plug in. (6, (2,5)) and (-1, (-1,1)).
% Then P = [2 -1;5 1], D = diag(6,-1), P^-1 = [1 1;-5 2]/7

OK, now that we've diagonalized this matrix... so what? How does it make our lives easier? Well...
\exx[12]{Find a formula for $A^{n}$ when $A = \begin{bmatrix}1 & 2 \\ 5 & 4\end{bmatrix}$.}
% We already diagonalized. Just compute the n-th power of the diagonal then un-diagonalize.
% Note that it would have been extremely hard to "eyeball" the formula like we did on early assignments.


\newpage


\exx[22]{Find a formula for $A^{n}$ when $A = \begin{bmatrix}0 & -1 \\ 1 & 0\end{bmatrix}$.}
% Many ways to solve. Method 1: take powers, find a pattern. repeats every 4th power
% Method 2: geometric. This is a rotation CCW by pi/2
% Method 3: diagonalization. Did eigenstuff last week. Go through all 3 methods, note that method 3 is more work, but has the advantage of working in many situations where method 1 or 2 don't


\newpage


The previous theorem is nice since it completely characterizes when a matrix is diagonalizable. However, there is one special case that is worth pointing out where it is actually much easier to prove that a matrix is diagonalizable.

\begin{theorem}[Matrices with Distinct Eigenvalues]
	Let $A$ be an $n \times n$ matrix with \textbf{distinct} eigenvalues $\lambda_1, \lambda_2, \ldots, \lambda_n$. Then $A$ is diagonalizable.
\end{theorem}

\begin{proof}
	We just need to prove that eigenvectors corresponding to different eigenvalues are linearly independent. To show this...
	
	\horlines{15}\vspace*{-1.3cm}
	% Let v_j be an eigenvector corresponding to lambda_j.
	% Let k be the largest integer such that v1,...,vk are independent. Then there exist c1, c2, ..., ck so that \sum_{i=1}^k c_iv_i = v_{k+1}. After multiplying by $A$ we get \sum_{i=1}^k c_i \lambda_i v_i = \lambda_{k+1} v_{k+1}. Subtracing \lambda_{k+1}(Eq.1) - (Eq.2) gives
	% \sum_{i=1}^k (\lambda_{k+1} - \lambda_i)c_i v_i = 0.
	% This is a contradiction since all eigenvalues are distinct so not all of the coefficients on the left are 0.
\end{proof}


\newpage


\exx[10]{Show that the matrix $A = \begin{bmatrix}1 & 1 \\ 1 & 0\end{bmatrix}$ is diagonalizable.}
% Eigenvalues phi and 1/phi. Distinct, done.

\noindent However, to actually perform the diagonalization itself, we still need to know the eigenvectors.


\section*{The Fibonacci Sequence}

As an example to demonstrate the usefulness of diagonalization, let's investigate the Fibonacci sequence, which is the sequence of integers that starts as follows:

\horlines{1}
% 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, ...

\noindent A bit more properly, it is defined by

\horlines{1}
% F1 = 1, F2 = 1, F(n+1) = Fn + F(n-1)

\noindent My question to you is: can we find a simple closed-form formula for the $n$-th term of this sequence, without computing all of the previous terms first?


\newpage
% yes!


\noindent What we will do is represent the Fibonacci sequence via matrix multiplication. Notice that

\horlines{2}
% [Fn+1; Fn] = [1 1; 1 0][Fn; Fn-1]

\noindent Well, if we iterate this line of thinking, we get

\horlines{3}
% [Fn+1; Fn] = [1 1; 1 0][Fn; Fn-1] = [1 1; 1 0]^2[Fn-1; Fn-2] = ... = [1 1; 1 0]^(n-1)[1;1]

Matrix multiplication is kinda nasty, but fortunately we already saw that this matrix is diagonalizable, so we can compute large powers of it easily! So let's find its eigenvectors (we already computed its eigenvalues):

\horlines{12}
% (1-lam)*(-lam) - 1 = 0, so lam^2 - lam - 1 = 0, so lam = 1/2 +- sqrt(5)/2. Call these phi and 1-phi.
% if lam = phi then null([1-phi, 1; 1 -phi]) (FIRST ROW OPERATION: swap R1 and R2 to make the algebra work out nicely), so we get x = phi*y, so all vectors of the form y[phi,1]
% if lam = 1-phi then null([phi, 1; 1 -1+phi]) (FIRST ROW OPERATION: swap R1 and R2 to make the algebra work out nicely, but actually just state answer here, it's very similar to the last one we found), so we get x = (1-phi)y, so all vectors of the form y[1-phi,1]
% The row operations above are important for making the algebra work out cleanly below


\newpage


\noindent Thus its diagonalization is:

\horlines{4}
% Need to compute P^{-1}. Can use explicit 2x2 formula.
% So P = [phi, 1-phi; 1, 1]
% D = [phi, 0; 0 1-phi]
% P^{-1} = [1, phi-1; -1, phi]/sqrt(5)

Finally, we can use this diagonalization to compute arbitrary powers of the matrix:

\horlines{11}
% IMPORTANT: DO NOT MULTIPLY MATRICES TOGETHER. INSTEAD, only do matrix-vector multiplications (i.e., go right-to-left) to make it work out cleanly.
% Thus [Fn+1; Fn] = [1 1; 1 0]^(n-1)[1;1] = PD^(n-1)P^{-1}[1;1] **CAREFUL TO PUT P ON LEFT AND P^-1 ON RIGHT!!**
% = (1/sqrt(5))*[phi, 1-phi; 1, 1][phi^(n-1), 0; 0 (1-phi)^(n-1)][1, phi-1; -1, phi][1;1] = (1/sqrt(5))*[phi, 1-phi; 1, 1][phi^(n-1), 0; 0 (1-phi)^(n-1)][phi; phi-1] = (1/sqrt(5))*[phi, 1-phi; 1, 1][phi^n; -(1-phi)^n] =  (1/sqrt(5))*[phi^(n+1) - (1-phi)^(n+1); phi^n - (1-phi)^n].

\noindent Thus we obtain the following simple formula for the $n$-th Fibonacci number:

\horlines{1}
% Fn = (phi^n - (1-phi)^n)/sqrt(5)

The idea used throughout this example applies in a lot of generality: if we can represent something by matrix multiplication, then there's a good chance that diagonalization (via eigenvalues/eigenvectors) can shed light on the problem.


\newpage


\section*{Arbitrary Matrix Powers}

Once we have diagonalized a matrix, performing an operation on it is almost as easy as performing that operation on a number. We already saw this with computing large powers of a matrix: our procedure was...
\begin{enumerate}
	\item First, \\ % diagonalize matrix A = P^-1 D P
	
	\item Next, \\ % compute the large power of D
	
	\item Finally, \\ % undiagonalize
\end{enumerate}

This same basic idea works in lots of generality, and helps us talk about things we wouldn't even know how to define otherwise. For example...

\begin{center}
	\Large{What is a square root $B$ of the matrix $A$?}
\end{center}

\horlines{2}
% A matrix B with the property that B^2 = A.

\begin{center}
	\Large{OK, how could we find a square root of $A$?}
\end{center}

\horlines{2}
% Diagonalize and find a square root of the diagonal part! If A = PDP^{-1} and B = Psqrt(D)P^-1 then things work out.

\exx{Find a square root of the matrix $A = \begin{bmatrix}0 & 4 \\ -1 & 5\end{bmatrix}$.}
% eigs are 1 and 4
% evecs are [4,1] and [1,1]
% diagonalize and compute
% double check and the square of your answer is the original matrix
% Note that you have flexibility in how you take your square roots due to plus-minus sign stuff (4 square roots in this case), so we should say "A" square root, not "THE" square root

\newpage


\horlines{22}


\newpage


\horlines{7}


Similarly, we can use this method to define the $r$-th power of a diagonalizable matrix for any real number $r$ (i.e., $r$ doesn't need to be an integer):

\begin{definition}[Matrix Powers]
	Let $r$ be a real number and let $A$ be a diagonalizable matrix (i.e., $A = PDP^{-1}$ for some invertible $P$ and diagonal $D$). Then $A^r = PD^rP^{-1}$, where $D^r$ is obtained by raising each of its diagonal entries to the $r$-th power.
\end{definition}

% Note that if r is a positive integer, this agrees with the usual definition.
% If r = 1/2, this agrees with our method of finding matrix square roots we just discussed.

\exx[5]{Compute $A^\pi$ when $A = \begin{bmatrix}0 & 4 \\ -1 & 5\end{bmatrix}$.}

\noindent Question to ponder: What happens when $r = -1$ in the above definition?

\horlines{1}
% You get A^{-1}


\newpage


\section*{Arbitrary Matrix Functions}

The previous section just touched the tip of the iceberg: we can also extend any function with a Taylor series to matrices now. For example, let's consider the function $e^x$. Recall that
\begin{align*}
	e^x = 1 + x \ + \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad \quad % x^2/2 + x^3/3! + ...
\end{align*}

\noindent With that in mind, we define $e^A$, where $A$ is a square matrix, as follows:

\horlines{1}
% e^A = I + A + A^2/2 + ... = \sum_{n=0}^\infty A^n / n!

That seems like it might be nasty to calculate in general. Fortunately, we can just do what we usually do: diagonalize, apply the function $e^x$ to each diagonal entry, and then un-diagonalize.

\exx[5]{Compute $e^A$, where $A = \begin{bmatrix}0 & 4 \\ -1 & 5\end{bmatrix}$.}

\noindent Why does this work?

\horlines{6}
% Because e^A = I + A + A^2/2 + ... = PP^{-1} + PDP^{-1} + ... then factor P on left P^-1 on right, etc.

\newpage


The following properties of the matrix exponential are straightforward to check:

\begin{itemize}
	\item $e^O = I$ \\ % O is already diagonal, so just e^diagonal entries.
	
	\item $e^Ae^{-A} = I$
	
	\horlines{3}
	% Pe^DP^{-1}Pe^-DP^-1 = I
\end{itemize}

There wasn't really anything too special about the function $e^x$ here that let us define it for matrices: we can do the same thing for any function that equals its Taylor series, and the idea is exactly the same: just apply the function to the diagonal entries in the diagonalization of the matrix.

\exx[7]{Compute $\sin(A)$, where $A = \begin{bmatrix}0 & 4 \\ -1 & 5\end{bmatrix}$.}

So go forth, and compute the $\sin$, $\arctan$, and $\log$ of matrices to your heart's content! \\

\noindent (Unless the matrices aren't diagonalizable... in that case, take Advanced Linear Algebra (MATH~3221) to learn what to do.)
\end{document}