% DEFINE some information that will be populated throughout the course notes. \def \coursename {Linear Algebra} \def \coursecode {MATH 2221} \def \courseterm {Winter 2020} \def \instructorname {Nathan Johnston} % END DEFINITIONS % IMPORT the course note formatting and templates \input{course_notes_template} % END IMPORT %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \setcounter{chapter}{1} % Set to one less than the week number \chapter{Lengths, Angles, and \\ the Dot Product} %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% {\large This week we will learn about: \begin{itemize} \item The dot product, \item The length of vectors and the angle between them, and \item The Cauchy--Schwarz and triangle inequalities. \end{itemize}\bigskip\bigskip \noindent Extra reading and watching: \begin{itemize} \item Section 1.2 in the textbook \item Lecture videos \href{https://www.youtube.com/watch?v=PJfvKCXpWZM&list=PLOAf1ViVP13jmawPabxnAa00YFIetVqbd&index=4}{4}, \href{https://www.youtube.com/watch?v=iffOTbS3IYw&list=PLOAf1ViVP13jmawPabxnAa00YFIetVqbd&index=5}{5}, \href{https://www.youtube.com/watch?v=iQmX26y9ZvI&list=PLOAf1ViVP13jmawPabxnAa00YFIetVqbd&index=6}{6}, and \href{https://www.youtube.com/watch?v=f73qCiJCIXE&list=PLOAf1ViVP13jmawPabxnAa00YFIetVqbd&index=7}{7} on YouTube \item \href{http://en.wikipedia.org/wiki/Dot_product}{Dot product} at Wikipedia \item \href{http://en.wikipedia.org/wiki/Cauchy-Schwarz_inequality}{Cauchy--Schwarz inequality} at Wikipedia \end{itemize}\bigskip\bigskip \noindent Extra textbook problems: \begin{itemize} \item[$\star$] 1.2.1--1.2.3, 1.2.7, 1.2.8 \item[$\star \, \star$] 1.2.4--1.2.6, 1.2.9--1.2.11 \item[$\star \star \star$] 1.2.12, 1.2.13, 1.2.17--1.2.21 \item[$\skull$] 1.2.23 \end{itemize}} \newpage \section*{The Dot Product} In 2D (and sometimes in 3D), it is fairly intuitive to talk about geometric quantities like lengths or angles. You have used things like similar triangles and the law of cosines for tackling problems like this in the past. \\ Using vectors, we can now generalize these concepts to arbitrary dimensions (even though we can't picture it)! Our main tool will be... \begin{definition}[Dot Product] If $\v = (v_1,v_2,\ldots,v_n) \in \R^n$ and $\w = (w_1,w_2,\ldots,w_n) \in \R^n$ then the \textbf{dot product} of $\v$ and $\w$, denoted by $\v \cdot \w$, is the quantity \begin{align*} \v \cdot \w \defeq v_1w_1 + v_2w_2 + \cdots + v_nw_n. \end{align*} \end{definition} Please be wary of what types of objects go into and come out of the dot product: \horlines{3} % Input: 2 vectors, Output: a single number! % v \cdot w \cdot x makes no sense, for example. \noindent Intuitively, the dot product $\mathbf{v} \cdot \mathbf{w}$ tells you how much $\mathbf{v}$ points in the direction of $\mathbf{w}$ (or how much $\mathbf{w}$ points in the direction of $\mathbf{v}$). \exx[6]{2D examples.} % Maybe ask students for one and then pick another one to do as well where the second vector is changed to perpendicular. Also dot product of (x,y) with (1,0) and (0,1). Draw pictures. \newpage \exx[5]{Higher-dimensional examples.} % Compute the dot product of $(1,2,4)$ and $(3,-1,2)$. % 3 - 2 + 8 = 9 % 4D or higher dimensional example too We have defined a new mathematical operation, so it's time for another ``obvious'' theorem telling us what properties it satisfies: \begin{theorem}[Properties of the Dot Product] Let $\v,\w,\z \in \R^n$ be vectors and let $c \in \R$ be a scalar. Then \begin{enumerate} \item $\v \cdot \w = \w \cdot \v$ \hfill {\color{gray}(commutativity)} \item $\v \cdot (\w + \z) = \v \cdot \w + \v \cdot \z$ \hfill {\color{gray}(distributivity)} \item $(c\v) \cdot \w = c(\v \cdot \w)$ \end{enumerate} \end{theorem} \begin{proof} We will prove property~(a). You can try the rest on your own (the method is quite similar). \horlines{6} % To prove~(a), simply apply the definition of the dot product: % \v \cdot \w % = v_1w_1 + v_2w_2 + \cdots + v_nw_n % = w_1v_1 + w_2v_2 + \cdots + w_nv_n % = \w \cdot \v \noindent This completes the proof. \end{proof} \newpage \exx[3]{Compute $\frac{1}{2}(-1,-3,2) \cdot (6,-4,2)$.} % We can move the 1/2 to the right vector and *then* take the dot product, which makes the algebra a bit cleaner. We get (-1,-3,2) \cdot (3,-2,1) = -3 + 6 + 2 = 5. % Maybe also do this the "ugly" way. \exx[5]{Show that $(\mathbf{v} + \mathbf{w}) \cdot (\mathbf{v} + \mathbf{w}) = \mathbf{v}\cdot\mathbf{v} + 2\mathbf{v}\cdot\mathbf{w} + \mathbf{w}\cdot\mathbf{w}$ for all $\mathbf{v},\mathbf{w}\in\mathbb{R}^n$.} % Just go through the calculation. % The point: you can "FOIL" or "multiply out" brackets with the dot product just like with regular multiplication. \section*{Length of a Vector} We now start making use of the dot product to talk about things like the length of vectors or the angle between vectors (even in high-dimensional spaces). \exx[7]{Length of vectors in $\mathbb{R}^2$.} % A vector $\mathbf{v} = (v_1,v_2)$ has length... what? Draw it. Use Pythagorean theorem. % sqrt(v_1^2 + v_2^2) = sqrt(\v \cdot \v). % THE LAST LINE ABOVE IS IMPORTANT: make sure to write it. \newpage \exx[8]{Length of vectors in $\mathbb{R}^3$.} % FILES NEEDED: week2/3d_length.png % INSERT image (week2/3d_length.png). The point is to write v = (v_1,v_2,v_3) as (v_1,v_2,0) + (0,0,v_3) and use Pythagoras together with the R^2 length we just derived. % Again, MAKE SURE to write \|v\| = \sqrt{v \cdot v} In higher dimensions, we \emph{define} the length of a vector so as to continue the pattern that we observed above: \begin{definition}[Length of a Vector] The \textbf{length} of a vector $\v = (v_1,v_2,\ldots,v_n) \in \mathbb{R}^n$, denoted by $\|\v\|$, is defined by \begin{align*} \|\v\| \defeq \sqrt{\v\cdot\v} = \sqrt{v_1^2 + v_2^2 + \cdots + v_n^2}. \end{align*} \end{definition} % The square root in the previous definition makes sense since $\v \cdot \v = v_1^2 + v_2^2 + \cdots + v_n^2 \geq 0$. \exx[4]{Compute the length of some vectors.} % Do some simple ones like (2,3), (1,2,3), and then a higher-dimensional one like (1,3,-5,-1,2) % Maybe ||(cos(x),sin(x))|| = 1 too. As always, we have defined a new mathematical object, so we want a theorem that tells us what its properties are. \newpage \begin{theorem}[Properties of Vector Length] Let $\v \in \R^n$ be a vector and let $c \in \R$ be a scalar. Then \begin{enumerate} \item $\|c\v\| = |c|\|\v\|$ \item $\|\v\| = 0$ if and only if $\v = \0$ \end{enumerate} \end{theorem} \begin{proof} To prove property~(a), we just apply the relevant definitions: \horlines{2} % \|c\mathbf{v}\| = \sqrt{(c\mathbf{v})\cdot(c\mathbf{v})} = \sqrt{c^2 \mathbf{v} \cdot \mathbf{v}} = \sqrt{c^2}\sqrt{\mathbf{v}\cdot\mathbf{v}} = |c|\|\mathbf{v}\| \noindent To prove property~(b), we have to prove two things: \horlines{7} % \|0\| = \sqrt{0^2+...+0^2} = 0. For the other direction, suppose sqrt(v_1^2 + ... + v_n^2} = 0. Then v_1^2 + ... + v_n^2 = 0. Since each v_i^2 >= 0, this means that v_i^2 = 0 for all i, so v_i = 0, so v = 0. \noindent This completes the proof. \end{proof} A vector with length~$1$ is called a \textbf{unit vector}. Every non-zero vector $\mathbf{v} \in \mathbb{R}^n$ can be divided by its length to get a unit vector: \horlines{2} % If w = v/||v|| then ||w|| = 1/||v|| * ||v|| = 1 by part (a) of the theorem from last page. \noindent Scaling $\mathbf{v}$ to have length~$1$ like this is called \textbf{normalizing} $\mathbf{v}$ (and this unit vector $\mathbf{w}$ is called the \textbf{normalization} of $\mathbf{v}$). \newpage \exx[5]{Normalize the vector $(3,4) \in \R^2$.} % Draw it and compute w = (3/5, 4/5). Draw its normalization too, and maybe the unit circle (noting that a vector's normalization is always on the unit circle in the same direction). % Note that unit vectors basically just describe a direction; we discarded length. \exx[3]{Show that the standard basis vectors are unit vectors.} % Remind students that e_i has a 1 in its i-th position, 0 elsewhere. Thus ||e_i|| = sqrt{0^2 + ... + 1^2 + ... + 0^2} = 1. % Maybe mention that this is where the course starts to get much harder for a lot of students. We now start to look at somewhat more interesting properties of the dot product and vector lengths. Our first result in this direction is an inequality that relates the dot product of two vectors to their lengths: \begin{theorem}[Cauchy--Schwarz Inequality]\label{thm:cauchy_schwarz} Suppose that $\v,\w \in \R^n$ are vectors. Then $|\v\cdot\w| \leq \|\v\|\|\w\|$. \end{theorem} \begin{proof} Define the vector $\x = \|\w\|\v - \|\v\|\w$ and then expand the quantity $\|\x\|^2$ in terms of the dot product: \horlines{5} \newpage \horlines{6}\vspace*{-1.3cm} %\begin{align*} % 0 & \leq \|\x\|^2 \\ % & = \x\cdot\x \\ % & = \big(\|\w\|\v - \|\v\|\w\big)\cdot\big(\|\w\|\v - \|\v\|\w\big) \\ % & = \|\w\|^2(\v\cdot\v) - 2\|\v\|\|\w\|(\v\cdot\w) + \|\v\|^2(\w\cdot\w) \\ % & = 2\|\v\|^2\|\w\|^2 - 2\|\v\|\|\w\|(\v\cdot\w). %\end{align*} % By dividing the inequality $0 \leq 2\|\v\|^2\|\w\|^2 - 2\|\v\|\|\w\|(\v\cdot\w)$ by $2\|\v\|\|\w\|$ and then rearranging, we arrive at the inequality $\v\cdot\w \leq \|\v\|\|\w\|$. To strengthen this statement to the desired form $|\v\cdot\w| \leq \|\v\|\|\w\|$, we must also prove that $-(\v\cdot\w) \leq \|\v\|\|\w\|$, which can be accomplished by mimicking the above argument with the vector $\y = \|\w\|\v + \|\v\|\w$ instead of $\x = \|\w\|\v - \|\v\|\w$ (assignment question?). Easier way: just replace w by -w. Then inequality we already proved immediately implies the negative version too. \end{proof}\vspace*{0.3cm} The above theorem is our first example of a theorem with a very non-obvious proof: even though we can follow the steps and see that they are individually true, the choice of $\x = \|\w\|\v - \|\v\|\w$ at the start was something like magic. This particular choice of $\x$ was chosen so that the proof would give us what we wanted. Other choices of $\x$ also result in true inequalities, but ones that are less useful than Cauchy--Schwarz. \exx[5]{Do there exist vectors $\v,\w \in \R^2$ with...} % v \cdot w = 7, ||v|| = 3, ||w|| = 2? No, violates C-S. % v \cdot w = 5, ||v|| = 3, ||w|| = 2? Yes, can construct. w = (2,0), v = (5/2,sqrt(11/4)) We have two main uses for the Cauchy--Schwarz inequality. The first is that it helps us prove another geometrically ``obvious'' fact about vector lengths: \horlines{4} % The length of one side of a triangle is never longer than the sum of the other two side lengths. Draw picture in R^2, but now we can actually prove it in R^n. % Triangle inequality: Stopping by Starbucks on your way to class cannot be a shortcut. \begin{theorem}[Triangle Inequality]\label{thm:triangle_inequality} Suppose that $\v,\w \in \R^n$ are vectors. Then $\|\v+\w\| \leq \|\v\| + \|\w\|$. \end{theorem} \begin{proof} We start by expanding $\|\v + \w\|^2$ in terms of the dot product: \horlines{6}\vspace*{-1.4cm} % \begin{align*} % \|\v + \w\|^2 & = (\v + \w)\cdot(\v + \w) & & \\ % & = (\v\cdot\v) + 2(\v\cdot\w) + (\w\cdot\w) & & \\ % & = \|\v\|^2 + 2(\v\cdot\w) + \|\w\|^2 & & \\ % & \leq \|\v\|^2 + 2\|\v\|\|\w\| + \|\w\|^2 & & \text{\color{gray}(by Cauchy--Schwarz)} \\ % & = (\|\v\| + \|\w\|)^2. & & % \end{align*} % We can then take the square root of both sides of the inequality to see $\|\v + \w\| \leq \|\v\| + \|\w\|$, as desired. \end{proof}\vspace*{0.3cm} % FUN ACTIVITY: high-dimensional cubes are "spiky" (corners are super far away) % https://www.reddit.com/r/math/comments/6o8a98/corners_stick_out_more_in_high_dimensions/ \section*{Angle Between Vectors} The second immediate use of the Cauchy--Schwarz inequality is that it helps us define angles in $\R^n$. To get an idea of how this works, let's start by thinking about a triangle with sides given by the vectors $\mathbf{v}$, $\mathbf{w}$, and $\mathbf{v} - \mathbf{w}$: \horlines{7} % Draw the triangle (remember v-w goes from head of w to head of v) % Law of cosines says that ||v - w||^2 = ||v||^2 + ||w||^2 - 2||v||*||w||*cos(theta), where theta is the angle between v and w % Expand the LHS via the dot product and rearrange to get cos(theta) = u\cdot v / (||u||*||v||) % Maybe mention u cot v = ||u||||v||cos(theta) form, to drive home the idea from earlier that the dot product measures how much u and v point in the same direction? Largest when theta close to 0. Or leave this to next page when introducing orthogonality? Our reasoning above gave us a formula for the angle between two vectors in $\R^2$ (and in $\R^3$). We now state it as a definition in higher-dimensional spaces. \newpage % DEFINITION: Angle Between Vectors \begin{definition}[Angle Between Vectors]\label{thm:angle} The \textbf{angle} $\theta$ between two non-zero vectors $\v,\w \in \R^n$ is the quantity $$\theta = \arccos\left(\frac{\v\cdot\w}{\|\v\|\|\w\|}\right).$$ \end{definition} % END DEFINITION \exx[6]{What is the angle between $\v = (1,1,1,1)$ and $\w = (2,0,2,0)$?} % v \cdot w = 4 % ||v||*||w|| = 4*sqrt(2) % arccos(1/sqrt(2)) = pi/4 (special triangle with sides 1, 1, sqrt(2), angles pi/4, pi/4, pi/2) % DRAW this special triangle \exx[10]{What is the angle between the diagonals of two adjacent faces of a cube?} % FILES NEEDED: week2/cube_faces.png % Can be positioned so that it has one vertex at $(0,0,0)$ and its opposite vertex at $(1,1,1)$. There are lots of pairs of face diagonals that we could choose; we (arbitrarily) choose the face diagonals $\v = (1,0,1) - (1,1,0) = (0,-1,1)$ and $\w = (0,1,1) - (1,1,0) = (-1,0,1)$ (INSERT week2/cube_faces.png). Then $\v\cdot\w = 0 + 0 + 1 = 1$, $\|\v\| = \|\w\| = \sqrt{2}$, so the angle between $\v$ and $\w$ is $\theta = \arccos(1/(sqrt(2)*sqrt(2))) = \arccos(1/2) = \pi/3.$$ (again, DRAW the 1-2-sqrt(3) special triangle) \newpage Recall that $\arccos(x)$ is only defined if $-1 \leq x \leq 1$. How do we know that $-1 \leq \frac{\v \cdot \w}{\|\v\|\|\w\|} \leq 1$? \horlines{1} % Cauchy-Schwarz inequality One special case of vector angles that is worth pointing out is the case when $\v \cdot \w = 0$. When this happens... \horlines{5} % \theta = \arccos(v\cdot w / ||v|| ||w||) = arccos(0) = pi/2. % i.e., v and w are perpendicular! Maybe draw a simple picture and note that acute means \v \cdot \w > 0 and obtuse means \v \cdot \w < 0. \noindent This special case is important enough that we give it its own name: % DEFINITION: Orthogonality \begin{definition}[Orthogonality]\label{thm:orthogonal} Two vectors $\v,\w \in \R^n$ are called \textbf{orthogonal} if $\v \cdot \w = 0$. \end{definition} % END DEFINITION % Stress that, again, this can be thought of as a theorem in small dimensions, but it's a definition in 4 or more dimensions. (We mention that the zero vector is orthogonal to everything soon, so no need to now.) \exx[1]{Show that the vectors $(1,1,-2)$ and $(3,1,2)$ are orthogonal.} % 3 + 1 - 4 = 0 \exx[5]{Find a non-zero vector orthogonal to $\mathbf{v} = (v_1,v_2) \in \mathbb{R}^2$} % Why "non-zero"? Because the zero vector is orthogonal to everything! % (v2,-v1) % Draw picture of where this vector is % WEEK 2 Friday fun: % High-dimensional cubes are spiky! % https://www.reddit.com/r/math/comments/6o8a98/corners_stick_out_more_in_high_dimensions/ %On a cube in high dimensions, how far the corners stick out isn't as remarkable as how deep the faces dig in. % %Consider the million-dimensional cube [-1,1]1,000,000. The centers of the faces of this cube are at distance 1 from the origin. The corners of this cube are at distance 1,000 from the origin. Picking a point uniformly at random on the surface of this cube (or in the interior of this cube) makes it very likely that its distance from the origin is close to 3-1/2*1,000 i.e. around 577. Specifically, this distance is between 576 and 578 with probability around 99%. % %So if you're going to picture a million-dimensional cube as something like a spiky million-dimensional ball, then really you should picture it as a million-dimensional ball of radius around 577, covered in "mountain ranges" of height around 423, and deep "trenches" that go practically to the center. % % RELATED: % Concentration of measure: https://ttic.uchicago.edu/~harry/teaching/pdf/lecture13.pdf % If you pick an equator of the n-dim hypersphere and make a tiny sausage with radius epsilon around it=, you cover most of the sphere. % Concrete example: Equator million-dim hypersphere, walk distance of 1/1000 around it (wrap in sausage), now cover 99% of hypersphere. Even weirder, this works with ANY equator -- most points are near ANY of them! % With radius of sphere = 1, radius of sausage = 0.1: % n=2: about 6.345% is contained within sausage (easy calculation with high school math) % n=3: about 10% (calculation using techniques from Calc 2) \end{document}