\documentclass{article}
\usepackage{amssymb,amsmath,amsthm,comment,graphicx,geometry,hyperref,mathtools,amscd}
\geometry{
a4paper,
total={170mm,257mm},
left=20mm,
top=20mm,
}
\newtheorem{definition}{Definition}
\numberwithin{definition}{section}
\newtheorem{theorem}{Theorem}
\numberwithin{theorem}{section}
\newtheorem{lemma}{Lemma}
\numberwithin{lemma}{section}
\newcommand{\R}{\mathbb{R}}
\newcommand{\E}{\mathcal{E}}
\newcommand{\F}{\mathcal{F}}
\newcommand{\cE}{\mathcal{E}}
\newcommand{\cF}{\mathcal{F}}
\newcommand{\cD}{\mathcal{D}}
\newcommand{\cP}{\mathcal{P}}
\newcommand{\QH}{\mathbf{H}}
\newcommand{\oa}{\overrightarrow}
\newcommand{\la}{\langle}
\newcommand{\ra}{\rangle}
\newcommand{\p}{\partial}
\newcommand{\an}{\angle}
\newcommand{\man}{\measuredangle}
\newcommand{\bi}{\mathbf{i}}
\newcommand{\bj}{\mathbf{j}}
\newcommand{\bk}{\mathbf{k}}
\newcommand{\Vect}{\mathrm{Vec}}
\title{Lecture Notes Geometry}
\date{22-3-2020}
\author{Roland van der Veen}
\begin{document}
\maketitle
\section{Introduction}
\emph{It is indeed very satisfying for a mathematician to define an affine space
as being a set acted on by a vector space (and this is what I do here) but this
formal approach, although elegant, must not hide the “phenomenological”
aspect of elementary geometry, its own aesthetics: yes, Thales’ theorem expresses
the fact that a projection is an affine mapping, no, you do not need
to orient the plane before defining oriented angles. . . but this will prevent
neither the Euler circle from being tangent to the incircle and excircles, nor
the Simson lines from enveloping a three-cusped hypocycloid!} -Michelle Audin.
In these lecture notes we aim to summarize some of the main points of the first chapters of the book \emph{Geometry} by Michelle Audin:
We will also treat some elementary notions of differential geometry after.
All our vector spaces will be over $\R$ for simplicity.
\section{Affine space}
A natural setting for discussing elementary geometry of lines and planes is affine space. We are mostly concerned with properties of parallelism and intersection,
not yet with distances. The approach is firmly rooted in linear algebra as we will see affine space is just a vector space where one refuses to say where the origin is as geometrically there is no natural choice of origin.
\begin{definition} (Audin 1.1)\\
A set $\E$ is called an {\bf affine space} if there exist a vector space $E$ and a map $\E\times\E\xrightarrow{\Theta}E$.
We use the notation $\Theta(A,B) = \oa{AB}$. The map should satisfy two conditions:
\begin{enumerate}
\item For all $A\in \E$ the map $B\xmapsto{\Theta_A} \oa{AB}$ is a bijection from $\E$ to $E$.
\item For all $A,B,C\in \E$ we have $\oa{AC}+\oa{CB} = \oa{AB}$.
\end{enumerate}
\end{definition}
We say $\E$ is directed by vector space $E$ or $E$ underlies $\E$. Also sometimes the image of $\Theta_A$ is named $\E_A$: it is $\E$ viewed as a vector space with the origin at $A$.\\
For two points $A,B\in \E$ we may express $\Theta_A\circ \Theta_B^{-1}:E\to E$ as follows:
$\Theta_A\circ \Theta_B^{-1}v = \oa{AB}+v$. This is because $\Theta_B^{-1}v = C$ for a unique $C\in \E$ such that $v=\oa{BC}$.
But then $\Theta_A C = \oa{AC}$ and the second property of affine space says $\oa{AC} = \oa{AB}+\oa{BC}$. In other words
$\Theta_A\circ \Theta_B^{-1}v=\oa{AC} = \oa{AB}+v$.
Going in the opposite direction we can promote any vector space $V$ to an associated affine space $\mathcal{V}$ as follows.
Define $\mathcal{V} = V$ as a set and define $\Theta:\mathcal{V}\times \mathcal{V}\to V$ by $\Theta(A,B) = B-A$.
This provides many examples of affine spaces but just like with vector spaces there is in some sense only one for each dimension.
We call it affine $n$-space. To get even more concrete we could do the above procedure to $V = \R^n$. However being so concrete is not always the easiest approach to difficult geometric problems. Often it is not the coordinates that count but rather the relationships between them and these are observed more readily in an affine setting.
\begin{definition} (Audin p.10)\\
A subset $\F\subset \E$ of an affine space $\E$ is called an {\bf affine subspace} if it is empty or for some $A\in \F$ we have $\Theta_A(\F)$ is a linear subspace of $E$.
\end{definition}
\begin{definition} (Audin p.11)\\
For $S\subset\E$ the {\bf subspace generated} by $S$ is the intersection of all affine subspaces of $\E$ containing $S$. Notation $\langle S \rangle$.
\end{definition}
For points $A,B\in \E$ the line $\langle A,B\rangle$ is also described as the line $AB$. The line segment from $A$ to $B$ is the set $\{M\in \E|\oa{AM} = \lambda \oa{AB},\ \lambda\in [0,1]\}$.
\begin{definition} (Audin p.12)\\
Points $A_0\dots A_k\in \E$ are {\bf affine independent} if $\langle A_0,\dots A_k\rangle$ has dimension $k$. When $k = \dim \E$ we say $(A_0,\dots,A_k)$ is an affine frame of $\E$.
\end{definition}
Affine independence is closely related to linear independence as becomes clear when we choose one of the points as an origin. More precisely,
\begin{lemma}
\label{lem.indep}
If the points $A_0\dots A_k\in \E$ are affine independent then the vectors $\oa{A_0A_i}$ for $i=1,\dots k$ are linearly independent.
\end{lemma}
\begin{proof}
The vectors $\oa{A_0A_i}$ are linearly independent if and only if their span $F\subset E$ has dimension $k$.
Set $T = \Theta_{A_0}:\E\to E$. By definition $T^{-1}(F)\subset \E$ is an affine subspace containing all the $A_i$.
This shows $\la A_0,\dots A_k\ra \subset \F$ but the former was assumed to have dimension $k$ already.
\end{proof}
In particular the above lemma asserts the relation between affine frames in $\E$ and bases of $E$. This is of great importance for applying linear algebra to understand affine space. Of course the choice of origin $A_0$ in the above was arbitrary, we could just as well have chosen some other point $A_i$.
\begin{definition} (Audin p.12)\\
Two affine subspaces are said to be {\bf parallel} if they have the same direction (underlying vector space).
\end{definition}
More precisely this means that affine subspaces $\mathcal{F},\mathcal{G}\subset \E$ are parallel iff for every $A\in \mathcal{F}$ and $B\in \mathcal{G}$ we have
$\Theta_A(\mathcal{F}) = \Theta_B(\mathcal{G})$ as linear subspaces of $E$.
\begin{definition} (Audin p.14)\\
A map $\E\xrightarrow{\phi}\F$ between two affine spaces is called an {\bf affine map} if for some $O\in \E$ there is a linear map $E\xrightarrow{f}F$
such that for all $M\in \E$ we have $\oa{\phi(O)\phi(M)} = f(\oa{OM})$. Sometimes we use the notation $f = \oa{\phi}$.
\end{definition}
Sometimes it is useful to reformulate this condition in terms of the bijection $\Theta_O:\E\to E$ and $\Theta_{\phi(O)}:\F\to F$ as follows:
$f\circ\Theta_O = \Theta_{\phi(O)} \circ \phi$. These compositions are conveniently visualized in a (commutative) diagram:
\[
\begin{CD}\E @>\phi>> \F\\@VV\Theta_O V @VV\Theta_{\phi(O)}V\\E @>f>> F\end{CD}
\]
So after choosing an origin $O$ and the compatible choice of origin $\phi(O)$ we find that $\phi$ is represented by the linear map $f$.
Examples of affine maps:\\
1) For $u\in E$ the translation $t_u:\E\to\E$ is an affine map defined by $t_u(A) = B$ with $\oa{AB} = u$.\\
2) For $O\in \E$ and $\lambda\in \R$ the (central) dilation with center $O$ and ratio $\lambda$, denoted $h_{O,\lambda}:\E\to \E$,
is the affine map defined by $\oa{Oh_{O,\lambda}(M)} = \lambda \oa{OM}$.\\
3) Another example is the projection $\pi_{\F,L}$ onto affine subspace $\F\subset \E$ in the direction of linear subspace $L\subset E$.
This makes sense only if $L+F = E$ and $L\cap F=0$ where $F$ is the direction of $\F$. In that case $\pi_{\F,L}(A) = B$ where $B\in \F$ is the unique point such that $\oa{AB}\in L$. Point $B$ is indeed unique because if $C$ was another such point then $\oa{AB}-\oa{AC}$ would be in $F\cap L$. Point $B$ exists because for any $O\in \F$ we can write $\oa{OA}= \ell+v$ for some $v\in F,\ell \in L$ and set $B$ to be such that $\oa{OB} = v$.
The corresponding linear map is the projection $\ell+v\xmapsto{f} v$ for any $\ell\in L,v\in F$. Indeed $f(\oa{OA}) = \oa{\pi_{\F,L}(O)\pi_{\F,L}(A)}$.
\begin{lemma}(Audin 2.5 p.16)\\
Any affine map sends affine subspaces to affine subspaces.
\end{lemma}
\begin{proof}
Given affine subspace $\mathcal{G}\subset \E$ and affine map $\phi:\E\to \F$ as above we need to check that $\phi(\mathcal{G})$ is an affine subspace of $\F$.
When $\mathcal{G} = \emptyset$ this is clear. Otherwise choose a point $O\in \mathcal{G}$, since it is an affine subspace $G = \Theta_O(\mathcal{G})\subset E$ is a linear subspace. Now $\phi(O) \in \phi(\mathcal{G})$ so we need to check that $J=\Theta_{\phi(O)}(\phi(\mathcal{G}))\subset F$ is also a linear subspace.
Looking at the diagram of maps above and writing $f = \oa{\phi}$ we have $\Theta_{\phi(O)}\circ \phi = f\circ \Theta_O$ so $J=f\circ \Theta_O(\mathcal{G}) = f(G)$.
From linear algebra we know that the linear map $f$ sends linear subspace $G\subset E$ to a linear subspace $f(G)\subset F$.
\end{proof}
\begin{lemma}(Audin 2.10 p.18)\\
An affine mapping is determined by the images of an affine frame. Any affine mapping of $n$-dimensional affine space that fixes $n+1$ affine independent points is the identity.
\end{lemma}
\begin{proof}
Given affine frame $A_0,\dots A_n$ and $O=A_0$ Lemma \ref{lem.indep} tells us that $\Theta_O$ sends $A_1,\dots A_n$ to a basis $v_1,\dots v_n$ of $E$.
Using the same notation as above the linear map $f = \oa{\phi}$ completely determines $\phi$. Also $f(v_i) = \Theta_{\phi(O)}\circ\phi\circ \Theta_O^{-1}(v_i) =
\Theta_{\phi(O)}(\phi(A_i))$ shows that $f$ in turn is completely determined by the images of the frame $A_i$ under $\phi$. Here we used that a linear map is determined by what it does on the basis $v_i$.
\end{proof}
For four collinear points $A,B,C\neq D$ the ratio $\frac{\oa{AB}}{\oa{CD}}$ means the scalar $\lambda$ such that $\oa{AB} = \lambda \oa{CD}$.
\begin{theorem}(Pappus, Audin Thm 3.4, p.25)\\
In an affine plane imagine points $A,B,C$ on line $\mathcal{D}$ and $A',B',C'$ three points on a second distinct line $\mathcal{D}'$. If $AB'$ is parallel to $A'B$ and $BC'$ parallel to $CB'$ then $AC'$ must be parallel to $CA'$.
\end{theorem}
\begin{lemma}
\label{lem.P1}
Given two distinct lines $\mathcal{L},\mathcal{L}'$ intersecting at $O$ and points $X,Y$ on $\mathcal{L}$ and $X',Y'$ on $\mathcal{L}'$.
The dilation $h_{O,\lambda}$ sends both $X$ to $Y$ and $X'$ to $Y'$ if and only if $XX'$ is parallel to $YY'$.
\end{lemma}
\begin{proof}
Lines $XX'$ and $YY'$ are parallel if and only if $\oa{YY'} = \mu\oa{XX'}$ for some $\mu\in \R$.
If $\oa{OY}=\lambda\oa{OX}$ and $\oa{OY'}=\lambda\oa{OX'}$ then
\[
\oa{YY'} = \oa{OY'}-\oa{OY} = \lambda(\oa{OX'}-\oa{OX})=\lambda\oa{XX'}
\]
Conversely, suppose $\oa{OY} = \lambda \oa{OX}$ and $\oa{YY'} = \mu\oa{XX'}$ for some $\mu\in \R$.
Then
\[\lambda\oa{OX'}=\lambda\oa{OX}+\lambda\oa{XX'} = \oa{OY}+\lambda\mu\oa{YY'}= \oa{OY'}-\oa{YY'}+\lambda\mu\oa{YY'}\]
Since the lines $\mathcal{L},\mathcal{L}'$ are assumed distinct $\oa{YY'}$ is independent from the (dependent) vectors $\oa{OY'},\oa{OX'}$ and the two terms on the right hand side must cancel out.
\end{proof}
\begin{lemma}
\label{lem.P2}
Given two distinct parallel lines $\mathcal{L},\mathcal{L}'$ and points $X,Y$ on $\mathcal{L}$ and $X',Y'$ on $\mathcal{L}'$.
The translation $t_{\oa{XY}}$ sends both $X$ to $Y$ and $X'$ to $Y'$ if and only if $XX'$ is parallel to $YY'$.
\end{lemma}
\begin{proof}
If $\oa{X'Y'}=\oa{XY}$ then using $\oa{XY}=-\oa{YX}$ we get
\[
\oa{YY'} = \oa{YX}+\oa{XX'}+\oa{X'Y'} = \oa{XX'}
\]
Conversely, suppose $\oa{YY'} = \mu\oa{XX'}$ for some $\mu\in \R$.
Then $\oa{X'Y'}=\oa{X'X}+\oa{XY}+\oa{YY'} = \oa{XY}+(-1+\mu)\oa{XX'}$ implies that $\mu=1$ since $\oa{X'Y'}$ and $\oa{XY}$ are parallel vectors assumed independent of $\oa{XX'}$ because the lines $\mathcal{L},\mathcal{L}'$ are assumed distinct.
\end{proof}
\begin{proof}(of Pappus)\\
First we will assume the lines $\mathcal{D},\mathcal{D}'$ intersect in point $O$. For some $\lambda,\mu$ we have $h_{O,\lambda}A=B$ and $h_{O,\mu}B=C$.
By Lemma \ref{lem.P1} $h_{O,\lambda}B'=A'$ and $h_{O,\mu}C'=B'$. This means that the composition $h_{O,\lambda\mu} = h_{O,\lambda}\circ h_{O,\mu} = h_{O,\mu}\circ h_{O,\lambda}$ must map both $A$ to $C$ and $C'$ to $A'$. According to Lemma \ref{lem.P1} this means $AC'$ is parallel to $C'A$.\\
When lines $\mathcal{D},\mathcal{D}'$ in the plane do not intersect they must be parallel and we argue in the same way using translations:
The translation $t_{\oa{AB}}$ sends $A$ to $B$ and the translation $t_{\oa{BC}}$ sends $B$ to $C$. By Lemma \ref{lem.P2}
we also have $t_{\oa{AB}}B' = A'$ and $t_{\oa{BC}}C' = B'$. So $t_{\oa{AB}+\oa{BC}}$ sends both $A$ to $C$ and $C'$ to $A'$.
According to Lemma \ref{lem.P1} this means $AC'$ is parallel to $C'A$.
\end{proof}
\begin{theorem}(Desargues)\\
Given distinct triangles $A,B,C$ and $A',B',C'$ in an affine plane such that $AB$ is parallel to $A'B'$ and $AC$ parallel to $A'C'$ and $BC$ parallel to $B'C'$,
the lines $AA'$, $BB'$ and $CC'$ are either all parallel or intersect in a single point.
\end{theorem}
\begin{proof}
First consider the case where $AA'$ and $BB'$ intersect in point $O$. By Lemma \ref{lem.P1} there is a $\lambda\in \R$ such that $h(O,\lambda)A = A'$ and
$h_{O,\lambda}B = B'$. We aim to show that the point $C'' = h_{O,\lambda}C$ is equal to $C'$. Again Lemma \ref{lem.P1} assures us that $A'C''$ and $AC$ are parallel
so $C''$ lies on $A'C'$. For the same reason $C''$ also lies on line $B'C'$. Together this means $C'' = C'$. Also $CC'$ passes through $O$ by definition of $C''$.\\
Second, the case where $AA'$ and $BB'$ are parallel is approached analogously using Lemma \ref{lem.P2}. So the translation $t_{\oa{AA'}}$ sends $A$ to $A'$ and also $B$ to $B'$. Define $C'' = t_{\oa{AA'}}C$ then we want to show $C'' = C'$. Lemma \ref{lem.P2} implies that $A'C''$ is parallel to $AC$ so $C''$ lies on $A'C'$. For the same reason $C''$ also lies on line $B'C'$. Together this means $C''=C'$ and so $CC'$ is parallel to $AA'$.
\end{proof}
\subsection*{Exercises}
\begin{enumerate}
\item Can you give an example of an affine space that is not a vector space?
\item Prove that for any points $B,C$ of affine subspace $\F\subset \E$ we have $\Theta_B(\F) = \Theta_C(\F)$.
\item Prove that through any two points of an affine space passes a unique line.
\item Let $A$ be an $m\times n$ matrix and $b\in \R^m$. Define $\F = \{x\in \R^n: Ax=b\}$.
Prove that $\F$ is an affine subspace of $\R^n$. What is its direction? When is it empty? Express the dimension of $\F$ in terms of the rank of $A$.
\item Given an affine map $\phi:\E\to \F$ between affine spaces and $\mathcal{M}$ an affine subspace of $\F$. Prove that $\phi^{-1}(\mathcal{M})$ is an affine subspace of $\E$.
\item Prove that an affine map is completely determined by the image of an affine frame.
\item Is Pappus theorem valid if the lines $\mathcal{D},\mathcal{D}'$ are two distinct lines in three-dimensional affine space?
\item The set of all invertible affine maps from $\E$ to itself is denoted $GA(\E)$.
\begin{enumerate}
\item Prove that the composition of two affine maps $\phi$ and $\psi$ is again an affine map, and that the linear map associated to $\psi \circ \phi$ is the composition of the linear maps associated to $\psi$ and $\phi$, i.e. that $\overrightarrow{\psi \circ \phi} = \overrightarrow{\psi} \overrightarrow{\phi}.$
\item Let $\phi \in GA(\E)$, and let $h(O,\lambda)$ be a central dilation. Compute $\phi \circ h(O,\lambda) \circ \phi^{-1}.$
\item Prove that two dilations with the same center commute.
\item Compute $h(B,\lambda' ) \circ h(A,\lambda).$
\item Explain whether or not the set of all dilations is a subgroup of $GA(\E).$
\item Prove $\oa{AB} = -\oa{BA}$ for any $A,B\in \E$.
\end{enumerate}
\item In this exercise we aim to prove {\bf Menelaus' theorem} in a couple of steps, see also Audin Exercise I.37.
In an affine plane consider three distinct non-collinear points $A,B,C$ and points $A'$ on line $BC$, $B'$ on $CA$ and $C'$ on $AB$ all distinct from $A,B,C$.
Points $A',B',C'$ are collinear if and only if $\frac{\oa{A'B}}{\oa{A'C}}\cdot\frac{\oa{B'C}}{\oa{B'A}}\cdot\frac{\oa{C'A}}{\oa{C'B}}=1$.
\begin{enumerate}
\item Draw a picture.
\item Explain why $h_{C',\frac{\oa{C'A}}{\oa{C'B}}}$ sends $B$ to $A$ and $h_{B',\frac{\oa{B'C}}{\oa{B'A}}}$ sends $A$ to $C$ and
$h_{A',\frac{\oa{A'B}}{\oa{A'C}}}$ sends $C$ to $B$.
\item The composition of the three maps above sends $B$ to itself, why must it be of the form $h_{B,\lambda}$ for some $\lambda\in \R$?
\item The first two maps fix the line $C'B'$.
\item The third map fixes $C'B'$ if and only if $A'$ is on that line.
\item Conclude that if the composition of the three is the identity then $A',B',C'$ are collinear.
\item The point $B$ is not on $A'C'$ because then $\oa{A'B}$ and $\oa{C'B}$ are proportional and hence the sides of the triangle $BC$ and $AB$ would not be independent.
\item Conversely if $A',B',C'$ are collinear then show the composition of the three is the identity because it fixes both the line $C'B'$ and $B$ (not on that line).
\end{enumerate}
\item \begin{enumerate}
\item Let $V_1$ and $V_2$ be linear subspaces of a vector space $V$. Under which conditions is $V_1 \cup V_2$ a linear subspace of $V$? Prove your claim.
\item Let $\cF_1$ and $\cF_2$ be affine subspaces of an affine space $\cE$, directed by $E$. Under which condition is $\cF_1 \cup \cF_2$ an affine subspace of $\cE$? Prove your claim.
\end{enumerate}
\item Let $A$ be a point on $\cD$ be a line in an affine plane $\cP$ with underlying vector space $P$. For a one-dimensional linear subspace $L\subset P$ not equal to $\Theta_A(\cD)$ we consider the mapping $\pi: \cP \to \cP$ defined as follows. $\pi(M)$ is the point $M'$ defined by
$M' \in \cD$ and $\overrightarrow{MM'} \in L$.
\begin{enumerate}
\item Why can there be only one point $M'$ with properties $M' \in \cD$ and $\oa{MM'} \in L$?
\item Prove that $\pi$ is an affine map.
\item What is the linear map $\oa{\pi}:P\to P$ associated to $\pi$?
\item Why is $\pi$ not defined if $L$ is in the same direction as $\cD$?
\end{enumerate}
\item All affine $n$-spaces are 'the same'.
\begin{enumerate}
\item Prove that any two vector spaces of dimension $n$ are isomorphic (i.e.) there is a linear bijection from one to the other that has a linear inverse.
\item Suppose $\E$ and $\F$ are two $n$-dimensional affine spaces. Pick $B\in \E$ and $C\in \F$ and use $\Theta_B,\Theta_C$ and the previous part to construct an affine bijection between $\E$ and $\F$ with affine inverse.
\end{enumerate}
\end{enumerate}
\section{Euclidean space in general}
By Euclidean space we mean affine space that also has a notion of dot-product (aka scalar or inner)-product $\la \cdot, \cdot\ra$. This allows us to discuss angles and distances as Euclid did but still using vector spaces as our foundation.
\begin{definition}(Audin p.44)\\
A Euclidean vector space is a vector space with a choice of scalar product.
A Euclidean affine space $\E$ is an affine space directed by a Euclidean vector space.
The distance between two points $A,B$ is $d(A,B)=|\oa{AB}|$.
\end{definition}
In a Euclidean vector space $E$ the notion of perpendicular is very important. For any subset $S\subset E$ we define
$S^\perp = \{x\in E: \forall s\in S,\ \la s, x\ra = 0\}$. When $S$ is a subspace of $E$ then $E = S\oplus S^\perp$. Recall
that $A=B\oplus C$ means every element $a$ of vector space $A$ can uniquely be written as $a = b+c$ for some $b$ in subspace $B\subset A$ and $c$ in subspace $C\subset A$.
\subsection{Euclidean isometries}
The motions that preserve the geometry are fundamental so we will spend some time defining and studying them before moving on to more classical theorems in geometry.
\begin{definition}(Audin p.44)\\
A linear isometry is a linear map between Euclidean vector spaces that preserves the scalar product.
An affine isometry is an affine map between Euclidean affine spaces if it preserves the distance between any pair of points.
The set of all linear isometries of $E$ is $O(E)$. The set of all affine isometries of $\E$ is $Isom(\E)$.
\end{definition}
Examples of isometries include
\begin{enumerate}
\item Translations are affine isometries.
\item Dilations $h_{O,\lambda}$ are affine isometries if and only if $\lambda = \pm 1$.
\item Orthogonal linear isometries are the following. For a linear subspace $F\subset E$ define
$E\xrightarrow{s_F} E$ by $s_F(x+y) = x-y$ where $x\in F$ and $y\in F^\perp$.
\item Orthogonal affine isometries are defined similarly. If $\F\subset \E$ is an affine subspace
then $\E\xrightarrow{\sigma_\F} \E$ is defined by choosing any $O\in \F$ and setting $F =\Theta_O(\F)$ and
$\oa{O\sigma_\F(M)} = s_F(\oa{OM})$ for any $M\in \E$.
\item In case the subspace is a hyperplane (so the dimension is one less than the whole space) we say the orthogonal isometry
is a reflection.
\end{enumerate}
When $v\in E$ is a vector in Euclidean space $E$ there is a simple formula for the reflection $s_{v^\perp}$ in the hyperplane orthogonal to $v$.
Namely $s_{v^\perp}w = w-2\frac{\la v, w\ra}{|v|^2}v$. This formula is correct since it fixes any $w\in v^\perp$ and sends $v$ to $-v$.
Notice that $s_{v^\perp}$ is its own inverse.
\begin{theorem}
\label{thm.vectreflect}
Suppose Euclidean vector space $E$ is $n$-dimensional and $g \in O(E)$. If there is a $k$ dimensional subspace $F$ such that $\forall f\in F: g(f)=f$
then $g$ is the composition of at most $n-k$ reflections.
\end{theorem}
\begin{proof}
We argue by induction on $n-k$. When $k=n$ the $g$ must be the identity.
Suppose $g$ fixes a $k-1$ dimensional subspace $F$ and suppose $g(v) = w\neq v$ for some $v\in E$.
We will show that reflection in $u^\perp$ where $u=v-w$ fixes all vectors in $F$ and sends $w$ to $v$.
Since $g$ is an isometry $|v| = |w|$ we have $-2\la u, w\ra = |u|^2$ so $s_{u^\perp}w = w-2\frac{\la u,w\ra}{|u|^2}u= w+v-w=v$.
Indeed for any $f\in F$ we have $\la f,v\ra = \la g(f),g(v)\ra = \la f,w\ra$ so $\la f,u\ra = 0$ meaning $s_{u^\perp}$ fixes $f$.
We conclude that the composition $s_{u^\perp}\circ g$ fixes vectors in $F$ and $g$ sends $v$ to $w$ and $s_{u^\perp}$ maps it back to $v$ so the $k$-dimensional subspace spanned by both $F$ and $v$. By the induction hypothesis $s_{u^\perp}\circ g$ is a composition of at most $n-k$ reflections so
composing by $s_{u^\perp}$ we see $g$ is obtained from at most $n-k+1$ reflections.
\end{proof}
\begin{lemma}(Audin exercise II.7)\\
\label{lem.perpbisect}
Given two points $A,A'\in \E$ there exists an affine hyperplane $\mathcal{H}\subset \E$ such that $\sigma_\mathcal{H}(A) = A'$.
\end{lemma}
\begin{proof}
Define $\mathcal{H} = \{B\in \E: |\oa{AB}| = |\oa{A'B}|\}$. Then $\mathcal{H}$ contains $O=t_{\oa{AA'}/2}A$ since $\oa{OA} = -\oa{AA'}/2$ and $\oa{OA'} = \oa{OA}+\oa{AA'} = \oa{AA'}/2$. Next $\mathcal{H}$ is an affine hyperplane since $\Theta_O(\mathcal{H}) = \oa{AA'}^\perp$. To see why notice that for $B\in \mathcal{H}$ we have $|\oa{AB}| = |\oa{A'B}|$ so $|\oa{OA}-\oa{OB}|^2 = |\oa{OA'}-\oa{OB}|^2$ meaning $\la \oa{OA},\oa{OB}\ra = \la \oa{OA'},\oa{OB}\ra$ so $\oa{OB}\perp \oa{OA'}-\oa{OA} = \oa{AA'}$. Conversely if $\oa{OB}\perp \oa{AA'}$ then $\la \oa{OB},\oa{OA} \ra = \la \oa{OB},\oa{OA'} \ra$ so $|\oa{AB}| = |\oa{A'B}|$ meaning $B\in \mathcal{H}$.
Finally if $\sigma_\mathcal{H}(A) = A''$ then $\oa{OA''} = s_{\oa{AA'}^\perp}(\oa{OA}) = s_{(\oa{OA'}-\oa{OA})^\perp}(\oa{OA}) = \oa{OA'}$ by the calculation in the proof of Theorem \ref{thm.vectreflect}. Therefore $A''=A'$.
\end{proof}
\begin{theorem}(Audin thm 2.2, p.49)\\
Suppose $\E$ is an $n$ dimensional Euclidean affine space. Any element of $Isom(\E)$ can be written as the composition of at most $n+1$ reflections.
\end{theorem}
\begin{proof}
If the affine isometry $\psi$ has a fixed point $O$ then there is a linear isometry $L = \Theta_O \circ \psi \circ \Theta_O^{-1}$ which by Theorem \ref{thm.vectreflect} is the composition of at most $n$ reflections through hyperplanes containing $O$. This means that the same is true for $\psi$ because
we may conjugate each of these reflections by $\Theta_O$. More precisely if $L = s_{H_1}\circ \dots s_{H_j}$ then $\psi = \Theta_O^{-1}\circ L \circ \Theta_O =
\Theta_O^{-1}\circ s_{H_1} \circ \Theta_O \circ \Theta_O^{-1}\circ \dots \circ \Theta_O^{-1} \circ s_{H_j} \circ \Theta_O$ and each $\Theta_O^{-1}\circ s_{H_1} \circ \Theta_O$ is areflection in an affine hyperplane through $O$.
In case $\psi$ does not fix any point then choose $A\in \E$ and set $A' = \psi A$. Using Lemma \ref{lem.perpbisect} we see that $\sigma_\mathcal{H}\circ \psi$ fixes $A$ so we conclude $\psi$ is the composition of at most $n+1$ reflections.
\end{proof}
Linear reflections reverse orientation in the sense that they have determinant $-1$. Since the determinant of the product is the product of the determinants this means that linear isometries have determinant $\pm 1$. The linear isometries with determinant $1$ are known as the rigid motions. The same terminology is used for the affine isometries. We use the notation $Isom^+(\E)$ for the rigid affine motions and $O^+(E)$ or $SO(E)$ for the linear rigid motions.
Choosing an orthonormal basis of Euclidean $n$-dimensional space $E$ we may identify $O(E)$ with a subset $O(n)\subset M_n$ inside the set $M_n$ of the $n\times n$ matrices. Recall the columns of the matrix of a linear map are the images of the base vectors and a linear isometry preserves the inner product. Therefore the columns of the matrix with respect to an orthonormal basis must again form an orthonormal basis. In other words $O(n) = \{A\in M_n: AA^t = I\}$, where $A^t$ is the transpose of matrix $A$. Identifying $M_n$ with $\R^{n^2}$ we see that $A\mapsto AA^t$ is a continuous map and $O(n)$ is the inverse image of a point so $O(n)$ must be a closed subset of $M_n$ with respect to this topology. Also $O(n)$ is bounded since all the columns must be unit vectors so all in all we concluded that $O(n)$ is a compact subset of $M_n$.
More precisely the determinant shows that $O(n)$ decomposes neatly into two equal parts, the matrices with determinant $1$ called $O^+(n)$ and those with determinant $-1$.
The two dimensional case is fundamental so we examine $O^+(2)$ in more detail. It will be shown to be identified with the unit circle $U$ in the complex plane.
\begin{lemma}(Audin prop 3.4, p. 53)\\
\label{lem.O2}
The group $O^+(2)$ is isomorphic and homeomorphic to the complex units $U\subset \mathbb{C}$.
\end{lemma}
\begin{proof}
Matrix $A=\left(\begin{array}{cc}a & c\\ b&d \end{array}\right)$ belongs to $O(2)$ if and only if the columns form an orthonormal basis. So $a^2+b^2 = 1$ and $c^2+d^2=1$ and $ac+bd=0$. This means that $c=\epsilon b$ and $d = -\epsilon a$ for some $\epsilon\in\{-1,1\}$. In fact $\epsilon = \det A$.
The map $\varphi:O^+(2) \to U$ given by $\varphi\left(\begin{array}{cc}a & -b\\ b&a \end{array}\right) = a+ib$ is
a homomorphism of groups and it is also a continous bijection with continuous inverse as the reader can check by explicit calculations, finishing the proof.
\end{proof}
We thus find a surjective map $\R \to U \to O^+(2)$ sending $\theta\in \R$ to $e^{\theta i}$. The image in $O^+(2)$ is called the rotation with angle $\theta$.
In section \ref{sec.planegeom} we will have more to say about angles and plane geometry.
\subsection{Centroid, Orthocenter and Circumcenter}
We start with generalizing triangles to any dimension. These are known as simplices.
\begin{definition}
Given affine independent points $A_0,\dots A_n$ in affine space $\E$,
the {\bf $n$-simplex} $[A_0,\dots,A_n]$ is defined as
\[
[A_0,\dots,A_n] = \{B\in \E| \exists \beta_0,\dots \beta_n\in [0,1]: \sum_{i=0}^n \beta_i\oa{BA_i} = 0\}
\]
\end{definition}
A $0$-simplex is a point, a $1$-simplex is simply and interval and a $2$-simplex is a triangle. $3$-simplices are known as tetrahedra.
The face of simplex $[A_0,\dots A_n]$ opposite to $A_i$ is the $(n-1)$-simplex $[A_0,\dots \hat{A_i} \dots A_n]$ where
the hat means we removed $A_i$ from the sequence. The faces of a $2$-simplex are simply its sides viewed as segments.
The numbers $\beta_i$ appearing in the definition of simplex are known as barycentric coordinates.
\begin{definition}
Numbers $\beta_0,\dots \beta_n\in \R$ are ${\bf barycentric\ coordinates}$ of point $B$ with respect to an affine frame $A_0,\dots A_n$ if $\sum_{i=0}^n \beta_i \oa{BA_i} = 0$.
\end{definition}
Barycentric coordinates are not unique but they are unique when we add the condition that $\sum_{i=0}^n \beta_i = 1$. This condition can always be met because affine independence of the $A_i$ implies $\sum_{i=0}^n \beta_i\neq 0$. The $n$-simplex can be defined as those points with barycentric coordinates in $[0,1]^{n+1}$.
What is the middle of a simplex? There are several competing notions and we will survey a few of them: centroid, circumcenter and orthocenter. For $2$-simplices (triangles) we will investigate how they relate to each other.
Looking at the definition of simplex we defined the {\bf centroid} $G$ of the simplex as follows.
$G\in \E$ is the unique point such that $\sum_{i=0}^n \frac{1}{n+1}\oa{GA_i} = 0$.
The faces of the simplex also have centroids called $G_i$ and the lines $G_iA_i$ all meet in $G$. To see why this is the case, notice that the segment $A_iG_i$ is precisely the set of points with barycentric coordinates $\beta_j=\begin{cases}\frac{1-t}{n}\ &j\neq i\\ t\ &j= i
\end{cases}$ for $t\in [0,1]$. Since $G$ has barycentric coordinates $\beta_k = \frac{1}{n+1}$ this is the case $t = \frac{1}{n+1}$.
In fact $n\oa{GG_i} = \oa{A_iG}$ (exercise).
A competing notion of the middle of a simplex is known as the {\bf circumcenter} $O$. It is the center of the unique sphere that passes through all vertices of the simplex. First we should define the $(n-1)$-{\bf sphere} $S^{n-1}(c,r)$ with center $c\in \E$ and radius $r\geq 0$ in affine $n$-space $\E$ to be
$S^{n-1}(c,r) = \{A\in \E: |\oa{cA}| = r\}$. The following lemma shows that the circumcenter is uniquely defined by the simplex.
\begin{lemma}
\label{lem.spheresimplex}
For an $n$-simplex $[A_0,\dots,A_n]$ in affine Euclidean $n$-space $\E$ there exists a unique $(n-1)$-sphere $S^{n-1}(O,r)$ such that $\forall i: A_i\in S^{n-1}(O,r)$.
\end{lemma}
\begin{proof}
We argue by induction on the dimension $n$. The base case $n=0$ is clear so now assume the statement is true in all dimensions $< n$.
The face of our simplex opposite to $A_n$ spans an affine hyperplane $\mathcal{H}_n$ in which there is a unique $(n-1)$-sphere with center $O_n$ passes through all points $A_i$ with $i2$. For example the $3$-simplex $[(1,0,0),(0,0,1),(1,1,0),(0,0,0)]$ has non-intersecting altitudes $\mathcal{H}_2$ and $\mathcal{H}_1$ as is easily seen by inscribing this tetrahedron inside a regular cube.
For $n=2$ the altitudes do meet in a single point called the {\bf orthocenter} $H$ of the triangle $\mathcal{A} = [A_0,A_1,A_2]$. To see why this is the case consider the
triangle $\mathcal{B}=[B_0,B_1,B_2]$ defined by $h_{G,-2}(A_i) = B_i$ where $G$ is the centroid of $\mathcal{A}$.
Notice that $A_iA_j$ is parallel to $B_iB_j$ by Lemma \ref{lem.P1}. Also when $i,j,k$ are distinct $A_i$ is the midpoint of the segment $B_jB_k$.
This is because the midpoint of segment $A_jA_k$ is $G_i$ and $\oa{B_jA_i} = \oa{GA_i}-\oa{GB_j} = -2\oa{GG_i}+2\oa{GA_j} = 2\oa{G_iA_j}$ and likewise $\oa{B_kA_i}=2\oa{G_iA_k}$. Finally the altitudes of $\mathcal{A}$ are precisely the perpendicular bisectors of the sides of $\mathcal{B}$. The latter meet in the circumcenter of $\mathcal{B}$ so this point is the orthocenter of $\mathcal{A}$.
Looking carefully at what we just proved we find the following theorem of Euler:
\begin{theorem} (Euler line)\\
The circumcenter, centroid and orthocenter $O,G,H$ of triangle $\mathcal{B} = [B_0,B_1,B_2]$ lie on a line and more precisely $-2\oa{GO} = \oa{GH}$.
Also on this line is center $N$ of the circle through the midpoints $G_i$ of the sides of our triangle. Moreover, $N$ is the midpoint of the segment $HO$.
\end{theorem}
\begin{proof}
As before we use two triangles $\mathcal{A},\mathcal{B}$, where $\mathcal{B} = h_{G,-2}\mathcal{A}$. By construction they have the same centroid $G$ and $A_i=G_i$. Moreover, we showed the circumcenter $O=O_\mathcal{B}$ of $\mathcal{B}$ is the orthocenter $H_\mathcal{A}$ of $\mathcal{A}$.
By definition this means the orthocenter $H=H_\mathcal{B}$ of $\mathcal{B}$ is $h_{G,-2}H_\mathcal{A} = h_{G,-2}O$. We conclude that $O,H,G$ lie on a line and moreover $\oa{GH} = h_{G,-2}\oa{GO} = -2\oa{GO}$.
Notice $N$ is the circumcenter $O_\mathcal{A}$ of $\mathcal{A}$ and so $h_{G,-2}O_\mathcal{A} = O_\mathcal{B} = O$. Therefore $O,G,N,H$ lie on the same line and
$\oa{GO} = h_{G,-2}\oa{GN} = -2\oa{GN}$. This means that $\oa{NO} = \oa{NG}+\oa{GO} = \frac{3}{2}\oa{GO} = 2\oa{GO}+\oa{GN} = \oa{HG}+\oa{GN} = \oa{HN}$.
\end{proof}
By similar methods much more can be said about the circle through the midpoints of the sides of our triangle. It is the famous nine-point circle.
\begin{theorem} (Nine point circle)\\
Given triangle $\mathcal{B} = [B_0,B_1,B_2]$, there is a circle that passes through the following nine special points:
the midpoints $G_i$ of the sides and intersections $H_i$ of the altitude $\mathcal{H}_i$ with the opposite side and the midpoints of the segments $B_iH$.
\end{theorem}
\begin{proof}
Keeping the previous notation we call our triangle $\mathcal{B}$ and define the nine-point circle $\mathcal{C}$ as the circle through the $G_i=A_i$. There must be a unique such circle by Lemma \ref{lem.spheresimplex}, also it has center $N$. We now need to show that $\mathcal{C}$ contains both the $H_i$ and the midpoints $M_i$ of the segments $B_iH$.
First $h_{G,-\frac{1}{2}}\circ h_{H,2} = h_{N,-1}$ because the left hand side has fixed point $N$ by the previous theorem. Also looking at the left hand side we see that $h_{N,-1}(M_i) = A_i$. Since $h_{N,-1}$ is an isometry we have $|NM_i| = |NA_i|$ so the $M_i$ are on $\mathcal{C}$.
Second $H_i$ and $G_i$ are both on line $B_jB_k$ for $i,j,k$ distinct. To show that $|NH_i| = |NG_i|$ we argue as follows. The lines $HH_i$ and $OG_i$ are parallel because both perpendicular to $B_jB_k$. Since by the previous theorem $N$ is the midpoint of segment $HO$, Lemma \ref{lem.P1} shows $NN_i$ is also perpendicular to $B_jB_k$ where $N_i$ is the midpoint of segment $H_iG_i$. It follows by a reflection in line $NN_i$ that $|NH_i| = |NG_i|$ as desired.
\end{proof}
There are in fact many more interesting points on the nine-point circle. For example Feuerbach found there are precisely four circles that are tangent to all three sides of our triangle and the nine-point circle is tangent to all four of those!
\subsection*{Exercises}
\begin{enumerate}
\item Is it true that for any subset $S$ in a Euclidean vector space $E$ we have $(S^\perp)^{\perp}?$
\item Imagine a line $\mathcal{L}$ in affine Euclidean plane $\E$ and a vector $u\in L$ in the direction (underlying vector subspace) of $\mathcal{L}$.
The affine map $\gamma_{\mathcal{L},u}:\E\to\E$ defined by $\gamma_{\mathcal{L},u} = t_u \circ \sigma_{\mathcal{L}}$ is called a glide reflection.
\begin{enumerate}
\item Prove that the glide reflection $\gamma_{\mathcal{L},u}$ is an affine isometry.
\item Why can $\gamma_{\mathcal{L},u}$ not be written as the composition of two or less reflections? (Hint: fixed points)
\item Write $\gamma_{\mathcal{L},u}$ explicitly as a composition of three reflections.
\end{enumerate}
\item Verify that $s_F\circ s_F$ is the identity. Give an example of linear subspaces $F,G\subset E$ such that $s_F\circ s_G \neq s_G\circ s_F$.
\item If we choose a basis $O(E)$ can be identified with a subset of $\R^{n^2}$ where $n=\dim(E)$. Show that if $f$ is the composition of an even number of reflections then there is a continuous function $\gamma:[0,1]\to O(E)$ with $\gamma(0)=id_E$ and $\gamma(1) = f$.
\item Imagine a $2$-simplex $[A_0,A_1,A_2]$ with centroid $G$ and $G_i$ the centroid of the face opposite to $A_i$.
Using $\oa{GA_0}+\oa{GA_1}+\oa{GA_2} = 0 = \oa{G_0A_1}+\oa{G_0A_2}$, prove that $2\oa{GG_0} = \oa{A_0G}$.
\item Prove that for any nonzero $\lambda\in \R$ and $O\in \E$ the dilation $h_{O,\lambda}$ sends any $n$-simplex in $\E$ to another $n$-simplex in $\E$.
\item By a rotation around point $A$ in affine Euclidean plane $\E$ we mean any composition of two affine reflections in lines passing through $A$.
For $A,B\in \E$ let $\rho_A$ be a rotation around $A$ and $\rho_B$ a rotation around $B\neq A$.
\begin{enumerate}
\item What is the determinant of the linear map $\oa{\rho_A}:E\to E$ associated to $\rho_A$?
\item Prove that $\oa{\rho_A\circ \rho_B} = \oa{\rho_A}\circ \oa{\rho_B}$.
\item What is the determinant of $\oa{\rho_A\circ \rho_B}$?
\item Is it true that $\rho_A\circ \rho_B$ must also be a rotation around some point $C\in \E$? Prove or provide a counterexample.
\end{enumerate}
\end{enumerate}
\section{Plane Euclidean geometry}
\label{sec.planegeom}
Throughout this section $\E$ denotes a two-dimensional Euclidean affine space with underlying (2d) vector space $E$.
We begin our study of the plane with a more detailed discussion of the isometries, starting with $E$. From Theorem \ref{thm.vectreflect}
we know that every linear isometry is either the identity, a reflection in a line through the origin or a composition of two such reflections.
The latter are usually called rotations and they will play an important role in what follows.
\begin{definition}
A {\bf rotation} is an element of $O(E)$ that can be written as the composition of two reflections.
\end{definition}
In fact the set $O^+(E)$ of linear isometries of positive determinant coincides is the set of all rotations.
It forms a commutative subgroup isomorphic to $O^+(2)$ or the unit circle in the complex plane, see Lemma \ref{lem.O2}.
Notice that the identity is also a rotation according to this definition.
As we mentioned before, reflections reverse orientation while rotations preserve it.
Orientation is an important and often overlooked concept so we will provide a definition.
\begin{definition}(Orientation)\\
Recall a basis for a vector space is an \emph{ordered} sequence of independent vectors that span $V$.
An {\bf orientation} of a vector space $V$ is a choice of an equivalence class of bases of $V$.
Here two bases of $V$ are said to be equivalent if the linear map sending one to the other has positive determinant.
\end{definition}
For $V=\R^2$ the standard choice of orientation is to take the equivalence class of the standard basis $(e_1,e_2)$.
It often appears as saying that the positive direction of rotation is counter-clockwise. In $\R^3$ orientation is
often referred to as the 'right hand rule' but we will deal with space later.
As promised reflections do reverse orientation in the following sense. The reflection $s=s_{v^\perp}$
is easy to describe in the basis $v,v'$ where $v$ is orthogonal to $v'$. By definition $s(v) = -v$ while $s(v') = v'$
so the matrix with respect to this basis is $\left( \begin{array}{cc} -1 & 0\\ 0& 1 \end{array}\right)$\footnote{Remember to write down the matrix of a linear transformation one just lists the images of the base vectors as columns of this matrix.} so the determinant is $-1$.
To connect with the above definition we would say that the bases $(v,v')$ and $(s(v),s(v'))$ belong to different equivalence classes, i.e. orientations.
Since there are only two orientations for $E$ and a single reflection reverses the orientation, all rotations preserve orientation. This is why they play such an important role.
Closely tied to the concept of rotation is the notion of angle. But what exactly \emph{is} an angle? This is not as straightforward as it seems.
In fact we will define four related notions of angle:
\begin{enumerate}
\item The oriented angle $\an(u,v)$ between two vectors in $u,v\in E$.
\item The oriented angle of lines where we identify $\an(u,v)$ with $\angle(u,-v)$.
\item The geometric angle which identifies $\angle(u,v)$ and $\angle(v,u)$.
\item Given an orientation of $E$, the measure of $\man(u,v)$, a real number mod $2\pi$.
\end{enumerate}
What corresponds to the 'usual' notion of angles as real numbers modulo $2\pi$ is the measure of oriented angle. However it is NOT well defined in
our Euclidean $E$ vector space. The common definition of an angle $\theta$ between vectors $u,v$ using the inner product as
$\cos \theta = \frac{\la u,v \ra}{|u||v|}$ is also incomplete. The right hand side is symmetric in $u,v$ so without additional assumptions it cannot possibly distinguish between $\theta$ and $-\theta$. Do we go from $u$ to $v$ or vice versa and which way do we go?
The issue is that there is no natural choice for orientation of $E$. Of course we can simply choose an arbitrary orientation but this muddles the theory and clutters the proofs. Much like affine space arises from our refusal to make an arbitrary choice of origin.
\subsection{Angles}
In this subsection we define four notions of angle the most important of which is called the oriented angle between two vectors.
It is firmly rooted in our understanding of rotations.
\begin{lemma}(Audin prop 1.1, p.65)\\
\label{lem.planerot}
For any (ordered) pair of unit vectors in $(u,u')\in E$ there is a unique rotation that sends $u$ to $u'$.
\end{lemma}
\begin{proof}
Suppose $u,u'$ are unit vectors and choose a unit vector $v$ such that $u,v$ form an orthonormal basis.
Then $u' = au+bv$ for some $a,b\in \R$ with $a^2+b^2=1$. With respect to the basis $u,v$ any orthonormal linear map
sending $u$ to $u'$ and determinant $1$ has matrix $\left( \begin{array}{cc} a & -b\\ b& a \end{array}\right)$,
see also Lemma \ref{lem.O2}.
\end{proof}
Our oriented angles now come out by considering pairs of vectors, modulo rotations.
\begin{definition}(Audin p.66)\\
\label{def.oangle}
Define $\hat{\mathcal{A}}$ to be the set of pairs of unit vectors $(u,v)$ in $E$.
Two pairs $(p_1,p_2),(q_1,q_2)\in \hat{\mathcal{A}}$ are said to be equivalent if there exists a rotation
that sends both $p_1$ to $q_1$ and $p_2$ to $q_2$.
The equivalence classes are are called {\bf oriented angles} and the set of oriented angles is denoted $\mathcal{A}$.
The class of $(u,v)$ is denoted\footnote{Audin uses just $(u,v)$ to denote both the pair and the oriented angle.} $\an(u,v)$ and is called the oriented angle between $u$ and $v$.
The {\bf flat angle} is $\an(v,-v)$ for any unit vector $v$.\\
This definition is extended to any pair of non-zero vectors $u',v'$ where we set $\an(u',v')=\an(\frac{u'}{|u'|},\frac{v'}{|v'|})$.
\end{definition}
To clarify the relation between angles and rotations further we define a map from oriented angles to rotations $O^+(E)$ as follows.
\begin{lemma}(Audin lemma 1.2, p.66)\\
The map $\hat{\Phi}:\hat{\mathcal{A}}\to O^+(E)$ is defined by setting $\hat{\Phi}(u,v)$ to be the unique rotation sending $u$ to $v$.
It defines a bijection $\Phi:\mathcal{A}\to O^+(E)$ by sending the class of $\an(u,v)$ to $\hat{\Phi}(u,v)$.
\end{lemma}
\begin{proof}
To show $\hat{\Phi}$ is surjective choose any $f\in O^+(E)$ and unit vector $u\in E$.
We have $\hat{\Phi}(u,f(u)) = f$ since both sides send $u$ to $f(u)$ and there is only one such rotation by Lemma \ref{lem.planerot}.
Next we need to check that $\Phi$ is well-defined by showing that if $(p_1,p_2)$ is equivalent in the sense of Definition \ref{def.oangle} to $(q_1,q_2)$ then $\hat{\Phi}$ maps them to the same rotation in $O^+(E)$. By Lemma \ref{lem.planerot} there is a unique rotation $g$ taking $p_1$ to $q_1$ and by assumption
it also takes $p_2$ to $q_2$. Suppose $f=\hat{\Phi}(p_1,p_2)$ so $f(p_1)=p_2$ then plane rotations commute so $f(q_1) = f\circ g(p_1) = g\circ f(p_1) = g(p_2) = q_2$, showing $f = \hat{\Phi}(q_1,q_2)$.
Finally we should show that if $\hat{\Phi}(p_1,p_2) = \hat{\Phi}(q_1,q_2)=f$ then there is a single rotation that sends both $p_1$ to $q_1$ and $p_2$ to $q_2$. By Lemma \ref{lem.planerot} there is a unique rotation $g$ sending $p_1$ to $q_1$. By assumption $f\circ g\circ f^{-1}$ sends $p_2$ to $p_1$ to $q_1$ to $q_2$. Since the rotations form a commutative group $f\circ g\circ f^{-1} = g$.
\end{proof}
The map $\Phi$ is bijective so with any rotation we may identify an oriented angle in $\mathcal{A}$. We showed there is a unique rotation that is the image of
$\an(u,v)$ so it rotates unit vector $u$ to $v$. We call this the rotation over angle $\an(u,v)$ and to any rotation there corresponds such an angle.
Moreover since the map $\Phi$ is a bijection we may use it to 'add' angles by composing their images in the group $O^+(E)$. More precisely we set
\[\an(u,v)+\an(w,z) = \Phi^{-1}\big(\ \Phi(\an(u,v))\circ \Phi(\an(w,z))\ \big)\]
Unpacking the definitions this says that the sum of the angles is an angle $\an(a,b)$ where $b = f(a)$ and the rotation
$f$ is the result of first rotating $u$ to $v$ and then $w$ to $z$. At the cost of changing $z$ we may always choose $v=w$ and then this kind of addition is just placing the angles next to each other (draw a picture!). This gives the important special case
\[
\an(u,v)+\an(v,z) = \an(u,z)
\]
Now let us turn to measuring angles with respect to an orientation. This depends on the following lemma.
\begin{lemma}
Suppose $(p_1,p_2)$ and $(q_1,q_2)$ are two orthonormal bases of $E$ in the same orientation $\mathcal{O}$ and $r\in O^+(E)$ is a rotation.
The matrix of $r$ with repsect to $(p_1,p_2)$ is the same as that with respect to $(q_1,q_2)$.
The orientation $\mathcal{O}$ thus determines an isomorphism $F_{\mathcal{O}}:O^+(E)\to O^+(2)$.
\end{lemma}
\begin{proof}
The linear map $\rho$ sending basis $p$ to basis $q$ must be in $O^+(E)$ since both bases are orthonormal and in the same orientation.
Now rotations commute so if $r p_1 = a p_1+b p_2$ and $r p_2 = c p_1+d p_2$ then also $r q_1 = r\rho p_1 = \rho r p_1 = \rho(ap_1+bp_2) = aq_1+bq_2$ and
similarly $r q_2 = r\rho p_2 = \rho r p_2 = \rho(cp_1+dp_2) = cq_1+dq_2$.\\
The isomorphism $F_{\mathcal{O}}$ comes from choosing any basis in $\mathcal{O}$. This is an isomorphism by Lemma \ref{lem.O2}.
\end{proof}
Since the matrix of a rotation only depends on the orientation of the basis we can use it to define our measure of angle.
\begin{definition}(Measure of an oriented angle)\\
Given an orientation $\mathcal{O}$ of $E$ and $a\in \mathcal{A}$. The matrix $F_\mathcal{O}(\Phi a)\in O^+(2)$ is of the form
$\left( \begin{array}{cc} \cos \mu & \sin \mu\\ -\sin \mu& \cos \mu \end{array}\right)$. This determines a number $\mu = \mu(a) \in \R$ modulo $2\pi$ and we call it the {\bf measure} $\mu(a)$ of oriented angle $a$.
\end{definition}
It should be clear from this definition that the measure of the angle really does depend on the choice of orientation.
Also the definition agrees with $\cos \mu = \la u,v \ra$ where $\mu = \mu\an(u,v)$ for unit vectors $u,v$.
The measure of the flat angle is $\pi\mod 2\pi$. Since $F_\mathcal{O}$ is an isomorphism we have $\mu(a)+\mu(b) = \mu(a+b) \mod 2\pi$ as
implied by the word measure.
Returning to the oriented angles $\an(u,v)$ we next define the angle between two lines and the geometric angle between two vectors. Neither depends on a choice of orientation.
\begin{definition}(Angle between lines and geometric angle)\\
The {\bf angle between two lines} spanned by $u$ and $v$ in $E$ is defined to be $\an(u,v)\mod \an(u,-u)$\\
The {\bf geometric angle} between two vectors $u,v\in E$ is the equivalence class of $\an(u,v)$ under the equivalence relation
where we identify $\an(x,y)$ with $\an(y,x)$ for any $x,y\in E$.
\end{definition}
One should check that the definition of the angle between lines does not depend on the choice of spanning vectors $u$ and $v$. To this end recall that $\an(u,-u)$ represents the flat angle and flipping the sign of $u$ or $v$ amounts to adding or subtracting such a flat angle.
The point of defining the notion of geometric angle is that this is what is actually invariant under all isometries. By definition a rotation $\rho\in O^+(E)$ sends preserves oriented angles in the sense that $\an(u,v) = \an(\rho u,\rho v)$. However a reflection $s$ does not.
\begin{lemma}(Audin prop 1.10, p.70 )\\
For any unit vectors $u,v\in E$ and a reflection $s$ we have $\an(s u,s v) = \an(v,u)$.
\end{lemma}
\begin{proof}
The reflection $s' = s_{(u-v)^{\perp}}$ sends $u$ to $v$ and $v$ to $u$.
The composition $s\circ s'$ is a rotation so $\an(v,u) = \an(s\circ s'(v),s\circ s'(u)) = \an(s u,s v)$.
\end{proof}
If follows from this discussion that geometric angles are preserved by all isometries, both rotations and reflections.
So far we considered angles in a two-dimesional Euclidean vector space $E$. These can be carried over to the affine Euclidean plane $\E$ as follows.
\begin{definition}(Angles in the affine Euclidean plane)\\
For points $O,A,B\in \E$ with $O\neq A$ and $O\neq B$, the oriented angle between the segments ($1$-simplices) $[O,A]$ and $[O,B]$ is $\an(OAB)$,
defined as $\an OAB = \an(\oa{OA},\oa{OB})$.\\
For a pair of lines $\mathcal{L}_1,\mathcal{L}_2$ intersecting at $O$ the angle between them is the angle between the lines $\Theta_O(\mathcal{L}_i)$ in $E$.
The geometric angle and the measure of the angle $\an OAB$ are defined as explained above.
\end{definition}
\subsection{Plane constructions involving angles}
To illustrate what we just learned about angles let us prove a few elementary facts involving angles.
\begin{lemma}(Angle sum, Audin prop 1.14 p.73)\\
\label{lem.anglesum}
If $A,B,C$ are distinct points in an affine Euclidean plane then $\an ABC + \an BCA+\an CAB$
is the flat angle and so is $\an ACB + \an BAC+\an CBA$.
\end{lemma}
\begin{proof}
The rotation $h=h_{C,-1}$ sends $\oa{CA}$ to $\oa{AC}$ and $\oa{CB}$ to $\oa{BC}$ so
$\an(\oa{CA},\oa{CB}) = \an(h \oa{CA},h\oa{CB}) = \an(\oa{AC},\oa{BC})$. Now $\an(x,y)+\an(y,z) = \an(x,z)$ so
\[\an(\oa{AB},\oa{AC})+\an(\oa{BC},\oa{BA})+\an(\oa{CA},\oa{CB}) = \an(\oa{AB},\oa{AC})+\an(\oa{BC},\oa{BA})+\an(\oa{AC},\oa{BC}) = \an(\oa{AB},\oa{BA})\]
which is the flat angle. The second statement follows from $\an(x,y) = -\an(y,x)$.
\end{proof}
Notice the cyclic permutation of the order of the vertices $A,B,C$ in the angle sum: ABC, BCA, CAB.
In the presence of an orientation this becomes the familiar theorem about the angle sum in a triangle.
If $A,B,C$ are distinct points in an oriented affine Euclidean plane then $\theta_{\an ABC}+\theta_{\an BCA}+\theta_{\an CAB} = \pi \mod 2\pi$.
This follows from the previous statement since the measure of the sum is the sum of the measures modulo $2\pi$.
More interestingly we have the classic proposition about the angles in the circumference.
\begin{theorem}(Audin prop 1.17, p.74)\\
If $A,B,C$ are distinct points in an affine Euclidean plane and $O$ is the circumcenter of the triangle $[A,B,C]$ then
\[
\an OAB = 2 \an CAB
\]
\end{theorem}
\begin{proof}
$O$ is the center of the circle through $A,C$ so it is fixed by the reflection in the perpendicular bisector $\mathcal{H}$ of segment $[A,C]$.
This is the affine line of all points of equal distance to both $A$ and $C$. Since $\sigma_{\mathcal{H}}$ reverses orientation and permutes $A$ and $C$
we have $\an COA = \an ACO$. Lemma \ref{lem.anglesum} tells us that $\an COA + \an ACO+ \an OAC = 2\an COA+\an OAC$ is flat. In the same way
$2\an CBO+\an OCB$ is also flat. Adding two flat angles gives the angle $0$ so
\[
0=2\an COA+\an OAC+2\an CBO+\an OCB = 2\an CBA+\an OAB
\]
Since $\an(x,y) = -\an(y,x)$ we are done.
\end{proof}
Returning briefly to our discussion of centers of triangles, for each pair of lines there is a pair of bisectors of the two angles defined by the lines.
If the lines are directed by unit vectors $u,v$ then these bisectors are directed by $u+v$ and $u-v$. They are perpendicular and cut the angles in two as reflection in either of the bisectors permutes the original two lines.
One can prove that given a triangle the six bisectors of the lines generated by the sides of the triangle
meet in triples to define four points. These points are the centers of four circles tangent to the extended sides of the triangle. One of them is inside the circle and is called the inscribed circle. As was mentioned earlier these circles are all tangent to the nine-point circle, making it effectively a thirteen-point circle. Curiously, if instead we \emph{trisect} each angle then the six trisectors inside the triangle meet in adjacent pairs to define a regular triangle. This is known as Morley's theorem.
Finally let us mention an interesting map that is not an isometry but as we will see later it reverses angles as if it were an ordinary reflection.
In some sense inversions are reflections in a circular mirror.
\begin{definition}(Audin def 4.1, p.84)\\
For $\lambda\in \R$ and $O$ a point in affine Euclidean plane $\E$, consider the inversion map $I_{O,\lambda}:\E-\{O\}\to\E-\{O\}$
defined by $I_{O,\lambda}(M) = M'$ where $M'$ is the point on line $OM$ such that $\oa{OM'} = \frac{\lambda}{|OM|^2}\oa{OM}$.
\end{definition}
\section{Three dimensional Euclidean geometry}
In this section we briefly explore three dimensional Euclidean geometry. Our main focus is the classification of the regular polyhedra, also known as the Platonic solids. Traditionally they are a centerpiece of mathematics, for example Euclid's elements concludes with constructions of the polyhedra and an argument that there can only be five such. Even though the Platonic solids do not seem to have much significance in themselves, the tools needed to define, classify and construct them nicely illustrate this kind of geometry.
A new feature in dimension three is that in addition to usual angles between planes one now also has a notion of 'solid angle' or a piece of a sphere cut out by a cone. As such we are naturally led to consider the geometry of the sphere itself.
\subsection{Rotations revisited}
As in the plane we define a rotation in general to be a composition of two reflections in hyperplanes through the origin.
According to Theorem \ref{thm.vectreflect} every linear isometry of $3$-dimensional Euclidean space $E$ is a composition of $0,1,2$ or $3$ reflections in planes through the origin. Looking at the determinant it is still true that $O^+(E)$ consists of only rotations (including the identity).
In three dimensions the planes $P,P'$ we are reflecting in generally meet in a line $L$ and points on this line are left fixed by the rotation $r = s_P\circ s_{P'}$. The plane $L^\perp$ is also sent to itself (but not pointwise) because $r$ preserves the inner product.
Since $E = L\oplus L^\perp$ we may describe the rotation $r$ by saying how it behaves in $L^\perp$. Viewed as a Euclidean vector space in its own right
$L^\perp$ is two dimensional and the restriction of $r$ to $L^\perp$ is a plane rotation as described in the previous sections.
This is because the restriction of $r$ is now the composition of reflections in the lines $P\cap L^\perp$ and $P'\cap L^\perp$ viewed as an element of $O^+(L^\perp)$. As such we may describe it using an oriented angle.
A more concrete approach is to choose an orientation $\mathcal{O}_E$ for $E$ usually referred to as the 'right hand rule' and also an orientation $\mathcal{O}_L$ of line $L$ usually thought of an arrow on $L$. In that case we obtain an orientation $\mathcal{O}_{L^\perp}$ of $L^\perp$ as follows. Choose a basis vector $\ell$ in $\mathcal{O}_L$. A basis $(a,b)$ of $L^\perp$ is in the orientation $\mathcal{O}_{L^\perp}$ if the triple $(a,b,\ell)$ is in orientation $\mathcal{O}_E$. In other words it should agree with the right hand rule where $a,b$ are index finger and middle finger and $\ell$ is the thumb.
With all these orientations in place we are in a position to use the measure of the oriented angle of the rotation in $L^\perp$. The rotation can then be described by giving an axis $\ell$ and an angle $\mu\in \R \mod 2\pi$. Choosing the angle to be in $[0,2\pi)$ we may encode the rotation into the single axis vector $\ell$ by requiring $|\ell| = \mu$. We will then us the notation $r_{\ell}$ to denote the rotation with axis $\ell$ and angle $|\ell|\in [0,2\pi)$. By convention the case $\ell=0$ refers to the identity.
Taking the standard basis $(e_1,e_2,e_3)$ of $E=\R^3$ to represent our orientation the most commonly used rotations are $r_{\mu e_2}$ (rolling), $r_{\mu e_1}$ (pitching) and $r_{\mu e_3}$ (yawing)\footnote{rollen, stampen en gieren in goed Nederlands.}. For pilots the coordinate system is often thought of being attached to the cockpit such that $e_2$ points forwards, $e_1$ is pointing to the right wing and $e_3$ points upwards. We prefer instead to fix the coordinate system once and for all.
This notation is sufficient in the sense that any other rotation can be written as a composition of rolls, pitches and yaws. Actually rolls and pitches suffice. We will briefly return to this point using quaternions later.
\subsection{Convex polyhedra}
Convexity is an important property of objects in affine space (no inner product/distance necessary!). When discussing polyhedra in space we (as did Euclid) often assume implicitly that the polyhedron is convex. Roughly it means the polyhedron looks like a ball without cavities, craters or holes. More formally,
\begin{definition}
A subset $S$ of affine space $\E$ is called {\bf convex} if for all $A,B\in S$ the line segment (1-simplex) $[A,B]\subset S$.
\end{definition}
In any dimension the ball $B=\{x\in E:|x|\leq r\}$ and the simplex are convex (Exercise!). The regular polygons, triangle, square, pentagon \dots in the plane (equal angles and equal sides) are also convex.
Inspecting the definition we remark that the intersection of two convex subsets is again convex (Exercise!). This allows us to generate many examples of convex subsets by taking the convex hull and such objects are called polyhedra when the set is finite.
\begin{definition} (Audin Def 5.5 p.29)\\
The {\bf convex hull} of $S\subset \E$ is the intersection of all convex subsets of $\E$ that contain $S$.
\end{definition}
The convex hull of a sphere $S^{n-1}$ is the solid ball $B$ contained in it because any point in $B$ is on a straight line between two elements of the sphere. This means that this point must be contained in any convex subset containing the sphere.
\begin{definition} (Audin Def 4.1 p.122)\\
A {\bf polyhedron} in $P\subset \E$ is the convex hull of finitely many points in $\E$. It is said to be non-degenerate if the affine subspace generated by these points is $\E$. A point $p\in P$ is called interior if there exists a ball around $p$ that is also contained in $P$. The set of all interior points is called $P^o$.
\end{definition}
In other words a non-degenerate polyhedron always contains an affine frame and hence the simplex spanned by it.
It also means that there exist interior points.
For non-degenerate polyhedra we will give a careful definition of the vertices, edges and faces. This is not as straightforward as one might think.
\begin{definition}
Define the vertices of a polyhedron $P$ to be the intersection of all $T\subset \E$ whose convex hull is $P$.
Any subset of $n$ of the vertices determines a hyperplane $H$. We say $H\cap P$ is a face of $P$ if $P^o\cap H=\emptyset$.
In case the intersection of two distinct faces contains two vertices we call it an edge. The sets of vertices, edges and faces are generally denoted by
$V,E,F$.
\end{definition}
Even though a polyhedron $P$ may be the convex hull of ten points, it may have fewer vertices, for example take the eight corners of a regular cube and place two more points inside. The convex hull will still be that cube and the eight corners will be the vertices. It is true that if $P$ is the convex hull of set $S$ then $V$ is a subset of $S$ (exercise!).
Our terminology is mostly adapted for dimension three. In two dimensions the sides of a triangle should now be called faces and there are no edges.
In three dimensions the edges are intervals connecting two vertices (exercise!) but in higher dimensions they will have higher dimension too and more should be said to describe the combinatorial structure of the polyhedron completely.
Now that we defined faces properly we can say what we mean by a regular polyhedron. Again the definition we give is mostly tailored to dimension three, in higher dimensions we can ask even more regularity.
\begin{definition}
A non-degenerate convex polyhedron in Euclidean affine space is called {\bf regular} if all its faces are isometric to a single regular polygon and if for each pair of vertices $a,b$ there is an isometry taking $a$ to $b$ that sends any edge adjacent to $a$ to an edge adjacent to $b$.
\end{definition}
Simple examples of regular polyhedra in any dimension are the orthoplexes which are the convex hull of the unit vectors $\pm e_i$ in $\R^n$ (viewed as affine $n$ space). Dually the $n$-cubes $[-1,1]^n$ are also regular.
These two examples illustrate a general phenomenon of duality of convex polyhedra. Recall that for triangles we defined the centroid to be the center of gravity.
This notion works for any finite subset $S\subset E$. It is just the average of the points in $S$. The centroid of a polyhedron is the centroid of its vertices.
\begin{definition}
The dual of a convex polyhedron $P$ is the convex hull of the centroids of the faces of $P$.
\end{definition}
\subsection{Spherical geometry}
Throughout this subsection we work in a three dimensional Euclidean vector space $E$. Our main object of study is the sphere $S^2 = S^2(0,1)$ of radius $1$ centered at the origin.
\begin{definition}
A great circle on $S^2$ is the intersection of $S^2$ with a two dimensional linear subspace of $E$.
A half space defined by vector $v\in E$ is the set $H_v =\{x\in E| \la v, x\ra \geq 0\}$.
A {\bf convex spherical polygon} is the intersection of $S^2$ and finitely many half spaces.
\end{definition}
The most important special cases of convex spherical $n$-gons are: the case $n=1$ which is known as a hemisphere. The case $n=2$ is called a time zone (lune) and the case $n=3$ is called a spherical triangle.
We think of the great circles as the 'straight lines' (geodesics). Later in the course we will see that indeed among all paths that run on the surface of the sphere the shortest path between two points is always an arc of a great circle. By arc of great circle we just mean a connected part. Unlike Euclidean straight lines, any great circles intersects and not just in one point but actually in two points. This pair of points is always antipodal (opposite). There is a fundamental isometry called the antipodal isometry $a:E\to E$ defined by $a(x) = -x$ and we say it sends points $x$ on the sphere to their antipode $-x$.
We make the assumption that there exists an area function $\mathrm{Area}:\mathcal{P}\to \R$ defined on the set $\mathcal{P}$ of all $P\subset S^2$ bounded by finitely many segments of great circles\footnote{One should not include all subsets of the sphere because this will allow paradoxal constructions akin to the Banach-Tarski paradox.}. We assume $\mathrm{Area}$ satisfies the following properties:
\begin{enumerate}
\item $\mathrm{Area}(S^2) = 4\pi$
\item If $P,Q\in \mathcal{P}$ and $P\cap Q$ consists of segments of great circles then $\mathrm{Area}(P\cup Q) = \mathrm{Area}(P)+\mathrm{Area}(Q)$.
\item For any isometry $\phi$ of $E$ and $P\in \mathcal{P}$ we have $\mathrm{Area}(P) = \mathrm{Area}(\phi(P))$.
\end{enumerate}
The angle between two great circles is the geometric angle between the two corresponding planes.
\begin{theorem}(Audin Prop 3.1, p. 121)\\
The sum of the angles of a spherical triangle is the area of the triangle plus $\pi$.
\end{theorem}
\begin{proof}(Nicer than Audin!).
Extend the sides of our spherical triangle $ABC$ to three great circles. Each pair of great circles will meet
in an additional point antipodal to the original, say $A'$ is antipodal to $A$ and $B'$ the antipode of $B$ and $C'$ the antipode of $C$.
The antipodal isometry sends triangle $ABC$ to triangle $A'B'C'$ which means they have equal area.
For each vertex of our triangle two of the three great circles intersect and these two circles bound two time zones ending at the vertex and its antipode.
One of these time zones contains $ABC$ the other contains $A'B'C'$. The time zones are related by a $\pi$ rotation with axis $\oa{AA'}$.
The measure of the geometric angle of both the time zones at $A$ is called $\alpha\in [0,\pi)$, the angle at $B$ is called $\beta$ and the one at $C$ is called $\gamma$.
The total of six time zones covers the sphere completely with a triple overlap precisely at $ABC$ and $A'B'C'$. Since the area of a time zone with angle $\mu$ is $2\mu$ and the area of the sphere is $4\pi$ finishes the proof:
\[
4\pi = \mathrm{Area}(S^2) = 4\alpha+4\beta+4\gamma-2\mathrm{Area}(ABC)-2\mathrm{Area}(A'B'C') = 4(\alpha+\beta+\gamma-\mathrm{Area}(ABC))
\]
\end{proof}
\subsection{Euler's formula}
In the remainder of this subsection we assume the dimension of $\E$ is $3$.
\begin{theorem}(Euler's polyhedron formula)\\
If $V$, $E$ and $F$ denote the number of vertices, edges and faces of a convex non-degenerate polyhedron in $\E$
then
\[V-E+F=2\]
\end{theorem}
\begin{proof}
Without loss of generality we may assume that all faces of our polyhedron $Q\subset \E$ are triangular. This is because we can always divide a face with more corners by adding a diagonal edge. This increases both $E$ and $F$ by one while not changing the number of vertices. Hence $V-E+F$ is unchanged.
By non-degeneracy we may choose a point $O$ in the interior of our polyhedron $Q$. Applying the map $\Theta_O:\E\to \mathbb{E}$ where we wrote the underlying vector space as $\mathbb{E}$ to avoid conflict of notation we now consider a polyhedron $P = \Theta_O(Q)$ containing the origin in its interior.
The radial projection map $\rho:\mathbb{E}-\{0\}\to S^2$ defined by $\rho(x) = \frac{x}{|x|}$ sends the vertices, edges and faces of $P$
to a tiling of the sphere $S^2$ by triangles. More precisely the images of the edges end at the images of the vertices and their complement on the sphere
is a union of spherical triangles.
We calculate in two ways the sum of the (measures of the geometric) angles of all the triangles. Collecting the angles around each vertex we see that they add up to $2\pi$ at each vertex so we get $2\pi V$ in total.
On the other hand we can add the angle sum of each of the triangular faces to get
$\sum_{\mathrm{faces}\ f}\mathrm{Area}(f)+\pi = 4\pi+\pi F$ since the $F$ faces cover $S^2$ precisely.
Putting it all together we found
\[
2\pi V = 4\pi +\pi F
\]
Since all faces are assumed to be triangles we have $3F=2E$ so $-\frac{F}{2} = -E+F$. Dividing the above by $2\pi$ we find $2=V-\frac{F}{2} = V-E+F$.
\end{proof}
Euler's polyhedron formula restricts the possibilities for regular polyhedra severely.
\begin{lemma}
If in a convex polyhedron all faces have $s$ vertices and around every vertex $r$ edges meet then $\{r,s\}\subset \{\{3,3\},\{3,4\},\{3,5\}\}$
\end{lemma}
\begin{proof}
Any edge belongs to two faces so $sF = 2E$ and every edge has two end points so $rV=2E$.
Solving for $E$ in Euler's formula we find $2=V-E+F=\frac{2}{r}E-E+\frac{2}{s}E$ so dividing by $2E$ we find
\[
\frac{1}{E}+\frac{1}{2} = \frac{1}{r}+\frac{1}{s}
\]
The number of edges is positive so $\frac{1}{r}+\frac{1}{s}>\frac{1}{2}$. Since $r,s$ are supposed to be integers $\geq 3$ there are only finitely many possibilities to satisfy this equation. Five to be precise as one can see by listing all possibilities.
\end{proof}
So far we have shown that if regular polyhedra exist in $3$-space then the possible values of $s,r$ are as above. However we do not yet know if for each of these allowed values there exists such a polyhedron. This is the content of the final book of Euclid's Elements:
\begin{theorem}(the Platonic solids)\\
Up to isometries and dilations there exist precisely five regular polyhedra in Euclidean affine $3$-space. They are:
\begin{enumerate}
\item The tetrahedron $(r,s) = (3,3)$
\item The cube $(3,4)$
\item The octahedron $(4,3)$
\item The dodecahedron $(3,5)$
\item The icosahedron $(5,3)$
\end{enumerate}
\end{theorem}
The existence of tetrahedron, cube and octahedron is not hard to establish explicitly. The construction of the dodecahedron and icosahedron is more surprising.
The strategy is to start with a cube and place roofs on each of the faces of a cube in a symmetric fashion.
The roof has a square horizontal base and a top line segment parallel to a pair of sides of the cube. The remaining faces are determined by connecting the vertices
of the square to the endpoints of the top segment. The five edges not part of the square should have equal length. The two dihedral angles that the faces make with the square base should add up to $\pi/2$. The top angles of the triangles should be $2\pi/5$.
The vertices of a dodecahedron are $(\pm 1,\pm 1,\pm 1),(0,\pm \tau,\pm \tau^{-1}),(\pm \tau^{-1},0,\pm \tau),(\pm \tau,\pm \tau^{-1},0)$.
Higher dimensional analogues of the Platonic solids exist in any dimension. We have already met the simplices and the generalization of the cube is the hypercube $[-1,1]^n$. Its dual is the analogue of the octahedron and also exists in any dimension. However dodecahedron and icosahedron are unique to three dimensions.
In dimensions five and up the three regular solids: simplex, hypercube and its dual the orthoplex are the only ones. In dimension four there are three more that go by the name of $120$-cell $600$-cell (analogues of dodecahedron and icosahedron) and something new called the $24$-cell. One way to understand their existence is as lifts of the symmetries of the threedimensional Platonic solids. This uses the fact that $SO(3)$ is doubly covered by the $3$-sphere $S^3$.
\subsection{Quaternions}
No discussion of three-dimensions is complete without mention of the quaterions. Much like complex numbers are very effective in discussing the plane the four-dimensional space of quaternions illuminates the isometry group of three space. Following Hamilton we boldly construct these new numbers:
\begin{definition}
The quaternions $\QH$ is a $4$-dimensional vector space with basis $1,i,j,k$ and bilinear, associative product
$\QH\times \QH \to \QH$ determined by Hamilton's relations
\[
\bi^2=\bj^2=\bk^2=-1 \quad \bi\bj = \bk\quad \bj\bk=\bi \quad \bk\bi=\bj
\]
If $q = w+x\bi+y\bj+z\bk\in \QH$ the conjugate is $\bar{q} = w-x\bi-y\bj-z\bk$.
The norm is defined as $|q| = q\bar{q} = w^2+x^2+y^2+z^2\geq 0$.
\end{definition}
Calculus books still use $\bi,\bj,\bk$ to denote the unit vectors in $\R^3$ and this originated from Hamilton's work.
If we identify vector $v=(v_1,v_2,v_3)\in \R^3$ with the imaginary quaternion $q_v = v_1\bi+v_2\bj+v_3\bk\in \QH$ then we have
\[q_vq_u = -v\cdot u+q_{v\times u}\]
This means that vectors $v\perp u$ if and only if $q_v q_u + q_u q_v = 0$ because $\bar{q}_v = -q_v$.
Now assume $u$ is a unit vector so that $q_u\bar{q}_u = 1$. Then $q_u^2=-1$ and
$v\perp u$ if and only if $q_v=q_u q_v q_u$. Notice that the left hand side is again a pure quaternion.
This means there is a map $s:\R^3\to \R^3$ given by $v\mapsto q_u q_v q_u = q_{s(v)}$.
We claim $s = s_{u^\perp}$ the reflection in the plane orthogonal to unit vector $u$.
Indeed $s$ fixes vectors in the plane $u^\perp$ by the above discussion and $q_{s(u)} = q_u^3 = -q_u=q_{-u}$ since $q_u^2=-1$.
As in the plane a rotation in three-dimensional Euclidean space $E$ is still a composition of two reflections and any element of $O^+(E)$ is a rotation
since it is the composition of $0,1,2$ or $3$ reflections.
For two unit vectors $u,t$ composing the reflections $s_{u^\perp}$ and $s_{t^\perp}$ we get a map sending
$q_v$ to $q_tq_uq_vq_uq_t$. Now $q_tq_u = (q_uq_t)^{-1}$ since both $u,t$ are unit vectors. The product $q_tq_u$ is also a unit quaternion
so all rotations are written in quaterions as conjugation by a unit quaternion.
\section{Differential geometry}
In the second half of this course we explore what remains of geometry when the space is curved and distances are distorted. Such spaces will be introduced as Riemannian\footnote{B. Riemann wrote his PhD thesis on this topic. His advisor was C. Gauss.} charts\footnote{Usually one considers Riemannian manifolds (see the course in the 3rd year) instead of charts but for our purposes charts are sufficient.}.
Our main concern will be to
formulate what we mean by a 'straight line' in such a curved space. The curves taking the place of straight lines are known as geodesics and we will show that they exist at least locally as the curve minimizing the distance from point $A$ to point $B$. In doing so we will need a fair amount of techniques from differential calculus including the existence of solutions to ordinary differential equations and a little variational calculus.
\subsection{Derivative}
\begin{definition}
A $C^1$ function is a function whose partial derivatives exist and are continuous. Likewise $C^2$ means the partial derivatives of the partial derivatives are continuous and so on.\\
Imagine a $C^1$ function $P\xrightarrow{f}\R^m$ defined on an open subset $P\subset \R^n$. Set $f = \sum_{i=1}^m f_i e_i$ for some functions $f_i:P\to \R$.
The derivative of $f$ at $p\in P$ is the linear transformation $\R^n\xrightarrow{f'(p)}\R^m$ whose matrix with respect to the standard bases is
$(\frac{\p f_i}{\p x_j}(p))_{i=1\dots m, j=1\dots n}$.\\
In case $P$ is not open we say $f$ is differentiable if there exists a differentiable function on an open set containing $P$ that coincides with $f$ on $P$.
\end{definition}
The matrix of the derivative with respect to the standard bases is also known as the Jacobian matrix. Its $(i,j)$-th entry is the $j$-th partial derivative $\frac{\p f_i}{\p x_j}(p)$ of the $i$-th component of $f$. Even though it is often enough to work in the standard bases it is important to notice that our derivative is the linear transformation corresponding to the Jacobian matrix, not the Jacobian itself. In other words $f'(p)e_j = \p_jf(p) = \sum_i \frac{\p f_i}{\p x_j}(p)$.
In geometry it is often useful to be able to change to a basis adapted to the situation. Also, phrasing things in terms of linear transformations makes the formulas easier to manage.
It is traditional to call a function $X:P\to \R^n$ for $P\subset \R^n$ a {\bf vector field} on $P$. We will take for granted the following properties of the derivative:
\begin{lemma}(Properties of the derivative)
\begin{enumerate}
\item (Chain rule) For $C^1$ functions $P\xrightarrow{f} Q\xrightarrow{g} R$ we have $(g\circ f)'(p) = g'(f(p)) \circ f'(p)$.
\item For $C^1$ functions $P\xrightarrow{f,g} Q$ and $\alpha,\beta\in \R$ the function $\alpha f+\beta g$ is also $C^1$.
\item For $C^1$ functions $P\xrightarrow{f,g} \R$ the product $fg$ is also $C^1$
\item The function $P\xrightarrow{f} Q$ is $C^1$ if and only if its coefficient functions $f_i$ are $C^1$ where $f(p) = \sum_i f_i(p)e_i$.
\end{enumerate}
\end{lemma}
\begin{definition}
A $C^1$ function $\R^n\supset P\xrightarrow{f}Q\subset\R^n$ is said to be a diffeomorphism if it is a bijection whose inverse is also $C^1$.
\end{definition}
Diffeomorphisms are often important as symmetries or coordinate changes because by definition we do not lose any information applying them to some object.
The inverse function theorem provides lots of theoretical examples of diffeomorphism. One way to state the theorem is
that if a $C^1$ function $P\xrightarrow{f} Q\xrightarrow{g} R$ has invertible derivative $f'(p)$ for some $p\in P$ then there are open neighbourhoods $p\in A\subset P$ and $f(p)\in B\subset Q$ such that the restriction of $f$ to $A$ is a diffeomorphism with image $B$.
A geometric way to think about the derivative of a map $P\xrightarrow{f}Q$ between open subsets $P\subset \R^n$ and $Q\subset \R^m$ is the following.
At every point $p\in P$ the derivative at $p$ is a linear map $\R^n\xrightarrow{f'(p)} \R^m$ and to visualize $f'(p)v$ for some $v\in \R^n$ we use the chain rule.
To make the vector $v$ more geometric we interpret it as the velocity vector of some curve $\gamma:(-\epsilon,\epsilon)\to P$ for some small $\epsilon>0$. So $\gamma(0)=p$ and $\dot{\gamma}(0)= v$. We cannot apply $f$ directly to $v$ but we can now consider what happens to $\gamma$ as we apply $f$. This produces a new curve $\beta=f\circ\gamma$ with image in $Q$. The velocity vector $w$ of this new curve $\beta$ at $f(p) = \beta(0)$ is a natural geometric way to transport the vector $v$ at $p$ to a vector $w$ at $f(p)$. In formulas
\[w = \dot{\beta}(0) = (f\circ \gamma)'(0) = f'(\gamma(0))\dot{\gamma}(0) = f'(p)v\]
In the third step we used the chain rule. We leave it as an exercise to the reader that this construction did not depend on the particular choice of $\gamma$.
Another good way to think about the derivative $f'(p)v$ is as a directional derivative $\p_v f(p)$:
\[
\p_v f(p) = \lim_{h\to 0}\frac{f(p+hv)-f(p)}{h}
\]
The chain rule tells us that $\p_v f(p) = f'(p)v$. This is left as an exercise to the reader.
\subsection{Riemannian metrics}
\begin{definition}(Riemannian chart)\\
A {\bf Riemannian chart} $(P,g)$ is an open subset $P\subset \R^n$ together with a {\bf Riemannian metric} $g$.
The Riemanian metric is a choice of inner product $g(p)$ for each $p\in P$ in such a way that the functions $g_{ij}:P\to \R$ defined by
$g_{ij} = g(p)(e_i,e_j)$ are $C^1$.
\end{definition}
Euclidean $n$-space is the fundamental example of a Riemannian chart. It looks like $(\R^n,g_E)$ where the Riemannian metric $g_E$ is given by
$g_E(p)(v,w) = \la v,w\dot \ra$, the usual inner product. Since the standard basis $e_i$ is orthonormal we have $g_{ij}(p) = \delta_{ij}$ for all $p$
which is surely a $C^1$ function.
Lengths and measures of angles can still be defined in a Riemannian chart $(P,g)$:
\begin{definition}(Length and angle)\\
Suppose $\beta,\gamma:[a,b]\to P$ are curves in Riemannian chart $(P,g)$.\\
The {\bf length} $L(\gamma)$ of curve $\gamma$ is the integral $L(\gamma) = \int_a^b \sqrt{g(\gamma(t))(\gamma'(t),\gamma'(t))}dt$. \\
If $\beta(q)=\gamma(q)=p\in P$ the {\bf angle} between $\beta,\gamma$ at $p$ is
the angle between the vectors $\beta'(q),\gamma'(q)$ in the Euclidean space $(\R^n,g(p))$, provided that $\beta'(q),\gamma'(q)\neq 0$.
\end{definition}
Metrics can also be transferred by pull-back as follows.
\begin{definition}{\bf(Pull-back metric)}\\
Given a $C^2$ differentiable map $P\xrightarrow{\varphi}Q$ between open $P\subset \R^m$ and Riemannian chart $(Q,g)$ such that $\varphi'(p)$ is injective for all $p\in P$,
we may define a metric $\varphi^*g$ on $P$ by $(\varphi^*g)(p)(v,w) = g(\varphi(p))( \varphi'(p)v,\varphi'(p)w)$.
\end{definition}
The derivative of $\varphi$ is required to be injective in order to ensure non-degeneracy of the pulled back inner product.
For example we may take the sphere and geographic coordinates $G = (0,\pi)\times (-\pi,\pi)$ and $G\ni (\mu,\lambda)\xmapsto{\varphi} (\cos \lambda \sin\mu,\sin\lambda \sin\mu,\cos\mu)\in \R^3$. Here $\mu$ is the latitude coordinate and $\lambda$ the longitude, for example Leiden is the point $\varphi(52.1601,4.4970)\frac{\pi}{180}$ written in degrees.
Explicitly the inner product $\varphi^*g_{Eucl}$ is given by calculating it at every point for the basis vectors $e_1,e_2$. Since the matrix for $\varphi'(p)$ is
$\left(\begin{array}{cc}
\cos \lambda \cos \mu & -\sin \lambda \sin \mu \\
\sin \lambda \cos \mu & \cos \lambda \sin \mu \\
-\sin \mu & 0\\
\end{array}\right)$ we get $\varphi^*g_{Eucl}(p)(e_1,e_1) = 1$, $\varphi^*g_{Eucl}(p)(e_1,e_2) = 0$, $\varphi^*g_{Eucl}(p)(e_2,e_2) = \sin^2 \mu$.
Given a geometric object $Q\subset \R^n$ in Euclidean space there are two ways to study it: {\bf intrinsically} or {\bf extrinsically}.
The intrinsic way of studying $Q$ would be to find a differentiable map $P\xrightarrow{\varphi}Q$ if injective derivative and turn $P$ into a Riemannian chart $(P,g)$ as above by pulling back the Euclidean metric: $g=\varphi^*g_{Eucl}$. Properties of $Q$ that can be described in this way are called {\bf intrinsic} properties. The length of a curve $\gamma:[0,1]\to Q$ is an example of an intrinsic property.
The other way is to work with $Q$ from the 'outside' or {\bf extrinsically} by studying how it sits in $\R^n$ using the tools from Euclidean geometry and differential calculus. A typical extrinsic property of $Q$ is the Euclidean distance between two points in $Q$.
Both the intrinsic and extrinsic are important but in modern geometry there is an emphasis on the intrinsic features. Often working intrinsically is more efficient in that one needs less coordinates and is less distracted by 'irrelevant' extrinsic features. Also sometimes our space $Q$ is not presented to us as subset of some Euclidean space but still has a Riemannian metric. In that case the intrinsic approach is the only option. For example when $Q$ is the universe or when $Q = O(E)$ the orthogonal group or the space of quaterions of unit length.
Finally another important example of a Riemannian chart is hyperbolic space $\mathbb{H}^n = \R^{n-1}\times \R_{>0}$ with metric
$g_{hyp}(x,y)(v,w) = \frac{1}{y^2}g_{E}(v,w)$.
For example in the hyperbolic plane the length of the vertical line between $(0,a)$ and $(0,b)$ given by the curve $\gamma$ defined by $\gamma(t) = (t(b-a)+ a)e_2$ with $a**0$ for $t\in (0,1)$. By assumption $0=\int_0^1 M(t)h(t)dt$ but by the mean value theorem for integrals
this integral equals $M(s)h(s)$ for some $s\in (0,1)$.
\end{proof}
\begin{proof}(Of Theorem \ref{thm.EL})\\
Let us assume curve $\gamma\in \mathcal{C}$ minimizes $S$ in the sense that for any other curve $\beta\in \mathcal{C}$
$S(\gamma)\leq S(\beta)$.
Choose $i\in \{1,\dots, n\}$ and consider perturbing the curve $\gamma$ by adding an arbitrary function $h$ in the $i$ direction.
More precisely choose some fixed $C^2$ function $h:[0,1]\to \R$ with $h(0)=h(1) =0$. The function $V:(-v,v)\times [0,1]\to P$ defined by
$V(\epsilon,t) = \gamma(t)+\epsilon h(t)e_i$ represents a family of curves, one for each value of $\epsilon$ and $v$ is chosen so that all the curves
actually have image contained in $P$. So for any fixed $|\epsilon|0$.
It thus has an inverse $(g^{-1}_{rk}(\gamma(t)))$ so $\sum_{k}g^{-1}_{rk}(\gamma(t))g_{ks}(\gamma(t)) = \delta_{rs}$ (Kronecker delta). Applying this to both sides of our differential equations
we get the {\bf geodesic equation}. For fixed $r\in \{1,\dots n\}$ we have:
\begin{equation}
\label{eq.geod}
0=\gamma''_r(t)+ \sum_{i,j=1}^n \Gamma^r_{ij}(\gamma(t))\gamma'_i(t)\gamma'_j(t)
\end{equation}
For any value of $i,j,r\in \{1,\dots n\}$ the function $\Gamma^r_{ij}:P\to \R$ is known as the {\bf Christoffel symbols} and is defined as:
\begin{equation}
\label{eq.Christoffel}
\Gamma^r_{ij}(p) = \frac{1}{2}\sum_{k}g^{-1}_{rk}(p)\Big(\frac{\p g_{jk}}{\p x_i}(p)+\frac{\p g_{ik}}{\p x_j}(p)-\frac{\p g_{ij}}{\p x_k}(p)\Big)
\end{equation}
For example the eight Christoffel symbols for the sphere with geographic coordinates are $\Gamma^1_{ij} = 0$ except $\Gamma^1_{22} = -\cos \mu\sin \mu$.
$\Gamma^2_{11} = \Gamma^2_{22} = 0$ and $\Gamma^2_{12} = \Gamma^2_{21} = \cot\mu=\frac{\cos\mu}{\sin\mu}$. So the geodesic equations for $\gamma = (\gamma_1,\gamma_2)$ is $\ddot{\gamma}_1 -\cos\mu\sin\mu \dot{\gamma}_2\dot{\gamma}_2 = 0 = \ddot{\gamma}_2 +2\cot\mu\dot{\gamma}_1\dot{\gamma}_2$.
While a little complicated at first sight it should be clear that the meridians $\gamma(t) = (t,c)$ for some constant $c\in \R$.
Returning to the general case: the fundamental theorem on the existence and uniqueness of ordinary differential equations assures us that geodesics always exist. Intuitively the next theorem states that in a Riemannian chart one can start walking 'straight' in any direction at any point of the chart.
\begin{theorem}
In a Riemannian chart $(P,g)$ any $p\in P\subset \R^n$ and $v\in \R^n$ determines a unique $C^2$ curve
$\gamma:[C,D]\to P$ such that $\gamma(0) = p$ and $\dot{\gamma}(0)=v$ and for all $m$ we have $\gamma''_m(t)+ \sum_{i,j=1}^n \Gamma^m_{ij}\gamma'_i(t)\gamma'_j(t)$. The domain $[-D,D]$ is not quite unique but if we have another curve $\tilde{\gamma}$ with the same properties and domain $[A,B]$ then $\gamma=\tilde{\gamma}$ on $[A,B]\cap [C,D]$.
\end{theorem}
We arrived at geodesics by attempting to minimize the distance between two points but what we found is actually more like the curves that are 'straight'.
Depending on global issues travelling a straight line may or may not actually minimize the distance travelled.
\subsection{Surfaces in Euclidean space}
In this subsection we examine the meaning of the geodesic equation from the \emph{extrinsic} point of view.
This means we assume our Riemannian chart $(P,g)$ comes with an injective $C^2$ map $\phi:P\to \R^m$ with injective derivative and the metric is the pull-back of the Euclidean metric $g = \phi^*g_E$. We will assume $P\subset \R^n$ throughout. The reader should pay particular attention to the case $n=2, m=3$ that we see around us every day.
The tangent space to $\phi(P)$ at point $\phi(p)$ is spanned by the partial derivatives $\p_i \phi(p) = \phi'(p)e_i$ or more generally the directional derivatives $\p_v \phi(p)$. Let us denote the {\bf tangent space} by $T_{\phi(p)} = \phi'(p)(\R^n)$. It is an $n$-dimensional linear subspace of $\R^m$, translated by vector $\phi(p)$ it actually becomes an affine subspace tangent to $\phi(P)$.
We will see that in this setting the geodesic equation simply states that the accelleration of a geodesic
is perpendicular to the tangent space. Actually given a curve $\gamma$ in $P$ we also have its extrinsic couterpart $\phi\circ \gamma = \beta$ and we really mean
$\ddot{\beta}(t) \perp T_{\beta(t)}$ is equivalent to $\gamma$ being a geodesic in $(P,g)$.
In other words, the accelleration is $0$ just like we are used to for straight lines in the plane except some accelleration in the normal direction is needed to keep our curve on the surface $\phi(P)$.
\begin{comment}
It makes sense to generalize our discussion a little by considering any vector field $X:P\to \R^n$ instead of the special vector field $\dot{\gamma}$.
Again $X$ has an extrinsic counterpart $Y$ by transporting it to $\R^m$ using $\phi$ or rather the derivative. The vector field $Y$ is only defined on
$\phi(P)$ by the formula $Y(\phi(p)) = \phi'(p)X(p)$. Setting $X(\gamma(t)) = \dot{\gamma}(t)$ will recover $Y(\beta(p)) = \dot{\beta}(t)$ the promised geodesic equation. Instead of the accelleration $\ddot{\phi\circ\gamma}$ we will then consider the directional derivative $\p_{\dot{\beta}(t)}Y(\beta(t))$.
The vector field $Y$ is said to be {\bf parallel} along curve $\beta$ if $\p_{\dot{\beta}(t)}Y(\beta(t))\perp T_{\beta(t)}$. Intuitively this means that as we move along the curve $\beta$, the vector $Y(\beta(t))$ does not change, except in the normal direction. In case $Y(\beta(t)) = \dot{\beta}(t)$ we thus restated the geodesic equation as saying that $\dot{\beta}$ is parallel along $\beta$.
\end{comment}
In summary the theorem we would like to prove is the following:
\begin{theorem}
With $\beta = \phi\circ \gamma$ we have
$\ddot{\beta} \perp T_{\beta(t)}$ is equivalent to the geodesic equation for $\gamma$:
\[0=\ddot{\gamma}_r(t)+ \sum_{i,j=1}^n \Gamma^r_{ij}(\gamma(t))\gamma'_i(t)\gamma'_j(t)\]
\end{theorem}
The key to the proof is the relation between the Christoffel symbols and the second order partial derivatives:
\begin{lemma}
\label{lem.ddC}
\[
\la \p_i \p_j \phi(p),\p_k\phi(p) \ra= \sum_{r=1}^n \Gamma^r_{ij}(p) g_{k r}(p) = \frac{1}{2}(\p_i g_{jk}+\p_j g_{ki}-\p_k g_{ij})
\]
\end{lemma}
\begin{proof}
Dropping the evaluation at $p$ for brevity the right hand side can be expanded using \eqref{eq.Christoffel}:
\[\frac{1}{2}\sum_{r,\ell}g_{k r}g^{-1}_{r\ell}(\p_i g_{j\ell}+\p_j g_{\ell i}-\p_\ell g_{ij})=
\frac{1}{2}\sum_{\ell}\delta_{k\ell}(\p_i g_{j\ell}+\p_j g_{\ell i}-\p_\ell g_{ij})= \frac{1}{2}(\p_i g_{jk}+\p_j g_{k i}-\p_k g_{ij})\]
Here we used the defining equation for the inverse of $g$, namely $\sum_{r}g_{k r}g^{-1}_{r\ell} = \delta_{k\ell}$.
Writing the dot product with a $\cdot$ and using the product rule for the dot product we have:
$\p_i g_{jk} = \p_i(\p_j \phi \cdot \p_k\phi) = (\p_i\p_j\phi) \cdot \p_k\phi+\p_j \phi \cdot (\p_i\p_k\phi)$.
write this equation three times while cyclically permuting the indices $i,j,k$:
\begin{align}
\p_i g_{jk} = (\p_i\p_j\phi) \cdot \p_k\phi+\p_j \phi \cdot (\p_i\p_k\phi) \\
\p_j g_{ki} = (\p_j\p_k\phi) \cdot \p_i\phi+\p_k \phi \cdot (\p_j\p_i\phi) \\
\p_k g_{ij} = (\p_k\p_i\phi) \cdot \p_j\phi+\p_i \phi \cdot (\p_k\p_j\phi)
\end{align}
It follows that $\p_i g_{jk}+\p_j g_{ki}-\p_k g_{ij} = 2(\p_i\p_j\phi) \cdot \p_k\phi$.
\end{proof}
\begin{proof}(of the Theorem)\\
Since the vectors $\p_k\phi(p)$ span the tangent space $T_{\phi(p)}$ as $k$ runs from $1$ to $n$ it suffices to prove
\[\forall k: \la \ddot{\beta}(t),\p_k\phi(\gamma(t))\ra = 0\]
By the chain rule $\dot{\beta}(t) = (\phi\circ \gamma)'(t) = \phi'(\gamma(t))\dot{\gamma}(t) = \sum_{i=1}^n\p_i\phi(\gamma(t))\dot{\gamma}_i(t)$.
Differentiating once more using both product rule and chain rule we find
\[
\ddot{\beta}(t) = \sum_{i,j=1}^n \p_j\p_i\phi(\gamma(t))\dot{\gamma}_i(t)\dot{\gamma}_j(t)+\sum_{i=1}^n \p_i\phi(\gamma(t))\ddot{\gamma}_i(t)
\]
Taking the inner product with $\p_k\phi(\gamma(t))$ on both sides and abbreviating $p = \gamma(t)$ we can make use of our lemma as follows
\[
\la \ddot{\beta}(t),\p_k\phi(\gamma(t))\ra =
\sum_{i,j=1}^n \la \p_j\p_i\phi(p), \p_k\phi(p)\ra\dot{\gamma}_i(t)\dot{\gamma}_j(t)+\sum_{s=1}^n \la \p_s\phi(p),\p_k\phi(p)\ra\ddot{\gamma}_i(t)=
\]
\[
\sum_{i,j=1}^n \frac{1}{2}(\p_i g_{jk}+\p_j g_{ki}-\p_k g_{ij})\dot{\gamma}_i(t)\dot{\gamma}_j(t)+\sum_{s=1}^n g_{sk}(p)\ddot{\gamma}_s(t)= 0
\]
Up to a factor $2$ the final line is precisely equation \eqref{eq.geod0} which is equivalent to the geodesic equation.
\end{proof}
Lemma \ref{lem.ddC} actually gives us more intuition about the Christoffel symbols. They express the projection of the second derivative
in terms of the basis $\p_k\phi(p)$ of the tangent space $T_{\phi(p)}$. More precisely, for any $i,j$ we have
$\p_i\p_j\phi(p) = \sum_{k=1}^n \Gamma^k_{ij}(p)\p_k\phi(p) + N_{ij}(p)$ for some vectors $N_{ij}(p)$ perpendicular to $T_{\phi(p)}$.
If we denote by $\pi_q$ the orthogonal projection of $\R^m$ onto the linear subspace $T_{q}$ then we may reformulate our lemma
as
\begin{equation}
\label{eq.proj}
\pi_{\phi(p)}\p_i\p_j\phi(p) = \sum_{k=1}^n \Gamma^k_{ij}(p)\p_k\phi(p)
\end{equation}
\begin{comment}
In order to formulate this more geometrically let us generalize the second order partial derivative to a second order directional derivative.
Recall the directional derivative of $\phi$ at $p$ in direction $v$ is denoted $\p_v \phi(p) = \phi'(p)v$ and partial derivatives are the special case
$\p_i = \p_{e_i}$. For two vector fields $A,B$ on $\R^m$ define $\p_AB$ to be the vector field defined by $(\p_AB)(q) = \p_{A(q)}B(q)$.
For example $\p_i\p_j\phi(p) = \p_{e_i}\p_j\phi(p)$
In other words if
$A = \sum_{i=1}^mA_ie_i$ and $B = \sum_{i=1}^mB_je_j$ for some functions $A_i,B_j:\R^m\to \R$ then
$\p_AB = \sum_{i=1}^m (\p_AB)_i e_i$ with $(\p_A B)_i(q) = \sum_{i,j=1}^mA_i(q)(\p_i B_j)(q)e_j$.
Take a vector field $X = \sum_{i=1}^nX_ie_i$ on $P$ and transport it to a vector field $\tilde{X}$ on $\phi(P)\subset \R^m$ by $\tilde{X}(\phi(p))=\phi'(p)X(p) = \sum_{i=1}^n X_i\p_i\phi(p)$. The Christoffel symbols now enter when we have two vector fields $X,Y$ on $P$, transport them to $\R^m$ and differentiate one in the direction of the other: With $q = \phi(p)$ we find
\[
\p_{\tilde{X}}\tilde{Y}(q) = \sum_{i,j=1}^n X_i\p_{\p_i\phi(p)}Y_j\p_j\phi(p) = \sum_{i,j,k=1}^n X_i\p_{w_{k,i}\phi(p)e_k}Y_j\p_j\phi(p)=
\sum_{i,j,k=1}^n w_{k,i}\phi(p) X_iY_j\p_k\p_j\phi(p)
\]
\end{comment}
\subsection{Covariant derivative and curvature}
Intuitively the meaning of the geodesic equation is that one should set the acceleration of the curve to $0$
except for some Christoffel symbols that correct for the fact that space we are studying is 'curved'.
In this subsection we make the notion of being curved painfully precise by introducing the Riemann curvature.
It is built on a generalization of the arguments in the previous subsections. A basic geometric question that curvature answers
is why all maps of the earth have to be distorted in some way. We cannot flatten the surface of a sphere onto the plane
without stretching or tearing.
The new feature we want to emphasize is the invariance of our constructions under isometries. To this end we pass from
differentiation with respect to the arbitrary coordinates in our chart to differentiation in any direction.
Partial derivatives should be replaced by the directional derivative $\p_v(p)F = F'(p)v$.
Since the partial derivatives of the previous subsection were viewed as functions on $P$, their collection
really forms a bunch of vector fields. So instead of merely the directional derivative we would like to differentiate
one vector field in the direction of another. At every point. This produces a new vector field as follows.
\begin{definition}
\label{def.dXY}
For $C^1$ vector fields $X,Y:P\to \R^n$ on open set $P\subset \R^n$ we define a new vector field $\p_X Y:P\to \R^n$
by $(\p_X Y)(p) = (\p_{X(p)}Y)(p) = Y'(p)X(p)$. The commutator is $[X,Y] = \p_X Y-\p_Y X$.\\
If $P\xrightarrow{f} Q$ has injective derivative and $X$ a vector field on $P$ then we denote by $f'X$ the vector field on $f(P)\subset Q$ defined by
$f'X(f(p)) = f'(p)X(p)$.
\end{definition}
For example in $P=\R^2$ with $X(x,y) = (2x+y^2,x)$ and $Y(x,y) = (3y+x^3,y^2)$ we get
$\p_X(Y)(x,y) = \p_{(2x+y^2,x)}Y(x,y) = Y'(x,y)(2x+y^2,x) = \left(\begin{array}{cc} 3x^2 & 3\\ 0 & 2y\end{array}\right){2x+y^2 \choose x} = {6x^3+3x^2y^2+3x\choose 2xy}$. More generally if $X = \sum_{i=1}^n X_i e_i$ and $Y= \sum_{j=1}^n Y_j e_j$ then $(\p_X Y)(p) = \sum_{i,j}X_i(p)(\p_iY_j)(p) e_j$.
Another way to think about $(\p_XY)(p)$ is that we find an integral curve $\gamma$ for $X$ so $\gamma(0)=p$ and $\dot{\gamma}(0) = X(p)$. Then
$(\p_XY)(p) = \frac{d}{dt}Y(\gamma(t))$.
Our new operation on vector fields can be used to give an important criterion for space being 'flat'.
\begin{lemma}
For any three vector fields $X,Y,Z$ on $\R^n$ we have
\[
\p_X (\p_Y Z) - \p_Y (\p_X Z) - \p_{[X,Y]} Z = 0
\]
\end{lemma}
\begin{proof}
To see why we simply expand the definitions, for brevity we drop the evaluation at $p$ from our notation:
\[
\p_X (\p_Y Z) =
\sum_{i,j,k}X_k\p_k(Y_i(\p_iZ_j)) e_j =
\sum_{i,j,k}X_k\big(\p_kY_i+Y_i\p_k\big)\p_iZ_j e_j
\]
Therefore interchanging $X$ and $Y$ and also the indices $i,k$ we find
\[
\p_X (\p_Y Z)-\p_Y (\p_X Z) = \sum_{i,j,k}\big(X_i\p_iY_k\p_k-Y_i\p_iX_k\p_k\big)Z_j e_j
\]
On the other hand $[X,Y] = \p_X Y-\p_YX = \sum_{i,k}(X_i\p_iY_k-Y_i\p_iX_k)e_k$ proving the lemma.
\end{proof}
As we saw in the geodesic equation the proper generalization of $\p_X Y$ to an arbitrary Riemannian chart is called the {\bf covariant derivative}
also known as the Levi-Civita connection $\nabla_X Y$. It turns out to be related to the Christoffel symbols
by $\nabla_{e_i}e_j = \sum_{ij}\Gamma^k_{ij}e_k$. We will see that in this notation the geodesic equation would be $\nabla_{\dot{\gamma}}\dot{\gamma} = 0$.
\begin{definition}{\bf(Levi-Civita connection)}\\
Denote the set of all $C^2$ vector fields on $P$ by $\Vect(P)$.
An {\bf LC-connection} on a Riemannian chart $(P,g)$ is a map $\nabla:\Vect(P)\times \Vect(P) \to \Vect(P)$
with the following properties. For any $X,Y,Z\in Vec(P)$ scalar $a\in \R$ and $C^1$-function $f:P\to \R$ we have:
\begin{enumerate}
\item $\nabla_X (fY+Z) = f\nabla_X Y+(\p_X f)Y+\nabla_XZ$
\item $\nabla_{fX+Z} (Y) = f\nabla_X Y+\nabla_{Z} Y$
\item $g(p)(\nabla_X Y(p),Z(p))+g(p)(Y(p),\nabla_X Z(p)) = \p_{X(p)} G(p)$, where $G(s) = g(s)(Y(s),Z(s))$
\item $\nabla_XY-\nabla_YX=[X,Y]$
\end{enumerate}
\end{definition}
In $(\R^n,g_E)$ the formula $\nabla_XY = \p_X Y$ defines an LC-connection. In checking the first two axioms the hardest bit is
$\nabla_X (fY) = \sum_{i=1}^n X_i\p_i(fY) = \sum_{i=1}^nX_i(\p_i f)Y+X_if\p_iY = (\p_Xf) Y+f\nabla_XY$. The third axiom is true because
the partial derivative satisfies a product rule with respect to the standard dot product. The last axiom was our definition of the commutator so there is nothing to check.
While a little abstract this definition is encoding exactly what we were doing before. In the theorem below we will see the Christoffel symbol comes out as the coefficients of
$\nabla_{e_i}e_j$ so $\nabla_{e_i}e_j = \sum_{i,j,k}\Gamma^k_{ij}e_k$. Assuming this for a moment the geodesic equation can be rewritten as
$\nabla_{\dot{\gamma}}\dot{\gamma} = 0$ with the caveat that we need to extend $\dot{\gamma}$ to a vector field $V$ such that $V(\gamma(t)) = \dot{\gamma}(t)$ for all $t$. Indeed using $V= \sum_{j=1}^nV_je_j$ the properties of the connection give us
\[
\nabla_{\dot{\gamma}}\dot{\gamma} = \sum_{i,j}\dot{\gamma}_i\nabla_{e_i}V_je_j = \sum_{i,j}\dot{\gamma}_i(\p_iV_j)e_j +\dot{\gamma}_iV_j\nabla_{e_i}e_j=
\sum_{j}\frac{d}{dt}V_k(\gamma(t))e_k +\sum_{i,j,k}\Gamma^k_{ij}\dot{\gamma}_iV_j e_k = \Big(\ddot{\gamma}_k(t) +\sum_{i,j,k}\Gamma^k_{ij}\dot{\gamma}_iV_j\Big) e_k
\]
What is important is that the Riemannian metric $g$ determines the LC-connection uniquely, much like our discussion in the previous subsection.
\begin{theorem} {\bf (Fundamental lemma of Riemannian geometry)}\\
There exists a unique Levi-Civita connection $\nabla$ on Riemannian chart $(P,g)$.\\ Also $\nabla_{e_i}e_j = \sum_{i,j,k}\Gamma^k_{ij}e_k$.
\end{theorem}
\begin{proof}
For simplicity we write the Riemannian metric as $\la\cdot,\cdot\ra$ and do not explicitly write its dependence on the point. Condition 3) is then abbreviated as \begin{equation}
\label{eq.gcons}
\p_{X} \la Y,Z\ra=\la\nabla_X Y,Z\ra+\la Y,\nabla_X Z\ra =\la\nabla_X Y,Z\ra+\la Y,\nabla_Z X\ra+\la Y,[X,Z]\ra
\end{equation}
We start by proving uniqueness of the LC-connection by proving it is determined entirely by $g$ much like the proof of Lemma \ref{lem.ddC}.
For vector fields $X,Y,Z$ we write Equation \eqref{eq.gcons} three times cycling around $X,Y,Z$:
\begin{align*}
\p_{X} \la Y,Z\ra&=\la\nabla_X Y,Z\ra+\la Y,\nabla_Z X\ra+\la Y,[X,Z]\ra \\
\p_{Y} \la Z,X\ra&=\la\nabla_Y Z,X\ra+\la Z,\nabla_X Y\ra+\la Z,[Y,X]\ra \\
\p_{Z} \la X,Y\ra&=\la\nabla_Z X,Y\ra+\la X,\nabla_Y Z\ra+\la X,[Z,Y]\ra
\end{align*}
Subtracting the third equation from the sum of the first two we can solve for $\la\nabla_X Y,Z\ra$ to get
\begin{equation}
\label{eq.nablas}
\la\nabla_X Y,Z\ra = \frac{1}{2}\Big(\p_{X} \la Y,Z\ra +\p_{Y} \la Z,X\ra+\p_{Z} \la X,Y\ra-\la Y,[X,Z]\ra-\la Z,[Y,X]\ra+\la X,[Z,Y]\ra\Big)
\end{equation}
Uniqueness follows from this equation since if we had two LC-connections $\nabla,\tilde{\nabla}$ then for all $Z$ we would find
$\la\nabla_X Y,Z\ra = \la\tilde{\nabla}_X Y,Z\ra$ which is only possible if $\nabla_X Y = \tilde{\nabla}_X Y$.
Existence follows from the same equation \eqref{eq.nablas} because taking $X,Y,Z = e_i,e_j,e_k$ we find
\[
\la \nabla_{e_i}e_j,e_k\ra = \frac{1}{2}\Big(\p_i g_{jk} +\p_j g_{kj}+\p_k g_{ij}\Big) = \sum_{r=1}^n \Gamma^r_{ij}(p) g_{k r}(p)
\]
using $[e_a,e_b] = 0$ and the definition of the Christoffel symbols \eqref{eq.Christoffel}. Using the inverse $g^{-1}$ we thus find $\nabla_{e_i}e_j = \sum_{k=1}^n\Gamma_{ij}^ke_k$ as desired.
This proves that an LC-connection exists because by properties 1) and 2) we have
\[\nabla_{X}Y = \sum_{i,j}X_i\nabla_{e_i}(Y_je_j) = \sum_{i,j}X_i\p_iY_je_j+X_iY_j\nabla_{e_i}e_j = \sum_{i,j}X_i\p_iY_je_j+\sum_kX_iY_j\Gamma^k_{ij}\]
\end{proof}
When our Riemannian metric comes from pulling back the Euclidean metric along a suitable map $\phi:P\to \R^m$ as in the previous subsection
then the LC-connection is easier to understand. It is basically using the LC-connection in $(\R^m,g_E)$ given by Definition \ref{def.dXY} and then projecting the result onto the tangent space in $\R^m$. More precisely set $g=\phi^*g_E$ to be a metric on $P\subset \R^n$ and $\phi:P\to \R^m$ as in the previous subsection. We claim that the LC-connection in that case is given by
$\phi'\nabla_XY=\pi\nabla_{\tilde{X}}\tilde{Y}$ where $\pi(p):\R^m\to \R^m$ is the linear map that projects orthogonally onto the tangent space $\mathrm{Image}(\phi'(p))$ and
$\tilde{X}$ and $\tilde{Y}$ are vector fields on $\R^m$ such that $\tilde{X}\circ \phi = \phi'X$ and likewise $\tilde{Y}\circ \phi = \phi'Y$.
Also recall that by $\phi'X$ we mean the vector field on $\mathrm{Image}(\phi)$ defined by $\phi'X(\phi(p)) = \phi'(p)X(p)$.
This means that on the image of $\phi$ we get
\[\nabla_{\tilde{X}}\tilde{Y}(\phi(p)) = \big(\tilde{Y}(\phi(p))\big)'\tilde{X}(\phi(p)) = (\phi'Y)'(p)X(p) = \p_X (\phi'Y)(p)
\]
When $X = e_i$ and $Y=e_j$ we get $\phi'Y = \p_j\phi$ so that by equation \eqref{eq.proj}
$\phi'\nabla_{e_i}e_j = \pi \p_i\p_j\phi = \sum_{k=1}^n \Gamma^k_{ij}(p)\p_k\phi$
Since $\phi'e_k=\p_k\phi$ we conclude $\nabla_{e_i}e_j = \pi \p_i\p_j\phi = \sum_{k=1}^n \Gamma^k_{ij}(p)e_k$.
\begin{lemma}{\bf (Isometry invariance)}\\
\label{lem.isominv1}
For any isometry $P\xrightarrow{f}Q$ between Riemannian charts we have $f'(\nabla_XY) = \nabla_{f'X}f'Y$.
Also $\gamma$ is a geodesic if and only if $f\circ \gamma$ is a geodesic.
\end{lemma}
\begin{proof}
Define a map $\Delta:\Vect(P)\times \Vect(P) \to \Vect(P)$ by $\Delta_XY = (f^{-1})'\nabla_{f'X}f'Y$. If we can show that $\Delta$ is an LC-connection on $P$
then by uniqueness $\nabla = \Delta$ so $\nabla_XY = (f^{-1})'\nabla_{f'X}f'Y$ proving $f'(\nabla_XY) = \nabla_{f'X}f'Y$.
Next, if $\gamma$ is a geodesic in $P$ and $\beta = f\circ \gamma$ then $\nabla_{\dot{\gamma}}\dot{\gamma}=0$. Applying $f'$ to both sides yields
$0=f'\nabla_{\dot{\gamma}}\dot{\gamma}=\nabla_{f'\dot{\gamma}}f'\dot{\gamma} = \nabla_{\dot{\beta}}\dot{\beta}$, showing $\beta$ is a geodesic too.
Conversely if $\beta$ is a geodesic then so is $\gamma$ because $f^{-1}$ is an isometry too.
\end{proof}
We we have seen that for the LC-connection on $(\R^n,g_E)$ we had
$\nabla_X \nabla_Y Z-\nabla_Y \nabla_X Z-\nabla_{[X,Y]}Z = 0$
For other Riemannian charts this is not at all the case.
\begin{definition}{\bf (Riemannian Curvature)}\\
The Riemann curvature at point $p$ of $n$-dimensional Riemannian chart $(P,g)$ the map
$R:Vec(P)\times Vec(P)\times Vec(P)\to Vec(P)$ defined by
\[R(X,Y)Z = \nabla_X \nabla_Y Z-\nabla_Y \nabla_X Z-\nabla_{[X,Y]}Z\]
\end{definition}
It follows that curvature is invariant under isometries in the following sense
\begin{lemma}{\bf (Isometry invariance of curvature)}\\
\label{lem.isominv2}
For any isometry $P\xrightarrow{f}Q$ between Riemannian charts we have
$f'(R(X,Y)Z) = R(f'X,f'Y)f'Z$
\end{lemma}
\begin{proof}
Using Lemma \ref{lem.isominv1}
\[f'(R(X,Y)Z) = \nabla_{f'X} \nabla_{f'Y} f'Z-\nabla_{f'Y} \nabla_{f'X} f'Z-\nabla_{[f'X,f'Y]}f'Z = R(f'X,f'Y)f'Z\]
\end{proof}
Concretely the coefficients of the Riemann curvature can be computed in terms of Christoffel symbols.
\begin{lemma}
The curvature is linear in $X,Y,Z$ and for $C^2$-functions $f:P\to \R$ we have
$fR(X,Y)Z = R(X,Y)fZ = R(X,fY)Z=R(fX,Y)Z$. If we set
$R(e_i,e_j)e_k = \sum_{\ell}R_{i,j,k}^\ell e_\ell$ then
\[
R(e_i,e_j)e_k =\sum_\ell\Big(\p_i\Gamma_{jk}^\ell -\p_j\Gamma_{ik}^\ell
+\sum_{r}\Gamma_{jk}^r\Gamma_{ir}^\ell
-\Gamma_{ik}^r\Gamma_{jr}^\ell
\Big)e_\ell
\]
\end{lemma}
\begin{proof}
Linearity in $X,Y,Z$ follows directly from the properties of the LC-connection.
The property with function $f$ is left to the reader as an exercise.
Notice $[e_i,e_j] = 0$ so
$R(e_i,e_j)e_k = \nabla_{e_i}\nabla_{e_j}e_k-\nabla_{e_j}\nabla_{e_i}e_k$.
Now
\[\nabla_{e_i}\nabla_{e_j}e_k=\nabla_{e_i}\sum_r \Gamma_{jk}^re_r =
\sum_r\p_i\Gamma_{jk}^re_r+\sum_{r,s}\Gamma_{jk}^r\Gamma_{ir}^se_s =
\sum_\ell\Big(\p_i\Gamma_{jk}^\ell +\sum_{r}\Gamma_{jk}^r\Gamma_{ir}^\ell\Big)e_\ell\]
Therefore swapping $i$ and $j$ and subtracting we obtain the desired result.
\end{proof}
As promised we can now understand how the geometry of the sphere is intrinsically different from that of the plane.
\begin{theorem}
Suppose $(P,g)$ is a Riemannian chart such that $g=\phi^*g_E$ and $\phi:P\to S^2\subset \R^3$ is an injective $C^2$ map with injective derivative.
Also take $(Q,h)$ to be a part of the plane with the Euclidean metric so $Q\subset \R^2$ and $h=g_E$.
There is no isometry between $(P,g)$ and $(Q,h)$.
\end{theorem}
\begin{proof}
Restricting the domains $P,Q$ if necessary, Lemma \ref{lem.isominv2} tells us we can assume without loss of generality that $P=(0,\pi)\times(-\pi,\pi)$ and that $\phi$ is the usual spherical coordinate map $\phi(\mu,\lambda) = (\cos \lambda \sin\mu,\sin \lambda \sin\mu,\cos\mu)$.
By computing the Christoffel symbols in this case we see $R$ is not zero in $P$. For example
$R_{1,2,2,1} = \sin^2 \mu$ is non-zero.
In $Q$ the curvature maps any triple of vector fields to $0$.
\end{proof}
Of course the Riemann curvature does much more than distinguishing the sphere from the plane but we leave this for further study except for a few general comments. Since the Riemann curvature coefficients contain a lot of information it is common to package them
into simpler quantities known as the Ricci curvature coefficients $R_{ij} = \sum_{k=1}^nR_{kij}^k$
and the scalar curvature $S = \sum_{i,j}g_{ij}^{-1}R_{ij}$. For surfaces in 3-space all the information of the Riemann curvature is already contained in $S$ and
$S =2R_{1,2,2,1}/\det(g)$ is known to be twice the Gauss curvature which describes how the normal vector turns as one walks around a point.
In general a simple interpretation of the scalar curvature is possible in any Riemannian chart by comparing the circumference of a small (geodesic) circle with radius $r$ to $2\pi r$. Similar expressions are available in terms of are and volume. For future reference we state a definition for volume in a Riemannian chart.
\begin{definition}{\bf (Volume)}\\
For a closed subset $D$ of a Riemannian chart $(P,g)$ we define the determinant of the metric
$\det g:P\to \R$ by $\det g(p)=\det (g_{ij}(p))_{i,j=1,\dots n}$, the determinant of the matrix
formed by the functions $g_{ij}$. The volume of $D$ is defined to be
\[vol(D) = \int_D \sqrt{\det g} dx_1,\dots dx_n\]
\end{definition}
\end{document}
**