Skip to content

Commit

Permalink
Edits
Browse files Browse the repository at this point in the history
  • Loading branch information
triangle-man committed Apr 23, 2024
1 parent 87000d2 commit 1975109
Show file tree
Hide file tree
Showing 3 changed files with 85 additions and 68 deletions.
6 changes: 3 additions & 3 deletions notes/bilinear-form.asy
Original file line number Diff line number Diff line change
Expand Up @@ -26,11 +26,11 @@ draw(mapcurve, margin = DotMargins, Arrow(size=4pt));
label("$V$", vt * (0.5, 1), 2 * N);
label("$V^*$", xt * vt * (0.5, 1), 2 * N);

label("$C$", midpoint(mapcurve), N);
label("$\bm{C}$", midpoint(mapcurve), N);
// label("$C^{-1}$", reflect((0,0),E) * midpoint(mapcurve), S);

dot("$x$", vt * (0.7, 0.7), W);
dot("$C(x)$", xt * vt * (0.3, 0.7), E);
dot("$v$", vt * (0.7, 0.7), W);
dot("$\bm{C}(v)$", xt * vt * (0.3, 0.7), E);

// dot("$\tilde{b}$", xt * vt * (0.3, 0.2), E);
// dot("$C^{-1}(\tilde{b})$", vt * (0.7, 0.2), W);
10 changes: 5 additions & 5 deletions notes/bilinear-form2.asy
Original file line number Diff line number Diff line change
Expand Up @@ -26,11 +26,11 @@ draw(reflect((0,0),E) * mapcurve, margin = DotMargins, BeginArrow(size = 4pt));
label("$V$", vt * (0.5, 1), 2 * N);
label("$V^*$", xt * vt * (0.5, 1), 2 * N);

label("$C$", midpoint(mapcurve), N);
label("$C^{-1}$", reflect((0,0),E) * midpoint(mapcurve), S);
label("$\bm{C}$", midpoint(mapcurve), N);
label("$\bm{C}^{-1}$", reflect((0,0),E) * midpoint(mapcurve), S);

dot("$x$", vt * (0.7, 0.7), W);
dot("$C(x)$", xt * vt * (0.3, 0.7), E);
dot("$v$", vt * (0.7, 0.7), W);
dot("$\bm{C}(v)$", xt * vt * (0.3, 0.7), E);

dot("$\tilde{b}$", xt * vt * (0.3, 0.2), E);
dot("$C^{-1}(\tilde{b})$", vt * (0.7, 0.2), W);
dot("$\bm{C}^{-1}(\tilde{b})$", vt * (0.7, 0.2), W);
137 changes: 77 additions & 60 deletions notes/optimisation.tex
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
\date{\today}
%%
\DeclareBoldMathCommand{\setR}{R}
\DeclareBoldMathCommand{\bfC}{C}
\DeclareMathOperator*{\argmin}{arg\,min}
\newcommand{\eg}{\emph{Example:}}
\newcommand{\ie}{\emph{i.e.}}
Expand All @@ -37,7 +38,7 @@
prevent an analytic solution and impede a numerical one.

The general problem is this. Suppose $X$ is some set, possibly with
additional structure, and $f\colon \setR \to \setR$ a real-valued function
additional structure, and $f\colon X \to \setR$ a real-valued function
on~$X$. We are to find $x_\text{min}\in V$ (if one exists) such that
\[
f(x_\text{min}) \leq f(x) \quad\text{for all $x\in V$}.
Expand Down Expand Up @@ -147,67 +148,69 @@
with $v$, somehow “carry it across” to $V^*$, and then act with the
result on~$w$.

Thus, let $C\colon V\to V^*$ be a linear map from $V$ to its dual. For
any vector $v\in V$, we obtain $C(v)\in V^*$, a linear map from $V$
Thus, let $\bfC\colon V\to V^*$ be a linear map from $V$ to its dual. For
any vector $v\in V$, we obtain $\bfC(v)\in V^*$, a linear map from $V$
to~$\setR$ (see figure~\ref{fig:bilinear-form}). Since an element of $V^*$ is a linear map from $V$
to~$\setR$, we may apply $C(v)$ to $w\in V$ and thereby obtain a number,
$(C(v))(w)$.
to~$\setR$, we may apply $\bfC(v)$ to $w\in V$ and thereby obtain a number,
$(\bfC(v))(w)$.

\begin{marginfigure}
\begin{center}
\asyinclude[width=5cm]{bilinear-form.asy}
\end{center}
\caption{A vector space $V$ and its dual $V^*$, showing: an element
$x\in V$; a linear map $C\colon V\to V^*$; and the image of $x$ in
$V^*$ under $C$.\label{fig:bilinear-form}}
$v\in V$; a linear map $\bfC\colon V\to V^*$; and the image of $v$ in
$V^*$ under $\bfC$.\label{fig:bilinear-form}}
\end{marginfigure}
In a sense, one may think of $C$ as a map, from pairs
In a sense, one may think of $\bfC$ as a map, from pairs
$(v,w)\in V\times V$ to the reals, which is “linear in both $v$ and
$w$.” This view suggests a less cumbersome notation: instead of
$(C(v))(w)$ we shall write $C(v,w)$. Thus, by $C(v, w)$ we shall mean,
“apply $C$ to $v$, obtaining an elemement of $V^*$, and apply this
element to $w$, obtaining a number.” When $C$ is viewed from this
$(\bfC(v))(w)$ we shall write $\bfC(v,w)$. Thus, by $\bfC(v, w)$ we shall mean,
“apply $\bfC$ to $v$, obtaining an elemement of $V^*$, and apply this
element to $w$, obtaining a number.” When $\bfC$ is viewed from this
perspective, it is known as a \emph{bilinear form}.

\eg{} For $C$ any bilinear form,
$C(\alpha v, \beta w) = \alpha\beta C(v,w)$ (which very much gives $C$ the flavour of a
\eg{} For $\bfC$ any bilinear form,
$\bfC(\alpha v, \beta w) = \alpha\beta \bfC(v,w)$ (which very much gives $\bfC$ the flavour of a
product).

We can now say roughly what is meant by a “quadratic term:” it is an
expression of the form $C(v,v)$ for some bilinear form~$C$.
expression of the form $\bfC(v,v)$ for some bilinear form~$\bfC$.

Notice, however, that in this expression $C$ is applied to a single
Notice, however, that in this expression $\bfC$ is applied to a single
$v$ (twice); whereas more generally a bilinear form may be applied to
two different vectors. Is there some redundancy in this definition?
Let $A$ be any bilinear form such that $A(v,w)=-A(w,v)$ and consider
the bilinear form $C+A$. By linearity, we have $(C+A)(v, v) =
C(v,v)+A(v,v)$. However, $A(v,v)=-A(v,v)$ (by assumption), whence
$A(v,v)=0$. Thus $(C+A)(v,v)=C(v,v)$; that is, $C+A$ gives rise to the
same quadratic form as~$C$.

A bilinear form $A$ for which $A(v,w)=-A(w,v)$ is said to be
\emph{antisymmetric}. Conversely, a bilinear form $S$ for which
$S(v,w)=S(w,v)$ is said to be \emph{symmetric}. Let $C$ be any
bilinear form and consider the identity:
Let $\bm{A}$ be any bilinear form such that $\bm{A}(v,w)=-\bm{A}(w,v)$ and consider
the bilinear form $\bfC+\bm{A}$. By linearity, we have $(\bfC+\bm{A})(v, v) =
\bfC(v,v)+\bm{A}(v,v)$. However, $\bm{A}(v,v)=-\bm{A}(v,v)$ (by assumption), whence
$\bm{A}(v,v)=0$. Thus $(\bfC+\bm{A})(v,v)=\bfC(v,v)$; that is, $\bfC+\bm{A}$ gives rise to the
same quadratic form as~$\bfC$.

A bilinear form $\bm{A}$ for which $\bm{A}(v,w)=-\bm{A}(w,v)$ is said to be
\emph{antisymmetric}. Conversely, a bilinear form $\bm{S}$ for which
$\bm{S}(v,w)=\bm{S}(w,v)$ is said to be \emph{symmetric}. From the foregoing, we
may add to $\bfC$ any antisymmetric bilinear form without affecting the
value of $\bfC(v,v)$. Now consider the identity (for any bilinear form):
\[
C(v, w) = \frac{1}{2}\bigl[C(v,w) + C(w,v)\bigr]
+ \frac{1}{2}\bigl[C(v,w) - C(w,v)\bigr].
\bfC(v, w) = \frac{1}{2}\bigl[\bfC(v,w) + \bfC(w,v)\bigr]
+ \frac{1}{2}\bigl[\bfC(v,w) - \bfC(w,v)\bigr].
\]
The first term on the right-hand side is symmetric whereas the second
is antisymmetic. Since the antisymmetric term vanishes when both
arguments are the same, we may, without loss of generality, assume
that $C$ is symmetric when evaluting~$C(v,v)$.\sidenote{Well, we have
to show that every symmetric bilinear form arises in this way.}
that $\bfC$ is symmetric when evaluting~$\bfC(v,v)$.\sidenote{It is also
true, though we do not show it, that there is no further
redundancy.}

We are now in a position to say what we mean by a quadratic function on
a vector space. It is a function of the form:
\begin{equation}
f(v) = a - 2\tilde{b}(v) + C(v, v).
f(v) = a - 2\tilde{b}(v) + \bfC(v, v),
\label{eq:quadratic-function}
\end{equation}
In this expression, $a$ is a number, $\tilde{b}$ is an element of the
dual space and $C$ is a symmetric bilinear form. (The factor of $-2$
is conventional as it simplifies certain calculations.)
where, in this expression, $a$ is a number, $\tilde{b}$ is an element
of the dual space and $\bfC$ is a symmetric bilinear form. (The factor of
$-2$ is conventional as it simplifies certain calculations.)

Having written down a function on $V$, we return to the problem of
finding the location of its minimum.
Expand All @@ -216,79 +219,93 @@
eq.~\eqref{eq:completing-the-square}, we might attempt to rewrite
eq.~\eqref{eq:quadratic-function} as:
\begin{equation}
f(v) = \kappa + \Gamma(v - \xi, v - \xi),
f(v) = \kappa + \bm{\Gamma}(v - \xi, v - \xi),
\label{eq:vector-square}
\end{equation}
where now $\kappa$ is a number, $\xi$ is a vector (which we hope will turn
out to be the minimiser of $f$!), and $\Gamma$ is a symmetric bilinear
out to be the minimiser of $f$!), and $\bm{\Gamma}$ is a symmetric bilinear
form. (Note that, previously, the last term on the right-hand side
involved the expression ${(x-\xi)}^2$; here, a symmetric bilinear form
is required to effect the square.)

In the one-dimensional case we next expanded the term in ${(x-\xi)}^2$
and equated coefficients of each power of $x$. To do the same thing
here, we shall have to expand the term in $\Gamma$. Recall the meaning of
$\Gamma(v-\xi, v-\xi)$: $\Gamma$ is applied to $v-\xi$ to obtain an element
here, we shall have to expand the term in $\bm{\Gamma}$. Recall the meaning of
$\bm{\Gamma}(v-\xi, v-\xi)$: $\bm{\Gamma}$ is applied to $v-\xi$ to obtain an element
of~$V^*$; this element is then applied to $v-\xi$. Both of these
applications are linear, and so
\[
\begin{aligned}
\Gamma(v-\xi,v-\xi) & = \Gamma(v-\xi,v)-\Gamma(v-\xi, \xi) \\
& = \Gamma(v,v)-\Gamma(v, \xi) - \Gamma(\xi,v) + \Gamma(\xi, \xi) \\
& = \Gamma(v,v)-2\Gamma(\xi,v)+\Gamma(\xi,\xi).
\bm{\Gamma}(v-\xi,v-\xi) & = \bm{\Gamma}(v-\xi,v)-\bm{\Gamma}(v-\xi, \xi) \\
& = \bm{\Gamma}(v,v)-\bm{\Gamma}(v, \xi) - \bm{\Gamma}(\xi,v) + \bm{\Gamma}(\xi, \xi) \\
& = \bm{\Gamma}(v,v)-2\bm{\Gamma}(\xi,v)+\bm{\Gamma}(\xi,\xi).
\end{aligned}
\]
Replacing $\Gamma$ in eq.~\eqref{eq:vector-square} with this expansion, we
Replacing $\bm{\Gamma}$ in eq.~\eqref{eq:vector-square} with this expansion, we
obtain
\[
a -2\tilde{b}(v)+C(v,v) = \bigl[\kappa+\Gamma(\xi,\xi)\bigr] -2\Gamma(\xi,v) + \Gamma(v,v)
a -2\tilde{b}(v)+\bfC(v,v) = \bigl[\kappa+\bm{\Gamma}(\xi,\xi)\bigr] -2\bm{\Gamma}(\xi,v) + \bm{\Gamma}(v,v)
\]
from which we conclude: $\Gamma(v,v) = C(v,v)$ (from the
terms “quadratric in $v$”); $\Gamma(\xi, v) = \tilde{b}(v)$ (from the terms
linear in $v$); and $\kappa+\Gamma(\xi,\xi)=a$ (from the constant terms).
from which we conclude: $\bm{\Gamma}(v,v) = \bfC(v,v)$ (from the
terms “quadratric in $v$”); $\bm{\Gamma}(\xi, v) = \tilde{b}(v)$ (from the terms
linear in $v$); and $\kappa+\bm{\Gamma}(\xi,\xi)=a$ (from the constant terms).

The first and third of these identifications are clear. We should
choose $\Gamma=C$ and therefore $\kappa=a-C(\xi,\xi)$. The second term is less
obvious. Replacing $\Gamma$ with $C$, it is
$C(\xi, v) = \tilde{b}(v)$. What meaning should we ascribe to this?
Recall the meaning of $C(\xi,v)$: it is notation for $(C(\xi))(v)$, or
$C$ applied first to $\xi$, and the result is applied to~$v$ (see
choose $\bm{\Gamma}=\bfC$ and therefore $\kappa=a-\bfC(\xi,\xi)$. The second term is less
obvious. Replacing $\bm{\Gamma}$ with $\bfC$, it is
$\bfC(\xi, v) = \tilde{b}(v)$. What meaning should we ascribe to this?
Recall the meaning of $\bfC(\xi,v)$: it is notation for $(\bfC(\xi))(v)$, or
$\bfC$ applied first to $\xi$, and the result is applied to~$v$ (see
figure~\ref{fig:bilinear-form2}).
\begin{marginfigure}
\begin{center}
\asyinclude[width=5cm]{bilinear-form2.asy}
\end{center}
\caption{A vector space $V$ and its dual $V^*$, showing an element $x\in
V$ and its image in $V^*$ under $C$, as well as an element
$\tilde{b}\in V^*$ and its image in $V$ under~$C^{-1}$.\label{fig:bilinear-form2}}
V$ and its image in $V^*$ under $\bfC$, as well as an element
$\tilde{b}\in V^*$ and its image in $V$ under~$\bfC^{-1}$.\label{fig:bilinear-form2}}
\end{marginfigure}
That is, $C(\xi)$ is an element of~$V^*$, as is $\tilde{b}$. Moreover,
That is, $\bfC(\xi)$ is an element of~$V^*$, as is $\tilde{b}$. Moreover,
both of these give the same result when acting on any $v\in V$ and,
hence, are the same element of~$V^*$. That is, $C(\xi) = \tilde{b}$.
hence, are the same element of~$V^*$. That is, $\bfC(\xi) = \tilde{b}$.

A candidate answer for the minimiser of $f(v)$, is therefore
\begin{equation}
\label{eq:minimiser}
\xi = C^{-1}(\tilde{b}).
\xi = \bfC^{-1}(\tilde{b}).
\end{equation}
Unfortunately, we are not yet done. To conclude that this is a
minimiser we must in addition show two things: first, that $C$
minimiser we must in addition show two things: first, that $\bfC$
\emph{has} an inverse; and second that $f(v)$ is indeed a minimum at
this value.

It is convenient to tackle the second condition first. Assume, for the
moment, that $C$ is invertible and that $\xi$ is given by
moment, that $\bfC$ is invertible and that $\xi$ is given by
eq.~\eqref{eq:minimiser}. This $\xi$ will be a minimiser of We must show that $f(v)>f(\xi)$ for all
$v\neq x$ which, from eq.~\eqref{eq:vector-square}, is equivalent to
requiring $C(v-\xi,v-\xi)>C(0,0)$. Since $v$ is arbitrary and
$C(0,0)=0$ this condition is equivalent to
requiring $\bfC(v-\xi,v-\xi)>\bfC(0,0)$. Since $v$ is arbitrary and
$\bfC(0,0)=0$ this condition is equivalent to
\begin{equation}
\label{eq:positive-definite}
C(x,x) > 0 \quad\text{for all $x\in V$ such that $x\neq0$}.
\bfC(x,x) > 0 \quad\text{for all $x\in V$ such that $x\neq0$}.
\end{equation}
A symmetric bilinear form for which eq.~\eqref{eq:positive-definite}
holds is said to be \emph{positive definite}.

Now we return to the issue of whether $\bfC$ is invertible. In fact,
we have:

\emph{Theorem}: Any positive-definite, symmetric bilinear form on a
finite-dimensional vector space is invertible.

\emph{Proof}: Suppose $\bfC$ is a positive-definite, symmetric
bilinear form. $\bfC$ is invertible if it is injective and
surjective. To show injectivity suppose, for contradiction, that there
is some $u\neq\bm{0}$ in $V$ such that $\bfC(u)=\bm{0}$ (with the
right-hand side being the zero element of~$V$, noting that this
condition is equivalent to injectivity). Then we would have
$\bfC(u, x) =0$ for any $x$ and in particular $\bfC(u,u)=0$,
contradicting the assumed positive-definiteness of~$\bfC$.



Expand Down

0 comments on commit 1975109

Please sign in to comment.